# PRODUCING AND ANALYZING MACRO-CONNECTOMES: CURRENT STATE AND CHALLENGES

EDITED BY: Mihail Bota, Sharon Crook and Marcus Kaiser PUBLISHED IN: Frontiers in Neuroinformatics

#### *Frontiers Copyright Statement*

*© Copyright 2007-2016 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88919-981-5 DOI 10.3389/978-2-88919-981-5

## About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

## Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

## Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

## What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **PRODUCING AND ANALYZING MACRO-CONNECTOMES: CURRENT STATE AND CHALLENGES**

Topic Editors:

**Mihail Bota,** Cold Spring Harbor Laboratory, USA **Sharon Crook,** Arizona State University, USA **Marcus Kaiser,** Newcastle University, UK

Visualization of the healthy human connectome with BrainX3 . Credit Riccardo Zucca

Construction of comprehensive and detailed brain regions neuroanatomical connections matrices (macro-connectomes) is necessary to understand how the nervous system is organized and to elucidate how its different parts interact. Macro-connectomes also are the structural foundation of any finer granularity approaches at the neuron classes and types (meso-connectomes) or individual neuron (micro-connectomes) levels. The advent of novel neuroanatomical methods, as well as combinations of classic techniques, form the basis of several large scale projects with the ultimate goal of producing publicly available connectomes at different levels. A parallel approach, that of systematic and comprehensive collation of connectivity data from the published literature and from publicly accessible neuroinformatics platforms, has produced macro-connectomes of different parts of

the central nervous system (CNS) in several mammalian species.

The emergence of these public platforms that allow for the manipulation of rich connectivity data sets and enable the construction of CNS macro-connectomes in different species may have significant and long lasting implications. Moreover, when these efforts are leveraged by novel statistical methods, they may influence our way of thinking about the brain. Hence, the present brain region-centric paradigm may be challenged by a network-centric one. Ultimately, these projects will provide the information and knowledge for understanding how different neuronal parts communicate and function, developing novel approaches to diseases and disorders, and facilitating translational efforts in neurosciences.

With this Research Topic we bring together the current state of macro-connectome related projects including the large scale production of thousands of publicly available neuronatomical experiments, databases with tens of thousands of connectivity records collated from the published literature, and the newest methods for displaying and analyzing this information. This topic also includes a wide range of challenges and how they are addressed - from platforms designed to integrate connectivity data across different sources, species and CNS levels of organization, to languages specifically designed to use these data in models at different scales of resolution, to efforts of 3D reconstruction and data integration, and to approaches for extraction and representation of this knowledge. Finally, we address the present state of different efforts of meso-connectomes construction, and of computational modeling in the context of the information provided by macro-connectomes.

**Citation:** Bota, M., Crook, S., Kaiser, M., eds. (2016). Producing and Analyzing Macro-Connectomes: Current State and Challenges. Lausanne: Frontiers Media. doi: 10.3389/978-2-88919-981-5

# Table of Contents

## **Section I: Establishing Macro-Connectomes Experimentally**

*05 A case study in connectomics: the history, mapping, and connectivity of the claustrum*

Carinna M. Torgerson and John D. Van Horn


## **Section II: Compiling Macro-Connectomes from Existing Data**


Tristan Moreau and Bernard Gibaud

## **Section III: Interaction with Macro-Connectomes**


Xerxes D. Arsiwalla, Riccardo Zucca, Alberto Betella, Enrique Martinez, David Dalmazzo, Pedro Omedas, Gustavo Deco and Paul F. M. J. Verschure

## A case study in connectomics: the history, mapping, and connectivity of the claustrum

## *Carinna M. Torgerson and John D. Van Horn\**

*Department of Neurology, Laboratory of Neuro Imaging, Institute of Neuroimaging and Informatics, University of Southern California, Los Angeles, CA, USA*

#### *Edited by:*

*Mihail Bota, University of Southern California, USA*

#### *Reviewed by:*

*Graham J. Galloway, The University of Queensland, Australia David Reser, Monash University, Australia*

#### *\*Correspondence:*

*John D. Van Horn, Laboratory of Neuro Imaging, The Institute for Neuroimaging and Informatics, Keck School of Medicine, University of Southern California, 2001 North Soto Street - Room 102, Los Angeles, CA 90032, USA e-mail: jvanhorn@usc.edu*

The claustrum seems to have been waiting for the science of connectomics. Due to its tiny size, the structure has remained remarkably difficult to study until modern technological and mathematical advancements like graph theory, connectomics, diffusion tensor imaging, HARDI, and excitotoxic lesioning. That does not mean, however, that early methods allowed researchers to assess micro-connectomics. In fact, the claustrum is such an enigma that the only things known for certain about it are its histology, and that it is extraordinarily well connected. In this literature review, we provide background details on the claustrum and the history of its study in the human and in other animal species. By providing an explanation of the neuroimaging and histology methods have been undertaken to study the claustrum thus far—and the conclusions these studies have drawn—we illustrate this example of how the shift from micro-connectomics to macro-connectomics advances the field of neuroscience and improves our capacity to understand the brain.

**Keywords: claustrum, connectomics, macro-scale, micro-scale, Wilson's Disease, consciousness**

## **INTRODUCTION**

Macro-scale connectomics, the study of neuronal connections between two or more regions of the brain, combines the principles of functional specialization and functional integration. While there are only a mere 30,000–40,000 protein-encoding human genes, and nearly 1.5 million single nucleotide polymorphisms (SNPs), there may be an astonishing 1015 neuronal connections in the human brain (Lander et al., 2001; Sporns et al., 2005). Despite such complexity, the use of modern neuroimaging is ushering in a new vision for describing the wiring of the brain. While graph theory, diffusion imaging, and even functional imaging are still in their relative infancy, researchers have never possessed more appropriate tools for decoding the enigmas of the brain. Connectomics analysis is particularly revelatory for small, highly connected structures, like the claustrum.

Indeed, the claustrum serves as an informative case study for examining the boon that connectomics research has brought to the field of neuroscience; it has remained one of the most mysterious structures in the brain since the 17th century. Its name, in fact, means "hidden away" (Crick and Koch, 2005). Now, macro-scale connectomics allow for new analysis that may help unlock the secrets of the enigmatic structure. In this literature review, we discuss the limitations that have led to the claustrum to be investigated through a connectomics lens in the human and in other animal species, even before Sporns, Tononi, Hagmann, Bullmore, and others launched the modern connectomics movement. Our review includes descriptions of neuroimaging and histology studies tracing claustral connectivity, its putative role(s) in various neural systems, its reported influence in neurological syndromes, and examines the recent flurry of interest in the macro-connectomics of the claustrum.

#### **BACKGROUND**

The structure of the claustrum is visible as early as 1672 in the drawings of Thomas Willis (Bayer and Altman, 1991) who first proposed that higher cognitive functions arose from the convolutions of the cerebral cortex, rather than the ventricles (Molnár, 2004). Karl Friedrich Burdach, first described the claustrum (using the German word "vormauer"), however, in his seminal work *Von Baue und Leben des Gehirns,* in the early 19th century (Parent, 2012). Burdach himself credited the first illustrated depiction (**Figure 1**) of the structure to the 1786 drawings by Marie-Antoinette's personal physician, Félix Vicq-d'Azyr, who not only discovered the *substantia nigra*, but also provided the most detailed drawings of the basal ganglia of his time (Parent, 2012). Perhaps the first person to appreciate how crucial the claustrum is in multi-modal processing was Theodor Meynert, the director of the psychiatric clinic at the University of Vienna in the late 19th century. Investigating aphasia, Meynert noted that many post-mortem examinations of aphasic patients turned up pathological changes between the insula and the Sylvian fissure. The general belief at the time held that the entire cortex surrounding the Sylvian fissure was dedicated to speech. Meynert hypothesized that the claustrum contained an "acoustic field" that corresponded to the beginning of the *Acousticusstrang*, or "acoustic tract" (Eling, 1994). Information from the acoustic nerve, he posited, was associated with the speech system through spindle-shaped association cells in the claustrum, before being transmitted to the Sylvian fissure. His evidence for this relationship was the well-understood relationship between the claustrum and other "association systems" in the brain (Eling, 1994).

Classification of the claustrum is not a straightforward process. It cannot be accurately described as strictly cortical or subcortical,

as it possesses the laminar organization and characteristic pyramidal somata of cortical regions in Insectivora (Narkiewicz and Mamos, 1990), but also contains some notably subcortical cell types (Mathur et al., 2009). Although prominent researchers such as Brodmann and Wernicke suggested that the claustrum represents the innermost layer of the insula due to its close proximity (Landau, 1919) it is somewhat unsatisfactorily included as part of the basal ganglia today. As a case in point, a search for the claustrum on PubMed will automatically also search for the broader term "basal ganglia," presumed to reflect the fact that claustral afferent inputs are believed to be similar to the striatum, although its efferents connect directly to the cortex without passing through a thalamic relay (Salerno et al., 1984). These efferent connections have a relatively slow conduction speed, which suggests they are small in diameter and/or poorly myelinated. More recently, Pirone et al. (Pirone et al., 2012) have concluded that the more likely ontological relationship of the human claustrum is with insula, rather than basal ganglia, based on immunostaining of human tissue. The detailed overview provided by Druga (2014) indicates a pallial origin of the claustrum, whereas the striatum appears to be of sub-pallial origin. Puelles (2014) notes that various techniques throughout history have suggested the claustrum could be derived from insular cortical strata, the subpalium/basal ganglia, non-insular pallium, or even that different claustral subdivisions arose from distinct regions. He does, however, note that recent genoarchitectonic and immunocytochemical investigations seem to have reinstated the insular derivative theory. Not only do these new investigations concur with the claustral differences noted in species with and without extreme capsules, they also bolster the recent discussions of the role of the claustrum in consciousness (see Methods for Studying the Claustrum, below), since the insula itself has been recently implicated as a neural site for conscious awareness (Craig, 2009).

Despite its considerable—and controversial—history, the structure itself is diminutive (**Figure 2A**). As represented in the Talairach and Tournoux (1988) atlas, the claustrum is located medial to the insular cortex and lateral to the putamen from between −4 and +16 mm relative to the AC-PC plane. The right claustrum has an approximate average surface area of 1551.15 mm²and volume of 828.83 mm<sup>3</sup> while the left claustrum has a surface area of 1439.16 mm²and volume of 705.82 mm<sup>3</sup> (Kapakin, 2011). The noticeable asymmetry (**Figure 2B**) in structure, volume, and average anisotropy (Cao et al., 2003) may relate to function, as the right claustrum, but not the left, has been shown to react differently to congruent and incongruent stimuli (Naghavi et al., 2007). Although the fine-grained anatomy of the claustrum in the human remains poorly understood, it is reasonable to expect that there exist major divisions present as in all of the species examined to date, i.e., a ventral, "endopiriform" and a dorsal "insular" component. Given a putative shared history with the insula, neural sub-divisions may be present mirroring, in part, those of adjacent insular cortex (Puelles, 2014). Thus, the claustrum is unlikely to function as a single uniform body, *per se*, but have finely interconnected sub-divisions. Lastly, it remains unclear whether blood arrives through the vessels penetrating the insula (Edelstein and Denaro, 2004) or from the deep and superficial sections of the middle cerebral artery (Crick and Koch, 2005).

Embryonic observation shows that neurons settle in the claustrum 4–6 days after the peak of neurogenesis (Bayer and Altman,

and **(B)** a separate coronal view of the claustral body in the left hemisphere from post-mortem tissue (orange arrows). Modern neuroimaging computational segmentation algorithms do not consider the claustrum due to the particular difficulty of extracting it from the surrounding tissues.

1991). As most structures do with age, the claustrum increases in volume through middle age and decreases again in old age (Wisco et al., 2008). It shares its ontogeny with the insular cortex, but not with another of its neighbors, the putamen (Pirone et al., 2012). The claustrum is very closely related to both the insula and external capsule. In fact, removal of the white fibers in the external capsule leads to the dorsal claustrum is also being removed. Removing all the fibers from the external capsule that merge in the dorsal claustrum leads to removal of the dorsal claustrum, leaving the putamen exposed, without a lateral covering (Fernandez-Miranda et al., 2008b). The dorsal external capsule is mainly composed of projection (intra-hemispheric corticosubcortical) and not association (intra-hemispheric inter-lobar cortico-cortical) fibers (Fernandez-Miranda et al., 2008a).

## **METHODS FOR STUDYING THE CLAUSTRUM**

The claustrum has been the subject of a relatively small body of research, although the pace of research is increasing in today's era of diffusion imaging and macro-connectomics (**Table 1**). In fact, many studies that mention the claustrum only find its involvement to be a result of their research question, but do not intend to focus on the structure, nor define it particularly carefully. Until recently, the literature was predominantly composed of animal studies (**Table 2**). Human *in vivo* claustrum studies seem to be a phenomenon almost exclusive to the 21st century. While the paucity of neuroimaging technology throughout most of history can account in some part for the scarceness of claustral studies, high-resolution imaging of other small, non-cortical structures was undertaken far earlier. The majority of claustrum investigations have examined the microscopic detail of the claustrum: its neuronal composition (Sherk and LeVay, 1981; Bayer and Altman, 1991; Mathur et al., 2009; Smythies et al., 2012b); its afferents to a single cortical region (Edelstein and Denaro, 2004; Smith et al., 2012); and its excitatory or inhibitory properties (Sherk and LeVay, 1981; Shima et al., 1996). Only a relative few at the time of this writing have attempted to create a macro-scale picture of how the claustrum fits within a larger context—some more recent studies attempt to discover its role in cortico-cortical networks (Hadjikhani and Roland, 1998; Tanne-Gariepy et al., 2002; Poeppl et al., 2014) or to understand its function by analyzing how it alters the electrical frequency of incoming signals (Smythies et al., 2014).

Some unique characteristics have held back research of the claustrum. It is important to note that imaging of the structure is infamously challenging. Early neuroscientists often examined the brains of the deceased in order to relate abnormalities with some functional deficit the patient experienced during life. The claustrum cannot be studied in this way, as researchers have yet to induce a lesion affecting the claustrum without also affecting neighboring structures, although some natural lesions have been reported in epileptic subjects (Sperner et al., 1996; Duffau et al., 2007). Even with the advent of histological staining techniques, it is incredibly difficult to localize an injection in the claustrum without some tracer spreading. The emergence of neuroimaging could not address this limitation of size, as fine-scale structures such as the tiny claustrum can be severely distorted by high-resolution MRI (Konukoglu et al., 2013). Conversely, the claustrum is simply not visible in some low-resolution MR imaging; Meng et al. note that in the developing brain the claustrum is not visible even at 11.7-T MRI, but can be seen on T2-weighted images at 7T (Meng et al., 2012). The difficulty of capturing the claustrum has in fact helped researchers compute signal-to-noise and contrast-to-noise ratios for image restoration because it can only be seen under specific conditions (Konukoglu et al., 2013). In terms of functional imaging, temporal resolution is already notoriously poor. It is not unusual for fMRI spatial resolution to be larger than the width of the claustrum, and therefore attributing any function to just the claustrum runs the risk of ascribing tasks actually carried out in the insula or external capsule to their claustral neighbor. Such misattribution is particularly dangerous in light of the suggestions that the claustrum and insula connect to very different regions (Park et al., 2012).

Within the last 5 years, tractography studies of the claustrum have been undertaken in hopes of obtaining a broader picture of how the claustrum relates to the rest of the brain. An example rendering based upon diffusion imaging of the claustrum is shown in **Figures 3A–D**. Researchers have begun to examine how the claustrum connects functionally disparate cortical networks, and to attempt to extrapolate from these individual examples a larger idea of the role the claustrum plays in the brain. Presumably, the logic of this is that the claustrum performs the same function for all networks in which it participates. The strategy for assessing function seems to be to analyze networks and structures that we understand well, and attribute any unexplainable interaction to the claustrum. This process creates a reactionary chain of research in which one aspect of claustral function is asserted, and then argued against with evidence from a different claustrocortical network. For example, the reciprocal connections from the visual cortex to the claustrum have been used to suggest segregation of function in the claustrum, and yet, analyses of



**Table 1 | Continued**





**Frontiers in Neuroinformatics www.frontiersin.org** November 2014 | Volume 8 | Article 83 | **14**


multiple sensory networks has led researchers to conclude that the claustrum functions primarily as a relay station between major networks (Minciacchi et al., 1995).

After centuries of analyzing individual connections, connectomics offers us the opportunity to analyze all connections to and from the claustrum and determine the weight of influence these have on the overall function of the structure. Such a gestalt view could shed light on the role of the claustrum, perhaps even providing a more precise definition of consciousness itself. Some researchers have eschewed a formal definition of consciousness, lest it result in biasing primary research or its results (Crick and Koch, 1990), while others list necessary components—such as an awareness of one's own physical and sentient existence (Craig, 2009), or varying awareness to unchanging stimuli (Blake et al., 2014)—but researchers seem to agree that conscious processes have access to a multitude of sensory information and some processing capacity. Characterization of a network or region in possession of these properties would almost certainly benefit from some graph theoretical calculation of the relative weights of network influence.

Still, reliance on newer investigative techniques will require an accumulation of many large studies in order to find certitude in our conclusions. In terms of DTI investigations, further knowledge of the anatomy of multi-orientational fiber populations may be necessary to increase accuracy; when axons are not oriented in a coherent fashion, so that then the voxel-averaged estimate of orientation cannot accurately summarize the orientation of the underlying fibers, continuity may be assumed between the fibers where there is none (Fernandez-Miranda et al., 2008a). Modern, *in vivo,* human imaging studies of the claustrum still suffer from low statistical power due to small sample sizes. Number of diffusion directions, voxel size, magnetic field strength, and eddy current correction will all need to be carefully controlled to obtain accurate results in a structure so small. In the long-term, parcellation algorithms (like FreeSurfer reconstruction; http://surfer*.* nmr*.*mgh*.*harvard*.*edu/) will need to become more precise if we hope to perform population-level analyses of small structures.

### **STRUCTURAL MICRO-CONNECTOMICS**

"The central rationale for human connectomics builds on the premise that structural brain connectivity can serve as a basis for understanding brain dynamics and behavior" (Behrens and Sporns, 2012). Despite never having been subjected to network theory analysis, many observations of connections between the claustrum and individual regions or networks have helped establish some facts about claustral micro-connectomics. Hagmann et al. (2008) have applied community detection or modularity analysis to demonstrate—using diffusion imaging and resting state functional MRI—that a close relationship exists between structural connections and functional connections. Therefore, an analysis of the structural connections of the claustrum ought to allow us to hypothesize the functional role it may play in the networks to which it contributes. Functional imaging studies will be discussed in the following section.

Inputs to the claustrum arrive from a plethora of brain areas before being assimilated, integrated, and signals directed to the claustrum (Edelstein and Denaro, 2004). These functions

**Table 2 | Continued**

**FIGURE 3 | (A–C)** Views of a 64-direction diffusion tensor (DTI) fiber tract reconstruction from a 3.0 Telsa Siemens MRI Trio scanner in an example subject showing white matter pathways emanating from a the region of the human claustrum. **(D)** The image shows a binary label mask drawn of the left and right claustra on a T1-weighted image of the same subject.

allow rapid adaptation to nuanced changes in one's environment. Intra-claustral interactions, which may involve dendrodendritic synapses and networks of gap junction-linked neurons, are thought to be abundant (Smythies et al., 2012a,b), though the presence of connexin proteins would be a prerequisite for such networks which requires further exploration. Several authors have also found connections to the corpus callosum and anterior commissure (Berke, 1960; Milardi et al., 2013), although there are only one fifth as many contralateral projections as ipsilateral ones (Markowitsch et al., 1984). There is substantial evidence for claustrum-subcortical connections in the existing animal literature (for instance, in the cat, rat, and a range of insectivora, respectively; Kaufman and Rosenquist, 1985; Sloniewski et al., 1986a; Narkiewicz and Mamos, 1990; Dinopoulos et al., 1992), though there is a paucity of similar findings from *ex vivo* studies of humans. There is substantial evidence for claustrum-subcortical connections in the existing animal literature, though there is a paucity of similar findings from *ex vivo* studies of humans.

When multiple inputs converge onto the claustrum, this results in a new signal, which demonstrates integration (Edelstein and Denaro, 2004). It is particularly useful in cross-modal matching; for example, it is active when a subject sees and touches something, but not active when two items are seen or two items are touched (Arnow et al., 2002). Information can also be redirected throughout the brain by this structure (Edelstein and Denaro, 2004). In a study using macro-electrodes, the organization of somatic sensory, auditory, and visual projections to the claustrum were found to display heterotopic and multi-sensory convergence characteristics (Spector et al., 1970). Responses differ when the stimulus site or receiver site change, which indicates it is a non-homogeneous multisensory structure with three electrophysiologically distinct parts (Spector et al., 1970). Claustral neurons appear to discriminate and associate between intermodal and intra-modal sensory stimuli (Spector et al., 1974). The neurons of the striate cortex and pyramidal tract show decreased spontaneous firing during stimulation of the claustrum, which indicates it may play a role in regulating afferent sensory information (Edelstein and Denaro, 2004). Claustral cells themselves display very low spontaneous discharge (Spector et al., 1974). Connections to limbic regions seem to indicate that the claustrum possesses other functions beyond sensory integration. The amount of intra-modal branched inputs is higher in the claustrum than in relay nuclei, which indicates that the claustrum must do more than simply relay information (Minciacchi et al., 1995). Zingg et al. (2014) suggest that the claustrum may provide additional means of direct interaction between neural sub-networks. The structure and function of claustral connections may differ to some extent between genders. For example, the claustrum it is active during penile stimulation, but not clitoral stimulation (Georgiadis et al., 2009), and the level of activation corresponds to turgidity of the penis (Arnow et al., 2002). In fact, the claustrum and the brain stem are the only two brain regions active in males during sexual arousal, but inactive during competitive arousal (Redoute et al., 2000).

While the claustrum as a whole is known for multimodal processing, most neurons within it are not multisensory processors (Remedios et al., 2010). Because of this, it has been implied that perhaps claustrum synchronizes cortical regions that are responsible for bilaterally coordinated behaviors such as eye movements without actually changing the data being shuttled through itself (Smith and Alloway, 2010). Another hypothesis is that the claustrum seems to work to counterbalance the homunculus of the brain; overrepresented regions in S1 and V1 are relatively underrepresented in the claustrum and a preference exists for retinal periphery representation (LeVay and Sherk, 1981b; Minciacchi et al., 1995). Sherk and LeVay suggest that claustral efferents serve as inhibitors in order to shape the receptive field properties of cortical neurons (Sherk and LeVay, 1981; Shima et al., 1996). Furthermore, the claustrum may play a crucial role in plasticity and reorganization. Such a function would require the claustrum to recognize the unknown modular specific code carried by an afferent axon, or require claustral efferent to return to the same neuron (or neuronal group) that gave rise to the afferent axon (Smythies et al., 2012b).

Micro-connectomics of the claustrum was, until fairly recently, largely studied through retrograde tract-tracing, since the proximity of the claustrum to other areas makes placement of strictly claustral anatomical tracers incredibly imprecise. Unfortunately, retrograde tract-tracing requires clearly defined anatomical boundaries, which are still under investigation in the claustrum (Mathur et al., 2009).

#### **MACRO-CONNECTOMICS**

Sexuality studies may offer the easiest starting point for understanding the interactions between claustrocortical regions. Firstly, the claustrum is activated during penile, but not clitoral arousal (Georgiadis et al., 2009). Men and women tend to experience sexual arousal differently in response to visual stimuli, with one of the most notable differences being that visual stimulation is associated more with male arousal (Redoute et al., 2000; Hamann et al., 2004)—though this assumption has recently been questioned by Rupp and Wallen (2008). Arnow et al. (2002) suggest the claustrum may facilitate reflexive cross-modal transfer of visual input to imagined tactile (penile) stimulation. In both PET and fMRI investigations, claustrum activation corresponded to penile turgidity with statistical significance (Arnow et al., 2002; Stoleru et al., 2012). There is some question whether such activation is specific to sexual arousal, or may be indicative of more broad motivational processing; some imaging has shown claustral activation correlated with thirst, hunger, and emotional motivation (Stoleru et al., 2012).

A further breakdown of this relationship has been undertaken to separate sub-networks in which the claustrum participates during co-activation of the networks. It was demonstrated psychosexual arousal was characterized by lateral prefrontal cortex, superior parietal lobule, and medial and inferior frontal gyri (all bilaterally), while physiosexual arousal was characterized by anterior cingulate cortex (which the authors consider to be alternatively called ventromedial prefrontal cortex and medial orbitofrontal cortex) (Poeppl et al., 2014). The one region that was activated during both types of arousal was the claustrum. Furthermore, the right claustrum/insula was the only field shown to be involved in visual-tactile stimulation, but not activated in either pure tactile or pure visual stimulation (Hadjikhani and Roland, 1998). Due to observational analysis, researchers have long suspected that sexual stimuli and autonomic processes must be linked in the brain in a way that allows for two-way communication, and meta-analytic review suggests this link may be the claustrum and the putamen. Given its multimodal, cortical and cross-cortical connections, the claustrum seems like a viable candidate for cross-modal matching, such that modality-specific areas can communicate with the claustrum via unimodal connections, which can then be exchanged and altered to output a novel unimodal signal (Smythies et al., 2014). Additional functional imaging investigations of sexual arousal at ultra-high field and spatial resolution would provide confirmation of the psychophysiological role of the claustrum in concert with other brain regions believed to be different between males and females.

Claustral activation also occurs during identification of the use of a fluency heuristic, and may imply that the prioritization of certain inputs, along with the priming effect, may be driven or mediated despite its modest structural size (Volz et al., 2010). It has also been implicated in learning; the insula and claustrum are BOLD-activated during active, but not passive learning (Kersey and James, 2013). Given the prevalence of claustral abnormalities in memory disorders, the role of the claustrum in the creation of fluency heuristics may be due to its involvement in recall. If the conclusions about sensory integration are correct, an aclaustral subject could still respond to isolated stimuli, but would not be able to process complex ones or to coordinate the synchronization of inputs from multiple modalities (Crick and Koch, 2005).

In an fMRI study of musicians, Fauvel et al. (2014) found that the bilateral claustrum was functionally connected with the right inferior orbitofrontal gyrus at rest, and formed a resting-state executive control network. The authors suggested that the network integrates emotion aroused by the auditory stimuli in order to drive planning of future motor sequences that would continue to arouse the appropriate emotionalism in the music, although admitted that their functional conclusions were merely hypothetical.

Electrolytic lesions and ablations of the claustrum inhibit conditioned activity. Electrical stimulation can lead to salivation, tongue movements, blinking, and swallowing, and contralateral upper extremity motions, although results differ between stimulation of the claustral-putamen pathway and the claustralamygdala pathway (Edelstein and Denaro, 2004). Stimulation can also increase spinal reflexes. Pupillary dilation, licking, swallowing, shivering, and ear movement have been affected by claustral lesions in cats. Lesions of the left claustrum specifically lead to eyelid twitching, myoclonic jerking, and convulsive seizures, although one must remember lesion studies in small structures are more susceptible to downstream effects and other errors (Wada and Tsuchimochi, 1997).

### **CLINICAL SYNDROMES ASSOCIATED WITH CLAUSTRAL DAMAGE AND DYSFUNCTION**

**Table 3** lists the studies of clinical conditions where the claustrum has been implicated, along with the suspected means of claustral involvement. It is important to note that no clinical studies have been able to isolate the claustrum as the only region of involvement. This may support the idea that dysfunction in the claustrum results primarily in network disruption, rather than a specific functional deficit.

This diminutive structure has been investigated for its potential role in seizure generalization (Fernandez-Miranda et al., 2008b). It is thought that epileptoform propagation from limbic sites may be linked to the motor cortex via the claustrum (Mohapel et al., 2000). A case study was conducted on a 12-yearold girl who presented with status epilepticus (multiple, repeated complex partial and myoclonic seizures occurred in the upper extremities and face with orofacial automatisms, eye deviation, and nystagmus) displaying psychotic behavior with agitation, severe cognitive impairment and temporary loss of vision, hearing and speech, as well as loss of orientation in time and place. Bilateral, strip-like lesions were discovered in T1 and T2 images of her claustrum and external capsule. The case indicates that lesions of the claustrum function more like gray matter disease than white matter disease. Additionally, the case study implies a functional correlation between seizures, behavioral state, and abnormalities in the claustrum (Sperner et al., 1996).

In contrast, unilateral removal of the claustrum did not lead to sensorimotor or cognitive impairment in patients receiving surgery for low-grade cerebral glioma (Duffau et al., 2007). So, perhaps the claustrum operates as part of a network or networks, rather than as the epicenter of a network.

The most common clinical association is the effect of the claustrum on memory. Claustral amyloid plaques accumulation has been implicated in the outcomes of Alzheimer's disease and aging (Morys et al., 1994; Fernandez-Miranda et al., 2008b). These amyloid deposits seem to cluster in the ventral claustrum, and may disrupt limbic connections (Morys et al., 1994). These

#### **Table 3 | Clinical syndromes with putative claustral involvement.**


changes in the claustrum associated with aging, however, are subtle, appearing several years after those in the cerebral cortex become apparent. The claustrum has also been examined in cases of memory impairment associated with HIV, AIDS, Parkinson's, and Dementia with Lewy Bodies (DLB) (Kozlowski et al., 1997; Yamamoto et al., 2007; Smith et al., 2008; Kalaitzakis et al., 2009).

Other neurological conditions have been examined in the claustrum as well. Negative correlations between anhedonia and metabolism in the claustrum have been shown in both patients with unipolar depression and bipolar disorder (BD), which may be part of overall enlargement of the basal ganglia and increase in claustral GM volume in BD (Chen et al., 2011). Such factors may be the result of differences in pruning between diseased and healthy populations. Severity of delusions in schizophrenia is correlated with the reduction in left claustral volume (Cascella et al., 2011) and schizophrenia patients with hallucinations show signs of white matter excesses (Shapleske et al., 2002). The claustrum is alternately reported as spared from insular gliomas, or commonly found to be invaded by such tumors (Fernandez-Miranda et al., 2008a).

In Wilson's disease, abnormal brightness of T2-weighted images in the claustrum is considered one of the markers of the disease though its effects may be misattributed in the literature to its close neighbor, the putamen (Sener, 1993; King et al., 1996). In fact, Wilson himself cited the possible role of the claustrum in his initial description of the disease in 1912 (though he referred to it as "Progressive Lenticular Degeneration") (Wilson, 1912; Sener, 1993). The disease, which is invariably fatal without intervention, is caused by high copper levels—and sometimes iron levels—in the brain, particularly in the basal ganglia. Symptoms include bilateral tremors of the extremities, spasticity of the limbs and face, emaciation, dysphagia, dysarthria, emotionalism, and difficulty maintaining equilibrium (Wilson, 1912).

#### **CLAUSTRAL INVOLVEMENT IN NEURAL SYSTEMS**

Based on the summation of the micro-connectomics information gathered from centuries of histology, structural imaging, functional imaging, and clinical pathology studies, interactions have been proposed between the claustrum and every major sensory network. The extent to which these networks rely on the function of the claustrum, or to which the claustrum relies on information from these networks, remains to be investigated.

#### **VISUAL**

Claustral neurons are overwhelmingly binocular; 84% respond to stimuli from either eye, while only 40% of overall cortical neurons do (Sherk and LeVay, 1981). Responses in the claustrum to photic stimuli can be abolished by lesioning the lateral geniculate nucleus (LGN) (Edelstein and Denaro, 2004). V1 connections may not be reciprocal, as connections from V1 to the claustrum in macaques have not been reported (Crick and Koch, 2005).

Cortico-claustral connections from layer VI reach layer IV through the claustrum, creating an alternative route to the direct projection from layer VI to layer IV (LeVay and Sherk, 1981a). In the contralateral half of the visual field, the upper fields map to the caudal region of the claustrum, while the lower ones map rostrally. The far periphery can be mapped to the claustral surface, while the vertical meridian lies at the lower limit of the visual region. A single representation of the visual hemi-field exists as a unified map without discontinuities or duplications in the claustrum (LeVay and Sherk, 1981b).

#### **AUDITORY**

The medial ectosylvian gyrus (AII) relays information from the medial geniculate nucleus to the claustrum. In fact, ablation of the entire AII results in a lack of claustral response to MGN stimulation (Edelstein and Denaro, 2004).

#### **MOTOR**

Functional localization has not been demonstrated in the claustrum, so it is unlikely that the claustrum can influence specific muscular groups (Salerno et al., 1984).

The claustral loop to the striate cortex is involved in motion detection, but cannot discriminate the direction of motion (Edelstein and Denaro, 2004). Most (approximately 70%) of movement-related neurons increased discharge regardless of whether the motion was a push, a pull, or a turn, while only 16% were selective to one movement. This differs from the specificity of the motor cortex itself, in which about half of the neurons are responsive to a single motion (Shima et al., 1996). This may be indicative of a high degree of convergence of inputs. This lack of selectivity, along with the presence of inhibitory efferents, may suggest that the claustrum generally suppresses cortical activity immediately before the initiation of movement.

Projections to M1, pre-SMA, SMA-proper, and various subdivisions of PM and area 46 emanate from the entire rostro-caudal extent of the claustrum, with no distinct topographic or somatotopic organization. Along the dorso-ventral axis, however, these projections tend to originate in the dorsal or intermediate claustrum (Tanne-Gariepy et al., 2002). There is overlap of region of origin in most motor areas, but Pre-SMA and SMA-proper both uniquely show local segregation through inter-digitations. Area 46 also receives projections from the most ventral portion of the caudal claustrum, which sends minor projections to M1, the subareas of PM, pre-SMA, and SMA-proper (Tanne-Gariepy et al., 2002). M1 projections to the claustrum are then projected to S1, indicating a role in sensorimotor coordination (Smith et al., 2012).

In terms of oculomotor control, the mid-ventral claustrum receives projections from the frontal eye fields, and is organized topographically according to the size of saccades (Tanne-Gariepy et al., 2002).

#### **SOMATOSENSORY**

While there is general agreement on its heterotopic organization, there is no apparent somatotopic organization of the claustrum (Spector et al., 1974). Some somatic afferents to the claustrum are conveyed via the posterior spinal funiculi fibers. In addition, the anterior ectosylvian gyrus (SII) relays information from the ventral posterolateral nucleus (VPL) to the claustrum. Lesions in the VPL lead to reduced responsivity of the claustrum to skin stimulation. Marked claustral activity was noted in response to stimuli in somatic-vagal-tooth pulp. This may be indicative of trigeminal projections similar to those noted in other structures of the basal ganglia (Edelstein and Denaro, 2004).

## **LOOKING TOWARD THE FUTURE**

Studying connectomics will improve the quality of human brain research. Firstly, connection profiles contain valuable information about a brain region. In fact, anatomical delineations between brain regions can be drawn out by examining which structural elements share similar long-range connections that differ from the connection profiles of other structures. Therefore, connectomics analysis could be a valuable tool in creating a reliable white matter atlas, a resource that does not currently exist. In fact, there is not even a universally agreed upon system of cortical parcellation (Sporns et al., 2005).

At present, the rapid proliferation of claustrum research seems to be mostly composed of macro-connectomics approaches, such as HARDI, DTI, and functional imaging. Shifting the focus, however, from micro to macro is not likely to solve all of the mysteries of the relationship between brain structure and function. Sporns and Tononi, in their noteworthy paper introducing the concept of the connectome, have argued that scientists would not be able to map structure to function in the human brain without a comprehensive connectional model (Sporns et al., 2005). A complete biophysical model of the human connectome, however, would "provide a unified, time-invariant, and readily available neuroinformatics resource that could be used in virtually all areas of experimental and theoretical neuroscience." Several maps of the human macro-scale structural connectome have been presented, (Irimia et al., 2012a,b; Barch et al., 2013; Ugurbil et al., 2013; Van Essen et al., 2013) with additional multimodal atlases forthcoming (Amunts et al., 2014). Yet, defining which clusters of connections comprise a sub-networks and determining the hierarchy of which networks depend on each other will take a lot of additional time and analysis of both the macro- and the micro-scale.

Micro-scale approaches to assessing connectivity have several shortcomings, although most of these can be mitigated by combining micro- and macro-scale approaches. Assessments of differences in network theory measures, such as betweenness centrality and assortativity, are difficult to translate into clinically relevant knowledge (Johansen-Berg, 2013). Histological attempts to study connectomics often neglect to take into account the difference in tissue size, behavior, and integrity caused by observation and intervention. Where histology tends to underestimate axonal properties, micro-structural observations from tractography data tend to be heavily influenced by the parameters of reconstruction chosen by the researcher (Assaf et al., 2013). Therefore, the combination of macro- and micro-scale connectomics allows for validation of statistics which tend to be particularly prone to error; crossing fibers, a perennial problem for tractography analysis, can be separated according to micro-structural characteristics such as axon diameter in order to increase the validity of diffusion data. Not only does this combination improve the accuracy of existing data analytic techniques, but a combinatory approach also opens the door to new, more complex and multi-scaled analyses. Localized data can be applied to estimation of conduction velocity, which helps in assigning network weight (Johansen-Berg, 2013). Merging EEG and fMRI data with tractography and histology would allow us to take functional research conclusions out of the realm of correlation and start to understand the causation of the data patterns that have emerged in fMRI research. Uniting the two methods would undoubtedly generate further about the interplay of overlapping networks in the human brain (Assaf et al., 2013).

The claustrum appears ideally suited for synthesizing multimodal data because it appears to be the infrastructure through which sub-networks communicate. Since its discovery, its function was assumed to be connecting disparate cortical areas. However, using modern functional imaging methods, it is highly challenging to link behavioral function to such a diminutive region. Among the most relied upon approaches to inferring its function have been by assessing its connections, interactions, cell types, and shape to draw inferences which can be quantified.

For a combinatory connectomics analysis to succeed in the claustrum, changes will need to occur in the neuroimaging status quo. The claustrum literature is woefully lacking *in vivo* imaging, as well as human imaging. Regardless of species, sample sizes will need to increase dramatically, in order to help overcome the problem of individual differences that plagues neuroimaging data. Intra-individual differences should also be addressed, as there have been very few adjustments made to account for the growing evidence that connectivity is state-dependent. In small structures, tweaks in an imaging protocol can have a far more pronounced affect; therefore, it is vital that claustrum studies utilize the highest available resolution of MR images, and the highest number diffusion gradients. Including a claustrum label in any of the parcellation software atlases would drastically improve the reliability of the literature.

## **CONCLUSIONS**

After centuries of research and much recent interest, a comprehensive theory of the claustrum's role in the brain and its sensory sub-networks remains lacking. Yet, the field of macro-scale connectomics—made possible through the refinement of neuroimaging and computational methods—provides a new basis for exploring intricate patterns of neural wiring and has already enabled broad-scale exploration of many major brain structures and pathways. The advancing sophistication of brain imaging and computational approaches is ideally suited for understanding the relative connectomics contributions of smaller brain regions which have only begun to be examined via brain mapping methods. Even the seemingly simple mathematical question of whether the claustrum is more crucial in inter- or intra- network connectivity remains unanswered and worthy of examination. Understanding whether the claustrum participates in each subnetwork to an equal quantitative extent would allow assessment of whether the structure performs a single function for each network or multiple functions, unique to each network. Comparison of claustral network metrics throughout human development could illuminate considerable information about the establishment and re-enforcement of inter- and intra-hemispheric communication. Finally, macro-connectomics analysis of a region of such growing popularity could set a leading precedent for investigating small structures and sub-structures of the brain.

In conclusion, the purpose of the claustrum—of interest to brain science for centuries—stands a chance of being unlocked through the use of macro-scale connectomics methods. While the study of neuronal connections is far from novel, the increasingly complex statistical algorithms for understanding the functions of network-wide interactions remain in their infancy. Furthermore, studies which combine microscopic detail with macro-scale investigation of multiple brain regions remain rare at present, if not unheard of. Such a wide scope of analysis would likely enhance the knowledge of almost any brain region; however, the putatively abstract and as yet unconfirmed role of the claustrum, beyond when considered in isolation, poses a valuable opportunity for connectomic analysis to break new ground and demonstrate its function by evaluating this stubbornly enigmatic structure across all scales. Likewise, multi-scale analysis of this under-appreciated brain region could demonstrate how much more robust our understanding becomes when networks and sub-networks are considered at all levels, rather than assuming that we can achieve this understanding through an additive process of examining individual micro-scale architecture. It is true that connectomic analysis is already enhancing our understanding of the brain. Yet, the advantages of these computational approaches seem underwhelming when merely used to recapitulate understanding of established networks. It remains unlikely—with contemporary technologies alone—that this mysterious strip of non-cortical, non-subcortical matter nestled deep within the cerebrum will ever be understood without applying connectomic network theory at the broadest possible scales.

### **REFERENCES**


human immunodeficiency virus–a gross-anatomical morphometric study. *Acta Neuropathol.* 93, 136–145. doi: 10.1007/s004010050594


claustrum: an autoradiographic study. *Acta Neurobiol. Exp. (Wars).* 47, 179–182.


**Conflict of Interest Statement:** The Associate Editor Mihail Bota declares that, despite being affiliated to the same institution as authors, the review process was handled objectively and no conflict of interest exists. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

#### *Received: 03 July 2014; accepted: 09 October 2014; published online: 11 November 2014.*

*Citation: Torgerson CM and Van Horn JD (2014) A case study in connectomics: the history, mapping, and connectivity of the claustrum. Front. Neuroinform. 8:83. doi: 10.3389/fninf.2014.00083*

*This article was submitted to the journal Frontiers in Neuroinformatics.*

*Copyright © 2014 Torgerson and Van Horn. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Automated multi-subject fiber clustering of mouse brain using dominant sets

#### *Luca Dodero1 \*†, Sebastiano Vascon1†, Vittorio Murino1, Angelo Bifone2, Alessandro Gozzi <sup>2</sup> and Diego Sona1,3\**

*<sup>1</sup> Pattern Analysis and Computer Vision Department (PAVIS), Istituto Italiano di Tecnologia, Genova, Italy*

*<sup>2</sup> Magnetic Resonance Imaging Department, Center for Neuroscience and Cognitive Systems@UniTn, Istituto Italiano di Tecnologia, Rovereto, Italy*

*<sup>3</sup> NeuroInformatics Laboratory (NiLab), Fondazione Bruno Kessler, Trento, Italy*

#### *Edited by:*

*Mihail Bota, University of Southern California, USA*

#### *Reviewed by:*

*Hidetoshi Ikeno, University of Hyogo, Japan Zhuo Wang, University of Southern California, USA*

#### *\*Correspondence:*

*Luca Dodero and Diego Sona, Pattern Analysis and Computer Vision (PAVIS), Istituto Italiano di Tecnologia, Genova, Italy e-mail: luca.dodero@iit.it; diego.sona@iit.it*

*†These authors have contributed equally to this work.*

Mapping of structural and functional connectivity may provide deeper understanding of brain function and disfunction. Diffusion Magnetic Resonance Imaging (DMRI) is a powerful technique to non-invasively delineate white matter (WM) tracts and to obtain a three-dimensional description of the structural architecture of the brain. However, DMRI tractography methods produce highly multi-dimensional datasets whose interpretation requires advanced analytical tools. Indeed, manual identification of specific neuroanatomical tracts based on prior anatomical knowledge is time-consuming and prone to operator-induced bias. Here we propose an automatic multi-subject fiber clustering method that enables retrieval of group-wise WM fiber bundles. In order to account for variance across subjects, we developed a multi-subject approach based on a method known as Dominant Sets algorithm, via an intra- and cross-subject clustering. The intra-subject step allows us to reduce the complexity of the raw tractography data, thus obtaining homogeneous neuroanatomically-plausible bundles in each diffusion space. The cross-subject step, characterized by a proper space-invariant metric in the original diffusion space, enables the identification of the same WM bundles across multiple subjects without any prior neuroanatomical knowledge. Quantitative analysis was conducted comparing our algorithm with spectral clustering and affinity propagation methods on synthetic dataset. We also performed qualitative analysis on mouse brain tractography retrieving significant WM structures. The approach serves the final goal of detecting WM bundles at a population level, thus paving the way to the study of the WM organization across groups.

**Keywords: clustering, dominant sets, fibers segmentation, white matter, tractography, multi-subject, diffusion magnetic resonance imaging, DTI**

## **1. INTRODUCTION**

Diffusion magnetic resonance imaging (DMRI) permits noninvasive investigation of the white matter (WM) structure based on the diffusion profile of water molecules in the brain. This technique can be used to estimate the orientation of fibers at the voxel level, which can in turn be used by a number of tractography algorithms to build global fiber trajectories (Basser and Jones, 2002; Tournier et al., 2004). One of the advantage of DMRI over other methods is that it provides neuroscientists and neurosurgeons with the possibility to non-invasively identify fiber bundles, i.e., groups of fibers belonging to the same anatomical regions. These bundles represent major pathways in the overall physical connectivity of the brain. All diffusion-based MRI techniques (e.g., DTI, HARDY, Q-Ball) provide whole brain tractography datasets that are large (typically more than 100,000 fibers), complex and multidimensional, as well as artifact prone (e.g., crossing and broken fibers, low fractional anisotropy near the cortex, etc.) thus greatly complicating the description of large-scale WM structure and limiting the clinical impact of this approach. In most instances, the identification of relevant bundles is carried out via manual identification of regions of interest (ROIs) corresponding to the main known pathways (Mori et al., 2005; Wakana et al., 2007; Catani and de Schotten, 2008). However, this analysis is strongly affected by the prior knowledge used to identify the structures and very much prone to operator bias.

Methods for the automatic decomposition of whole brain tractography into fiber bundles could greatly help reduce complexity and bias associated with manual segmentation. For this reason, there is an urgent need for (semi)-automatic tools determining the bundles within and across subjects with little or no human intervention. This approach, frequently referred to as tractography segmentation, aims at generating a simplified representation of the WM structure, enabling easier navigation and improved understanding of the structural organization of the brain and its overall connectivity.

To automate bundles retrieval, various methods, based on different computational paradigms were proposed over the last few years. For example, the solution proposed in Li et al. (2010) is an evolution of the ROI-based technique that works directly on fiber and applies prior knowledge to perform preliminary parcellation of the brain. Kernel-PCA and C-means are then used to cluster the fibers. However, this approach is limited by the level of detail of the brain atlases, which can prevent the retrieval of small structures or suffer from cross-subject misalignments. Supervised methods were also proposed to retrieve local WM bundles using prior knowledge (Mayer et al., 2011; Olivetti and Avesani, 2011). These approaches require a first manual intervention to select tracts of interest in a subset of subjects and then retrieve the same structure in other subjects, and making them unsuitable for a global WM segmentation.

Clustering approaches represent a logical alternative to supervised methods as they permit to discover bundle structures without the need of prior anatomical knowledge. A common clustering framework is based on the exploitation of the affinity matrix of a single subject that indicates the similarity between each pair of fibers (Brun et al., 2004; O'Donnell et al., 2006; Zhang et al., 2008). A limitation common to all algorithms based on affinity matrix is their propensity to suffer from computational load owing to the calculation of pairwise distances between streamlines. Usually the complexity of these algorithms is *<sup>O</sup>*(*N*2), where *N* is the total number of fibers. Approaches to reduce computational complexity have been proposed like Quick Bundles (Garyfallidis et al., 2012), on line agglomerative clustering (Demir et al., 2013) and atlas-guided clustering with efficient implementation (Ros et al., 2013). A hierarchical clustering approach on single subject (Guevara et al., 2011) was proposed to automatically estimate the number of cluster from the dataset. However, the results of this approach are strongly conditioned by the number of hierarchical steps and several input parameters are required to carry out a comprehensive map of WM bundles.

Multi-subject spectral clustering (O'Donnell and Westin, 2007) was proposed to build a high dimensional WM atlas based on multiple DTI images. One limitation of this approach is that it needs prior information about the number of cluster to be segmented, which is however often unknown. To circumvent the problem of prior knowledge, multi-subject hierarchical clustering was proposed (Guevara et al., 2012). All subjects are registered to a common space but different manually agglomerative distance thresholds, based on neuroanatomical information, are used to retrieve the same WM bundles across different subjects.

A more advanced multi-subject clustering non-parametric Bayesian framework based on a Dirichlet process (Wang et al., 2011) was proposed to infer automatically the number of clusters from the data without affinity matrix computation. However, large datasets can dramatically decrease the quality of the results. Recently (Tunç et al., 2014) proposed a multi-subject adaptive clustering algorithm to build an atlas by using a subset of subjects to segment new subjects. However, manual thresholds are used to merge fibers and the atlas is strongly dependent of the number of subjects used.

In an attempt to circumvent all these limitations, we present a multi-subject clustering approach based on affinity matrices, directly connected with Graph Theory and rooted in the Game Theory. The method, based on the Dominant Set framework benefits from three properties that make it appealing for the problem at hand: (i) it is robust to noise and to outliers (Pavan and Pelillo, 2007); (ii) it is robust to parameters setting, generating stable results across different dataset (Dodero et al., 2013b); (iii) it automatically infers the number of clusters (Pavan and Pelillo, 2007). We tested our method on synthetic datasets comparing the results with state-of-the-art solutions like spectral clustering (Ng et al., 2002) and affinity propagation (Frey and Dueck, 2007). We also tested our method on a mouse brain dataset with the tractography inferred from DTI images, showing that it can reliably identify neuroanatomically plausible WM bundles in the mouse brain across multiple subjects without any prior neuroanatomical knowledge.

## **2. MATERIALS AND METHODS**

Our main goal was to identify WM bundles across multiple subjects without prior registration of the raw diffusion data or the tractography. The algorithm approaches this problem in two steps. In the beginning, the tractography data-sets are segmented in the original diffusion space to obtain WM bundles for each subject, Dodero et al. (2013b). Subsequently, the bundles with high intra-subject similarity are clustered across subjects, performing all computations in the original space of the subjects by defining a space-invariant set of landmarks (O'Donnell et al., 2012). **Figure 1** shows a schematic pipeline of the most important steps of our methods.

Since unsupervised learning methods can be heavily affected by the chosen similarity measure, and the two clustering levels use different metrics, we investigated and compared different measures, with the aim of finding the encoding that better preserves the relative similarities across metrics.

#### **2.1. STANDARD FIBER SIMILARITIES**

Each fiber is described by a sequence of points in 3D space. To achieve a uniform representation across fibers with the same number of equidistant points, each fiber was quantized using B-spline interpolation and sampling it with *k* = 12 points, as proposed in Garyfallidis et al. (2012). We thus coded the generic *i*-th streamline F*<sup>i</sup>* as a 3D curve described by a constant sequence of points *Fi* = - **p***i* 1...**p***<sup>i</sup> k* with **p***<sup>i</sup> <sup>j</sup>* <sup>∈</sup> <sup>R</sup>3. Since fibers have no preferred orientation, also the flipped version of the streamlines *F <sup>i</sup>* = - **p***i k*...**p***<sup>i</sup>* 1 was considered in each metric computation.

To cluster WM at single-subject level, we compared the symmetrized mean closest point distance (Guevara et al., 2011) and symmetrized point to point distance.

• *Symmetrized mean closest point distance*

$$d\_{\rm sup}(F\_i, F\_j) = \frac{1}{2} \left( d\_m(F\_i, F\_j) + d\_m(F\_j, F\_i) \right) \tag{1}$$

defined as the average of the two directed (non-symmetric) mean closest points distances between fibers *Fi* and *Fj*.

$$d\_m(F\_i, F\_j) = \frac{1}{k} \sum\_{\mathbf{p}\_k^i \in F\_i} \min\_{\mathbf{p}\_l^j \in F\_j} \|\mathbf{p}\_k^i - \mathbf{p}\_l^j\|\_2 \tag{2}$$

where ∗<sup>2</sup> is the Euclidean norm. • *Symmetrized Point to Point Distance*

**FIGURE 1 | Pipeline of our proposed method. (A)** Intra-subject clustering through Dominant Sets. **(B)** Landmark extraction and centroids encoding on the landmark space. Each centroids selected from

intra-cluster step was encoded on landmarks. **(C)** Cross-Subject clustering through affinity block matrix and Dominant Set to find out same WM bundles across multi-subject.

$$d\_{pp}(F\_i, F\_j) = \min\left(d\_p(F\_i, F\_j), d\_p(F\_i, F\_j')\right) \tag{3}$$

defined as the minimum of the two directed mean points distances between fibers *Fi* and *Fj* and its flipped version *F j* .

$$d\_{\mathcal{P}}(F\_i, F\_j) = \frac{1}{k} \sum\_{k} \|\mathbf{p}\_k^i - \mathbf{p}\_k^j\|\_2 \tag{4}$$

where **p***<sup>i</sup> <sup>k</sup>* and **p** *j <sup>k</sup>* are the corresponding points sampled in the two fibers.

Regardless the chosen metric, the affinity matrix *A* = *aij* encoding the fiber similarities was built:

$$a\_{\vec{\eta}} = \begin{cases} e^{-\frac{d(F\_{\vec{\imath}}, F\_{\vec{\jmath}})}{\sigma}} & \text{if } (\vec{\imath}, j) \in E \\\\ 0 & \text{otherwise.} \end{cases} \tag{5}$$

where σ is a normalization term. We imposed σ = max*i*,*<sup>j</sup>* (*d*(*Fi*, *Fj*)) fixing a unique bound for *aij*, regardless of the used dataset.

## **2.2. LANDMARK-BASED SIMILARITIES**

Starting from the brain atlas registered to each diffusion space, we define some landmarks (3D points in the volume), which have different spatial locations in each subject but refer to the same cortical structures across datasets. These points are used to represent the fibers with a cross-subject invariant descriptor, which allows us to avoid space registration, handling the fiber segmentation in the original space. More specifically, in our experiments with mice tractography landmarks were defined from an anatomical t2-weighted mouse brain atlas (Sforazzini et al., 2013) (139 brain regions) linearly registered to each subjects space, using FSL's FLIRT, v.5.0.6 (Smith et al., 2004). We next selected a subset of symmetric cortical and sub-cortical areas (50 labels), covering both hemispheres and including all the major cortical and subcortical districts of the mouse brain (Paxinos and Franklin, 2004), and for each ROI we computed the center of gravity obtaining fifty landmark points. We next tested two landmark-based measures, defined as follow:

• *Symmetrized Minimum Landmark Distance*

Given the list of landmarks *L* = *L*1...*Ln* (with *n* = 50 in our case) each one identifying a specific brain region, we built a corresponding feature vector *F*˜ (Dim. 1 × *n*) describing each fiber as the list of minimum distances between all the landmarks and the fiber itself. More specifically, each fiber was encoded as a vector *F*˜ *<sup>i</sup>* = {*f <sup>i</sup>* <sup>1</sup>, ..., *<sup>f</sup> <sup>i</sup> <sup>n</sup>*} such that:

$$f\_s^i = \min\_{\mathbf{p}\_k^i \in F\_i} \|\mathbf{p}\_k^i - \mathbf{L}\_s\|\_2 \tag{6}$$

Then we define the similarity between fibers as:

$$d\_l(F\_i, F\_j) = \|\tilde{F}\_i - \tilde{F}\_j\|\_2 \tag{7}$$

and the affinity matrix was determined using Equation 5.

• *Landmark Distance*

An alternative and more selective encoding can be obtained by employing a full landmark distance representation, where each point in a fiber is mapped using all elements in the landmark space. In this case each fiber is encoded into a vector *F*˜ *i* of dimensions *k* × *n*, where *k* is the number of sample points in a fiber and *n* is the number of landmarks and each entry of this vector is the Euclidean distance between one fiber's point coordinate and one landmark coordinate. We define *F*˜ *<sup>i</sup>* = {*f i* <sup>11</sup>, ..., *<sup>f</sup> <sup>i</sup>* <sup>1</sup>*n*, ..., *<sup>f</sup> <sup>i</sup> <sup>k</sup>*1, ..., *<sup>f</sup> <sup>i</sup> kn*} such that:

$$f\_{ks}^{i} = \|\mathbf{p}\_k^{i} - \mathbf{L}\_s\|\_2 \tag{8}$$

Equation 7 was then used to compute the similarity and the corresponding affinity matrix was determined according Equation 5.

#### **2.3. METRIC COMPARISON**

The above two groups of similarity measures were defined for the two clustering steps in the light of their different requirements. Since the choice of the similarity measure can greatly affect the clustering algorithms we compared the measures aiming at selecting the two that produce most similar results. The landmark measure is almost mandatory in order to avoid the tractography alignment. However, being the landmarks-based representation an approximation of the real fiber location, we have to choose the similarity between elements able to preserve the geometry and the shape of the subject bundles. We thus pairwise compared all proposed measures computing each similarity measure between each pair of fibers of a random subject. In **Figure 2** are depicted the distributions of all pairwise comparisons. Comparing the the similarities with Pearson correlation we found that symmetrized point to point distance and landmark distance are the most correlated presenting the closest correspondence (see **Figure 2-Bottom Left**). Based on these results, we adopted the symmetrized point to point distance for intra-subject clustering and the landmark distance for cross-subject clustering.

#### **2.4. DOMINANT SETS CLUSTERING**

Dominant Sets framework (Pavan and Pelillo, 2007) is a graphtheoretic method that generalizes the maximal clique problem to weighted graphs. It finds a compact, coherent and well-separated subset of nodes into a graph, i.e., the *dominant set* (DS). This framework defines the correspondence between clique, DS and cluster using a graph-theoretic perspective, and provides an optimization algorithm used to extract all DSs in a graph. Formally, a dataset is represented as a weighted undirected graph *G* = (*V*, *E*, φ) with no self-loop in which the vertices *V* are the data points and the edges *E* ⊆ *V* × *V* represent neighborhood relations among pairs of nodes, quantified by the weighting function φ : *E* → R+. A DS formalizes two crucial properties of all clustering techniques: the *intra-cluster homogeneity* and *inter-cluster inhomogeneity*.

A graph is compactly represented by its weighted adjacency matrix *A* (the affinity matrix in our approach), which is defined by Equation 5 . In our setting, each fiber corresponds to a node in the graph and the weighting function φ provides a measure of the similarity between pairs of fibers. Evaluating these two properties in all the possible subset of *V* is obviously unfeasible, for this reason the problem is casted into the following optimization task:

$$\begin{array}{ll}\text{maximize} & \mathbf{x}^T A \mathbf{x} \\\\ \text{subject to} & \mathbf{x} \in \Delta^n \end{array} \tag{9}$$

where **x** lies in the standard n-dimensional simplex *n*, or equivalently, *<sup>i</sup> xi* = 1, ∀*i xi* ≥ 0. In the DS framework, **x** is called the *weighted characteristic vector* and it quantifies the degree of participation of the *i*-th component in the DS. If **x** is a strict local solution of (9) then its support, defined as δ(**x**) = {*i* | *xi* > 0}, is a DS (Pavan and Pelillo, 2003) and thus a cluster. A local maximizer of (9) is found using the *replicator dynamics*(Pavan and Pelillo, 2003), a result from the evolutionary game theory mimicking the temporal changes in a population, based on the fitness of its individuals:

$$\mathbf{x}\_{i}(t+1) = \mathbf{x}\_{i}(t)\frac{(A\mathbf{x}(t))\_{i}}{\mathbf{x}(t)^{T}A\mathbf{x}(t)}\tag{10}$$

The optimization starts with a point **x**(*t*0), sited in the barycenter of the simplex *xi*(*t*0) = <sup>1</sup> *<sup>n</sup>* , ∀ *i* . Equation (10) is iterated until stability which is guaranteed to be reached if the matrix *A* is non-negative and symmetric. Theoretical stability condition is achieved when **x**(*t* + 1) = **x**(*t*), i.e., when the distance between two consecutive steps ||**x**(*t* + 1) − **x**(*t*)|| is lower than a threshold (in our setting = 10<sup>−</sup>7). Equation (10) also guarantees the satisfaction in time of constraint in Equation (9) (Pavan and Pelillo, 2003). In practice, the algorithm operates a selection process over the components of vector **x** driven by the affinity matrix *A*. At convergence some elements of **x** will emerge (*xi* > 0) and others will become extinct (*xi* = 0). In order to extract multiple clusters a *peeling-off* strategy is applied: once a DS is determined, it is removed from the whole set of vertices *V*, and the process is iterated on the remaining nodes, until all elements are clustered.

Applying the method in practical cases rarely produces a vector **x** whose certain elements are equal to zero and this is mainly due to the numerical approximation or premature stopping of the dynamics. Thresholding over **x** is thus integrated into the support calculation:

$$\tilde{\delta}(\mathbf{x}) = \{ i \mid \mathbf{x}\_i \succ \theta \, \* \, \text{max} \, (\mathbf{x}) \} \quad \theta \in [0, 1] \tag{11}$$

Small θs act as noise reducer, while higher values guarantee a greater number of clusters, each one having higher internal compactness. We fixed the coherence threshold according to the findings in a previous work (Dodero et al., 2013b), which needs to be very small to make the model stable (θ = 10<sup>−</sup>5) .

#### **2.5. INTRA-SUBJECT CLUSTERING**

DS clustering was first applied to single subject tractography volume to extract the WM bundles ( intra-subject clustering). To reduce data dimensionality and thus computational complexity, we split the whole brain into three smaller datasets: left hemisphere, right hemisphere, and inter-hemispheric fibers, resulting in approximately 15,000 fibers per sub-datasets. The quality of retrieved bundles was then evaluated measuring the cohesiveness, which is a quantitative index measuring the internal coherence of each cluster δ as follows:

$$C(\delta) = \mathbf{x}^T A \mathbf{x} \tag{12}$$

where **x** is the characteristic vector corresponding to δ and *A* is the adjacency matrix. High values of cohesiveness are related to clusters with high internal similarity between elements while clusters with low cohesiveness aggregates fibers with little structural significance. Hence, we used the cohesiveness index to remove the less significant clusters. **Figure 3-Left** shows an example of cohesiveness determined for all iteratively generated clusters. Since the last generated clusters are generally not significant (Pavan and Pelillo, 2007), we removed the last 5% clusters which are mostly the cluster with very low internal cohesivity.

Moreover, in order to select most representative WM structures, we normalized the cohesiveness curve subtracting a second order polynomial curve fitted on the cohesiveness curve itself. Assuming the data distributed according to a Gaussian distribution *N* (0, σ), with σ estimated from the data, we decided to consider as outliers in term of cohesiveness all clusters in the negative tail of the distribution with a level of confidence *p* < 0.05. **Figure 3-Right** shows a plot of normalized coherence with the confidence level below which clusters are rejected. Once the set of cluster candidates were

**FIGURE 3 | Left:** Example of Cohesiveness curve and polynomial fitting. **Right:** Strategy to remove outliers from intra-subject clustering using gaussian curve and statical test. All positive peaks and negative above green line are considered as significant for multi-subject clustering.

generated for each subject, the medoids were determined for each WM bundle and used as reference tracts in the next step.

#### **2.6. CROSS-SUBJECT CLUSTERING**

In the proposed approach the bundles retrieved for all subjects separately were then clustered together in a second step according to the DS framework. To this purpose, clusters determined in the first step were substituted by their representative fiber (in our case the medoid) and then all dataset were joined into a single dataset in such a way that the algorithm groups bundles from different datasets while excluding pairs from the same dataset. In this way coherent clusters of bundles, including no more than one representative bundle from each dataset, were generated.

In more detail, given *n* datasets of bundles *D* = {*d*1,... *dn*} the extended dataset *D*ˆ obtained as the union of the elements in *D*, *D*ˆ = *<sup>n</sup> <sup>i</sup>* <sup>=</sup> <sup>1</sup> *di* is described by an affinity matrix. The graph based representation was then generated over *D*ˆ to avoid cliques containing bundles from the same subject. This was obtained by forcing the elements of the same subject to have zero similarity. The set of edges *E*ˆ in the graph describing the new dataset *D*ˆ is thus defined as:

$$\hat{E}(i,j) = \begin{cases} e^{-\frac{d(\boldsymbol{\nu}\_{l},\boldsymbol{\mathbf{v}}\_{j})}{\sigma\_{k,h}}} & \text{if } \boldsymbol{\nu}\_{i} \in d\_{k}, \boldsymbol{\nu}\_{j} \in d\_{h} \text{ and } k \neq h \\\\ 0 & \text{otherwise.} \end{cases} \tag{13}$$

where *vi* and *vj* are different elements in *D*ˆ , *d*(·,·) is a measure of distance between two elements, and σ*k*,*<sup>h</sup>* is a normalization terms between datasets *h* and *k*. To obtain a metric *d*(·,·) invariant to the different subject spaces, tracts where projected on the landmark space, and landmark distance was used to compare WM structures. The feature vector in the new space was determined

according Equation 8 and a new similarity matrix was built. The resulting weighted adjacency matrix of *D*ˆ exhibits a "block shape" in which the main diagonal is composed of blocks of zeros ensuring that no pair of bundles from the same subject will appear in a cluster. Importantly, within this framework the algorithm can allow for and easily manage differences in the size of individual subject datasets.

**Figure 4** shows an example of cross-subject affinity matrix, where the diagonal blocks represent the intra-subject similarity that we set to 0 to force a maximum of one bundle per subject in each cluster. The off-diagonal blocks describe the similarity between centroids of different subjects. We then applied DSs algorithm to the new adjacency finding similar WM bundles across multiple-subjects.

Aiming at finding the most important WM bundles we selected only significant bundles containing the maximum number of structures corresponding to the number of subjects. All clusters with fewer structures than the number of subjects were discarded, even if the internal cohesiveness was high. The analysis could in any case be further extended to other clusters that were currently rejected.

It can be proved that, if nodes *m*, *n* belong to the same dataset and their similarity is forced to be *amn* = 0 we are sure that the pair cannot be part of the same DS (cluster) and thus on each clusters we will have only the relationship between different datasets (the ones with positive weights).

#### **2.7. MOUSE BRAIN DATASET**

All procedures were carried out in accordance with the European directive 86/609/EEC governing animal welfare and protection, which is acknowledged by the Italian Legislative Decree no. 116, 27 January 1992. The protocol was reviewed and consented to by the animal care committee of the Istituto Italiano di Tecnologia. All surgical procedures were performed under anesthesia.

DTI volumes from adult male 8 *ex vivo* wild type mouse brains (C57BL/6J, Charles River, Como Italy), an inbred strain widely used in neuroscience research, were acquired as previously described (Dodero et al., 2013a; Tucci et al., 2014). Briefly, sample preparation for *ex vivo* mouse brain imaging has been recently described in great detail (Dodero et al., 2013a; Tucci et al., 2014). Briefly, *ex vivo* high-resolution DTI images were acquired on paraformaldehyde fixed specimens and brains were imaged inside intact skulls to avoid post-extraction deformations. Diffusion tensor images (DTI) were acquired with 81 different gradient orientations at a *b*-value of 1262 s/mm<sup>2</sup> (σ =5 ms =10 ms), in-plane spatial resolution of 130 × 130 μm2, and slice thickness of 350 μm in the coronal plane, using a 4-shot EPI sequence with *TR* = 5500 ms and *TE* = 26 ms, 20 averages for a total acquisition time of 10 h 52 min. For each specimen, 8 co-centered volumes were acquired with no diffusion weighting (*b* = 0). Co-centered T2 weighted images were also acquired with the same resolution of the DTI volumes, using a 2-D fast spin-echo sequence.

Diffusion Tensor Tractography was performed by estimating the axonal fibers projections with the Fiber Assignment

by Continuous Tracking (FACT) algorithm (Mori et al., 1999). Fractional Anisotropy (FA) threshold (0.1) and angle threshold (35°) were imposed to start and stop tracking. Fibers shorter than 3 mm were filtered out leading to a set of about 80,000 streamlines. Anatomical brain atlas of a C57BL/6J mouse brain (Sforazzini et al., 2013) was used to extract the landmarks needed for mapping the WM bundles cross-subjects. Homemade FA template was used to linearly register the mouse atlas in the subjects space.

#### **2.8. SYNTHETIC DATASET**

Synthetic WM streamlines and the associated DW-MR images were created using the numerical fibers generator software package (Close et al., 2009). The synthetic data has spherical volume with a fixed radius and composed of a random number of fibers and bundles. We used volumes released by the authors and 10 more volumes were generated in order to introduce more variability across dataset with an average of 41 ± 4 bundles and an 870 ± 37 fibers. Since the synthetic dataset does not contain group volumes, it was only used to compare our algorithm with the other state-of-the-art methods, i.e., spectral clustering and affinity propagation on the first step of the process, i.e., subject-wise fiber segmentation.

In particular, to perform a statistically robust comparison, for each of the above volumes we generated many trials randomly selecting a number of bundles with *k* = {5, 10, 15, 20, 25, 30}. This was repeated 5 time for each volume and for each cluster size. The empirical evaluation was therefore performed on a total of 510 random volumes with different number of clusters and fibers. We quantitatively evaluated the performance of all methods using some common indexes like completeness and adjusted rand index (Moberts et al., 2005).

### **3. RESULTS**

#### **3.1. CLUSTERING ON SYNTHETIC DATASETS**

For each method tested on synthetic dataset, we identified a set of optimal parameters. Spectral clustering requires a prior definition of the expected number of clusters *k*, which however is unknown in the address problem. Hence to avoid a biased evaluation, the algorithm was run with a varying number of clusters *k* ranging from 1 to 40 allowing a fair comparison. A similar requirement holds for both DS and affinity propagation. However, for both approaches empirical methods exist to decide proper parameter values required to obtain a number of clusters approximating the ground truth. Once optimal parameters are fixed, both DS and affinity propagation can then automatically find the optimal number of clusters.

More specifically, affinity propagation requires the definition of self-responsibility parameter, which according to the practice, if set *p* = min (*ai*, *<sup>j</sup>*) is known to generate a number of cluster near the ground truth. DS framework instead requires fixing θ as described in Section 2.4. We used the Adjusted-Rand Index and Completeness indexes to evaluate the three methods, which are frequently used to evaluate the performance of clustering algorithms (Moberts et al., 2005). Higher completeness means that fibers belonging to the same anatomical bundle are clustered together. Rand index is defined as the number of agreement pairs divided by the total number

**tractographies.** Each color is associated to a cluster of fibers. The two subjects have different color mappings because inter-subject clustering is results different, there is a strong evidence of similarity in the determined structures.

of pairs. If the two partitions agree completely then the Rand index returns a value of 1, otherwise the lower-limit of this index is 0.

**Figure 5** shows average results for spectral clustering, DSs, and affinity propagation with various dataset. The figure reports the results over the 6 groups of volumes, with varying amount of clusters {*k* = 5, 10, 15, 20, 25, 30}. DS algorithm always identifies a slightly greater number of clusters than the ground truth, an aspect that is not to be considered necessarily a drawback for the WM fiber segmentation. In general, DS and affinity propagation showed consistent output both in terms of number of cluster retrieved and quality of results.

Hyppocampus, S2 = Somato-Sensory Cortex, Fro = Cerebral cortex: frontal lobe, Crb = Cerebellum, M2 = Motor Cortex.

However, DS algorithm consistently showed higher completeness and adjusted rand index values. The results of spectral clustering also show that prior knowledge of the exact number of clusters could in principle produce higher performance (black curve). Affinity propagation exhibited similar performance than DSs although this approach suffers higher variance than DSs in term of number of clusters generated. DS, on the contrary, consistently yielded a solution approximating

**FIGURE 8 | (A)** Results of cross-subjects clustering on Left-Hemispheric fibers with some magnifications of relevant bundles. For each significant bundle we show four random subjects. **(B)** Results of cross-subjects clustering on Right-Hemispheric fibers with some magnification of relevant bundle. For each significant bundles we show four random subjects. Hp = Hyppocampus, S1 = Somato-Sensory Cortex, M2 = Motor Cortex, NACB = Nucleus Accumbens, Pir = Piriform Cortex, Rhinal = Rhinal Cortex, OFC = Orbitofrontal Cortex.

the optimal one and it was more reasonably stable across all experiments.

#### **3.2. CLUSTERING ON REAL DATASEST**

The proposed approach was also tested on a real dataset.

**Figure 6** shows two examples of qualitative results of intra-subject clustering applied to two mouse tractographies. We obtained different parcellation scheme for each subject and, at this level, each color does not represent associations between subjects.

**Figure 7** shows some examples of common inter-hemispheric WM bundles in 4 representative subjects (i.e., dorsal hippocampal commissure, hippocampal commissure, forceps minor, corpus callosum, and posterior commissure). Using the above restriction the algorithm was able to match 70 cross-subject bundles with significant inter-hemispheric commissure of multiple subjects clustered together. Despite the intrinsic variability of tractography across subjects, the algorithm automatically clustered bundles from different subjects.

**Figure 8** shows obtained results on left (A) and right (B) hemispheres, where the algorithm found, respectively 70 and 74 common WM bundles. Although no symmetry constraints were imposed, our method correctly identified inter-hemispheric bundles and preserved symmetry even in presence of different termination areas characterizing symmetric structures.

## **4. CONCLUSION AND DISCUSSION**

We presented a new method to cluster multiple-subject tractographies and to identify common bundles across subjects for the characterization of WM structure in a population. The proposed solution, based on DS can be used with diffusion MRI methods that use tractography to generate WM streamlines. We adopted DSs clustering to segment single subjects and we extended the framework to multiple subjects without resorting to spatial co-registration of the fibers, but using a landmark-based configuration.

Indeed, projection on the landmarks space, through linear registration of anatomical atlas on subject spaces, enables clustering of fibers in the original diffusion space, thus defining common structures across subjects while preserving invariance with respect to the intrinsic variability of each subject.

Clustering in the proposed multiple-subject framework requires different metrics to built affinity matrix for either the single or the cross-subject steps. Some similarity indexes in the space of streamlines were tested suggesting to use the symmetrized point to point distance (Equation 3) in the first stage and the landmark distance (Equation 8) in the second stage. We could have used the landmark projections for both steps, however, the symmetrized point to point distance is more robust in case of small fibers, while landmark distance, which is an approximation respect to the real distance between fibers, might fail in these cases. At single subject level is preferable to adopt a distance metric able to catch bundles characterizing the variability of each subject (O'Donnell and Westin, 2007; Guevara et al., 2011). On the other hand, the choice of landmark distance is mandatory to cast many subjects in a common space without registering the diffusion data.

We tested synthetic dataset for the proposed DS clustering and compared it with other methods, similarly working with adjacency matrix between fiber pairs, i.e., spectral clustering and affinity propagation. As mentioned in Section 2.4 we set θ very close to 0 according to our previous work (Dodero et al., 2013b). θ works as noise reducer and it acts on the internal elements of single cluster. With low values of θ we generally obtained low number of clusters but preserving high internal similarity. Conversely, higher values yielded over-segmentation, obtaining many clusters with just few elements. Moreover, adopting the fiber generator as ground truth and testing the performance of DSs, we obtained better values of completeness and adjusted rand index using θ very close to 0. From this indexes, we observed that our method is more suitable than the other two methods for fiber clustering. Indeed, unlike spectral clustering, our method does not need to set the number of clusters in advance, and is more stable than affinity propagation in terms of number of clusters generated. If the number of cluster is known a priori, spectral clustering works better than DS; however, the segmentation of whole tractography is an open problem where the number of WM bundles is typically unknown. In this framework, DS performs better compared to the other algorithms in a fair condition, i.e., with all algorithms generating the same number of clusters.

On real dataset, our algorithm was able to segment single subjects tractography generating anatomically plausible bundles. We did not observe any significant variation of WM bundles (also in the synthetic dataset) using various number of points to describe the fibers. We therefore used 12 points as suggested in Garyfallidis et al. (2012). According with DSs theory, the last clusters are always meaningless and they can be considered as outliers. Indeed, the choice to discard the last 5% of clusters is mostly empirical based on the data distribution.

In the cross-subject analysis, the number of landmarks has little influence on the matching between subjects. Indeed very few landmarks do not allow a proper representation of all fibers. On the other side to many landmarks while allowing a nearly perfect fiber representation induce an increased computational complexity. Our choice regarding the number of landmarks represent a good trade-off since they cover all the cortical brain regions, which represent the starting and end areas of the physical connections, while being still computationally manageable.

The algorithm was able to group coherent WM bundles of different subjects in their own space while preserving the symmetry of structures. Interestingly, this was obtained in presence of different shapes across subjects, demonstrating the robustness of the method. In principle, our approach enables the characterization of a population with significant bundles and could be applied to human data-sets to build an atlas of WM bundles for clinical applications.

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 September 2014; accepted: 08 December 2014; published online: 12 January 2015.*

*Citation: Dodero L, Vascon S, Murino V, Bifone A, Gozzi A and Sona D (2015) Automated multi-subject fiber clustering of mouse brain using dominant sets. Front. Neuroinform. 8:87. doi: 10.3389/fninf.2014.00087*

*This article was submitted to the journal Frontiers in Neuroinformatics.*

*Copyright © 2015 Dodero, Vascon, Murino, Bifone, Gozzi and Sona. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Functional connectivity-based parcellation and connectome of cortical midline structures in the mouse: a perfusion autoradiography study

## *Daniel P. Holschneider 1,2\*, Zhuo Wang1 and Raina D. Pang1*

*<sup>1</sup> Department of Psychiatry and Behavioral Sciences, University of Southern California, Los Angeles, CA, USA*

*<sup>2</sup> Departments of Neurology, Cell and Neurobiology, Biomedical Engineering, University of Southern California, Los Angeles, CA, USA*

#### *Edited by:*

*Mihail Bota, University of Southern California, USA*

#### *Reviewed by:*

*Trygve B. Leergaard, University of Oslo, Norway Lucina Q. Uddin, University of Miami, USA*

#### *\*Correspondence:*

*Daniel P. Holschneider, Department of Psychiatry and Behavioral Sciences, University of Southern California, 1975 Zonal Ave., KAM 400, MC9037, Los Angeles, CA 90089-9037, USA e-mail: holschne@usc.edu*

Rodent cortical midline structures (CMS) are involved in emotional, cognitive and attentional processes. Tract tracing has revealed complex patterns of structural connectivity demonstrating connectivity-based integration and segregation for the prelimbic, cingulate area 1, retrosplenial dysgranular cortices dorsally, and infralimbic, cingulate area 2, and retrosplenial granular cortices ventrally. Understanding of CMS functional connectivity (FC) remains more limited. Here we present the first subregion-level FC analysis of the mouse CMS, and assess whether fear results in state-dependent FC changes analogous to what has been reported in humans. Brain mapping using [14C]-iodoantipyrine was performed in mice during auditory-cued fear conditioned recall and in controls. Regional cerebral blood flow (CBF) was analyzed in 3-D images reconstructed from brain autoradiographs. Regions-of-interest were selected along the CMS anterior-posterior and dorsal-ventral axes. In controls, pairwise correlation and graph theoretical analyses showed strong FC within each CMS structure, strong FC along the dorsal-ventral axis, with segregation of anterior from posterior structures. Seed correlation showed FC of anterior regions to limbic/paralimbic areas, and FC of posterior regions to sensory areas–findings consistent with functional segregation noted in humans. Fear recall increased FC between the cingulate and retrosplenial cortices, but decreased FC between dorsal and ventral structures. In agreement with reports in humans, fear recall broadened FC of anterior structures to the amygdala and to somatosensory areas, suggesting integration and processing of both limbic and sensory information. Organizational principles learned from animal models at the mesoscopic level (brain regions and pathways) will not only critically inform future work at the microscopic (single neurons and synapses) level, but also have translational value to advance our understanding of human brain architecture.

**Keywords: prelimbic cortex, infralimbic cortex, cingulate cortex, retrosplenial cortex, fear, functional connectome**

## **INTRODUCTION**

The importance of building brain connectomes to help understand brain structure and function has received increasing attention (Sporns et al., 2005). Multiple projects are underway to construct structural connectomes for the rodent (Bota et al., 2012) (see also the Mouse Connectome Project, http://www*.*mouseconnectome*.*org; and the mouse connectivity database in the Allen Brain Atlas, http://connectivity*.*brain-map*.* org) and human brain (Van Essen et al., 2012). In comparison, construction of a brain *functional* connectome has been in a far less advanced stage. Current efforts for a human functional connectome are focused on the restingstate only (e.g., the 1000 Functional Connectomes Project, http://fcon\_1000.projects.nitrc.org). It is important to note that brain functional connectivity (FC) is dynamic and statedependent. For example, performing a memory task, processing language information, being aroused emotionally, and listening to music can all elicit distinct patterns of FC, each of which can be considered a manifestation of the functional connectome. Furthermore, even for the same type of task, FC patterns may vary with a change of parameters. Therefore, whereas a completed structural connectome consists of definitive information of a finite number of projections among brain structures, the number of state-dependent FC pathways in a functional connectome greatly outnumbers that in the structural connectome. Although highly challenging, the time is ripe for the design and construction of functional connectomes based on the neuroinformatic tools developed for structural connectomes and the large volume of functional brain imaging data. Functional connectomes would allow comparison within and across experimental paradigms to refine current theories and to derive new theories about how the brain works at the circuit level.

Given the inherent complexity of brain functional connectomes, one approach is to compartmentalize the task and focus on a limited set of brain structures first. This approach has been taken in constructing a structural connectome for the retrosplenial cortex (Sugar et al., 2011) and the amygdala (Schmitt et al., 2012) in rodents. We propose here to choose the cortical midline structures (CMS) to start building a functional connectome for the rodent brain. We employ the term "functional connectome" to refer to a description of the functional relationships between subregions of the CMS, and the term "functional connectivity" to denote the symmetrical statistical association or dependency between individual brain regions (Bullmore and Sporns, 2009).

The CMS structures have attracted great research interest, both individually and as a whole. In rodents, from anterior to posterior, the CMS includes the prelimbic (PrL), cingulate area 1 (Cg1) and retrosplenial dysgranular (RSD) cortices dorsally, and medial orbital (MO), infralimbic (IL), cingulate area 2 (Cg2), and retrosplenial granular (RSG) cortices ventrally. Tract tracing studies have shown strong and reciprocal inter-regional anatomic projections and have suggested connectivity-based integration and segregation (Jones et al., 2005). Lesion and neurochemical mapping has also demonstrated *functional* integration and segregation (Vogt et al., 2013). The CMS structures are involved in a broad range of emotional, cognitive, attentional, and physiologic processes. Importantly, human neuroimaging findings in the past two decades show that the CMS is a central component of the default mode network, a network of structurally and functionally connected brain regions showing the highest metabolic level in the brain in the resting-state, but decreased metabolic rate when the brain is engaged in a task (Raichle et al., 2001). The CMS also contains candidate hubs such as the posterior cingulate cortex and medial prefrontal cortex with the highest level of FC of the resting-state network of the brain (Deco et al., 2011; Andrews-Hanna, 2012). This unique and central role of the CMS in the brain at rest further underscores the importance of better understanding of the functional organization within the CMS and between the CMS and other brain areas.

Human studies to date have just begun to systematically map FC at the subregion-level within the cingulate gyrus (Margulies et al., 2007; Habas, 2010; Yu et al., 2011). Brain FC has also been examined in rodents, including mice (Sif et al., 1989; Lee et al., 2009; Jonckers et al., 2011; White et al., 2011). Prior FC analysis of the CMS has typically selected a single seed region-ofinterest (ROI) to represent an entire structure. Such an approach could mask subregion-level functional segregation as suggested by structural connectivity data. In the present study, we provide a more comprehensive examination of subregional FC within the anterior-posterior and dorsal-ventral axes of the CMS of the mouse. Furthermore, we sought to evaluate whether FC patterns of the CMS reported during fear in humans would parallel those observed in the mouse. CMS in the mouse have been proposed to model many of the cytoarchitectural and receptor binding characteristics of the human CMS (Vogt et al., 2013), however, little is known about its FC. Finally, because most imaging studies performed in mice have been performed in anesthetized animals, and because anesthesia can impact FC (Nallasamy and Tsao, 2011), we performed cerebral perfusion mapping in the current study in awake, nonrestrained mice. In our study, we applied perfusion mapping with autoradiographic methods, with FC calculated at a single time point using inter-subject, regionof-interest correlation analysis. This approach is similar to FC analyses performed in positron emission tomography (PET) data, but differs from the time-series correlation typically used in functional magnetic resonance imaging (fMRI). As such our analysis precludes evaluation of the dynamics of functional brain activation.

## **METHODS**

#### **ANIMALS**

Male C57BL/6 mice were bred at the university vivarium from pairs obtained from Taconic (Taconic, Hudson, NY, USA). Mice had been backcrossed onto a C57BL/6 background for greater than 15 generations from an original mixed background [129/P1ReJ (ES cells), C57BL/6J and CD-1] (Bengel et al., 1998). Male mice were weaned at 3 weeks of age, housed in groups of 3–4 on a 12 h light/ 12 h dark cycle (lights on at 0600) until 3 months of age with direct contact bedding and free access to rodent chow (NIH #31M diet) and water. At the start of experimentation, animals were individually housed. All testing was conducted during the light phase of the light/dark cycle (0930–1430). All experimental protocols were approved by the Institutional Animal Care and Use Committee of the University of Southern California. Behavioral data and data of regional brain activation have been previously reported (Pang et al., 2011).

### **FUNCTIONAL BRAIN MAPPING**

#### *Surgery*

Animals were anesthetized with isoflurane (2.0%). The ventral skin of the neck was aseptically prepared and the right external jugular vein was catheterized with a 1-French silastic catheter (SAI Infusion Technologies, Chicago, IL, USA), which was advanced 1 cm to the superior vena cava. The catheter was externalized through subcutaneous space to a dorsal percutaneous port. The catheter was filled with 0.01 mL Taurolidine-Citrate lock solution (SAI Infusion Technologies) and was closed with a stainless steel plug.

#### *Conditioned fear- training phase*

Fear conditioning experiments were conducted as previously described (Pang et al., 2011) at 3 days post-surgery. Animals were habituated to the experimental room for 30 min in their home cages. Thereafter, mice were placed in a Plexiglas box (22*.*5 × 21× 18 cm) with a floor of stainless steel rods. The chamber was illuminated with indirect ambient fluorescent light from a ceiling panel (930 lx) and was subjected to background ambient sound (65 dB). After a 2-min baseline, the animals were presented a tone six times (30-s duration, 70 dB, 1000 Hz/8000 Hz continuous, alternating sequence of 250-ms pulses). Each tone was separated by a 1-min quiet period. In the conditioned fear group (body weight = 26 ± 0.5 g, age = 12.8 ± 0.3 wks, *n* = 13) each tone was immediately followed by a foot shock (0.5 mA, 1 s). Control animals (body weight = 26 ± 0.3 g, age = 12.4 ± 0.2 wks, *n* = 11) received identical exposure to the tone but without the foot shock. One minute following the final tone, mice were returned to their home cages.

#### *Functional brain mapping during conditioned fear recall*

Twenty-four hours after the training session, animals were placed in the experimental room for 1 h in their home cages. Thereafter, the animal's percutaneous cannula was connected to a tethered catheter containing the perfusion radiotracer ([14C] iodoantipyrine, 325 µCi/kg bodyweight in 0.18 mL of 0.9% saline, American Radiolabelled Chemicals, St. Louis, MO, USA) and a syringe containing a euthanasia solution (50 mg/kg pentobarbital, 3M KCl). Animals were allowed to rest in a transit cage for 10 min prior to exposure to a novel behavioral cage (a cylindrical Plexiglas cage with a flat Plexiglas floor, dimly lit at 300 lx). Fear-conditioned and control animals received a 2-min exposure to the behavioral cage followed by a 1-min continuous exposure to the conditioned tone. One minute after the start of the tone exposure, the radiotracer was injected intravenously at 1 mL/min using a mechanical infusion pump (Harvard Apparatus, Holliston, MA, USA), followed immediately by injection of the euthanasia solution. This resulted in cardiac arrest within 5–10 s, a precipitous fall of arterial blood pressure, termination of brain perfusion, and death. Brains were rapidly removed and flash frozen in methylbutane over dry ice.

## *Autoradiography*

Brains were sliced in a cryostat at −20◦C into 20-µm coronal sections, with an inter-slice spacing of 140 µm. Slices were heat dried on glass slides and exposed to Kodak Ektascan diagnostic film (Eastman Kodak, Rochester, NY, USA) for 14 days at room temperature along with 12 [14C] standards (Amersham Biosciences, Piscataway, NJ, USA). Autoradiographs were then digitized on an 8-bit gray scale using a voltage stabilized light box (Northern Lights Illuminator, InterFocus Ltd., Linton, England) and a Retiga 4000R charge-coupled device monochrome camera (QImaging, Surrey, Canada). Cerebral blood flow (CBF) related tissue radioactivity was measured by the classic [14C] iodoantipyrine method and used as a proxy measure of neuronal activation. In this method, there is a strict linear proportionality between tissue radioactivity and CBF when the data is captured within a brief interval (∼10 s) after the tracer injection (Van Uitert and Levy, 1978; Jones et al., 1991).

### *Image preprocessing*

Three-dimensional (3D) reconstruction has been described before (Nguyen et al., 2004). In short, regional CBF (rCBF) was analyzed on a whole-brain basis using statistical parametric mapping (SPM, version SPM5, Wellcome Center for Neuroimaging, University College London, London, UK). SPM, a software package developed for the analysis of human neuroimaging data (Friston et al., 1991), has recently been adapted by us and others for use in rodent brain autoradiographs (Nguyen et al., 2004; Lee et al., 2005; Dubois et al., 2008). A 3D reconstruction of each animal's brain was conducted using 69 serial coronal sections (starting at bregma +2.98 mm) with a voxel size of 40 × 140 × 40µm. Adjacent sections were aligned both manually and using TurboReg, an automated pixel-based registration algorithm (Thevenaz et al., 1998). After 3D reconstruction, all brains were smoothed with a Gaussian kernel (FWHM = 120 × 420× 120µm). The smoothed brains from all groups were then spatially normalized to a smoothed reference brain (one "artifact free" brain). Following spatial normalization, normalized images were averaged to create a mean image, which was then smoothed to create the smoothed template. Each smoothed original 3D reconstructed brain was then spatially normalized into the standard space defined by the smoothed template (Nguyen et al., 2004). Voxels for each brain failing to reach a specified threshold in optical density (80% of the mean voxel value) were masked out to eliminate the background and ventricular spaces without masking gray or white matter. To account for any global differences in the absolute amount of radiotracer delivered to the brain, adjustments were made by the SPM software in each animal by scaling the voxel intensities so that the mean intensity for each brain was the same (proportional scaling).

#### **PAIRWISE INTER-REGIONAL CORRELATION ANALYSIS**

Anatomical regions of interest (ROIs) in the CMS were sampled with a manually drawn circular ROI defined in MRIcro (version 1.40, http://cnl*.*web*.*arizona*.*edu/mricro*.*htm) on the template brain (**Figure 1**). ROI location was decided according to the anatomic parcellation defined in the Franklin and Paxinos mouse brain atlas (Franklin and Paxinos, 2007), and using the central sulcus, cortical surface, and corpus callosum as primary landmarks. Thirty-seven circular ROIs (100µm in diameter) were selected bilaterally in 37 coronal slices (bregma +2.56 mm to −2.48 mm, 140-µm inter-slice distance, one ROI on each slice) across the dorsal structures of the CMS (PrL, Cg1, RSD). Additional 37 bilateral ROIs were selected in ventral structures (MO, IL, Cg2, RSG). Since two ROIs were selected at each bregma level, the most anterior part of Cg1 overlapping with PrL, and the most posterior part of IL overlapping Cg2 were not included in the analysis. Mean optical density of each ROI was extracted for each animal using the Marsbar toolbox for SPM (version 0.42, http://marsbar*.*sourceforge*.*net/). A pairwise inter-regional correlation matrix was calculated across animals for each group in Matlab (version 6.5.1, The MathWorks Inc., Natick, MA, USA). The matrices were visualized as heatmaps with Z-scores of Pearson's correlation coefficients color-coded. Statistical significance of between-group difference of a correlation coefficient was evaluated using the Fisher's Z-transform test (*P <* 0*.*05).

### **GRAPH THEORETICAL ANALYSIS**

In graph theory, a network is defined as a set of nodes or vertices and the edges or lines between them (Bullmore and Sporns, 2009). Analysis was performed on networks defined by the above correlation matrices in the Pajek software (version 3.12, http:// vlado*.*fmf*.*uni-lj*.*si/pub/networks/Pajek/) (De Nooy et al., 2011). Each ROI was represented by a vertex (node) in a graph, and two vertices with significant correlation (positive or negative) were linked with an edge. We used cluster analysis to delineate the organization of the CMS network. Hierarchical clustering based on dissimilarity was calculated in Pajek using the d1 dissimilarity index, which quantifies the difference in FC profile between each pair of ROIs. Results were visualized as dendrograms. In addition, a Kamada-Kawai algorithm was implemented to arrange the graph such that strongly connected regions were placed closer to each other, while weakly connected regions were placed further apart. The "energized" graphs further facilitated visualization and identification of the organizational characteristics of the CMS network.

#### **SEED CORRELATION ANALYSIS**

To evaluate functional segregation within the CMS, as well as to test the hypothesis that fear conditioning may have resulted in altered CMS functional connectivity profile, we applied seed-ROI correlation analysis. Unilateral seed ROIs—PrL (bregma +2.28 to +2.0 mm), IL (bregma +2.1 to +1.86 mm), Cg1 (bregma +0.88 to +0.6 mm), Cg2 (bregma +0.88 to +0.6 mm), RSD (bregma −0.94 to −1.22 mm), RSG (bregma −0.94 to −1.22 mm)—were hand drawn for the right hemisphere over the template brain as described above. Mean optical density of the seed ROIs was extracted for each animal. Correlation analysis was performed in SPM for each group using the seed values as a covariate. Threshold for significance was set at *P <* 0*.*05 at the voxel level and an extent threshold of 100 contiguous voxels. Regions showing significant correlations in rCBF with the ROI were considered functionally connected with the ROI.

## **RESULTS**

#### **PAIRWISE INTER-REGIONAL CORRELATION ANALYSIS**

Control mice when exposed to the neutral tone showed strong intra-regional FC in all CMS structures (**Figure 2A**, along the upper-left to lower-right diagonal). Along the anterior-posterior axis, short- to mid-range inter-regional FC connected neighboring structures. In particular, the anterior part of the Cg was functionally connected rostrally with PrL, MO, and IL, whereas the posterior part of the Cg was connected caudally with RSD and RSG. Long-range FC was missing, leaving the anterior (PrL, MO, IL) and posterior part (RSD, RSG) of CMS functionally disconnected. In contrast, along the dorsal-ventral axis, strong FC connected dorsal and ventral structures (PrL↔MO, PrL↔IL, Cg1↔Cg2, RSD↔RSG; **Figure 2A**, along the lower-left to upperright diagonal).

Fear recall induced significant changes in the pattern of FC in the CMS (**Figures 2B,C**). Fear-conditioned mice compared to controls demonstrated significantly decreased FC along the dorsal-ventral axis (decreases in RSD↔RSG, posterior Cg1↔posterior Cg2, posterior Cg1↔RSG, RSD↔posterior

*(Continued)*

#### **FIGURE 2 | Continued**

across the black diagonal line from upper-left to lower-right. Significant correlations (*P <* 0*.*05) are marked with white dots. **(C)** Statistical comparison of correlation coefficients between the fear-conditioned and the control group. The matrix of Fisher's *Z*-statistics represents differences in Pearson's correlation coefficients (*r*). Positive *Z*-values indicate greater *r* in the fear-conditioned group, while negative *Z*-values indicate smaller *r*. Significant between-group differences (*P <* 0*.*05) are marked with white dots. Numbers along the axes denote the anterior-posterior position in mm relative to the bregma. Black rectangles along the vertical axis in **(A)** denote anterior-posterior location of region-of-interests used in the seed correlation analysis. Abbreviations: Cg1, cingulate cortex area 1; Cg2, cingulate cortex area 2; IL, infralimbic cortex; MO, medial orbital cortex; PrL, prelimbic cortex; RSD, retrosplenial dystranular cortex; RSG, retrosplenial granular cortex.

Cg2). Increases in FC were noted primarily between Cg1 and RSD, and between Cg2 and RSG, resulting in a dorsal and a ventral cingulate-retrosplenial cluster with almost complete connections within each cluster. Results showed a similar pattern when examined separately in the left or the right hemisphere (data not shown).

## **GRAPH THEORETICAL ANALYSIS**

The dendrograms in **Figure 3** show hierarchical clustering of the CMS functional network in control and fear-conditioned mice. Dissimilarity between two ROIs is represented by the horizontal distance to their nearest joining point. The control mice showed two main clusters: one anterior (PrL, MO, IL, anterior Cg1, and Cg2), the other posterior (RSD, RSG, posterior Cg1, and Cg2). The anterior cluster could be further divided into an anterior CMS group (PrL, MO, IL) and an anterior Cg group. The posterior cluster could also be divided into two groups, the first included posterior Cg and anterior RSG, while the second included RSD and posterior RSG. In contrast, fear-conditioned mice showed three main clusters: the anterior CMS (PrL, MO, IL), posterior dorsal CMS (Cg1, RSD), and posterior ventral CMS (Cg2, RSG).

Organization of the CMS functional networks was further characterized with energized graphs (**Figure 4**). In the control mice, the relative location of ROIs along the anterior-posterior axis was largely preserved topologically in the functional network (**Figure 4A**). The cingulate cortices connected anterior CMS (PrL, MO, IL) and posterior CMS (RSD, RSG). The two most posterior RSG ROIs were connected directly to the anterior CMS through negative FC. Consistent to the cluster analysis results (**Figure 3B**), the fear-conditioned mice showed the same three clusters (**Figure 4B**). The posterior RSG was connected through negative FC to the ventral aspect of anterior CMS (IL, MO).

Functional segregation of the anterior CMS (PrL, MO, IL), particularly in the fear-conditioned mice, was further visualized in **Figure 5**. Also clearly shown was the functional disconnection between posterior dorsal (Cg1, RSD) and posterior ventral (Cg2, RSG) aspect of CMS in fear-conditioned mice.

### **SEED ANALYSIS**

Seed FC analysis in control animals revealed a functional segregation such that anterior CMS (PrL and IL) showed a preferential positive connectivity to limbic/paralimbic structures, while posterior-most CMS (RSD, RSG) showed a preferential positive connectivity to sensory structures (**Table 1**, **Figures 6A**, **7**). As a general trend, positive correlations for the anterior CMS would appear as negative or nonsignificant for the retrosplenial cortices. Likewise, positive correlations for the retrosplenial cortices would appear as negative or nonsignificant for the anterior CMS. Thus, for limbic/paralimbic structures, the PrL and IL showed positive correlations with the anterior insula (aIns), lateral and ventral orbital cortices (LO/VO), lateral and medial septa (LS, MS), amygdala (central n., CeA; basolateral n., BL), dorsal and median raphe (DR, MnR), nucleus accumbens (Acb), ventral caudate putamen (vCPu), dorsal hippocampus (dHPC), dentate gyrus (DG), and postsubiculum (PS). These correlations were either negative or nonsignificant for the retrosplenial seeds. Likewise, for sensory structures such as auditory cortex (Au), mid and posterior insula (mIns, pIns), primary somatosensory cortices (barrel field, S1BF; forelimb, S1FL; hindlimb, S1HL), secondary somatosensory cortex (S2), parietal association cortex (PtA), perirhinal and piriform cortices (PRh, Pir), visual cortices (V1, V2), the sensory thalamus (lateral genicular, dorsal, DLG; lateral dorsal, LD; medial geniculate, MG; ventral posterior lateral/ventral posterior media, VPL/VPM), anterior pretectal area (APT), inferior and superior colliculi (IC, SC), correlations were positive for RSD and RSG, but negative or nonsignificant for PrL and IL. While functional segregation was clearly noted along the anterior-posterior axis, no distinct segregation was noted along the dorsal-ventral axis. The cingulate (Cg1, Cg2) and retrosplenial cortices (RSD, RSG) showed overlapping FC patterns to the sensory areas, whereas the cingulate and the anterior CMS (PrL, IL) overlapped in their FC to the limbic/paralimbic areas. Of note, FC to primary and secondary motor cortices (M1, M2) showed significant positive correlations for all CMS seeds examined.

While the anterior-posterior functional segregation was in general preserved in fear-conditioned mice, FC of the CMS to the limbic/paralimbic and sensory areas showed substantial changes (**Table 2**, **Figures 6B**, **7**). In particular, fear conditioned recall broadened the FC of both PrL and IL to the amygdala, with newly emerged FC to the basomedial (BM) and lateral nuclei (La). Fear recall also changed the correlation of PrL with the medial amygdalar nucleus (PrL↔MeA) from negative to positive, while increasing the positive correlation of PrL↔CeA. The PrL and IL also showed new FC to the sensory and lateral entorhinal cortices (PrL, IL↔Au, pIns, PRh, S1BF, S1HL, LEnt, **Figure 6B**). Fear recall induced functional segregation along the dorsal-ventral axis, particularly in the retrosplenial cortices. Fear-conditioned mice compared to controls showed a loss of FC for RSG, but not RSD, to some of the sensory cortices (PRh, S1HL, S1HL, V1, V2, **Figure 6B**) and the sensory thalamus (Po, VPL/VPM). Whereas RSG became more broadly connected with the limbic/paralimbic areas (new FC with Acb, LH, LS, MS), RSD lost all its limbic/paralimbic FC.

## **DISCUSSION**

To the best of our knowledge this is the first subregion-level functional connectivity analysis of the mouse cerebral midline

structures. Our main findings include FC patterns among CMS structures and between CMS and the rest of the brain, as well as the impact of conditioned fear recall on these FC patterns.

### **FUNCTIONAL CONNECTIVITY AMONG THE CORTICAL MIDLINE STRUCTURES**

In the control mice, pairwise correlation analysis showed strong intra-regional FC within each CMS structure (PrL, Cg1, RSD, IL,

**cortical midline structures. (A)** In the control mice, relative location of regions of interest (ROIs) along the anterior-posterior axis was largely preserved topologically in the functional network. The cingulate cluster, which is circled and highlighted in yellow, connected anterior and posterior aspects of the cortical midline structures (CMS). **(B)** The fear-conditioned mice

are circled including a posterior dorsal and a posterior ventral cluster. The functional connectivity networks are represented with graphs, in which nodes (vertices) represent region of interests (ROIs) and edges represent significant correlations. Solid red lines denote significant positive correlations, whereas *(Continued)*

#### **FIGURE 4 | Continued**

dashed blue lines significant negative correlations. The graphs were energized using the Kamada**–**Kawai algorithm that placed strongly correlated nodes closer to each other while keeping weakly correlated nodes further apart. The size of each node (in area) is proportional to its degree centrality, a measurement of the number of connections linking the node to other nodes in the network. Abbreviations: Cg1, cingulate cortex area 1; Cg2, cingulate

Cg2, and RSG) and strong inter-regional FC between contiguous structures along the anterior-posterior (PrL/MO/IL↔anterior Cg1/Cg2, posterior Cg1/Cg2↔RSD/RSG), as well as the dorsalventral axis (PrL↔MO/IL, Cg1↔Cg2, RSD↔RSG). In addition, the anterior (PrL, MO, IL) and posterior (RSD, RSG) aspect of CMS were functionally segregated. The fear-conditioned mice showed substantial functional reorganization. While the intraregional FC was largely preserved, the anterior aspect of the CMS (PrL/MO/IL) became more segregated with the loss of most of its FC with the anterior Cg1/Cg2. Whereas FC was preserved for PrL↔MO/IL along the dorsal-ventral axis, FC was greatly reduced for Cg1↔Cg2, RSD↔RSG, posterior Cg1↔RSG and RSD↔posterior Cg2. In addition, FC was significantly enhanced along the anterior-posterior axis for Cg1↔RSD and Cg2↔RSG.

Graph theoretical analysis underscored these findings regarding functional integration and segregation of the CMS network. In the control mice, the CMS network showed remarkable topological organization along the anterior-posterior axis, with the mid CMS (Cg1/Cg2) connecting rostrally with the anterior CMS (PrL/MO/IL) and caudally with the posterior CMS (RSD/RSG). In contrast, the fear-conditioned mice showed increased functional integration dorsally between Cg1 and RSD, and ventrally between Cg2 and RSG, and increased functional segregation of the network into three clusters: the anterior (PrL/MO/IL), dorsal posterior (Cg1/RSD) and ventral posterior (Cg2/RSG) aspect of the CMS. It is important to note that in many cases, the purely data-driven graph theoretical analysis (**Figures 3**, **4**) was able to segregate functional subnetworks in ways consistent with the underlying anatomic structure.

The structural connectivity across the CMS has been well documented (Jones et al., 2005; Vogt et al., 2013; Vogt and Paxinos, 2014). In the rat, Jones et al. (2005) reported reciprocal structural connections along the dorsal-ventral axis: between the PrL and IL, between the middle one third of the dorsal anterior cingulate (Cg1) and ventral anterior cingulate cortices (Cg2), and between dorsal (RSD) and ventral retrosplenial cortices (RSG). Along the anterior-posterior axis, the PrL provides axonal projections to the anterior part of Cg1 and Cg2, whereas the PrL/IL and anterior Cg1/Cg2 are largely disconnected from the other CMS structures. The posterior one third of Cg1 is reciprocally connected with the anterior aspect of both dorsal and ventral retrosplenial cortices (RSD/RSG) and receives projection from the posterior RSD/RSG. The posterior one third of Cg2 receives projection from the anterior RSD/RSG. Our FC results in the control mice concurred remarkably with these patterns of structural connectivity. Qualitatively similar patterns of structural connectivity among the CMS can be found in tract cortex area 2; IL, infralimbic cortex; MO, medial orbital cortex; PrL, prelimbic cortex; RSD, retrosplenial dystranular cortex; RSG, retrosplenial granular cortex. The index numbers denote sequence of ROIs along the anterior-posterior axis, such that #1 denotes the most anterior ROI at 2.56 mm anterior to the bregma, and #37 the most posterior ROI at 2.48 mm posterior to the bregma. The inter-ROI distance along the anterior- posterior axis in the brain is 0.14 mm.

tracing data published online by the Mouse Connectome Project (http://www*.*mouseconnectome*.*org/), and the mouse connectivity database in the Allen Brain Atlas (http://connectivity*.* brain-map*.*org/).

### **FUNCTIONAL CONNECTIVITY OF THE CORTICAL MIDLINE STRUCTURES WITH OTHER BRAIN AREAS**

Seed FC analysis in control animals revealed a functional segregation such that anterior-most structures (PrL and IL) showed a preferential positive connectivity to limbic/paralimbic areas, while posterior structures (RSD, RSG) showed a preferential positive connectivity to sensory areas. Furthermore, the sign of correlation (positive or negative) was often reversed for the anterior compared to the posterior CMS structures with regard to the limbic/paralimbic and sensory areas. Thus, the PrL and IL showed positive correlations with limbic/paralimbic areas, including the anterior insula, septum (lateral, medial), central and basolateral nuclei of the amygdala, nucleus accumbens, and dorsal hippocampus, whereas these limbic/paralimbic areas showed negative or nonsignificant correlations with the RSD and RSG. Likewise, for sensory areas such as the somatosensory cortices (S1BF, S1HL, S1FL, S2), parietal association cortex, visual cortices, auditory cortex, mid and posterior insula, the sensory thalamus (ventroposterior lateral/medial, medial geniculate), anterior pretectal nucleus, and colliculi (inferior, superior), correlations were positive for the RSD and RSG seeds, but negative or nonsignificant for the PrL and IL seeds. The above findings are consistent with the divergent roles of anterior medial prefrontal cortex and that of the retrosplenial cortex, with the former playing a role in the regulation of limbic activity (Margulies et al., 2007; Horn et al., 2010; Ichesco et al., 2012; Connolly et al., 2013; Klavir et al., 2013), and the latter receiving and integrating early-processed sensory information (Vann et al., 2009).

This functional segregation was in general preserved in fearconditioned mice. In addition, fear conditioned recall broadened the FC of the PrL and IL to the amygdala, with new FC to the basomedial and medial nuclei. These results are consistent with the existent structural connectivity of the PrL and IL with the amygdala, as well as brain mapping studies suggesting their functional coactivation in the conditioned-fear paradigm (Cassell et al., 1989; Singewald et al., 2003; Holschneider et al., 2006; Knapska and Maren, 2009; Lehner et al., 2009; Sotres-Bayon et al., 2012). The fear-conditioned mice also showed new positive correlations between PrL/IL and sensory areas, including somatosensory cortices (S1BF, S1HL, PRh, pIns), auditory cortex (PrL only) and piriform cortex (PrL only), suggesting integration and processing of sensory information by the PrL/IL during fear recall. Altered FC of the retrosplenial cortices was observed

**FIGURE 5 | Circular plot of graphs representing the functional networks of the cortical midline structures (CMS). (A)** In control mice, functional segregation was observed along the anterior-posterior axis between the anterior aspect (PrL, MO, IL, anterior Cg1, and Cg 2) and posterior aspect (RSD, RSG, posterior Cg1, and Cg2) of the CMS. **(B)** In the fear-conditioned mice, functional connectivity between the dorsal and ventral aspect of the CMs was greatly reduced. In each circular plot, regions of interest (ROIs) representing the dorsal CMS are arranged in the upper half of the circle,

whereas ROIs representing the ventral CMS are arranged in the lower half. Abbreviations: Cg1, cingulate cortex area 1; Cg2, cingulate cortex area 2; IL, infralimbic cortex; MO, medial orbital cortex; PrL, prelimbic cortex; RSD, retrosplenial dystranular cortex; RSG, retrosplenial granular cortex. The index numbers denote sequence of ROIs along the anterior-posterior axis such that #1 denotes the most anterior ROI at 2.56 mm anterior to the bregma and #37 the most posterior ROI at 2.48 mm posterior to the bregma. The inter-ROI distance is 0.14 mm.

#### **Table 1 | Summary of seed correlation analysis results in the control mice.**


*(Continued)*

#### **Table 1 | Continued**


*Functional connectivity of the cortical midline structures was analyzed using seed correlation for the right prelimbic (PrL), infralimbic (IL), cingulate area 1 (Cg1), cingulate area 2 (Cg2), retrosplenial dysgranular (RSD) and retrosplenial granular (RSG) cortices. Shown are significant left and right (L/R) positive (*+*) and negative (*−*) correlations with the seed (P < 0.05, clusters* ≥ *100 voxels), with double signs denoting broadly represented correlations. "0" and blank cells denote the absence of significant correlations. Gray shaded cells highlight limbic/paralimbic areas. White text on a black background denotes the seed region.*

in the fear-conditioned mice. The RSG was predominantly connected with the limbic/paralimbic areas, whereas the RSD was predominantly connected with the sensory areas. The significance of this shift in FC pattern remains to be further investigated.

The afferent and efferent projections of the CMS structures have been a subject of extensive research (Domesick, 1969; Sesack et al., 1989; Vertes, 2004; Hoover and Vertes, 2007; Sugar et al., 2011). While it is beyond the scope of this report to compare functional and structural connectivity of the CMS in detail, it is important to note that a functional connectome reflects the dynamic, state-dependent recruitment of the underlying structural network.

#### **TRANSLATIONAL ASPECTS**

One widely accepted theoretical construct of the subdivisions of the human cingulate gyrus is the four-region model (Vogt et al., 2013), consisting of the anterior cingulate cortex (ACC; s, subgenual; p, pregenual), the midcingulate cortex (MCC; a, anterior; p, posterior), the posterior cingulate cortex (PCC; d, dorsal; v, ventral), and the retrosplenial cortex (RSC). This midline cortex shows evolutionary expansion across species, with increasing complexity as one progresses from rodents to nonhuman primates to humans (Vogt et al., 2013). PrL and IL in rodents appear to be homologous to primate pregenual ACC and subgenual ACC, respectively (Vogt et al., 2013; Vogt and Paxinos, 2014), although PrL may also show some features of primate dorsolateral prefrontal cortex (for further discussion see Uylings et al., 2003), and IL features of primate orbitomedial cortex (for further discussion see Vertes, 2006). There are no posterior cingulate areas in rodents, and posterior CMS is composed entirely of retrosplenial cortex, which is proportionally much larger in rodents than in humans (Vann et al., 2009; Vogt et al., 2013; Vogt and Paxinos, 2014). Hence, in rodents, the CMS is best described by a three-region model (Vogt and Paxinos, 2014), with key similarities of structural connectivity for intra-cingulate connections for humans, primates and rodents (Vogt et al., 2013).

Functional specialization of the cingulate gyrus has been explored in human subjects by Margulies et al. (2007) who examined resting-state FC patterns for 16 ACC seed regions. Their results demonstrated strong anterior-posterior and dorsal-ventral functional specialization of the ACC, and highlighted the negative relationships between rostral ACC-based affective networks and caudal ACC-based frontoparietal attention networks (Margulies et al., 2007). Habas (2010) mapped the FC patterns of the human rostral and caudal cingulate motor areas (located just under the pre-supplementary and supplementary motor areas), and found that activity in the rostral cingulate motor area was more correlated with activity in prefrontal, orbitofrontal, and language-associated cortices, whereas the caudal cingulate motor area correlated more closely with sensory cortex (Habas, 2010). More recently, Yu et al. (2011) examined functional connectivity of the human cingulate cortex using the four-compartment model (Yu et al., 2011). They found that the subgenual ACC and pregenual ACC were involved in an affective network, while being negatively correlated with a sensorimotor network. In the MCC, however, the anterior MCC was correlated with the sensorimotor network and negatively correlated with the affective network, whereas the posterior MCC only correlated with the sensorimotor network. The dorsal PCC and ventral PCC were involved in the default-mode network and were negatively correlated with the sensorimotor network. In contrast, the RSC was mainly correlated with the PCC and thalamus.

Our findings in the control mice parallel these human findings in general. The anterior CMS (PrL and IL) in the mouse showed a preferential FC to limbic/paralimbic areas, while mid (Cg1, Cg2) and posterior CMS (RSD, RSG) showed greater connectivity to sensory areas. Furthermore, the PrL and IL showed negative correlations with some sensory areas, whereas the cingulate and retrosplenial cortices showed negative correlations with some limbic/paralimbic areas. Differences were noted in the FC of the retrosplenial cortices, with the mouse showing broader FC with sensorimotor regions than that reported in the PCC in humans (Yu et al., 2011). This may reflect the fact that the rodent retrosplenial cortex is proportionally larger than in humans, and contains areas of cortex not represented in the human PCC (Vann et al., 2009). Of note, the retrosplenial cortex showed broad FC with thalamic nuclei in mice, which correlates with strong FC between these regions observed in humans (Yu et al., 2011). Finally, in agreement with prior work in human subjects, fear broadened FC of anterior CMS to the amygdala and

to somatosensory areas, suggesting integration and processing of both limbic and sensory information (Hariri et al., 2003; Stein et al., 2007; Cullen et al., 2011; Robinson et al., 2012; Motomura et al., 2013; Prater et al., 2013).

#### **METHODOLOGICAL CONSIDERATIONS**

We applied pairwise inter-regional correlation analysis to autoradiographic CBF data to investigate brain functional connectivity (Wang et al., 2011, 2012). This is a well-established method, which has been applied to analyze rodent brain mapping data of other modalities, including autoradiographic deoxyglucose uptake (Soncrant et al., 1986; Barrett et al., 2003), cytochrome oxydase histochemistry (Fidalgo et al., 2011; Padilla et al., 2011), activity regulated genes (c-fos) (Wheeler et al., 2013), and fMRI (Schwarz et al., 2007). In these studies, correlations are calculated in an inter-subject manner, i.e., across subjects within a group. Hence, perfusion mapping using autoradiographic methods presents a "snap-shot" of brain activity at a single point in time, which in the case of the current study corresponded to a several second time window occurring 1 minute following exposure to the tone. Thus, our methods preclude analysis of the dynamics of functional brain activation. This approach is different from the intra-subject cross correlation analysis often used on fMRI time series data (Pawela et al., 2008; Magnuson et al., 2010; Liang et al., 2011) or that typically performed in electrophysiologic recordings (Scholvinck et al., 2013). Caution needs to be taken comparing FC results between different brain imaging modalities and between different analytic methods (Di et al., 2012; Buckner et al., 2013; Hutchison et al., 2013; Scholvinck et al., 2013; Wehrl et al., 2013).

#### **FIGURE 7 | Summary of positive functional connectivity for cortical**

**midline structures with other brain areas.** Significant correlations are shown for seeds representing the right prelimbic, infralimbic, cingulate area 1, cingulate area 2, retrosplenial dysgranular and retrosplenial granular cortices. Brain areas that showed correlation to only one seed in each dorsal-ventral pair were underlined. Abbreviations: aIns, anterior insular cortex; APT, anterior pretectal area; Au, auditory cortex; AD/AV, anterior dorsal/ventral thalamic n.; BL, basolateral amygdalar n.; BM, basomedial amygdalar n.; Cb3–5, cerebellar lobules 3–5; CeA, central amygdalar n.; Cn, cuneiform n.; dCPu, dorsal caudate putamen; DG, dentate gyrus; DLG, lateral geniculate, dorsal; DR, dorsal raphe; DS, dorsal subiculum; IC, inferior colliculus; La, lateral amygdalar n.; LD, lateral dorsal thalamic n.; Lent, lateral

entorhinal cortex; Li, linear raphe; LO/VO, lateral/ventral orbital cortex; LS, lateral septum); M1, primary motor cortex; M2, secondary motor cortex; MeA, medial amygdalar n.; mIns, mid insular cortex; MG, medial geniculate; MnR, median raphe; MS, medial septum; Acb, nucleus accumbens; vCPu, ventral caudate putamen; vHPC, ventral hippocampus; Pir, piriform cortex; PnO, pons; pIns, posterior insular cortex; PRh, perirhinal cortex; PS, parasubiculum; PtA, parietal association cortex; S1BF, primary somatosensory cortex, barrel field; S1FL, primary somatosensory cortex, forelimb; S2, secondary somatosensory cortex; SC, superior colliculus; V1/V2, primary/secondary visual cortex; VA/VL, ventral anterior/ventrolateral thalamic n.; VM/Sub, ventromedial/submedial thalamic n.; VPL/VPM, ventroposterolateral/ventroposteromedial thalamic n.

#### **Table 2 | Summary of seed correlation analysis results in the fear-conditioned mice.**


*(Continued)*

#### **Table 2 | Continued**


*Functional connectivity of the cortical midline structures was analyzed using seed correlation for the right prelimbic (PrL), infralimbic (IL), cingulate area 1 (Cg1), cingulate area 2 (Cg2), retrosplenial dysgranular (RSD) and retrosplenial granular (RSG) cortices. Shown are significant left and right (L/R) positive (*+*) and negative (*−*) correlations with the seed (P < 0.05, clusters* ≥ *100 voxels), with double signs denoting broadly represented correlations. "0" and blank cells denote the absence of significant correlations. Gray shaded cells highlight limbic/paralimbic areas. White text on a black background denotes the seed region.*

What has become increasingly clear is that FC may occur at different time scales ranging, for example, from milliseconds in electrophysiologic studies, to seconds in fMRI and minutes in PET (Di et al., 2012; Scholvinck et al., 2013; Wehrl et al., 2013). Although the existence of a flow/metabolism coupling to neural activity is well accepted, and indeed forms the basis of the majority of functional brain mapping studies, it is true that the exact relationship between neuronal activity, regional CBF and metabolism, as well as the role of vascular distribution and architecture remains a question of debate (Gsell et al., 2000; Keri and Gulyas, 2003; Van Zijl et al., 2012), and the relationship between dynamic neurometabolic coupling and more static measures of regional covariance remains largely unresolved. Different analytic tools have been adapted to allow the determination of functional associations, either by accounting for the temporal aspects of time series or, as in the current study, by modeling the system over the entire experimental period independent of the temporal order (Stephan, 2004). Honey et al. (2007) who explored the network structure of cerebral cortex on multiple time scales reported that at the slowest time scale (minutes), the aggregate strength of functional couplings between regions is, on average, a good indicator of the presence of an underlying structural link (Honey et al., 2007). At faster time scales significant fluctuations are observed in the strength of functional coupling. Recent work has compared FC calculated using inter-subject, regionof-interest correlation analysis of 18fluorodeoxyglucose PET data and that using time-series correlation analysis of fMRI data (Di et al., 2012; Wehrl et al., 2013). These methods differ in their temporal scales, ranging from minutes for the PET images to seconds for the fMRI images. Findings suggest that in general the two methods generate comparable results with regards to core regions. However, differences in the time scales of data sampling may result in the differential recruitment of ancillary regions, and this effect may be accentuated in studies in which subjects receive an ongoing active stimulation. Future efforts at delineating a functional connectome will need to evaluate FC at multiple time scales to better address the issue of state vs. trait related changes.

It is important to remember that while correlation based analyses provide information about functional connectivity, they do not directly address causal relationships. Thus, it is possible that functional connectivity may arise even in the absence of a direct structural connection through functional linkages across a shared secondary node. However, while indirect interactions can account for some functional linkages, current evidence suggests that topological parameters are generally conserved between structural and functional networks (Bullmore and Sporns, 2009). Our approach to studying FC in the mouse brain appears reasonable and consistent with the current theoretical understanding of functional connectivity as long as one understands that it does not address causality or directionality of individual connections, and that it is conceivable that covariance between two nodes in a circuit may occur in the absence of their direct structural connectivity.

The results of our pairwise correlation analysis highlight the general challenge inherent in the interpretation of any ROI analysis—that is how representative is the selected ROI for assessing the functional connectivity of the structure of interest as a whole? An ROI defined either too large or too small relative to the actual extent of regional activation may result in loss of statistical power. For brain structures with complex spatial patterns of afferent and efferent projections, defining appropriate ROIs may be particularly difficult. A strength of our study was its unbiased approach of ROI selection across sequential coronal slices of the 3D midline cortex of the mouse. This unbiased approach allowed us to detect functional segregation of these regions without the limitations of pre-specified ROIs. In our study, the cingulate cortex (Cg1, Cg2) was itself functionally segregated such that the anterior half correlated more strongly with PrL, MO and IL, whereas the posterior half correlated more strongly with RSD and RSG. However, while our seed analysis (using pre-specified ROIs) suggested that the cingulate cortex had a pattern of functional connectivity that was intermediate between that of the anteriormost and posteriormost CMS, it is likely that an individual seed placed at different locations within the cingulate would result in progressively different FC patterns.

## **CONCLUSION**

Our study provided information on the functional connectivity pattern of the CMS at a mesoscopic level. Thus, while FC as implemented in the current study was not specified at a level that allowed one to distinguish between different processes at synaptic, cellular, columnar or laminar levels, it did allow one to model context-dependent changes at the level of large neural populations. Functional integration and segregation noted in our study paralleled reports of structural connectivity of CMS in the rodent, and were in general consistent with reports of functional connectivity in humans using fMRI. The subregion-level approach to defining individual functional units and constructing macroto mesoscopic level connectomes for neural systems such as the CMS offered a balanced solution that facilitated comparison with structural connectivity data. Differences in FC between the control and fear-conditioned mice highlighted the state-dependence of brain functional connectome, and the importance of evaluating and comparing the functional connectome across states. Organizational principles learned from animal models at the macro- and mesoscopic level (brain regions/subregions and pathways) will not only inform future work at the microscopic level (single neurons and synapses) but may have translational value to advance our understanding of human brain structure and function, as well as of animal models of human cerebral pathology (Lynch et al., 2013).

#### **ACKNOWLEDGMENT**

Research support was provided by NARSAD: The Brain and Behavior Research Fund.

### **REFERENCES**


autoradiographic brain perfusion mapping and functional connectivity study. *Neuroimage* 59, 4168–4188. doi: 10.1016/j.neuroimage.2011.11.047


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 22 February 2014; accepted: 24 May 2014; published online: 11 June 2014. Citation: Holschneider DP, Wang Z and Pang RD (2014) Functional connectivitybased parcellation and connectome of cortical midline structures in the mouse: a perfusion autoradiography study. Front. Neuroinform. 8:61. doi: 10.3389/fninf. 2014.00061*

*This article was submitted to the journal Frontiers in Neuroinformatics.*

*Copyright © 2014 Holschneider, Wang and Pang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Brain-wide map of efferent projections from rat barrel cortex

## *Izabela M. Zakiewicz , Jan G. Bjaalie and Trygve B. Leergaard\**

*Department of Anatomy, Institute of Basic Medical Sciences, University of Oslo, Oslo, Norway*

#### *Edited by:*

*Mihail Bota, University of Southern California, USA*

#### *Reviewed by:*

*Graham J. Galloway, The University of Queensland, Australia Rembrandt Bakker, Radboud University Nijmegen, Netherlands*

#### *\*Correspondence:*

*Trygve B. Leergaard, Department of Anatomy, Institute of Basic Medical Sciences, University of Oslo, Postboks 1105 Blindern, 0317 Oslo, Norway*

*e-mail: t.b.leergaard@medisin.uio.no*

## The somatotopically organized whisker barrel field of the rat primary somatosensory (S1) cortex is a commonly used model system for anatomical and physiological investigations of sensory processing. The neural connections of the barrel cortex have been extensively mapped. But most investigations have focused on connections to limited regions of the brain, and overviews in the literature of the connections across the brain thus build on a range of material from different laboratories, presented in numerous publications. Furthermore, given the limitations of the conventional journal article format, analyses and interpretations are hampered by lack of access to the underlying experimental data. New opportunities for analyses have emerged with the recent release of an online resource of experimental data consisting of collections of high-resolution images from 6 experiments in which anterograde tracers were injected in S1 whisker or forelimb representations. Building on this material, we have conducted a detailed analysis of the brain wide distribution of the efferent projections of the rat barrel cortex. We compare our findings with the available literature and reports accumulated in the Brain Architecture Management System (BAMS2) database. We report well-known and less known intracortical and subcortical projections of the barrel cortex, as well as distinct differences between S1 whisker and forelimb related projections. Our results correspond well with recently published overviews, but provide additional information about relative differences among S1 projection targets. Our approach demonstrates how collections of shared experimental image data are suitable for brain-wide analysis and interpretation of connectivity mapping data.

#### **Keywords: anterograde transport, axonal tracing, brain atlas, connectivity, connectome, neuroanatomical tract tracing, neuroinformatics, wiring diagram**

## **INTRODUCTION**

The characteristic grid-like arrangement of mystacial representations in the whisker barrel field of the primary somatosensory cortex (S1; Welker, 1971; Chapin and Lin, 1984; Dawson and Killackey, 1987; Welker et al., 1988; Fabri and Burton, 1991a) has made the rat barrel cortex a common model for anatomical and physiological investigations of sensory processing and brain plasticity (Petersen, 2007; Alloway, 2008; Wiest et al., 2008; Feldmeyer et al., 2013). The intracortical and subcortical connections of the S1 barrel cortex have been extensively mapped by use of axonal tract tracing and electrophysiological techniques, and many of the connections target brain regions involved in synchronization of body movements in reply to sensory stimuli (Alloway, 2008; Wiest et al., 2008). A considerable number of studies have shown that the S1 barrel cortex projects to the motor cortex (Chapin and Lin, 1984; Reep et al., 1990; Fabri and Burton, 1991a; Smith and Alloway, 2013), primary and secondary somatosensory cortex (Chapin and Lin, 1984; Koralek et al., 1990; Fabri and Burton, 1991a), insular cortex (Fabri and Burton, 1991a), perirhinal and ectorhinal cortex (Fabri and Burton, 1991a; Naber et al., 2000), auditory and visual cortex (Frostig et al., 2008; Sieben et al., 2013) while subcortical projections terminate bilaterally in the dorsal striatum (Brown et al., 1998; Alloway et al., 1999; Hoffer et al., 2005), ipsilaterally in the thalamus (Fabri and Burton, 1991b; Landisman and Connors, 2007), red nucleus (Ebrahimi-Gaillard and Roger, 1993), superior colliculus (Wise and Jones, 1977a; Hoffer et al., 2005), and pontine nuclei (Mihailoff et al., 1978; Wiesendanger and Wiesendanger, 1982; Mihailoff et al., 1985; Leergaard and Bjaalie, 2007), and contralaterally in the trigeminal nuclei (Killackey et al., 1989; Furuta et al., 2010), dorsal column nuclei (Giuffrida et al., 1986; Shin and Chapin, 1989), and spinal cord (Akintunde and Buxton, 1992).

However, each of the previous investigations has typically covered the projections of one or at most a few brain regions. To our knowledge, only one earlier investigation provided a brain-wide analysis of efferent projections from the S1 barrel cortex in mouse (Welker et al., 1988). Similar data are not available in rat, and no previous study has provided documentation of barrel cortex connectivity across the entire brain, allowing comparison of the projections originating from different S1 body representations. There is increasing awareness in the field about the need for comprehensive maps of rodent brain connectivity, and several large scale initiatives currently employ sophisticated axonal tracing paradigms and high-throughput methodologies to generate large amounts of experimental connectivity data from the mouse brain, such as the Allen Brain Atlas Mouse Connectivity project (www.brain-map.org) and the Mouse Connectome Project (www*.*mouseconnectome*.*org). These projects have made impressive amounts of image data from a large numbers of tract-tracing experiments publicly available, but few analyses of connectivity have yet been conducted with brain-wide coverage.

Efforts to aggregate information from the literature to gain overview of rat brain connectivity, such as the Brain Architecture Management System (BAMS, Bota et al., 2005, 2012) provide an overview of the major connections of S1. But the completeness of the presentations is difficult to assess due to lack of access to original data, and lack of brain-wide coverage in the original publications. A related question is whether neighboring body representations in S1 project to the same cortical and subcortical targets across the brain. Distinct topographical organization of S1 forelimb and whisker related projections to major target regions have been described (e.g., Brown et al., 1998; Hoover et al., 2003; Leergaard et al., 2000b), but differences in connectivity across the entire brain are largely unknown. Thus, beyond a few studies comparing S1 projections to different cortical areas (Hoffer et al., 2003) or corticostriatal, corticothalamic, and corticopontine projections from sensory and motor cortex (Hoffer et al., 2005), little is known about differences in densities and extent of S1 whisker barrel projections across all cortical and subcortical target regions. Such differences can only be assessed by brain-wide analyses of connectivity in the same experiments.

We here utilize an online resource containing high-resolution images with tract-tracing data (Zakiewicz et al., 2011; www*.*rbwb*.* org) to perform a brain-wide, semiquantitative analysis of the efferent connections of S1 barrel cortex. Our results allow comparison of the different well-known S1 efferent projections as well as less known projections to cortical and subcortical brain regions. We demonstrate distinct differences between S1 whisker and forelimb related projections and discuss possible functional implications of these findings. We finally compare our results to the overview of S1 connections provided by earlier publications and the BAMS database.

## **MATERIALS AND METHODS**

To determine the target regions of S1 efferent projections across the rat brain, we used a collection of high-resolution images of histological sections from six experiments in which axonal tracers were injected in whisker or forelimb representations in S1 (www*.* rbwb*.*org) (Zakiewicz et al., 2011).

Detailed procedures are described in Zakiewicz et al. (2011) and experimental metadata are available via the online data system (www*.*rbwb*.*org). All experimental procedures were approved by the institutional animal welfare committee of the University of Oslo and the Norwegian Animal Research Authority, and were in compliance with European Community regulations on animal well-being. Briefly, an anterograde axonal tracer (biotinylated dextran amine, BDA, or *Phaseolus vulgaris* leucoagglutinin, *Pha*-L), was injected in the cerebral cortex of anaesthetized adult Sprague Dawley or Wistar rats. After 7 days animals were sacrificed and transcardially perfused with 4% paraformaldehyde, and brains were removed for histological processing. 50µm thick coronal sections were cut on a freezing microtome, and every second section was processed to visualize BDA or *Pha*-L (Gerfen and Sawchenko, 1984). Most sections were further counterstained with Thionine or Neutral red. Alternating sections through S1 were stained for cytochrome oxidase using the procedure of Wong-Riley et al. (Wong-Riley, 1979). Highresolution section images (TIFF format) were obtained through a 10× objective (Olympus UPlanApo, NA 0.40) using a motorized Olympus BX52 microscope running the Virtual Slide module of Neurolucida 7.0 (MBF Bioscience Inc., Williston, VT, USA). Images were converted to the Zoomify PFF format (Zoomify Inc., Santa Cruz, CA, USA) and assembled in an online data repository.

The location of tracer injection sites were confirmed by analysis of anatomical landmarks and cytochrome oxidase staining pattern (Zakiewicz et al., 2011). All injections were columnar, and involved all cortical layers (**Figure 1**). To assess the size of the injection sites we used image analysis tools in Neurolucida. RGB images were converted (using the red or blue channel for sections stained with Neutral red or Thionine, respectively) to gray scale representations. The grayscale images were binarized with Neurolucida filters (Kodalith, fill holes, erode and pruning), and injection site volumes were estimated by summation of the measured areas multiplied with section spacing.

To identify target regions for efferent projections from the S1 whisker and forelimb representations, we systematically inspected all parts of the microscopic images from the six cases. Individual labeled axons were followed across sections to ensure that their targets were identified. The anatomical location of the observed labeling was determined by superimposing corresponding coronal atlas plates (Paxinos and Watson, 2007) to each image, using affine transformations applied in Adobe Illustrator CS5 (Adobe Systems Inc, San Jose, CA, USA). For each region the spatial registration of atlas overlay was adjusted on the basis of local landmarks and cytoarchitectonic patterns. The nomenclature and abbreviations used in this report are adopted from Paxinos and Watson (2007).

The amount of labeled fibers in each anatomical (sub) region was semiquantitatively assessed by a single examiner, scoring the observed labeling using a density rating system using predefined criteria. The labeling was scored as "weak" (score = 1) for a few labeled fibers that were possible to count, as "moderate" (score = 2) for several fibers that could be individually discerned but not readily counted, and as "strong" (score = 3) for many labeled fibers forming dense plexuses where individual fibers could not be discerned.

For comparison with connectivity reports registered in the BAMS2 database (http://brancusi1*.*usc*.*edu/) we used the online query tools of this database, supported with customized data files kindly provided by Dr. Mihail Bota (personal communication). The connections reported in the BAMS2 are based on terminology used in the Swanson (1998) atlas of the rat brain. At the detail level of the target regions reported here, this terminology is compatible with the Paxinos and Watson (2007). Strengths indicated for connections in BAMS2 were re-interpreted to match that of the present study, aided by the collator notes registered in BAMS2, and cross-check with original references. The annotation of strength in BAMS2 had a higher granularity and was reinterpreted to our semiquantitative scale as follows, scores "very light"

and "light" were interpreted as light; "light/moderate" and "moderate" as moderate, and "moderate/strong," "strong," and "very strong" as strong.

## **RESULTS**

To identify the cortical and subcortical brain regions receiving projections from the rat S1 whisker barrel cortex, and to compare the projections of S1 whisker representations to the neighboring S1 forelimb representation, we have examined the distribution of anterogradely labeled axons arising from axonal tracer injections in S1 whisker or forelimb representations in a collection of section images from six experiments (www*.*rbwb*.*org; Zakiewicz et al., 2011).

### **GENERAL FEATURES OF LABELING**

The six injection sites varied in volume (0.23–2.97 mm3; **Table 1**), but had sharp boundaries and covered the entire thickness of the cerebral cortex, without involvement of the underlying white matter (**Figure 1**). The positions of the injection site centers were inferred from histological analyses of anatomical landmarks and cytochrome oxidase staining patterns (Zakiewicz et al., 2011). Inspection of sections stained for cytochrome oxidase revealed that the injections into S1 barrel cortex involved both barrels (D2, D3 or D5) and adjacent septa.

The two tracers (BDA and *Pha*-L) both gave rise to distinctly labeled axons in intracortical and subcortical targets (**Figures 1**–**4**, summarized in **Tables 1**, **2**). The fibers where sharply defined with visible beaded varicosities, readily observed in the high-resolution images shown in the Whole Brain Connectivity Atlas. Retrogradely labeled cells were also observed in several regions in cases injected with BDA. This labeling is commented on below, but not included in our semiquantitative analysis due to the less robust properties of the 10 kDa BDA tracer for retrograde tracing (Lanciego and Wouterlood, 2000, 2011). Cyto- and chemoarchitectural features were helpful to determine anatomical boundaries. The shape, size, and density of labeled fibers were highly similar across cases, although the amount of labeling reflected the size of the injection sites. While several bilateral projections were observed, the amount of contralateral labeling was always lower, and tended to be distributed in a pattern mirroring the ipsilateral labeling.

#### **CORTICO-CORTICAL PROJECTIONS** *Motor and somatosensory cortex*

All tracer injections gave rise to labeled axons in most parts of the injected S1 cortex, reflecting the well-known intrinsic connectivity of S1 (Fabri and Burton, 1991a). We found substantial amounts of labeled fibers in the contralateral S1, bilaterally in

#### **Table 1 | Summary of observations and semiquantitative assessment.**


*IS, injection site; NA, not available.*

*Semiquantitative assessment of amount of labeled fibers:*

*0, confirmed absence of labeled fibers.*

*1, weak; few fibers that are possible to count.*

*2, modest; several fibers that can be individually discerned but not readily counted.*

*3, strong; many fibers forming dense plexuses where individual fibers cannot be discerned.*

*\*Anatomical location was assigned according to Paxinos and Watson (2007). Note that labeled fibers were observed in anterior regions underlying the forceps minor of the corpus callossum, >3 mm anterior of bregma, and that recent proteomic analyses (Mathur et al., 2009) indicate that this part of the brain should not be included in the structural definition of the claustrum.*

the primary motor cortex (M1) and the secondary somatosensory cortex (S2), and to a lesser extent the secondary motor cortex (M2; **Figures 2A–D**), in agreement with earlier observations (Donoghue and Parham, 1983; Reep et al., 1990; Fabri and Burton, 1991a; Wright et al., 1999; Hoffer et al., 2003; Alloway et al., 2004, 2008; Hoffer et al., 2005; Colechio and Alloway, 2009; Smith and Alloway, 2013). The amount of forelimb related projections to motor areas was consistently higher than whisker related projections, relative to the size of the injection sites (**Table 1**). It should be noted that the region

**FIGURE 2 | Examples of projections to motor cortex.** Images exemplifying S1 projections to the ipsilateral primary and secondary motor cortex. **(A–D)**, Labeling in M1 and M2 originating from S1 forelimb **(A,B)** and S1 whisker

**(C,D)** representations, distributed in distinct columns in M1, and partly across the boundary between M1 and M2. M1, primary motor cortex; M2, secondary motor cortex; S1, primary somatosensory cortex. Scale bar, 0.5 mm.

visual (**C**, case R602) and auditory (**D**, case R602) cortices both labeled axons and retrogradely labeled neurons are observed in superficial layers.

drawn as M2 in the employed reference atlas (Paxinos and Watson, 2007) includes the medial and lateral agranular cortex (Donoghue and Wise, 1982). Indeed, the labeling observed in our material (**Figures 2C,D**) fits well with the S1 projections recently described to distribute across the transition zone between the medial and lateral agranular cortex (Smith and Alloway,

#### *Insular and posterior parietal cortex*

Scale bars, 0.5 mm.

In five of six cases, we found significant amounts of labeled axons distributed bilaterally in the insular cortex (**Table 1**; **Figure 3A**), in agreement with earlier descriptions of S1 projections to the parietal ventral cortex (Fabri and Burton, 1991a), which corresponds to the insular cortex as delineated in the reference atlas (Paxinos and Watson, 2007). In the three cases injected in the

2013).

**FIGURE 4 | Examples of subcortical labeling.** Images illustrating observed axonal labeling in a selection of subcortical regions. **(A,B)** Elongated plexuses of labeling in the dorsal striatum, arising from S1 whisker (**A**, case R602) and forelimb (**B**, case R605) representations. **(C,D)** Labeled fibers in anterior parts of the claustrum as defined in the employed atlas (Paxinos and Watson, 2007), but in a location that according to recent proteomic analysis is not part of the claustrum (Mathur et al., 2009). (**C**, case R602; **D**, case R606). **(E,F)** Widespread labeled fibers in the lateral, reticular part of the substantia nigra (**E**, case R602; **F**, case R601). Black arrowheads in **(E,F)** indicate labeled fibers in the substantita nigra. Blue arrowheads in **E** indicate a labeled fiber reaching the red nucleus. **(G,H)** Sharply defined dense plexuses of labeling in the thalamus (**G**, case R602; **H**, case R605). **(I)** Example of a loosely organized plexus of labeled fibers in the submedius nucleus thalamus (case R606). **(J,K)** Labeled

fibers in the zona incerta (**J**, case R602M; **K**, case R605). **(L,M)** Examples showing labeled fibers in the superficial and deep layers of the superior colliculus (**L**, case R602; **M**, case R605). **(N,O)** Dense plexuses of labeling in the pontine nuclei (**N**, case R602; **O**, case R605). **(P)** Discrete labeling in the caudal part of the contralateral spinal trigeminal nucleus (R602) Cl, claustrum; cp, cerebral peduncle; CPu, caudate putamen (striatum); DpG, deep gray layer of the superior colliculus; DpWh, deep white layer of the superior colliculus; fmi, forceps minor of the corpus callosum; Pn, pontine nuclei; Po, posterior thalamic nuclear group; RN, red nucleus; Rt, reticular thalamic nucleus; SC, superior colliculus; SNc, substantia nigra, compact part; SNr, substantia nigra, reticular part; Sp5, spinal trigeminal nucleus; SubD, submedius nucleus thalamus, dorsal part; VPL, ventral posterolateral thalamic nucleus; VPM, ventral posteromedial thalamic nucleus; ZI, zona incerta. Scale bars, 0.5 mm.

#### **Table 2 | Overview of S1 efferent projections.**



*Columns 1 (S1 forelimb) and 2 (S1 whisker) show average projections (with color-coded strength) observed in the present study, cumulated from all six cases (Table 1). Column 3 shows the difference between the semiquantitative projection scores from S1 whisker and forelimb representations, such that 0 indicates no difference, while numbers 1–3 indicate degrees of difference.*

barrel cortex, we also found labeling bilaterally in the perirhinal and ectorhinal cortex (**Figure 3B**), in line with earlier studies (Fabri and Burton, 1991a; Naber et al., 2000).

Further, in the two S1 whisker experiments we observed substantial labeling in the posterior parietal cortex, in agreement with earlier reports (Koralek et al., 1990; Fabri and Burton, 1991a; Lee et al., 2011). In the three S1 forelimb experiments moderate amounts of labeling were found in the posterior parietal cortex.

#### *Cingulate and retrospleninal cortex*

In the four cases with the largest injection sites we observed some labeling in area 1 of the ipsilateral cingulate cortex. In the cases injected in the S1 barrel cortex a modest amount of labeling was also seen in the ipsilateral retrosplenial cortex. Our observations confirm earlier reports of moderate or weak projections from S1 to the anterior cingulate cortex (Reep et al., 1990; Van Eden et al., 1992; Condé et al., 1995) and retrosplenial cortex (Shibata and Naito, 2008).

#### *Visual and auditory cortex*

In the two animals receiving BDA injections in the S1 barrel cortex, discrete patches of labeled fibers and considerable numbers of retrogradely labeled neurons were observed in the ipsilateral primary and secondary visual cortex (**Figure 3C**), as well as in the neighboring auditory cortex (**Figure 3D**), confirming earlier findings by electrophysiology and tract tracing (Frostig et al., 2008; Sieben et al., 2013).

## **SUBCORTICAL PROJECTIONS**

#### *Basal ganglia*

In all experiments, dense, elongated clusters of labeled axons were seen in the ipsilateral dorsal striatum (**Figures 4A,B**), and in some we also found smaller amounts of labeling in mirrored locations in the contralateral striatum. The corticostriatal projections from the S1 barrel region are well known, and the somatotopic arrangement of projections from different body representations is well characterized (Brown et al., 1998; Alloway et al., 1999; Hoffer and Alloway, 2001).

We further observed weak projections to other parts of the basal ganglia. In two cases (R605 and R606), a few individual labeled fibers were observed in the amygdalostriatal transition area of the ventral striatum, which presumably were en route to the basolateral amygdaloid nucleus (see below). In two experiments with relatively large BDA or Pha-L injection sites in the S1 barrel cortex, labeled axons were visible in anterior parts of the ipsilateral claustrum, in the region located ventrally to the forceps minor of the corpus callosum, *>*3 mm anterior of bregma (**Figure 4C**). It was earlier demonstrated by retrograde tracing that this region projects to S1 (Zhang and Deschenes, 1998). Our findings of (anterograde) Pha-L labeling here thus indicate direct projections from S1 whisker representations. However, a recent study (Smith et al., 2012) failed to demonstrate corticoclaustral projections from S1 whisker representations, at least at more posterior levels. Recent proteomic analyses indicate that the claustrum is limited anteriorly to coronal levels which include the striatum (Mathur et al., 2009), and not the anterior region underlying the forceps minor of the corpus callosum, where we observed labeling. This suggests that the labeling we observed in the region defined as claustrum in our reference atlas (Paxinos and Watson, 2007), should not be interpreted as corticoclaustral projections (see footnote to **Table 1**).

Finally, in all cases but one (in which relevant sections were missing) some widespread labeled fibers were found in the ipsilateral reticular part of the substantia nigra (**Figures 4E,F**). While corticonigral projections from prefrontal and motor areas have been reported earlier (Gerfen et al., 1982), evidence of S1 corticonigral projections has to our knowledge not been reported before.

#### *Basal forebrain*

While the basal forebrain is known to project to the cerebral cortex (Sripanidkulchai et al., 1984), it is less clear if the basal forebrain receives projections from S1. In two cases injected in S1 whisker representations, we observed a few labeled fibers in the anterior part of the basolateral amygdaloid nucleus.

#### *Thalamus*

In agreement with earlier reports (Staiger et al., 1999; Wright et al., 1999; Veinante et al., 2000; Wright et al., 2000) we found substantial ipsilateral projections to the ventral posterolateral and ventral posteromedial thalamic nuclei, the posterior thalamic nuclear group, and reticular thalamic nucleus (**Figures 4G,H**). Also, in all animals injected with BDA, multiple retrogradely labeled neurons were observed in these regions, reflecting the well-known reciprocal connections between S1 and the thalamus (Saporta and Kruger, 1977; Koralek et al., 1988; Berendse and Groenewegen, 1991; Fabri and Burton, 1991a). Further, in the three experiments involving the S1 whisker barrel cortex, we also found a substantial labeling in the dorsal part of the ipsilateral submedius thalamic nucleus (**Figure 4I**), which in the two cases with the largest injection sites also included some contralateral labeling. The submedius nucleus is known to receive nociceptive input from the trigeminal nuclei and spinal cord, and has been implicated in modulatory nociceptive processes (Craig and Burton, 1981; Dostrovsky and Guilbaud, 1988; Miletic and Coffield, 1989). This region is reciprocally connected with the cerebral cortex in cat (Craig et al., 1982), but these connections have, as far as we can determine, not been emphasized in earlier studies of the rat brain.

#### *Zona incerta, subthalamic nucleus, and red nucleus*

Moderate amounts of fibers were found in the ipsilateral zona incerta (**Figures 4J,K**) and subthalamic nucleus, in line with earlier observations (Rouzaire-Dubois and Scarnati, 1985; Nicolelis et al., 1992). In cases injected into the whisker barrel cortex we also found a few labeled fibers in the red nucleus (**Figure 4E**). Although somatosensory projections to the red nucleus have been described by use of electrophysiological recordings (Ebrahimi-Gaillard and Roger, 1993) and retrograde tracing technique (Bernays et al., 1988; Akintunde and Buxton, 1992), our anterograde tracing results indicate that corticorubral projections from S1 forelimb and whisker representations are rather insignificant.

#### *Anterior pretectal nucleus and superior colliculus*

In all cases but one (from which relevant material was missing), moderate amounts of labeled fibers were observed in the anterior pretectal nucleus and superior colliculus. In the anterior pretectal nucleus loose plexuses of labeled fibers are seen, confirming earlier observations of sparse connections by means of retrograde tracing (Cadussea and Roger, 1991). In the superior colliculus (**Figures 4L,M**), labeled fibers were loosely distributed across several layers, and did not aggregate in distinct, topographically organized clusters as described in several earlier studies (Wise and Jones, 1977b; Schwarz and Their, 1995; Hoffer et al., 2003, 2005).

#### *Pontine nuclei*

In all cases, we observed strong projections to the ipsilateral pontine nuclei (**Figures 4N,O**). These fibers were distributed in several well defined clusters in agreement with earlier observations (Leergaard, 2003; Leergaard and Bjaalie, 2007).

## *Trigeminal nuclei*

Although the trigeminal nuclei are known to receive significant projections from the contralateral S1 (Wise et al., 1979; Killackey et al., 1989; Furuta et al., 2010; Tomita et al., 2012), we only observed limited amounts of labeled fibers in the trigeminal nuclei (**Figure 4P**) in three of six cases, including two cases with tracer injection into the S1 whisker representation (R602, R606) and one case with tracer injection in the forelimb representation (R605). These were the three experiments with relatively large injection sites (**Table 1**). The modest labeling observed in our material stands in contrast to the rather abundant corticotrigeminal labeling seen after tracer injections into S1 orofacial regions (Tomita et al., 2012).

### *Dorsal column nuclei and spinal cord*

The corticocuneate and corticospinal projections of S1 are known (Wise and Jones, 1977a; Lue et al., 1997; Martinez-Lorenzana et al., 2001) and we also observed substantial amounts of labeled fibers in the corticobulbar and corticospinal tracts, which in some cases (where material was available) could be followed to the contralateral dorsal corticospinal tract. Sparse amounts of labeled fibers were observed in contralateral cuneate nuclei in two experiments where tracer was injected into the S1 forelimb representation, but it should be noted that in three cases material was not available from this region. The amount of labeling observed in our material is compatible with the observation that cortical neurons, retrogradely labeled by tracer deposits in the dorsal column nuclei, are relatively widespread in S1 (Martinez-Lorenzana et al., 2001).

### **NEGATIVE FINDINGS IN BRAIN REGIONS OTHERWISE NOT MENTIONED**

The present analysis covered all sections present in the brainwide collection of section images available in the Whole Brain Connectivity Atlas. All regions and subregions of the brain were manually inspected for labeling. Thus, our results strongly indicate absence of projections from S1 whisker and forelimb representations to brain regions not included in **Table 1**.

#### **COMPARISON OF EFFERENT PROJECTIONS FROM S1 FORELIMB AND WHISKER REPRESENTATIONS**

Overall, our results show that S1 forelimb and whisker projections target many of the same cortical and subcortical regions (**Tables 1**, **2**; **Figure 5**), although with different topographical distributions within each region. Some important differences were observed (**Table 2**): The S1 whisker barrel cortex projects to several cortical areas which do not receive projections from the S1 forelimb region, such as the retrosplenial cortex, perirhinal, ectorhinal, auditory, and visual cortex, S1 forelimb representations have more prominent projections to the motor areas (M1 and M2), and projections from S1 whisker barrel to insular cortex are more abundant. We further observed some differences in the subcortical projections: the S1 barrel cortex targets the submedius thalamic nucleus, provides stronger projections to the superior colliculus and trigeminal nuclei, has weak projections to the basolateral amygdaloid nucleus and red nucleus, but no projections to the cuneate nucleus.

#### **COMPARISON WITH ACCUMULATED LEGACY DATA**

A large number of previous investigations have explored the connections of the S1 barrel cortex (see references above, and review by Bosman et al., 2011). Many publications on rat brain connections have also been collated and registered in the BAMS2 database (http://brancusi1*.*usc*.*edu/), although coverage here is far from exhaustive. We compared our results with S1 efferent connections registered in BAMS2 (only ipsilateral data were available), connections mentioned in a recent review article (Bosman et al., 2011), and projections reported in an earlier brain-wide tract-tracing study conducted in mice (Welker et al., 1988). **Table 3** provides an overview of these comparisons.

Comparing our results with BAMS2 (**Table 2**), we find that all ipsilateral cortico-cortical projections observed in our analysis are registered in BAMS2 with corresponding strengths, and further that BAMS2 contains reports of some additional weak projections to the orbital area (Paperna and Malach, 1991; Reep et al., 1996), temporal association cortex (Paperna and Malach, 1991), and entorhinal cortex (Burwell and Amaral, 1998), regions in which we found no labeling. Consulting the original reports we find that these weak connections were identified by observation of few scattered neurons retrogradely labeled by tracer injection in the different target regions (see, e.g., Burwell and Amaral, 1998). It is unclear whether such cells were located in S1 whisker or forelimb representations.

The collection of subcortical connections of S1 registered in BAMS2 is, however, different from our account, as major corticothalamic projections are not included in BAMS2. The annotated strength of S1 projections to the striatum, posterior thalamic nuclear group, anterior pretectal nucleus, superior colliculus, and pontine nuclei registered matched fairly well with our results. Projections to zona incerta, subthalamic nucleus, red nucleus, trigeminal nuclei, and cuneate nucleus are so far not included in BAMS2. BAMS2contained reports of weak subcortical projections to the nucleus of the optic tract (Schmidt et al., 1993), raphe nuclei (O'Hearn and Molliver, 1984), ventral tuberomammilary nucleus (Köhler et al., 1985), and reticulotegmental nucleus of the pons (O'Hearn and Molliver, 1984),

**FIGURE 5 | Wiring diagram, summarizing findings.** Summary diagram showing all connections observed in experiments. connections arising from s1 whisker representations are indicated by red lines, and connections arising from s1 forelimb representations are indicated by blue lines. line thickness corresponds to the amount of labeling (low, medium or high) observed, as indicated in **Table 1**. apn, anterior pretectal nucleus; au1, primary auditory cortex; aud, secondary auditory cortex, dorsal area; bla, basolateral amygdaloid nucleus, anterior part; cg1, cingulate cortex, area 1; cl, claustrum; cpu, caudate putamen (striatum); cu, cuneate nucleus; ect, ectorhinal cortex; ins, insular cortex; m1, primary motor cortex; m2, secondary motor cortex;

pn, pontine nuclei; po, posterior thalamic nuclear group; pot, posterior thalamic nuclear group, triangular part; prh, perirhinal cortex; ptp, posterior parietal cortex; r, red nucleus; rsd, retrosplenial cortex; rt, reticular thalamic nucleus; s1, primary somatosensory cortex; s2, secondary somatosensory cortex; sc, superior colliculus; snr, substantia nigra, reticular part; sth, subthalamic nucleus; subd, submedius thalamic nucleus, dorsal part; tn trigeminal nuclei; v1, primary visual cortex; v2, secondary visual cortex; va/vl, ventral anterior and ventrolateral thalamic nucleus; vpl, ventral posterolateral thalamic nucleus; vpm, ventral posteromedial thalamic nucleus; zi, zona incerta.

all regions in which we found no labeling. Consulting the original articles, we find that these concern retrograde tracing experiments, yielding some labeling in the parietal cortex, which may or may not include the regions investigated in the present study.

Finally, when comparing our results with a recent review of the rodent barrel cortex (Bosman et al., 2011) and an earlier brainwide tract tracing study in the mouse brain (Welker et al., 1988), we find that all major connections are mentioned in these reports, while most of the moderate or weaker projections observed in our study (and to some extent also registered in BAMS2) are not included.

#### **DISCUSSION**

We have mapped projections to cortical and subcortical targets originating from the S1 whisker and forelimb representations in rat. Anterogradely labeled axons, originating from tracer injections in S1 cortex of six animals, were identified across a large collection of histological image (Zakiewicz et al., 2011). Compared to earlier efforts, our brain-wide analysis (summarized in **Tables 1**–**3**, and **Figure 5**) contributes more complete and detailed information about S1 efferent projections, both regarding completeness and information about differences between projections from the S1 whisker and forelimb cortex. Our comparison of the efferent projections of S1 whisker and forelimb representations shows that these generally reach the same targets, but that projections from the S1 barrel cortex target more (sensory related) cortical areas as well as some additional subcortical brain regions. The present analysis is based on experimental tract tracing data from adult male Sprague Dawley and Wistar rats, using two different axonal tracers (**Table 1**). *Pha*-L is considered to be

#### **Table 3 | Comparison with legacy data.**



*Columns A–C show a comparison of efferent projections from the S1 barrel cortex reported in the present study (column A), a recent review report of the rodent barrel cortex (Bosman et al., 2011), and a brain-wide analysis in mice (Welker et al., 1988). Columns D and E provide a comparison of our findings with ipsilateral projections of the entire S1 region registered in the BAMS2 database, with projection strength re-interpreted to the scale employed in the present study. Data in column D show the maximum average S1 forelimb or whisker projections (from Table 2).*

a pure anterograde tracer showing little uptake by fibers of passage (Wouterlood and Jorritsma-Byham, 1993), while BDA can be taken up by passing fibers and also has retrograde properties which may give rise to secondary, or indirect, anterograde labeling (Merchan et al., 1994; Merchan and Berbel, 1996; Lanciego and Wouterlood, 2000). Regardless of these different parameters, the overall pattern of connections observed in this material is remarkably consistent across strain and tracers used (**Table 1**).

All six injection sites were columnar of shape and involved all cortical layers without infringement of white matter in the external capsule. The experiments provide information about the efferent connectivity of the entire S1 injection sites, but without possibility to differentiate layer-specific connections. With semiquantitative assessment we observe a robust relationship between injection site volumes and amount of labeling. The relatively small injection sites may account for weak projections. Hence, absence of labeled fibers in the cingulate cortex, claustrum, basolateral amygdaloid nucleus, and trigeminal nuclei in one (case R604) out of three experiments with tracer injection in the S1 barrel cortex, can be explained by the considerably smaller size of the BDA injection.

Nearly all of the connections demonstrated in our survey have been reported earlier, and only a few projections not observed in our material have been reported in the literature. Thus, our report is in general agreement with earlier literature, and provides the so far most complete overview of the efferent projections of rat S1 barrel cortex. However, an overwhelming wealth of scientific reports describing various details reflecting the connectivity of the S1 barrel cortex exists, and a comprehensive review of S1 connectivity literature is beyond the scope of our study.

Discrepancies with earlier observations may reflect biological variability or variation in the employed tract tracing paradigms (tracer properties, size and position of tracer injection site, and effective zone of tracer uptake). Reports of connections not observed in the present study mainly concerns retrograde tracing studies demonstrating sparse amounts of labeled neurons in the parietal cortex, which may or may not involve the specific S1 representations investigated in our study. There is also a concern that some connections identified by retrograde tracing may involve false positive labeling caused by contamination or uptake of tracer in passing fibers. Our results further highlight the challenges related to the use of different nomenclature and boundary definitions, and the need for efficient ways to compare and translate between different brain atlases. This is particularly evident with respect to the claustrum, where the employed atlas (Paxinos and Watson, 2007) does not hold more recent structural information (Mathur et al., 2009). Interestingly, it is thus unclear which anatomical location would be appropriate for the fibers observed in the anterior part of the region previously known as the anterior part of the claustrum. A related problem is found with our observations of S1 barrel cortex projections to the ectorhinal cortex, which is referred to by different terms (postrhinal cortex) in earlier studies of connections (Burwell et al., 1995; Naber et al., 2000).

Some more subtle differences between our results and earlier reports should be mentioned: The observed S1 projections to the red nucleus appear very weak in our material, which is at odds with earlier electrophysiological reports of somatosensory cortical influence of the red nucleus (Ebrahimi-Gaillard and Roger, 1993). This discrepancy may reflect the selection of S1 representations involved in our study. Similarly, the projections to the superior colliculus are unexpectedly weak in our material, as compared to other investigations which have reported strong corticotectal projections from the S1 barrel cortex (Schwarz and Their, 1995; Hoffer et al., 2003, 2005). We have no explanation for this difference, other than experimental factors such as the size and position of the tracer injections.

Overall, relative to S1 forelimb representation, our study shows that the S1 whisker barrel cortex has more abundant projections to cortical and subcortical regions that are relevant in context of sensory exploration, such as the perirhinal and ectorhinal cortex which are implicated in sensory integration and gating (Naber et al., 2000; Rodgers et al., 2008), and to the submedius nucleus of the thalamus which modulates nociceptive processes (Craig and Burton, 1981; Miletic and Coffield, 1989; Blomqvist et al., 1992).

The presented results are of relevance for ongoing largescale efforts to systematically map connections in the rodent brain, such as the Mouse Brain Connectome Project and the Allen Mouse Brain Connectivity Atlas. These initiatives provide access to very large collections of images containing tract-tracing data resulting from tracer injections in various parts of the mouse brain. Similar to the Whole Brain Connectivity Atlas resource utilized in our project, these projects provide online access to serial image data in web browsers, allowing investigators to inspect tracer injection sites and ensuing labeling patterns. These resources are conceptually quite similar to the data collection investigated in the present study, and face the same challenges with respect to analysis, interpretation, and extraction of knowledge about connectivity. The three-dimensional image viewer provided by the Allen Mouse Brain Connectivity Atlas offers additional advantages. When looking up experiments involving the S1 barrel cortex, it is straightforward to view well-known projections to e.g., the ipsilateral M1, contralateral S1, striatum, thalamus, and pontine nuclei. But existence of projections to other known targets can only be confirmed by more detailed anatomical analysis of individual section images.

## **CONCLUSIONS**

We have performed the first brain-wide survey of whisker and forelimb related S1 efferent connections in rat based on data shared through an online atlas. The observed connectivity patterns were highly consistent across the 6 experiments, and some distinct differences were observed between projections from S1 forelimb and whisker representations. In comparison to earlier efforts to generate overviews of S1 efferent projections in the rodent brain based on the available literature, our analysis has provided a more detailed overview, allowing assessment of projection strength across target regions and comparison of projections originating from different subregions of S1. Access to organized collections of raw image data and accompanying tools for viewing and inspection of the data represents a first step only. Conclusions regarding connectivity require attention to interpretation of location of labeling in relation to boundaries and potential sources of error in the experiments. Our study sheds light on important challenges inherent to such analyses.

#### **ACKNOWLEDGMENTS**

This research was funded by The Research Council of Norway and EMBIO/MLS@UIO. We thank Dmitri Darine, Ivar A. Moene, and Muthuraja Ramachandran for expert technical assistance, and Dr. Mihail Bota for kindly providing data files exported from the BAMS2 database.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 08 October 2013; accepted: 14 January 2014; published online: 05 February 2014.*

*Citation: Zakiewicz IM, Bjaalie JG and Leergaard TB (2014) Brain-wide map of efferent projections from rat barrel cortex. Front. Neuroinform. 8:5. doi: 10.3389/fninf. 2014.00005*

*This article was submitted to the journal Frontiers in Neuroinformatics.*

*Copyright © 2014 Zakiewicz, Bjaalie and Leergaard. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Building the Ferretome

#### Dmitrii I. Sukhinin<sup>1</sup> , Andreas K. Engel<sup>2</sup> , Paul Manger<sup>3</sup> and Claus C. Hilgetag1,4 \*

<sup>1</sup> Department of Computational Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany, <sup>2</sup> Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany, <sup>3</sup> School of Anatomical Science, University of the Witwatersrand, Johannesburg, South Africa, <sup>4</sup> Department of Health Sciences, Boston University, Boston, MA, USA

Databases of structural connections of the mammalian brain, such as CoCoMac (cocomac.g-node.org) or BAMS (https://bams1.org), are valuable resources for the analysis of brain connectivity and the modeling of brain dynamics in species such as the non-human primate or the rodent, and have also contributed to the computational modeling of the human brain. Another animal model that is widely used in electrophysiological or developmental studies is the ferret; however, no systematic compilation of brain connectivity is currently available for this species. Thus, we have started developing a database of anatomical connections and architectonic features of the ferret brain, the Ferret(connect)ome, www.Ferretome.org. The Ferretome database has adapted essential features of the CoCoMac methodology and legacy, such as the CoCoMac data model. This data model was simplified and extended in order to accommodate new data modalities that were not represented previously, such as the cytoarchitecture of brain areas. The Ferretome uses a semantic parcellation of brain regions as well as a logical brain map transformation algorithm (objective relational transformation, ORT). The ORT algorithm was also adopted for the transformation of architecture data. The database is being developed in MySQL and has been populated with literature reports on tract-tracing observations in the ferret brain using a customdesigned web interface that allows efficient and validated simultaneous input and proofreading by multiple curators. The database is equipped with a non-specialist web interface. This interface can be extended to produce connectivity matrices in several formats, including a graphical representation superimposed on established ferret brain maps. An important feature of the Ferretome database is the possibility to trace back entries in connectivity matrices to the original studies archived in the system. Currently, the Ferretome contains 50 reports on connections comprising 20 injection reports with more than 150 labeled source and target areas, the majority reflecting connectivity of subcortical nuclei and 15 descriptions of regional brain architecture. We hope that the Ferretome database will become a useful resource for neuroinformatics and neural modeling, and will support studies of the ferret brain as well as facilitate advances in comparative studies of mesoscopic brain connectivity.

#### Edited by:

Arjen Van Ooyen, VU University Amsterdam, Netherlands

#### Reviewed by:

Gully A. Burns, USC Information Sciences Institute, USA Hidetoshi Ikeno, University of Hyogo, Japan

> \*Correspondence: Claus C. Hilgetag c.hilgetag@uke.de

Received: 21 January 2015 Accepted: 14 April 2016 Published: 10 May 2016

#### Citation:

Sukhinin DI, Engel AK, Manger P and Hilgetag CC (2016) Building the Ferretome. Front. Neuroinform. 10:16. doi: 10.3389/fninf.2016.00016

Keywords: brain connectivity, connectomics, tract-tracing, ferrets, databases as topic, models, theoretical

## INTRODUCTION

fninf-10-00016 May 6, 2016 Time: 17:8 # 2

## Connectomics

A central perspective for analyzing brain data is the representation of neural relations as complex networks. This representation can be used for almost all structuralfunctional dimensions of the brain, from the molecular to the systems scale, and structural to cognitive characterizations. The network-theoretical approach is a powerful tool in the hands of neuroscientists, because it provides a formalized framework for the analysis of complex interactions (Klimm et al., 2014). In particular, different types of brain connectivity can be distinguished, such as functional connectivity (reflecting statistical dependencies among neurophysiological events) as well as effective (causal) connectivity (Friston, 1994). The most fundamental type of connectivity is structural or anatomical connectivity, which provides a structural network basis of brain dynamics and function.

Several current projects address the challenge of collating the complete structural network of the brain, the so-called connectome (Sporns et al., 2005), from the cellular to the mesoscopic and macroscopic scale (Leergaard et al., 2012). The neuronal micro-connectome, which is based on invasive methods of imaging and the reconstruction of neuronal elements (including synapses) from brain sections (see Van Essen et al., 2013 for an extensive review), may form the ultimate structural basis of the brain. However, connectomics at the cellular level faces a host of conceptual and technical challenges and cellular connectomes have so far only been completed for the small nervous systems of the nematode Caenorhabditis elegans, possessing just 302 neurons (White et al., 1986; Varshney et al., 2011), as well as partly for neural populations in the zebrafish (Friedrich, 2013) and Drosophila (Chiang et al., 2011; Shih et al., 2015). One of the main problems of constructing connectomes at the microscopic level is the computationally demanding reconstruction of synaptic connections from the raw data that places limitation on the volume of brain tissue that can be studied (Helmstaedter et al., 2008). Recently, considerable progress to overcome these limitations has been made in terms of methodology (reviewed in Kleinfeld et al., 2011), resulting in advances that may eventually lead to the creation of a whole connectome of the mouse brain (Mikula et al., 2012). Moreover, by applying new methods from genomics, it might be possible to create micro-connectomes for a wide range of species (Zador et al., 2012).

Examples for connectomes at the macroscopic level include the recently published data on brain-wide mouse connectivity (Oh et al., 2014; Zingg et al., 2014), partly based on optogenetic methods for labeling and tracing axonal connections of largescale regions of interests (that is, cortical areas and subcortical nuclei). Further anatomical tracing techniques can be used to obtain structural connectivity at the mesoscopic level. The conventional method of histochemical tract-tracing has produced significant insights into the organization of brain connectivity and has resulted in an extensive body of connection data, for example, a detailed description and analysis of macaque monkey visual cortical connectivity (Felleman and Van Essen, 1991) and connectivity of the entire mesoscopic cat cortical (Scannell et al., 1995) and thalamocortical system (Scannell et al., 1999) as well as extensive connectivity of the rat at the systems level (Bota et al., 2015). These connectivity data were compiled from traditional neuroanatomical studies performed during the last decades. As a further attempt to systematize this approach of generating structural connectivity, and in order to deal with methodological problems such as different parcellation approaches and methods of labeling, connectivity databases such as the CoCoMac database were created (Stephan et al., 2001; Bakker et al., 2012; Stephan, 2013). Over a period of more than 10 years, hundreds of tracttracing reports for the macaque monkey brain were collated in CoCoMac (Bakker et al., 2012).

A fundamental problem of conventional anatomical tracttracing studies is that, due to their invasiveness, they cannot be performed in humans. This limitation raises questions about the applicability of data gathered in the animal models to humans. The problem can be ameliorated by comparative studies of different animal models (Bohland et al., 2009; Goulas et al., 2014; Zingg et al., 2014; Bota et al., 2015), and through newly developed non-invasive techniques for imaging connectivityrelated parameters. For example, diffusion imaging methods such as diffusion tensor imaging (DTI) or diffusion spectrum imaging (DSI) can be used to produce entire connectomes of a human brain in relatively short time (Van Essen et al., 2012). Diffusion imaging measures the anisotropy of water diffusion along axonal paths, which can then be used to infer the course of fiber tracts. The approach is systematically exploited by large-scale projects such as the Human Connectome Project (Toga et al., 2012), which aims to provide a comprehensive description of all long-range pathways of the human brain. However, diffusion-based approaches may be prone to several measuring and reconstruction artifacts (Farquharson et al., 2013).

The rise of new imaging methods such as DTI raises the question of whether connectivity databases based on laborious and invasive anatomical tract-tracing studies are still required. The answer should be affirmative, as such conventional data provide a well-established 'gold standard' of structural brain connectivity. With this approach, one can directly observe the labeled origins and terminations of projection neurons in different brain regions, gather information on the axonal density and direction of projections as well as finer details, such as the laminar origins and terminations of projections. All of these aspects, which may be of substantial functional importance (e.g., Vanduffel et al., 1997), are currently not accessible by diffusionbased tractography.

It should, however, be noted that conventional anatomical tract-tracing studies are not without potential technical and methodological problems either, considering, for example, mislabeling due to the spillage of tracer injections into neighboring regions or the white matter (for further discussion of these issues see Kötter, 2001). Moreover, there are also challenges associated with the many alternative ways of parcellating the brain into different areas, by not completely objectified criteria. For example, brains may be parcellated

by using various multi-modal macroscopic or cytoarchitectonic criteria (Dombrowski et al., 2001; Amunts et al., 2014), as well as personal preferences. One way to address these problems is through knowledge management methodology. One current project in this field is Neurolex (Neurolex.org; Larson and Martone, 2013) which allows to organize and query neurobiological knowledge by inter-referencing and linking it to detailed empirical data. A further example is UBERON<sup>1</sup> , which provides cross-species hierarchical parcellations of regions of interest of the nervous system. However, due to the crossspecies generality of the approach, the annotation is rather coarse, as contrasted with detailed existing parcellations in an individual species such as the ferret, which, for structures such as the cerebral cortex, already possess several dozen parcels. Therefore, the practical value of this systematic approach for the current project is limited. Generally, despite the obvious advantages of a systematic organization of neurobiological knowledge for the scientific community, advances in knowledge management methodology are still mostly ignored by the authors of tract-tracing reports (see Bakker et al., 2012 for review).

In addition, many reports in the literature do not provide quantitative data on the number of labeled neurons or the numerical density of axonal terminations, but only categorical information on the presence or absence of pathways, or comparative qualitative measures, such as 'low'/ 'average'/ 'high' density of connections (Lanciego and Wouterlood, 2011). This type of coding may encompass a great range of quantitative values. For example, the density of anatomical pathways (that is, the number of axons in them) can extend over five orders of magnitude (Markov et al., 2011) and may be poorly captured by a limited number of ordinal categories.

## The Model System of the Ferret Brain

Due to limitations of directly investigating the structural connectivity of the human brain, research has turned to animals models, where extensive developmental, behavioral, or electrophysiological data can be obtained. Here, the ferret brain has some distinctive advantages. For example, one benefit in developmental studies is the convoluted, gyrencephalic surface of the ferret brain and that the process of gyrification can be observed in detail (Sawada and Watanabe, 2012). Immaturity of the ferret at birth helps to investigate developmental processes that occur prenatally in other species, such as the cat, and, for example, allows conducting systematic experiments with altered connectivity in order to observe the adaptation of cortical areas to new sensory stimuli (Noctor et al., 2001). Moreover, the relative developmental immaturity of the neonatal ferret facilitates studies on how early lesions in one part of the brain may affect connectivity in other regions (Restrepo et al., 2002), and how lesions have an impact on the development of topographical maps and connectivity between the cerebral hemispheres (Restrepo et al., 2003). A further advantage of the ferret is that its brain shows substantial homologies with other species, such as the cat (Manger et al., 2010) as well as potentially with other carnivores such as the dog (Onishi et al., 2007). Taking these factors into account, extensive work has been performed in this animal model using electrophysiology to relate patterns of electrical activity to behavior (e.g., Fritz et al., 2003; Bizley et al., 2013). These studies have shown that ferrets possess intricate sensory cortical systems (Phillips et al., 1988; Nelken and Versnel, 2000; Innocenti et al., 2002; Bizley et al., 2005, 2007; Manger et al., 2005), making them an appropriate model for the study of sensory processing pathways, response properties and topographies of sensory neurons and multisensory interactions. In fact, there exists no comparable model at the moment that combines elaborate and easily trainable behavior with the opportunity for extensive anatomical and physiological as well as developmental studies. In particular, similar studies in primates, which proceed only in very few labs, are much more restricted in the scope of investigations and the number of animals studied.

In addition to the advantages of the ferret brain model for anatomico-physiological research, one should also point out its usefulness for comparative studies. Currently, systematically compiled macro-connectivity data are only available for a restricted range of species (macaque monkey, cat, rat, and mouse) limiting the ability of cross-species analyses. Extending the number of available connectomes of different species for systematic statistical and graph theoretical analyses can shed light on the general organization of connectivity patterns in mammalian brain networks (Striedter, 2005). One successful example of such inter-species comparisons is the identification of a densely connected 'rich club' of core brain regions in different species (van den Heuvel and Sporns, 2011; Harriger et al., 2012; Towlson et al., 2013) and its role in brain diseases (van den Heuvel et al., 2013).

Hence, a detailed macroconnectome of the ferret brain will facilitate comparative anatomical studies and support cross-domain exchange in anatomy, electrophysiology and connectomics. Another specific motivation of the ferretome project is to provide data for the connectivity-based modeling of ferret brain dynamics. This modeling project is part of a research collaboration with experimentalists recording brain activity at multiple sites of the ferret brain using ECoG and multi-electrode approaches (Stitt et al., 2015a,b). As a necessary precondition for the modeling, the structural connectivity of the ferret brain as well as further features of its brain architecture need to be known. However, at the moment, no systematic compilation of connectivity is available for this species. Creating a repository of the macroconnectivity of the ferret brain is a complex task. The collation of the data from published tracttracing report faces similar problems as previously addressed by the CoCoMac database (Bakker et al., 2012) or projects such as BAMS (Bota et al., 2005, 2015) and neuroVIISAS (Schmitt and Eipert, 2012). Thus, in the following section we provide a short review of existing database projects that aim at storing connectivity data, in order to define the parameters of a suitable architecture for the ferret brain connectivity database.

<sup>1</sup>uberon.github.io

## COMPARABLE WORK

fninf-10-00016 May 6, 2016 Time: 17:8 # 4

In the area of connectivity databasing, two main types of approaches for representing brain topography can be distinguished: coordinate-based vs. semantic or logical parcellation schemes. The first type is represented by the XANAT system (Press et al., 2001), while the second approach is used by the remainder of projects reviewed below.

XANAT (Press et al., 2001) was one of the first systems for storing, comparing and analyzing the results of neuroanatomical connection studies. Data can be entered into the system by placing injection and label sites into canonical representations of the neuroanatomical structures of interest, along with verbal descriptions. After the entry procedure, a graphical search can be performed on the data by selecting a specific brain site or textual search with use of keywords or references to original studies. An important feature of the system is that data may be studied and compared relative to well–known neuroanatomical substrates or stereotaxic coordinates regardless of variable areal boundaries (Press et al., 2001). XANAT can be downloaded and run in the Unix X window environment (as reflected in the name of the software).

The brain architecture management system (BAMS) (Bota et al., 2005, 2015), is a representative example of the attempt to store comprehensive structural descriptions of the brain. Information about four main entities and their attributes can be kept in the system: connections, relations, cell types and molecules. The connections entity represents records of data and metadata of macroscopic neuroanatomical projections between brain regions. The relations entity describes qualitative spatial relations between brain regions. Cell type attributes provide descriptions of neurons, neuronal population and their classifications. The molecules category represents data on molecules (e.g., neurotransmitters) specific to neurons and brain regions.

BAMS is accessible online via a web interface<sup>2</sup> . The server part is written in PHP<sup>3</sup> and the database itself is handled by MySQL<sup>4</sup> . In BAMS, data can be stored and found for different species; however, the majority of it reflects structural descriptions of the rat. Some data can be exported for further analysis in structured formats (for example, as an adjacency matrix).

A further system, the NeuroVIISAS platform (NeuroVisualization, Image mapping, Information System for Analysis and Simulation; Schmitt and Eipert, 2012) is an example of a neuroinformatics approach that aims to link the storage of connectivity information with visualization and analysis. NeuroVIISAS is an open framework that allows users to perform integrative data analysis, visualization of the data, and even population simulations with the help of a link to the NEST software for neuronal simulations (Gewaltig and Diesmann, 2007). During the data analysis step, it is possible to use a variety of network manipulations, such as network randomization and comparisons to benchmark networks (e.g., scale-free networks). Connectivity matrices can be visualized together with summary indices for characterizing brain connectivity, such as the clustering coefficient (Holland and Leinhardt, 1971) or the joint degree distribution (Albert and Barabási, 2002). Visualization, in particular of rat connectivity, can be provided in the framework of the Paxinos and Watson (2006) atlas. Population simulations based on the connectivity data can be performed using PyNEST (Davison et al., 2008) and NEST (Gewaltig and Diesmann, 2007). In this way, neurobiologically defined connectivity is integrated with computational neuroscience simulations. After script generation and simulation, the produced results can be imported back into NeuroVIISAS and visualized in various formats, including 3D visualization. NeuroVIISAS is a free software implemented in Java with versions for Windows and Linux, which can be operated locally. The main advantage of this approach is that a researcher's own data (connectivity or mapping information) can be quickly added to the framework and analyzed, visualized, and simulated in the local environment.

Finally, CoCoMac (Collation of Connectivity data on the Macaque brain) is a connectivity database and neuroinformatics platform that has been developed for more than a decade (Stephan et al., 2001; Bakker et al., 2012; Stephan, 2013). CoCoMac aims to store two main modalities of data: connectivity tract-tracing studies as well as mapping studies of (mainly) rhesus macaque. CoCoMaC addresses central challenges of collations of connectivity from the anatomical literature, such as the absence of spatial coordinates in many primate anatomical studies and of a universally accepted brain map for the Macaque monkey. These aspects result in inconsistencies between alternative brain parcellation schemes, as well as ambiguities and contradictions of results from different tract-tracing studies. The CoCoMac creators postulated five main principles for their project: Objectivity, Reproducibility, Transparency, Flexibility, and Simplicity. These principles reflect the way in which the system links to original data, as well as the schema by which data are inserted and processed in the database. In particular, a specific algorithmic framework was developed, termed objective relational transformation (ORT; Stephan and Kötter, 1999; Stephan et al., 2000b). This framework allows the transformation of all available connectivity data in one brain map into another map, according to relations between areas and brain maps established in the anatomical literature, using an encoding of logico-spatial relations between the regions (e.g., an area A is smaller than, bigger than, equal to, or overlaps with, another area B).

Originally, CoCoMac was created in MS Access, but subsequently the database was converted to MySQL and made accessible through a web interface, with the server side programmed in PHP. With the update to a new version<sup>5</sup> , CoCoMac received several new features including a search/browse wizard and direct access to the database content through specifically developed viewers (Bakker et al., 2012).

In summary, in this section we reviewed existing neuroinformatical approaches for representing experimentally

<sup>2</sup>https://bams1.org

<sup>3</sup>https://www.php.net

<sup>4</sup>https://www.mysql.com

<sup>5</sup>http://cocomac.g-node.org

established brain connectivity as a network model at different scales. Despite the rise of new experimental methods, such as DSI/DTI, at the macroscopic level, anatomical tract-tracing studies are still the most reliable source of connectivity data. Availability of macroscopic connectivity data for a variety of species will facilitate comparative studies and deepen our understanding of the particular organization of the human brain. One popular animal model is the ferret due to its valuable features, such as elaborate behavior and immaturity at birth. Creating a complete brain connectivity scheme of an animal even as small as a ferret is a complex task that requires the help of modern methods in computer science such as online databasing. In the next section we turn to the issue of building such a database, populating it with data, supporting it and extracting summary results.

## METHODS

## Basic Design

From a conceptual point of view, the main structure of the Ferretome database was derived from the CoCoMac project (Stephan et al., 2001). The CoCoMac data model allows the storage of three data modalities: mapping information, labeling data, and meta data about brain map relations as well as special data codes.

Mapping information is based on published verbal or graphic descriptions of brain parcellations, structuring the brain into multiple areas and nuclei, typically according to the characteristic architectonic or physiological properties of these parcels (see **Figure 1** for illustration.)

Connection labeling information is based on verbal or graphic descriptions of results of labeling experiments. In the tracttracing literature, the results of connection labeling experiments may be published together with their own mapping scheme or use previously published maps. In both cases, a tract-tracing experiment describes locations of tracer injections (injection site – a brain area in a specific brain map or part of a brain region, e.g., "caudal parietal cortex") and locations where tracer was found (labeled sites). The density of the label is usually coded in a qualitative parameter – from 'weak' to 'strong.' Further information about the tract-tracing methodology may be given (for example, the number of studied animals, type of tracer and its amount, survival time of animals and thickness of brain sections that were evaluated). See **Figure 2** for details.

Meta information can be divided into two main types. The first type concerns brain map relations. This type of data is published in its own right or provided as part of tract-tracing studies and usually given as a verbal description of how brain areas in one parcelated map are related to brain areas in another map. Across the tract-tracing literature, five main relations of brain areas can be found. Brain areas can be identical, area A may be a subarea of area B, two areas can overlap with each other, area B may be a subarea of A, or the areas may be unrelated.

As a second type of meta information, the creators of CoCoMac introduced special descriptions in order to cope with issues of data ambiguity and lack of data. The first of these descriptions is the "Extension code." This code describes the extent of information available for a brain area or a labeled site. This code has several states: information may be available for an entire brain site, part of a brain site or for no part of a brain site. This code is used subsequently by the algorithmic engine of CoCoMac.

A further kind of characterization is given by the so-called precision data codes (PDC). PDCs were used in CoCoMac in order to cope with situations where the information contained in the text of a paper apparently contradicts information in figures or tables. Here, the PDC is coded by letters from "A" to "Q," where "A" stands for the most reliable and consistent description. For example, the PDC code "A" for specifying a brain area signifies that "The area is named explicitly in the text/tables and identified with certainly. Additional figures explicitly support the text by showing present (or missing) label in areas defined by names and/or borders", whereas "Q" indicates: "The information about the (un)labeled area is not from an original research report, but from a review article" (more details can be found in Stephan et al., 2001). CoCoMac provides several types of PDC's for different types of data, for example, PDC\_BrainArea, PDC\_lamina, with their own specific descriptions.

All three data modalities can only be entered into the database together with links to a concrete data source. For this purpose, CoCoMac and Ferretome.org provide special tables to store information on literature references and their authors (**Figure 3**).

Another distinctive feature of CoCoMac is the incorporation of the approach of ORT (Stephan et al., 2000b). This powerful algorithm allows the automatic conversion of all available data (including PDCs) from one given brain map into another. ORT uses a custom-developed relational algebra that handles the five main relations between brain areas, as mentioned above: identical, subarea, larger, overlap and disjoint (for details see Stephan et al., 2000b). Specifically, if there exists a report that specifies a relation among brain maps, then it is possible to transform connectivity data from one report to another and hence to build a consistent description of brain connectivity. For example, if two areas from two different brain maps are specified by a report as "identical," then all data associated with these areas can be easily transferred from one map to another. In addition to transforming data for known relations among brain maps, ORT is capable of discovering previously undefined relations between brain areas of different maps (i.e., which are not yet specified in the anatomical literature). For example, if it is known that "A" is identical to "B" and "B" identical to "C," it can be inferred that "A" is identical to "C." The algorithm can also identify inconsistent relations (such as that "A" is a subarea of "B" while also "B" is a subarea of "A").

## Extending the Basic Design

In creating the Ferretome database, the template data model and algorithmic services of CoCoMac were adjusted to species-specific properties of the ferret brain as well as additional requirements established during the conceptual planning.

The main novelty, in terms of the database structure, was the introduction of extensible and flexible tables that store data about ferret brain architecture and the means to process this data as part of the standard data model. After an extensive review of presently available reports on ferret brain architecture we found that this new data modality has several distinct features. For example, architecture parameters can be applied to a whole area or part of an area. Such parameters can be quantitative as well as qualitative. For instance, quantitative data may exist on primary and secondary cell diameters, the number of layers and sublayers and their thickness. Alternatively, one may find qualitative descriptions of CO reactivity, myelination (e.g., in terms of "weak," "average," "strong"), laminar differentiation and types of neurons and their sizes (e.g., "big pyramidal neurons," "small granular neurons").

Similar as for the labeling data modality, architecture data can be extracted from figures as well as from textual descriptions provided in literature reports. Therefore, for this data modality, the same PDC method of specifying the data reliability was employed. Different aspects of PDC\_Architecture were gathered from the literature and can be used for an entire brain area as well as for area subcompartments, such as individual cortical layers (**Figure 4**).

For algorithmic services, Ferretome.org uses the implementation of ORT described above. This algorithm was extended in order to process brain architecture in a similar way as transferring labeling information from one brain map to another. This process does not require additional metadata about brain maps relations and transfers all available architectonic parameters simultaneously with the connectivity data. In case of ambiguities, when two different brain map indicate contradictory information about an area or subpart of the area, the algorithm performs a ranking according to extension codes and PDC codes. More reliable data (indicated by better extension codes and PDC codes) is shown first.

## Data Entry Process

In order to comply with established procedures and recommendations for connectome projects (e.g., Bakker et al., 2012), we introduced specific routines for data entry and data modification.

For data entry, a semi-automated pipeline was created with four main steps: (1) systematic literature search and discovery of tract tracing reports, (2) short-listing and queuing of reports for input, (3) input by one DB collator, (4) proofreading by another DB collator.

The first and second step are performed outside of the system. During the first step, Ferretome.org curators (trained in brain anatomy) used online search engines such as Google Scholar<sup>6</sup> and PubMed<sup>7</sup> to identify ferret brain tract-tracing reports. In the second step, the DB curators, after an initial assessment of a report, decided if it should be added to the database. If so, a

<sup>6</sup>http://scholar.google.com

<sup>7</sup>http://www.ncbi.nlm.nih.gov/pubmed

FIGURE 2 | Labeling data and its representation in the database. Top left: schematic representation of the outcome of a connection labeling experiment in the ferret brain; top right: corresponding microphotographs of stained sections and labeled neurons; both panels reproduced with permission from Manger et al. (2010). Bottom: Ferretome.org database schema related to the data modality of connection labeling experiments. The central entity of this schema is an injection (linked to a literature table). One connection-tracing report may comprise several injections. Many injections have several outcomes. Every outcome comprises many labeled sites that should be linked to brain sites (cf. Figure 1). All injections are supplied with data about methods, tracers as well as further parameters.

curator created a task inside Ferretome.org (**Figure 5**). Moreover, during this step, the DB curators inspected literature references within selected tract-tracing reports and, if these reports used brain maps delineated elsewhere, the respective reports were also selected for entry.

During the third step, the system distributed tasks in such a way that the initial data entry and the proofreading of a tracttracing report were performed by two different researchers. The step included the detailed evaluation of a tract-tracing report, entering all available data into database and marking the data

Frontiers in Neuroinformatics | www.frontiersin.org May 2016 | Volume 10 | Article 16 |

with extension codes and PDCs. After finishing data entry, the DB curator changed the task status to "finished" in order to proceed to the fourth step. This final step virtually repeats the procedure of the third step, but performed by a different DB curator.

From the perspective of a DB curator, the data entry interface represents a typical web application where user can select necessary section and by means of an input wizard perform entry of data found in tract-tracing report. The data entry pipeline was integrated with a journaling subsystem that keeps track

of changes that were made by users for every data modality presented in the system and allows to roll back unwanted changes.

## Use Cases and Technical Information

Although the data entry interface (or 'back office') allows navigation across already inserted data, for the convenience of the end users an entire new interface for data browsing was created ('front office'). This interface interacts with the database in readonly mode. In general, the data browsing interface provides different means of searching information and creating summaries of stored data.

One way in which this interface can be used is for literature search, where users can try to find data by using bibliographical information (i.e., by the title of a literature report or author names) or by entering the acronym of a brain area (**Figure 6**). Another way to access connectivity data is via the connectivity section (or directly from the literature section), display the entire information provided in a literature report. Ferretome.org automatically maps all available connectivity data from all brain maps present in

the DB into a selected map on-the-fly using the ORT algorithm.

At this point the connectivity data can be extracted in two formats, XML and JSON<sup>8</sup> (more formats are planned, see Discussion) and be further analyzed by approaches such as the brain connectivity toolbox<sup>9</sup> or neuroVIISAS, mentioned above. A snapshot of the data is provided as a Supplementary File.

Going deeper into technical details, Ferretome.org represents a typical web application with a front-office and a back-office supported by a database. As a database management system, the reliable and free MySQL<sup>10</sup> was employed and phpMyAdmin<sup>11</sup> was used to handle the initial creation and editing of tables. The source code and schema of the database are available on GitHub<sup>12</sup> .

## System Validation

The Ferretome.org system has so far been used by three members of our lab for data entry. These researchers also provided substantial feedback on the general design of the system. Moreover, this project is being developed as part of a research center collaboration<sup>13</sup>. In this context, we initially presented the conceptual design of the database and as well later preliminary results to other researchers at the center who work experimentally on the ferret brain and who are the main local recipients of this project. These

8 json.org


researchers provided helpful feedback on the approach and methodology as well as an approval of the general design of the system.

## RESULTS

Currently, the state of Ferretome.org can be characterized as a beta version. While it integrates all connectivity information for the ferret presently available in the literature (as identified by the DB curators), the available information itself is sparse, so that the information contained in the Ferretome about the brain architecture and macroconnectome of the ferret brain is still limited. Moreover, the relatively small number of anatomical connectivity reports published so far on the ferret covers mostly subcortical connections. However, the database is continuously being populated with newly appearing reports, and we are also working on evaluating still unpublished results of tracttracing experiments in the ferret as well as performing new experiments. Stored records can be accessed via the web interface (**Figure 7**), where the full summary of inserted data for a given publication is represented as a table. This table can be dynamically extended to display links with other publications (e.g., if brain maps were defined in a different paper and the current record is using these parcellation schemes to describe tract-tracing results).

Using the same interface, the architecture of the brain areas can be obtained directly from the extracted data of a paper, as well as from other records by using the ORT algorithm that transforms connectivity data from one map to another, if relations among parcellations schemes are specified.

At the current point, more than 150 ferret brain papers have been reviewed, 50 them were entered into the database and for 30 of them that contain mapping or connectivity data, the proofreading is finished. These 30 reports contain 20 unique injections sites with 200 labeling sites in both ipsi- and contra-lateral hemispheres of the ferret brain. Architecture data is currently provided for 12 distinct brain areas, primarily for visual and auditory cortex.

## DISCUSSION AND OUTLOOK

Differences in the techniques of different neuroanatomical labs and the absence of well-established standards for producing tract-tracing reports create challenges in extracting architecture and connectivity data for systematic computational analysis and cross-species comparative studies. After a review of existing technologies, approaches and methods, it appeared that the most suitable strategy for databasing structural information of the ferret would be a CoCoMac-like approach and database schema. Our motivation was similar to that of the initial CoCoMac development (Stephan, 2013). First, most tract-tracing reports do not provide the exact spatial location of injections sites, but rather employ semantic localisers (such as an injection being made into 'primary visual cortex' or 'area 17'). Second, brain areas in one brain map may be represented very differently in another brain map. In order to build a comprehensive description of ferret brain connectivity, one needs mechanisms to relate one brain area and its connectivity in one parcellation to another brain area in a different parcellation. This transformation is tedious and error-prone if performed by hand, and therefore requires automation. Here, we focused on the problem of how to adapt the CoCoMac approach to the case of the ferret. Our system includes the main features of the CoCoMac approach, including PDC and extension codes as well as the ORT algorithm, but, in addition, we have extended the database schema in order to flexibly accommodate the representation of architecture information of brain areas.

To provide a wide base for the subsequent use of the database, several additional structural parameters were included. One motivation for this approach was the finding that brain connectivity appears to be closely related to the architectonic similarity of cortical areas (e.g., Hilgetag and Grant, 2010; Beul et al., 2014; van den Heuvel et al., 2015). Many literature reports also provide descriptions of brain cytoarchitecture. Such descriptions include the classification of cells, number of layers and sublayers and their density, amongst other features. Such cytoarchitectonic descriptions can be affected by similar problems as connectivity data, because they are usually defined by researchers within their own brain maps and hence need to undergo transformations from one brain map to another.

An important extension of the CoCoMac methodology is to link connectivity data to tools for visualization, analysis and simulation. This perspective is vital not only for understanding functional implications of connectivity, but also for validating data inserted into the database, by providing analytical summaries that can be compared to global models of connectivity organization. Therefore, Ferretome.org should have the functional capacity to extract data of all modalities (including computed brain maps relations) in a variety of formats in order to integrate well with analysis and simulation platforms and (online) atlases, such as the Scalable Brain Atlas<sup>14</sup> (Bakker et al., 2010). The export of connectivity data in XML and JSON formats is already available and more formats are planned. Integration with atlases will be useful not only for visualization, but can provide new knowledge in the area of comparative studies. For example, co-registering connectivity data with the SVG based Common Atlas format developed by Majka et al. (2012) has facilitated studies in a variety of species, such as opossum and marmoset. Moreover, following the example of the NeuroVIISAS platform (Schmitt and Eipert, 2012), integration with connectivity analysis tools, such as the Brain Connectivity Toolbox<sup>15</sup>, or tools for modeling brain dynamics, like The Virtual Brain<sup>16</sup> (Ritter et al., 2013), will be provided. This integration will allow characterizing features of structural nodes and circuits and linking them to aspects of brain dynamics and function.

In addition to storing fundamental connectivity and architectural data for the ferret brain, several additions are planned for Ferretome.org that will make access to the data easier or more functional. For example, in the past, an attempt was made to provide CoCoMac with visualization and search automation tools by using external software (Kötter, 2004). To follow the CoCoMac example, in the short term, we are planning the integration of visualization tools that can be deployed at the users' computer clients (directly in a browser). For example, the use of WebGL technology will allow future integration with a prospective atlas of the ferret brain. Taking into account that Ferretome.org extensively represents the architecture of brain areas, visualization tools could give to users the opportunity to display simultaneously connectivity data and architecture data. Moreover, by analogy with connectivity data, researchers should have the ability to perform a quick survey of architectural data right in their browser. For example, it will be helpful if architectural information on the cellular density and thickness of cortical layers can be read out in standard formats for further offline analysis.

Although in the current state the database does not contain sufficient data to provide connectivity and architectural data for the entire ferret brain, it may already be sufficient for identifying underrepresented brain areas where, for various reasons, tracttracing studies have not yet been conducted. As soon as new tract-tracing reports appear in literature, the data will be added to Ferretome.org. The collated data do not have to be restricted to cortical connectivity and area-to-area connection systems, but could also include the connectivity of neuromodulatory systems. These systems typically include localized cell populations (such as the orexinergic neurons in the hypothalamus, or the cholinergic and noradrenergic neurons in the pons) that project widely

<sup>14</sup>scalablebrainatlas.org

<sup>15</sup>www.brain-connectivity-toolbox.net

<sup>16</sup>www.thevirtualbrain.org

throughout the brain and spinal cord (Dell et al., 2013). These projections are easily identified with immunohistochemistry, and could be readily plotted and quantified with stereological techniques (in terms of regional densities, distribution by cortical layers and neuronal types, etc.) and added to the database. In addition, the quantitative distribution could be determined for the GABAergic neurons stained with parvalbumin, calbindin, and calretinin. Such an effort would provide insight into the organization of inhibitory systems in the brain, in addition to excitatory long-range projections. Although the integration of this type of data is a complex task that requires substantial adaptation of the database structure, it appears feasible and was already partly realized in the neuroVIISAS project (for details see Schmitt and Eipert, 2012).

As a further extension of the concept of this connectivity database, we also consider the possibility of adding the modality of large-scale functional connectivity of the ferret brain, both at rest and during tasks. This idea can be implemented with the same methodology as CoCoMac and Ferretome.org, by providing information on the reliability of data and by transformation of data across different brain maps. A worked example of storing functional connectivity data in the CoCoMac framework was provided by CoCoMacStry, a collation of strychnine-induced functional connectivity of the macaque brain (Stephan et al., 2000a). Ultimately, the structural and functional perspective of connectivity data can be linked through computational modeling platforms.

On the practical side, an efficient implementation and management system is required in order to maintain an upto-date database that is quick and functional as well as easy to handle by administrators and users. One way of achieving this aspect is by providing constant web access to all parts of the database. In this case, data in the database can be reviewed not only by the database collators, but also external experts. In the long-term, an important goal is the involvement of the scientific community, in particular of experimental neuroanatomists, for contributing new data or validating the information already existing in the database. This step is essential for verifying the overall consistency of the data and facilitating the dialog among all parties interested in ferret brain structure and function. Thus, the system has to be designed in such a way that it is accessible and appealing to experimentalists studying the

## REFERENCES


ferret brain. Based on this idea of community participation, one of the options for increasing the value of the databasing project is to have the ability to store the raw data (such as images, or detailed quantitative information) taken directly from experiments. From the technological point of view, this is a challenging task that requires development of special storage subsystems and algorithms for data access as well as data protection methods at different levels of data access, public and private.

In summary, here we introduced Ferretome.org, a ferret brain macro-connectivity and architecture database. This project is built upon the experience of a previous generation of neuroinformatics project such as XNAT, BAMS, NeuroVIISAS, and in particular CoCoMac. Specifically, Ferretome.org inherited from CoCoMac the basic methodology and philosophy of objectivity and reproducibility, and follows the same data collation rules and standards. In addition, we extended the basic CoCoMaC methodology in order to capture architectural data that provide an important context for connectivity data. Currently, we are moving toward extensive population of the database with newly published results and thus hope to make a useful contribution to the study of ferret brain structure and function.

## AUTHOR CONTRIBUTIONS

All authors listed have made substantial, direct and intellectual contribution to the work, and approved it for publication.

## ACKNOWLEDGMENTS

We gratefully acknowledge financial support by DFG projects SFB 936/A1 (authors DS and CH) and SFB 936/A2 (AE).

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fninf. 2016.00016




cerebral cortex. J. Neurosci. 35, 13943–13948. doi: 10.1523/JNEUROSCI.2630- 15.2015


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Sukhinin, Engel, Manger and Hilgetag. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application

Leon French<sup>1</sup> , Po Liu<sup>2</sup> , Olivia Marais <sup>2</sup> , Tianna Koreman<sup>2</sup> , Lucia Tseng<sup>2</sup> , Artemis Lai <sup>2</sup> and Paul Pavlidis 2,3 \*

<sup>1</sup> Rotman Research Institute, University of Toronto, Toronto, ON, Canada, <sup>2</sup> Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada, <sup>3</sup> Centre for High-Throughput Biology, University of British Columbia, Vancouver, BC, Canada

We describe the WhiteText project, and its progress towards automatically extracting statements of neuroanatomical connectivity from text. We review progress to date on the three main steps of the project: recognition of brain region mentions, standardization of brain region mentions to neuroanatomical nomenclature, and connectivity statement extraction. We further describe a new version of our manually curated corpus that adds 2,111 connectivity statements from 1,828 additional abstracts. Crossvalidation classification within the new corpus replicates results on our original corpus, recalling 67% of connectivity statements at 51% precision. The resulting merged corpus provides 5,208 connectivity statements that can be used to seed species-specific connectivity matrices and to better train automated techniques. Finally, we present a new web application that allows fast interactive browsing of the over 70,000 sentences indexed by the system, as a tool for accessing the data and assisting in further curation. Software and data are freely available at http://www.chibi.ubc.ca/WhiteText/.

#### Edited by:

Mihail Bota, University of Southern California, USA

#### Reviewed by:

Neil R. Smalheiser, University of Illinois-Chicago, USA Gully A. P. C. Burns, USC Information Sciences Institute, USA

#### \*Correspondence:

Paul Pavlidis, Centre for High-Throughput Biology, University of British Columbia, 2185 East Mall, Vancouver, BC, V6T 1Z4, Canada paul@chibi.ubc.ca

> Received: 26 August 2014 Accepted: 07 May 2015 Published: 21 May 2015

#### Citation:

French L, Liu P, Marais O, Koreman T, Tseng L, Lai A and Pavlidis P (2015) Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application. Front. Neuroinform. 9:13. doi: 10.3389/fninf.2015.00013 Keywords: connectome, text mining, natural language processing, information retrieval

## Introduction

Neuroinformatics research thrives on plentiful amounts of open and computable neuroscience datasets. This type of data is lacking at the level of brain regions, when compared to molecular data about genes or proteins. Currently, the bulk of this neuroscience information is fragmented across the literature. Manual curation can join and formalize the findings (Bota et al., 2012, 2014). To speed up this process in the domain of anatomical connectivity, we created the WhiteText project to automatically extract this information from text. WhiteText was designed to extract mentions of brain regions and statements describing connections between them. While developing WhiteText we asked the following questions:


By extracting thousands of connectivity statements and addressing these questions, we were able to evaluate several text processing techniques and create neuroinformatics resources for connectivity knowledge.

Connectomics is of great interest to neuroscientists that seek to understand brain networks using complete connectivity maps. Global connectivity knowledge is often viewed as fundamental to understanding how the brain processes information. Local connectivity informs focused studies involving smaller numbers of brain regions. Several large-scale projects are seeking to uncover brain region level connectivity maps in human, macaque, rat and mouse. Using advanced magnetic resonance imaging, the Human Connectome Project will provide brain region level connectivity maps for 1,200 healthy individuals to understand human neuroanatomy and its variation (Van Essen et al., 2013). Using neural tracers, the Mouse Connectome Project (Zingg et al., 2014), the Mouse Brain Architecture Project (brainarchitecture.org) and Allen Mouse Brain Connectivity Atlas (Oh et al., 2014) provide connectome scale data obtained from standardized approaches. The Allen Mouse Brain Connectivity Atlas has produced the most complete and standardized mouse connectome which covers 295 disjoint regions (Oh et al., 2014). The preceding Brain Architecture Management System (BAMS2) contrasts these experimental efforts by providing curated reports of rat brain connectivity (Bota et al., 2014, 2015). The BAMS curators standardize published connectivity results obtained from independent labs into a single database. CoCoMac is a similar system that collates connectivity results from Macaque studies (Stephan, 2013). The WhiteText project seeks to complement these projects by automatically extracting connectivity reports from past studies to speed up manual curation and add context to large-scale connectome projects.

While there has been significant effort to mine neuroscience information from text (Ambert and Cohen, 2012), our work is most inspired by past efforts to extract information about protein-protein interactions. This task is analogous to extracting connectivity information: it requires extraction of interaction relationships between named entities (proteins instead of brain regions). This close analogy allowed us to leverage work done in the gene and protein domain. At the BioCreative III workshop challenge, 23 teams competed to extract, resolve and link protein and gene mentions (Arighi et al., 2011), generating information on effective approaches. We adapted and extended the text-mining methods previously used to analyze protein networks for extraction of connectivity between brain regions (Tikk et al., 2010). This is an attractive approach because many of the challenges in analyzing text for information about proteins are also faced in mining information about brain regions. These challenges include abbreviations, synonyms, lexical variation and ambiguity. A key foundation for both the BioCreative challenges and our work are hand-annotated corpora to use as training data and gold standards for evaluation. Here we first review our annotated corpus and the methods and results for three tasks required for extracting connectivity relationships between brain region pairs: recognition of brain region mentions, standardization of brain region mentions and connectivity statement extraction (**Figure 1**). We provide only summary results and methods for these tasks and refer readers to the corresponding publications for details. Then we describe a recent evaluation which we used to create a new corpus; and finally a website we created to view the results.

## Review of Progress

## Manually Annotated Corpus

To seed the project we annotated a set of 1,377 abstracts for brain region mentions and connectivity relations (French et al., 2009). We used abstracts rather than full-text documents due to accessibility and their higher proportion of summary statements. This initial corpus allows for training of machine learning methods and later comparison between automatic and manually derived annotations. We focused on abstracts from one journal, the Journal of Comparative Neurology (JCN), because it is enriched with neuroanatomical studies. As described below, we considered other journals in later analyses.

Two trained undergraduate research assistants annotated the corpus for brain region mentions and connectivity relations in any species. Annotated brain region spans matched for 90.7% of the mentions in the subset of 231 abstracts annotated by both curators. In total, 17,585 brain region mentions were annotated with a subset forming 4,276 connections. Three high accuracy text mining methods were applied to all abstracts: species recognition (Gerner et al., 2010), automated expansion of abbreviations (Schwartz and Hearst, 2003) and tokenization (McCallum, 2002; Cunningham et al., 2013). Abbreviations were expanded because they are a common source of ambiguity and confusion (Gaudan et al., 2005). Further, we processed text within individual sentences that would not contain abbreviation expansion information found in previous sentences. Rat was the most common species of the 209 species studied (aside from species relating to reagents used

in tract tracing such as horseradish, wheat and pseudorabies virus).

## Recognition of Brain Region Mentions

The first task of recognizing mentions of brain regions in free text is known as ''named entity recognition''. This step identifies (''highlights'') spans of text that refer to brain regions (a ''mention''). For this task we employed the MALLET package for natural language processing (McCallum, 2002) to create a conditional random field classifier that was able to label brain region mentions (French et al., 2009). Eight-fold cross-validation was used in this evaluation and abstracts were not split between training and testing. In this cross-validation framework each sentence is used seven times for training and once to test. Feature selection was performed with 14% of abstracts heldout. We consistently define precision as the proportion of true positives to positive predictions and recall as the proportion of true positives to actual positives. For this task the classifier recalled 76% of brain region mentions at 81% precision. Precision increases to 92% and recall reaches 86% when partial matches are counted. This performance was much higher than naive dictionary-based methods that attempt to match words to lists of known brain region names. We observed that regions in non-mammals (e.g., insects), which were underrepresented in the corpus, were poorly classified. Thus recall improves when restricting the abstracts to studies of monkey, cat, rat and mouse brain but only in comparison to a similar sized set of random abstracts.

From our analysis, we suspect many incorrectly recognized brain region mentions are due to conjunctions, previously unseen words and brain regions of less commonly studied organisms (e.g., insects and fish). Surrounding words, word base forms and abbreviation expansion were the most informative techniques and features used by the classifier. Although textual features derived from the neuroscience domain did increase performance (lexicons of brain region names for example), we found that most of the knowledge needed to extract brain region mentions can be learned from our large set of annotated examples.

## Standardization

Recognition of brain region mentions only provides a string that is predicted to refer to a brain region. Standardizing this string to a formally defined concept representing the brain region is important to downstream analysis and linking to other resources. The process of mapping free text to formal identifiers is also known as normalization or resolution. For example, this step aims to link the free text ''substantia nigra compact part'' or ''SNC'' found in an abstract to the NeuroLex concept birnlex\_990 which has the preferred name of ''Substantia nigra pars compacta'' (Bug et al., 2008). Viewing birnlex\_990 in the NeuroLex website expands the mention to a definition, information about spatial location and cell types found in the region (Larson and Martone, 2013). To maximize our set of brain region names, we targeted five neuroanatomical lexicons that span several species [NeuroNames (Bowden et al., 2011), NIFSTD (Imam et al., 2012), the Brede Database (Nielsen, 2015), BAMS (Bota et al., 2014), and the Allen Mouse Brain Reference Atlas (Dong, 2008)]. This provided a set of 11,909 target region names that represent an estimated 1,000 different mammalian brain regions (French and Pavlidis, 2012).

For the standardization task we applied simple lexicon-based methods that iteratively modified the original mention until a match was found (French and Pavlidis, 2012). First, a caseinsensitive exact string match was attempted on the mentioned region. If that failed to match, word order was ignored by using bag-of-words matching, so that ''reticular thalamic nucleus'' would match ''thalamic reticular nucleus''. Next, stemming was applied to reduce words to base forms (e.g., ''nucleus raphé dorsalis'' would reduce to ''nucleu raph dorsali''). Again, exact matching of stemmed mentions and bag-of-stems matching was attempted. These methods were stringent, as they required all words or stems in a mention to match the name in the lexicons. To improve coverage we designed twelve modifiers that edited the mentions, sacrificing some information. This included removing hemisphere specific qualifiers, bracketed text and directional prefixes. Application of these modifiers increased standardization coverage from 47–63%.

By testing the above approaches on the manually annotated corpus, we estimated that mentions are mapped at 95% precision and 63% recall (French and Pavlidis, 2012). We note that precision is based on the lexical information (brain region names) and not the specific neuroanatomical location in a given species and atlas. This step is key for many neuroscience text miners because it provides a method for linking abstracts to region-specific data outside the text via formalized brain region names. In addition, patterns of publication interest can be observed: not surprisingly, some regions are more popular than others but popularity can wax and wane over time. Importantly, this work quantified challenges in the standardization of neuroanatomical nomenclatures. We observed that many standardized terms never appear in our input corpus and many mentions used by authors are not in the terminologies. To address this latter gap, we deposited 136 brain region names identified from our analysis into the Neuroscience Lexicon (Larson and Martone, 2013).

## Connectivity Statement Extraction

In a given abstract, mentions of brain regions provide limited information without any context. Our goal was to extract information about the brain regions, namely connectivity. To reduce the complexity of this task we targeted positive statements of connectivity and ignore the direction (efferent/afferent). Further we limited the manually curated training and test connections to those within sentences. These restrictions allow application of existing tools for extracting protein-protein interactions. The resulting classification task is to determine if a pair of brain region mentions are described as connected or not. A negative prediction includes statements reporting no connectivity between the two regions but the majority of negative pairings are from sentences mentioning two brain regions but containing no connectivity information. Our corpus had available 22937 total pairs of brain regions of which 3097 describe connections, with the balance considered negative examples (French et al., 2012).

By re-using the protein interaction benchmark tools assembled by Tikk and colleagues, we tested several methods on our annotated corpus in a cross-validation framework (Tikk et al., 2010). The best method, the ''shallow linguistic kernel'' (Giuliano et al., 2006) recalled 70% of the sentence level connectivity statements at 50% precision in ten-fold crossvalidation. This method is ''shallow'' in the sense that it does not involve parsing complex sentence structure. Sentence length does increase from the top to bottom ranked predictions, suggesting a relationship with complexity. However, similar accuracy was provided by more complex methods that use deeper features such as word dependencies and semantic features.

## Methods

## Extended Evaluation and Corpus Creation

Each interaction was independently judged by two of four undergraduate research assistants and disagreements were resolved by group review. All four curators annotated a small training set of 307 connections for initial training and guideline refinement. Annotator agreement depended on the pair of curators compared, and ranged between 83% and 97%. To speed curation, we used spreadsheets that presented the full sentences and links to the abstracts containing the predicted connections.

## Article Classification and Expanded Predictions of Connectivity

The online MScanner tool was used to find connectivity abstracts outside of the JCN (Poulter et al., 2008). MScanner is not domain specific, but instead uses supervised learning to search PubMed for related articles (Naïve Bayes classifier). Abstracts found to contain connectivity statements in previous evaluations were used as the input set. We applied MScanner with and without the word features (journal name and MeSH terms features were used for both executions). Brain region mentions were extracted with the previously published conditional random field that was trained on the entire first set of manually annotated abstracts. The shallow linguistic kernel (Giuliano et al., 2006) from the ppi-benchmark framework was used to predict connectivity relations (Tikk et al., 2010).

## WhiteText Web

WhiteText Web was implemented with Google Web Toolkit 2.5.1 and the Apache Jena framework. User input is restricted to Neurolex brain regions that appear in the corpus. We note that this restriction is only placed on the one of the two connected regions, allowing any brain region mention text to represent the second region displayed (''Connected Region''). Formalized mapping to synonyms and query expansion to subregions was extracted from the Neuroscience Information Framework (NIF) Gross Anatomy ontology. Subregions were extracted by extracting ''proper\_part\_of'' predicates [Open Biomedical Ontologies (OBO) and relation ontology (RO)]. The list of 110 phrases that describe connectivity are that underlined in the output were extracted from the first manually annotated corpus. Example phrases are ''projects to'' and ''terminating in'' (full list on supplementary website). LINNAEUS was used to recognize and normalize species names (Gerner et al., 2010). Sentences from abstracts with more than one species mentioned are duplicated to prevent omission of connections when sorting the table by species.

## Results

## Extended Evaluation and Corpus Creation

Beyond the cross-validation evaluation described above, we have previously applied our method to 12,557 previously unseen JCN abstracts (those not in our corpus) and compared a standardized subset of 2,688 relationships to the data in BAMS (Bota et al., 2012). We found that 63.5% of these connections were reported in BAMS. Using the BAMS data as a gold standard, we also found that precision can be increased at the cost of recall by requiring connections to occur more than once across the corpus (French et al., 2012).

To extend these results and obtain more training data, we have now created a new corpus by extending our previous evaluation of 2000 positive predictions (French et al., 2012). **Figure 2** outlines the creation of new corpora from the original corpus. This new corpus is based on running our framework on the test set of 12,557 JCN abstracts. Most importantly, to gauge recall we had to identify negative examples, as our previous effort only manually evaluated positive predictions. By adding new evaluations of negative predictions, the new corpus contains 11,825 brain region pairings extracted from the 12,557 abstracts (12% of possible within sentence pairings), of which 18% were considered positive examples. Recall was 45.5% (as previously reported on the

2000 positive predictions, precision is 55.3%). The drop in accuracy compared to the previous cross-validation test appears to be partly due to automation of preprocessing steps that were done manually in the original corpus of 1,377 abstracts. These automated steps are imperfect and thus a source of errors upstream of the connection prediction step. Specifically, we found that many classification errors could be ascribed to problems with brain region mention extraction (∼10–15% of errors) and abbreviation expansion (<4%). Standardization of brain region mentions was not performed in this evaluation, allowing isolation from low recall in the standardization task. In a cross-validation framework, the shallow linguistic kernel within this new set of evaluated connections replicates the accuracy of the first set (recalling 67% of connectivity statements at 51% precision). In comparison to our first set, this corpus covers a broader set of abstracts but a lower number of connections (**Table 1**). We are making this corpus available to the community to use in further efforts at improving connectivity extraction methods or for developing other text processing tasks that benefit from the annotations provided.

## Article Classification and Expanded Predictions of Connectivity

In our initial work we focused on the Journal of Comparative of Neurology due to its enrichment of connectivity reports. Although it is possible to run our pipeline on the entire MEDLINE corpus, we suspected this would produce a large number of false positives from the enormous number of abstracts that mention brain regions but do not discuss connectivity. To create a larger connectivity resource from abstracts more likely to be relevant, we used the MScanner tool (Poulter et al., 2008). MScanner trains a classifier to identify abstracts with features similar to an input training set. Features used by MScanner include journal name, MeSH terms and words in the abstract and title. We trained MScanner using the abstracts from our curated corpora, reasoning that the results would be enriched for abstracts containing connectivity statements (**Figure 2**). The features chosen by MScanner as relevant include the word features ''Nucleus'', ''Medial'', ''Projection'' and the MeSH qualifier ''anatomy and histology''. MScanner yielded a new set of 8,264 abstracts. Applying our pipeline to this set yielded 36,566 predicted statements of connectivity. Over 92% of abstracts were predicted to have at least one connectivity statement. This suggests MScanner provides a good initial filtering step for our pipeline. Beyond the general purpose approach of MScanner, Ambert and Cohen demonstrate more complex article classification tools for extraction of connectivity studies (Ambert and Cohen, 2012).

## WhiteText Web

We created WhiteText Web,<sup>1</sup> to provide easy access to the extracted connectivity statements. Given an input brain region, the tool returns all predicted statements of connectivity involving that region and its enclosing subregions. The resulting sentences and connections are highlighted for quick browsing by the user. As shown in **Figure 3**, results are provided in a spreadsheet table format that allows quick sorting by classification score (approximates prediction confidence), connected regions and species mentioned in the abstract. Source abstracts are easily accessible to allow review of full context. Each predicted connectivity relation presented is displayed with a flag icon, which allows a user to flag connections that appear incorrect. This user provided information is logged and will be used in future evaluations.

## Characterization of WhiteText Web Combined Corpus

The WhiteText Web corpus consists of all the curated and predicted connectivity statements mentioned above (17,454 abstracts with at least one connection). Over 200 species were mentioned in these abstracts with rat (24,690 predicted connections), cat (12,469) and rhesus macaque (3,113) having the most mentions (top ten shown in **Table 2**). The JCN is still the main source for our connectivity information due to its selection for the original corpora. However, the use of MScanner has provided abstracts with connectivity predictions from 304 additional journals with the top ten given in **Table 3**. Publication year in the combined corpus is limited by available abstracts in the MEDLINE database (8 abstracts found before 1,975) and the dates of our studies and the MScanner database (no abstracts beyond

<sup>1</sup>http://www.chibi.ubc.ca/whitetext/app/



This table presents summary counts of abstracts and sentence level connectivity counts for abstracts with predicted and curated connections. Region pairs and connection counts are counted within sentences only. Connections were predicted with a shallow linguistic kernel trained on the Original Corpus for both the JCN Predictions (Journal of Comparative Neurology) and MScanner sets. Precision and recall values were computed with shallow linguistic kernel in a crossvalidation framework.


FIGURE 3 | Screenshot of example results from WhiteText Web. The top text input field attempts to match typed text to brain regions in NIFSTD while the user types. The query region column shows the original named brain

regions that were matched to the given input of "Habenula" or it's children. Sentence text is directly linked to the source abstract in PubMed. Query and

connected regions are colored, with underlines marking words that suggest connectivity. Results can be sorted by all columns except the first. A single click on the gray flag in the "Report" column allows users to mark sentences that were incorrectly parsed. The "Export Table" link (top left) provides a tab-separated file containing the returned results.

#### TABLE 2 | Species with the most associated connections in the combined corpus.


Connection counts combine predicted and curated connections in the corpus. NCBI taxonomy identifiers are provided.

mid 2011). Yearly counts of connectivity studies contained in our combined corpus peaks in 1991 with 707 abstracts (**Figure 4**).

#### Accessibility

Data and software used for the project are freely available in standardized formats.<sup>2</sup> The new corpus

<sup>2</sup>http://www.chibi.ubc.ca/WhiteText

#### TABLE 3 | Top ten most frequent journals in the combined corpus.


is additionally provided at http://figshare.com/articles/ New\_WhiteText\_Corpus/1400541. Text mined results for a specific brain region specific can be exported from WhiteText Web as a tab separated files. To store annotated text we used GATE and encoded downstream annotations in AirolaXML and Resource Description Framework (RDF). Use of RDF allows simple queries of extracted connectivity statements with the SPARQL query language. Connectivity matrices are also provided for convenience. Software and documentation

Frontiers in Neuroinformatics | www.frontiersin.org May 2015 | Volume 9 | Article 13 |

is available on GitHub.<sup>3</sup> Further, Bluima, an open source text mining toolkit for neuroscience has re-used some of our resources for brain region extraction in the UIMA framework (Unstructured Information Management Architecture; Richardet and Telefont, 2013).

## Discussion

We created and applied a system for large-scale automatic extraction of connectivity knowledge. By analyzing over 20,000 abstracts we found the neuroscience literature contains a wide diversity of terms, species, and brain region names. Unfortunately, this diversity exceeds that of the existing formalized neuroanatomical lexicons. We found it difficult to create a clear set of annotation guidelines due to this diversity that extends to sentence structure and experiment design. While this diversity limits the automatic mining of neuroscience literature, we evaluated several methods that improve automatic extraction. We found great value in general-purpose and biomedical text mining tools. We applied these tools with little or no tuning and report robust and extendable results. This allowed more time for extensive manual evaluation and review. In addition to tested methods, our work provides a database of evaluated connectivity statements that can be used as a starting point for manual curation and to facilitate neuroscience text mining.

Our results and evaluations provide the most critical assessment of text mining for neuroscience to date. We note the NeuroScholar system which had similar overall goals to our project (Burns and Cheng, 2006). Burns and Cheng sought to automatically label more detailed information about connectivity experiments in full text articles, including methodological details. In contrast, our work focuses on summary statements in abstracts to extract brain region mentions and relationships between them. Building on the WhiteText project resources, a large-scale application of connectivity extraction has been performed on all PubMed abstracts and a large set of full text articles (Richardet et al., 2015). Richardet and colleagues used the WhiteText corpus to help develop a scalable system that combined the shallow linguistic kernel with filer and rule based methods. By using the Allen Mouse Brain Connectivity Atlas, precision of the extracted connections was estimated at 78%. Both studies support the value and feasibility of automatically extracting connectivity information from natural language text.

It is possible to search for connectivity literature using keyword searches of PubMed or Google Scholar. However, searches for a single brain region will retrieve studies of that region that do not examine connectivity. Adding keywords like ''projections'' will not recall all studies as demonstrated by our list of over 100 phrases that describe a connection. In contrast our system is focused on connectivity studies and only presents users with sentences predicted to contain connectivity statements. We designed WhiteText Web for neuroscientists searching connectivity studies that can be used to design or interpret their experiments. We also designed it to aid curation by adding an easy way to report incorrect predictions. The features of WhiteText Web are similar to NIF Integrated Nervous System Connectivity resource (Larson and Martone, 2013). Our text-mined results are less accurate than the NIF connectivity resource, which is based on six manually curated databases. Also, the NIF resource provides the direction of the connections and reports of no connectivity. However, WhiteText Web provides a wider search covering more species and sources and underlines key words that indicate direction (''projects to'' and ''terminated in'' for example). Further, WhiteText Web provides original text with highlights for quick viewing and the ability to provide instant feedback. NIF and BAMS2 provide increasingly valuable resources for integration and validation as they continue to grow with the published literature (Bota et al., 2015).

Recently, two teams reported large-scale tract-tracing studies in mouse (Oh et al., 2014; Zingg et al., 2014). Our system can supplement these studies by providing evidence of connectivity in other species and providing literature context for the connections. Unlike the large-scale surveys, the connectivity statements we extract are often from studies that potentially provide additional context and relationships. For example, future work could extract behavior and systems that are related to a connection by examining the abstract or full text that contains the connection.

Our work has several limitations. First, our analysis was limited to article titles and abstracts. Application of our methods to the complete texts of papers should provide more brain region mentions that can be mined for connections and other relationships. However, a recent study of neuroscience document classification demonstrated the difficulty of using full text, reporting low performance when using information from full text compared to abstracts for an article classification task (Ambert et al., 2013). This is presumably because abstracts tend to be highly concentrated with factual statements about the study compared to the rest of the article. Like Cohen and colleagues we believe better tools may be needed to fully exploit the different content and structure of full text (Cohen et al.,

<sup>3</sup>https://github.com/leonfrench/public/wiki

2010). Richardet et al. address these questions by extracting connectivity from over 630,216 full text neuroscience articles to predict over 250,000 connectivity relations, more than what was extracted from all PubMed abstracts (Richardet et al., 2015). They found that specific filtering rules were needed for full text. However, they report no differences between connections extracted from full text and abstracts in terms of distance and precision. Another limitation is that our tools were limited to analysis of single sentences, so connections that are described in more than one sentence could not be captured. We estimate that at least 25% of connections mentioned in an abstract span multiple sentences, a substantial loss of information for the text mining approach. Directionality of connections is also lost as our methods only predict presence of connection. However, we note that directions are annotated in the first corpus and WhiteText Web underlines direction-specifying keywords for the

## References


user. Finally, the accuracy of text mining and natural language processing limits the immediate application and biological interpretation of the results without further manual curation. To further improve the quality of the data in WhiteText, we are performing manual curation to remove errors. The text mining approach can provide a large set of data that is highly enriched for relevant statements, providing a good input for manual curation.

## Acknowledgments

We thank Sanja Rogic, Shreejoy Tripathy and Dmitry Tebaykin for their comments on the paper. The work described was supported by the Natural Sciences and Engineering Research Council of Canada and the National Institutes of Health (GM076990 to PP).


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 French, Liu, Marais, Koreman, Tseng, Lai and Pavlidis. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Ontology-based approach for in vivo human connectomics: the medial Brodmann area 6 case study

#### Tristan Moreau\* and Bernard Gibaud

*Medicis, UMR 1099 LTSI, INSERM, University of Rennes 1, Rennes, France*

Different non-invasive neuroimaging modalities and multi-level analysis of human connectomics datasets yield a great amount of heterogeneous data which are hard to integrate into an unified representation. Biomedical ontologies can provide a suitable integrative framework for domain knowledge as well as a tool to facilitate information retrieval, data sharing and data comparisons across scales, modalities and species. Especially, it is urgently needed to fill the gap between neurobiology and *in vivo* human connectomics in order to better take into account the reality highlighted in Magnetic Resonance Imaging (MRI) and relate it to existing brain knowledge. The aim of this study was to create a neuroanatomical ontology, called "Human Connectomics Ontology" (HCO), in order to represent macroscopic gray matter regions connected with fiber bundles assessed by diffusion tractography and to annotate MRI connectomics datasets acquired in the living human brain. First a neuroanatomical "view" called NEURO-DL-FMA was extracted from the reference ontology Foundational Model of Anatomy (FMA) in order to construct a gross anatomy ontology of the brain. HCO extends NEURO-DL-FMA by introducing entities (such as "MR\_Node" and "MR\_Route") and object properties (such as "tracto\_connects") pertaining to MR connectivity. The Web Ontology Language Description Logics (OWL DL) formalism was used in order to enable reasoning with common reasoning engines. Moreover, an experimental work was achieved in order to demonstrate how the HCO could be effectively used to address complex queries concerning *in vivo* MRI connectomics datasets. Indeed, neuroimaging datasets of five healthy subjects were annotated with terms of the HCO and a multi-level analysis of the connectivity patterns assessed by diffusion tractography of the right medial Brodmann Area 6 was achieved using a set of queries. This approach can facilitate comparison of data across scales, modalities and species.

Keywords: ontology, connectome, data sharing, neuroanatomy, neuroimaging, tractography, semantic web, MRI

## 1. Introduction

The human brain is constituted of a vast amount of interconnected neurons forming structural circuits which transmit information. Multi-scale analysis of this anatomical connectivity (Caspers et al., 2013) from synaptic connections between individual neurons (microscopic scale), to brain regions interconnected via white matter fiber bundles (macroscopic scale) is fundamental to better apprehend the link between structure and function in diseased and healthy brains

#### Edited by:

*Maryann E. Martone, University of California San Diego, USA*

#### Reviewed by:

*Mihail Bota, University of Southern California, USA Jose L. V. Mejino, University of Washington, USA*

#### \*Correspondence:

*Tristan Moreau, UMR 1099 LTSI, INSERM, Universite de Rennes 1, 2 avenue du Pr. Leon Bernard, Rennes F-35000, France tristan.soyyo@gmail.com*

> Received: *30 September 2014* Accepted: *24 March 2015* Published: *10 April 2015*

#### Citation:

*Moreau T and Gibaud B (2015) Ontology-based approach for* in vivo *human connectomics: the medial Brodmann area 6 case study. Front. Neuroinform. 9:9. doi: 10.3389/fninf.2015.00009* (Honey et al., 2010). A promising way for studying brain connectivity is to compile a coherent mapping of the network of elements and connections forming the human brain and defined as the human connectome (Sporns et al., 2005).

Recent advances in Magnetic Resonance Imaging (MRI) and brain networks have opened new possibilities to map and analyse anatomical and functional long-range connectivities in the living brain, giving birth to a new field of research: human connectomics (Behrens and Sporns, 2012). Currently, diffusion MRI (dMRI) and functional MRI (fMRI) are the most popular modalities to assess non invasively anatomical and functional connectivities, respectively (Craddock et al., 2013). dMRI estimates the local fiber bundles orientations at millimeter voxel resolution as the directions of least hindrance to water diffusion in brain. Then, tractography aims at reconstructing white matter fiber bundles using algorithmic approaches based on local fiber bundles orientations (Basser et al., 2000). fMRI uses temporal correlations in the fluctuations of the Blood-Oxygenation-Level-Dependent (BOLD) signal to infer functional connectivity (Smith et al., 2011). After reconstruction of anatomical or functional connectivities from MRI (Jbabdi and Johansen-Berg, 2011), in vivo neuroimaging data can be modeled and analyzed using connectomics in order to produce brain networks at macroscopic scale (∼ 1 cm<sup>3</sup> or greater) (Hagmann et al., 2007; Zalesky et al., 2011 ; Sporns, 2013). However, there is a great diversity in methodological approaches, especially no consensus currently exists on how to best define nodes for charting in vivo human connectome, i.e., subdividing the brain into macroscopic regions in an anatomofunctional coherent way (Craddock et al., 2013 ; Fortino et al., 2013). Indeed, depending on the scope of the study, nodes can represent small regions (∼ 1 cm) or larger brain areas as a specific gyrus. Moreover comparing data across scales, modalities, and species remains challenging (Essen and Ugurbil, 2012 ; Leergaard et al., 2012). A real need exists of new neuroinformatics tools for in vivo human connectomics that allow different levels of granularity of multi-modal connectivity data to be described, shared, integrated and compared.

Semantic annotation of brain images consists in associating meaningful metadata using terms of an ontology in order to describe and share information related to that resource such as acquisition protocol, anatomical content, diagnosis etc. (Mechouche et al., 2009; Turner et al., 2010). Biomedical ontologies are structured vocabularies representing classes of entities which are of biomedical significance in reality. They focus on the definition of the entities of the domain being modeled and on the relations between them, especially the subtype relation used to organize the entities in a taxonomy. Ontologies also specify other relations, such as the "part of " relation, or any other relation that is relevant in the domain of interest. Specifying the set of relations (called axioms) that apply to all the instances of a class contributes to capture knowledge about this entity (Gruber, 1995). Axioms can be expressed in the OWL<sup>1</sup> ontology language standard (Web Ontology Language, defined by the W3C), and especially the OWL DL<sup>2</sup> sublanguage, based on Description Logics (DL) (Baader et al., 2003). OWL DL provides a good compromise between expressivity and computational complexity (and decidability). Moreover, it allows reasoning on formal knowledge and infering automatically new axioms using description logic reasoning engines such as FaCT++<sup>3</sup> . Ontology-based systems and reasoning engines are particularly relevant in the human connectomics realm as they provide the capability to apprehend consistently multi-scale knowledge, to describe heterogeneous data with semantic annotations, and finally to facilitate data querying, sharing and interoperability.

In recent years, different efforts were reported to specify computer models and ontologies to represent, collate, process and share human brain anatomical connectivity (OBO Relation Ontology, Smith et al., 2005; Swanson and Bota, 2010; Larson and Martone, 2013; Bota et al., 2014 ; Nichols et al., 2014). On one hand, the Foundational Model of Anatomy (FMA) was developed to provide a reference ontology for human anatomy. It includes many terms from Terminologia Anatomica (Federative Committee on Anatomical Terminology, 1998), which itself founds its origin in Nomina Anatomica (International Anatomical Nomenclature Committee, 1989). The foundamental difference between these terminologies and ontologies like FMA is that the former provide organizations of terms that enhance part of the intrinsic meaning of each term, in an implicit way, whereas ontologies such as FMA relate terms using relationships bearing explicit semantics such as subsumption links and "part of " links. FMA specifies anatomical connectivity relationships at different levels of granularity (Rosse and Mejino, 2003; Nichols et al., 2014). On the other hand, the Foundational Model of Connectivity (FMC) provides a high level conceptual framework suitable for modeling "structural architecture of nervous connectivity in all animals at all resolutions" (Swanson and Bota, 2010). In particular, this model influenced and is compatible with BAMS, the Brain Architecture Management System built by Mihail Bota and co., a neuroinformatics system to store, mine and model structural connectivity in multiple species such as mouse, rat, cat, macaque and human. Most connectivity data concern pathway-tracing experiments in animals, techniques based on injection of a tracer and tracing of neural connections either from their source to their point of termination (anterograde tracing) or the opposite (retrograde tracing). However, although these biological ontologies and conceptual models aim at representing anatomical connectivity, none of them can be used to represent connectivity assessed by diffusion tractography, yet. Indeed, diffusion tractography can only provide limited insight on the organization of in vivo white matter fiber bundles at the present time (cf. Section 4): for example it cannot determine polarity of connections, nor synaptic connections. Moreover, cytoarchitecture of the cerebral cortex cannot be rendered using MRI due to limited spatial resolution, so that concepts of gray matter region defined using criteria based on spatial distribution of a set of neuron types are not relevant for in vivo connectomics. So, a real need is emerging of new ontology in order to bridge the gap between experimental neurobiology and in vivo human connectomics observations provided by MRI.

<sup>1</sup>OWL, http://www.w3.org/TR/owl-ref/

<sup>2</sup>OWL DL, http://www.w3.org/TR/owl-ref/#OWLDL

<sup>3</sup>FaCT++, http://owl.man.ac.uk/factplusplus/

An important and complementary field of research in neuroimaging concerns the development of digital atlases providing both a template brain and neuroanatomical labels in a conformed space. Individual brain datasets are aligned to the atlas using volumetric or surface-based registration approaches in order to propagate the neuroanatomical labels of the atlas to brain regions. A great number of in vivo neuroimaging datasets are currently annotated using brain atlases such as the Talairach Atlas (Talairach and Tournoux, 1988), the Montreal Neurological Institute (MNI) atlas (Tzourio-Mazoyer et al., 2002), or the atlases embedded in software tools such as Freesurfer<sup>4</sup> (Fischl et al., 2004; Desikan et al., 2006) or the JHU white matter tractography atlas<sup>5</sup> (Wakana et al., 2007; Hua et al., 2008). Recently, FMA provided a mapping between several terminologies used in brain atlases, such as Freesurfer or the JHU white matter tractography atlas, and the corresponding neuroanatomical concepts defined in FMA (Nichols et al., 2014). This effort facilitates the use of FMA as a reference and pivotal terminology for the annotation of brain segmentation results such as cortical or subcortical gray matter regions, and different white matter fiber bundles.

The main contribution of this paper was to create a generic neuroanatomical ontology called "Human Connectomics Ontology"<sup>6</sup> (HCO) in order to represent macroscopic regions defined on MRI datasets connected via fiber bundles assessed by diffusion tractography in the living human brain. Grounded on the FMA reference ontology, HCO was expressed in OWL and used the OWL DL sublanguage in order to be processable by usual reasoning engines. The latter provide highly optimized implementations of reasoning algorithms to process and correctly answer arbitrarily complex queries, such as those involving, e.g., transitive part-whole and spatial relationships. Moreover, an experimental work was achieved in order to show how the HCO could be effectively used to address complex queries concerning in vivo MRI connectomics datasets: a multi-level analysis of the connectivity pattern of the right medial Brodmann Area 6 (BA6) reconstructed by diffusion tractography was achieved, using a set of queries on annotated neuroimaging datasets of five healthy subjects. The medial BA6 region is a cortical region defined using a set of cytoarchitectural criteria (Zilles and Amunts, 2010) and is part of the medial frontal cortex located on the midline surface of the hemisphere just in front of the primary motor cortex. This region of interest was chosen because different studies showed how it could be subdivided into different sub-regions in a reproducible way using criteria based on long-range connectivity assessed by diffusion tractography (Johansen-Berg et al., 2004; Anwander et al., 2007; Jbabdi et al., 2009). We believe that this approach can facilitate comparison of data across scales, modalities and species.

The following of the paper is organized as follows. Section 2 describes how the HCO was designed and achieved. Section 3 is related to the experimental work. Finally, Sections 4 and 5 are dedicated to the discussion and conclusion.

## 2. Materials and Methods

## 2.1. A Neuroanatomical Ontology for in vivo Human Connectomics

#### 2.1.1. Requirements and design

Competency questions (Neuhaus and Vizedom, 2013) are more and more used to specify the domain that an ontology should cover. Therefore, we designed a set of competency questions pertaining to a use case inspired by the medial BA6 connectivitybased parcellation (Johansen-Berg et al., 2004), in order to assess how the Human Connectomics Ontology (HCO) can support hypothesis-driven analysis of connectomics datasets at different levels of granularity:


In order to meet these requirements, a neuroanatomical ontology module, called "NEURO-DL-FMA," was first constituted in order to annotate gross anatomy of the brain (i.e., gray matter regions, white matter fiber bundles). NEURO-DL-FMA was based on a subset of FMA which is an open source reference ontology representing the phenotypic structure of the human body at different scales. FMA contains more than 85, 000 classes and 140 relationships between entities (Rosse and Mejino, 2003; Golbreich et al., 2013). Finally, the HCO was based on NEURO-DL-FMA and aimed at representing nodes connected with fiber bundles assessed by diffusion tractography. Moreover, nearest neighbor topology between gray matter regions was also represented.

**Figure 1** depicts a scenario of information retrieval concerning in vivo connectivity patterns assessed by diffusion tractography. Investigators can pose a wide range of queries using terms (i.e., classes and object properties) of the HCO. For example, an investigator could be interested in retrieving all cortical parcels of the right medial BA6 which have a connectivity pattern similar to the right Supplementary Motor Area (connectivity pattern passing through the right corticospinal tract or connected to gray matter parts of the right precentral gyrus). The query is submitted to a reasoning engine that infers automatically part-whole, connectivity and spatial relationships at different levels of granularity.

<sup>4</sup>Freesurfer, http://surfer.nmr.mgh.harvard.edu/fswiki

<sup>5</sup>FSL, http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FSL

<sup>6</sup>HCO, https://medicis.univ-rennes1.fr/activities/theme3/projects/semantic-annota tion/index

## 2.1.2. NEURO-DL-FMA: a Neuroanatomical Gross Anatomy Ontology

The NEURO-DL-FMA is a neuroanatomical gross anatomy ontology that was achieved in two steps. First all useful entities and relations were extracted as a "view" from the FMA reference ontology (OWL Full 3.2.1 version) (Noy and Rubin, 2008). This view was then translated into OWL DL, which was necessary since most commonly used reasoning engines do not support OWL Full.

As the FMA contains more than 85, 000 anatomical concepts, the first step was to extract a "view" from the FMA in order to focus only on concepts and relationships of interest. This was achieved using vSparQL queries (Shaw et al., 2011) and the entities were extracted from the more specific to the more general ones. This view was constituted of (1) all concepts denoting a gray matter structure mapping a Freesurfer cortical region, (2) all concepts denoting a white matter bundle mapping an entity in the JHU white matter tractography atlas. A one-to-one mapping between FMA and the Freesurfer and JHU white matter tractography atlas terminologies was available in the 3.2.1 version of the FMA (Nichols et al., 2014). Then all the entities related using the fma:regional\_part\_of or fma:constitutional\_part\_of object properties were extracted recursively (the "fma" prefix denotes terms originating from the FMA). The Brodmann areas, the hippocampus parts, and the other object and data properties were excluded. All the entities that subsume the entities present in the current view were included recursively. All the metaclasses of FMA (Dameron et al., 2005) included in our view were then discarded as they did not contain useful information for our gross anatomy ontology. All concepts that did not concern the domain of neuroanatomical gross anatomy such as fma:Human\_body were also discarded. In order to reuse nearest neighbor topology knowledge represented in the FMA, all the entities related using the fma:attributed\_continuous\_with object property of type fma:Continuous\_with\_relation were included in our view. Finally, only the following three object properties and their inverse (if exists) were kept: fma:constitutional\_part, fma:regional\_part, fma:attributed\_continuous\_with.

This view was achieved using a web service implementation developed by the University of Washington's Structural Informatics Group<sup>7</sup> . In this implementation, the FMA (OWL Full 3.2.1 version) is embedded in a MySQL<sup>8</sup> relational database. This web service based on Apache Jena<sup>9</sup> accepts VSparQL queries allowing portions of the FMA to be extracted by recursively following complex pathways within the ontology graph (Shaw et al., 2011). **Figure 2** shows an example in which all entities related to the fma:Right\_precentral\_gyrus using the fma:regional\_part\_of or fma:constitutional\_part\_of object properties were extracted from the FMA using the following vSparQl request:

CONSTRUCT { ?x ?y ?z } FROM <http://purl.org/sig/fma> FROM NAMEDEV <rpo\_precentral\_gyrus> [ CONSTRUCT { temp:set temp:member ?x. } FROM <http://purl.org/sig/fma> WHERE { fma:Right\_precentral\_gyrus gleen:OnPath("([fma:regional\_part \_of]|[fma:constitutional\_part\_of])<sup>∗</sup> " ?x). } ] WHERE GRAPH { <rpo\_precentral\_gyrus> { ?x ?y ?z } }.

This request was processed by a web service based on a local server at the university of Rennes 1. The result of this query (cf. **Figure 2**) lists all anatomical concepts from the right precentral gyrus to the human body entities illustrating part-whole relationships in FMA and was expressed using the Resource Description Framework<sup>10</sup> (RDF).

Finally, a translation into OWL DL was necessary in order to enable the subsequent use of reasoning engines. This was achieved using a local java program based on the OWL API

<sup>7</sup>University of Washington's Structural Informatics Group, http://sig.biostr. washington.edu/

<sup>8</sup>MySQL, https://www.mysql.com/

<sup>9</sup> Jena, https://jena.apache.org/

<sup>10</sup>RDF, http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/

package11. All classes and object properties of the view were included in the ontology. As object properties were expressed at the individuals' level in the view, all object properties were translated at the classes' level using existential restrictions. **Figure 3** depicts an example of translation of the fma:Precentral\_gyrus entity from the view expressed in OWL Full (cf. part 1 of the figure) into NEURO-DL-FMA expressed in the OWL DL formalism (cf. part 2). In the view, entities such as fma:Precentral\_gyrus appear both as a class and as an individual (cf. part 1). While part-whole relationships such as fma:constitutional\_part or fma:regional\_part\_of were expressed at the individuals' level in the view (cf. part 1), the same object properties were expressed at the classes' level using existential restrictions in NEURO-DL-FMA (cf. part 2).

### 2.1.3. The Human Connectomics Ontology

The "Human Connectomics Ontology" (HCO) was created in order to represent brain regions, nearest neighbor topology and connectivity relationships assessed by diffusion tractography.

Different classes and object properties were defined in the HCO (cf. **Tables 1**, **2**):


<sup>11</sup>OWL API, http://owlapi.sourceforge.net/

<sup>12</sup>Gray-matter-region: http://brancusi1.usc.edu/thesaurus/definition/gray-matterregion/

<sup>13</sup>Node: http://brancusi1.usc.edu/thesaurus/definition/node/

FIGURE 3 | Example of translation of an entity from a subset (or a "view") of the Foundational Model of Anatomy (FMA) expressed in OWL Full (cf. part 1) into the corresponding NEURO-DL-FMA entity expressed using the OWL DL sublanguage (cf. part 2). The "fma" prefix denotes that the entity was part of the FMA. In the left part of the figure (cf.

part 1), concepts such as *fma:Precentral\_gyrus* were both class and instance. Moreover, part-whole relationships such as *fma:constitutional\_part* or *fma:regional\_part\_of* were expressed at the individuals' level. In the right part of the figure (cf. part 2), the same relationships were represented at the classes' level using existential restrictions.

#### TABLE 1 | Definition of Human Connectomics Ontology (HCO) classes.


*The "fma:" prefix denotes terms originating from the Foundational Model of Anatomy. The "hco:" prefix denotes terms defined in the HCO.*

adaptation of the route<sup>14</sup> term, defined in the FMC thesaurus (Swanson and Bota, 2010), dedicated to the brain images domain as diffusion tractography does not probe the path of white matter bundles directly, but water diffusion in brain.

• hco:is\_tracto\_connected: Object property that links an hco:MR\_Node to an hco:MR\_Route. The inverse property is hco:tracto\_connects.

<sup>14</sup>Route: http://brancusi1.usc.edu/thesaurus/definition/route/

• hco:mr\_connection: Symmetric object property that denotes the existence of a pathway assessed by diffusion tractography linking two different hco:MR\_Node. If a hco:is\_tracto\_connected b and b hco:tracto\_connects c, then a hco:mr\_connection c. This property is an adaptation of the connection<sup>15</sup> term, defined in the FMC thesaurus (Swanson and Bota, 2010), dedicated to connections assessed by diffusion tractography.

<sup>15</sup>Connection: http://brancusi1.usc.edu/thesaurus/definition/connection/

#### TABLE 2 | Object properties of the human connectomics ontology.


*The "hco:" prefix denotes terms of the human connectomics ontology. The "fma:" prefix denotes terms originating from the Foundational Model of Anatomy.*

• hco:continuous\_with: Symmetric object property that links an fma:Anatomical\_structure to some nearest neighbor fma:Anatomical\_structure.

**Figure 4** illustrates how a fiber bundle reconstructed by diffusion tractography that connected two cortical parcels via the corpus callosum white matter fiber bundle were represented using terms of the HCO. The hco:s1\_gray\_matter\_of\_right\_superior\_frontal\_gyrus\_17 instance of the hco:MR\_Node class denotes a high resolution cortical parcel. Part-whole relationships were represented thanks to the fma:regional\_part\_of and the fma:constitutional\_part\_of object properties: this latter cortical parcel was a regional part of some gray matter of the right superior frontal gyrus which was a constitutional part of the right superior frontal gyrus which was a regional part of the right frontal lobe. The parcel in the right hemisphere was linked to the other hemisphere via the hco:s1\_mr\_route\_118 instance of the hco:MR\_Route class. This instance was related to the anterior part of the corpus callosum via a fma:regional\_part\_of object property. Finally the two different cortical parcels were linked via the hco:mr\_connection object property.

### 2.2. Experimental Work

The aim of this experimental work was to assess how the HCO can effectively be used to enhance multi-level hypothesis-driven analysis of connectomics datasets. **Figure 5** depicts an overview of the main steps of this experimental work.

### 2.2.1. Neuroimaging Data and Preprocessing

The analysis was performed for five subjects of the NMR public database (Poupon et al., 2006). This database provided T1 (voxel size 0.9 × 0.9 ×1.2 mm) and diffusion-weighted datasets (voxel size of 1.9 × 1.9 ×2.0 mm) acquired with a GE Healthcare Signa 1.5 Tesla Excite II scanner. The diffusion datasets presented a high angular resolution (HARDI) based on 200 directions and

Automatic segmentation of brain regions using the Freesurfer pipeline. (3) Computation of an high resolution parcellation (CMTK toolkit). (4) Acquisition of diffusion weighted MR images. (5) Computation of

based on the JHU white atlas (FSL toolkit). (8) Computation of connectivity matrices. (9) Automatic annotation of MRI connectomics datasets using terms of the Human Connectomics Ontology (HCO).

a b-value of 3000 s/mm<sup>2</sup> . The use of a twice refocusing spin echo technique was used to compensate the echoplanar distortions due to eddy currents (Reese et al., 2003), at the first order. Susceptibility artifacts were corrected using a phase map acquisition. See "MRI acquisitions" on **Figure 5**.

The Freesurfer pipeline was applied on T1-weighted datasets producing cortical (Desikan et al., 2006) and sub-cortical segmentations (Dale et al., 1999). Then a high resolution cortical parcellation was performed for each subject. Each of the Freesurfer cortical regions was arbitrarily subdivided into a set of small and compact parcels of about 1.5 cm<sup>2</sup> (Hagmann et al., 2008), resulting in 1000 parcels covering the entire cortex thanks to the connectome mapping toolkit<sup>16</sup> (CMTK) (Daducci et al., 2012). The nearest neighbors of each parcel were assessed for each subject using a dilatation-based strategy. Thus, a total of 1000 cortical parcels and other regions (i.e., thalamus, caudate, putamen, pallidum, accumbens area, amygdala, hippocampus in both hemispheres, and brain-stem) were defined in the Freesurfer structural space. See "Freesurfer pipeline" and "high resolution parcellation" on **Figure 5**.

<sup>16</sup>CMTK, http://www.cmtk.org/

Two target masks were defined for the tractography in the Freesurfer structural space. The first target mask was defined as the set of high resolution cortical parcels included in the right medial Brodmann Area 6 (BA6). This right medial BA6 region of interest was defined in restricting the Freesurfer segmentation corresponding to the cortical region of the right superior frontal gyrus (i.e., "ctx-rh-superiorfrontal") from y = −22 to y = 30 (MNI coordinates in the anteroposterior direction) and from the cingulate sulcus to the dorsal surface of the brain in order to include only voxels belonging to the gray matter on the medial wall (Johansen-Berg et al., 2004). The registration between the Freesurfer conformed space and the MNI space was computed using a linear registration (12 DOF) based on mutual information. This was achieved using the FLIRT tool of the FSL toolbox<sup>17</sup> . Finally, the second target mask was defined as the remaining cortical parcels or regions.

All the 20 white matter tracts of the JHU white matter tractography probabilistic atlas based on diffusion tensor imaging (Wakana et al., 2007; Hua et al., 2008) were segmented using a threshold at 25: anterior radiation of thalamus, corticospinal tract, anterior segment of cingulum bundle, anterior forceps of corpus callosum, posterior forceps of corpus callosum, inferior occipitofrontal fasciculus, inferior longitudinal fasciculus, uncinate fasciculus, superior longitudinal fasciculus in both hemispheres. A total of 22 white matter masks (one for each fiber bundle and two for the rest of the white matter in both hemispheres) were defined as seed masks for the tractography. An automatic linear (12 DOF) registration based on the correlation ratio between the JHU white matter tractography atlas and the Freesurfer structural space was computed using the FLIRT tool. See "JHU white matter atlas" on **Figure 5**.

A registration between the Freesurfer structural space and the diffusion dataset space was computed using a rigid registration (6 DOF) based on mutual information implemented in FLIRT. The registration was performed considering the average of five B0 volumes (i.e., volumes with b-value = 0) and the brain volume in the Freesurfer structural space.

## 2.2.2. Connectivity Assessed by Diffusion Tractography

The aim of the probabilistic tractography was to characterize the connectivity pattern of each structural element, denoted by seeds, in probing the Brownian movement of water molecules within white matter fiber bundles. Probabilistic tractography was performed using the bedpostX and probtrackX2 tools, part of the FSL toolbox (Behrens et al., 2007). BedpostX uses a Monte Carlo Markov chain sampling to estimate the diffusion parameters at each voxel. The probabilistic tractography could model up to two fiber bundles in each voxel. The burn-in of the Markov chains was set to 3000 in order to ensure convergence of the model. See "model of diffusion" on **Figure 5**.

A whole brain probabilistic tractography was achieved in order to assess gray-to-gray connectivity between the two target masks defined above. The probtrackX2 tractography tool drew 5000 probabilistic streamlines that were sent in both directions from the distribution connectivity of each white matter seed voxel. The 22 white matter masks defined above were used successively as seed for the tractography. If the streamline hitted the two target masks at two locations along either sides of the streamline, then the corresponding row and column of the connectivity matrix was filled. Streamlines that stopped before reaching a length of 30 mm or that passed through an exclusion mask (i.e., ventricles, cortico-spinal fluid (CSF), the choroid-plexus) were discarded. A distance correction was used in order to correct the fact that connectivity distribution drops with distance from the seed voxel. See "probabilistic tractography" on **Figure 5**.

Each of the 22 voxel-wise connectivity matrices was converted into a region-wise connectivity matrix (11 × 1004) between the 11 cortical regions in the first target mask and the 1004 cortical and other brain regions in the second target mask. The connectivity between two regions in the region-wise connectivity matrix was computed as the mean of the connectivities between the voxels belonging to the corresponding regions. After a logarithmic transformation of the region-wise connectivity matrix, a normalization of the values of each row was achieved. Finally, connectivity values in the region-wise connectivity matrices were thresholded at 0.7 in order to keep only connections reconstructed by diffusion tractography with a high probability. See "connectivity matrices" on **Figure 5**.

## 2.2.3. Automatic Annotation of MRI Connectomics Datasets

The HCO was populated with instances describing fiber bundles assessed by diffusion tractography between different gray matter regions of the five healthy subjects. This was achieved using a Java program based on OWL API. First, each gray matter region was represented as an instance of the hco:Gray\_matter\_part class. If the gray matter region was a high resolution cortical parcel, then the cortical parcel was related to the instance of the overlapping gyrus via the fma:regional\_part\_of object property. If two instances of fma:Anatomical\_structure were found to be nearest neighbors then they were related together using the hco:continuous\_with object property. Finally, each binary region-wise connectivity matrix (11 × 1004) was used to encode the connectivity reconstructed by diffusion tractography between the 11 cortical regions defined in the first target mask and the 1004 brain regions defined in the second target mask (cf. **Figure 5**, "automatic annotation of datasets using HCO"):


<sup>17</sup>FSL, http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FSL

Frontiers in Neuroinformatics | www.frontiersin.org April 2015 | Volume 9 | Article 9 |

## 2.2.4. Answering Competency Questions through the Human Connectomics Ontology

**Table 3** presents each competency question translated into terms of the HCO before submission to the FaCT++ reasoning engine via the "DL Query" tab of the ontology editor Protégé (Rubin et al., 2007). FaCT++ is an efficient tableaux-based reasoner implemented using C++ that supports OWL DL. It is used as one of the default reasoners in Protégé (version 4). In order to find all parts of (fma:regional\_part\_of, fma:constitutional\_part\_of) an anatomical structure, the hco:part\_of transitive object property was used. The right medial parietal cortex was expressed using a conjunction of terms from the ontology: fma:Cortex\_of\_right\_parietal\_lobe and fma:Medial\_segment\_of\_cerebral\_hemisphere. The inferior frontal cortex was translated into the fma:Orbitobasal\_segment\_of\_right\_frontal\_lobe term.

## 3. Results

## 3.1. Right Medial Brodmann Area 6 Region of Interest

Our definition of the right medial BA6 was decomposed into eleven high resolution cortical parcels numbered: 20, 12, 32, 9, 17, 13, 42, 18, 24, 25, 38 in all five subjects. Columns 1 and 2 of the **Figure 8** depict a map of these parcels on a medial view of the gray/white interface of the right hemisphere.

## 3.2. Structural Connectivity Assessed by Diffusion Tractography

**Table 4** summarizes the set of regions that were found connected to the right medial BA6 via fiber bundles assessed by diffusion tractography for each subject. This set of regions was expressed using FMA terms denoting regions at a high level of granularity (i.e., gyrus level, or other anatomical structures): as an illustration the hco:s1\_gray\_matter\_of\_left\_superior\_frontal\_gyrus\_39 entity, which was a cortical region of the left superior frontal gyrus (cf. **Figure 4**), was denoted using the fma:Left\_superior\_frontal\_gyrus concept. 23.7% of the total of anatomical terms that were connected to the right medial BA6 in our data were found common to the five subjects. 38.9% (69.5% resp.) of these anatomical terms were found common to at least 4 (3 resp.) subjects.

## 3.3. Semantic Annotation of MRI Connectomics Datasets

The HCO and its NEURO-DL-FMA module contained 811 classes, 11 object properties, no data property and a mean of 2321 instances per subject. The HCO was classified in less than 6 s per subject using the FaCT++ reasoning engine on a dual core processor, 3.06 GHz, 3.9 Go RAM workstation. The reasoning engine was used both to keep ontologies in a logically consistent state, and to infer new axioms between brain regions.

An example of the use of some part-whole, spatial and connectivity relationships using the HCO terms is provided on **Figure 6**. The part-whole relationship was expressed using the fma:regional\_part\_of object property denoting the fact that the high resolution cortical parcel 10 (i.e., hco:s1\_gray\_matter\_of\_left\_superior\_frontal\_gyrus\_10) was a regional part of the gray matter of the left superior frontal gyrus of the subject 01. The spatial relationship was expressed thanks to the hco:continuous\_with object property denoting the fact that the cortical parcel 10 had several nearest neighbors in the gray matter of the left frontal gyrus, namely parcels number 16, 24, 32, 35, 39, and 8. Finally, the cortical parcel 10 was linked to the instance number 117 of the MR\_Route class with the hco:is\_tracto\_connected object property denoting the existence of a fiber bundle reconstructed by diffusion tractography.

**Figure 7** aimed at illustrating the use of the hco: tracto\_connects connectivity relationship using HCO terms. The s1\_mr\_route\_117 instance of the MR\_Route class denoted a fiber bundle assessed by diffusion tractography belonging to the subject 01. This route 117 was linked to two cortical parcels in the left (number 10) and right (number 25) superior frontal gyri using the hco:tracto\_connects object property. The

TABLE 3 | Translation of the four competency questions into Description Logic (DL) queries using terms of the Human Connectomics Ontology (HCO).


#### TABLE 4 | Connectivity assessed by diffusion tractography between the right medial Brodmann area 6 (BA6) and other regions of the brain using the FMA terminology for the 5 subjects.



FIGURE 7 | Illustration of the use of the hco:tracto\_connects connectivity relationship between a fiber bundle assessed by diffusion tractography (i.e., hco:MR\_Route) and different cortical parcels (i.e., hco:MR\_Node) using terms of the Human

Connectomics Ontology (HCO). The "fma" and "hco" prefixes denote entities of the Foundational Model of Anatomy (FMA) and of the HCO, respectively. The object property *fma:regional\_part\_of* denotes a part-whole relationship.

fma:regional\_part\_of object property expressed the fact that the route 117 was a part of the anterior forceps of the corpus callosum.

## 3.4. Ontology Application Testing: the Medial BA6 Case Study

**Table 5** summarizes the results of the queries expressing our competency questions (cf. 2.1.1) and using terms of the HCO before submission to the FaCT++ reasoning engine. These results are expressed using the high resolution cortical parcel identifiers for columns 2, 3, and 4 (i.e., 17 was the identifier of the cortical parcel denoted by the following instance: hco:s1\_gray\_matter\_of\_right\_superior\_frontal\_17). Column 5 of the table lists a set of instances denoting the white matter fiber bundles that were found to match the criteria of the last competency question.

A map of the different high resolution cortical parcels presented in **Table 5** in columns 2, 3, and 4 were plotted on the gray/white interface of the right hemisphere for each subject (cf. **Figure 8**). The first column of this figure represents (in red) the right medial BA6 region of interest for each subject. The second column of the figure depicts a map of the different cortical parcels that are parts of the region of interest and the corresponding identifiers. The third column represents (in blue and green) the results summarized in columns 2 (CQ1 query) and 3 (CQ2 query) of **Table 5**, respectively. The fourth column of the figure represents (in blue and green) the results summarized in columns 3 and 4 of **Table 5**, respectively. The cortical parcels that were found to meet the criteria of several competency questions were represented in orange color.

## 4. Discussion

The aim of this study was to design an ontology for in vivo human connectomics, i.e., suitable to describe connectomics data revealed by MRI, thus facilitating their retrieval, sharing and comparison with other neuroscience knowledge resources. A new ontology was created, called the "Human Connectomics Ontology" (HCO), that models brain regions and connectivity relationships assessed by diffusion tractography, using a three step methodology. First, the domain of discourse was specified using a set of competency questions grounded on the paradigmatic medial BA6 case study. Then, the HCO was based on a neuroanatomical ontology module called "NEURO-DL-FMA" in order to represent gross anatomy of the brain (i.e., gray matter regions, white matter fiber bundles). Finally, a set of entities was explicitly defined in the HCO to represent some aspects of the connectivity that could be observed through diffusion MRI. Moreover, an experimental work was achieved in order to show how the HCO could be effectively used to express complex queries and process them using a DL reasoning engine.

## 4.1. A Neuroanatomical Ontology for in vivo Human Connectomics

The medial BA6 case study provided an interesting use case for expressing competency questions for the specification of the


TABLE 5 | Results of the different queries corresponding to the Competency Questions (CQ) that were translated into terms of the Human Connectomics Ontology (HCO) and submitted to the FaCT++ reasoning engine (cf. Table 3).

*CQ1: Which gray matter parts of the right superior frontal gyrus have a connectivity pattern passing through the corticospinal tract or through some gray matter parts of the right precentral gyrus?*

*CQ2: Which gray matter parts of the right superior frontal gyrus have a connectivity pattern passing through some gray matter parts of the right medial parietal cortex or through some gray matter parts of the right inferior frontal cortex of the frontal lobe?*

*CQ3: Which gray matter parts of the right superior frontal gyrus have a connectivity pattern passing through the corticospinal tract or through some gray matter parts of the right precentral gyrus or through some gray matter parts contiguous with the right precentral gyrus?*

*CQ4: Which anatomical white matter fiber bundles connect some gray matter parts of the right superior frontal gyrus to some gray matter parts of the right temporal lobe? See the* Figure 8 *for a graphic representation of these results.*

HCO. In comparing macaque to human brain, Johansen-Berg et al. showed how the medial BA6 could be subdivided into two major anatomo-functional regions—Supplementary Motor Area (SMA) and pre-SMA—using distinct long-range connectivity patterns assessed by diffusion tractography (Johansen-Berg et al., 2004). As these connectivity patterns were concerned with rich neuroanatomical concepts denoting regions at different levels of resolution (e.g., "part of superior frontal gyrus," "part of medial parietal cortex," etc.), our set of competency questions (Neuhaus and Vizedom, 2013), inspired by this study, involved multi-level analysis and rich neuroanatomical expressivity.

NEURO-DL-FMA, defined as a neuroanatomical ontology of the gross-anatomy of the brain, was first extracted as a view from the FMA reference ontology in OWL Full and then translated into OWL DL. Different studies tried to convert the entire FMA (Protege frames version) into different OWL versions and to use reasoning engines (Golbreich et al., 2006; Golbreich et al., 2013). Our strategy was to extract from the OWL Full version of FMA a "view" of the brain constituted only of entities which were parts of the neuraxis, following (Turner et al., 2010; Shaw et al., 2011). Though the latter study achieved brain image analysis using the DXBrain software (Detwiler et al., 2009), no DL reasoning engine was used, however.

The HCO was designed in taking into account both the reference ontology in neuroanatomy FMA and the conceptual framework of structural connectivity FMC. If NEURO-DL-FMA was clearly grounded on a subset of FMA, however no concept of the BAMS<sup>18</sup> ontology was used in the HCO. Indeed, structural connectivity addressed in BAMS primarily concerns pathway-tracing experiments in animals, whereas we were focusing on connectivity as observed in diffusion MRI. Nevertheless, the FMC was a useful source of inspiration. Some terms (i.e., gray-matter-region<sup>19</sup> , node<sup>20</sup> , route<sup>21</sup> , connection22) of the FMC thesaurus were instrumental in the definition of some new HCO entities (i.e., hco:Gray\_matter\_part, hco:MR\_Node, hco:MR\_Route, hco:mr\_connection) dedicated to MRI connectomics.

#### 4.2. Experimental Work

The experimental work was achieved in order to illustrate how the semantic annotation and the reasoning about MRI connectomics datasets could enhance the analysis of connectivity patterns present in this data. Connectivity was assessed using a probabilistic tractography method in the living human brain. It is worth saying that tractography results should be interpreted with care (Jones et al., 2013). Indeed, anatomical connectivity denotes the white matter fibers which physically connect brain regions, whereas connectivity assessed by diffusion tractography relies on water diffusion as an indirect probe of axon geometry. In fact, tractography infers fiber bundles pathways through the diffusion field in assuming that the direction of least hindered diffusion is aligned with axons (Jbabdi and Johansen-Berg, 2011). If this hypothesis seems reasonable at the axon level (microscopic scale), it has several practical consequences at the imaging level (macroscopic scale). For example, complex microscopic architectures of white matter fibers are often oversimplified by local models of axons-diffusion mapping. Moreover, tractography algorithms cannot determine with accuracy the origin and the termination of connections in the cortex (Jbabdi and Johansen-Berg, 2011). Thus, these different ambiguities combined with imaging noise generate spurious connections between brain regions. This is why it is so important to describe such connections using conceptual entities that allow distinguishing them from connections observed using tracer-based methods, e.g., collated in the

<sup>18</sup>BAMS, http://brancusi1.usc.edu/ontology/

<sup>19</sup>Gray-matter-region, http://brancusi1.usc.edu/thesaurus/definition/gray-matterregion/

<sup>20</sup>Node, http://brancusi1.usc.edu/thesaurus/definition/node/

<sup>21</sup>Route, http://brancusi1.usc.edu/thesaurus/definition/route/

<sup>22</sup>Connection, http://brancusi1.usc.edu/thesaurus/definition/connection/

FIGURE 8 | Medial view of the right gray/white interface of the five subjects. Column 1 represents in red the medial Brodmann area 6 region of interest for subject 01, 02, 03, 04, 05, respectively. Column 2 depicts a map of the different high resolution cortical parcels that were parts of the region of interest and the corresponding identifiers. The two last columns represent the results of three different queries corresponding to some of our Competency Questions (CQ) that were translated into terms of the Human

Connectomics Ontology (HCO) and submitted to the FaCT++ reasoning engine (cf. Table 3). Column 3 represents in blue (resp. green) the cortical parcels matching the query 1 (resp. 2) criteria (cf. Table 5). When a parcel was the result of both queries, it was represented in orange. Column 4 represents in blue the cortical parcels matching the query 3 criteria (cf. Table 5). On column 3, the same color code was kept for the green and orange cortical parcels.

BAMS database. Furthermore, although probabilistic tractography methods do not estimate the connection strength between two regions (as tracer-based methods actually do), they allow assessing the confidence in the pathway of least hindrance to diffusion. This is a major advantage of probabilistic tractography over deterministic tractography, since the latter cannot provide such confidence cues. So, although tractography is limited by several biases, it is currently the only available tool that gives us the opportunity to investigate anatomical connectivity non invasively and in the living human brain.

Our automatic annotation of brain images was based on brain segmentations achieved thanks to the use of atlases such as the Freesurfer ( Fischl et al., 2004; Desikan et al., 2006) or JHU white matter tractography atlas (Wakana et al., 2007; Hua et al., 2008). Such atlases should be used with care in case of brain pathology, however. Another approach for automatic probabilistic reconstruction of in vivo white matter bundles based on global tractography called "Tracula" seems more robust in presence of pathology (Yendiki et al., 2011). However, the JHU white matter tractography atlas was preferred in our experimental work, because FMA provided a one-to-one mapping with the terminology of this atlas. As this atlas provided a probability that a particular voxel belonged to a white matter bundle, some white matter voxels could be mislabelled particularly in case of two close white matter bundles. As an illustration, **Table 5** column 4 gives the names of the anatomical white matter fiber bundles reconstructed by diffusion tractography which connected some gray matter parts of the right superior frontal gyrus to some gray matter parts of the right temporal lobe. Two different anatomical bundles were found in our data: the right superior longitudinal fasciculus (found in subjects 01, 02, 03, and 05) and the right inferior longitudinal fasciculus (found in subject 05). If the right superior longitudinal fasciculus was found anatomically relevant in the MRI atlas of human white matter (Mori et al., 2005), the right inferior longitudinal fasciculus was not, however. This spurious annotation may have resulted from some mislabelled white matter voxels, since superior and inferior longitudinal fasciculi appeared close to one another particularly in the occipital lobe of the brain.

The HCO ontology aimed at representing brain regions and connectivity relationships assessed by diffusion tractography in the living brain. This was achieved by creating the corresponding instances of the ontology classes in the annotation file. If the seed of the tractography was located in a segmented white matter bundle, then the instance representing the pathway generated by the tractography algorithm was related to the instance representing this anatomical white matter bundle using the fma:regional\_part\_of object property (cf. **Figure 4**). This is based on the assumption that the whole pathway of the tractography was located within the segmented white matter bundle, which would need to be verified.

#### 4.3. Automatic Inferences on Brain Connectivity

Automatic annotation of brain images with terms of an ontology and subsequent analysis using reasoning engines enable powerful information retrieval thanks to the high level representation of the image content embedded in the ontology. As an illustration, when an investigator queries through the HCO all cortical parts of the orbitobasal segment of the right frontal lobe which are connected to some medial BA6 parts via fiber bundles assessed by diffusion tractography, the reasoning engine takes advantage of both class-level knowledge (what are the gyri included in the orbitobasal segment of the right frontal lobe cortex?) and instance-level facts derived from image evidence (which data instantiate some regional parts of these gyri classes?).

Different initiatives used automatic inferences based on structured knowledge in order to represent cerebral connectivity: (1) Neurolex and (2) KEfED (Knowledge Engineering from Experimental Design) approach. (1) Neurolex is a semantic wiki-based website and knowledge management system dedicated to neurobiology whose primary goals are to assist neuroscientists in reviewing anatomical features, linking them to other neuroscience resources, and stimulating discussion with other scientists especially about controversial or missing features (Larson and Martone, 2013). Due to the fact that the semantic MediaWiki platform (on which Neurolex was built) did not support many of the first-order logic features that are needed to achieve OWL DL reasoning, the RDF version of Neurolex was deployed into an instance of the OWL-IM semantic repository (http://www. ontotext.com/owlim) providing SPARQL 1.1 querying capabilities. In Larson and Martone (2013), the authors demonstrated how a SPARQL query could retrieve from Neurolex "all brain regions that send projections into the cerebellum or any of its parts via mossy fibers." In order to search recursively all subclasses which were regional parts of the cerebellum, the authors used the "property paths" <sup>23</sup> feature of SPARQL 1.1. (2) Another initiative was based on a KEfED approach. First an experimental design was modeled as a workflow using a set of KEfED models which aimed at representing the experiment using structured information. Secondly, interpretations of the experimental observations were achieved using a domain-specific reasoning. In Russ et al. (2011), the authors illustrated the relevance of a KEfED approach through a neural connectivity use case based on tract-tracing experiments in animal subjects. Tract-tracing experiments consist of injecting in a site a chemical tracer which is then transported along neurons' axonal fibers. Interpretation of such tract-tracing experiments aims at describing connections between different brain regions. Spatial reasoning was used especially to process the part-whole and the overlaps relationships between regions. Basic geometric features were imported from the BAMS neuroanatomical ontology of the rat into PowerLoom, a first-order logic knowledge representation and closed-world reasoning system. Thus, the authors demonstrated how connectivity matrices could be inferred through the use of spatial reasoning and the modeling of the tract-tracing experiments using a KEfED approach. HCO and its NEURO-DL-FMA module differ from these approaches because they were expressed in the W3C standard OWL language and used the OWL DL description logics sublanguage. Moreover, the FaCT++ reasoning engine was used both to ensure the satisfiability of the ontologies, and to infer new axioms using transitive part-whole, spatial relationships, and connectivity relationships assessed by diffusion tractography. As a result, complex queries (cf. **Table 3**) could be formulated directly via the "DL Query" tab of the Protégé ontology editor, in a more expressive way than using the SPARQL language.

An interesting initiative dedicated to diffusion tractography called the "White Matter Query Language" (Wassermann et al., 2013) used a textual language in order to express anatomical descriptions of white matter tracts. In selecting streamlines from a whole brain deterministic tractography using both anatomical structure terms describing where streamlines end or pass through, relative position terms of streamlines from other anatomical structures and finally logical operations terms, Wassermann et al. used different expressions to define some association, projection and commissural tracts (Wassermann et al., 2013). Although the latter approach explicitly defined some white matter tracts using a near-to-English syntax, it did not provide an ontology in order to annotate results of in vivo human connectomics based on diffusion tractography nor reasoning capabilities in order to infer part-whole, spatial or connectivity relationships at different level of granularity.

## 5. Conclusion and Perspectives

In this article we have described a neuroanatomical ontology dedicated to human connectomics called the Human Connectomics Ontology (HCO) that could represent brain regions and connectivity assessed by diffusion tractography in the living human brain. Moreover, an experimental work was achieved in order to show how the HCO could be effectively used within an information system to express complex queries concerning MRI connectomics datasets and process them using a DL reasoning engine. This approach can facilitate comparison of data across scales, modalities and species.

Future work will consist in the development of a visualization module in order to display macro-connectome at different levels of granularity in a matrix or a network form. This module could leverage the reasoning engine to retrieve connections

<sup>23</sup>Property paths, http://www.w3.org/TR/sparql11-query/#propertypaths

assessed by diffusion tractography between different gray matter regions. To finish, a long-term goal will consist in facilitating the consistent querying of in vivo MRI connectomics data and tracerbased observations made in multiple species. This would be of major interest for assessing the validity of putative connections highlighted in human MR connectomics.

## Author Contributions

TM and BG designed the work, analyzed data, drafted the work, approved the final version to be published and agreed

## References


to be accountable for all aspects of the work ensuring questions related to the accuracy or integrity of any part of the work.

## Acknowledgments

The authors would like to thank C. Poupon for the brain datasets used in this paper which were part of the NMR public database (Poupon et al., 2006). We would like to thank Pr. X. Morandi and R. Seizeur, professors in anatomy and neurosurgery, for their assistance in interpreting anatomy.


and basic plan architecture. Proc. Natl. Acad. Sci. U.S.A. 107, 20610–20617. doi: 10.1073/pnas.1015128107


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Moreau and Gibaud. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Golgi: Interactive Online Brain Mapping

#### Ramsay A. Brown\* and Larry W. Swanson

Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA

Golgi (http://www.usegolgi.com) is a prototype interactive brain map of the rat brain that helps researchers intuitively interact with neuroanatomy, connectomics, and cellular and chemical architecture. The flood of "-omic" data urges new ways to help researchers connect discrete findings to the larger context of the nervous system. Here we explore Golgi's underlying reasoning and techniques and how our design decisions balance the constraints of building both a scientifically useful and usable tool. We demonstrate how Golgi can enhance connectomic literature searches with a case study investigating a thalamocortical circuit involving the Nucleus Accumbens and we explore Golgi's potential and future directions for growth in systems neuroscience and connectomics.

Keywords: connectome, brain mapping, interactive, neuroanatomy, API

## INTRODUCTION

The nervous system is an evolved computing network that solves the challenge of organizing adaptive cybernetic behavior in animals (Swanson, 2000). The strategy for understanding this network as a biological system of the body is simple and dates to classical antiquity: accurately observe and describe the form of the system, and clarify its mechanism of action by functional experimentation (Swanson, 1995). Thus, a necessary but not sufficient prerequisite is to describe and display accurately the structural organization of the nervous system: what are the parts, what is the internal organization of each part, and how are the parts interconnected? This structural approach was put on modern footing at the macroscopic level in the 1660–1670's by Thomas Willis, and at the microscopic level in the 1880–1890's by Santiago Ramón y Cajal and others (Swanson, 2014). The contemporary revolution in structural neuroscience began around 1970 and was based on the development of experimental axonal circuit tracing methods combined with histochemical methods for the cellular localization of molecules with immunohistochemistry and hybridization histochemistry (Swanson, 1999).

Despite annual improvements in neuroanatomical atlases and connectomic techniques a complete architectural description of the macrolevel mammalian connectome remains elusive. This is to no fault of the connectomic community but instead reflects the sheer scale and complexity of assembling such a description. To aid in managing the complexity of the task, both in terms of human organization and data management, we turn to automation via neuroinformatics.

Neuroinformatics systems aid in collecting, storing, communicating, analyzing, and visualizing neuroscientific data (Arbib and Grethe, 2001). We identified four emerging trends from neuroinformatics and human-computer interaction that constrained Golgi's design: (1) previous product validation from other neuroinformatics tools; (2) the experimental status of natural language processing and automated reasoning systems; (3) the growing corpus of biocurated reports from legacy literature; and (4) the adoption of high-throughput connectomic techniques.

#### Edited by:

Sean L. Hill, International Neuroinformatics Coordinating Facility (INCF), Sweden

#### Reviewed by:

Sharon Crook, Arizona State University, USA Marcus Kaiser, Newcastle University, UK

#### \*Correspondence: Ramsay A. Brown ramsaybr@usc.edu

Received: 09 June 2015 Accepted: 23 October 2015 Published: 17 November 2015

#### Citation:

Brown RA and Swanson LW (2015) Golgi: Interactive Online Brain Mapping. Front. Neuroinform. 9:26. doi: 10.3389/fninf.2015.00026

Neuroscientists are accepting modern digital tools, such as the Brain Architecture Knowledge Management System (BAMS; Bota et al., 2005) and BrainMaps.org (Mikula et al., 2008) into their research workflows. Other tools similar to that we describe here, such as NeuroVIISAS (Schmitt and Eipert, 2012) and the Allen Institute Brain Atlas (Sunkin et al., 2013), seek to integrate neuroanatomical reference sets with connectomic data for consumption by researchers. But adoption of these types of tools has not stopped with expert workers in the field: software for crowdsourcing scientific exploration—even to layman—has demonstrated the value of building systems and User Experiences that encourage participation at different levels of expertise (Seung, 2013). These two trends signaled to us that researchers were becoming increasingly comfortable adopting new software techniques into their scientific workflows (and thus that a potential user base for Golgi existed) and that simple tools and intuitive interfaces for neuroscience can and should be designed.

Natural Language Processing (NLP) and Machine Learning promise to enhance the quality and throughput of semiautomated neuroscientific curation and exploration. The automated synthesis of systems-level insights about neural connectivity from primary reports is improving but still requires human oversight: the current state-of-the-art in NLP for automating curation remains in an experimental stage (Burns et al., 2008). Without doubt, neuroscience will be disrupted by tools that intelligently and automatically assemble disparate reports into cohesive lines of reasoning. For the immediate future, however, generating high-level insights about the brain from primary reports remains a distinctly human task. With this in mind, we designed Golgi as a human-friendly tool to help users connect primary reports into high-level understandings.

The corpus of legacy literature contains decades of connectomic reports, and informatics teams are curating (portions of) this corpus into neuroinformatics workbenches. The analysis of this curated data has allowed some groups to generate exciting novel connectomic insights from previously disparate reports (Bota et al., 2015). New data streams stem not only from curation, but also from new high-throughput techniques. Tools such as functional magnetic resonance imaging generate massive amounts of raw data even in a single set of experiments. Between the staggering data generated by these new techniques and the growing body of curated literature we identified the need for tools that simplify how researchers navigate a growing corpus of nuanced, sometimes conflicting assertions of connectivity.

In designing Golgi we sought to provide neuroscience with a tool that makes a task traditionally challenging for many users (synthesizing the interactions between multiple nuanced neuroscientific findings) easier by bootstrapping it to users' innate spatial reasoning skills. This pressure led us to design Golgi's User Experience around the framework of an interactive, two-dimensonal embryonic fate map of the nervous system. The utility of this map is immediately obvious: it is a flatmap of the central nervous system that displays all of its macrolevel parts in a topologically consistent way, so it can be displayed easily on paper or a computer monitor, much like the wall maps of the earth pioneered by Mercator.

## METHODS

## Where did this Map Come From?

When the nervous system first becomes embryologically recognizable, it is a topologically flat sheet of ectodermal tissue called the neural plate (Swanson, 2012). As development progresses, the plate forms a tube and the walls of the tube progressively differentiate by forming a hierarchy of gray matter regions (analogous to countries on a map of the earth) interconnected by white matter tracts (like airline routes between national capitals).

The adult ''flatmap'' that underlies Golgi can be conceptually formed by cutting the top (dorsal surface) of the neural tube lengthwise (rostrocaudally) and unfurling the walls of the tube like a book (**Figure 1**). The ''fate map'' part of the definition involves a series of assumptions based on a variety of evidence (both experimental and theoretical) about where adult nervous system parts are located in the neural plate—for example, the brain is located toward the head-end of the neural plate whereas the spinal cord is located toward the tail-end—and how they ought to be represented in two-dimensions given their volumes in the adult form (Swanson, 2004).

## Building the Map

The atlas displayed in Golgi's UI (User Interface) is a subversion of the canonical rat embryonic flatmap featured in Swanson (2004) rat atlas.

Original vector graphics obtained from the author were modified using Adobe Illustrator CS6 on Mac OS X.10 to adapt the map for display in web browsers (**Figure 2A**) Once the map was transformed into a web-compatible Portable Network Graphic (PNG) format three versions were created: a version only displaying the outline of the map, a version including the outline of each brain region, and a version that included abbreviation labels for each region. These three versions were exported as 300-dpi PNG files at four levels of increasing resolution. Each of the four images was processed through a bespoke image processing pipeline powered by ImageMagick (ImageMagick Studio LLC, 2014) executed on Mac OS X.10 to slice the full-scale maps into constituent tiles for streamlined display on the net.

### Frontend Design

Golgi's frontend was designed to make it easy to query and visualize assertions of the mammalian nervous system's anatomical, connective, chemical, and cellular architecture. It interprets as a series of hand-written HTML (Berjon et al., 2014) pages generated via PHP made interactive and styled by Javascript (ECMA Standard, 2011) and CSS (Etemad, 2011) resources, respectively. Publicly-available Javascript microlibraries such as jQuery (JQuery.com, 2012) were utilized to aid in (amongst other purposes) communication (via asynchronous HTTP calls), security (via the Secure Hash

Algorithm), and display (via the jQuery UI library). To streamline design and ensure a consistent visual language throughout the product we used Twitter Bootstrap's core CSS library for UI styling. All frontend code is available freely at https://github.com/ramsaybr/golgi. We encourage the community to fork, improve, and extend Golgi freely.

Here we explore key design decisions we made while developing Golgi with special attention paid to features that distinguish Golgi's User Experience.

#### **Unregistered Use**

Golgi is free to use without first creating a user account, allowing users to immediately start using Golgi to explore the nervous system (**Figure 3**). This decision lowers barriers to adoption and help users investigate how Golgi could aid their research without requiring them to commit to a registration process.

#### **Search**

Users may explore Golgi by first interacting with a search inputbox (**Figure 4**). The search inputbox lets a user look up a gray matter region (a node in network parlance) by either its anatomical name or its atlas abbreviation.

This feature both creates and solves a UI discoverability problem: new users or users familiar to alternative nomenclatures may not be familiar with the set of available Swanson (2004) brain regions they could search for (thereby limiting utility).

from Illustrator. We then used ImageMagick to automatically slice each map into 480 × 480 pixel tiles. Tiles were algorithmically named according to their scale and position for maintaining map co-registration. Brain regions and connections were isolated within Illustrator into their own files and saved as Scalable Vector Graphics files for display directly in browsers. (B) Data was extracted from BAMS with the use of the Kimono API (http://www.kimonolabs.com). BAMS follows a structured and predictable URI-based method for querying for data on brain regions, connections, molecules, and cell types but exposes no public REST API. This structure enabled us to use Kimono to expose a RESTful API representation of the data encoded on BAMS pages. Using PHP's cURL library we then launched over 70k requests at our newly constructed REST endpoint, structured the response as to meet our data architecture, and stored the newly retrieved information in a series of MySQL tables.

To ameliorate this discoverability problem we implement a predictive text autocompletion algorithm. This will help experienced users access regions of interest more quickly and helps novice users learn the Swanson (2004) terminology more quickly.

Once a user begins typing, the autocompletion algorithm will suggest potential brain region matches. As a user continues typing, the potential match list will narrow until the user is presented with her correct region of interest. This is accomplished by pre-loading an array containing the names and abbreviations of all regions available and using the jQuery ''TypeAhead'' library for text autocomplete.

## **Inline Display of Visualizations Options**

Once the user selects a region of interest she will be presented a summary of its available data within Golgi and will be prompted to ''add'' the region and highlight it on the map.

Highlighted brain regions are displayed and marked with a context-dependent ''pin'' icon on the map. The pin's appearance depends on the set of region-associated data currently displayed on the map. This design makes it easier for users to understand what modalities of data are currently active on the map. For example, if a user has selected only connectional data about a region, the pin's appearance will reflect this choice. A region displaying only molecular data will have a different pin, and a

region with both active molecular and connectional data will show still a third type of pin.

Clicking a pin offers visualization options for assertions about connectivity, molecules, and cell types associated with the selected region (**Figure 5**). These options are displayed in a UI element dynamically placed adjacent to the region of interest on the map. Decreasing the amount of visual search a user will perform to navigate the interface will make it easier for users to select the information they want displayed.

#### **Links to Primary Literature, BAMS**

Golgi will help users more efficiently find primary sources that support assertions about neural architecture.

Expert-curated supporting evidence and links to primary sources are available for every assertion of connectivity and molecular and cellular architecture. Most assertions are backed by many reports sourced from the same or different primary sources. Many of these reports (to varying degrees) agree with one another, some do not. We present these reports, conflicting or otherwise, free from influence or normativity about technique or conclusion (**Figure 6**). This design decision was made to balance the efficiency of creating a User Experience based around assertions about the nervous system with the necessary respect for the oft-conflicting reports in the literature.

Each report associated with an assertion is available for inspection independently and is presented with a prosaic textual summary. These summaries read in plain-English and contain information such as the experimental technique used, the injection site in a pathway-tracing experiment, the relative strength and distribution of pathway tracer (if curated), and any notes left from the original curator. The decision to display this data as text (as opposed to in a table) was fueled by our desire to tutorialize new users on the nature of the assertions available in Golgi.

All assertions, and their associated connectivity reports, will present the user with not only a hyperlink to the PubMed report page for the cited primary source, but also with a link back to the BAMS connection details page corresponding to the data presented in Golgi. Both of these outbound-linking features will encourage curious users to not only engage with the larger system of BAMS but more deeply explore the data available in the primary source manuscript as well.

#### **Data Layers**

A selected region (and its associated assertions) is placed into a ''Data Layer'' when visualized to the map. Data Layers will let the user segregate assertions into distinct groups that can be selectively displayed or temporarily hidden on/from the map. Data can be moved between Data Layers and Data Layers can be created or destroyed.

Data Layers will help users compare data on the map. Keeping Layers isolated but visually co-registered allows researchers to build models while keeping some assertions distinct. For example, a user comparing the relationships between two connectional circuits that share a common node region will be able to keep each circuit isolated in its own Layer. She will be able to focus on one circuit by selectively displaying that Layer and hiding the other. Finally, she will be able to compare these two circuits with a third Data Layer that only contains molecular data for yet another region. Inspiration for this feature was drawn

depends on the set of region-associated data currently displayed on the map. Visualization options for displaying connectional, molecular, and cell-type assertions are displayed immediately adjacent to the selected region. This makes Golgi easier to use by cutting down on the amount of visual search a user has to perform.

from analogous features found in graphic design programs like Adobe Illustrator and commercial mapping software like Google Maps.

Users will be afforded control over the level of detail displayed on the flatmap (**Figure 7**). With the Map Layer affordance a user will be able to select from a visualization that only shows the gross outlines of the nervous system, another that includes the outline of individual regions, and a third that includes the Swanson (2004) nomenclature names of the regions superimposed. A click-and-drag interface affords an intuitive interaction point with the map and a familiar ''zoom'' scaling tool will allow users to enlarge the map to focus in on regions of interest without compromising map quality.

#### **Account Personalization**

Private accounts personalize and enhance Golgi. Golgi accounts will allow users to save notes on any assertion for later use. This will let users easily record their thoughts, processes, and observations for later analysis or retrieval from a different computer (**Figure 8**). User data and notes are kept private between Golgi and each user.

#### **Tutorialization**

Design Thinking and user-testing in the development process help product designers and developers build tools that are intuitive to use and minimize barriers to consumer adoption. Yet despite best efforts, most designed systems (Golgi included) still benefit from explicit tutorialization. As such, we are developing a series of short, video-based ''real life'' examples of how Golgi's features can be used. These videos will demonstrate the way users can interact with different parts of Golgi in the context of a typical use case. This will not only help instruct users in how to use

#### FIGURE 7 | Data layers, map visualization, and map control

affordances. Three different map visualizations can be displayed: a version only displaying the outline of the map, a version including the outline of each brain region, and a version that includes abbreviation labels for each region. Each map can be easily panned by clicking-and-dragging the mouse and zoomed using a zoom slider. Active brain regions and assertions can be selectively shown and hidden within Data Layers. Data Layers can be created, shown and hidden, and destroyed.

Golgi, but also in why to use Golgi to solve real neuroscientific problems.

These videos will overlay directly onto Golgi (to place the explained behavior in context) and will be available via an outbound link to YouTube as well. They will provide examples of how to search for a brain part, activate a region on the flatmap, use the pins to explore afferent and efferent connectivity, explore evidence underlying an assertion of connectivity, and work with layers.

#### Backend Design: REST-First Neuroscience

Golgi's backend is a RESTful (Fielding and Taylor, 2002) API custom-built from Apache2, PHP and MySQL. MySQL was chosen as our relational database management system because of its compatibility with our imported legacy data sources, and the authors had existing experience building and maintaining MySQL services. As an interpreted language PHP let us quickly test and evaluate our design decisions and rapidly deploy changes to the system, minimizing delays between developing and testing phases. Apache2 was selected as our HTTP server for its existing support within our Ubuntu staging and production servers and because it will adequately meet the usage demands of the product outlined here.

Adhering to a REST-first architecture promotes modular and reusable code, makes Golgi easier to maintain and grow, and will allow other members of the neuroinformatics community to use Golgi as a data source for their own projects.

For example, information about the Nucleus Accumbens is available over HTTP via a REST endpoint.

This endpoint returns JSON-formatted (Bray, 2014) information about the ACB (**Figure 9**). The response contains information both specific to Golgi (for example, coordinates for display on the Swanson, 2004 flatmap) and of general use to developers wishing to build their own neuroinformatics products powered by Golgi.

For example, the response.dataSets[ ] array contains information about what other data one can find associated with this region. In the Nucleus Accumbens example this includes macroconnections involving the nucleus (gray matter region). Because this data is returned in structured, predictable ways, it is amenable to automated incorporation in follow-up requests to Golgi's API.

A returned Connection object contains both source and target region data identifiers unique to Golgi as well as textual region names and abbreviations and information about the species and nomenclature for which these assertions are valid.

RESTful architecture lets developers consume Golgi's as an API service layer. For example, future neuroinformatics tools will be able to take advantage of Golgi's data about connectivity, chemical, molecular, and anatomical architecture via standard HTTP calls. This makes Golgi useful as a platform layer for the general neuroscientific community.

## Data Extraction for Map Content

Golgi's data is sourced from the Brain Architecture Knowledge Management System (BAMS). BAMS' databases of neuroanatomical parts, connections, cell types, and molecules provide Golgi with a large, expertly curated source of neuroscientific data. The parts, connections, cell types, and molecules encoded in Golgi and extracted from BAMS are species-exclusive to the rat. The decision to constrain Golgi to rat-based data was multifaceted. First and foremost BAMS contains more data about the rat than it does any other species. Second, Swanson's hierarchical, internally-consistent nomenclature and flatmap either only exist for the rat (in the case of the nomenclature) or are the most detailed in the rat (in the case of the flatmap, for which he has also developed simpler, experimental flatmaps for other non-rat species). Finally, while inter-species neurohomology and comparisons of connectivity are a compelling line of inquiry in their own right, they were not our primary design goal for this product. As such, we have scoped Golgi's current data offering to the rat, based on the Swanson (2004) nomenclature. Data extracted from BAMS was transformed from raw XML to records in MySQL inside Golgi's API via a bespoke extraction pipeline written in PHP and powered by the Kimono API (**Figure 2B**). Data was then manually inspected for both completeness (all data that was desired was transferred) and transfer accuracy (what was represented in BAMS is faithfully represented in Golgi) once inside Golgi's MySQL database. Development continued only once the extraction pipeline achieved both transfer accuracy and completeness.

The data visualized on Golgi are assertions of connectivity (or cytoarchitecture or chemoarchitecture), not individual reports of data as can be found in BAMS. This distinction is significant because it affects how users understand what Golgi is showing them (see ''Links to Primary Literature, BAMS'' Section). A connection that a user plots on Golgi is a summary of connectivity that has been supported by at least one report of connectivity curated from the literature.

(Continued)

#### FIGURE 9 | Continued

from a request to http://www.usegolgi.com/api/v1/nomenclature/1/region/ ACB. Responses are stateless and consistent in structure. Statelessness and structural consistency allows Golgi (and developers building on top of Golgi as a backend data service) to automate follow-on queries. For example, a request to the /region/ACB endpoint can be followed-on by requests to the /region/ACB/connection, /region/ACB/molecule, and /region/ACB/cell endpoints.

## Development Technology and Production Infrastructure

Golgi is deployed on a virtualized 64-bit Ubuntu 14.04LTS Elastic Cloud Compute instance on Amazon Web Services (**Figure 10**). Amazon Elastic Cloud Computing's elasticity minimizes the costs of operating Golgi while maximizing its performance. DNS Routing is managed via Amazon Web Services' Route 53.

Golgi was developed on physical testing environments running Mac OS X.10 and Ubuntu 14.04LTS using Sublime Text 3, MAMP, and Ubuntu's native LAMP stack for development and local testing. Golgi's source code has been made available at http://www.github.com/ramsaybr/golgi. We encourage the community to fork, improve, and extend Golgi freely.

Golgi was developed and tested using Google Chrome's Blink CSS layout engine and V8 Javascript engine. The extension of support for Firefox and Internet Explorer will require nontrivial code refactoring. First, client-side browser identification via Javascript will automate the detection of the user's browser type and load the appropriate CSS and Javascript files optimized to that browser. Golgi's current CSS, tested and optimized for the Blink layout engine, will be re-written and re-tested for both the Gecko (supporting Firefox) and Trident (supporting Internet Explorer) layout engines as to ensure a consistent User Experience across all browsers (particularly when Document Object Model [DOM] objects and their CSS properties are directly modified from Javascript). All of Golgi's Javascript will also be re-tested and refactored as needed as to maintain consistency in user interactions, 3rd party library functionality (for example, with jQuery), and asynchronous communication with Golgi's API between the V8 (Chrome), Chakra (Internet Explorer) and SpiderMonkey (Firefox) Javascript engines.

While future development priorities include extending support for Internet Explorer and Firefox browsers, we suggest that, for best performance, users access Golgi using Google Chrome.

## RESULTS

## The Flipped Perspective of Model-First Inquiry

Golgi will let users explore connectivity assured that the primary literature supports them at each step of their model building. In designing Golgi's User Experience around connectivity assertions (as opposed to reports from the primary literature) we promote a ''model-first'' paradigm for exploring connectomics. We bridge this assertion-level User Experience with the imperative for evidence-based reasoning by providing users an easy way to view details about the individual reports that underlie the connectivity assertions. Bolstered by links to the primary source from which the connection report was curated, this approach will enhance a user's ability to quickly explore connectomic assertions without sacrificing scientific integrity. Once a user activates a connection she will be able to investigate the associated primary sources to better understand the data underlying their assertions.

This approach is a strength of Golgi as a research tool: a user will be quickly and easily build high-level connectomic models assured that the circuits she has outlined are supported by curated reports from the primary literature (as opposed to doctrine or authority). By reversing the literature search workflow (''model-first'' as opposed to ''source-first''), Golgi will provide her a way to quickly assemble high-level models of regions of interest and their systems-level relationships while respecting the imperative for source authenticity.

## Exploring the Nucleus Accumbens

Golgi combines the power of neuroinformatics databases with the usability and explanatory power of anatomical atlases. It will enhance traditional literature searches by letting a user start from a ''model-first'' perspective and focus-in on specific primary sources as she progresses. Here we demonstrate how this functionality will assist in model building by outlining a macrolevel circuit implicated in the hedonic response, addiction, and depression (Thompson and Swanson, 2010).

We begin by searching Golgi for the Nucleus Accumbens: a small cerebral gray matter region located ventral to the dorsal Striatum (Caudoputamen) implicated in the hedonic response, addiction, and emotional valence. To find the Nucleus Accumbens we begin by using the search tool and starting to type in its name (**Figure 11**).

Before we have completed the term ''Nucleus'', Golgi's TypeAhead algorithm has constrained the available search results presented to us. The Nucleus Accumbens is displayed as a potential match. As we continue typing the search result is singled out as the only appropriate match to our input. We click the name and the green ''Search'' button.

Our search results (a summary of the data available about the Nucleus Accumbens) slides into view and we are presented with options for data visualization: we can display the Nucleus Accumbens on the map, we can display the Nucleus Accumbens on the map and immediately begin displaying associated assertions, or we can record notes about the Nucleus Accumbens that will save to our Golgi account.

As we click the green ''Add ACB to map'' button the search results slide out of view and the area on the map representing the Nucleus Accumbens highlights and changes color to amber. For clarity and to cue the user to their next behavior a pin is placed on the region. The pin's crowning halo is a thin ring of amber to signify that the region is active but no assertions are currently displayed.

To clarify the anatomical context of the Nucleus Accumbens we adjust our map display options. We click the orange ''Adjust Map View'' button in the map control panel and select

performance.

''Regions + Names''. This updates the map to include the anatomical boundaries between regions as defined in Swanson (2004) and labels each region with its neuroanatomical name (or abbreviation). We use the zoom slider to investigate more closely by zooming to an 800% enlargement. By clicking and dragging on the map we pan over to view the Nucleus Accumbens and the rest of the Striatum.

Clicking on the Nucleus Accumbens' pin reveals a UI element that summarizes the assertions currently displayed and our options for displaying more (**Figure 12**).

To begin our exploration of Nucleus Accumbens circuitry we click the green ''Add new +'' button underneath the ''0 Connections'' header. The current visualization summary is replaced with a panel for selecting input (afferent macrolevel connections terminating in the ACB) and/or output (efferent macrolevel connections emanating from the ACB) connections to visualize on the map. Here we select one input and two outputs: an afferent input from the Infralimbic Area (ILA) and two efferent connections to the Substantia Innominata (SI) and the Anterior Lateral Hypothalamic Area (LHAa). Once these regions have been selected we click the green ''Add Selected to Map''button.

Golgi: (1) checks the browser's memory for information stored on the ILA, SI, and LHAa and recalls it for displaying the regions properly; (2) downloads the image files that display those regions on the map and displays them according to data retrieved in step 1; (3) calls the Golgi API for connectional, molecular, and cellular assertions (and all supporting evidence) associated with these new regions [to speed-up future searches that the user is likely to make]; (4) downloads the image files encoding the connections that will be rendered on the map and displays them according to data recalled in step 1; (5) modifies the pin dropped on the ACB to include a green halo as well signifying that the ACB currently is displaying associated connectional assertions; and (6) drops similar pins on the newly displayed ILA, SI, and LHAa (**Figure 13**).

We have explored a simple macrolevel connection chain of (using the Neural Systems Language notation) @ILA >+ ACB @ACB >− SI, LHAa (Brown and Swanson, 2013). From here we continue our circuit building in the same manner. By clicking the

FIGURE 11 | Searching for Nucleus Accumbens. Searching for a region of interest, like the Nucleus Accumbens, is made easier by Golgi's autocompleting search algorithm. All possible matching options are displayed and subsequently removed as the user continues typing. This focuses the list down to the best possible matches to the region of interest.

pin for the SI we select an output connection to the LHAa and PT and add them to the map. Likewise we can select the LHAa pin and activate output connections to the PT as well.

Now our simple @ILA >+ ACB @ACB >− SI, LHAa has grown to @ILA >+ ACB @ACB >− SI, LHAa @SI >− PT @LHAa >+ PT as we have introduced a connectional divergence (@ACB >− SI, LHAa) and convergence (@SI >− PT @LHAa >+ PT). As we continue to plot output connections by selecting the PT we find the ILA as an available output target and select it for plotting (**Figure 14**).

We have traced a full thalamocortical loop between the ventromedial prefrontal cortex (ILA), a hedonic behavior modifier (the ACB), a behavior releaser and hypothalamic central pattern generator (the SI and LHAa, respectively) and a thalamic feedback relay (PT). Golgi's interface for interacting with high-level assertions of connectivity made this task straightforward and helped us quickly model a nuanced macrolevel circuit.

To explore the reports supporting any of these assertions we need only click on any pin in the circuit. Where we previously used the data visualization interface to activate regions and connections to the map, here we use the ''View Connection Details'' button to learn more about the connections we visualized. All input and output connections involving the selected region of interest (in this case the ACB) are available for closer inspection. As we select the ACB >SI connection we are

FIGURE 12 | Visualization options. Once the region of interest has been found it is easily activated on the map. Activating a region colors it amber and drops a context-dependent pin onto the region. Clicking the pin displays visualization options (for the display of connection, molecular, and cell-type assertions) immediately adjacent to the pin itself. This helps streamline the user's workflow by reducing the amount of visual search they have to complete.

presented with a high-level summary of the connection, options for viewing individual reports supporting the connection collated from the literature, and a space to save notes on this connection to our Golgi account (**Figure 15**).

Clicking any of the green ''View Report'' buttons displays a summary of the primary document from which this connection report was sourced (including a hyperlink to the primary document's PubMed entry) and where available, curated details describing the experimental technique.

## DISCUSSION

Neuroinformatics offers tools that help researchers collect, analyze, store, and disseminate models and data. We note here two compelling patterns that emerged as the field grew.

First, frameworks emerged that promote structured exchange of information between neuroinformatics solutions (the federation of tools was not ensured as a natural consequence of the software ecosystem or development processes). Nevertheless the emergent pattern is often mutually beneficial to participating tools: federation and synthesis help each tool become more useful (and, by proxy, better adopted).

A second pattern emerged in which individual tools synthesize disparate modalities of neuroscientific data and assertions into a single User Experience. This trend—and tools leading it—are not new: systems such as the BAMS offer researchers these types of multifaceted solutions. Tools in this pattern (of which we have aspired to build Golgi as one) combine reports from multiple lines of neuroscientific inquiry in a common data framework (such as internally-consistent,

hierarchical neuroanatomical nomenclatures and associated atlases) and provide users with multifaceted reports that help them connect disparate observations and data sources into systems-level models.

We see tools that follow this second pattern, that aggregate data from disparate lines of inquiry, holding the potential to enhance workflows for three user segments: (1) research investigators; (2) computational modelers; and (3) clinicians.

## Golgi can Facilitate Systems-Level Investigation

The state of the art in experimental neuroscience requires increasingly multifaceted experimental manipulations at multiple levels and scopes. For example, investigators designing optogenetic experiments often must incorporate data about neuroanatomy, cytoarchitecture, genetics, proteomics, and connectomics to properly execute their experiments. More challenging still: the multifaceted data they integrate should be internally-consistent in its method and anatomical framework. To this extent, there is a pressure for tools that help investigators seamlessly aggregate reports on disparate neurological phenomenon in an internally-consistent manner. Golgi will help investigators generate high-level insights within internallyconsistent frameworks. This will facilitate new experiments better than will either isolated single-focus neuroinformatics tools or traditional literature searches.

## Golgi can help Facilitate Large-Scale Simulation Frameworks

Computational neuroscience has, on two ends of its spectrum, historically focused on emulating large biological-inspired neural networks with high levels of abstraction or simulating restricted ensembles of neurons with higher degrees of biological realism. Contemporary investigations like those outlined in the Human Brain Project aim to synthesize biological realism with largescale networks of simulated neurons. This signals to us an increased pressure for tools that make biological neural network architecture readily accessible.

As semiconductor price-performance doubling increases, more research teams may find it feasible to incorporate largescale biologically-realistic simulations into their experimental workflows. Tools like Golgi that provide streamlined access to aggregated connectomic findings may be positioned to provide the connectomic framework for neural simulation suites. The export of selected neural architectures or direct federation with computational simulation tools will offer the computational neuroscience community easy ways to incorporate structurallyobserved network architectures into their simulations. To this end we predict that tools that streamline the exploration and interaction of connectomics data will experience closer relationships with computational neural simulation suites. In this scenario data exchange and system integration efforts can be expedited by bootstrapping APIs off existing interchange systems such NeuroML (Gleeson et al., 2010).

While artificial neural network simulation suites exist (Aisa et al., 2008) that simulate the macrolevel connectivity of layers constrained to behave as unique brain regions, some more sophisticated systems seeking greater biological plausibility require the specification of mesolevel connectivity (connectivity between distinct cell populations) (Arbib, 2003). To this end, Golgi would need to evolve to incorporate both macrolevel connections and mesolevel connections as well. In the near future connectomics may converge on both experimental techniques well-suited to mesolevel investigation as well as palatable ontologies for describing mesoconnectivity. So too will Golgi need to evolve to incorporate this new information and federate it with other neuroinformatics and simulation tools.

However, in its current form, Golgi contains information that relates brain regions to distinct cell types that have been reported within those brain regions as recorded in BAMS. Nevertheless, the data available in this iteration of Golgi is limited to reporting the presence of these cell types and does not encode any mesoconnectivity. This is due to two related reasons. Primarily, both the available techniques for exploring mesoconnectivity and consensus understanding of how to discuss and organize acquired connectional data at the mesolevel remain developmental (Bohland et al., 2009). Because of this, the connectomic data available online in internally-consistent, hierarchical mesolevel ontologies is sparse. As such, we restricted the scope of data included in Golgi to macroconnectivity.

## Golgi can Connect Benchside with Bedside

When tools are built with ease-of-use as a design constraint they maximize the scope of their potential user base. Balancing Golgi's academic integrity with its ease-of-use was a primary design goal. The resulting User Experience minimizes barriers to adoption and encourages non-research experts to use it. Making Golgi easier to use for users like clinicians encourages its use as a dissemination and teaching platform.

In this scenario, neurologists and other clinicians will benefit from streamlined access to contemporary connectomics. Users in this segment often suffer from constraints of time, background understanding, and institutional access barriers for literature searches about a neural pathway of interest to be effective or efficient. We designed Golgi to minimize the amount of time or background knowledge a non-research expert will need to learn more about a particular pathway or circuit under investigation. By lowering these barriers to access Golgi can help clinicians maintain a contemporary understanding of neural architecture quickly and easily. Lowering these barriers to access for this user group may help improve outcomes as more clinicians are equipped with more current knowledge that helps them make better intervention decisions.

## Golgi's Context Among Analogous Neuroinformatics Tools

Golgi's development was encouraged by the existence and spread of other anatomical neuroinformatics frameworks. These other tools not only bolstered our belief that a tool like Golgi would be useful for the field, they provided both inspiration and a confirmation of how Golgi could distinguish itself from other tools and add value to researchers' workflows. Here we briefly explore three analogous neuroinformatics solutions, their strengths, and how Golgi may distinguish itself from each.

NeuroVIISAS offers an open framework for storing, processing, visualizing, and simulating neural data. While the majority of data collected and available from the creating team at the University of Rostock focuses on the rat and mouse, neuroVIISAS is species-flexible. As an expert-oriented system neuroVIISAS offers solutions to problems ranging from the digital exploration of traditional neuroanatomical atlases (like the Paxinos/Watson mouse brain atlas) to network analysis according to graph-theoretic measures and the direct incorporation of cell-type connectivity with computational simulation. Its broad feature set offers many solutions within a single common data neuroinformatics framework. Golgi may distinguish itself from neuroVIISAS by focusing narrowly on solving a single neuroinformatics problem very well: spatially exploring rat macroconnectivity. Constraining the use-case of our system allows us to constrain its User Interface and by proxy, the learning-curve that a user will face when using Golgi. This narrow approach is aligned with a trend in consumer-facing software products towards smaller, ''one-solution'' applications. This trend has two major causes. First, users seek out new applications to solve a narrow problem they face. As such, singlepurpose apps tend to outcompete ''all-in-one'' apps because single-purpose apps better capture ''first-to-mind'' market positioning than do larger apps. Secondly, data aggregation, shared ontological frameworks, and common interchange formats like the NIFSTD allow single-focus neuroinformatics solutions to share data and resources while each maintaining a focused User Experience and minimized learning period.

The Mouse Connectome Project (MCP) produced by the Laboratory of Neural Imaging (Hintiryan et al., 2012) offers a user-friendly online web app for exploring macroconnectivity in the C57Bl/6J mouse brain. The approach taken by the MCP is unique to the approach taken by neuroVIISAS or BAMS: rather than aggregating and synthesizing connectivity data from the available corpus of literature (as did BAMS and neuroVIISAS), the MCP itself generated novel connectivity data using state-of-the-art double co-injection tract tracing techniques. The MCP makes this data available online in the form of both queryable tables of connectivity and an interactive display interface. This interface allows users to view coronal section slices of the actual microscopy data collected for each experiment corresponding to queried regions and connections of interest. Rather than display interactive coronal sections, Golgi distinguishes itself from the MCP by displaying the entire brain as an embryonic flatmap upon which several connections that may have spanned numerous serial sections can be contiguously displayed simultaneously. This approach allows for users to more intuitively grasp the spatial relationships between several connections at once than is afforded in a serial section viewer like the MCP.

The interactive neural map produced by Jianu et al. (2012) allows users to explore diffusion tensor imaging tractography data in two-dimensions via a web interface. Like the MCP, Jianu's neural map displays connectivity data collected de novo explicitly for the neural map. Jainu's map makes it easy for users to explore the relationships between white matter tracts (as observed in diffusion tensor imaging) and their relationship to one another in space. Yet unlike the MCP or neuroVIISAS, Jianu's map is unable to provide the user with any information about macroconnections and the relationship between brain parts. While both Jainu's neural map and Golgi take advantage of twodimensional visualizations and brain flattening the two systems take a very different approach to the problem of flattening the brain. Jainu's map simultaneously displays coronal, horizontal, and midsagital views in which all fiber tracts are simultaneously displayed across all three views. This means that, for example, two tracts viewed in a midsagital view, one more medial than the other, may appear to overlap across each other. This forces the user to reference the horizontal or coronal view to distinguish the relative mediolateral position of that tract. Golgi instead relies on the Swanson (2004) flatmap to show the brain as a single flattened surface. This decision lets us show the entire brain in a single view and (when combined with the neuroanatomical compass we provide) makes it easier for users to understand the spatial relationships between regions and their connections than would be accommodated in the three-view system used in Jainu's neural map.

## CONCLUSION

We envision four threads of future technological growth for Golgi: data, display, detail, and development.

## Data

The data inside Golgi will be under pressures to grow in two directions: deep within a species and wide between species.

As more data within a species is collected experimentally or curated into knowledge management systems so too should Golgi include these new findings. To this end, establishing high-throughput pipelines for updating the data inside Golgi and designing API services between Golgi and other neuroinformatics tools will help ensure that Golgi expands its usefulness and is populated with contemporary data.

As a parallel line of development, incorporating different existing brain atlases within Golgi would increase the depth of available data for a given species. These atlases and their atlasspecific reports could exist independently from data registered to other atlases, or could be spatially co-registered for comparison across atlases for which strict topological relationships between terms and regions have been established (though this process comes with unique challenges of its own.)

As an orthogonal vector of growth, incorporating findings from other species into separate maps is rate-limited not only by its availability but by the existence of hierarchical, internallyconsistent nomenclatures and corresponding reference atlases in which this data can be contextualized. To this extent, Golgi's potential to represent data from other species depends on the neuroanatomy community's progress in producing high-caliber, interoperable brain atlases and nomenclatures for more species. Golgi's data architecture and front-end structure have been designed such that accommodating future brain atlases and atlases from different species can be done without drastically overhauling either the frontend or backend framework.

## Display

The way that connectomics data is displayed is constrained by the affordances allowed by our two-dimensional display devices. Decisions in Golgi's construction—such as the choice of the Swanson (2004) embryonic flatmap as our atlas framework—were constrained by the fact that we predict our users will (overwhelmingly) use two-dimensional display devices and human-interface devices like a keyboard and mouse. Given these constraints, a two-dimensional map is well suited for a world of two-dimensional human-computer interaction.

As both human-computer interaction and display technologies evolve so too will our software. To this end, one major future developmental branch for Golgi will be to use new methods of displaying the nervous system. New visualization and interactions methods may compel us to—at minimum—reimagine how we display data on our existing brain atlases. At their most disruptive, new three-dimensional or fourdimensional (that is, those that incorporate how connectomic and proteomic data changes over the course of development, and during dynamic functioning) visualization techniques may require us to invent wholly new ways to visualize the nervous system.

## Detail

The state-of-the-art in connectomics is growing in precision and detail. Golgi currently renders assertions of connectivity at the macrolevel (between brain regions) due to the amount of macrolevel data available compared to mesolevel data (between cell populations). The connectomics community—in collaboration with neuroanatomists—must develop frameworks for distinguishing mesonodes (distinct cell populations) from one another before our existing macrolevel data can be interpreted in this new context (and new experiments can be performed de-novo at the mesolevel). As more connectivity information becomes available in mesolevel frameworks and mappable on mesolevel atlases, Golgi will need to provide affordances for displaying data at the mesoscopic level as well.

## Development

Beneath advancing the complexity of Golgi's display interface, widening and deepening the level of connectomic data it contains, and increasing the scope of connectivity users can explore lies the quotidian albeit critical operational challenge of ensuring Golgi's continued hosting, maintenance, and active development.

Golgi's hosting on Amazon Web Services helps us balance the demands of keeping Golgi adequately available to interested users against the need for cost-efficiency. Amazon Web Services Elastic Cloud Compute service allows us to run Golgi costeffectively when demand is low, and expand capacity accordingly only when demand is high. High server availability and routine machine image backups help us ensure that any unpredicted system failures can be quickly remediated with minimal system downtown and data loss.

Golgi's data repository, as collected from BAMS, is currently being developed alongside a semi-automated daemon system that keeps Golgi up-to-date with BAMS without human intervention. This system, built on top the Kimono API prototype pipeline we initially built for populating Golgi, will periodically check for discrepancies between Golgi's data and BAMS data and update Golgi to minimize these differences. This means that as BAMS grows and is populated with more and better connectional data, so too will Golgi grow along with it. This helps us focus available development labor toward more creative and productive ends in expanding Golgi than routine data aggregation.

Finally, we released Golgi under open-source public version control on GitHub. Now, any enthusiastic developer can learn more about neuroinformatics and freely contribute to the development of this tool for the betterment of the community and the state-of-the-art. This reflects not only our desire to spread Golgi's adoption as far as we can, but also our belief that the best way to inure Golgi against deprecation and disinterest is to remove barriers to community involvement in Golgi's development. This type of public stewardship of projects and ideas not only helps make Golgi better, it helps enfranchise talented, interested developers looking to join our fast-growing field.

## REFERENCES


## AUTHOR CONTRIBUTIONS

Conceived and developed the project: RAB, LWS; Developed software: RAB; wrote the manuscript: RAB, LWS.

## ACKNOWLEDGMENTS

We would like to thank Drs. Gully APC Burns, Hong-Wei Dong, Olaf Sporns, Mihail Bota, and Alan Watts as well as Mr. T. Dalton Combs and Ms. Cathleen Crayton for guiding feedback and support while developing Golgi. Development was supported by the Rose Hills Foundation and the National Institutes of Health; grant numbers: 5R01NS050792; 5R37NS 016686.


**Conflict of Interest Statement**: The reviewers Marcus Kaiser and Sharon Crook declare that, despite being the Topic Editors of the Frontiers Research Topic this manuscript is part of, the review was handled objectively and no conflicts of interest exist. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Brown and Swanson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Network dynamics with BrainX3: a large-scale simulation of the human brain network with real-time interaction

*Xerxes D. Arsiwalla1 \*, Riccardo Zucca1, Alberto Betella1, Enrique Martinez 1, David Dalmazzo1, Pedro Omedas 1, Gustavo Deco2,3 and Paul F. M. J. Verschure1,3\**

*<sup>1</sup> Synthetic Perceptive Emotive and Cognitive Systems Lab, Center of Autonomous Systems and Neurorobotics, Universitat Pompeu Fabra, Barcelona, Spain*

*<sup>2</sup> Computational Neuroscience Group, Center for Brain and Cognition, Universitat Pompeu Fabra, Barcelona, Spain*

*<sup>3</sup> Institució Catalana de Recerca i Estudis Avançats , Barcelona, Spain*

#### *Edited by:*

*Mihail Bota, University of Southern California, USA*

#### *Reviewed by:*

*Bruce Graham, University of Stirling, UK*

*Olusola Ajilore, University of Illinois at Chicago, USA*

#### *\*Correspondence:*

*Xerxes D. Arsiwalla and Paul F. M. J. Verschure, Synthetic Perceptive Emotive and Cognitive Systems Lab, Center of Autonomous Systems and Neurorobotics, Universitat Pompeu Fabra, La Nau Building, Roc Boronat 138, 08018 Barcelona, Spain e-mail: x.d.arsiwalla@gmail.com; paul.verschure@upf.edu*

BrainX3 is a large-scale simulation of human brain activity with real-time interaction, rendered in 3D in a virtual reality environment, which combines computational power with human intuition for the exploration and analysis of complex dynamical networks. We ground this simulation on structural connectivity obtained from diffusion spectrum imaging data and model it on neuronal population dynamics. Users can interact with BrainX3 in real-time by perturbing brain regions with transient stimulations to observe reverberating network activity, simulate lesion dynamics or implement network analysis functions from a library of graph theoretic measures. BrainX3 can thus be used as a novel immersive platform for exploration and analysis of dynamical activity patterns in brain networks, both at rest or in a task-related state, for discovery of signaling pathways associated to brain function and/or dysfunction and as a tool for virtual neurosurgery. Our results demonstrate these functionalities and shed insight on the dynamics of the resting-state attractor. Specifically, we found that a noisy network seems to favor a low firing attractor state. We also found that the dynamics of a noisy network is less resilient to lesions. Our simulations on TMS perturbations show that even though TMS inhibits most of the network, it also sparsely excites a few regions. This is presumably due to anti-correlations in the dynamics and suggests that even a lesioned network can show sparsely distributed increased activity compared to healthy resting-state, over specific brain areas.

**Keywords: connectomics, virtual reality, neural dynamics, large-scale brain networks, big data, virtual neurosurgery**

#### **1. INTRODUCTION**

How should one visualize and simulate the large amounts of data being generated nowadays in neurobiology, in ways that could inform our understanding of the structure and function of the brain? Would that also link to clinical applications? Over the years, the cumulative spate of studies in structural and functional neuroimaging, electrophysiology, genetic imaging and axonaltracing studies have generated enormous amounts of data (found in online repositories such as http://www.neuroscienceblueprint. nih.gov/connectome/ and http://www.brain-map.org to name a few), which, on one hand have led to many insights on the intricate patterns of signaling and connectivity, as well as the existence of multi-scale processes in the brain; on the other hand, it has exposed the need for an integrative framework for modeling and simulating whole-brain dynamics and function. To address this need, neuroscientists started using ideas from network engineering, thinking of the brain as a complex dynamical network of neurons, thus giving rise to the field of brain connectomics (Hagmann, 2005; Sporns et al., 2005). Akin to the genome, which is a map of the genetic sequence of an organism, the connectome is a map of the neuronal circuitry of an organism emphasizing the nodes and their connections where nodes can represent volumes of neuronal tissue to single neurons. Of course, merely a map of this network is not sufficient to predict or understand function. Being a dynamic network, signaling processes in the brain operate across a range of spatial and temporal scales. Therefore, a mechanistic understanding of these processes is essential to gain insight into cognitive function. At the same time, the complexity of the connectome means that these signaling circuits cannot be understood in isolation or even in a serial manner, but necessarily have to be seen in the functional context of the whole network. This calls for a large-scale network level analysis and simulation of whole-brain activity and an associated immersive visualization and interaction system. This is the challenge BrainX3 aims to tackle.

BrainX3 is a large-scale simulation of the human cerebral connectome, which uses both anatomical structure and biophysical dynamics in order to reconstruct activity and predict function. Structural connectivity of the network is obtained from human Diffusion Spectrum Imaging (DSI) data (Hagmann et al., 2008). Each node of the connectivity matrix corresponds to a population of neurons. For simulating dynamics, BrainX3 offers the user a choice of dynamical models that can be implemented, namely, population dynamics based on a linear-threshold transfer function, or a non-linear sigmoidal transfer function with decay, or dynamical mean-field models (Wong and Wang, 2006; Deco et al., 2013). The last of these models are the most interesting as they come closest to biology; they compute aggregate neural activity taking into account synaptic dynamics and stochasticity.

Until the advent of the connectome (Hagmann, 2005; Sporns et al., 2005), the traditional method of choice for investigating non-invasive human resting-state dynamics has heavily relied on back-inference of neural mechanisms from BOLD signals. Despite the many successes of this approach, a few drawbacks remain; for instance, it does not provide precise temporal information on the flow of activity through the network, and being a signal based on haemodynamic responses, it is at best only a proxy for neural activity. However, the birth of the field of connectomics has helped turn this around. Combining structural connectivity data with detailed neuronal dynamics made it possible to instead predict functional activity (Honey et al., 2009). These predictions have been validated for both spiking neuron as well as dynamical mean-field models, when compared to empirical BOLD data (Deco et al., 2013). In this work we implement mean-field dynamics in BrainX3 using Hagmann et al.'s DSI network to reconstruct neurodynamic activity for the entire cortex within a 3D virtual environment, for the purpose of investigating temporal patterns of whole-brain neural activity when the brain is in the resting-state. The resting-state refers to the state of spontaneous neural activity recorded in the absence of any specific tasks being instructed to the subject. In computational models, this corresponds to no specific (or localized) external input currents being injected into the network above the overall baseline current. The activity in this spontaneous state is far from random. Hence, understanding the biophysical dynamics of the resting-state constitutes an important challenge.

The technology of BrainX3 is layered on four modular components: an input module, a data processing module, a visualization and interaction module, and a simulation and analysis module. The integration of these modules generates the full user experience. Simulations run in the eXperience Induction Machine (XIM), an enclosed virtual/mixed reality chamber, which enables user-immersion and exploration of virtually rendered scenarios (refer to **Figure 1**) (Bernardet et al., 2010; Betella et al., 2014c). Based on DSI data, a 3D model of the connectome network is reconstructed within the XIM, providing users with an insideout perspective of the brain connectome and allowing them to navigate through pathways in the network. Additionally, the XIM supports a custom-built large-scale neural simulator, iqr, which communicates bi-directionally with the virtual brain network and imposes dynamics on it (Bernardet and Verschure, 2010). Furthermore, BrainX3 is based on a natural user interaction paradigm, such that using natural gestures, i.e., hand gestures, body posture, etc., users can navigate the virtual space, select and bookmark brain areas, perform surgeries, stimulate any region of the network to investigate the dynamics of the neuronal activity reverberating through associated areas. Finally, for data analysis, BrainX3 communicates bi-directionally, using standard protocols such as UDP or YARP (Metta et al., 2006), with external network analysis tools, including the Brain Connectivity Toolbox (BCT) (Rubinov and Sporns, 2010) running on a MATLAB client.

What is BrainX3 capable of? It provides the possibility of analyzing and interacting in real-time with the simulated

**FIGURE 1 | Computer rendering of BrainX3 within the eXperience Induction Machine (XIM).** The 3D connectome network and its simulated dynamics are projected on the frontal screen. The screen on the right displays regional information of selected brain areas from a curated database, while the left screen shows 2D axial slices of the brain and indicates regions of activity. The user can navigate and interact with the model with predefined hand gestures.

activity. Compared to functional correlations, dynamical analysis of causal activity serves as a powerful tool to unravel mechanisms of large-scale neural circuits. Indeed, coupling structural connectivity data with detailed enough population dynamics should be sufficient in predicting functional correlations and large-scale activity patterns. As examples, we use BrainX3 to demonstrate the dynamics of the brain in the resting-state as well as under perturbations due to evoked stimuli. We investigate how neural activity reorganizes following simulated lesions. Additionally, using graph theoretic measures from the BCT we can also determine shortest paths between nodes.

At this point, it is also worth drawing attention to the growing eco-system of 'big brain projects' and other neuroinformatics tools which complement BrainX3. Among these are the Connectome Workbench of the Human Connectome Project (http://www.humanconnectome.org/connectome/connectomeworkbench.html) (Marcus et al., 2013), the Brain Explorer 2 of the Allen Institute for Brain Science (http://www.brain-map. org), the Glass Brain Project (http://neuroscapelab.com/projects/ glass-brain/) (Mullen et al., 2013), the VisNEST tool (Nowke et al., 2013) and The Virtual Brain (Jirsa et al., 2010; Sanz-Leon et al., 2013). While many of them are 3D visualization tools, some of them also include dynamics and interaction. In that sense, The Virtual Brain comes closest to the objectives of BrainX3, but for example, it does not include real-time interaction. Unlike the aforementioned, BrainX3 runs in a completely immersive virtual reality chamber, facilitating real-time interaction with the simulation using natural gestures. For the benefit of the neuroinformatics community, a portable laptop version, including interaction (but without user immersion), is currently under development. Besides, visualization, interaction and simulation, BrainX3 has been developed with a vision toward a "smart exploration space for big data" as part of the European Union CEEDS (Collective Experience of Empathic Data Systems) project (http:// ceeds-project.eu and http://www.brainx3.com).

### **2. MATERIALS AND METHODS**

#### **2.1. HARDWARE AND SYSTEM ARCHITECTURE**

The virtual reality environment supporting BrainX3 is the eXperience Induction Machine (XIM) (Bernardet et al., 2010; Betella et al., 2014c; Omedas et al., 2014). The XIM is a 25 m2 human accessible space (schema in **Figure 1**) equipped with 360◦ surround screens, an interactive luminous floor with pressure sensors, a marker-free tracking system, a KinectTM, microphones, a sonification system and wearable sensors, that support human-machine interaction in the exploration of complex datasets. The computational platform running BrainX3 in the XIM includes four latest generation machines that communicate bi-directionally using the YARP protocol within a high speed LAN connection: 2 PCs dedicated to graphical rendering (INTEL CORE i7 2600K 3,4GHZ/8mb/LGA1155, two DDR3s 4GB 1333Mhz KINGSTON, AMD FirePro V7900 Professional with AMD Eyefinity technology) with a total of eight display port outputs, each of which is connected to an HD projector (Epson PowerLite Pro G5450WUNL), thus creating a 360◦ projection display that surrounds the user; 1 server dedicated to sensors recording and real-time interaction (HP proliant DL160 G6, Xeon E5506, 2.13 GHz) that is connected to the XIM sensors and effectors, including a Microsoft Kinect2™, the sonification system and the interactive floor; 1 server dedicated to simulation and computation (HP proliant DL160 G6, Xeon E5506 at 2.13 GHz) that runs the neural network simulator iqr (http://iqr.sourceforge. net) and MATLAB.

Within the XIM virtual reality environment, BrainX3 functions as a data visualization and simulation tool. The processing architecture of BrainX3 is schematically illustrated in **Figure 2**. The input module (layer) has two components: the *network data* and the *network atlas*. The *network data* is the connectome dataset, while the *network atlas* contains the coordinates of each element of the network. Both of these are stored in GraphML (XML) format. The *data parser* generates a data structure and specifies the components of the graph. The *graphical allocator* reads the meta data associated to each one of the elements that compose the network and associates them to the 3D coordinate system, included in the *network atlas*. In BrainX3, we adopt the standard anatomical coordinates of the Talairach atlas (Talairach and Tournoux, 1988), however other coordinate systems can just as easily be applied. The *geometry provider* plots the results as a 3D representation of the data by combining the instances generated by the parser with the coordinates specified within the atlas. The components responsible for data processing follow a *Model-View-Controller* (MVC) design pattern. Both data processing and real-time rendering have been developed and implemented using Unity 3D (http://unity3d.com/). The advantage of such a modular structure is that it provides BrainX3 with the adaptability for visualization and simulation of other data types besides neural data, which can be stored as a network or organized in a hierarchical structure, such as gene regulatory networks, social networks, etc.

#### **2.2. VISUALIZATION AND SIMULATION**

Visualization and reconstruction of the connectome within BrainX3 is based on DSI data of white matter fiber structural connectivity averaged from five healthy right-handed male human subjects (Hagmann et al., 2008). The dataset contains 998 voxels (nodes) belonging to 33 cortical areas per hemisphere (refer **Table 1**), for a total of 66 areas. The 998 Regions of Interest (ROIs) have an average size of 1.5 cm2 and each ROI is associated with {*x*, *y*,*z*} coordinates as per the Talairach coordinates of ROIs (Talairach and Tournoux, 1988). Since tractography does not determine the directionality of the fibers, the connectivity matrix (approximately 17000 bi-directional connections) is symmetric at the ROI level. Connection strengths within the network refer to normalized number of white matter fiber tracts between ROIs.

To introduce dynamics into the visualization, the largescale multi-level neural networks simulator, iqr (http://iqr. sourceforge.net/), is bi-directionally interfaced to Unity. iqr allows the user to design complex neuronal models through a graphical interface and to visualize, analyze and modify the model's parameters in real-time (Bernardet and Verschure, 2010). The architecture of iqr is modular, providing the possibility to define custom neurons and synapses. iqr can simulate large neuronal systems up to 500k neurons and connections and can be directly interfaced to external sensors and effectors. In order to enable real-time user interaction with the reconstructed data, user input from Unity is sent to iqr (Arsiwalla et al., 2013). The neuronal simulator computes the processes and broadcasts the output of the simulation back to the Unity engine in the XIM. The simulation runs with iqr receiving commands through Unity at any time during the simulation. Upon receiving input from iqr, Unity updates the visualized population activity on each node. In its current form, BrainX3 can accommodate networks of up to 4000 nodes (albeit with a slower simulation time).

**Table 1 | 33 brain regions on each hemisphere (ID), abbreviated name (Abbr.), anatomical name (Brain region) and ROI node numbers for each region on right (R) and (L) hemispheres.**


#### **2.3. DYNAMICAL MODELS IN BRAINX<sup>3</sup>**

As the connectivity data currently being used by BrainX3 is derived from neuroimaging sources, it is more appropriate to model network dynamics by means of neuronal population models. At present, BrainX3 allows to run simulations with either of the three models: (i) the linear-threshold model, (ii) non-linear (sigmoidal) model and (iii) dynamical mean-field model. The linear-threshold model simply sums up all the input signals to a population module from various dendrites (within a fixed time window) and fires an output signal to neighboring modules only if the summed inputs cross a designated threshold. Additionally each neuronal population module is stochastic, having Gaussian noise. This was demonstrated in earlier work (Arsiwalla et al., 2013). The non-linear model is similar to above except that the linear-threshold filter is replaced by a sigmoidal filter with decay. This was used in Betella et al. (2014b).

The dynamical mean-field model is a mathematical reduction of a spiking attractor network consisting of integrate and fire neurons with excitatory and inhibitory synapses (Wong and Wang, 2006; Deco et al., 2013). Global brain dynamics of the network of interconnected local networks is described by the following set of differential equations derived in Deco et al. (2013):

**Table 2 | List of parameters of the dynamical mean-field model implemented in BrainX3 from Deco et al. (2013).**


$$\frac{d\mathbf{S}\_i}{dt} = -\frac{\mathbf{S}\_i}{\mathbf{r}\_s} + (1 - \mathbf{S}\_i)\boldsymbol{\chi}\boldsymbol{H}(\mathbf{x}\_i) + \boldsymbol{\sigma}\boldsymbol{\nu}\_i(t) \tag{1}$$

$$H(\mathbf{x}\_i) = \frac{a\mathbf{x}\_i - b}{1 - \exp(-d(a\mathbf{x}\_i - b))} \tag{2}$$

$$\mathbf{x}\_{i} = \mathbf{w}\_{N}\mathbf{S}\_{i} - G\mathbf{J}\_{N}\sum\_{j}C\_{ij}\mathbf{S}\_{j} + I\_{0} \tag{3}$$

where *H*(*xi*) and *Si* respectively correspond to the population rate and the average synaptic gating variable at each local node *i*, *w* is the local recurrent excitation, *G* is a global scaling parameter, *Cij* is the matrix of structural connectivity expressing the neuroanatomical connections between areas *i* and *j*, *vi* is uncorrelated Gaussian noise. All parameters values, with the exception of σ, which was systematically varied in the present simulation study, are as in Deco et al. (2013) and have been summarized in **Table 2**.

Among the three types of models described above, mean-field models are the most interesting as they come closest to biology. They compute aggregate neural activity taking into account synaptic dynamics and stochasticity. Hence, in this paper, our simulations will be based on the dynamical mean-field model. However, compared to Deco et al. (2013), where the dynamics was parametrized on 66 regions, in this work we scale the dynamics to 998 ROIs.

#### **2.4. REAL-TIME INTERACTION FRAMEWORK**

BrainX3 allows users to interact in real-time with the simulation (the simulation itself is not real-time, each millisecond of simulation takes between 20 and 50 ms, being slower during interaction). This is a form of on-line interaction, as opposed to a pre-programed off-line mode of interaction. It provides users the possibility to perturb the simulated activity (by injecting currents using predefined hand gestures) mid-way through the run. Gesture recognition and signaling within the XIM is supported via the Social Signal Interpretation (SSI) framework (Wagner et al., 2013) and is based on the Microsoft KinectTM v2 technology. The KinectTM detects body joints and two main hand actions: the closed hand and pointing with a finger. All high

**FIGURE 4 | Simulation of resting-state activity vs. noise showing that increased noise suppresses network activity.** Snapshot of resting-state neural activity at a single time-point (after the dynamics stabilizes around the attractor) for different levels of noise amplitude within BrainX3. From top to bottom, noise amplitudes: **(A)** 0.01, **(B)** 0.05, **(C)** 0.07, and **(D)** 0.1. Each row shows

screenshots from the posterior, superior and lateral perspectives. The color bar on the right represents neuronal activity in Hz (warmer colors represent higher activation of the nodes). The full simulation can be seen on videos 01 and 02 of the following link: https://www.youtube.com/playlist?list=PL-BcYpSz98wqVA KuI-ymqDII-6nXK\_8uq.

indicated by colors from the color bar (warmer colors refer to higher activation), mean firing rate for all 998 nodes over the last 2 s of simulation and time-series signals extracted for three seed ROIs rCAC (node 193, shown in black), rISTC (node 205, shown in green) and lPCUN (node 830, shown in magenta).

level mapping and interpretation of gestures is performed by SSI. In order to rotate the network sideways, the user simply has to clench her/his fist and make a sideways arm movement. For zooming into the network, the user moves directly toward the screen and the visualization becomes immersive placing the user "inside" the 3D reconstruction (refer to **Figure 3**). For stimulating or inhibiting brain areas, the user simply has to control the cursor with a hand movement, select a node or region in the network with a grabbing gesture, then drag and drop the cursor on the icon in the graphical user interface, associated to stimulation or inhibition. This respectively corresponds to injecting external excitatory or inhibitory currents into the dynamics. The strength of the stimulation current is pre-defined in the iqr configuration file (but can be arbitrarily chosen). Stimulations can be performed on one or more brain areas simultaneously. BrainX3 then reconstructs reverberating neural activity propagating through the connectome. Furthermore, in order to equip the user with tools for analysis of the outcome of the simulation, BrainX3 is also interfaced with the MATLAB Brain Connectivity Toolbox, which enables several graph-theoretic operations to be performed on the reconstructed network such as finding the shortest path between any two nodes or detecting community structure in the data (Rubinov and Sporns, 2010). BrainX3 also includes customized interaction functionalities that allow the user to bookmark areas of interest, to tag and visually highlight chosen pathways, to filter network complexity and to model lesions by disabling nodes in order to obtain altered activity associated to the lesion.

### **3. RESULTS**

We now put to test the functional capabilities of BrainX3 to gain valuable insights on the large-scale dynamics of the human connectome. We start by simulating the global dynamics of the resting-state. Then we lesion the structural network and study aberrant cortical activity for both focal lesions as in stroke patients as well as for diffuse lesions as in multiple sclerosis patients. Next, we study the effect of external perturbations such as trans-cranial magnetic stimulations on the network and its resulting evoked activity. Finally, we demonstrate an exercise in tracing pathways within the cortex in order to extract functional circuits as well as to analyze them. Videos explicitly demonstrating these results in BrainX3 have been uploaded on the following link: https://www.youtube.com/playlist?list=PL-BcYpSz98wqVAKuI-ymqDII-6nXK\_8uq.

#### **3.1. DYNAMICS OF THE RESTING-STATE**

**Figures 4**, **5** show results from 10 s of simulation of resting-state dynamics. An important observation made in Deco et al. (2013) was that the resting-state network operates at the edge of a bifurcation. This fixes the global coupling parameter of the model. Analogously, for the scaled model we implement here, the value of the global coupling *G* is determined to be 2.3 using the same

**FIGURE 6 | Simulation of lesioned (focal) brain activity vs. noise.** Snapshot of neural activity at a single time-point in BrainX3 following a focal lesion in areas rCUN, rLOCC, and rPCUN. From top to bottom, noise amplitudes: **(A)** 0.01, **(B)** 0.05, **(C)** 0.07, and **(D)** 0.1. Each row shows screenshots from the posterior, superior and lateral perspectives. The color bar on the right represents neuronal activity in Hz (warmer colors represent higher activation of the nodes and lesioned nodes are shown in black). The full simulation can be seen on videos 03 and 04 of the following link: https://www.youtube.com/playlist?list=PL-BcYpSz98wq VAKuI-ymqDII-6nXK\_8uq.

observation. Besides that, all other mean-field model parameters (except the noise amplitude σ) are held at exactly the same values as in Deco et al. (2013). The numerics run in time steps of 0.1 ms but we sample data every 1 ms giving 10K points for a run of 10 s. **Figure 4** shows screenshots from the front display of BrainX3 at the end of four runs of the simulation. Each run was chosen with a different value of noise amplitude, shown in rows A, B, C and D with σ 0.01, 0.05, 0.07, and 0.1 respectively. The four snapshots in each row (from left to right) correspond to the posterior, superior and lateral views respectively. Since BrainX3 is interfaced to MATLAB via YARP/UDP, in addition to the 3D reconstruction, we also obtained time-series data that can be analyzed using any statistical tool. This is shown in **Figure 5**. This analysis was performed off-line using MATLAB. Each of the four subplots A, B, C, and D refer to the same four noise levels. Further, each subplot includes three graphics: a 2D distribution of ROIs with colored nodes indicating activity level at the end of the simulation, a plot showing the mean firing rate of every ROI over the last 2 s and a plot showing the full time-series signal of three randomly chosen nodes. The mean firing activity represents the stable fixed point of the dynamics and in fact the attractor of the resting-state network. The seed ROIs corresponding to the three time-series signals refer to nodes 193, 205, and 830 located in regions rCAC (black), rISTC (green) and lPCUN (magenta) respectively. **Table 1** shows the mapping of ROI identities to anatomical region names.

An interesting insight that we gain upon comparing the results of these simulations is the way noise affects the network attractors themselves. This is summarized in the histogram on the lefthand side of **Figure 11**, showing the total mean firing rate of the resting-state network integrated over all ROIs. Each column of the histogram refers to a given noise amplitude. What is interesting is that rather than jumping into a hyperactive or chaotic state, upon increasing intrinsic noise, the dynamics of the network seems to quiet down. For σ 0.07, mean activity for each node is around 40 Hz and for σ 0.1, it almost goes to zero. Remarkably, this happens without the use of any ROI to ROI inhibitory connections. Noise seems to reverse the stability of the previously unstable low firing attractor state.

#### **3.2. DYNAMICS OF STROKE AND MULTIPLE SCLEROSIS**

Having looked at the healthy resting-state network above, we now show how lesions can be simulated in BrainX3. We consider two lesion types, (i) focal lesions, which occur in the case of stroke patients, and (ii) diffuse lesions, which typically occur in patients with multiple sclerosis. **Figures 6**, **7** shows results for the former lesion type with the same four levels of noise as above. **Figures 8**, **9** shows results with diffuse lesions. The focal lesion is constructed on the right hemisphere by severing all white matter fibers connections from all nodes in regions rCUN, rLOCC, and rPCUN. These are a total of 52 disconnected ROIs, amounting to 6.64% of the total connections. The diffuse lesions are constructed by randomly disconnecting individual ROIs distributed throughout the network. To compare with the focal case, we chose 50 scattered ROIs, which amount to 4.91% of the total connections.

colors from the color bar (warmer colors refer to higher activation and lesioned nodes are shown in black), mean firing rate for all nodes over the last 2 s of simulation and time-series signals extracted for the three seed ROIs rCAC (node 193, shown in black), rISTC (node 205, shown in green) and lPCUN (node 830, shown in magenta).

In **Figure 10**, we compare differences between healthy restingstate activity (**Figure 4**) and the lesioned activity (**Figure 6**). The four plots on the left side of **Figure 10** show the difference in mean firing rate between the healthy and focally lesioned networks (on the y-axis) in the attractor state at every ROI (on the x-axis), for the four noise amplitudes (in increasing order from top to bottom). From this we see that for the lowest noise amplitude (0.01), the lesion mostly affects activity in its anatomical vicinity. However, by the time we reach noise amplitude 0.07, the span of the network affected by the lesion has dramatically increased. Furthermore, the histogram shown in the center of **Figure 11** integrates the differences in mean firing over all ROIs to give the total difference in mean firing between the healthy and focally lesioned network for each noise amplitude. The columns of the histogram are on the positive side of the x-axis (except for noise amplitude 0.1, when the activity in both networks is just noise). By the time the noise amplitude rises to 0.07, the lesioned network dramatically differs from the healthy network in total firing. These observations suggest that noisy networks are less resilient to focal lesions.

On the other hand, a comparison of mean activity between the focally lesioned (**Figure 6**) and diffuse lesioned (**Figure 8**) networks is shown in the four plots on the right side of **Figure 10**. The y-axis denotes the difference in mean firing rate between the diffuse and focally lesioned networks in the attractor state. The x-axis runs over all 998 ROIs. Though the number of disabled nodes in both cases is almost the same, we find activity levels in case of diffuse lesions to be markedly higher than in the case of a focal lesion (of course, both conditions have diminished activity compared to the healthy network). The same thing can be seen from the histogram on the right-hand side of **Figure 11**, showing the total difference in mean firing between the diffuse and focal lesion activity for each noise amplitude, integrated over all ROIs. Again, the columns are on the positive side of the x-axis. Thus, the connectome network shows more resilience to diffuse rather than focal lesions with the same number of nodes. This is presumably due to the wiring architecture of the brain that allows for alternate passages in order to protect against random abrasions.

#### **3.3. CAUSAL EFFERENTS OF TMS PERTURBATIONS**

Non-invasive physiological perturbations of specific brain areas using transcranial magnetic stimulations (TMS) have successfully been used for probing neural circuits and their functions. They can be operated either to excite or completely inhibit a given brain area both in the presence or absence of a task. What we want to computationally reconstruct in BrainX3 are the causal efferents of the evoked activity due to this stimulation. In **Figure 12** we show results for an inhibitory stimulation applied to all the nodes in areas rCUN, rLOCC and rPCUN (the same regions on which we earlier simulated a focal lesion and with network noise amplitude of 0.01). TMS is applied during the first 5 s of the simulation and the network returns to resting-state once stimulation is discontinued. The bottom right plot in **Figure 12** shows how this affects the time-series of the same three seed nodes we used (rCAC (black), rISTC (green) and lPCUN (magenta)), which are connected to but not part of the perturbed regions. The change

**FIGURE 8 | Simulation of lesioned (diffuse) brain activity vs. noise.** Snapshot of neural activity at a single time-point in BrainX3 following a diffuse lesion. The lesion was simulated by disconnecting 50 randomly selected nodes. From top to bottom, noise amplitudes: **(A)** 0.01, **(B)** 0.05, **(C)** 0.07, and

**(D)** 0.1. Each row shows screenshots from the posterior, superior and lateral perspectives. The color bar on the right represents neuronal activity in Hz (warmer colors represent higher activation of the nodes and lesioned nodes are shown in black).

in the firing rate is of the order of 10–20 Hz and upon removing the stimulation, we find that the network returns to resting-state activity in about 40 ± 10 ms. The left diagram in **Figure 12** shows all 998 nodes, with the stimulated nodes in gray and the colors in all other nodes denoting the difference in the average firing rate (averaged over 2 s) for each node after and before the perturbation. The averaging is done to take in account variations due to noise. The plot on the top right of **Figure 12** shows the exact differences (in red) in average firing rate, after minus before, for each of the 998 nodes (the stimulated areas are shaded in gray), with the black, green and magenta markers referring to the seed nodes. Hence, above the zero difference, we see efferent areas of the network that are inhibited during TMS, whereas, below the zero line refers to efferents that are actually excited during TMS.

Clearly, the results show that areas anatomically closer to the perturbed regions are most affected, but they also show specific long range connections in the frontal, temporal and limbic lobes that are affected by stimulating areas rCUN, rLOCC, and rPCUN (in the occipital and parietal lobes). **Figure 12** shows ROIs in regions rPARC, rCAC, rISTC, rPC, rSP, rIP, and rLING are strongly inhibited. Since the stimulated areas here are exactly the same that we lesioned for simulating stroke dynamics, the map of efferents we find after TMS are also part of the affected pathways following the lesion. As described in the next subsection and **Figure 13**, in BrainX3 we can extract these efferents explicitly in a 3D reconstruction.

Though most of the TMS efferents are inhibited during the stimulation phase, interestingly, a small number of them are also excited, showing an average firing rate higher than the nonperturbed (resting-state) value. These occur sparsely in regions rLOF, rRMF, rCMF, lLOF, lPOPE, and lFUS. A possible explanation for the occurrence of these excitations is that these ROIs were the ones that were anti-correlated to the stimulated nodes, when the network was in the resting-state.

#### **3.4. PATHFINDING IN THE BRAIN**

Besides simulation, another utility in BrainX3 is that it can be customized for real-time analysis and circuit extraction. This can be done either by analyzing output signals of neural activity from the simulation or by implementing graphtheoretic algorithms on the network. Here, we provide a example of bookmarking pathways efferent to the focal lesion discussed above. Bookmarking in BrainX3 can simply be done using natural gestures. In **Figure 13** we trace the connectivity span (within the healthy dataset) of all the three areas that we had lesioned earlier. All edges emanating from the previously lesioned ROIs are bookmarked in thick black giving a clear spatial impression of the extent of the lesion on the network. Though the lesion lies only in the occipital and parietal lobes, its effects are felt as far as the frontal, temporal and limbic lobes. Extracting circuits this way is intuitive and user controllable, compared to automated processes based on correlation data.

#### **4. DISCUSSION**

As techniques of quantitative analysis and measurement devices in neuroscience make improvements, it is becoming more evident that the role of large-scale dynamics and whole-brain measures cannot be ignored. Functional correlations by themselves are insufficient for inference of mechanisms and principles underlying brain function. Large-scale temporal activity maps across structurally connected brain areas are more informative of whole-brain circuit mechanisms. Being able to predict these maps by implementing realistic biophysical dynamics brings us a small step closer to identifying the neural correlates of cognitive functions. BrainX3 is a small step in this direction. It opens the possibility of analyzing neural activity propagation due to causal dynamics. Being immersive, it gives a much better intuitive anatomical perspective of the brain, than a 2D atlas would.

**FIGURE 11 | Comparison of total mean firing rates between healthy and lesioned brains.** The histogram on the left shows the total mean firing rate of the healthy resting-state network (Total RS*FR*), integrated over all ROIs, for each value of noise amplitude. The histogram at the center shows the total

difference in mean firing between the healthy and focally lesioned networks (Total RS*FR* - Total FL*FR*) for each noise amplitude, integrated over all ROIs and the histogram on the right shows the same difference but for the diffuse vs. the focally lesioned networks (Total DL*FR* - Total FL*FR*).

BrainX3, as we have shown in this paper, is a platform for data visualization, simulation, analysis and interaction, which combines computational power with human intuition in representing and interacting with large complex data. For the human connectome network above, we have shown an anatomically-spaced 3D simulation of whole-brain neural activity, based on the dynamical mean-field model, which was earlier tested in Deco et al. (2013) for resting-state dynamics. The results shown included the resting-state network, lesioned brains as well as externally stimulated networks. Our simulations above shed some insight on the spatial distribution of activity in the attractor state, how it maintains a level of resilience to damage, effects of noise and physiological perturbations. Specifically, we found that a noisy network seems to favor a low firing attractor. This is simply a consequence of the detailed biophysics of our model. Interestingly, both, computational and empirical studies in the literature have claimed that an increase in neural noise (in the form of random background activity) is associated with aging brains, which show a lower signal-to-noise ratio and less distinctive cortical representations leading to reduced information processing (Li et al., 2001; Li and Sikström, 2002; Hong and Rebec, 2012). In particular, fMRI data in D'Esposito et al. (1999), Huettel et al. (2001) show fewer activated voxels and an increase in noise in older participants, compared to younger ones. Our observation about the effect of noise on neural firing corroborates with the literature and as future research we plan to model neural dynamics in aging

**FIGURE 13 | Pathfinding in the brain.** Extracting efferents of selected regions in BrainX3, shown in the top figure. The selected regions are the rCUN, rLOCC, and rPCUN. All paths emerging from these regions are traced in thick black. Screenshots refer to the posterior (top left) and superior view

(top right). The bottom figure shows a reference atlas with labels of brain regions in the posterior (bottom left) and superior (bottom right) sides of the network. The colors in the atlas refer to major lobes: frontal (blue), temporal (green), occipital (red), parietal (orange) and cingulate cortex (purple).

brains. We also found that a noisy network is less resilient to focal lesions. Between diffuse and focal lesions, the connectome network shows more resilience to the former, suggesting that the brain's wiring architecture is such that it provides alternate pathways for propagation of activity in order to protect against non-localized damage. Our results on TMS perturbations, generate temporal sequence of causal activations, which in the example of stimulating regions in the occipital and parietal lobes, map to efferent areas that presumably constitute a functional pathway. Interestingly, we also noticed that even though TMS inhibits most of the network, it also sparsely excites a few regions. Presumably, these are the regions anti-correlated to the perturbed ROIs. This suggests that even a lesioned network can show increased activity over sparsely distributed brain areas, compared to healthy brain networks. Knowledge of these active areas can be clinically useful for assessing levels of consciousness in patients with severe brain injury. These observations demonstrate the role of BrainX3 as a hypothesis generator. As is often the case with complex data, one might not always have a specific hypothesis to start with. Instead, discovering meaningful patterns and associations in big data might be a necessary incubation step for formulating well-defined hypotheses.

BrainX3 is not only a generator of simulated data of dynamical processes in complex networks, but it also provides a natural user interaction paradigm (including user immersion and gesturebased inputs) for the visualization and exploration of complex network datasets. In previous work, we have validated BrainX3 vs. standard desktop-based visualization and simulation tools and found that our system is better at structural understanding of the data based on the performance of subjects on a memory task (Betella et al., 2013, 2014a,b). As future applications of our technology, we foresee online user-interaction with simulations as a step toward virtual brain surgery, enabling a surgeon to try out several surgical procedures and assessing risk factors on models based on the patient's data before actually performing the surgery. However, to be useful for any form of precision surgery, besides improving usability and integration with other input/output devices relevant for surgery, the size of the simulation will have to be significantly scaled to much finer resolutions matching those of surgical standards and even more detailed biophysical models will have to be used (including plasticity and pharmacological inputs). This would mean working with networks having millions of nodes and proportionately many more connections (such as from precision microscopy), which would require optimizing BrainX3 with parallel computing. This is the next step in the development BrainX3, scaling and optimizing the simulation for very large networks.

## **AUTHOR CONTRIBUTIONS**

XA, RZ, AB, and PV contributed to the design, analysis, interpretation and writing of the manuscript. EM, PO, and DD contributed to the technical implementation. GD contributed to the analysis.

## **FUNDING**

This research has been supported by the EC FP7 project, CEEDS (FP7-ICT-2009-5), under grant agreement n. 258749.

## **ACKNOWLEDGMENTS**

We thank Olaf Sporns and Chris Honey for sharing the DSI data, Microsoft Inc. for the KinectTM v2 sensor and SDK, provided under the Developer Preview Program and Laura Serra for technical support.

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

#### *Received: 12 September 2014; accepted: 02 February 2015; published online: 24 February 2015.*

*Citation: Arsiwalla XD, Zucca R, Betella A, Martinez E, Dalmazzo D, Omedas P, Deco G and Verschure PFMJ (2015) Network dynamics with BrainX3: a large-scale simulation of the human brain network with real-time interaction. Front. Neuroinform. 9:02. doi: 10.3389/fninf.2015.00002*

*This article was submitted to the journal Frontiers in Neuroinformatics.*

*Copyright © 2015 Arsiwalla, Zucca, Betella, Martinez, Dalmazzo, Omedas, Deco and Verschure. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*