# ANATOMY AND PLASTICITY IN LARGE-SCALE BRAIN MODELS

EDITED BY: Markus Butz, Wolfram Schenck and Arjen van Ooyen PUBLISHED IN: Frontiers in Neuroanatomy

#### *Frontiers Copyright Statement*

*© Copyright 2007-2016 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88945-065-7 DOI 10.3389/978-2-88945-065-7

### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

## **ANATOMY AND PLASTICITY IN LARGE-SCALE BRAIN MODELS**

Topic Editors:

**Markus Butz,** Jülich Research Center, Germany **Wolfram Schenck,** Bielefeld University of Applied Sciences, Germany **Arjen van Ooyen,** VU University Amsterdam, Netherlands

3D-PLI based fiber orientation map of a section through simulated fibers modeling the optic chiasm of a hooded seal (Axer et al., this E-book)

Supercomputing facilities are becoming increasingly available for simulating activity dynamics in large-scale neuronal networks. On today's most advanced supercomputers, networks with up to a billion of neurons can be readily simulated. However, building biologically realistic, full-scale brain models requires more than just a huge number of neurons. In addition to network size, the detailed local and global anatomy of neuronal connections is of crucial importance. Moreover, anatomical connectivity is not fixed, but can rewire throughout life (structural plasticity)—an aspect that is missing in most current network models, in which plasticity is confined to changes in synaptic strength (synaptic plasticity).

The papers in this Ebook, which may broadly be divided into three themes, aim to bring together high-performance computing with recent experimental and computational research in neuroanatomy. In the first theme (fiber connectivity), new methods are described for measuring and data-basing microscopic and macroscopic connectivity. In

the second theme (structural plasticity), novel models are introduced that incorporate morphological plasticity and rewiring of anatomical connections. In the third theme (large-scale simulations), simulations of large-scale neuronal networks are presented with an emphasis on anatomical detail and plasticity mechanisms. Together, the articles in this Ebook make the reader aware of the methods and models by which large-scale brain networks running on supercomputers can be extended to include anatomical detail and plasticity.

**Citation:** Butz, M., Schenck, W., van Ooyen, A., eds. (2016). Anatomy and Plasticity in Large-Scale Brain Models. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-065-7

# Table of Contents

### **INTRODUCTION**

*04 Editorial: Anatomy and Plasticity in Large-Scale Brain Models* Markus Butz, Wolfram Schenck and Arjen van Ooyen

### **FIBER CONNECTIVITY**


Nicole Schubert, Markus Axer, Martin Schober, Anh-Minh Huynh, Marcel Huysegoms, Nicola Palomero-Gallagher, Jan G. Bjaalie, Trygve B. Leergaard, Mehmet E. Kirlangic, Katrin Amunts and Karl Zilles

### **STRUCTURAL PLASTICITY**


Sandra Diaz-Pier, Mikaël Naveau, Markus Butz-Ostendorf and Abigail Morrison *76 Structural Plasticity, Effectual Connectivity, and Memory in Cortex*

Andreas Knoblauch and Friedrich T. Sommer

### **LARGE-SCALE SIMULATIONS**

*96 Anatomically Detailed and Large-Scale Simulations Studying Synapse Loss and Synchrony Using NeuroBox*

Markus Breit, Martin Stepniewski, Stephan Grein, Pascal Gottmann, Lukas Reinhardt and Gillian Queisser


Shanglin Zhou, Michele Migliore and Yuguo Yu

*154 Closed-Loop Brain Model of Neocortical Information-Based Exchange* James Kozloski

# Editorial: Anatomy and Plasticity in Large-Scale Brain Models

Markus Butz <sup>1</sup> , Wolfram Schenck <sup>2</sup> and Arjen van Ooyen<sup>3</sup> \*

<sup>1</sup> Simulation Laboratory Neuroscience, Bernstein Facility for Simulation and Database Technology, Institute for Advanced Simulation, Jülich Aachen Research Alliance, Jülich Research Center, Jülich, Germany, <sup>2</sup> Faculty of Engineering and Mathematics, Bielefeld University of Applied Sciences, Bielefeld, Germany, <sup>3</sup> Department of Integrative Neurophysiology, VU University Amsterdam, Amsterdam, Netherlands

Keywords: brain models, simulation, supercomputing, high-performance computing, anatomy, connectivity, structural plasticity

**The Editorial on the Research Topic**

**Anatomy and Plasticity in Large-Scale Brain Models**

### INTRODUCTION

Supercomputing facilities are becoming increasingly available for simulating electrical activity in large-scale neuronal networks. On today's most advanced supercomputers, networks with up to a billion of neurons can be readily simulated. However, building biologically realistic, full-scale brain models requires more than just a huge number of neurons. In addition to network size, the detailed local and global anatomy of neuronal connections is of crucial importance. Moreover, anatomical connectivity is not fixed, but can rewire throughout life (structural plasticity; Butz et al., 2009)—an aspect that is missing in most current network models, in which plasticity is confined to changes in synaptic strength (synaptic plasticity).

The papers in this research topic, which may broadly be divided into three themes, aim to bring together high-performance computing with recent experimental and computational research in neuroanatomy. In the first theme (fiber connectivity), new methods are described for measuring and data-basing microscopic and macroscopic connectivity. In the second theme (structural plasticity), novel models are introduced that incorporate morphological plasticity and rewiring of anatomical connections. In the third theme (large-scale simulations), simulations of large-scale neuronal networks are presented with an emphasis on anatomical detail and plasticity mechanisms. Together, the papers in this research topic contribute to extending high-performance computing in neuroscience to encompass anatomical detail and plasticity.

Edited and reviewed by:

Javier DeFelipe, Cajal Institute, Spain

\*Correspondence: Arjen van Ooyen arjen.van.ooyen@gmail.com

Received: 22 September 2016 Accepted: 20 October 2016 Published: 07 November 2016

#### Citation:

Butz M, Schenck W and van Ooyen A (2016) Editorial: Anatomy and Plasticity in Large-Scale Brain Models. Front. Neuroanat. 10:108. doi: 10.3389/fnana.2016.00108 FIBER CONNECTIVITY

Investigating the brain's connectivity requires multiscale approaches and hence strategies for integrating data across different spatial scales. Axer et al. demonstrate how to bridge microscopic visualizations of fibers obtained by 3D-PLI (polarized light imaging; Axer et al., 2011) to mesoor macro-scopic fiber orientations based on dMRI (diffusion magnetic resonance imaging). A relatively new technique, 3D-PLI is applicable to microtome sections of postmortem brains and uses birefringence of brain tissue, induced by optical anisotropy of the myelin sheaths around axons, to derive a 3D description of the underlying fiber architecture. To be able to link 3D-PLI to dMRI measurements, the authors introduce fiber orientation distribution functions (ODFs) extracted from 3D-PLI. They demonstrate the validity of their approach with simulated 3D-PLI data as well as real 3D-PLI data from the human brain and the brain of a hooded seal.

Capturing different aspects of brain organization, such as connectivity and molecular composition, necessitates the use of different neuroimaging techniques. To subsequently integrate the multiscale and multimodal data into a complete 3D brain model requires an accurate definition of the spatial positions of structural entities. Defined by MRI, the Waxholm Space (WHS) (http://software.incf.org/software/waxholm-space) provides such a reference space for rodent brain data. The aim of the study by Schubert et al. was to extend the WHS rat brain atlas with information about cytoarchitecture, receptor expression and spatial orientation of fiber tracts, derived from autoradiography and PLI images. To incorporate these distinct classes of information into the WHS, the authors improved currently available registration algorithms to align sections and to correct for deformations. The extended WHS rat brain atlas now enables combined studies on receptor and cell distributions as well as fiber densities in the same anatomical structures at microscopic scales. Furthermore, the methods developed facilitate future integration of data of other modalities.

### STRUCTURAL PLASTICITY

According to the long-standing connectionists' dogma, information is stored in the connection weights of neural networks. However, in biological neural networks, a synaptic connection is much more than just a weight factor, and a wealth of biological mechanisms can bring about changes in functional and structural connectivity. The following papers deal with models that go beyond the traditional modeling concept of plasticity as merely up- or down-regulating synaptic connection strength (synaptic plasticity).

Fauth and Tetzlaff review experimental and modeling studies that address the formation and deletion of synapses (structural plasticity) and how this is regulated by electrical activity. Whereas adapting existing synapses in a Hebbian manner predominantly serves memory consolidation, the de novo formation of synapses and deletion of existing synapses can, together with other synaptic mechanisms, grant stability to a network facing continuously changing inputs. The authors point out that these different plasticity mechanisms may be involved in different neural functions and, remarkably, may respond very differently to changes in neuronal activity.

Changes in the anatomical layout of connections (structural plasticity) are crucially dependent on morphological changes in individual neurons. A lack of understanding of how structural changes affect network function is partly due to the absence of tools for studying the impact of neuronal morphology on neuronal function. Therefore, Bezelos et al. introduce a simulation tool for systematically varying the morphology of any type of neuron, which will help investigate the role of neuronal morphology and morphological changes in large-scale neuronal networks.

Diaz-Pier et al. go beyond the level of single neurons and describe an approach based on a recent model of homeostatic structural plasticity (Butz and van Ooyen, 2013), which for the first time enables growing from scratch the neuronal network connectivity of a cortical column in silico. The resulting connectivity shows remarkable similarities with a real cortical column.

In biologically realistic, sparsely connected cortical networks, Knoblauch and Sommer study what the computational contribution of structural plasticity is to memory formation based on synaptic plasticity. They show that neuronal networks with structural plasticity can more efficiently adapt their connectivity to the computational problem at hand. As a consequence, structural plasticity may significantly increase the number of stably maintained memory items and may even account for psychological phenomena such as the spacing effect in rehearsal learning (Greene, 1989).

## LARGE-SCALE SIMULATIONS

The third theme concerns large-scale simulations, with a special focus on anatomic detail and/or plasticity mechanisms. Because of the high computational demands of such simulations, it is crucial to have the right hardware and software tools available. Breit et al., Gosui and Yamazaki, and Knight et al. report on recent developments in this area. Furthermore, Zhou et al. show how to successfully employ established neuro-simulators for this purpose, and Kozloski how to move up to full brain-scale models.

Breit et al. developed the novel simulation framework NeuroBox and applied it to the investigation of the interplay between synapse loss and signaling synchrony. NeuroBox provides a link between the description of the morphology of neurons and networks on the one hand and the simulation of such networks in the simulation framework UG4 (Vogel et al., 2013) on the other hand. This combination enables the simulation of anatomically detailed models of large networks on supercomputers with good scaling behavior.

Gosui and Yamazaki tackle the problem of how to carry out long-term simulations within a manageable time frame. They studied the long-term gain adaptation of optokinetic response eye movements over a period of 5 days of simulated time in a large-scale cerebellar model. By implementing the simulation software on highly parallel processors (graphics processing units; GPUs), the simulation could be carried out in real-world time. This approach opens up new opportunities for studying longterm plasticity in neural networks, e.g., in the context of memory consolidation.

Knight et al. demonstrate that large-scale models that incorporate complex and demanding plasticity mechanisms (in this case, based on the Bayesian Confidence Propagation Neural Network paradigm; Lansner and Holst, 1996) can be simulated in a highly power-efficient way on the SpiNNaker neuromorphic architecture. Furthermore, in this study the largest plastic neural network ever was simulated on neuromorphic hardware. This shows that (some) neuromorphic hardware is already up to the task of simulating detailed anatomy and plasticity in large-scale brain models.

Zhou et al. rely in their work on the established neurosimulator NEURON (Hines and Carnevale, 2001). They implemented a large-scale model of the olfactory bulb that includes plasticity mechanisms. They investigated how prior odor experience influences the representation of new odor inputs. The results show that prior experience changes the pattern in which sparse responses occur in different sub-networks within the olfactory bulb. From a methodological point of view, this study demonstrates how to set up a large-scale simulation of a plastic neural network with rather detailed anatomy.

Kozloski positions his modeling and simulation work at brainscale, comprising the neocortex, basal ganglia and thalamus. The proposed model is based on the principle of information-based exchange and does not cover anatomical details. Nevertheless, it is grounded in neuroanatomical observations and accounts for various forms of plasticity. Modeling at this rather abstract level enables long-term simulations at brain-scale and extensive parameter variations. In this specific study, variations in dynamic set points and modulations were investigated, leading to a theory about the emergence of neurodegenerative diseases.

### CONCLUSION

Biologically realistic large-scale brain models require, besides a huge number of neurons, the right layout of local and global connections. Moreover, they need to capture the plasticity mechanisms operative in the adult brain. These mechanisms include, in addition to changes in the strength of synapses, structural modifications in anatomical connections. With the articles in this research topic, we hope that the reader will become aware of the methods and models by which large-scale brain networks running on supercomputing facilities may be extended to include anatomical detail and the full plastic potential of the brain.

### AUTHOR CONTRIBUTIONS

MB, WS, and AvO wrote the Editorial.

### REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Butz, Schenck and van Ooyen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Estimating Fiber Orientation Distribution Functions in 3D-Polarized Light Imaging

Markus Axer <sup>1</sup> \* † , Sven Strohmer 2, 3 †, David Gräßel <sup>1</sup> , Oliver Bücker <sup>2</sup> , Melanie Dohmen<sup>1</sup> , Julia Reckfort <sup>1</sup> , Karl Zilles 1, 4, 5 and Katrin Amunts 1, 6

<sup>1</sup> Research Centre Jülich, Institute of Neuroscience and Medicine, Jülich, Germany, <sup>2</sup> Jülich Supercomputing Centre, Institute for Advanced Simulation, Research Centre Jülich, Jülich, Germany, <sup>3</sup> Research Centre Jülich, Simulation Lab Neuroscience, Bernstein Facility for Simulation and Database Technology, Institute for Advanced Simulation, Jülich, Germany, <sup>4</sup> Department of Psychiatry, Psychotherapy, and Psychosomatics, RWTH Aachen University, Aachen, Germany, <sup>5</sup> JARA Jülich-Aachen Research Alliance, Translational Brain Medicine, Aachen, Germany, <sup>6</sup> C. and O. Vogt Institute for Brain Research, Heinrich-Heine-University Düsseldorf, Düsseldorf, Germany

#### Edited by:

Wolfram Schenck, University of Applied Sciences Bielefeld, Germany

#### Reviewed by:

José A. Armengol, University Pablo de Olavide, Spain Richard S. Nowakowski, Florida State University, USA Kathleen S. Rockland, Boston University School Medicine, USA

#### \*Correspondence:

Markus Axer m.axer@fz-juelich.de † These authors have contributed equally to this work.

Received: 23 December 2015 Accepted: 29 March 2016 Published: 19 April 2016

#### Citation:

Axer M, Strohmer S, Gräßel D, Bücker O, Dohmen M, Reckfort J, Zilles K and Amunts K (2016) Estimating Fiber Orientation Distribution Functions in 3D-Polarized Light Imaging. Front. Neuroanat. 10:40. doi: 10.3389/fnana.2016.00040 Research of the human brain connectome requires multiscale approaches derived from independent imaging methods ideally applied to the same object. Hence, comprehensible strategies for data integration across modalities and across scales are essential. We have successfully established a concept to bridge the spatial scales from microscopic fiber orientation measurements based on 3D-Polarized Light Imaging (3D-PLI) to meso- or macroscopic dimensions. By creating orientation distribution functions (pliODFs) from high-resolution vector data via series expansion with spherical harmonics utilizing high performance computing and supercomputing technologies, data fusion with Diffusion Magnetic Resonance Imaging has become feasible, even for a large-scale dataset such as the human brain. Validation of our approach was done effectively by means of two types of datasets that were transferred from fiber orientation maps into pliODFs: simulated 3D-PLI data showing artificial, but clearly defined fiber patterns and real 3D-PLI data derived from sections through the human brain and the brain of a hooded seal.

Keywords: connectome, fiber architecture, human brain, polarized light imaging, 3D-PLI, ODF

### INTRODUCTION

The repertoire of neuroimaging tools that are able to target neuronal connectivity in both the living and the post mortem brain, is continuously growing. Technological developments in particular in the field of microscopy (Osten and Margrie, 2013), new preparation and labeling methods (Chung et al., 2013; Costantini et al., 2015) and a better understanding of how to process the collected data (Amunts et al., 2013; Silvestri et al., 2015), facilitates this advancement (for recent review Amunts and Zilles, 2015). Several of these techniques address either cellular or even molecular dimensions, e.g., light sheet microscopy, or they provide data at meso- to macroscopic scales, e.g., Diffusion Magnetic Resonance Imaging (dMRI). Thus, the data output generated by different technical approaches and imaging techniques results in different data types, formats, and sizes, and is obtained on different spatial scales. As a consequence, comprehensible comparison across modalities and across scales evolves into a basic necessity for the neuroscience community.

In the present study, we demonstrate how to bridge the spatial scales from microscopic post mortem fiber visualization and orientation measurements based on 3D-Polarized Light Imaging (3D-PLI; Axer et al., 2011a,b) to meso- or macroscopic dimensions as targeted by dMRI. Our approach enables the propagation of the entire information on the microscopic fiber architecture within individual voxels by means of sophisticated data fusion. Based on the commonly used strategy to integrate vector data into a comprehensive description by employing Orientation Distribution Functions (ODFs, Bunge, 1982), we introduce the pliODF derived from 3D-PLI. pliODFs benefit from the unique property of 3D-PLI to extract high-resolution 3D vector fields indicating the spatial orientation of single fibers and fiber tracts in unstained brain sections. By transferring the entire vector data within a certain compartment, the so-called super-voxel, into a 3D statistical description (often visualized in form of a glyph), an efficient downscaling of high-resolution vector-like data becomes feasible. Considering the pure size of a 3D-PLI dataset that covers a whole human brain (i.e., 2500 sections scanned at 1.3 microns pixel size sum up to at least 500 TByte), the development of a method that reliably resamples large-scale microscopic data is of particular importance.

The choice of using ODF-like statistical descriptions of multiple-fiber compartments is based on the fact, that recent dMRI methods such as Diffusion Spectrum Imaging, Qball Imaging or Spherical Deconvolution (Tuch et al., 2002; Alexander, 2005; Wedeen et al., 2005; Dell'Acqua et al., 2013) also approximate the distribution of fiber orientations within an MRI-voxel by means of fiber Orientation Distribution Functions (fODF; Alexander et al., 2002; Hess et al., 2006; Rathi et al., 2009; Assemlal et al., 2011). Using a similar mathematical description is clearly beneficial for multimodal comparisons. Approaches have been reported aiming at the extraction of textural 2D-information, i.e., local fiber orientations, from images of histological brain sections stained for myelin in order to create 2D structural ODFs (Leergaard et al., 2010; Budde and Frank, 2012). In this context, small regions of interest were successfully compared to fODFs obtained from dMRI measurements. These studies represent important steps toward a region-based comparison of 2D fiber architecture obtained from different modalities. However, the 3D fiber architecture across large datasets has not been addressed yet.

Here, we benefit from the three-dimensional nature of the polarized light imaging approach, which provides measurements of the 3D fiber structures at the level of individual brain sections. Consequently, a super-voxel—and also a pliODF—can arbitrarily be composed of compartments within a section, but also across aligned neighboring sections without the requirement to change the software tools. To demonstrate the validity of our approach, two types of 3D-PLI datasets were transformed into pliODFs: (i) simulated 3D-PLI data showing synthetic, but clearly defined fiber-like patterns and (ii) real 3D-PLI data derived from sections through the human brain and the brain of a hooded seal. The human brain data were selected to highlight the gain of high-resolution imaging of brain regions with challenging fiber compositions such as the complex fiber crossings in the corona radiata or low fiber density regions in the cortex. The chiasm of the hooded seal with its nearly perpendicularly decussating fiber tracts (cf. Dohmen et al., 2015) appeared to be well suited to show the transition from simulation-based ODF generation to the most simple crossing fiber tract constellation observable in real brain tissue.

### MATERIALS AND METHODS

3D-PLI (Axer et al., 2011a,b; Zeineh et al., 2016) has demonstrated its unique capabilities (i) to reveal fiber structures at multiple scales, such as long-range connections and even single fibers and crossings within unstained histological brain sections, and (ii) to determine spatial fiber orientations (i.e., 3D unit vectors down to the scales of fiber diameters (0.4– 15µm). 3D-PLI is applicable to unstained microtome sections of post mortem brains and utilizes the optical birefringence of nerve fibers, which basically arises from the highly ordered arrangement of lipid molecules in the myelin sheath surrounding most of the axons in the brain. Polarimetric setups (e.g., a polarizing microscope) are employed to carry out birefringence measurements and to give contrast to individual nerve fibers and their tracts. Supported by fundamental principles of optics and dedicated simulation approaches (Dohmen et al., 2015; Menzel et al., 2015), the measured signals are additionally interpreted in terms of spatial fiber orientations by means of unit orientation vector descriptions (**Figure 1**). The algorithms used for the fiber orientation interpretation have been implemented as an automated 3D-PLI analysis workflow suitable for distributed supercomputing, as described by Axer et al. (2011b); Amunts et al. (2014).

### The Fiber Orientation Map

Application of the 3D-PLI analysis workflow results in a unit vector-based description of the fiber orientation determined for each tissue voxel, referred to as a native voxel. The native voxel dimensions are defined by the image pixel size provided by the optical setup and the thickness of the studied histological brain section (70µm in the present study). The generation of pliODFs was exclusively performed on 3D-PLI datasets with native voxel dimensions of 1.3 × 1.3 × 70 µm. Each orientation vector reflects the net effect of all fibers comprised within a voxel. The assembly of all unit vectors represents the fiber orientation map (FOM).

The orientation unit vector ⇀ u<sup>i</sup> at voxel location i can be parameterized by two angles, i.e., by spherical coordinates: the direction angle ϕ<sup>i</sup> , which represents the projection of the principal fiber axis within the sectioning plane, and the inclination angle α<sup>i</sup> , which is the angle between the principal fiber axis and the sectioning plane (**Figure 1** and Equation 1).

$$
\stackrel{\rightarrow}{u}\_i = \begin{pmatrix}
\cos \alpha\_i \cdot \cos \varphi\_i \\
\cos \alpha\_i \cdot \sin \varphi\_i \\
\sin \alpha\_i
\end{pmatrix} \tag{1}
$$

### Image Generation and Data Acquisition Simulated 3D-PLI Data

The simulation software tool SimPLI (Dohmen et al., 2015; Menzel et al., 2015) was used to generate two synthetic 3D-PLI datasets with known configurations of fiber-like structures. In SimPLI, three main steps of simulation are implemented: (i) the generation of an arbitrary spatial arrangement of synthetic fibers and the discretization into a three-dimensional fiber orientation vector field with a certain resolution (e.g., 70µm isotropic), (ii) the calculation of the transmitted light intensity based on the Jones matrix calculus (Jones, 1941) yielding a synthetic 3D-PLI image series, and (iii) the simulation of environmental clutter arising from the camera and the tissue in the optical path by adding blurring and noise effects.

In order to validate the different methodological steps employed to transfer a FOM into a set of orientation distribution functions, a well-defined template providing unambiguous structural macroscopic and microscopic features in terms of left/right, top/down and in-plane/out-of-plane orientations, was required. This dataset generated by means of SimPLI is shown in **Figures 2A–D**. It is composed of a stack of 18 images and comprises birefringent "fibers" forming human readable structures ("fiber bundles"), such as the capital letter "R" and the "±" sign. The line thickness of the letters (or the thickness of the "fiber bundles") was chosen to be 20 pixels on average. The fiber inclination angles in "R" were all set to α = 0 ◦ , while the inclination angles were set to α = +45◦ and α = −45◦ for the "+" and "−" sign, respectively. The direction angles ϕ were aligned with the local structures using a right-handed coordinate system, i.e., the horizontal components (e.g., the "−" sign) are identified by ϕ = 0 ◦ while the vertical components are represented by ϕ = 90◦ . The diagonal element of the "R" has a direction of ϕ = 135◦ . The background is composed of 90◦ inclined fibers corresponding to light intensity variations equal to zero. This dataset was subjected to the 3D-PLI analysis workflow to extract the corresponding FOM (**Figures 2E**, **3A**).

A second, more realistic dataset was generated with SimPLI and subjected to 3D-PLI analysis (cf. **Figure 5F**), and has already been described in detail in Dohmen et al. (2015): the model of a chiasm of the hooded seal. This data bridges to the successive section on real 3D-PLI data sets.

### 3D-PLI Data Obtained from Histological Sections of the Hooded Seal and the Human Brain

Two human brains and the optic chiasm of a hooded seal were immersed in 4% paraformaldehyde. After cryoprotection (with a 20% glycerin solution), the brain tissue was deep frozen at −70◦C and stored at the same temperature till further processing. A whole human brain and the occipital lobe of a human brain were coronally sectioned and optic chiasm was axially sectioned using a large-scale cryostat microtome (Polycut CM 3500, Leica, Germany), and eventually coverslipped with a 20% glycerin solution. The chosen section thickness was 70 µm. During sectioning, each blockface was imaged using a CCD camera mounted above the brain in order to obtain an undistorted reference for each section (cf. **Figures 5A**, **6A**, **7A**). No staining was applied. In each case, this procedure resulted in a complete series of sections through large tissue samples, which enables a 3D reconstruction. The brains were acquired in accordance with local legal and ethical requirements.

The brain sections were measured in two custom-made polarimetric setups, the large-area polarimeter and the polarizing microscope, providing images with pixel dimensions of 64 × 64µm and 1.3 × 1.3µm, respectively (for technical details refer to Axer et al., 2011b). The datasets were passed to the 3D-PLI analysis workflow to extract the corresponding FOMs. For pliODF generation only the high-resolution data were used, while the 64µm-sized FOMs were used for plausibility checks and reference measures.

### Estimation of Fiber Orientation Distribution Functions (pliODFs)

vectors −→<sup>u</sup> encoded in RGB color space (see color sphere for the relation between orientation and color-coding).

The fundamental aim was to generate a mathematical description -the orientation distribution function (pliODF)– that quantifies the spatial distribution of fiber orientations determined by 3D-PLI within a rectangular compartment. The compartment can be defined in a single FOM or in a series of FOMs and is referred to as super-voxel. A super-voxel is composed of (r × c × s = rows × columns × sections) native voxels. In order to calculate pliODFs, fundamental approaches from material science (texture analysis in crystallography; Bunge, 1982) and directional statistics (Mardia and Jupp, 2000) were applied and adopted to the needs of 3D-PLI.

The implemented procedure was based on three steps (cf. **Figure 3**):


(C) the approximation of the orientation probability distribution density by fitting the directional histogram with a series expansion using spherical harmonics.

In the following, steps (A) to (C) will be explained in more detail. As mentioned above (**Figure 1**), a 3D-PLI orientation vector can be expressed in spherical coordinates (the polar angle ϑ = π <sup>2</sup> − α and the azimuth angle ϕ) and defines two points on the unit sphere S<sup>2</sup> . This feature was used to construct a directional histogram aiming at a statistical description of the fiber orientations contained in a super-voxel. To discretize the distribution of fiber orientations, the surface of a unit sphere was subdivided into planar bins (bin centers characterized by latitude and longitude) with a total bin number n calculated by

$$n = \left(\text{\textquotedbl{}of\textquotedbl{}latitude\textquotedbl{}}\right) + \left(\text{\textquotedbl{}of\textquotedbl{}long\textquotedbl{}}\right) + 2\text{ polar\textquotedbl{}apss.}\tag{2}$$

n was adopted to the specific requirements of the used datasets. In a final step, the number of vectors falling into each bin was determined (**Figure 3B**).

Normalization of the directional histogram enabled to mathematically describe the empirical orientation probability distribution density p(ω) by

FIGURE 3 | Three steps toward pliODF generation. (A) First, a FOM is divided into regular domains or super-voxels. The exemplarily enlarged super-voxel contains 40 × 40 × 1 native voxels representing three predominant fiber orientations, which show a relative frequency of occurrence of ∼¼(blue color), ∼¼(magenta color) and ∼½(cyan color), respectively. The color sphere defines the relation between orientation and color-coding. (B) Second, a normalized directional histogram with a discretized binning on a unit sphere is created for each super-voxel. The relative fraction of fiber orientations assigned to a particular bin is reflected by the length of the colored solid angle originating from the middle of the sphere. The symmetry of the histogram with respect to point reflection across the center of the sphere is evident. Here, the total number of bins distributed over the sphere was set to 164 and the three predominant input fiber orientations are still preserved. (C) Third, a spherical harmonics expansion is used to approximate each directional histogram. Depending on the selected depth of expansion (e.g., to the 4th or the 6th band), orientation distribution features might become occluded by interpolation.

$$\frac{dN}{Z} = p\left(\alpha\right)d\Omega,\tag{3}$$

with Z being a normalization factor (Int dN Z = 1), d the dihedral angle differential, and ω the spatial direction. p(ω) was expanded into a series of generalized spherical harmonics:

$$\begin{split} p\left(\boldsymbol{\omega}\right) &= \sum\_{l=0}^{\infty} \sum\_{m=-l}^{l} \sum\_{n=-l}^{l} C\_{l}^{mn} T\_{l}^{mn} \left(\boldsymbol{\omega}\right) \\ &= \sum\_{l=0}^{\infty} \sum\_{m=-l}^{l} \sum\_{n=-l}^{l} C\_{l}^{mn} e^{I\boldsymbol{\varphi}\_{2}} P\_{l}^{mn} \left(\boldsymbol{\Phi}\right) e^{I\boldsymbol{\varphi}\_{1}} . \end{split} \tag{4}$$

The P m l represent the associated Legendre polynomials. Due to the existing rotational symmetry (i.e., the determined fiber orientations are invariant with respect to rotation around the axis defined by the unit vector), the general description of the series simplified to an expansion in terms of spherical harmonics Y m l (ϑ, ϕ):

$$p\left(\vartheta,\varphi\right) = \sum\_{l=0}^{\infty} \sum\_{m=-l}^{l} C\_{l}^{m} Y\_{l}^{m} \left(\vartheta,\varphi\right),\tag{5}$$

with l and m denoting the band index and sub-band index, respectively. A series expansion to the 6th band, for example, means l = 0,... , 6 and −l ≤ m ≤ l (with 1 + 3 + 5 + 7 + 9 + 11 + 13 = 49 members).

The expansion of the empirical orientation probability distribution p (ϑ, ϕ) into real valued symmetric spherical harmonics series reads as:

$$p\left(\vartheta,\varphi\right) = \sum\_{l=0}^{\infty} \sum\_{m=-2l}^{2l} \mathcal{C}\_{2l}^{m} \mathcal{Y}\_{2l}^{m} \left(\vartheta,\varphi\right) \approx \sum\_{l=0}^{\hat{L}} \sum\_{m=-2l}^{2l} \mathcal{C}\_{2l}^{m} \mathcal{Y}\_{2l}^{m} \left(\vartheta,\varphi\right),\tag{6}$$

with

$$
\frac{1}{2}\left(L+1\right)\left(L+2\right) = \left(2\hat{L}+1\right)\left(\hat{L}+1\right) \tag{7}
$$

coefficients (with 2Lˆ = L).

To generate a pliODF, the coefficients C m 2l of the expansion up to the l th band had to be determined. This was realized using a least square fit algorithm, because of its numerical stability and the efficiency of the numerical implementation. Due to discretization and truncated series expansion, the pliODF only approximates the empirical orientation probability distribution p.

#### Computing

Both, the size of the processed data and the computationally intensive algorithms to determine the expansion coefficients required a supercomputing environment. For this reason, we used the Juelich Dedicated GPU Environment (JuDGE), hosted by the Jülich Supercomputing Center (JSC), Germany. It was equipped with 206 compute nodes, where each node consisted of two Intel Xeon Westmere 6-core processors operating at 2.66 GHz. Each compute node contained 96 GB of main memory. Per node there were either two NVIDIA Tesla M2050 (Fermi) GPUs or two NVIDIA Tesla M2070 (Fermi) GPUs integrated.

super-voxels. The color sphere defines the relation between orientation and color-coding.

The inter- and intra-node communication was realized by the message passing interface (MPI) protocol.

Runtime measurements were performed for a region of interest in a FOM gained from the human brain tissue [i.e., ROI (1) as depicted in **Figure 6B**], which was composed of 3712 × 4576 orientation vectors. The number of compute cores was set to 72 in order to keep the compute time in an acceptable range, but to highlight the differences in run-time properly at the same time. The number of bands and the size of super-voxels were taken as parameters.

#### Visualization

ODFs are typically visualized either (i) by means of a textured sphere, where the color of a point on the surface represents the probability of its corresponding orientation, or (ii) by simply scaling the surface with the probability p. To visualize the generated pliODFs, the second option was applied in combination with the color-coding scheme also used to visualize different fiber orientations in a FOM and the surface intensity increasing with the probability (**Figure 3C**). The scheme is based on the RGB color space with the red channel representing the xcomponent, the green channel representing the y-component, and the blue channel representing the z-component of the fiber orientation in the reference coordinate frame. The peaks of the pliODF shape and the color-coding reflect the most common fiber directions such that both the pliODFs and the FOMs are visually comparable.

### Results

### Simulated 3D-PLI Data

The FOM obtained from the simulated dataset was divided into super-voxels of 10 × 10 × 1, 20 × 20 × 1, and 40 × 40 × 1 native voxels, respectively, to create pliODFs. These clusters were chosen (i) to demonstrate the accurate interpretation of the input vectors including the conservation of the coordinate system and (ii) to investigate how different super-voxel dimensions affect the shape of the resulting pliODF representations. Therefore, supervoxel dimensions close to the (fiber) structure dimensions (here: about 20 pixels, equivalent to 36 microns; cf. **Figure 3A**) were of specific interest. The binning of the directional histogram was set to (9 latitudes × 18 longitudes + 2 polar caps) = 164 bins.

As demonstrated in **Figures 3A–C** on the basis of an exemplary 40 × 40 × 1 super-voxel, the implemented resampling procedure of high-resolution FOMs provided a comprehensible description of the distribution of fiber orientations, in form of a directional histogram or a pliODF. The comparison of the pliODFs based on different levels of expansion with the corresponding directional histogram suggested that the approximation of the probability density function with spherical harmonics in the studied cases with the selected bin sizes should be confined (at least) to the 6th band, in order to reliably resolve orthogonal contributions (e.g., cyan and magenta).

The resampling results for different super-voxel dimensions are shown in **Figures 4B–D**. Compared to the original unit vector description of the fiber orientations, the peaks of the pliODFs obtained from the small super-voxels (**Figures 4B,C**) reflect the main underlying (microscopic) fiber orientations corroborated by the matching colors. In addition, the general (macroscopic) orientations of the letters agree with the orientations of the input structures. As expected, the complexity of the pliODF shapes increases in larger samples (**Figure 4D**), maintaining the major portions of fiber orientations.

FIGURE 5 | Real and simulated brain section from the hooded seal. (A) Blockface image of the optic chiasm of the hooded seal before sectioning. (B) Fiber orientation map of a medial section through the optic chiasm. Optic nerves and optic tracts appear as massive and rather homogeneous fiber bundles. Most fiber tracts from the optic nerves decussate to the contralateral optic tract. (C) The decussation zone in the center (i.e., the chiasm) is characterized by a patch pattern produced by small fiber tracts (red and green color; exemplary orientations are indicated by black lines) and fiber crossings characterized by signal attenuation (blue color; exemplary highlighted by white arrow). Based on this FOM, pliODFs were created for super-voxel dimensions of 40 × 40 × 1 native voxels. (D,E) demonstrate different enlargements of the field of pliODFs overlaid with the input FOM. (F) FOM of a simulated section through the optic chiasm and (G) corresponding pliODFs for super-voxel dimensions of 40 × 40 × 1 native voxels. (H) Zoom into the FOM of the fiber decussation zone and (I) corresponding pliODFs. The effects of crossing and bending fibers on the ODF shapes are obvious.

## 3D-PLI Data Obtained from a Hooded Seal and a Human Brain

#### Brain Sections of a Hooded Seal

FOMs taken from hooded seal brain tissue show the optic nerve traversing through the optic chiasm into the optic tract (**Figures 5A,B**). The center of the optic chiasm reveals decussate fiber populations alternating with blue dots caused by signal attenuation (white arrow, **Figure 5C**). A pattern of crossing fiber tracts was used to prove the functionality of our implementation in a realistic setting. The FOMs were sampled using a histogram binning of (50 latitudes × 100 longitudes + 2 polar caps) = 5002 bins with a super-voxel size of 52 × 52 × 70µm<sup>3</sup> , which is equivalent to 40 × 40 × 1 native voxels. The series expansion of the pliODFs was confined to the 6th band. The fused images of the high-resolution FOMs with the corresponding pliODFs demonstrated a sound resampling (**Figures 5D,E**). The simulated chiasm of the hooded seal was analyzed accordingly and showed concordant results (**Figures 5H,I**).

### Human Brain Sections

Three high-resolution FOMs of selected regions of interest from a coronal section through the human occipital lobe (**Figure 6A**) were resampled at different super-voxel dimensions (**Figure 6B**), but with fixed histogram binning (50 latitudes × 100 longitudes + 2 polar caps = 5002 bins). The targeted super-voxel sizes of 26 × 26 × 70µm<sup>3</sup> , 52 × 52 × 70µm<sup>3</sup> , and 260 × 260 × 70µm<sup>3</sup>

correspond to 20 × 20 × 1, 40 × 40 × 1, and 200×200 × 1 native voxels, respectively. The series expansion was confined to the 6th band. The pliODFs were compared both with the underlying high-resolution FOMs acquired with the polarizing microscope (**Figure 6C**) and FOMs obtained with the large-area polarimeter (**Figure 6D**).

In addition, two high-resolution FOMs of selected regions of interest from a coronal section through the human brain (**Figure 7**) at the level of the central region were resampled at different super-voxel dimensions (**Figures 7C,D**), but with fixed histogram binning (50 latitudes × 100 longitudes + 2 polar caps = 5002 bins). The targeted super-voxel sizes of 65 × 65 × 70µm<sup>3</sup> , and 260 × 260 × 70µm<sup>3</sup> correspond to 50 × 50 × 1, and 200 × 200 × 1 native voxels, respectively. The series expansion was confined to the 6th band.

The following observations were made:


fibers (indicated by the white arrows). The largest super-voxel size is equivalent to 260 × 260 × 70 µm<sup>3</sup> and corresponds approximately to the level of high-resolution post mortem dMRI measurements. (C) Region (2) demonstrates for a super-voxel dimension of 50 × 50 × 1 native voxels the preservation of the overall fiber structure in comparison with the original high-resolution FOM obtained with the polarizing microscope. Zooming into the data reveals pliODFs with multiple fiber orientations in inhomogeneous white matter regions. (D) For region (3), pliODFs (super-voxel dimension of 50 × 50 × 1 native voxels) are opposed to the vector-based representation of the FOM of the same brain region measured with the large-area polarimeter at 64 × 64 × 70µm<sup>3</sup> voxel size. The white arrows indicate a crossing zone of fibers.

in tissue regions basically composed of parallel fibers (**Figure 6D**).

(iii) Super-voxels localized in transition areas of adjacent or crossing fiber tracts preserved the corresponding fiber orientations in the pliODF with high fidelity (**Figures 6D**, **7C**), while the same regions scanned with the largearea polarimeter showed attenuated signals or averaged orientations (e.g., **Figure 6D**).

#### Runtime Behavior

Runtime measurements (for 3712 × 4576 orientation vectors) performed on the JuDGE supercomputer showed that (i) with increasing super-voxel size the overall runtime decreased and (ii) with increasing depth of expansion the overall runtime increased (cf. **Figure 8**).

### DISCUSSION

### General Concept

3D-Polarized Light Imaging has been demonstrated in previous neuroanatomical studies (Axer et al., 2011a,b; Caspers et al., 2015; Dohmen et al., 2015; Zilles et al., 2015; Zeineh et al., 2016) to provide unique high-resolution data on the brains' fiber architecture of various species, such as mouse, rat, seal, vervet monkey, and human. The colorcoded fiber orientation map (FOM) was the fundamental image modality for each of these studies, since it comprised both the highlighted fiber structures and their local 3D orientations. Hence, each FOM enabled a comprehensive delineation and identification of neuroanatomical fiber structures across different scales covering the micrometer (i.e., cortical and sub-cortical fibers) to the centimeter range (i.e., long-range fiber tracts across entire brain sections).

In the present study, we have successfully implemented a dedicated methodology of integrating high-resolution vector data obtained from FOMs into a comprehensive statistical description, the orientation distribution functions or pliODFs, respectively. By this means, efficient downscaling of highresolution FOMs was achieved and cross-scale, cross-modality comparisons were enabled. The general concept of pliODF generation can be summarized in three steps:


### Validation Strategy

To prove the concept, we applied our implementation to simulated and real 3D-PLI datasets that reflected different characteristics of fiber compositions: well-defined parallel fibers (letter simulation, **Figure 4**), perpendicular fiber tract crossings (measurement and simulation of the hooded seal chiasm, **Figure 5**), and complex fiber architectures (measurements in human brain sections, **Figures 6**, **7**). That way, the feasibility of our approach became traceable from the most simple to the most challenging cases.

The reliability of the steps required to derive a pliODF in form of a glyph was successfully demonstrated on basis of a simulated 3D-PLI dataset. The simulation tool SimPLI (Dohmen et al., 2015; Menzel et al., 2015) has specifically been developed to model the effects of birefringence in brain tissue as well as the results of polarimetric measurements. However, one of the generated synthetic datasets (cf. **Figure 2**) was not destined for realistic simulation of fiber architecture, but to enable systematic testing of the conservation of the involved coordinate systems, and fiber-like/bundle-like orientations, respectively. In combination with the same color-coding scheme applied to FOMs and pliODFs, this turned out to be an excellent approach for demonstration and validation purposes. This becomes evident in the study of directional histograms (i.e., the probability distributions of orientation vectors on a sphere) and their fitting with a series expansion with spherical harmonics (**Figure 3**).

The directional histogram is a proximate reflection of the main fiber orientations in a super-voxel as provided by the FOMs. However, the discrete arbitrary binning of the histogram affects the orientation information and, consequently, the final approximation of the density function. The different bin widths for simulation and experimental datasets, which were used for this study resulted from trials. For future studies, an automated data driven binning in S<sup>2</sup> , depending on input and target data, is essential. The creation of directional histograms from high-statistics vector data is not computationally intensive and

represents a very fast and robust method to derive the prevalent fiber directions in super-voxels. The precision of the fiber directions obtained from the directional histogram (defined by the bin centers) is basically limited to the size of the dihedral angle spanning a bin.

This limitation can be overcome by fitting the directional histogram with a series expansion using spherical harmonics to generate a pliODF, but at the expense of computing time that is increasing drastically. As derived from runtime measurements, the resampling of a FOM composed of 3712 × 4576 orientation vectors, the computationally most intensive step was the determination of the expansion coefficients. According to Equation 7, the number of coefficients to be determined increases non-linearly with increasing expansion depth. Hence, the series expansion has to be truncated. For the experiments done in this study, the expansion to the 6th band appeared to be sufficient to assess the functionality of our implementations, but, this level of resolution has to be reviewed in case of addressing different questions as to data precision, data size or computing time. By increasing the super-voxel size, the computation time is decreased in a non-linear way, due to the lower number of pliODFs to be computed.

What does this mean for whole human brain imaging and comparison? A human brain volume of 1200 cm<sup>3</sup> translates into 10<sup>10</sup> native voxels provided by high-resolution 3D-PLI measurements. A voxel size of 2 mm isotropic, such as provided by standard clinical in vivo dMRI technologies (e.g., Bastiani and Roebroeck, 2015), leads to 150,000 dMRI voxels for a human brain. Resampling of 3D-PLI to this dMRI scale means integrating 65,000 orientation vectors into a single super-voxel or pliODF, respectively. For the runtime measurements performed on the 3712×4576 pixel-sized FOM, about 170,000 pliODFs with a super-voxel dimension of 10 × 10 × 1 native voxels had to be computed. The computation time (expansion to the 6th band) was about 9.5 h utilizing only 72 compute cores. Conclusively, computation of pliODFs for an entire human brain at 2 mm resolution is feasible. Post mortem dMRI has recently progressed to study the structural organization of the entire human brain at a voxel size of 0.7 mm isotropic (Miller et al., 2011), which results in 3.5 million target (super-)voxels, which is a factor of 20 more than provided by standard in vivo dMRI measurements. However, adjusting pliODFs to submillimeter dMRI data for a whole human brain is still in the realm of the feasible, taking into account that supercomputing facilities provide thousands of computing cores.

### Scope of Application

We demonstrated that the pliODF generation workflow enables the study of scaling effects and related partial volume effects efficiently, already at the level of single brain sections for both real and simulated data (cf. **Figures 5–7**). 3D-PLI measurements of the same tissue at two distinct resolutions (native voxel sizes of 64 × 64 × 70µm and 1.3 × 1.3 × 70µm) were beneficial in this context. The low-resolution FOMs were used as independent references for prevailing fiber orientations derived from pliODFs computed from super-voxel dimensions that matched the native voxel dimensions of the low-resolution measurements (e.g., **Figure 6D**). While the principal orientations of distinct fiber tracts from both types of fiber orientation descriptions agreed, the benefit of pliODFs became evident at transition zones between differently oriented fiber tracts. In the latter case, pliODFs preserved details about the complex fiber population, which was not observed for the low-resolution measurements. This is due to the fact, that in a 3D-PLI measurement a native voxel collects birefringence effects from multiple fibers resulting in a measurement of superimposed sinusoidal signals (Reckfort et al., 2015), each with fiber orientation specific amplitude and phase. As demonstrated by Dohmen et al. (2015), the derived fiber orientation vector for a native voxel significantly depends on the complexity of the underlying fiber population. For the low-resolution measurements this means that about 50–100 myelinated fiber contribute to the measured signal, if fiber diameters between 0.4 and 15µm (Aboitiz et al., 1992) are assumed. Future studies will further elaborate on scaling effects by combining sophisticated simulation approaches (Dohmen et al., 2015; Menzel et al., 2016) with measurements across scales (Reckfort et al., 2015). This will open up new avenues to derive observer independent quality measures for 3D-PLI measurements, beyond neuroanatomical inspection.

Even though the concept of the pliODF generation has been demonstrated for FOMs of individual brain sections, its application is not limited to section-like data, but it can also be extended to a volume of FOMs. In the latter case, the brain section images have to be re-aligned into a coherent 3D brain volume, which requires application of complex nonlinear image registration techniques (Palm et al., 2010; Amunts et al., 2013). Assuming a whole human brain reconstruction from 2500 serial coronal sections scanned at 1.3µm pixel size with a single section image size of 70,000 × 100,000 pixels on average, it becomes evident, that the utilization of distributed high performance computing on a supercomputing environment is essential. The process of registration aims at correcting for up to cm-sized tissue deformation introduced during brain sectioning and tissue mounting. This poses a major challenge when addressing 3D brain reconstruction at the µm-scale, as required for long-range pixel-wise tracing of fiber tracts across hundreds to thousands of brain sections. By integrating many orientation vectors into pliODFs, local inaccuracies in section alignment are likely to be polished and, therefore, tractography of long distance fiber pathways become feasible for large-scale 3D-PLI datasets, but at the expense of resolution. As a benefit, advanced tractography algorithms exploiting multiple direction information of individual dMRI voxels (Sotiropoulos et al., 2010; Reisert et al., 2011) become applicable to 3D-PLI data and, vice versa, pliODFs (or, alternatively, directional histograms) are suitable to validate various methods of tractography.

As pointed out by Hubbard and Parker (2009), "it is important to . . . test not only the ability of the tractography algorithms to track fibers from voxel to voxel, but to observe the details of the voxel-scale information and independently quantify the ability of dMRI to assess fiber orientation. The integrity of this orientation information is paramount to the validity of the reconstructed tract." 3D-PLI with its pliODFs particularly opens up the avenue to align with dMRI measurements by crossing the scales using common data formats, and to provide sub-voxel information on the underlying fiber architecture based on an independent technology complementary to the dMRI approach. Based on the pliODF generation, comparisons of 3D-PLI and dMRI can now be conducted at the level of individual voxels of the same size. Describing the local distribution of fiber orientations by means of spherical harmonics opens up the possibility to utilize methods being developed in the scope of computer vision to assess the similarity of datasets via shape descriptors and shape metrics, for example. This appears to be a promising approach to derive observer independent quality measures for 3D-PLI measurements, which will be evaluated in future studies.

### CONCLUSIONS

The future of research about the brain connectome will depend on multiscale approaches validated by independent imaging methods applied to the same object simultaneously. We have successfully established a concept to bridge the spatial scales from microscopic fiber orientation measurements based on 3D-PLI to macroscopic dimensions by means of creating orientation distribution functions (pliODFs) from high-resolution vector data. With pliODFs the fusion and comparison with dMRI data becomes feasible even for whole human brains. The keytechnology of supercomputing, that is inevitable for addressing real big data challenges as posed by the human brain, has

### REFERENCES


been applied to achieve this goal. Though, our implementation does not only limit the software's application to supercomputing environments, but it was also demonstrated to run successfully on local linux cluster systems. The established software package to generate pliODFs from 3D-PLI datasets described in this work will be made available through an ICT portal currently being developed by the human brain project consortium.

### AUTHOR CONTRIBUTIONS

MA coordinated and substantially contributed to the conception and design of the study as well as to the analysis and interpretation of the 3D-PLI data. He drafted the manuscript. SS participated in the design of the study and was accountable for the mathematical description of the pliODFs and the software implementation. He helped to draft the manuscript. DG contributed to the interpretation of the 3D-PLI data and was accountable for the visualization of the directional histograms and the pliODFs. He helped to draft the manuscript. OB implemented the HPC-based analysis workflow of 3D-PLI data and generated the pliODFs for the whole human brain section. MD created and analyzed the simulated data set. JR conducted the 3D-PLI measurements and helped to compare the highand low-resolution data sets. KZ contributed to the anatomical content of the study and helped with the interpretation of the pliODFs. KA contributed to the anatomical content and substantially assisted to the conception of the study. All authors read and revised the final manuscript and gave approval for publication.

### FUNDING

This work was supported by the Helmholtz Association portfolio theme "Supercomputing and Modeling for the Human Brain," by the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 604102 (Human Brain Project), and by the National Institutes of Health under grant agreement no. R01MH0 92311.

### ACKNOWLEDGMENTS

We would like to thank M. Cremer and C. Schramm, Research Centre Jülich, Germany, for their excellent technical assistance and preparation of the histological sections. We also thank L. Folkow, Arctic University of Norway, for supplying the tissue sample of the hooded seal as well as A. Wree, University Rostock, for initiating the latter collaboration.

Alexander, D. C., Barker, G. J., and Arridge, S. R. (2002). Detection and modeling of non-Gaussian apparent diffusion coefficient profiles in human brain data. Magn. Res. Med. 48, 331–340. doi: 10.1002/mrm.10209

Amunts, K., Bücker, O., and Axer, M. (2014)."Towards a multiscale, highresolution model of the human brain," in Brain-Inspired Computing, vol. 8603, eds L. Grandinetti, T. Lippert, and N. Petkov (Cetraro: Springer International Publishing), 3–14. doi: 10.1007/978-3-319-12084-3\_1


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Axer, Strohmer, Gräßel, Bücker, Dohmen, Reckfort, Zilles and Amunts. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## 3D Reconstructed Cyto-, Muscarinic M<sup>2</sup> Receptor, and Fiber Architecture of the Rat Brain Registered to the Waxholm Space Atlas

Nicole Schubert <sup>1</sup> \*, Markus Axer <sup>1</sup> \*, Martin Schober <sup>1</sup> , Anh-Minh Huynh<sup>1</sup> , Marcel Huysegoms <sup>1</sup> , Nicola Palomero-Gallagher <sup>1</sup> , Jan G. Bjaalie<sup>2</sup> , Trygve B. Leergaard<sup>2</sup> , Mehmet E. Kirlangic<sup>1</sup> , Katrin Amunts 1, 3 and Karl Zilles 1, 4, 5

1 Institute of Neuroscience and Medicine 1, Research Centre Jülich, Jülich, Germany, <sup>2</sup> Institute of Basic Medical Sciences, University of Oslo, Oslo, Norway, <sup>3</sup> C. and O. Vogt Institute for Brain Research, Heinrich-Heine University Düsseldorf, Düsseldorf, Germany, <sup>4</sup> Translational Brain Medicine, Jülich-Aachen Research Alliance, Aachen, Germany, <sup>5</sup> Department of Psychiatry, Psychotherapy, and Psychosomatics, RWTH Aachen University, Aachen, Germany

#### Edited by:

Wolfram Schenck, University of Applied Sciences Bielefeld, Germany

#### Reviewed by:

Marina Bentivoglio, University of Verona, Italy M. Mallar Chakravarty, McGill University, Canada

\*Correspondence: Nicole Schubert n.schubert@fz-juelich.de; Markus Axer m.axer@fz-juelich.de

Received: 31 December 2015 Accepted: 18 April 2016 Published: 03 May 2016

#### Citation:

Schubert N, Axer M, Schober M, Huynh A-M, Huysegoms M, Palomero-Gallagher N, Bjaalie JG, Leergaard TB, Kirlangic ME, Amunts K and Zilles K (2016) 3D Reconstructed Cyto-, Muscarinic M2 Receptor, and Fiber Architecture of the Rat Brain Registered to the Waxholm Space Atlas Front. Neuroanat. 10:51. doi: 10.3389/fnana.2016.00051 High-resolution multiscale and multimodal 3D models of the brain are essential tools to understand its complex structural and functional organization. Neuroimaging techniques addressing different aspects of brain organization should be integrated in a reference space to enable topographically correct alignment and subsequent analysis of the various datasets and their modalities. The Waxholm Space (http://software.incf.org/software/waxholm-space) is a publicly available 3D coordinate-based standard reference space for the mapping and registration of neuroanatomical data in rodent brains. This paper provides a newly developed pipeline combining imaging and reconstruction steps with a novel registration strategy to integrate new neuroimaging modalities into the Waxholm Space atlas. As a proof of principle, we incorporated large scale high-resolution cyto-, muscarinic M<sup>2</sup> receptor, and fiber architectonic images of rat brains into the 3D digital MRI based atlas of the Sprague Dawley rat in Waxholm Space. We describe the whole workflow, from image acquisition to reconstruction and registration of these three modalities into the Waxholm Space rat atlas. The registration of the brain sections into the atlas is performed by using both linear and non-linear transformations. The validity of the procedure is qualitatively demonstrated by visual inspection, and a quantitative evaluation is performed by measurement of the concordance between representative atlas-delineated regions and the same regions based on receptor or fiber architectonic data. This novel approach enables for the first time the generation of 3D reconstructed volumes of nerve fibers and fiber tracts, or of muscarinic M<sup>2</sup> receptor density distributions, in an entire rat brain. Additionally, our pipeline facilitates the inclusion of further neuroimaging datasets, e.g., 3D reconstructed volumes of histochemical stainings or of the regional distributions of multiple other receptor types, into the Waxholm Space. Thereby, a multiscale and multimodal rat brain model was created in the Waxholm Space atlas of the rat brain. Since the registration of these multimodal high-resolution datasets into the same coordinate system is an indispensable requisite for multi-parameter analyses, this approach enables combined studies on receptor and cell distributions as well as fiber densities in the same anatomical structures at microscopic scales for the first time.

Keywords: brain atlas, polarized light imaging, quantitative receptor autoradiography, histology, image registration

### 1. INTRODUCTION

Virtual high-resolution multiscale and multimodal 3D models of the brain are essential tools to visualize and understand the complex structural and functional organization of the brain. To capture different aspects of brain organization, such as the long-range fiber tracts connecting different brain regions, intracortical connectivity, and differences in molecular compositions, complementary neuroimaging techniques should be utilized. In order to interpret and compare measurements derived from different experimental techniques, all brain data sets should be integrated into a standard reference space.

This integrative approach leading to a multimodal and multiscale brain model is a major challenge because of the enormous structural complexity of the brain. The different brain regions differ not only by their cytoarchitecture, i.e., the varying densities of cells between the different layers within and between brain areas, but also by the expression of neurotransmitter receptors and gene expression. This microstructural diversity leads to a segregation of the cerebral cortex and the subcortical regions into hundreds of well definable entities with complex spatial arrangements (Toga et al., 2006; Zilles and Amunts, 2010; Amunts et al., 2013; Amunts and Zilles, 2015). Moreover, the different entities are connected by long range and short range fiber tracts, which also show an enormous spatial complexity (Zilles and Amunts, 2012). Therefore, an accurate definition of the spatial positions of structural entities is an indispensable requirement, particularly for multimodal and multiscale data sets. This is far from trivial (Bjaalie, 2002), because it often requires the registration of data collected from different brains, with different spatial resolutions, different dimensions of methodically introduced structural deformations and artifacts, and structures of considerable intersubject variability. Therefore, a multiscale and multimodal analysis must be based on an integration of the various data in a common stereotaxic brain atlas framework.

The International Neuroinformatics Coordinating Facility (INCF) Digital Atlasing Project created such a standardized framework, i.e., Waxholm Space, that operates as a connection point between miscellaneous rodent brain data. The Waxholm Space (WHS) is a common open access (http://software. incf.org/software/waxholm-space) 3D reference space based on high resolution magnetic resonance imaging (MRI) data anchored in a standardized spatial coordinate system. It also supports infrastructure for data exchange. The WHS of the mouse brain (Hawrylycz et al., 2011) was extended amongst others with neuroanatomic atlases (Goldowitz, 2010), gene expression databases and MRI and diffusion tensor imaging (DTI) (Johnson et al., 2010). Papp et al. (2014) introduced and implemented the WHS atlas of the Sprague Dawley rat brain. The WHS rat brain atlas currently only contains high resolution MRI and DTI images, which served as basis for the delineation of the 79 major anatomical structures it depicts (Papp et al., 2014, 2015; Kjonigsen et al., 2015).

Aim of the present study was to complement the WHS rat brain atlas with information of cytoarchitecture, receptor expression and spatial orientation of fiber tracts. Thus, we processed entire postmortem brains of the Wistar rat for three different neuroimaging techniques: microscopic analysis of histological cell body stained serial sections for cytoarchitectonic analysis, in vitro receptor autoradiography to demonstrate muscarinic M<sup>2</sup> receptor density distributions (Zilles et al., 2002; Zilles and Amunts, 2010), and 3D Polarized Light Imaging (PLI) for high resolution visualization of fiber tracts (Axer et al., 2011a,b). Imaging of the cell body stained sections enables a precise microscopical identification of cytoarchitectonically definable areas and the visualization of the spatial distribution of neurons. Quantitative in vitro receptor autoradiography is a well established technique to visualize the topographically heterogeneous distribution of neurotransmitter receptors, the key molecules of signal transmission. 3D PLI has been introduced recently and has opened up new avenues to analyze the complex architecture of nerve fibers and fiber tracts in postmortem brains at a microscopic resolution.

All of the above mentioned techniques require brain sectioning and mounting on glass slides, and this approach results in a loss of spatial alignment between neighboring sections. To obtain 3D brain models, the sections have to be aligned, the artificial deformations must be corrected, and 3D reconstructions must be performed. Therefore, it was necessary to improve currently available registration algorithms and adapt them to the specific requirements inherent to images obtained from receptor autoradiography and PLI. Finally, the nature of the data acquired in the present study also required the development of a novel registration strategy which enables integration of large scale high-resolution images of into the 3D MRI volume of the WHS rat atlas.

The here provided data complementing the Waxholm Space rat brain atlas will provide a multiscale and multimodal rat brain model enabling for the first time combined studies on receptor and cell distributions as well as fiber densities in the same anatomical structures at microscopic scales. Furthermore it will be publicly accessible through the Human Brain Project (HBP) portal, intended for multi-parameter analyses, refinement of the atlas labels, or further expansion via the proposed registration strategies.

### 2. MATERIALS AND METHODS

### 2.1. Tissue Processing and Image Acquisition

### 2.1.1. Tissue Sectioning and Blockface Imaging

All animal procedures were approved by the institutional animal welfare committee at the Research Centre Jülich, and were in accordance with European Union (National Institutes of Health) guidelines for the use and care of laboratory animals. The brain of one adult male Wistar rat was used for the visualization of cell bodies and of muscarinic M<sup>2</sup> receptors. It is referred to as the receptor brain. The brain of a second adult male Wistar rat, which we refer to as the PLI brain, was processed for the visualization of nerve fibers and fiber tracts. The receptor brain was immediately deep frozen in isopentane at −50◦C and serially sectioned in the coronal plane at 20µm thickness using a cryostat microtome (Leica Microsystems, Germany). The ensuing 1362 sections were thaw-mounted onto glass slides and organized in series of adjoining triplets of which one section was used for visualization of cell bodies, and the other two sections were processed for quantitative in vitro receptor autoradiography. The PLI brain was immersion fixed in 4% buffered formaldehyde. After two cryoprotection steps (10% glycerin for 3 days, followed by 20% glycerin for 14 days at +4 ◦C), the brain was deep frozen in isopentane at −50◦C and serially sectioned in the coronal plane at 60µm thickness using the same cryostat microtome (Leica Microsystems, Germany). The 446 ensuing sections were placed on glass slides and stored at −80◦C in airtight plastic bags until further processing. They were thaw mounted and coverslipped with 20% glycerin the day before image acquisition took place. During sectioning of both brains, blockface images of every section were taken with a CCD camera (AVT Oscar F-810 C, 3272 × 2469 pixels, 15µm × 15µm, RGB) which was installed vertically above the cryostat, in order to obtain undistorted reference images. Spatial resolution in the z-direction was 20µm for images obtained from the receptor brain, and 60µm for images obtained from the PLI brain. A total of 1361 blockface images were taken for the receptor brain, and 446 for the PLI brain.

### 2.1.2. Receptor Brain

A total of 452 sections from the receptor brain were stained with a silver staining technique after (Merker, 1983). It results in a staining of all cell bodies, which is different from the widely used cresyl-violet stain of Nissl substance by its higher contrast and more intense visualization of cytoarchitecture. The sections processed for quantitative in vitro receptor autoradiography were used to demonstrate the densities (in fmol/mg protein) of two different receptor binding sites of the cholinergic muscarinic M<sup>2</sup> receptor, i.e., the agonistic and the antagonistic binding site, according to previously published protocols (Zilles et al., 2002; Palomero-Gallagher et al., 2013). Sections were incubated with 1,7 nM 3H-oxotremorine-M (PerkinElmer, USA) or with 5 nM 3H-AF-DX 384 (PerkinElmer, USA) to visualize the agonistic and antagonistic binding sites of the M<sup>2</sup> receptor, respectively. Binding assays were preceded by a preincubation in the respective buffer to eliminate the endogenous transmitters and finalized by a washing step. The labeled sections were exposed together with plastic scales of increasing and known radioactivity concentrations against beta-radiation (tritium) sensitive films, which were developed after 15 weeks.

The ensuing 430 autoradiographs of the agonistic binding site of the M<sup>2</sup> receptor as well as the 452 cell body stained histological sections were digitized using a high resolution camera (Zeiss) with an in-plane resolution of 5µm ×5µm ( 4164×3120 pixels, RGB). Since each of these sections was 20µm thick and sections had been organized into triplets, the resulting spatial resolution in the z-direction was 60µm for both the digitized autoradiographs and the digitized histological sections. For further details of quantification of receptor density in fmol/mg protein and color coding see Zilles et al. (2002).

### 2.1.3. PLI Brain

The 446 sections from the brain cut at 60µm were used to acquire 3D-PLI data reflecting the fiber architecture in gray and white matter regions (cf. Axer et al., 2011b; Dohmen et al., 2015; Menzel et al., 2015; Reckfort et al., 2015 for technical details). Briefly, 3D-PLI utilizes the optical birefringence of brain tissue, which is basically induced by the optical anisotropy of myelin sheaths wrapped around axons. By passing linearly polarized light through brain sections and by detecting the local changes in the polarization state of light, a 3D description of the underlying fiber architecture is derived. The imaging system used is a polarimeter. The sections were successively scanned with a largearea polarimeter (LAP), and subjected to an analysis workflow, which comprises calibration, independent component analysis, polarization analysis and calculation of fiber orientation maps (FOMs). FOMs are the fundamental data structure provided by 3D-PLI and have an in-plane resolution of 64µm × 64µm, and, since each section was 60µm thick, a spatial resolution in the zdirection of 60µm. They contain a single 3D fiber orientation vector per voxel, that is interpreted as the spatial orientation of the fibers in this voxel.

### 2.2. 3D Reconstruction

### 2.2.1. 3D Reconstruction of Blockface Images

Non-linear deformations introduced by brain sectioning, mounting and staining were corrected using blockface images as undistorted references for the spatial alignment of histological, autoradiographic and PLI images. Hence, in a first step the blockface images had to be 3D reconstructed. In short, the here applied robust and efficient reconstruction method consisted of a two-phase registration: a marker-based alignment of the blockface images and a refinement of the pre-reconstructed volume using 3D information. First, the coordinates of markers (circles) labeled on the microtome chuck were extracted. The centers of the circles of neighboring images were aligned to each other by means of a translation transformation. Processing all images leads to an almost smoothly reconstructed 3D stack of blockface images of the brain. However, this approach causes perspective errors due to the different heights of the sectioning plane and microtome chuck with the markers, and thus their different distances to the camera lens. Therefore, in the second part of the method the median along the z-direction of the marker-based reconstructed blockface volume was calculated to eliminate the outliers caused by perspective errors. The number of the images used by the voxelwise computation of this median volume can be specified by the radius of the median. In a next step, the marker-based reconstructed volume was aligned sliceby-slice onto the median volume using a translation transform estimated by an intensity based image registration algorithm with sum of squared differences as metric. By using this technique we took advantage of 3D information in an actually 2D slice-by-slice registration method. This led to an accurately aligned volume of blockface images that was an important reference to recover the spatial coherence of the non-linearly deformed sections corresponding to the blockface images. The procedure of 3D reconstruction of blockface images was introduced by Schober et al. (2015) with modified markers for the reconstruction. Finally, the reconstructed blockface volumes of the receptor and PLI brains were separated from the surrounding by means of a 3D watershed algorithm. The 3D reconstruction was carried out separately for images obtained from the receptor brain and for those of the PLI brain. Thus, we obtained two distinct blockface coordinate spaces (BCS): the BCS of the receptor brain (BCS<sup>R</sup> ), and the BCS of the PLI brain (BCSPLI).

### 2.2.2. 3D Reconstruction of Cyto- and Receptor Architecture Images

After reconstruction of the blockface volume of the receptor brain, each histological and autoradiographic image was aligned to its corresponding blockface image. Due to the highly different information each modality comprises (i.e., cell body distribution patterns vs. M<sup>2</sup> receptor densities), it was necessary to establish different registration strategies. Each histological section was rigidly aligned with its corresponding blockface image. The centers of gravity of the brain tissue in blockface and histological image were calculated and superimposed. To determine the center of gravity, the separation of the brain tissue from the background is required. This was done by means of thresholding, extracting the largest connected component and morphological operations. After alignment of the centers of gravity a brute force optimizer tested all rotation angles with the sum of squared differences as metric. Details are described in Schubert et al. (2015). Reconstruction of the autoradiographic images is considerably more challenging due to the fact that receptors of a given type, in our case the muscarinic M<sup>2</sup> receptor, are not necessarily expressed in all brain regions. Furthermore, when present, they can occur at very different concentrations throughout the brain. Thus, the range of gray values present in an autoradiographic image is much larger than that of a histological section. Therefore, the registration has to compensate for "empty regions," i.e., regions without information in the images because that part of the brain does not express the receptor in question. First of all, an intrastack registration matched consecutive autoradiographic images by means of a scale-invariant feature transform algorithm that detected characteristic points in the images based on their gradient information. Afterwards, these points were rigidly aligned by minimizing the Euclidean distance between them. With this pre-registered autoradiographic volume we were able to use a landmark based method to align the autoradiographic images to their corresponding blockface images. In every 30th image anatomical landmarks were manually set and the rigid transformation between these landmarks was calculated. Between these 30 images the transformations were interpolated and applied to the autoradiographic images (Huynh et al., 2015). Data acquisition and 3D reconstruction of the cyto- and receptor architecture are illustrated in **Figure 1**. At the end of this procedure, the 3D reconstructed histological volume and the 3D reconstructed M<sup>2</sup> receptor volume were each in the BCS<sup>R</sup> .

### 2.2.3. 3D Reconstruction of Fiber Architecture Images

The 3D reconstruction of the PLI data consists of two steps: a rigid slice-by-slice registration of the PLI images to the corresponding blockface images and a non-rigid refinement method. The first step is based on estimating a transformation of the PLI images to the corresponding image of the reconstructed blockface volume by image registration. To align the PLI images to the blockface images, the masks of the brain tissue of both data sets are required. For that, the reconstructed blockface volume was segmented by means of a 3D watershed algorithm and the PLI images were manually segmented. Using the segmented images the centers of gravity of the corresponding brain masks were calculated and aligned. Based on this initial transformation, an intensity based rigid registration was performed using mutual information as metric. The second step, the refinement, was done by means of a slice-by-slice B-Spline registration with sum of squared differences as metric and a grid size of 5 × 6. At the end of this procedure, the 3D reconstructed fiber volume was in the BCSPLI .

### 2.3. Data Intergration into a Reference Space

### 2.3.1. Waxholm Space Atlas

The WHS atlas of the Sprague Dawley rat brain is an open access atlas based on a high resolution MRI and DTI template in which both WHS and stereotaxic coordinates are defined. The T2<sup>∗</sup> -weighted anatomical MRI (512 × 1024 × 512 pixels) with an isotropic spatial resolution of 39µm was acquired ex vivo by means of a 7T small animal MRI system. Anatomical delineations in the atlas are based on image contrast observed in T2\*-weighted images and diffusion tensor images. Technical details are described in Papp et al. (2014). The latest version of the atlas contains 79 structures with new and updated delineations of the hippocampal formation and parahippocampal region, as described in Kjonigsen et al. (2015). The atlas is available from the INCF Software Center (http://software.incf.org/software/ waxholm-space-atlas-of-the-sprague-dawley-rat-brain).

### 2.3.2. Data Integration into the Waxholm Space Atlas

In order to achieve an accurate analysis of the multimodal data sets, we aligned the atlas data to the coordinate space of each reconstructed data sets, i.e., to the BCS<sup>R</sup> and the BCSPLI by means of an advance image registration. In the literature several methods were proposed to integrate brain sections into 3D data volumes. Strategies which rely on successively increasing the degrees of freedom of the transformation demonstrated

the best results. For instance, Dauguet et al. (2007) and Li et al. (2009) suggested a two step procedure consisting of a rigid transformation followed by non-linear transformations. However, this approach has several drawbacks, since a rigid transformation only aligns brain orientation, so that the brains after this transformation still differ in size and shape. Nonlinear registrations work locally and, therefore, need well aligned volumes as starting point, which a rigid transformation cannot garantee. Lebenberg et al. (2010) added an affine transformation between the rigid and non-linear transformations. This step is also important for our data, due to the fact that the affine transformation is able to align the size and shape of the brains by scaling and shearing. However, since Lebenberg et al. (2010) not only registered a single mouse hemisphere, but also excluded the olfactory bulb and cerebellum, they were not confronted with the challenges posed by trying to register the whole brains. Therefore, modifications of their strategy are essential to enable accurate registration of structures such as the olfactory bulbus, or of even of the hemispheres, since they are independent of each other at levels rostral of (or caudal to) the corpus callosum. We used for all three transformations (rigid, affine and nonlinear) a pyramidal method, i.e., a coarse to fine approach, to align initially large structures followed by aligning small and fine structures, whereas Lebenberg et al. (2010) only applied this multi-resolution approach to the affine transformation. Furthermore, we computed the similarity of the brains by means of Mutual Information, which is the best metric for multimodal registration tasks (Rueckert et al., 1999), in all three transformation steps, while Lebenberg et al. (2010) used Correlation Coefficient in the affine registration step. Finally, we studied the influence of the grid spacing used for the non-linear transformation to achieve best possible results for the whole brains, which was not tested by Lebenberg et al. (2010).

The T2\*-weighted atlas MRI was aligned to the respective reconstructed blockface volume. The estimated transformation was then applied to the digital atlas delineations. To compensate the variability between the atlas data and the rat brain data sets (i.e., the 3D reconstructed histological, M<sup>2</sup> receptor and fiber volumes), the registration strategy consists of the successive steps explained below (**Figure 2**). Initially, image parts belonging to the rat brain of the MRI data set were separated from the rest of the head using the atlas template. All voxels in the MRI with their corresponding atlas label unequal to 0 were marked as "brain," the other voxels were marked as background. The resulting masked MRI containing only the brain volume was used for the registration. Note that it was necessary to perform two separate registrations, since we have reconstructed blockface images in two separate spaces, namely BCS<sup>R</sup> and BCSPLI. The procedure is the same in both cases and encompasses the following registration steps: The T2<sup>∗</sup> -weighted MRI was manually aligned to a few selected images from the reconstructed blockface volume using the anchor a custom tool for affine registration of histological images to brain atlas space (Moene et al., 2011; Papp etal., 2016) in Navigator3. Then, the transformations were automatically propagated to the remaining images. These steps can be iterated with different parameters to prioritize specific boundaries or structures. Afterwards, the manually aligned MRI was automatically re-oriented to match the spatial orientation of the blockface volume. A global 3D affine registration initialized with the previously computed parameters

was then optimized with normalized mutual information as similarity measure. To speed up the registration and prevent local minima a coarse to fine multiresolution approach was used which consisted of six levels of Gaussian smoothing pyramids. Finally, a 3D non-linear registration based on cubic B-Splines was used to refine the former parameters and local discrepancies. Again, normalized mutual information was chosen as similarity measure, and a pyramidal approach was used. The manual anchoring as well as the automatic 3D affine and non-linear transformations were directly applied to the atlas template with one exception. Instead of cubic B-Spline interpolation we used nearest neighbor interpolation to preserve accurate label boundaries and avoid gaps. We employed elastix (Klein et al., 2010) for the 3D registration. After registration to the respective blockface volumes, the resulting volume dimensions of the MRI and atlas template equal the dimensions of the blockface volume coordinate space: 996 × 1356 × 1361 pixels with a spatial resolution of 15µm × 15µm × 20µm for the receptor brain (BCS<sup>R</sup> ), and 588 × 723 × 446 pixels with a spatial resolution of 22µm × 22µm × 60µm for the PLI brain BCSPLI .

#### 2.3.3. Evaluation of the Registration Results

Qualitative evaluation of the registration results in terms of anatomical accuracy was done by superimposing the aligned MRI and the blockface volume to compare the external borders and internal structures, as well as superimposing the atlas contours with the cyto-, muscarinic M<sup>2</sup> receptor and fiber architecture.

Quantitative evaluation was performed by computing the quality of the alignment of atlas based segmented and receptor or PLI based segmented structures. Two structures which are defined in the WHS atlas were used: the pial brain surface, and the hippocampal formation. In our datasets different strategies were used to generate the pial surface contour and that of the hippocampal formation: The surface of the entire brains was segmented with a 3D watershed algorithm, and the hippocampal formation was manually delineated on the original images by two of the coauthors (NP-G and KZ). A comprehensive evaluation suggests the use of three uncorrelated measures that cover different aspects of the segmentation: an overlap measure (e.g., Dice coefficient), the Hausdorff distance and the average surface distance (Handels, 2000; Heimann et al., 2004). The Dice coefficient (DC) (Dice, 1945) assesses the overall overlap of the segments, it is sensitive to misplacement of the segments, but gives less weight to outliers. The average surface distance (ASD) and the Hausdorff distance (HD) determine the discrepancy of the surface of the segments. ASD is defined as the average error of all distances. A small ASD indicates a small error and variance between the segments. The HD returns the maximum distance between the segments, and therefore the maximum error. It is sensitive to outliers. The measurements were determined between an atlas based segmented structure A and the corresponding receptor or PLI based segmented structure B. The DC calculates the spatial overlap accuracy of two segmented structures A and B, whereby 0 is the result of disjunct segments and 1 is the result of a perfect agreement of the segments. With |·| denotes the number of voxels in the respective segmented structure, the DC is:

$$DC(A, B) = \frac{2|A \cap B|}{|A| + |B|} \tag{1}$$

The ASD determines the minimal distance in mm of one segment to the other and vice versa. This value is 0 for a perfect registration.

$$ASD(A,B) = \frac{\sum\_{a \in A} \min\_{b \in B} d(a,b) + \sum\_{b \in B} \min\_{a \in A} d(b,a)}{|A| + |B|} \tag{2}$$

The HD is defined as the maximum distance in mm of a segment to the nearest point in another segment and vice versa. A low HD indicates a good match.

$$\begin{aligned}HD(A,B) &= \max(h(A,B), h(B,A)) \text{ with } h(A,B) \\ &= \max\_{a \in A} \min\_{b \in B} d(a,b) \end{aligned} \tag{3}$$

with the Euclidean distance d between point a and b.

Note, that the aim of this procedure was not to prove that our definition of pial surface or hippocampal formation is better than that of the WHS atlas. We only wanted to determine whether the overlap could be improved (i.e., differences could be reduced) with different algorithms.

### 2.4. Hard- and Software

The processing was partially done using high-performance computing tools and supercomputing facilities of the Jülich Supercomputing Centre, Germany [Juelich Dedicated GPU Environment (JuDGE)], as well as the in-house Solaris computer cluster. Custom C++ software programs using ITK, elastix, OpenCV, MPI, OpenMP, QT and OpenGL performed the 3D reconstruction of the postmortem rat brains, the data integration into the WHS atlas, and furthermore the evaluation and the visualization of the results.

### 3. RESULTS

The registration of the atlas MRI volume to the respective blockface volume was done in three subsequent steps: rigid, affine and non-linear B-Spline based registration. All three steps used the Adaptive Stochastic Gradient Descent approach for optimization. A multi-resolution registration with six levels was used to overcome local minima problems. The brain volumes were downsampled by a factor of 2 compared to the next resolution level. The similarity of the intensity values of blockface and MRI data was determined with the Mutual Information metric, which was specifically developed for multimodal data sets (Viola and Wells, 1997). As expected, the matching of brain structures from the different modalities improved considerably with increasing degrees of freedom of the transformations. Considerable differences were found after the rigid registration. The affine registration improved the matching, but still differences existed. The application of the non-linear registration led to a high matching quality. It took 39 min for the PLI brain (rigid 5 min 16 s, affine 5 min 16 s, B-Spline 28 min 30 s) and 66 min for the receptor brain (rigid 11 min 33 s, affine 12 min 8 s, B-Spline 42 min 18 s).

### 3.1. Qualitative Evaluation

The results after each step are illustrated as checkerboard images of blockface and MRI volumes in three orthogonal views (coronal, horizontal and sagittal, c.f. **Figure 3**). Regarding the differences in size and shape of the brains, it was recognizable that these differences nearly disappeared from rigid, affine to non-linear registration transformation. Depending on the actual registration method, considerable to minor differences could be easily detected at three sites: the outer surface of the entire brain, the olfactory bulb and the cerebellum. The blockface volume of the entire brain was wider, that of the olfactory bulb was deflected and laterally displaced, and the cerebellum of the blockface volume was more flattened compared to the MRI volume. The difference between the outer surfaces (**Figure 3**, green circles) decreased after affine registration and disappeared after the nonlinear registration. The position of cerebella (**Figure 3**, blue circles) and olfactory bulbs (**Figure 3**, yellow circles) considerably differed between blockface and MRI volumes. This could not be compensated by linear and global transformations. Only non-linear registration fitted these structures. In conclusion, rigid transformation sufficiently centered the brains, affine registration compensated differences in size and shape of the brains, and finally, non-linear registration aligned both small local mismatches and also large differences (c.f. olfactory bulb and cerebellum).

The cytoarchitectonic, M<sup>2</sup> receptor distribution, and fiber orientation volumes were superimposed with the atlas to demonstrate the quality of the match between the reconstructed and the atlas volumes (**Figures 4**, **5**).

In both brains, the overall matching quality was high, particularly at the anterior commissure (**Figure 4**, magenta arrow) and the corpus callosum (**Figure 5**, green arrow). In the receptor brain, the artificial gap between the hemispheres was caused by the removal of the unfixed brain from the skull, which resulted in an anti-clockwise rotation and lateral displacement of the hemispheres and thereby local mismatches at the mesial cortical surface, the border between the retrosplenial cortex and the underlying white matter, and the medial protrusion of the neocortex in direction to the hippocampus (**Figure 4**, yellow arrows). The differences between the position and shape of the cerebella in the atlas and the reconstructions led to further mismatches (**Figure 4**, red arrows), which were not compensated by the registration. In the PLI brain, minor mismatches were found at the outer surface of the brain (**Figure 5**, red arrows) and the cerebellum (**Figure 5**, yellow arrow). The better match of the PLI brain is understandable, because the receptor brain was not fixed and therefore, more prone to distortions whereas the PLI brain was fixed before deep freezing.

### 3.2. Quantitative Evaluation

The quality of alignment between the atlas and the reconstructed volumes was estimated by comparing the topography of the surface of the entire brain and of the hippocampal complex using three different measures introduced in Section 2.3.3. The surface of the entire brains was segmented with a 3D watershed algorithm. The atlas labels were used to extract the surface of the entire brain and hippocampal complex. The outer contour of the hippocampal complex was manually traced in the M<sup>2</sup> and the FOM sections by experienced neuroanatomists. The hippocampal complex comprises the Cornu Ammonis regions 1, 2, and 3, the dentate gyrus and the subicular complex with the subiculum, presubiculum and parasubiculum. The hippocampal complex spans a wide portion of the brain in both the rostrocaudal and dorsoventral directions, and can be used to demonstrate the registration quality for inner anatomical structures. Since the

#### FIGURE 4 | Superimposition of the atlas structures (white contours) on the histological volume of the receptor brain (upper row) and on the M<sup>2</sup> receptor density distribution volume (lower row). The color legend of the lower row denotes the distribution of M2 receptor densities (red: high, black: low). The overall matching quality is high, especially at the anterior commissure (magenta arrow). The gap between the hemispheres results in small mismatched boundaries (yellow arrows). The high differences of the cerebella in location and shape yields some small discrepancies (red arrows).

spatial location of the olfactory bulbs differed between MRI and blockface volumes, and caused far-reaching effects on the registration quality of many other brain structures, particularly at rostral levels, the quantitative evaluation was carried out with registrations with or without the olfactory bulbs. An important registration parameter was the spacing of the control points in the B-Spline grid, which indicates the flexibility of the transformation. A low spacing guaranties a high flexibility of the transformation.

The results of the comparisons after each registration step using different spacing of the control points are shown in **Figure 6**. Using the Dice coefficient, the affine registration was sufficient to reach a high matching of the entire brain and the hippocampal complex well above 0.7, which is commonly accepted as a limit for a good match (Zijdenbos et al., 1994). However, the measures improved significantly after the nonlinear B-Spline registration. The Dice coefficient and the average surface distance reached their optimum at middle flexibility of the transformation grid for the entire brain comparison. The Hausdorff distance reached an optimum at high flexibility. This is caused by the fact that the artificial displacements of the olfactory bulbs were compensated. However, this led to undesired transformations of brain structures at rostral levels. That was also reflected in the quantitative evaluation of the hippocampal complex. Here, the best results were achieved by the relatively less flexible affine registration (receptor brain), or after a nonlinear registration at low flexibility levels (PLI brain). Comparing the results after registration without inclusion of the olfactory bulbs, the measures improved significantly for the hippocampal complex in the receptor brain. In contrast to the receptor brain the results between the registration with or without the olfactory bulbs did not considerably differ. In **Table 1** the results are summarized.

### 4. DISCUSSION

The study aimed at integrating multimodal (i.e., cytoarchitectonic, muscarinic M<sup>2</sup> receptor distribution and fiber orientation data) and multiscale (i.e., mesoscopic resolution of blockface images and MR data of the 3D digital WHS Atlas, and microscopic resolution of sections) data in a common stereotaxic reference space. This was achieved by linear and non-linear registration. The qualitative and quantitative evaluations demonstrated a good matching of all data sets. We selected the whole brain, and additionally the hippocampal complex, that spans a wide distance within the brain in both the rostro-caudal and dorso-ventral directions, as examples to prove the quality of the methods of registration.

## 4.1. Methodic Challenges

#### 4.1.1. 3D Reconstruction

The 3D reconstruction of rodent brains is often carried out by means of rigid or affine registration transformations guided by blockface images (Ourselin et al., 2000; Lebenberg et al., 2010) or by volumes obtained from MRI (Li et al., 2009; Yang et al., 2012). We used blockface images for the reconstruction, due to the fact that they provide largely undistorted reference images of the brain sections. Furthermore, a reconstructed blockface volume is an excellent reference template, particularly if a real 3D volume (e.g., MRI volume) is missing or the resolution of the respective MRI volume is not appropriate (Schober et al., 2015). A particular challenge of the 3D reconstruction was the differential deformation of the brains inevitably caused by different tissue processing techniques. While the PLI brain was fixed and deep frozen before sectioning, the receptor brain was just deep frozen to maintain the receptor architecture. The fixation and deep freezing of the PLI brain introduces less deformations compared to the native brain size and shape than the deep freezing of the unfixed receptor brain (e.g., location of olfactory bulb). The sectioning procedure and mounting of sections from fixed brains also results in less deformations than that of sections from unfixed brains. To compensate these deformations, we performed, in addition to the common linear reconstruction strategies, non-linear transformations, which results in largely good reconstruction results. However, the anti-clockwise rotation of the hemispheres during the mounting procedure together with the more fragile nature of these unfixed cryostat microtome sections could not be completely eliminated at some sites (**Figure 4**, yellow arrows). A further challenge was the different information provided by the different modalities, e.g., "empty regions" (c.f. Section 2.2.2) in sections of the receptor brain. Therefore, particular reconstruction strategies were necessary for each modality,

FIGURE 6 | Diagrams of the quantitative analyses of the Dice coefficient (DC), Hausdorff distance (HD) and Average Surface distance (ASD) from rigid, affine to non-linear transformations, whereby numbers 20–100 describes the flexibility of the grid (20 -high flexibility, 100-low flexibility). The upper row illustrates the registration results of the receptor brain and the lower row demonstrates the registration results of the PLI brain. Diagrams in the first column indicate the results of the analyses of the whole brains (wb), the diagrams in the second column indicate the results of the analyses of the hippocampal complex (hc). The last two columns show diagrams of the analyses of the wb (3rd column) and the hc (4rd column) after a registration excluding olfactory bulb (ob). The best results of the quantitative and the qualitative analyses are marked with yellow rectangles.


TABLE 1 | The quantitative evaluation is measured by Dice coefficients (DC), Hausdorff distances (HD) and average surface distances (ASD) between receptor and PLI based and atlas based segmented structures, considering the whole brain and an internal structure, the hippocampal complex.

The Dice coefficient ranges between 0 and 1, 1 indicates full overlap. Lowest Hausdorff distance and average surface distance values indicate best alignments. The best results are labeled bold.

e.g., landmarks were interactively introduced in the receptor images, which was not required in the cytoarchitectonic and fiber tract images, because the latter images do not contain empty regions. In particular the strategy developed to solve problem of empty regions in receptor autoradiograhps represents a crucial step forwards in the reconstruction of future datasets coding for the regional and laminar distribution patterns of receptors, this is a recurrent problem, but the brain structures that do not express a certain type of receptors, or do so only at extremely low densities, vary considerably between receptor types.

#### 4.1.2. Data Integration

Although many studies described the registration of 3D data, e.g., MRI volumes, to 3D atlases of rodent (Sergejeva et al., 2015) or human brains (Collins and Evans, 1997), only a few studies registered postmortem rodent brain sections to an MRI volume based atlas (Lebenberg et al., 2010, 2011; Abdelmoula et al., 2014; Sergejeva et al., 2015). Lebenberg et al. (2010) published the alignment of autoradiographic and histological data of one hemisphere of the mouse brain into a 3D digital MRI based atlas by means of a three step strategy containing rigid, affine and non-linear (elastic) transformations using the reconstructed blockface volume as intermediate modality between atlas and postmortem data. Abdelmoula et al. (2014) used a similar method to transfer mass spectrometry data into the Allen Mouse Brain atlas (Goldowitz, 2010) via affine and non-rigid B-Spline based registration. Sergejeva et al. (2015) identified anatomical landmarks in MRI, blockface or histological images for a landmark based affine registration of these data to the WHS rodent atlases (Johnson et al., 2010).

Since the 3D reconstructed receptor and PLI brains as well as the WHS Atlas brain slightly differed in size and shape of the entire brain and its inner structures, and the use of a nonlinear registration is indispensable, we improved the three step registration strategy published in Lebenberg et al. (2010) by the constant use of a multi-resolution registration, i.e., application of a pyramidal method to all three transformation steps and of a similarity criterion (Mutual Information) as an optimal metric for multimodal registration tasks, and testing the influence of the grid spacing used by the non-linear transformation on the registration results. The combination of linear and non-linear transformations of the brains, and the use of the blockface volume as intermediate modality between fiber, cytoand receptor architectonic and MRI data provided a maximal concordance of the brains. Rigid and affine transformations optimized the matching of the position of the different brains and compensated global shearing and scaling misalignments. Local structural adaptations were done with the non-linear B-Spline based registration. A crucial parameter was the flexibility of the B-Spline grid. With higher flexibility the algorithm generally works more accurately, but unrealistic deformations can be induced. This is illustrated by increasing the grid flexibility, which led to best overlap and distance results with the atlas brain (**Figure 6**). The overlap of the receptor or PLI brains and the MRI brain of the atlas was nearly perfect (Dice coefficients of 0.98 for M2 receptor brain; 0.97 for the PLI brain), but structures within the forebrain, the olfactory bulb and the cerebellum were unrealistically deformed. To overcome this problem, a lower grid flexibility was chosen, although this led to a lower overlap of the entire brains, with special focus in the region of the olfactory bulb. Since the position of the bulb in the receptor and the PLI brain does not reflect its natural position, but is extremely deformed by the necessary preparation steps for receptor autoradiography and PLI measurement, the sections through the bulb were excluded from the registration. This led to a much better overlap and improved distance measurements of the hippocampal complex, particularly in the receptor brain, and still to a good matching of the entire brains well above (Dice coefficient of 0.84 for both brains). Although the quantitative evaluation was based on automatically extracted contour in the case of the whole brain and on manually defined contours in the case of the hippocampus, a comparison of the results obtained for both structures reveals a high consistency. Likewise, although the quantitative evaluation of the matching of the hippocampal complex was based on independent delineations in the receptor, PLI and MRI brains by different experienced neuroanatomists, the results demonstrated high congruence of the different delineations of the hippocampal complex. This registration strategy was very effective in most brain regions. However, the remaining anti-clockwise rotation of the hemispheres in the fragile sections of the unfixed receptor brain and the mismatch between the neocortex and hippocampus in the center of the section (**Figure 4**) would only be compensated with an extremely high flexibility of the B-Spline grid. This would introduce large undesired artificial deformations in adjoining brain regions, which are biologically unrealistic.

### 4.2. Limitations and Applications

We are aware that there are a series of putative limitations in the present study. One of them is that different sectioning thickness had to be used for processing of the receptor and PLI brains due to technical constraints. The PLI method requires fixation of brain tissue as well as a minimal section thickness in order to enable extraction of information concerning the direction of the fibers, and previous studies from our group have shown 60µm to be an optimal thickness (Axer et al., 2011a,b). Quantitative in vitro receptor autoradiography requires usage of unfixed deep-frozen brains, since the method is based on the fact that the receptors to be visualized must maintain their ability to bind the radioactively labeled ligand present in the incubation buffer (Zilles et al., 2002). Unfortunately, it is technically not possible to obtain 60µm thick sections tissue preprocessed in this manner, and, therefore, we used 20µm thick sections. However, the different section thicknesses were accounted for during 3D reconstruction, so we do not think this poses a problem for the registration of our different image modalities to the WHS atlas. Quite the contrary, the methods developed here to overcome these differences in section thickness will facilitate future inclusion of multiscale data into the WHS rat atlas or the WHS mouse atlas.

Our cyto-, M<sup>2</sup> receptor, and fiber architectonic datasets were obtained from adult Wistar rat brains, whereas the WHS atlas is based on an MRI scan of an adult Sprague Dawley rat (Papp et al., 2014, 2015; Kjonigsen et al., 2015), and the fact that brains from different rat strains have been used may also be viewed as a putative problem. However, this issue has been addressed in the past and is not thought to constitute a problem, since comparison of the cyto- and chemoarchitecture of the hippocampal formation in different rat strains has shown it to be a highly conserved brain structure (Kjonigsen et al., 2015).

A large variety of 3D digital atlases based on MRI or reconstructed histological or histochemical sections are available for rodent brains (Goldowitz, 2010; Dorr et al., 2008; Li et al., 2009, 2010; Johnson et al., 2010), nonhuman primate brains (Paxinos et al., 2000; Calabrese et al., 2015) and human brains (Hawrylycz et al., 2012; Shen et al., 2012; Amunts et al., 2013; Amunts and Zilles, 2015). Compared to the current available atlases the WHS (Hawrylycz et al., 2011; Papp et al., 2014) is an unique framework operating as a hub of an infrastructure connecting rodent brain data and reference

spaces. To enrich this framework with cyto-, M<sup>2</sup> receptor and fiber architecture provides a valuable extension to master analyses of the enormous structural complexity of the brain data.

### 4.3. Conclusion

We developed a tool to register multiscale and multimodal rat brain data to the WHS atlas brain. It enables retrieval of detailed information of volume densities of cell bodies, of neurotransmitter receptor densities, and of fiber tract architecture and orientation in microscopically identified brain regions (c.f. **Figure 7**).

Therefore, our results considerably expand the data base of the WHS. Furthermore, the methods developed in the present study enable future integration of data of other modalities, which can further enhance the neuroscientific impact of the atlas. The 3D reconstructions of the cyto-, receptor and fiber architectonic images registered to WHS will be publicly accessible through the Human Brain Project (HBP) portal.

### REFERENCES


### FUNDING

This study was partially supported by the National Institutes of Health under grant agreement no. R01MH092311, by the Helmholtz Association through the Helmholtz Portfolio Theme "Supercomputing and Modeling for the Human Brain," and by the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 604102 (Human Brain Project) and by the Research Council of Norway (Norwegian Node of the International Neuroinformatics Coordinating Facility, project ID 214842).

### ACKNOWLEDGMENTS

We would like to thank M. Cremer, Research Centre Jülich, Germany, for excellent technical assistance and preparation of the histological sections, and G. Csucs and D. Darine for valuable assistance with the use of neuroinformatics tools and infrastructures.

macaque brain. NeuroImage 117, 408–416. doi: 10.1016/j.neuroimage.2015. 05.072


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Schubert, Axer, Schober, Huynh, Huysegoms, Palomero-Gallagher, Bjaalie, Leergaard, Kirlangic, Amunts and Zilles. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Opposing Effects of Neuronal Activity on Structural Plasticity

Michael Fauth1, 2 \* and Christian Tetzlaff 2, 3

<sup>1</sup> Department of Computational Neuroscience, Third Institute of Physics - Biophysics, Georg-August University, Göttingen, Germany, <sup>2</sup> Bernstein Center for Computational Neuroscience, Göttingen, Germany, <sup>3</sup> Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany

The connectivity of the brain is continuously adjusted to new environmental influences by several activity-dependent adaptive processes. The most investigated adaptive mechanism is activity-dependent functional or synaptic plasticity regulating the transmission efficacy of existing synapses. Another important but less prominently discussed adaptive process is structural plasticity, which changes the connectivity by the formation and deletion of synapses. In this review, we show, based on experimental evidence, that structural plasticity can be classified similar to synaptic plasticity into two categories: (i) Hebbian structural plasticity, which leads to an increase (decrease) of the number of synapses during phases of high (low) neuronal activity and (ii) homeostatic structural plasticity, which balances these changes by removing and adding synapses. Furthermore, based on experimental and theoretical insights, we argue that each type of structural plasticity fulfills a different function. While Hebbian structural changes enhance memory lifetime, storage capacity, and memory robustness, homeostatic structural plasticity self-organizes the connectivity of the neural network to assure stability. However, the link between functional synaptic and structural plasticity as well as the detailed interactions between Hebbian and homeostatic structural plasticity are more complex. This implies even richer dynamics requiring further experimental and theoretical investigations.

### Edited by:

Arjen Van Ooyen, Vrije Universiteit (VU), Netherlands

#### Reviewed by:

Jochen Triesch, Johann Wolfgang Goethe University, Germany Andreas Vlachos, Heinrich-Heine-University, Germany

> \*Correspondence: Michael Fauth mfauth@gwdg.de

Received: 08 September 2015 Accepted: 16 June 2016 Published: 28 June 2016

#### Citation:

Fauth M and Tetzlaff C (2016) Opposing Effects of Neuronal Activity on Structural Plasticity. Front. Neuroanat. 10:75. doi: 10.3389/fnana.2016.00075 Keywords: structural plasticity, architectural plasticity, timescales, synaptic plasticity, network topology

### INTRODUCTION

Information from the environment leads to the activation of neural subnetworks in the brain. The connectivity of these neural subnetworks, i.e., the existence and strength of synapses between neurons, influences the neuronal activation and, thereby, determines the way environmental information is processed. Accordingly, the long-term storage of information is related to activitydependent (long-lasting) changes in connectivity (Hebb, 1949; Morris et al., 1986; Rioult-Pedotti et al., 1998; Leuner et al., 2003; Pastalkova et al., 2006; Whitlock et al., 2006; reviewed, e.g., in Martin et al., 2000; Chklovskii et al., 2004; Dudai, 2004; Hübener and Bonhoeffer, 2010). Basically two types of activity-dependent mechanisms yield such changes: synaptic or functional plasticity and structural plasticity. Structural or architectural plasticity determines the formation and removal of synapses. On the other hand, synaptic or functional plasticity changes the electrochemical transmission efficacy of synapses by altering, for instance, the receptor configuration of the postsynaptic site. Note, as we will show, this functional synaptic plasticity is associated with structural changes at existing synapses (size, postsynaptic density, etc.) and these changes are sometimes summarized as structural plasticity (Lamprecht and LeDoux, 2004). However, here we restrict structural plasticity to changes of the number of synapses (and of axonal/dendritic trees) and refer the long-term functional changes at existing synapses as synaptic plasticity.

The alterations of the transmission efficacy by synaptic plasticity depend on the level of neuronal activation. However, the mapping between activity level and triggered synaptic changes is not unique. In general, they are categorized into two classes: Hebbian and homeostatic synaptic plasticity. Hebbian synaptic plasticity yields an increase in synaptic efficacy given high neuronal activities (long-term potentiation; LTP; Bliss and Lomo, 1973; Lynch et al., 1983; Bliss and Collingridge, 1993; see Feldman, 2009 for a review), while low levels of activity induce a decrease (long-term depression; LTD; Lynch et al., 1977; Dudek and Bear, 1992; Mulkey and Malenka, 1992; see Collingridge et al., 2010 for a review). Thus, Hebbian synaptic plasticity basically maps the neuronal activation onto the synaptic efficacies or rather connectivity (high activity → stronger connections; low activity → weaker connections; Hebb, 1949; Bliss and Lomo, 1973; Dudek and Bear, 1992; Kirkwood et al., 1996). These changes in the connectivity, in turn, influence the neuronal activities. Along these lines, theoretical studies show (Rochester et al., 1956; Riedel and Schild, 1992; Gerstner and Kistler, 2002; Kolodziejski et al., 2010) that Hebbian synaptic plasticity alone induces a positive feedback loop leading to unrestricted synaptic (and thus neuronal) dynamics. On the other hand, homeostatic synaptic plasticity, as synaptic scaling (Turrigiano et al., 1998), act conversely to Hebbian synaptic plasticity. If neuronal activities are high, synaptic efficacies are decreased, while, if activities are low, efficacies are increased (high activity → weaker connections; low activity → stronger connections; Turrigiano et al., 1998; Hou et al., 2008, 2011; Ibata et al., 2008). Thereby, homeostatic synaptic plasticity alone induces a negative feedback loop and, thus, stabilizes the dynamics. As several theoretical results indicate (Tetzlaff et al., 2011; Zenke et al., 2013; Toyoizumi et al., 2014), the combination of both plasticity processes lead to desired, stable dynamics.

We will argue in this review that, analogous to functional synaptic plasticity, structural plasticity can also be categorized into two different classes of activity-dependency: (i) One class of structural changes maps features of the neuronal activity onto the connectivity, such that the connectivity is strengthened with high activity levels and vice versa. These changes will be referred to as Hebbian structural plasticity (Hebb, 1949; Helias et al., 2008). (ii) The other class of structural changes weakens (strengthens) the connectivity given high (low) neuronal activities and, thus, stabilizes the dynamics. This class is named homeostatic structural plasticity (Butz et al., 2009).

Note, this classification is phenomenological. Changes in connectivity (synaptic as well as structural) are not directly linked to neuronal activity. Neuronal activity initiates such changes by triggering secondary processes as molecular signaling cascades, which lead to the corresponding changes. For the here discussed plasticity processes, these underlying signaling cascades can have different degrees of similarity, which we will not consider in detail. The focus of this review is to systematize the qualitative links between the neuronal activity level and resulting connectivity changes.

Moreover, we focus on morphological changes of connections between excitatory neurons only. The dynamics of inhibitory synapses has been reviewed, for instance, by Vogels et al. (2013) for inhibitory synaptic plasticity and by Flores and Méndez (2014) for inhibitory structural plasticity. Further non-synaptic homeostatic mechanisms stabilizing neural network dynamics have been reviewed in Turrigiano and Nelson (2004), Marder and Goaillard (2006), or Yin and Yuan (2014).

In the following, as structural and synaptic plasticity are linked to each other, we first briefly outline the main findings for synaptic plasticity. Then, we review the morphological changes of synapses induced by synaptic plasticity and relate these changes to the dynamics of synapses and, thus, to structural plasticity. Following this, we summarize the experimental evidence of activity-dependent structural changes and categorize these, similar to synaptic plasticity, into the two classes of Hebbian and homeostatic structural plasticity. We also briefly review indications of Hebbian and homeostatic processes occurring during development. Finally, we sort theoretical investigations studying the dynamics of structural plasticity by this categorization and, based on their results, arrive at conclusions about the different functional roles of Hebbian and homeostatic structural plasticity.

### ACTIVITY-DEPENDENT SYNAPTIC PLASTICITY

The most investigated long-term plasticity in neuronal systems is synaptic plasticity. This mechanism adapts synaptic efficacies (by, e.g., altering the number of AMPA receptors at the postsynaptic site) between neurons dependent on the neuronal activation. One distinguishes between two different forms of synaptic plasticity: (i) Hebbian synaptic plasticity and (ii) homeostatic synaptic plasticity.

(i) Hebbian synaptic plasticity adapts the synaptic efficacies seconds or minutes after onset of a stimulus-induced neuronal activation. In general, neuronal activity induces a calcium influx into the postsynaptic site inducing a complex molecular cascade which changes, amongst others, the number of AMPA receptors determining the synaptic efficacy (Kauer et al., 1988; Muller and Lynch, 1988; Shi et al., 1999; reviewed, e.g., in Malenka and Bear, 2004). Many experiments show that a low calcium level (thus a low neuronal activity level) leads to a decrease of the number of AMPA receptors (long-term depression: LTD; Lynch et al., 1977; Dudek and Bear, 1992; Mulkey and Malenka, 1992; Beattie et al., 2000; see Collingridge et al., 2010 for a review) while a high calcium level yields an insertion of new ones resulting in a stronger synaptic efficacy (long-term potentiation: LTP; Bliss and Lomo, 1973; Lynch et al., 1983; Malenka et al., 1992; Bliss and Collingridge, 1993; see Feldman, 2009 for a review). Thus, after several minutes, Hebbian synaptic plasticity maps the strength of the stimulus onto the strength of the synaptic transmission. Note, synapses from several input sources connecting to the same postsynaptic neuron can interact with each other yielding cooperative and competitive dynamics (Miller, 1996). Moreover, the change of the synaptic efficacy can also depend on the relative timing of pre- and postsynaptic action potentials (spike-timing-dependent plasticity: STDP; Levy and Steward, 1983; Gerstner et al., 1996; Markram et al., 1997b; Bi and Poo, 1998; see Markram et al., 2011 for a review), such that also temporal correlations might be mapped onto the synaptic efficacies. However, as several theoretical studies indicate, Hebbian synaptic plasticity alone induces a positive feedback loop leading to unrestricted growth of the synaptic efficacy (Rochester et al., 1956; Riedel and Schild, 1992; Gerstner and Kistler, 2002; Kolodziejski et al., 2010). In other words, if a stimulus drives the firing of the postsynaptic neuron, LTP potentiates the corresponding synaptic efficacy and, by this, induces a stronger input drive which, in turn, generates more potentiation and so forth. Thus, Hebbian synaptic plasticity alone would yield unstable, divergent dynamics of the synaptic efficacies.

(ii) Another process adapting the transmission strength of a synapse is homeostatic synaptic plasticity. Several different homeostatic processes dampen the dynamics of neuronal systems on various levels (Zhang and Linden, 2003; Turrigiano and Nelson, 2004; Marder and Goaillard, 2006; Turrigiano, 2011; Yin and Yuan, 2014). Thus, it is reasonable that homeostatic processes, like synaptic scaling, also adapt synaptic efficacies (Turrigiano et al., 1998; Hengen et al., 2013; Keck et al., 2013). Amongst others, this mechanism depends mainly on the average postsynaptic activity (Ibata et al., 2008). Here, in contrast to Hebbian synaptic plasticity, if the neuronal activity is high, the synaptic efficacies are decreased and, if the activities are low, the efficacies are increased (Turrigiano et al., 1998; Burrone et al., 2002; Kim et al., 2012). Hereby, synaptic scaling is unspecific, i.e., it scales all synapses onto a postsynaptic neuron preserving relative differences between synaptic efficacies induced by Hebbian plasticity (Turrigiano, 2008). However, several experiments indicate (e.g., Turrigiano et al., 1998; Hengen et al., 2013; Keck et al., 2013, but see also Ibata et al., 2008) that, compared to Hebbian synaptic plasticity, this process is much slower (hours to days) which complicates the analysis of both processes within the same experimental setup (Vitureira and Goda, 2013). Nevertheless, theoretical investigations show that synaptic scaling is one way to solve the problem of unrestricted growth discussed above (Tetzlaff et al., 2011; Zenke et al., 2013; Toyoizumi et al., 2014). Please note that there are also other solutions proposed to solve this problem (von der Malsburg, 1973; Sejnowski, 1977a,b; Bienenstock et al., 1982; Oja, 1982).

In summary, investigations in the field of synaptic plasticity show that, at least, two classes of processes adapt synaptic efficacies: Hebbian synaptic plasticity and homeostatic synaptic plasticity. Hereby, Hebbian synaptic plasticity maps neuronal activities onto the synaptic efficacies (high act. → stronger connect.; low act. → weaker connect.), which are, in turn, stabilized by homeostatic processes (high act. → weaker connect.; low act. → stronger connect.).

## ACTIVITY-DEPENDENT STRUCTURAL PLASTICITY

Activity-dependent structural plasticity basically influences two different physical substrates: On the one hand, neurites (i.e., dendrites and axons) grow and retract dependent on the level of neuronal activation (Cohan and Kater, 1986; van Huizen and Romijn, 1987). These growth processes determine the basic shape of a neuron and its regions of afferent and efferent connections. On the other hand, synapses (i.e., dendritic spines and axonal boutons) are continuously formed and deleted. Although an axon and a dendrite lie close together and the gap could be bridged by a synapse, the existence of a synapse is not guaranteed (Kalisman et al., 2005). In fact, the formation and deletion of a synapse also depend on the neuronal activation of both neurons (see e.g., Annis et al., 1994; Nägerl et al., 2007; Kwon and Sabatini, 2011; Hill and Zito, 2013).

As the majority of cortical synapses resides on dendritic spines (Yuste, 2010), many studies applied time-lapse imaging of the dynamics of dendritic spines for analyzing the structural dynamics or structural plasticity of single synapses. This implies the problem that the existence of a dendritic spine does not guarantee the existence of a functional synapse. However, several experiments provide evidence that, at least after a few hours after spine formation, new born spines are structurally and functionally equivalent to mature spines hosting a synapse (Trachtenberg et al., 2002; Knott et al., 2006; Nägerl et al., 2007; Zito et al., 2009). Similarly, also the emergence and stabilization of axonal terminals or boutons seems to involve synapse formation and maturation (Friedman et al., 2000; Ruthazer et al., 2006). Thus, the existence of a spine or bouton is a good indicator for the existence of a functional synapse.

### Link between Structural and Synaptic Plasticity

The dynamics of synapses is determined by the dynamics of dendritic spines. Accordingly, structural plasticity depends on the morphology of spines as their sizes and shapes (Nägerl et al., 2008; Tønnesen et al., 2011, 2014). Experiments indicate that the volume of a dendritic spine correlates with the synaptic efficacy of the corresponding synapse (Matsuzaki et al., 2001; Knott et al., 2006; Zito et al., 2009) which, in turn, is influenced by synaptic plasticity. Accordingly, stimuli causing long-term potentiation (LTP) also cause spine enlargements (Fifková and Van Harreveld, 1977; Okamoto et al., 2004; Yang et al., 2008, for a review see Yuste and Bonhoeffer, 2001) while stimuli causing long-term depression (LTD) induce spine shrinkage (Okamoto et al., 2004; Zhou et al., 2004; Oh et al., 2013). Hereby, synaptic and structural changes rely on distinct signaling cascades, which are triggered by the same signals (Matsuzaki et al., 2004; Zhou et al., 2004). Thus, blocking synaptic plasticity, for instance, by blocking NMDAreceptors also prevents changes in the spine volume. Several experiments indicate that the spine head volume is correlated to the lifetime or stability of the spine (Grutzendler et al., 2002; Majewska et al., 2006; Yasumatsu et al., 2008; Loewenstein et al., 2015). Thus, the spine stability or removal of a synapse is, in turn, correlated to the synaptic efficacy of the corresponding synapse, which also has been directly observed in several experiments (Holtmaat et al., 2005; Le Bé and Markram, 2006, reviewed, e.g., in Kasai et al., 2003). In combination with STDP, this relation between synaptic weight, spine volume and spine stability could give rise to a spike-timing-dependent structural plasticity (Helias et al., 2008; Deger et al., 2012), which still has to be experimentally verified. Interestingly, the stability of a synapse is also influenced by the reliability of signal transmission of the synapse (Wiegert and Oertner, 2013) which is also altered by synaptic plasticity (Stevens and Wang, 1994). Thus, for Hebbian-like changes, structural and synaptic plasticity are linked with each other by the morphology of spines or properties of the synapse (Segal, 2005).

Some evidence indicates a similar link for homeostatic changes: in vitro (Murthy et al., 2001) and in vivo (Keck et al., 2013) studies show that changes of the spine volume also go along with the activity-dependent homeostatic scaling of synaptic efficacies. Given the aforementioned correlation between spine volume and spine stability, we expect that structural plasticity is also linked to homeostatic synaptic plasticity.

In the following, we will summarize experimental results indicating the different aspects of activity-dependent structural plasticity in more detail. We will classify these aspects according to Hebbian (high act. → stronger connect.; low act. → weaker connect.) and homeostatic (high act. → weaker connect.; low act. → stronger connect.) structural plasticity in adult networks. In addition, we will show that many of these experiments support the here discussed link between synaptic and structural plasticity. We will also provide a brief survey of structural dynamics during development. Finally, we will discuss experimental evidence of the interaction of Hebbian and homeostatic structural plasticity in the same neural system.

### Evidence for Hebbian Structural Plasticity LTP-Stimuli

The induction of LTP by a strong neuronal activation is mainly associated with the increase of the synaptic efficacy (e.g., number of AMPA receptors) of existing synapses (Malenka and Bear, 2004; Feldman, 2009). However, already in the 1980s first studies (Lee et al., 1980; Chang and Greenough, 1984) indicate that 15–20 min after applying the strong stimulus the number of synapses is enhanced, too. In addition, one also observes an increase in the number of filopodia (Lee et al., 1980; Chang and Greenough, 1984), which seem to be the precursors of dendritic spines (Ziv and Smith, 1996). Accordingly, about 30 min after stimulation, an increased number of dendritic spines can be observed (Moser et al., 1994; Trommald et al., 1996; Engert and Bonhoeffer, 1999, but see also Desmond and Levy, 1990). The strength of the effect and the detailed timescale, however, depend strongly on the used tissue and preparation method (Sorra and Harris, 1998; Dunaevsky et al., 1999; Kirov et al., 1999; Bourne et al., 2007; Bourne and Harris, 2011), but most studies report timescales between 5 and 30 min.

This increase in the number of spines after an LTP-stimulus provides further support for an interaction between Hebbian synaptic plasticity and Hebbian structural plasticity: A strong neuronal activation will induce an increase in synaptic efficacies or rather in the spine volumes implying the stabilization of these enlarged dendritic spines. Given a continuous formation of new spines, this also implies that the new and small spines, which would be pruned without stimulation, will be enlarged and stabilized by synaptic plasticity. Together with the already existing (and further stabilized) spines, the stabilization of new spines by the strong stimulus would lead to an increase of the the number of spines as observed experimentally. For this, the rate of forming new spines could be independent of the neuronal activity and stay constant. This potential explanation of the increase in spine number is supported by a recent study demonstrating that LTP stabilizes nascent spines (Hill and Zito, 2013). Accordingly, blocking the signals inducing LTP (by blocking the NMDA-channels) also prevents the increase in the number of dendritic spines and also of axonal boutons (Engert and Bonhoeffer, 1999; Maletic-Savatic et al., 1999; Toni et al., 1999; Nikonenko et al., 2003). Thus, the dynamics of dendritic spines can be explained by the link between Hebbian structural plasticity and synaptic plasticity.

Note, although an increase of the number of spines could be explained by assuming a constant rate of forming new spines, the LTP-dependent appearance of more filopodia (Lee et al., 1980; Chang and Greenough, 1984; Maletic-Savatic et al., 1999) suggests that the formation rate changes, too. Thus, further experiments are required to clarify whether the formation rate of dendritic spines (and also of axonal boutons) stays constant or whether it is adapted by the level of neuronal activity.

Also at the presynaptic neuron an LTP-inducing stimulus triggers a structural remodeling: the number of axonal boutons increases. This effect arises already 15 min after the stimulation (Nikonenko et al., 2003; Ninan et al., 2006). The fact, that both the numbers of dendritic spines and of axonal boutons are enhanced, suggests that new synapses are formed by these new elements. In addition, recent findings indicate also an LTP-dependent increase in the probability that a bouton hosts one or more functional synapses (Medvedev et al., 2014). Thus, newly formed spines have a very high chance of connecting to a new or old bouton and, hence, forming a new synapse.

### LTD-Stimuli

A link between the dynamics of Hebbian structural plasticity and LTD-inducing stimuli has been established, too. Several experimental studies show that the induction of an LTD-stimulus (low frequency) yields a separation of pre- and postsynaptic terminals (Bastrikova et al., 2008) and a loss of dendritic spines (Nägerl et al., 2004; Wiegert and Oertner, 2013). Thus, similar to the dynamics triggered by an LTP-stimulus, due to the induction of a low frequency stimulation, the synaptic efficacy is decreased, spines shrink and decrease their stability, and the removal rates of dendritic spines are increased (Segal, 2005). This is supported by experiments showing that the prevention of LTD by blocking NMDA-channels impedes the structural effects (Nägerl et al., 2004; Yu et al., 2013). Thus, also these results indicate that structural plasticity is linked to synaptic plasticity which influences the stability of the corresponding dendritic spines. The timescale of spine shrinkage and removal seems to depend on the experimental conditions: some experiments report spine shrinkage after about 20 min of LTD-induction (Oh et al., 2013) while other studies report no significant changes in spine volume or stability up to 30 min after the induction of LTD (Wiegert and Oertner, 2013). Like for LTP-induced dynamics, also the presynaptic site is also influenced by a low level of activation as it increases the turnover of axonal boutons (Paola et al., 2006; Stettler et al., 2006) resulting in a loss of synapses (Becker et al., 2008).

In summary, stimulation protocols inducing Hebbian synaptic plasticity change the stability and number of synapses. A strong activation induces the formation of more synapses while a low activation induces a loss of synapses. These variations in the number of synapses seem to depend on changes in the stability of the corresponding dendritic spines and axonal boutons correlated to the actual synaptic efficacy adapted by Hebbian synaptic plasticity. However, it is still not clear whether the rate of newly formed spines and boutons is changed, too. Furthermore, the data about the dynamics induced by LTD-stimuli are less comprehensive than the data for LTP-stimuli.

Most of the above discussed structural dynamics happens on a timescale of the order of several minutes to one hour. On this timescale the dendritic trees and axons hosting spines and boutons remain quite stable (Ziv and Smith, 1996; Grutzendler et al., 2002; Trachtenberg et al., 2002; Lee et al., 2006; Paola et al., 2006; Stettler et al., 2006). Hence, fast Hebbian changes of the network structure must be mainly implemented by the growth or removal of dendritic spines. On slower timescales, also changes of the dendrites and axons take place. However, as we will discuss in the following, such changes are mainly triggered by homeostatic processes.

### Evidence for Homeostatic Plasticity

As already mentioned above, the connectivity of neural networks is not only adapted by Hebbian-like changes. Similar to synaptic plasticity, also structural changes show homeostatic dynamics, i.e., a decrease of connectivity with high neuronal activities and an increase with low activities. Typically, these homeostatic dynamics are observed under chronically altered conditions of neuronal activity and, thus, also at slower timescales. In general, the resulting structural changes seem to counterbalance the altered conditions and, thereby, regulate the activity back to an intermediate level (for a complete review of homeostatic structural processes see Butz et al., 2009). Like Hebbian structural plasticity, homeostatic structural changes are determined by the dynamics of dendritic spines and axonal boutons. However, under extreme conditions, as in epilepsy or after lesions, also changes of the dendritic and axonal trees are observed.

Already in the year 1978, Wolff et al. (1978) observed in vivo the growth of protrusions and thickenings on the dendritic tree after decreasing neuronal activity. For this, they applied the inhibitory neurotransmitter GABA for 3–7 days. Further studies verified that chronic blockage of neuronal activity can yield an increase in the number of spines after approximately 8 h (Dalva et al., 1994; Rocha and Sur, 1995; McAllister et al., 1996; Kirov and Harris, 1999) indicating a slower timescale for homeostatic structural changes as compared to Hebbian ones. Hereby, already the blockage of NMDA channels leads to an increase in spine number (Yu et al., 2013; Chen et al., 2015) or prevents spine elimination (Bock and Braun, 1999). Note that during development blocking activity or NMDA receptors can show the opposite effect (Annis et al., 1994; Collin et al., 1997). However, the newly formed spines often host silent synapses needing synaptic plasticity to be converted to functional synapses (Nakayama et al., 2005). On the other hand, persistent depolarization of neurons leads to a loss of dendritic spines (Müller et al., 1993; Drakew et al., 1996). Already the application of high levels of NMDA induces a spine loss by the destabilization of the spine actin scaffold (Halpain et al., 1998). Thus, the number of spines is adapted in an activity-dependent homeostatic manner.

Furthermore, the changes in the number of spines also depend on the calcium level (Kirov and Harris, 1999; Kirov et al., 2004; Tian et al., 2010). Accordingly, it has been proposed that dendritic spines follow a calcium-dependent homeostasis (Segal et al., 2000). As the postsynaptic calcium level is largely influenced by neuronal activity (Spruston et al., 1995; Helmchen et al., 1996; reviewed, e.g., in Higley and Sabatini, 2008), the calcium-dependent homeostasis could, in turn, imply an activitydependent homeostasis as described above. However, the detailed relation between calcium, activity, and spine dynamics is more complex, as the calcium level is also regulated by other signals as neurotrophins (Stoop and Poo, 1996) or cell adhesion molecules (Bixby et al., 1994). Furthermore, in contrast to the postsynaptic activity, calcium is a local signal allowing different dynamics at different branches of the dendritic tree. Accordingly, by comparing different branches of the same dendrite, where each branch receives stimuli from other brain regions, such different spine dynamics are observed (Mattson, 1988; Bravin et al., 1999; Lohmann et al., 2005; Deller et al., 2006; Vuksic et al., 2011, see also Yu and Goda, 2009; Vlachos et al., 2012a, 2013 for evidence on local homeostasis of synaptic efficacies). However, in summary, these experiments indicate that the number of spines or synapses is adapted by activity-dependent homeostatic structural plasticity.

### Evidence from Networks in Extreme Situations

Further evidence for homeostatic dynamics are obtained in more complex settings which we summarize in the following. Note that under these conditions dynamics of dendritic and axonal trees are observed, too.

For example, homeostatic regulation of connectivity is found in animal models of epilepsy. Epileptic seizures are network states of high and synchronous activity. Given a homeostatic dynamic, this would lead to a decrease in the number of spines which, indeed, was found in animal models of epilepsy (Scheibel et al., 1974; Paul and Scheibel, 1986; Geinisman et al., 1990; Isokawa and Levesque, 1991; Isokawa, 1998). These changes are likely signs of structural plasticity rather than mere damages by the epileptic seizures, as the number of spines recovers after several days without seizures (Müller et al., 1993; Isokawa, 1998). The spine loss is only visible at least 5 h after the seizure (Mizrahi et al., 2004), which implies that, also under these conditions, the timescale of homeostatic structural plasticity is typically slower than for Hebbian structural plasticity described above. Interestingly, after several days with reoccurring seizures also changes of the neuronal morphology, like retraction of dendritic branches, are measured (Colling et al., 1996; Jiang et al., 1998).

In contrast to the elevated activities during epilepsy, phenomena like strokes, lesions, or deprivations typically lead to lowered activity levels in a group of neurons. For instance, for deprived neurons a homeostatic dynamic would increase the number of spines. Indeed, experiments show that, after 4 days of monocular deprivation, the number of newly formed spines in the binocular cortex of adult mice doubles compared to control conditions (Hofer et al., 2009). Interestingly, a second phase of monocular deprivation at the same eye does not lead to an increased formation of new spines. Now, the synaptic efficacies of spines formed during the first phase are strengthened by (presumably homeostatic) synaptic plasticity counterbalancing the lost input (Hofer et al., 2009). These findings support the link between (homeostatic) structural and synaptic plasticity.

However, as shown by Keck et al. (2008), also smaller interferences, like small lesions of the retina, lead to more new spines (in the lesion projection zone in the visual cortex). In this experiment, although more new spines are formed, the spine density is comparable to control conditions after 3 days. Another experiment shows that trimming the whiskers of rats leads to an increased number of spines and an outgrowth of dendritic trees in the input-receiving layer in the barrel cortex (Vees et al., 1998; other layers might be affected differently, see Chen et al., 2015). Along this line, one observes massive reorientation of the dendritic trees of adult rats after whisker removal, while the system regains the pre-removal dendritic lengths and spine densities (Tailby et al., 2005). Note, however, already the retraction of dendrites from denervated areas can increase the exitability of neurons, such that activity-homeostasis can be reached without regaining the pre-lesion dendritic length (Platschek et al., 2016).

Interestingly, not only the dendrites of the neurons with lesioned afferents, but also axons of neighboring neurons contribute to regain homeostasis. Although these neighboring neurons are not directly affected by the lesions, they can also be expected to experience altered activity levels. This triggers, after a few days, the growth of axons from the neighboring neurons toward the deprived region (Darian-Smith and Gilbert, 1994; Yamahachi et al., 2009; Marik et al., 2010). Furthermore, damaged axons can grow out and form new synapses, similar to growth dynamics during development (Canty et al., 2013). In summary, we find that lesions trigger the formation of new spines and the outgrowth of dendrites, which, together with new innervation from neighboring neurons, presumably form new synapses and restore the activity level.

Thus, very high or low activity levels occurring in extreme situations like epilepsy, lesions, or stroke are counterbalanced by structural changes on the timescale of several hours to days, thereby, contributing to activity-dependent homeostasis.

## Activity-Dependent Structural Plasticity during Development

As already mentioned above, apart from networks in extreme situations, many experiments in adult networks observe very small or no changes of the axonal or dendritic arborization. This is different during the development of neural networks, when these dendritic and axonal trees are formed. Interestingly, also during this phase activity-dependent structural processes contribute to the network dynamics. In the following we will briefly discuss these experiments.

Homeostatic Structural Changes during Development

Single, isolated neurons in culture typically start growing axons and dendrites. This initial process could already be a homeostatic mechanism, as such neurons typically exhibit only weak activities (Kater et al., 1989). The further outgrowth of neurites also seems to be homeostatically regulated: On the one hand, the application of the inhibitory neurotransmitter GABA, which normally decreases activity, triggers an increased outgrowth (Mattson and Kater, 1989). On the other hand, excitatory neurotransmitters as glutamate (but not NMDA, see Mattson et al., 1988), which normally yield an increased activity, induce the degeneration of the dendritic structures (Haydon et al., 1984, 1987; Mattson, 1988; Mattson et al., 1988; Mattson and Kater, 1989). The strength of this effect is dose-dependent (Mattson et al., 1988). Note, during early developmental phases, GABA is an excitatory neurotransmitter (Barker et al., 1998; Ben-Ari, 2002). Still, in the above studies GABA shows the inverse effect of the excitatory neurotransmitters. Furthermore, in these experiments, changes in the axonal dynamics are initiated only at very high doses and lead to a retraction of the axon.

Further experiments targeted downstream signals of these neurotransmitters. Also here, several indications show that especially the postsynaptic calcium level seems to trigger dendritic changes: on a slower timescale, an increased level of calcium induces a retraction of dendrites while a decrease of calcium leads to an outgrowth of dendrites (Mattson and Kater, 1987, for a similar effect for CaMKII see Wu and Cline, 1998). These dynamics are summarized in the calciumdependent homeostasis hypothesis for dendrites (Kater et al., 1989; Lipton and Kater, 1989). Furthermore, recent experiment suggest that also the dynamics of filopodia are regulated dependent on local calcium currents (Lohmann et al., 2005). As discussed above, the calcium level is mainly influenced by the neuronal activity. Therefore, we suppose that the calciumdependent homeostasis hypothesis implies an activity-dependent homeostasis, i.e., neurons grow and retract their dendrites to find an optimal level of input which, in turn, assures a medium activity level. This hypothesis is supported by experiments showing that increased activity, due to electrical stimulation, prevents dendritic outgrowth (Cohan and Kater, 1986; Fields et al., 1990 but see Garyantes and Regehr, 1992), whereas blocking activity yields enhanced growth of dendrites (van Huizen and Romijn, 1987; Fields et al., 1990). Note, these experiments demonstrate a relation between activity and the dendritic outgrowth describing only the potential connectivity between neurons and not the realized connectivity between neurons. Whether this also yields the formation of more functional synapses remains unclear.

Further evidence for homeostatic structural changes during development are coming from experiments analyzing the time course of the developmental process of neuronal networks. During development, neural networks evolve from an initial unconnected state to a connected matured state. Initially, neurons have very low activities (Ramakers et al., 1990; Chiappalone et al., 2006; Wagenaar et al., 2006), which could trigger the outgrowth of neurites and formation of synapses in a homeostatic manner (van Ooyen, 2011). Before reaching the matured state, neural networks typically pass through a phase of extreme build up of synapses followed by a phase of synapse pruning (so-called overshoot)—dependent on the level of neuronal activity (Feldman and Dowd, 1975; Huttenlocher et al., 1982; Huttenlocher, 1984; van Huizen et al., 1985, 1987; van Huizen and Romijn, 1987; van Pelt et al., 1996; Bock and Braun, 1999; Hua and Smith, 2004; Zuo et al., 2005a,b). Such an overshoot in synapse number is typical for neural networks with homeostatically regulated connectivity (van Ooyen, 2003, 2011).

### Hebbian Structural Changes during Development

During development some structural changes of axonal and dendritic trees also show Hebbian-like dynamics as described in the following: Neurites grow by constantly adding and removing branches (Wu and Cline, 1998; Sin et al., 2002; Wong and Ghosh, 2002; Portera-Cailliau et al., 2003). Hereby, only a few branches become stable and form the axonal or dendritic tree, while others are removed on the timescale of minutes to hours (Wu and Cline, 1998). Thereby, the activation of receptors and local calcium transients are necessary to stabilize and maintain dendritic branches (Lohmann et al., 2002; Vaillant et al., 2002; Hutchins and Kalil, 2008). Accordingly, in animals experiencing four hours of increased neuronal activity due to visual stimulation, one observes significantly more stabilized dendritic branches as compared to animals left in the dark (Sin et al., 2002). Similarly, the blockage of neuronal activities yields much less complex dendritic trees (Groc et al., 2002).

Interestingly, the stabilization of dendritic and axonal branches also depends on the connectivity, more precisely, on the existence and maturation of synaptic contacts on the branch (Haas et al., 2006; Ruthazer et al., 2006). As shown above, in adult networks, the activity-stability relationship of synapses implements Hebbian changes in connectivity. Thus, if the dynamics underlying the stabilization of synaptic contacts are similar during development and in adult networks, the activitydependent stabilization of spines and, therefore, of branches would indicate a Hebbian-component of the growth of dendritic trees.

### EVIDENCE FOR THE INTERACTION OF HEBBIAN AND HOMEOSTATIC STRUCTURAL PLASTICITY

As we discussed above, for adult networks, the alteration in neuronal activity causes two different directions of structural changes (see **Figure 1**). On a fast timescale (minutes to hours) the number of dendritic spines goes along with the change in neuronal activity in a Hebbian manner. On a typically slower timescale (hours to days), the dynamics of dendrites and dendritic spines homeostatically counterbalance the change in activity and regulate it back to an intermediate target regime. Obviously, in experiments, chronic changes in neuronal activity should trigger both processes which, then, interact with each other. With these two mechanisms and their typically different timescales at hand, in the following, we will discuss direct conclusions about the dynamics of structural changes during a period of altered activity.

For example, when neurons start to receive reduced or LTDinducing inputs, the corresponding synapses will be depressed and, therefore, more likely to be removed due to Hebbian structural plasticity—the spine density is reduced (**Figure 1**, bottom center). Later on, due to the reduced activity of the neuron, homeostatic structural plasticity yields the formation of new synapses—the spine density will increase (**Figure 1**, bottom right). Note, as the homeostatic changes are unspecific, very likely these new synapses connect to other, more active inputs. Thus, when the neural network has again reached its homeostatic level and assuming that the synaptic efficacies are, on the long run, similar to those before the activity alteration, the spine density is probably at the same level as before receiving the LTD-inducing inputs. Thus, as a direct consequence from the interaction of Hebbian and homeostatic structural plasticity in the same neural network, we expect in general a transient decrease in the spine density.

Such transient changes have been observed already in the 1970s (Parnavelas et al., 1974; Goldowitz et al., 1979). In these studies, the transsection of afferent hippocampal axons yields a strong decrease in spine density around 4 days after deafferentiation and a restoration of the initial spine density after 10–50 days (Parnavelas et al., 1974; Goldowitz et al., 1979; Vuksic et al., 2011). Strikingly, this transient change in spine density does not result from changes in the spine formation rate, but rather from changing the elimination rate or the stability of the spines (Vlachos et al., 2012b). Similarly, one observes changes in the spine elimination rate in barrel cortex after whisker trimming also leading to a transient decrease of the spine density (Zuo et al., 2005a; Miquelajauregui et al., 2015).

These results are consistent with the correlation between spine stability, spine volume, and synaptic efficacy governing the interaction of synaptic and structural plasticity: First, Hebbian synaptic plasticity would decrease the efficacies and the stability of spines, such that their density decreases. Later, synaptic scaling would scale up the synaptic efficacies of both old and new synapses and, thereby, stabilizes them and increases spine density. Interestingly, at the same time, Hebbian synaptic plasticity can induce competitive effects between newly formed and up-scaled preexisting spines, which destabilizes the newly formed synapses and, thereby, protracts the recovery of the system (Vlachos et al., 2013).

On the other side, paradigms which supposedly trigger higher neural activities, such as motor learning or an enriched environment, have been demonstrated to elicit a transient increase in the number of spines after 2–3 days of stimulation. After 7 days the number of spines reaches control level again

(Xu et al., 2009; Yang et al., 2009, see **Figure 1**, upper row). Also during these experiments, the number of new filopodia remains constant, which suggest a constant formation rate of new spines. Interestingly, the repeated training selectively stabilizes mainly the newly formed spines, while the stability of preexisting spines drops (Xu et al., 2009). Also for these types of experiments the interaction of structural and synaptic plasticity provides a potential explanation for the observed dynamics. We expect that Hebbian synaptic plasticity leads to a selective potentiation and, thus, a stabilization of the synapses which are important for learning (especially the ones hosted by newly formed spines, which are important for the task performance, see Xu et al., 2009; Yang et al., 2009). This, in turn, leads to an increased spine number and higher neuronal activities. In the long run, this increased activity triggers unspecific homeostatic synaptic plasticity decreasing the stability of synapses and inducing their pruning. Remarkably, in experiments, when training is stopped earlier, the newly formed spines are less stable than the preexisting ones (Xu et al., 2009). Following our reasoning, this could imply that learning was not long enough to trigger sufficient potentiation to stabilize the newly formed synapses.

The interaction of Hebbian and homeostatic mechanisms could also be used to explain a detailed EM-study conducted by Bourne and Harris (2011). This study shows that, 5–30 min after a typically LTP-inducing tetanic burst stimulation, a transient increase in the number of stubby spines, shaft synapses, and nonsynaptic protrusions can be observed. However, already after 2 h these structures are not present anymore. In addition, the number of small spines is decreased compared to prestimulation, whereas the postsynaptic densities of all remaining spines have been enlarged such that the PSD (postsynaptic density) area per micrometer dendrite is the same as for controls (Bourne and Harris, 2011). This suggests a strong and, possibly, fast homeostatic mechanism (the authors argue for a resource homeostasis of the polyribosomes which are used for spine creation and enlargement). Thus, probably a group of synapses is selectively stabilized by Hebbian synaptic plasticity increasing neuronal activity. At the same time, homeostatic synaptic and structural plasticity counterbalance these changes and decrease the stability of all synapses leading to the removal of small, unpotentiated synapses. These dynamics are similar to the dynamics during motor learning described above.

These examples demonstrate that Hebbian and homeostatic as well as synaptic and structural plasticity are strongly interweaved and jointly adapt the connectivity of the neural network according to alterations in neuronal activity. To understand these complex interactions in more detail, further experiments are needed. However, to assess also the general principles, theoretical network models are required. In the following, we will discuss the state of the art of theoretical models of structural plasticity.

### THEORETICAL MODELS OF STRUCTURAL PLASTICITY

In this section, we will summarize theoretical and computational studies analyzing the dynamics and functional consequences of structural plasticity. As models of structural plasticity basically adapt the connectivity, they enable predictions about properties of the connectivity in neural networks. These properties range from statistical (e.g., the statistics of subnetwork structures (motifs) or the probability distribution of the number of synapses between two neurons) to graph theoretical features (e.g., smallworldness or shortest path lengths) which can be compared to biological data. Many studies also investigate functional consequences of structural plasticity as, for instance, the influence on the storage capacity or the ability to classify different inputs. The majority of studies focuses on either Hebbian or homeostatic structural plasticity, however, at the end of this section, we will provide an overview of the few studies combining both processes of structural plasticity.

### Hebbian Structural Plasticity

As discussed above, Hebbian structural plasticity is mainly realized by the dynamics of dendritic spines. Thus, models of Hebbian structural plasticity typically describe the dynamics of dendritic spines. Synapses in these models appear and disappear at predefined potential synaptic locations with certain probabilities influenced by neuronal activities, synaptic efficacies and/or other hidden variables. As activities and efficacies depend on synaptic plasticity, Hebbian structural plasticity and Hebbian synaptic plasticity are strongly interconnected and the majority of models of Hebbian structural plasticity also incorporate the dynamics of Hebbian synaptic plasticity and some even homeostatic synaptic plasticity.

The simplest neural network to study the influence of Hebbian structural plasticity on the network's dynamics and connectivity is a postsynaptic neuron receiving input from one presynaptic neuron. Several experiments show that the connectivity between such pairs of neurons (the probability distribution of the number of synapses) is non-trivial (Markram et al., 1997a; Feldmeyer et al., 1999, 2002, 2006; Hardingham et al., 2010): these neurons are either unconnected (no synapse) or connected by multiple synapses (four to five synapses). This finding does not depend on the detailed anatomy of neurons, as the number of potential synapse location is much higher than the number of realized synapses (Fares and Stepanyants, 2009). However, as theoretical models show (Deger et al., 2012; Fauth et al., 2015b), Hebbian structural plasticity yields the formation of such multi-synaptic connections in a broad range of activity levels. By changing the activity level, the number of synapses between the neurons can be adjusted providing a way to change connectivity and, thus, store information in an activity-dependent manner (Fauth et al., 2015a,b). Furthermore, although the storage capacity per synapse is decreased, information, stored in such structures, can persist for timescales much longer than the lifetime of a single synapse, as the storage is collectively implemented by all synapses and does not rely on the existence of single ones (Fauth et al., 2015a).

Instead of considering a system consisting of one postsynaptic neuron, which receives inputs from one presynaptic neuron by multiple synapses, other studies considered a slightly more complex system: a postsynaptic neuron receiving inputs from several presynaptic neurons (note that in this system each presynaptic neuron is considered to be connected by only one synapse to the postsynaptic neuron). Here, the stability of synapses depends on the activity-dependent calcium influx; a high calcium influx causes stabilization of synapses and a low influx implies destabilization of synapses. Similar, as for the multi-synaptic connections, high neuronal activities lead to a stabilization of all synapses (Helias et al., 2008). However, for intermediate activity levels only correlated inputs are stabilized. Thus, the information stored in the connectivity could also be the information about the correlations between different inputs (Helias et al., 2008). In addition, synapses from uncorrelated inputs are pruned or deleted and lose their (noisy) influence on the postsynaptic neuron (Helias et al., 2008). Thus, Hebbian structural plasticity might help to prune synapses which are unimportant for the dynamics of the neural network.

Accordingly, also in more complex and biologically more reasonable systems, as large recurrent networks (Bourjaily and Miller, 2011; Zheng et al., 2013; Miner and Triesch, 2016), synaptic pruning preferentially removes synapses which only weakly contribute to synaptic transmission. These models use synaptic plasticity rules which typically yield a bimodal distribution of the electrical transmission efficacies with many efficacies close to zero. In combination with synaptic pruning, however, synapses with small efficacies are removed leading to the emergence of a unimodal distribution as observed in the cortex (Song et al., 2005). Accordingly, the continuous pruning and creation of synapses can also be interpreted as a process of stochastic inference, in which the network continuously tests and evaluates the "usefulness" of synapses to process or represent external stimuli (Kappel et al., 2015). Thus, synaptic pruning might minimize the resources for synaptic maintenance while preserving important dynamics.

Further advantages of pruning or deletion of uncorrelated or unimportant synapses have been revealed for simpler feedforward neuronal networks, which are typically used to study associative memory: the storage capacity of these networks is increased (Knoblauch et al., 2009, 2014). Considering a Willshaw or Hopfield network (Willshaw et al., 1969; Hopfield, 1982), the deletion of the weak or unimportant synapses increases the storage capacity per synapse without perturbing the stored patterns (Knoblauch et al., 2009). Furthermore, pruning prevents the occurrence of catastrophic forgetting and could explain phenomena as retrograde amnesia or the difference between spaced- and block-learning (Knoblauch, 2009; Knoblauch et al., 2014). Intuitively, the increase in storage capacity per synapse contradicts the finding of multi-synaptic connections described above (Deger et al., 2012; Fauth et al., 2015b). However, the influence of multi-synaptic connections on memory has to be further investigated as models at the network level are so far missing.

So far, theoretical studies of structural plasticity in recurrent networks mostly investigated storage capacity and compared the properties of the resulting connectivity with the properties of biological measured connectivities as, for instance, the statistics of the so-called motifs (Milo et al., 2002), i.e., configurations of the connectivity in small subnetworks. In cortical networks, groups of strongly connected neurons show an increased appearance (compared to random networks; Markram et al., 1997a; Feldmeyer et al., 1999; Song et al., 2005; Perin et al., 2011). As groups of strongly connected neurons typically show strongly correlated activities, which, in turn lead to stabilization of the corresponding connections, this increased appearance is naturally reproduced by Hebbian structural plasticity interacting with synaptic plasticity (Bourjaily and Miller, 2011; Miner and Triesch, 2016, but see also Zheng and Triesch, 2014). Remarkably, with the formation of more strongly connected subgroups of neurons the network's performance in discriminating different inputs increases (Bourjaily and Miller, 2011).

In summary, these results show that Hebbian structural plasticity improves several properties of neural networks compared to networks adapted only by synaptic plasticity. Especially, the storage of memories is improved in storage lifetime, capacity, and noise robustness. Furthermore, perhaps related to these improvements in memory storage, also the ability to discriminate inputs is enhanced. However, further investigations are needed to understand the influence of Hebbian structural plasticity on the dynamics of neural networks.

### Homeostatic Structural Plasticity

As already described above, homeostatic structural plasticity adapts dendrites and axons dependent on the neuronal activity to reach and sustain an intermediate activity regime (Butz et al., 2009). The slow timescale of homeostatic structural plasticity implies that its influences are basically observed after long durations, as during development, or in networks under extreme activity conditions as after lesions. Thus, also theoretical models investigating the dynamics of homeostatic structural plasticity concentrate mainly on these two paradigms.

During the development of a neural network from a naive initial state to a matured network, it passes through an overshoot phase of building up many synapses followed by a pruning phase until the network settles in the ground state (van Huizen et al., 1985, 1987; van Huizen and Romijn, 1987; van Ooyen, 2003, 2011). Such dynamics are already seen in a pure excitatory network model governed by homeostatic structural plasticity without the differentiation between axons and dendrites (van Ooyen and van Pelt, 1994). Introducing also inhibition further pronounces this overshoot effect and can lead to oscillatory and bursting neuronal activities (van Ooyen et al., 1995; van Ooyen and van Pelt, 1996). Assuming different homeostatic dynamics for axons and dendrites results in even more complex activity dynamics matching cell culture data (Tetzlaff et al., 2010). The resulting network state is the so-called critical state which is predestined for maintaining stability (Bak et al., 1987; Bak, 1996). Thus, the complex interactions between all these different homeostatic processes are important to bring the whole system into a stable state showing dynamics matching experimental data.

All of these developmental models consider the dynamics of axons and dendrites. However, as described above, also the dynamics of dendritic spines and axonal boutons are determined by homeostatic structural plasticity. Theoretical network models from the 1980s (Dammasch et al., 1986, 1988; Cromme and Dammasch, 1989) already showed that also such detailed models of homeostatic structural plasticity self-organize to reach a desired activity regime. Again, the resulting system is quite stable such that even the insertion of new neurons (by, for instance, neurogenesis in the hippocampus) does not perturb the global network state (Butz et al., 2008). Furthermore, by introducing a distance-dependency for forming new synapses, the network develops into a small-world network (Butz et al., 2014b).

The dynamics of these models can also be compared to in vivo measurements after input lesions or stroke-induced lesions (Butz et al., 2009; Butz and van Ooyen, 2013; Butz et al., 2014a). Interestingly, this comparison between in vivo and model dynamics enables conclusions on the activity-dependency of the different homeostatic processes. For instance, after a retinal lesion, neurons in the lesion projection zone (which have lost their external input) start to connect with active neurons at the border of the zone (Keck et al., 2008). In network models, this dynamics can only be seen if for small neuronal activations basically new dendritic spines are formed and axonal boutons are pruned (Butz and van Ooyen, 2013; Butz et al., 2014a). In contrast, if for small activities boutons are formed and spines are deleted, the system still reaches homeostasis, but the neurons in the lesion projection zone predominantly connect with each other and, thus, the whole zone decouples from the rest of the network (Butz and van Ooyen, 2013; Butz et al., 2014a).

Further predictions from these models are, for instance, that similar effects arise in networks with lesions after stroke (Butz et al., 2009, 2014a). Neurons affected by the deafferentiation (nearby the lesion zone) have problems in regaining their activity-homeostasis when the rest of the network is in homeostasis. This problem can be solved if, after stroke, the neurons, which are still in homeostasis, receive an external stimulation to trigger homeostatic structural plasticity and, thus, encourage rewiring. After this stimulation the whole network has reached homeostasis (Butz et al., 2009). Thus, studying the effects of structural plasticity also helps to gain insights in new potential medical treatments.

### Models of the Interaction of Hebbian and Homeostatic Structural Plasticity

So far only a few models investigated the interaction between Hebbian and homeostatic structural plasticity (Levy and Desmond, 1985; Adelsberger-Mangan and Levy, 1993, 1994; Levy, 2004; Thomas et al., 2015). Basically, these models include weight-dependent synapse removal (Hebbian structural plasticity) with an activity-dependent synapse formation (homeostatic structural plasticity). The combination of these processes in a feed-forward network optimizes the information transfer from input to output layer and also supports the separation of information in the output layer while keeping the homeostasis (Adelsberger-Mangan and Levy, 1993, 1994). As already seen in developmental models, also in the combined models the assumption of different dynamics for axons and dendrites in homeostatic structural plasticity increases the overall performance of the network (Adelsberger-Mangan and Levy, 1994). In other words, this combination of Hebbian and homeostatic structural plasticity provides an unsupervised way to transfer, compress, and store information (Hebbian structural plasticity) in an energy efficient representation, i.e., with a low number of needed neurons and low firing rates (homeostatic structural plasticity). Along this line, each postsynaptic neuron becomes selective or tuned to a specific input pattern. The number of neurons tuned to one pattern grows with the occurrence of this pattern (Thomas et al., 2015). This could, in principle, be a solution to the problem of memory allocation or rather allocation of inputs to specific groups of neurons in the brain (Rogerson et al., 2014; Tetzlaff et al., 2015). These results provide first insights into the complex dynamics resulting from the interaction between Hebbian and homeostatic structural plasticity. However, further theoretical investigations are needed.

### CONCLUSIONS AND OPEN QUESTIONS

In this review, we showed that structural plasticity can be classified into two categories (for a schematic summary, see **Figure 2**; italic and bold fonts indicate key references for experimental and theoretical studies respectively): (i) Hebbian structural plasticity leads to an increase (decrease) of the number or density of dendritic spines and contacts with axonal boutons during phases of high (low) activity (**Figure 2**, first column, orange). (ii) When these alterations in activity persist, homeostatic structural plasticity balances these changes by removing (adding new) synapses (**Figure 2**, second column, orange) and, after days, even by retracting (growing out) the dendrites themselves (**Figure 2**, second column, green).

In addition, we showed that there is a strong interaction between structural plasticity and synaptic plasticity. Both

respective effect. Studies which target both activity regimes and/or plasticity types are placed in-between them.

have been demonstrated to depend on (local) calcium levels. Even more strikingly, the synaptic transmission efficacies are related to the stability of the synapses. Thus, structural plasticity and its influences on the dynamics of neural networks can only be understood in conjunction with synaptic plasticity.

These complex interactions and their functional implications are best understood by theoretical models. For instance, Hebbian structural plasticity seems to remove and create synapses selectively. This selectivity leads to experimentally measured local connectivities and, furthermore, enhances memory lifetime, storage capacity, and robustness. For this, especially the pruning of non-needed synapses plays an important role. However, as for synaptic plasticity, these Hebbian dynamics lead to a positive feedback between connectivity and activity and, thus, to increasing neuronal activities. Thus, also structural plasticity requires a homeostatic process regulating activities back to an intermediate level. Accordingly, theoretical studies show that homeostatic structural plasticity organizes the connectivity of the network to maintain network stability. The combination of Hebbian and homeostatic structural plasticity preserves and improves network functions, as memory storage and input discrimination, and, in parallel, stabilizes the global dynamics in a resource efficient manner.

Still there are many open questions summarized in the following. On the experimental side, it is, for example, still unclear whether the increased number of spines after LTPinducing stimuli results from increased stabilization or from an increased spine formation. Also, the relation between structural and synaptic plasticity is still not completely understood. Along these lines, especially the relation between homeostatic structural plasticity and synaptic scaling has not been completely unraveled yet.

Furthermore, we argued that the interaction of Hebbian and homeostatic structural plasticity will lead to a transient increase (decrease) of, e.g., the number of spines in a system which undergoes prolonged phases of enhanced (decreased) activity. Such transient increases are observed in experiments. However, experimental investigations, whether the dynamics occur due to this interaction, are still missing. In general, the interaction of Hebbian and homeostatic processes in the same system is difficult to tackle and has been addressed only by a few studies.

### REFERENCES


Accordingly, also most of the theoretical studies have focused on either Hebbian or homeostatic structural plasticity. The interaction of both mechanisms, especially in recurrent networks, is widely unknown. Moreover, theoretical studies are often restricted to either reproducing biological data or investigating functional consequences of structural plasticity. Therefore, more studies are needed to link experimentally obtained connectivity features to functional predictions.

The here reviewed theoretical models mostly considered point-neurons. However, the actual position or location of a synapse on the dendritic tree strongly influences the details of synaptic plasticity (Sjöström and Häusser, 2006). In addition, also neighborly relations between synapses influence via, for instance, calcium currents synaptic plasticity (Oh et al., 2015). Obviously, due to the interactions between synaptic and structural plasticity, these local influences on synaptic plasticity also affect structural plasticity. On the other side, structural plasticity might select synapses with certain synaptic plasticity rules and remove others. Thereby, structural plasticity could act as a metaplasticity-like process (Abraham, 2008) which adds another level of complexity to the interaction of the different plasticities.

Taken together, we already have a decent understanding of the basic mechanisms governing Hebbian and homeostatic structural changes. Yet, their interaction with each other and with synaptic plasticity, as well as their functional relevance, still leave many open questions.

### AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

### FUNDING

This research was supported by the Federal Ministry of Education and Research (BMBF) Germany under grant number 01GQ1005B [CT] and 01GQ1005A [MF], and by the Göttingen Graduate School for Neuroscience and Molecular Biosciences (DFG Grant GSC226/2) [MF].

### ACKNOWLEDGMENTS

We want to thank Prof. Florentin Wörgötter for his feedback on the manuscript.


synaptic density in specific cortical circuits. Nat. Commun. 4:2038. doi: 10.1038/ncomms3038


in hippocampal neurons. J. Neurosci. 27, 8149–8156. doi: 10.1523/JNE UROSCI.0511-07.2007


in spine stability after entorhinal denervation. J. Comp. Neurol. 520, 1891–1902. doi: 10.1002/cne.23017


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Fauth and Tetzlaff. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# REMOD: A Tool for Analyzing and Remodeling the Dendritic Architecture of Neural Cells

Panagiotis Bozelos 1,2 , Stefanos S. Stefanou1,3† , Georgios Bouloukakis 1,4† , Constantinos Melachrinos <sup>1</sup> and Panayiota Poirazi <sup>1</sup> \*

1 Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology-Hellas (FORTH), Crete, Greece, <sup>2</sup> Department of Molecular Biology and Genetics, Democritus University of Thrace, Crete, Greece, <sup>3</sup> Department of Biology, University of Crete, Crete, Greece, <sup>4</sup> Computer Science Department, University of Crete, Crete, Greece

Dendritic morphology is a key determinant of how individual neurons acquire a unique signal processing profile. The highly branched dendritic structure that originates from the cell body, explores the surrounding 3D space in a fractal-like manner, until it reaches a certain amount of complexity. Its shape undergoes significant alterations under various physiological or neuropathological conditions. Yet, despite the profound effect that these alterations can have on neuronal function, the causal relationship between the two remains largely elusive. The lack of a systematic approach for remodeling neural cells and their dendritic trees is a key limitation that contributes to this problem. Such causal relationships can be inferred via the use of large-scale neuronal models whereby the anatomical plasticity of neurons is accounted for, in order to enhance their biological relevance and hence their predictive performance. To facilitate this effort, we developed a computational tool named REMOD that allows the structural remodeling of any type of virtual neuron. REMOD is written in Python and can be accessed through a dedicated web interface that guides the user through various options to manipulate selected neuronal morphologies. REMOD can also be used to extract meaningful morphology statistics for one or multiple reconstructions, including features such as sholl analysis, total dendritic length and area, path length to the soma, centrifugal branch order, diameter tapering and more. As such, the tool can be used both for the analysis and/or the remodeling of neuronal morphologies of any type.

Edited by:

Arjen Van Ooyen, VU University Amsterdam, Netherlands

#### Reviewed by:

Benjamin Torben-Nielsen, University of Hertfordshire, UK Jaap Van Pelt, VU University Amsterdam, Netherlands

### \*Correspondence:

Panayiota Poirazi poirazi@imbb.forth.gr

†These authors have contributed equally to the work.

Received: 27 August 2015 Accepted: 16 November 2015 Published: 06 January 2016

#### Citation:

Bozelos P, Stefanou SS, Bouloukakis G, Melachrinos C and Poirazi P (2016) REMOD: A Tool for Analyzing and Remodeling the Dendritic Architecture of Neural Cells. Front. Neuroanat. 9:156. doi: 10.3389/fnana.2015.00156 Keywords: neuron, dendrite, dendritic remodeling, computational tool, statistical analysis

### INTRODUCTION

The morphological complexity of dendrites has been documented since the times of Ramón y Cajal (1911) and is generally considered as an important factor for the proper functioning of the nervous system. Dendritic morphology also demonstrates a great deal of variation across different neuronal cell types (Ascoli, 2006), thus imposing an extra layer of complexity onto the deeply tangled relationship between structure and function (Mainen and Sejnowski, 1996; Krichmar et al., 2002; Schaefer et al., 2003). This structural diversity has been suggested to play a key role in shaping the mode of connectivity between neurons (Sholl, 1956; Kalisman et al., 2003; Chklovskii, 2004) as well as the information processing and signal integration capabilities of neural cells (reviewed in Spruston, 2008). Hence, the morphology of the dendritic tree directly influences the functionality of neural tissue, both at the single-cell and network levels, in complex ways.

To deliver this structural intrication neurons appear to follow an innate growth program (McAllister, 2000; Libersat and Duch, 2004) while responding adaptively to extrinsic guidance cues supplied by the extracellular matrix (for detailed reviews, see Wong and Wong, 2000; Procko and Shaham, 2010). Moreover, calcium signaling events induced by the electrophysiological activity of neurons are regarded as indispensable part of the morphogenesis process (Wu and Cline, 1998; Wong and Ghosh, 2002; Lohmann et al., 2005). Importantly, dendrites appear to be dynamic structures, even after the post-developmental period (Magariños et al., 1996; Stranahan et al., 2007) and these morphological changes ought to be related to neuronal function.

A considerable amount of research has accumulated to validate this notion, by quantifying the dendritic alterations that happen in response to either physiological or neuropathological stimuli. For instance, in Alzheimer's disease, besides the wellknown histopathological hallmarks of extracellular amyloid plaques and intracellular neurofibrillary tangles (NFTs), dendritic atrophy is consistently found among hippocampal and cortical pyramidal neurons (Yamada et al., 1988; Moolman et al., 2004). In another study, when rats were reared under chronic stress conditions, abnormal morphological changes to dendritic trees were seen across at least three different brain regions: the hippocampus, amygdala and prefrontal cortex (Vyas et al., 2002; Shansky and Morrison, 2009). Interestingly, neurons in these areas seemed to respond completely different to the same type of stress. In the CA3 subregion of the hippocampus, as well as in the medial prefrontal cortex, the dendritic arbor retracted to itself. On the contrary, dendritic trees in the basolateral amygdala (BLA) exhibited excessive growth as evident by their increased total length and number of branches.

Dendritic remodeling is also happening under physiological conditions. For instance, when there is increased need for receiving, integrating and encoding complex spatiotemporal patterns about newly encountered environments, the dendritic arbors of rat hippocampal neurons grow (Tronel et al., 2010); perhaps to accommodate new synapses and/or counterbalance the impending overexcitability, that otherwise would be toxic to the cell. Despite the profound effect that such dendritic alterations can have on neural processing, a causal relationship between structural changes and neuronal output has not been established yet.

A great deal of neuroanatomical research effort has been devoted to providing a direct link between structure and function, at both the single-cell and network levels. During the past few decades, the advent of intracellular labeling techniques and the application of various visualization methods have led to a dramatic increase in high resolution dendritic morphology data. A tremendous progress in imaging methods and automation is also expected to pave the way for an exponential growth of the data acquisition in the forthcoming years (Peng et al., 2015). The acquired morphological reconstructions are stored in dedicated databases such as the NeuroMorpho (Ascoli et al., 2007), the Fly Circuit (Chiang et al., 2011) and the Cell-Centered Database (Martone et al., 2003), and enable the quantitative analysis of neuronal shapes by the use of parameters relevant to the metrical and topological properties of the cell. Investigators interested in deciphering the role of neuroanatomy to the proper functioning of the nervous system have developed and provided the community with commercial and freeware tools such as Neurolucida Explorer (Glaser and Glaser, 1990), L-Measure (Scorcioni et al., 2008) BTMORPH (Torben-Nielsen, 2014), and NLMorphologyViewer<sup>1</sup> to trace, analyze and visualize the shape of the reconstructed neurons. However, most of these tools need to be locally installed and often require at least some basic programming skills, making them difficult to use by non-experts (for a detailed review, see Parekh and Ascoli, 2013).

More advanced approaches to the remodeling of dendritic trees incorporate the de-novo generation of neuronal morphologies, based either on intrinsic correlations between morphometric parameters (e.g., the NETMORPH tool by Koene et al., 2009), on principles of neuronal material conservation vs. conduction times (e.g., the TREES toolbox by Cuntz et al., 2011), or the dynamic and competitive interplay of retraction and outgrowth processes (e.g., the CX3D tool by Zubler et al., 2013; Hjorth et al., 2014). Others implement a hybrid of local and global generative algorithms to efficiently reproduce the anatomical variability of various neuronal classes (e.g., the L-Neuron and ArborVitae tools by Ascoli et al., 2001). Recently, the NeuroMac tool introduced an interesting context-aware approach in which developing neurons interact with the surrounding brain substrate (Torben-Nielsen and De Schutter, 2014). On the core of all the above algorithms lays the use of a limited set of statistical descriptors and assumptions on growth rules to generate stochastic neuronal structures that are statistically indistinguishable from the real neurons of the same morphological class. There is little evidence as to whether these assumptions and rules would hold under different physiological and/or pathological conditions. While the importance of these tools is widely recognized by the neuroanatomical community, to our knowledge, there is currently no tool available to implement targeted, assumption-free alterations on specific dendrites or branches of already-grown morphologies. Given that dendritic remodeling is happening quite frequently in nature, as part of the everadapting neuronal function, the necessity of a tool to remodel the dendritic morphology of digital reconstructions is long overdue. Moreover, the recent use of detailed biophysical and morphologically realistic large-scale brain models to investigate microcircuit structure and function (Markram et al., 2015) highlights the need to understand how different anatomical features and their plasticity shape brain function and dysfunction. Towards this goal, a systematic methodology that allows efficient remodeling of any type of neuronal morphology is a prerequisite.

In this work, we introduce REMOD, a computational tool that allows the structural remodeling of any type of digitally reconstructed neuron. The algorithm focuses on the simulation

<sup>1</sup>http://neuronland.org/NLMorphologyViewer/NLMorphologyViewer.html

of the end-result of a dendritic remodeling process, without explicitly implementing any growth and/or retraction rules. It should be noted that such an end result can be achieved by several different sets of manipulations, some of which may not be biologically relevant. The important advantage of REMOD is that it provides the flexibility to choose which manipulations are considered relevant for each experimenter, condition and cell type, thus allowing a full exploration of the parameter space. This flexibility is particularly important when investigating the effects of specific morphological manipulations on a given response pattern. Such manipulations can be used to tease out the contribution of distinct morphological factors from other processes such as biophysical mechanisms.

The tool is written in Python and can be accessed through a dedicated web interface that guides the user through various options to manipulate selected neuronal morphologies. More explicitly, it provides the ability to upload one or more morphology files (in SWC, the most widely used format) and choose specific dendrites or dendritic regions on which to operate one of the following actions: shrink, remove, extend, branch or scale. The user retains complete control over the extent of each alteration and if a chosen action is not possible due to pre-existing structural constraints, appropriate warnings are generated. It is worth mentioning that REMOD can also be used to extract a plethora of descriptive statistics for one or multiple neurons, such as sholl analysis, total dendritic length and area, path length to the soma, centrifugal branch order, diameter tapering and more. As such, the tool can be used both for the analysis and/or the remodeling of neuronal morphologies of any type.

### METHODS

Recent advances in high-throughput single-neuron imaging techniques are expected to stimulate a morphological data ''explosion'' that will revolutionize the computational neuroanatomy field. Importantly, the aforementioned data ''explosion'' needs to be accompanied by the development of software tools, designed to meet the future requirements of parsing and manipulating tons of 3D neuron reconstruction data in a transparent, reliable and highly-consistent manner (Peng et al., 2015). In order to achieve this longer-term goal, an appropriate approach needs to be adopted by the community to properly address this ''big data'' challenge. Therefore, it is imperative for the new generation of software packages to be optimized for this data volume, to handle it in a user friendly way and with the added benefit to be open-source and publicly available, allowing ease of use, sharing and active development by avid researchers in the field.

The proposed tool is developed according to this philosophy. The software implementation is in Python 2.7 and the tool is open source and available for use as a web-application at www.remod.gr/. The source code and documentation of the tool is available on the Github repository hosting service at github.com/bozelosp/remod, where it can be downloaded as a standalone application that retains almost all of its online capabilities and/or further developed.

The software package contains flexible and extendable modular blocks of code that can be generally classified in three categories: (i) parsing the morphology reconstruction data encoded in standard SWC files; (ii) performing morphometric and statistical analysis of the provided dendritic trees; and (iii) executing specific remodeling actions and exporting the new morphologies in the SWC format. The following paragraphs explain the user-action workflow between the abovementioned categories.

First, the user uploads one or more morphology reconstructions, properly formatted as SWC files. REMOD parses the geometric information of the specific 3D structure and topology and provides a rotating 3D visualization of the corresponding tree, as well as many depictions of the extracted morphometric statistics in the form of tables and charts. Next, the interface guides the user through a range of remodeling options that can be implemented on the selected morphologies as shown in **Figure 1**.

For example, the researcher can manually select specific dendrites to impose a morphological alteration or more conveniently select a whole region of the neuron (i.e., the basal or apical tree), or even a randomly selected portion of it. Remodeling actions can be implemented on any and/or multiple selected dendrites, irrespectively of whether they are terminal branches or not.

Next, the researcher decides the remodeling action to apply. If the plan is to further grow the dendritic tree, then the appropriate options to choose might be additive extension or branching, i.e., adding two new dendrites stemming from an existing parental one. The researcher can specify the extent of growth in terms of a percentage of the previous dendritic length or by defining the desirable length in micrometers. In this case, the algorithm will generate a series of somewhat randomly directed cylinders that radiate away from the parental dendritic segment and the soma as shown in **Figure 2**. The added segments have random lengths sampled from a realistic distribution and mimic the way dendrites explore the available 3D space in a quasi-random, quasi-directed way.

Similarly, to simulate a dendritic retraction, the user decides between two options: shrinking or complete removal of the selected dendrites. In both actions, cylinders are removed from the SWC file to match the desirable reduction in length, without altering any other parts of the morphology file. An example of each case is shown in **Figure 3**.

The scaling action can be used to enlarge or reduce the dimensions of either the entire dendritic tree, or the basal/apical regions of the tree, independently. Remodeled morphologies are exported in the SWC standard format and can be downloaded and/or sent to the user via email as shown in **Figure 4**.

REMOD is also able to extract useful statistics for the morphological and topological features of the neuronal reconstructions. Metrics such as total dendritic length, path length, surface and volume are available for both the basal and the apical regions of the tree. Sholl analysis is implemented either for the number of branch points, the number of intersections or the sum of dendritic lengths included in defined radius steps from the soma. The tool also supports

the comparison between two groups of morphologically different neurons using the abovementioned metrics, as shown in **Figure 5**. This utility can be especially useful for both experimentalists and modelers wishing to identify anatomical differences between two particular groups of neurons.

### RESULTS

To demonstrate the capabilities of REMOD we implemented morphological changes that led to dendritic atrophy and enhanced dendritic arborization, in hippocampal and amygdaloid pyramidal neurons, respectively. Such changes

Frontiers in Neuroanatomy | www.frontiersin.org January 2016 | Volume 9 | Article 156 |

were reported to occur as a result of stress by Vyas et al. (2002). The modifications performed aimed at reproducing the percentage of change in various dendritic features as reported experimentally and were applied to a large number of morphologies downloaded from the NeuroMorpho database.

According to Vyas et al. (2002) chronic immobilization stress (CIS) led to a significant decrease of the total dendritic length and the number of branch points in hippocampal CA3 pyramidal neurons as compared with neurons of the same class in control animals. The marked shrinkage was evident in both apical as well as basal dendrites of the examined pyramidal cells. Specifically, the total dendritic length of apical dendrites was decreased by 29%, and the number of apical branch points was reduced by 31%. The same analysis applied to basal dendrites found their total dendritic length shrunk by 16% and their number of branch points reduced by 16%, respectively. Based on the observation that the number of branch points is affected, it's easy to deduce that a simple scaling of the entire dendritic tree length would not reproduce the observed phenotype. With the goal of simulating the overall shrinkage effect and reproducing the reported percentages, we took the following approach:


• Accordingly, the basal tree was pruned by randomly selecting 15% of its terminal dendrites for complete removal and shrinking another 5% of them by 5% of their initial length. The results of this processing are shown in **Tables 1, 2** and **Figure 6**.

It should be noted that other combinations of dendrite removal/pruning may also reproduce the reported averaged changes in dendritic length. Due to the lack of information as to which specific morphological features were altered in the experiments, we used a randomized approach.

Interestingly, the same CIS paradigm that caused dendritic atrophy in CA3 pyramidal neurons induced the opposite effect in BLA pyramidal neurons. The total dendritic length of BLA pyramidal neurons under conditions of CIS increased by 25% and the number of branch points by 15% compared to the control cells. With the goal of reproducing the experimentally obtained results of dendritic extension in the amygdala we implemented the following series of actions in REMOD:


neuron depicted in Figure 1.

### Overall, despite the fact that the reconstructed morphologies analyzed by Vyas et al. (2002) were significantly smaller than the neurons of the same class downloaded from the NeuroMorpho database, we reliably reproduced the percentage morphometric changes as well as the overall distributions of changes between the control and CIS-treated cells in both neuronal classes.

### DISCUSSION

Over the past decade, computational simulations of the electrophysiological behavior of neurons have become fairly common as neuroscientists strive to develop a comprehensive understanding of the nervous system (Bower, 2013; Dudai and Kathinka, 2014). At the same time, the rapid


thereafter a 50% of the apical terminal dendrites were selected for extension by 60% of their parent length.




advancement of imaging techniques has greatly enhanced the acquisition of high-resolution 3D neuronal reconstructions, resulting in anatomically precise and biophysically detailed compartmental models that render realistic simulation of neuronal behavior possible under virtual experimental conditions. Thus, it is expected that multi-compartmental cable models with ever-increasing accuracy in detail will become critically important to better understand the rich repertoire of computational operations performed by dendrites.

In addition, substantial progress has been achieved by experimental neuroscientists into revealing how neural plasticity might relate to development, learning and disease and how its influence could be projected far beyond the dynamics of synapses (Butz et al., 2009; Yau et al., 2011). Along with motile spines and axonal branches, the dendritic architecture of neural cells is also subject to significant morphological readjustments (for a detailed review, see Emoto, 2011). Thus, many challenging questions arise about the structural plasticity of the nervous system, including the rules that govern the dendritic morphological configuration, the adaptability of the neural connectivity and the preservation of functional continuity. Hence, a software tool that is able to induce structural alterations upon virtual neurons and statistically quantify the changes between acquired and remodeled morphologies could prove an invaluable resource for modelers and neuroanatomists alike, complementary to the existing approaches.

In this paper, we have presented REMOD, a novel openaccess software tool for the analysis and remodeling of dendritic



TABLE 4 | Difference in the number of branch points of BLA pyramidal neurons as reported experimentally after CIS and via remodeling with REMOD.


morphologies. The tool is written in the non-commercial Python programming language and is publicly available on the GitHub platform, under the open-source MIT license to encourage active development by the neuroanatomical community. The tool is accompanied by an easily accessible front-end interface that anyone can use via a web browser (http://www.remod.gr), thus eliminating the need to download and configure additional packages and/or restricted libraries locally on the desktop.

REMOD is designed to fulfill the emerging need among computational neuroscientists to simulate the dynamic nature of the dendritic structure. Other toolboxes have been developed for the special purposes of de novo generating synthetic neurons or neuronal networks, or even configuring the connectivity between them (Kalisman et al., 2003; Koene et al., 2009). These methods are exploiting experimentally-substantiated principles, such as the conservation of neuronal material vs. conduction time constraints, the dynamic negotiation between retraction and outgrowth processes and the contextawareness of neuronal arborization, in order to enforce local or global generative algorithms that efficiently replicate the extent of morphological variability manifested by various neuronal types (Ascoli et al., 2001; Cuntz et al., 2011; Zubler et al., 2013; Hjorth et al., 2014; Torben-Nielsen and De Schutter, 2014). Still, the modeling of structural plasticity in the brain is arguably hindered by a multitude of challenges. These challenges predominantly include the complex interaction between numerous interdependent factors, such as genetic, metabolic and behavioral influences, that cannot be easily dissected for integration into the current predictive models (Bestman Da Silva and Cline, 2008; Polleux and Ghosh, 2008).

To overcome this limitation, the researcher can approach morphology from a different methodological perspective, that of remodeling an already-existing neuronal structure ad arbitrium. Principally, this approach provides a wider flexibility to choose the structural manipulations that are considered more relevant for each experimenter, physiological condition and cell type, albeit allowing for a greater exploration of the available parameter space. This type of freedom could prove particularly important when investigating the effects

of certain morphological alterations on a given neuronal response pattern, even if they are unlikely to occur. Such manipulations can for instance be used to assess and tease out the individual contribution of distinct morphological factors from other processes, like the electrophysiological properties of ion channels. Furthermore, REMOD offers the ability to perform statistical morphometric analyses of the user-provided morphologies via uploading to our server, an additional utility that may also be of great service to experimentalists.

In this work, we sought to demonstrate the capabilities of our toolbox by reproducing dendritic remodeling in the hippocampus and the amygdala brain regions of the rat under stress conditions, as reported by Vyas et al. (2002). Using REMOD, we faithfully reproduced the percentage of morphological changes reported in this paper, by imposing dendritic remodeling to 3D reconstructions of the same neuronal types downloaded from the NeuroMorpho repository. Unfortunately we did not have access to the original morphologies in order to perform a direct comparison.

The reproduction of the experimental results (with respect to the percentage of changes observed), delivered using REMOD, reflects the ability of the tool to simulate complicated neuronal phenomena that may occur under physiological or pathological conditions. We believe that the functions provided within the first release of the tool are flexible and efficient enough

### REFERENCES


to simulate any type of dendritic remodeling, capturing the variant expression of neuronal adaptability that has been extensively documented, but still insufficiently explained in current theoretical models.

To the best of our knowledge, REMOD is the first software package that allows the remodeling of existing, already-grown, detailed neuronal morphologies, in parallel with the effortless extraction of morphological descriptive statistics. Thus, REMOD allows the implementation of a systematic approach for altering virtual neuronal morphologies, which is likely to promote further research into understanding the hidden associations between critical neuroanatomical characteristics and the distinct electrophysiological patterns that individual neurons, as well as neural networks, exhibit. Exploiting the benefits and the capabilities of dendritic remodeling will aid the transition from investigating a ''rigid'' neuronal function to a refined exploration of the intricate effect of morphology to dendritic function and neuronal processing.

### ACKNOWLEDGMENTS

This work was supported by the ERC Starting Grant dEMORY (ERC-2012-StG-311435) and the BIOSYS research project, Action KRIPIS, project No MIS-448301 (2013SE01380036) that was funded by the General Secretariat for Research and Technology, Ministry of Education, Greece.


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Bozelos, Stefanou, Bouloukakis, Melachrinos and Poirazi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Automatic Generation of Connectivity for Large-Scale Neuronal Network Models through Structural Plasticity

Sandra Diaz-Pier <sup>1</sup> \*, Mikaël Naveau1, 2, Markus Butz-Ostendorf <sup>1</sup> and Abigail Morrison1, 3, 4

<sup>1</sup> Simulation Laboratory Neuroscience – Bernstein Facility for Simulation and Database Technology, Institute for Advanced Simulation, Jülich Aachen Research Alliance, Jülich Research Center, Jülich, Germany, <sup>2</sup> Serine Proteases and Pathophysiology of the Neurovascular Unit, Institut National de la Santé et de la Recherche Médicale UMR-S U919, Caen Normandy University, Groupement d'Intérêt Public (GIP) CYCERON, Caen, France, <sup>3</sup> Institute of Neuroscience and Medicine (INM-6), Computational and Systems Neuroscience, Jülich Research Centre, Jülich, Germany, <sup>4</sup> Faculty of Psychology, Institute of Cognitive Neuroscience, Ruhr-University Bochum, Bochum, Germany

With the emergence of new high performance computation technology in the last decade, the simulation of large scale neural networks which are able to reproduce the behavior and structure of the brain has finally become an achievable target of neuroscience. Due to the number of synaptic connections between neurons and the complexity of biological networks, most contemporary models have manually defined or static connectivity. However, it is expected that modeling the dynamic generation and deletion of the links among neurons, locally and between different regions of the brain, is crucial to unravel important mechanisms associated with learning, memory and healing. Moreover, for many neural circuits that could potentially be modeled, activity data is more readily and reliably available than connectivity data. Thus, a framework that enables networks to wire themselves on the basis of specified activity targets can be of great value in specifying network models where connectivity data is incomplete or has large error margins. To address these issues, in the present work we present an implementation of a model of structural plasticity in the neural network simulator NEST. In this model, synapses consist of two parts, a pre- and a post-synaptic element. Synapses are created and deleted during the execution of the simulation following local homeostatic rules until a mean level of electrical activity is reached in the network. We assess the scalability of the implementation in order to evaluate its potential usage in the self generation of connectivity of large scale networks. We show and discuss the results of simulations on simple two population networks and more complex models of the cortical microcircuit involving 8 populations and 4 layers using the new framework.

Keywords: structural plasticity, large scale neural networks, high performance computing, homeostatic growth, self-organizing network

## 1. INTRODUCTION

Models of large scale neural networks are an important tool for understanding the mechanics of the brain (De Garis et al., 2010; Helias et al., 2012; Eliasmith and Trujillo, 2014). Such models are created based on experimental information that has been collected for years by neuroscientists and combine mathematical methods with algorithms to reproduce observed behavior. It is known that

#### Edited by:

Jackson Cioni Bittencourt, University of Sao Paulo, Brazil

#### Reviewed by:

Nuno Miguel M. Amorim Da Costa, University of Zurich (UZH) and Swiss Federal Institute of Technology Zurich (ETH), Switzerland Ping Liu, University of Connecticut Health Center, USA

> \*Correspondence: Sandra Diaz-Pier s.diaz@fz-juelich.de

Received: 30 November 2015 Accepted: 06 May 2016 Published: 26 May 2016

#### Citation:

Diaz-Pier S, Naveau M, Butz-Ostendorf M and Morrison A (2016) Automatic Generation of Connectivity for Large-Scale Neuronal Network Models through Structural Plasticity. Front. Neuroanat. 10:57. doi: 10.3389/fnana.2016.00057 the connectivity of the network plays an essential role in defining the way function is achieved at higher levels of activity. Nevertheless, obtaining accurate measurements of connectivity is complex, even with the most advanced experimental techniques, due to the resolution of sensors and difficult access to the target areas. The dynamics of the connectivity are also not yet well understood, although it has been shown that synaptic plasticity is fundamental for understanding how learning and memory work. Non invasive techniques such as DTI imaging and fMRI scans can provide a glimpse to the real complexity of the problem in structure and function. Higher resolution techniques like electron microscopy (Gray, 1959), photostimulation (Dantzker and Callaway, 2000) and electrophysiological recordings (Thomson et al., 2002) provide more detailed connectivity information of specific regions. Regardless, creating an exact connectivity map of even a small region of the brain is extremely challenging (Deco et al., 2008; Essen et al., 2012; Van Essen and Ugurbil, 2012; Reckfort et al., 2013). This poses a significant problem for the modeling approach, as connectivity must be specified. For small networks, parameter scans can be carried out with respect to the unknown or imprecisely known connection probabilities between populations. For larger networks, which are more costly to simulate and also potentially have many more unknown connectivity parameters, this approach is hardly feasible.

One way to address the issue of modeling connectivity within a neural network is to allow a network model to determine its own suitable connectivity to achieve target activity patterns, e.g., experimental measurements of the spiking frequency, which is easier to measure accurately than connectivity. In addition to addressing the problem of network model specification, a framework that accounts for the appearance and disappearance of synapses on the basis of network activity can provide insight into how connectivity is generated during development and learning or even on how healing after lesions takes place (De Paola et al., 2006). It can also help understand how certain structures arise as a result of exposition to adequate external stimuli during critical periods in the development of the brain (Hensch, 2005) and the mechanisms underlying experience dependent structural synaptic plasticity (Holtmaat and Svoboda, 2009).

An appropriate model of structural plasticity that incorporates the dynamic generation, deletion and rewiring of synapses within a network was presented by Butz and van Ooyen (2013). In this model, synapses are represented as connections between pre and a post synaptic elements. The growth or diminishment of these synaptic elements is an independent process for each neuron. The model is based on the idea that plasticity in cortical networks is mainly driven by the need of individual neurons to homeostatically maintain their average electrical activity. As a consequence, if activity is lower than a desired set-point, neurons will form synaptic elements, and remove them when activity becomes too high. Additionally, a minimum level of activity is needed to form synaptic elements at all. If activity falls below this level the neuron will remove synaptic elements, too. Results show that small networks of hundreds or thousands of neurons robustly grow toward a stable homeostatic equilibrium of activity and connectivity. An important advance on earlier work is that all cell types had different desired average firing rates (achieved by different homeostatic set-points) and developed connectivity accordingly. It was shown that these local rules for structural plasticity can account for network rewiring after a partial loss of external input (deafferentation) and shows remarkable similarities with biological data from network rewiring in the primary visual cortex after focal retinal lesions (Keck et al., 2008; Yamahachi et al., 2009). Further analysis by Butz et al. (2014) of changes in network topology revealed that betweenness centrality could be used as an indicator of successful brain repair, in the sense that it is related to the ability of the neurons to restore their electrical activity by network rewiring. It was concluded by the authors that structural plasticity may account for network reorganization on different spatial scales.

In this work, we provide a complete description of how the structural plasticity model proposed by Butz and van Ooyen (2013) could be implemented in the neuronal network simulator NEST (Gewaltig and Diesmann, 2007) in order to create self-organizing large scale neural networks. We evaluate the scalability of the implementation and assess the performance of the model on two use cases. We demonstrate that our implementation is capable of selforganizing the connectivity within a cortical microcircuit model consisting of 100, 000 neurons in total, starting with a fully disconnected setup. We also show the scenario where partial information of the connectivity is given as initial condition and an stable connectivity pattern is obtained in the end.

The structural plasticity extension to NEST is included in release 2.10.0 (Bos et al., 2015) and creates a novel possibility for setting up large-scale neuronal networks. While supercomputers are required for very large-scale simulation, we show that smaller networks can also be run on a personal workstation or laptop according to the NEST development philosophy. This is a fundamental advantage of this implementation of structural plasticity in terms of capacity to test different configurations, as it provides high flexibility and portability for the neuroscientist.

The corresponding extension of the Python interface of NEST (PyNEST) allows the user to set up their own structural plasticity experiments for large scale networks.

The rest of this work is divided into three major parts. The first describes the major elements of the structural plasticity algorithm and the set of tests that were designed in order to measure the performance of an implementation of this algorithm. We also present some use cases for the structural plasticity framework. In the second part we provide the results of the technical implementation in NEST and describe how the design matches the memory and speed requirements for large scale simulations. We also present results for the use cases described in the previous section. In the third part, we discuss the results of the implementation and performance tests.

Some of this material has previously been presented in abstract form (Naveau and Butz, 2014).

### 2. MATERIALS AND METHODS

### 2.1. The Algorithm of Structural Plasticity

The original formulation of the structural plasticity algorithm defined in Butz and van Ooyen (2013) consists of three repeating steps which are described as follows:

(1) Update in electrical activity and intracellular calcium concentration. The electrical activity is calculated for each neuron on a millisecond timescale. The time-averaged level of the neuron's electrical activity drives changes in neuronal morphology. Intracellular calcium concentration is updated according to the electrical activity as follows:

$$\frac{d\text{Ca}}{dt} = \begin{cases} -\frac{\text{Ca}(t)}{\text{t}} + \beta & \text{if the neuron flies} \\ -\frac{\text{Ca}(t)}{\text{t}} & \text{otherwise} \end{cases} \tag{1}$$

where τ is the calcium decay constant and β is the calcium intake constant which indicates how much calcium is accumulated each time the neuron fires. Calcium concentration is linearly proportional to average firing rate and thus is the measure that is used to guide the growth dynamics of the synaptic elements.

(2) Update in synaptic elements. The detailed morphology of the synaptic elements is abstracted, and is represented in this formulation only by the number of possible synaptic contacts on axons (axonal elements representing axonal boutons: senders of synaptic activity) and on dendrites (dendritic elements representing dendritic spines: receivers of synaptic activity) collectively called synaptic elements. Synaptic elements are created or deleted according to a homeostatic rule. In general, the homeostatic rule will create synaptic elements when the activity is lower than the desired setpoint and delete them when the activity is higher until the desired activity level is achieved. This homeostasis is represented by a curve which defines how quickly new elements are created or deleted according to the current level of electrical activity. The original work considers two types of growth curve, linear and Gaussian:

**Linear:**

$$\frac{dz}{dt} = \nu(1 - \frac{1}{\epsilon}Ca(t))$$

where ν is the growth rate and ǫ is the target level of calcium concentration that the neuron should achieve. **Gaussian:**

$$\frac{dz}{dt} = \nu \left( 2 \exp \left( -\frac{\mathrm{Ca}(t) - \xi}{\xi} \right) - 1 \right)^2$$

where ξ = (η + ǫ) /2, ζ = (ǫ − η) /2 √ ln 2 and ν is the growth rate as before. In this Gaussian growth curve, η represents the minimum amount of calcium concentration that the neuron must have in order to start creating new synaptic elements. Same as in the linear growth, ǫ represents the target level of calcium concentration that the neuron should achieve.

A synaptic element is formed (or deleted) when the rounded

down z value increase (or decrease) by one. Newly-formed synaptic elements are initially vacant and available for synapse formation.

(3) Update in connectivity. In every connectivity update, available synaptic elements allow the formation of new synapses and deleted synaptic elements dictate synapse breaking. Every available synaptic element has the same probability to be randomly chosen for a new connection. Synaptic elements to be deleted are also chosen in a uniformly random manner out from the pool of already connected elements. It is important to notice that in this algorithm when a synapse breaks due to the deletion of one synaptic element, the counterpart remains and becomes vacant again. This remaining counterpart can form a new synapse at the next update in connectivity. This effect models network rewiring by re-routing of axons or dendrites.

An important characteristic of this algorithm, is that it relies on global communication to update the connectivity in the network, as available compatible synaptic elements must be matched during the simulation to create new connections. This must be taken into consideration for the design of any implementation of this model.

### 2.2. Scalability

To assess the scalability of the framework, we designed strong and weak scaling tests of the structural plasticity implementation. For all tests, networks with 80% excitatory and 20% inhibitory neurons were created. The growth rate for synaptic elements in the simulation was set to 4.0 × 10−<sup>4</sup> elements/ms for the excitatory elements of the inhibitory population and 1.0 × 10−<sup>4</sup> elements/ms for all the other elements. The set point for desired calcium concentration in the excitatory population was defined as 0.05 Ca2+, while in the inhibitory population it was set to 0.2 Ca2+. The calcium concentration intake constant was set to β = 0.001 and the calcium concentration decay constant to τ = 10000.0 for all neurons. The post synaptic amplitude of individual synapses was set to 1.0 mV. External input was provided using a Poisson generator with a frequency of 10<sup>4</sup> Hz. The post synaptic amplitude of individual synaptic input was set to 0.01 mV. The simulation was run for 100 s, with a step size for the numerical integration of 0.1 ms. The updates in the network connectivity were performed every 10 ms. These values were chosen as they proved to be one parameter combination that allowed for stable self-organizing growth of the network toward the homeostatic equilibrium (See Section 3.3.1 for additional comments on the selection of this parameter set).

Weak scaling tests were performed for networks with 5000 neurons per node and settings of 1, 2, 4, 8, and 16 nodes, each node using 28 cores. Strong scaling tests were performed with a network of 100, 000 on the same hardware configurations as the weak scaling tests. Only physical cores were used, no simultaneous multithreading was enabled. A hybrid optimization approach was chosen, in which MPI is used for communication between nodes and OpenMP for intra node communication. All measurements were performed on the JUROPATEST cluster, which provides up to 70 nodes (T-Platforms V210s Blades), each with 2 × Intel(R) Xeon(R) CPU E5 − 2695 v3 (Haswell) with 14-core processors (2.30 GHz) and 128 GB DDR memory, running with Scientific Linux release 6.5 (Carbon).

### 2.3. Use Cases for the Structural Plasticity Framework

The main objective of the structural plasticity framework is to provide the user with a tool to model the dynamic creation and deletion of synapses between neurons of a neural network in a scalable manner. There are several applications in which structural plasticity can be used. In this section we detail two use cases as examples. The first use case shows the basic functionality of the framework and how it can be used to study the relationship between connectivity and activity. We also show how this simple set-up can model critical development periods in the network connectivity. The second example is a more complicated case with several populations, where the objective is to show how connectivity can be self-generated in a network by using the synaptic element growth curves as connectivity fitness rules. All simulations were carried out with NEST version 2.8.0 extended by our structural plasticity implementation.

### 2.3.1. A Simple Two Population Network

In this initial use case, we generate a network with a total of 1000 leaky integrate and fire neurons, 80% excitatory and 20% inhibitory. For the excitatory neurons, η = 0.0, ǫ = 0.05 and ν = 1.0×10−<sup>4</sup> elements/ms. For the inhibitory neurons, η = 0.0, ǫ = 0.2 and ν = 1.0 × 10−<sup>4</sup> elements/ms, except for the excitatory elements which had ν = 4.0 × 10−<sup>4</sup> elements/ms. The connectivity in the system was allowed to evolve using a Gaussian growth curve for 3000 s, with an integration step of 0.1 ms and a delay of the connectivity update equal to 100 integration steps. The simulations were performed on a workstation with 8 Intel core i7 − 4770@3.4 GHz CPUs running openSUSE 13.1.

### 2.3.2. The Cortical Microcircuit Network

In this second use case, we create a four layer network based on the model of the cortical microcircuit proposed by Potjans and Diesmann (2014). Each layer contains one inhibitory and one excitatory population of leaky integrate and fire neurons. In the simulations presented here, the network starts with the same number of neurons in each population as in the previous study, but without any synaptic connections. For each population, we define a level of desired mean electrical activity based on experimental literature and a growth curve which defines the dynamics of the variation in the number of pre- and post-synaptic elements. These are Gaussian shaped curves with two intersections with the x-axis that determine the minimum amount of electrical activity required to form any synapse (η), and the target mean calcium concentration for the neuron (ǫ). The curves are illustrated in **Figure 1.**

FIGURE 1 | Growth curves for each synaptic element in each layer of the cortical microcircuit model. The growth curves define the rate at which synaptic elements are created depending on the amount of calcium concentration in the cell at the moment. Red curves are for neurons in the excitatory population. Blue curves are for neurons in the inhibitory population. Solid lines are for the excitatory synaptic elements and dotted lines represent inhibitory synaptic elements. The vertical purple line defines the target level of calcium concentration for excitatory neurons and the vertical cyan line represents the target level of calcium concentration for inhibitory neurons. It is important to highlight that all synaptic elements of the same neuron must have a growth curve with the same target level of calcium concentration, otherwise equilibrium will never be reached.

In the first example, we tune the growth rate to achieve an stable growth regime for the network connectivity. This means that the structural plasticity algorithm will stop creating and deleting synaptic connections when the desired mean activity is reached, and that this mean activity is actually reached on average in each population. In a second example, the growth rate provided leads to an unstable connectivity pattern, where the target mean electrical activities are never reached by all populations. A table containing the parameters for both cases can be seen in Appendix in Supplementary Material.

A third example was run to illustrate a more common situation where there are some assumption about the connectivity in a network and where the structural plasticity framework can help to find a suitable balance between excitation and inhibition in the network. Here we used the original model of Potjans and Diesmann (2014) and enable the structural plasticity after an initial stabilization period of 30 s.

Simulations were performed on JUROPATEST (70 nodes with 2 × 14-core processors Intel(R) Xeon(R) CPU E5 − 2695 v3 (Haswell) at 2.30 GHz and 128 GB DDR memory, running with Scientific Linux release 6.5) and JURECA (with 260 compute nodes with Intel Xeon E5 − 2680 v3 Haswell CPUs with 2 × 12 cores per CPU, 128GB of RAM per node and running on CentOS 7 Linux distribution).

### 3. RESULTS

### 3.1. Implementation of the Structural Plasticity Model into NEST

The implementation of the structural plasticity algorithm described in this work is based on the version 2.8 of the NEST software Eppler et al. (2015). In accordance with the original formalization described in Section 2.1, the algorithm consists of three repeating parts which can be visualized in a general form in **Figure 2** and described as follows:


the ConnBuilder was extended to include the new methods ConnBuilder::sp\_disconnect\_ and ConnBuilder::sp\_connect\_. Once new synapses are formed, synaptic elements are tagged from "vacant" to "connected." It is important to notice that when a synapse breaks due to the deletion of one synaptic element, the counterpart remains and becomes vacant again. This remaining counterpart can form a new synapse at the next update in connectivity. This effect preserves the network rewiring capabilities of the original formulation. A detailed diagram of how the new calls are integrated into the normal simulation flow of NEST can be seen in **Figure 3**.

An important feature that we implemented to simulate structural plasticity in NEST is the ability to create and delete synapses during the simulation time. Our new implementation of the connection management overcomes the limitation of the NEST simulator that currently models networks with a fixed connectivity. We have implemented the dynamic creation and deletion of synapses using the new connection framework released in version 2.6.0. The new connection framework improves memory usage to store connection data and reduces the computation time needed to create a connection.

The main limitation of the structural plasticity algorithm described by Butz and van Ooyen (2013) is that it requires global knowledge of the synaptic elements of the entire network. Fortunately, the MPI global communications, also used by the NEST kernel to communicate the electrical activity between the neurons during the simulation, do not pose a substantial bottleneck since changes in connectivity are assumed to take place on average around a factor of 100 times slower than changes in electrical activity. Therefore selecting a biologically realistic growth rate of around 10−<sup>4</sup> elements/ms will result in an exchange of data that is sufficiently low rate so as not to impact the scalability of the simulator as a whole. At the end of each connectivity update step, the number of created/deleted synaptic elements per neuron are communicated to all MPI processes and a global shuffle subsequently assigns the new pairs of neurons that should be connected, and likewise chooses existing connections for deletion. In the current implementation, no topological constraints are taken into account while deciding which neurons will be connected. The probability of two neurons connecting to each other depends solely on the number of available compatible synaptic elements between them. The actual creation and deletion of the synapses is finally done in parallel using the NEST connection framework. As stated before, a single update in connectivity should not produce a major modification of the network. That means that only a small part of the neurons should create or delete a synaptic element between two updates in connectivity.

It is important to highlight that the usage of global communication is a characteristic of the technical implementation of the algorithm and is not related to the functionality of the model. If topology was to be taken into account, the ability of a neuron to connect to any other would be limited by the constraints imposed by its relative position to others. Global communication would still be used by the implementation, but only relevant information would be taken into account to define the connectivity. The local homeostatic rules only define the creation or deletion of synaptic elements

})

per neuron. The number of available synaptic elements is transmitted globaly and the synaptic plasticity manager takes care of forming new synapses or deleting existing ones based on this information.

The update of electrical activity and of the number of synaptic elements is performed by every individual neuron and therefore benefits from the parallel framework already implemented in NEST. Indeed, the NEST software has already demonstrated its high scaling properties on supercomputer, including the JUQUEEN system (Helias et al., 2012; Kunkel et al., 2014).

Finally, the Python interface of NEST (PyNEST) was extended to allow users to easily set up the structural plasticity parameters. It is important to highlight that the user can enable structural plasticity inside the simulation and then disable it when the network has achieved a desired connectivity pattern or activity level. The user can now also delete synapses even without enabling structural plasticity, in a similar way as the connect functions work in NEST.

### 3.1.1. Setting up a Network in NEST with Structural Plasticity

In this section we will introduce the high level functions that are introduced into NEST with the structural plasticity framework using PyNEST.

In order to set up the network using structural plasticity, one first needs to define the time at which updates in the structure of the network should take place as follows:

```
nest.SetStructuralPlasticityStatus({
```

```
'structural_plasticity_update_
    interval':
   update_interval,
```
The next step is to define the synapses which can be dynamically modified by the structural plasticity manager during the simulation. This is achieved by:

```
nest.SetStructuralPlasticityStatus({
    'Structural_plasticity_synapses': {
       'structural_plasticity_synapse_
            ex': {
          'model': 'structural_
              plasticity_synapse_ex',
          'post_synaptic_element'
              : 'Den_ex',
          'pre_synaptic_element'
              : 'Axon_ex',
       },
       'structural_plasticity_synapse_
           in': {
          'model': 'structural_
               plasticity_synapse_in',
          'post_synaptic_element'
               : 'Den_in',
          'pre_synaptic_element'
              : 'Axon_in',
       },
    }
})
```
Here, two types of synapses are being defined, one for the excitatory synapses and another one for the inhibitory synapses. It is important to notice that in this definition, a name

Diaz-Pier et al. Self-Organizing Connectivity through Structural Plasticity

for the post and pre synaptic elements is also specified. This allows the structural plasticity manager to create new synapses of the type specified in model when synaptic elements related to this label become available. This way of setting up the dynamic synapses also allows the user to define static connectivity constraints in the network. This can be achieved by using one synapse model which is not registered for structural plasticity to define this fixed connectivity. For the moment, no other constraints in connectivity like indegree or outdegree ranges can be specified. Nevertheless, thanks to its flexible design, the model can be extended to add new constraints.

Next step involves defining the growth curves for the synaptic elements defined above. This is done as follows:

```
growth_curve_e_e = {
    'growth_curve': "gaussian",
    'growth_rate': 0.0001,
    'continuous': False,
    'eta': 0.0,
    'eps': 0.05,
}
```
This is an example of a Gaussian growth curve where the minimum level of calcium concentration required to start generating synaptic elements is η = 0.0 Ca2+, and the desired calcium concentration is set to ǫ = 0.5 Ca2+. Finally, the rate at which the synaptic elements will grow is ν = 1 ×

FIGURE 5 | Upper panel: Calcium concentration and numbers of connections as functions of time in a simple two population network. The cyan and black curves show the calcium concentration measured in the inhibitory and excitatory populations, respectively. The paler horizontal lines indicate the corresponding target levels ǫ. The blue and red dashed curves indicate the total number of connections in the inhibitory and excitatory populations, respectively. Vertical gray lines indicate the times of the snapshots displayed in the lower panel. Lower panel (A–F): Evolution of the connectivity in the two population network visualized using MSPViz . Images show half of the total amount of neurons in the network, where triangles represent excitatory neurons and circles inhibitory neurons. Red lines indicate excitatory connections while blue lines indicate inhibitory connections.

10−<sup>4</sup> elements/ms. Independent growth curves can be created for each synaptic element.

Now that we have defined the growth curve, we can assign this growth curve to the synaptic elements that each neuron will be able to grow. After that, we create the neurons and let NEST know that these synaptic elements are linked to the neurons:

```
synaptic_elements = {
    'Den_ex': growth_curve_e_e,
    'Den_in': growth_curve_e_i,
    'Axon_ex': growth_curve_e_e,
}
nodes = nest.Create('iaf_neuron',
     number_excitatory_neurons)
nest.SetStatus(nodes, 'synaptic_
     elements', synaptic_elements)
```
In this case we are creating the neurons pertaining to the excitatory population. Each neuron has three types of synaptic elements, one dendritic excitatory, one dendritic inhibitory and one axonal excitatory.

The final step is to enable structural plasticity and simulate:

```
nest.EnableStructuralPlasticity()
nest.Simulate(t_sim)
```
A complete PyNEST example which describes how to create a network with two populations, enable structural plasticity and simulate the network is available as Supplementary Material for this paper.

### 3.2. Scalability

While the update in electrical activity has been proven to scale up to 10<sup>9</sup> neurons, it is important to verify that updating the number of elements and the deletion and formation of synapses does not restrict the expected scaling, at least in the desired regime of up to 10<sup>6</sup> neurons. Updates in synaptic elements and connectivity make use of MPI's "AllGather" communication scheme to communicate the data. This collective communication is also used by the NEST kernel to communicate the spiking activity between the neurons during the simulation. Although AllGather implements communication between all processes, it is very unlikely that a huge amount of data has to be communicated when a reasonable growth rate of around 10−<sup>4</sup> elements/ms because updating the number of synaptic elements and the connectivity are very slow processes compared to the update in electrical activity.

### 3.2.1. Weak Scaling

**Figure 4A** shows the efficiency, defined as the speed-up divided by the number of nodes, of the implementation as measured by a weak scaling test with 28 OMP threads running on each node. It is visible that, as the number of neurons increases, so does the total number of synapses. The presence of new synapses leads to an increase of communication between neurons, which leads to a decrease in the efficiency of the simulation.

### 3.2.2. Strong Scaling for a Network of 100,000 Neurons

**Figure 4B** shows the computation times of the strong scaling tests for a network of 100, 000 neurons, and **Figure 4C** shows the efficiency, defined as speed-up divided by the number of nodes, of the strong scaling test. The peak efficiency is achieved with 4 nodes and 112 cores. These results show supra-linear scaling for this network. In Morrison et al. (2005) and Plesser et al. (2007), supra linear scaling for biological neural networks on NEST was demonstrated due to increasingly efficient caching.

These results show that the introduction of the new structural plasticity framework into NEST has no impact in the scalability of the simulation up to a network size close to that of a cortical column if a suitably low growth rate is selected.

### 3.3. Performance on the Use Cases 3.3.1. A Simple Two Population Network

The upper panel of **Figure 5** shows the evolution of the calcium concentration and total number of connections for the two population model described in 2.3.1.

The lower panel of **Figure 5** shows a graphical representation of the evolution of the connectivity in the network. During the first 30 s of the simulation, mostly excitatory connections

are created (**Figure 5A**). This allows the calcium concentration to increase in both populations. When the target mean electrical activity is reached and overshoots in the excitatory population (**Figure 5B**), the number of excitatory connections starts to decrease (**Figure 5C**) until the desired level of calcium concentration is achieved and stabilized in the excitatory population. However, both pre- and post-synaptic elements in the inhibitory population are still being created because it has not yet reached its target mean electrical activity . It is important to remember that neurons have no information regarding the global status of the network and the evolution of their synaptic elements depends solely on the predefined homeostatic local rules. At around 40 s (**Figure 5D**), an increment in excitatory connections is triggered by the enhanced levels of inhibition. This leads to a complete rewiring of the network (**Figure 5E**). The trend is preserved until the mean electrical activity in the inhibitory population is also reached (**Figure 5F**).

In this network setting, the inhibitory population has a higher level of activity than the excitatory population. It is important to remember that the probability of two neurons connecting

depends only on the number of available compatible synaptic elements between them. At the start of the simulation, the inhibitory population must offer more post-synaptic elements for excitatory synapses than the excitatory population, otherwise the excitatory population would reach equilibrium first and cease to create excitatory pre-synaptic elements. As a result, not enough excitatory synapses would be created to the inhibitory population and it would never reach the desired level of activity. It is important to remember that the structural plasticity parameter space is broad and a certain amount of exploration is required to discover combinations for each synaptic element which take the network to equilibrium. However, there is in general no unique combination of parameters leading to equilibrium, and different equilibrium combinations will typically produce different connectivity patterns. At this point, biological constraints must be applied to choose between them.

#### 3.3.2. The Cortical Microcircuit Network

In the case of the cortical microcircuit model described in 2.3.2, **Figure 6** shows the changes in calcium concentration, while **Figure 7** shows the evolution of connectivity among layers as the simulation runs. In this case, parameters which lead to stable network connectivity were chosen. Reaching stable connectivity in the networks takes around 700 biological seconds of simulation, which takes 24 h using 25 nodes and 24 cores per node in the JURECA cluster to simulate. It is visible that during the first 20–30 s of simulation, connectivity highly increases on every layer. After the initial overshoot, a smoother approximation toward the desired activity levels is achieved. As seen only from the calcium concentration diagram, the evolution of the network appears to be quite stable. Regardless, the connectivity plots show a continuous dynamical reorganization. While neurons on some layers might start deleting connections due to excess of activity, the post-synaptic neurons must then create new connections in order to compensate for missing activity in case they have not reached their setpoint yet. This leads to a continuous search for compensating excitation and inhibition which must satisfy the requirements of all 8 populations. From **Figure 7** it can be seen that outgoing connections from excitatory populations on layers IV, V, and VI are quite stable. On the other hand, layer II/III exhibits the highest amount of reorganization, both from the excitatory and inhibitory populations. This might be due to the fact that their reduced target levels of activity might be easily influenced by variations in all other layers. Inhibitory populations on all layers in general exhibit a higher degree of reorganization during the whole simulation.

The search space of connectivity parameters for this model of the cortical microcircuit is large as each setup requires 64 values to be defined. If a brute force exploration would be performed on these parameters by simulating each combination for 1 biological second, only 1 − 2 values per parameter could be considered before more biological seconds would be simulated than using the structural plasticity approach. When adequate synaptic element growth curves are defined, the structural plasticity framework allows a progressive exploration of the space in which the dynamics of the the 8 populations are balanced at every step, thus providing an efficient way to find stable connectivity combinations.

**Figure 8** presents a comparison between the proportional values of connectivity among layers between the results obtained from the simulation using structural plasticity and the original values reported by Potjans and Diesman. The average error in percentual connectivity is of 1.058 ± 1.175.

A second case was also explored, in which parameters lead to unstable network activity are chosen. **Figure 9** shows the evolution of connectivity among layers and **Figure 10** shows the changes in calcium concentration in each layer for this scenario. Overshoots in the connectivity, are originated by a choice of higher rate in the creation of synaptic elements. The system behaves as a feedback control system, with a delay which is defined by the time between updates in connectivity and the synaptic element creation rate. The synaptic element growth rate determines the steepness of the growth curve, and influences the speed at which control changes are made. The instability in the connectivity is reflected in the calcium concentration, never reaching the desired levels. A stable setting involves finding a suitable balance between the speed in the creation of excitatory and inhibitory connections related to the desired level of activity for each layer.

To study the behavior of the structural plasticity algorithm on partially pre-connected networks, another simulation was set in which the initial conditions in connectivity for the structural plasticity algorithm were those specified in the original model of Potjans and Diesman. The network was simulated without plasticity for an initial period of 30 s in order to allow the calcium concentration reach an initial stable value. The evolution of the calcium concentration in all layers after the structural plasticity algorithm was enabled can be seen in **Figures 11A,B**. The stability point is reached a lot faster than in the scenario with no initial connections, at around 400 s. A final simulation was set

FIGURE 8 | Comparison of the normalized connectivity in the microcircuit model between the results obtained with the structural plasticity framework (red) and the values reported by Potjans and Diesmann (2014) (blue). The radius of the circle represents the linearly normalized value of the percentage of connections between layers.

in which the connectivity was specified with a 10% error margin from the original setup reported by Potjans and Diesman. The evolution of the calcium concentration in all layers after plasticity was enabled can be seen in **Figures 11C,D**. The structural plasticity algorithm is able to find a suitable balance between excitation and inhibition. The initial overshoot in electrical activity is a reflection of the initial stronger reconfigurations of the network connectivity. It is important to highlight that a suitable growth scheme is required for the algorithm to reach this stability. Not all setups will become stable or find a solution, this depends on the initial conditions, the desired set points, the shape of the growth curve and the growth rate.

### 4. DISCUSSION

In this paper we have described the implementation of a framework of structural plasticity for the neural network simulator NEST. We show that the framework is scalable and can be used to model the dynamical creation and deletion of synapses inside a large scale network guided by simple homeostatic rules.

This work also presents some use cases for the framework and some of its potential applications. Researchers can now use structural plasticity in NEST to generate the connectivity of a network from scratch, defining homeostatic rules, in form of synaptic element growth curves, which may vary according to

their needs. The shape of the growth curve defines the speed with which new synaptic elements are created, and as a result, defines the acceleration at which calcium is stored inside the neuron. The relationship between the growth speed at certain level of calcium concentration of excitatory and inhibitory elements is fundamental to achieve stable setups under the model of structural plasticity. As is has been shown, some parameter combinations lead to unstable activity in the network. There are cases where the desired average electrical activity will never be reached by the system. In other cases the average electrical activity will oscillate continuously or suddenly go out of bounds. This relationship depends also on the size of the network and the neuron model used. As a consequence, some care is required in navigating the parameter space in order to achieve desired results.

The example of the two population network illustrates how this framework can be used to understand the interaction between activity and the creation of synapses. The behavior observed in the simulation can be used to model how inhibition triggers critical periods of connectivity during development of neural networks (Hensch, 2005). During this window, external stimuli can also be used to shape the formation of the new connections. Together with the performance measurements, these results show that our implementation of structural plasticity is suitable to study the development of connectivity patterns inside a neural network in an efficient and scalable manner.

In the specific case of the cortical microcircuit presented in this work, we are able to see some similarities and differences between the results obtained by simulating with the structural plasticity framework and the data reported in Potjans and Diesmann (2014). One of the most visible differences is the smaller amount of recurrent connections generated in the simulation for layer 2/3. This layer has a very low target electrical activity, which is initially almost reached by external input. This means that very few synapses are required to reach this target. This fact limits the creation of synapses for this layer. Note that the results shown in this paper were obtained only by defining target activity levels; no other connectivity constraints were specified. A more elaborate simulation could incorporate tailored growth curves for each layer, and implement additional connectivity restrictions which promote recurrent connections and other connectivity patterns that do not emerge naturally from the current approach.

Another visible difference is that the excitatory population of layers 5 and 6 show a higher number of connections than the ones shown in the original work. On the other hand, connections from and to the inhibitory population of layer 5 and layer 2/3 are well fit. Except from connections between the inhibitory and excitatory populations of layer 4, connections from and to layer 4 are also well predicted.

In this paper we describe a framework which can be used to study structural network dynamics. The focus of this paper is on the technical implementation. It is not the scope of the present work to perform a deep analysis of the biological results that can be obtained using this framework. However, we provide some examples of how the framework can be used, its capacities and limitations. This implementation gives researchers flexibility to explore complex connectivity dynamics by extending the synaptic elements growth rules. As our implementation is integrated into NEST, simulations using structural plasticity can also be combined with other features available in the simulator. For example, the user may take into account dynamic synaptic

### REFERENCES


weights by mixing this framework with synaptic plasticity. The framework can also be further extended using the current topology framework in NEST in order to constrain connectivity by relative position.

We also show that the structural plasticity algorithm is able to solve the complex balance of interaction between layers with different levels of electrical activity when partial information of the connectivity is available. This result is very promising, as it shows that given the right growth rules, it would now be possible to reconstruct connectivity inside a network without having exact anatomical information. We therefore conclude that our approach represents a novel and useful technique to close the current gaps in information about the connectivity in certain regions of the brain.

### AUTHOR CONTRIBUTIONS

SD worked on the implementation of the structural plasticity framework in NEST, simulated the use cases and performed the scalability tests. MN did most of the implementation of the structural plasticity framework in NEST and assessed the simulations. MB gave the theoretical guidance regarding the structural plasticity algorithm and the use cases. AM gave scientific and theoretical guidance on the simulation and implementation of the framework.

### FUNDING

This work was supported by the Helmholtz Association through the Helmholtz Portfolio Theme "Supercomputing and Modeling for the Human Brain" and its Initiative and Networking Fund, and by the Jülich Aachen Research Alliance (JARA).

### ACKNOWLEDGMENTS

We would like to thank Juan Pedro Brito Méndez and Benjamin Weyers for their collaboration in the development of the MSPViz tool to visualize the evolution of neural networks using our implementation of structural plasticity in NEST.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fnana. 2016.00057


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Diaz-Pier, Naveau, Butz-Ostendorf and Morrison. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Structural Plasticity, Effectual Connectivity, and Memory in Cortex

#### Andreas Knoblauch<sup>1</sup> \* and Friedrich T. Sommer <sup>2</sup>

*1 Informatics Faculty, Albstadt-Sigmaringen University, Albstadt, Germany, <sup>2</sup> Redwood Center for Theoretical Neuroscience, University of California at Berkeley, Berkeley, CA, USA*

Learning and memory is commonly attributed to the modification of synaptic strengths in neuronal networks. More recent experiments have also revealed a major role of structural plasticity including elimination and regeneration of synapses, growth and retraction of dendritic spines, and remodeling of axons and dendrites. Here we work out the idea that one likely function of structural plasticity is to increase "effectual connectivity" in order to improve the capacity of sparsely connected networks to store Hebbian cell assemblies that are supposed to represent memories. For this we define effectual connectivity as the fraction of synaptically linked neuron pairs within a cell assembly representing a memory. We show by theory and numerical simulation the close links between effectual connectivity and both information storage capacity of neural networks and effective connectivity as commonly employed in functional brain imaging and connectome analysis. Then, by applying our model to a recently proposed memory model, we can give improved estimates on the number of cell assemblies that can be stored in a cortical macrocolumn assuming realistic connectivity. Finally, we derive a simplified model of structural plasticity to enable large scale simulation of memory phenomena, and apply our model to link ongoing adult structural plasticity to recent behavioral data on the spacing effect of learning.

#### Edited by:

*Markus Butz, Independent Researcher, Germany*

#### Reviewed by:

*Christian Tetzlaff, Max Planck Institute for Dynamics and Self-Organization, Germany Sen Cheng, Ruhr University Bochum, Germany*

#### \*Correspondence:

*Andreas Knoblauch knoblauch@hs-albsig.de*

Received: *30 November 2015* Accepted: *26 May 2016* Published: *16 June 2016*

#### Citation:

*Knoblauch A and Sommer FT (2016) Structural Plasticity, Effectual Connectivity, and Memory in Cortex. Front. Neuroanat. 10:63. doi: 10.3389/fnana.2016.00063* Keywords: synaptic plasticity, effective connectivity, transfer entropy, learning, potential synapse, memory consolidation, storage capacity, spacing effect

### 1. INTRODUCTION

Traditional theories attribute adult learning and memory to Hebbian modification of synaptic weights (Hebb, 1949; Bliss and Collingridge, 1993; Paulsen and Sejnowski, 2000; Song et al., 2000), whereas recent evidence suggests also a role for network rewiring by structural plasticity including generation of synapses, growth and retraction of spines, and remodeling of dendritic and axonal branches, both during development and adulthood (Raisman, 1969; Witte et al., 1996; Engert and Bonhoeffer, 1999; Chklovskii et al., 2004; Butz et al., 2009; Holtmaat and Svoboda, 2009; Xu et al., 2009; Yang et al., 2009; Fu and Zuo, 2011; Yu and Zuo, 2011). One possible function of structural plasticity is effective information storage, both in terms of space and energy requirements (Poirazi and Mel, 2001; Chklovskii et al., 2004; Knoblauch et al., 2010). Indeed, due to space and energy limitations, neural networks in the brain are only sparsely connected, even on a local scale (Abeles, 1991; Braitenberg and Schüz, 1991; Hellwig, 2000). Moreover, it is believed that the energy consumption of the brain is dominated by the number of postsynaptic potentials or, equivalently, the number of functional non-silent synapses (Attwell and Laughlin, 2001; Laughlin and Sejnowski, 2003; Lennie, 2003). Together this implies a pressure to minimize the number and density of functional (non-silent) synapses. It has therefore been suggested that the function of structural plasticity "moves" the rare expensive synapses to the most useful locations, while keeping the mean number of synapses on a constant low level (Knoblauch et al., 2014). By this, sparsely connected networks can have computational abilities that are equivalent to densely connected networks. For example, it is known that memory storage capacity of neural associative networks scales with the synaptic density, such that networks with a high connectivity can store many more memories than networks with a low connectivity (Buckingham and Willshaw, 1993; Bosch and Kurfess, 1998; Knoblauch, 2011). For modeling structural plasticity it is therefore necessary to define different types of "connectivity," for example, to be able to distinguish between the actual number of anatomical synapses per neuron and the "potential" or "effectual" synapse number in an equivalent network with a fixed structure (Stepanyants et al., 2002; Knoblauch et al., 2014).

In this work we develop substantial new analytical results and insights focusing on the relation between network connectivity, structural plasticity, and memory. First, we work out the relation between "effectual connectivity" in structurally plastic networks and functional measures of brain connectivity such as "effective connectivity" and "transfer entropy." Assuming a simple model of activity propagation between two cortical columns or areas, we argue that effectual connectivity is basically equivalent to the functional measures, while maintaining a precise anatomical interpretation. Second, we give improved estimates on the information storage capacity of a cortical macrocolumn as a function of effectual connectivity (cf., Stepanyants et al., 2002; Knoblauch et al., 2010, 2014). For this we develop exact methods (Knoblauch, 2008) to analyze associative memory in sparsely connected cortical networks storing random activity patterns by structural plasticity. Moreover, we generalize our analyses that are reasonable only for very sparse neural activity, to a recently proposed model of associative memory with structural plasticity (Knoblauch, 2009b, 2016) that is much more appropriate for moderately sparse activity deemed necessary to stabilize cell assemblies or synfire chains in networks with sparse connectivity (Latham and Nirenberg, 2004; Aviel et al., 2005). Third, we point out in more detail how effectual connectivity may relate to cognitive phenomena such as the spacing effect that learning improves if rehearsal is distributed to multiple sessions (Ebbinghaus, 1885; Crowder, 1976; Greene, 1989). For this, we analyze the temporal evolution of effectual connectivity and optimize the time gap between learning sessions to compare the results to recent behavioral data on the spacing effect (Cepeda et al., 2008).

### 2. MODELING

### 2.1. Memory, Cell Assemblies and Synapse Ensembles

Memories are commonly identified with patterns of neural activity that can be revisited, evoked and/or stabilized by appropriately modified synaptic connections (Hebb, 1949; Bliss and Collingridge, 1993; Martin et al., 2000; Paulsen and Sejnowski, 2000; for alternative views see Arshavsky, 2006). In the simplest case such a memory corresponds to a group of neurons that fire at the same time and, according to the Hebbian hypothesis that "what fires together wires together" (Hebb, 1949) develop strong mutual synaptic connections (Caporale and Dan, 2008; Clopath et al., 2010; Knoblauch et al., 2012). Such groups of strongly connected neurons are called cell assemblies (Hebb, 1949; Palm et al., 2014) and have a number of properties that suggest a function for associative memory (Willshaw et al., 1969; Marr, 1971; Palm, 1980; Hopfield, 1982; Knoblauch, 2011): For example, if a stimulus activates a subset of the cells, the mutual synaptic connections will quickly activate the whole cell assembly which is thought to correspond to the retrieval or completion of a memory. In a similar way, a cell assembly in one brain area u can activate an associated cell assembly in another brain area v. We call the set of synapses that supports retrieval of a given set of memories their synapse ensemble S. Memory consolidation is then the process of consolidating the synapses S.

Formally, networks of cell assemblies can be modeled as associative networks, that is, single layer neural networks employing Hebbian-type learning. **Figure 1** illustrates a simple associative network with clipped Hebbian learning (Willshaw et al., 1969; Palm, 1980; Knoblauch et al., 2010; Knoblauch, 2016) that associates binary activity patterns u 1 , u 2 , . . . and v 1 , v 2 , . . . within neuron populations u and v having size m = 7 and n = 8, respectively: Here synapses are binary, where a weight Wij may increase from 0 to 1 if both presynaptic neuron u<sup>i</sup> and postsynaptic neuron v<sup>j</sup> have been synchronously activated for at least θij times,

$$W\_{\vec{\imath}\vec{\jmath}} = \begin{cases} 1 & \boldsymbol{\alpha}\_{\vec{\imath}\vec{\jmath}} := \sum\_{\mu=1}^{M} \mathbb{R}(\boldsymbol{u}\_{\boldsymbol{i}}^{\mu}, \boldsymbol{\nu}\_{\boldsymbol{j}}^{\mu}) \ge \; \theta\_{\vec{\imath}\vec{\jmath}} \; . \tag{1} \\ 0 & \text{otherwise} \end{cases} \tag{1}$$

where M is the number of stored memories, ωij is called the synaptic potential, R defines a local learning rule, and θij is the threshold of the synapse. In the following we will consider the special case of Equation (1) with Hebbian learning, R(u µ i , v µ j ) = u µ i · v µ j , and minimal synaptic thresholds θij = 1, which corresponds to the well-known Steinbuch or Willshaw model (**Figure 1**; cf., Steinbuch, 1961; Willshaw et al., 1969). Further, we will also investigate the recently proposed general "zip net" model, where both the learning rule R and synaptic thresholds θij may be optimized for memory performance (Knoblauch, 2016): For R we assume the optimal homosynaptic or covariance rules, whereas synaptic thresholds θij are chosen large enough such that the chance p<sup>1</sup> := pr[Wij = 1] of potentiating a given synapse is 0.5 to maximize entropy of synaptic weights (see Appendix A.3 for further details). In general, we can identify the synapse ensemble S that supports storage of a memory set M by those neuron pairs ij with a sufficiently large synaptic potential ωij ≥ θij where θij may depend on M. For convenience we may represent S as a binary matrix (with Sij = 1 if ij ∈ S and Sij = 0 if ij 6∈ S) similar as the weight matrix Wij.

After learning a memory association u <sup>µ</sup> → v <sup>µ</sup>, a noisy input u˜ can retrieve an associated memory content vˆ in a single

association (B), pruning of irrelevant silent synapses (C), and the asymptotic storage capacity in bit/synapse as a function of the fraction *p*1 of potentiated synapses (D) for networks with and without structural plasticity (*C* tot vs. *C* wp; computed from Equations (49, 50, 47) for *<sup>P</sup>*eff <sup>=</sup> <sup>1</sup>; subscripts <sup>ǫ</sup> refer to maximized values at output noise level ǫ). Note that networks with structural plasticity can have a much higher storage capacity in sparsely potentiated networks with small fractions *p*<sup>1</sup> ≪ 1 of potentiated synapses.

processing step by

$$\hat{\nu}\_{j} = \begin{cases} 1, & \text{x}\_{j} = \left(\sum\_{i=1}^{m} \tilde{u}\_{i} W\_{ij} + \mathcal{N}\_{j}\right) \succeq \Theta\_{j} \\ 0, & \text{otherwise} \end{cases} \tag{2}$$

for appropriately chosen neural firing thresholds 2<sup>j</sup> . The model may include random variables N<sup>j</sup> to account for additional synaptic inputs and further noise sources, but for most analyses and simulations (except Section 3.1) we assume N<sup>j</sup> = 0 such that retrieval depends deterministically on the input u˜. In **Figure 1B**, stimulating with a noisy input pattern u˜ ≈ u <sup>1</sup> perfectly retrieves the corresponding output pattern vˆ = v 1 for thresholds 2<sup>j</sup> = 2. In the literature, input and output patterns are also called address and content patterns, and the (noisy) input pattern used for retrieval is called query pattern. In the illustrated completely connected network, the thresholds can simply be chosen according to the number of active units in the query pattern, whereas in biologically more realistic models, firing thresholds are thought to be controlled by recurrent inhibition, for example, regulating the number of active units to a desired level l being the mean activity of a content pattern (Knoblauch and Palm, 2001). Thus, a common threshold strategy in the more abstract models is to simply select the l most activated "winner" neurons having the largest dendritic potentials x<sup>j</sup> . In general, the retrieval outputs may have errors and the retrieval quality can then be judged by the output noise

$$\hat{\epsilon} = \frac{\sum\_{j=1}^{n} |\hat{\nu}\_{j} - \nu\_{j}^{\mu}|}{l} \tag{3}$$

defined as the Hamming distance between vˆ and v <sup>µ</sup> normalized to the mean number l of active units in an output pattern. Similarly, we can define input noise ǫ˜ as the Hamming distance between u˜ and u <sup>µ</sup> normalized to the mean number k of active units in an input pattern.

In the illustrated network u and v are different neuron populations corresponding to hetero-association. However, all arguments will also apply to auto-association when u and v are identical (with m = n, k = l), and cell assemblies correspond to cliques of interconnected neurons. In that case output activity can be fed back to the input layer iteratively to improve retrieval results (Schwenker et al., 1996). Stable activation of a cell assembly can then expected if output noise ǫˆ after the first retrieval step is lower than input noise ǫ˜.

Capacity analyses show that each synapse can store a large amount of information. For example, even without any structural plasticity, the Willshaw model can store C wp = 0.69 bit per synapse by weight plasticity (wp) corresponding to a large number of about n 2 / log<sup>2</sup> n small cell assemblies, quite close to the theoretical maximum of binary synapses (Willshaw et al., 1969; Palm, 1980). However, unlike in the illustration, real networks will not be fully connected, but, on a local scale of macrocolumns, the chance that two neurons are connected is only about 10% (Braitenberg and Schüz, 1991; Hellwig, 2000). In this case it is still possible to store a considerable number of memories, although maximal M scales with the number of synapses per neuron, and cell assemblies need to be relatively large in this case (Buckingham and Willshaw, 1993; Bosch and Kurfess, 1998; Knoblauch, 2011).

By including structural plasticity, for example, through pruning the unused silent synapses after learning in a network with high connectivity (**Figure 1C**), the total synaptic capacity of the Willshaw model can even increase to C tot ∼ log n ≫ 1 bit per (non-silent) synapse, depending on the fraction p<sup>1</sup> of potentiated synapses (**Figure 1D**; see Knoblauch et al., 2010). Moreover, the same high capacity can be achieved for networks that are sparsely connected at any time, if the model includes ongoing structural plasticity and repeated memory rehearsal or additional consolidation mechanisms involving memory replay (Knoblauch et al., 2014).

In Section 3.2 we precisely compute the maximal number of cell assemblies that can be stored in a Willshaw-type cortical macrocolumn. As the Willshaw model is optimal only for extremely small cell assemblies with k ∼ log n (Knoblauch, 2011), we will extend these results also for the general "zip model" of Equation (1) that performs close to optimal Bayesian learning even for much larger cell assemblies (Knoblauch, 2016).

### 2.2. Anatomical, Potential, and Effectual Connectivity

As argued in the introduction, connectivity is an important parameter to judge performance. However, network models with structural plasticity need to consider different types of connectivity, in particular, anatomical connectivity P, potential connectivity Ppot, effectual connectivity Peff, and target connectivity as measured by consolidation load P1<sup>S</sup> (see **Figure 2**; cf., Krone et al., 1986; Braitenberg and Schüz, 1991; Hellwig, 2000; Stepanyants et al., 2002; Knoblauch et al., 2014),

$$P := \frac{\text{\textquotedbl{}actual\textquotedbl{}synaptic connections\textquotedbl{} concentration\textquotedbl{}}}{\dots},\tag{4}$$

$$P\_{\text{pot}} := \frac{mm}{mm} \frac{\text{\textsuperscript{\#potential\textsuperscript{\#}}} \,\text{\textsuperscript{\#}}}{mm} \,,\tag{5}$$

$$P\_{\text{eff}} := \frac{\sum\_{i=1}^{m} \sum\_{j=1}^{n} H(\mathcal{W}\_{ij} \mathcal{S}\_{ij})}{\sum\_{i=1}^{m} \sum\_{j=1}^{n} H(\mathcal{S}\_{ij}^{2})} \,, \tag{6}$$

$$P\_{1S} := \frac{\sum\_{i=1}^{m} \sum\_{j=1}^{n} H(\mathbf{S}\_{ij}^2)}{mm},\tag{7}$$

where H is the Heaviside function (with H(x) = 1 if x > 0 and 0 otherwise) to include the general case of non-binary weights and synapse ensembles (Wij, Sij ∈ R).

First, anatomical connectivity P is defined as the chance that there is an actual synaptic connection between two randomly chosen neurons (**Figure 2A**) 1 . However, for example in the pruned network of **Figure 1C**, the anatomical connectivity P equals the fraction p<sup>1</sup> of potentiated synapses (before pruning) and, thus, conveys only little information about the true (full) connectivity within a cell assembly. Instead, it is more adequate to consider potential and effectual connectivity (**Figures 2B,C**).

Second, potential connectivity Ppot is defined as the chance that there is a potential synapse between two randomly chosen neurons, where a potential synapse is defined as a cortical location ij where pre- and postsynaptic fibers are close enough such that a synapse could potentially be generated or has already been generated (Stepanyants et al., 2002).

Third, effectual connectivity Peff defined as the fraction of "required synapses" that have already been realized is most interesting to judge the functional state of memories or cell assemblies during ongoing learning or consolidation with structural plasticity. Here we call the synapse ensemble Sij required for stable storage of a given memory set also the consolidation signal. If ij corresponds to an actual synapse, we

<sup>1</sup>More precisely, this means the presence of at least one synapse connecting the first to the second neuron. This definition is motivated by simplifications employed by many theories for judging how many memories can be stored. These simplifications include, in particular, (1) point neurons neglecting dendritic compartments and non-linearities, and (2) ideal weight plasticity such that any desired synaptic strength can be realized. Then having two synapses with strength 1 would be equivalent to a single synapse with strength 2. The definition is further justified by experimental findings that the number of actual synapses per connection is narrowly distributed around small positive values (Fares and Stepanyants, 2009; Deger et al., 2012).

may identify the case Sij > 0 with tagging synapse ij for consolidation (Frey and Morris, 1997). In case of simple binary network models such as the Willshaw or zip net models, the Sij simply equal the optimal synaptic weights in a fully connected network after storing the whole memory set (Equation 1). Intuitively, if a set of cell assemblies or memories has a certain effectual connectivity Peff, then retrieval performance will be as if these memories would have been stored in a structurally static network with anatomical connectivity Peff, whereas true P in the structurally plastic network may be much lower than Peff.

Last, target connectivity or consolidation load P1<sup>S</sup> is the fraction of neuron pairs ij that require a consolidated synapse as specified by Sij. This means that P1<sup>S</sup> is a measure of the learning load of a consolidation task.

Note that our definitions of Peff and P1<sup>S</sup> apply as well to network models with gradual synapses (Wij, Sij ∈ R). More generally, by means of the consolidation signal Sij, we can abstract from any particular network model or application domain. Our theory is therefore not restricted to models of associative memory, but may be applied as well to other connectionist domains, given that the "required" synapse ensembles {ij|Sij 6= 0} and their weights can be defined properly by Sij. The following provides a minimal model to simulate the dynamics of effectual connectivity during consolidation.

### 2.3. Modeling and Efficient Simulation of Structural Plasticity

**Figure 3A** illustrates a minimal model of a "potential" synapse that can be used to simulate the dynamics of ongoing structural plasticity (Knoblauch, 2009a; Deger et al., 2012; Knoblauch et al., 2014). Here a potential synapse ij<sup>ν</sup> is the possible location of a real synapse connecting neuron i to neuron j, for example, a cortical location where axonal and dendritic branches of neurons i and j are close enough to allow the formation of a novel connection by spine growth and synaptogenesis (Krone et al., 1986; Stepanyants et al., 2002). Note that there may be multiple potential synapses per neuron pair, ν = 1, 2, . . .. The model assumes that a synapse can be either potential but not yet realized (state π), realized but still silent (state and weight 0), or realized and consolidated (state and weight 1).

For real synapses, state transitions are modulated by the consolidation signal Sij specifying synapses to be potentiated and consolidated Then structural plasticity means the transition processes between states π and 0 described by transition probabilities p<sup>g</sup> := pr[state(t + 1) = 0|state(t) = π] and pe|<sup>s</sup> := pr[state(t + 1) = π|state(t) = 0, Sij = s]. Similarly, weight plasticity means the transitions between states 0 and 1 described by probabilities pc|<sup>s</sup> := pr[state(t + 1) = 1|state(t) =

FIGURE 2 | Illustration of different types of "connectivity" corresponding to actual (A), potential (B), and requested synapses (C). The requested synapses in (C) correspond to the synapse ensemble *S* required to store the memory patterns in Figure 1.

0, Sij = s] and pd|<sup>s</sup> := pr[state(t + 1) = 0|state(t) = 1, Sij = s]. For simplicity, we do not distinguish between long-term potentiation (LTP) and synaptic consolidation (or L-LTP), both corresponding to the transition from state 0 to 1. In accordance with the state diagram of **Figure 3A**, the evolution of synaptic states can then be described by probabilities p (s) state(t) that a given potential synapse receiving Sij = s is in a certain state ∈ {π, 0, 1} at time step t = 0, 1, 2, . . .,

$$p\_1^{(s)}(t) = (1 - p\_{d|s(t)})p\_1^{(s)}(t-1) + p\_{c|s(t)}p\_0^{(s)}(t-1) \tag{8}$$

$$p\_0^{(s)}(t) = (1 - p\_{c|s(t)} - p\_{e|s(t)})p\_0^{(s)}(t-1) + p\_{d|s(t)}p\_1^{(s)}(t-1) + \\\tag{9}$$

$$p\_\% p\_\pi^{(s)}(t-1) \tag{9}$$

$$\begin{split} p\_{\pi}^{(s)}(t) &= (1 - p\_{\emptyset}) p\_{\pi}^{(s)}(t - 1) + p\_{\varepsilon | s(t)} p\_{0}^{(s)}(t - 1) \\ &= 1 - p\_{1}^{(s)}(t) - p\_{0}^{(s)}(t) \,, \end{split} \tag{10}$$

where the consolidation signal s(t) = Sij(t) may depend on time.

The second model variant (**Figure 3B**) can be described in a similar way except that pd|<sup>s</sup> describes the transition from state 1 to state π. Model B is more convenient to analyze the spacing effect. We will see that, in relevant parameter ranges, both model variants behave qualitatively and quantitatively very similar. However, in most simulations we have used model A.

Note that a binary synapse in the original Willshaw model (Equation 1, **Figures 1A,B**) is a special case of the described potential synapse (p<sup>g</sup> = pe|<sup>s</sup> = pd|<sup>s</sup> = 0, pc|<sup>s</sup> = s ∈ {0, 1}, Sij = Wij as in Equation 1). Then pruning following a (developmental) learning phase (**Figure 1C**) can be modeled by the same parameters except increasing pe|<sup>s</sup> > 0 to positive values. Finally, adult learning with ongoing structural plasticity can be modeled by introducing a homeostatic constraint to keep P constant (cf., Equation 69 in Appendix B.1; cf., Knoblauch et al., 2014), such that in each step the number of generated and eliminated synapses are about the same. **Figure 4** illustrates such a simulation for pe|<sup>s</sup> = 1 − s and a fixed consolidation signal Sij corresponding to the same memories as in **Figure 1**. Here the instable silent (state 0) synapses take part in synaptic turnover until they grow at a tagged location ij with Sij = 1 where they get consolidated (state 1) and escape further turnover. This process of increasing effectual connectivity (see Equation 70 in Appendix B.2) continues until all potential synapses with Sij = 1 have been realized and consolidated (**Figure 4**, t = 4) or synaptic turnover comes to an end if all silent synapses have been depleted.

Microscopic simulation of large networks of potential synapses can be expensive. We have therefore developed a method for efficient simulation of structural plasticity on a macroscopic level: Instead of the lower case probabilities (Equations 8–10) we consider additional memory-specific uppercase connectivity variables P (s) state defined as the fractions of neuron pairs ij that receive a certain consolidation signal s(t) = Sij(t) and are in a certain state ∈ {∅, π, 0, 1} (where ∅ denotes neuron pairs without any potential synapses). In general it is

$$P\_1^{(s)}(t) = P\_{\text{pot}}^{(s)} \sum\_{n=1}^{\infty} \mathfrak{p}(\mathfrak{n}) \left( 1 - (1 - p\_1^{(s)}(t))^{\mathfrak{n}} \right) \tag{11}$$

$$P\_{\pi}^{(s)}(t) = P\_{\text{pot}}^{(s)} \sum\_{n=1}^{\infty} \mathfrak{p}(n) \left( \rho\_{\pi}^{(s)}(t) \right)^{n} \tag{12}$$

$$P\_0^{(s)}(t) = P\_{\text{pot}}^{(s)} - P\_1^{(s)}(t) - P\_\pi^{(s)}(t) \tag{13}$$

where p (s) 1 and p (s) π are as in Equations (8, 10); P (s) pot is the fraction of neuron pairs receiving s that have at least one potential synapse; and p(n) is the conditional distribution of potential synapse number n per neuron pair having at least one potential synapse. Thus, we define a pre-/postsynaptic neuron pair ij to be in state 1 iff it has at least one state-1 synapse; in state 0 iff it does not have a state-1 synapse but at least one state-0 synapse; and in state π if it is neither in state 1 nor state 0 but has at least one potential synapse. See Fares and Stepanyants (2009) for neuroanatomical estimates of p(n) in various cortical areas.

Summing over s we obtain further connectivity variables P1, P0, P<sup>π</sup> from which we can finally determine the familiar network connectivities defined in the previous section,

$$P\_{\text{state}}(t) = \sum\_{s} P\_{\text{state}}^{(s)}(t) \quad \text{for state} \in \{ \emptyset, \pi, 0, 1 \} \tag{14}$$

$$P(t) = P\_0(t) + P\_1(t) \tag{15}$$

$$P\_{\rm pot}(t) = P\_{\pi}(t) + P\_0(t) + P\_1(t) \tag{16}$$

$$P\_{1\mathcal{S}} = \sum\_{s \neq 0} \sum\_{\text{state} \in \{\emptyset, \pi, 0, 1\}} P\_{\text{state}}^{\text{(s)}} \text{(t)} \tag{17}$$

$$P\_{\rm eff}(t) = \frac{\sum\_{s \neq 0} P\_1^{(s)}(t)}{P\_{1S}} \,. \tag{18}$$

In general, the consolidation signal s = s(t) = Sij(t) will not be constant but may be a time-varying signal (e.g., if different memory sets are consolidated at different times). To efficiently simulate a large network of many potential synapses, we can partition the set of potential synapses in groups that receive the same signal s(t). For each group we can calculate the temporal evolution of state probabilities p (s) π (t), p (s) 0 (t), p (s) 1 (t) of individual synapses from Equations (8–10). From this we can then compute from Equations (11–13) the group-specific macroscopic connectivity variables P (s) π (t), P (s) 0 (t), P (s) 1 (t), and finally from Equations (14–18) the temporal evolution of the various network connectivities P<sup>π</sup> (t), P0(t), P1(t), P(t) as well as effectual connectivity Peff(t) for certain memory sets. For such an approach the computational cost of simulating structural plasticity scales only with the number of different groups corresponding to different consolidation signals s(t) (instead of the number of potential synapses as for the microscopic simulations).

Moreover, this approach is the basis for further simplifications and the analysis of cognitive phenomena like the spacing effect described in Appendix B. For example, for simplicity, the following simulations and analyses assume that each neuron pair ij can have at most a single potential synapse [i.e., p(1) = 1]. In previous works we have simulated also a model variant

Note that *P*eff increases with time from the anatomical level *P*eff = 9/22 ≈ *P* at *t* = 1 toward the level of potential connectivity with *P*eff = 15/22 ≈ *P*pot at *t* = 4. Correspondingly, output noise ǫˆ decreases with increasing *P*eff. At each time firing threshold 2 is chosen maximally to activate at least *l* = 3 neurons corresponding to the mean cell assembly size in the output population.

allowing multiple synapses per neuron pair, where we observed very similar results as for single synapses (Knoblauch et al., 2014). As synapse number per connected neuron pair has sometimes been reported to be narrowly distributed around a small number (e.g., n = 4; cf., Fares and Stepanyants, 2009), one may also identify each single synapse in our model with a group of about 4 real cortical synapses (see Section 4).

This assumption is actually justified by evidence that n is narrowly distributed around a small number, e.g., n = 4 (Fares and Stepanyants, 2009). This means that two neurons are either unconnected or connected by a group of about four synapses (which is actually a very surprising finding as it is unclear how the neurons can regulate n; cf., Deger et al., 2012). This situation is well consistent with our modeling assumption p(1) = 1 if we identify each model synapse with such a group of about 4 real synapses.

### 3. RESULTS

### 3.1. Information Storage Capacity, Effectual Connectivity and its Relation to Functional Measures of Brain Connectivity

For an information-theoretic evaluation, associative memories are typically viewed as memory channels that transmit the original content patterns v <sup>µ</sup> and retrieve corresponding retrieval output pattern vˆ <sup>µ</sup> (see **Figure 5A**). Thus, the absolute amount of transmitted or stored information Cabs of all M memories

FIGURE 5 | Relation between effectual connectivity <sup>P</sup>eff, information storage capacity <sup>C</sup>, and output noise <sup>ǫ</sup>ˆ. (A) Processing model for computing storage capacity *<sup>C</sup>* :<sup>=</sup> *<sup>C</sup>*abs/*Pmn* for *<sup>M</sup>* given memory associations between input patterns *<sup>u</sup>* <sup>µ</sup> and output patterns *v* <sup>µ</sup> stored in the synaptic weights (Equation 1; *p* := pr[*u* <sup>µ</sup> = 1], *q* := pr[*v* <sup>µ</sup> = 1]; *k* and *l* are mean cell assembly sizes in neuron populations *u* and *v*). During retrieval noisy address inputs *u*˜<sup>µ</sup> with component errors *<sup>p</sup>ab* :<sup>=</sup> pr[*u*˜ µ *i* = *b*|*u* µ *i* <sup>=</sup> *<sup>a</sup>*] and input noise <sup>ǫ</sup>˜ :<sup>=</sup> *<sup>p</sup>*<sup>10</sup> <sup>+</sup> (1/*<sup>q</sup>* <sup>−</sup> 1)*p*<sup>01</sup> are propagated through the network (Equation 2) yielding output patterns *<sup>v</sup>*ˆ<sup>µ</sup> with component errors *<sup>q</sup>ab* :<sup>=</sup> pr[*v*<sup>ˆ</sup> µ *j* = *b*|*v* µ *j* = *a*] and output noise ǫˆ = *q*<sup>10</sup> + (1/*q* − 1)*q*01. The retrieved information is then the transinformation between *v* <sup>µ</sup> and *v*ˆµ. To simplify analysis, we assume independent transmission of individual (i.i.d.) memory bits *v* µ *j* over a binary channel with transmission errors *q*01, *q*10. (B) Information storage capacity *C*(*P*eff) (blue curve), and output noise ǫˆ(*P*eff) (red curve) as functions of effectual connectivity *P*eff for a structurally plastic Willshaw network (similar to Figure 4) of *m* = *n* = 100, 000 neurons storing *M* = 10<sup>6</sup> cell assemblies of sizes *k* = *l* = 50 and having anatomical connectivity *P* = 0.1 assuming zero input noise (ǫ˜ = 0). Data have been computed similar to Equation (37) using Equations (44–46) for 0 ≤ *P*eff ≤ *P*/*p*<sup>1</sup> .

equals the transinformation or mutual information (Shannon and Weaver, 1949; Cover and Thomas, 1991)

$$C\_{\rm abs} := T(\hat{V}; V) := \sum p(\hat{V}, V) \log\_2 \frac{p(\hat{V}, V)}{p(\hat{V}) \cdot p(V)} \tag{19}$$

where V := (v 1 , v 2 , . . . v <sup>M</sup>) and Vˆ := (vˆ 1 , vˆ 2 , . . . , vˆ <sup>M</sup>) correspond to the sets of original and retrieved content patterns, and p(.) to their probability distributions. If all M memories and n neurons have independent and identically distributed (i.i.d) activities (e.g., same fraction q of active units per pattern and component transmission error probabilities q01, q10), we can approximate this memory channel by a simple binary channel transmitting M · n memory bits v µ j 7→ ˆv µ j as assumed in Appendix A. Then

$$C\_{\rm abs} \approx M \cdot T(\hat{\nu}^{\mu}; \nu^{\mu}) \approx M \cdot n \cdot T(q, q\_{01}, q\_{10}) \tag{20}$$

where T(vˆ <sup>µ</sup>; v <sup>µ</sup>) is the transinformation for single memory patterns and T(q, q01, q10) is the transinformation of a single bit (see Equation 38). From this we obtain the normalized information storage capacity C per synapse after dividing Cabs by the number of synapses Pmn (similar to Equation 37).

In our first experiment we have investigated the relation between information storage capacity and effectual connectivity Peff during ongoing structural plasticity. For this we have assumed a larger network of size m = n = 100000 with anatomical connectivity P = 0.1 and larger cell assemblies with sizes k = l = 50, but otherwise a similar setting as for the toy example illustrated by **Figure 4**. **Figure 5B** shows output noise ǫˆ and normalized capacity C as functions of effectual connectivity Peff for a given number of M = 10<sup>6</sup> random memories. Interestingly, both ǫˆ and C turn out to be monotonic functions of Peff because output errors decrease with increasing Peff (see Equations 45, 46). Therefore, also output noise ǫˆ(Peff) decreases with increasing Peff whereas, correspondingly, stored information per synapse C(Peff) increases with Peff. Because monotonic functions are invertible, we can thus conclude that effectual connectivity Peff is an equivalent measure of information storage capacity or the transinformation (=mutual information) between the activity patterns of two neuron populations u and v. As can be seen from our data, C(Peff) tends to be even linear over a large range, C ∼ Peff, until saturation occurs if ǫˆ → 0 approaches zero corresponding to high-fidelity retrieval outputs.

Next, based on the this equivalence between Peff and C, we work out the close relationship between Peff and commonly used functional measures of brain connectivity. Recall that we have introduced "effectual connectivity" as a measure of memory related synaptic connectivity (**Figure 2C**) that shares with other definitions of connectivity (such as anatomical and potential connectivity) the idea that any "connectivity" measure should correspond to the chance of finding a connection element (such as an actual or potential synapse) between two cells. By contrast, in brain imaging and connectome analysis (Friston, 1994; Sporns, 2007) the term "connectivity" has a more heterogeneous meaning ranging from patterns of synaptic connections (anatomical connectivity) and correlations between neural activity (functional connectivity) to causal interactions between brain areas. The latter is also referred to as "effective connectivity" although usually measured in information theoretic terms (bits) such as delayed mutual information or transfer entropy (Schreiber, 2000). For example, in the simplest case the transfer entropy between activities u(t) and v(t) measured in two brain areas u and v is defined as

$$T\_{\boldsymbol{\mu}\rightarrow\boldsymbol{\nu}} := \sum \rho(\boldsymbol{\nu}(t+1), \boldsymbol{\mu}(t), \boldsymbol{\nu}(t)) \log\_2 \frac{\rho(\boldsymbol{\nu}(t+1)|\boldsymbol{\mu}(t), \boldsymbol{\nu}(t))}{\rho(\boldsymbol{\nu}(t+1)|\boldsymbol{\nu}(t))} \tag{21}$$

where p(.) denotes the distribution of activity patterns (see Equation 4 in Schreiber, 2000) 2 . Such ideas of effective connectivity come from the desire to extract directions of information flow between two brain areas from measured neural activity, contrasting with (symmetric) correlation measures that can neither detect processing directions nor distinguish between causal interactions and correlated activity due to a common cause.

To see the relation between these functional measures of "effective connectivity" and Peff, first, note that transfer entropy equals the well-known conditional transinformation or conditional mutual information between v(t + 1) and u(t) given v(t) (Dobrushin, 1959; Wyner, 1978),

$$\begin{aligned} T(\nu(t+1); u(t)|\nu(t)) &:= \\ \sum p(\nu(t+1), u(t), \nu(t)) \log\_2 \frac{p(\nu(t+1), u(t)|\nu(t))}{p(\nu(t+1)|\nu(t)) \cdot p(u(t)|\nu(t))} \\ \sum p(\nu(t+1), \nu(t)) \cdot p(\nu(t)|\nu(t)) \end{aligned} \tag{22}$$

$$\begin{aligned} \rho &= \sum p(\nu(t+1), \ u(t), \nu(t)) \log\_2 \\ \frac{p(\nu(t+1)|\mu(t), \nu(t)) \cdot p(\mu(t)|\nu(t))}{p(\nu(t+1)|\nu(t)) \cdot p(\mu(t)|\nu(t))} &= T\_{\mu \rightarrow \nu} \end{aligned} \tag{23}$$

Second, we may apply this to one-step retrieval in an associative memory (Equation 2). Then u(t) = ˜u <sup>µ</sup> is a noisy input, and the update v(t + 1) = F(u(t)) = ˆv <sup>µ</sup> produces the corresponding output pattern, where the mapping F corresponds to activity propagation through the associative network. As here the update does not depend on the old state v(t), we may approximate transfer entropy by the regular transinformation or mutual information

$$T\_{\boldsymbol{u}\rightarrow\boldsymbol{v}} = T(\boldsymbol{\nu}(t+1);\,\boldsymbol{u}(t)|\boldsymbol{\nu}(t)) \approx T(F(\boldsymbol{u}(t));\,\boldsymbol{u}(t))\tag{24}$$

$$=I(\mu(t)) - I(\mu(t)|F(\mu(t))) \quad \text{(25)}$$

$$=I(F(\mu(t))) - I(F(\mu(t))|\mu(t)) \quad \text{(26)}$$

where I(X) := −P x p(x) log p(x) is the Shannon information of a random variable X, and I(X|Y) := −P x,y p(x, y) log p(x|y) the conditional information of X given Y (Shannon and Weaver, 1949; Cover and Thomas, 1991). Thus, up to normalization, transfer entropy Tu→<sup>v</sup> ≈ T(F(u(t)); u(t)) = T(vˆ <sup>µ</sup>; ˜u <sup>µ</sup>) has a very similar form as storage capacity Cabs in Equation (20). If F(u) is deterministic, the second term in Equation (26) vanishes and transfer entropy equals the output information I(F(u(t))) ≤ I(u(t)). If F(u) is also invertible, the second term in Equation (25) would vanish and Tu→<sup>v</sup> = I(u(t)) = I(F(u(t))) = Cabs/M. However, in the associative memory application many

<sup>2</sup> The general case considers delay vectors (u(t), u(t − 1), . . . , u(t − K + 1) and (v(t), v(t − 1), . . . , v(t − L + 1)) instead of u(t) and v(t).

input patterns are (ideally) mapped to one memory and F(u) is noninvertible and thus Tu→<sup>v</sup> = I(F(u(t))) < I(u(t)). Moreover, in more realistic cortex models F is also nondeterministic as v(t + 1) will depend not only on activity u(t) from a single input area, but also on inputs from further cortical and subcortical areas as well as on numerous additional noise sources. Thus, in fact it will be Tu→<sup>v</sup> < I(F(u(t))).

Third, we can compare Tu→<sup>v</sup> to information storage capacity (Equation 20) by normalizing to single memory patterns,

$$\frac{C\_{\text{abs}}}{M} = \frac{CPmn}{M} = T(\hat{\nu}^{\mu}; \nu^{\mu}) = T(F(\mu(t); \nu^{\mu(\mu(t))}) \tag{27}$$

$$=I(F(\mu(t))) - I(F(\mu(t))|\nu^{\mu(\mu(t))}) \tag{28}$$

where µ(u(t)) is a function determining the memory index of the input pattern u <sup>µ</sup>(u(t)) best matching the current input u˜ = u(t). Thus, comparing Equation (26) to Equation (28) yields generally

$$T\_{\mu \rightarrow \nu} - \frac{C\_{\text{abs}}}{M} = I(F(\mu(t)) | \nu^{\mu(\mu(t))}) - I(F(\mu(t)) | \mu(t)) \geq 0 \,. \tag{29}$$

where the bound is true as v <sup>µ</sup>(u(t)) is a deterministic function of u(t). In particular, for deterministic F, transfer entropy Tu→<sup>v</sup> = Cabs <sup>M</sup> + I(F(u(t))|v <sup>µ</sup>(u(t))) typically exceeds normalized capacity <sup>C</sup>abs M , whereas equality follows for I(F(u(t))|v <sup>µ</sup>(u(t))) = I(F(u(t))|u(t)), for example, error-free retrieval with F(u(t)) = v <sup>µ</sup>(u(t)). Appendix A.4 shows that equality holds generally as well for nondeterministic propagation of activity (e.g., Equation 2 with N<sup>j</sup> 6= 0) if we assume that component retrieval errors occur independently with probabilities q<sup>01</sup> := pr[vˆ µ <sup>j</sup> = 1|v µ <sup>j</sup> = 0] ≈ pr[vˆ µ <sup>j</sup> = 1|v µ <sup>j</sup> = 0, u˜] = pr[vˆ µ <sup>j</sup> <sup>=</sup> <sup>1</sup>| ˜u] and <sup>q</sup><sup>10</sup> :<sup>=</sup> pr[vˆ µ <sup>j</sup> = 0|v µ <sup>j</sup> = 1] ≈ pr[vˆ µ <sup>j</sup> = 0|v µ <sup>j</sup> = 1, u˜] = pr[vˆ µ <sup>j</sup> = 0| ˜u] corresponding to the same (nondeterministic, i.i.d.) processing model as we have presumed in our capacity analysis (**Figure 5A**; see also Appendix A, Equations 42–43 or Equations 45–46 for Willshaw networks). Then normalizing transfer entropy TE and information capacity CN per output unit yields (see Equations 53, 38)

$$\text{TE} := \frac{T\_{u \to v}}{n} \stackrel{>}{\approx} T(q, q\_{01}, q\_{10}) \approx \text{CN} := \frac{\text{C}\_{\text{abs}}}{M n} = \frac{\text{CPm}}{M} \,. \tag{30}$$

Thus, "effective connectivity" as measured by transfer entropy becomes (up to normalization) equivalent to the information storage capacity C of associative networks (see Equation 37 with Equation 38).

**Figure 6** shows upper bounds TE ≤ OE := I(v µ j ) and lower bounds TE≥CN of transfer entropy as functions of output noise level ǫˆ = qq<sup>10</sup> + (1 − q)q<sup>01</sup> for different activities q of output patterns (cf., Equations 26, 29, 30). For low output noise (ǫˆ → 0) both Tu→<sup>v</sup> and C approach the full information content of the stored memory set. In general both TE and CN are monotonic functions of ǫˆ for relevant (sufficiently low) noise levels ǫˆ. While TE increases with ǫˆ for deterministic retrieval (N<sup>j</sup> = 0; cf. Equation 2), TE becomes a decreasing function of ǫˆ already for low levels of intrinsic noise (N<sup>j</sup> on the order of single synaptic inputs; see panel D). Similar decreases are obtained even without

FIGURE 6 | Transfer entropy, output entropy and information capacity. (A) Normalized transfer entropy (TE := *Tu*→*v*/*n*) is bounded by normalized information storage capacity (solid; CN := *CPm*/*M* ≤TE; see Equation 30 with Equation 38) and output entropy (dashed; OE := *I*(*v*ˆ µ *j* ) ≥TE), where TE = OE for deterministic retrieval and TE = CN for non-deterministic retrieval with independent output noise (see text for details). The curves show TE,CN,OE as functions of output noise ǫˆ = (1 − *q*)*q*<sup>01</sup> assuming only add noise *q*<sup>01</sup> = pr[*v*ˆ *<sup>j</sup>* = 1|*v<sup>j</sup>* = 0] but no miss noise *q*<sup>10</sup> = pr[*v*ˆ *<sup>j</sup>* = 0|*v<sup>j</sup>* = 1] = 0 (e.g., as it is the case for optimal "pattern part" retrieval; see Equation 46 in Appendix A.2). Different curves correspond to different fractions *q* of active units in a memory pattern (thick, medium, and thin lines correspond to *q* = 0.5, *q* = 0.1, and *q* = 0.01, respectively). (B) Contour plot of CN = min TE as function of output noise ǫˆ and activity parameter *q* for *q*<sup>10</sup> = 0. (C) Contour plot of OE=max TE as function of output noise ǫˆ and activity parameter *q* for *q*<sup>10</sup> = 0. (D) TE (thick solid) and CN (thin dashed) as functions of ǫˆ for simulated retrieval (zero input noise ǫ˜ = 0) in Willshaw networks of size *n* = 10, 000 storing *M* = 1000 cell assemblies of size *k* = 100 (*q* = 0.01) and increasing *P*eff from 0 to 1 (markers correspond to *P*eff = 0.001, 0.01, 0.1, 0.15, 0.2, . . . , 0.95, 1). Each data point corresponds to averaging over 10 networks each performing 10,000 retrievals of 100 memories (see Equations 51, 52). Different curves correspond to different levels of intrinsic noise N*<sup>j</sup>* in output neurons *vj* (see Equation 2; N*<sup>j</sup>* uniformly distributed in [0; Nmax] for Nmax = 0, 1, 10, 100 as indicated by black, blue, green, red lines). Note that, already for low noise levels, retrieval is non-deterministic such that TE becomes monotonic decreasing in ǫˆ and, thus, similar or even equivalent to CN (and effectual connectivity *P*eff; see Figure 5B and Equation 49; cf. Figures 7, 8).

intrinsic noise, N<sup>j</sup> = 0, if the target assembly v <sup>µ</sup> receives (noisy) synaptic inputs from multiple cortical populations (data not shown; cf., Braitenberg and Schüz, 1991).

Our results thus show that, at least for realistic intrinsic noise and/or inter-columnar synaptic connectivity, transfer entropy Tu→<sup>v</sup> becomes equivalent to information capacity C. Because of the monotonic (or even linear) dependence of C on Peff (see **Figure 5B** and Equation 49; cf. **Figures 7**, **8**), transfer entropy is equivalent also to effectual connectivity Peff. Thus, we may interpret effectual connectivity Peff as an essentially equivalent measure of "effective connectivity" as previously defined for functional brain imaging. Still, due to its anatomical definition, Peff can only measure a potential causal interaction. For example, if both the synaptic connections from brain area u to v and the reverse connections from v to u have high Peff, we will not be able to infer the direction of information flow in a certain memory task unless we measure the actual neural activity.

### 3.2. Storage Capacity of a Macrocolumnar Cortical Network

A typical cortical macrocolumn comprises on the order of n = 10<sup>5</sup> neurons below about 1 mm<sup>2</sup> cortex surface, where the anatomical connectivity is about P = 0.1 and the potential connectivity about Ppot = 0.5 corresponding to a filling fraction f := P/Ppot = 0.2 (Braitenberg and Schüz, 1991; Hellwig, 2000; Stepanyants et al., 2002). Sizes of cell assemblies have been estimated to be somewhere between 50 and 500 in entorhinal

FIGURE 7 | Exact storage capacities for a finite Willshaw network having the size of a cortical macrocolumn (n = 105). (A) Contour plot of pattern capacity *M*ǫ (maximal number of stored memories or cell assemblies) as a function of assembly size *k* (number of active units in a memory pattern) and effectual network connectivity *P*eff assuming output noise level ǫ = 0.01 and no input noise (*u*˜ = *u* <sup>µ</sup>). (B) Weight capacity *C* wp ǫ (in bit/synapse) corresponding to maximal *M*ǫ in (A) for networks without structural plasticity. (C) Total storage capacity *C* tot ǫ (in bit/non-silent synapse) corresponding to maximal *M*ǫ in ( A) for networks with structural plasticity. Note that *C* tot may increase even further if less than the maximum *M*<sup>ǫ</sup> memories are stored (see text for details). (D) Minimal anatomical connectivity *P*<sup>1</sup> = *p*1*P*eff ≤ *P* required to achieve the data in ( A-C). Data computed as described in Appendix A.1. Red and blue dashed lines correspond to plausible values of *P*eff for networks with and without structural plasticity (assuming *P* = 0.1, *P*pot = 0.5). Note that only the area below the magenta dashed line (*P*<sup>1</sup> = 0.1) is consistent with *P* = 0.1. Our exact data is in good agreement with earlier approximative data (Knoblauch et al., 2014, Figure 5) unless *k* is very small (e.g., *k* < 50).

cortex (Waydo et al., 2006). Given these data we can try to estimate the number M of local cell assemblies or memories that can be stored in a macrocolumn (Sommer, 2000). In a previous work (Knoblauch et al., 2014, **Figure 5**) we have estimated the storage capacity for the Willshaw model (**Figures 1**, **4**) by approximating dendritic potential distributions by Gaussians. However, this approximation can be off as, in particular, for sparse activity dendritic potentials can strongly deviate from Gaussians. We have therefore developed a method to compute the exact storage capacity for the Willshaw model storing random memories (see Appendix A). **Figure 7** shows corresponding contour plots of pattern capacity Mǫ , weight capacity C wp ǫ , total synaptic capacity C tot ǫ , and the required minimal anatomical connectivity P<sup>1</sup> (assuming that all silent synapses have been pruned in the end). We can make several observations: First, the exact results can significantly deviate from the approximations (cf., Knoblauch et al., 2014, **Figure 5**). In particular, for extremely sparse activity (k < 10) the Gaussian assumption seems violated and the true capacities are significantly lower than estimated previously. Still, for larger more realistic 50 < k < 500 the new data is in good agreement with the previous Gaussian estimates, and for even larger k > 500 the true capacities even slightly exceed the previous estimates. Second, the previous conclusions, therefore, largely hold: Without structural plasticity (Peff = P = 0.1) the storage capacity would be generally very low and only a small number of memories could be stored. For very sparse k ≈ 50 not even a single memory could be stored and thus, the cell assembly hypothesis would be inconsistent with experimental

*P* = *P*<sup>1</sup> = 0.1 as a function of cell assembly size *k* and potential network connectivity *P*pot (which is here an upper bound on the achievable effectual connectivity *P*eff). (B) Total storage capacity *C* tot ǫ for zip nets including structural plasticity for the setting of (A). (C) Contour plot of the pattern capacity *M*ǫ of an optimal Bayesian associative network (Knoblauch, 2011) without structural plasticity as a function of cell assembly size *k* and anatomical network connectivity *P*. (D) Weight capacity *C* wp ǫ for the Bayesian net for the setting of (C). Other parameters are as assumed for Figure 7 (ǫ = 0.01, *u*˜ = *u* <sup>µ</sup>). Data computed as described in Appendix A.3. Red and blue dashed lines correspond to plausible values for *P*pot and *P*, respectively.

estimates of k. Third, by contrast, networks including structural plasticity increasing Peff from P = 0.1 to Ppot ≈ 0.5 can store many more memories: For example, for k = 50, the pattern capacity increases from M ≈ 0 to about M ≈ 800, 000. For k = 500, there is still an increase from M ≈ 13, 000 to M ≈ 45, 000. Fourth, correspondingly, networks without structural plasticity would have only a very small weight capacity C wp: For example, at Peff = P = 0.1 it is C wp ≈ 0bps for k ≤ 50 and still C wp < 0.07 bps for k = 500. Fifth, by contrast, networks with structural plasticity have a much higher total synaptic capacity C tot, i.e., they can store much more information per actual synapse and are therefore also much more energyefficient, in particular for sparse activity: Although the very high values C tot → log n are approached only for unrealistically low k and high Peff, they can still store C tot ≈ 0.5 bps for realistic Peff = 0.5 and k = 50. This high value appears to decrease, however, to only C tot ≈ 0.06 bps for k = 500 which would suggest that, for relatively large cell assemblies with k = 500, a network without structural plasticity (at P = 0.1) would be more efficient than a network with structural plasticity (at Peff = 0.5). However, as the Willshaw model is known to be sub-optimal for relatively large k ≫ log n, we will re-discuss this issue below for a more general network model. Sixth, another weakness of the Willshaw model is that the fraction p<sup>1</sup> := 1 − (1 − k 2 n 2 ) <sup>M</sup> of 1 synapses is coupled both to cell assembly size k and number of stored memories M (due to the fixed synaptic threshold θ = 1, cf., Equation 1). Therefore, the residual (minimal) anatomical connectivity of a pruned network P<sup>1</sup> = p1Peff depends also on k,M, and we can obtain P<sup>1</sup> ≈ P = 0.1 consistent with physiology only in a limited range of the k-Peff-planes of **Figure 7**. At least, physiological k ≈ 50 and Peff ≈ 0.5 match physiological P<sup>1</sup> = 0.1, whereas larger k ≫ 50 would require P<sup>1</sup> being larger than the anatomical connectivity P = 0.1. As many cortical areas comprise significant fractions P<sup>0</sup> > 0 of silent synapses we may as well allow for smaller P<sup>1</sup> < P = 0.1 satisfying P<sup>0</sup> + P<sup>1</sup> = P (where C tot would become a measure only of energy efficiency, but no longer of space efficiency), but the very high values of C tot ≫ 1 can generally be reached only for tiny fractions of 1-synapses.

In order to overcome some weaknesses of the Willshaw model we have recently proposed a novel network model (so called binary "zip nets") where the fraction p<sup>1</sup> of potentiated 1-synapses is no longer coupled to cell assembly size k and number M (Knoblauch, 2009b, 2010b, 2016). Instead, the model assumes that synaptic thresholds θij (see Equation 1) are under homeostatic control to maintain a constant fractions p<sup>1</sup> (or P1) of potentiated 1-synapses. We have shown for the limit Mpq → ∞ that this model can reach for p<sup>1</sup> = 0.5 up to a "zip" factor ζ ≈ 0.64 almost the same high storage capacities M<sup>ǫ</sup> and C wp ǫ as the optimal Bayesian neural network (Kononenko, 1989; Lansner and Ekeberg, 1989; Knoblauch, 2011), although requiring only binary synapses. Moreover, if compressed by structural plasticity, zip nets can also reach C tot <sup>ǫ</sup> → log n for p<sup>1</sup> → 0, similar to the Willshaw model. As the Willshaw model is optimal only for extremely sparse activity (k ≤ log n) it is thus interesting to evaluate the performance gain of structural plasticity for physiological k using the zip net instead of the Willshaw model. **Figure 8** shows data from evaluating storage capacity of a cortical macrocolumn of size n = 10<sup>5</sup> both for the zip net model (upper panels) and the Bayesian model (lower panels), the latter being a benchmark for the optimal network without structural plasticity (Knoblauch, 2011). In order to compute the capacity of the zip net we have assumed physiological anatomical connectivity P = P<sup>1</sup> = 0.1 where structural plasticity "moves" the P1n 2 relevant 1-synapses to the most useful locations within the limits given by potential connectivity Ppot (as P<sup>1</sup> is fixed, unlike to the Willshaw model, final Peff after learning may be lower than Ppot in zip nets; see Appendix A.3 for methodological details). We can make the following observations: First, as expected, for high connectivity and very sparse activity (e.g., k ≪ 100) the zip nets may perform worse than the Willshaw model (because the Willshaw model then performs close to the optimal Bayesian net). Second, for more physiological parameters Ppot ≤ 0.5, k ≥ 50 the zip net can store significantly more memories than the Willshaw model, for example, for Ppot = 0.5 the zip net reaches M ≈ 1000000 for k = 50 and still M ≈ 120, 000 for k = 500. Third, also the total synaptic capacity C tot is higher than for the Willshaw network, for example for Ppot = 0.5, it is C tot ≈ 0.6 for k = 50 and still C tot ≈ 0.5 for k = 500 (remember that the corresponding value for the Willshaw model required unphysiological P<sup>1</sup> > 0.1). Fourth, although the Bayesian network can store significantly more memories M it has only a moderate storage capacity below C wp = 0.25. In fact, for plausible cell assembly sizes, the binary synapses of the zip net with structural plasticity at P = 0.1 and Ppot = 0.5 achieve more than double the capacity of the optimal (but biologically implausible) Bayesian network with real-valued synapses at P = 0.1.

In summary, the new data confirms our previous conclusion that structural plasticity strongly increases space and energy efficiency of associative memory storage in neural networks under physiological conditions (Knoblauch et al., 2014).

### 3.3. Structural Plasticity and the Spacing Effect

In previous works we have linked structural plasticity and cognitive effects like retrograde amnesia, absence of catastrophic forgetting, and the spacing effect (Knoblauch, 2009a; Knoblauch et al., 2014). Here we focus on a more detailed analysis of the spacing effect that learning is most efficient if learning is distributed in time (Ebbinghaus, 1885; Crowder, 1976; Greene, 1989). For example, learning a list of vocabularies in two sessions each lasting 10 min is more efficient than learning in a single session of 20 min. We have explained this effect by slow ongoing structural plasticity and fast synaptic weight plasticity: Thus, spaced learning is useful because during the (long) time gaps between two (or more) learning sessions structural plasticity can grow many novel synapses that are potentially useful for storing new memories and that can quickly be potentiated and consolidated by synaptic weight plasticity during the (brief) learning sessions (Knoblauch et al., 2014, Section 7.3).

Appendix B.2 develops a simplified theory of the spacing effect that is based on model variant B of a potential synapse (which can more easily be analyzed than model A; see **Figure 3**) and the concept and methods proposed in Section 2.3. In particular, with (Equations 73–75) we can easily compute the temporal evolution of effectual connectivity Peff(t) for arbitrary rehearsal sequences of a novel set of memories to be learned. As output noise ǫˆ is a decreasing function of Peff (see **Figure 5B**), we can use Peff as a measure of retrieval performance.

To illustrate the effect of spaced vs. non-spaced rehearsal (or consolidation) on Peff, and to verify the theory in Appendix B.2, **Figure 9** shows the temporal evolution of Peff(t) for different models and synapse parameters. It can be seen that for high potential connectivity Ppot ≈ 1 and low deconsolidation probability pd|<sup>s</sup> ≈ 0 the spacing effect is most pronounced and the network easily realizes high-performance long-term memory (with high Peff; see panel A). Larger pd|<sup>0</sup> > 0 is plausible to model short-term memory, whereas realizing long-term memory would then require repeated consolidation steps (panels B–D). Significant spacing effects are visible for any parameter set. Comparing the microscopic simulations of both synapse models from **Figure 3** to the macroscopic simulations using the methods of Section 2.3 and Appendix B.2, it can be seen that all model and simulation variants behave qualitatively and quantitatively very similar. This justifies to use the theory of Appendix B.2 in the following analysis of recent psychological experiments exploring the spacing effect.

For example, Cepeda et al. (2008) describe an internet-based learning experiment investigating the spacing effect over longer time intervals of more than a year (up to 455 days). The structure of the experiment followed **Figure 10**. The subjects had to learn a set of facts in an initial study session. After a gap interval (0– 105 days) without any learning the subjects restudied the same material. After a retention interval (RI; 7–350 days) there was the final test.

These experiments showed that the final recall performance depends both on the gap and the RI showing the following characteristics: First, for any gap duration, recall performance

FIGURE 9 | Verification of the theoretical analyses of the spacing effect in Section B.2 in Appendix. Each curve shows effectual connectivity *P*eff over time for different network and learning parameters. Thin solid lines correspond to simulation experiments of synapse model A (magenta; see Figure 3A) and synapse model B (black; see Figure 3B), where both variants assume that at most one synapse can connect a neuron pair (p(1) = 1). Green dashed lines correspond to the theory of synapse model A in Appendix B.1 (see Equations 54–56). Blue dash-dotted lines correspond to the theory of synapse model B in Appendix B.2 (see Equations 71–72) and, virtually identical, red-dashed lines correspond to the final theory of model B (see Equations 73–75). For comparison, thick light-gray lines correspond to non-spaced rehearsal of the same total duration as the spaced rehearsal sessions (using model A). ( A) Spaced rehearsal of a set of *M* = 20 memories at times *t* = 0 − 4, 100 − 104, 200 − 204, and 300 − 304. Each memory had *k* = *l* = 50 active units out of *m* = *n* = 1000 neurons corresponding to a consolidation load *P*1*<sup>S</sup>* ≈ 0.0488. Further we used anatomical connectivity *P* = 0.1, potential connectivity *P*pot = 1, initial fraction of consolidated synapses of *P*<sup>1</sup> = 0 and *pe*|<sup>1</sup> = *pd*|<sup>1</sup> = 0, *pc*|*s*=*<sup>s</sup>* . In each simulation step a fraction *<sup>p</sup><sup>e</sup>* :<sup>=</sup> *<sup>p</sup>e*|<sup>0</sup> <sup>=</sup> 0.01 of untagged silent synapses was replaced by new synapses at other locations, but there was no deconsolidation *<sup>p</sup><sup>d</sup>* :<sup>=</sup> *<sup>p</sup>d*|<sup>0</sup> <sup>=</sup> <sup>0</sup>. (B) Similar parameters as before, but *<sup>P</sup>*pot <sup>=</sup> 0.4, *<sup>P</sup>*<sup>1</sup> <sup>=</sup> 0.04, *<sup>p</sup><sup>e</sup>* <sup>=</sup> 0.1, and *<sup>p</sup><sup>d</sup>* <sup>=</sup> 0.02. Memories were rehearsed for a single time step *t* = 0, *t* = 100, *t* = 200, and *t* = 300. (C) Similar parameters as for panel B, but smaller *p<sup>d</sup>* = 0.05. (D) Similar parameters as for panel C, but larger *P*<sup>1</sup> = 0.095, i.e., 95 percent of real synapses are initially consolidated. Rehearsal times were *t* = 0, 100, 200, . . . , 700. Note that the theoretical curves for model A closely match the experimental curves (magenta vs. green). The theory for model B is still reasonably good (black vs. blue/red), although panel D shows some deviations to the simulation experiments. Such deviations may be due to the small number of unstable silent synapses (*P*1 near *P*). In any case, synapse models A and B behave very similar.

decline as a function of RI in a negatively accelerated fashion, which corresponds to the familiar "forgetting curve." Second, for any RI greater than zero, an increase in study gap causes recall to first increase and then decrease. Third, as RI increases, the optimal gap increases, whereas that ratio of optimal gap to RI declines. The following shows that our simple associative Knoblauch and Sommer Structural Plasticity, Connectivity, and Memory

memory model based on structural plasticity can explain most of these characteristics.

It is straight-forward to model the experiments of Cepeda et al. (2008) by applying our model of structural plasticity and synaptic consolidation. **Figure 11** illustrates Peff(t) for a learning protocol as employed in the experiments: In an initial study session facts are learned until time t (1) when some desired performance level P (1) eff is reached. After a gap the facts are rehearsed briefly at time t (2) reaching a performance equivalent to P (2) eff . After the retention interval at time t (3) performance still corresponds to an effectual connectivity P (3) eff .

Similar to Cepeda et al. (2008), we want to optimize the gap duration in order to maximize P (3) eff for a given retention interval RI. After the second rehearsal at time t (2) , Peff decays exponentially by a fixed factor 1 − pd|<sup>0</sup> per time step (Equation 74). Therefore, P (3) eff = P (2) eff (1−pd|<sup>0</sup> ) t (3)−t (2) is a function of P (2) eff that decreases with the retention interval length t (3) − t (2) . We can therefore equivalently maximize P (2) eff with respect to the gap length 1t := t (2) − t (1). For pc|<sup>s</sup> = s, pe|<sup>1</sup> = pd|<sup>1</sup> = 0, a good approximation of P (2) eff follows from Equation (73),

$$\begin{aligned} P P\_{\rm pot} &+ [(P\_{\rm pot} - P)P\_{\rm eff}^{(1)} - P\_{\rm pot}P\_1^{(t1)}] (1 - p\_{d|0})^{\Delta t} \\ P\_{\rm eff}^{(2)} &\approx \frac{-P\_{\rm pot}(P - P\_1^{(t1)})(1 - p\_{e|0})^{\Delta t}}{P\_{\rm pot} - P\_1^{(t1)}(1 - p\_{d|0})^{\Delta t} - [P - P\_1^{(t1)}](1 - p\_{e|0})^{\Delta t}}, \text{(31)} \end{aligned}$$

where P (t1) 1 : = P (t0) 1 (1 − P1S)(1 − pd|<sup>0</sup> ) t (1) + P1SP (1) eff with P (t0) 1 denoting the initial fraction of consolidated synapses at time 0.3 Since P (2) eff does not depend on the RI we can already see that the optimal gap interval 1t depends on the RI neither (which contrasts with the experiments reporting that optimal 1t increases with RI). Optimizing 1t yields the optimality criterion (see Appendix B.3)

$$\frac{P\_{\text{eff}}^{(1)} - P\_1^{(t1)}}{P - P\_1^{(t1)}} + (\alpha - 1) \frac{P\_{\text{eff}}^{(1)}}{P\_{\text{pot}}} \boldsymbol{\pi}^{\alpha} - \alpha \boldsymbol{\pi}^{\alpha - 1} = \boldsymbol{0} \,. \tag{32}$$

with

$$\alpha := (1 - p\_{d|0})^{\Delta t} = e^{\Delta t \ln(1 - p\_{d|0})} \Leftrightarrow \quad \Delta t = \frac{\ln \alpha}{\ln(1 - p\_{d|0})} \tag{33}$$

$$\alpha := \frac{\ln(1 - p\_{e|0})}{\ln(1 - p\_{d|0})} \,, \tag{34}$$

which can easily be evaluated using standard Newton-type numerical methods. Note that Equation (32) can be used to link neuroanatomical and neurophysiological to psychological data.

For example, given the optimal gap 1topt from psychological experiments, Equation (32) gives a constraint on the remaining network and learning parameters. Alternatively, we can solve Equation (32) to determine the optimal gap 1topt given the remaining parameters.

and *pe*|<sup>1</sup> = *pd*|<sup>1</sup> = 0.

We have verified Equation (32) by simulations illustrated in **Figure 12** (compare simulation data to Cepeda et al., 2008, **Figure 3**). For these simulations we chose physiologically plausible model parameters: Similarly as before we used Ppot = 0.4 (Stepanyants et al., 2002; DePaola et al., 2006), P = 0.1 (Braitenberg and Schüz, 1991; Hellwig, 2000). Further, we used P (t0) <sup>1</sup> = 0.02 as neurophysiological experiments investigating two-state properties of synapses suggest that about 20% of synapses are in the "up" state (Petersen et al., 1998; O'Connor et al., 2005) 4 . Then we chose a small consolidation load P1<sup>S</sup> = 0.001 assuming that the small set of novel facts is negligible compared to the presumably large set of older memories. As before, we assumed p<sup>g</sup> in homeostatic balance to maintain a constant anatomical connectivity P(t) (Equation 69) and binary consolidation signals s = Sij ∈ {0, 1} with pc|<sup>s</sup> = s and pd|<sup>1</sup> = pe|<sup>1</sup> = 0 for any synapse ij. For the remaining learning

<sup>3</sup> Note that a constant (instead of decaying) "background" consolidation P<sup>1</sup> can be modeled, for example, by using P (t0) <sup>1</sup> = 0 and then excluding the initially consolidated synapses from further simulation. This means to simulate a network with anatomical connectivity P ′ = P−P1, potential connectivity P ′ pot = Ppot −P1, no initial consolidation with P ′ <sup>1</sup> = 0, and otherwise same parameters as the original network. Then the effectual connectivity can be computed from P (1) <sup>1</sup> = P (1) 1 ′ + P1SP<sup>1</sup> using Equation (18) where P (1) 1 ′ mn is obtained from the simulation.

<sup>4</sup> It may be more realistic that the total number of "up"-synapses is kept constant by homeostatic processes (i.e., P1/P = 0.2). However, here we were more interested in verifying our theory which assumes exponential decay of "up"-synapses. To account for homeostasis with constant P<sup>1</sup> one may proceed as described in footnote 3. Nevertheless, the qualitative behavior of the model does not strongly depend on P<sup>1</sup> or P (t0) 1 unless their values being close to P which would strongly impair learning.

FIGURE 12 | Simulation of the spacing effect described by Cepeda et al. (2008, Figure 3) using synapse model variant A (green lines) and B (magenta lines; see Figure 3). Each curve shows final effectual connectivity *P*eff = *P* (3) eff as a function of rehearsal gap <sup>1</sup>*<sup>t</sup>* for different retention intervals (RI <sup>=</sup> 7, 35, 70, 350 days) assuming an experimental setting as in illustrated in Figures 10, 11. Initially, memory facts were rehearsed for tr1=10 time steps (1 time step = 1 h). After the gap, memory facts were rehearsed again for a single time step (tr2=1). Finally, after RI steps the resulting effectual connectivity was tested. Red dashed lines indicate optimal gap interval length for synapse model B as computed from solving Equation (32). Different panels correspond to different synapse parameters *pe*|0 and *pd*|0 : Elimination probabilities are *pe*|<sup>0</sup> = 0.1 (top panels A,D), *pe*|<sup>0</sup> = 0.01 (middle panels B,E), and *pe*|<sup>0</sup> = 0.001 (bottom panels C,F). Deconsolidation probabilities are *pd*|<sup>0</sup> = 0.0001 (left panels A–C) and *pd*|<sup>0</sup> = 0.001 (right panels D–F). Remaining model parameters are described in the main text.

parameters pe|<sup>0</sup> and pd|<sup>0</sup> we have chosen several combinations to test their relevance for fitting the model to the observed data.

The simulation results of **Figure 12** imply the following conclusions: First, the simulations show that the optimal gap determined by Equation (32) closely matches the simulation results, for both synapse models (**Figure 3**). Second, for fixed deconsolidation pd|<sup>0</sup> , larger pe|<sup>0</sup> implies smaller optimal gaps 1topt. Thus, faster synaptic turnover implies smaller optimal gaps. Third, for fixed turnover pe|0, larger pd|<sup>0</sup> implies smaller 1topt. Thus, faster deconsolidation implies also smaller optimal gaps. Fourth, together this means that faster (weight and structural) plasticity implies smaller optimal gaps. Fifth, although model variants A and B (**Figure 3**) behave very similar for most parameters settings, they can differ significantly for some parameter combinations. For example, for pe|<sup>0</sup> = pd|<sup>0</sup> = 0.001 (panel F) the peak in Peff of model A is more than a third larger than the peak of model B. In fact, there the curve of model B is almost flat. Still, even here, the optimal gap interval length is very similar for the two models. An obvious reason why model A sometimes performs better than model B is that deconsolidation of a synapse in model A does not necessarily imply elimination as in model B (see **Figure 3**). Sixth, our simple model already satisfies two of the three characteristics of the spacing effect mentioned above: Both the forgetting effect and the existence of an optimal time gap can be observed in a wide parameter range. Best fits to the experimental data occurred for pe|<sup>0</sup> = 0.01 and pd|<sup>0</sup> = 0.0002 (between parameters of panels B,C; data not shown). Last, however, our simple model cannot reproduce the third characteristic: As argued above, the optimal gap interval length 1topt does not depend on the retention interval RI. This is in contrast to the experiments of Cepeda et al. (2008) reporting that 1topt increases with RI.

Nevertheless, we have shown in some preliminary simulations that a slight extension of the model can easily resolve the latter discrepancy (Knoblauch, 2010a): By mixing two populations of synapses having different plasticity parameters corresponding to a small and large optimal gap (or fast and slow plasticity), respectively, it is possible to obtain a dependence of optimal spacing as in the experiments.

### 4. DISCUSSION

In this theoretical work we have identified roles of structural plasticity and effectual connectivity Peff for network performance, measuring brain connectivity, and optimizing learning protocols. Analyzing how many cell assemblies or memories can be stored in a cortical macrocolumn (of size 1 mm<sup>3</sup> ), we find a strong dependence of storage capacity on Peff and cell assembly size k (see **Figures 7**, **8**). We find that, without structural plasticity, when cell assemblies would have a connectivity close to the low anatomical connectivity P ≈ 0.1, only a small number of relatively large cell assemblies could be stably stored (Latham and Nirenberg, 2004; Aviel et al., 2005) and, correspondingly, retrieval would not be energy efficient (Attwell and Laughlin, 2001; Laughlin and Sejnowski, 2003; Lennie, 2003; Knoblauch et al., 2010; Knoblauch, 2016). It thus appears that storing and efficiently retrieving a large number of small cell assemblies as observed in some areas of the medial temporal lobe (Waydo et al., 2006) would require structural plasticity increasing Peff from the low anatomical level toward the much larger level of potential connectivity Ppot ≈ 0.5 (Stepanyants et al., 2002). Similarly, our model predicts ongoing structural plasticity for any cortical area that exhibits sparse neural activity and high capacity.

Moreover, we have shown a close relation between our definition of effectual connectivity Peff and previous measures of functional brain connectivity. While the latter, for example transfer entropy, are solely based on correlations between neural activity in cortical areas (Schreiber, 2000), our definition of Peff as the fraction of realized required synapses has also a clear anatomical basis (**Figure 2**). Via the link of memory channel capacity C(Peff) used to measure storage capacity of a neural network, we have shown that Peff is basically an equivalent measure of functional connectivity as transfer entropy. By this, it may become possible to establish an anatomically grounded link between structural plasticity and functional connectivity. For example, this could enable predictions on which cortical areas exhibit strong ongoing structural plasticity during certain cognitive tasks.

Further, as one example linking cognitive phenomena to its potential anatomical basis, we have more closely investigated the spacing effect that learning becomes more efficient if rehearsal is distributed to multiple sessions (Crowder, 1976; Greene, 1989; Cepeda et al., 2008). In previous works we have already shown that the spacing effect can easily be explained by structural plasticity and that, therefore, structural plasticity may be the common physiological basis of various forms of the spacing effect (Knoblauch, 2009a; Knoblauch et al., 2014). Here we have extended these results to explain some recent long-term memory experiments investigating the optimal time gap between two learning sessions (Cepeda et al., 2008). For a given retention interval, our model, if fitted to neuroanatomical data, can easily explain the profile of the psychological data, in particular, the existence of an optimal gap that maximizes memory retention. It is even possible to analyze this profile, linking the optimal gap to parameters of the synapse model, in particular, the rate of deconsolidation pd|<sup>0</sup> and elimination pe|0. Our results show that small optimal gaps correspond to fast structural and weight plasticity with a high synaptic turnover rate pe|<sup>0</sup> and relative large pd|<sup>0</sup> with a high forgetting rate, whereas large gaps correspond to slow plasticity processes. This result has two implications: First, it may be used to explain the remaining discrepancy that in the psychological data the time gap depends on the retention interval, whereas in our model it does not: As preliminary simulations indicate, the experimental data could be reproduced by mixing (at least) two synapse populations with different sets of parameters, where they could be both within the same cortical area (stable vs. unstable synapses; cf., Holtmaat and Svoboda, 2009) or distributed to different areas (e.g., fast plasticity in the medial temporal lobe, and slower plasticity in neocortical areas). Moreover, as the temporal profile of optimal learning depends on parameters of structural plasticity, it may become possible in future experiments to link behavioral data on memory performance to physiological data on structural plasticity in cortical areas where these memories are finally stored.

Although we have concentrated on analyzing one-step retrieval in feed-forward networks, our results apply as well to recurrent networks and iterative retrieval (Hopfield, 1982; Schwenker et al., 1996; Sommer and Palm, 1999): Obviously, all results on the temporal evolution of Peff (including the results on the spacing effect) depend only on synapses having proper access to consolidation signals Sij by either repeated rehearsal or memory replay, and therefore hold independently of network and retrieval type. However, linking Peff to output noise (Equation 3) needs to assume a particular retrieval procedure. At least one-step retrieval is known to be almost equivalent for both feedforward and recurrent networks yielding almost identical output noise and pattern capacity Mǫ (Knoblauch, 2008). Estimating retrieved information for pattern completion in auto-associative recurrent networks, however, requires to subtract the information already provided by the input patterns u˜ <sup>µ</sup>. Here information storage capacity C is maximal if u˜ µ contains half of the one-entries (or information) of the original pattern u <sup>µ</sup>, which leads to factor 1/2 and 1/4 decreases of M and C compared to hetero-association (cf., Equations 48, 49 for λ = 1/2; Palm and Sommer, 1992). Nevertheless, up to such scaling, our results demonstrating C increasing with Peff are still valid. Similarly, our capacity analyses of Mǫ and Cǫ can also be applied to iterative retrieval by requiring that the one-step output noise level ǫ is smaller than the initial input noise ǫ˜. As typically output noise ǫˆ steeply decreases with input noise ǫ˜ (cf. Equation 45), additional retrieval steps will drive ǫˆ toward zero, with activity quickly converging to the memory attractor.

Our theory depends on the assumption that potential connectivity Ppot is significantly larger than anatomical connectivity P. This assumption may be challenged by experimental findings suggesting that cortical neuron pairs are either unconnected or have multiple (e.g., 4 or 5) instead of single synapses (Fares and Stepanyants, 2009) and the corresponding theoretical works to explain these findings (Deger et al., 2012; Fauth et al., 2015b). For example, Fauth et al. (2015a) predict that narrow distributions of synapse numbers around 4 or 5 follow from a regulatory interaction between synaptic

### REFERENCES


and structural plasticity, where connections having a smaller synapse number cannot stably exist. If true this would mean that most potential synapses could never become stable actual synapses because the majority of potentially connected neuron pairs have less than 4 potential synapses (e.g., see Fares and Stepanyants, 2009, **Figure 1**). As a consequence, actual Ppot would be significantly lower than assumed in our work, perhaps only slightly larger than P, strongly limiting a possible increase of effectual connectivity Peff by structural plasticity. On the other hand, the data of Fares and Stepanyants (2009) are based only on neuron pairs having very low distances (< 50µm), whereas our model rather applies to cortical macrocolumns where most neuron pairs have much larger distances. Thus, unlike Fauth et al. (2015a), our theory of structural plasticity increasing effectual connectivity and synaptic storage efficiency predicts that neuron pairs within a macrocolumn should typically be connected by a much smaller synapse number (e.g., 1 or perhaps 2).

### AUTHOR CONTRIBUTIONS

Conceived, designed, and performed experiments: AK. Analyzed the data: AK, FS. Contributed simulation/analysis tools: AK. Wrote the paper: AK, FS.

### FUNDING

FS was supported by INTEL, the Kavli Foundation and the National Science Foundation (grants 0855272, 1219212, 1516527).

### ACKNOWLEDGMENTS

We thank Edgar Körner, Ursula Körner, Günther Palm, and Marc-Oliver Gewaltig for many fruitful discussions.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fnana. 2016.00063

Bosch, H., and Kurfess, F. (1998). Information storage capacity of incompletely connected associative memories. Neural Netw. 11, 869–876.


Abstract: Computational and Systems Neuroscience. 2010, doi: 10.3389/conf.fnins.2010.03.00227


Palm, G. (1980). On associative memories. Biol. Cybernet. 36, 19–31.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Knoblauch and Sommer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Anatomically Detailed and Large-Scale Simulations Studying Synapse Loss and Synchrony Using NeuroBox

Markus Breit <sup>1</sup> , Martin Stepniewski <sup>1</sup> , Stephan Grein1, 2, Pascal Gottmann<sup>1</sup> , Lukas Reinhardt <sup>1</sup> and Gillian Queisser 1, 2 \*

*<sup>1</sup> Computational Neuroscience, Department for Computer Science and Mathematics, Goethe Center for Scientific Computing, Goethe University, Frankfurt am Main, Germany, <sup>2</sup> Department of Mathematics, Temple University, Philadelphia, PA, USA*

The morphology of neurons and networks plays an important role in processing electrical and biochemical signals. Based on neuronal reconstructions, which are becoming abundantly available through databases such as *NeuroMorpho.org*, numerical simulations of Hodgkin-Huxley-type equations, coupled to biochemical models, can be performed in order to systematically investigate the influence of cellular morphology and the connectivity pattern in networks on the underlying function. Development in the area of synthetic neural network generation and morphology reconstruction from microscopy data has brought forth the software tool NeuGen. Coupling this morphology data (either from databases, synthetic, or reconstruction) to the simulation platform UG 4 (which harbors a neuroscientific portfolio) and VRL-Studio, has brought forth the extendible toolbox NeuroBox. NeuroBox allows users to perform numerical simulations on hybrid-dimensional morphology representations. The code basis is designed in a modular way, such that e.g., new channel or synapse types can be added to the library. Workflows can be specified through scripts or through the VRL-Studio graphical workflow representation. Third-party tools, such as ImageJ, can be added to NeuroBox workflows. In this paper, NeuroBox is used to study the electrical and biochemical effects of synapse loss vs. synchrony in neurons, to investigate large morphology data sets within detailed biophysical simulations, and used to demonstrate the capability of utilizing high-performance computing infrastructure for large scale network simulations. Using new synapse distribution methods and Finite Volume based numerical solvers for compartment-type models, our results demonstrate how an increase in synaptic synchronization can compensate synapse loss at the electrical and calcium level, and how detailed neuronal morphology can be integrated in large-scale network simulations.

Keywords: HPC, large-scale neuronal networks, synaptic plasticity, electrical scale, anatomy, reconstruction, simulation, cable equation

#### Edited by:

*Markus Butz, Independent Researcher, Germany*

#### Reviewed by:

*Bruce Graham, University of Stirling, UK Zoltan Rusznak, Neuroscience Research Australia, Australia*

> \*Correspondence: *Gillian Queisser gillian.queisser@temple.edu*

Received: *20 November 2015* Accepted: *25 January 2016* Published: *12 February 2016*

#### Citation:

*Breit M, Stepniewski M, Grein S, Gottmann P, Reinhardt L and Queisser G (2016) Anatomically Detailed and Large-Scale Simulations Studying Synapse Loss and Synchrony Using NeuroBox. Front. Neuroanat. 10:8. doi: 10.3389/fnana.2016.00008*

### 1. INTRODUCTION

The structure of neurons and networks in the brain is known to change continuously over time. Cellular growth, synapse formation or synapse loss, reorganization of intracellular architecture constantly make changes to the overall cellular and network anatomy (Hughes, 1958; Abbott and Nelson, 2000; Sheng and Hoogenraad, 2007; Shepherd and Huganir, 2007; Tai et al., 2008; Colon-Ramos, 2009; Branco et al., 2010; Zeltser et al., 2012; Tyagarajan and Fritschy, 2014). These changes in geometric layout can be interpreted as a strong indicator that the anatomy of the (sub)cellular and network level is deeply involved on various functional levels. Neuroscientific research has always been devoted to the interplay between morphology and function on various functional levels. Experimental research draws from microscopy techniques that can make morphology and spatio-temporal signals visible (Spacek and Harris, 1997; Arellano et al., 2007; Chen et al., 2008), theoretical work in Computational Neuroscience has brought forth an abundant spread of cellular and network models, many of them rely on a spatial representation of neurons and networks (Bower and Beeman, 1997; Hines and Carnevale, 1997; Balls et al., 2004; Gewaltig and Diesmann, 2007; Andrews et al., 2010). General purpose simulators such as NEURON or Genesis couple electrical and biochemical models to graph-representations of neurons and synaptically connected networks. The importance of neuronal morphology used in such simulations can be seen in reconstruction projects, such as the database project NeuroMorpho (cf. Ascoli, 2006). Currently more than 30,000 cell reconstructions are freely available on this platform.

Reconstructing morphology from microscopy data is a further example of how deeply structure is integrated in the brain. Semimanual or fully automated reconstruction methods are being developed in research groups around the world (e.g., Jungblut et al., 2011; Popov et al., 2011; Burette et al., 2012), trying to unravel the filigreed multi-level organization of the brain. This dedication has advanced the field significantly, still many of the anatomical questions are currently unresolved. To leverage the power of large-scale network simulations, synthetic neuron morphology tools have been developed (Wolf et al., 2013). These algorithms are capable of generating synthetic networks with realistic morphology statistics which can be used within detailed functional simulations. In order to use these large data sets in detailed and large network simulations high performance computing platforms become an inevitable component of the process. While most of the available network simulators were originally conceived to run serially, there has been effort to parallelize and optimize the code for ever growing computing power.

In this paper, we present an approach focusing on the topic of cellular and network anatomy within a largescale computing context. Building on scalable numerical methods in a flexible and parallelized discretization and solver framework for general ordinary and partial differential equation systems, this unified approach does not make use of the NEURON simulation environment (Hines and Carnevale, 1997) used in similar projects (Markram, 2015; Ramaswamy, 2015; Reimann, 2015). We introduce some of the authors' contributions in morphology reconstruction as well as artificial construction, hybrid-dimensional modeling and simulation of coupled biochemical and electrical signals, and link these to newly developed algorithms for massively parallel simulation of cable equation models and synapse distribution on cells. The latter can be used to simulate healthy and disease state neurons with different synapse numbers and distributions.

The Materials and Methods section of this paper discusses the tool NeuGen (Eberhard et al., 2006; Wolf et al., 2013) and how it ties into a generalized simulation framework. Our model for simulating electrical signals builds upon the known cable theory and is briefly summarized. We introduce our methods for handling synapse types and synapse distributions and introduce a new way of numerically discretizing the resulting model equations and computational domains, ultimately resulting in a system that can be solved on massively parallel computing architectures. These methods are compiled in the toolbox NeuroBox which is developed on top of the numerics engine UG 4 (cf. Vogel et al., 2013) that has been used in several detailed studies of structure-function interplay (Xylouris et al., 2007; Hansen et al., 2008; Nägel et al., 2008, 2009; Wittmann et al., 2009; Grillo et al., 2010; Muha et al., 2011).

To study this anatomy-high-performance framework we present a study of synapse loss vs. signal synchronicity and the influence on somatic calcium signals as well as simulations of large and detailed network simulations (10,000 neurons, each neuron containing 574–586 degree of freedom) of a neocortical column synthetically generated with NeuGen. In these studies we show that synapse loss, which is a major factor in neurodegenerative diseases, can be partially compensated by an increase in synaptic synchronicity, while somatic calcium signals rely strongly on the activation and frequency of action potentials. We further show that wave activation in neocortical networks is clearly driven by synapse density and that our employed simulation framework scales well on JUQUEEN, one of the highperformance computers at the German Jülich Supercomputing Center. This in turn demonstrates that large-scale network simulations do not necessarily have to come at the cost of anatomy anymore.

### 2. MATERIALS AND METHODS

In this section we will introduce the tools and methods used for the simulations performed in Section 3. A combination of neuron and network generating tools (Section 2.1), synapse distribution algorithms, a new approach for numerical discretization of the network topology and a parallel computing framework (Section 2.2) forms the basis of our detailed anatomical and large-scale network simulations and is integrated in a new and extendible simulation toolbox, NeuroBox (Section 2.3).

### 2.1. Generating Large and Anatomically Detailed Networks

The generation of large neural networks (containing more than 10,000 neurons) is accomplished with the neural network generator NeuGen (Eberhard et al., 2006). NeuGen uses anatomical fingerprints, i.e., experimental morphology data and standard deviations to generate anatomically consistent neurons that fit experimental mean and standard deviation. NeuGen thus generates non-identical neurons of various types—e.g., pyramidal cells and spiny stellate cells of the neocortex and hippocampus—and synaptically connects these to form neural networks. The topology of the network is described in terms of graph theory as an undirected, connected graph containing edges and vertices in three-dimensional coordinate space. NeuGen algorithms sample parameter values from experimental data distributions and incorporates two categories of synapses: Primary synapses representing external stimulation of the network; as well as interconnecting synapses which represent chemical synapses between neurons present in the network, typically formed by a presynaptic axon and a postsynaptic dendrite. The anatomy of the network can be exported to a 3D graphics format for visualization and various discrete morphology file formats that can be used in simulators such as NEURON (Hines and Carnevale, 1997) or UG 4 (Vogel et al., 2013). NeuGen is intended to provide anatomically accurate large network topologies for general purpose neuron network simulators.

The algorithm, which is not a growth-based algorithm, is summarized by the following steps (cf. **Figure 1**):

– Generate sections for each neuron based on anatomical fingerprints


It is worth highlighting two parameters when discussing anatomical detail. To regulate the number of vertices for each neuron (which represents the level of detail at which neuron morphology is represented), one may adjust a parameter termed section\_length, the average compartment length in µm. In cases where memory consumption is a constraint, choosing an increased section length permits the creation of and simulation on larger networks (with less anatomical detail) using the same amount of memory. Secondly, the number of synapses inserted into the network may be adjusted by a global threshold parameter termed dist\_synapse. If and only if the euclidean distance between two sections falls below the threshold specified by this parameter, these sections will be marked as potential synaptic contact points. Whether or not a synapse will be placed in the network depends on the type of pre- and postsynaptic neurons. A connectivity matrix specifies which classes of neurons are interconnected by synapses (Wolf et al., 2013).

Subsequent simulations need to refer to the compartments contained in the grid for simulation control setup. Therefore, an alphanumerical identifier is stored within the grid too. The identifier is a string and composed out of the cell type (e.g., pyramidal or stellate) and the compartment type (e.g., axon or dendrite) and groups all edges and vertices belonging to a given cell and compartment type (cf. **Figure 2**). If desired one

generate three-dimensional representations (see Grein et al., 2014) quality assessment of the generated grid can be performed in a semi-automatic way to allow for the best possible preparation for the subsequent numerical simulations, for instance, we check for intersecting dendrites introduced during neuron tracing.

can request the identifier to group edges and vertices based also on the section number of the compartment resulting in a fine-grained access of the network (not shown).

The network can be exported to a variety of formats including a format suitable for large neural network grid generation, e.g., a custom sparse data format based on a file format derived from TXT (plain text or compressed plain text) or a more convenient XML-based file format.

To use NeuGen in conjunction with the simulation framework UG 4 (Vogel et al., 2013, cf. Section 3), the exported morphology is exported to the UG 4 geometry format UGX (an xmlbased file format). To that end, topology information of the exported network, consisting of the raw nodes and vertices, is enriched by grid attachments such as diameter information and synapses, together with their parametrizations. This procedure is implemented as a plugin for UG 4 and produces large neural networks (≥ 10, 000 neurons) in the matter of seconds (cf. **Table 1**).

In addition to directly writing UGX-files from NeuGen, it is possible to convert the following formats to UGX: SWC (commonly used in the NeuroMorpho.org database, Ascoli et al., 2007), HOC (widespread format utilized by NEURON, Hines and Carnevale, 1997), TXT and NeuroML. The last three file formats can be exported directly by NeuGen. NeuGen and the corresponding UG 4-plugins thus form an efficient pipeline for integrating large and anatomically realistic neural networks and publicly accessible anatomical neuron reconstructions into neuron and network simulation frameworks.

### 2.2. Simulating Electrical and Biochemical Signals

Having established methods for generating network topologies in the previous section, we now focus on the steps from modeling

TABLE 1 | Network creation statistics sorted by size, i.e., by number of contained cells within the network, in ascending order.


*The networks are composed of L5A and L5B pyramidal cells (*≈ *16% each), of L4 spiny stellate cells (*≈ *42%) as well as L2/3 pyramidal neurons (*≈ *26%). The smallest network contains 12 and the largest network 120,000 cells in total. To create even larger networks with the same memory resources, one can decrease the number of compartments using the* section\_length *parameter.*

electrical signals, handling membrane transport mechanisms, including synapses to discretizing the model equations by means of a new approach via finite volumes. Lastly we summarize parallel methods for efficiently solving large-scale networks.

### 2.2.1. Model Equations for Membrane Potential and Ion Species

We follow the well established cable theory (cf. Thompson, 1854; Scott, 1975) to model electrical signals on spatially resolved neuron morphologies. A neuron's morphology is given as a graph consisting of vertices in a three-dimensional space and edges connecting them. Common file formats for neuronal morphologies (such as SWC or HOC) contain radius or diameter values assigned to each vertex. We make use of this diameter in the most simplistic way, i.e., by supposing the morphology to be piecewise tubular, each piece being located around a vertex and with the radius associated to this vertex. With only very few modifications, we also implemented compartments shaped like truncated cones resulting in a continuous radius along the neurites, however, we restrict ourselves to the case of tubular compartments in the following description for the sake of simplicity. In each of the compartments, we impose the following equation expressing the membrane's role as an ideal capacitor:

$$C\_m \frac{\partial V}{\partial t} = \ I\_{ax} + I\_m,\tag{1}$$

where V is the membrane potential, C<sup>m</sup> is the capacitance of the compartment and Iax, I<sup>m</sup> are the axial and transmembrane (inward) electric currents, respectively. The compartment's capacitance C<sup>m</sup> depends on its shape and can be expressed in terms of a membrane-specific constant cm,

$$\mathcal{C}\_m = c\_m \cdot 2\pi a l,$$

where a and l are the radius and the axial length of the compartment, respectively.

Axial currents need to be calculated at both ends of a compartment, at the interface with the neighboring compartments. They are assumed to be purely ohmic in nature and are expressed in terms of voltage between the two vertices associated with the neighboring compartments:

$$I\_{ax}^{x\_2 \to x\_1} = \frac{V(\mathbf{x}\_2) - V(\mathbf{x}\_1)}{\frac{r\_\epsilon}{\pi} \int\_{\mathbf{x}\_1}^{\mathbf{x}\_2} (a(\mathbf{x}))^{-2} \, d\mathbf{x}} = \frac{V(\mathbf{x}\_2) - V(\mathbf{x}\_1)}{\frac{r\_\epsilon}{2\pi} \left(a\_1^{-2} + a\_2^{-2}\right) |\mathbf{x}\_2 - \mathbf{x}\_1|},$$

where r<sup>c</sup> is a material constant, the specific resistance of the cytosol, x is the axial coordinate, and x1, x<sup>2</sup> as well as a1, a<sup>2</sup> are depicted in **Figure 3**. Note that the former equation implicitly assumes that the extracellular potential is constant in space.

Finally, the transmembrane current I<sup>m</sup> into the compartment is expressed in terms of electrical flux density i<sup>m</sup> as

$$I\_m = i\_m \cdot 2\pi al\tag{2}$$

and depends on transport mechanisms (e.g., Hodgkin-Huxleytype channels, Na/K pumps, leakage), synapses and electrodes definable on the membrane.

In order to track individual ion species, concentrations for K<sup>+</sup> , Na<sup>+</sup> , and Ca2+ or any other ion type can be added to the model. Each of the species satisfies a diffusion-convection equation in axial direction and is coupled to transport mechanisms in the plasma membrane.

Note that as these ions are charged, they are affected by potential gradients in reality—and conversely, for the same reason, their concentrations directly affect the potential. A physically more accurate model of ionic movement in neurons incorporating both electric and diffusive properties of individual ion species is electro-diffusion. It has ben demonstrated that the modeling error introduced by using the cable equation can be prominent in thin compartments (Qian and Sejnowski, 1989) or where three-dimensional structural detail is concerned (Lopreore et al., 2008).

#### 2.2.2. Membrane Transport Mechanisms

What is truly at the heart of most neuronal simulations is transport across membranes. We have defined an interface allowing the addition of arbitrary transport mechanisms to the electrical model in the transmembrane current density term i<sup>m</sup> of Equation (2). These transport mechanisms are granted access to the underlying grid as well as to the unknowns of the voltage and ion species equations. Thus, they are able to declare and calculate their own sets of states, which may depend on given ones and vary in space and in time—like the gating parameters m, n and h in classical Hodgkin-Huxley-type channels governed by ordinary differential equations in time which depend on the membrane potential (Hodgkin and Huxley, 1952). As the dependence of inner states of membrane transport systems on the potential and on ion concentrations is typically strongly non-linear, we have decided (in the interest of fast computation) to include transmembrane currents only by an explicit scheme, i.e., inner states are updated before any time step of the solution process using only the solution from the previous time step.

The concept is not unlike the NMODL model description language for NEURON by Hines and Carnevale (1997, 2000). In fact, we have developed an automated translation unit that can convert existing NMODL files to C++ source code compilable in our framework.

#### 2.2.3. Synapses

Glutamate being the primary excitatory neurotransmitter in most synapses of the central nervous system, we define excitatory synaptic input localized at dendrites as the postsynaptic response of AMPA or NMDA receptors to presynaptic glutamate signals. AMPA and NMDA receptors, cation channels that become permeable in glutamate-bound state and thereby exhibit a conductance change in direct response to incoming presynaptic spikes, induce transmembrane flux of sodium, potassium and calcium ions causing a local excitatory depolarization of the membrane potential.

In our simulations we distinguish two general categories of synapses: Primary synapses connected to dendrites as the postsynaptic side,—they are used to initialize activity in single cells as well as networks and represent connections to other neurons not included in the simulation. The second category are synapses connecting dendrites and axons both present within a network morphology. We call these interconnecting synapses.

As there is no information on the presynaptic side of primary synapses, the common and simple approach of alpha functions provides a reasonable approximation to model postsynaptic conductance profiles (Roth and van Rossum, 2009):

$$g(t) = g\_{\text{max}} \frac{t - t\_{\text{onset}}}{\pi} \exp\left(-\frac{t - t\_{\text{onset}} - \pi}{\pi}\right),\tag{3}$$

where gmax denotes the maximal conductance, τ the rise/decay and tonset the arrival time of a single presynaptic spike. Note that gmax occurs at t = tonset + τ . The synaptic current Ips(t) is then defined by

$$I\_{\mathbb{P}^8}(t) = \mathcal{g}(t)(V(t) - E\_{\text{rev}}),\tag{4}$$

with g(t) given by (3) for tonset ≤ t ≤ tonset + 6τ and g(t) = 0 otherwise. V(t) denotes the current postsynaptic membrane potential and Erev a reversal potential. For glutamatergic synapses, we use Erev ≈ 0 mV (Purves et al., 2001).

Interconnecting synapses are activated upon rise of the presynaptic membrane potential above a threshold Vth and the following current Iis(t) to the postsynaptic end is modeled according to a bi-exponential activity function:

$$t\_{\text{max}} = \frac{\mathbf{r}\_1 \mathbf{r}\_2}{\mathbf{r}\_2 - \mathbf{r}\_1} \log \left( \frac{\mathbf{r}\_2}{\mathbf{r}\_1} \right), \tag{5}$$

$$n = \left(\exp\left(-\frac{t\_{\text{max}}}{\tau\_2}\right) - \exp\left(-\frac{t\_{\text{max}}}{\tau\_1}\right)\right)^{-1},\tag{6}$$

$$I\_{\rm is}(t) = g\_{\rm max} \left( V - E\_{\rm rev} \right) \left( \exp \left( -\frac{t}{\tau\_2} \right) - \exp \left( -\frac{t}{\tau\_1} \right) \right), \tag{7}$$

where gmax is the maximal conductance; Erev is a reversal potential; τ<sup>1</sup> and τ<sup>2</sup> are constants regulating rise and decay time of the conductance; tmax designates the point in time (after initial activation) at which the conductance is maximal, and the factor n normalizes the conductance such that its value is gmax at tmax.

Synaptic currents—like all other trans-membrane currents are evaluated using the solution for the potential of the previous time step only. This has significant benefits in parallel computation, as there is no direct coupling of solutions for the next time step between cells connected to one another by synapses.

#### 2.2.4. Activation Patterns of Primary Synapses

Our implementation provides a method to set generic activation patterns for a given set of input synapses in the computational domain. To achieve that, we introduce the continuous random variables Xonset and X<sup>τ</sup> for the timing parameters tonset and τ [cf. Equation (3)], respectively. Both of which we assume to be normally distributed, i.e., Xonset ∼ N (µonset, σ<sup>2</sup> onset) and X<sup>τ</sup> ∼ <sup>N</sup> (µ<sup>τ</sup> , σ<sup>2</sup> τ ) with probability density functions given by:

$$f\_{\mathcal{N}}(\boldsymbol{\omega}\_{\boldsymbol{\xi}}, \boldsymbol{\mu}\_{\boldsymbol{\xi}}, \sigma\_{\boldsymbol{\xi}}^{2}) = \frac{1}{\sigma\_{\boldsymbol{\xi}}\sqrt{2\pi}}e^{-\frac{1}{2}(\frac{\boldsymbol{\kappa}\_{\boldsymbol{\xi}} - \boldsymbol{\mu}\_{\boldsymbol{\xi}}}{\sigma\_{\boldsymbol{\xi}}})^{2}}, \quad \boldsymbol{\xi} \in \{\text{onset}, \tau\} \tag{8}$$

After specification of a peak conductance gmax, a mean onset time µonset and duration µ<sup>τ</sup> of synaptic activity as well as corresponding standard deviation values σonset and σ<sup>τ</sup> , the parameters tonset and τ are set to random values drawn from the above normal distributions N (µonset, σ<sup>2</sup> onset) and <sup>N</sup> (µ<sup>τ</sup> , σ<sup>2</sup> τ ), respectively.

#### 2.2.5. Spatial Distribution of Primary Synapses

Given neuron morphologies (defined as graphs in threedimensional coordinate space), we attach all information parameterizing synapses to the dendritic edges they are associated to. The distribution is managed by the C++ class SynapseDistributor. It provides methods to create new or delete existing ones to user-specified statistical distributions.

In our studies, we assume a uniform distribution of nsyn ∈ N synapses on the edge sample space

$$S\_{\text{edge}} \colon= \{ e\_i \mid i = 1, \ldots, n\_{\text{edge}} \} \tag{9}$$

of the basal and apical dendrites. For this purpose, we consider a discrete random variable X i syn, i ∈ {1, ..., nsyn}, for the i-th synapse to be distributed. With every draw X i syn can thereby take one of the edge indices j ∈ {1, ..., nedge} as value, i.e., X i syn = x i j : = j. To account for the heterogenous edge lengths every edge index is assigned an associated probability given by the following probability mass function:

$$P\left(\mathbf{X}\_{\text{syn}}^{i} = \mathbf{x}\_{j}^{i}\right) = p\_{j}^{i} := \frac{\|\boldsymbol{e}\_{j}\|\_{2}}{\sum\_{k=1}^{n\_{\text{edge}}} \|\boldsymbol{e}\_{k}\|\_{2}} \tag{10}$$

The exact location of the i-th synapse x i j on the j-th edge is then drawn from a continuous uniform distribution in the range (0, 1).

#### 2.2.6. Discretization and Solution

We use a first-order (vertex-centered) Finite Volume (FV) scheme. This type of discretization method is well-suited for any type of problem resulting from a conservation law. In a FV scheme, one typically has a conservation formulation like the following:

$$\frac{\partial \rho}{\partial t} = -\text{div}\vec{j} \qquad \text{on the domain } \Omega,\tag{11}$$

where ρ is the density of a conserved quantity,Ej is a flux density of the same quantity. In our case, ρ represents the charge density for the voltage equation and the ionic concentration for the species equations; the flux densities are given by the electric current density and the ionic flux density, respectively. The conservation equation is then transformed into a system of ordinary—i.e., nondifferential—equations by partitioning the domain on which the equation holds into so-called control volumes (in our case, those are exactly the compartments as defined above),

$$
\Omega = \bigcup\_i \Omega\_i,
$$

then by integrating Equation (11) on each control volume (thus ensuring local conservation)

$$\int\_{\Omega\_{\vec{l}}} \frac{\partial \rho}{\partial t} \, d\mathbf{x} = \int\_{\Omega\_{\vec{l}}} -\text{div}\vec{j} \, d\mathbf{x} \left( = \left. - \int\_{\partial \Omega\_{\vec{l}}} \vec{j} \cdot \vec{n}\_{\vec{l}} \, dS \right) \right. \quad \forall \vec{i},$$

Frontiers in Neuroanatomy | www.frontiersin.org February 2016 | Volume 10 | Article 8 |

and finally by assuming the unknown function to be part of some finite-dimensional space (in our case: piecewise linear) in order to be able to represent it by a finite number of unknowns which can be used to express the integrals explicitly in a system of ordinary equations,

$$
\rho = \sum\_{k} \lambda\_k \rho\_k,
$$

where {ρ<sup>k</sup> } are a known basis of the finite-dimensional function space; while {λ<sup>k</sup> } are the coefficients in the corresponding representation of ρ and, at the same time, the unknowns of the resulting system of equations.

Time discretization is achieved by an Euler scheme, backwards with respect to axial fluxes and forward with respect to radial fluxes. The latter treatment results in a step size requirement for the time integration, the numerically well-known Courant-Friedrichs-Lewy (CFL) condition (Courant et al., 1928). In this particular case this condition states: The more trans-membrane flux there is, the smaller the time step has to be chosen. If the requirement is not met (i.e., if the time step size is chosen too big) the solution will "explode," meaning that it will tend to infinity very rapidly. In order to prevent such instability, we calculate and use an estimate for the allowed step size. Thus, our time step is neither too big ("explosion") nor too small (inefficiency).

Discretization is performed using the numerical framework UG 4 (Vogel et al., 2013). It is written in C++ and simulations can be set up and run using the widespread scripting language Lua, which makes this framework easy to use without learning a highly specialized language of its own.

Solution of the symmetric system of linear equations emerging from the discretization is also done within the UG 4 framework. The tree structure of neurons allows for an efficient usage (i.e., with linear runtime complexity in terms of the number of unknowns) of a direct solver if the unknowns are numbered in such a way that, in each line of the matrix, there is at most one non-zero entry to the right of the diagonal. We use a Cuthill-McKee (Cuthill and McKee, 1969) ordering to guarantee this. We solve by calculating the LU decomposition in a sparse matrix format.

#### 2.2.7. Parallelization

As UG 4 comes with full MPI support for parallel calculations, the inevitable usage of large-scale computer facilities for the simulation of large networks is straight forward. Partitioning of the domain can be performed using METIS 5.0 (Karypis and Kumar, 1998) and can be achieved on two levels:

In large networks, whole neurons can be assigned to the processors (as described for NEURON in Migliore et al., 2006), resulting in an "embarrassing" parallelism, since there is no direct coupling between the neurons if synaptic events triggered on the presynaptic side in one time step are taken into account only in the next time step on the postsynaptic side. If whole neurons can be distributed in such a way that the processors' workloads are well balanced, this will be the preferred way of parallelizing, as the solution of the problem works exactly like in the serial case and communication is only needed at active synapses.

On a second parallelization level, it is also possible to cut neurons and assign their parts to different processors. The process of solving the system of equations is a little bit more involved then. Assuming the system to be solved on a processor is

Ax = b,

then the iterative solving process on each processor is defined by the following pseudo-code:

x<sup>0</sup> = solution from the precedent time step

d = d<sup>0</sup> = b − Ax<sup>0</sup> ("defect" vector)

**while** |d| < |d0| ·reductionFactor on any processor **do** c = A <sup>−</sup>1d (calculate correction)

Sum up (over all processors) the corrections in all cutting points and store back in c.

x = x + c (update solution) d = b − Ax (update defect)

**end while**

In order for this to work, the process-wise matrices A need to be stored "additively," i.e., the entries of the global system matrix must be equal to the sum of the corresponding entries in the process-wise matrices (where existent).

It usually takes about five to fifteen iterations until convergence is achieved, depending on how many neurons are cut and at which locations. Of course, in the case where no neuron is cut by the distribution of the network, the iteration will converge in one step. The gain in computation time from parallelizing on this level is not as big as from distributing whole neurons, obviously—however, it can still provide some speedup as it is not solving the system which takes the most time, but setting it up in the first place.

### 2.3. Simulation Workflow

The efficiency of simulating large and complex systems in neuroscience strongly depends on the scaling properties of code on high performance computers (Section 2.2.7). Additional aspects when looking at efficiency are the time invested for setting up a model, the computational tools, compiling and visualizing data and finally accessibility to an extendible code basis.

The simulation toolbox NeuroBox focusses on these aspects by allowing users to compile visual or script-based workflows. Workflows can define models, numerical tools and include thirdparty tools, such as ImageJ (Schneider et al., 2012). The multilevel design, founded on the multi-physics engine UG 4 (Vogel et al., 2013) and the Visual Reflection Library (VRL, Hoffer et al., 2013), allows non-experts intuitive access to advanced numerical methods for solving anatomically detailed biophysical models. NeuroBox is an open-source project hosted on github and thus is conceived as a modular and extendible C++ framework, where new biological components such as ion channels, receptors, synapse types etc. can be added manually or through an NMODL importer. This section briefly introduces script-based and visual workflow design and examples of the extendibility of NeuroBox, which as a platform is capable of hosting large multi-domain workflows.

#### 2.3.1. Using Lua Scripts

The complete process of setting up and solving a problem in parallel is handled internally by UG 4. In order to use its functionality, we developed our code as a UG 4 plugin and compile against the UG 4 libraries. We register our classes and functions at the UG 4 registry (this is done in the C++ code) in order to make them available at the Lua script level, where a simulation can then be formulated using the registered functionality (in addition to any valid Lua command; see **Figure 4**). A schematic representation of what a typical simulation workflow looks like is shown in **Figure 5**, an example script with extensive comments is provided in Listing 1 in Supplementary Material.

registry and compiling makes the channel available for usage in simulations defined either by a Lua script or by a graphical workflow representation using VRL-Studio.

FIGURE 5 | Illustration of the simulation workflow. After creation of a neuronal (network) morphology, the system of linear equations emerging from the cable equation is assembled by the central class CableEquation; synapse handling (i.e., activation, calculation of fluxes, parallel coordination) is taken care of by the class SynapseHandler, while all trans-membrane fluxes are handled by individual classes which all derive from a common interface known to the CableEquation class. The system is solved using UG4 solvers and parallelism.

### 2.3.2. Using Graphical Workflows

We take advantage of the open source software VRL-Studio (Hoffer et al., 2013) to represent simulation workflows graphically. Each class and function registered in the UG 4 registry can be represented in VRL-Studio. This allows any user to put together a simulation by dragging and dropping the graphical representations of involved objects (like instances of the cable equation discretization, the channel and pump mechanisms or the synapse handler) and adding application of their methods with only a few clicks. Scripts are not necessary but possible. The important aspect is that VRL-Studio can combine textual and visual programming in a single interactive development environment. For some aspects, script-based development has many advantages. Therefore, VRL-Studio provides access to the UG 4 APIs. Lua-scripts can be integrated into the visual workflow. A Lua editor with advanced autocompletion support allows for intuitive Lua-based development. Even more important is the fact that VRL-Studio workflows can integrate any Java library, such as ImageJ and JFreeChart. Automatic GUI generation works for these external libraries as well. Users can easily extend existing workflows with custom Groovy scripts, e.g., for pre- and post-processing. Custom scripts are also available as graphical components. Using external libraries in custom scripts is a powerful tool for adding domainspecific knowledge to the NeuroBox platform.

Typically the following steps can be followed to set up a new NeuroBox workflow (a screenshot of a simple graphical simulation workflow created in this way is depicted in **Figure 6**):

1. The first step is the definition of the computational domain (the neuronal morphology) and the unknown functions (membrane potential, ion concentrations) to be computed. This is done by adding an instance of DomainAndFunctionDefinition to the canvas and selecting the grid file as well as names for the unknown functions and subsets of the domain they are supposed to be defined on (subsets defined in the geometry file can be chosen from a list).


### 2.3.3. Adding Functionality

As there is an abundance of membrane transport mechanisms and even more models trying to describe them, it is hardly

window represent a method call with the contents of the panel as parameters. The control flow is defined by the yellow connections, data transfer between objects is marked by gray connections.

possible to implement all of them in advance. In order to support a large pool of available models, we wrote a file converter that will produce C++ code suitable to be compiled with our UG 4 implementation from any model file conforming to Neuron's NMODL description language (Hines and Carnevale, 2000). Of course, membrane transport models can also be implemented directly on the C++ level, implementing the required methods of a pre-defined interface class. This requires writing code for the initialization and updating (typically: evolving some kind of gating variables, expressed in terms of ordinary differential equations) of a model as well as code for the computation of the ion and charge flux through the membrane effectuated by the model. After registration of a new model at the UG 4 registry and compilation of the corresponding code, the model can be used on the Lua script level or on the graphical workflow level in VRL-Studio. The whole process is depicted schematically in **Figure 4**.

### 2.4. Setups for Our Simulations

#### 2.4.1. Synapse Loss Simulations

We conducted in silico experiments investigating the impact of synapse loss in various activation patterns, particularly focussing on the effects it has on the formation of action potentials and the somatic calcium signal. For the simulations we chose a layer 3 pyramidal cell from the rat neocortex reconstructed by Radman et al. (2009), which was well suited to serve as reference cell for further studies as its reconstruction comprised the complete description of soma, dendrites and axon. The corresponding neuronal morphology is publicly available in the SWC file format as part of the NeuroMorpho.org database (Ascoli et al., 2007) under the name 13-L3pyr-77. It was converted to the UGX file format to meet UG 4 format specifications.

Subject to the discrete probability distribution specified in section 2.2.5, N = 100 distributions of nsyn = 1000 synapses each were drawn from the sample space Sedge defined in (9). We simulated synapse loss by successively removing portions of the previously created synapses uniformly from the neuron.

Regarding synapse activity we used a maximal conductance of gmax = 1.2 nS and a constant rise/decay time of τ = 0.4 ms representing a fast AMPA receptor channel parameterization (Gabbiani et al., 1994) throughout the simulations. We compared three levels of input pattern synchrony, namely: complete synchrony (σonset = 0), moderate asynchrony (σonset = 5 ms), and high asynchrony (σonset = 10 ms).

A fraction of 0.2–4% of the current through AMPA receptor channels is carried by calcium ions, depending mainly on the exact AMPA subtype (Burnashev et al., 1995; Garaschuk et al., 1996). As we did not consider calcium buffer (calmodulin, calbindin) reactions in our simulations, we reduced this amount to 0.1% in order to (roughly) represent fast binding of free calcium to these buffers. Calcium dynamics were also regulated by N-type voltage-dependent calcium channels modeled according to Borg-Graham (1999) and NCX and PMCA pump mechanisms (first-order, second-order Hill-type model, resp.). A leakage term was added to ensure zero-flux for the equilibrium state.

### 2.4.2. Network Simulations

For the simulations in Section 3.2.2, we used NeuGen to create five neocortex geometries composed of 3500 L2/3 pyramidal; 3500 L4 spiny stellate; 1500 L5A and L5B pyramidal cells each whose somata were contained in a box with extensions of about 0.5 mm × 0.5 mm × 1 mm (length × width × depth), resulting in a cell density which is of the same order of magnitude as reported by Rockel et al. (1980). In each of the five geometries, NeuGen distributed an average of 30 primary synapses per L4 spiny stellate cell and an average of 25 per L5B pyramidal (cf. Constantinople and Bruno, 2013) for thalamic input. Interconnecting synapses were created wherever axon and dendrite from compatible neuron types came close enough, with the critical distance dist\_synapse (cf. Section 2.1) being 1 µm for the first network, 2 µm for the second and so on. The numbers of synapses thus created show a cubical dependence on the critical creation distance (cf. **Table 2**), which is to be expected, as the sphere around any dendritic point within which axonal points eligible for connection through a synapse are located grows cubically in volume with increasing radius.

Axonal, dendritic and somatic membranes contained classical Hodgkin-Huxley-type sodium and potassium channels. Their flux density is described by

$$i\_{hh} = \mathcal{c}(T) \left( \mathcal{g}\_K n^4 (V - E\_K) + \mathcal{g}\_{Na} m^3 h \left( V - E\_{Na} \right) \right), \tag{12}$$

$$\frac{\partial n}{\partial t} = \left[ c(T) \left( \alpha\_n(V) \left( 1 - n \right) - \beta\_n(V) \right) n \right], \tag{13}$$

$$\frac{\partial m}{\partial t} = \mathcal{c}(T) \left( \alpha\_m(V) \left( 1 - m \right) - \beta\_m(V) \, m \right), \tag{14}$$

$$\frac{\partial h}{\partial t} = \mathcal{c}(T) \left( \alpha\_h \left( V \right) \left( 1 - h \right) - \beta\_h \left( V \right) h \right), \tag{15}$$

where c(T) is a temperature-dependent constant with a value of about 3.2 at 37 ◦C (roughly taken from Collins and Rojas, 1982; Tiwari and Sikdar, 1999); gK, gNa are (location-specific) conductance constants; E<sup>K</sup> and ENa Nernst potentials; and the rate functions α and β are taken from the original Hodgkin and Huxley publication (Hodgkin and Huxley, 1952).

We used a leakage flux density to achieve zero net flux at resting potential:

$$\dot{q}\_l = \mathcal{c}(T) \lg \left( V - E\_l \right), \tag{16}$$

where g<sup>l</sup> is the leakage flux conductance and E<sup>l</sup> an (artificial) reversal potential calibrated to ensure zero membrane net flux at resting potential.



Breit et al. Synapse Loss Simulations Using NeuroBox

TABLE 3 | Base synaptic conductance values for connections between different types in units of nS.


*Connections that are not created by NeuGen are marked by a dash.*

For initialization, the membrane potential was set to the resting potential of −0.065 V globally, voltage-dependent potassium and sodium channels were also set to their resting states. At the beginning of the simulation, thalamic input synapses were activated using an alpha function [cf. Equation (3)] with tonset, τ drawn from normal distributions with (µonset, σonset) = (5 ms, 2.5 ms), (µ<sup>τ</sup> , σ<sup>τ</sup> ) = (2.5 ms, 0.1 ms), respectively, and gmax = 1.2 nS.

Synapses between cells of the network were exclusively excitatory glutamatergic in nature and modeled as described in Section 2.2.3 using a parameterization which represents a fast AMPA receptor channel (Gabbiani et al., 1994). The maximal conductance parameter of synapse S with a presynaptic neuron of type T<sup>1</sup> and a postsynaptic neuron of type T<sup>2</sup> is calculated by NeuGen according to the formula

$$\log\_{\text{max}}(\text{S}) = \left(1 + 0.001 \cdot \text{dsd}(\text{S})\right) \cdot \text{g}\_{\text{s}}(T\_1, T\_2) \,, \tag{17}$$

with dsd(S) being the post-synapse's distance to the soma in µm; and a type-specific base conductance the values of which are summed up in **Table 3**. All other synaptic parameters were the same for each synapse. No delay through neuro-transmitter release and diffusion was considered.

All parameter values for the network simulations are summed up in **Table 4**.

Simulations were performed on 160 processors for a simulated time period of 20 ms and took about two hours. Parallel scaling results for this type of problem are presented in Section 3.2.1.

### 3. RESULTS

### 3.1. Influence of Synapse Loss on Formation of Action Potentials and Somatic Calcium Signal

The human brain is one of the most complex structures known in the universe. It consists of nearly 100 billion nerve cells, each of which is entangled in a dense and constantly adapting network of massive information exchange. On average, a single neuron is linked with 10,000 to 100,000 other neuronal or non-neuronal cells via synapses (Cragg, 1975). Brain function relies essentially on those highly dynamic synaptic connections.

In this part of our study, we investigate the three-dimensional spatial distribution and activity pattern in time of glutamatergic

TABLE 4 | Parameters for the large-scale network simulation.


synapses in neurons of the cerebral cortex. Both are key factors to the integrative properties of the cell. For this purpose, we have developed a tool for automatic placement of synaptic functionality onto neuron morphologies. We apply this tool to systematically assess the impact of activation patterns on the signal processing in single neurons. In particular, we perform in silico experiments where we successively knock out synapses at dendritic locations. We thus investigate situations where synapse loss contributes to pathological states e.g., Alzheimer's disease (Scheff et al., 2006). At the same time, we address the question under which circumstances the neuron will sustain its integrative capability. More precisely, how does impulse conductance and especially the initiation of action potentials at the axon hillock depend on the number of input synapses and their signal synchrony? Does a higher input signal synchrony sustain action potential initiation during increasing synapse loss? The degree of synchrony is defined by the size of the standard deviation from a given mean value. In our experiments, we vary the standard deviation of the start time σonset of synaptic excitations.

In the following sections, we present the results of a series of in silico experiments on a layer 3 pyramidal cell from the rat neocortex (cf. Section 2.4.1), in which we compare three levels of input pattern synchrony, namely: complete synchrony (σonset = 0), moderate asynchrony (σonset = 5 ms), and high asynchrony (σonset = 10 ms). We randomly distributed 1000 excitatory synapses on the geometry in 100 sample configurations. In each of these 100 configurations, we gradually increased synapse loss and analyzed the neuron's capability of creating action potentials, and at the same time, recorded corresponding calcium levels within the soma.

#### 3.1.1. Generation of Action Potentials

Both moderate (µonset = 15 ms, σonset = 5 ms) and high (µonset = 30 ms, σonset = 10 ms) asynchrony cases show a strong action potential spike train response to the initial synapse distribution. The number of spikes ranges from two to three in the moderate asynchrony case and from one to three in the high asynchrony case (**Figure 7**). The synchronous setup, however, produced exactly one action potential for the initial distribution of 1000 synapses in all samples. Only the cation influx at new synapses perpetually being active in the asynchronous cases can induce the repetitive spiking, while cation influx through all synapses is completely compensated by potassium efflux during hyper-polarization in the synchronous case. The number of action potentials decreased with increasing synapse loss in both asynchronous cases until complete signal breakdown (in at least 90 % of the sample patterns) at 75% synapse loss in the moderately and at 60% in the highly asynchronous case. In contrast, synchronous activation patterns sustained generation of an action potential up to a loss of about 97.7% (corresponding to 23 synapses).

### 3.1.2. Calcium Signaling

In all setups, synchronous as well as asynchronous, calcium levels at the soma exclusively depend on whether or not an action potential is elicited. We see step increases in the calcium concentration with every action potential. Calcium diffusion, however, is only able to transport calcium within a very local vicinity of its original point of entry at active synapses. After termination of electrical signaling, calcium levels exponentially decay to equilibrium levels due to the activity of NCX and PMCA pumps. This shows a direct correspondence between synapse loss and somatic calcium levels through the number of action potentials elicited in a neuron. Sample evolution of membrane potential and calcium concentration at the soma (from the moderately asynchronous setting) for various levels of excitatory synapse loss are depicted in **Figure 8**.

### 3.2. Large-Scale Network Simulations with Detailed Anatomy

#### 3.2.1. Parallel Scaling

In order to test the parallel scaling properties of our network simulation implementation, we created six neocortical geometries containing 320, 640, 1280, 2560, 5120, and 10,240 neurons, respectively. The average number of compartments per neuron in the six geometries ranged from 574 to 586. We defined a random thalamic activation pattern, where synapse activation times and durations for the thalamic input synapses created by NeuGen were drawn from the same normal distribution for all geometries. We then performed one thousand time steps using 32, 64, 128, 256, 512, and 1024 processors of the Jülich supercomputer JUQUEEN on the geometries, respectively—thus in each simulation, a processor would be assigned approximately the same amount of work ("weak scaling"). We profiled the

execution of the program to obtain the amount of time spent in the main components of the simulation. **Table 5** shows the results. Leaving out the loading of the geometry into memory and its distribution to the involved processors (both are inherently serial), we achieve good scaling. The times spent for preparing the channel mechanisms and synapses before a time step, for assembling, for factorizing the matrix and applying the inverse remain approximately constant. As a typical network simulation will have more than 1000 time steps, the loading and distribution of the domain (which is only performed once, i.e., at the start of the parallel simulation) will have much less of an impact on scaling behavior than in this particular study. We thus demonstrated that our code is suitable to be used efficiently for simulations of large-scale networks of neurons.

#### 3.2.2. Network Connectivity Affects Network Activity

When a neuronal network is created by NeuGen, synapses connect presynaptic axons to postsynaptic dendrites (if the involved neuron types allow this) where axon and dendrite are sufficiently close to each other (cf. Wolf et al., 2013). The maximal distance dist\_synapse for which synapses are placed can be chosen by the user. This criterion, albeit not representative of an actual model of synaptogenesis (NeuGen does not reproduce neuronal growth, but only a fully grown state), might be

FIGURE 8 | Courses of the membrane potential in mV (row 1) and calcium concentrations in mM (row 2) measured at the soma. 400 (column 1), 300 (column 2), 200 (column 3) synaptic inputs asynchronously activated at µonset = 15 *ms* with standard deviation σonset = 5 *ms*.


#### TABLE 5 | Weak parallel scaling results obtained by code profiling.

*Timings show almost ideal scaling for setting up the system of equations as well as solving it. Loading and distributing the domain to the involved processors is inherently serial as it is done by a single processor in every simulation—however, the time percentage of the two tasks is much lower in typical simulations as they run much longer and loading and distributing is only performed once at the beginning. The remaining variance of the simulation runs is mainly due to differences in the quality of domain distribution. All time values in units of seconds.* considered as a parameterization of the agility of filopodia and growth cones during synaptogenesis (Munno and Syed, 2003). In any case, it has a direct effect on the connectivity properties of the network.

We conducted simulations on five neocortical networks, each composed of the same 10,000 neurons, but with the connection distance ranging from 1 µm to 5 µm in steps of 1 µm. This resulted in networks with increasing numbers of synapses and connected neurons (**Table 3**). As previously described in Wanner (2007), we initialized network activity by depolarizing L4 spiny stellate cells via primary thalamic input synapses, activity then spread out through the cortical layers due to interconnecting synapses. Analysis of the time courses of the membrane potential at the somata in conjunction with activity data from the interconnecting synpases (**Figure 9**) reveals significant impact of the connectedness on the overall qualitative (and quantitative) behavior following the same thalamic input pattern in all five simulations.

In the least connected network, the number of synapses connecting thalamically activated L4 spiny stellate cells to L2/3 pyramidal cells (only 7.5 per L2/3 pyramidal cell on average) does not suffice to lead to the depolarization of a single L2/3 cell in the network. Obviously, this means there can be no active synapses connecting L2/3 to L5A and although there are also synapses connecting L4 to L5A directly, there is no activity in L5A, either. While in the network next in synapse number, considerable depolarization of layer 2/3 pyramidal neuron somata manifests itself due to 7.5-fold increase in average number of active synapses from L4 to L2/3, there is still practically no signal in L5A. Only in the networks created with synapse creation distance parameters ≥ 3 µm are action potentials elicited at the somata of L5A. The same networks exhibit the formation of a second action potential in some of the initially activated L2/3 somata, the two most connected networks also show the occurrence of a second action potential in some of the L5A cells. These second action potentials are the combined effect of (i) charge from previous synaptic inputs that has not yet been cleared and (ii) additional influx at the re-activated synapses. It is noteworthy that somatic activity in both L2/3 and L5A pyramidal cells peaks higher (L2/3: 0.0, 0.56, 0.84, 0.87, 0.87; L5A: 0.0, 0.0, 0.39, 0.78, 0.87) and earlier

FIGURE 9 | Simulation results for networks of 10,000 neurons. The columns contain time plots for networks created with a synapse creation distance of 2 µm, 3 µm, 4 µm (FLTR), respectively. Row 1: Relative number of active somata, i.e., *V* ≥ −45 mV, in different levels (L2/3 red, L4 green, L5A blue, L5B orange). Row 2: Number of active synapses at L2/3 pyramidal neurons originating from different levels (L2/3 red, L4 green). Row 3: Number of active synapses at L5A pyramidal neurons originating from different levels (L2/3 red, L4 green, L5A blue). Initial activation of L4 spiny stellate and L5B pyramidal cells by the same thalamic input pattern in all simulations.

(L2/3: –, 8.8 ms, 7.9 ms, 7.6 ms, 7.6 ms; L5A: –, –, 11.4 ms, 10.2 ms, 9.7 ms) the more synaptic connections there are in the respective cortical layers.

Explicit influence of spatial extensions of the neural network can be identified in **Figure 10**: Somatic depolarization and hyperpolarization expands through the layers of L2/3 and L5A pyramidal cells like a wave, activating the neurons in the order imposed by the distance to the respective origin of that activation.

## 4. DISCUSSION

In this paper we presented studies of electrical and biochemical signals in single cells and networks to investigate the interplay between synapse loss and signaling synchrony. A major focus was the anatomically realistic representation of cells and networks, for which a novel simulation toolbox NeuroBox was developed.

The synapse distribution studies on the layer 3 pyramidal cell from the rat neocortex show a significant impact of the activation pattern (in space and time) on the signal conductance capabilities of the cell. Two effects are apparent: (1) The more asynchronous the input signals are, the more spikes can be generated by this input—up to a point where the asynchrony begins to affect the likelihood of generating a single spike. (2) The more synchronous input signals are, the higher the cell's resilience is to synapse loss with regard to its capability of generating action potentials in response to synaptic input.

In the context of the study of synaptic input patterns, we also conducted simulations of calcium dynamics, including Ca2+ influx through synaptic AMPA-R channels as well as voltage-dependent calcium channels, NCX and PMCA pump mechanisms distributed throughout the membranes of dendrites and soma. Results showed that the somatic calcium concentration, key factor in the control of gene expression (Hardingham et al., 1997) and thus development and survival of cells, is directly coupled to the number of action potentials initiated in the cell, each action potential leading to a step increase in calcium levels. However, we neglected effects of internal calcium stores and also correct consideration of calcium buffers here. Especially the large amounts of calcium releasable through ryanodine and IP<sup>3</sup> receptor channels in the membrane of the endoplasmic reticulum need to be taken into account in a detailed three-dimensional simulation in order to achieve a more accurate description of calcium signaling, possibly including calcium waves (cf. Berridge, 1998, among others). A method of coupling the one-dimensional simulation of the membrane potential to a detailed three-dimensional simulation of calcium signals has previously been developed by the authors (Grein et al., 2014) and may be applied here.

Using NeuGen for the generation of a neocortical column, we have shown that our implementation of a compartment model for the cable equation and trans-membrane current mechanisms in neural networks is adequate for large-scale applications and scales well with the number of neurons involved. It is reasonable to assume that simulations on even larger neural networks can successfully and efficiently be conducted on high-performance computers with the help of our implementation.

The neural network simulations we performed were very basic in nature. We only considered four of the diverse neuron types present in the neocortex. Unlike (e.g., Anderson et al., 2007; Vierling-Claassen et al., 2010; Neymotin et al., 2011), we did not take into account inhibitory synapses and their role in regulating cortical signal processing. Unlike the three aforementioned contributions, however, we created neural networks whose spatial resolution—about 500 compartments per neuron in our simulations as compared to 3, 16 and 1 in theirs—allowed for a realistic spatial positioning of synapses. We utilized the simple (yet not unreasonable) distance rule of NeuGen to create synapses instead of putting experimental projection data (as extensively reviewed for excitatory neurons by Feldmeyer, 2012) to good use. Incorporation of experimental findings into the existing framework, however, is not difficult. The addition of inhibitory synapses, for instance, is merely a question of reparameterization in a preprocessing step. All that considered, our network simulations make it possible to examine the impact of intra- and trans-laminar synaptic connections on each level and can therefore serve as a valuable tool to decipher the functional role of detailed anatomy in cortical information processing.

With a focus on accessible workflow control that includes high-performance numerical methods, a modular neuroscientific repository and the option of including third-party tools, we developed the toolbox NeuroBox and used it to perform all simulations in this paper. NeuroBox is an open source project hosted on github with the intent to offer its full functional scope to a broad community. Visual workflow design and control through VRL-Studio makes NeuroBox projects easy to use and share with experts and non-experts alike. This feature is highly beneficial for rapid prototyping and offers an efficient pathway from in silico experiment design to full implementation thereof.

The possibility to integrate third-party tools, such as ImageJ, anatomical reconstructions (e.g., neuromorpho.org) and the automated import of NMODL models, integrates NeuroBox ideally into ongoing endeavors in the computational neuroscience field. Due to the modular design, this toolbox is easily extendible through various pathways discussed in

### REFERENCES


Section 2 and thus can grow with continued research. As problem sizes typically increase alongside growing high-performance computing power, NeuroBox was built with links to UG 4, a general purpose package for solving partial differential equations. Advanced numerical methods with time and space adaptivity, error estimation and parallel communication layer advance the possibilities for solving anatomically realistic large-scale network problems.

### AUTHOR CONTRIBUTIONS

Cable equation modeled and implemented by MB and PG. Synapse handling designed and implemented by MB, SG, LR, MS. Neuron and network morphology preparation by SG. Network simulations carried out by MB and PG, synapse loss and calcium simulations by MB, PG, LR, and MS. GQ co-designed all methods and experiments and analyzed data. MB, SG, GQ, and MS wrote the manuscript.

### FUNDING

The work presented in this paper was funded by the BMBF (Bernstein Center for Computational Neuroscience Heidelberg/Mannheim) and the program for US-German collaborative research in Computational Neuroscience (01GQ1410B).

### ACKNOWLEDGMENTS

We want to thank Michael Hoffer for the helpful discussions and support around all VRL-based developments. Visualization of neuronal network grid (**Figure 2**) realized with ProMesh (Reiter, 2012, 2014). Plots generated with GnuPlot (Williams et al., 2010), visualization of solutions on grid (**Figure 10**) created with ParaView (Ahrens et al., 2005).

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fnana. 2016.00008


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Breit, Stepniewski, Grein, Gottmann, Reinhardt and Queisser. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Real-World-Time Simulation of Memory Consolidation in a Large-Scale Cerebellar Model

Masato Gosui <sup>1</sup> and Tadashi Yamazaki 1, 2, 3 \*

*<sup>1</sup> Department of Communication Engineering and Informatics, Graduate School of Informatics and Engineering, The University of Electro-Communications, Tokyo, Japan, <sup>2</sup> Neuroinformatics Japan Center, RIKEN Brain Science Institute, Saitama, Japan, <sup>3</sup> Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology, Ibaraki, Japan*

We report development of a large-scale spiking network model of the cerebellum composed of more than 1 million neurons. The model is implemented on graphics processing units (GPUs), which are dedicated hardware for parallel computing. Using 4 GPUs simultaneously, we achieve realtime simulation, in which computer simulation of cerebellar activity for 1 s completes within 1 s in the real-world time, with temporal resolution of 1 ms. This allows us to carry out a very long-term computer simulation of cerebellar activity in a practical time with millisecond temporal resolution. Using the model, we carry out computer simulation of long-term gain adaptation of optokinetic response (OKR) eye movements for 5 days aimed to study the neural mechanisms of posttraining memory consolidation. The simulation results are consistent with animal experiments and our theory of posttraining memory consolidation. These results suggest that realtime computing provides a useful means to study a very slow neural process such as memory consolidation in the brain.

#### Edited by:

*Arjen Van Ooyen, VU University Amsterdam, Netherlands*

#### Reviewed by:

*Andreas Knoblauch, Albstadt-Sigmaringen University, Germany Christian Tetzlaff, Max Planck Institute for Dynamics and Self-Organization, Germany*

#### \*Correspondence:

*Tadashi Yamazaki fina15@neuralgorithm.org*

Received: *04 September 2015* Accepted: *18 February 2016* Published: *03 March 2016*

#### Citation:

*Gosui M and Yamazaki T (2016) Real-World-Time Simulation of Memory Consolidation in a Large-Scale Cerebellar Model. Front. Neuroanat. 10:21. doi: 10.3389/fnana.2016.00021* Keywords: cerebellum, model, memory consolidation, optokinetic response, realtime simulation, graphics processing unit

### 1. INTRODUCTION

Memory formation has two stages: memory acquisition and memory consolidation (Dudai, 2004). A single session of training forms a type of memory which is fragile and persists only a short period up to minutes to hours. This phase is called memory acquisition. After the training, the learned memory, a short-term memory, decays spontaneously and quickly within a day. Meanwhile, repeated training with a sufficient rest between training sessions gradually form another type of memory, a long-term memory, which is robust and persists for days and weeks. This phase is called memory consolidation. Memory consolidation occurs after training but not during training. That is, when we take a rest after training, the brain still continues working to consolidate the learned memory. This posttraining memory consolidation is thought to be the basis of spacing effect (Ebbinghaus, 1885), in which a massed training is inferior to repeated training to form a robust long-term memory, even if the total training time is equal. Therefore, it is important to study how the brain works after training as well as during training to elucidate the memory mechanisms and our behaviors.

In cerebellar motor learning, both memory formation and consolidation occur within the cerebellum. In gain adaptation of vestibulo-ocular reflex (VOR) and optokinetic response (OKR), parallel fiber-Purkinje cell (PF-PC) synapses in the cerebellar cortex store short-term memory, whereas mossy fiber-vestibular nuclear cell (MF-VN) synapses in the brain stem store long-term memory (Kassardjian et al., 2005; Shutoh et al., 2006). OKR is an oculomotor reflex in which the eye moves to the same direction of the visual world's movement to reduce the slip of the retinal image. In OKR adaptation, the amplitude of eye movement, called gain, changes by training. By a single 1-h training, the gain increases quickly, which corresponds to memory acquisition. After the training, the gain decreases naturally to the original level within a day. By repeating the 1-h training everyday, the gain increases gradually throughout 1 week (Shutoh et al., 2006), which represents memory consolidation. Moreover, injection of muscimol, a γ -Aminobutyric acid (GABA) receptor agonist, to the cerebellar cortex immediately after the training disrupts memory consolidation (Okamoto et al., 2011), indicating that training alone is not sufficient for memory consolidation. Accumulating evidence suggests that posttraining memory consolidation of OKR gain takes the following steps (Shutoh et al., 2006). By a single 1-h training, PF-PC synapses undergo LTD induced by conjunctive activation of PFs and the CF innervated to the same PCs (Ito, 1989), and thereby the OKR gain increases. After the training, PFs gradually recover from the LTD, which erase the memory of learned OKR gain in the cortex. On the other hand, because inhibition exerted by PCs to VN is weakened due to the LTD, the VN is deporalized tonically. This deporalization, combined with presynaptic MF activation, induces LTP at MF-VN synapses (McElvain et al., 2010; Person and Raman, 2010), and thereby forming the memory of OKR gain in the nucleus. In this way, while the cortical memory is erased gradually after the training, the nuclear memory forms simultaneously as a long-term memory, as if the learned cortical memory is transferred to the nucleus and consolidated there.

We have proposed a theory of the cerebellar posttraining memory consolidation in OKR adaptation (Yamazaki et al., 2015). The theory captures an essence of the macroscopic dynamics of synaptic mechanisms underlying the posttraining memory consolidation. On the other hand, the theory does not provide insights on mesoscopic cellular/synaptic dynamics on the posttraining memory consolidation. For example, the theory does not tell us about spatiotemporal spike patterns of individual neurons. To study the detailed cellular/synaptic dynamics, an elaborated, realistic cerebellar model is necessary. A problem of such elaborated models, however, is that they would spend too much computational time. Typically, computer simulation of large-scale spiking network models is 10–100 times slower than the real-world time (Nageswaran et al., 2009). This means, if we wanted to carry out a computer simulation of memory consolidation for 1 week, and the computer simulation was 100 times slower than real time, the simulation would spend about 2 years in total to complete. This is practically impossible.

In this study, we adopted high-performance computing (HPC) technology to solve these problems. We used graphics processing units (GPUs) to calculate equations of neurons in parallel, which could speed up the numerical simulation drastically. Specifically, we built a very large-scale spiking network model of the cerebellum composed of 1 million neurons, which is a model of 1 mm<sup>3</sup> of cats' cerebellum. Moreover, owing to the parallel computing on GPUs, we were able to conduct the computer simulation fast enough to complete a very long computer simulation in a practical time, Eventually, we achieved realtime simulation, which means that computer simulation of the cerebellar activity for 1 s completes within 1 s in the real-world time (Igarashi et al., 2011; Yamazaki and Igarashi, 2013). This is essential for computer simulation of the cerebellar posttraining memory consolidation, because the memory consolidation takes days or even weeks. Using the present cerebellar model, we performed computer simulation of long-term OKR adaptation of training for 5 days, and obtained qualitatively the same results with experiments (Shutoh et al., 2006) and our previous theoretical model (Yamazaki et al., 2015). We also examined the detailed spike patterns of neurons, which was abstracted and therefore ignored in our theory.

### 2. MATERIALS AND METHODS

### 2.1. Model

Our cerebellar model is built based on a 1 mm<sup>3</sup> of the cerebellar corticonuclear microcomplex (**Figure 1**) of cats, which is thought to be a functional module of the cerebellum (Ito, 1984, 2012). The original model had 100,000 granule cells, which is 10 times smaller than cats' cerebellum (Ito, 1984), and was already reported elsewhere (Yamazaki and Tanaka, 2007; Yamazaki and Nagao, 2012; Yamazaki and Igarashi, 2013). In this study, we extended the previous model as follows. First, the present model includes 1 million granule cells, thereby the model includes the same number of neurons with 1mm<sup>3</sup> of the cats' cerebellum. Second, the present model has synaptic plasticity at mossy fibervestibular nuclear cell (MF-VN) synapses, as well as parallel fiber-Purkinje cell (PF-PC) synapses. Except the number of granule cells and MF-VN synaptic plasticity, the previous and present models are the same. Therefore, we summarize the model specification only briefly below. The details are found in our previous papers (Yamazaki and Tanaka, 2007; Yamazaki and Nagao, 2012; Yamazaki and Igarashi, 2013).

The present model is composed of 1,048,576 (= 1024 × 1024) granule cells, 1024 Golgi cells, 16 PCs, 16 basket cells, 1 inferior olivary cell, and 1 VN, connected according to cats anatomical data (Yamazaki and Tanaka, 2007). Neurons are modeled as conductance-based integrate-and-fire units (Gerstner et al., 2014):

$$\mathcal{C}\frac{du}{dt} = -g\_{\text{leak}}(u(t) - E\_{\text{leak}}) - g\_{\text{ex:AMPA}}(t)(u(t) - E\_{\text{ex}})$$

$$-g\_{\text{ex:NMDA}}(t)(u(t) - E\_{\text{ex}}) - g\_{\text{inh:GABA}}(t)(u(t) - E\_{\text{inh}})$$

$$-g\_{\text{ahp}}(t)(u(t) - E\_{\text{ahp}}) + I\_{\text{ext}}(t),\tag{1}$$

where u(t) is the membrane potential at time t, C is the capacitance, gleak and Eleak are the conductance and reversal potential of the leak current, respectively, gex:AMPA(t), gex:NMDA(t), ginh:GABA(t) are synaptic conductances of excitatory α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) and N-methyl-D-aspartic acid (NMDA), and inhibitory GABA synapses, Eex and Einh are reversal potentials, gahp(t) and Eahp are the conductance and reversal potential of after-hyper

plastic change as in (A). Arroheads represent types of synaptic connections (triangle, excitatory; circle, inhibitory). GR, granule cell; GO, Golgi cell; PC, Purkinje cell;

polarization, respectively, and Iext(t) is an external current. When u(t) exceeds a threshold θspike at time t, the neuron elicits a spike at time t. Cell parameters are taken from turtles and rodents electrophysiological data (Yamazaki and Tanaka, 2007). The values used in this study are summarized in **Table 1**. Synaptic conductance gx(t) for type x is calculated as a convolution of

BS, basket cell; VN, vestibular nucleus; IO, inferior olive; MF, mossy fiber; CF, climbing fiber; PF, parallel fiber.

presynaptic spike events with an exponential kernel as

$$\log\_x(t) = \sum\_j \boldsymbol{w}\_j \cdot \bar{\mathbf{g}}\_{\mathbf{x}} \sum\_{f \in \mathcal{S}\_{\bar{f}}} \exp\_x \left( - \left( t - t^{(f)} \right) \right) \Theta \left( t - t^{(f)} \right), \tag{2}$$

where g¯<sup>x</sup> is the peak conductance, w<sup>j</sup> is the synaptic weight which is constant, S<sup>j</sup> is the set of spikes elicited by presynaptic cell j, t (f) is the spike time for the f th spike, exp<sup>x</sup> (t) is the exponential kernel, and 2(t) is the Heaviside step function. The exponential kernels used in the present study are summarized in **Table 2**, whereas the synaptic weights are shown in **Table 3**.

The model has two distinct synaptic plasticity sites. One is PF-PC synapses, which undergo long-term depression (LTD) by conjunctive activation of granule-cell axons called parallel fibers (PFs) and a climbing fiber (CF) innervating to the same PC (Ito, 1989), and long-term potentiation (LTP) as well by sole activation of PFs (Lev-Ram et al., 2003). We modeled these bidirectional plasticity as follows:

$$\begin{split} \pi\_{\boldsymbol{w}} \frac{d\boldsymbol{w}\_{ij}}{dt} &= -\boldsymbol{w}\_{ij}(t) + \boldsymbol{x}\_{ij}(t) \\ \pi\_{\boldsymbol{x}} \frac{d\boldsymbol{x}\_{ij}}{dt} &= -\boldsymbol{x}\_{ij}(t) - \mathop{\rm c\!\_{\rm TD}}\sum\_{\boldsymbol{s}=\boldsymbol{0}}^{50\text{ms}} \textrm{PF}\_{ij}(t-\boldsymbol{s}) \rm CF(t) + \mathop{\rm c\!\_{\rm LT}}\textrm{PF}\_{ij}(t), \end{split} \tag{3}$$

#### TABLE 1 | Summary of cell parameters.


*GR, granule cell; GO, Golgi cell; PC, Purkinje cell; BS, basket cell; VN, vestibular nuclear neuron; IO, inferior olivary neuron; -, nonexistent.*

where wij(t) is the synaptic weight between PC i and PF j, τ<sup>w</sup> and τ<sup>x</sup> are time constants where τ<sup>w</sup> ≪ τx, xij(t) is an internal variable, cLTD and cLTP are constants, PFij(t) is 1 if PF j on PC i elicits a spike at time t and 0 otherwise, and CF(t) is 1 if the climbing fiber elicits a spike at time t and 0 otherwise. The term P50ms s=0 PFij(t − s)CF(t) means that PFs that elicit spikes 0–50 ms earlier than the time when the climbing fiber elicits a spike undergo LTD. If n spikes travel along a PF during 50 ms, the weight change becomes n times cLTD. Transmission delay

#### TABLE 2 | Summary of synaptic functions.


*Abbreviations as in* Table 1*.*



*Abbreviations as in* Table 1*.*

of PF spikes might be essential for plasticity (Knoblauch et al., 2014). The conduction velocity of PFs has been experimentally estimated as 0.24 m/s (Vranesic et al., 1994). This results in the transmission delay of 1 mm PF is maximally about 4.2 ms, which could be negligible as long as we assume 50 ms time window for LTD. Therefore, we did not model transmission delays of PF spikes. On the other hand, we do not exactly describe the biological counterpart of xij. A potential interpretation of xij would be intracellular concentration of some kinases involving PKC-MAPK positive feedback loop, which plays an essential role in maintenance of induced LTD (Kuroda et al., 2001). The initial values of w and x were set at 1.0 and 0.0, respectively.

The other plasticity is MF-VN synapses, which undergo bidirectional plasticity by a modified Hebbian mechanism. The original equation was proposed by our previous theoretical model (Yamazaki et al., 2015) based on Zhang and Linden (2006); Person and Raman (2010); McElvain et al. (2010) as follows:

$$\pi\_\nu \frac{d\nu}{dt} = -\nu(t) \left< \mathrm{MF}(t) \right> + \left< \mathrm{MF}(t) \left( \mathrm{VN}(t) - \theta(t) \right) \right>,\tag{4}$$

where τ<sup>v</sup> is time constant, v(t) is the synaptic weight at MF-VN synapses at time t, MF(t) is the activity of MFs, VN(t) is the activity of VN, h·i is the temporal average over a certain time window (we assumed 6 s), and θ(t) is a running average TABLE 4 | Summary of learning parameters.


of VN(t), namely θ(t) = hVN(t)i. The left-hand side represents the temporal increment of v(t). The 1st term in the right-hand side represents LTD by sole activation of MFs, and 2nd term represents the Hebbian mechanism, where the weight change correlates with the correlated activity of pre- and postsynaptic neurons. Here, the term θ(t) acts as a threshold; only when the postsynaptic neuron is activated strongly to exceed θ(t), the synapses undergo LTP, otherwise LTD or no change. In this way, θ(t) determines the direction of synaptic change. Moreover, the value of θ(t) itself changes temporally depending on the temporal history of VN(t). Higher θ(t) value makes the synapse harder to undergo LTP. The initial value of v was set at 1.0. The parameters for w and v are summarized in **Table 4**.

As far as we have tested, the general network dynamics does not change so largely over a wide range of parameter settings. We have found three points that are necessary to achieve robust learning. First, granule-Golgi cell recurrent network should be tuned so as to generate the population code of granule cells robustly. Second, basket cell→ PC synaptic connections should not be so strong; otherwise, PCs would be silent completely. Third, PC→VN synaptic connections should not be so strong; otherwise, VN would be silent completely. If we satisfy these three points, the network, as far as we have tested, works robustly.

### 2.2. Simulation Paradigm

We conducted computer simulation of long-term OKR gain adaptation as in Shutoh et al. (2006). Specifically, we repeated a 1-h simulated OKR training followed by 23-h rest 5 times corresponding to 5-days training. In each OKR training, simulated optokinetic stimulus is fed to MFs, and retinal slip is fed to a CF. Both optokinetic stimulus and retinal slip are modeled as Poisson spikes with the following firing rate:

$$\begin{aligned} f\_{\text{MF\_{train}}}(t) &= \overline{\text{MF\_{train}}} \left( 1 + \sin \frac{2\pi t}{T} \right) \quad \text{(for MFs)}\\ f\_{\text{CF\_{train}}}(t) &= \overline{\text{CF\_{train}}} \left( 1 + \sin \frac{2\pi t}{T} \right) \quad \text{(for a CF)}, \end{aligned} \tag{5}$$

where fMFtrain (t) and fCFtrain (t) are the firing rate of MFs and a CF, respectively, MFtrain and CFtrain are the mean activity of MFs and a CF, which are set at 15 spikes/s and 1.5 spikes/s, respectively. T is a period of a cycle of optokinetic stimulus, which is assumed to be rotated sinusoidally in front of animal subjects. We set T = 6 s consistently with the experiments (Shutoh et al., 2006). Because one cycle is 6 s, daily 1-h training consists of 600 cycles of simulated optokinetic stimulus. On the other hand, after training, we assumed that both MFs and a CF elicited spikes spontaneously with the following firing rate:

$$\begin{aligned} f\_{\text{MF}\_{\text{rest}}}(t) &= \overline{\text{MF}\_{\text{rest}}} \quad \text{(for MFs)}\\ f\_{\text{CF}\_{\text{rest}}}(t) &= \overline{\text{CF}\_{\text{rest}}} \quad \text{(for a CF)}, \end{aligned} \tag{6}$$

where MFrest and CFrest are set at 5 spikes/s and 1 spikes/s, respectively.

Once we define the firing rate of MF and CF as above, and assume that the activity of a simulated neuron (e.g., firing rate) reflects the strength of input signals to the neuron almost linearly as in the case of integrate-and-fire neurons used in this study (Gerstner et al., 2014), we could estimate the activity of VN as a linear sum of excitatory MF activity and inhibitory PC activity. The PC activity could be estimated as a linear sum of PF activity and basket cell activity, and further by solely MF activity. By substituting the MF and VN activities for MF(t) and VN(t) in Equation (4), we could obtain the following simplified equation for v. The detailed derivation is found in our previous paper (Yamazaki et al., 2015).

$$
\pi\_\nu \frac{d\nu}{dt} = -\varkappa(t) + \varkappa\_\varepsilon,\tag{7}
$$

where w(t) is the average synaptic weight of all PF-PC synapses, and w<sup>c</sup> is a constant that defines the initial weight of PF-PC synapses, namely, 1.0. We used Equation (7) rather than Equation (4) for simplicity to update v(t).

### 2.3. Data Analysis

We conducted computer simulation of the 5-days OKR training, and obtained spike data of all individual neurons and synaptic weight data of PF-PC synapses and MF-VN synapses. The total simulation time was 5 × 24 × 60 × 60 × 1000 = 4.32 × 10<sup>8</sup> ms, with temporal resolution of 1 ms.

We analyzed how the OKR gain changed before and after training for each day. To do so, before training for each day, we fed 10 cycles of simulated optokinetic stimulus to the network and obtained the spike data of VN. We made a spike histogram with bin size of 100 ms, fitted the data with a cosine function with the period of 6 s, and calculated the modulation amplitude. We defined the modulation amplitude as the OKR gain before training. We made the same procedure to obtain the OKR gain after training for each day.

We also examined how the granule cells transmit mossy fiber signals robustly against noise in Poisson spike trains. Granule cells must produce almost identical spike pattern in response to the same optokinetic stimulus with different noise across cycles; otherwise, learning at Purkinje cells would fail. To quantify the reproducibility of the granule cell spike pattern in response to the same simulated optokinetic stimulus, we calculated the reproducibility index at time t defined as the normalized cross correlation as follows:

$$R(t) = \frac{\sum\_{j} z\_{j}^{(i)}(t) z\_{j}^{(i+1)}(t)}{\sqrt{\sum\_{j} z\_{j}^{(i)}(t)} \sqrt{\sum\_{j} z\_{j}^{(i+1)}(t)}},\tag{8}$$

where z (i) j (t) is the activity of granule cell j at cycle i of simulated optokinetic stimulus at time t, which was calculated by convolution of the spikes with a causal exponential:

$$z\_j^{(i)}(t) = \sum\_{f \in S\_j^{(i)}} \exp\left(-\frac{t - t^{(f)}}{\tau}\right) \Theta\left(t - t^{(f)}\right),\tag{9}$$

where S (i) j is the set of spikes elicited by granule cell j at cycle i, t (f) is the spike time for the f th spike, τ is the time constant of 8.3 ms, and 2(t) is the Heaviside step function. Intuitively, z (i) j (t) is a temporal trace of EPSPs of PF j on a PC at cycle i, and τ = 8.3 ms is the time constant of AMPA receptor-mediated PF-EPSPs at a PC (Llano et al., 1991). We calculated the average and standard deviation of the reproducibility index among 10 pairs of two successive cycles.

### 2.4. Numerical Method

All equations that govern the network dynamics are solved numerically. Specifically, differential equations describing membrane potentials are solved by 2nd-order Runge-Kutta method with temporal resolution (1t) of 1 ms. The simulation program is written in C with CUDA (Common Unified Device Architecture) (NVIDIA, 2015) and most of the calculation is made on GPUs.

In our previous study (Yamazaki and Igarashi, 2013), we used only 1 GPU (NVIDIA GeForce GTX580) to simulate 100,000 granule cells in realtime. On the other hand, the present model has 10 times more granule cells, which makes computer simulation far slower than realtime. The most time-consuming part is to calculate synaptic conductances of Golgi cells, basket cells and PCs, where these cells receive excitatory inputs from granule cells via PFs. Due to the large number of granule cells, the calculation spends too much time. To address this issue, we decomposed the granular layer network composed of granule cells and Golgi cells into 4 identical subnetworks and calculated the dynamics in parallel on 4 GPUs (2 boards of NVIDIA GeForce GTX TITAN Z, each contains 2 GPUs). In the following, we explain how to decompose the network and calculate the conductance on 4 GPUs.

**Figure 2A** illustrates a part of the granular layer of our model. The granular layer is composed of 1024 × 1024 granule cells and 32 × 32 Golgi cells arranged regularly on a two-dimensional grid. Granule cells are further divided as 32 × 32 clusters, where each cluster consists of 32 × 32 granule cells. Due to short dendrites of granule cells, we assumed that the granule cells in the same cluster shared inhibitory inputs from the same Golgi cells. On the other hand, granule cells receive 4 excitatory MF inputs. We assumed that granule cells receive 4 MF inputs independently of the other granule cells. This structure allows us to decompose the granular layer network into 4 identical subnetworks composed of 512 × 512 granule cells and 32 × 32 Golgi cells, where granule cells are further divided into 32 × 32 clusters in which each cluster contained 16 × 16 = 256 granule cells as shown in **Figure 2B**. We conducted simulation of each subnetwork on a GPU, thereby we employed 4 GPUs for

simulation of 4 subnetworks. In each subnetwork, we calculated quarter of synaptic conductance for each Golgi cell from granule cells in the same subnetwork. We then exchanged the partial conductances across subnetworks over GPUs and obtained the full conductance by summing up the partial values (**Figure 2C**). This is made by direct memory exchange between 2 GPUs called peer access, which is much faster than conventional memory exchange via CPUs. Because calculation of synaptic conductance is linear, our split-reduction method over 4 GPUs provides the same result with the conventional method. The same method is used to calculate synaptic conductances of basket cells and PCs as well.

### 3. RESULTS

### 3.1. Simulation Time

First, we measured how the simulation time was accelerated by using multi GPUs. Using only 1 GPU, we found that computer simulation of the cerebellar activity for 6 s, corresponding to 1 cycle of simulated optokinetic stimulus, spends 17.7 s. Using 2 GPUs, 9.10 s are spent. Finally, using 4 GPUs, we achieved 5.33 s for 6 s simulation, indicating realtime simulation. Therefore, we used 4 GPUs for further simulation.

### 3.2. Long-term OKR Gain Change

We conducted computer simulation of long-term OKR adaptation for 5 days. For each day, we performed a simulated 1-h OKR training. During the training, MFs convey simulated optokinetic stimuli, whereas a CF conveys simulated retinal slip error signals. After the training, both MFs and the CF elicit Poisson spikes spontaneously with a constant firing rate, respectively.

**Figure 3A** plots the OKR gain obtained in our long-term OKR training simulation for 5 days. By daily 1-h training, the OKR gain increases during training, and after the training, the learned OKR gain almost disappears. This indicates memory acquisition of OKR gain. On the other hand, throughout the 5 days, OKR gain gradually increases, indicating memory consolidation. The present numerical result is qualitatively consistent with previous experimental and theoretical results (Shutoh et al., 2006; Yamazaki et al., 2015).

**Figure 3B** plots the daily increment of learned OKR gain by 1 h training. The increment becomes larger day by day, suggesting that repeated daily training accelerates the memory acquisition. This result is consistent with previous experiments (Shutoh et al., 2006).

### 3.3. Change of Synaptic Weights

**Figure 4** plots the change of weights at PF-PC synapses (w) and MF-VN synapses (v) throughout the 5 days training. For w, we calculated the average of all PF-PC synaptic weights with respect to PFs and PCs. Similarly for v, we calculated the average of all MF-VN synaptic weights with respect to MFs. PF-PC synapses undergo LTD during training, and slowly return to the original weight value after training spontaneously. PF-PC synapses repeat the same temporal change 5 times for 5 days, suggesting that PF-PC synapses store only short-term memory of OKR gain for hours. On the other hand, MF-VN synapses change little during training, and slowly increase after training. The synaptic weight accumulates every day after training, suggesting that MF-VN synapses store long-term memory of OKR gain.

The overall dynamics is as follows. First, memory of OKR gain is formed in the cerebellar cortex by PF-PC LTD during training. Second, after training, learned cortical memory is

decayed slowly and disappears completely by the next day, and finally, during the slow decay of the cortical memory, memory is formed in the vestibular nucleus by MF-VN LTP, as if the cortical memory is transferred to the nucleus and consolidated. The present numerical result is consistent with the previous theoretical results (Yamazaki et al., 2015).

### 3.4. Change of Eye Movement Trajectory

So far, both the current numerical and previous theoretical studies show qualitatively the same results. A benefit of our numerical study is that we could obtain detailed data of individual neurons such as membrane potential and spike trains with a fine temporal resolution of 1 ms, which were abstracted in our theoretical model (Yamazaki et al., 2015).

**Figure 5** plots the firing rate of VN in response to simulated sinusoidal optokinetic stimulus before and after training at the 1st day (A) and the 5th day (B). The firing rate modulates sinusoidally as the input signals. The modulation amplitude increases by daily 1-h training, and the amplitude also increases gradually throughout 5 days. On the other hand, the baseline firing rate does not change largely from 30–50 spikes/s. Here, the modulation amplitude of VN represents the OKR gain (Shutoh et al., 2006), suggesting that the OKR gain becomes larger by repeated daily training. These results also suggest that realtime simulation allows us to study both macroscopic behaviors of a neural network such as OKR gain, and mesoscopic dynamics of individual neurons in the neural network such as a membrane potential and spike trains.

### 3.5. Robust Signal Transmission by the Enormous Number of Granule Cells

Granule cells must transmit information conveyed by mossy fibers to Purkinje cells and interneurons faithfully against input noise, otherwise, learning at Purkinje cells would fail. In OKR, mossy fibers convey information on visual world movement, and granule cells produce a spatiotemporal spike pattern that represents the stimulus reliably. For this purpose, the almost identical spike pattern of granule cells must be produced across cycles of the optokinetic stimulus.

Here, we examined how the enormous number of granule cells help them to transmit mossy fiber information faithfully and robustly against input noise. Specifically, we calculated the reproducibility index (Equation 8) that quantifies the reproducibility of the spike pattern of granule cells across cycles of the simulated optokinetic stimulus on different cycles, while changing the number of granule cells in the network.

**Figure 6A** plots an example of the spike pattern of granule cells, whereas **Figure 6B** plots the reproducibility. As can be seen, the reproducibility is better when 1 million granule cells were employed than 0.1 million granule cells. This result suggests that a functional role of the enormous number of granule cells is

robust transmission of mossy fiber signals to PCs against input noise.

### 4. DISCUSSION

### 4.1. Understanding Memory Consolidation Mechanisms

Memory consolidation is a slow process that takes days and weeks. To study the neural mechanisms of memory consolidation, two ways are possible: either conducting experiments or making theoretical models. A theoretical model is a mathematical description of a specific phenomenon. To make such model, we ignore most of experimental details and capture the essence of the phenomenon. For example, in our theoretical model of posttraining memory consolidation in the cerebellum (Yamazaki et al., 2015), we abstracted all detailed physiology of individual neurons, detailed anatomical structure, and detailed input stimuli. This provides a clear view of how the memory consolidates after training, but we still do not know the detailed neuronal process during the memory consolidation. Large-scale, realistic spiking network models are appropriate for this purpose, but the computational time would be problematic instead.

HPC technology solves this problem. The advantage is two-folds. First, the technology allows us to build a largerscale model composed of more neurons and synapses with more detailed morphology and biophysical properties than conventional models. Very large-scale functional brain models have been built (Izhikevich and Edelman, 2008; Eliasmith et al., 2012). Notably, The Blue Brain Project and Human Brain Project attempt to build a realistic whole brain model, and they recently published a very detailed cortical microcolumn model (Markram et al., 2015). Second, the technology allows us to carry out computer simulations much faster than that on a single-threaded CPU. The latter makes the above-mentioned long-term computer simulation possible in a reasonable time. For instance, if the computer simulation runs in real time, a simulation of memory consolidation for 1 week completes in 1 week. In our study, we adopted GPUs. Using our large-scale, detailed spiking network model of the cerebellum implemented on multi GPUs, we were able to simulate the detailed temporal dynamics of individual neurons, while observing the slow memory consolidation process simultaneously. The present study is, as far as the author knows, a first demonstration of a very long time computer simulation of an elaborated spiking network model for days. We were able to replicate our previous theoretical results (Yamazaki et al., 2015), and further examined detailed neuronal and synaptic dynamics during memory consolidation. In cerebellar motor learning, location of motor memory and the role of LTD at PF-PC synapses have been a matter of debate for more than 30 years (Mellvill-Jones, 2000). The present study could provide an answer from the modeling view point.

### 4.2. Realtime Simulation and the Programming

The present cerebellar model consists of more than 1 million spiking neurons. In general, computer simulation of such largescale model takes very long time. The simulation could be 10–100 times slower than the real-world time (Nageswaran et al., 2009). However, owing to HPC technology, we were able to conduct computer simulation in realtime, where simulation of cerebellar activity for 1 s completes within 1 s in the real-world time. This allowed us to conduct a complete computer simulation of long-term OKR adaptation training for 5 days in a practical time.

We used 4 GPUs simultaneously to perform realtime simulation of 1 million neurons. To do so, we had to write the simulation program in C with CUDA, a platform for GPU computing, and employed some parallel algorithms to use GPUs efficiently. Specifically, we used some algorithms to compute synaptic conductances of Golgi cells, basket cells and PCs that receive excitatory inputs from many granule cells. This is quite technical and difficult, and so there should be a more simple way to adopt the power of parallel computing in neuroscience. One potential way would be to develop a neural simulator primarily designed for GPUs and some other accelerators. Naveros et al. (2014) has reported development of such a spiking neuron network simulator on a GPU. Some groups have used the software for realtime robot control (Garrido et al., 2013; Casellato et al., 2015).

Realtime simulation is only a milestone, and we expect even faster computer simulation. Scalability, however, would be a problem. Generally speaking, using more GPUs would employ more latency for communications and overhead of communication operations, which could easily be a bottle neck.

### 4.3. Advantages of Large-Scale Models over Theoretical Models

Although the present study reproduced qualitatively the same results with our previous theoretical model (Yamazaki et al., 2015), some results are slightly different. First, the MF-VN synaptic weight in the present model tends to decay spontaneously, whereas that in the theoretical model did not. This is because in the theoretical model, the decay term was canceled out and removed by a mathematical treatment. In experiments (Shutoh et al., 2006), the learned long-term OKR gain almost vanishes after 2 weeks from the last training, suggesting that it is natural for the synaptic weight to decay spontaneously. Second, the increase of modulation amplitude of VN before and after the 1-h training gradually becomes larger throughout 5 days in the current study (**Figure 2B**), whereas the change is constant in the theoretical model. The same experiments demonstrate that the increase becomes larger gradually day by day. This result suggests that the present largescale model captures the detailed dynamics of long-term OKR gain adaptation better than the theoretical model.

Moreover, the present model allows us to study the detailed temporal dynamics of individual neurons with a fine temporal resolution. We were able to obtain detailed spike data of PCs and VN, and analyzed the firing patterns as in **Figure 6**. This is an advantage of an elaborated spiking network model over theoretical models, which abstract detailed temporal dynamics of individual neurons. We will be able to go into the details of molecular mechanisms of memory acquisition and consolidation (Abel and Lattal, 2001; Ito, 2002), if the HPC technology advances further.

We were also able to examine how the number of neurons could affect the stability of the network dynamics. In the present model, we incorporated more than 1 million granule cells, because the cats' cerebellum has 1 million granule cells per 1 mm<sup>3</sup> (Ito, 1984). The cerebellar granule cells constitute the largest population in the whole brain (Azevedo et al., 2009). A question arises: why does the cerebellum have such an enormous number of granule cells? A theoretical study has demonstrated that incorporating more granule cells makes the network more reliable for controlling hardware robots (Pinzon-Morales and Hirata, 2015). In the present study, we demonstrated that the enormous number of granule cells makes signal transmission from MFs to PFs more robust as in **Figure 6**.

In summary, combination of large-scale, detailed spiking network models with HPC technology for realtime simulation will provide a strong means to study mesoscopic, detailed neural mechanisms for macroscopic behavioral phenomenon that could take very long time for days and weeks such as memory formation.

### 4.4. Data Sharing

We will release the source code of the model used in this study under an opensource license upon publication, to facilitate open collaboration and ensure scientific reproducibility, on Cerebellar Platform (https://cerebellum.neuroinf.jp/).

### AUTHOR CONTRIBUTIONS

TY designed research; MS and TY performed research; MS and TY analyzed data; TY and MS wrote the paper.

### FUNDING

JSPS Kakenhi Grant Number (26430009) and UEC Tenure Track Program (6F15).

### ACKNOWLEDGMENTS

We would like to thank Soichi Nagao, Jun Igarashi, and Daisuke Miyamoto for helpful discussions.

### REFERENCES


responses at central vestibular nerve synapses. Neuron 68, 763–775. doi: 10.1016/j.neuron.2010.09.025


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Gosui and Yamazaki. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Large-Scale Simulations of Plastic Neural Networks on Neuromorphic Hardware

#### James C. Knight <sup>1</sup> \*, Philip J. Tully 2, 3, 4, Bernhard A. Kaplan<sup>5</sup> , Anders Lansner 2, 3, 6 and Steve B. Furber <sup>1</sup>

*<sup>1</sup> Advanced Processor Technologies Group, School of Computer Science, University of Manchester, Manchester, UK, <sup>2</sup> Department of Computational Biology, Royal Institute of Technology, Stockholm, Sweden, <sup>3</sup> Stockholm Brain Institute, Karolinska Institute, Stockholm, Sweden, <sup>4</sup> Institute for Adaptive and Neural Computation, School of Informatics, University of Edinburgh, Edinburgh, UK, <sup>5</sup> Department of Visualization and Data Analysis, Zuse Institute Berlin, Berlin, Germany, <sup>6</sup> Department of Numerical analysis and Computer Science, Stockholm University, Stockholm, Sweden*

SpiNNaker is a digital, neuromorphic architecture designed for simulating large-scale spiking neural networks at speeds close to biological real-time. Rather than using bespoke analog or digital hardware, the basic computational unit of a SpiNNaker system is a general-purpose ARM processor, allowing it to be programmed to simulate a wide variety of neuron and synapse models. This flexibility is particularly valuable in the study of biological plasticity phenomena. A recently proposed learning rule based on the Bayesian Confidence Propagation Neural Network (BCPNN) paradigm offers a generic framework for modeling the interaction of different plasticity mechanisms using spiking neurons. However, it can be computationally expensive to simulate large networks with BCPNN learning since it requires multiple state variables for each synapse, each of which needs to be updated every simulation time-step. We discuss the trade-offs in efficiency and accuracy involved in developing an event-based BCPNN implementation for SpiNNaker based on an analytical solution to the BCPNN equations, and detail the steps taken to fit this within the limited computational and memory resources of the SpiNNaker architecture. We demonstrate this learning rule by learning temporal sequences of neural activity within a recurrent attractor network which we simulate at scales of up to 2.0 × 10<sup>4</sup> neurons and 5.1 10 × <sup>7</sup> plastic synapses: the largest plastic neural network ever to be simulated on neuromorphic hardware. We also run a comparable simulation on a Cray XC-30 supercomputer system and find that, if it is to match the run-time of our SpiNNaker simulation, the super computer system uses approximately 45× more power. This suggests that cheaper, more power efficient neuromorphic systems are becoming useful discovery tools in the study of plasticity in large-scale brain models.

Keywords: SpiNNaker, learning, plasticity, digital neuromorphic hardware, Bayesian confidence propagation neural network (BCPNN), event-driven simulation, fixed-point accuracy

#### Edited by:

*Wolfram Schenck, University of Applied Sciences Bielefeld, Germany*

#### Reviewed by:

*Guy Elston, Centre for Cognitive Neuroscience, Australia Thomas Pfeil, Ruprecht-Karls-University Heidelberg, Germany*

#### \*Correspondence:

*James C. Knight james.knight@manchester.ac.uk*

Received: *30 November 2015* Accepted: *18 March 2016* Published: *07 April 2016*

#### Citation:

*Knight JC, Tully PJ, Kaplan BA, Lansner A and Furber SB (2016) Large-Scale Simulations of Plastic Neural Networks on Neuromorphic Hardware. Front. Neuroanat. 10:37. doi: 10.3389/fnana.2016.00037*

## 1. INTRODUCTION

Motor, sensory and memory tasks are composed of sequential elements and are therefore thought to rely upon the generation of temporal sequences of neural activity (Abeles et al., 1995; Seidemann et al., 1996; Jones et al., 2007). However it remains a major challenge to learn such functionally meaningful dynamics within large-scale models using biologically plausible synaptic and neural plasticity mechanisms. Using SpiNNaker, a neuromorphic hardware platform for simulating large-scale spiking neural networks, and BCPNN, a plasticity model based on Bayesian inference, we demonstrate how temporal sequence learning could be achieved through modification of recurrent cortical connectivity and intrinsic excitability in an attractor memory network.

Spike-Timing-Dependent Plasticity (Bi and Poo, 1998) (STDP) inherently reinforces temporal causality which has made it a popular choice for modeling temporal sequence learning (Dan and Poo, 2004; Caporale and Dan, 2008; Markram et al., 2011). However, to date, all large-scale neural simulations using STDP (Morrison et al., 2007; Kunkel et al., 2011) have been run on large cluster machines or supercomputers, both of which consume many orders of magnitude more power than the few watts required by the human brain. Mead (1990) suggested that the solution to this huge gap in power efficiency was to develop an entirely new breed of "neuromorphic" computer architectures inspired by the brain. Over the proceeding years, a number of these neuromorphic architectures have been built with the aim of reducing the power consumption and execution time of large neural simulations.

Large-scale neuromorphic systems have been constructed using a number of approaches: NeuroGrid (Benjamin et al., 2014) and BrainScaleS (Schemmel et al., 2010) are built using custom analog hardware; True North (Merolla et al., 2014) is built using custom digital hardware and SpiNNaker (Furber et al., 2014) is built from software programmable ARM processors.

Neuromorphic architectures based around custom hardware, especially the type of sub-threshold analog systems which Mead (1990) proposed, have huge potential to enable truly low-power neural simulation, but inevitably the act of casting algorithms into hardware requires some restrictions to be accepted in terms of connectivity, learning rules, and control over parameter values. As an example of these restrictions, of the largescale systems previously mentioned, only BrainScaleS supports synaptic plasticity in any form implementing both short-term plasticity and pair-based STDP using a dedicated mixed-mode circuit.

As a software programmable system, SpiNNaker will require more power than a custom hardware based system to simulate a model of a given size (Stromatias et al., 2013). However this software programmability gives SpiNNaker considerable flexibility in terms of the connectivity, learning rules, and ranges of parameter values that it can support. The neurons and synapses which make up a model can be freely distributed between the cores of a SpiNNaker system until they fit within memory; and the CPU and communication overheads taken in advancing the simulation can be handled within a single simulation time step.

This flexibility has allowed the SpiNNaker system to be used for the simulation of large-scale cortical models with up to 5.0 × 10<sup>4</sup> neurons and 5.0 × 10<sup>7</sup> synapses (Sharp et al., 2012, 2014); and various forms of synaptic plasticity (Jin et al., 2010; Diehl and Cook, 2014; Galluppi et al., 2015; Lagorce et al., 2015). In the most recent of these papers, Galluppi et al. (2015) and Lagorce et al. (2015) demonstrated that Sheik et al.'s (2012) model of the learning of temporal sequences from audio data can be implemented on SpiNNaker using a voltage-gated STDP rule. However, this model only uses a small number of neurons and Kunkel et al.'s (2011) analysis suggests that STDP alone cannot maintain the multiple, interconnected stable attractors that would allow spatio-temporal sequences to be learnt within more realistic, larger networks. This conclusion adds to growing criticism of simple STDP rules regarding their failure to generalize over experimental observations (see e.g., Lisman and Spruston, 2005, 2010; Feldman, 2012 for reviews).

We address some of these issues by implementing spike-based BCPNN (Tully et al., 2014)—an alternative to phenomenological plasticity rules which exhibits a diverse range of mechanisms including Hebbian, neuromodulated, and intrinsic plasticity all of which emerge from a network-level model of probabilistic inference (Lansner and Ekeberg, 1989; Lansner and Holst, 1996). BCPNN can translate correlations at different timescales into connectivity patterns through the use of locally stored synaptic traces, enabling a functionally powerful framework to study the relationship between structure and function within cortical circuits. In Sections 2.1–2.3, we describe how this learning rule can be combined with a simple point neuron model as the basis of a simplified version of Lundqvist et al.'s (2006) cortical attractor memory model. In Sections 2.4, 2.5, we then describe how this model can be simulated efficiently on SpiNNaker using an approach based on a recently proposed event-driven implementation of BCPNN (Vogginger et al., 2015). We then compare the accuracy of our new BCPNN implementation with previous non-spiking implementations (Sandberg et al., 2002) and demonstrate how the attractor memory network can be used to learn and replay spatio-temporal sequences (Abbott and Blum, 1996). Finally, in Section 3.3, we show how an anticipatory response to this replay behavior can be decoded from the neurons' sub-threshold behavior which can in turn be used to infer network connectivity.

## 2. MATERIALS AND METHODS

### 2.1. Simplified Cortical Microcircuit Architecture

We constructed a network using connectivity based on a previously proposed cortical microcircuit model (Lundqvist et al., 2006) and inspired by the columnar structure of neocortex (Mountcastle, 1997). The network consists of NHC hypercolumns arranged in a grid where each hypercolumn consists of 250 inhibitory basket cells and 1000 excitatory pyramidal cells evenly divided into 10 minicolumns. Within each hypercolumn, the pyramidal cells send AMPA-mediated

connections to the basket cells with a connection probability of 10 % and a weight of 0.4 nA (defined as a postsynaptic current (PSC) amplitude). The basket cells then send GABAergic connections back to the pyramidal cells with a connection probability of 10 % and a weight of 2 nA. The basket cells are also recurrently connected through GABAergic connections, again with a connection probability of 10 % and a connection weight of 2 nA. The functional outcome of this local connectivity (excitatory to inhibitory and vice versa) is to enable winnertake-all (WTA) dynamics within each hypercolumn. While the strength of the local synapses remains fixed, all pyramidal cells in the network are also recurrently connected to each other through global AMPA and NMDA connections using plastic BCPNN synapses (see Section 2.2): also with a connection probability of 10 %. All connections in the network have distancedependent synaptic delays such that, between two cells located in hypercolumns H pre xy and H post xy , the delay is calculated based on the Euclidean distance between the grid coordinates of the hypercolumns (meaning that all local connections have delays of 1 ms):

$$t\_d^{H\_{\rm xy}^{pre}H\_{\rm xy}^{post}} = \frac{d\_{norm}\sqrt{\left(H\_{\rm x}^{post} - H\_{\rm x}^{pre}\right)^2 + \left(H\_{\mathcal{Y}}^{post} - H\_{\mathcal{Y}}^{pre}\right)^2}}{V} + 1\tag{1}$$

Where conduction velocity V = 0.2 mm ms−<sup>1</sup> and dnorm = 0.75 mm.

### 2.2. Synaptic and Intrinsic Plasticity Model

The spike-based BCPNN learning rule is used to learn the strengths of all global synaptic connections and the intrinsic excitabilities of all pyramidal cells in the network described in Section 2.1. The goal of the learning process is to estimate the probabilities of pre- and postsynaptic neurons firing (P<sup>i</sup> and P<sup>j</sup> respectively), along with the probability of them firing together (Pij). Then, as Lansner and Holst (1996) describe, these probabilities can be used to calculate the synaptic strengths and intrinsic excitabilities of the network allowing it to perform Bayesian inference. Tully et al. (2014) developed an approach for estimating these probabilities based on pre- and postsynaptic spike trains (S<sup>i</sup> and S<sup>j</sup> respectively), defined as summed Dirac delta functions δ(·) where t f represent the times of spikes:

 $\text{אַזאַנדער}$ 
 $\text{אַזאַנדער}$ 
 $\text{אַזאַנדער}$ 
 $\text{אַזאַנדער}$ 
 $\text{אַזאַנדער}$ 

$$\text{S}\_{i}(t) = \sum\_{t\_{i}^{f}} \delta(t - t\_{i}^{f}) \qquad \qquad \text{S}\_{j}(t) = \sum\_{t\_{j}^{f}} \delta(t - t\_{j}^{f}) \qquad \text{(2)}$$

These spike trains are then smoothed using exponentially weighted moving averages to calculate the Z traces:

$$\text{tr}\_{\mathbf{z}\_i} \frac{dZ\_i}{dt} = \frac{\mathbf{S}\_i}{f\_{\text{max}} \Delta t} - Z\_i \qquad \qquad \mathbf{r}\_{\mathbf{z}\_j} \frac{dZ\_j}{dt} = \frac{\mathbf{S}\_j}{f\_{\text{max}} \Delta t} - Z\_j \tag{3}$$

Here, the maximum allowed firing rate fmax and spike duration 1t = 1 ms combine with the lowest attainable probability estimate ǫ = 1000 fmaxτ<sup>p</sup> introduced in Equation (5) to maintain a linear mapping from neuronal spike rates to probabilities. For more details on the Bayesian transformation entailed by these equations, see Tully et al. (2014). The Z trace time constants τz<sup>i</sup> and τz<sup>j</sup> determine the time scale over which correlations can be detected and are inspired by fast biological processes such as Ca2+ influx via NMDA receptors or voltage-gated Ca2+ channels. The Z traces are then fed into the P traces, where a coactivity term is introduced:

$$\text{tr}\_p \frac{dP\_i}{dt} = Z\_i - P\_i \quad \text{tr}\_p \frac{dP\_{ij}}{dt} = Z\_i Z\_j - P\_{ij} \quad \text{tr}\_p \frac{dP\_j}{dt} = Z\_j - P\_j \tag{4}$$

The P trace time constant τ<sup>p</sup> models long-term memory storage events such as gene expression or protein synthesis. It can be set higher to more realistically match these slow processes, but since simulation time increases with higher τ<sup>p</sup> values, in this work we keep them just long enough to preserve the relevant dynamics. Estimated levels of activity in the P traces are then combined to compute a postsynaptic bias membrane current Iβ<sup>j</sup> and synaptic weight between pre- and postsynaptic neurons wij:

$$I\_{\beta\_{\bar{j}}} = \beta\_{\text{gain}} \log(P\_{\bar{j}} + \epsilon) \quad \omega\_{\bar{i}\bar{j}} = \left. w\_{\text{gain}}^{\text{sym}} \log \frac{P\_{\bar{i}\bar{j}} + \epsilon^2}{(P\_{\bar{i}} + \epsilon) \left(P\_{\bar{j}} + \epsilon\right)} \right. \tag{5}$$

Here, βgain is used to scale the BCPNN bias into an intrinsic input current to the neuron which is used to model an A-type K+ channel (Jung and Hoffman, 2009) or other channel capable of modifying the intrinsic excitability of a neuron (Daoudal and Debanne, 2003). Similarly, w syn gain is used to scale the BCPNN weight into a current-based synaptic strength.

### 2.3. Neuronal Model

We model excitatory and inhibitory cells as IAF neurons with exponentially decaying PSCs (Liu and Wang, 2001; Rauch et al., 2003). The sub-threshold membrane voltage V<sup>m</sup> of these neurons evolves according to:

$$
\tau\_m \frac{dV\_m}{dt} = -V\_m + R\_m \left(I\_s + I\_a + I\_{\beta\_j}\right) \tag{6}
$$

The membrane time constant τ<sup>m</sup> and capacitance C<sup>m</sup> determine the input resistance R<sup>m</sup> = τm Cm through which input currents from the afferent synapses (Is), spike-frequency adaption mechanism (Ia) and the intrinsic input current from the BCPNN learning rule (Iβ<sup>j</sup> ) – described in Section 2.2—are applied. When V<sup>m</sup> reaches the threshold V<sup>t</sup> a spike is emitted, V<sup>m</sup> is reset to V<sup>r</sup> and α is added to the adaption current Ia. We use Liu and Wang's (2001) model of spike-frequency adaption with the adaption time constant τa:

$$
\pi\_a \frac{dI\_a}{dt} = -I\_a \tag{7}
$$

The synaptic input current to postsynaptic neuron j (Is<sup>j</sup> ) is modeled as a sum of exponentially shaped PSCs from other presynaptic neurons in the network:

$$\pi\_{\rm sym} \frac{dI\_{s\_{\vec{j}}}}{dt} = -I\_{s\_{\vec{j}}} + \sum\_{\rm syn} \sum\_{i=0}^{n} \boldsymbol{w}\_{\vec{ij}}^{\rm sym} \sum\_{t\_i^{\vec{j}}} \delta(t - t\_i^{\vec{j}}) \tag{8}$$

w syn ij indicates the weight of the connection between neurons i and j [where syn ∈ (AMPA, GABA, NMDA) denotes the synapse type], t f i represents the arrival time of spikes from presynaptic neuron i (where there are n neurons in the network), and τsyn is the synaptic time constant.

### 2.4. Simulating Spiking Neural Networks on SpiNNaker

SpiNNaker is a digital neuromorphic architecture designed for the simulation of spiking neural networks. Although systems built using this architecture are available in sizes ranging from single boards to room-size machines, they all share the same basic building blocks—the SpiNNaker chip (Furber et al., 2014). Each of these chips is connected to its six immediate neighbors using a chip-level interconnection network with a hexagonal mesh topology. Each SpiNNaker chip contains 18 ARM cores connected to each other through a network-on-chip, and connected to an external network through a multicast router. Each core has two small tightly-coupled memories: 32 KiB for instructions and 64 KiB for data; and shares 128 MiB of offchip SDRAM with the other cores on the same chip. Although this memory hierarchy is somewhat unusual, the lack of global shared memory means that many of the problems of simulating large spiking neural networks on a SpiNNaker system are shared with more typical distributed computer systems. Morrison et al. (2005) and Kunkel et al. (2012) developed a collection of approaches for mapping such networks onto large distributed systems in a memory-efficient manner while still obtaining supralinear speed-up as the number of processors increases. The SpiNNaker neural simulation kernel employs a very similar approach where, as shown in **Figure 1**, each processing core is responsible for simulating between 100 and 1000 neurons and their afferent synapses. The neurons are simulated using a time-driven approach with their state held in the tightly-coupled data memory. Each neuron is assigned a 32 bit ID and, when a simulation step results in a spike, it sends a packet containing this ID to the SpiNNaker router. These "spike" packets are then routed across the network fabric to the cores that are responsible for simulating these synapses. Biological neurons have in the order of 10<sup>3</sup> – 10<sup>4</sup> afferent synapses, so updating all of these every time step would be extremely computationally intensive. Instead, as individual synapses only receive spikes at relatively low rates, they can be updated only when they transfer a spike as long as their new state can be calculated from:


Using this event-driven approach on SpiNNaker is also advantageous as, due to their sheer number, synapses need to be stored in the off-chip SDRAM which has insufficient bandwidth for every synapse's parameters to be retrieved every simulation time step (Painkras and Plana, 2013). Instead, on receipt of a "spike" packet, a core retrieves the row of the connectivity matrix associated with the firing neuron from SDRAM. Each

FIGURE 1 | Mapping of a spiking neural network to SpiNNaker. For example a network consisting of 12 neurons is distributed between two SpiNNaker cores. Each core is responsible for simulating six neurons (filled circles) and holds a list of afferent synapses (non-filled circles) associated with each neuron in the network. The SpiNNaker router routes spikes from firing neurons (filled circles) to the cores responsible for simulating the neurons to which they make efferent synaptic connections.

of these rows describes the parameters associated with the synapses connecting the firing neuron to those simulated on the core. Once a row is retrieved the weights are inserted into an input ring-buffer, where they remain until any synaptic delay has elapsed and they are applied to the neuronal input current.

In addition to enabling large-scale simulations with static synapses, this event-driven approach can, in theory, be extended to handle any type of plastic synapse that meets the 3 criteria outlined above. However, simulating plastic synapses has additional overheads in terms of memory and CPU load—both of which are very limited resources on SpiNNaker. Several different approaches have been previously developed that aim to minimize


**function** processRow(t) **for each** j **in** postSynapticNeurons **do** history ← getHistoryEntries(j, told, t)

> **for each** (tj,sj) **in** history **do** wij ← applyPostSpike(wij, tj, told,si)

(tj,sj) ← getLastHistoryEntry(t) wij ← applyPreSpike(wij, t, tj,sj) addWeightToRingBuffer(wij, j)

s<sup>i</sup> ← addPreSpike(si, t, told) told ← t

memory usage (Jin et al., 2010), reduce CPU load (Diehl and Cook, 2014) or offload the processing of plastic synapses to dedicated cores (Galluppi et al., 2015). Morrison et al. (2007) also extended their work on distributed spiking neural network simulation to include synaptic plasticity, developing an algorithm for simulating plastic synapses in an event-driven manner, using a simplified model of synaptic delay to reduce CPU and memory usage. In this work, we combine elements of Diehl and Cook's (2014) and Morrison et al.'s (2007) approaches, resulting in **Algorithm 1** which is called whenever the connectivity matrix row associated with an incoming "spike" packet is retrieved from the SDRAM. As well as the weights of the synapses connecting the presynaptic neuron to the postsynaptic neurons simulated on the local core (wij), the row also has a header containing the time at which the presynaptic neuron last spiked (told) and its state at that time (si). The exact contents of the state depends on the plasticity rule being employed, but as only the times of presynaptic spikes are available at the synapse, the state often consists of one or more low-pass filtered versions of this spike train.

The algorithm begins by looping through each postsynaptic neuron (j) in the row and retrieving a list of the times (tj) at which that neuron spiked between told and t and its state at that time (sj). In the SpiNNaker implementation, these times and states are stored in a fixed-length circular queue located in the tightly-coupled data memory to which a new entry gets added whenever a local neuron fires. Next, the effect of the interaction between these postsynaptic spikes and the presynaptic spike that occurred at told is applied to the synapse using the applyPostSpike function. The synaptic update is then completed by applying the effect of the interaction between the presynaptic spike that instigated the whole process and the most recent postsynaptic spike to the synapse using the applyPreSpike function before adding this weight to the input ring buffer. Finally, the header of the row is updated by calling the addPreSpike function to update s<sup>i</sup> and setting told to the current time.

### 2.5. An Event-based, SpiNNaker Implementation of Bayesian Learning

Equations (3)–(5) cannot be directly evaluated within the eventdriven synaptic processing scheme outlined in Section 2.4, but as they are simple first-order linear ODEs, they can be solved to obtain closed-form solutions for Z(t) and P(t). These equations then need only be evaluated when spikes occur. Vogginger et al. (2015) converted this resultant system of equations into a spikeresponse model (Gerstner and Kistler, 2002) which, as it only consists of linear combinations of e −t <sup>τ</sup><sup>z</sup> and e −t τp , can be re-framed into a new set of new state variables Z ∗ i , Z ∗ j , P ∗ i , P ∗ j , and P ∗ ij. These, like the state variables used in many STDP models are simply low-pass filtered versions of the spike-trains and can be evaluated when a spike occurs at time t:

$$Z\_i^\*(t) = Z\_i^\*(t^{last})e^{-\frac{\Delta t}{t\_{\overline{\varepsilon}\_i}}} + \mathcal{S}\_i(t) \quad P\_i^\*(t) = P\_i^\*(t^{last})e^{-\frac{\Delta t}{t\_P}} + \mathcal{S}\_i(t) \tag{9}$$

Z ∗ and P ∗ can now be stored in the pre and postsynaptic state (s<sup>i</sup> and sj) and updated in the addPreSpike function called from algorithm 1; and when postsynaptic neurons fire. The correlation trace, Pij can similarly be re-framed in terms of a new state variable:

$$P\_{ij}^\*(t) = P\_{ij}^\*(t^{last})e^{-\frac{\Delta t}{t^p}} + \mathcal{S}\_i(t)Z\_j^\*(t) \tag{10}$$

P ∗ ij can now be stored alongside the synaptic weight wij in each synapse and evaluated in the applyPreSpike and applyPostSpike functions called from algorithm 1. The final stage of the eventbased implementation is to obtain the P<sup>i</sup> , P<sup>j</sup> and Pij values required to evaluate (Equation 5) from the new state variables and thus obtain wij and β<sup>j</sup> .

$$P\_i(t) = a\_i \left( Z\_i^\*(t) - P\_i^\*(t) \right) \tag{11}$$

$$P\_{\vec{ij}}(t) = a\_{\vec{ij}} \left( Z\_i^\*(t) Z\_{\vec{j}}^\*(t) - P\_{\vec{ij}}^\*(t) \right) \tag{12}$$

With the following coefficients used for brevity:

$$\begin{aligned} \tau\_{z\_{\vec{\eta}}} &= \left(\frac{1}{\tau\_{z\_i}} + \frac{1}{\tau\_{z\_{\vec{\eta}}}}\right)^{-1} & a\_i &= \frac{1}{f\_{\max}\left(\tau\_{z\_i} - \tau\_{\vec{\rho}}\right)}\\ a\_{\vec{\eta}} &= \frac{1}{f\_{\max}^{-2}\left(\tau\_{z\_j} + \tau\_{z\_i}\right)\left(\tau\_{z\_{\vec{\eta}}} - \tau\_{\vec{\rho}}\right)} \end{aligned} \tag{13}$$

This approach makes implementing spike-based BCPNN on SpiNNaker feasible from an algorithmic point of view, but limitations of the SpiNNaker architecture further complicate the problem. The most fundamental of these limitations is that, as Moise (2012, p. 20) explains, for reasons of silicon area and energy efficiency, SpiNNaker has no hardware floating point unit. While floating point operations can be emulated in software, this comes at a significant performance cost meaning that performance-critical SpiNNaker software needs to instead use fixed-point arithmetic. Hopkins and Furber (2015) discussed the challenges of using fixed-point arithmetic for neural simulation on the SpiNNaker platform in detail but, in the context of this work, there are two main issues of particular importance. Firstly the range of fixed-point numeric representations is static so, to attain maximal accuracy, the optimal representation for storing each state variable must be chosen ahead of time. Vogginger et al. (2015) investigated the use of fixed-point types for BCPNN as a means of saving memory and calculated that, in order to match the accuracy of a time-driven floating point implementation, a fixed-point format with 10 integer and 12 fractional bits would be required. However, not only is the model described in Section 2.2 somewhat different from the reduced modular model considered by Vogginger et al. (2015), but the ARM architecture only allows 8, 16, or 32 bit types to be natively addressed. Therefore, we reevaluated these calculations for the SpiNNaker implementation and chose to use 16 bit types for two reasons:


Based on a total of 16 bit, the number of bits used for the integer and fractional parts of the fixed-point representation needs to be determined based on the range of the state variables. As all of the Z ∗ and P ∗ state variables are linear sums of exponential spike responses and P <sup>∗</sup> has the largest time constant, it decays slowest meaning that it will reach the highest value. Therefore we can calculate the maximum value which our fixed-point format must be able to represent in order to handle a maximum spike frequency of fmax as follows:

$$P\_{\max}^\* = \frac{1}{1 - e^{-\frac{1}{f\_{\max} \times r\_p}}} \tag{14}$$

In order to match the firing rates of pyramidal cells commonly observed in cortex, low values of the maximum firing rate (fmax, e.g., 20 or 50 Hz) are often used with the BCPNN model described in Section 2.2. On this basis, by using a signed fixed-point format with 6 integer and 9 fractional bits, if fmax = 20 Hz, traces with τ<sup>p</sup> < 3.17 s can be represented and, if fmax = 50 Hz, traces with τ<sup>p</sup> < 1.27 s can be represented.

The second problem caused by the lack of floating point hardware is that there is no standard means of calculating transcendental functions for fixed-point arithmetic. This means that the exponential and logarithm functions required to implement BCPNN must be implemented by other means. While it is possible to implement approximations of these functions using, for instance a Taylor series, the resultant functions are likely to take in the order of 100 CPU cycles to evaluate (Moise, 2012), making them too slow for use in the context of BCPNN where around ten of these operations will be performed every time a spike is transferred. Another approach is to use precalculated lookup tables (LUTs). These are particularly well suited to implementing functions such as e −t <sup>τ</sup> where t is discretized to simulation time steps and, for small values of τ , the function decays to 0 after only a small number of table entries. While the log(x) function has neither of these ideal properties, x can be normalized into the form x = y × 2 n : n ∈ Z, y ∈ [1, 2) so a LUT is only required to cover the interval [1, 2) within which log(x) is relatively linear.

### 3. RESULTS

### 3.1. Validating BCPNN Learning on SpiNNaker with Previous Implementations

In this section we demonstrate that the implementation of BCPNN we describe in Section 2.5 produces connection weights and intrinsic excitabilities comparable to those learned by previous models. To do this we used the procedure developed by Tully et al. (2014) and the network described in **Table 1** to compare two neurons, connected with a BCPNN synapse, modeled using both our spiking BCPNN implementation and as abstract units with simple, exponentially smoothed binary activation patterns (Sandberg et al., 2002). We performed this comparison by presenting the neurons with five patterns of differing relative activations, each repeated for ten consecutive 200 ms trials. Correlated patterns meant both neurons were firing at fmax Hz or ǫ Hz each trial; independent patterns meant uniform sampling of fmax Hz and ǫ Hz patterns for both neurons in each trial; anti-correlated patterns meant one neuron fired at fmax Hz and the other at ǫ Hz or vice-versa in each trial; both muted meant both neurons fired at ǫ Hz in all trials; and post muted meant uniform sampling of presynaptic neuron activity while the postsynaptic neuron fired at ǫ Hz in all trials.

As **Figure 2** shows, during the presentation of patterns in which both units are firing, the responses from the abstract model fall well within the standard deviation of the SpiNNaker model's responses, but as units are muted, the two models begin to diverge. Further investigation into the behavior of the individual state variables shows that this is due to the P ∗ term of Equation (11) coming close to underflowing the 16 bit fixed-point format when a long time has passed since the last spike. This inaccuracy in the P ∗ term is then further amplified when the weights and intrinsic excitabilities are calculated using (Equation 5) as for small values of x, log(x) approaches its vertical asymptote. The standard deviations visible in **Figure 2** reflect the fact that for the spiking learning rule, the realization of Poisson noise that determined firing rates was different for each trial, but with a rate modulation that was repeated across trials.

### 3.2. Learning Sequential Attractors using Spike-Based BCPNN

In this section we consider a functional use case of the the modular attractor network described in Section 2.1 involving learning temporal sequences of attractors. With asymmetrical BCPNN time constants, it was previously proposed that this network could self-organize spontaneously active sequential attractor trajectories (Tully et al., 2014). We built a suitable network using the neuron and plasticity models described in Sections 2.2, 2.3; and the parameters listed in **Table 2**. Using this network we employed a training regime—a subset of which is shown in **Figure 3A**—in which we repeatedly stimulated all cells in a mutually exclusive sequence of minicolumns for 50 training epochs. Each minicolumn was stimulated for 100 ms, such that the neurons within it fired at an average rate of fmax Hz. During training we disabled the term in Equation (8) that incorporates input from the plastic AMPA and NMDA synapses meaning that, while the weights were learned online, the dynamics of the network did not disturb the training regime. A recall phase

#### TABLE 1 | Model description of the BCPNN validation network. (A) Model summary


(B) Populations


#### (C) Connectivity


(D) Neuron and synapse model




*After Nordlie et al. (2009).*

followed this learning phase in which a 50 ms stimulus of fmax Hz was applied to all cells in the first minicolumn of the learned sequence. During both the training and recall phases we provided background input to each cell in the network from an independent 65 Hz Poisson spike source. These Poisson spike sources are simulated on additional SpiNNaker cores to those used for the neural simulation algorithm described in Section 2.4.

versions of the learning rule. SpiNNaker simulations were repeated 10 times and averaged, with standard deviations illustrated by the shaded regions.

We found that the training regime was able to produce the recurrent connectivity required to perform temporal sequence recall in the same serial order that patterns were presented during training as shown in **Figure 3B**. Each sequence element replayed as a learned attractor state that temporarily stifled the activity of all other cells in the network due to WTA and asymmetrically projected NMDA toward neurons of the subsequent sequence element, allowing a stable trajectory to form. Activity within attractor states was sharpened and stabilized by learned auto-associative AMPA connectivity; and sequential transitions were jointly enabled by neural adaptation and inter-pattern heteroassociation via NMDA synapses.

Because of the modular structure of the network described in Section 2.1, this temporal sequence learning can be performed using networks of varying scales by instantiating different number of hypercolumns and linearly scaling the w syn gain parameter of the connections between them. By doing this, we investigated how the time taken to simulate the network on SpiNNaker scales with network size. **Figure 4** shows how these times are split between the training and testing phases; and how long is spent generating data on the host computer, transferring it to and from SpiNNaker and actually running the simulation. As the SpiNNaker simulation always runs at a fixed fraction of real-time (for this simulation 0.5×), the simulation time remains constant

#### Knight et al. Plastic Neural Networks on Neuromorphic Hardware

#### TABLE 2 | Parameters for the modular attractor network.


#### (B) Neuron and synapse model


(C) Plasticity Type BCPNN AMPA synapses as described in Section 2.2 Parameters *fmax* = 20 Hz maximum spiking frequency τ*z i* = 5 ms presynaptic primary trace time constant τ*z j* = 5 ms postsynaptic primary trace time constant τ*p* = 2000 ms probability trace time constant *w syn gain* <sup>=</sup> 0.546 *NHC* nA weight gain Type BCPNN NMDA synapses as described in Section 2.2 Parameters *fmax* = 20 Hz maximum spiking frequency τ*z i* = 5 ms presynaptic primary trace time constant τ*z j* = 150 ms postsynaptic primary trace time constant τ*p* = 2000 ms probability trace time constant *w syn gain* <sup>=</sup> 0.114 *NHC* nA weight gain β*gain* = 0.05 nA intrinsic bias gain (D) Input


*After Nordlie et al. (2009).*

as the network grows, but the times required to generate the data and to transfer it grow significantly, meaning that when NHC = 16 (2.0 × 10<sup>4</sup> neurons and 5.1 × 10<sup>7</sup> plastic synapses), the total simulation time is 146 min. However, the amount of time spent in several phases of the simulation is increased by limitations of the current SpiNNaker toolchain. 84 min is spent downloading the learned weight matrices and re-uploading them for the testing: A process that is only required because the changing of parameters (in this case, whether learning is enabled or not) mid-simulation is not currently supported. Additionally, the current implementation of the algorithm outlined in Section 2.4 only allows neurons simulated on one core to have afferent synapses with a single learning rule configuration. This means that we have to run the training regime twice with the same input spike trains, once for the AMPA synapses and once for the NMDA synapses: Doubling the time taken to simulate the training network.

Previous supercomputer simulations of modular attractor memory networks have often used more complex neuron models and connectivity (Lundqvist et al., 2010), making simulation times difficult to compare with our SpiNNaker simulation due to the simplifications we outlined in Section 2.1. In order to present a better comparison, we built a network model with the same connectivity as our SpiNNaker model and simulated it on a Cray XC-30 supercomputer system using NEST version 2.2 (Gewaltig and Diesmann, 2007) with the spike-based BCPNN implementation developed by Tully et al. (2014). NEST does not include the adaptive neuron model we described in Section 2.3 so we used the adaptive exponential model (Brette and Gerstner, 2005): a simple point neuron model with spike-frequency adaption.

As previously discussed SpiNNaker runs at a fixed-fraction of real-time so we distribute our NEST simulations across increasing numbers of Cray XC-30 compute nodes (each consisting of two 2.5 GHz Intel Ivy Bridge Xeon processors) until the simulation completed in the same time as those shown in **Figure 4** for our SpiNNaker simulations. **Table 3** shows the result of both these supercomputer simulations and a second set with the time taken for the mid-simulation downloading and re-uploading of weights—currently required by the SpiNNaker software—removed. Due to this redundant step and because NEST parallelizes the generation of simulation data across the compute nodes, at all three scales, our modular attractor network can be simulated using 2 compute nodes. However, if we remove the time spent downloading and re-uploading the weights, 9 compute nodes are required to match the run-time of the SpiNNaker simulation when NHC = 16.

While a more in-depth measurement of power usage is out of the scope of this work, we can also derive approximate figures for the power usage of our simulations running on both systems based on the 1 W peak power usage of the SpiNNaker chip and the 30 kW power usage of a Cray XC-30 compute rack (Cray, 2013). While these figures ignore the power consumed by the host computer connected to the SpiNNaker system; the power consumed by the "blower" and storage cabinets connected to the Cray XC-30; and assume that all CPUs are running at peak power usage, they show that even in the worst case, SpiNNaker uses 45× less power than the Cray XC-30 and, if the limitations of the current SpiNNaker software are addressed, this can be improved to 200×.

### 3.3. Connectivity Patterns Show Different Signatures in Membrane Potentials

The purpose of this section is to study how learning parameters influence the resulting connectivity patterns and the effect of learned connectivity on membrane dynamics during sequence replay. For this purpose we vary two parameters of the learning rule that control the time window within which correlations are detected − τz<sup>i</sup> on the pre- and τz<sup>j</sup> on the postsynaptic side. The network is trained using the same regime described in Section 3.2 and two different configurations, one with τz<sup>i</sup> = τzj on NMDA synapses, and one with τz<sup>i</sup> 6= τz<sup>j</sup> . If τz<sup>i</sup> and τzj are equal, the Z<sup>i</sup> and Z<sup>j</sup> traces evolve in the same manner, meaning that, as their dynamics propagate through the P traces to the synaptic weights, the forward and reciprocal connections between minicolumns develop symmetrically as shown in the top row of **Figure 5**. However, when τz<sup>i</sup> 6= τz<sup>j</sup> , the Z<sup>i</sup> and Z<sup>j</sup> traces evolve differently and, as the bottom row of **Figure 5** shows, asymmetrical connections develop between minicolumns. It is important to note that the spiking activity during the training regime is the same in both configurations and the shape of the resulting connectivity results only from the learning timeconstants τz<sup>i</sup> and τz<sup>j</sup> .

In order to analyze the effect of the different learned connectivity patterns shown in **Figure 5**, we studied the impact of the two connectivity kernels on the subthreshold dynamics of neurons during sequence replay. As described in Section 3.2, after training, the trained sequence can be replayed by applying a 50 ms stimulus of fmax Hz into all cells of the first minicolumn in the learned sequence. Later, in the sequence replay when the stimulus has been removed, we recorded the membrane potential of all of the neurons in the network and stored the point in time when the firing rate of the respective minicolumn was maximal. We then align the membrane potential traces to this point in time and average them over all cells in a minicolumn. Interestingly,


TABLE 3 | Comparison of power usage of modular attractor network simulations running on SpiNNaker with simulations distributed across enough compute nodes of a Cray XC-30 system to match SpiNNaker simulation time.

*Cray XC-30 power usage is based on the 30 kW power usage of an entire Cray XC-30 compute rack (Cray, 2013). SpiNNaker power usage is based on the 1 W peak power usage of the SpiNNaker chip (Furber et al., 2014).*

*Top: SpiNNaker simulation times include downloading of learned weights and re-uploading required by current software.*

*Bottom: Time taken to download learned weights, re-generate and re-upload model to SpiNNaker have been removed.*

*<sup>a</sup>We are unsure why more supercomputer compute nodes are required to match the SpiNNaker simulation times when NHC* = 9 *than when NHC* = 16*. We assume this is an artifact of the different scaling properties of the two simulators, but further investigation is outside of the scope of this work.*

as **Figure 6** illustrates, these averaged and aligned membrane responses show different characteristics for the network models built on symmetric and asymmetric connectivity. Both network types show similar membrane characteristics before the sequence arrives at the minicolumn, but, the network with symmetric connectivity shows a significantly slower decrease in membrane potential after the sequence has passed. In contrast, the network with asymmetric connectivity shows a strong afterstimulus hyperpolarization due to the increased inhibitory input originating from minicolumns later in the sequence which get subsequently activated. The slower decrease in the mean membrane potential in the symmetric network can be explained by the excitatory projections in both directions of the sequence providing excitatory current flow to previously

activated neurons. The implications of this experiment and interpretations of these different characteristics is discussed in Section 4.2.

### 4. DISCUSSION

The contribution of this study is threefold. Firstly, we have shown that BCPNN can be efficiently implemented within the constraints of the SpiNNaker neuromorphic architecture. Secondly, we have shown how BCPNN can be used in a functionally meaningful context to perform near real-time learning of temporal sequences within a large-scale modular attractor network—the largest plastic neural network ever to

be simulated on neuromorphic hardware. Finally, we have demonstrated the value of SpiNNaker as a tool for investigating plasticity within large-scale brain models by exploring how, by changing a single parameter in the BCPNN learning rule, both symmetric and asymmetric connectivity can be learned, which in turn influence underlying membrane potential characteristics.

### 4.1. Learning Temporal Sequences in Cortical Microcircuits

The total duration of temporal sequences is longer than the time courses of individual cellular or synaptic processes and therefore, such sequences are thought to be driven by circuitlevel phenomena although the intricacies of this relationship have yet to be fully explored. The massively recurrent and long-range nature of cortical connectivity, taken together with the emergence of temporal sequences at fine scales and distributed over spatial areas, suggests the presence of generic cortical microcircuit mechanisms. The model presented here is modularly organized into hypercolumns, each implementing WTA dynamics (Douglas and Martin, 2004). This modular structure also allowed us to vary the number of hypercolumns the network contained without effecting its functionality (Djurfeldt et al., 2008). Such distributed systems generally exhibit an improved signal-to-noise ratio, error resilience, generalizability and a structure suitable for Bayesian calculations (McClelland et al., 1986; Barlow, 2001). Like their uniformly interconnected counterparts, they can also exhibit high variability in their spike train statistics (Litwin-Kumar and Doiron, 2012; Lundqvist et al., 2012). Moreover, due to their capacity to exhibit a rich repertoire of behaviorally relevant activity states, these topologies are also well suited for information processing and stimulus sensitivity (Lundqvist et al., 2010; Wang et al., 2011).

Previous investigations have shown that the attractors which emerge within such modular networks reproduce features of local UP states (Lundqvist et al., 2006). This observation remains consistent with the extension considered here since, in vivo, UP state onsets are accompanied by the sequential activation of cortical neurons (Luczak et al., 2007). This redundant neural coding scheme should not necessarily be viewed in terms of anatomical columns, but rather functional columns consisting of subgroups of neurons with similar receptive fields that are highly connected (Yoshimura et al., 2005) and co-active (Cossart et al., 2003). Similar stereotypical architectures have been used as building blocks for other unified mathematical frameworks of the neocortex (Johansson and Lansner, 2007; George and Hawkins, 2009; Bastos et al., 2012).

The dynamics of the model consists of attractors, whose activations produce self-sustaining spiking among groups of neurons spread across different hypercolumns. Activity within attractors is sharpened by the fast dynamics of the AMPA receptor, until the network transitions to a subsequent attractor due to neural adaptation and asymmetrical NMDA connectivity, both of which have longer time constants of activation. In this work we have shown how these dynamics could be learned using BCPNN, a learning rule which offers an alternative to phenomenological STDP rules that often require complementary mechanisms due to their prevailing instability (Kempter et al. 2001; Babadi and Abbott 2010; but see Gütig et al. 2003).

### 4.2. Sequence Anticipation and Asymmetric Connectivity as Observed in the Membrane Potential Dynamics

In both the symmetric and asymmetric networks, the stimulusaligned mean membrane potential traces show a similar rise prior to sequence arrival which can be interpreted as a form of anticipation of the impending activity peak. By anticipation we mean the premature build-up of a neuronal response which is becoming increasingly similar to the response when the actual stimulus is present and represents the neural signature of expectation or prediction of future input. Anticipation is an important function of neural circuits and is observed not only in early sensory structures such as the retina (Berry et al., 1999; Hosoya et al., 2005; Vaney et al., 2012), but also in downstream structures and cortical areas (Rao and Ballard, 1999; Enns and Lleras, 2008) which are involved in more abstract cognitive tasks (Riegler, 2001; Butz et al., 2003). Anticipation can also be regarded as a form of prediction of future events: something which Bar (2007) and Bubic et al. (2010) argue is a fundamental function of the brain. This predictive capability can improve sensory perception (Yoshida and Katz, 2011; Rohenkohl et al., 2012) and is important for other modalities such as motor control and learning (Shadmehr et al., 2010; Schlerf et al., 2012). However, the connectivity which implements this predictive or anticipatory function, and the mechanisms which give rise to it, are not well understood. We believe that BCPNN learning helps fill this gap—as we discussed in Section 3.2, it can learn functional connectivity at a network scale and, as previously argued in this section, it exhibits anticipatory behavior.

We studied the network response by looking at the membrane potential dynamics prior to and after a stimulus and compared the response of two network connectivities trained with different learning parameters. As membrane potential dynamics are the result of a multitude of parameters, we constructed identical experimental settings in terms of input, to make sure that the differences in the membrane potential dynamics can be linked as closely as possible to the differences in the underlying connectivity. That is, the only major difference between the two settings is the characteristic shape of the connectivity (either being symmetric or asymmetric, see **Figure 5**) resulting from different learning parameters. Since the two models implement a different flow of recurrent excitation, the gain parameters in both networks have been adjusted so that both operate in a similar activity regime in order to enable a meaningful comparison of the temporal characteristics introduced by the connectivity shape. The voltage traces arising from the different network connectivities shown in **Figure 6** exhibit different post-stimulus characteristics during sequence replay, with a faster hyperpolarization happening in networks with asymmetric connectivity. Hence we propose that by aligning the average membrane voltage of a population of neurons—in a perceptual context, to its preferred stimulus and, in a task-related context, to its peak activity—and then analyzing the post-stimulus characteristics of this average voltage, the population's afferent connectivity can be inferred.

### 4.3. Asymmetric Connectivity Supports Motion Preference

In the context of visual perception of motion, asymmetric connectivity has been found to play an important role in direction-sensitive ganglion cells in the retina (Kim et al., 2008; Vaney et al., 2012). A previous study by Kaplan et al. (2013) proposed asymmetric connectivity as a means of extrapolating the trajectory of a moving stimulus in the absence of a stimulus: similar, in many respects, to the experiment presented here. Recently, this hypothesized tuning property-based connectivity has been confirmed by the observation of neuronal modules in mouse V1 that exhibit similar motion direction preference (Wertz et al., 2015). The model we present here uses a Hebbian-Bayesian rule to explain how such feature-selective connectivity between neurons tuned to similar stimulus features could arise. It could therefore serve not only as a framework for modeling observed connectivity patterns and helping to understand their functional implications, but also as a means of linking experimentally observed connectivity with earlier modeling studies (Kaplan et al., 2013) by explaining how asymmetric connectivity can arise through learning.

The question of how a preferred sequence direction could be learned and replayed is not only relevant for sensory systems, but also other systems where sequence learning, generation and replay are important (Luczak et al., 2015). We addressed this question by training networks with both symmetric and asymmetric connectivity using a single sequence direction. We then triggered sequence replay in both networks in a similar way to experiments by Gavornik and Bear (2014) and Xu et al. (2012) which studied sequence learning in early visual cortices. Models with symmetric connectivity can show sequence replay in both directions, not only in the trained one. The intuition being that if one were to employ the same training protocol described in Section 3.2, one could replay the sequence forwards or backwards by presenting a cue to the first or last attractor. Instead of being directed by asymmetrical connectivity, the preferred sequence trajectory would evolve according to adaptation. Hence, the direction of the sequence during training alone is not sufficient to create a preferred replay direction as observed in experiments (Xu et al., 2012). Instead, we argue that the asymmetric connectivity caused by a difference in the learning parameters, i.e., an unequal temporal correlation time window, is necessary to replay sequences in only the trained, and therefore preferred, direction.

### 4.4. Results in Context of Anatomical Data

The presented model addresses the question of how connectivity emerges at a cellular and network level in response to temporally varying stimuli. Through usage of different learning time constants, connectivity kernels of varying widths develop as shown in **Figure 5**. There exists a large body of anatomical evidence reporting regional variations in cortical circuitry in terms of structural features such as dendritic morphology and the density of dendritic spines (see e.g., Jacobs and Scheibel, 2002; Elston, 2003 for reviews). In the visual system the hierarchical organization of areas (Riesenhuber and Poggio, 1999) is reflected in their varying dendritic complexity. When compared to areas such as V1, V2, and V4 which respond to simpler visual features, areas associated with more complex functionality also exhibit more complex dendritic morphologies and have a higher number of connections (Jacobs and Scheibel, 2002; Elston and Fujita, 2014).

It stands to reason that the structural and electrophysiological differences observed in both pyramidal cells and interneurons influences activity on a cellular level (Spruston, 2008), shaping the way in which information is integrated and therefore the functional roles of both the individual cells and the entire circuit (Elston, 2003). These regional variations appear to be consistent across species and to change during development (see Elston and Fujita, 2014 for a recent review). Pyramidal cells in V1 reduce their dendritic complexity and those in inferotemporal and prefrontal areas grow larger dendritic structures over the first months and years of development. In the light of the presented model, these observations could lead to the interpretation that reducing the dendritic extent of pyramidal cells mirrors an improved perceptual precision in V1 as finer temporal correlations are detected (represented by short learning time constants τzi,<sup>j</sup> and smaller dendritic extent as shown in the panels on the left side of **Figure 5**). In contrast, as more abstract associations are learned, pyramidal cells in higher areas grow more spines over larger dendritic territories. This allows these cells to integrate information from more diverse sources, requiring integration and association to occur over longer time scales (larger τzi,<sup>j</sup> ). In this context, it is important to note that the learning time constants may not necessarily equal the synaptic time constants (which are determined by receptor and channel kinetics), but could vary depending on the area and with it the function or task to be learned.

Despite the fact that our model uses point neurons and thus does not directly represent the dendritic field, we argue that the learning time-constants determine a neuron's capability to integrate information over time which – given a topographic stimulus representation such as that seen in V1—could be linked to the size of the dendritic field of a neuron. Hence, the presented learning framework offers the possibility to study these arguments in more quantitative detail.

### 4.5. Scaling the Modular Attractor Model

In Section 3.2 we presented simulations of the modular attractor network model described in Section 2.1 with up to 16 hypercolumns, connected using sparse, random 10 % global connectivity. At this scale each pyramidal cell in the network receives 4.0 × 10<sup>3</sup> afferent excitatory synapses but—if the model were scaled up to, for example, the scale of the mouse neocortex with approximately 1.6 × 10<sup>7</sup> neurons (Braitenberg and Schüz, 2013)—each pyramidal cell would receive 1.3 × 10<sup>6</sup> afferent synapses. As we discussed in Section 4.4, pyramidal cell connectivity varies widely across the layers and areas of the neocortex. However, in this section we base our discussion of the scaling properties of our model on the assumption that each pyramidal cell receives 8.0 × 10<sup>3</sup> afferent synapses. This number is consistent with averages calculated across cortical layers and areas in mice (Braitenberg and Schüz, 2013), cats (Beaulieu and Colonnier, 1989) and humans (Pakkenberg et al., 2003). The reason this number is significantly lower than the one obtained by naïvely scaling our current model is because of the "patchy" nature of long-range cortical connectivity (Goldman and Nauta, 1977; DeFelipe et al., 1986; Gilbert and Wiesel, 1989; Bosking et al., 1997). Specifically, each pyramidal cell only connects to around 10, approximately hypercolumn-sized, clusters of neurons located within a radius of a few millimeters. Additionally, while each hypercolumn in our model contains 10 minicolumns, biological hypercolumns typically have closer to 100 (Mountcastle, 1997; Buxhoeveden and Casanova, 2002). This means that, because of the winner-take-all dynamics within each hypercolumn, while 10 % of neurons in our model are active at any given time, only 1 % would be active in a more realistic model.

As Sharp and Furber (2013) discuss, when simulating spiking neural networks on SpiNNaker, the majority of CPU time is spent within the event-driven synaptic processing stage, making the CPU load highly dependent on the rate of incoming synaptic events (a single spike innervating a single synapse). The combined effect of the more realistic global connectivity and sparser activity discussed in the previous paragraph would be to reduce the rate of incoming synaptic events by a factor of 5 when compared to our current model. This means that a model with more realistic connectivity could actually run faster than the current model on SpiNNaker - Potentially in biological real-time rather than the 0.5× real-time we use in this work.

However, as we discussed in Section 3.2, the time spent actually running simulations on SpiNNaker is often dwarved by the time spent generating simulation data on the host computer and transferring it to and from the SpiNNaker system. One way of reducing the time taken to generate the simulation data and upload it to SpiNNaker would be to perform some of the data generation on SpiNNaker itself. The most obvious target for this approach would be the generation of the connectivity matrices as, not only do these represent the bulk of the uploaded data, but they are typically defined probabilistically meaning that they could be generated based on a very small uploaded definition. While this approach would undoubtedly reduce the time taken to generate and upload the simulation data, even the 1 min currently taken to download the results at the end of the simulation would grow to several hours if the network was scaled up to the size of even a mouse's neocortex. These slow upload and download times are due to current simulations all having been run on single board SpiNNaker systems, connected to the host computer through a single ethernet link. While the theoretical bandwidth of this link is 100 Mbit s−1, inefficiencies in the current SpiNNaker system software reduce the effective bandwidth to only a few MiB s−1 .

Not only is work currently underway to improve the bandwidth of the ethernet links, but in the case of large-scale network simulations running across multiple SpiNNaker boards, if the host computer is powerful enough and connected to the SpiNNaker system through a sufficiently fast network, data can be transferred to multiple SpiNNaker boards in parallel. Furthermore, if still more bandwidth is required, each SpiNNaker board also has several high-speed serial connectors which could be used for transferring data to and from SpiNNaker at the full 1 Gbit s−1 bandwidth of the chip-level interconnect network. Together, the improvements to the scalability of the model discussed in this section would also act to further increase the power efficiency of SpiNNaker when compared to traditional super computer systems that we briefly discuss in Section 3.2.

### 4.6. Extensions of BCPNN on SpiNNaker and other Future Considerations

Since we have shown that BCPNN learning is possible on SpiNNaker, the implementation we describe in Section 2.5 could be extended to support spike-based reinforcement learning (Izhikevich, 2007) by adding an extra level of E (i.e., "eligibility") traces with time constants between those of the Z and P traces (Tully et al., 2014). Representing downstream cellular processes that interact with increased intracellular Ca2+ concentrations (Yagishita et al., 2014), E traces propagate into the P traces at a rate previously described as κ (Tully et al., 2014). The κ parameter models the delivery of delayed reward signals in the form of interactions with global neuromodulatory systems, which have been linked to the emergence of sequential activity (Gavornik and Bear, 2014; Ikegaya, 2004). Using this extended BCPNN model, the modular attractor memory model we describe in Section 2.1 could be extended to include basal ganglia input (Berthet et al., 2012), allowing it to switch between behavioral sequences when this might be a beneficial strategy for successful task completion (Ponzi and Wickens, 2010).

Similarly, Vogginger et al.'s (2015) original event-driven BCPNN model includes a set of E ∗ state variables which are used to represent the components of the spike-response model arising from E trace dynamics. Though omitted here, the SpiNNaker BCPNN implementation could be extended to include these traces at the cost of some extra computational cost, and the memory required to store an additional 16 bit trace with each synapse and with each entry in the postsynaptic history structure. In Section 3.1 we showed that by using a 16 bit fixed-point representation for the Z ∗ and P ∗ state variables, we can produce results comparable to previous floating-point implementations when both τ<sup>p</sup> and fmax are relatively small. However, this approach doesn't scale to the type of model described by Fiebig and Lansner (2014) where learning time constants span many orders of magnitude. In these situations, it may be necessary to use a 32 bit fixed-point representation for the P ∗ traces, further increasing the memory and computational cost of the learning rule.

As spikes from neuromodulator-releasing populations can arrive at the synapse at any time, integrating spike-based reinforcement learning into an event-driven, distributed simulation requires incorporating the times of modulatory as well as postsynaptic spikes into algorithm 1. Because entire populations of neuromodulator-releasing neurons can affect the modulatory input received by a single synapse, the per-neuron history structure discussed in Section 2.4 is not a viable means of storing them. Potjans et al. (2010) extend Morrison et al.'s (2007) STDP algorithm to support neuromodulated learning by introducing "volume transmitter" populations which handle all the incoming modulatory input to a virtual "volume." These populations maintain a spike-history of all incoming modulatory spikes which they deliver to the synapses of neuronal populations within this volume, both at presynaptic spike times and after a fixed period so as to 'flush out' the spike-history data structure and allow it to be kept relatively small. This approach has the potential to map well to the SpiNNaker architecture and could be used as the basis of a future SpiNNaker implementation of spike-based reinforcement learning using BCPNN.

A benefit of the model proposed here is its robustness and flexibility. Non-sequential attractor networks without learning have previously been emulated on a neuromorphic microchip (Pfeil et al., 2013) and on a simulated version of the BrainScaleS system (Petrovici et al., 2014). Though not shown here, the connectivity required by these types of randomly hopping attractor networks can also be learned. Variations of this network run on supercomputers have been shown to account for disparate cognitive phenomena including perceptual rivalry and completion (Kaplan and Lansner, 2013); attentional blink (Lundqvist et al., 2006; Silverstein and Lansner, 2011); and diverse oscillatory regimes (Lundqvist et al., 2010). But our model was a reduced version of previous detailed ones insofar that we did not utilize Hodgkin-Huxley neurons with calciumdependent potassium channels or regular spiking non-pyramidal cells; nor did we explicitly model connections among basket cells, saturating synapses, a Vm-dependent Mg2+ blockade or short-term depression.

A problem not stressed by the aforementioned models is how the connectivity required for stable activity propagation might be learned (Wörgötter and Porr, 2005; Kunkel et al., 2011), despite the biochemical (Peters et al., 2014) and metabolic (Picard et al., 2013) changes accompanying learned sequential behaviors. Several promising approaches have been developed (Sussillo and Abbott, 2009; Laje and Buonomano, 2013; Hennequin et al., 2014), albeit with biological motivations driven more from the perspective of algorithmic optimization, rather than from bottom-up neural processing. Here, we have shown that activity could propagate through recurrent cortical microcircuits as a result of a probabilistic learning rule based on neurobiologically plausible time courses and dynamics. The model predicts that the interaction between several learning and dynamical processes constitute a compound mnemonic engram that can flexibly generate step-wise sequential increases of activity within pools of excitatory neurons. We have shown that this large-scale learning model can be efficiently simulated at scale using neuromorphic hardware and our simulations suggest that flexible systems such as SpiNNaker offer a promising tool for the study of collective dynamical phenomena emerging from the complex interactions occurring between individual neurons and synapses whose properties change over time.

### AUTHOR CONTRIBUTIONS

JK developed the SpiNNaker BCPNN implementation and performed the experiments. JK and BK analyzed the data. PT, AL, and JK developed the simplified modular attractor network architecture. JK, PT, BK, AL, and SF wrote the paper.

### ACKNOWLEDGMENTS

The design and construction of SpiNNaker was funded by EPSRC (the UK Engineering and Physical Sciences Research Council) under grants EP/G015740/1 and EP/G015775/1. The research was supported by the European Research Council under the European Union's Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement 320689 and also by the EU Flagship Human Brain Project (FP7-604102) and the EU BrainScales project (EU-FP7-FET-269921). JK is supported by a Kilburn Studentship from the School of Computer Science at The University of Manchester. Additionally, we thank Andrew Mundy for creating **Figure 1**; Michael Hopkins, Patrick Camilleri, and Luis Plana for taking the time to read the manuscript and provide feedback; and our reviewers for their helpful and valuable feedback.

### REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Knight, Tully, Kaplan, Lansner and Furber. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Odor Experience Facilitates Sparse Representations of New Odors in a Large-Scale Olfactory Bulb Model

Shanglin Zhou<sup>1</sup> , Michele Migliore2, 3 and Yuguo Yu<sup>1</sup> \*

*<sup>1</sup> School of Life Science and The Collaborative Innovation Center for Brain Science, The Center for Computational Systems Biology, Fudan University, Shanghai, China, <sup>2</sup> Division of Palermo, Institute of Biophysics, National Research Council, Palermo, Italy, <sup>3</sup> Department of Neurobiology, Yale University School of Medicine, New Haven, CT, USA*

Prior odor experience has a profound effect on the coding of new odor inputs by animals. The olfactory bulb, the first relay of the olfactory pathway, can substantially shape the representations of odor inputs. How prior odor experience affects the representation of new odor inputs in olfactory bulb and its underlying network mechanism are still unclear. Here we carried out a series of simulations based on a large-scale realistic mitral-granule network model and found that prior odor experience not only accelerated formation of the network, but it also significantly strengthened sparse responses in the mitral cell network while decreasing sparse responses in the granule cell network. This modulation of sparse representations may be due to the increase of inhibitory synaptic weights. Correlations among mitral cells within the network and correlations between mitral network responses to different odors decreased gradually when the number of prior training odors was increased, resulting in a greater decorrelation of the bulb representations of input odors. Based on these findings, we conclude that the degree of prior odor experience facilitates degrees of sparse representations of new odors by the mitral cell network through experience-enhanced inhibition mechanism.

#### Edited by:

*Wolfram Schenck, University of Applied Sciences Bielefeld, Germany*

#### Reviewed by:

*Kei M. Igarashi, Norwegian University of Science and Technology, Norway Takaki Komiyama, University of California, San Diego, USA*

#### \*Correspondence:

*Yuguo Yu yuyuguo@fudan.edu.cn*

Received: *26 August 2015* Accepted: *27 January 2016* Published: *11 February 2016*

#### Citation:

*Zhou S, Migliore M and Yu Y (2016) Odor Experience Facilitates Sparse Representations of New Odors in a Large-Scale Olfactory Bulb Model. Front. Neuroanat. 10:10. doi: 10.3389/fnana.2016.00010*

Keywords: odor representation, prior experience, sparse representation, olfactory bulb, large scale network

### INTRODUCTION

Prior sensory experience is very important for animals in learning and processing novel incoming signals. In olfaction, prior odor experience can significantly improve the ability of the animal to discriminate new odor inputs (Mandairon et al., 2006a,b,c; Mandairon and Linster, 2009; Sinding et al., 2011). The olfactory bulb is the first relay of the olfactory pathway, and encodes odor inputs as the network responses of mitral cells (Kay and Sherman, 2007; Mandairon and Linster, 2009). The olfactory bulb has been observed to encode signals in a spatiotemporally sparse and decorrelated manner (Khan et al., 2010; Yu et al., 2014). Moreover, it has been observed that mitral cells become less responsive after prior odor exposure (Buonviso et al., 1998; Buonviso and Chaput, 2000; Fletcher and Wilson, 2003; Mandairon and Linster, 2009; Kato et al., 2012). On the other hand, it has been shown that interneurons may become more (Mandairon et al., 2008) or less (Kato et al., 2012) responsive with new odors.

In previous experimental and computational studies, the numbers of prior odor experiences and new incoming odors are limited. How an animal's prior experience with odorants affects the representation by the olfactory bulb (i.e., the firing patterns of mitral and granule cells) in response to new odors is an open question. Considering the limitations of current experimental techniques, it is nearly impossible to access the synaptic dynamics or neuronal response to odor inputs in the olfactory bulb network at a large scale. However, large-scale supercomputer simulation of realistic olfactory bulb models has been employed to carry out a series of simulations examining these issues (Yu et al., 2013, 2014; Migliore et al., 2014, 2015). Our previous reports have shown that a sparse spatial spiking representation of specific odor signals can emerge naturally from mitral-granule interactions and can be realistically implemented by our model via balanced excitatory-inhibitory synapses (Yu et al., 2013, 2014). Here, we examine how and to what extent prior odor experience modulates the excitatory and inhibitory interactions and how they shape odor representations.

To address these issues, we performed a series of simulations based on a previously established large-scale olfactory bulb model (Yu et al., 2013, 2014). The simulation results show that prior odor experience can accelerate the formation of sparseness in the mitral cell network in response to new odors. Furthermore, the sparseness of the mitral cell network is increased but the sparseness of the granule cell network is decreased with an increasing number of prior training odors. Further analysis demonstrated that this phenomenon is accompanied by a nonlinear change in the excitatory and inhibitory synaptic weighting of the network. Mitral cell network responses demonstrated a gradual increase in their intrinsic decorrelation property, suggesting an increased odor discrimination ability.

### MATERIAL AND METHODS

### Computational Simulations

All simulations were carried out with the NEURON simulation program v7.3 (Hines and Carnevale, 1997, 2001) on a Cray XC30 system (INCF, Sweden). All the present work was based on a previously verified scaled-up olfactory bulb model (Yu et al., 2013, 2014). Briefly, The network was composed of multi-compartment canonical models of 500 mitral and 10,000 granule cells, implemented as described in our previous studies (Migliore and Shepherd, 2008; Migliore et al., 2010).The model uses a reduced number of MCs and granule cells (glom: MC: GC = 1: 5: 100). As we have already explained in detail in a previous paper (Yu et al., 2013), the reason for this choice is that our main aim with this model is to understand the basic processes underlying lateral and feedback inhibition in a network. To this purpose the full number of cells is not needed, especially in the presence of experimental data limited to a very small subset of glomeruli; however, the relative ratio between mitral and granule cells is consistent with experimental estimations, validated against a number of experimental findings (Willhite et al., 2006; Shusterman et al., 2011). The canonical model for mitral cells was implemented with 312 compartments representing an axon, the soma, the apical dendrite, and two lateral dendrites each 1.5 mm in length, in the range indicated by anatomical measurements (Mori et al., 1981). Real mitral cells have a number of lateral dendrites that cover a relatively large, bidimensional surrounding area. From this point of view, our simplifying choice of using only two lateral dendrites per mitral cell has the obvious limitation that, since many glomeruli are at variable distances from the single projection tract, the interactions between mitral cells belonging to specific neighboring glomeruli are not precisely represented. However, our additional choice to project the glomeruli into a single tract, results in the interactions of a given mitral cell with many nearby mitral cells still holding in a generic sense, so that the model gives a relatively accurate reflection of these population interactions within the mitral-granule network. In this way we were able to maintain the requirements for computational resources within a reasonable limit. An indirect proof of the overall quality of this model is its qualitative agreement with a number of experimental findings (Yu et al., 2013). Uniform passive properties were used, with R<sup>a</sup> = 150 ·cm, τ<sup>m</sup> = 20 ms, and R<sup>m</sup> and C<sup>m</sup> adjusted to obtain an input resistance of about 100 M. Resting potential was set at -65 mV and temperature at 35◦C. Cells were modeled as regular firing cells (Migliore et al., 2005), with Na, KA, and KDR conductances uniformly distributed over the entire dendritic tree (Bischofberger and Jonas, 1997). Kinetics for the Na conductance were from hippocampal pyramidal neurons (Migliore et al., 1999), whereas those for KA and KDR were from mitral cell data (Wang et al., 1996). Granule cells were modeled with a soma and a 20 segment radial dendrite (250µm of total length) representing the dendritic tree. Na<sup>+</sup> and KA channels were distributed throughout (Schoppa and Westbrook, 1999; Pinato and Midtgaard, 2005; Zelles et al., 2006) whereas KDR was present only in the soma (Schoppa and Westbrook, 1999).

Effective dendrodendritic coupling between granule cell synapses and mitral cell secondary dendrites was implemented by connecting a GC synapse, containing the same proportion of AMPA and NMDA channels, with the appropriate compartment of mitral cell GABA channel-containing secondary dendrites. The details of the synaptic mechanisms have been described in our previous work (Yu et al., 2013, 2014). It should be noted that we applied a generic use-dependent plasticity rule to the dendrodendritic connection. Briefly, all synaptic weights started at zero and, in response to an odor input, the components (inhibitory or excitatory) of each dendrodendritic synapse were independently modified according to local spiking activity in the lateral dendrite of the mitral cell or the granule cell synapse. After each spike, the peak conductance (w) and the state (p) of any given synapse were updated from their current value w{exc,inh},<sup>p</sup> = gmax,{exc,inh}·S(p) to a new value. The new values were calculated according to the instantaneous presynaptic interspike interval (ISI) (see Migliore et al., 2007) as w{exc,inh},p+<sup>1</sup> = gmax,{exc,inh}·S(p+1). The value of p was limited to the range 0–50, and is subjected to the classical scheme 1 = {0, +1, −1}(Stanton, 1996) in which 1 = 0 for an ISI ≥ 250 ms (i.e., no changes for spike rates ≤ 4 Hz), 1 = −1 for 33 < ISI < 250 ms (LTD in the range of 4–30 Hz), and 1 = 1 for ISI ≤ 33 ms (LTP for a spike rate ≥ 30 Hz). The sigmoidal activation function S(p) was defined as S (p) = 1/{1+exp[(25-p)/3]} (Haykin, 1994). In this way, the weight (i.e., the peak synaptic conductance) of any given synapse could transition from a fully depressed (w ≈ 0, for p = 0) to a fully potentiated state (w ≈ gmax, for p = 50), or vice-versa, over a span of 50 consecutive spikes of the appropriate frequency. At the beginning of a simulation p = 0, the spikes resulting in values of p < 0 or > 50 were ignored.

It should be stressed that synaptic plasticity is fundamental to any dynamic network. Although in the mitral-granule circuit it has not been observed directly, we consider this lack of information as a shortcoming of the experimental techniques rather than a demonstration that there is no plasticity in the olfactory bulb. Indeed, recent studies have shown more or less direct evidence for long term plasticity of olfactory input in mitral cells (Ennis et al., 1998; Ma et al., 2012), and in granule cells (Patneau and Stripling, 1992; Gao and Strowbridge, 2009; Arenkiel et al., 2011). Also note that the plasticity rule used in this model has already been shown (Yu et al., 2013) to generate synaptic clusters and firing patterns in qualitative agreement with experimental findings. As discussed in detail elsewhere (Xiong and Chen, 2002; Migliore and Shepherd, 2008), the formation of synaptic clusters consistent with those observed experimentally is an extremely robust process that can be understood by considering the follow dynamics: (a) a strong odor input causes mitral cells to fire at high-frequency; (b) somatic APs backpropagate along the lateral dendrites and potentiate excitatory mitral–granule synapses along their way, activating granule cells; (c) granule cells begin to fire at high-frequency, potentiating inhibitory synapses on the lateral dendrites of mitral cells, (d) inhibition from granule cells hinders AP backpropagation as it travels far from the soma, thus reducing, locally, the firing frequency of mitral and granule cells, and (e) this finally results in the selective depression of synapses far from the soma of the active mitral cell. Therefore, as long as: (1) action potentials backpropagate along the mitral cell lateral dendrites, (2) granule cells form dendrodendritic connections, and (3) LTD and LTP are induced by different levels of synaptic activity, a column will form independently from the specific learning rule. This mechanism is robust and independent of the plasticity rule used to update the synaptic weights during a simulation (Migliore et al., 2007, 2010); we have tested it with hebbian, non-hebbian, and spike-timedependent plasticity, obtaining in all cases the same qualitative result (i.e., the formation of a column).

It should be noted that in this paper we were interested in the results obtained for a relatively high odor concentration, which is needed to form glomerular units as observed in the experiments. The overall amount of LTP or LTD obtained in a real system, and its overall effect on the I/O properties, will of course depend from the actual plasticity rules in effect for mitral and granule cells. There are no sufficient experimental indications on these processes. However, we stress that the plasticity rule used in this model has already been shown (Yu et al., 2013) to generate synaptic clusters and firing patterns in qualitative agreement with experimental findings.

Other details of the model were identical to those described previously (Yu et al., 2013, 2014). The simulation codes used to run the simulations described in the present work are available in the ModelDB database (http://senselab.med.yale.edu/modeldb, accession number 144570), with the exception of run control files. Kinetic equations and implementation details for all ionic currents are described in these model files.

### Odor Input Paradigm

In our model, the network contains 100 glomeruli, 500 mitral cells, and 10,000 granule cells. The 100 glomeruli spatially distributed within which 74 glomeruli are chosen to have active responses to represent the spatial responses to 72 different odor stimuli. Each glomerulus makes synaptic connections with five mitral cells. For those 74 glomeruli, there are 370 spatially distributed mitral cells connected to them. The other 130 mitral cells are connected to other 26 glomeruli (which could be stimulated by new odors, other than the present 72 odors). We distributed them in such a way to have a roughly uniform overall spatial distribution of glomeruli. Note that although there is no odor input feeding to those 130 mitral cells, their firing is modulated by the random background activity and by the lateral inhibition received from granule cells that are connected with odor-activated mitral cells. As described in our previous work, 72 odor inputs were used for simulations (Yu et al., 2013, 2014). The basic activation strength (0–4) for each glomerulus and each odor is taken directly from the experimental values kindly provided by Mori et al. (2006). To simulate an odor presentation, these values are multiplied by a coefficient representing the odor concentration, and that resulted in an aggregate synaptic input up to 10 nS, as explained in details in the Methods section of Yu et al. (2013).

A new model of the olfactory bulb, representing the actual 3D layout of the mitral-granule cell network, has been recently developed (e.g., Migliore et al., 2015). This model represents in a very realistic way the possible interactions between glomeruli located within the dendritic field of mitral cells, and it would be especially useful to study natural odors, which exhibit a rather broad and dense input. However, it requires large computational resources. With the particular set of inputs we are considering in this paper, i.e., single monomolecular odors with rather sparse and segregated inputs, such a model would not give results qualitatively different from those obtained with the 1D model.

To represent the range of intensities with adequate sensitivity (i.e., including the weakest concentration without saturating the network at the highest concentration), we set the peak conductance sensitivity to give suprathreshold responses to levels 3 and 4. Then, we defined strong inputs as strengths of 3 or 4 and weak inputs as strengths of 0, 1, or 2 (**Figure 1A**). All odor inputs were presented over 4–10 Hz. To address how prior odor experience interferes with the subsequent sparse representation of new odors, a series of odor inputs was used to train the network in sequence. In one odor experience condition, the first odors were presented within the first 5 s, and after a 5 s rest, another odor input was presented for the next 5 s (**Figure 1C**). In other experience conditions (for instance, five odors experience), more odor series were presented similarly to the single odor experience condition: each odor input was presented for 5 s, with a 5 s resting state between each presentation. The last odor input was denoted as the new odor input, and all prior odor inputs were defined as experienced odors, implying that in the five odors experience condition, a total of six odor inputs were used. For the control condition, only the new odor inputs were given at the time when the new odors were given in the experience

FIGURE 1 | Mitral cell network responses in naïve and prior odor experiences conditions. (A) Two example odor input strengths to each mitral cell. k7-1, heptyl methyl ketone; 8OH, octanol. (B) A raster plot shows the mitral cell network response in the naïve condition. k7-1 was delivered at the 10th second after a resting state (each mitral cell fires spontaneously and randomly at a low frequency) of 5 s. (C) A raster plot shows the mitral cell network response to the new odor input k7-1 in the single odor input (8OH) experience condition. Red rectangles represent the time elapsed to reach a stable sparseness level for the network response in different conditions.

conditions (**Figure 1B**). Unless otherwise noted, all experienced odors during training were presented in order from low level to high level of input strength.

### Sparseness Calculation

The method for the sparseness calculation of network response was identical to our previous work (Yu et al., 2014). Briefly, based on previous work (Vinje and Gallant, 2000; Franco et al., 2007), the sparseness of response to a given stimulus can be calculated as follows:

$$S = \left\{ 1 - \frac{\left[\sum\_{i=1}^{N} \left(\frac{r\_i}{N}\right)\right]^2}{\sum\_{i=1}^{N} \frac{r\_i^2}{N}} \right\} / \left(1 - \frac{1}{N}\right),$$

where S is the sparseness of the network in one period of odor input (from the beginning of one input to the beginning of the next odor input); r<sup>i</sup> is the mean firing rate of mitral cell i in that period; N is the total number of mitral cell (500). A high sparseness value in our present work indicates only a few neurons with high firing rates.

### Correlation between Mitral Cell Firing in a Network

To calculate the correlation between mitral cell firing in a network, we used a coherence measure based on the normalized cross-correlation of neuronal pairs in the network. The coherence between two mitral cell i and j was measured by their crosscorrelation of spike trains at zero time lag within a time bin of τ . Precisely, supposing that a long time interval T (one period of odor input) was divided into small bins of τ , and that two spike trains (value of 0 or 1) were given by X(l), Y(l), with l = 1, 2, . . .K (here T/K = τ ), respectively, a coherence for the pair (Kij) was calculated as follows (Wang and Buzsaki, 1996; Yu et al., 2014):

$$K\_{ij}(\tau) = \frac{\sum\_{l=1}^{K} X(l)Y(l)}{\sqrt{\sum\_{l=1}^{K} X(l)} \sqrt{\sum\_{l=1}^{K} Y(l)}}.$$

And then, the correlation between mitral cells across the whole network K was defined by the average of Ki,j(τ ) over all pairs of mitral cells in the network. That is

$$K = \frac{1}{N(N-1)} \sum\_{i=1}^{N} \sum\_{j=1, j \neq i}^{N} K\_{ij}(\mathbf{r}),$$

where N is the total number of the mitral cells in the network. And in our present work, τ was taken as 20 ms through the whole analysis.

### Correlation between Mitral Cell Network Responses

To compare the similarity between mitral cell network response to odor inputs x and y during an odor input period, we defined and calculated it as the correlation coefficient (Cxy) as follows:

$$\mathcal{C}\_{\mathbf{x}\mathbf{y}} = \frac{1}{N} \sum\_{i=1}^{N} \text{Corrcoef}\left\{ \text{MC}\_{i}\left[\mathbf{x}(t)\right], \text{MC}\_{i}\left[\mathbf{y}(t)\right] \right\},$$

where MC<sup>i</sup> is the i'th mitral cell; MCi[x(t)] and MCi[y(t)] are the mitral cell network response in an odor input period to odor input x(t) and y(t) respectively; Corrcoef is to calculate the classic correlation coefficient. To investigate how prior odor experience affects the network response to the news odor inputs, we calculated the average of Cxy between one new odor input in the experience conditions and the other tested new odor inputs in the naïve condition.

### 1/2 Time of Sparseness

To test the dynamic evolution of the sparseness of the mitral cell network response, sparseness values were calculated at series time points when the odor inputs were presented. This sparseness time series could be fitted by the classic logarithmic function as follows:

$$S = A\_2 + (A\_1 - A\_2) / \left( 1 + \left(\frac{\chi}{\chi\_0}\right)^{S\_{1/2}} \right),$$

where S is the sparseness, and S1/<sup>2</sup> is the time at which S reaches the half of the maximum S (A1).

### Correlation of Input Strengths between Different Odors

In some simulations, we quantified the similarity of two odor inputs by calculating the Pearson correlation coefficient based on their strength values for 500 mitral cells (i.e., 500 values for each odor). A higher correlation coefficient indicates that a pair of odors is more similar.

Data were presented as mean ± SEM. Statistical significance was assessed by paired Student's t-test or ANOVA analysis with Tukey's multiple comparison test, and p < 0.05 was considered significant. Data analyses were performed using Graphpad Prism software Version 6.0 (San Diego, USA).

### RESULTS

To systematically address how prior odor experience affects the representation of new odor inputs by the olfactory bulb network, we used a previously verified olfactory bulb network model that includes 500 mitral cells and 10,000 granule cells connected through dendrodendritic synapses (Yu et al., 2013, 2014). In this model, we simulated different odor inputs to mitral cells with varied strength intensities ranging from 0 to 4 based on previous experimental results (Mori et al., 2006; **Figure 1A**). As shown in **Figure 1B**, in the naïve condition (i.e., no prior odor input experience, without odor inputs during the first 10 s), a sparse spatial spiking representation of specific odor input (k7- 1 in this example) emerged naturally within several seconds of the training period from the mitral-granule cell interactions, as verified by our previous work (Yu et al., 2013, 2014). In one training paradigm, after delivery of a prior odor input (8OH) for 5 s and a 5 s resting state (no odor input), a new odor input (K7-1or other, see below) induced a different response of mitral cell network compared with that observed in the naïve condition (**Figure 1C**, compare with the mitral cell network response during the period of 10th–15th second in **Figure 1B**). From the raster plot, we observed that the response of the mitral cell network reached a stable sparseness state much faster than the naïve condition (**Figures 1B,C**, note that the red rectangle denotes the course to reach stable sparseness in **Figure 1C** that is much narrower than in **Figure 1B**). Since the sparseness of the mitral cell network reaches steady state after about 2 s of odor stimulus, we trained the network with specific odor input for 5 s in the following results. We also extended the simulation time to 10 s, and no significantly different results were found (Supplementary Figure 1). We will now present additional details describing our results.

### Prior Odor Input Experience Facilitates the Evolution of the Sparseness of the Mitral Cell Network Response

Experimental and computational studies have shown that the response of the mitral cell network to odor inputs tends to be heterogeneous and spatiotemporally sparse (Yu et al., 2013, 2014). Our previous reports have shown that a sparse spatial spiking representation of specific odor signals can emerge naturally within several seconds of the training period from mitral-granule cell interactions and that the network response reaches a stable level of sparseness (Yu et al., 2013, 2014). To address how prior odor experience affects the evolution of sparseness in the mitral cell network and the final sparseness level in response to new odor inputs, we fixed the prior odor inputs to 8OH or o-Eph and then varied the new odor inputs or trained the network only with the new odor inputs (**Figures 1B,C**). Same as in our previous reports, the sparseness of the mitral cell network response gradually evolved from a relatively low sparseness level to a high sparseness level (**Figures 1**, **2A**). We found that the sparseness of the mitral cell network response to new odor inputs in the single odor experience (8OH or o-Eph) condition was larger than that in the naïve condition at all sniff points the input were given (**Figure 2A**, n = 14 for the number of second odors). **Figure 2B** shows that the stable sparseness levels of the mitral cell network (represented by the last sniff point of 14.8 s) in both 8OH and o-Eph experience conditions are statistically larger than those in the naïve condition (one way ANOVA analysis, p < 0.01, **Figure 2B**). To demonstrate this phenomenon in a more systematic way, we trained the network with additional prior odor series in a manner similar to the single odor experience condition. As shown in **Figure 2C**, the stable sparseness level (represented by the last sniff point) of the mitral cell network increases with the number of prior odors experienced (one way ANOVA, p < 0.01). This scenario was more significant, as shown by the sparseness at the first sniff (**Figure 2C**). As shown in Supplementary Figure 2, the prior odors were delivered from low input strength level to high strength level in the 72 odor experience condition [72 Odors (lh)]. We also reversed the training sequence [i.e., prior odors were delivered from high input strength level to low strength level, 72 Odors (hl)]. Interestingly, the final sparseness of the mitral cell network is significantly lower in the 72 odor (hl) condition than that in the 72 odor (lh) condition (paired t-test, p < 0.001, Supplementary Figure 2). We plan to address this phenomenon extensively in future work, but the present work will mainly focus on the former training sequence (i.e., all prior odors were delivered from low input strength level to high strength level). Moreover, we fixed the new odor to 8OH and varied the experienced odor inputs. We found that the final sparseness level of the mitral cell network to 8OH was negatively correlated to the correlation coefficients of input strength of experienced odors and 8OH (**Figure 2D**). Similar results were found in the cases of k7-1 and o-Eph as the second odors (Supplementary Figure 3A).

It is worthwhile to note that, the new odors we used in our model were different from the experienced odors. We also tested the case that the second odor was the same as the first odor (experienced odor) in the single experienced odor condition and found no significant difference (Supplementary Figure 4).

A previous experimental study showed that prior odor experience could increase the tuning specificity of mitral cell to a variety of odors (Fletcher and Wilson, 2003). In our model, prior odor training could decrease the response of mitral cell to weak odor input leading to a slight increase of the tuning specificity of the mitral cell (Supplementary Figure 5A). To test whether the network sparseness change observed above was due to the increase of tuning specificity of mitral cells, we arbitrarily set the responses of mitral cell receiving no input from a given new odor the same as that in the naïve condition and left the rest responses of mitral cells (receiving at least one intensity from new odor) unchanged as in **Figure 2C**; we found that the stable sparseness levels hardly changed under different conditions (one

vs. the correlation coefficients of input strength of these 14 experienced odor inputs and 8OH. The solid line represents the linear fitting curve.

way ANOVA, Supplementary Figure 5B). Such results suggest that the observed sparseness change of mitral cell network under prior odor experience condition is mainly due to the increase of the sparseness of mitral cells with no input from the new odors. And the experimentally observed tuning specificity of MCs after the odor exposure (e.g., Fletcher and Wilson, 2003) may have additional mechanisms that are beyond the present

model simulation study. To quantify how prior odor experience affects the evolution of the sparseness of the mitral cell network response to new odors, we fitted the time course of sparseness using a classical logarithmic function (**Figure 3A**). Then, based on the fitting curve, we determined the time at which the sparseness reaches half of the maximum value (denoted S1/2). We found that S1/<sup>2</sup> of the network response to new odor inputs in both the 8OH and o-Eph experience conditions is less than in the naïve condition (one way ANOVA, p < 0.001, **Figure 3B**), implying that prior odor experience could accelerate the formation of sparse state sin the mitral cell network in response to new odor inputs. Moreover, to determine how the correlation of experience and new odor input strength affects the evolution of sparseness induced by new odor inputs, we fixed the new odor as 8OH and varied the experienced odor inputs, then measured the S1/<sup>2</sup> of the network response to 8OH. Interestingly, we found that S1/<sup>2</sup> was negatively proportional to the correlation coefficient of the input strength of the experienced odors and 8OH (r = 0.89, **Figure 3C**). Similar results were found in the cases of k7-1 and o-Eph as the second odors (Supplementary Figure 3B). This may imply that the network response to new odor input requires less time to evolve to a stable sparseness state following experienced odor input more similar to the new odor.

In summary, prior odor experience could accelerate the evolution of sparseness in the mitral cell network response to new odor inputs and increases the sparseness level.

### Prior Odor Input Experience Increases the Response of the Granule Cell Network

Previous experiments have shown that prior odor experience has a profound effect on the activity of the granule cell network in response to new odor inputs (Mandairon et al., 2008; Kato et al., 2012). We next tested the activity of the granule cell network in our model system. We also applied the sparseness measurement for mitral cells to quantify the activity in the granule cell network. Contrary to the sparseness in mitral cell network, the sparseness of the granule cell network decreased with the number of prior odors, implying that more prior odors leads to a larger increase of the response of the granule cell network to new odor inputs (**Figure 4**, one way ANOVA, p < 0.01).

fitting curve described by the classical logarithmic function (see Materials and Methods). (B) 1/2 time of sparseness (S1/2 ) of the mitral cell network response to 17 new odor inputs in the single odor input (8OH or o-Eph) experience or naive conditions. S1/2 is the time elapsed from the first presentation of new odor inputs to the time when the sparseness reaches half of the maximum value. \*\*\**p* < 0.001, one-way ANOVA with Tukey's *post-hoc* comparison test. (C) S1/2 of mitral cell network responses to 8OH in nine different single odor input experience conditions vs. the correlation coefficients of input strength of these nine experienced odor inputs and 8OH. The solid line represents the linear fitting curve.

## Effects of Prior Odor Experience on Synaptic Weight in the Mitral Cell Network

Because synaptic plasticity exists in our model, the different stable sparseness of mitral or granule cell networks under different conditions may be due to the final synaptic weights in the bulb network. We tested the excitatory and inhibitory synaptic weight under different conditions in response to specific new odors. We divided the input strength into a strong group with strengths of 3 or 4 and a weak group with strengths of 0, 1, or 2. As shown in **Figure 5A**, the average excitatory synaptic weights for a mitral cell receiving weak or all inputs significantly increased with prior odor number, but decreased for mitral cells receiving strong inputs (two way ANOVA, p < 0.01). The same scenario applied to the average inhibitory synaptic weight (**Figure 5B**, two way ANOVA, p < 0.01).

Our previous studies reported that the change of sparseness of the mitral cell network resulted from the evolved dynamic changes in the synaptic weight of both excitatory and inhibitory dendrodendritic synapses (Yu et al., 2013, 2014). Such a developed dynamic change of synaptic weights has been suggested to affect the changes in the time course of mitral cell network sparseness (Yu et al., 2013, 2014). Now we would like to examine how the synaptic weights will further evolve with the continuous training of prior odor experience, and then examine how this prior experience could modulate the response sparseness of the mitral network to new odor inputs. We analyzed the time course of average excitatory weight (Gex) and inhibitory weight (Gin) for mitral cells receiving strong inputs and weak inputs. Similar to our previous results (Yu et al., 2013, 2014), the time courses of response sparseness are tightly correlated with the changes of synaptic weight (especially excitatory synaptic weight) during the response to 8OH, both for naïve (Supplementary Figure 6) and single odor input experience (k7-1, Supplementary Figure 6). In the naïve condition, strong excitatory synaptic inputs gradually increased from 0.06 nS to a steady state of ∼0.47 nS after 2 s of 8OH input (Supplementary Figures 6A,B). However, in the k7-1 odor experience condition, the sparseness of the mitral cell network reached a maximum level immediately in response to new 8OH input, and the strong excitatory and inhibitory synaptic inputs also reached a maximum value at

the beginning of the 8OH input period (Supplementary Figures 6A,B). Therefore, these results imply that prior odor experience accelerates the evolution of synaptic weight in the mitral cell network to the steady state, which in turn accelerates the evolution of sparseness in the mitral cell network in response to new odor inputs.

### Prior Odor Input Experience Decreased the Correlation of the Mitral Cell Firing Pattern

Sparse coding is an efficient scheme by which an individual neuron independently encodes different properties of the input (Olshausen and Field, 1996; Vinje and Gallant, 2000). This naturally leads us to predict that when the mitral cell network reaches a high sparseness level, the correlation level among responses of mitral cells in the network should reach a low level. In fact, we have verified this prediction in our previous work (Yu et al., 2014). We therefore tried to determine whether prior odor experience would also affect the evolution of the decorrelated state among mitral cells in the network to new coming odor inputs. We quantified this correlation by averaging the correlation coefficients of all possible pairs of 500 mitral cell responses at each sniff point during new odor delivery. Similar to the evolution of sparseness in the mitral cell network, the correlation among mitral cells gradually evolved from a relatively high level to a low level (**Figure 6A**). We found that the correlation of mitral cell firing in the network in response to new odor inputs in one odor experience (8OH or o-Eph) condition was lower than that in the naïve condition at all the sniff points tested (**Figure 6A**). **Figure 6B** shows the stable correlation level of mitral cell firing in the network (represented by the last sniff point) for both 8OH and o-Eph experience conditions were statistically lower than that in the naïve condition (paired ttest, p < 0.05, **Figure 6B**). We then tested the correlation in conditions with more prior odor inputs. As shown in **Figure 6C**, the correlations of mitral cell firing in the network in response to new odor inputs at first and last sniff following prior odor experience both decreased as the number of prior odors increased (one way ANOVA, p < 0.01). Moreover, we fixed the new odor to 8OH and varied the experienced odor inputs; we found that the correlation of mitral cell firing in network in response to the new odor 8OH at last sniff was weakly linearly correlated to the correlation coefficients of the input strength of experienced odors and 8OH (**Figure 6D**). Similar results were found in the cases of k7-1 and o-Eph as the second odors (Supplementary Figure 3C). This result implies that the mitral cell firing response tends to be more decorrelated if the input strength of new coming odor differs more greatly from that of the prior experienced odor.

### Prior Odor Input Experience Decrease Correlation of Mitral Cell Network Response

We already tested the effects of prior odor experience on mitral cell firing pattern in response to the corresponding new odor inputs. A more direct way to measure the coding efficiency of the mitral cell network in response to different odor inputs is to calculate the similarity of the network response to different odor inputs, especially to similar odor inputs (Yu et al., 2014). We therefore investigated how prior odor experience affects the mitral cell network response to different new odor inputs. Odors 7OH and 6OH are two very similar odor inputs (**Figure 7A**). The response of the mitral cell network to 6OH in one odor input (8OH) experience condition was more different than that in the naïve condition to the network response to 7OH in naïve condition after the training process (**Figures 7B,C**). For instance, the firing rates of mitral cells 1–100 to 6OH in the 8OH experience condition (**Figure 7C**, right) were more different than that in naïve condition (**Figure 6B**, right) from that to 7OH in naïve condition (**Figure 7B**, left). A similar result was also found when we compared the response of the mitral cell network to 7OH in the 8OH experience condition or the naïve condition with that to 6OH in the naïve condition (**Figures 7B,C**).

To quantify the similarity of the mitral cell network response to different odor inputs in different conditions, we measured the network response correlation as described in the Materials and Methods section. The correlations of mitral cell network responses evolved gradually from relatively high to low level (**Figure 8A**). Furthermore, the correlations between mitral cell network responses to new coming odor inputs in the one odor experience (8OH or o-Eph) condition were lower than in the naïve condition at all sniff points tested (**Figure 8A**).

**Figure 8B** shows the stable correlation level of the mitral cell network response (represented by the last sniff point) in both 8OH and o-Eph experience conditions were statistically lower than in the naïve condition (one way ANOVA, p < 0.05, **Figure 8B**). We then tested the correlation in conditions with more series of prior odor inputs. As shown in **Figure 8C**, the correlations of mitral cell network response to new coming odor inputs in prior odor experience at the first and last sniff point decreased while the number of prior odors increased (one way ANOVA, p < 0.01). Similar to **Figure 6D**, we fixed the new odor to 8OH, and varied the experienced odor inputs and we found that the correlation of mitral cell network response to new coming odor 8OH was weakly linearly correlated to correlation coefficients of the input strength of experienced odors and 8OH (**Figure 8D**). Similar results were also found in the cases of k7-1 and o-Eph as the second odors (Supplementary Figure 3D).

### DISCUSSION

Using a scaled up mitral-granule cell network model and a set of odors with relatively strong input strength, we systematically investigated how prior odor input affects the coding paradigm of mitral cells to new incoming odor inputs. The following findings were observed: (1) when increasing the number of prior odors, the activity of the mitral cell network decreased and the granule cell network increased, gradually reaching an equilibrium level. (2) prior odor experience accelerated the formation of a stable sparseness level of the mitral cell network to new odors; (3) increasing prior odor experience also facilitated the mitral cell network to evolve to a more decorrelated state; (4) prior odor experience decreased the correlation of the mitral cell network response to new odors and this effect is more obvious after training the network with a larger number of prior odors. Note that all the changes gradually reach an equilibrium level that does not change with additional odor experience. All changes may be attributed to two key factors: (1) the continuous LTP effect for those mitral cells receiving a sustained strong input; (2) more and more mitral cells are activated when more odors are presented to the network, inducing more dynamic changes in the excitatory and inhibitory synaptic weights of the dendrodendritic synapses. An equilibrium is reached when most of the relevant mitral cells have been activated during the past odor experiences.

(C, right) differs more than that of the naive condition (B, right) from the firing pattern of the network to 7OH in the naïve conditions (B, left).

## Sparseness of Mitral and Granule Cell in Prior Odor Experience Conditions

Sparse coding has been suggested as an efficient way to code the sensory inputs (Olshausen and Field, 1996; Rinberg et al., 2006; Davison and Katz, 2007; Koulakov and Rinberg, 2011). Previous studies have found that the response of mitral cell network to new coming odor inputs decreases after prior odor experience (Buonviso et al., 1998; Buonviso and Chaput, 2000; Fletcher and Wilson, 2003; Kato et al., 2012). We systematically tested several prior odors inputs in our large scale olfactory bulb model, and we found the sparseness of the mitral cell network to new coming odors increases with the number of prior odors (**Figure 2C**). Moreover, we found the sparseness of the granule cell network to new coming odors decrease with the number of prior odors (**Figure 4B**). In fact, previous experimental results also found that prior odor experience could increase the activity of interneuron (Mandairon et al., 2008). We further found the average of excitatory and inhibitory synaptic weight both increase along with the number of prior odor inputs (**Figure 5**). As the mitral cell is the main target of granule cell and granule cell is also the main target of mitral cell, we can infer that the increase of excitatory synaptic weight leads to the increase of granule cell activity, and the increase of inhibitory synaptic weight combined with the increase of granule cell activity leads to the decrease of the mitral cell activity.

Previous experimental study also showed that prior odor experience could decrease the granule cell activity (Kato et al., 2012). In our simulations, prior odor experience decreases the response of mitral cell network, which tends to decrease the activity of granule cells. By contrast, prior odor experience increases the average excitatory synaptic weight to granule cell, which tends to increase the activity of granule cells. These two contrary effects of prior odor experience on granule cell activity might be the cause for the varied results of granule cell activity change to prior odor experience observed in experiments.

It should be noted that an experimental study shows that mitral cell responses decreases more after the same prior odor exposure than with different odor experience (Kato et al., 2012). In our work, the response of mitral cells hardly changed when it was in the same prior odor conditions. This suggests the phenomenon observed by Kato et al. (2012) might be attributed to a different mechanism that is not involved in the current model network.

Previous experimental results have also shown that the first sniff after odor input is very important for odor discrimination behavior (Uchida and Mainen, 2003; Cury and Uchida, 2010). In addition to analyzing olfactory bulb responses at equilibrium, we also analyzed the response properties right after the first sniff in all conditions. We found the sparseness change (or correlation of mitral cell network response change) induced by

odor input experience conditions vs. the correlation coefficients of input strength of these 14 experienced odor inputs and 8OH. The solid line represents the linear fitting curve.

prior odor experience is more significant for first sniff cycle than that for the last sniff cycle. The more experienced odors result in less response difference in the sparseness level (or response correlation) between the first and last sniff cycles. This may suggest that the response in the first sniff may contain important information for odor discrimination that can be enhanced by the experienced odors.

It is worthwhile to note that the increasing rate of sparseness of mitral cell network tend to decrease when the number of prior odors increases (**Figure 2C**). For instance, the increase of sparseness between three and five odors conditions (increase of only two odors) is larger than that between 15 and 72 odors conditions (increase of 57 odors). We may infer that the sparseness level of the mitral network would saturate after a certain number of prior odor experience. Another interesting phenomenon needed to further test is that the different training sequence of series of prior odors would have significantly different effect on the response of mitral cell network to new coming odors.

### Sparseness Evolution in Mitral Cell Network

Previous studies have shown a sparse spiking representation of specific odor can emerge naturally after several seconds of a learning period (with certain odor input frequency) from the mitral-granule cell synaptic connections (Yu et al., 2013, 2014). And this phenomenon may be corresponding to the learning process of animal to a new odor inputs. In the one odor experience condition, we found the prior odor experience accelerates the process to reach the stable sparse state (**Figure 3**). Furthermore, we found that such acceleration of the sparseness to reach stable sparse state was well correlated to the acceleration of the excitatory and inhibitory synaptic weight to reach the maximum value (Supplementary Figure 6). And further experiments are needed to confirm such findings.

We also found that the more similar the new odor was to the experienced odor, the faster the mitral cell network reached a stable sparseness level, which may be due to less time needed to train more overlapping synaptic interactions to reach steady state. On the other hand, more disparate prior experienced odors lead to a higher stable sparseness level of mitral cell to the new odor, which may be due to the increase of the overall inhibitory synaptic weight resulting from the activation of more mitral and granule cells by more experienced odors. We may infer that the rate of formation of stable sparseness and the sparseness level itself are two different aspects of the odor representation of the mitral cell network.

### Decorrelation of the Mitral Cell Network Response

Our previous study has shown that the response of the mitral cell network tends to be decorrelated and accompanied by sparseness (Yu et al., 2014). Our current work shows correlations within the mitral cell network to new odors decrease with the number of prior odors (**Figure 6C**). A similar correlation exists for mitral cell network responses to different odor inputs—a direct way to quantify coding efficiency under different conditions (**Figure 8C**). This may partially give an explanation to why enrichment could increase the ability of an animal to discriminate different odors (Mandairon et al., 2006a,b,c; Sinding et al., 2011).

Previous experimental and computational results have extensively shown the importance of granule cell activity and inhibitory synaptic weight for representation of odor inputs in olfactory bulb network (Mandairon et al., 2006a, 2008; Koulakov and Rinberg, 2011; Kato et al., 2012). We infer that such a decorrelated state between mitral cell firing in a specific network and the network response to different odor inputs is due to an increase in granule cell activity and inhibitory synaptic weight after odor experience.

In summary, using a scaled up olfactory bulb model, we systematically investigated how prior odor experience affects the sparse representation of new odor inputs by the olfactory bulb network. In conclusion, the gradual increased inhibitory weight of granule cells together with the slightly increased firing rates of gradual cell populations promote the response sparseness and decorrelated state of mitral populations to new odor inputs. These results may help to better explain how prior sensory experience affects the behavior of animals in response to new odor inputs.

### AUTHOR CONTRIBUTIONS

YY and ZS designed research; ZS and MM performed research; ZS and YY wrote the paper. All authors reviewed the manuscript.

### ACKNOWLEDGMENTS

We thank for the support from the National Natural Science Foundation of China (31271170,31571070), and China 863 program (2015AA020508), and the program for the Professor of

### REFERENCES


Special Appointment (Eastern Scholar SHH1140004) at Shanghai Institutions of Higher Learning. MM was supported by grant RO1-NS11613 from National Institutes of Health, USA. Thanks to Tom McTavish for proof reading.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fnana. 2016.00010


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Zhou, Migliore and Yu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Closed-Loop Brain Model of Neocortical Information-Based Exchange

#### James Kozloski\*

IBM Research Division, Computational Biology Center, IBM T.J. Watson Research Center, Yorktown Heights, NY, USA

Here we describe an "information-based exchange" model of brain function that ascribes to neocortex, basal ganglia, and thalamus distinct network functions. The model allows us to analyze whole brain system set point measures, such as the rate and heterogeneity of transitions in striatum and neocortex, in the context of neuromodulation and other perturbations. Our closed-loop model is grounded in neuroanatomical observations, proposing a novel "Grand Loop" through neocortex, and invokes different forms of plasticity at specific tissue interfaces and their principle cell synapses to achieve these transitions. By implementing a system for maximum information-based exchange of action potentials between modeled neocortical areas, we observe changes to these measures in simulation. We hypothesize that similar dynamic set points and modulations exist in the brain's resting state activity, and that different modifications to informationbased exchange may shift the risk profile of different component tissues, resulting in different neurodegenerative diseases. This model is targeted for further development using IBM's Neural Tissue Simulator, which allows scalable elaboration of networks, tissues, and their neural and synaptic components toward ever greater complexity and biological realism.

#### Edited by:

Wolfram Schenck, University of Applied Sciences Bielefeld, Germany

#### Reviewed by:

Alino Martinez-Marcos, Universidad de Castilla, Spain Alexander Peyser, Forschungszentrum Jülich, Germany

#### \*Correspondence:

James Kozloski kozloski@us.ibm.com

Received: 29 August 2015 Accepted: 02 January 2016 Published: 18 January 2016

#### Citation:

Kozloski J (2016) Closed-Loop Brain Model of Neocortical Information-Based Exchange. Front. Neuroanat. 10:3. doi: 10.3389/fnana.2016.00003 Keywords: neocortex, thalamus, basal ganglia, information-based exchange, brain model

### 1. INTRODUCTION

Synaptic plasticity regulates neuronal responses to patterns of inputs impinging on dendritic arbors from multiple presynaptic sources. Resulting input selectivity at the single neuron level is often associated with learning and memory in models of cognition. At the circuit level, synaptic plasticity can serve more complex functions over arbitrary inputs, from selecting fixed points in recurrent networks (Hopfield, 1982), to implementing optimizations such as information maximization in artificial neural networks (Linsker, 1997), to dynamically encoding inputs in winnerless networks (Rabinovich et al., 2001). A challenge to analyzing the role of any neuron or circuit that implements these functions for cognition is that of modeling appropriate, naturalistic neuronal and circuit inputs, which in real brains derive from tens of thousands to millions of other neurons.

Here we present a closed-loop brain model, including component models of several neural tissues that we hypothesize implement some of these functions. Synapses and plasticity connecting components at principle cell interfaces together create a set of closed neuroanatomical loops. Without extrinsic inputs or stochastic intrinsic drivers, our model avoids the challenges and assumptions of modeling naturalistic inputs separately, and instead derives them exclusively from the dynamics of upstream neurons and tissues. The challenge then is model validation, which we won't address in this report. Instead the aim here is to delineate hypotheses and a theory of brain resting state function using the model and its simulations. We propose that models implemented similarly constitute a class of "brain models," and are distinct from component "neural tissue models," which instead assume an arbitrary set of inputs or stochastic processes to drive intrinsic tissue dynamics. By avoiding these assumptions, a coarse but consistent model of global brain function may be useful for better constraining the most detailed neural tissue simulations.

We introduce the term "traversal" to refer to a "synfire chain" as defined by Abeles (1991), but with additional neuroanatomical constraints defining a minimum set of neocortical regions traversed by the event. The cortico-cortical feedback loop in our model acts as a substrate for combined traversals of sensory, limbic, and motor areas, which we propose together drive behavior in the organism. The cortico-thalamocortical feed forward loop acts to maximize the entropy of these global traversals and to maximize information about the environment relayed as inputs to the loop. Lastly, the striato-nigro-striatal loop provides a means to select subsequent configurations by monitoring changes in ongoing traversals and signaling them with dopamine to alter routing within the feed forward entropy maximizing network. We propose this function as the substrate for reward learning in the organism. Each of these loops therefore has both a closed-loop function (global traversal, traversal entropy maximization, and traversal change monitoring and rerouting) and an organismal input-output function (behavior generation, sensory processing, and behavior selection based on reward learning).

The objective of this report is to describe the closedloop model and simulations of it. Perturbations to the closed loop that alter dynamic set points will also be described. We hypothesize about dynamic disease mechanisms and progression based on the model, and describe methods that use neural tissue simulation (Kozloski and Wagner, 2011) for modeling treatments that alter brain disease risks. To summarize our overall approach and long-term research goals, the driving hypotheses relating our closed-loop model to brain disorder and disease states are: (1) The primary disease and disorder risk is a disturbance in plasticity that critically maintains brain system dynamic set points; (2) Compensatory circuit dynamics achieves near-normal set points despite genetic or environmental perturbations, but with increased secondary risk of neuronal dysfunction, damage, or loss; (3) Secondary risk correlates with feed forward destruction or dysfunction of neural tissues because with each neuron function lost, maintaining system set points requires even greater secondary risks; and (4) Slowing progression may therefore lie in mitigating the primary risk's effect on system set points or in limiting secondary risks incurred by inherent compensatory dynamics.

### 2. INFORMATION-BASED EXCHANGE BRAIN MODEL

### 2.1. Cytoarchitectonics of Bidirectional Neocortical Projections: The "Grand Loop"

We propose a model that emphasizes a specific cortico-cortical connectivity across the major sensory, limbic, and motor categories of Brodmann areas. This emphasis derives from several observations. First, we note the importance of signals traversing all three categories of cortical representations in order to produce a stable basis for perception and behavior by integrating information about the environment, internal needs, and behavioral opportunities of the organism. While many loops have been discovered in studies of the neocortical connectome, none provide the directed graph (feed forward vs. feedback) needed to identify a system to support such traversals. Instead, we note that the cytoarchitectonic granularity of neocortical areas provides one means to interpret feed forward (i.e., more granular to less granular) and feedback (i.e., less granular to more granular) connections between cortical areas (Rempel-Clower and Barbas, 2000) and therefore a means to identify a backbone for global brain traversals (**Figure 1A**).

Granularity refers to the density of punctate Nissl bodies in stained layer 4 of neocortex. The granularity across all of neocortex was studied and mapped extensively by von Economo (1929), and we reproduce his illustrations and some key findings in **Figures 1A,B**. Note that granular cortices typically have smaller diffuse Nissl bodies in layer 5, and agranular cortices have very large diffuse layer 5 Nissl bodies. Tiling in von Economo's map shows that regions of cortex with similar granularity are adjacent, with key exceptions at the boundaries between primary motor (M1) and primary somatosensory (S1), hippocampus (HC) and retrosplenial granular areas (RGA), and subgenual anterior cingulate (ACC) and prefrontal (PFC) cortices. Each of these three pairs of Brodmann areas are interconnected, and in our model represent key boundaries in the backbone for traversing the sensory-limbic (HC-RGA), limbic-motor (ACC-PFC), and motor-sensory (M1-S1) cortices (**Figure 1B**, black arrows). To complete a "Grand Loop" backbone, we join each pair of areas by an area in their adjacent dysgranular neocortical regions: the secondary somatosensory (S2), posterior cingulate (PCC), and supplemental motor (SMA) areas (**Figure 1C**). While others have noted that organizing principles for intrinsic microcircuits may be derived from combining von Economo's observations with those regarding granularity and the direction of cortico-cortical projections (Beul and Hilgetag, 2015), none to our knowledge have proposed a Grand Loop that traverses all of neocortex according to these principles.

### 2.2. Cortico-Cortical and Cortico-Thalamo-Cortical Functional Pathways

Having defined the feed forward neocortical Grand Loop, we'll now embellish this structural model with additional components based on observations regarding feed forward projections and

FIGURE 1 | (A) Granularity of different neorcortical areas, adapted from von Economo (1929). Colors at bottom correspond to the map in (B). (B) von Economo's neocortical tiling based on the granularity of large regions of neocortex spanning multiple Brodmann areas. The location of three Brodmann areas per stage are waypoints along a feed forward Grand Loop (arrows). (C) These Brodmann areas are connected based on projection data. Evidence that feed forward connections progress from granular to agranular areas provides directionality. The reciprocal feedback loop is not shown.

signaling between neocortical areas. Sherman and Guillery emphasized different roles for direct cortico-cortical feed forward projections, which join one cortical area to another through their supragranular layers, and indirect cortico-thalamo-cortical projections, which join infragranular layers of the same original area to the granular layer of the same target area (**Figure 2**; Guillery and Sherman, 2011). In Sherman and Guillery's model, direct cortico-cortical projections are "modulatory," providing restricted activation to the target area, and indirect corticothalamo-cortical projections are "driving," providing activation across all layers of the target area. **Figure 2** represents Sherman and Guillery's model (based on a simplification of their schematic). We will now describe how this model may be integrated into the Grand Loop.

Recall that each station of the loop in **Figure 1** is coupled in the feed forward direction. These connections, largely through the supragranular layers, are mirrored in the feedback direction by connections through infragranular layers (**Figure 2**; Rempel-Clower and Barbas, 2000). Thus, the Grand Loop represents two reciprocal loops, one in the feed forward direction and one in the feedback direction. Furthermore, according to Sherman and Guillery, higher order thalamic nuclei provide at every stage a redundant relay for driving inputs over each feed forward connection. The local cortical circuit then receives signals from these thalamic nuclei and mixes the otherwise independent direct feed forward modulation loop and feedback traversal loop, primarily at layer 4's synaptic connections onto supragranular layers, and at supranular layers' onto the infragranular layers' apical dendrites.

We proposed previously that layers 2/3 of neocortex implement a network for maximizing mutual information between thalamic inputs and cortical responses (Kozloski et al., 2007). Entropy maximization in these layers (equivalent to information maximization when noise in the inputs is assumed to be negligible) would require a dense lateral network (Linsker, 1997), which fits well with the high proportion (∼22%) of total cortical synapses dedicated to intralaminar 2/3 connections (Binzegger et al., 2004). Given this role for the supragranular layers, we now propose that the role of corticothalamo-cortical driving inputs in Sherman and Guillery's model is to provide inputs both from first order thalamic nuclei about the environment and from feedback traversals through higher order thalamic nuclei about the behavioral state of the organism. A global supragranular network then extracts maximally informative features from combinations of these inputs. In addition, we propose that these features become conditional modulators on feedback traversals by boosting or reducing the gain on proximal inputs to layer 5 neurons by means of synaptic inputs onto their apical dendrites from layer 2/3 neurons.

### 2.3. Basal Ganglia Gating of Feed Forward Functional Pathways

In Sherman and Guillery's model, thalamic relay neurons in both first order and second order nuclei are subject to modulation. Modulation may derive from direct cortico-thalamic feedback from layer 6, inhibition from the thalamic reticular nucleus, or from neuromodulatory inputs such as norepinepherine from the locus coeruleus. Sherman and Guillery's model derives largely from their studies of sensory cortices and feed forward pathways through them, projecting from more granular to less granular regions. Here we extend the discussion of thalamic relay neuron modulation to include a role for inhibitory inputs from the basal ganglia to thalamic nuclei that act as relays in the frontal lobe between more granular limbic and motor areas to less granular areas in these regions.

The basal ganglia (including ventral limbic and dorsal motor) are in a privileged position to influence traversals by means of their inhibitory inputs onto thalamic relay neurons within the Grand Loop. These inputs derive from nucleus inominata in the ventral limbic subpallium and from the globus pallidus in the dorsal motor subpallium. Neurons in the ventral pallidus (nucleus inominata) receive inhibition from medium spiny neurons (MSNs) in the nucleus accumbens (ventral striatum) and those in the dorsal globus pallidus from those in the dorsal striatum. These neurons then either directly disinhibit thalamic relay neurons or indirectly inhibit thalamic relay neurons through an additional stage of inhibitory neurons (in globus pallidus, this is organized as direct and indirect projections through the external and internal segments). Spiking models of inhibitory pallidothalamic gating have focused on the bird song system (Goldberg et al., 2012), where gating inputs to thalamic relay neurons serve the role of transitioning syllables of the organism's vocalizations. Here we propose a more generic role for this gating in selecting and deselecting different pathways for internal traversals of the pallium.

Inputs to these direct and indirect pathways through basal ganglia derive from neocortical layer 5 neurons' projections onto MSNs, and their corticostriatal synapses undergo spiketiming dependendent plasticity (STDP) which is modulated differentially by dopamine depending on the selective expression of either D1 dopamine receptors in the direct or D2 dopamine receptors in the indirect pathways (Pawlak and Kerr, 2008; **Figure 3**). Each layer 5 neuron's collaterals then include a branch descending to the brainstem or spinal cord, a branch descending to thalamus (Guillery and Sherman, 2011), and additionally a branch descending to striatum (Lévesque et al., 1996). A recent review of additional types of layer 5 projection neurons and the role of corticostriatal connectivity in disease provides a thorough examination and schematic of these pathways (Shepherd, 2013), and our model of thalamic gating, for now and for simplicity, includes only the "Pyramidal Tract" layer 5 neurons and their projections to basal ganglia and thalamus for the function of thalamic gating.

In summary, our model extends Sherman and Guillery's model of cortico-thalamo-cortical gating of driving, feed forward inputs to include modulation from striatal and pallidal neurons in both the direct and indirect pathways (**Figure 3**). MSNs in our model receive convergent layer 5 collaterals from all layer 5 neurons that send convergent collaterals onto a specific thalamic relay neuron. This relay neuron is then gated by the same MSNs, indirectly through globus pallidus (the gate opens for the direct pathway, and closes for the indirect pathway). Such a scheme does not preclude so called "closed loops" that originate and terminate in the same cortical area (Kelly and Strick, 2004), but downplays their significance as only partial regulators of feed forward thalamic gating (**Figure 4A**).

The basal ganglia in our model is then a "forward driver gate" for all feed forward driving signals relayed through the frontal lobe's cortico-thalamo-cortical functional pathways. Because these pathways relay layer 5 traversals through thalamus to the granular and supragranular layers of cortex, they can indirectly control the routing of feedback signals and the selection of certain traversals over others through the Grand Loop, as we describe in the next section. Additional area to area corticothalamo-cortical pathways not on the main loop backbone (such as the visual system) are then available for additional modulation and traversals of the global layer 5 behavioral network, possibly including loops requiring reafference from the environment.

### 2.4. Information Based Gating of Feedback Traversals

Our model provides two distinct functional signaling pathways through the Grand Loop: feed forward for driving the

supragranular entropy maximizing network, and feedback for traversal of the infragranular behavior generation network. The latter, in our implementation of the model, propagates synfire events through a loop, as described by Zheng and Triesch in their model of "synfire ring" formation and propagation (Zheng and Triesch, 2014). Restricting synfire activity to the feedback direction is a key aspect of our model. Unlike other models of feedback, which ascribe to it solely a sensory processing "top down" function, we model the propagation of feedback activity as potentially independent of feed forward activity (for example when a coupling parameter between these two networks is zero). Specifically, the emergence of activations in the supragranular layers are rate coded, while activations in the infragranular layers are spike timing based in order to support synfire events. (We won't speculate here on how these distinct coding schemes are implemented and maintained by the neocortical microcicuit, but it would seem there are ample mechanisms available.)

Conditional coupling between features, extracted by information maximization in the supragranular layers, and spike propagation in the infragranular layers, is then under the control of a parameter that models cholinergic modulation in neocortex. Acetylcholine enhances the influence of sensory inputs on pyramidal cell firing relative to their processing of intrinsic signals within neocortical circuits (Hasselmo and Giocomo, 2006). We model this modulatory parameter as changing the slope and dynamic range of a gain function. The function sets the gain on feedback integration within the synfire ring based on the level of activity in the corresponding functional units (e.g., orientation columns) in layer 2/3. Thus, feature encoding acts as a gate for synfire propagation, and we hypothesize this gain function may be implemented by layer 2/3 inputs to layer 5 neurons' apical dendrites. Varying cholinergic modulation of these inputs in cortex then controls the slope and dynamic range of the mapping from layer 2/3 activity to layer 5 feedback integration gain. The result is that propagation of synfire activity through a column of cortex is informed by the categorization of thalamic inputs to that same area. Information maximization among responses in the supragranular areas over environmental inputs becomes entropy maximization of synfire propagation pathways through the infragranular layers, provided that coupling between these is strong (i.e., cholinergic modulation is high). It is because of this coupling that we have named our model an information-based exchange network.

### 2.5. The Forward Driver Gate: Bursting, Modulation, and Plasticity

Having proposed a central cortico-thalamo-cortical routing function for striatal MSNs by means of their directly disinhibiting

or indirectly inhibiting thalamic relay neurons, we will now propose on what basis a striatal MSN adapts to perform this function in the context of system set points. We call this the forward driver gate's "routing function." Our model of MSN firing includes constraints from a weak, assymetric lateral inhibitory network giving rise to "winnerless competition" (Rabinovich et al., 2001), and closely matching the periods of striatal bursting lasting hundreds of milliseconds observed in vivo (Miller et al., 2008). Ponzi and Wickens have similarly used this network to model spiking properties of striatum (Ponzi and Wickens, 2010), and have shown that at transition points in the lateral network configuration (from low, ∼10%, to high, ∼20%, rates of connectivity), an optimal balance is achieved that facilitates winnerless encoding of variations in driving inputs from neocortex (Ponzi and Wickens, 2013). To achieve this balance, our model instead varies the strength of cortical inputs dynamically by a dual source of modulation of STDP at the corticostriatal synapse.

The first dynamic modulator of STDP at the corticostriatal synapse in our model is GABA inhibition from the lateral network, itself responsible for "turn-taking" among MSNs and their bursts, characteristic of the winnerless network. We assume that both direct and indirect pathways show STPD reversal under GABA inhibition (Fino et al., 2010; Paille et al., 2013), and we model winnerless competition between striatal neurons as the source for this inhibition (**Figure 4B**).

The second dynamic modulator of STDP at the corticostriatal synapse is dopamine. Given the routing function's potential as a critical determiner of the emergence of behavior, affect, and cognition in the organism via its direct control over traversals of the layer 5 network, reward-based learning of this function is ultimately required. For now, we simulate our brain model of information-based exchange with dopaminebased learning serving only a closed-loop function, separate from the environment and therefore independent of reward encoding. This closed-loop function is sensitive to system set points and monitors traversals. It is equivalent to so-called "tonic firing" in dopamine neurons, which can also include bursts. We propose that the intrinsic dynamics of dopamine neuron membrane currents implements this closed-loop function by measuring time and the abruptness of changes to system states, with bursts generated under specific conditions summarized below. Dopamine provides a potent modulation of STDP at the corticostriatal synapse (Pawlak and Kerr, 2008), and further modulates it differentially at the inputs to D1-MSNs and D2- MSNs. In our model this differential modulation, combined with GABA modulation, produces the complex routing function summarized in **Figure 4C**.

Dopamine neurons have been shown to fire bursts of action potentials to signal basal reward inputs to the organism encoded as strong excitatory inputs to medial tegmentum and substantia nigra pars compacta (SNc) neurons. Because of these responses, the dopamine system has been extensively modeled as recapitulating reinforcement learning and operant conditioning in the organism. We propose here for the first time an additional closed-loop role for dopamine neurons in learning routing functions and selecting traversals. Specifically, we propose that dopamine neurons signal changes to traversals, and thereby influence the subsequent emergence of new traversals. The basis for this proposal derives from recent connectomics studies, which demonstrate that 70% of dopamine neuron inputs are inhibitory, and that most of this inhibition arises from striatum (Watabe-Uchida et al., 2012). Our closed-loop role for dopamine modulation depends on this inhibition, and this proportion and source suggests that closed-loop responses to inhibitory inputs, not open loop responses to basal excitation and reward, may be the predominant operating mode of the dopaminergic system.

Dopamine neurons exhibit heterogeneous combinations of intrinsic I<sup>h</sup> and I<sup>A</sup> currents (Amendola et al., 2012), as well as T-type calcium currents, which together generate post inhibitory rebound bursting in slice preparations. These currents' role in vivo has not yet been demonstrated, but our model assumes that the dynamical criteria for dopamine neuron bursting (and subsequent learning of routing at corticostriatal synapses) are implemented at least in part by rebound bursting. Other models have explored rebound bursting in dopamine neurons (Lobb et al., 2011), but not in the context of a closed-loop regulatory function. In our model, if the duration and abruptness of removal of striatal inhibition to dopamine neurons is appropriate, a rebound burst occurs. This aspect of the model indirectly imposes the additional criterion that inputs from layer 5 to striatum that transition the MSN winnerless network should be similarly matched to the duration and abruptness of change required for rebound spiking in dopamine neurons. In this way the striatonigro-striatal loop monitors changes in traversals and alters routing within the feed forward entropy maximizing network by modulating corticostriatal STDP. We therefore propose that MSNs learn this routing based in part on their ability to recognize patterns of spiking in layer 5 that remain stable for a minimum duration of time then fade rapidly, a property expected during traversals of the Grand Loop.

### 3. SIMULATION METHODS

We simulated the model to explore its dynamics, characterize preliminary set points for measurement and analysis, and study traversal behavior and its regulation under different modulatory conditions. The five major components of the model to be simulated included cortical layers 2/3, 5, thalamus, striatum, and dopamine neurons. Meeting this challenge at the detailed level of neural tissue simulation is beyond the scope of this report, and without a good understanding of target model set points, likely impossible. We therefore aimed to draw upon four simplified abstractions of the key behaviors we ascribe to principle cells in these structures (Linsker, 1997; Rabinovich et al., 2001; Mihalas and Niebur, 2009; Zheng and Triesch, 2014). With four base component models replicated from other studies, we then coupled them across novel interfaces, realizing the closed, functioning Grand Loop, complete with its subcortical regulators.

### 3.1. Component Models

Four component models from the literature were targeted here to capture the functions of cortical layers 2/3 and 5, striatum, and dopamine neurons in the brain model. These four met sufficient requirements to implement informationbased exchange (**Algorithm 1**), with very few changes to published parameters. We list the models below and describe the requirements they satisfy. Parameters defined in the original references for each component model are found in **Table 1**. Because thalamic relay neurons were implemented as a simple set of sums over inputs, they are described as an interface between component models in the subsequent section.

• **Neocortex, layer 2/3**: The model applies the "Infomax" algorithm of Bell and Sejnowski (1995) to thalamic relay neuron inputs. A neural network implementation of the same optimization (Linsker, 1997), based entirely on a local learning rule, establishes the biological plausibility of this function for this tissue (Kozloski et al., 2007). The three stage network modifies a weight matrix C that couples thalamic relay neurons to cortical units (e.g., at layer 4) based on microcircuit feedback from layer 2/3. Stage one receives the rate vector <sup>b</sup>x<sup>I</sup> from an ensemble of time averaged thalamic spike trains and computes the zero mean input vector <sup>x</sup><sup>I</sup> <sup>=</sup> <sup>b</sup>x<sup>I</sup> <sup>−</sup> <sup>x</sup>0, where <sup>x</sup><sup>0</sup> ≈ hbxI<sup>i</sup> is learned at the learning rate βx<sup>0</sup> . Stage two computes the sum of weighted inputs to stage three, u ≡ Cx<sup>I</sup> . In addition, each stage two unit computes an element of the output vector y, y<sup>i</sup> = σ(ui), where σ(·) denotes a nonlinear squashing function, here the logistic transfer function, y = 1/1 + e −(u+w0) , where w<sup>0</sup> is an adaptive output bias vector learned by 1w<sup>0</sup> = βw<sup>0</sup> [1− 2y]. The output vector y maximizes the mutual information over the input ensemble and provides regulatory microcircuit feedback to the model of layer 5 described below.

Stage three then computes an entropy maximizing learning vector, which is fed back to stage two and applied by Hebbian learning to modify C at the learning rate βC<sup>0</sup> . Derived by Linsker (1997), this learning vector when applied in this way precisely yields the Infomax anti-redundancy term of Bell and Sejnowski (1995), (C ′ ) −1 (ie., the inverse of the transpose of the input weight matrix) which for simplicity may also be computed directly. In this model of layer 2/3, the entropy maximizing learning vector emerges from a fully connected lateral network, whose weight matrix <sup>b</sup><sup>Q</sup> undergoes Hebbian learning according to <sup>1</sup>b<sup>Q</sup> <sup>=</sup> <sup>β</sup>Q[uu′ <sup>−</sup> <sup>b</sup>Q], such that <sup>b</sup><sup>Q</sup> <sup>≈</sup> <sup>Q</sup> <sup>=</sup> huu′ i. For a given input presentation, these lateral connections evolve an auxiliary vector v according to v<sup>t</sup> = vt−<sup>1</sup> + u − <sup>α</sup>bQvt−1. Regardless of initial <sup>v</sup>, and assuming the scalar <sup>α</sup> is chosen so that v converges, the Infomax anti-redundancy term can be approximated by iterating the lateral network


TABLE 1 | Component models and parameters.

and applying its output by Hebbian learning, since (C ′ ) <sup>−</sup><sup>1</sup> = Q <sup>−</sup>1ChxIx<sup>I</sup> ′ i, and by substitution, (C ′ ) <sup>−</sup><sup>1</sup> = αhv∞x<sup>I</sup> ′ i(Linsker, 1997).

• **Neocortex, layer 5**: The model evolves from a self-organizing recurrent network (SORN) of binary spiking units through application of homeostatic plasticity, weight normalization, and STDP learning rules, together with synaptic pruning and synaptogenesis (Zheng and Triesch, 2014). This biologically consistent set of synaptic modifications creates distributions of synaptic densities and weights that evolve over time to closely match data from developing neocortex. The weight matrix W also develops robust feed forward motifs and synfire activity similar to the model of Kozloski and Cecchi (2010), but with the remarkable topological feature of a closed, global loop of distinct propagation layers (**Figure 5**), which together engender "synfire rings." We evolved this network for 200, 000 time steps (1t = 1 ms) to create four areas of cortex, which were then embedded into the larger model as two frontal lobe (M1, Msup) and two sensory lobe (S1, Ssec) areas. Weights close to zero were held at zero for the remainder of all simulations. Propagating activity is maintained in the excitatory network, satisfying the requirement for layer 5 traversals. An inhibitory network that undergoes biologically plausible inhibitory STDP at its synapses onto excitatory neurons, together with homeostatic plasticity in the excitatory network, maintains spiking activity, s(t), in the synfire ring at a nominal firing rate of 100 spikes/s. The inhibitory network imposes global, persistent competition across the network of excitatory layers. We propose this inhibition as an approximate functional model of inhibition from the thalamic reticular nucleus, which also integrates activity from across the thalamocortical system.


forward area to area connections over all others. This self-organized topology supports synfire ring activity, which in the current model is referred to as traversal activity.

the voltage-dependence of 2(t) permits the model to generate rebound spiking under conditions when the neuron has been hyperpolarized deeply, or for a prolonged period (**Figure 7**). Due to the independent spike-induced current R2, each rebound event generates a burst of four action potentials. This simplification's phenomenology also approximates that generated by other more complex models of rebound firing in dopamine neurons (Lobb et al., 2011).

### 3.2. Component Model Interfaces

The interfaces between component models that implement the integrated brain model of information-based exchange summarized in **Algorithm 1**, are now listed and described.


gating vector <sup>b</sup>G, such that elements of the feed forward input vector are <sup>x</sup>FF<sup>j</sup> <sup>←</sup> <sup>b</sup>G<sup>j</sup> · θFF<sup>j</sup> .


• **Layer 5 to Striatum**: The inputs from the Layer 5 model to an MSN in the Striatum model are drawn from all layer 5 neurons in the cortical area for which the MSN gates inputs at the thalamus, and from those in the areas connected to it in either the feed forward or feedback directions. These Layer 5 inputs may also be directed to motor outputs of the model to a simulated environment (as in the Pyramidal tract). Corticostriatal synapses are subjected to STDP that differentially adjusts weights, WCStr, based on correlation between cortical spiking and the derivative of the burst outputs of MSNs, P ′ (xf (t)). Pre-post pairing is defined as when a cortical spike occurs and this derivative is positive, and postpre pairing when a cortical spike occurs and it is negative. Each kind of pairing is computed separately and subjected to the modulatory conditions at the synapse, as illustrated in **Figure 4**. Briefly, depending on 1. the identity of the MSN (D1or D2-type), 2. whether dopamine is or is not present at the synapse, and 3. whether the inhibitory synaptic current z<sup>f</sup> (t) at the MSN exceeds a threshold (z<sup>f</sup> (t) > 0.00707), each pairing value may be either 1 or 0, and the adjustment to the weight a multiple of this value and a learning rate of 0.002. As in Zheng and Triesch (2014), weights are normalized such that the sum of all inputs to an MSN cannot exceed 0.1. When weights reach zero they are pruned, and a single new connection may then be formed during a time step with probability 0.2.


FIGURE 7 | (A) Long duration time series plots of Dopamine Neuron model variables. When the membrane potential (blue) reaches the variable threshold (red), a spike reset occurs. Firing rates of Dopamine neurons across all simulations were consistently on average ∼1.6 Hz, and varied locally depending up the ongoing integration of dynamic inputs. (B) An expanded time scale reveals bursts (inset) occurring in response to deep hyperpolarization (single star), or prolonged weaker hyperpolarization (triple star) events.

neuron projection is to a synapse, not a neuron. Specifically, dopamine spiking results in a persistent dopamine modulation of STDP at a specific set of corticostriatal synapses. Dopamine neurons are assigned randomly without replacement to corticostriatal synapses onto each MSN. The duration of dopamine modulation following a Dopamine Neuron model spike persists at the synapse for a time τDA.

### 3.3. Simulation Materials and Experiments

We simulated the model to explore the rate and heterogeneity of transitions in traversals and in subcortical modulators of these traversals. The configuration (**Table 2**) allowed for a rapid prototyping in Matlab because of the simulation's small size. Following initialization of the cortico-cortical Grand Loop network of four areas, we simulated the full model for an additional 500, 000 iterations using an Intel Xeon E5-2640 v3 Processor (20 MB Cache, 2.60 GHz), requiring 2.5 hours of compute time. The first 50, 000 iterations were used to adjust the biases of the layer 2/3 model, during which time Ach modulation of layer 5 was drawn from the positive half of a zero mean normal distribution with standard deviation of 1. All plots, except where noted, show the final iterations of the 500, 000 total. Reported are experiments wherein the parameter τDA was set at either 25 or 100 ms, and Ach at 0.25, 0.5, or 0.75. All plots except **Figure 12** show results for τDA = 100 ms.

### 4. SIMULATION RESULTS

### 4.1. Coordinated Behavior Among Component Models

Behavior of the model may be analyzed first based on inspection of various raster plots from different components of the model.

#### **Algorithm 1:** Information-Based Spike Exchange

```
1 while Simulation Running do
```

```
2 for (M1, Msup, Ssec, S1) do
3 xFB(t) ← MFB ·
                     Pt
                        t−τX
                           sFB(T);
4 θFF(t) ← MFF ·
                     Pt
                       t−τX
                           sFF(T);
5 if (M1, Msup) then
6 xFF(t) ← G(θFF(t));
7 else
8 xFF(t) ← θFF(t);
9 end
10 bxI ← xFB + xFF;
11 y ← σ(CxI);
12 1C ←Infomax(xI, C, y, . . .);
13 U ← [1 − Ach(1 − yj)]/[1 − Ach/2];
14 s(t): =SORN(sFB(t − 1), W, U, . . .);
15 xf
          (t): =Winnerless(s(t), WCStr, P(xf
                                        (t −
        1)), WStr, . . .);
16 V(t): =IAF(P(xf
                      (t)), 2(t − 1),I1(t − 1),I2(t −
        1), . . .);
17 1WCStr ←STDP(s(t), P(xf
                              (t)), P(xf
                                      (t −
        1)), WStr,
                Pt
                   t−τDA
                       V(T), . . .);
18 if (M1, Msup) then
19 bG ← H[D · P(xf
                         (t))]
20 end
21 end
22 end
```
In this way coordination between the different components is apparent. We first observed that cortico-cortical traversals through the feedback layer 5 network occur without subcortical



regulation, and were similar to the synfire events reported by Zheng and Triesch (2014). There are two main regulators of these traversals in our model: (1) an information based gain on layer 5 feedback inputs provided from layer 2/3, and (2) basal ganglia gating of cortico-thalamo-cortical feed forward inputs to layer 2/3 information maximization by the forward driver gate.

Upon introducing these regulators, we noted that traversals became structured into long bouts of smoothly alternating and repeating patterns of activity across the different cortical layers' raster plots. Each pattern persisted for ∼400 ms (**Figure 8A**), and sequences of patterns, while similar over each cycle, were not identical. The Ach parameter provides a means to adjust the influence of categories learned by layer 2/3 on traversals. For this initial experiment, Ach = 0.25 provided a gain U ∈ (0.86, 1.14) for y ∈ (0, 1).

Information maximization creates maximal entropy in the ensemble of output vectors over an input ensemble, and because of the logistic function, activity in each layer 2/3 neuron was typically close to zero or one. We interpret these values as cortical up and down states, which have both an extrinsic and intrinsic origin in the local cortical microcircuit.

Maximizing entropy of the ensemble of gain functions, applied to layer 5 inputs in the feedback traversal network, had the interesting effect of creating more irregularity in the patterns of activity across all of cortex as Ach increased. At Ach = 0.5, U ∈ (0.67, 1.33), (**Figure 8B**), pattern combinations became varied, even though average global firing rates imposed by homeostatic plasticity in the network were consistently maintained (100 spiked/s). Finally, at Ach = 0.75, U ∈ (0.4, 1.6), traversal transition rates increase significantly, and patterns were highly varied (**Figure 8C**).

Inspecting the information-bearing up and down states in layer 2/3 directly in state rasters from all four cortical areas also reveals coordination between areas and with transitions in traversals. In **Figure 9A**, under Ach = 0.25, the rate of state changes among layer 2/3 units appeared coordinated, especially in the secondary sensory area. This coordination is less regularly transitioned than in the traversals, and occurs at a higher rate. At higher Ach = 0.5 (**Figure 9B**) up and down state coordination with traversals increases, while coordination across layer 2/3 is weakened. At Ach = 0.75 (**Figure 9C**) states become synchronized in the secondary sensory area and more coordinated with traversals overall, even though traversals themselves become more heterogeneous. Note that the heterogeneity in traversals due to increased control by the information maximizing network is not due to a lack of convergence in the weights of the networks. Weights among both the layer 2/3 Infomax input weights C and layer 5 feedback weights W converged during these simulations.

MSN bursts generated by the model were ongoing, as in the winnerless network and the model of Ponzi and Wickens (2010). These bursts appeared in fast sequences, which were of longer duration in D1-type MSNs than D2-type (**Figure 10**). Variability in burst rate between MSNs was also observed, with some not firing at all, likely because of inhibition from the active network. Increasing Ach had only a small effect on the raster appearance, and so we began our quantitative analysis by examining coordination between the Striatum model and the Layer 5 model.

### 4.2. Measurements of Information-Based Exchange

To quantify coordination between striatum bursting and cortical layer 5 spiking, we computed pairwise linear correlation coefficients between each cortical spike train and striatal burst train. We plotted each using a color scale (red, more correlated; blue, less correlated) in a matrix showing how different areas of cortex fired in relation to D1- and D2-type MSN bursts (**Figure 11**, left column). Only significant correlations were plotted, and all others were represented by zero. We also show that the mean of each distribution of correlation values (**Figure 11**, right column) for both D1- (blue) and D2-type (red) MSNs differed. Most coefficient distributions of D1 vs. D2 burst correlation with cortical spiking were significantly different(p < 0.05), based on pairwise student t-tests. More striking is the difference in sign for each mean coefficient of correlation to each cortical area as Ach increases. Positive correlation coefficients dominated at low Ach and negative at high. At the intermediate level, M1 in particular showed a divergence in sign between mean correlation coefficients for D1-type (positive) and D2-type (negative) MSNs.

Finally, to quantify information-based exchange directly, we measured the entropy of cortical spiking and dopamine neuron spiking, and the mutual information between cortical and dopamine neuron spiking (Strong et al., 1998). Instead of measuring entropy and information among spike trains of individual neurons however, which quantifies the distribution of patterns of spikes over time, we measured entropy and information in population spiking, which

quantifies the distribution of patterns of spikes over the population for single time steps. The method was aimed at asking if traversals themselves show entropy maximization based on increased modulation from layer 2/3. Synfire events are encoded by the sets of units that participate at every stage of the chain or ring propagation. Therefore,

if the entropy of synfire population spiking increases, it can be concluded that the synfire chain entropy itself has increased.

We found that entropy in cortical layer 5 population spiking increased as Ach increased (**Figure 12A**). We also show that as the window of dopamine integration τDA increased, the

entropy depends on dopamine population spiking, we measured the mutual information between these two populations, and found it to decrease as Ach increased (**Figure 12C**).

## 5. DISCUSSION

We discuss the brain model of information-based exchange in three contexts: brain evolution and development, brain resting state networks, and new approaches to the study of brain disorders such as neurodegenerative diseases.

### 5.1. Brain Evolution and Development

We propose that the Grand Loop, spanning sensory, limbic, and motor cortices, and specifically traversing in our model somatosensory cortices, is prototypical and embryonic in origin, since other modalities develop fully only after birth and do not share a granular-agranular tiling boundary in von Economo's map. The topological relationship between other modalities and this backbone may then provide alternative pathways for completing a full traversal and rapidly binding percepts, needs, and behaviors. Finally, the tight coupling between somatosensory inputs and limbic states (i.e., tissue damage, pain) and motor states (i.e., sensorimotor feedback, proprioception) argues that this loop is likely preeminent in both brain evolution, organization, and development.

This model additionally provides insights into those organisms lacking cortices, wherein the stages of the proposed traversals may not be segregated anatomically (e.g., into Brodmann areas), but instead may be nucleated (e.g., in the birdsong system), or even superimposed within the same pallial regions (e.g., in fish and amphibians). Synfire ring development is robust given the synaptic modifications proposed by Zheng and Triesch (2014). It furthermore does not require anatomical segregation between layers to emerge or for synfire activity to propagate (e.g., for **Figure 5**, we sorted each matrix after areas developed in order to illustrate them clearly and connect subcortical structures to each).

Synfire rings may represent a prototypical substrate for behavior generation (**Figure 13**), and through subpallial regulatory inputs from thalamus and basal ganglia as described herein, for behavior selection. In such a scenerio, the evolution of a multilaminate neocortex to support such rings may have solved the problem of entropy maximization over the ensemble of synfire events in very large networks. Since the neural network implementation of Infomax requires a dense lateral network, to optimize each stage of a synfire ring and traversals in general would necessitate both the segregation of stages and a registered information maximizing network (**Figure 13**). This solution to the problem would then support rapid expansion of the synfire ring substrate by evolution, given that redundancy in large networks could suddenly be managed and eliminated by information maximization.

### 5.2. Resting State Networks

The challenge of modeling resting state activity in the brain has presented itself based on observations that distinct networks spanning multiple cortical areas appear in imaging studies to

entropy of layer 5 population spiking increased slightly as well. Surprisingly, the entropy of dopamine neuron population spiking (**Figure 12B**) remained constant while both parameters in the model were altered. Finally, to measure how increasing traversal

squares 100 ms. In all conditions H increases with Ach and τDA. (B) Same as (A) but showing entropy of dopamine neuron population spiking. (C) With increasing cortical population spiking entropy (A), the mutual information between cortical and dopamine neuron population spiking decreased.

serve either active or inactive states of the organism (Fox et al., 2005). Inactivity correlated networks appear even under anesthesia (Vincent et al., 2007), and these areas have very high metabolic rates, tipping the brain's energy budget toward a large investment in the organism's doing nothing.

What this costly outlay accomplishes may be explained by our model's use of closed-loop activity in the informationbased exchange network to increase entropy over the ensemble of traversals. In an evolutionary context, this activity may be viewed as preadapting the brain to selecting novel behaviors in novel contexts by maximizing such a quantity first, before engaging with the environment, then using the preadapted diverse traversals to explore it and seek reward.

While others have noted that resting state dynamics may represent a "constant state of inner exploration" (Deco et al., 2011), our model is the first to assign a quantitative measure to the fruits of this brain activity, providing a new way to reason about the trade-off between evolutionary pressures toward latent adaptive behaviors and the large metabolic cost of resting state network activity.

### 5.3. Dynamic Disease Risk

We hypothesize that basic controls are required to establish "cognitive homeostasis," i.e., a process by which variables that change brain dynamics are carefully regulated so that properties of brain state transitions (and thus brain information processing and behavioral dynamics) remain relatively stable under constant neuromodulatory conditions. We refer to these stable properties as "set points," i.e., targeted norms for critical system variables supporting normal behavior, percepts, affect, and cognition. In our model, these controls are based on a consistent set of parameters that yield consistent spiking and bursting patterns, even when the network undergoes reorganization (e.g., when Ach was modified, the system adjusted and produced stable traversals). Stable ranges of firing among burst rates and traversals, coefficients of correlated firing and bursts, and entropy and mutual information among population spiking and bursting have been our initial targets for describing these system set points using the brain model.

In real brains, given evolutionary pressures for robust selfregulation and behavior, the system is certainly replete with controls aimed at maintaining these set points. The challenge of studying brain disorders such as neurodegenerative disease is sorting primary and secondary risks from the multitude of compensatory mechanisms, each of which manifests itself as a deviation from normal brain and neuronal function given some primary genetic or injury risk. Researchers have shown, for example, that mutant Huntingtin protein disturbs NMDA receptor localization, densities, and currents at the corticostriatal synapse in mouse models of the disease (Cepeda et al., 2001). Knowing how this change arises and perturbs circuit dynamics, plasticity, and system set points may provide a better understanding of why certain neurons succumb and others don't when subjected to the same mutant protein.

We propose that perturbations in our model may result in stable dynamics, but with measurable risks related to stressors on normal neuronal function. If these deviations are extreme in our model, and therefore difficult to compensate for in biological tissue, a cascade of neuronal dysfunction may result. Neurodegenerative diseases such as Huntington's, Parkinson's, and Alzheimer's, may then be understood as cascading failures given initial stressors derived from plasticity abnormalities at the corticostriatal synapse, within the striato-nigro-striatal loop, and over the process of entropy maximization in layer 2/3, respectively. For example, subtle changes to STDP or homeostatic plasticity may result in increased synaptic competition or cycling in the space of possible weights, which is then difficult to compensate for locally, given that traversals entail global brain states. If these risks increase when stressed neurons are removed from a simulation, the model may then be used to predict disease progression.

Implementation of the current brain model of informationbased exchange forms a framework for the analysis of cognitive homeostasis and disease using IBM's scalable approach to structural and neurophysiological modeling of neocortex and brain nuclei (Kozloski and Wagner, 2011). Here we extend this approach and that of many brain modeling projects, which seem focused on validating complex local circuit and tissue models at the expense of validating tissue inputs. Minimal complexity brain models, in our case an information-based exchange network, may be necessary to capture brain dynamics and provide validatable inputs to complex tissue models. The current model has now been reimplemented in the same model graph simulation infrastructure in which IBM's Neural Tissue Simulator was implemented, and thus will allow direct coupling between these in a single scalable, extensible program.

With this new approach, inputs and models of the various components may be simulated and compared to in vivo experimental observations. Furthermore, simulations over very long time scales can be used to stress the model and its set points in physiologically and clinically realistic ways. Additional perturbations to the model may include physiological stimulation, such as simulated deep brain stimulation (DBS) in simulated neural tissue, simulated drugs with known targets in the detailed model, and different simulated disease states with hypothesized mechanisms at the level of gene, protein, regulatory network, etc. Stimulation, drug effects, and disease mechanisms can then be targeted to test certain hypotheses about modifications to dynamic disease risk, and to study the wider system's behavior. Increasing complexity of perturbation sets (targets and combinations) may be designed to validate the model under different therapeutic conditions, and to test for phenotypic outcomes (e.g., symptomatology). Therefore, elaboration of these simulations within each modeled neural tissue might allow for in silico study of therapeutic interventions in living brain tissue.

In the above discussion, a model of several brain circuit components and their global set points has been proposed as a means to test disease mechanisms and therapeutic inputs such as DBS and drugs. The implicit assumption of these tests is that risks can be inferred from outlier variables that maintain system set points, and that these outliers may then be implicated as causes of phenotypic symptoms such as abnormal behavior at the organismal, circuit, neuronal, or synapse level. Targeting these variables in real world systems is one approach we propose for novel therapeutic design and discovery using brain modeling combined with neural tissue simulation.

### FUNDING

A portion of this work was funded by CHDI.

### ACKNOWLEDGMENTS

We acknowledge many years of helpful discussions with Charles C. Peck and Guillermo Cecchi. We also acknowledge discussions with Ralph Linsker and Roger Traub on component model design, and Robert Rogers and Robert Kerr on the integrated model design. We acknowledge helpful comments on the manuscript from Erik Schomburg, Tuan Hoang Trong, and Pengsheng Zheng. Finally, we acknowledge the artistic contributions of Stella Kozloski, who produced the drawings based on von Economo's work.

### REFERENCES


spike timing-dependent plasticity. J. Physiol. 588, 3045–3062. doi: 10.1113/jphysiol.2010.188466


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Kozloski. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.