# **CLOSING THE LOOP AROUND NEURAL SYSTEMS**

**Topic Editors Steve M. Potter, Ahmed El Hady and Eberhard E. Fetz**

#### *FRONTIERS COPYRIGHT STATEMENT*

© Copyright 2007-2014 Frontiers Media SA. All rights reserved.

All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.

**ISSN** 1664-8714 **ISBN** 978-2-88919-356-1 **DOI** 10.3389/978-2-88919-356-1

## *ABOUT FRONTIERS*

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

## *FRONTIERS JOURNAL SERIES*

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing.

All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

## *DEDICATION TO QUALITY*

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view.

By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

## *WHAT ARE FRONTIERS RESEARCH TOPICS?*

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area!

Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **CLOSING THE LOOP AROUND NEURAL SYSTEMS**

Topic Editors:

**Steve M. Potter,** Georgia Institute of Technology, USA **Ahmed El Hady,** Max Planck Institute for Dynamics and Self Organization, Germany **Eberhard E. Fetz,** University of Washington, USA

Closing the loop around neural systems: most commonly used words. This word cloud was generated using all the keywords, titles, authors' affiliations, and abstracts from this Research Topic. The font size is proportional to the number of times the word was used.

Closed-loop neurophysiology has been accelerated by recent software and hardware developments and by the emergence of novel tools to control neuronal activity with spatial and temporal precision, in which stimuli are delivered in real time based on recordings or behavior. Real-time stimulation feedback enables a wide range of innovative studies of information processing and plasticity in neuronal networks. This Research Topic e-Book comprises 16 Original Research Articles, seven Methods Articles, and seven Reviews, Mini-Reviews, and Perspectives, all peer-reviewed and published in Frontiers in Neural Circuits. The contributions deal with closed loop neurophysiology experiments at a variety of levels of neural circuit complexity. Some include modeling and theoretical analyses. New enabling technologies and techniques are described. Novel work is presented from experiments in vitro, in vivo, and in humans, along with their clinical and technological implications for improving the human condition.

# Table of Contents



Rishi R. Dhingra,Yenan Zhu, Frank J. Jacono, David M. Katz, Roberto F. Galán and Thomas E. Dick


Martin Egelhaaf, Norbert Boeddeker, Roland Kern, Rafael Kurtz and Jens P. Lindemann

*188 Closed-Loop Response Properties of a Visual Interneuron Involved in Fly Optomotor Control*

Naveed Ejaz, Holger G. Krapp and Reiko J. Tanaka

*199 The Iso-Response Method: Measuring Neuronal Stimulus Integration with Closed - Loop Experiments*

Tim Gollisch and Andreas V. M. Herz

*213 Statistics of Neuronal Identification with Open- and Closed-Loop Measures of Intrinsic Excitability*

Ted Brookings, Rachel Grashow and Eve Marder


A. Hanuschkin, S. Ganguli and R. H. R. Hahnloser

*289 Real-Time System for Studies of the Effects of Acoustic Feedback on Animal Vocalizations*

Mike Skocik and Alexay Kozhevnikov

*295 Learning and Exploration in Action-Perception Loops* Daniel Y. Little and Friedrich T. Sommer


Pedram Afshar, Ankit Khambhati, Scott Stanslaski, David Carlson, Randy Jensen, Dave Linde, Siddharth Dani, Maciej Lazarewicz, Peng Cong, Jon Giftakis, Paul Stypulkowski and Tim Denison

*382 Dynamic Control of Modeled Tonic-Clonic Seizure States with Closed-Loop Stimulation*

Bryce Beverlin II and Theoden I. Netoff


Armin Walter, Ander Ramos Murguialday, Martin Spüler, Georgios Naros, Maria Teresa Leão, Alireza Gharabaghi, Wolfgang Rosenstiel, Niels Birbaumer and Martin Bogdan

# **NEURAL CIRCUITS**

**EDITORIAL** published: 23 September 2014 doi: 10.3389/fncir.2014.00115

## Closed-loop neuroscience and neuroengineering

#### *Steve M. Potter <sup>1</sup> \*, Ahmed El Hady2 and Eberhard E. Fetz <sup>3</sup>*

*<sup>1</sup> Laboratory for Neuroengineering, Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA, USA*

*<sup>2</sup> Department of Non Linear Dynamics, Max Planck Institute for Dynamics and Self Organization, Goettingen, Germany*

*<sup>3</sup> Departments of Physiology and Biophysics and Bioengineering, Washington National Primate Research Center, University of Washington, Seattle, WA, USA*

*\*Correspondence: steve.potter@bme.gatech.edu*

#### *Edited and reviewed by:*

*Florian Engert, Harvard University, USA*

**Keywords: feedback, real-time, BCI, microelectrode array, CMOS, DBS, stimulation**

Feedback and closed-loop circuits exist in just about every part of the nervous system. It is curious, therefore, that for decades neuroscientists have been probing the nervous system in an openloop manner to understand it. Instead of the linear, reductionistic "stimulate **→** record response" approach, a more modern approach is taking hold: closed-loop neuroscience. It respects the inherent "loopiness" of neural circuits, and the fact that the nervous system is embodied, and embedded in an environment. Through active sensing, behaving animals can influence their environment in ways that alter subsequent sensory inputs. Therefore, loops abound not only in the nervous system itself, but through its dynamic interactions with the world. By interposing our own technology in some of these loops, we can achieve unprecedented control over the system being studied and explore the functional consequences. This Research Topic, "Closing the Loop Around Neural Systems," presents a diverse set of recent methodological, scientific and theoretical advances from neuroscientists and neuroengineers who are pioneering closed-loop neuroscience.

As shown here, cutting-edge researchers are taking advantage of real-time or "on-line" processing of large streams of neural data. This has become feasible thanks to advances in computer processing power, in electronics such as microprocessors and field-programmable gate arrays (FPGAs), and in specialized and open-source software. These advances have enabled a wide variety of new neuroscience approaches to understanding, modulating, and interfacing with the nervous system—approaches in which the variables being monitored can influence the experiment in progress, just as active sensing can influence an animal's next input.

Our call for submissions to this Frontiers in Neural Circuits Research Topic yielded an overwhelming response, indicating that closing the loop around neural systems is an exciting and rapidly expanding field. Perhaps this is because of the diversity of ways in which "closed-loops" can be interpreted and implemented. This Research Topic presents seven Methods articles, 16 Original Research articles, and seven Reviews, Mini-Reviews, and Perspectives, for a total of 30 accepted papers published in Frontiers in Neural Circuits between April 2012 and October 2013. A map showing the locations of all the contributors1 reveals that most are in the USA and Europe, although researchers in Russia, Japan, and Israel are also represented.

Several articles describe or review new technologies that increase the options for closed-loop neuroscience. Two papers by Bareket-Keren and Hanein (2013) and Robinson et al. (2013) review the latest in carbon nanotube and nanowire multielectrode arrays (MEAs) for neural interfacing. Franke et al. (2012) review high-density MEAs with many electrodes and realtime spike sorting. Müller et al. (2013) present sophisticated hardware and software for very fast (sub-millisecond) closedloop recording and stimulation of cultured networks using their CMOS array with 11,011 electrodes. Newman et al. (2013) created an application programming interface (API) for their open-source NeuroRighter electrophysiology system that greatly enhances its ability to carry out closed-loop experiments in which recorded signals trigger electrical stimulation or other hardware. Five examples of closed loop experiments *in vitro* and *in vivo* are described.

A number of articles present advances using acute or cultured networks *in vitro*. Bonifazi et al. (2013) present EU Brain Bow project efforts in progress, to create and study bi-directional neural interfaces. Their work includes both patterned dissociated cultures and their responses to laser ablation, and a whole-brain *in vitro* preparation and its response to focal ischemia. The goal is to develop the closed-loop prostheses of the future. Tessadori et al. (2012) present their Hybrain2 software for real-time control of hybrid neural-robotic systems, consisting in this case of a virtual wheeled robot interfaced to a living hippocampal network on an MEA. Brewer et al. (2013) reconstructed a hippocampal trisynaptic loop *in vitro* on an MEA with small tunnels for neurites to grow through. Pimashkin et al. (2013) used an adaptively enhanced learning protocol to study learning in dissociated hippocampal networks on MEAs.

Others studied the nervous systems of intact or semi-intact animals with closed-loop approaches. Nishimura et al. (2013) restored arm movements in a spinal cord-injured non-human primate (NHP) with an artificial cortico-spinal connection and an artificial musculo-spinal connection. This system allows volitional control and boosting of weak, residual muscle activity. Opris et al. (2012) enhanced performance on a delayed match-tosample task in NHPs using cortical microstimulation contingent on recordings that predict incorrect responses. Dhingra et al. (2013) studied the role of the vagal mechanosensory feedback

<sup>1</sup>See: https://mapsengine.google.com/map/edit?mid=zDBeK\_5W8FVs.knb4\_ z5h9NpQ

loop for respiration in a perfused *in situ* brainstem preparation of a mouse model for Rett syndrome. To help map the brain's feedback loops, Beier et al. (2013) demonstrate in the mouse a new transsynaptic retrograde tracer, based on the vesicular stomatitis virus with a rabies virus coat. Egelhaaf et al. (2012) provide a comprehensive review of work on insect vision, emphasizing the importance of active sensing for interpreting optic flow to optimize flying. Ejaz et al. (2013) present closed-loop experiments to study fly visual circuits in which recorded neural responses control a fast turntable on which the fly is mounted. Gollisch and Herz (2012) review how the locust auditory system, salamander retina, and the monkey visual cortex have been used to efficiently explore a large parameter space of iso-response curves, via on-line analysis of incoming data to generate the next stimuli.

The "Model-in-the-loop" paradigm is a powerful approach to understanding complex neural network dynamics. Brookings et al. (2012) interfaced an excised crab stomatogastric ganglion (STG) to a dynamic clamp model neuron to help determine the relative contributions of intrinsic and network properties of STG neurons to network function. Hsiao et al. (2013) interfaced a dentate gyrus-CA1 model to an acute hippocampal slice preparation on an MEA, with the goal of developing cognitive prostheses that could someday replace damaged brain regions.

Theoretical advances are described in several modeling and simulation papers. Witt et al. (2013) modeled the ability of closed-loop optogenetic stimulation to control communication between neural populations by altering their phase relationships. DiMattina and Zhang (2013) reviewed the use of feedback to optimize stimuli continuously during an experiment, for realtime model estimation. Hanuschkin et al. (2013) modeled the sensory-motor loop by which birds learn to produce stereotyped songs. Skocik and Kozhevnikov (2013) demonstrate a system for real-time audio feedback to study birdsong learning. Little and Sommer (2013) optimized exploration strategies in embodied agents based on information-theoretic analysis. Manoonpong et al. (2013) demonstrate the value of adaptive forward models in developing a legged robot locomotion controller. Molkov et al. (2013) modeled the roles of local (brainstem) and distal (lungs) feedback in mammalian respiratory circuits. Wallach (2013) reviews the concept and implementation of a response clamp, in which closed-loop control of a selected neural response variable is used to uncover network properties in cultured networks.

On the clinical side, Afshar et al. (2013) describe and test a new platform for closed-loop deep brain stimulation (DBS). This is the beginning of "smart neuromodulators" that tune themselves to provide optimal benefit to those suffering from, for example, epilepsy or Parkinson's disease. Beverlin and Netoff (2013) present theoretical analysis of a model neural network, aimed at closed-loop seizure control with just such a smart DBS device. Fernandez-Vargas et al. (2013) explored closed-loop optimization of a flickering light display as part of a visually-evoked potential (VEP) brain-computer interface that could be used by locked-in patients to communicate. Walter et al. (2012, 2013) explored transcranial cortical magnetic stimulation (TMS) in a motor task in 3 paralyzed stroke patients wearing a mechatronic hand orthosis. TMS was triggered by recorded brain states that were processed in real time for spectral estimation and to deal with stimulation artifacts.

The diversity of methods, experiments, tools, and analyses in this Research Topic suggests that many more areas of neuroscience research would benefit from adopting a closed-loop perspective.

## **ACKNOWLEDGMENTS**

Many thanks to all the authors and to the many reviewers who helped make this an outstanding set of articles!

## **REFERENCES**


integration of brains and machines. *Front. Neural Circuits* 6:99. doi: 10.3389/fncir.2012.00099


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 03 December 2013; accepted: 01 September 2014; published online: 23 September 2014.*

*Citation: Potter SM, El Hady A and Fetz EE (2014) Closed-loop neuroscience and neuroengineering. Front. Neural Circuits 8:115. doi: 10.3389/fncir.2014.00115 This article was submitted to the journal Frontiers in Neural Circuits.*

*Copyright © 2014 Potter, El Hady and Fetz. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Carbon nanotube-based multi electrode arrays for neuronal interfacing: progress and prospects

## *Lilach Bareket-Keren1,2 and Yael Hanein1,2\**

*<sup>1</sup> School of Electrical Engineering, Tel-Aviv University, Tel-Aviv, Israel*

*<sup>2</sup> Tel-Aviv University Center for Nanoscience and Nanotechnology, Tel-Aviv University, Tel-Aviv, Israel*

#### *Edited by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### *Reviewed by:*

*Graham W. Knott, University of Lausanne, Switzerland David J. Margolis, University of Zurich, Switzerland*

#### *\*Correspondence:*

*Yael Hanein, School of Electrical Engineering, Tel-Aviv University, 69978 Tel-Aviv, Israel. e-mail: yaelha@tauex.tau.ac.il*

## **INTRODUCTION**

Extensive investigations over the past 50 years revealed the great potential of implanted electrodes for recording and stimulating neuronal signals. Such devices are currently being employed for the treatment of a wide range of conditions such as deafness, Parkinson's disease and chronic pain, to name just a few (Schwartz, 2004; Clark, 2006; Wichmann and DeLong, 2006; McCreery, 2008; Plow et al., 2012). Recent studies also suggested the use of neuro-stimulation in a growing number of additional disabling conditions, such as schizophrenia and Alzheimer's disease (George et al., 2007; Laxton et al., 2010). As high resolution, multi-site recording and stimulation devices are very attractive for neural recording and stimulation applications, the concept of multi electrode array (MEA) has gained increased attention in this field. A MEA device consists of an array of electrically conducting microelectrodes (typically 20–200µm in diameter), connected to an external circuitry to allow recording or stimulation of neural electrical activity. Extensive effort has indeed demonstrated the potential of MEAs as an effective tool in various neurological applications. In particular, micro-fabrication technologies were employed to form finely shaped metallic [e.g., gold, platinum, and titanium nitride (TiN)] electrodes. The realization of such electrodes is readily achieved using a toolbox borrowed from the micro electro mechanical system (MEMS) field. This toolbox includes fabrication processes as well as materials with improved performances.

The scope of the current review is to explore, within the framework of micro fabricated neuro-electrodes, the employment of carbon nanotubes (CNTs) as a novel material with unique properties for neuro-applications. To this end, the CNT properties will be reviewed as well as their processing and fabrication into devices. The general field of micro fabricated neuro-electrodes will be introduced briefly and is beyond the scope of this review. We refer the reader for further reading on micro fabricated

Carbon nanotube (CNT) coatings have been demonstrated over the past several years as a promising material for neuronal interfacing applications. In particular, in the realm of neuronal implants, CNTs have major advantages owing to their unique mechanical and electrical properties. Here we review recent investigations utilizing CNTs in neuro-interfacing applications. Cell adhesion, neuronal engineering and multi electrode recordings with CNTs are described. We also highlight prospective advances in this field, in particular, progress toward flexible, bio-compatible CNT-based technology.

**Keywords: carbon nanotubes, multi electrode array, neuronal recording, stimulation**

neuro-electrodes to HajjHassan et al. (2008), on the biocompatibility of CNTs to Warheit et al. (2004), Bottini et al. (2006), Carrero-Sanchez et al. (2006), and Firme and Bandaru (2010), and on the use of CNTs in biology to Bekyarova et al. (2005), Tarakanov et al. (2010), and Bottini et al. (2011).

We begin by reviewing the fundamental chemical, physical and electrical properties of CNTs (Thostenson et al., 2001; Harris, 2009; Lan et al., 2011). We then examine studies in which the neuron-CNT interface was explored. Next, the use of CNTs for neuronal patterning is discussed followed by a review of the electrical interfacing between CNTs and neurons and the study of CNT MEAs for neuronal applications. Finally, we discuss the progress toward flexible, bio-compatible CNT technology.

#### **BEYOND CONVENTIONAL MICRO-FABRICATION**

Despite a rapid recent development, contemporary MEAs for neuronal applications are still typified by relatively low signal to noise ratio (SNR), low spatial resolution (leading to poor site specificity) and limited biocompatibility. Clearly, further development is needed to make better electrodes suited for seamless integration between electronic devices and neuronal systems. The limited performances of these MEA systems stem primarily from the challenging interface between the biological systems and the artificial, electronic systems. The design of an interface between a living tissue and an electronic device must consider the dramatic structural and chemical differences between these two systems: Living tissues are soft, whereas electronic devices are usually rigid. Tissue conducts charges by ionic transport, whereas electronic devices conduct electrons and holes. Therefore, neural electrodes should accommodate differences in mechanical properties, bioactivity, and mechanisms of charge transport. Proper electrode-neuron interface is critical in ensuring both the viability of the cells and the effectiveness of the electrical interface.

A fundamental limiting feature of many contemporary MEAs is large electrode dimensions. Smaller electrodes would allow better spatial resolution and specific cell recording or stimulation. Also, reduction in electrode size (and therefore the dimensions of the entire device) is related to decreased tissue injury and immune response (Szarowski et al., 2003; Biran et al., 2005; Polikov et al., 2005; McConnell et al., 2009). While manufacturing small electrodes is technologically possible; the reduction in electrode size, needed for improving both stimulation and recording, is challenging. Small electrodes fail to provide sufficient charge injection owing to their high interface impedance. Low reversible charge storage capacity (CSC) means that the electrode cannot inject enough current to the tissue at small enough overpotential to avoid irreversible electrochemical reactions (i.e., electrolysis) and the ensuing damage to the electrode and the tissue (Cogan, 2008). Thus, to reduce electrode size without sacrificing the electrode ability to transfer charge, electrodes with high specific area are desired. High impedance also contributes to increased overall noise levels in recorded signals, thus reducing the recording sensitivity. An additional concern is the polarity of the electrode. For better biocompatibility, polar electrodes are desired (Merrill et al., 2005). These issues are further discussed later in the text.

Coupling neural cells intimately to the electrodes is also important otherwise the efficacy of both recording and stimulation are compromised. Recording is compromised by background noise of nearby neurons. Also, the conductance of the solution effects both recording and stimulation (Grattarola and Martinoia, 1993). The most common means to promote neural adhesion is through the use of cell adhesion proteins (Sorribas et al., 2001; Heller et al., 2005). Synthetic positively charged polymers, such as polylysine (Crompton et al., 2007) and poly(ethyleneimine) (PEI) (Ruardij et al., 2000) are commonly used to promote neural cell attachment (He and Bellamkonda, 2005; Khan and Newaz, 2010). The temperature sensitive Poly(N-isopropylacrylamide) (PNIPAm) was used to improve the binding between a retinal implant and the retina (Tunc et al., 2007). Conducting polymers (CPs), such as poly(ethylenedioxythiophene) (PEDOT), and polypyrrole (PPy) were used as neural growth substrate and electrode coating and are of particular interest due to their combined electronic and ionic conductivity (George et al., 2005; Abidian and Martin, 2008; Asplund et al., 2009; Abidian et al., 2010). The main disadvantage of CPs is their low stability under continued stimulation and exposure to ultra-violate (UV) radiation or heat. Applied voltage results with the insertion or removal of counter ions, so the CPs undergo swelling, shrinkage or breaking that gradually degrades their conductance (Yamato et al., 1995; Marciniak et al., 2004). Additionally, synthetic and CPs are often fabricated using complex or toxic polymerization schemes that are not well suited for cell interfacing applications. These residues are often not easily removed (Wan, 2008).

#### **SURFACE ROUGHNESS AND CARBON NANOTUBES IN NEURONAL INTERFACING**

Recent studies have shown that surface topography is an important parameter affecting neuronal anchoring and branching (Seidlits et al., 2008; Hoffman-Kim et al., 2010; Roach et al., 2010). In fact, cells preferentially adhere to rough surfaces when exposed to the same chemistry (Fan et al., 2002). Therefore, new electrode materials were investigated to realize electrodes with improved electrical properties, affinity to neuronal cells and biocompatibility utilizing the electrode morphological properties rather than their chemical ones.

An ideal material to meet these requirements is CNTs. CNTs are well suited for neural electrical interfacing applications owing to their large surface area, superior electrical and mechanical properties, as well as their ability to support excellent neuronal cell adhesion (Malarkey and Parpura, 2007; Ben-Jacob and Hanein, 2008; Voge and Stegemann, 2011). Recent studies have indeed confirmed the great potential of CNT surfaces as a biocompatible substrate on which neurons can readily adhere. This affinity was linked to surface properties including roughness, polarity, charge, and chemistry (Hu et al., 2004; Gabay et al., 2005a,b; Malarkey et al., 2009; Sorkin et al., 2009). CNT high surface area can lead to a significant increase in charge injection capacity and decreased interfacial impedance (Gabay et al., 2007; Keefer et al., 2008).

Investigations so far focused on several main themes: The effect of chemically modified CNTs on the viability of neuronal cells, process outgrowth and branching (Mattson et al., 2000; Hu et al., 2004; Matsumoto et al., 2007), electrical interfacing with neurons (Gheith et al., 2006; Wang et al., 2006; Gabay et al., 2007; Shein et al., 2009), and the development of neural implants (Webster et al., 2004; Nunes et al., 2012). CNTs are now widely investigated as an interfacing material for neuronal applications (Malarkey and Parpura, 2007; Ben-Jacob and Hanein, 2008; Pancrazio, 2008; Lee and Parpura, 2009; Voge and Stegemann, 2011). As highlighted above, both surface-chemistry and surfacetopography are critically important parameters determining the formation of effective electrodes. Many schemes have been developed addressing these challenges using CNT coatings (pristine and chemically modified) offering exciting opportunities as will be further explored below.

#### **CARBON NANOTUBES**

We begin our review with a brief overview of the physical properties of CNTs. CNTs are hollow cylinders formed in the shape of a rolled graphite sheet. Single walled CNTs (SWCNTs) are the simplest of these objects with a diameter ranging between 0.4 and 2.5 nm and lengths of up to a few millimeters. Multi walled carbon nanotubes (MWCNTs) are composed of a set of coaxially organized SWCNTs and are 2–100 nm in diameter while their length can vary from one to several hundred micrometers (Harris, 2009). The arrangement of the carbon atoms in the graphene sheet can be of different chirality: armchair, chiral, or zigzag. The chirality, as well as the tube diameter and the number of graphene walls, determine the CNT conductivity. Generally, SWCNTs can be metallic or semiconducting with MWCNTs featuring metallic behavior (Charlier et al., 2007). CNTs are also mechanically stable with very high tensile strengths and chemical inertness (Ciraci et al., 2004; Hayashi et al., 2007). CNTs are commonly synthesized from a catalyst by a variety of methods including: chemical vapor deposition (CVD), electric arc discharge and laser ablation (Thostenson et al., 2001; Seah et al., 2011). Their physical properties make CNTs a durable nanomaterial for biological applications, especially where a long lasting material is desired (e.g., scaffolds for support of cellular growth). Although the surface of CNTs is fundamentally inert, it can be readily functionalized with different polymers or bioactive molecules, such as peptides and proteins to improve their biocompatibility and bioactivity (Bekyarova et al., 2005; Yang et al., 2007; Lu et al., 2009; Bottini et al., 2011).

#### **CARBON NANOTUBES AND NEURONS**

The first investigations into the use of CNTs in neuro-interfacing applications focused on characterizing neuronal adhesion and proliferation on CNT coated surfaces. Mattson and co-workers were the first to discuss the use of CNTs as a substrate for neuronal growth (Mattson et al., 2000). The researchers grew embryonic rat hippocampal neurons on cover slips covered with PEI and MWCNTs. They found that pristine MWCNT substrates allowed neuronal attachment but did not support neurite branching as elaborate as that of cells cultured on PEI-coated coverslips. However, when MWCNTs were non-covalently functionalized (by physiosorption) with 4-hydroxynonenal (4-HNE), a molecule that promotes neurite outgrowth, large increases in the number of neurites per cell and in the overall neurite lengths were observed. This study demonstrated that MWCNTs can serve as a permissive substrate for neuronal cell adhesion and growth and that modifying MWCNTs with a biologically relevant molecule can be used to modulate neuronal growth and neurite outgrowth (Mattson et al., 2000).

The pioneering work of Mattson and co-workers was followed by a succession of studies aiming to further elucidate the observed effects. Hu et al. studied the effect of charge. Longer neurites and more elaborate branching were observed on positively charged CNT substrates (Hu et al., 2004). The charge of a MWCNT substrate was modified by functionalization with carboxyl groups, poly-m-aminobenzene sulfonic (PABS) acid or ethylenediamine (EN) to create negatively, zwitterionic or positively charged nanotubes, respectively. The number of neurites was counted depending on the nature of the nanotubes and their functionalization. Xie and co-workers determined that MWCNT mats functionalized with carboxyl groups are a permissive substrate for rat dorsal root ganglion (DRG) neurons growth, as confirmed by scanning electron microscopy (SEM) imaging. The researchers further suggested that the functional groups act as anchoring seeds enhancing neural cells and neurite adhesion (Xie et al., 2006).

Covalent modifications of CNTs with neurotrophins, protein growth factors that promote the survival and differentiation of neurons, were studied by Matsumoto et al. (2007). MWCNTs were functionalized with nerve growth factor (NGF) and brainderived neurotrophic factor (BDNF). Embryonic chick DRG neurite outgrowth on modified MWCNTs was similar to that seen with soluble NGF and BDNF in culturing media, indicating that the covalently attached factors were still bioactive. Pristine MWCNTs were also shown to support the growth of neurons (Gabay et al., 2005a,b; Galvan-Garcia et al., 2007). This effect is nicely illustrated in **Figure 1** which shows the strong affinity between dissociated locust neurons and pristine CNT islands

**FIGURE 1 | A false-colored SEM image of fixed locust frontal ganglion neuronal cells cultured on carbon nanotube islands.** The carbon nanotube islands were grown using the chemical vapor deposition method directly on a quartz support. For further details see Sorkin et al. (2009). Width of field of view is 77 µm.

after several days of incubation. Galvan-Garcia and co-workers reported that MWCNTs in the form of sheets or yarns supported long-term growth of a variety of cell types ranging from skin fibroblasts and Schwann cells, to postnatal cortical and cerebellar neurons. When highly purified, these CNT sheets allowed neurons to extend processes in a similar number and length to those grown on planar polyornithine substrates (a permissive support). Thus, these results suggest that the interaction between neurons and CNTs may be affected by the purity of the CNTs, as well as by the three-dimensional organization of the CNT substrate.

Although initial investigations focused on MWCNTs, SWCNTs were also studied as neuronal substrates. Hu and co-workers synthesized a PEI functionalized SWCNT graft copolymer (SWCNT-PEI) (Hu et al., 2005). Covalent functionalization was used to turn SWCNTs to be soluble in aqueous media. Next, rat hippocampal neurons were cultured on coverslips coated with SWCNT-PEI and the results were compared with those of pristine MWCNT or PEI substrates. Fluorescent microscopy was used to examine neuronal viability, as indicated by their ability to accumulate the vital stain, calcein. It was found that SWCNT functionalization diluted the effect of the PEI's positive charge, resulting in neurite outgrowth and branching with intermediate extent to that of as-prepared CNT films or PEI alone. These results were consistent with the initial findings of Mattson and colleagues using fixed cells. Modified MWCNTs were found to be inferior to PEI as a culturing substrate (Hu et al., 2005). Gheith and co-workers demonstrated that freestanding SWCNT-polymer films prepared by the layer-by-layer (LBL) technique are compatible with neuronal cell culturing. The films were prepared by layering SWCNT with a negatively charged polyacrylic acid polymer. The SWCNTs were coated with amphiphilic poly (*N*-cetyl-4-vinylpyridinium bromide*co*-*N*-ethyl-4-vinylpyridinium bromide-*co*-4-vinylpyridine). The presence of positively charged groups on the surface of this polymer promoted cell adhesion. Cell cultures of the neuronal model cell line NG108 effectively grew and proliferated on these substrates. Moreover, the number of neurites spun from individual cells exceeded those developed on traditional cell growth substrates (Gheith et al., 2005). However, not all CNT functionalization lead to the design of substrates that enhance neural cell growth. Liopo and co-workers showed that 4-tertbutylphenyl or 4-benzoic acid functionalized SWCNTs were less supportive of NG108 cell attachment and growth than pristine nanotubes (Liopo et al., 2006).

Carbon nanofibers (CNFs) are a form of carbon material closely related to MWCNTs and were also tested as a neuronal substrate. CNFs consist of multi-walled graphene structures stacked on top of each other like a stack of ice cream cones (Rodriguez, 1993). Nugen-Vu and co-workers directly grew forest-like vertically aligned CNFs (VACNFs) on a substrate with a lithographically patterned catalyst. After the CNF film was submerged in a liquid and dried, the CNFs irreversibly stuck together to form microbundles. A uniform freestanding film was achieved after coating the CNF with a thin layer of the CP PPy by electrochemical deposition. PC12 cell line grew as monolayers on the CNF films only after further coating with a collagen film. Otherwise cells appeared to float on top of the CNF surface (Nguyen-Vu et al., 2006). In a subsequent study, the neuronal marker NGF was introduced to the VACNF surface to promote the formation of well-differentiated cells with mature neurites. The freestanding VACNFs coated with PPy and NGF were found to bend toward the cell body and adhere to it. Therefore, it was suggested that the soft PPy coating contributes to better mechanical contact with cells due to a reduction in the local mechanical stress (Nguyen-Vu et al., 2007).

#### **CNT CONDUCTIVITY**

Since CNTs may vary between being conducting and semiconducting, their electrical properties were also studied. Malarkey and co-workers varied the conductivity of SWCNT-polyethylene glycol (PEG) graft copolymer coatings by changing the film thickness, while maintaining a constant surface roughness (Malarkey et al., 2009). Rat hippocampal neurons were then seeded. It was shown that thinner, less conducting SWCNT films, resulted in longer neurite processes, while thicker, more conductive films, produced larger cell bodies. Smooth, positively charged PEI substrates resulted in a larger number of growth cones per cell body. This study demonstrated that differences in conductance, roughness, and surface charge can modulate neuronal cell growth and morphology.

#### **CARBON NANOTUBE SURFACE ROUGHNESS**

Overall, the origin of the neuron-CNT interaction appears to be strongly affected by surface roughness. It was suggested that the roughness of CNTs contributes to anchoring neural cells (Zhang et al., 2005; Xie et al., 2006; Sorkin et al., 2009). Zhang et al. (2005) fabricated patterned vertical MWCNT surfaces. CNTs were then functionalized with poly-L-lysine (PLL). Cell cultures of the neuronal cell line H19-7 preferentially adhered to the MWCNT patterns. Neuronal growth cones were found to make contact with the nanotube surface, and these strong interactions allowed the neurons to spread along patterns and form interactions with one another. It was established that guided neurite growth was formed preferably on long vertical MWCNTs compared to short ones. This behavior was attributed by the authors to a possible increased adsorption of the PLL molecules onto the long nanotubes. Additional mechanism may be that long nanotubes are flexible and undergo deformation to accommodate the proliferating neurites.

Sorkin and co-workers characterized the arrangement of neurons and glia cells on CNT surfaces (see **Figure 2**). Threedimensional, small, isolated and pristine CNT islands were fabricated and plated with cells. Two biological model systems were used: cortical neurons from rats, and ganglion cells from locusts. Neurons were found bound and preferentially anchored to the rough surfaces. For both model systems, the morphology of neuronal processes on the small, isolated islands of high density

**FIGURE 2 | Rat neuronal cultures on CNT islands. (A)** Fluorescent confocal image of fixed neurons (red) and glia cells (green) cultured on a carbon nanotube island. Disk diameter is 20µm. **(B,C)** HRSEM images of a neuronal process forming a loop around several CNTs (designated by arrows). The

image in **(C)** corresponds to the area marked by the dashed box in **(B)**. Clearly identifiable process segments were manually highlighted. Processes appear to bind to the carbon nanotube surface in a manner akin to that of tendrils. Adopted from Sorkin et al. (2009).

CNTs was found to be conspicuously curled and entangled. In this study, it was demonstrated that the roughness of the surface must match the diameter of the neuronal processes in order to allow them to bind. It was suggested that entanglement, a mechanical effect, may constitute an additional mechanism by which neurons anchor themselves to rough surfaces (Sorkin et al., 2009).

**Table 1** summarizes the main results described above, emphasizing how different CNTs and CNT modifications affect neuronal adhesion. Overall the general picture that emerges from these investigations is that MWCNTs, SWCNTs, and CNFs are permissive substrate for neuronal growth and proliferation. Neuronal interaction with CNTs is affected by CNT surface chemical modification, conductivity, charge, and roughness. Positive charge had a positive effect on neurite branching and length. Altering conductivity resulted with morphological changes in neurite length and cell body size. Surface roughness contributed to anchoring neurons to the surface. Chemical modifications of the CNT surface with 4-HNE, and PEI had a positive effect on neurite branching and growth, whereas modification with 4-tertbutylphenyl and 4-benzoic acid modified substrate diminished neuronal cell growth.

#### **CARBON NANOTUBES FOR NEURONAL PATTERNING**

Patterned CNT films, such as those discussed above, provide a unique scheme for creating and studying engineered neuronal networks. Studies using patterned CNTs can provide insight into the collective activity of neural networks. CNT patterns also offer a route for developing three-dimensional scaffolds as a step toward designing circuits for bio-computational purposes and neuro-prosthetics applications. This approach can also be used to build advanced neuro-chips for bio-sensing applications (e.g., drug and toxin detection) where the structure and stability of the networks are important.

Zhang and co-workers cultured neurons on micron-scale patterns with different geometries. These patterns were designed to support an investigation into mechanisms underlying neuronal extension, guidance, and interaction. Straight lines, squares and circular features were used, as well as different lengths of the nanotubes. It was found that neurons preferentially adhered to MWCNT patterns. Growth cones were attached to the nanotube surface, allowing the neurons to spread along patterns and interact with one another (Zhang et al., 2005).

CNT islands were also used extensively by us to engineer neuronal networks into a system with well-defined geometry (see **Figure 3**), so the interplay between geometry and neuronal activity can be systematically investigated (Gabay et al., 2005a,b; Sorkin et al., 2006, 2009; Greenbaum et al., 2009; Shein et al., 2009) (see **Figure 2** for a typical example). In one of the first publications to use MWCNTs for neuronal interfacing applications, Gabay and co-workers imprinted a pattern of iron nanoparticle catalyst on quartz substrates using a poly (dimethylsiloxane) (PDMS) stamp and then grew CNTs from the iron catalyst islands. Rat cortical neurons and glial cells accumulated preferentially on the MWCNT islands and formed interconnected networks, bridging across the non-permissive quartz to form connections between adjacent islands. Using the patch clamp technique, cultured neurons were found to be electro-physiologically active with normal resting membrane potentials, demonstrating that the MWCNT did not alter the neuronal integrity (Gabay et al., 2005a,b).

In a successive work, Sorkin and co-workers examined the dynamics of neuronal network organization by placing rat cortical and hippocampal neurons on patterned MWCNT or poly-D-lysine patterned substrates. Cell clusters were found to spontaneously anchor to patterned islands with neurites, connecting nearby islands through a single non-adherent straight bundle composed of axons and dendrites. Square, triangular and circular structures of connectivity were successfully realized. Monitoring the dynamics of the networks in real time revealed that the self-assembly process is mainly driven by the ability of the cells to move while continuously stretching neurite bundles in between. The patterned networks were stable for as long as 11 weeks (Sorkin et al., 2006). In a subsequent study, Sorkin and co-workers cultured rat cortical neurons, as well as locust frontal ganglion neurons on micro-patterned MWCNT islands. Neuronal processes tended to wrap and entangle with the rough MWCNT islands. It appears that the similar dimensions of the CNTs (within the island) and the neurites supports an anchoring mechanism allowing neurons to attach (Sorkin et al., 2009). Greenbaum and co-workers demonstrated the use of specially designed CNT substrates to form small networks of locust frontal ganglion neurons. It was suggested that mechanical tension is created along the cell's processes and pulls the cell's soma; neuronal activity was recorded from single cells (Greenbaum et al., 2009). These effects were further explored (Anava et al., 2009; Hanein et al., 2011) to show that indeed mechanical effects are ubiquitous in these developing networks.

## **CARBON NANOTUBES FOR ELECTRICAL NEURONAL INTERFACING**

As discussed in the "Introduction" section, contemporary electrodes used for neuro-prosthetic applications have relatively high impedance and poor CSC. In order to better appreciate these challenges and to evaluate CNTs potential in neuronal electrode applications, we begin with a brief overview of the electrical processes taking place at the neuron-electrode interface.

#### **EXTRACELLULAR RECORDING AND STIMULATION OF NEURONAL ACTIVITY**

Signal transmission in neuronal systems is the result of ionic currents passing through specific ion channels across the cell membrane. Extracellular recording methods monitor the electrical field associated with this dynamic. The time course of the extracellular action potential is typically ∼1 ms and the amplitude is in the range of a few tens to a few hundreds of microvolts (Cogan, 2008; Buzsaki et al., 2012). This amplitude is significantly smaller than the corresponding intracellular spike, which is in the tens of millivolt range. Additionally, extra cellular signals diminish rapidly as a function of distance from the cell. A reverse process takes place during stimulation; charges are delivered from the electrode and induce a buildup of membrane potential. Under a strong enough field, voltage sensitive ions in the cell membrane trigger the generation of an action potential (Roth, 1994; Tehovnik, 1996; Basser and Roth, 2000).


**Table 1 | Neuronal adhesion on CNT coated surfaces.**


Stimulating neurons and recording extracellular signals can be achieved using a conducting electrode placed close to the cell or its processes. The electrode electrochemical properties are fundamental to its performances as a stimulating or recording electrode. Clearly, an effective interface is a prerequisite for both stimulation and recording. While neuronal stimulation and recording are related in nature, these two applications have somewhat different requirements. Foremost, the amount of charge required for stimulation is orders of magnitude higher than what is recorded. Recording may often be impossible with electrodes which are well suited for stimulation. In neuronal recording, the typically small signals make noise considerations very important (Musial et al., 2002). For safe stimulation purpose, however, delivering the appropriate charge to the tissue without causing electrode or tissue damage is the main consideration (McCreery et al., 1988, 1990; Cogan, 2008).

The electrode material and the reactions at the electrode-tissue interface (the reactions mediating the transition from electron flow in the electrode to ion flow in the tissue) are the main parameters determining the safe range for stimulation. The reactions taking place during charge injection can be capacitive or Faradaic (**Figure 4A**). Capacitive reactions involve displacement current and are associated with the charging and discharging of the electrode-electrolyte double layer due to redistribution of charged species in the electrolyte. Faradaic reactions, on the other hand, involve the transfer of electrons across the electrodeelectrolyte interface and require that some species, on the surface of the electrode or in solution, are oxidized or reduced. These reactions can lead to irreversible processes that cause electrode or tissue damage. Therefore, while maximizing the current injected through an electrode is important, it has to be achieved ideally by using non-Faradaic electrodes. Capacitive charge delivery is therefore a critical consideration in the design of electrodes both for recording and stimulation.

The capacitive and Faradic reactions at the electrodeelectrolyte are modeled by a simple electrical circuit consisting

of two elements, a capacitor and a resistive element in parallel. **Figures 4B,C** illustrate circuit models of electrode-electrolyte interface and extracellular recording and stimulation of neuronal tissue, respectively. The capacitive mechanism, which represents the ability of the electrode to cause charge flow in the electrolyte without electron transfer, is modeled as a simple electrical capacitor called the double layer capacitor (Bard and Falkner, 2000; Merrill et al., 2005). Faradaic processes are modeled as a Faradaic impedance (Bard and Falkner, 2000; Merrill et al., 2005). There are two limiting cases derived from this model: The ideally polarizable electrode, and the ideally non-polarizable electrode (Bard and Falkner, 2000; Merrill et al., 2005). The ideally non-polarizable electrode has a zero Faradaic resistance, therefore current flows readily in Faradaic reactions and there is no change in voltage across the interface upon the passage of current. Thus, the electrode potential remains near equilibrium, even upon current flow. The ideally polarizable electrode has infinite Faradaic impedance element and is modeled by a pure capacitor. In an ideally polarizable electrode, all the current is transferred through capacitive action, thus the electrode potential is easily perturbed away from the equilibrium potential. Real electrode interfaces are modeled by the double layer capacitor in parallel with finite Faradaic impedance, together in series with the solution resistance. A highly polarizable electrode is one that can accommodate a large amount of injected

charge on the double layer prior to initiating Faradaic reactions. Thus, for improved biocompatibility, highly polarizable electrodes are desired. An additional important parameter used is the description of neuronal stimulation electrodes is the reversible CSC, also known as the reversible charge injection limit (Robblee and Rose, 1990; Merrill et al., 2005). The CSC of an electrode is the total amount of charge that may be stored reversibly, including storage in the double layer capacitance, pseudocapacitance, or any reversible Faradaic reaction. The material used for the electrode, the size and shape of the electrode, the electrolyte composition, and parameters of the electrical stimulation waveform, all influence the CSC. We refer the reader for a detailed description of the electrochemical electrode-electrolyte interface of recording and stimulation neuronal electrodes (Bard and Falkner, 2000; Merrill et al., 2005; Cogan, 2008).

Overall, increased capacitance results in decreased impedance, and reduction in noise levels, as well as allowing wider voltage windows for safe electrical stimulation. Contemporary Faradaic electrode materials include mainly noble metals such as gold, platinum, titanium, and iridium, as well as alloys of these metals, iridium oxide, stainless steel, and highly doped semiconductors such as silicon. Capacitive electrode materials include TiN, tantalum-tantalum oxide, and the more recently investigated CNTs. The capacitive nature of CNT electrodes is therefore yet another major advantage.

#### **CARBON NANOTUBES FOR RECORDING AND STIMULATION OF NEURONAL ACTIVITY**

As we discussed above, CNTs have several fundamental properties which make them ideally suited for neuronal interfacing. They support neuronal proliferation, they are conducting and they form extremely high specific area, capacitive electro-chemical electrodes. Accordingly, many recent studies have employed CNTs as a coating material for neuro-electrodes.

Direct stimulation of isolated neurons in culture using SWCNT coated substrate was demonstrated recently by several groups (Gheith et al., 2006; Liopo et al., 2006; Mazzatenta et al., 2007). Gheith and co-workers incorporated positively charged SWCNTs and poly acrylic acid into LBL multilayers with sufficiently high electrical conductivity to electrically stimulate a model neuronal cells line (NG108). The use of the SWCNT LBL films as culturing substrates did not perturb the key electrophysiological features of the NG108 cells, which confirms previous observations (Gheith et al., 2006). The electrical coupling of NG108 cells, as well as rat primary peripheral neurons to unmodified, as well as 4-tertbutylphenyl or 4-benzoic acid modified SWCNTs deposited onto polyethylene terephthalate (PET) films, were assessed by Liopo et al. (2006). Neurons showed voltage activated currents when electrically stimulated through the conducting SWCNT film. The same issue was subsequently addressed by Mazzatenta and co-workers who used electrophysiological measurements and computational modeling in order to understand the nature of the electrical coupling between neurons and pure SWCNTs (Mazzatenta et al., 2007). The authors cultured rat hippocampal neuronal on glass cover slips coated with pristine SWCNT films. SEM revealed contacts between neuronal membranes and SWCNTs. Electrical recordings using a patch clamp indicated that neurons grown on SWCNT substrates displayed spontaneous electrical activity. Stimulation of cultured neurons was achieved by applying current through the nanotube substrate. Finally, a mathematical model describing the electrical coupling between the SWCNT and the neurons was suggested (Mazzatenta et al., 2007).

Some studies suggested that CNTs boost neuronal electrical activity (Lovat et al., 2005; Cellot et al., 2009). Lovat and co-workers functionalized CNTs with pyrrolidine groups. This functionalization removed impurities and improved the CNT solubility in organic solvents. Glass cover slips were then coated with a drop of the solution. Evaporation of the solvent and heat treatment resulted with defunctionalization, leaving purified MWCNTs on the glass. Neurons grown on MWCNT films showed a six-fold increase in the frequency of the spontaneous postsynaptic currents and spontaneous action potential generation when compared to those grown on untreated glass. The authors proposed that the high conductivity of the CNT substrate might have affected the voltage-dependent membrane processes resulting in the increased activity (Lovat et al., 2005). Cellot and co-workers have suggested that CNTs improve electrical communication between neurons through the formation of tight contacts with the cell membranes. They used thin CNT films formed by solution deposition on glass followed by thermal treatment. Rat hippocampal neurons were seeded onto the films and showed an increase in synaptic firing (Cellot et al., 2009), enhanced formation of synapses as well as changes in synaptic dynamics (Cellot et al., 2011).

Composite CNT coatings enhance recording and stimulation of neurons *in vitro* and *in vivo* by decreasing the impedance and increasing charge transfer. Keefer and co-workers successfully coated electrodes with MWCNTs using different deposition schemes (Keefer et al., 2008). Commercial tungsten and stainless steel sharpened wire electrodes were coated with CNTs, using covalent attachment of the CNT coating, electrodeposition of CNT-gold coating or electrodeposition of CNT combined with CP (PPy). The different CNT coatings resulted with lower impedance and higher charge transfer capacity compared with bare metal electrodes. *In vivo* recording quality of CNTcoated sharp electrodes was tested in the motor cortex of anesthetized rats and in the visual cortex of monkeys. Compared with bare metal electrodes, CNT coated electrodes had reduced noise and improved detection of spontaneous activity (Keefer et al., 2008). Baranauskas and co-workers tested PPy-CNT coated platinum/tungsten microelectrodes. PPy-CNT coating significantly reduced the microelectrode impedance and induced a significant improvement of the SNR, up to four-fold on average. *In vivo* signals were recorded from rat cortex (Baranauskas et al., 2011). Other CPs-CNT composite coatings including PPy-CNT (Lu et al., 2010; Chen et al., 2011a) and PEDOT-CNT (Luo et al., 2011) were tested. These coatings similarly resulted with enhanced electrochemical properties and were found biocompatible. The devices were not used in recording or stimulation. The PPy-CNT coatings highly improve the electrochemical performance of the test electrodes and further investigation into the durability of these coatings under long-term stimulation and recording use would be important to reveal their full potential.

Collectively, the studies reviewed above show that CNTs may provide a superior mean for electrical coupling between devices and neuron. We shall now discuss the use of CNTs electrodes for both electrical recordings and stimulation of neurons in the form of MEAs.

#### **CARBON NANOTUBE MEA FOR NEURONAL RECORDING AND STIMULATION**

A major development in the use of CNT in neuro-applications is the design and fabrication of CNT MEAs (Gabay et al., 2007). Such MEAs were made by synthesizing islands of high density CNTs. Both MWCNTs and SWCNTs structures were used. CNTs were either deposited as a coating on top of metal electrodes (Keefer et al., 2008; Gabriel et al., 2009; Fuchsberger et al., 2011) or directly grown from a catalyst patterned substrate (Wang et al., 2006; Gabay et al., 2007; Yu et al., 2007).

MWCNT-gold coated indium-tin oxide MEAs were used to record and stimulate mice cortical cultures by Keefer and coworkers. The CNT coated electrodes were found to be suited for recording and improved the effectiveness of stimulation (Keefer et al., 2008). Pristine CNT coatings were also used. Gabriel et al. coated standard platinum MEAs with SWCNTs which were directly deposited onto electrodes by drop coating and drying. CNT coating resulted with enhanced electrical properties, decreased impedance and increased capacitance. The researchers successfully performed extracellular recordings from ganglion cells of isolated rabbit retinas (Gabriel et al., 2009). Fuchsberger and co-workers proposed the deposition of MWCNT layers onto TiN microelectrode arrays by means of a micro-contact printing technique using PDMS stamps. The coated MEA was applied for the electrochemical detection of dopamine and electrophysiological measurements of rat hippocampal neuronal cultures. MWCNT coated microelectrodes were found to have recording properties superior to those of commercial TiN microelectrodes (Fuchsberger et al., 2011). Drop coating and micro-contact printing methods are quite simple to impalement. However, the film may have weak adhesion to the surface compared with covalent or electrochemical techniques, therefore careful validation of the coating adhesion is important.

CNT MEAs based on top–down fabrication approaches were also reported. Superior electrical properties of CNT microelectrodes were presented by Gabay and co-workers. We fabricated the CNT MEAs by synthesizing high density MWCNT islands on a silicon dioxide substrate. The three-dimensional nature of the CNT electrodes contributes to a very large surface area, and consequently to high electrode specific capacitance (non-Fradaic behavior was validated) and low frequency dependence of the electrode impedance. Spontaneous activity of rat cultured neurons was recorded (Gabay et al., 2005a,b, 2007). Direct electrical interfacing between pristine CNT microelectrodes and rat cultured neurons was also demonstrated by Shein et al. (2009). Each electrode recorded the activity from a cluster of several neurons; this activity was characterized by bursting events (see **Figure 5**). The same CNT MEAs were further used to study the electrical activity of neuronal networks (Shein Idelson et al., 2010) as well as to interface with mice retina (Shoval et al., 2009).

**FIGURE 5 | Spontaneous electrical activity of neuronal clusters on CNT MEA. (A)** Voltage traces of spontaneous electrical activity recorded from a CNT electrode. **(B)** Raster plot of the spontaneous spiking activity in several CNT electrodes. Activity patterns are characterized by bursting events; short time windows (several hundreds of milliseconds) of rapid collective neuronal firing, which are followed by long intervals (seconds) of sporadic firing. For further details see Shein et al. (2009).

The retina tests revealed that SNR of CNT electrode improved over time suggesting a gradual (over 2 days) improvement in the tissue-electrode coupling. Recent stimulation tests by the same group revealed a similar improvement in the stimulation threshold (Eleftheriou et al., 2012).

Wang and co-workers presented a prototype of vertically aligned MWCNT pillars as microelectrodes on a quartz substrate (Wang et al., 2006). The nanotubes were functionalized with PEG to create a hydrophilic surface. The obtained hydrophilic CNT microelectrodes offer a high charge injection limit without Faradic reactions. *In vitro* electrical stimulation of embryonic rat hippocampal neurons was then achieved and detected by observing intracellular calcium level change using a calcium indicator (Wang et al., 2006). VACNF MEA was fabricated and tested for potential electrophysiological applications by Yu et al. (2007). Extracellular stimulation and recording of both spontaneous and evoked activity in organotypic hippocampal slices was reported. de Asis and co-workers systematically compared PPy-coated VACNF MEA with tungsten wire electrodes, a planar platinum MEA, and an as-grown VACNF MEA for the recording of evoked signals from acute hippocampal slices (de Asis et al., 2009). Recently Su and co-workers synthesized CNTs on a cone-shaped silicon tip by catalytic thermal CVD. Oxygen plasma treatment was used to modify the CNT surface to change the CNT surface characteristics from hydrophobic to hydrophilic in order to improve CNT wettability and electrical properties. Electrochemical characterization of the oxygen plasma-treated three dimensional CNT probes revealed lower impedance and higher capacitance compared with the bare silicon tip. Furthermore, the oxygen treated CNT probes were employed to record signals of a crayfish nerve cord (Su et al., 2010).

The development of CNT MEAs has a few important advantages over silicon probes commonly used in current neuroscience research and clinical applications. Silicon probes typically consist of a silicon support, silicon nitride, and silicon dioxide insulation layer. The electrodes are usually coated with iridium, gold or platinum. The first designs include the Michigan array (Wise et al., 1970; Wise and Angell, 1975) and the Utah array (Campbell et al., 1991). The Michigan probe includes several microelectrode sites patterned on each shank of the structure and the Utah array is a three-dimensional electrode array which consists of multiple sharpened silicon needles. However, a major shortcoming of these devices is the electrode material which is metallic and therefore Faradaic (compared with the capacitive CNT electrodes) and has no affinity to neuronal cells compared with the preferred neuronal adhesion to the rough CNT surfaces.

## **FLEXIBLE CNT MEA FOR RECORDING AND STIMULATION OF NEURONAL ACTIVITY**

Typical MEMS electrodes, despite their many advantages, are rigid and therefore are poorly suited for long-term neural *in vivo* applications. Accordingly, there is an increased interest in the development of flexible MEAs. Specifically, the combination of flexible substrates and CNTs electrodes for neuronal applications has gained attention.

Lin and co-workers were the first to fabricate and implement a flexible CNT-based electrode array for neuronal recording. The CNT electrode array was grown and patterned on a silicon substrate and was then transferred onto a flexible Parylene-C film. The four-step process included: CNT growth, polymer binding, flexible film transfer, and partial isolation. The resulting vertically aligned CNTs were partially embedded into the polymer film. Recording the electrophysiological response of a crayfish nerve cord was performed with two teflon-coated silver-wires used as a stimulation and a reference electrode. The SNR of the flexible CNT electrode was 257 (Lin et al., 2009).

Direct growth of CNTs on flexible polyimide substrates by catalyst-assisted CVD was also demonstrated (Hsu et al., 2010). The length of the MWCNTs was controlled and increased approximately linearly with the growth time resulting with decreased impedance and increased capacitance. UV-ozone exposure improved the interfacial properties between the CNT electrodes and the electrolyte by increasing the surface wettability (changing it from super hydrophobic to hydrophilic). UV-ozone treatment yielded a 50-fold impedance reduction. Furthermore, flexible CNT electrodes were found to exhibit resistive characteristics, in contrast to the results described above (Nguyen-Vu et al., 2006) which suggested that capacitive conduction dominates. Examination of neuronal cell cultures indicated good biocompatibility. Finally, recordings of evoked action potential from lateral giant neurons in the abdominal ganglia of crayfish were achieved. SNR was about 150, as good as that of a suction pipette and better than gold electrodes (SNR of 122 and 36, respectively). In a subsequent study, a flexible CNT MEA integrated with a chip containing 16 recording amplifiers was presented (Chen et al., 2011b). CNTs were again grown directly on a polyimide flexible substrate. The CNT microelectrode had ten times lower electrode impedance and six times higher capacitance, resulting with better charge injection capacity compared with a gold microelectrode of the same size. Tests with cultured neurons validated the biocompatibility of the device. *In vitro* spontaneous spikes were recorded from a caudal photoreceptor from the tail of the crayfish neuron with SNR of 6.2. The flexible CNT MEA was also applied to record the electrocorticography (ECoG) of a rat motor cortex.

Our group has recently developed a novel all-CNT flexible electrode suited for recording and stimulation of neuronal tissue. Flexible devices were realized by transferring high density MWCNT films onto a flexible PDMS film (Hanein, 2010). A deliberate poor adhesion between the CNT film and the substrate allowed the transfer of the CNTs to the PDMS substrate (**Figure 6A**). This poor adhesion resulted from direct growth of the CNTs on SiO2. The technology is simple and the resulting stimulating electrodes are nearly purely capacitive. The electrodes exhibit a capacitance of 2 mF/cm2 which is similar to that of TiN and pristine MWCNTs electrodes fabricated on a rigid silicon substrate with 2 and 10 mF/cm2, respectively (Gabay et al., 2007). Recent recording and stimulation tests with chick retina (**Figure 6B**) validate the device suitability for high-efficacy neuronal stimulation applications (David-Pur et al., submitted).

**Table 2** summarizes the main findings related to CNT-based neuronal electrical recording and stimulation. The overall picture that emerges from these data is that CNTs were used for neuronal electrical interfacing in three main schemes: CNT coated substrates, CNT coated sharpened wire metal electrodes and CNT MEAs. CNT substrates were used as an *in vitro* growth substrate for neurons and the electrical activity was recorded using intracellular patch clamp technique. Electrical stimulation through these CNT coated surface was also demonstrated. CNT coated sharpened wire electrodes were used for both *in vitro* and *in vivo* neuronal extracellular recording and stimulation. The CNT MEA scheme allows for *in vitro* patterned neuronal growth in conjugation with extracellular recording and stimulation. The final and most recent scheme is the development of flexible CNT MEAs which represents a major step toward implantable neuro-prosthetics applications.

artifact of the stimulation. Spontaneous activity prior to stimulation is marked

an embryonic chick retina (day 14) by a CNT electrode (one out of sixteen 50 µm diameter electrodes in the array) using a biphasic anodic first pulse of

with asterisks.


**Table 2 | Neuronal electrical interfacing**

 **CNT** 

**technologies.**

## **CONCLUSIONS AND PERSPECTIVES**

In this review we explored the different properties that make CNT uniquely suited for neuronal interfacing. We have also shown that intensive investigations over the past 10 years have explored CNTs for neuronal interfacing, from surface properties effecting cell adhesion and proliferation to the development of CNT-based MEAs and flexible electrode arrays for *in vivo* applications. This intensive research was motivated by the need to find therapies for neural disorders which require the use of electrical stimulation, as well as by the need to address basic questions in neuroscience. In particular, the study of engineered neuronal circuits can greatly benefit from such CNT-based platforms. Neuronal circuits study aims at rebuilding damaged neuronal tissues. Natural circuits are not prone to manipulations and have highly complex structure and thus are extremely challenging to study. Engineered *in vitro* neuronal networks, however, allow monitoring and systematic investigation and provide unique platform for the study of activity patterns, morphology-activity relationship as well as network damage and repair methods. All These applications can greatly benefit from an efficient neuronal scaffold having the ability to record and stimulate neuronal electrical activity.

The challenging requirements in the field of neural prosthetics, namely, reduction of electrode size while maintaining efficient electrochemical function, as well as reduction of immune response to the implanted device (linked to both size and rigidity of the implanted device), are only poorly fulfilled by commonly used materials. Thus, the development of an efficient neuroprosthetic platform will highly benefit from the realization of CNT electrodes on a flexible substrate.

The emerging applications of CNTs in the field of neuroscience must take into account cytotoxicity considerations. The potential toxicity of CNTs was extensively studied and so far revealed mixed results (Shvedova et al., 2003, 2009, 2010; Dumortier et al., 2006; Firme and Bandaru, 2010; Zhao and Liu, 2012). Better understanding of the interaction between CNTs and the biological environment is required in order to facilitate efficient development of both safe and effective CNT-based neural technologies. Further testing of CNT electrodes corrosion resistance as well as stress durability is required. Another essential step is further study of the nature of neuron-CNT electrical interfacing. Also, comprehensive long-term recording and stimulation studies in animal models followed by clinical trials and approval by administrative authorities such as the US food and drug administration (FDA) must be accomplished to allow routine use of CNT MEAs in neuroscience. The vast literature reviewed here, along with recent studies using CNTs embedded in polymeric support; show that CNTs, if handled properly, are safe as an implantable coating.

Several very promising directions in the study of CNT-based neuro-prosthetic devices currently exist: First is the integration of drug elution coatings. These coatings will allow the reduction of inflammation caused by the insertion of the neuronal implant to the tissue and improve survival of neurons in contact with the device. There is a growing interest in the study of such coatings (Zhong and Bellamkonda, 2005; Wadhwa et al., 2006; He et al., 2007), such studies will also benefit from addressing the development of a coating that will not


impair the electrical activity of the device. Second, is the research toward realization of CNT-based flexible MEAs as elaborated in the text above. Some very recent work done in this area by our group and others revealed great potential of such devices. Finally, combining light-sensitive function with the enhanced neuronal interfacing properties of CNTs will be highly beneficial for the development of novel retinal implants.

To conclude, CNT enhanced electrochemical properties, their flexible and simple micro-fabrication preparation procedure, as well as their bio-compatibility and durability, suggest that CNT electrodes are a promising platform for high resolution neuronal applications. The resemblance of CNT surfaces to the nanostructured features of natural neural tissue makes CNTs a suitable platform for tissue engineering and regeneration (Tran et al., 2010; Voge and Stegemann, 2011). Also, the high electrical conductivity

#### **REFERENCES**


in biotechnology and biomedicine.


of CNTs allows direct electrical interfacing with neurons (Shein-Idelson et al., 2011). Clearly, CNTs have enormous potential in the development of neuronal interfaces and further study will enable the utilization of CNT-based technology to expand the understanding of the nervous system and for the realization of therapeutic approaches.

## **ACKNOWLEDGMENTS**

The authors thank many useful discussions with Moshe David-Pur, Giora Beit-Yaakov, and Dr. Dorit Raz-Prag. They also acknowledge the support of a grant from the Israel Ministry of Science and Technology, the Israel Science Foundation and the European Research Council funding under the European Community's Seventh Framework Programme (FP7/2007– 2013)/ERC grant agreement FUNMANIA-306707.


Ben-Jacob, E., et al. (2007). Electrochemical and biological properties of carbon nanotube based multielectrode arrays. *Nanotechnology* 18, 1–6.


interfacing. *J. Neurosci. Methods* 182, 219–224.


toward electrical-neural interfaces. *Small* 2, 89–94.


circuits. *PLoS ONE* 5: e14443. doi: 10.1371/journal.pone.0014443


silicon devices. *Brain Res.* 983, 23–35.


approach to extracellular microelectrodes. *IEEE Trans. Biomed. Eng*. BM17, 238–247.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 August 2012; accepted: 22 December 2012; published online: 09 January 2013.*

*Citation: Bareket-Keren L and Hanein Y (2013) Carbon nanotube-based multi electrode arrays for neuronal interfacing: progress and prospects. Front. Neural Circuits 6:122. doi: 10.3389/fncir. 2012.00122*

*Copyright © 2013 Bareket-Keren and Hanein. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Nanowire electrodes for high-density stimulation and measurement of neural circuits

#### *Jacob T. Robinson1 \*, Marsela Jorgolli <sup>2</sup> and Hongkun Park2,3\**

*<sup>1</sup> Departments of Electrical and Computer Engineering and Bioengineering, Rice University, Houston, TX, USA*

*<sup>2</sup> Department of Physics, Harvard University, Cambridge, MA, USA*

*<sup>3</sup> Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA*

#### *Edited by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### *Reviewed by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany Bianxiao Cui, Stanford University, USA*

#### *\*Correspondence:*

*Jacob T. Robinson, Departments of Electrical and Computer Engineering and Bioengineering, Rice University, 6100 Main Street, MS 380, Houston, TX 77005, USA. e-mail: jtrobinson@rice.edu; Hongkun Park, Department of Chemistry and Chemical Biology and Physics, Harvard University, 12 Oxford St., Cambridge, MA 02138, USA. e-mail: hongkun\_park@harvard.edu*

Brain-machine interfaces (BMIs) that can precisely monitor and control neural activity will likely require new hardware with improved resolution and specificity. New nanofabricated electrodes with feature sizes and densities comparable to neural circuits may lead to such improvements. In this perspective, we review the recent development of vertical nanowire (NW) electrodes that could provide highly parallel single-cell recording and stimulation for future BMIs. We compare the advantages of these devices and discuss some of the technical challenges that must be overcome for this technology to become a platform for next-generation closed-loop BMIs.

#### **Keywords: brain machine interface (BMI), nanotechnology, nanowires, neuroengineering, electrophysiology**

Today, brain-machines interfaces (BMIs) enable users to manipulate prosthetic limbs and computer interfaces by monitoring and processing their neural activity (Donoghue et al., 2007; Simeral et al., 2011). BMIs can also be used to treat neurological disorders such as Parkinson's disease (Volkmann, 2004), obsessive compulsive disorder (Bourne et al., 2012), and depression (Howland et al., 2011) by applying voltage or current pulses to specific regions deep within the brain—a treatment known as deep brain stimulation (DBS). As remarkable as today's BMI technology is, it is in many ways in its infancy. Future technology will seek to improve the precision with which external devices can be manipulated and the specificity of stimulation to the level of individual cells. These improvements will help expand the capabilities of neural prosthetics and extend the range of disorders that can be treated using DBS (Donoghue et al., 2007). To achieve these goals, the next generation of BMIs will need improved resolution for measurement and stimulation, as well as the ability to adjust their spatial and temporal stimulation patterns based on the current state of the neural activity (the devices with this latter capability are often termed "closed-loop" BMIs) (Stanslaski et al., 2012).

Currently, the large size and small number of electrodes in BMIs limits their stimulation and measurement resolution. Stateof-the-art devices for DBS typically have 4–8 millimeter-sized electrodes (Stanslaski et al., 2012), whereas BMIs for neural recording typically use a few dozen electrodes that are 10–100 microns in diameter (Hochberg et al., 2006; Donoghue et al.,

2007; Du et al., 2011) (**Figure 1A**). This density and feature size is a far cry from that of the human brain, which contains approximately one hundred billion neurons, each with diameter as fine as 10 microns (Williams and Herrup, 1988). In fact, a single square millimeter of brain tissue contains approximately one million neurons (Williams and Herrup, 1988). To match this number and density, future BMIs must feature smaller and denser electrode arrays in order to precisely monitor and control neural circuit activity. Furthermore, smaller electrodes (*<*1 micron in diameter) may also enable the recording of intracellular electrical signals of individual neurons (**Figure 1A**): compared to extracellular recording, these intracellular measurements will have improved signal to noise ratio and enable a clear cell-to-electrode registry (**Figures 1B–D**). Importantly, the improved signal to noise ratio also enables intracellular electrodes to record subthreshold neural activity (e.g., postsynaptic potentials) that can be used to determine the strength of synaptic connectivity.

Fortunately devices that match the feature size and density of neural circuits are routinely fabricated on silicon using contemporary nanofabrication techniques (Arden, 2002). This observation highlights the future role for semiconductor fabrication and nanotechnology as a platform for high-precision BMIs. Recently, these nanofabrication techniques have been used to create vertical nanowires (NWs) and nanotubes that can intracellularly stimulate and record the activity of neurons and other electrically active cells. Here we review this technology, highlight the

diameters less than 1 micron can be used as intracellular probes. Photo credits: DBS—EPDA.com, Microelectrodes—microsystems.utah.edu. **(B)** Equivalent circuit model for a cell on top of an extracellular (left) and intracellular (right) electrode. The membrane resistance, capacitance, and Nernst potential is shown as R*m*, C*m*, and E*m*, respectively. The voltage recorded extracellularly

is proportional to V*m*, and typically has a magnitude of greater than 10 mV for a neuronal action potential. **(C)** Optical microscope image of a rat cortical neuron grown on top of a vertical NW electrode, scale bar 10 microns. **(D)** Scanning electron micrograph of a set of vertical NWs, scale bar 1 micron. [**(C)** and **(D)** adapted from Robinson et al. (2012)].

characteristics that make NW electrodes an attractive platform for future BMIs, and comment on some of the challenges that face the development of these next-generation devices.

## **NWs AS INTRACELLULAR ELECTRODES**

The electrical activity of neurons is most directly measured as the electrical potential across the cellular membrane. As a result, intracellular electrodes that can directly measure the membrane potential are widely considered the "gold standard" in neuronal recording. These high fidelity measurements, however, come at a cost. To monitor intracellular signals via traditional patchclamp methods, glass micropipettes must be carefully aligned to the cellular membrane using manually controlled micromanipulators. Once the pipette is placed in contact with the cell, an intracellular electrical connection can be formed either by slowly driving the pipette through the cellular membrane or by applying negative pressure to seal the membrane against the pipette and then rupturing the circumscribed patch using a current pulse or a rapid impulse of negative pressure (**Figure 2A**). While this painstaking process is a commonly performed procedure, it is not scalable. As a result, today's BMIs are based on extracellular recording techniques that sacrifice the high fidelity of intracellular measurements for the sake of scalability. For instance, contemporary extracellular electrodes have array sizes approaching one hundred electrodes and can simultaneously measure the activity of dozens of individual neurons (Hochberg et al., 2006; Donoghue et al., 2007; Du et al., 2011).

While extracellular electrodes succeed in monitoring large numbers of neurons, there are potential challenges for using these devices as the basis for closed-loop BMIs. For instance, a single extracellular electrode records the spiking activity of many nearby cells. This makes it difficult to identify the activity that corresponds to each individual neuron. While a variety of computational techniques can be used to sort the recorded

**FIGURE 2 | Intracellular recording methods. (A)** Whole cell patch pipette configuration measures a voltage (Vpipette) proportional to the membrane potential (V*m*). **(B)** A vertical glass nanotube (blue) is grown on top of an FET (pink) that lies within an insulated NW (gray). When the nanotube penetrates the cellular membrane, the membrane potential can be measured as a change in the source-drain current (ISD). **(C)** A platinum NW (red) is deposited on top of a platinum

electrode (red) that is insulated by silicon nitride (blue). The voltage recorded at the NW (VNW) is then proportional to the membrane potential. **(D)** A silicon NW (gray) insulated by glass (blue) is capped with a metallic film such as platinum (red). Similarly to **(C)**, VNW is proportional to V*m*, however in this configuration the NW sidewalls are insulated by glass, improving the amplitude of the measured signal and proving a surface for cell membrane fusion.

spikes based on their waveforms, this process typically requires a training period of several minutes, and must be repeated on a daily basis as the electrical coupling between the cells and extracellular electrode changes (Donoghue et al., 2007). Furthermore, typical signal to noise ratios are less than 10:1 (Hochberg et al., 2006; Donoghue et al., 2007; Du et al., 2011), and therefore slight degradation of the signal amplitude during chronic implantation leads to a gradual reduction in the number of individual neurons that can be recorded (Dickey et al., 2009). Stimulation using extracellular electrodes also lacks precise cell-to-electrode registry. During voltage or current pulses from an extracellular electrode, many cells in the vicinity of the electrode will be activated. This shortcoming ultimately limits the spatial accuracy of stimuli applied via extracellular probes.

To improve the cell-to-electrode registration, the size of the electrodes can be scaled down so that an individual electrode can interface to at most a single neuron (**Figure 1A**). This is the approach taken recently for vertical NW electrodes. Three recent papers have shown that these electrodes can be made small enough to penetrate the cellular membrane without compromising cell viability, and record or stimulate individual cells (Duan et al., 2012; Robinson et al., 2012; Xie et al., 2012). Thus silicon-based intracellular electrodes can provide both precise cell-to-electrode registration as well as large signal-to-noise ratios typically reserved for patch clamp recordings. Importantly, these NW devices can be fabricated using semiconductor nanofabrication techniques that can be scaled up to produce tens of thousands of recording sites in a single fabrication run, making this technology a potential platform for next generation BMIs requiring an increased number of electrodes.

Although the three recent demonstrations of vertical NW electrodes employed different fabrication strategies, each reported successful intracellular electrical measurements. Duan et al. used electron beam lithography to define nanoscale gold islands on the gate region of NW field-effect transistors (FETs). They used these islands as precursors for germanium (Ge) NW vapor-liquid-solid (VLS) growth (Duan et al., 2012). The Ge NWs were then coated with SiO2 using atomic layer deposition and the Ge core was subsequently etched away using Hydrogen Peroxide. This process left a nanoscale glass tube leading the gate region of the NW FET (**Figure 2B**). The authors showed that when this glass nanotube penetrated the cellular membrane of a cardiomyocyte, the intracellular membrane potential could be recorded. In this configuration the cardiomyocyte membrane potential gates the NW FET such that the source-drain conductance maps to the intracellular membrane potential. One advantage of using the FET conductance to transduce the membrane potential is that the gain of the NW FET can be used to amplify the measured signal. An alternative strategy pursued by Robinson et al. used plasma etching to micromachine solid silicon (Si) NWs out of highly conductive silicon-on-insulator wafers (Robinson et al., 2012). The resulting Si NWs were then insulated by a thermally grown silicon oxide that was subsequently removed from the Si NW tips. The Si NW tips were coated with an evaporated platinum or gold film (**Figure 2D**). Unlike NW FETs, this approach requires amplifying electronics to boost the signal recorded by the Si NWs. At the same time, however, the Si NW electrodes can also be used to stimulate electrical activity by injecting current into the cell, thereby evoking neuronal action potentials on demand. Using the stimulation capabilities of the solid Si NWs, functional circuits can be reconstructed by systematically stimulating individual electrodes and recording the resulting response at another cell (Robinson et al., 2012). A third fabrication technique was employed by Xie et al. who used focused ion beams to deposit 100 nm diameter platinum NWs on top of planar platinum electrodes (Xie et al., 2012) (**Figure 2C**). The functionality of these devices were similar to those reported by Robinson et al. although Xie et al. used their device to monitor mitotic cardiac cells as opposed to primary neurons.

For each NW electrode device, care must be taken to secure a stable intracellular measurement. To improve the stability of the cell-electrode interface and promote NW penetration, Duan et al. coated their devices with a phospholipid (Duan et al., 2012). One drawback to this method is that the phospholipid coating prevents cells from being cultured directly on top of the electrodes. As a result, to test these devices cells were grown on a separate PDMS substrate, inverted, and aligned atop the devices for measurement. Alternatively, Robinson et al. and Xie et al. did not use a surface treatment to promote penetration and reported intracellular recordings in cells grown directly on top of the devices. Both groups reported that, at times, voltage pulses are required to permeabilize the membrane covering the electrode in order to achieve intracellular electrical coupling. Xie et al. showed that over time, the permeabilized membrane recovers; however, repeated application of voltage pulses can restore the intracellular electrical coupling. The observed time scale of this recovery is consistent with the kinetics of cell membrane electroporation for biomolecule delivery (Saulis et al., 1991).

#### **FUTURE CHALLENGES FOR NW ELECTRODES** *in vivo*

While electrically facilitated membrane permeabilization is adequate for *in vitro* studies, a long-term *in vivo* interface will likely require alternative approaches to stabilize the NW-cell interface. Potential surface treatments include the phospholipid coating used by Duan et al. and biomimetic surfaces developed by Almquist and Melosh (2010). While these methods have been successful in securing stable interfaces between cells and nanostructures, future studies should investigate their long term stability and biocompatibility *in vivo*.

In addition to stabilizing the cellular interface, future *in vivo* devices must also deal with the motion of the brain resulting from the expanding and contracting vasculature (Enzmann and Pelc, 1992). Improving the flexibility of the NW electrodes may help them function in living tissue like the brain. Recently Tian et al. have taken steps in this direction by embedding NW FETs in a flexible polymer matrix and releasing them from the rigid substrate (Tian et al., 2012). While these devices, which have only recently been reported, have yet to be used for intracellular measurements, this approach may allow intracellular electrodes to remain within the cell while the tissue moves. Supporting this idea are reports that vertical NWs can pin neurons in place and inhibit them from migrating away from NW electrodes (Xie et al., 2010). Such effects may help secure intracellular coupling *in vivo*. An alternative approach may be to imbed vertical silicon NWs in a PDMS or other polymeric matrix that can be peeled off the silicon substrate. Such methods have recently been used to create flexible photonic crystal cavities based on semiconductor NWs, and may be adapted to support vertical NW electrodes (Yu et al., 2013).

Another challenge to using NW electrodes *in vivo* is recording beyond the layer of dead cells and protective glia that surround implanted electrodes. Studies have reported the thickness of the dead cell layers to be approximately 40 microns (Chia and Levene, 2009). Therefore, to access healthy cells, the NWs electrodes must penetrate through this dead layer. One approach is to make long electrode shanks tipped with NW electrodes. Alternatively very high aspect ratio NWs can be fabricated using deep reactive ion etching and oxide thinning processes (Morton et al., 2008). Such high aspect ratio NWs will have increased flexibility (Li et al., 2009), which may help accommodate tissue movement.

Another concern for closed loop BMIs is the potential for crosstalk between stimulation and recording electrodes that could interfere with the feedback algorithms. Potential solutions to this problem fall primarily into two categories: (1) Recording hardware and/or software can be modified to reduce the crosstalk using techniques such as blanking, spectral filtering, or common mode rejection (Stanslaski et al., 2012) or (2) Stimulation can be performed optically using channel-rhodopsin (Erickson et al., 2008) or near-IR stimulation (Wells et al., 2005). The advantage of modifying the recording electronics is that there is no need to genetically modify neuronal populations or to employ optical sources and components. However, the added electronic elements needed to reduce crosstalk can increase the size and power consumption of the BMI. Optical stimulation, on the other hand, typically produces negligible electronic crosstalk, simplifying the design requirements and lowering the power consumption of the recording electronics.

Finally, to achieve the thousands to millions of recording sites that will be desired for future BMIs the number of electrodes must be increased. The NW electrodes described here have relied on a direct electrical connection to each electrode. This method, requiring a dedicated wire for each recording/stimulation site, is impractical for devices with large numbers of electrodes. Fortunately solutions to this "interconnect problem" have already been solved for semiconductor electronics. Image sensors, for instance, contain millions of pixels that can be read using only a few dozen connections. This reduced number of contacts is achieved by interleaving data from several pixels and transmitting it over a single wire. This multiplexing process can be implemented on-chip using complementary metal oxide semiconductor (CMOS) technology (Lei et al., 2011). Yet another advantage of silicon-based electrodes is that they can be readily coupled to back-side CMOS electronics using bonding or other post-processing methods such as those used for extracellular electrodes (Kim et al., 2009). The combination of NW electrodes and CMOS multiplexing will allow high-resolution BMIs to be created on a compact monolithic platform.

#### **CONCLUSIONS**

In the future, closed-loop BMIs will see improvements both to the hardware that interfaces to the neural circuits, and to the software that drives their activity. Ultimately this hardware may be able to stimulate and record the activity of thousands to millions of neurons with single-cell resolution. The high resolution combined with algorithms to identify the state of the neural circuit and predict its response to stimuli would provide a basis for new classes of BMIs that can accurately control neural circuits and translate this recorded activity into accurate manipulation of prosthetic devices. While there is currently no technology that can achieve such a high-resolution electrical interface, NW-based electrodes have made considerable progress toward this goal *in vitro*. To take the next step and employ nanotechnology for BMIs, efforts must be taken to improve the stability of the interface, flexibility of the electrodes, and compatibility with layers of dead cells and glia that accompany surgical implantation of electrodes. With these improvements, NW electrodes may become the preferred technology for high-resolution BMIs.

## **REFERENCES**


extracellular nanoscale field-effect transistor. *Nat. Nanotechnol.* 7, 174–179.


## **ACKNOWLEDGMENTS**

This work was supported by an NIH Pioneer award (5DP1OD003893-03) and an NSF EFRI award (EFRI-0835947) to Hongkun Park.


light for in vivo neural stimulation. *J. Biomed. Opt.* 10, 064003.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 31 October 2012; accepted: 24 February 2013; published online: 12 March 2013.*

*Citation: Robinson JT, Jorgolli M and Park H (2013) Nanowire electrodes for high-density stimulation and measurement of neural circuits. Front. Neural Circuits 7:38. doi: 10.3389/fncir. 2013.00038*

*Copyright © 2013 Robinson, Jorgolli and Park. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## High-density microelectrode array recordings and real-time spike sorting for closed-loop experiments: an emerging technology to study neural plasticity

*Felix Franke\*, David Jäckel , Jelena Dragas , Jan Müller, Milos Radivojevic , Douglas Bakkum and Andreas Hierlemann*

*Department of Biosystems Science and Engineering, ETH Zürich, Basle, Switzerland*

#### *Edited by:*

*Steve M. Potter, Georgia Institute of Technology, USA*

#### *Reviewed by:*

*Suguru N. Kudoh, Kwansei Gakuin University, Japan Michela Chiappalone, Italian Institute of Technology, Italy*

#### *\*Correspondence:*

*Felix Franke, Department of Biosystems Science and Engineering, ETH Zürich, 4058 Basle, Switzerland. e-mail: felfranke@gmail.com* Understanding plasticity of neural networks is a key to comprehending their development and function. A powerful technique to study neural plasticity includes recording and control of pre- and post-synaptic neural activity, e.g., by using simultaneous intracellular recording and stimulation of several neurons. Intracellular recording is, however, a demanding technique and has its limitations in that only a small number of neurons can be stimulated and recorded from at the same time. Extracellular techniques offer the possibility to simultaneously record from larger numbers of neurons with relative ease, at the expenses of increased efforts to sort out single neuronal activities from the recorded mixture, which is a time consuming and error prone step, referred to as spike sorting. In this mini-review, we describe recent technological developments in two separate fields, namely CMOS-based high-density microelectrode arrays, which also allow for extracellular stimulation of neurons, and real-time spike sorting. We argue that these techniques, when combined, will provide a powerful tool to study plasticity in neural networks consisting of several thousand neurons *in vitro*.

**Keywords: closed-loop, real-time, spike sorting, multielectrode arrays, neural cultures**

#### **INTRODUCTION**

The understanding of neural circuits and their activities is to a major extent based on measurements with extracellular electrodes. This is due to the fact that extracellular recordings are relatively easy to perform and very well established. In contrast to single cell measurements with intracellular recording techniques, extracellular electrodes pick up the action potentials (spikes) of all neurons in their vicinity. This is a blessing as well as a curse. An advantage is that in principle several neurons can be measured simultaneously using a single extracellular electrode, but the price to pay is the need to assign single spikes to their putative neuronal sources. This problem is referred to as spike sorting and it is known to be difficult and error-prone (Lewicki, 1998), and spike sorting often involves a highly time consuming, manual component.

Depending on the experiment, time consuming spike sorting can be regarded as a mere inconvenience, and many studies have focused on the development of spike sorting algorithms for the offline analysis of the recordings after performing the experiment (see e.g., Letelier and Weber, 2000; Shoham and Fellows, 2003; Delescluse and Pouzat, 2006). For real-time closedloop experiments and brain machine interfaces (BMI), however, it is absolutely necessary to obtain spike trains already during the recording so that time consuming spike sorting is not only a problem but essentially prohibits performing such experiments. Therefore, spike sorting is usually avoided in those experiments by detecting just the presence of action potentials, e.g., by applying a voltage threshold, which can be relatively easy and efficiently implemented also in hardware (Guillory and Normann, 1999). Real-time spike detection allows for studying closed-loop feedback of neural activity, for example, through the implementation of visual feedback to an awake monkey (Fetz, 1969), or by applying electrical stimulation to neurons in an awake animal (Jackson et al., 2006). Electrical stimulation of neurons that depends on the activity of other neurons (see also **Figure 1**) was also successfully used in neural cultures on top of multi-electrode arrays (MEAs): electrical feedback stimuli have been used to control the bursting activity of cultured neurons in Wagenaar et al.(2005) and the connection strengths between neurons in Müller et al. (in review). The closed-loop approach can also be used to connect a neural network to a robot (Bontorin et al., 2007; Potter, 2010). For a review of real-time closed-loop electrophysiology see, e.g., Arsiero et al. (2007). These studies, however, were all realized without using spike sorting, either by limiting the number of single neurons that were recorded from (by trying to detect only one specific neuron per electrode), or by using multi-unit activities.

Recent developments in measurement techniques and in spike sorting algorithms make it now possible to overcome some of the limitations of extracellular recordings. A possible setup using spike sorting for closed-loop stimulation of specific neurons is shown in **Figure 1**. To use the closed loop, e.g., to investigate spike-timing-dependent plasticity, the real-time spike-sortinginduced latency may not exceed a few milliseconds. In the following, we will review the advances in MEA recording technology with a special focus on high-density MEAs and show

that the high-density of the electrodes provides unprecedented signal quality that holds the promise to enable clear and reliable assignment of single spikes to putative neurons (Litke et al., 2004; Prentice et al., 2011; Jäckel et al., 2012).

sorting is applied to compute the spike times of the single neurons. Depending

## **MEA RECORDING TECHNOLOGY**

Planar MEAs are two-dimensional arrangements of recording electrodes for *in vitro* extracellular measurements of cultured neuronal cells or slice preparations. They allow for recording of electrical activity simultaneously on many electrodes at high temporal resolution. Thus, they represent an important tool to study the dynamics in neuronal networks (e.g., Potter et al., 2006; Bontorin et al., 2007; Chao et al., 2007; Rolston et al., 2010; Müller et al., in review).

An important parameter of MEAs is the inter-electrode distance (IED). For multi-electrode arrangements on shafts of needles, such as tetrode configurations (Eckhorn and Thomas, 1993; O'Keefe and Recce, 1993), this distance is small enough (less than 20µm) that a single action potential can be simultaneously detected on several electrodes. The maximal distance between a neuron and an electrode, at which the action potentials of the neuron can be still measured, is assumed to be smaller than 50–70µm although this greatly depends on the recording setup and the respective preparation (Buzsáki, 2004; Frey et al., 2009b). For traditional, commercially available MEAs, however, the IED was usually much larger [100–200 µm IED and 60–200 metal electrodes on a glass substrate (Stett et al., 2003)] so that MEA recordings constituted, in principle, multiple simultaneous single-electrode recordings. In other words, the distance between the electrodes was too large

to detect activity of the same single neuron on multiple electrodes.

(Feldman, 2012). Parts of this graph were adopted from Einevoll et al. (2011).

From the signal processing point of view, this is an unfavorable recording situation, as recording the same action potential with more than one electrode was shown to strongly increase spike sorting performance (Gray et al., 1995). Furthermore, many neurons will lie in between electrodes and not be measured at all. To ensure that neurons lie close to the electrodes, additional measures can be taken during the preparation of the cultures, such as patterning the cells at electrode locations (Shein et al., 2009), but this adds complexity to the experimental procedure.

Recent advances in microtechnology, especially the realization of MEAs in complementary metal–oxide–semiconductor (CMOS) technology (Berdondini et al., 2009; Lambacher et al., 2010; Hierlemann et al., 2011), made it possible to greatly increase the number of electrodes per MEA, for example to 4096 in Berdondini et al. (2009), 11,011 in Frey et al. (2010) or 16,384 in Lambacher et al. (2010), while decreasing the IED to less than 20 µm, a distance comparable to that of the previously mentioned electrode ensembles on needles (e.g., tetrodes). Additionally, this technology provides increased signal quality through on-chip amplification and digitization circuits. Using on-chip multiplexing schemes, high-density MEAs (HDMEA) systems have been realized, which enable to read out large numbers of electrodes, arranged at high spatial density (Eversmann et al., 2003; Berdondini et al., 2005; Hutzler et al., 2006; Frey et al., 2009a).

The closely spaced microelectrodes of HDMEAs enable that virtually every neuron on the array is detected by multiple electrodes. Along with the additional information where the signal originated from, the high electrode density greatly improves spike sorting (Gray et al., 1995; Harris et al., 2000; Einevoll et al., 2011; Prentice et al., 2011). **Figure 2** shows an example of such a recording.

However, HDMEAs do not only improve recording but also stimulation capabilities. Localized, reliable stimulation of single cells (Hottowy et al., 2012) is a powerful tool for plasticity experiments (Müller et al., in review). Indeed, subcellular sized electrodes have been shown to provide reliable stimulation of individual neurons *in vitro*. This has been demonstrated using MEAs with particularly high electrode densities that feature only stimulation capabilities, such as (Braeken et al., 2010; Lei et al., 2011). Procedures how to optimally stimulate a given neuron by using multiple electrodes and complex stimulation patterns are currently under investigation.

HDMEAs featuring recording and stimulation circuitry (Frey et al., 2010; Eversmann et al., 2011) combine the advantages of reliable spike sorting and localized single neuron stimulation,

#### **FIGURE 2 | Spike sorting for high-density multi-electrode recordings of cultured neurons. (A)** Example recording of 6 out of 102 electrodes of a HDMEA (left), where mainly two neurons were recorded from, and

a close up on two spikes (middle) (similar figure as in Frey et al. (2009a), however, with cultured cortical neurons). Spikes of individual neurons are recorded by multiple electrodes. Colored traces are identified spikes from two neurons. Note that on the trace of electrode 4, the two spikes are hardly distinguishable and that only combining the information of different channels enables unambiguous spike assignment, see also (Fiscella et al., 2012). (Right top) Several superimposed spike traces of the two neurons. The colored traces are the spike-triggered averages (STAs) of the two neurons on the respective electrodes. The templates of the two neurons (green and violet) spatially overlap (right bottom) indicating that the same set of electrodes recorded from both neurons. **(B)** Spikes (left) and templates (right) for 10 identified neurons (colored traces). For each neuron, the electrode was chosen, where its template had the largest peak-to-peak amplitude (indicated by the colored arrows in the right panel). Note that some of the spikes are visible on more than one electrode (three channels marked by asterisks) and that high-amplitude spikes on one electrode can overlap with spikes on another electrode. Right: for illustration purposes the identified templates are superimposed onto a MAP2 staining of the culture they were recorded from Bakkum et al. (in review). Note that the electrodes have a similar IED than the distance between neurons.

which paves the way to truly bidirectional experiments on single-cell level within the network context.

## **REAL-TIME SPIKE SORTING ALGORITHMS**

The overall spike sorting process consists of a number of nontrivial processing steps (for a schematic of the spike sorting process see, e.g., Einevoll et al., 2011). First, spikes need to be detected in the noisy signals. For multi-electrode-shaft and HDMEA recordings, a single action potential can be detected on multiple electrodes. Then, a short piece of data is usually cut out around the detected events (potentially on multiple electrodes) and structured into a vector in a high dimensional space. Spike features are then extracted from this piece using, e.g., principle component analysis (Lewicki, 1998). This step aims at reducing the dimension of the vector space in order to keep dimensions that carry most information about the origin of the spikes and to remove dimensions that only carry noise. The goal of the feature space representation and dimensionality reduction is that spikes from the same neuron, i.e., appear to be similar to each other, are located closely together while being distant from spikes of other neurons. The most demanding step, achieved by using a clustering routine, is to determine how many neurons were recorded from, and which spike was produced by which neuron. Since most standard spike sorting procedures (e.g., Harris et al., 2000; Shoham and Fellows, 2003; Quiroga et al., 2004) need to store all individual spikes before the clustering step, they are not applicable for online spike sorting with the notable exceptions of Öhberg et al. (1996), where a neural network is used for real-time spike sorting, and (Rutishauser and Schuman, 2006), where the clusters are formed in an online procedure. The output of the spike sorting consists of the number of neurons, the individual neuronal spike trains, and the prototypic spike waveforms (called templates) for every neuron.

Since some data from a certain preparation can already be recorded and stored prior to a specific experiment, templates can be pre-computed using an offline spike sorter. This way, fast and efficient classifiers can be designed based on stored templates that are able to sort spikes in real-time. It does not come as a surprise that almost all research efforts in the direction of real-time spike sorting follow this approach (Friedman, 1968; Mishelevich, 1970; Roberts and Hartline, 1975; Stein et al., 1979; Salganicoff et al., 1988; Yang and Shamma, 1988; Gozani and Miller, 1994; Santhanam et al., 2004; Asai et al., 2005; Takahashi and Sakurai, 2005; Vollgraf et al., 2005; Biffi et al., 2010; Franke, 2011), although not all of these approaches explicitly make use of templates to derive spike classifiers.

So far, real-time spike sorting was mainly achieved by deriving simple hardware-implementable decision rules, based on the spike templates. One such rule is to check, if the spike voltage sample at a given time lies between a lower and an upper threshold relative to the peak of the spike waveform (a so called hoop), as described in Santhanam et al. (2004). Such decision rules are also used in commercially available recording systems and were individually applied to single electrodes (Nicolelis et al., 1997; Wessberg et al., 2000; Taylor et al., 2002; Guenther et al., 2009).

However, there have been only few applications of these approaches to multielectrode arrays in real-time scenarios, such as Takahashi and Sakurai (2005), where independent-component analysis was used to separate individual neuronal activities. The information of several recording channels must be efficiently combined for multi-electrode recordings. Extending a spike sorting method that works for single electrodes to multi-electrodes is not a trivial task and might not be possible for all methods.

As already discussed, HDMEAs impose even higher demands on the methods due to the large overall number of simultaneously recorded neurons and the large number of electrodes that are available per single neuron. There are a number of approaches to spike sorting of HDMEA data (Meister et al., 1994; Litke et al., 2004; Jäckel et al., 2011, 2012; Prentice et al., 2011; Fiscella et al., 2012) but none of those has been evaluated with respect to low latency real-time spike sorting so far. There is also no commercial system with real-time spike sorting available, and it is currently unclear how effective the application of the "hoop"-approach (Santhanam et al., 2004) is. Another ICA-based real-time approach has been described in Takahashi and Sakurai (2005), but the performance of ICA to separate all neurons of HDMEA data sets was found to be limited (Jäckel et al., 2012).

#### **LINEAR FILTERS FOR SPIKE SORTING**

Linear-filter-based spike sorting approaches rely on linear filters that preferentially respond to one template that is considered to represent spikes from a single neuron (Roberts and Hartline, 1975; Stein et al., 1979; Gozani and Miller, 1994; Vollgraf and Obermayer, 2006; Franke et al., 2010; Franke, 2011). Spikes can then be detected by thresholding the filter outputs. An alternative method was suggested in Vollgraf et al. (2005), where a preprocessing filter was designed to be tuned to the average spike waveform of all spikes. However, detected spikes have subsequently to be clustered in the filter output space, which introduces a complex problem after the filtering. Filter-based methods hold the promise to be suitable even for low-latency real-time spike sorting of MEA: linear filters can be efficiently implemented in hardware and they scale well with the number of recording electrodes. Firstly, all electrodes can be processed in parallel, and, secondly, if spikes of one neuron cannot be detected on a given electrode, this electrode can be ignored for the corresponding filter (Jäckel et al., 2011).

It was argued that linear-filter-based spike sorting provides only moderate performance in terms of sorting quality (Wheeler and Heetderks, 1982; Lewicki, 1994; Guido et al., 2006), but it was shown more recently that this could be due to the fact that the candidate filters have been derived in the frequency domain, which was shown to be non-optimal (Vollgraf and Obermayer, 2006).

## **REAL-TIME IMPLEMENTATION**

Numeric computations behind linear filters are based on multiply-accumulate (MAC) operations. For every recording electrode, a set of filter coefficients has to be multiplied with the most recent samples of the recordings, and all multiplications over all electrodes are then summed up. Since multiplications are independent of each other, they can be done in parallel on a digital signal processor (DSP) as a single processing step. DSPs are well suited for implementing MAC-based algorithms, but filter-based spike sorting algorithms can consist of more complex operations [like buffering the filter outputs, thresholding, and estimation of the filter with the maximal output (Franke, 2011)], which requires more flexibility than provided by DSPs. Such more complex operations can, however, be implemented by using field-programmable gate arrays (FPGAs). The digital interface of a MEA can be controlled by these fast and reprogrammable microcontrollers. By integrating data analysis modules, as well as stimulation logics directly on the FPGA, the complete closed-loop experiment can be realized in "programmable hardware" (Hafizovic et al., 2007). This obviates the necessity to route the signal path through a PC, which would increase latency and jitter. Another advantage of FPGAs is the relatively large available memory to store filter coefficients.

#### **OVERLAPPING SPIKES**

When two spikes occur nearly at the same time, they can cause problems for the spike sorting: The overlapping signals could be detected as a single spike instead of being recognized as two spikes, and the distorted overall waveform can lead to misclassifications. With multi-electrode recordings, there can be two different types of spike overlaps: (1) temporal overlaps include spikes that occur nearly at the same time but on different electrodes, while (2) spatio-temporal overlaps occur nearly at the same time and also on the same electrodes. Purely temporal overlaps do not cause any problems for filter-based methods, as the filters corresponding to one neuron can be made "blind" to the electrodes of another neuron and can be treated separately. Spatio-temporal overlaps (see **Figure 2**), however, will distort the filter outputs of both filters. A way to solve this problem is to remove the corresponding waveform from the data, once a spike was detected, and to then recompute the filter outputs (Gozani and Miller, 1994; Franke, 2011). This approach is not well suited for a challenging realtime implementation, since it will generate a larger delay for overlapping spikes than for non-overlapping ones. The realization of an efficient overlap resolution technique for highelectrode-density data of real-time applications is still an open issue.

#### **DISCUSSION/OUTLOOK**

A number of issues in implementing real-time spike sorting still remain unsolved. It would be desirable to make the linear filters as short as possible to achieve the smallest possible delay (the delay of a causal filter is directly related to its length) (Vollgraf and Obermayer, 2006). However, it was not investigated yet, how

#### **REFERENCES**


spike sorting with unsupervised learning. *Lect. Notes Comput. Sci.* 3696, 109–114.

Berdondini, L., Imfeld, K., Maccione, A., Tedesco, M., Neukom, S., Koudelka-Hep, M., et al. (2009). Active pixel sensor array for high spatio-temporal resolution electrophysiological recordings from single cell to large scale short the filters for HDMEA recordings can be, while still ensuring a high spike sorting quality. Furthermore, the filters described in Roberts and Hartline (1975) are, in principle, more powerful than a simple matched filter (Vollgraf et al., 2005; Franke, 2011), since they try to suppress spikes from other neurons. This may be useful to resolve overlapping spikes but comes at a price: the filters might be less robust to noise, since they are under stronger constraints. Additionally, spike waveforms of two different neurons may not necessarily be linearly independent, which poses a problem for this kind of linear filters.

Given the high spatial resolution of HDMEAs, it will be interesting to investigate, how the quality of the results obtained by using simple spike sorting algorithms compares to that of more complex ones. Promising algorithms for use with high electrode density include the aforementioned "hoop"-approach (Santhanam et al., 2004), or a sorting that is solely based on the identities of the electrodes, on which a spike was detected.

An important issue for spike sorting is the occurrence of bursts. Here, a neuron produces potentially many spikes with successively decreasing amplitudes and, possibly, varying waveforms (Fee et al., 1996). For most algorithms, it is not known, how the spike sorting error rate is affected by bursts. HDMEAs seem to offer the potential to correctly sort spikes according to their relative amplitude distribution over many electrodes, which may be a robust feature also preserved during bursts (Rinberg et al., 1999).

HDMEAs are a valuable tool to study neural networks, and in combination with real-time spike sorting, hold great promise for new closed-loop experiments to study, e.g., neural plasticity. We have discussed the potential applicability of spike-sorting algorithms for this purpose and come to the conclusion that the combination of hardware-optimized algorithms with HDMEA recordings may possibly enable high performance spike sorting of more than hundred neurons with latencies in the range that is required to stimulate and control synaptic plasticity (Feldman, 2012). This may allow for experiments similar to those reported in Fetz (1969); Jackson et al. (2006); Bontorin et al. (2007); Rebesco et al. (2010), however, with the possibility to use sophisticated feedback stimuli upon occurrence of defined signature signals of single neurons within a local population.

#### **ACKNOWLEDGMENTS**

This work was financially supported by the European Community through the ERC Advanced Grant 267351, "NeuroCMOS" and the Swiss National Science Foundation through the Ambizione Grant PZ00P3\_132245. Felix Franke acknowledges individual support through an EU-funded Marie Curie Training Network of FP6: CT 2006-035854, CELLCHECK.

neuronal networks. *Lab Chip* 9, 2644–2651.


Tomas, J. (2007). A real-time closedloop setup for hybrid neural networks. *Conf. Proc. IEEE Eng. Med. Biol. Soc.* 2007, 3004–3007.


C. (2006). A new technique to construct a wavelet transform matching a specified signal with applications to digital, real time, spike, and overlap pattern recognition. *Digit. Signal Process.* 16, 24–44.


recordings. *J. Neurophysiol.* 108, 334–348.


Mathematical analysis of optimal multichannel filtering for nerve signals. *Biol. Cybern.* 32, 19–24.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 04 October 2012; paper pending published: 31 October 2012; accepted: 02 December 2012; published online: 20 December 2012.*

*Citation: Franke F, Jäckel D, Dragas J, Müller J, Radivojevic M, Bakkum D and Hierlemann A (2012) High-density microelectrode array recordings and realtime spike sorting for closed-loop experiments: an emerging technology to study neural plasticity. Front. Neural Circuits 6:105. doi: 10.3389/fncir.2012.00105*

*Copyright © 2012 Franke, Jäckel, Dragas, Müller, Radivojevic, Bakkum and Hierlemann. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Sub-millisecond closed-loop feedback stimulation between arbitrary sets of individual neurons

## *Jan Müller\*, Douglas J. Bakkum and Andreas Hierlemann*

*Bio Engineering Laboratory, ETH Zürich, Basel, Switzerland*

#### *Edited by:*

*Steve M. Potter, Georgia Institute of Technology, USA*

#### *Reviewed by:*

*Antonio Novellino, ETT s.r.l., Italy Jürg Streit, University of Bern, Switzerland*

*\*Correspondence: Jan Müller, Bio Engineering Laboratory, ETH Zürich, Basel, Switzerland. e-mail: 217534@gmail.com*

We present a system to artificially correlate the spike timing between sets of arbitrary neurons that were interfaced to a complementary metal–oxide–semiconductor (CMOS) high-density microelectrode array (MEA). The system features a novel reprogrammable and flexible event engine unit to detect arbitrary spatio-temporal patterns of recorded action potentials and is capable of delivering sub-millisecond closed-loop feedback of electrical stimulation upon trigger events in real-time. The relative timing between action potentials of individual neurons as well as the temporal pattern among multiple neurons, or neuronal assemblies, is considered an important factor governing memory and learning in the brain. Artificially changing timings between arbitrary sets of spiking neurons with our system could provide a "knob" to tune information processing in the network.

**Keywords: closed-loop, high-density microelectrode array, STDP, acausal stimulation, LTD, sub-millisecond**

## **INTRODUCTION**

Different theories describing learning and memory in the brain have been developed, and converging evidence shows that the precise activity timing of individual or groups of neurons may play a paramount role in plasticity of neuronal circuits. The well-known spike timing dependent plasticity (STDP) rule states that if two synaptically connected neurons fire within tens of milliseconds of each other, the connectivity strength of the involved synapses gets potentiated or depressed depending on the firing order. In pioneering studies, STDP rules were discovered (Markram et al., 1997) and further characterized (Bi and Poo, 1998; Song et al., 2000) by observing the effect of correlated firing of two neurons either artificially induced by stimulating a pre-and a post-synaptic neuron with two patch-clamps or by applying trains of pairedpulse stimuli to one neuron in the network (Bi and Poo, 1999). Furthermore, computation in a network is likely due not only to the relative timing of two individual neurons but also to the correlated activity of different neurons forming an associated group, i.e., assembly (Chang et al., 2000; Izhikevich, 2006). In this vein, different studies reported the existence of precise time-locked activity patterns of multiple neurons, both *in vivo* and *in vitro* (Abeles and Gerstein, 1988; Bienenstock, 1995; Ikegaya et al., 2004; Rolston et al., 2007). Having a system to generate feedback stimulation quickly and accurately to interact with such activity patterns would expand such studies beyond finding rules governing the plasticity between two cells toward finding rules governing the spatio-temporal dynamics of whole networks or assemblies (Froemke and Dan, 2002; Izhikevich et al., 2004).

In recent years, different systems to artificially control such feedback stimulation in a closed-loop manner, and thus study neuronal plasticity, have been developed for both *in vivo* (Jackson et al., 2006b; Bontorin et al., 2007; Venkatraman et al., 2009) and *in vitro* applications (Bontorin et al., 2007; Hafizovic et al., 2007; Novellino et al., 2007; Rolston et al., 2010; Zrenner et al., 2010; Wallach et al., 2011). In turn, activity-dependent feedback stimulation was shown to modify the functional connectivity of neuronal networks, both *in vivo* and *in vitro*, as done by reprogramming the motor output of freely behaving primates (Jackson et al., 2006a), changing the functional connectivity in rat forelimb sensorimotor cortex (Rebesco et al., 2010), or shaping *in vitro* neocortical networks into predefined activity states (Bakkum et al., 2008b). *In vivo* systems usually record from needles inserted into a certain location of the brain and subsequently stimulate the same or another site upon the detection of activity. These systems usually comprise the implanted needles, a head stage to amplify the signals, and some means to transmit the acquired signals to a PC. In the case of closed-loop feedback stimulation, these systems usually feature a dedicated very-largescale-integrated application-specific circuit (VLSI ASIC) (Chen et al., 2009; Rizk et al., 2009; Lee et al., 2010; Azin et al., 2011), or use a general-purpose microcontroller to achieve the respective goals (Mavoori et al., 2005; Zanos et al., 2011). Most *in vitro* systems, on the other hand, use a data acquisition card (DAQ) to sample data for analysis on a PC; feedback stimulation is typically returned through a DAQ card as well.

In order to accurately control the timing of feedback stimulation loops within the timescales relevant for STDP to occur, the delays introduced by a system must be understood. A generic description is given in **Figure 1**. Different system implementations will have different sources for and values of delays. Signal-processing algorithms introduce an inherent delay in the processing itself. Systems, which rely on general-purpose computers, might introduce latencies and jitter through the presence of data buffers, interrupts, shared resources, or user interactions, etc. In **Figure 1**, the time points *t*0−<sup>3</sup> and *t*<sup>S</sup> specify the occurrence of important events. At *t*<sup>0</sup> = 0, the trigger neuron emits an action potential, which is recorded by the acquisition system. After entering the signal-processing stages, it is ready to be detected as a spike event at time *t*1. From there, the system emits a stimulation pulse hitting the electrode at time *t*2. Conventionally,

the loop is considered "closed" at this point. The stimulation pulse evokes neuronal activity, frequently activating nearby axons (Bakkum et al., 2008a) whose signals propagate antidromically toward the soma until eliciting an action potential at time *t*3. In the case depicted in **Figure 1**, where the trigger neuron is synaptically connected to the elicited neuron, an additional biological time, *t*S, denotes the duration of an action potential propagation through the axon of the trigger neuron until synaptic activation of the elicited neuron. In case where *t*<sup>0</sup> − *t*<sup>1</sup> − *t*<sup>2</sup> is faster than *t*<sup>0</sup> − *t*S, that is when the signal propagates faster through the artificial feedback-loop than down the axon toward the biological synapse, acausal stimulation, and thus the introduction of long-term depression (LTD) according to the STDP rule, is possible.

components, over which the feedback-loop can be closed, are possible,

In order to apply closed-loop stimulation feedback precise and fast enough to study plasticity at the timescales of STDP or acausal stimulation, and flexible enough to interact with cell assemblies, we developed a field-programmable gate array (FPGA)-based system, interfaced with a complementary metal–oxide–semiconductor high-density microelectrode array (CMOS-MEA). The CMOS-MEA features a total of 126 readout and 42 stimulation channels, which can be connected to an almost arbitrary subset of 11,011 5 × 7 µm2 electrodes, arranged in a 2 × 1*.*75 mm2 array. The feedback stimulation loop is closed around the CMOS-MEA using an FPGA that performs signalprocessing, such as spike-detection and feedback generation. The system functionality was verified using cultured networks of cortical neurons and glia. The minimum programmable latency of the closed-loop stimulation feedback (*t*<sup>0</sup> − *t*<sup>1</sup> − *t*2) was 400µs with jitter below 50µs, suitable to induce STDP. This is faster than many axonal propagation delays (*t*<sup>0</sup> − *t*S), rendering it possible to conduct acausal stimulation experiments. An "event engine" was designed and implemented to trigger feedback stimulation at the occurrence of activity patterns, such as those described in Ikegaya et al. (2004) and Rolston et al. (2007). Patterns could be of almost arbitrary length and could consist of up to 1000's of individual elements, only limited by the available resources of the FPGA. Configurations for the event engine could be (re)loaded within milliseconds. Unique to this system is the possibility to enable low-latency, high-throughput, STDP-like experiments as well as acausal stimulations across many individual neurons, or neuronal assemblies in parallel through the simultaneous application of many feedback stimulation loops. To infer changes in synaptic strengths, correlations between putative mono-synaptically connected neurons (Fujisawa et al., 2008) can be monitored using extracellular spikes. In the future, high-throughput STDP experiments will be possible by adding a patch electrode to the system in order to monitor changes in intracellular post-synaptic currents.

schematically shows the timeline of the respective signals.

## **METHODS**

## **SYSTEM ARCHITECTURE**

The main design goals were to implement (1) multiple feedback stimulation loops (2) to match arbitrary spike patterns with (3) short latencies (*<*1 ms) and (4) high accuracy (*<*50µs) (5) while still recording from all available 126 channels. A main component of the presented system is an FPGA, used to hijack signals traveling between the analog-to-digital converter on the CMOS device and the host PC. Due to the inherent parallel nature of FPGAs, signal-processing and feedback generation using data from additional recording channels can be done without introducing additional delays or jitter.

The system consists of three main parts as shown in **Figure 2**. The first is a high-density CMOS-MEA device featuring on chip signal-conditioning, stimulation, and analog-to-digital conversion (ADC) units (Frey et al., 2010), described in more detail in the next section. It is plugged into a custom printed circuit board (PCB) that provides reference voltages and clock signals. The digital data as provided by the CMOS-MEA are transmitted through a low-voltage differential link to reduce sensitivity to electromagnetic interferences as caused, for example, by a nearby incubator. The second part is an FPGA, which reads in the differential signals and subsequently performs signal-processing, spike-detection, and feedback stimulation, as well as compression and framing of the data to be sent via TCP/IP over Ethernet to a host PC, the third main part. On the host PC, further data analysis can be performed online or offline. It is also used to program and control the CMOS-MEA device during experimentation with different settings, like amplifier gain or electrode-to-amplifier routing, in order to be adopted for use in different experimental sessions.

#### **CMOS DEVICE**

The CMOS-MEA includes 126 readout channels with programmable amplification (0 dB to 80 dB), on chip ADCs sampling at 20 kHz, and stimulation capabilities (see below). It features a sensor area of 2 × 1*.*75 mm2 with a total of 11,011 electrodes, each with a size of 5 × 7 µm2 and a pitch of 18µm. Beneath the electrodes resides a sophisticated analog-switching matrix to connect an almost arbitrary subset of the 11,011 electrodes to the 126 readout channels. The readout electronics were placed outside of the sensor array, instead of directly below the electrodes as done in active-pixel sensor devices (APS) (Berdondini et al., 2009), to provide space for larger circuitry elements that produce less noise. This scheme also allows for reducing the pitch of the electrodes below the spatial requirements of the readout electronics. See Frey et al. (2010) for more details.

#### **FPGA**

A reprogrammable Virtex II pro FPGA (Xilinx Inc., San Jose, USA) was used as an intermediate signal-processing device between the CMOS-MEA and the host PC to perform real-time signal-processing, decision-making and feedback generation. The FPGA acquires digital data coming from the differential link and forwards it to a PC over Ethernet. The Virtex II pro features an embedded PowerPC microprocessor running at 300 MHz that operates a Linux kernel with a Busybox operating system. The TCP/IP stack of the Linux kernel handles the network communication and data transfer. As the embedded PowerPC microprocessor is relatively slow, compared to modern CPUs, this provides a bottleneck for fast data transmission. We measured the latency between the TCP/IP stack of the FPGA and the host PC to be 83 ± 21 ms (mean ± SD, *N* = 308) at fullframe data transmission, which is larger than the STDP window of up to tens of milliseconds. One solution to this problem might be to stop streaming of the full data readout, while performing a closed-loop experiment and to only route out the data channels strictly needed for the closed-loop feedback stimulation. This would free some of the bandwidth of the Ethernet link and make it available for faster feedback stimulation. Crucially, however,

buffer. **(B)** Photograph of the CMOS-MEA plugged into the custom printed-circuit board, which is connected through an LVDS link to the Xilinx stimulation loop is closed around the CMOS-MEA and the FPGA. The

components are described in detail in the text.

we would lose the possibility to simultaneously monitor neural activity elsewhere in the cultured network by applying such a paradigm. Another option might be to bypass the Ethernet link by streaming the data directly to a DAQ card, attached to the host PC, and to send stimulation information back through a second link to the FPGA. All these methods are less practical than using the universal TCP/IP connection, which plugs into almost every kind of host PC and does not require additional hardware. An attractive alternative for achieving low latencies was to implement all needed signal-processing and feedback generation directly on the FPGA. The next paragraphs highlight the different building blocks needed to implement such a scheme. Although the FPGA can be reprogrammed at will, this is time-consuming and error prone and, therefore, not suitable during an experimental session. To accommodate reprogramming, a more flexible, module-based design was developed in VHDL and programmed into the FPGA logic together with a software interface to quickly reconfigure the connectivity of the individual modules (see "Event Engine").

#### **SPIKE-DETECTION**

One such signal-processing building block is spike-detection, which extracts spiking events from the raw voltage traces, recorded at the electrodes. Spike-detection is implemented as a threshold crossing. The signals are first digitally band-pass filtered with a two-tab Butterworth filter (500 Hz–3 kHz) to suppress DC offset components and higher frequency noise; this will emphasize the action potential frequency components. The detection threshold level is user-programmable and typically set around 4.5 times the noise standard deviation. During experimentation, this value can be determined by software running online on the host PC. After an identified spike event, we set a programmable refractory period to 3 ms. After stimulation, detection was disabled for 3 ms as well, to avoid oscillating loops due to feedback stimulation artifacts being falsely classified as spikes.

## **EVENT ENGINE**

To avoid time-consuming reprogramming of the FPGA fabric, a more flexible and modular event-based scheme for feedback generation (Event Engine) was designed and implemented. The event engine consists of small building blocks, called modules, each of which implements a specific simple function. Each module has one or more event sinks as inputs and one event source as an output. By connecting the event sources to the appropriate event sinks, different, almost arbitrary pattern matching, and event handling algorithms can be achieved. **Table 1** summarizes the implemented modules. **Figure 3** shows different basic configurations to achieve defined pattern matching. In **Figure 3A**, the simplest closed-loop configuration is depicted, where the source of a spike-detection module gets connected to the sink of a delay unit and from there to a stimulation function generator. Whenever the source produces an event (i.e., in this case detects a spike), the sink triggers a stimulation pulse after a defined time delay. By means of software, the sources can be connected to sinks dynamically and rapidly within milliseconds while running an experiment such that pattern matching can adapt to ongoing activity in the living culture. One notable property is the lack of time binning. Each spike gets represented as a single pulse with a temporal resolution set by the sampling frequency, i.e., 20 kHz. As a consequence, certain desired operations might not make sense, as the biological neurons have some inherent variability in when they spike. For example, the user might want to match a pattern, where two neurons spike together (see **Figure 3E**). To achieve this, a SPREADING module "spreads" the spike pulse in time in order to compensate for jitter. This way, the subsequent AND module can generate an output event whenever the two neurons fire together within a specified range of time. As discussed in Ikegaya et al. (2004) and Rolston et al. (2007), 2 ms is suitable for most recurring patterns. Another module can be used to convert the spread-out spike pulse back into a single one-shot event, which then can be used, for example, to trigger the stimulation unit only once per spread-out pulse. The particular selection of implemented modules (as listed in **Table 1**) represents a minimal set, which, if combined in the appropriate way, allows for matching different kinds of events, such as specific spatio-temporal activity patterns, time sequences, network bursts, local bursts, etc. In order to keep the event engine as flexible as possible and adaptable to different, possibly unforeseen pattern matching sequences, the implementation of a minimal set of small building blocks has been chosen over the approach, where each envisioned pattern would require a single, but more complex, and less flexible building block. Thus, available modules can be combined together in almost infinite different ways, limited only by the available FPGA memory that keeps track of all source-sink associations.

## **STIMULATION/FUNCTION GENERATOR**

The CMOS-MEA has 42 on-chip integrated stimulation units, which are driven by two 10bit DACs. On the FPGA is a function generator implemented to achieve arbitrary stimulation waveforms. A defined waveform has to be programmed at the start of the experiment. We used biphasic, first positive then negative voltage pulses of 200µs duration per phase and ±300 or 400 mV amplitude. The stimulation buffers can be chosen to operate in voltage- or current mode (Livi et al., 2010). Whenever the event engine outputs an event, the appropriate stimulation buffer, located on the CMOS-MEA, gets connected, and the function generator starts its operation. Stimulation artifacts on the readout channels could result in falsely detected spikes and cause a reverberation problem for low-latency feedback-loops. Therefore, spike-detection is blanked during a time period of a few milliseconds after stimulation onset.

## **CULTURES**

The performance of the closed-loop system was tested with cortical neurons and glia grown over the CMOS-MEA. Animal handling protocols were approved by the Basel-Stadt Veterinary office according to Swiss federal laws on animal welfare. Briefly, a time-pregnant rat was anesthetized using isoflurane, then decapitated to gain E18 embryos. Cortices were extracted from the embryos and dissociated enzymatically in trypsin (Invitrogen) followed by mechanical trituration. A layer of laminin (Sigma) over a layer of poly(ethyleneimine) (Sigma) was used to adhere between 20 and 40 k cells. Plating media consisted of 850 µL of Neurobasal, supplemented with 10% horse serum (HyClone), 0.5 mM GlutaMAX (Invitrogen), and 2% B27 (Invitrogen). After

#### **Table 1 | A minimal set of modules making up the event engine.**


*Configurable parameters are represented in italics (t, p, n, c), and input events are denoted in bold letters (A, B).*

**FIGURE 3 | Example configurations of the event engine.** Stitching together the appropriate set of modules allows the event engine to be configured to match a variety of patterns in order to trigger feedback stimulation. Different minimal examples are shown. **(A)** A DELAY element is inserted after a DETECTION module to trigger STIMULATION after a programmable delay. This configuration, with the delay set to zero, was used for the experiments shown in **Figures 5**, **7**. **(B)** Either an event on channel **A** OR an event on channel **B** triggers stimulation. **(C)** In a programmable time window before and after an event on channel **A**, there may not be any event on channel **B** in order to trigger stimulation (trace **C**). **(D)** A RAND module propagates or discards the events, in this case with a probability of ½. **(E)** Events on channel **A** and channel **B** are fed through SPREAD modules into an AND module, which outputs events (on trace **C**), when both inputs are active. The intermediate trace **C** is fed into a SPREAD−<sup>1</sup> module to trigger stimulation at the onset of the event. **(F)** When the event on channel **B** happens subsequently to an event on channel **A**, an event **C** is generated **(G)** An ACCU module is set to increment, when either an event on channel **A** OR channel **B** happened, and to decrement, when a delayed event from channel **B** (trace **C**) arrived. In this example, the ACCU threshold is set to three events. Once the threshold is reached, the internal counter gets reset to zero. When the three input events happen shortly after each other, a stimulation event gets emitted. As shown in the example, the delayed channel **B** (trace **C**) decrements the accumulator and thus delays or prohibits crossing of the threshold. **(H)** All modules can be combined together to achieve almost arbitrarily complex pattern matching. For example, this configuration was used to match the pattern of **Figure 6**. The formula describing this pattern is: STIMULATION(*1*, SPREAD<sup>−</sup>1(AND(AND(SPREAD(*2 ms*, **A**), SPREAD(*2 ms*, **B**)), SPREAD(*2 ms*, **C**)))).

24 h, the plating media was changed to growth media: 850µL of DMEM (Invitrogen), supplemented with 10% horse serum, 0.5 mM GlutaMAX, and 1 mM sodium pyruvate (Invitrogen). Cultures matured for 3–4 weeks prior to experimentation, and experiments were conducted inside an incubator to control environmental conditions (34.5◦C and 5% CO2). For further details see Hales et al. (2010).

#### **EVALUATION AND RESULTS**

This section begins with data characterizing the suitability of our setup to perform closed-loop feedback stimulation experiments, using cultures of cortical neurons and glia for validation. First, the process of identifying neurons to be used in closed-loop feedback stimulation will be described. Then the system's loop speed and jitter performance will be quantified. An example event engine was run to provide stimulation feedback, triggered by an activity pattern. Preliminary data and techniques to analyze the consequences of such stimulation on the functional connectivity between neurons will be presented and discussed. Finally, an experimental session to induce LTD through acausal stimulation will be sketched, and its implications discussed. Data in the figures demonstrate proof-of-principle experiments from individual cultures, the setup has, however, been successfully applied to many tens of cultures.

#### **RECORDING/STIMULATION SELECTIVITY**

High-density CMOS-MEAs can potentially sample from complete neuronal populations. Due to the high-density (18 µm pitch) of the CMOS electrode array, every neuron lying on the 2 × 1*.*75 mm2 array can be bidirectionally addressed. On the other hand, when stimulating one electrode, a defined subset of neurons is often directly activated in response (Bakkum et al., 2008a). **Figure 4** shows such a scenario. In **Figure 4A**, one electrode, marked with a black cross, was stimulated multiple times, and the evoked activity was recorded during a window of 12 ms after stimulation onset. The median calculated over all voltage traces filters out noise and spontaneously spiking neurons/traces. Reliable activity (usually with a jitter on the order of 100 µs or below) is considered due to an antidromic action potential initiated at the neuron's axon (Lipski, 1981). Since only a subset of 126 out of the 11,011 electrodes can be readout simultaneously, the stimulation sequence was repeated multiple times, each time with a different subset of electrodes, until all electrodes were covered. After recording all sequences, the traces of the individual recordings were aligned in time. To highlight the electrodes that recorded elicited action potentials, the negative peak of the recorded voltage level during 12 ms after stimulation is color-coded and clipped at −100µV. The red circles around the exemplified 11 spots highlight neurons that fired directly elicited action potentials. Their traces are individually shown in **Figure 4B**, demonstrating that the elicited action potentials were reliably and precisely fired after a given time, and only in a few cases (traces 2, 4, 6, 9), activity with different timing occurred. These could stem from a different neuron that happened to sit near the same electrode and/or from action potentials occurring within a coincident network burst.

#### **FIGURE 4 | Identification of directly evocable action potentials.**

**(A)** Data recorded in response to repeated stimulation of one electrode (black cross) from the whole 2 × 1*.*75 mm<sup>2</sup> sensor area of the CMOS-MEA (each pixel is one electrode). Recording electrode configurations were scanned across the array in sets of 126 electrodes at a time. For every configuration, data were recorded for 12 ms after stimulation onset. The amplitude of the negative voltage peak within these 12 ms is color-coded and clipped at −100µV. Blue indicates the detection of directly evoked somatic action potentials. **(B)** Example traces from 11 somas and the stimulation pulse are shown on the right. Traces from 30 stimulation trials are overlaid, with the median trace highlighted in black. The stimulation artifact was blanked prior to recording. Numbers are ordered by increasing distance from the stimulation site.

As shown, recording and stimulation with the CMOS-MEA feature high spatial resolution and, therefore, are locally very confined. However, the facts that one electrode can detect signals from more than one neuron, and that the stimulation through one electrode can directly evoke action potentials of more than one neuron have to be considered when planning closed-loop feedback stimulation experiments. In this case, the feedbackloop is not closed between two neurons, but includes two sets of neurons.

#### **FEEDBACK LATENCIES**

According to the rules of STDP, the timing window to induce long-term potentiation (LTP) at synapses is between less than a few milliseconds and up to tens of milliseconds post-synaptic activation before and after pre-synaptic activity. Thus, even though feedback cycles of 5–10 ms are fast enough to induce LTP, we aimed at reaching cycle-times below 1 ms to enable the system to perform acausal stimulations, as explained in the respective section below.

**Figure 5** shows the overlay of 128 traces of the feedback-loop. Here, the event engine was configured to detect events on only one channel and stimulate immediately after detection, i.e., without any further delays in order to test the system performance (cf. **Figure 3A**). The traces are aligned at the onset-time of the stimulation pulse, and time zero is set to be at the negative peak of the spike of the trigger neuron. In red are the traces from the trigger neuron, and in black, the traces from the elicited neuron. The timing between a trigger neuron spike and the onset

of the stimulation pulse was 200µs, i.e., 4 sampling periods. This delay arises as follows: 50µs (1 sampling period) was used to buffer the incoming data in the FPGA; 100µs accounted for the delay of the two-tab Butterworth filter and the last 50µs account for all other delays, such as synchronizing the stimulation pulse with the recording sampling time. Delays for sending digital data between the CMOS device and the FPGA were on the order of nanoseconds and thus are negligible. When stimulating with biphasic voltage pulses, the steep negative transition, which injects negative current (I = C × dV/dt), is the time point, when a cell is activated (Wagenaar et al., 2004; Bakkum et al., 2008b). Thus, this time point was taken to measure the latency between stimulation and an elicited spike. In the case depicted in **Figure 5**, this timing is 0.85 ms, and the overall latency between trigger neuron activity and a spike on the elicited neuron was 1.25 ms.

As can be seen in **Figure 5**, besides achieving short feedback cycles, another advantage of using digital hardware (in this case FPGAs) for feedback generation is that no additional jitter is introduced, as such a system is fully deterministic. Sources of jitter in other systems (Hafizovic et al., 2007; Rolston et al., 2010) that close the feedback-loop around general-purpose or real-time personal computers are, for example, system interrupts that might disrupt the data processing, or buffer sizes of the USB, TCP/IP, or DAQ cards, which have to be set large enough in order to guarantee full data throughput. Usually these buffers have a size larger than one sample period. Depending on when an event happened inside this buffer, the latency could be larger or smaller and thus introduce jitter. This can be avoided by using digital hardware to hijack the data stream. In our case, the jitter was below ±50µs and arose from the fact that neural activity is, of course, not aligned to the sampling period of the CMOS-MEA (50 µs). The exact time of the threshold crossing relative to the negative spike peak depends, among other things, on the slope of the spike waveform. Since the recorded signal was not interpolated between samples, this was an unavoidable source for jitter.

## **PATTERN MATCHING**

To demonstrate the event engine in operation, feedback stimulation, triggered by an activity pattern, was performed. For the dataset presented in **Figure 6**, the event engine was programmed according to **Figure 3H** and classified spontaneous activity patterns as follows: A neuron recorded on electrode N2 fires an action potential; then an action potential is recorded from a neuron on electrode N3 after 3 ms; finally an action potential is recorded on electrode N1 after another 1.5 ms. Each individual event occurrence was allowed to have a jitter of ±1 ms. After successful identification of such a pattern, a stimulation pulse was emitted to elicit action potentials on a different neuron, NE. The cell cultures under investigations typically expressed bursting behavior, and this was when almost all of the patterns occurred. During bursts, the cells usually fired more than once at an elevated frequency, and this explains why the neurons on electrodes N1–N3 showed additional spikes "outside" of the detected pattern. Nevertheless, the pattern matching event engine identified 22 activity pattern occurrences during 12 min of recording.

#### **CORRELATION ANALYSIS**

To assess the connectivity between different neurons and the efficacy of change, induced by the closed-loop feedback

stimulation, cross-correlation curves (Perkel et al., 1967) were computed between spike trains of the trigger neuron and the elicited neurons. When exceeding a 95% confidence interval (Brillinger, 1976), correlation is considered significant. **Figure 7** shows three descriptive cases, comparing the cross-correlation curves from 1 h of spontaneous activity before and after closedloop feedback stimulation was applied for 1 h. To evaluate significance of the change, a similar procedure as in Fujisawa et al. (2008) was used. Briefly, the two times 1 h of spontaneous activity recordings were divided into smaller bins of 10 min duration and were randomly assigned to be before or after the closed-loop stimulation. Cross-correlation from this shuffled data was computed for both "before" and "after" and the difference was evaluated. This procedure was repeated 1000 times to generate a surrogate data set. Points on the *x*-axis, where the true difference is larger than 95% of the surrogate data, were assigned to be significant and are marked with an orange bar in **Figure 7**. Assessing the true connectivity of neuronal networks by means of extracellular measurements is difficult, and using the cross-correlation to that end is not ideal, as effects like common inputs or firing rate changes cannot be easily explained. However, in our context of evaluating the effect of feedback stimulation, we do not necessarily seek to precisely explain the changes in network connectivity, but to rather demonstrate that a change occurred at all and to what extent.

#### **ACAUSAL STIMULATION**

One motivation for very short feedback cycles is to open the possibility of acausal stimulation. If the closed-loop stimulation (*t*<sup>0</sup> − *t*2) is faster than the time it takes the action potential to travel along the axon and hit the synapses (*t*<sup>0</sup> − *t*S), acausal stimulation and, therefore, induction of LTD by means of closedloop feedback stimulation is possible. The time that it takes for an action potential, initiated at the axonal hillock, to propagate down the axonal arbor to the synapses depends on the propagation velocity of action potentials along axons and the length of the axons. Action potential conduction velocities in unmyelinated axons were reported around 0.2–0.4 ms−<sup>1</sup> (Debanne et al., 2011). As demonstrated in **Figure 5**, the closed-loop stimulation (*t*<sup>0</sup> − *t*2) can be as fast as 0.4 ms, meaning acausal stimulation is possible for trigger neurons (*t*0) with unmyelinated axons that synapse to an elicited neuron (*t*3*/*S) after a minimum axial length of 80–160µm. **Figure 8** shows such an acausal stimulation procedure. First, before applying a closed-loop, the activity between different neurons was measured then evaluated by computing the cross-correlation. In the example in **Figure 8**, the firing activity of the second neuron B with respect to the first neuron A was elevated around a delay of 2.5 ms, implying neuron A has a functional connection with neuron B. Integrating the cross-correlation curve, where it exceeds the confidence intervals around 2–3 ms after the reference time zero, reveals an integral probability of around 40% chance for neuron B to spike 2–3 ms after neuron A had fired. Once two such neurons could be identified, closed-loop stimulation can be applied between them with a very short feedback cycle. In the presented example, the delay from the trigger neuron to the elicited spike was around 1 ms, smaller than the average delay between the occurrence of their spontaneous action potentials. The closed-loop feedback stimulation was applied for 20 min, and, afterwards, the correlation was measured again. Now, the correlation no longer exceeded the confidence intervals at around 2–3 ms after the trigger neuron. Note, however, that Bi and Poo (1998) have shown that LTD can only be induced, if the spontaneous synaptic efficiency is not strong enough to evoke a post-synaptic action potential. Otherwise, the post-synaptic Ca2<sup>+</sup> influx dominates, and LTP will occur. For the experiment shown in **Figure 8**, the elicited neuron spiked only a fraction of the time, and provided an intermediary synapse; in all

**FIGURE 7 | Cross-correlation analysis.** Three descriptive cases of changes in correlated firing between trigger neurons and elicited neurons. Spontaneous activity was recorded 1 h before and 1 h after the application of closed-loop feedback stimulation. Periods, where the difference exceeded a confidence bound (see text), were assigned to be significant and are indicated with an orange bar. The 95% confidence intervals are indicated

with black dashed lines. Cross-correlation is computed based on trains with 9000–13000 spikes per neuron. **(A)** Relative probability remained constant, but the timing between trigger neuron and elicited neuron changed and became more synchronous. **(B)** The elicited neuron became more likely to fire in concert with the trigger neuron. **(C)** Relative timing within a network burst changed.

The time delay between the plotted spikes of neuron A and neuron B was chosen to align with the maximum peak of the cross-correlation curve. Bottom: Cross-correlation curve of spike-times of neuron B with respect to neuron A. 95% confidence intervals are drawn with dotted red lines. Cross-correlations were computed with trains having 2000–3000 spikes. Significantly elevated correlated activity of neuron B can be detected around 2.4 ± 0.4 ms after neuron A fired an action potential. **(B)** Same

cross-correlation no longer shows a significant peak for latencies larger than zero. The time delay between the plotted spikes of neuron A and neuron B was again chosen to align with the maximum peak of the cross-correlation. **(D)** Geometric sketch of the situation. The trigger neuron A and its axon are shown in green and the elicited neuron B in yellow. **(E)** Comparison of the two cross-correlation curves before (black) and after (red) the acausal stimulation with their 95% confidence intervals.

other cases, evoked excitatory post-synaptic currents (EPSCs) remained below the threshold. Further experiments are required before drawing conclusions. Additionally, to explore LTD and LTP in more depth, and advantageously, across many synapses simultaneously, extracellular recordings targeted to many trigger neurons, and an elicited neuron on the CMOS-MEA could be combined with an intracellular patch-clamp, attached to the elicited neuron and measuring the incoming EPSCs.

## **DISCUSSION**

With the presented system, capable of applying multiple flexible feedback-loops simultaneously, many different experiments will be possible. The dynamic clamp technique proved to be a valuable tool for investigating the membrane dynamics involved in action potential generation (Destexhe and Bal, 2009; Economo et al., 2010). In such systems, intracellularly applied closedloop-controlled voltage feedback enables the manipulation of cell membrane functions. Similarly, extracellularly applied closedloop stimulation feedback, as presented in this work, might provide a useful tool for investigating cellular and network level plasticity and enable the manipulation of neuronal network functions. Potential questions include how information processing and the amount of memory that can be stored in a cultured network are influenced by adding one or more feedback-loops. Further experiments might involve more detailed studies of both LTP and LTD of individual sets of neurons by implementing causal and acausal feedback-loops between them. Using the pattern matching capabilities of the event engine will allow for extending plasticity studies to the network level. For example, investigations of the temporal order and history of spike trains, similar to those reported by Froemke and Dan (2002) and Ikegaya et al. (2004), could be performed, however, in parallel on multiple different neurons and pathways and, in addition, the respective pathways could be dynamically altered by targeted closed-loop feedback stimulations. Further rules governing plasticity beyond the classical STDP could be investigated.

An inherent limitation of extracellular recording systems is the inability to directly measure EPSCs. Conventional plasticity studies rely on patch-clamp to directly measure the EPSC to assess synaptic connectivity strength. Since these currents are not accessible with extracellular measurement techniques, indirect methods to assess synaptic connectivity have to be employed. Although cross-correlation seems attractive and is commonly used to assess connectivity, either between different brain regions or networks, or even between individual cells, it remains to be investigated to what extent correlation analysis unveils the direct synaptic strength between neurons. A combination of patchclamp techniques and MEAs would provide a more direct way to measure the EPSC than through the computation of crosscorrelation curves. By patching the post-synaptic neuron, EPSC strengths can be directly measured and related to extracellularly recorded pre-synaptic activity. Combining the advantages of both techniques, i.e., the precise EPSC measurements through patch-clamp, and the large-scale parallel, extracellular measurements and stimulations through CMOS-MEAs with flexible feedback-loops programmed by the event engine, would greatly expand experimental horizons. One could study the plasticity of hundreds of synapses in parallel. Furthermore, by hooking up the patch-clamp system to the event engine through dedicated spike-detection and stimulation modules, feedback-loops could be applied through the patch-clamp between extracellularly recorded and intracellularly stimulated (or vice versa) neurons.

Although, due to the high-density of electrodes, potentially all neurons can be read out individually, the recorded signals from two different neurons, located close to each other, are sometimes difficult to separate. A spike-sorting step, incorporated prior to event detection, can help to sort, and separate even neurons recorded from with the same electrodes. This holds in particular for using high-density electrode arrays (Franke et al., 2012). The spike-sorting might enable the identification of neurons with smaller spiking amplitudes, close to the noise level, and the identification of more neurons or cell assemblies. However, a drawback of more sophisticated spike-sorting algorithms is an additional time delay in the detection phase (*t*<sup>0</sup> − *t*1). Spike-sorting, together with intracellular stimulation through patch-clamp as described above, could eliminate the aforementioned limitations in section "Recording/stimulation selectivity": Trigger spikes can be assigned to an individual neuron through spike-sorting, and stimulation pulses will only activate action potentials in the patched neuron.

## **CONCLUSION**

By using an FPGA to perform signal-processing, as well as feedback generation, fast, and flexible loop cycles have been realized. Our approach using reconfigurable digital hardware to perform computationally intensive tasks, such as signal filtering, spike identification, decision-making, and feedback generation, is a compromise between traditionally employed methods either using a general-purpose (micro-) processor, which introduces additional latencies, and jitter, and the highly integrated application-specific circuits (VLSI ASICs), which are much less flexible in terms of adaptations to new experimental paradigms. Our achieved closed-loop feedback latencies are lower than many axonal propagation delays and thus enable acausal stimulation. Due to the flexible event engine, high-throughput experiments applying many feedback-loops in parallel are conceivable.

#### **ACKNOWLEDGMENTS**

We thank Milos Radivojevic and Marta Lewandowska for culturing assistance and Felix Franke, Michele Fiscella, Ian Jones, and David Jäckel for helpful discussions. This work was financially supported through the ERC Advanced Grant 267351 "NeuroCMOS" and the Swiss National Science Foundation Ambizione Grant PZ00P3\_132245.

#### **REFERENCES**


(2011). A battery-powered activitydependent intracortical microstimulation ic for brain-machine-brain interface. *IEEE J. Solid-State Circ.* 46, 731–745.

Bakkum, D. J., Chao, Z. C., and Potter, S. M. (2008a). Long-term activity-dependent plasticity of action potential propagation delay and amplitude in cortical networks. *PLoS ONE* 3:e2088. doi: 10.1371/journal.pone.0002088

Bakkum, D. J., Chao, Z. C., and Potter, S. M. (2008b). Spatio-temporal electrical stimuli shape behavior of an embodied cortical network in a goal-directed learning task. *J. Neural Eng.* 5, 310–323.

Berdondini, L., Imfeld, K., Maccione, A., Tedesco, M., Neukom, S., Koudelka-Hep, M., et al. (2009). Active pixel sensor array for high spatio-temporal resolution electrophysiological recordings from single cell to large scale neuronal networks. *Lab Chip* 9, 2644–2651.


and real-time spike sorting for closed-loop experiments: an emerging tool to study neural plasticity. *Front. Neural Circuits* 6:105. doi: 10.3389/fncir.2012.00105


ADC. *IEEE J. Solid-State Circ.* 45, 1935–1945.


learning through spike-timingdependent synaptic plasticity. *Nat. Neurosci.* 3, 919–926.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 26 September 2012; paper pending published: 30 October 2012; accepted: 22 December 2012; published online: 10 January 2013.*

*Citation: Müller J, Bakkum DJ and Hierlemann A (2013) Sub-millisecond closed-loop feedback stimulation between arbitrary sets of individual neurons. Front. Neural Circuits 6:121. doi: 10.3389/fncir.2012.00121*

*Copyright © 2013 Müller, Bakkum and Hierlemann. This is an openaccess article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Closed-loop, multichannel experimentation using the open-source NeuroRighter electrophysiology platform

#### **Jonathan P. Newman<sup>1</sup> , Riley Zeller-Townson<sup>1</sup> , Ming-Fai Fong1,2, Sharanya Arcot Desai 1,3,4 , Robert E. Gross 3,4,5 and Steve M. Potter <sup>1</sup>\***

<sup>1</sup> Laboratory for Neuroengineering, Department of Biomedical Engineering, Georgia Institute of Technology and Emory University School of Medicine, Atlanta, GA, USA

<sup>4</sup> Department of Neurology, Emory University School of Medicine, Atlanta, GA, USA

<sup>5</sup> Department of Biomedical Engineering, Georgia Institute of Technology and Emory University School of Medicine, Atlanta, GA, USA

#### **Edited by:**

Eberhard E. Fetz, University of Washington, USA

#### **Reviewed by:**

Yang Dan, University of California, Berkeley, USA Stavros Zanos, University of Washington, USA

#### **\*Correspondence:**

Steve M. Potter, Laboratory for Neuroengineering, Department of Biomedical Engineering, The Georgia Institute of Technology, 313 Ferst Drive, Atlanta, GA 30332, USA. e-mail: steve.potter@bme.gatech.edu Single neuron feedback control techniques, such as voltage clamp and dynamic clamp, have enabled numerous advances in our understanding of ion channels, electrochemical signaling, and neural dynamics. Although commercially available multichannel recording and stimulation systems are commonly used for studying neural processing at the network level, they provide little native support for real-time feedback. We developed the open-source NeuroRighter multichannel electrophysiology hardware and software platform for closed-loop multichannel control with a focus on accessibility and low cost. NeuroRighter allows 64 channels of stimulation and recording for around US \$10,000, along with the ability to integrate with other software and hardware. Here, we present substantial enhancements to the NeuroRighter platform, including a redesigned desktop application, a new stimulation subsystem allowing arbitrary stimulation patterns, low-latency data servers for accessing data streams, and a new application programming interface (API) for creating closed-loop protocols that can be inserted into NeuroRighter as plugin programs. This greatly simplifies the design of sophisticated real-time experiments without sacrificing the power and speed of a compiled programming language. Here we present a detailed description of NeuroRighter as a stand-alone application, its plugin API, and an extensive set of case studies that highlight the system's abilities for conducting closed-loop, multichannel interfacing experiments.

**Keywords: closed-loop, multichannel, real-time, multi-electrode, micro-electrode array, electrophysiology, open-source, network**

## **1. INTRODUCTION**

Multi-electrode neural interfacing systems, such as planar electrode arrays, silicon probes, and microwire arrays are commonly used to record spatially distributed neural activity *in vitro* and *in vivo*. Advances in nanoscale fabrication techniques have continued to push channel counts and electrode resolution (Du et al., 2011; Fiscella et al., 2012; Robinson et al., 2012), allowing for increasingly detailed measurements of network activity states. Because multi-electrode neural interfaces provide many parallel measurements, they can be used to rapidly estimate ensemble features of network activity (e.g., the population firing rate or network-level synchronization). This makes them well suited for real-time applications.

However, most commercial software interfaces for controlling multichannel hardware lack flexible support for real-time, bi-directional communication with neural tissue. Additionally, commercial software is often hard to integrate into complex multicomponent experimental configurations. As a result,multichannel hardware has not been incorporated into closed-loop interfacing schemes to the degree of single-cell recording systems, such as voltage and dynamic clamp (Cole, 1949; Marmont, 1949; Hamill

et al., 1981; Prinz et al., 2004; Arsiero et al., 2007; Kispersky et al., 2011). There are some exceptions to this trend (Jackson et al., 2006b; Azin and Guggenmos, 2011; Zanos et al., 2011). These systems are typically limited to low channel counts and/or low recording resolution in order to achieve embedded real-time processing at the recording site using a microcontroller or DSP. This approach has clear advantages for experiments on freely moving animals, but is limited in terms of input and output bandwidth, processing power to enable complex experimental protocols, and ease of programming. Neuroscience research would benefit from a multichannel acquisition platform that (1) enables bi-directional interaction with neuronal networks, (2) is practical for everyday use, (3) is straightforwardly extensible for complex closed-loop protocols, (4) works with a variety multi-electrode interfaces, (5) provides large channel counts and high recording resolution, and (6) is low cost. This type of system would be particularly applicable to three areas of neuroscience research:

• Feedback Control of Network Variables: Neuronal networks are complex systems with many recurrently interacting components. This often results in ambiguity in cause and effect

<sup>2</sup> Department of Physiology, Emory University School of Medicine, Atlanta, GA, USA

<sup>3</sup> Department of Neurosurgery, Emory University School of Medicine, Atlanta, GA, USA

relationships between network variables (Rich and Wenner, 2007; Turrigiano, 2011). Feedback control can be used to parse variables of neural activation that are causally linked (Cole, 1949). Feedback control of network-level variables (e.g., population firing rate, neuronal synchronization, or neurotransmission levels) can potentially clarify their causal relationships (Wagenaar et al., 2005; Wallach et al., 2011).


Here,we present substantial improvements to NeuroRighter, an open-source, multichannel neural interfacing platform which we designed specifically to enable bi-directional, real-time communication with neuronal networks (Rolston et al., 2009a, 2010). In the first half of the paper, we provide a description of NeuroRighter's capabilities, including an application programming interface (API) that facilitates the creation of custom real-time experiment protocols. In the second half of the paper, we demonstrate these features with a variety of case studies. Each case-study highlights a different aspect of NeuroRighter's abilities in the areas of networklevel feedback control, artificial embodiment, and closed-loop control of aberrant activity states in freely moving animals.

## **2. THE NEURORIGHTER MULTICHANNEL ELECTROPHYSIOLOGY PLATFORM**

NeuroRighter is an open-source, low-cost multichannel electrophysiology system designed for bi-directional neural interfacing (Rolston et al., 2009a, 2010). A complete system, including all necessary electronics and a host computer, can be assembled for less than \$10,000 USD. The NeuroRighter software is free. Extensive documentation on the construction and usage of a NeuroRighter system is available online<sup>1</sup> . NeuroRighter's source code, the API reference, and demonstration closed-loop protocol code, are available from the NeuroRighter code repository<sup>2</sup> . Questions on NeuroRighter assembly and usage can be submitted to the

NeuroRighter-Users forum<sup>3</sup> . Tutorials on API usage are provided in sections 1 and 2 of the Supplementary Material.

#### **2.1. HARDWARE**

Here we provide a summary of NeuroRighter's hardware building blocks. Hardware components can be used with neural interfaces designed for applications both *in vivo* and *in vitro*. Printed circuit board (PCB) performance specifications are provided in (Rolston et al., 2009a) and layouts are available online. A complete NeuroRighter system meets or exceeds the performance of commercial alternatives in terms of noise levels, stimulation channel count, stimulation recovery times, and flexibility (Rolston et al., 2009a). NeuroRighter's PCBs are designed to be modular: electrode interfacing and stimulation PCBs have identical footprints and use vertical headers to route power between boards. This allows interfacing PCBs to be stacked on top of one another for increased channel counts and the use of a single DC power supply (or set of batteries) for all hardware.

## **2.1.1. ADC/DAC boards**

NeuroRighter uses National Instruments (NI; National Instruments Corp, Austin, TX, USA) data acquisition hardware driven with NI's hardware control library, DAQmx. NI PCI-6259, PCIe-6259, PCIe-6353, and PCIe-6363 16-bit, 1 M sample/sec data acquisition cards are currently supported. Each card supports 32 analog inputs (AI), 4 analog outputs (AO), and 48 I/O-configurable digital channels. NI SCB-68 screw-terminal connector boxes are used to interface each data acquisition card with external hardware. Up to 3 cards can be used in a single NeuroRighter system to meet channel count requirements.

## **2.1.2. Multichannel amplifier interfacing boards**

NeuroRighter provides two types of PCB to interface the NI data acquisition cards with multi-electrode amplifier systems. For *in vivo* applications, a 16-channel filter module provides 1.6X signal buffering, anti-aliasing filtering (−3 dB point at 8.8 KHz), DC offset subtraction (−3 dB point at 1 Hz), and regulated power to the headstage. Up to four of these modules can be stacked together in order to meet channel count requirements. For *in vitro* applications, a 68 channel conversion board provides power and signal routing for planar electrode array amplifier systems, e.g., Multichannel Systems' 60 channel amplifiers (Multichannel Systems, Reutlingen, Germany), which have a manufacturer settable pass-band. Both boards interface with the SCB-68 connector boxes using 34-channel ribbon cables, wired as signal/ground pairs to reduce capacitive crosstalk between adjacent lines during stimulation.

## **2.1.3. Electrical micro-stimulation hardware**

NeuroRighter includes all-channel (up to 64 electrodes) stimulation capabilities for both *in vivo* and *in vitro* systems. This system is based upon the circuits presented in (Wagenaar and Potter, 2004; Wagenaar et al., 2004) and includes two separate PCBs: (1) a voltage- or current-controlled signal generation PCB, and (2) a

<sup>1</sup>https://sites.google.com/site/neurorighter/

<sup>2</sup>http://code.google.com/p/neurorighter/

<sup>3</sup>http://groups.google.com/group/neurorighter-users

signal multiplexing and isolation PCB to select different electrodes for stimulation and isolate recording electrodes from stimulation cables between stimulus pulses.


#### **2.1.4. Generic I/O**

NeuroRighter provides 4 analog output channels and 32 bits of programmable digital I/O for controlling or recording digital signals from laboratory equipment. An auxiliary set of up to 32 analog input channels and 32 bits of digital I/O can also be used. Channel counts of generic I/O in a NeuroRighter system depend on the number of data acquisition cards in the user's system, and the amount of analog input channels reserved for the electrodes.

NeuroRighter's hardware serves as an adaptable interface between multi-electrode sensors and data acquisition cards for recording and microstimulation. There are many other options for routing signals to and from the acquisition cards. Therefore, except for the acquisition cards themselves, the hardware we present here is not required to make use of NeuroRighter's software.

#### **2.2. SOFTWARE**

The NeuroRighter software application was written in C# (pronounced "C-Sharp"). C# is a modern, general purpose, object-oriented programming language. The software is free and its source code is maintained on a publicly accessible repository<sup>4</sup> . For standard installations, NeuroRighter is distributed as an installation package for 32- or 64-bit Windows operating systems (Microsoft Corp., Redmond, WA). NeuroRighter installations contain two software components:


#### **2.2.1. The NeuroRighter application**

As a stand-alone application, NeuroRighter can be used for highquality multichannel recordings (16-bit resolution, 31 k Samples/sec/channel) and all-channel stimulation protocols. NeuroRighter's graphical interface is organized into tabbed pages, each of which encapsulates a particular group of functions or visualization tools (**Figure 1**). In the following section, we discuss the main functional aspects of the stand-alone NeuroRighter application.

*2.2.1.1. Main interface.* The main NeuroRighter interface (**Figure 1C**) is an access point for all of the application's functionality. It facilitates user manipulation of hardware settings, online filter settings, data visualization windows, stimulation tools, and other features, which are discussed below. Additionally, some recording settings can be manipulated within the main interface itself:

*Online acquisition settings.* Many filter settings can be adjusted during data collection. This allows the user to fine tune acquisition settings while gaining visual feedback of the effect on incoming data streams. Bandpass, spike detection, and spike sorting parameters can be adjusted during a recording.

*Data visualization.* Data visualization tools in NeuroRighter use the Microsoft XNA game development framework. This ensures that online visualization does not consume CPU cycles by offloading plotting routines to a supported graphics card. Visualization tools are provided for single-unit activity, local field potentials (LFP), multiunit activity (MUA), electroencephalograph (EEG) traces, and auxiliary analog input streams. Additionally, overlay plots are used to display sorted spike waveforms for each channel (**Figure 1C**).

*File saving.* Data streams selected by the user are written to disk with a unique file extension that designates their type. These binary files can be read with MATLAB (Mathworks, Natick, MA) functions included with NeuroRighter installations.

<sup>4</sup>http://code.google.com/p/neurorighter/

**FIGURE 1 | Portions of NeuroRighter's graphical user interface. (A)** The hardware settings interface. **(B)** The spike-detection filter and spike sorting interface. **(C)** The main application window. Sorted spike waveforms recorded

from a 59-channel, planar electrode array are shown on the spike visualization tab of the main GUI. The position of each waveform corresponds to the position of the recording electrode on which it was detected.

*2.2.1.2. Hardware configuration.* Correctly specifying mixed digital and analog signal routing, clock synchronization, and trigger synchronization on a multi-board data acquisition system can be complicated. NeuroRighter simplifies this process using a graphical hardware settings interface (**Figure 1A**). Here, the user specifies the types of signals carried by the NI acquisition cards in his or her system, amplifier gain settings, auxiliary input and output channels, options for electrode impedance measurement, signal referencing, and real-time data streaming options. Upon closing the settings dialog, NeuroRighter performs the required signal routing and clock synchronization. All NI cards are synchronized to a single clock oscillator using an NI real-time system integration bus (RTSI, **Figure 3**).

*2.2.1.3. Time-series filtering.* Incoming datafrom theA/D converters are passed through a cascade of digital filters to produce different neural data streams. First, channel voltages are passed through several linear filters to extract frequency bands for singleunit activity ('200–5000 Hz) and LFP ('1–500 Hz). MUA, which reflects the firing rate of neurons within the vicinity of the recording electrode, is extracted by rectifying and then low pass filtering the single-unit activity data stream (Supèr and Roelfsema, 2005).

In addition to traditional filtering methods, NeuroRighter provides several specialized filtering options. Common-mode noise sources such as AC mains pickup or movement artifacts in freely moving animals can corrupt neural recordings. NeuroRighter allows the mean or median of all recording electrodes (with appropriate scaling) to be subtracted from individual electrode voltage streams to combat common-mode interference (Rolston et al., 2009b). This is an effective method for reducing non-periodic common-mode interference, such as movement artifacts, where template subtraction methods are inappropriate. Finally, NeuroRighter includes an implementation of the SALPA filter (Wagenaar and Potter, 2002), which subtracts locally fit cubic splines from electrode traces following the application of a stimulus pulse. This removes the capacitive artifacts from non-saturated recording channels and allows online action potential detection within 2 ms after a stimulus pulse.

Sampling rates for different data streams can be set independently. Filter settings (pass-band and filter order) can be modified during data acquisition (**Figure 1C**). Raw data, as well as the result of each filtering stage, yield separate data streams (**Table 1**).

*2.2.1.4. Spike filtering.* Spike filtering in NeuroRighter is a three-step process: (1) detection, (2) validation, and (3) sorting. NeuroRighter detects spikes using a threshold criterion that compares individual voltage samples to the estimated RMS voltage on the corresponding electrode. Upon threshold crossing, a peakaligned voltage "snippet" is extracted from the raw voltage stream. Each snippet is validated using a series of *ad hoc* criteria based upon waveform slope,width, and peak-to-peak amplitude. Finally, spikes can be sorted online using an automated Gaussian mixture modeling algorithm. Details of the spike detection and sorting algorithms used by NeuroRighter are provided in section 3 in the Supplementary Material.

The spike detection/sorting configuration is controlled through a child GUI (**Figure 1B**). All relevant spike detection, validation, and sorting parameters are under user control and are manipulated using the spike detection GUI. Because spike-detection settings are changed using a secondary GUI, the effects of parameter changes can be simultaneously monitored on the visualization tabs in the main interface while data collection occurs. A complete list of these parameters is shown in Table S1 in the Supplementary Material. Spike filters, including trained spike sorters, can be saved and reused.

*2.2.1.5. Stimulation.* NeuroRighter provides several options for delivering complex stimulus patterns to neural tissue either manually through the NeuroRighter application or using scripted protocols. Simple, periodic stimulation protocols, consisting of single or double phase, square, current- or voltage-controlled pulses on any electrode, can be performed directly from the main GUI. Stimuli can be triggered "on demand"in response to a mouse click or by using hardware-timed, periodic sequence of triggers.

Scripted protocols can be used to deliver complex, potentially non-periodic stimulus patterns and to access general purpose analog and digital output lines. Neurorighter uses a double-buffered output engine, called StimSrv (**Table 2**), to produce arbitrary, hardware-timed stimulation, analog-output, or digital output signals (**Table 1**, bottom). StimSrv can be accessed on-the-fly using NeuroRighter's API (section 2.2.2) or with user-written scripts. The schematic in **Figure 2A** demonstrates how StimSrv delivers uninterrupted output. First, a block of the NI cards' memory is reserved and divided into two sections, each of which comprises a single output buffer. At a given instant, one buffer is reserved for sample generation and one is available for writing. When the all samples in the read buffer are exhausted, the buffers switch roles, allowing seamless delivery of constantly varying output signals. This allows the delivery of complex, aperiodic stimulation patterns and the orchestration of experimental apparatuses using analog and digital output lines. All output is clock-synchronized to input data streams, allowing *a priori* specification of stimulus delivery times, relative to the start of the experiment, with single-sample precision. Stimulation scripts can be created with a set of MATLAB functions that are included with NeuroRighter installations (see section 1 in the Supplementary Material).

**Figure 2B** demonstrates the use of a scripted stimulation protocol to deliver spatio-temporal patterns of electrical stimuli. Onesecond trials of spatially uniform, and temporally Poisson random stimulus pulses were delivered to a dissociated cortical network. Each trial consisted of either a new, random stimulus realization or a single repeated realization. Each type of stimulus sequence was interleaved with no delay between adjacent trials. **Figure 2Bi**shows stimulus raster plots for 100 trials each stimulus type, with a grayscale indicating the stimulus trial. For repeated stimuli, individual trials cannot be seen since the recording and stimulation subsystems are clock-synchronized and every repeated stimulus sequence occupies the same set of samples relative to the start of a trial. **Figure 2Bii** shows spiking patterns in response to random and repeated stimuli for 4 units across trials. The delivery of repeated stimuli to the network results in extremely reproducible spiking patterns, and non-repeated, random stimuli probe the variability of population spiking response. This type of stimulus protocol is commonly used to estimate the mutual information between a stimulation process and the population spiking response (Strong et al., 1998; Yu et al., 2010).

#### **2.2.2. NeuroRighter's application programming interface**

NeuroRighter installations include an API that facilitates the creation of real-time protocols. The API comprises a set of tools for interacting with NeuroRighter's input and output streams. Protocols written using the API are externally compiled libraries that can "plug in" to the NeuroRighter application in order to impart

#### **Table 1 | Overview of NeuroRighter's input and output streams.**


Each stream is accessed using a dedicated server that includes functions for reading from, or writing to, its data buffer.

#### **Table 2 | Packages included with NeuroRighter's Plugin API.**


real-time and closed-loop functionality. The software packages included with the API are shown in **Table 2**. Each package contains different set of tools for interacting with NeuroRighter's data streams. Here we discuss the contents and usage of each of these tools. Additionally, a detailed API reference is available online<sup>5</sup> .

*2.2.2.1. NeuroRighterTask.* User-defined protocols employ the NeuroRighter application as a real-time data server. These protocols are inherited from a base component called NRTask, which belongs to the NeuroRighterTask package. Closed-loop protocols created with the plugin API are derived from NRTask (see section 2 in the Supplementary Material for details). Three functions included in NRTask can then be accessed to impart real-time functionality.

1. NRTask.Setup(): This function is called when the base NRTask component is instantiated. It allows one-time setup operations to take place, such as the declaration of variables, allocation of internal buffers, file streaming setup, GUI initialization, etc.


<sup>5</sup>https://potterlab.gatech.edu/main/neurorighter-api-ref/

```
LISTING 1 | Code structure for two types of real-time plugin implemented with the API. (A) Pseudocode for a StimSrv-based real-time plugin.
(B) Pseudocode for real-time plugin triggered by NewData events.
```
**Listing 1A**and **1B**provide pseudocodefor a two real-time plugins that both respond to a spike produced by a particular detected unit. A real-time protocol written using the API will follow the structure of one of these code skeletons, regardless of its complexity. First, the user references the required packages from the API. Next, the plugin is designated to be a child of NRTask, which provides the protocol with automatic access to NeuroRighter's data servers. Finally, the Setup(), Loop(), and Cleanup() functions are overridden (**Listing 1A**), or a NewData event is subscribed to (**Listing 1B**), to impart real-time functionality. After it is compiled (either using Visual Studio or Mono<sup>6</sup> ), the plugin can be executed through NeuroRighter's GUI. Plugin protocols executed through NeuroRighter operate on a high-priority thread to decrease closed-loop response latency. The diagram shown in **Figure 3** shows the interaction between a plugin created using the API, the NeuroRighter executable, and hardware. Functional examples of plugin protocols are provided in section 5 of the Supplementary Material.

*2.2.2.2. Server.* Components derived from NRTask have automatic access to NeuroRighter's input and output servers, which belong to the Server package. There are two banks of data servers: (1) DataSrv, which can be used to read NeuroRighter's input streams (**Table 1**, top) and (2) StimSrv, which can be used to write to output streams (**Table 1**, bottom). DataSrv and StimSrv objects encapsulate isolated data servers, each of which handles a particular data stream. Each server includes methods for reading the hardware clock, reading from and writing to its own data buffer, and accessing stream metadata. Because input and output servers are simultaneously accessible from within a user-defined NRTask, sending output signals (e.g., stimuli) contingent on recorded input is straightforward. The user can select which data streams are sent to DataSrv or available for writing on StimSrv using the Hardware Settings GUI (**Figure 1A**).

<sup>6</sup>http://www.mono-project.com/Main\_Page

A final important feature of each data server within DataSrv is a NewData event. A NewData event is fired for a given stream each time it receives new data for the A/D card or a digital filter. Functions within a plugin can subscribe to these events so that feedback processing only occurs when new data is acquired. This reduces computational overhead and the latency of the closed-loop response. Plugins that use NewData events to generate feedback are not required to include a Loop() function or to use StimSrv to send output signals. Instead, standard calls to the National Instrument driver

loop() function is called at the instant of a buffer switch. **(B)** Example

library (DAQmx) can be used to access the NI cards' directly. Alternatively, output can be generated using natively supported external communication protocols (USB, TCP/IP, UDP, serial, etc.). **Listing II. B.2(b)** provides pseudocode for a real-time protocol analogous to **Listing II.B.2(a)**, but using the NewData event to trigger a response. This type of plugin provides a lower response latencies but is less capable of producing complex, precisely timed output signals. A functional example of a NewData-based plugin is provided in section 5.2 in the Supplementary Material.

the 4 units are shown to the right.

neural signals.

**software elements.** NeuroRighter serves as a high-level interface between hardware and custom user-written protocols (pink box). NeuroRighter simplifies hardware level programming by using datatypes and methods that are specialized for multichannel neural recording and stimulation. This facilitates the creation of low-latency, closed-loop protocols. Neural signals and secondary data streams are fed into the NI cards' analog and digital inputs where they are digitalized and stored temporarily in on-board memory. NeuroRighter periodically transfers data

*2.2.2.3. Datatypes.* NeuroRighter's input and output servers operate on high-level data types that encapsulate different forms of multichannel input and output data. These include multichannel buffers for continuous data streams (such as raw electrode voltages or LFP recordings) and discrete event types (such a detected spikes or stimulation events). Extensive documentation on each of these data types is provided in the API reference.

*2.2.2.4. Log.* The Log package provides accesses to a data logging tool that operates within the NeuroRighter executable, but can be invoked from a user protocol. This tool can be used to write information to a log file using a separate, low-priority thread. This is useful in the development of real-time protocols because core NeuroRighter operations (such as the timing of hardware reads, writes, and other triggers) are logged to this file as well, providing context for messages written from the plugin.

## **3. CASE STUDIES**

NeuroRighter's abilities for orchestrating closed-loop experiments are best demonstrated through example. Here we present five case studies in which protocols created with the API were used to measure NeuroRighter's closed-loop reaction-time, clamp network firing levels in dissociated cultured cortical networks, react to seizures in freely moving animals with multi-electrode electrical stimulation, and control robots serving as artificial embodiments. Experimental methods, and plugin examples are provided in the section 4 in the Supplementary Material. The plugin code used in these case studies is available for download on NeuroRighter's code repository. <sup>7</sup> . Additionally, we provide all code used in the reaction-time case study in section 5 in the Supplementary Material.

## **3.1. LOW-LATENCY CONTROL OF REAL-TIME HARDWARE**

DataSrv serves data to NeuroRighter's visualization tools, filtering algorithms, and externally compiled plugins. The plugin API provides functions for safe interaction with DataSrv so that custom operations can be performed on incoming data streams. User-written plugins can interact with any of the computer's native communication ports, or write data back to StimSrv in order to control external hardware as a function of recorded

Rapid response times are critical for maintaining a tight feedback loop in which features of incoming data streams (e.g., spikes, EEG, temperature, or animal motion) are used to trigger or adjust the delivery of stimuli. To benchmark the response speed of protocols written using the API, we wrote a protocol that generated output signals in response to recorded action potentials. We picked two sorted units from a dissociated neural culture to serve as triggers for hardware activation. When one of these units fired, it triggered the output of a digital word encoding the identity of the detect unit. These signals serve as a generic stand-in for a stimulation pattern or any other hardware control signal that might be used in a feedback control scheme. Output signals were then recorded using NeuroRighter's digital input port. The delay between action potential detection and signal generation could then be measured using the same sample clock. A diagram of the experimental protocol is shown in **Figure 4A**. We wrote protocols to test three hardware options for generating the required digital output:


<sup>7</sup>http://code.google.com/p/neurorighter/source/browse/NR-ClosedLoop-Examples/

<sup>8</sup>http://www.arduino.cc/

hardware option.

The response latency, calculated from the time of an action potential peak to the corresponding change in the digital port was calculated for each hardware option (**Figure 4**). Mean response latencies were 46.9 ± 3.1 ms for rb StimSrv, 7.1 ± 1.5 ms for NewData, and 9.2 ± 1.3 ms for the Arduino board. Latencies where measured while NeuroRighter performed bandpass filtering, spike detection, spike sorting, data streaming, and data saving for 64 electrode inputs, each sampled at 25 kHz. Experiments were conducted on a desktop computer using an Intel Core i7 processor (Santa Clara, California, USA.) and running running 64-bit Windows Vista.

units, the plugin triggered the generation of a digital word encoding the detected unit using either StimSrv, unbuffered digital output triggered by a

The differences in reaction latency for different hardware options are a result of both the method used to communicate with the hardware and the how the input sent from NeuroRighter is interpreted and transformed into a physical output signal. The differences in response times for NewData and Arduino are largely attributable to the different communication protocols and command interpretation by the client device. For instance the Arduino used a RS-232 serial interface where as NewData communicates with the NI cards via PCIe. StimSrv's long latency in comparison to other options is a result of its double buffering system, which requires a relatively long time period between updates to the NI D/A's output buffer. While StimSrv is slow in comparison to the NewData and microcontroller options, it provides an interface that is easier to use and allows the uninterrupted delivery of arbitrary complex singal outputs. On the other hand, the Arduino and NewData methods can only respond by generating finite-sample or periodic control signals. We have found that StimSrv is fast enough for most of our closed-loop requirements. For this reason, we used StimSrv to generate physical outputs for the remainder of the case studies. However, as demonstrated above, the API's modularity allows the use of faster hardware options with little change in coding complexity.

## **3.2. MULTICHANNEL POPULATION FIRING CLAMP**

The population firing rate is a building block of the neural code. The ability to precisely control population firing in the face of experimental perturbations can be used to understand its role in network function. To demonstrate NeuroRighter's ability to control the network firing rate, we implemented the feedback controller presented in Wagenaar et al. (2005) to control the firing activity in dissociated cortical cultures grown on 59-channel micro-electrode arrays. This algorithm adjusts the stimulation amplitude of voltage controlled, biphasic pulses on 10 electrodes to desynchronize population firing and force the network firing rate to track target values. The control law is given by

three different hardware options. N is the number of spikes recorded for each

$$\nu\_k[t+\Delta T] = \nu\_k[t] - \alpha \nu\_k[t] \left(\frac{\langle f\_{\hat{\mu}}[t] \rangle}{f^\*} - 1\right),\tag{1}$$

where *v<sup>k</sup>* is the stimulation voltage on electrode *k*, h*fu*[*t*]i is the average firing rate across sorted units detected with the 59 electrode array extending over a 2 s window into the past, *f* ∗ is the target firing rate, 1*T* is the update period of the feedback loop (as defined within NeuroRighter's Hardware Settings GUI), and α defines the time constant of the feedback controller as

$$
\pi\_{FB} = \Delta T / \alpha. \tag{2}
$$

We used 1*T* = 10 ms and α = 0.002 so that τ *FB* = 5 s. Electrodes were stimulated at a 10 Hz aggregate frequency (1 Hz per electrode for 10 electrodes) in a random, repeating sequence. Additionally, individual electrode voltages were multiplied by a tuning factor that was inversely proportional to the number of spikes that occurred within 30 ms following a stimulus pulse on that electrode, as described in Wagenaar et al. (2005). This factor equalizes each electrode's ability to evoke a spiking response, and is critical for achieving the desynchronizing effect of the controller on population activity.

**FIGURE 5 | NeuroRighter can be used to clamp population firing rates in vitro using closed-loop electrical stimulation. (A)** Schematic of the multi-electrode population firing clamp. **(B)** Step tracking performance is shown for a range of target firing rates, f ∗ (dotted lines). The average neuronal firing rate across detected units,hfu[t]i (colored lines), is shown for each step

in f ∗ . Tracking failures are colored gray. **(C)** Time averaged neuronal firing rate for the last 2.5 min of each 5 min protocol compared to the reference signal, f ∗ . The dotted line is identity. **(D)** The mean control voltage across the stimulating electrodes over the final 2.5 min of each step protocol at different values of f ∗ .

We used the controller to clamp network firing at target rates for 5 min epochs. These results are shown in **Figure 5**. The controller was able to achieve target rates within the range of *f* <sup>∗</sup> = 1.5–4.5 Hz/Unit. An animation of neural activity before and during firing-rate clamping is provided in the Supplementary Material.

The monotonically increasing relationship between the mean stimulation voltage h*v<sup>k</sup>* [*t*]i, and target firing rate *f* ∗ (**Figure 5D**) might indicate that knowledge of the stimulation voltage versus firing rate relationship is sufficient to design an open-loop controller capable of holding network firing rates. To test this, we clamped firing at *f* <sup>∗</sup> = 3.0 Hz/Unit over 10 min epochs for 15 trials. Five minutes into each 10 min protocol, we stopped updating stimulation voltages on the ten stimulating electrodes, but continued multi-electrode stimulation in open-loop mode (**Figure 6**). Although the desired mean firing rate was achieved fairly consistently, the open-loop control scheme could not react to the rapid changes in excitability that are typical of cultured cortical networks (Wagenaar et al., 2006b). This variability is reflected in the large range of control signals required to track the target rate over the first 5 min of each trial. As a result the RMS error of h*fu*[*t*]i about *f* ∗ increased by a factor of 5.1 for open-loop compared to closed-loop epochs. The variance of firing during open-loop stimulation is comparable to that of spontaneous (non-evoked) firing behavior that was recorded before the controller was switched on (**Figure 6**, top).

## **3.3. LONG-TERM POPULATION FIRING CLAMP WITH SYNAPTIC DECOUPLING**

#### **3.3.1. Experiment 1**

*In vitro* neural preparations allow continuous experimental access to neural tissue over very long time scales (Potter and DeMarse, 2001), and therefore serve as important models for understanding slowly occurring developmental processes (Turrigiano et al., 1998; Minerbi et al., 2009; Gal et al., 2010). To demonstrate that NeuroRighter is capable of stable closed-loop neural interfacing over long time scales, we used the multi-electrode feedback controller used in section 3.2 for 6 h epochs. This protocol started with a 1 h recording of spontaneous activity. Then, the controller was engaged to clamp population firing to *f* <sup>∗</sup> = 3.0 Hz/Unit for 6 h. Following the clamping protocol, spontaneous network activity was recorded for an additional hour.

**Figure 7A** shows the resulting multichannel stimulation signal (**Figure 7Ai**), neuronal firing rate in relation to *f* ∗ (**Figure 7Aii**), individual unit firing rates (**Figure 7Aiii**), and zoomed rastergrams before, during, and after multi-electrode stimulation was applied (**Figure 7Aiv**). The controller achieved the *f* <sup>∗</sup> = 3.0 Hz/Unit tracking over the duration of the 6 h protocol. Additionally, network activity was desynchronized through most of the control epoch, but occasionally the controller allowed bouts of synchronized network activity (Wagenaar et al., 2006b).

#### **3.3.2. Experiment 2**

Spiking and neurotransmission have a strong reciprocal influence on one another, making their individual effects on network development difficult to quantify (Turrigiano, 2011). For instance, *N*-methyl-d-aspartate (NMDA)-ergic neurotransmission plays a large role in sustained network recruitment (Nakanishi and Kukita, 1998). For this reason, long-term changes in the state of *in vitro* networks following the application synaptic blockers (e.g., changes in firing rate, spiking patterns, or synaptic-strength) is difficult to attribute directly to effects on neurotransmission because of secondary, confounding effects on network activity levels. However, the closed-loop population clamp provides a solution to this problem. A firing rate controller has the potential to compensate for changes in network excitability induced by the application of a drug, removing its confounding effect on network activity.

To test this, we used the multichannel population clamp during the bath application of d(−)-2-amino-5-phosphonopentanoic acid (AP5), a competitive antagonist of NMDA receptor. This protocol proceeded identically to experiment 1 except that at 1-h following the start of closed-loop stimulation, NeuroRighter triggered the perfusion of 50µm AP5 into the culturing medium using a syringe pump and a custom, gas-permeable perfusion lid (Potter and DeMarse, 2001; Figure S5 in the Supplementary Material). Four hours after AP5 was applied, NeuroRighter triggered the pump a second time to perform a series of washes with normal culturing medium that removed AP5 from the bath.

Time-series results of this protocol are shown in **Figure 7B**. The contents of these plots are analogous to **Figure 7A** but have arrows to indicate when AP5 was added to, and removed from, the culturing chamber. The controller was able to successfully compensate for changes in network excitability caused by the addition of AP5. Changes in network dynamics were reflected in the control signal, which became smoother in the presence of the AP5 (**Figure 7Bi**).

#### **3.3.3. Comparing Experiments 1 and 2**

**Figure 7C**shows the average, pair-wise firing rate correlation functions (Tchumatchenko et al., 2010) for 30 randomly selected units from experiment 1 (black lines) and experiment 2 (red lines). **Figures 7Ci,iii** show the correlation functions of spontaneous network activity before and after the controller was engaged, respectively. **Figure 7Cii** shows correlation functions for epochs during the clamping phase (which included the AP5 treatment for experiment 2). The periodicity of this correlation function follows the 10 Hz aggregate stimulation frequency during the clamping period.

Intriguingly, although the pair-wise spiking correlations for experiments 1 and 2 were very similar for epochs of spontaneous activity before and during multichannel stimulation (**Figures 7Ci,ii**), they were remarkably different once the stimulator was turned off (**Figure 7Ciii**). When AP5 was not present during the clamping phase (experiment 1), the firing correlation between units appeared to be enhanced following multichannel stimulation. In contrast, pair-wise correlations were almost non-existent following the a population clamp in which AP5 was present (experiment 2). Because the firing statistics (firing rate and correlation structure) during the 6-h clamping period were nearly identical for the both experiments 1 and 2, this effect on the correlation structure of network activity can not be due to effects on firing activity, but required blocking NMDAergic transmission. Without the closed-loop controller in place, AP5 would have affected network activity levels, obfuscating the mechanism of AP5's effect.

This case study demonstrates the ability of the closed-loop controller to quickly adapt to drug-induced changes in network excitability, to decouple network variables that are normally causally intertwined, and to operate robustly over many hours. Additionally, this case study demonstrates NeuroRighter's ability control peripheral equipment aside from electrical stimulators.

#### **3.4. REAL-TIME SEIZURE INTERVENTION IN FREELY MOVING RATS**

Aside from *in vitro* recording hardware, NeuroRighter can interface with many different types of neural probes, including those designed to record from and stimulate freely moving animals. To demonstrate this, we performed electrical micro-stimulation in response to paroxysmal activity of hippocampal recordings taken from a rat with induced temporal lobe epilepsy. Many studies have shown potentially therapeutic effects of electrical stimulation on epileptic brain tissue, which could serve as an alternative to pharmacological or surgical treatment methods. For instance, electrical stimulation triggered by characteristic field potential abnormalities can potentially abrogate seizures and lead to a decreased frequency of behavioral symptoms (Mormann et al.,2007;Morrell, 2011; Nelson et al., 2011).

We used the plugin API to create a closed-loop protocol that could detect temporal lobe seizures in freely moving rats and react with multi-electrode stimulation (**Figure 8A**). This control scheme is similar to that of the NeuroPace responsive neurostimulation system (Sun et al., 2008) (NeuroPace Inc., Moutain View, CA, USA), with the exception that we used multi-micro-electrode stimulation instead of driving a single macroelectrode.

Rats were rendered epileptic using focal injections of tetanus toxin into the right-dorsal hippocampus (Hawkins and Mellanby, 1987; see section 4C in the Supplementary Material). LFPs were recorded from CA1 and CA3 regions of the hippocampus using

**FIGURE 7 | Long-term population clamp. (A)** (i) The mean stimulation voltage (black) and individual electrode stimulation voltages (gray) over the course of the 6-h clamping protocol. (ii) The neuronal firing rate (black) compared to the target rate (red line). (iii) Individual unit firing rates, sorted in order of increasing rate during the 1 h period prior to the start of closed-loop control. (iv) Zoomed rastergrams showing short time scale network spiking before, during and after the controller was engaged. **(B)** Same as **(A)** except that AP5 was added 1 h after the start of the closed-loop controller and removed 4 h later. This is indicated by the arrows at the top of the figure. **(C)**

Average pair-wise correlation functions between units for experiments with and without AP5 application (red and black lines, respectively). Cross-correlations were created from spiking data (i) during spontaneous activity before the closed-loop controller was engaged, (ii) half-way through the closed-loop-control period, and (iii) during spontaneous network activity following closed-loop control. The data used to create the correlation functions is centered about locations used to create the rastergrams shown in **(A**iv**)** and **(B**iv**)**. To create the correlation functions, unit firing rates were calculated using 10 ms time bins.

multichannel electrical stimulation through the recording electrodes via a stimulation multiplexing board (green). **(B)** Implantation sites of the microwire array. Top view shows the electrode penetration sites (black dots) in the right-dorsal hippocampus. The red line indicates position of except with closed-loop stimulation engaged. Electrical stimulation was applied on electrode 1 along with nine other electrodes (not shown). Red dots indicate stimulation times for e01 and stimulation artifacts appear on the LFP trace. e05–e07 and e11 were not used for stimulus application.

a chronically implanted 16-channel microwire array (Tucker-Davis Technologies, Alachua, FL; **Figure 8B**). The microwire array consisted of two rows of electrodes, with 8 electrodes per row.

Multi-electrode stimulation was triggered in response to detected seizures while the rat moved around its cage. To accomplish this, a "line length"measure on each LFP channel, which has been shown to be effective for threshold based seizure detection, was calculated online (Esteller et al., 2001). A line length increment for a single LFP channel is defined as absolute difference between successive samples of the LFP,

$$l\_k[t] = |\mathbf{x}\_k[t] - \mathbf{x}\_k[t - T\_s]|\tag{3}$$

where *xk*[*t*] is the LFP value on the *k*th channel at time *t*, and *T<sup>s</sup>* is the LFP sampling period of 500µs. *lk*[*t*] was passed through a first order averaging filter,

$$L\_k^{\mathbf{r}\_{\rm fld}}[t+T\_s] = l\_k[t] + \exp\left(\frac{-T\_s}{\mathbf{r}\_{\rm fld}}\right) \cdot (L\_k^{\mathbf{r}\_{\rm fld}}[t] - l\_k[t]) \tag{4}$$

where τ *filt* is the filter time constant. For each recording channel, we calculated *L* τ*filt k* [*t*+*Ts*] using two values of τ *filt*, 1 and 60 s,which resulted in short and long time averages that could be compared to detect rapidly occurring trends in *lk*[*t*]. Specifically, seizures were defined as events for which the criterion

$$L\_k^{1\text{sec}}[t] > 2 \cdot L\_k^{60\text{sec}}[t] \tag{5}$$

was met on at least 4 of the 16 recordings channels. Upon seizure detection, 10 randomly chosen electrodes were stimulated sequentially at 45 Hz (aggregate frequency) for 10 s using biphasic, 1V, 400µs per phase, square waves. **Figures 8C,D** shows seizure events without and with closed-loop stimulation engaged. During stimulus application, *L* α *k* [*t*] values were frozen to prevent stimulation artifacts from affecting the line length averages.

There was no easily discernible effect of microstimulation on seizure duration or intensity during this pilot experiment. However, this proof of concept demonstrates the API's utility in experiments conducted on freely moving animals and to modulate aberrant neural activity states. These features are useful for testing stimulation algorithms that do not just react to a seizure occurrence, but *predict* oncoming seizures ahead of time in order to apply a preventative action, which has proven a difficult goal to achieve (Mormann et al., 2007).

#### **3.5. SILENT BARRAGE AND ROBOTIC EMBODIMENT**

The complexity of neural systems often necessitates intricate experimental protocols for proper investigation. To meet this requirement, the plugin API can be used to integrate NeuroRighter with complicated configurations external hardware and software. Working in collaboration with the SymbioticA group at the University of Western Australia, we used NeuroRighter for intercontinental neural control of a robotic system. This project was part of an art-science collaboration called Silent Barrage (Zeller-Townson et al., 2011), in which a dissociated cortical culture in Atlanta, Georgia, USA, was embodied with a remote array of robotic drawing machines situated in an interactive art gallery<sup>9</sup> . This system is an extension of the MEART project (Bakkum et al., 2007).

**Figure 9A** shows an illustration of the Silent Barrage system. Using the plugin API, a protocol was written to communicate between NeuroRighter and a custom web server running on the same computer. The web server in turn communicated with a client computer controlling a robotic body consisting of 32 independent robots. Each robot had a rotating actuator capable of climbing up and down a vertical column (**Figure 9C**). Columns were arranged in a grid that reflected the electrode layout of the MEA (**Figures 9A,B**). The height of each rotating actuator at a given moment was determined by the instantaneous firing rate detected on two adjacent electrodes from the 59-channel MEA. As the actuators traveled up and down, they periodically marked their positions on the vertical poles using an ink pen. Over time, this resulted in a visual record of spatiotemporal activity of the culture inscribed on each column (**Figure 9C**).

Silent Barrage was exhibited in the United States (New York), Spain (Madrid), Brazil (Sao Paolo), Ireland (Dublin), and China (Beijing). Visitors to the exhibitions were encouraged to mingle amongst the robotic embodiment and they were observed using overhead cameras (**Figures 9A,B**). The resulting video feed was processed on site to extract features of audience movement (Horn and Schunck, 1981) and these data were streamed back to NeuroRighter's web server in Atlanta. Audience movement measures were then used to adjust stimulation patterns delivered through NeuroRighter's all-channel stimulator. The relationship between incoming video data and electrical stimulation varied from exhibit to exhibit, from simple single-electrode rate coding schemes to more complex multi-electrode schemes where artificial neural networks were used to deliver certain stimulus pattern based upon learned features of incoming video data. Electrical stimulation modulated the activity state of the culture's firing patterns, thus closing the loop around the dissociated culture, robotic body, and audience members separated by thousands of kilometers. While on exhibit in the National Art Museum of China, Silent Barrage was perhaps the Earth's largest behaving "organism."

#### **4. DISCUSSION**

Closed-loop electrophysiology systems are powerful tools for neuroscience research because they can be used to parse recurrent systems into independently manipulable components. Voltage clamp techniques use feedback control to separate membrane potential from the recurrent influence of voltage-dependent ionic conductances (Marmont, 1949). Seminal experiments using voltage clamp have fostered our understanding of ion channels, neuronal excitability, and synaptic transmission. More recently, dynamic clamp has been used to deliver artificial transmembrane or synaptic conductances into living neurons (Prinz et al., 2004; Kispersky et al., 2011). Using these approaches, feedback control transforms dynamic features of *individual neurons* into controlled experimental variables. Similarly, closed-loop multichannel systems like NeuroRighter can transform features of *neural networks* into controlled experimental variables (Arsiero et al., 2007). NeuroRighter is a powerful tool for controlling network variables, improving upon currently available systems in terms of cost, usability, accessibility, extensibility, and hardware standardization (Wagenaar et al., 2006a; Stirman et al., 2011; Wallach et al., 2011; Ahrens et al., 2012). We have this demonstrated NeuroRighter's power in conducting basic and translational neuroscience research through a variety of case studies.

<sup>9</sup>http://silentbarrage.com/

Altered gene expression, synaptic input, or environmental conditions can induce changes in spiking activity, which in turn trigger activity-dependent processes. Because of this, it becomes difficult to distinguish the role these factors play in shaping network dynamics and neural plasticity independent of firing rate. Closed-loop multichannel feedback systems provide an opportunity to render the population firing rate a controlled experimental variable and enable study of cellular and network processes as a function of a defined activity state.We used Neurorighter to clamp the firing rate of a living neural network to user-defined setpoints over both short and long timescales (Sections 3.2, 3.3). Further, we were able to control population firing rate during prolonged application of the NMDA receptor antagonist, AP5 (Section 3.3). Our controller compensated for the loss of NMDA-mediated excitation and maintained network spiking at the target firing rate. Therefore, the effects of AP5 could be deduced through comparison with a control culture that underwent an identical clamping protocol but with intact synaptic transmission. In most studies that use long-term drug application, the individual roles of spiking and

excitatory neurotransmission on plasticity are ambiguous (Turrigiano, 2011). By using a real-time multichannel feedback system, we have begun to unravel the independent effects of spiking and NMDAergic transmission on network behavior. This approach could also be used to more directly study the effects of altered genetic or environmental factors on network activity.

In addition to better controlled experimental variables, realtime feedback can be used to improve the relevance of experiments using reduced neural preparations in studies of behavior. Implicit to animal behavior is the interplay between motor output and sensory perception (e.g., head movement affects the visual input stream and vice-versa). While reduced neural preparations or immobilized animals provide excellent experimental accessibility, their major weakness is that they do not preserve a functional sensory-motor loop. We have demonstrated that Neurorighter is well-equipped for performing closed-loop experiments that restore the sensory-motor loop by interfacing living neural networks with artificial bodies (Section 3.5). The advantages of this approach over traditional open-loop techniques are twofold. First, neural systems can engage in"motor"behaviors without sacrificing delicate optical (Ahrens et al., 2012) or electrophysiological (Harvey et al., 2009) access due to actual motion. Secondly, the experimenter has complete control over the mapping between a recorded neural signal and its resulting "motor" effect (DeMarse et al., 2001; Ahrens et al., 2012). For example, Ahrens et al. (2012) recently examined optomotor adaptation in paralyzed larval zebrafish by embedding them in a virtual environment. Visual stimuli in the virtual environment provided a perception of motion, and induced fictive motor-nerve activity. Recorded motor-nerve activity was used to drive motion of the virtual environment. Changes in sensory-motor feedback gain could be achieved by adjusting the efficacy by which fictive motor patterns propelled the fish through its virtual world. All the while, full brain activity was recorded through single-cell resolution imaging, which would be nearly impossible to achieve in a freely moving animal. This study highlights how closed-loop interfaces between artificial bodies or environments and a living neural system allows excellent experimental access during behaviors requiring an intact sensory-motor loop.

Aside from basic research, closed-loop multichannel electrophysiology has possible medical applications. Predictive application of drugs or electrical stimulation has the potential to increase the efficacy and safety of treatments for various neurological disorders (Mormann et al., 2007; Rosin et al., 2011) and improve neural rehabilitation procedures (Jackson et al., 2006a). For example, a reliable seizure prediction algorithm would open the possibility for targeted interventions that abort seizures before they occur. Mormann et al. (2007) provide an extensive comparison of different methods for seizure prediction. Unfortunately, the clinical applicability of these algorithms remains quite pessimistic and future studies will require a high-throughput validation system to test robustness of seizure prediction algorithms under a variety of circumstances. We have demonstrated that NeuroRighter can be used for this purpose (Section 3.4). The stimulation algorithm we used is very similar to a method called responsive neurostimulation (NeuroPace Inc., Mountain View, CA, USA) that recently showed very promising results in a large, double-blind, pivotal clinical trial (Morrell, 2011). This form of closed-loop seizure modulation is not truly predictive as it was triggered on the occurrence of "unequivocal seizure onset" (Litt and Echauz, 2002). However, the API provides a means for easy reconfiguration in order to test alternative, predictive methods to abort seizures before they begin, using multichannel electrical stimulation or the local application of an anti-convulsive drug. Additionally, a plugin could be reconfigured for closed-loop modulation

#### **REFERENCES**


of other pathological neuronal activities or to facilitate motor rehabilitation (Jackson et al., 2006a).

Tools that enable closed-loop interaction with neural tissue at the network level have great potential to advance experimental neuroscience. Historically, open-source projects have been extremely good at adapting equipment and code designed for a singular purpose to other uses. For this reason, we envision a large role for open-source software and open-access hardware communities in the development of technologies for closed-loop eletrophysiology systems. Rapid improvements in microprocessor performance, embedded computer systems, on-chip multichannel signal processing, and A/D conversion technology must be matched by projects that can expose their powerful features for researchers with little or no background in embedded systems or computer science. NeuroRighter is one of several open-source hardware/software projects that are enabling more labs to carry out sophisticated electrophysiology with less money and more experimental flexibility<sup>10</sup> .

## **ACKNOWLEDGMENTS**

This work was supported by NSF COPN grant 1238097 and NIH grant 1R01NS079757-01, NSF GRFP Fellowship 08-593 to Jonathan P Newman, NSF GRFP Fellowship 09-603 to Ming-fai Fong, and NSF IGERT Fellowship DGE-0333411 to Jonathan P Newman and Ming-fai Fong. Sharanya Arcot Desai was supported by a Faculty for the Future fellowship, provided by the Schlumberger Foundation. We thank Guy Ben-Ary, Phil Gamblen, Peter Gee, Stephen Bobic, and Douglas Swehla for their contributions to the Silent Barrage robotic embodiment project. We thank J. T. Shoemaker for performing tissue harvests. We thank Ted French for his work creating the gas-permeable perfusion caps for delivery of AP5. Finally, we gratefully acknowledge all those who have contributed to NeuroRighter's hardware forums and supplied bug reports to the NeuroRighter code repository. Jon Erickson and Neal Laxpati have been especially helpful in this regard.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at http://www.frontiersin.org/Neural\_Circuits/10.3389/ fncir.2012.00098/abstract

<sup>10</sup>http://www.its.caltech.edu/~daw/meabench/ http://code.google.com/p/arte-ephys/ http://open-ephys.com/

http://www.backyardbrains.com/Home.aspx


impedance of implantable microwire multi-electrode arrays by ultrasonic electroplating of durable platinum black. *Front. Neuroeng.* 3:5. doi:10.3389/fneng.2010.00005

Du, J., Blanche, T. J., Harrison, R. R., Lester, H., and Masmanidis, S. C. (2011). Multiplexed, high density electrophysiology with nanofabricated neural probes. *PLoS ONE* 6:e26204. doi:10.1371/journal.pone.0026204


remodeling, and network activity. *PLoS Biol.* 7:e1000136. doi:10.1371/journal.pbio.1000136


*International IEEE EMBS Conference on Neural Engineering*, Washington, 518–521.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 August 2012; accepted: 18 November 2012; published online: 18 January 2013.*

*Citation: Newman JP, Zeller-Townson R, Fong M-F, Arcot Desai S, Gross RE and Potter SM (2013) Closed-loop, multichannel experimentation using the open-source NeuroRighter electrophysiology platform. Front. Neural Circuits 6:98. doi: 10.3389/fncir.2012.00098*

*Copyright © 2013 Newman, Zeller-Townson, Fong , Arcot Desai, Gross and Potter. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, providedthe original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## *In vitro* large-scale experimental and theoretical studies for the realization of bi-directional brain-prostheses

*Paolo Bonifazi 1†, Francesco Difato2†, Paolo Massobrio3†, Gian L. Breschi 2, Valentina Pasquale2, Timothée Levi 4, Miri Goldin1, Yannick Bornat 4, Mariateresa Tedesco3, Marta Bisio2, Sivan Kanner 5, Ronit Galron5, Jacopo Tessadori 2, Stefano Taverna2 and Michela Chiappalone2 \**

*<sup>1</sup> School of Physics and Astronomy, Tel Aviv University, Tel Aviv, Israel*

*<sup>5</sup> Department of Neurobiology, George S. Wise Faculty of Life Sciences, Sagol School of Neuroscience, Tel Aviv University, Tel-Aviv, Israel*

#### *Edited by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### *Reviewed by:*

*Suguru N. Kudoh, Kwansei Gakuin University, Japan Hsin Chen, National Tsing-Hua University, Taiwan Jason G. Fleischer, The Neurosciences Institute, USA Pavel M. Itskov, Champalimaud Foundation, Portugal*

#### *\*Correspondence:*

*Michela Chiappalone, Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Via Morego 30, Genova, 16163, Italy.*

*e-mail: michela.chiappalone@iit.it †These authors have contributed equally to this work.*

Brain-machine interfaces (BMI) were born to control "actions from thoughts" in order to recover motor capability of patients with impaired functional connectivity between the central and peripheral nervous system. The final goal of our studies is the development of a new proof-of-concept BMI—a neuromorphic chip for brain repair—to reproduce the functional organization of a damaged part of the central nervous system. To reach this ambitious goal, we implemented a multidisciplinary "bottom-up" approach in which *in vitro* networks are the paradigm for the development of an *in silico* model to be incorporated into a neuromorphic device. In this paper we present the overall strategy and focus on the different building blocks of our studies: (i) the experimental characterization and modeling of "finite size networks" which represent the smallest and most general self-organized circuits capable of generating spontaneous collective dynamics; (ii) the induction of lesions in neuronal networks and the whole brain preparation with special attention on the impact on the functional organization of the circuits; (iii) the first production of a neuromorphic chip able to implement a real-time model of neuronal networks. A dynamical characterization of the finite size circuits with single cell resolution is provided. A neural network model based on Izhikevich neurons was able to replicate the experimental observations. Changes in the dynamics of the neuronal circuits induced by optical and ischemic lesions are presented respectively for *in vitro* neuronal networks and for a whole brain preparation. Finally the implementation of a neuromorphic chip reproducing the network dynamics in quasi-real time (10 ns precision) is presented.

**Keywords:** *In vitro* **modular networks, whole brain, lesioned circuits,** *in silico* **neuronal circuit, hardware spiking neural network**

## **INTRODUCTION**

Millions of people worldwide are affected by neurological disorders that disrupt connections between brain and body, causing paralysis, or impair cognitive capabilities. This number is likely to increase in coming years, yet current assistive technology is still limited. Over the last decade Brain-Machine Interfaces (BMIs) and neuro-prostheses (Nicolelis, 2003; Hochberg et al., 2006, 2012; Nicolelis and Lebedev, 2009) have been the object of extensive research and offer the promise of treatment for such disabilities. These devices could profoundly improve the quality of life for affected individuals, and could have a more widespread impact on society.

Neural interfaces have mainly been devoted to restoring motor function that is lost due to injuries at the level of the spinal cord (Collinger et al., 2013), or to recover sensorial capacities, e.g., artificial retinal or cochlear implants (Chader et al., 2009). However, recent interest has also focused on neural prostheses for restoring cognitive functions. For example, a hippocampal prosthesis for improving memory function in behaving rats was recently presented (Berger et al., 2011, 2012), and the same group has also tested a device in primate prefrontal cortex aimed at restoring impaired cognitive functions (Hampson et al., 2012; Opris et al., 2012).

The realization of such prostheses implies that we know how to interact with neuronal cell assemblies, taking into account the intrinsic spontaneous activation of neuronal networks and understanding how to drive them into a desired state in order to produce a specific behavior. The longterm goal of replacing damaged brain areas with artificial devices requires neural network-like prosthetics or models that could be fed with recorded electrophysiological patterns and that could provide a substitute output to recover the desired functions. While ultimately this approach must be tested and applied *in vivo*, important insights could be gained using *in vitro* systems of increasing architectural complexity, which can be more easily and thoroughly accessed, monitored,

*<sup>2</sup> Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Genova, Italy*

*<sup>3</sup> Department of Informatics, Bioengineering, Robotics and System Engineering, University of Genova, Genova, Italy*

*<sup>4</sup> IMS laboratory, UMR CNRS 5218, University of Bordeaux, Talence, France*

manipulated, and modeled than *in vivo* systems (at least at present).

The final goal of the studies presented in this paper is to develop a test-bed for the development of a new generation of neuro-prostheses capable of restoring lost communication between neuronal circuits. These studies constitute the object of the European project BRAIN BOW (www.brainbowproject.eu). Healthy and lesioned *in vitro* neuronal circuits are characterized in parallel to the development of *in silico* neuronal networks, with the goal of establishing bi-directional communication to mimic or bypass an injured neuronal network. In order to develop an experimental and computational platform for the prototyping of neuro-prostheses, we followed a bottom-up approach using *in vitro* biological neuronal systems with increasing structural complexity. Our approach takes advantage of the unique features of *in vitro* neuronal cultures, which represent a powerful experimental model to investigate the inherent properties of neuronal cell assemblies as a complement to artificial computational models. We use engineered networks of increasing structural complexity, from isolated finite-size networks up to interacting assemblies, as a model of intercommunicating neuronal circuitries. Moreover, we scaled our studies up to the isolated whole guinea-pig brain (IWB), to translate to an *in vivo* model.

In this paper we present the overall multidisciplinary strategy and preliminary results on the different building blocks of the project. The structure-function relationship of "finite size circuits" was characterized with single cell resolution by combining calcium imaging and immunocytochemistry. Similarly to what previously observed in isolated neuronal clusters (Shein-Idelson et al., 2010), we found that the frequency of synchronous network events increased with circuit size. This result was reproduced by *in silico* neural network models based on Izhikevich neurons with scale-free connectivity. The feasibility of controlled network lesions was explored by optically transecting cell processes and monitoring the subsequent change in functional network connectivity. In addition, in a whole brain preparation, a focal ischemic lesion in the hippocampus was demonstrated to cause an interruption of the limbic olfactory pathway. Finally, a neural network hardware model with arbitrary connectivity based on Izhikevich neurons, working at nanosecond time scale, is presented. These experimental and computational platforms represent a starting point for restoring functional closed-loop communication in a neuronal network with lesioned circuitries.

## **MATERIALS AND METHODS**

#### **EXPERIMENTAL MODELS**

The repertoire of activity patterns exhibited by an *in vitro* neural network is strongly dependent on the complexity of its geometry (Shein-Idelson et al., 2011). While homogeneous networks (**Figure 1A**) tend to display highly stereotyped bursts which spread to most of the connected cells (Kamioka et al., 1996; Van Pelt et al., 2004; Chiappalone et al., 2006; Eytan and Marom, 2006), networks composed of smaller sub-networks with sparse connections (**Figure 1C**) usually present non-repetitive patterns of sparse spiking and local bursts (Macis et al., 2007; Shein-Idelson et al., 2010). The first cellular model proposed in this

work is that of finite size network (**Figure 1B**), namely an isolated neuronal circuit consisting of a small number of neurons (dozens to a few hundreds) that is still able to spontaneously produce bursts similar to those observed in larger homogeneous networks (cf. section "Results"). Characterization of activity within these assemblies could allow their use as building blocks for larger, more complex structures of interconnecting sub-networks. At the other end of the complexity spectrum we set the isolated whole brain of a guinea pig (**Figure 1D**). This model is used to investigate the properties of one complex functional neuronal assembly (the olfactory tract, see below) embedded in an intact brain (cf. section "Results").

## *Finite size networks: patterning, cell culture, and calcium imaging*

The procedure adopted for the preparation of "finite size networks" is in accordance with the NIH standards for care and use of laboratory animals and was approved by the Tel-Aviv University Animal Care and Use Committee.

Cultures were prepared as described in Herzog et al. (2011). After the fourth day *in vitro*, the growth medium was enriched with 0.5% Pen-Strep (Biological Industries Beit Haemek), 2% B-27 (Gibco), and 0.75% glutamax (Biological Industries Beit Haemek). Cells were plated at a density of 750 cells/mm2 on a 23 mm square glass coverslip previously glued on a 35 mm petri dish. Coverslips were coated with spots of poly-D-lysine (PDL, Sigma), and petri dishes were homogenously coated with PDL. The cells attaching homogeneously on the free surface of the petri dish (i.e., not covered by the glass coverslip) functioned as a "supporting network" (Kleinfeld et al., 1988). PDL spots were created using either manual drop deposition or polydimethylsiloxane (PDMS) stencils. For manual drop deposition, an Eppendorf pipette with a tip of 10μl capacity was used. The spots were created by touching the tip filled with 2 μl PDL on the coverslip surface and then drying the coverslips at 37◦C for 30 min.

When PDMS stencils were used, the procedure to create PDL spots was based on a soft lithography process, as described in Sorkin et al. (2006). Briefly, an SU8-2075 (Micro Chem) mould on a silicon wafer with a feature thickness of approximately 200 μm was used to shape the PDMS. The feature was composed of squares of 700 μm × 700μm separated by at least 1 mm, in order to obtain isolated neuronal islands. The size of the square was chosen to fit the field of view of a 10× objective in the calcium imaging setup described in detail below and in Herzog et al. (2011). Once the PDMS substrate was shaped and dried on the silicon wafer, the PDMS stencils were detached and placed directly on the glass coverslips. Drops of the PDL solution were dripped onto the PDMS stencil until the features were completely covered. After mild vacuum degassing for 15 min, the excess PDL solution was removed and the sample was dried at 37◦C for 30 min. The PDMS stencil was removed before cell plating.

Calcium imaging of the patterned neuronal networks grown on coverslips was performed in buffered-ACSF solution (containing, in mM, 10 HEPES, 4 KCl, 1.5 CaCl2, 0.75 MgCl2, 139 NaCl, 10 D-glucose, adjusted with sucrose to an osmolarity of 325 mOsm, and with NaOH to a pH of 7.4). In order to load the cells with the calcium-sensitive dye, cultures were incubated for 30 min in 1 ml ACSF supplemented with 1μl of 10% pluronic acid F-127 (Biotium 59000) and 1 μl Oregon-Green BAPTA-I AM (Invitrogen 06807) previously diluted with 7.6μl anhydrous-DMSO. Following incubation, cultures were washed with ACSF and recorded at 37◦C. In order to avoid artifacts due to evaporation and pH change, the ACSF was replaced every 20 min during the recording session.

Calcium-fluorescence images were acquired with an EMCCD camera (Andor Ixon-885) mounted on an upright Olympus microscope (BX51WI) using a 10× water-immersion objective (Olympus, NA 0.4). Fluorescent excitation was provided via a 120 W mercury lamp (EXFO X-Cite 120PC) coupled to the microscope optical axis with a dichroic mirror, and equipped with an emission filter matching the dye spectrum (Chroma T495LP). Images were acquired at 59 fps in 2 × 2 binning mode using Andor software data-acquisition card (SOLIS) installed on a personal computer.

#### *Immunocytochemical staining*

At the end of calcium-imaging experiments, cultures were washed twice with PBS, then fixed with 4% PFA (15 min) and left in PBS for not more than 5 days before staining. For immunocytochemical staining, fixed cultures were washed three times with PBS (10 min each) and then incubated with 1% Triton ×100 in PBS for 30 min. Cultures were blocked with 2% BSA, 10% normal serum and 0.5% Triton × 100 in PBS for 2 h at room temperature. The cultures were incubated overnight with the first primary antibody (GAD67, 1:250, Millipore, MAB5406) in blocking solution at 4◦C. The next day cultures were incubated with the second primary antibody (MAP2, 1:500, Chemicon, AB5622) overnight at 4◦C. Cultures were then washed three times with TBS and incubated with the secondary antibodies in 2% BSA, 2 mM CaCl2 in TBS for 1 h at room temperature. After being washed three times with TBS the cultures were mounted with aqueous mounting medium containing DAPI (vector).

#### *In vitro whole brain*

Young adult Hartley guinea pigs (150–300 g, Charles River) were used for IWB recordings. All procedures were approved by the Italian Department of Health and were conducted in accordance to FELASA guidelines and Italian and European directives (DL 116/92 and 2010/63/EU). Animals were anesthetized with sodium thiopental (125 mg/kg, i.p.) and transcardially perfused with a cold (4◦C), oxygenated (95% O2, 5% CO2) saline solution composed of 126 mM NaCl, 3 mM KCl, 1.2 mM KH2PO4, 1.3 mM MgSO4, 2.4 mM CaCl2, 26 mM NaHCO3, 15 mM glucose, 2.1 mM HEPES, and 3% dextran (MW 70,000). The pH of the solution was corrected to 7.1 with 1N HCl. After assessing the absence of nociceptive and ocular reflexes, the brain was gently dissected out of the skull, transferred to a recording chamber, and perfused at 7 ml/min with the above solution (pH = 7*.*3, 15◦C) via a peristaltic pump (Minipulse II, Gilson, France) through a cannula inserted in the basilar artery (**Figure 5**). Prior to recording, the temperature of the preparation was gradually increased to 32◦C (0.2◦C/min) (Llinas et al., 1981; Muhlethaler et al., 1993; De Curtis et al., 1998). In order to induce an ischemic insult in the hippocampal formation, a silk thread was positioned under the left rostral and caudal posterior cerebral arteries [r- and c-PCA, see Librizzi et al. (1999)] and a loose knot was prepared around the vessels. The flow was interrupted by pulling the thread ends to tighten the knot (**Figure 5**) (Pastori et al., 2007).

#### **READ-OUT SYSTEMS**

#### *Optical manipulation and recording system for in vitro neural networks*

The optical system combined a laser dissector with a microscope for simultaneous fluorescence and bright field imaging during electrophysiological recording of neural network activity, as previously described (Difato et al., 2011a).

The light source used to perform calcium fluorescence imaging was composed of TTL modulable laser diodes (TECBL-15 G-473-TTL-FC, World Star Tech. Inc., USA) coupled to the microscope (BX51, Olympus, Italy) through a circle top-hat engineered diffuser (ED1-C20-MD, Thorlabs, Optoprim, Italy) to remove laser speckles. A pair of UV doublets (Thorlabs, Optoprim, Italy) coupled the laser light to the microscope objective (60×, 0.9 NA water dipping). The laser light was focused on the back focal plane of the microscope objective to produce a homogenous wide field illumination on the sample. A light emitting diode at 590 nm wavelength served as the bright field illumination source (M590L2, Thorlabs, Optoprim, Italy). The wavelength of the diode was chosen to avoid interference with the emission spectra of the fluorochrome (Fluo4-AM, Invitrogen) used to label the sample. A dichroic mirror separated the light coming from the sample (green and red portion of light spectra) onto two cameras. Green emission light was deviated on CCD1 (V887ECSUVB EMCCD, Andor, Lot Oriel, Italy) acquiring the calcium fluctuations due to network activity, and the red portion of the light spectra was deviated on CCD2 (Pilot PIA1000-48GM, Basler, Advanced Technologies, Italy) to perform bright-field imaging. The CCDs image acquisitions and light sources were synchronized with a TTL signal coming from a D/A board (PCI-6529, National Instruments, Italy). The use of TTL-modulable light sources for fluorescence and bright field imaging allowed a precisely timed illumination of the sample, thereby reducing phototoxicity and facilitating long term calcium imaging of neural networks. Bright-field images were acquired at 1 Hz to detect network topography before and after laser dissection of network connections. Cells were previously incubated for 10 min with 5μm Fluo-4 AM (Invitrogen, Italy). To monitor the neural network activity before and after laser induced network lesions, calcium imaging was performed at 60 Hz (light exposure of 3 ms each frame, at an average power at the sample of 60μW).

Cells were kept under the microscope at 35◦C using a Peltier device (QE1 resistive heating with TC-344B dual channel heater controller, Warner Instruments, Italy). For neuronal cultures plated on Petri dishes, pH and humidity were controlled by aerating a custom-designed polydimethylsiloxane (PDMS) sleeve, which integrated the objective for optical access, with humidified carbogen (95% O2, 5% CO2).

A pulsed, sub-nanosecond UV Nd:YAG laser at 355 nm (PowerChip nano-Pulse UV laser PNV-001525-040, Teem Photonics, Italy) served as the source for performing laser microdissection experiments. The diaphragm of the epi-illuminator was substituted by a narrow-band laser mirror, which reflects 355 nm laser light while passing all other wavelengths coming from the laser diodes used for fluorescence microscopy (DM6, TLM1-350-45-P, CVI, Italy), thus allowing fluorescence imaging and laser dissection to be performed simultaneously. Damage to neural network was inflicted with laser pulse repetition rate settled at 100 Hz, and an average power at the sample of about 4 μW.

#### *Electrophysiological system for the in vitro whole brain*

Extra- and intracellular recordings were performed simultaneously in piriform and medial entorhinal cortex (PC and m-ERC). To test the viability of the preparation throughout the experiment, we monitored evoked local field potentials (LFPs) in PC and m-ERC in response to the electrical stimulation (0.5–3 mA, 0.3 ms) of the lateral olfactory tract (LOT) using custom-made bipolar electrodes made of twisted, insulated silver wires. Intracellular recordings were performed with sharp micropipettes filled with 3M potassium acetate (input resistance 70–110 M*-*) and attached to an electronically controlled micromanipulator (Sutter Instruments, Novato, CA, USA). Signals were amplified by an intracellular amplifier (IR-283A Cygnus Technology, PA, USA). Field potentials were recorded using glass pipette filled with 0.9% NaCl (resistance 2–5 m*-*) or microwire arrays (Tucker-Davis Technologies, Alachua, FL, USA) featuring 16 tungsten planar recording wires (filament diameter 50 μm, tip angle 45◦), each separated by 250μm (impedance 30–40 K*-*). The extracellular signals were acquired using a PBX3 preamplifier (Plexon, Dallas, TX, USA) configured to separately process spikes (150 Hz–8 KHz bandwidth) and local field potentials (0.7–300 Hz).

Data were digitized at 25 kHz using a PCI-6071E A/D board (National Instruments, Austin, TX, USA) and stored on the hard drive of a personal computer. Recordings were performed using ELPHO software developed by Dr. Vadym Gnatkovsky at the C. Besta Neurological Institute (Milan, Italy).

## **COMPUTATIONAL MODEL**

In the following sections we will present the computational model used to mimic the dynamics expressed by finite size networks (cf. section "Experimental Models").

## *Neuron model*

The neuron model used for the finite size networks is based on the Izhikevich equations (Izhikevich, 2003). The dynamics of this model depend on four parameters that, correctly chosen, reproduce the spiking behavior and voltage traces of specific types of cortical neurons. From a mathematical point of view, the model is described by a two-dimensional system of ordinary differential equations.

$$\frac{d\nu}{dt} = 0.04\nu^2 + 5\nu + 140 - \mu + I\_{\text{syn}} + I\_{\text{noise}} \tag{1}$$

$$\frac{du}{dt} = a(b\nu - u)\tag{2}$$

with the after-spike resetting conditions:

$$\text{if } \nu \ge 40 \,\text{mV} \to \begin{cases} \nu \gets c \\ \mu \gets u + d \end{cases} \tag{3}$$

In Equations (1–3), *v* is the membrane potential of the neuron, *u* is a membrane recovery variable which takes into account the activation of K<sup>+</sup> and inactivation of Na<sup>+</sup> channels; *I*syn describes the synaptic input from other neurons; *I*noise is a current source generator introduced to model the spontaneous subthreshold electrophysiological activity of the neurons. Practically, we introduced a stochastic source of noise (modeled according to an Ornstein-Uhlenbeck process) to each neuron described as follows:

$$dI\_{\rm noise} = -\frac{I\_{\rm noise}}{\pi\_I}dt + \frac{m\_I}{\pi\_I}dt + s\_1\sqrt{\frac{2dt}{\pi\_I}}\xi\_t\tag{4}$$

In Equation (4) the quantity ξ*<sup>t</sup>* is a white noise with zero mean and unitary variance. In this way, *I*noise is Gauss-distributed at any time *t* and, after a transient of magnitude τ*<sup>I</sup>* (correlation length), converges to a process with a mean equals to *mI* and standard deviation *sI*. For the simulation, we set τ*<sup>I</sup>* = 1 ms, *mI* = 25 pA, and *sI* = 9 pA.

Among the possible firing patterns generated by the neuron model of Equations (1, 2), we implemented the family of regular spiking (RS) and the family of fast spiking neurons (FS) in percentage of 75% and 25%, respectively, in agreement with the experimental findings (cf. section "Finite Size Network Dynamics"). Mathematically, the four aforementioned parameters were set as follows:

$$a = \begin{bmatrix} 0.02 \\ 0.02 + 0.08r\_i \end{bmatrix} \quad b = \begin{bmatrix} 0.2 \\ 0.25 - 0.05r\_i \end{bmatrix}$$

$$c = \begin{bmatrix} -65 + 15r\_i^2 \\ -65 \end{bmatrix} \quad d = \begin{bmatrix} 8 - 6r\_i^2 \\ 2 \end{bmatrix} \tag{5}$$

In Equation (5), the first row is relative to the excitatory, while the second one to the inhibitory neurons. *ri* is a random variable which spans from 0 to 1, and *i* the neuron index. *ri* was added in order to introduce a further variability in the neuron dynamics: for example, a neuron exhibits classic RS behavior if *ri* = 0, and bursting behavior if *ri* = 1.

#### *Finite size network model*

Graph theory was used to represent the network connectivity. All graphs are defined by nodes which represent the neurons, and edges which model the morphological connections among the neurons. The structure of the graph is described by the adjacency matrix, a square matrix of size equal to the number of nodes *N* with binary entries. If the element *aij* = 1, a connection between the node *j* to *i* is present, otherwise *aij* = 0 means no connection. All the auto-connections are avoided (*aii* = 0*,* ∀ i). Then, the value 1 of the non-zero *aij* elements has been replaced to mimic different synaptic strengths. Synaptic weights were chosen randomly from a normal distribution with a mean value and standard deviation equal to 10 and 3.5, respectively.

To model the synaptic transmission we chose the approach of the pulse-coupled neural networks: practically, the firing of the *j*-th neuron causes an instantaneous change in the membrane potential of the neuron *i*-th by means of the weight *sij*.

Among the possible graphs, following the experimental findings regarding the functional connectivity of such confined neuronal assemblies (cf. section "Finite Size Network Dynamics"), we implemented neuronal networks with a scale-free (SF) connectivity (Barabasi and Albert, 1999). Briefly, in SF networks the degree distribution follows a power law: if *m* is the number of edges incident to a node, i.e., the degree, the power law distribution is given by Dorogovtsev and Mendes (2002):

$$P(m) = m^{-\gamma} \tag{6}$$

where γ is the characteristic exponent. This law suggests that most nodes have just a few connections and other, named *hubs*, have a very high number of links. To build a SF network, we made use of the algorithm described in Batagelj and Brandes (2005), particularly efficient in terms of computation when dealing with large-scale networks. Nodes are added successively. For each node, *m* edges are generated. The endpoints are selected from the nodes whose edges have already been created, with a bias toward high degree nodes.

In order to mimic the experimental conditions of the confined assemblies described in section "Finite Size Network Dynamics," in section "Simulation Results" we presented the results regarding the ongoing activity of networks made up of 90, 100, 120, 150, 240, 320, and 520 neurons.

#### **DATA ANALYSIS AND STATISTICS**

## *Analysis of network dynamics based on calcium fluorescence imaging*

Custom software running in MATLAB (Crépel et al., 2007; Bonifazi et al., 2009) was used for the automatic identification of the cells loaded with the calcium indicator and for the extraction of their fluorescence signals as a function of time (time resolution 59 Hz). To detect the calcium events (i.e., the onset and offset of neuronal firing) from the fluorescent trace *Fij* of the neurons (1 ≤ *i* ≤ *M, M* number of neurons; 1 ≤ *j* ≤ *N, N* number of frames) we calculated the first derivative of the fluorescent signal (*Fij* = *Fij*<sup>+</sup><sup>1</sup> − *Fij*) and we integrated *Fij* in overlapping sliding time windows of 1 s (*Iij* = *<sup>j</sup>* ≤*n*≤*j* +<sup>59</sup> *Fnj*; 1 ≤ *j* ≤ *N* − 59). A Gaussian fit centered at zero was used to extract the standard deviation σ*<sup>i</sup>* of the noise of the processed signal *Iij*. Signal transients exceeding the threshold of 3σ*<sup>i</sup>* for at least 5 consecutive points were considered as calcium events. The onset and the offset of these calcium events were determined using a four-parameter sigmoidal equation as described in Takano et al. (2012). The estimated onset and offset times were fixed respectively to the 5% and the 95% of the sigmoidal plateau.

The reconstruction of the functional connectivity of the network was based on pair-wise correlation analysis of the onset time series extracted from the calcium imaging data, as described in Bonifazi et al. (2009). Briefly, when the firing onset of cell *j* preceded in a repetitive way the firing onset of cell *k*, a functional connection directed from *j* to *k* was established. In order to reveal these temporal correlations, the post-stimulus time histogram of cell *k* centered on the firing onsets of cell *j* was calculated within a maximal time lag of 500 ms. Both the Student's *t*-test and the Kolmogorov-Smirnov test with a level of confidence of 5% were used to exclude the possibility that the poststimulus time distribution could be a Gaussian distribution with zero mean or a uniform distribution, respectively. In this way, we excluded cases where the activation of two neurons was completely uncorrelated (uniform distribution) or synchronous (Gaussian centered at zero).

The cross correlation between firing onsets time series of individual neurons was used to estimate the average correlation and average time of activation of each neuron relative to all others, similarly to what described in Bonifazi et al. (2009) and Marissal et al. (2012). Briefly, the cross correlation between the time series of neurons *a* and *b* was calculated as follows:

$$\text{CC}\_{ab}(\mathbf{r}) = \frac{\sum\_{0 \le t \le T} (a\_t + \mathbf{r} - \mathbf{c} \cdot \mathbf{a} \succ) \cdot (b\_t - \mathbf{c} \cdot \mathbf{b} \succ)}{\sigma\_a \cdot \sigma\_b} \tag{7}$$

where σ*<sup>a</sup>* and σ*<sup>b</sup>* are the standard deviation of the time series, *t* is the sampling time, *T* the duration of the entire movie and |τ| ≤ 1 s.

The maximum cross-correlation value (*CC*max *ab* ) and the time lag of its occurrence (τmax *ab* ) were used to calculate, respectively, the average correlation and average time of activation of neuron *i* to the following formulas *CC*max *i* = <sup>1</sup> *n <sup>j</sup>* =*<sup>i</sup> CC*max *ij* and τmax *i* = 1 *n <sup>j</sup>* =*<sup>i</sup>* τmax *ij* where *n* is the number of neurons displaying a positive cross-correlation with neuron *I*.

## *Processing of electrophysiological signals from the IWB*

Raw data acquired by the ELPHO software were loaded into MATLAB (Mathworks Inc., Natick, MA, USA) for off-line processing. First, raw traces were band-pass filtered to select either multi-unit activity (MUA, 800 Hz–3 KHz) or local field potentials (LFP, 1–300 Hz). Stimulation artifacts were suppressed using an off-line MATLAB implementation of the SALPA algorithm (Wagenaar and Potter, 2002). Highly noisy channels were visually excluded from the analysis. Then, MUA raw data were spikedetected by means of the PTSD algorithm (Maccione et al., 2009) (peak lifetime period = 2 ms; refractory period = 1 ms; threshold = ±8 times the estimated noise standard deviation). The result of the spike-detection procedure consists of a series of point processes (i.e., spike trains), one for each recording channel (Bologna et al., 2010).

We evaluated the network-wide evoked response by computing the Peri Stimulus Time Histogram (PSTH; Perkel et al., 1967) for each recording channel of the array and for the full array [time bin = 4 ms, time window = *(*−100 ms, +400 ms) relative to the stimulus onset]. We also measured the intensity of the response as the average number of evoked spikes in a 200-ms time window following each stimulus. The final dataset comprised 4 recordings in control brains (duration ∼300 s, 10–20 paired pulses delivered to the LOT at 0.05 Hz, inter-pulse interval 200 ms) and 3 recordings before and after the induction of focal ischemia (same stimulation protocol).

#### **RESULTS**

#### **FINITE SIZE NETWORK DYNAMICS**

#### *Spontaneous synchronizations in finite size networks*

To build an experimental model for the study of physiological and impaired communications between neuronal assemblies we grew finite size neuronal networks, i.e., networks composed of neuronal assemblies spatially separated by hundreds of micrometers and interconnected through long neuritis. As a first step, we focused on the properties of single modules, i.e., the structural and dynamical properties of isolated and spatially confined neuronal circuits (**Figure 2**). Isolated neuronal circuits located within an 800 × 800μ m spot were obtained by plating the cells on glass cover slips previously coated with a geometrically defined molecular adhesive layer (PDL). The individual cell populations varied between a few dozen up to a few hundred neurons. Similar to homogenous and clustered cultures (Chiappalone et al., 2006; Shein-Idelson et al., 2010), finite size circuits displayed spontaneous synchronized events after 2 weeks in culture (**Figure 2**, panel B1) occurring with a frequency linearly correlated with the number of cells present in the circuit (Pearson correlation 0.88, **Figure 2C1**). Likewise, depending on the density of the plating and on the vicinity to the supporting network, finite size circuits organized into monolayers or in three-dimensional clusters, with a higher propensity of clustering at increased plating density or at larger distances from the supporting network (data not shown). We used calcium imaging of monolayer neuronal circuits (performed with a 10× objective) in combination with immunocytochemical staining to map the functional and structural properties of all the neurons in the circuits with singlecell resolution. GABAergic cells could be specifically identified (**Figure 2A3**), allowing us to investigate their specific involvement in spontaneous synchronization processes, similar to the work of Bonifazi et al. (2009) in developing hippocampal networks.

A pair-wise analysis based on the cross-correlation between the firing onsets time series of pairs of neurons (see section "Materials and Methods") was used to estimate the average correlation and average time of activation of each neuron relative to all others (Bonifazi et al., 2009; Marissal et al., 2012). In all the circuits analyzed (*n* = 4) the time correlation graph presented a bimodal distribution (**Figure 2C2**), indicating that network events synchronized first the population of neurons plotted on the left side of the graph (i.e., with a time lag *<* 0), whereas neurons on the right (i.e., with a time lag *>* 0) were activated next. In addition, the presence of highly correlated early activated GABAergic neurons was observed (red points within the violet circle in **Figure 2C2**). Interestingly, the existence of a characteristic, earlyactivated neuronal population within the network synchronizations has been already documented in developing hippocampal circuits (Bonifazi et al., 2009) even in absence of GABAergic transmission (Marissal et al., 2012). Notably, in the presence of GABAergic transmission it has been shown that early-activated GABAergic neurons can play the role of hub cells in orchestrating network dynamics (Bonifazi et al., 2009). The similarity between these previous observations and the results presented here suggest that cortical circuits share common innate features in their functional organization.

#### *Effect of laser ablation on functional connectivity*

To monitor the synaptic re-organization of lesioned neuronal circuits with single cell resolution, we reconstructed the functional connectivity of a neuronal subset of a larger neuronal network 20 min before and after laser-induced ablations (see section "Materials and Methods").

Two micro-lesions (lesion 1 and lesion 2) were induced next to the center of the field of view, using an average laser power

at the sample of 4μW and 5μW, respectively. The second lesion was performed at higher power to obtain a more pronounced alteration of the network. Indeed, this lesion produced a strong intracellular calcium increase in several cells, and a calcium

indicator OGB are shown in the panel **(B2)** (objective magnification 10×, field

"shockwave" started to propagate through the network. After a few minutes, only directly ablated cells displayed a saturated calcium fluorescence signal, while the other neurons recovered a relatively low basal calcium level and presented spontaneous

circuits (Bonifazi et al., 2009).

activity (cf. **Figure A1**). The frequency of occurrence of spontaneous network synchronizations was not affected by the lesions (**Figure 3**, 4th and 5th rows) with no significantly statistical difference between the inter-burst interval distribution before and after lesion (student *t*-test, *p >* 0*.*05). However, the number of cells recruited within the network events in the imaged field (i.e., close to the location of the lesion) decreased by 31 ± 10% (student *t*-test, *p <* 0*.*05).

Based on the calcium dynamics of the cells imaged in a circular field of 244 μm diameter (**Figure 3**), we reconstructed the functional connectivity of the neuronal population through a pair-wise analysis of the onset of firing (see section "Materials and Methods"). Briefly, if the activation of cell *i* reliably preceded the activation of cell *j* (i.e., over several repetitions with statistical significance, see section "Materials and Methods"), we inferred a functional connection directed from *i* to *j*. Cell pairs that were synchronously activated or not displaying any activation order were not included in the directed functional connectivity reconstruction (see section "Materials and Methods"). **Figure 3** (1st row) shows the location of ten neurons with the highest number of functional INPUT (violet) and OUTPUT (yellow) connections before and after the lesions. Interestingly, after the lesions, top rank INPUT and OUTPUT neurons segregated into spatially distinct regions. Top rank OUTPUT neurons relocated in the bottom right region while top rank INPUT neurons remained in the rest of the circuit. In addition, just one out of the ten neurons for each group belonged to the top rank group before and after the lesion. The relocation of the functional connections (drawn for clarity just for the five best ranked neurons) can additionally be observed in **Figure 3** (2nd and 3rd row).

#### *In vitro* **WHOLE BRAIN**

We also characterized the activity of an *ex vivo* experimental model (i.e., the isolated brain of a guinea pig, **Figure 4**) before and after a lesion induced by a focal ischemia.

#### *Network response to LOT stimulation in the m-ERC*

Electrical stimulation of the LOT induced a polysynaptic response in the m-ERC mediated by the interposed activation of the hippocampus (Biella and De Curtis, 2000; Gnatkovsky and De Curtis, 2006) (**Figure 4**). The intracellular correlate of the LOTevoked delayed response in neurons of m-ERC superficial layers was characterized by an early GABAA receptor- mediated inhibitory postsynaptic potential (IPSP; latency from LOT stimulation: 51 ± 1 ms, *n* = 12), followed by a relatively slow (duration 409 ± 36 ms) NMDA-dependent depolarizing component which often reached threshold for spike firing. Conversely, pyramidal cells in deeper layers responded to LOT stimulation with an early excitatory postsynaptic potential (EPSP) occurring 15 ± 1 ms after the population spike recorded in the dentate gyrus (DG, **Figure 4**). The EPSP often crossed the threshold for action potential firing and was followed by a relatively slow inhibitory potential mediated by GABAB receptors (Gnatkovsky and De Curtis, 2006). The early inhibition of the superficial principal cells is presumably due to a feed-forward mechanism sustained by interneurons recorded in layers II/III (i.e., basket and chandelier cells; Canto et al., 2008). In **Figure 4** the firing of an interneuron

**FIGURE 3 | Directed functional connectivity before (left) and after (right) lesion.** The number of OUTPUT and INPUT functional connections has been calculated for all the imaged neurons based on the temporal correlation between the firing onsets of the neurons (see section "Materials and Methods"). The ten top ranked cells, i.e., the cells with the largest number of functional OUTPUT (yellow) and INPUT connections (pink), are represented in the top row. For graphic clarity, the connectivity graphs shown in the 2nd and 3rd rows (respectively INPUT and OUTPUT connections) include only the five top ranked cells. The data refers to a homogenous neuronal network where functional hub cells (i.e., neurons with a very large number of functional connections) were not identified. The fluorescent images show the cells loaded with the calcium indicator Fluo4 (see section "Materials and Methods"). The locations of the two lesions (L1 and L2) are marked by the white arrows. The green rectangle highlights the region shown in **Figure A1**. The field of view is a circular region of 244 μm diameter. The raster plot (representing the firing onsets) and the fraction of activated cells are shown respectively in the 4th and 5th row.

positioned in the center of the m-ERC (delimited by dotted line). S, stimulating electrode; LOT, lateral olfactory tract; PC, piriform cortex; l-ERC, lateral entorhinal cortex; DG, dentate gyrus; m-ERC, medial entorhinal cortex. **(B)** Stereomicroscope photograph of the isolated brain positioned in the perfusion chamber. **(C)** Electrical responses to LOT stimulation recorded in the m-ERC. Left, intracellularly recorded voltage traces from a superficial pyramidal cell lying at 200–300 μm from pial surface (black trace), a

pathway (PC and l-ERC) and subsequently invading the hippocampal structure (DG and CA1, dark and light gray spots, respectively). The left margin of the gray area is aligned to the first component of the m-ERC LFP. Right, simplified scheme of the polysynaptic neuronal circuitry within the m-ERC, based on the evoked response pattern and delay analysis of the neuronal response to LOT stimulation. The gray cell represents a putative interneuron mediating a feedback GABAergic inhibition onto a deep pyramidal cell and a feed-forward inhibition onto another interneuron.

corresponds to the early IPSP measured in the pyramidal cells in the same layer.

Spiking responses to paired-pulse LOT stimulation (interpulse interval 200 ms) were recorded by 16-channel MEAs implanted in the superficial layers of the m-ERC (200–500μm from pial surface; **Figure 5A**). **Figure 5B** shows the peri-stimulus raster plots of two selected channels (19 and 24, experiment #1) in response to each of the two LOT stimulations for a selected

**ischemic lesioning of the hippocampus. (A)** Local field potentials (LFP) and multi-unit activity (MUA) raw traces from two selected electrodes (19 and 24, experiment #1) recorded in response to an individual paired-pulse stimulus

(ISI 200 ms) delivered to the LOT. The volume-conducted components originating in DG and CA1 are indicated by the dark gray and light gray dots, respectively. **(B)** Peri-stimulus raster plots for the same two representative *(Continued)*

#### **FIGURE 5 | Continued**

electrodes. The corresponding PSTHs are superimposed (bin size = 4 ms). **(C)** Summary plot of mean number of evoked spikes (mean ± S*.*E*.*M*.*) after 1st and 2nd pulse for all four experiments. <sup>∗</sup>*p <* 0*.*05, Mann–Whitney *U*-test. **(D)** LFP and MUA raw traces of one selected electrode recorded in response to a paired-pulse stimulus either before (black trace) or after (gray trace) an

experiment. An earlier phase, which we observed in almost all active recording channels, was characterized by two relatively sharp peaks: the first corresponding to the far-field response originating in the hippocampus and the other corresponding to the initial phase of the m-ERC response (**Figure 4**). This was followed by a late, long-lasting but less reliable component (cf. channels 19 and 24). The histogram in **Figure 5C** displays the number of spikes (mean ± S*.*E*.*M*.*) evoked by the 1st and the 2nd pulse for all experiments (control condition) as a measure of response intensity. In 2 out of 4 experiments (#1, #4) we observed a stronger activation after the 1st rather than 2nd pulse, whereas in the other 2 experiments (#2, #3) responses to the 2nd pulse were slightly stronger than to the 1st pulse (no significant statistical difference). However, one must consider that first evoked responses in experiments #1 and #4 were on average more intense, probably reflecting a relatively high probability of excitatory neurotransmitter release upon the first pulse. This would nearly deplete the available pool of synaptic glutamatergic vesicles, leading to a paired-pulse depression of the postsynaptic response.

### *Cutting the olfactory pathway: hippocampal focal ischemia*

Occlusion of the posterior left cerebral arteries abruptly reduced ACSF perfusion of the hippocampus, resulting in a block of the propagation of the synaptic activity toward the entorhinal cortex (**Figure 4**). About 5 min after the ischemic insult, LOT stimulation failed to evoke any response (**Figure 5**). Stimulus-triggered raster plots and the corresponding pre- and post-lesion PSTH are shown in **Figure 5E**. The bar graph in **Figure 5F** summarizes the total number of spikes evoked by a paired-pulse stimulus before and after the ischemic lesion. A significant reduction of the response intensity caused by the lesion was observed in all three analyzed experiments.

## **SIMULATION RESULTS**

In this section, we report the results of simulations in which we modeled the effects changing the number of neurons in confined networks. Each simulation lasted 10 min, sampled at 10 kHz. Networks were simulated in MATLAB (The Mathworks, Natik, US). Peak trains were stored and then processed by using SpyCode software (Bologna et al., 2010), conveniently adapted to manage large-scale networks.

## *Dynamics of finite size networks*

We simulated the ongoing activity of neuronal networks made up a 90, 100, 120, 150, 240, 320, and 520 neurons. The choice of these networks sizes followed from the experimental findings described in section "Finite Size Network Dynamics" (assuming a neuron/glia ratio equal to 2:1). In addition, 25% of such neurons were considered inhibitory (Isaacson and Scanziani, 2011) ischemic lesion of the hippocampus. **(E)** Peri-stimulus raster plot for the same representative electrode, before and after the lesion. The corresponding PSTHs are superimposed (bin size = 4 ms). **(F)** Summary plot of mean number of evoked spikes by a paired pulse stimulus delivered to the LOT (mean ± S*.*E*.*M*.*) either before or after the ischemic lesion of the hippocampus for all analyzed experiments. <sup>∗</sup>*p <* 0*.*05, Mann–Whitney *U*-test.

and were modeled as FS neurons (cf. section "Computational Model").

Model neurons were connected following a scale-free (SF) topology. **Figure 6A** shows the degree distribution of the simulated SF networks. For all SF networks, the degree distribution was fitted by a power law and the corresponding exponent lay between −1*.*04 (network made up of 90 neurons) and −1*.*34 (networks made up of 520 neurons).

The simulated networks displayed spontaneous synchronized events (network bursts) independently of their size (**Figure 6B**). However, the frequency of occurrence of those synchronized events varied in a linear manner with respect to the number of cells present in the circuit (Pearson correlation 0.96, **Figure 6C**). To facilitate comparison with **Figure 2C1**, the *x*axis of **Figure 6C** reports the total cell number (neurons + glia), although the number of neurons effectively simulated is indicated near the blue dots. The results of the simulation were fit well with the experimental data, as confirmed by the slope of the linear fit (0.00015 vs. 0.00016). An interesting finding was that the simulated networks tended to show a higher proportion of random spiking activity and less bursting than normally observed in actual finite-size neuronal networks. This is consistent with other experimental results of interconnected finite-size networks previously reported in the literature (Macis et al., 2007).

#### **HARDWARE SET-UP FOR A BRAIN PROSTHESIS**

The hardware set-up that will be used to interface the biological component (either the neuronal culture or the *in vitro* whole brain) is a Spiking Neural Network (SNN) system. This SNN implements biologically realistic neural network models, spanning from the electrophysiological properties of one single neuron up to network plasticity rules. As already discussed in the modeling section, the choice of Izhikevich neuron model is relevant because (1) it is biologically realistic, and (2) it operates in biological real time. By real-time, we mean that computation results are provided within a firmly controlled delay (10 ns precision), which is lower than the sampling period (100μs to 1 ms). Among these modules, the computation-critical task is the implementation of a SNN model, which represents the prosthesis itself, and the analysis of biological signals to produce events from the recorded activity.

The digital Izhikevich neurons and detection system are implemented as a configurable digital integrated circuit (fieldprogrammable gate array, FPGA) using the VHDL language. We implement Regular Spiking (RS) neurons (excitatory) and Fast Spiking (FS) interneurons (inhibitory) similar to those found in cell culture (**Figures 2**, **3** and **6**). The hardware models follow the Izhikevich equations with parameters corresponding to RS activity (*a* = 0*.*02, *b* = 0*.*2, *c* = −65, and *d* = 8). In **Figure 7A1**

we describe the choice of the topology (Cassidy and Andreou, 2008) to implement the Izhikevich equations. We implement a neuron on FPGA board Xilinx Virtex 5 XC5VLX50. This neuron uses really few resources (only 2% of the FPGA) and works in real-time. In **Figure 7A2** we compare the behavior f(I) of biological RS neurons and one RS neuron implemented into the FPGA.

Concerning the SNN, our goal was to implement a model using 80 neurons (FS and RS) with high connectivity capacity (e.g., 6400 synapses). Network structure is fully configurable, and synapses are excitatory or inhibitory conductances which provide current depending on the postsynaptic membrane voltage. Delays are also implemented to provide good accuracy on timing. The network is defined into the RAM of the digital board where lie all characteristics of all neurons and synaptic connections in the network. A synaptic connection is defined by a synaptic weight and the address of the neuron linked by this synapse. Added with complementary functions like loopback stimulation and monitoring, this system will be able to perform cross-platform neural computation.

The detection of neural electrophysiological activity is done by a reconfigurable acquisition based on wavelet detection circuit for *in vitro* biological signals. Our strategy for real-time spike detection is to implement a pre-processor, which emphasizes spikes shapes and attenuates out-of-band noise. This pre-processor provides two outputs corresponding to different wavelet detail levels. The first one is essentially composed of out-of-band noise used to determine a threshold level adapted to the signal amplitude. The second output is compared to the threshold to discriminate spike events. The pre-processing algorithm is the Stationary Wavelet Transform (SWT). The detection system computes in real-time the SWT, the adaptive threshold and the

term of frequency of the neuron *vs*. the stimulation current. **(B)** Outputs of the detection system to be implemented in the closed-loop set-up of the brain prosthesis. *First row*—*(a)*. Raw electrophysiological signal. *Second row—(b)*. The same signal with added Gaussian white noise to reduce Signal The emphasized detected spike is a false positive. This shows that the signal *(b)* represents the limit of signals that can be reliably processed by our system. These signals were first recorded then input to the system with a waveform generator.

comparison. This method proved to be very efficient to extract action potential of excitable cells from very noisy signals (Raoux et al., 2012). **Figure 7B** shows the performance of the method on a single channel setup. Action potentials are emphasized by arrows on the signal A. We added significant noise [signal (b)] and then sent the signal to the detector that provides outputs (c) and (d).

To summarize, all modules (i.e., Izhikevich neuron, neural network specifications, detection and stimulation modules) will be implemented into the FPGA. This modular system will be used as a cross-platform neural computation unit. Microelectrode arrays will be used to record and electrically stimulate living neural networks, with a specific emphasis on stimulation localization. Dedicated integrated electronics will be designed to implement the communication channels between the living and the artificial networks. The biological signals (from living to artificial) will be processed by using on-line spike detection algorithms and a rate-based decoding (Rieke et al., 1997; Novellino et al., 2007; Tessadori et al., 2012), while the firing rate of an artificial neuronal sub-network will be translated into the stimulation frequency for the biological network (from artificial to living), thus following a similar rate-based strategy. The system including the artificial and living neural networks will form a closed loop with a regulated feedback (cf. next Section).

## **A BI-DIRECTIONAL NEURO-PROSTHESIS**

The knowledge that we gained through the various studies presented here will contribute to the final realization of a bidirectional communication between *in vitro* and *in silico* models of interconnected cell assemblies. By studying the dynamics of *in vitro* networks (see **Figures 2**, **3**), we will create a computational model (see **Figure 6**) exhibiting the same I/O function of its biological counterpart (**Figure 8**, panel A). Through this approach we also plan to further our knowledge about the

interplay between structural connectivity and dynamics in neuronal networks. Once we have realized and tested our model, we will bi-directionally integrate it into a biological network made up of few interconnected sub-networks in replacement of

function of the olfactory-limbic circuit after an ischemic lesion of the

one of these that has been previously lesioned (**Figure 8A**, right panel).

The same conceptual approach will be applied to the olfactorylimbic pathway in the IWB (**Figure 8**, panel B). After a thorough

lesion are marked in red.

characterization of spontaneous activity patterns (e.g., spontaneous periodic events, which strikingly resemble the ones shown by primary cortical cultures; see **Figure 1**) and LOT-evoked responses generated in the m-ERC (see **Figure 8**), we will include such information into a realistic computational model. We will then induce an ischemic lesion of the hippocampus and realize a functional model able to reproduce the same transfer function of the damaged part in order to restore the original pathway. **Figure 8** summarizes this approach, both for *in vitro* interconnected finite-size networks and for the guinea pig IWB.

The final step foreseen in the BRAIN BOW project is the hardware implementation of the signal processing algorithms and computational models to achieve our proof-of-concept neuroprosthesis based on a neuromorphic chip. **Figure 9** illustrates the closed-loop architecture that we plan to develop. Raw traces recorded by means of either planar or implanted MEAs (depending on the biological sample) will be fed into the artificial element and pre-processed online to extract multi-unit activity patterns (MUA). Spatio-temporal spiking patterns will then be translated by the "decoding" block into signals delivered to the neuronal network model. After elaboration, output patterns produced by the model will be finally translated by the "coding" block into a stimulation delivered to the neural element (**Figure 9**).

## **DISCUSSION**

This paper presents a bottom-up, multidisciplinary approach toward the realization of a neural prosthesis capable of replacing lesioned neuronal circuitries. The final goal of the studies consists of developing a neuromorphic chip reproducing the function of a lesioned circuit without replicating its specific architecture or structural organization.

As a general model of a self-organized neuronal circuit, finite size neuronal circuits in culture are produced and studied in an isolated configuration to reveal innate (and therefore most general) features of intra-circuit organization (cf. **Figure 1**). Since finite size networks can spontaneously interconnect in a "multimodular" network organization, they also represent an optimal experimental model to reveal innate inter-circuit communication properties (cf. **Figure 1**), as shown in previous studies (Macis et al., 2007; Raichman and Ben-Jacob, 2008; Shein-Idelson et al., 2010, 2011). The structural—functional configuration of the finite size circuits can be replicated by an *in silico* neuronal

complexity ("Neural element") will be interfaced to a hardware neuromorphic chip ("Artificial element—Hardware"), implementing both

"Decoding" blocks.

and its biological counterpart is accomplished by the "Coding" and

network and then implemented on a neuromorphic prosthetic chip. The capability of the neuromorphic chip to replace the function of a lesioned circuit will be tested at increasing levels of network complexity from an *in vitro* modular network to an isolated whole brain system (IWB). In the attempt to present the overall scientific approach of the BRAIN BOW project (cf. section "Introduction" and "A Bi-Directional Neuro-Prosthesis"), this paper shows first results from the different level of investigation grounding the overall strategy.

#### **FINITE SIZE CIRCUITS AND INNATE FUNCTIONAL ORGANIZATION OF CORTICAL CIRCUITS**

As previously shown by Shein-Idelson et al. (2010), cultured cortical neuronal networks composed of at least a few dozen neurons are able to produce spontaneous collective dynamics known as network bursts, characterized by oscillatory activity in the gamma-theta range, and with the frequency of the bursts increasing with the number of neurons in the network. We confirmed these findings here using optical measurements on monolayer circuits (cf. **Figure 2**). By combining calcium imaging with immunocytochemistry, we have found that network events first recruit a characteristic population of neurons which includes GABAergic neurons. In particular, the time-lag correlation of the finite size cortical circuits is similar to what observed in developing hippocampal circuits (Bonifazi et al., 2009), in which a scale-free functional connectivity distribution was accompanied by the existence of GABAergic hub cells able to play a key role in the orchestration of the spontaneous network events. All together, these observations suggest that cortical neuronal circuits share a common innate functional organization which might include the existence of GABAergic hub cells.

## **MONITORING EFFECTS OF LESIONED NEURONAL CIRCUITS IN FINITE SIZE NETWORKS**

After characterizing the spontaneous dynamics of the finite size networks we monitored how a focalized lesion can trigger functional reorganization in the neuronal circuit. We made controlled laser ablations of different intensities on our networks (e.g., targeting single modules, inter-connections between modules, single neuritis/cell bodies/cell assembly). After the lesions, the neuronal circuits continued to produce spontaneous networks events with no significant changes in the frequency of occurrence (**Figures 3** and **A1**). These were presumably generated out of the imaged field where the lesions were performed. The number of cells recruited during network events decreased either because they were directly lesioned by the laser ablation or because of a change in the local functional organization of the circuits (see the functional connectivity graphs of **Figure 4**). In a previous study by Maeda et al. (1995) the authors made a lesion in a homogeneous network over a MEA to study the origin of spontaneous network bursting. More recently, Difato et al. (2011a) reported controlled sequential ablation of single connections in a neuronal network, causing modulation of its activity without irreversibly damaging it. By combining MEA recording and calcium imaging the authors found changes in electrophysiological patterns in the network and identified the contribution of neuronal sub-populations to the network activity

(Difato et al., 2011b). To the best of our knowledge, our study is the first to make a spatially defined micro-lesion at the single cell scale and to analyze the neuronal dynamics and connectivity by means of optical-only tools. This methodology, which can be extended to the use of genetically encoded calcium sensors, allows a more detailed and prolonged monitoring of the functional reorganization of the circuit over hours or days with the advantage, when compared to electrophysiological recordings, that the high spatial resolution (i.e., single cell) can be linked to morphological/structural cellular properties through *post-hoc* immunocytochemical characterizations. This could also facilitate testing of methods to promote functional circuit repair, such as pharmacological approaches.

## **SIMULATION RESULTS (SOFTWARE AND HARDWARE)**

Given the similarity between the synchronization dynamics observed in developing hippocampal networks (Bonifazi et al., 2009) and in the finite circuits (**Figure 2**) with early activated GABAergic cells forecasting synchrony, we hypothesized a common innate structural-functional organization in neocortical and paleocortical circuits. Therefore, we used a scale-free topology (Barabasi and Albert, 1999) to model a neuronal network based on Izhikevich neurons (Izhikevich, 2003) (**Figure 6**). The proposed model was able to reproduce the empirical dependence between bursting rate and circuit size. However, the model predicted a richer repertoire of firing patterns (e.g., **Figure 6B**). Indeed, such patterns can be found in biological networks (Segev et al., 2002; Macis et al., 2007; Marconi et al., 2012). Thus our synthetic models (conveniently tuned and adapted) are able to reproduce the dynamics found in *in vitro* networks. Our results also demonstrate that the hardware element of the prosthesis (cf. section "Hardware Set-up for a Brain Prosthesis" and "A Bi-Directional Neuro-Prosthesis") can be constituted by a neuromorphic model (SNN) built on the same equations as the computational model (Izhikevich, 2003), since it reproduces similar firing rate distributions (**Figure 7**). Thus, the computational (software) model serves as a bridge between the biological networks and the hardware implementation.

## **COMPARISON TO PREVIOUS WORK AND PROSPECTIVE RESULTS**

In the last decades, great efforts have been made to develop neuroprostheses to restore lost sensory or motor functions (Taylor et al., 2002; Chader et al., 2009; Collinger et al., 2013), but very few groups have focused on neuro-prostheses targeting lesions at the level of the CNS and aimed at recovering lost cognitive capabilities (Berger et al., 2011; Prueckl et al., 2011; Bamford et al., 2012; Hampson et al., 2012; Opris et al., 2012). Although our studies are limited to simplified *in vitro* models of cell assemblies, their final aim is to provide useful insights for the design of future cognitive prostheses. We believe that our approach would help us understand how we can influence/drive the dynamics of a neuronal assembly by interfacing it to an artificial network, implemented either in software or hardware. This is not the first attempt to realize an *in vitro* closed-loop system: previous studies have used a robotic actuator or a control algorithm aimed at clamping network activity to a desired level (Demarse et al., 2001; Martinoia et al., 2004; Wagenaar et al., 2005; Wallach et al., 2011). However, we seek to extend these approaches by replacing a real biological network with a simulated network, and hence by implementing bi-directional communication between biological and simulated networks. This research project builds on previously published results in the field of *in vitro* closed-loop electrophysiology (Arsiero et al., 2007). It can also be generalized to a more structured experimental model like the *in vitro* whole brain of a guinea pig, which lies between *in vivo* (as it retains the original tridimensional architecture) and *in vitro* (as it is disconnected from any sensory input/motor output). In contrast to other groups which have exclusively investigated *in vivo* brain prostheses (Prueckl et al., 2011; Bamford et al., 2012; Berger et al., 2012; Hampson et al., 2012; Opris et al., 2012), we are trying to exploit the unique advantages of *in vitro* electrophysiology—accessibility, visibility and control of physical and chemical conditions—to study neural information processing in neuronal assemblies, and to understand which parameters are relevant for effectively interfacing biological and artificial networks. In addition to informing the design of future *in vivo* approaches, our approach could also

#### **REFERENCES**


illuminate how network structure constrains and drives network dynamics.

## **ACKNOWLEDGMENTS**

The research leading to these results has received funding from the European Union's Seventh Framework Programme (ICT-FET FP7/2007-2013, FET Young Explorers scheme) under grant agreement n. 284772 BRAIN BOW (www.brainbowproject.eu).

The authors wish to thank Dr. Marina Nanni and Claudia Chiabrera from NBT-IIT for technical assistance in cell culture preparation; Dr. Pieter Laurens Baljon for the MATLAB implementation of the SALPA algorithm; Prof. Marco de Curtis for helpful discussion about the *in vitro* guinea pig whole brain preparation and recording; Prof. Sergio Martinoia, Prof. Sylvie Renaud, Prof. Eshel Ben-Jacob, Prof. Ari Barzilai, and Prof. Yael Hanein for their support and for useful discussion.

The authors at TAU thank the help of Nitzan Herzog for calcium imaging set up, Mark Shein-Idelson for network patterning, Inna Brainis for cell culture and Moshe David-Pur for silicon wafer processing. Dr. Paolo Bonifazi is additionally supported by the Joint Italy—Israel Laboratory on Neuroscience.


A. (1996). Spontaneous periodic synchronized bursting during formation of mature patterns of connections in cortical cultures. *Neurosci. Lett.* 206, 109–112.


*Neural Circuits* 6:99. doi: 10.3389/ fncir.2012.00099


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 November 2012; accepted: 25 February 2013; published online: 14 March 2013.*

*Citation: Bonifazi P, Difato F, Massobrio P, Breschi GL, Pasquale V, Levi T, Goldin M, Bornat Y, Tedesco M, Bisio M, Kanner S, Galron R, Tessadori J, Taverna S and Chiappalone M (2013) In vitro largescale experimental and theoretical studies for the realization of bi-directional brain-prostheses. Front. Neural Circuits 7:40. doi: 10.3389/fncir.2013.00040*

*Copyright © 2013 Bonifazi, Difato, Massobrio, Breschi, Pasquale, Levi, Goldin, Bornat, Tedesco, Bisio, Kanner, Galron, Tessadori, Taverna and Chiappalone. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## **APPENDIX**

**FIGURE A1 | Subpopulation of a neuronal network before and after laser induced lesions.** The first panel, starting from the left upper corner, shows a subpopulation of a neuronal network loaded with Fuo4-AM. White arrows, depicted with L1 and L2, indicate the positions of the lesions inflicted to the network. The red arrows indicate the position of the UV laser focus spot. The average power delivered at the sample, during lesion one, is 4 μW, and during Lesion 2, is 5 μW. We delivered 300 UV light pulse for each lesion, at pulse-repetition rate of 100 Hz. At 25 s, after Lesion 1, the L2 position is centered onto the UV focus spot. The last panel shows the same field of view of the first one, after laser inflicted damages. The cells directly affected by the UV laser presented saturated calcium signal. Numbers indicate seconds. The field of view is 150 × 150 μm. Calcium imaging was acquired at 60 Hz.

## Modular neuronal assemblies embodied in a closed-loop environment: toward future integration of brains and machines

#### **Jacopo Tessadori <sup>1</sup> , Marta Bisio<sup>1</sup> , Sergio Martinoia1,2 and Michela Chiappalone<sup>1</sup>\***

<sup>1</sup> Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Genova, Italy

<sup>2</sup> Department of Informatics, Bioengineering, Robotics and System Engineering, University of Genova, Genova, Italy

#### **Edited by:**

Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany

#### **Reviewed by:**

Guo-Qiang Bi, University of Pittsburgh, USA Wolfgang Stein, Illinois State University, USA

#### **\*Correspondence:**

Michela Chiappalone, NeuroTech Unit, Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Via Morego 30, 16163 Genova, Italy. e-mail: michela.chiappalone@iit.it

Behaviors, from simple to most complex, require a two-way interaction with the environment and the contribution of different brain areas depending on the orchestrated activation of neuronal assemblies. In this work we present a new hybrid neuro-robotic architecture based on a neural controller bi-directionally connected to a virtual robot implementing a Braitenberg vehicle aimed at avoiding obstacles. The robot is characterized by proximity sensors and wheels, allowing it to navigate into a circular arena with obstacles of different sizes. As neural controller, we used hippocampal cultures dissociated from embryonic rats and kept alive over Micro Electrode Arrays (MEAs) for 3–8 weeks. The developed software architecture guarantees a bi-directional exchange of information between the natural and the artificial part by means of simple linear coding/decoding schemes. We used two different kinds of experimental preparation: "random" and "modular" populations. In the second case, the confinement was assured by a polydimethylsiloxane (PDMS) mask placed over the surface of the MEA device, thus defining two populations interconnected via specific microchannels. The main results of our study are: (i) neuronal cultures can be successfully interfaced to an artificial agent; (ii) modular networks show a different dynamics with respect to random culture, both in terms of spontaneous and evoked electrophysiological patterns; (iii) the robot performs better if a reinforcement learning paradigm (i.e., a tetanic stimulation delivered to the network following each collision) is activated, regardless of the modularity of the culture; (iv) the robot controlled by the modular network further enhances its capabilities in avoiding obstacles during the short-term plasticity trial. The developed paradigm offers a new framework for studying, in simplified model systems, neuro-artificial bi-directional interfaces for the development of new strategies for brain-machine interaction.

**Keywords: bi-directional, in vitro, hippocampal cultures, confinement, micro electrode array, robot**

## **INTRODUCTION**

Algorithms based on classical models of computation cannot compare with living beings capabilities in terms of dealing with unexpected situations. Different fields of study, such as developmental biology (West-Eberhard, 2003; Gilbert, 2009), embodied cognition (Clark, 1997), evolutionary robotics (Bongard, 2011), seem to indicate as a likely cause for this shortcoming the lack of a developmental phase in traditional silicon-based technology. This process is especially evident in the Central Nervous System (CNS), where morphological changes, both reversible and permanent, occur on a wide range of different time scales. One possible way to deal with this issue is the realization of hybrid systems, where biological components could be exploited for their plastic properties.

In the recent past, several different hybrid model systems have been developed (DeMarse et al., 2001; Martinoia et al., 2004; Mussa-Ivaldi et al., 2010; Warwick et al., 2010; Kudoh et al., 2011), consisting of living neurons coupled to a robotic system.

This solution allows the use of an artificial body whose dynamics can be easily and completely modeled, as opposed to the case of even the simplest animals. Furthermore, the exchange of information in a hybrid system can be limited to the desired level of complexity.

Following this "embodied neurophysiology" approach, we built a closed-loop electrophysiological system by interfacing a virtual mobile robot with a population of neurons, extracted from rat embryos and cultured over Micro Electrode Arrays (MEA; Novellino et al., 2007). The proposed paradigm represents an innovative, simplified, and controllable closed-loop system where it is possible to investigate the dynamic and adaptive properties of a neural population interacting with an external environment by means of an artificial body (i.e., the mobile robot). The main innovations of this experimental setup are: (i) the flexible software architecture at the base of the closed-loop experiments, here described in detail; (ii) the introduction of a modular network design. Starting from the observation of the high degree of modularity in the

brain, different studies point out how such a property is likely to have a profound impact on neural activity (Hubel et al., 1977; Sporns et al., 2000; Derdikman et al., 2003; Kumar et al., 2010; Pan et al., 2010; Boucsein et al., 2011). In this work, we took advantage of the modular structure of the network to obtain a better separation between interacting cell assemblies. A significant improvement to previous works would be the added capability of inducing plastic changes in a controlled fashion. A step in this direction is taken in this setup by the use of a tetanic stimulation to enhance interconnected pathways to improve robot behavior (Jimbo et al., 1999; Chiappalone et al., 2008), following a collision with an obstacle. It is worth pointing out that the final objective of this work is not to achieve the best possible control of the robot: excluding any biological component would, at this stage, easily provide better performance and more reliable results. What is being developed here is groundwork for the integration of electronic systems and neural networks, with the twofold long-term objectives of taking advantage of neural plasticity in more complex control systems and performing closed-loops experiment to gage the computational and learning properties of relatively simple neural preparations.

## **MATERIALS AND METHODS**

The setup developed for experiments of embodied electrophysiology is characterized by several different software, hardware and wetware components (**Figure 1**). The wetware part consists of hippocampal neurons cultured onto a standard 60-electrode MEA. The front-end electronics are constituted by a MEA1060-Inv-BC amplification system (Multichannel Systems, MCS, Reutlingen, Germany) and the computer used is a desktop machine (Dell Precision T5500, 2.66 GHz, 3.43 GB RAM) equipped with a DAQ E NI6255 (National Instruments, Austin, TX, USA) data acquisition board. An *ad hoc* adaptor was realized to interface the DAQ board with the amplification system. The software used for the

management and acquisition is HyBrain2, a specifically developed software based on what is described in a previous work (Mulas et al., 2010): it allows control of all the parameters of the neurorobotics experiments and performs the required data processing, such as the implementation of the coding, decoding and shortterm plasticity schemes. Information is sent to the culture as a series of electrical stimulations through a Stimulus Generator 4002 (Multichannel Systems). Three different robots can be used for the experiments: two physical ones (Khepera II and its successor Khepera III, from K-Team, Zi les Plains-Praz, Switzerland) and a virtual implementation within the HyBrain2 architecture. The relevant elements of the robot are a set of distance sensors and two independently controlled wheels. Both the physical and the virtual ones have a circular arena with obstacles to move in. In all of the experiments, the task the robot is trying to perform is obstacle avoidance. While both physical robots have been tested and are properly working within the setup, in the following, only experiments with the virtual one are reported. The main problems with the physical robot are the fact that it requires actual tracking from an image to compute its position (which is both machinetime consuming and occasionally fails) and the non-idealities of its sensors: among the other, ambient lighting conditions have an impact on the performance of the infrared distance sensor and it has been reputed unwise to add such a factor of unpredictability at this stage of the development.

## **NETWORK MODULE**

#### **Neuronal preparation: random and modular cultures**

Dissociated neuronal cultures were prepared from hippocampi of 18-day-old embryonic rats (pregnant female rats were obtained from Charles River Laboratories). Culture preparation was performed as previously described (Frega et al., 2012). Briefly, the hippocampi of 4–5 embryos were dissected out from the brain and dissociated first by enzymatic digestion in trypsin solution

a micro electrode array; (ii) a computer which hosts the developed software tool (i.e., HyBrain2) which manages the communication between the biological and the artificial part; (iii) the robotic module composed by a robot, either real or virtual, with sensors and actuators navigating into a circular arena with obstacles.

0.125% (30 min at 37˚C) and subsequently by mechanical dissociation with a fine-tipped Pasteur pipette. The resulting tissue was re-suspended in Neurobasal medium supplemented with 2% B-27, 1% Glutamax-I, 1% Pen-Strep solution, and 10% Fetal Bovine Serum (Invitrogen, Carlsbad, CA, USA), at the final concentration of 60 k cells/ml.

Cells were afterward plated onto standard 60-channel MEAs previously coated with poly-d-lysine and laminin to promote cell adhesion (final density around 1200 cells/mm<sup>2</sup> ) and maintained with 1 ml of nutrient medium (**Figures 2A,B**). They were then placed in a humidified incubator having an atmosphere of 5% CO<sup>2</sup> and 95% air at 37˚C. Half of the medium was changed weekly. Recordings were performed on cultures between 20 and 60 days *in vitro* (DIVs).

Considering the multitude of connections that usually forms in a random culture, a way to better control the network complexity consists of imposing a constraint to the neuronal cells growth along specific pathways (Chang et al., 2001;Boehler et al., 2012). To do this, a dual-compartment chamber with two interconnecting microchannels has been realized in polydimethylsiloxane (PDMS), a biocompatible, inert, and non-toxic polymer often used to this extent (Raichman and Ben-Jacob, 2008; Levy et al., 2012). The realization of the modular structures has been realized by replica molding using specific master with a previously developed technique (Berdondini et al., 2006). The obtained structures have been then placed on MEA substrates, in order to confine the growth of the neuronal cells that will be plated on it, as shown in **Figure 2B**.

#### **Micro electrode arrays**

Micro electrode arrays (Multichannel Systems, MCS, Reutlingen, Germany) consist of 60 TiN/SiN planar round electrodes (30µm diameter; 200µm center-to-center inter-electrode distance, see

**Figure 2A**) arranged in an 8 × 8 square grid excluding corners. In some devices, one recording electrode is replaced by a larger ground electrode. Each electrode provides information on the activity of the neural network in its immediate area. A microwire connects each micro electrode of the MEA to a different channel of a dedicated amplifying system with a gain of 1100. The amplified 60-channel data is then conveyed to the data acquisition card which samples them at 10 kHz per channel and converts them into digital, 12 bit data (**Figures 2C,D**).

#### **HYBRAIN2 SOFTWARE**

The need for real-time access to data led to the adoption of a general-purpose acquisition card (NI6255, National Instruments, Austin, TX, USA) and required the development of a specific software: Hybrain2. The core of the program handles incoming data from the acquisition card and graphically displays them in a panel such as the one shown in **Figure 3A**. Spike detection options can be selected from this panel, such as threshold amplitudes or update times, as well as software blanking of stimulus artifacts. While a rather sophisticated algorithm (i.e., SALPA filtering; Wagenaar and Potter, 2002) for blanking has been included and validated, it has not been used in the described experiments, as it tends to compete for CPU-time with the rest of the system, leading to occasional resource starvation. In its current version, Hybrain2 does not make use of raw data other than for displaying. Instead, incoming data is processed by a spike detection algorithm (Maccione et al., 2009) whose output is a series of time stamps.

As explained later in more detail, both the coding and decoding algorithms for the closed-loop control of the robot are rate-based, therefore spike time stamps are a lossless representation of incoming data. **Figure 3B** shows the panels used for configuration of

**FIGURE 2 | Random and modular neuronal assemblies over micro electrode arrays. (A)** On the left, a random culture grown on a standard MEA device. On the right, the MEA layout is shown: a squared matrix of 59 micro electrodes (the missing one is the reference electrode), in which the inter-electrode distance is 200µm and the micro electrode diameter is 30µm. **(B)** On the left, a confined culture on a MEA substrate. On the right, the

bi-compartmental system realized in PDMS with two interconnection microchannels. Compartments height is 700µm, and width is 1500µm. Microchannels height is 100µm, and width is 50µm. **(C)** Spontaneous electrophysiological activity of a confined culture of hippocampal neurons, registered from all the micro electrodes. **(D)** A typical hippocampal burst waveform recorded from a single channel.

display panel, including options for data visualization, artifact filtering, and spike detection. **(B)** Several panels allow the configuration of coding and decoding algorithms and saving of data during experiments. **(C)** The robot

robot, this can also be used to draw the arena itself. **(D)** The physical robot inside the arena where two obstacles are placed. The dotted red line represents the trajectory of the robot inside the arena.

the parameters of these algorithms, such as selection of recording and stimulation electrodes, pulses amplitudes and lengths, and maximum and minimum allowed wheel speeds for the robot.

A module of the software is dedicated to managing the robot itself: in **Figure 3C**, a sample experiment with a virtual robot is shown. Here, the software is generating the robot environment as well as controlling all the relevant parameters of the robot itself, while, in the case of a physical robot (such as that in **Figure 3D**), the software provides a simple tracking feature on images provided by a webcam positioned over the arena and the required communication with the robot itself. All the data produced during experiments, including electrode readings, time stamps, and robot navigation data can be stored for later analysis both in text and/or binary format, while common parameters configurations can be saved and loaded in order to minimize experiment setup times and human errors.

#### **ROBOTIC MODULE**

The robot, either virtual or physical, is basically a two-wheeled sensor platform: six infrared sensors are mounted on the robot at different angles, providing information about the distance of surrounding objects in different directions,whereas the speed profile of each wheel determine the direction and velocity of the robot itself.

The arena consists of an enclosed space containing several different round obstacles in random positions and the robot.A typical experiment with the virtual robot is shown in **Figure 3C**: the robot is moving in a 400 × 400 pixels circular arena, where dark green pixels represent obstacles or arena walls, whereas light green pixels are free for the robot to move in. The robot (small pink circle) is collecting information about its environment through its six sensors: each black line departing from the robot represents the line of sight of a different sensor; their angles are fixed with respect to the robot heading (in this case, 30˚, 45˚, and 90˚ on both sides of the robot direction), while the length of each line is equal to the distance from the robot center to the closest obstacle in the sensor direction. This distance defines the reading of the sensor: the output is 0 if the robot is in direct contact with an obstacle, 1 if the closest obstacle is at the maximum distance possible (the diameter of the arena, in this case). The three sensor readings on each side are averaged to provide the neuronal network with a single value per side.

In the case shown in **Figure 3C**, the robot is performing an obstacle avoidance task, as can be inferred by the red trajectory. The speed of a wheel is inversely proportional to the average of the sensor readings on the same-side, therefore the robot turns away from close obstacles. The ideal behavior of the robot is that of a Braitenberg vehicle (Braitenberg, 1984) in the case of no loss of information and no significant delays between sensor data collection and motor command execution. Obtaining a behavior as close as possible to this one is the goal of the coding-decoding-short-term plasticity process implemented here.

During experiments, collisions with obstacles or walls are unavoidable: following such an event, the robot moves back to a previous position in its path, at a fixed distance from any obstacle.

#### **INTERFACING THE NETWORK AND THE ROBOTIC MODULE Decoding scheme**

Although many different decoding schemes are possible, so far the only one implemented has been a frequency rate-based algorithm (Adrian, 1928; Rieke et al., 1997; Martinoia et al., 2004). For this scheme, only a feature of the recorded signals is useful: the frequency of spikes at each location. A group of electrodes (i.e., a sub-population of neurons) on the MEA is selected and defined as the "output area" through the procedure described in the Section "Experimental protocol." The number of spikes occurring over that area in 100 ms, non-overlapping windows constitutes the basis for calculating the motor signal for the corresponding wheel. In the current architecture, a linear relation is implemented between wheel speed and motor signal: if no spikes are detected in a time window, the corresponding wheel turns at a set minimum speed, increasing linearly with the number of detected spikes, up to a defined maximum rate. A low-pass filtering effect is added by taking into account previous samples, in order to smooth robot movements.

Dissociated neural networks are especially prone to bursting (Chiappalone et al., 2006) and this pattern of activity has been shown to code different information than just the sum of its spikes (Cozzi et al., 2006). A module for the detection of bursts has been already added to the Hybrain2 software, but its output is not yet part of the control loop of the robot.

For each wheel, the speed is therefore defined as:

$$\boldsymbol{\omega}\_{i} = \begin{cases} \frac{f\_{i,t} + f\_{i,t-1}}{2f\_{i}^{\text{MAX}}} \left( \boldsymbol{\omega}\_{i}^{\text{MAX}} - \boldsymbol{\omega}\_{i}^{\text{min}} \right) + \boldsymbol{\omega}\_{i}^{\text{min}} & \text{for } f\_{i} < f\_{i}^{\text{MAX}}\\ \boldsymbol{\omega}\_{i}^{\text{MAX}} & \text{for } f\_{i} \ge f\_{i}^{\text{MAX}} \end{cases}$$

where subscript *i* denotes wheel side, ω is the wheel speed, and *fi ,t* is the averaged firing rate over all the electrodes corresponding to the *i*-th recording area at time sample *t*. ω MAX, ω min, and *f* MAX are parameters set by the experimenter before the start of the experiment.

#### **Coding scheme**

Likewise, the coding scheme is linear and rate-based: two groups of electrodes are defined as "input areas" and assigned to the sensors on the left and right side of the robot body. The details for area selection are fully explained in the Section "Experimental Protocol." Each sensor provides a reading, normalized to 1 for an object in direct contact with the robot and 0 for an object at the far end of the designed arena (while this behavior is nearly ideal for the virtual robot, it is far from so in the case of the physical robot, as already mentioned in the Section "Materials and Methods." The readings from the sensors on the same-side of the robot are then averaged and coded back to the corresponding sensory area. As mentioned before, the coding is linear and frequency based: a fixed stimulus is delivered at the sensory area at a frequency directly proportional to the averaged, same-side sensors readings. The stimulation rate for each input region is determined as:

$$s\_i = \left(s\_i^{\text{MAX}} - s\_i^{\text{min}}\right)r\_i + s\_i^{\text{min}}$$

where *s<sup>i</sup>* is the stimulation rate of the *i*-th input area and *r<sup>i</sup>* the normalized average of all the sensor readings on the corresponding side of the robots, whereas *s* MAX *i s* min *i* are user-set parameters fixing the maximum and minimum stimulation rate.

#### **Short-term plasticity protocol**

In order to progress toward the desired behavior, it is necessary to define a learning rule that allows a modification of connectivity between input and output areas by rewarding "good behavior," while discouraging "bad behavior." The effect of tetanic stimulation in these networks was already demonstrated by our group and by others in the past, showing that a 20 Hz stimulation should strengthen the synaptic connections of receiving neurons (Jimbo et al., 1999; Tateno and Jimbo, 1999; Madhavan et al., 2007; Chiappalone et al., 2008; le Feber et al., 2010). In all these papers the effect of the tetanus on the change of firing rate was studied in a time frame comparable to that of our experiments (30 min to 1 h). Additionally, in a previous paper from our group (Chiappalone et al., 2008), we were able to demonstrate that a single tetanic shock to a neuronal network had an immediate effect in terms of increase in the Post Stimulus Time Histograms (PSTH) area (i.e., increase in the number of spikes evoked by a stimulus), a medium-term effect (i.e., few hours after the tetanus delivery), and a long-term effect (i.e., 1 day after the tetanus delivery).

The above observations have been used to define the learning rule in the current implementation of the software: following each robot collision, a 2-s-long, 20 Hz stimulation is delivered to the same-side input area. The rationale for this choice is that collisions are usually caused by poor correlation between stimulation in an input area and detected activity in the corresponding output area, thus making the network responses to stimulation insufficient to steer the robot in the correct direction. Our hypothesis is that tetanic stimulation strengthens all participating connections, thus correcting the problem, as demonstrated in the studies cited above. A tetanic stimulation induces short-term plasticity effects which allow the groups of neurons involved in the obstacle avoidance tasks to fire at a higher frequency, thus inducing the corresponding wheel to increase the angular velocity. Since input-output regions were selected according to connection strength (see Experimental Protocol below), this should increase responses detected from the desired electrodes upon delivery of a stimulus from the input electrodes. This bring to a generalized strengthening of connections in the network and to an improvement in the driving of the robot.

## **ON-LINE PROCESSING OF ELECTROPHYSIOLOGICAL SIGNALS**

#### **Spike detection**

The electrophysiological signals acquired from MEA electrodes must be preprocessed in order to remove the stimulus artifact and to isolate spikes from noise. The spike detection algorithm uses a differential peak-to-peak threshold to follow the variability of the signal and a set of controls are performed in order to make the algorithm as reliable as possible (Maccione et al., 2009). The threshold is proportional to noise SD and is calculated separately for each individual channel (typically as six or seven times SD) before the beginning of the actual experiment (i.e., during phase 1 of the protocol described below).

## **Blanking of stimulus artifact**

Stimulus artifacts are detected when the recorded signal exceeds a defined threshold much higher than the one used for spike detection. The artifact is then suppressed by canceling the first samples in the spike train occurring immediately after it, corresponding to a signal blanking of 4 ms after stimulus delivery.

## **EXPERIMENTAL PROTOCOL**

The typical experimental protocol followed in this work consists of a five-step procedure:


During the first step of the experimental session, spontaneous activity of the network is subject to observation, in order to determine, empirically,which electrodes are the most likely candidate as "input" sites (i.e., sites from which stimulation must be delivered). Typical features to look for in this phase are a sustained mean firing rate (i.e., sufficient number of spikes per second, usually higher than 0.1 spikes/s) and patterns of activity not synchronous with other regions. The best candidates (usually a set of 8–10 sites) are then selected for the second step of the experiment. From each of the candidate "input" channel, in turn, a 500-µs, 1.5V peak-topeak, bipolar square wave is delivered every 5 s, until a total of 40 stimuli per channels have been delivered, while spiking activity is detected from other electrodes.

At the end of this phase, for every stimulation electrode involved, 59 PSTH are generated (Chiappalone et al., 2007): these graphs report the average number of spikes detected from each electrode in the 600 ms following each stimulation and therefore provide information on the strength of the connections in the culture. Through a custom-made script developed in the Matlab environment (The Matworks, Natick, MA, USA), the generated PSTHs are then compared in order to look for areas that present a significant degree of specificity, i.e., where responses are not elicited by stimulation delivered from all the electrodes, but from some of them. In this way, it is possible to define an output (recording) area that will respond mostly to stimulation from the corresponding input area, while remaining silent during stimulation from the opposite input area (cf., see "Input and Output Sites of a Neuronal Population" of the Results).

During steps 3 and 4, the robot is left free to roam the arena with the rules described above, with a tetanic stimulus following each collision with an obstacle delivered during step 4. If the starting hypotheses hold true, this will progressively drive the network toward the desired condition of reliable and specific evoked responses.

Finally, we collect the data on the robot performances. In order to verify the neural-based behavior of the robot, we compared the results obtained (i) in a neuron-controlled experiment (a MEA with living neurons grown on, bi-directionally connected to the robot), (ii) in a open-loop experiment (a MEA with living neurons grown on, but without sensory feedback), and (iii) in an "empty" MEA experiment (a MEA with culturing medium only). In case (ii), the robot performs in a way imposed by the spontaneous firing rate of the neural network, usually in a random pattern, while in the case of the "empty" MEA (iii) the robot basically drives in a straight line (see the Supplementary Videos and Closed-Loop Robot Navigation of the Results).

#### **DATABASE OF EXPERIMENTS, DATA ANALYSIS, AND STATISTICS**

Experiments on a total of *N* = 17 different cultures, ranging from 20 to 60 DIV, have been conducted: 11 of those were random hippocampal cultures, while the other six experiments were conducted on hippocampal cultures, divided into sub-populations by a confinement mask, as described above. Those six cultures were also compared for spontaneous activity evaluation with a subset of six random cultures (age range of the subset: 21–42 DIV).

In order to highlight differences in term of synchronization between the two populations, a cross-correlation algorithm was applied to spike trains, a technique already introduced previously (Frega et al., 2012). Briefly, the cross-correlation function (i.e., cross-correlogram) is defined by the incidence of a spike at electrode *y* after that a spike was fired at electrode *x*. More specifically, given two spike trains (i.e., *x* and *y*) from two electrodes of a MEA, we count the number of spikes in the *y* train within a time frame around the spikes of the *x* train of ±T (in the order of tens of milliseconds), using bins of amplitude ∆τ (usually set at multiple of the sampling frequency). The correct *Cxy*(τ) is obtained by means of a normalization procedure, by dividing each element of the array by the square root of the product between the number of peaks in the *x* and the *y* train. If the obtained *Cxy*(τ) shows a distribution that clearly deviated from flat, electrodes x and y are considered correlated. For each cross-correlogram *Cxy*(τ) we then estimated the coefficient *C*peak. *C*peak represents the value of the cross-correlogram in an area around the maximum detected peak and it is usually evaluated in order to quantify the *correlation level* among two recording channels. The statistical distribution of all *C*peak values was computed for the two experimental groups during spontaneous activity (i.e., random vs. modular cultures). For each robot run, two different parameters have been computed in order to evaluate the performance of the robot, namely the average distance traveled by the robot between hits (measured in pixels) and the average number of hits per second. The virtual robot is implemented so that following a collision against an obstacle, it

is immediately moved to the last location where its center was at least 20 pixels away from any other object. Since the robot radius is 5 pixels, the lower limit for the average distance traveled by the robot during each robot run is that of 15 pixels.

Statistical tests were employed to assess the significant difference among diverse experimental conditions. The normal distribution of experimental data was assessed using the Kolmogorov-Smirnov normality test. According to the distribution of the data, we performed either parametric (e.g., ANOVA, **Figure 7**) or nonparametric (e.g., Mann–Whitney U test, **Figures 4**–**6** and **8**) tests and *p* values < 0.05 were considered significant. Statistical analysis was carried out by using OriginPro (OriginLab Corporation, Northampton, MA, USA).

#### **RESULTS**

#### **NETWORK DYNAMICS: SPONTANEOUS ACTIVITY IN RANDOM AND MODULAR NETWORKS**

Hippocampal cultures grown *in vitro* over MEAs show a spontaneous (i.e., ongoing) activity, similar to that exhibited by *in vivo* systems during their development (Ben-Ari, 2001) or during deep sleep (Corner, 2008). Their electrophysiological behavior is characterized by spontaneous spiking which becomes synchronized with the maturation of the network, giving rise to phenomena called "bursts," network bursts (Pasquale et al., 2010) or network spikes (Eytan and Marom, 2006). These network bursts are the fingerprints of a steady-state in which the network dynamic found a balance between excitation and inhibition (on average 70–80% of neurons are excitatory ones and the remaining 20–30% is constituted by inhibitory interneurons). Such state can be easily pharmacologically disrupted by acting on the glutamatergic as well as on the gabaergic receptors or by adding neuromodulators (Keefer et al., 2001; Eytan et al., 2004; Frega et al., 2012). Another possibility to alter such stereotyped behavior is to introduce modularity (i.e., interconnected populations) instead of having a single uniform and random culture (Raichman and Ben-Jacob, 2008; Shein Idelson et al., 2010; Kanagasabapathi et al., 2012).

**Figure 4** shows the spontaneous activity from a representative random (**Figure 4A**, top) and a modular culture (**Figure 4A**, bottom) during the fourth week of development.While in the random culture the activity is highly synchronized and packed in the form of "network bursts" (van Pelt et al., 2004; Pasquale et al., 2010), in the modular culture we can identify two different temporal patterns of activity with moments of synchronized bursts interleaved with sparse spiking periods. Synchronized network bursts spread to the whole culture also in the modular networks, even if, globally, modular cultures are much less correlated than the random ones (**Figure 4B**).

#### **NETWORK DYNAMICS: EVOKED ACTIVITY IN RANDOM AND MODULAR NETWORKS**

It is possible to electrically modulate the activity of the network by means of electrical stimulation. The typical response of a

same (S) or other (O) compartment with respect to stimulating electrodes. No statistical differences can be noted in a random culture. N = 11 random cultures. **(D)** Box-plot of the latency from the first evoked spikes in the same (S) or other (O) compartment with respect to stimulating electrodes. In a modular network, the latency between the stimulus and the first evoked spike is statistically lower for the electrodes belonging to the same cluster of the stimulating electrodes. N = 6 modular cultures. Box range: percentile 25–75; Box whiskers: percentile 5–95; line: median; square: mean. Mann–Whitney test for not-normal data, significance level = \*p < 0.05.

network can be evaluated through the Post Stimulus Time Histogram (PSTH, cf., see Materials and Methods). In **Figure 5A** the maps of the PSTH obtained as a consequence of the stimulation from site 13 (top) and site 72 (bottom) are reported in a non-confined culture. Typically, the PSTH is characterized by an "early response," lasting 20–40 ms, and by a late response, lasting more than 100–200 ms, usually due to the generation of an evoked burst synchronized over the whole network (Gal et al., 2010). The integral calculated over the PSTH profile represents the average number of evoked spikes at a specific site and it is used for quantifying the strength of the connection between a specific stimulation site and all the recording ones (Chiappalone et al., 2008). This parameter is at the base of the choice of the input-output connections for our neuro-robotic studies (cf., see Input and Output

Bottom. PSTHs obtained by stimulating electrode 28 (black square) in the bottom compartment of the same confined network. Shaded area indicates the top compartment. X -axis: time (0, 400) ms, bin 4 ms; Y -axis: probability of

> Sites of a Neuronal Population). **Figure 5B** reports the maps of the PSTH obtained in a modular network. When stimulation is delivered from site 21 (top compartment, **Figure 5B** top), mainly the electrodes of the top compartment respond to the stimulation. Few activations can be observed also in the bottom compartment, but with a dominant late response and an almost absent early one. In the same network, when stimulation comes from one electrode of the bottom compartment (electrode 28, **Figure 5B** bottom) practically only that compartment responds to the stimulus.

> To further test the actual confinement of the evoked responses, we also analyzed the distribution of the mean latencies (i.e., the distance between the stimulus and the first evoked spike) obtained for each couple of stimulation-recording electrodes (Mainen and Sejnowski, 1995; Tateno and Jimbo, 1999): simply by eye, it is

**FIGURE 6 | Input-output selection. (A)** Map obtained in a representative "random" culture for the selection of the output sites, given two inputs sites (e.g., 26 and 47): red, left recording area; blue, right recording area. **(B)** Schematic representation of the input (yellow and light blue) and respective recording (red and blue) areas for the same experiment reported in A ("random" culture): note that the selected electrodes are quite spread over the entire recording area. **(C)** Map obtained in a representative "confined" culture for the selection of the output sites, given two inputs sites (e.g., 27 and 62): red, left recording area; blue, right recording area. **(D)** Schematic representation of the input (yellow and light blue) and respective recording (red and blue) areas for the same experiment reported in **(B)** ("confined" culture): note that the selected

recording electrodes are close to the stimulating electrode and they follow the structure of the underlying network. **(E)** A box-plot representing the distances from bisector of the selected recording electrodes in the set of random and confined cultures used within this study (N = 11 random and N = 6 modular cultures). The distribution of the distances in the modular case is significantly higher than in the random case. Box range: percentile 25–75; box whiskers: percentile 5–95; line: median; square: mean. Mann–Whitney test for not-normal data, significance level = \*p < 0.05. **(F)** Pie chart representing the percentage of networks in which at least 50% of the recording electrodes were selected in the same compartment of the stimulating electrode. The percentage is higher for the modular networks (N = 11 random and N = 6 modular cultures).

clear that the evoked response is (mostly) limited to the compartment hosting the stimulation electrode. **Figures 5C,D** reports the distribution of the latencies from the electrodes in the same compartment (i.e., top or bottom) of the stimulating electrode (S) compared to those from the electrodes in the other compartment (O). Only in the case of confined networks (**Figure 5D**) the two distributions are statistically different, being the latencies evaluated in the electrodes belonging to the same compartment of the

stimulation significantly lower than those of the electrodes in the other compartment. This proves that dividing the neural network in two sub-populations has indeed an effect on stimulus response.

#### **INPUT AND OUTPUT SITES OF A NEURONAL POPULATION**

The simplest architecture that can be adopted for the proposed task includes two electrodes to deliver coded sensory information, one for each set of sensors. While the same could be said for output sites, the point of interest in this work was the response of the network, therefore a set of 8–10 electrodes is chosen to act as output sites for each wheel.

The main disadvantage in dealing with dissociated cultures instead of experimental models with a preserved neural structure is the lack of predefined architecture. For this reason, before starting an experiment, a procedure has been performed to define the stimulation (sensory input) and recording (motor output) areas of the network. During this procedure (i.e., phase 2 of our experimental protocol, cf., see Materials and Methods), we stimulated the cultures by delivering trains of 40 electrical stimuli (1.5V peak-to-peak, biphasic pulses, 500µs total duration) from 8 to 10 sites in a serial way. Then, the PSTH area (i.e., the number of spikes in the 600 ms following each stimulation) between each pair of stimulation-recording electrodes is computed and the related maps, like the one reported in **Figures 6A,C**, are

produced. The coordinates of each square in that map represent the PSTH areas at a specific recording site relative to stimulation from the two stimulating sites reported on the axis (Stim[26] and Stim[47] in **Figure 6A**, for example). All the possible inputoutput combinations are explored and only the pathways producing "selective" responses are retained. These "selective" pathways are identified by pool of recording sites with respect to a couple of stimulating sites for which the responses measured fall far away from the bisector (i.e., pool of recording site closer to the axis).

Those specific pathways of sensory-motor activations can be then conveniently utilized for driving the robot and for implementing simple reactive behaviors (e.g., obstacle avoidance). **Figures 6B,D** report the selected inputs (i.e., two electrodes, one for the left and one for the right area) and output regions, characterized by eight electrodes each, corresponding with maps **Figures 6B,D**, respectively for two representative cultures (i.e., random and modular).

The presence of a confinement structure tends to generate networks showing a higher degree of functional separation (i.e., selectivity), as well as a physical one, when compared to totally random networks: as can be seen in **Figure 6E**, the average distance from the bisector of the evoked response pair is significantly increased in the case of the modular network. The geometry of the

stimulation-recording pairs is also affected, as they are more likely to be clustered together on the same half of the culture (**Figure 6F**).

#### **CLOSED-LOOP ROBOT NAVIGATION**

All the parameters relevant to the movement of the robot are recorded during the experiment. In **Figures 7A,B**, more than 1000 s of signal recordings are plotted (**Figure 7A** for the left side and **Figure 7B** for the right side). The top panels are showing sensory information, with the blue trace representing the average value of proximity sensors on the left side of the robot and the red one the average value of those on the right. In the second graph, a measure of stimulation is shown, expressed as the mean stimulation rate. The third line of graphs reports the firing rates, measured in spikes per second; wheel speeds (shown in the lower graph, expressed in pixels per seconds), closely follow neural activity.

The results of the behavior described so far can be observed in **Figures 8A–C**, where a virtual arena is shown along with the path drawn by the robot (in red) in a 20-min long robot run, respectively in an "empty" experiment (**Figure 8A**), an open-loop experiment

The values are obtained in N = 5 experiments for the empty and the open-loop case and in the N = 17 experiments reported in the text (light blue = empty MEA; blue = open-loop MEA; cyan = closed-loop MEA). The closed-loop experiments give the best results. Statistical analysis was carried out by using one-way ANOVA (\*p < 0.05) for normal distributions (Kolmogorov–Smirnov test of normality), while for mean comparison both the Tukey and the Bonferroni tests were used.

false positives in the spike detection algorithm on background noise or stimulation artifacts. As can be inferred from the image, though, their total impact is almost null and the robot moves almost precisely in a straight line. **(B)** Reconstruction of a 20-min long robot trajectory in open-loop. During this robot run the control loop has been opened by stopping stimulation to the neural culture. As a result, the robot is, similarly to the

(**Figure 8B**) and finally a closed-loop experiment (**Figure 8C**). While collisions are fairly frequent even in the latter case the behavior of the robot is still much closer to the desired one rather than in an open-loop configuration, or (obviously) in the absence of a biological substrate. As can be observed from the graph in **Figure 8D**, the average path traveled between hits is significantly higher in the case of a close-loop.

#### **IMPACT OF MODULARITY AND TETANIC STIMULATION ON ROBOT NAVIGATION**

Despite the improvement in performance of the closed-loop scenario compared to the control cases, robot collisions against obstacles are still a frequent occurrence in random networks. Observation of PSTHs reveals that random networks show a very high degree of connectivity, with evoked responses showing a strong overlap regardless of the stimulating electrodes position (**Figure 5A**). The introduction of a confinement mask shows a marked separation in the responses obtained from stimulation, as can be observed from **Figure 5B**. This, in turn, leads to a reduction in the amount of "cross talk" between input and output channels, with a consequent increase in the navigation performance of the robot. **Figures 9A,C** compare the improvement in performance between the random network structure and the modular one. Specifically, **Figure 9A** shows the comparison between performances evaluated as the average distance between consecutive collisions in different conditions (without and with tetanic stimulation, respectively on the left and right graphs), while **Figure 9C** displays the same performances evaluated through a different parameter, average number of hits per second. The tetanic stimulation leads to a further improvement in the performance, especially when performed on a network with a modular geometry, as can be observed in **Figures 9B,D**: the first couple of graphs show the increase in performance following the introduction of the tetanic stimulation routine (in a random network, left, and in a modular one, right) evaluated as distance between collisions, while the graphs in **Figure 9D** show the performance obtained in the same experiment as average number of hits per second. Examples of changes in effective connectivity obtained in modular and random networks can be observed in **Figure A1** in the Appendix. Even if quantification will be necessary, preliminary analyses of changes in connectivity show that tetanic stimulation does affect the network response, by strengthening the connections on one side and weakening or not affecting the connections on the opposite side.

While all of the described comparisons yield statistically significant results in the case of the average distance parameter, it is not the case for the average number of collisions: the only condition that causes a large enough change to be significant is the introduction of a tetanic stimulation on a modular network.

#### **DISCUSSION AND FUTURE DEVELOPMENT**

In this paper we successfully interfaced, in a bi-directional way, a network of neurons coming from the hippocampus of embryonic rats with a virtual robot. The robot, which has sensors and wheels,is forced to move in a static arena with obstacles and its task consists in avoiding collisions. Looking at the spontaneous electrophysiological activity of the network, we first select a set of possible "inputs," then we evaluate the evoked response of the entire culture

by delivering patterns of electrical stimulation. This procedure allows us to select the "outputs" of our network. Then, by applying a linear rate-based decoding strategy,we were able to transform the spikefrequency into velocity and the sensory information collected by the robot"eyes"into stimulationfrequencyfor our neurons. The behavior of the robot during the closed-loop experiments resulted significantly better than that in open-loop (i.e., without any sensory feedback) or the "empty" MEA condition, proving that the activity driving the robot is actually neural-based (cf. **Figure 8**). In general, these results prove that an *in vitro* network of biological neurons can control an external agent. While ours is not the first setup to achieve this goal, in our knowledge, no previous work reports an extensive set of experiments like the ones we performed (DeMarse et al., 2001; Martinoia et al., 2004; Novellino et al., 2007; Bakkum et al., 2008; Kudoh et al., 2011), but, rather they focus on a single thesis supported by data obtained from a limited number of analogous preparations. Here, we introduce for the first time statistical comparisons obtained on a sizable number of different preparations with highly different spiking behaviors, such as those observed on random and modular networks. Furthermore, bi-modularity of cultures is introduced here for the first time in the context of closed-loop interfaces and its impact is shown to be relevant for the performance of the embodied agent.

Early experiments on random networks showed the tendency of these cultures to evolve toward a degenerate state where mostly network-wide synchronous activity can be observed. The addition of a confinement mask and the consequent modularity qualitatively changed the behavior of the network, preventing or at least strongly reducing the appearance of synchronized network bursts (cf. **Figure 4**). This change alone was enough to provide a significant increase in the performance of the robot (cf. **Figures 5**, **6**, and **9**). These results lead to two possible investigation lines on the same experimental setup: increasing the modularity of the network might allow more complex behavior to emerge, while chronic stimulation since the day of plating might be used in future experiments to define functionally but not physically distinct sub-populations of neurons within the same culture.

Another point of novelty in our approach has been the systematic use of tetanic stimulation on hippocampal cultures over MEA. Previous approaches aiming at demonstrating plasticity in neuronal assemblies by using stimulation protocols from embedded extracellular electrodes were always applied to cortical cultures (Jimbo et al., 1999; Madhavan et al., 2007; Chiappalone et al., 2008; Stegenga et al., 2010). Here we used hippocampal cells and we proved that tetanic stimulation worked successfully, providing an increase in performance both in random and modular networks (cf. **Figure 9**). A further analysis on data is being conducted to determine whether it is possible to define a clear relationship between spontaneous activity of the network and its impact on the observed changes in connectivity strength, since the patterns of induced change proved to be more complex than expected (see **Figure A1** in the Appendix for a preliminary example of effective connection changes induced by tetanic stimulation). This could allow the design of a more successful learning scheme. The exact biological mechanisms linking performance increase and tetanic stimulation are still unclear and further investigations and targeted experiments are needed. Along this direction, the use of

networks, in the absence or presence of tetanic stimulation (respectively, left and right graph), evaluated as average distance (in pixels) between consecutive hits. **(B)** Comparison of robot performances between different conditions of tetanic stimulation, in random (left graph) and modular networks (right graph), evaluated as average distance between hits. **(C)** Comparison between robot performance in random and modular networks, in the absence or presence of tetanic stimulation (respectively, left and right graph),

(left graph) and modular networks (right graph), evaluated in hits per second. All the values are obtained in the experiments described in text (N = 11 experiments for the random condition, N = 6 for the modular), with a tetanic stimulation session following each standard robot run. Box range: percentile 25–75; box whiskers: percentile 5–95; line: median; square: mean. Statistical analysis was carried out by using Mann–Whitney test for not-normal data, significance level = \*p < 0.05.

pharmacological manipulation could allow to change the state of the network and thus to investigate roles of synaptic transmission and receptors involved in the process of adaptation and learning depending on specific stimulation protocols.

As expected, the final performance of the robot is worse than what was possible to achieve without including biological components in the closed-loop (data not shown): for the task of obstacle avoidance, it would be possible to program the robot so that it can perform the navigation task with no risk of hitting obstacles. However, our neuro-robotic framework proved to be a valid tool for the study of mechanisms of neural coding and the computational and adaptive properties of neuronal assemblies with the final goal to facilitate progress in understanding neural pathologies, designing neural prosthetics, and creating fundamentally different types of artificial or hybrid intelligence.

## **ACKNOWLEDGMENTS**

The authors wish to thanks Dr. Marina Nanni for the technical assistance in cell culture preparation. The authors would like to thank Alessandro Bosca and Luca Berdondini for their help and technical assistance in the realization of the PDMS masks used for part of the experiments reported in the paper.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at http://www.frontiersin.org/Neural\_Circuits/10.3389/ fncir.2012.00099/abstract

#### **REFERENCES**


*(GECCO '11)*, ed. N. Krasnogor (New York: ACM).


**Video S1 | Video of a closed-loop robot run.** This video of a virtual robot run is running at 40× real speed. The arena is composed of dark green solid obstacles and light green "floor" which the robot can move upon. The magenta circle is the virtual robot itself, the red dots highlight the path followed by the robot center over time, while black circles represent hits against obstacles. While the amount of obstacles hit by the robot shows that control is not perfect, the robot is able to take advantage of sensory information to extricate itself from all the situations encountered in a limited amount of time and hits.

**Video S2 | Video of an "empty MEA" robot run.** This video of a virtual robot run is running at 40× real speed. The arena is composed of dark green solid obstacles and light green "floor" which the robot can move upon. The magenta circle is the virtual robot itself; the red dots highlight the path followed by the robot center over time, while black circles represent hits against obstacles. The starting direction of the robot in this trial is rotated 90˚ clockwise with respect to the other two shown videos. Total lack of biological material on the MEA prevents a closing of the sensory-motor loop. As a consequence, the robot shows a total inability to navigate its environment. The small changes in robot heading are likely false positives in the spike detection algorithm on background noise or stimulation artifacts. As can be inferred from the video, though, their total impact is almost null and the robot moves almost precisely in a straight line.

**Video S3 | Video of an open-loop robot run.** This video of a virtual robot run is running at 40× real speed. The arena is composed of dark green solid obstacles and light green "floor" which the robot can move upon. The magenta circle is the virtual robot itself, the red dots highlight the path followed by the robot center over time, while black circles represent hits against obstacles. During this robot run the control loop has been opened by stopping stimulation to the neural culture. As a result, the robot is, similarly to the previous case, lacking any capability of navigating its environment. Changes in robot direction are, in this case, provoked by the spontaneous activity of the neural network.


neuronal signals. *J. Neurosci. Methods* 177, 241–249.


in dissociated rat cerebral cortex cell cultures on multi-electrode arrays. *Neurosci. Lett.* 361, 86–89.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 06 August 2012; accepted: 18 November 2012; published online: 12 December 2012.*

*Citation: Tessadori J, Bisio M, Martinoia S and Chiappalone M (2012) Modular neuronal assemblies embodied in a closed-loop environment: toward future integration of brains and machines. Front. Neural Circuits 6:99. doi: 10.3389/fncir.2012.00099*

*Copyright © 2012 Tessadori, Bisio, Martinoia and Chiappalone. This is an openaccess article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## **APPENDIX**

**FIGURE A1 | Maps of changes in effective connectivity. (A)** Changes in effective connectivity occurring during a tetanic stimulation experiment. The large dots, in yellow and light blue, represent electrodes used for delivery of both tetanic stimulation and sensory information for the left and right inputs. The smaller dots in blue and red indicate the position of electrodes used for recording from the two "motor" areas. Change in effective connectivity is defined as the difference in the area of PSTHs measured after and before the short-term plasticity experiment, divided by the average of these two values. Variations greater than 20% are represented as lines on the maps, with gray and black lines indicating, respectively, a decrease and an increase in functional connectivity. Only connections involving either stimulating electrode have been represented for clarity, with thicker lines highlighting the connections used in the closed-loop control of the robot (left input-left output and right input-right output areas). This map, in particular, is displaying the change in connectivity observed on a random culture during a 30-min short-term plasticity experiment. In this culture, tetanic stimulation led to a widespread increase of connection strengths involving the electrode represented in yellow, while those involving the one in light blue underwent a mixed change, with about half of them resulting strengthened and half of them weakened. **(B)** Same map as **(A)** obtained from recordings on a modular culture before and after a 30-min short-term plasticity experiment. While tetanic stimulation was delivered to both the yellow and light blue electrodes (respectively in the "lower" and "upper" halves of the culture), only one of the sub-populations was significantly affected, with a diffuse increase in connectivity.

## Toward a self-wired active reconstruction of the hippocampal trisynaptic loop: DG-CA3

## *Gregory J. Brewer 1,2\*†, Michael D. Boehler 1, Stathis Leondopulos 1,3, Liangbin Pan3, Sankaraleengam Alagapan3, Thomas B. DeMarse3 and Bruce C. Wheeler <sup>3</sup>*

*<sup>1</sup> Department of Medical Microbiology, Immunology and Cell Biology, Southern Illinois University School of Medicine, Springfield, IL, USA*

*<sup>2</sup> Department of Neurology, Southern Illinois University School of Medicine, Springfield, IL, USA*

*<sup>3</sup> Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, FL, USA*

#### *Edited by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### *Reviewed by:*

*Ulrich Egert, University of Freiburg, Germany Inah Lee, Seoul National University, South Korea*

#### *\*Correspondence:*

*Gregory J. Brewer, Department of Medical Microbiology, Immunology and Cell Biology, Southern Illinois University School of Medicine, PO Box 19626, Springfield, IL 62794-9626, USA e-mail: gbrewer@siumed.edu*

#### *†Present address:*

*Gregory J. Brewer, Natural Sciences II, Department of Bioengineering, University of California, Irvine, USA*

#### **INTRODUCTION**

The mammalian hippocampus crucially encodes the formation of long-term episodic memories and spatial navigation, yet the staged encoding mechanisms remain elusive. While we know molecular details of many types of synapses in the major regions of the hippocampus important to learning and memory at the single neuron level, we don't know if these regions self-wire into the anatomically accurate network or require external electrical or chemical inputs. Further, brain functional studies are saddled with a tradeoff between high-spatial resolution (e.g., MRI, fMRI, EEG, EcoG) vs. high-temporal resolution (e.g., *in vivo* electrode arrays, single cell patch clamp). To bridge this gap, our strategy employs *in vitro* culture in an attempt to recapitulate entire *in vivo* brain regions in culture. Today a wide variety of cell types from various areas of the brain can easily be explanted, cultured *in vitro*, and studied in detail. While *in vitro* technology does provide exquisite temporal and spatial access it too has significant shortcomings. A key hurdle toward reconstructing brain areas *in vitro* has been the difficulty controlling the structural connectivity among cells to begin to recapitulate the *in vivo* architecture. The connections in the hippocampal formation of the brain uniquely propagate forward excitatory communication from one region to the next with the CA3 region distinctive for recurrent collateral excitation. We begin to create a functional tri-synaptic network of the hippocampal formation from the entorhinal cortex (EC) to the dentate gyrus (DG) to the CA3 to the CA1 (Cajal,

The mammalian hippocampus functions to encode and retrieve memories by transiently changing synaptic strengths, yet encoding in individual subregions for transmission between regions remains poorly understood. Toward the goal of better understanding the coding in the trisynaptic pathway from the dentate gyrus (DG) to the CA3 and CA1, we report a novel microfabricated device that divides a micro-electrode array into two compartments of separate hippocampal network subregions connected by axons that grow through 3 × 10 × 400µm tunnels. Gene expression by qPCR demonstrated selective enrichment of separate DG, CA3, and CA1 subregions. Reconnection of DG to CA3 altered burst dynamics associated with marked enrichment of GAD67 in DG and GFAP in CA3. Surprisingly, DG axon spike propagation was preferentially unidirectional to the CA3 region at 0.5 m/s with little reverse transmission. Therefore, select hippocampal subregions intrinsically self-wire in anatomically appropriate patterns and maintain their distinct subregion phenotype without external inputs.

**Keywords: multielectrode array, dentate gyrus, GAD67, burst, GFAP**

1968; Amaral and Lavenex, 2006). Other aspects of hippocampal anatomy not modeled here are the connections from the EC through the perforant path to the CA3 in a feed-forward fashion. In addition, the hippocampus receives modulatory inputs from the amygdala and basal forebrain. Output from the CA1 proceeds through the subiculum and returns to the EC to complete the loop. With smaller numbers of electrodes placed in the rat brain, others have monitored activity from each hippocampal region in behaving animals to describe specific patterns of activity for each region (Rolls and Kesner, 2006; Leutgeb et al., 2007), suggesting staged encoding, but we lack information about the inputs necessary to evoke these patterns and their network relationships.

To achieve these staged connections, we combine microfabrication (MEMS) technology to channel connections between cultured subregions of the hippocampus on a multi-electrode array to simultaneously monitor activity (**Figure 1**). Inspired by Campenot (1977, 1987), the MEMS device creates compartments in which we separately place cells from each major area of the hippocampus (EC, DG, CA3, CA1) connected by microscale tunnels through which axons can pass between the wells to define neuronal communication pathways between each well (Taylor et al., 2005; Dworak and Wheeler, 2009; Pan et al., 2011; Kanagasabapathi et al., 2012). Due to the defined geometry of the hippocampus, the cells can be dissociated from micro-dissected subregions of DG, CA3, CA1, and EC (Mattson et al., 1989; Baranes et al., 1996; Zhao et al., 2001; Lein et al., 2004). Cells

how the tunnels promoted selective growth of axons from one compartment into another.

are loaded into the compartments after the device is placed over an array of extracellular electrodes (microelectrode array or MEA) to measure neural activity in each subnetwork as well as communication between the subnetworks through the tunnels (Morefield et al., 2000; Czarnecki et al., 2012; Downes et al., 2012; Kanagasabapathi et al., 2012; Dranias et al., 2013).

Here we reconstructed paired components of the tri-synaptic pathway, with a focus on the DG to CA3 connection to determine: (1) whether specific subregions of the hippocampus be reproducibly dissected as evidenced by region-restricted gene expression? (2) Will these regions maintain and establish their original identity in a uniform culture environment when removed from external hormonal gradients and input activity? (3) Given that CA3 development precedes DG *in vivo* (Bayer, 1980), is the natural axonal polarity of DG*>*CA3 intrinsically controlled by the neurons or does *in vivo* recapitulation of connectivity require external cues? (4) Do the dynamics of neural activity differentiate between each area *in vitro* and to what extent are they similar to activity patterns seen *in vivo*? We addressed the above issues by quantitative PCR of region-restricted gene expression, by evaluation of distinct spike and burst dynamics in each sub-region compartment and by establishing the polarity of directional communication between sub-regions, whether random or anatomically accurate from the DG to the CA3. Surprisingly, intrinsic capabilities of the DG neurons promote axon extension toward the CA3 neurons, with limited back propagation.

## **RESULTS**

In order to reconstruct subregions of the rat hippocampus, we microscopically dissected these regions from postnatal day 4 rats. At this time, the CA3 and CA1 are well-developed and the dentate granule and hilar region (DG) have nearly completed neurogenesis (Bayer, 1980). Single cell suspensions from each region were plated into separate compartments of a microfabricated PDMS device, positioned over a 60 electrode microarray (**Figure 1**). Cells were plated at physiological density ratios of 100 for DG to 33 for CA3 or 41 for CA1. Between the two compartments was a 400 um barrier perforated by a series of 51 narrow channels that excluded cell somata, but promoted growth of axons along the length of the tunnels (Taylor et al., 2005; Dworak and Wheeler, 2009; Berdichevsky et al., 2010; Pan et al., 2011; Kanagasabapathi et al., 2012; Wang et al., 2012). For quality control of the dissection and to determine whether the microdissected DG, CA1, and CA3 regions maintain their *in vivo* identity in culture, we performed qPCR on the neurons that developed in the compartments for 3 weeks. We selected several genes based on their demonstrated enrichment in specific regions of the adult hippocampus (Lein et al., 2004). When standard, uniform cultures from a single subregion were assessed from glass slips without tunnel devices, the adult animal enriched gene expression was replicated with detection of specific transcripts for each region DG, CA1, and CA3 cultured in the common culture medium (**Figure 2A**). But would the subregion types of expression be maintained across the microtunnel devices? **Figure 2B** shows that the same subregionenriched gene expression is maintained between compartments with the same subregion in each compartment as well as in devices with different subregions on each side (**Figures 2C–E**). These results indicate the fidelity of the dissection and culture process as well as the ability of these hippocampal subregions to maintain their specific identities in the absence of external vascular, hormonal, or electrical instruction.

We examined network spike and burst dynamics with the goal of decoding communication between hippocampal subregions. We recorded from paired compartments of DG and CA3 neurons as a model of this part of the brain anatomy. As controls, we recorded activity from networks comprised of either DG on both sides or CA3 neurons on both sides of tunnel-connected compartments or in random single compartment models. Regardless of configuration, 80% of electrodes were active with an average spike rate around 12 Hz in the NbAct4 medium (data not shown). However, burst dynamics differed between hippocampal subregions. The spike rate outside of bursts for DG increased with anatomically correct tunnel connection to CA3 neurons, while CA3 neurons showed the opposite trend (**Figure 3A**). **Figure 3B** shows that about 60% of spikes occurred within bursts in DG networks, regardless of configuration, while only 45% of spikes occurred in bursts in the CA3 networks apposed in tunnels. The average duration of each burst (**Figure 3C**) showed trends that

**phenocopies selective expression in the adult hippocampus as measured by qPCR.** Specific genes probed for expression after 3 weeks in **(A)** the indicated homogeneous random cultures without tunnel devices. Note enrichment of Trpc6 in the DG cultures, Prkcd in the CA3 cultures, and Nov in the CA1 cultures. **(B)** The same hippocampal subregions plated into each of two compartments of the tunnel device. Note same enrichment profile in the tunnel device as in the random cultures. **(C)** Enriched expression of the DG gene Trpc6 whenever DG neurons are present in heterologous combinations of hippocampal subregions cultured between tunnels, normalized to DG cultured on both sides of the tunnels. **(D)** Enriched expression of the CA3 gene Prkcd when CA3 neurons are present in heterologous combinations of hippocampal sub-regions cultured between tunnels, normalized to CA3 cultured on both sides of the tunnels. **(E)** Enriched expression of the CA1 gene Nov when CA1 neurons are present in heterologous combinations of hippocampal sub-regions cultured between tunnels, normalized to CA1 cultured on both sides of the tunnels. Note the similar expression of each region-specific gene to neurons of that region in combination with heterologous regions, while the other 4 combinations without this region express lower levels of this marker mRNA (*n* = 3 separate cultures).

mirrored the extra-burst spike rate, with DG apposed to CA3 showing longer burst durations while CA3 bursts were shorter. The larger changes in inter-burst interval (**Figure 3D**) showed a co-modulation upon anatomical connection with increased intervals for the DG and CA3 apposed configuration. Intraburst spike rate (**Figure 3E**) and spikes per burst (**Figure 3F**) also differed with configuration.

*In vivo*, granule cells in DG uni-directionally synapse with pyramidal cells in CA3 with no back-propagation of connections from CA3 to DG. In our preparation, cells from DG and CA3 were plated simultaneously in apposing compartments that would permit connectivity in either direction. If axonal polarity of DG → CA3 is intrinsically controlled by the neurons (i.e., self-wire) in the absence of other external cues found *in vivo*, then polarity

corresponding random cultures and (ii) is higher for DG than CA3 apposed across tunnels. **(B)** Percent spikes in bursts are generally higher for any DG culture than any CA3 culture. Log normal distribution statistics apply to **(C–F)**. **(C)** Burst duration was longer for DG than CA3 when they were apposed. For random cultures without tunnel devices, DG burst durations are much lower than CA3 random cultures. **(D)** Inter-burst intervals are lengthened by 50% in DG apposed to CA3 compared to DG self-apposed across tunnels and 300% compared to DG in random networks. Similarly, inter-burst times are longer for CA3 apposed to DG than CA3 apposed to itself or random CA3 cultures. **(E)** Intra-burst spike rates are shortened by 20% in DG apposed to CA3 compared to DG self-apposed across tunnels but longer in the reverse direction. **(F)** Spikes per burst decreased by 14% in DG apposed to CA3 compared to DG self-apposed across tunnels and even less in the reverse direction. In all cases *n* displayed is total degrees of freedom from burst or non-burst segments from 3 min recordings of networks of 4 random DG, 4 random CA3, 8 DG(DG), 8 CA3(CA3), 5 DG(CA3), and 5 CA3(DG). Different letters above bars indicate significant differences (a shared letter indicates a non-significant comparison) by *post-hoc* Tukey multiple-comparison analysis after significant ANOVA, *p <* 0*.*05, normal distribution statistics.

of connectivity from DG-CA3 should be maintained *in vitro* and could also account for the distinct burst dynamics of DG apposed to CA3. Previously, we showed that selective axon polarity from one side to the other, as opposed to bi-directional axon crossing, could be achieved in cortical neurons across tunnels by plating and culture of one side of a device followed after 1 week by plating and culture on the opposite side (Dworak et al., 2010; Pan et al., 2011). Here we tested whether intrinsic DG neuron properties would mimic *in vivo* conditions and preferentially cross the tunnels to innervate the CA3 neurons in the other compartment when both compartments were plated on the same day, with fewer axons crossing from CA3 to DG (reverse direction). This axon polarity could be measured in our devices by the direction of the time delay between spikes detected on the two microelectrodes embedded in the tunnels as shown in **Figure 4Ai** where a 0.48 ms delay from the DG side to the CA3 side over the 0.2 mm distance indicates a forward conduction velocity of 0.42 m/s. In contrast, when CA3 was connected to CA3 (or DG-DG), **Figure 4Aii** shows a 0.40 ms delay in the opposite direction, implying a reverse speed of 0.50 m/s that was evident in about half the spikes in these configurations. Individual tunnels were examined to determine whether axon conduction velocity was directionally polarized or whether evidence of multiple spike heights and shapes suggested several axons in a tunnel. Statistics based on all spike pairs above

**FIGURE 4 | Native polarity established from DG to CA3.** Delay times of spikes traveling in axons in tunnels were determined from the difference in spike times at two tunnel electrodes separated by 200µm. **(Ai)** Example of spike travelling from DG to CA3 with a 480µs delay indicating a velocity of 0.42 m/s. **(ii)** Example of spike propagation from the top to the bottom compartment (arbitrarily designated reverse direction for CA3-CA3). **(B)** Statistical analysis of directional propagation indicates 62% of tunnels spontaneously connect axons with anatomical accuracy from DG-CA3, while homologous regions across tunnels fail to show polarity (Wilcoxin non-parametric test).

a high positive threshold indicated 81% unidirectional axon conduction from DG to CA3 at a mean velocity of 0*.*54 ± 0*.*02 m/s (SD., *n* = 9167 spike pairs). Other positive conduction velocities were DG:DG 0*.*47 ± 0*.*02 (*n* = 2615) and CA3:CA3 0*.*51 ± 0*.*01 (*n* = 1343) m/s, all within the range for the hippocampus *in vivo* (Patolsky et al., 2006).

Closer examination on a tunnel by tunnel basis revealed a more complicated situation. In 3 tunnels on one array, *>*99% of the spikes propagated from DG to CA3. In other tunnels, examination of the waveforms for negative directions from CA3 back to DG indicated multiple roughly simultaneous spikes, likely the spikes from two or more axons, whose sum was detected by the electrode as a shift in the peak within the 2 ms detection window. We never observed a tunnel with only back propagation, suggesting that these events are rare and supporting the conclusion that axons from DG neurons preferentially connect to CA3. For all the DG-CA3 tunnels with measurable spike pairs (*n* = 26 tunnels), more than *>*60% of the spikes propagated from the DG to CA3 direction (**Figure 4B)**. This polarity contrasts with the nearly equal directional distributions of DG-DG and CA3-CA3.

Differences in GABAergic neuron and astroglia content could also affect burst dynamics. Transfection of networks with a Lenti virus carrying a GFP reporter driven by the inhibitory neuron GAD67 promoter was used to evaluate neurons expressing the GABA synthetic enzyme GAD67. **Figure 5** shows higher GAD<sup>+</sup> inhibitory neuron density in heterologous sub-region connections in DG compared to CA3 (**Figures 5A,B**) and higher GAD<sup>+</sup> inhibitory neurons/nucleus in homologous sub-region connection in DG compared to CA3 (**Figures 5C,D**). Quantitation of GAD67 neurons relative to nuclei (**Figure 5G**) showed a 5-fold increase in GAD<sup>+</sup> neuron density in DG over CA3 for heterologous and a 3-fold increase in the homologous configuration. The dissected DG region which includes the hilus contains a higher percentage of GAD67 expressing neurons than CA3 (Harvey and Boksa, 2011) that could contribute to a stronger inhibitory drive to enable a higher percentage of spikes in bursts (**Figure 3B**) and a longer burst duration (**Figure 3C**) for DG than CA3. The same mechanism could also contribute to the longer inter-burst intervals in DG apposed to CA3 (**Figure 3D**) and offer more opportunity for higher extra-burst spike rates. Some of these GAD67 neurons sent axons across the tunnels (not shown), better seen with GAD65 immunoreactivity (**Figure 5I**), as a feed-forward inhibitory component (Cabezas et al., 2012).

By increasing glutamate uptake, higher astroglial density could also affect burst dynamics (Boehler et al., 2007). A nuclear count of more than 2-fold above the plating density for CA3 apposed to DG (**Figure 5H**) suggested proliferation of astroglia. **Figures 5E,F** show that GFAP stain for astroglia are indeed more activated in the CA3 side than the DG side.

## **DISCUSSION**

Here we reconstructed rat hippocampal sub-regions in pairs connected by axon-conducting tunnels to demonstrate intrinsic retention of *in vivo* behavior in the absence of external electrical and hormonal stimuli. The subregions maintained their physiological distinctions based on qPCR expression of

**FIGURE 5 | DG networks contain 3–5× more GABAergic (GAD67-GFP) neurons than CA3 networks while CA3 has more astroglia.** Some GAD<sup>+</sup> neurons traverse the tunnels. **(A–D)** Green is GAD67-GFP expression 11 days after infection with Lenti-virus with GAD-67 promoter fused to GFP. Red is pseudocolored for blue bisbenzamide labeled nuclei. **(E,F)** GFAP immunostain in DG or CA3 compartments. **(G)** Nuclei per somata from bisbenzamide stain for DNA. Note red vertical striped CA3 apposed to DG is 50% higher than CA3 by itself. **(H)** Percent GAD67 labeled neurons per nuclei. (*N* = 6 20× fields from each of 2 networks). **(I)** GAD65 immunolabeled axons traverse tunnels.

subregion-enriched genes, distinct spike dynamics, GABAergic neurons, astroglia, and preferential wiring direction.

Our observations on spike and burst dynamics are consistent with a synchronously connected network; neither DG nor CA3 operates as an independent oscillating/bursting center. Isolated DG neurons in the network burst at a higher rate than CA3 neurons, suggestive of regions preprogrammed to drive faster bursting DG onto slower bursting CA3, but operating in constrained fashion with predominantly forward connectivity from DG to CA3, but sufficient recurrent GABA-ergic innervation in DG and astroglia (Boehler et al., 2007) in CA3 to modulate the dynamic behavior. Further, the higher extra burst spike rate, slightly longer burst duration and GABAergic neuron density in DG provide extra drive from DG to CA3, establishing the background readiness of CA3 to be receptive to other inputs for learning or information fusion.

A key hurdle toward reconstructing brain areas *in vitro* has been the difficulty controlling the structural connectivity among cells to reflect, or even begin to adequately recapitulate the *in vivo* architecture in an *in vitro* model. A variety of novel *in vitro* technologies address this difficult problem. These technologies have been targeted toward modifying the surface chemistry to provide guidance cues that promote preferential attachment and growth (Boehler et al., 2012), using microfluidics (Morin et al., 2006), or alternatively, to capitalize on the intrinsic neuronal property to follow topographical features in their environment such as pillars (Dowell-Mesfin et al., 2004), ridges (Curtis and Wilkinson, 1997), or gradients (Hattori et al., 2010). The tunnel approach used in this paper confirms the intrinsic ability of *ex vivo* neurons to reconnect in an *in vivo* order (Czarnecki et al., 2012; Downes et al., 2012; Kanagasabapathi et al., 2012; Dranias et al., 2013).

Our model is a greatly reduced analogue of the *in vivo* circuit due to its anatomical incompleteness as a two-dimensional network. It fails to include modulatory cholinergic, noradrenergic, seratonergic, or dopaminergic inputs. It certainly lacks hormonal fluctuations and efficient removal of waste metabolites. With extracellular electrodes, we only monitor the net effects of thousands of synapses as individual action potentials per neuron.

Advantages of this model include direct stimulation and monitoring of electrical activity on time scales of milliseconds to weeks, pharmacologic access and most importantly, the ability to monitor inputs, axon communication, and outputs of the hippocampal region. The hippocampus is well-known for its different levels of information processing, but the details of the coding remain elusive. Our model has the potential to decode the information from the easily monitored spiking dynamics between hippocampal subregions. This technology will enable determination of the network integration of stimulation-dependent plasticity and how subregion-specific information patterns are reliably transmitted but differentially processed within each hippocampal subregion.

#### **MATERIALS AND METHODS**

#### **MICROFABRICATION OF TWO-COMPARTMENT TUNNEL DEVICES**

A multilayered mold made of photoresist SU-8 was fabricated on a silicon wafer. The first layer of the mold was made for the microtunnel structure. Briefly SU-8 2002 (Microchem, Inc.) was spun on a 4-inch silicon wafer at a nominal thickness of 3 µm, baked, exposed with the first mask, baked again, and developed. The second thicker layer of the mold was made for the well-structure. SU-8 2050 was spun on at a nominal thickness of 120µm and then baked. The second mask was aligned to marks on the silicon wafer and then the second SU-8 film was exposed, baked again and developed. This mold was slowly filled with PDMS silicone rubber [polydimethylsiloxane; Sylgard 184 (Dow-Corning, Midland, Michigan) 10:1 ratio of pre-polymer (base)/cross-linker (curing agent)]. Once the PDMS spread over the entire wafer, it was heated for 2 h at 70◦C for curing or in later devices, 10 h at 70◦C. Two wells for culture and another smaller circular well for a reference electrode were formed on the peeled PDMS with a punch. Finally, a circular PDMS ring was placed around the entire device to form a chamber for holding cell culture media.

#### **DISSECTION OF RAT HIPPOCAMPAL SUBREGIONS**

The SIUSM LACUC approved these experiments as conforming to the Laboratory and Animal Use Guidelines of the NIH. To obtain neurons for electrical and genetic analysis, hippocampal sub-regions were isolated from anesthetized 4-day-old Sprague– Dawley rat pups as described (Mattson et al., 1989; Baranes et al., 1996; Zhao et al., 2001; Lein et al., 2004). The entire hippocampus was dissected away from the overlying neocortex of each brain hemisphere and removed as an intact structure for further sub-region dissection. The boundaries of the DG could be seen in the dissected hippocampus. Briefly, the CA1 or top portion of Ammon's horn was isolated at the natural division of the hippocampal fissure separating CA1 and DG-CA3. Using DG rostral and ventral ends as anchors, cuts were first made along the DG-CA1 boundary until the CA1 was separated and isolated. The CA3 sub-region (bottom remainder of Ammon's horn) was then dissected away from the DG following the clearly visible boundaries.

## **NEURON CULTURE**

Hippocampal sub-region cells were plated at 1000 cells/mm2 for DG, 330 cells/mm<sup>2</sup> for CA3, and 410 cells/mm<sup>2</sup> for CA1 on poly-D lysine coated MEAs or glass cover-slips with attached PDMS micro-tunnels in NbActiv4™ medium (Brewer et al., 2008) (BrainBits, Springfield, IL). Poly-D-lysine (Sigma SLBB8061V) was dissolved at 37◦C for 1 h in sterile water before application to the devices at 100µg/mL and incubation overnight at room temperature. The PDMS tunnels served to connect axons from the separated source sub-region to the target subregion. Sub-region cultures were plated at a ratio respective to their anatomical density *in vivo* (final ratios DG-CA3 3:1, CA3-CA1 1:1.25) (Braitenberg, 1981). **Figure 1** depicts tunnel dimensions in relation to MEA dimensions. Briefly, the dimensions of 51 tunnels were 400µm long, 10µm wide, and 3µm height with spacing 40µm apart allowing for the coverage of seven electrode pairs in the middle of the MEA. Twenty-two electrodes were left uncovered in each of the top (target) and bottom (source) wells of the MEA (well area = 6.28 mm2). Homologous cultures connected with tunnels were plated at a 1:1 ratio. Homologous random cultures were plated on 15 mm glass slips (Assistant Brand, Carolina Biologicals). Source cultures were plated first on the bottom half of the MEA and incubated for 15–30 min before adding target cultures. Cultures were incubated at 37◦C, 5% CO2, 9% O2 and saturating humidity (Thermo-Forma, Columbus, OH). Every 4–5 days, one-half of the culture medium was removed and replaced with the same volume of fresh medium up until the day of recording.

#### **HIPPOCAMPAL SUB-REGION RNA EXTRACTION AND DETECTION THROUGH qPCR**

RNA was extracted from 3 week old cultures using 20µL (device compartments) or 100µL (random) Trizol (Life Technologies #15596-026) applied directly to glass cover-slips or tunnel wells. After addition of 20% volume chloroform and 5% glycogen (final 250 ug/mL), samples were centrifuged and precipitated according to the manufacturer. Five hundred nanograms of RNA was used to create a cDNA pool with the High Capacity RNA-to-cDNA Kit (Applied Biosystems #4387406) per instructions. Hundred nanograms of cDNA was used in a 20µL multiplex Taqman reaction using primers known to be enriched in specific hippocampal sub-regions (Lein et al., 2004): Transient receptor potential cation channel 6 enriched in DG (Trpc6, Applied BioSystems 00677559); Protein kinase C delta enriched in CA3 (Prkcd, Applied BioSystems #00440891), family of serine and threonine specific protein kinases activated by calcium and secondary messenger diacylglycerol; Nephroblastoma overexpressed gene enriched in CA1 (Nov, Applied BioSystems #00578390), family of CCN secreted extracellular matrix associated signaling proteins; and polymerase (RNA) II (DNA directed) polypeptide A (POLR2a, Applied BioSystems #4448489) as the internal standard reference. POLR2a was chosen as the internal standard reference over the more conventional housekeeping gene GAPDH because the lower level of expression (Alan Brain Atlas) is more appropriate for other low expression genes. qPCR reactions were run with a 2× master mix (Applied Biosystems #4369016) in a StepOne Plus PCR system (Applied Biosystems) at the manufacturer's recommended optimized conditions of 10 min at 95◦C for enzyme activation followed by 40 cycles of (15 s denaturation at 95◦C and 1 min anneal/extend at 60◦C). Primer expression was normalized to POLR2A with fold change differences determined using the 2*--*Ct method. Graphed results show fold changes in the hippocampal sub-region gene probe expression relative to the hippocampal sub-region of known gene enrichment.

## **MULTI-ELECTRODE ARRAYS AND RECORDING**

MEA's from Multichannel Systems (MCS, Reutlingen, Germany) consisted of 60 TiN3 electrodes with diameters of 30µm and spacing of 200µm, one of which served as ground. The spontaneous activity on the MEA's was measured using an MCS 1100× amplifier at 25 kHz sampling with a hardware filter of 1–3000 Hz at 37◦C under continuous flow of hydrated, sterile 5% CO2, 9% O2, balance N2 (custom made, AGA, Springfield, IL). A Teflon membrane (ALA Scientific, Westbury, NY) was used to reduce evaporation and chances of contamination. MCRack software was used to record 3 week old cultures for 3 min of spontaneous activity.

#### **SPIKE ACTIVITY ANALYSIS**

Offline data analysis was performed using a modification of SpyCode V2.0 software (Bologna et al., 2010) including custom MATLAB scripts (The Mathworks, Natick, MA). After filtering the data with at 300 Hz high pass, spikes were identified as peak-to-peak amplitudes that exceeded 9 times the minimum root-mean-square of 200 ms contiguous windows. A dead-time or refractory period of 1 ms was assumed after each detected spike. Bursts were defined as 4 or more spikes with no greater than a 50 ms inter-spike interval.

Delay times of spikes within tunnels were used to determine directionality. Due to larger amplitudes in tunnels, thresholds were determined on a per-electrode basis as visually selected asymptotic minima of the continuous voltage distribution. Furthermore, the time stamp of any *spike event* was chosen as the time of occurrence of the voltage maximum within any such event. A histogram of delay times between spike pairs was constructed within the limits of ±280–600µs, conforming to the known range of axonal conduction delays (Patolsky et al., 2006) and limited in resolution by the 40µs sampling period. The histograms showed distinct peaks, each indicating the high precision in delay time that is consistent with an action potential propagating on a single axon past a pair of electrodes.

#### **GAD67 GFP REPORTER FOR GABAergic NEURONS**

At 7 DIV, a fluorescent GAD67 lenti-virus reporter (System Biosciences #SR10023VA-1, 10µg/ml) diluted in 8µg/ml protamine sulfate for better adsorption (Sigma# P4020) was added to DG and CA3 sub-cultures with PDMS micro-tunnels on glass cover-slips. Cultures were incubated in NbActiv4 medium for two more weeks before imaging. At 3 weeks, cultures were switched from NbActiv4 medium to Hibernate Low Fluorescence— Glucose (BrainBits #112012) and bisbenzamide was added to stain cell nuclei for 2 min (final concentration 300 ng/ml, diluted in PBS, Sigma #B2261). Cultures were rinsed two times in Hibernate LF—glucose before being imaged. Images were taken

#### **REFERENCES**


through an Olympus 20×/0.45 objective, and recorded with a Retiga Exi CCD camera (QImaging, Surrey, BC, Canada). Image Pro<sup>+</sup> software was used in digital analysis and display of the immunostain and nuclear stain. After flattening backgrounds, a constant density segmentation threshold was set for cell counts.

#### **GFAP OR GAD65 IMMUNOSTAINS FOR ASTROGLIA OR GABAergic NEURONS**

For immunostains, DG and CA3 sub-cultures with PDMS microtunnels were plated as described previously on glass cover-slips and cultured for 3 weeks, then fixed in 4% paraformaldehyde for 10 min. Cells were permeabilized and weakly antigenic sites blocked in 5% normal goat serum and 0.5% Triton X-100 in PBS. Conjugate mouse anti-GFAP/Alexa Fluor 488 (Molecular Probes #A21294) was diluted 1:500 in 5% NGS and 0.05 TX-100. Cells for incubated for 90 min at 22◦C, then rinsed four times in PBS. Other antibodies were mouse anti-GAD65 1:250 (Sigma #G1166) with secondary Alexafluor 588 conjugated goat anti-mouse 1:1000 (Molecular Probes #11031). Nuclei of cells were stained for 2 min with bisbenzamide (final concentration 300 ng/ml, diluted in PBS, Sigma #B2261). Slips were rinsed two final times, imaged through an Olympus 20×/0.45 objective, and recorded with a Retiga Exi CCD camera (QImaging, Surrey, BC, Canada).

#### **STATISTICS**

Statistical differences were determined by Student's *t*-test with *p <* 0*.*05 considered significant for two-way comparisons of data normally distributed. Log-normal adjustments were made when appropriate. *Post-hoc* Tukey adjustments for multiplecomparisons are reported after significant ANOVA. Statistical differences of spike times were determined by the Wilcoxon test with significance at *p <* 0*.*05 (Hollander and Wolfe, 1999).

## **ACKNOWLEDGMENTS**

This work was supported by a grant from NIH, R01 NS052233.

analyzer. *Neural Netw*. 23, 685–697. doi: 10.1016/j.neunet.2010.05.002


332, 11835–11840. doi: 10.1523/ JNEUROSCI.5543-11.2012


*Eur. J. Neurosci*. 35, 375–388. doi: 10.1111/j.1460-9568.2011.07966.x


62, 1767–1776. doi: 10.1016/j. neuropharm.2011.11.022


*Bioelectron*. 15, 383–396. doi: 10.1016/S0956-5663(00)00095-6


Zhao, X., Lein, E. S., He, A., Smith, S. C., Aston, C., and Gage, F. H. (2001). Transcriptional profiling reveals strict boundaries between hippocampal subregions. *J. Comp. Neurol*. 441, 187–196. doi: 10.1002/cne.1406

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest, with the exception that the culture medium, NbActiv4, was provided by BrainBits LLC, which is owned by one of the authors (Gregory J. Brewer).

*Received: 01 May 2013; accepted: 23 September 2013; published online: 21 October 2013.*

*Citation: Brewer GJ, Boehler MD, Leondopulos S, Pan L, Alagapan S, DeMarse TB and Wheeler BC (2013) Toward a self-wired active reconstruction of the hippocampal trisynaptic loop: DG-CA3. Front. Neural Circuits 7:165. doi: 10.3389/fncir.2013.00165*

*This article was submitted to the journal Frontiers in Neural Circuits.*

*Copyright © 2013 Brewer, Boehler, Leondopulos, Pan, Alagapan, DeMarse and Wheeler. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## *Alexey Pimashkin1\*, Arseniy Gladkov1,2, Irina Mukhina1,2 and Victor Kazantsev1,3*

*<sup>1</sup> Department of Neurodynamics and Neurobiology, Lobachevsky State University of Nizhny Novgorod, Nizhny Novgorod, Russia*

*<sup>2</sup> Normal Physiology Department, Nizhny Novgorod State Medical Academy, Nizhny Novgorod, Russia*

*<sup>3</sup> Laboratory of Nonlinear Processes in Living Systems, Institute of Applied Physics of the Russian Academy of Science, Nizhny Novgorod, Russia*

#### *Edited by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### *Reviewed by:*

*Sergio Martinoia, University of Genova, Italy Caleb Kemere, Rice University, USA*

*Robert E. Hampson, Wake Forest University Health Sciences, USA*

#### *\*Correspondence:*

*Alexey Pimashkin, Department of Neurodynamics and Neurobiology, Lobachevsky State University of Nizhny Novgorod, 23 Prospekt Gagarina, Nizhny Novgorod 603950, Russia.*

*e-mail: pimashkin@neuro.nnov.ru*

Learning in neuronal networks can be investigated using dissociated cultures on multielectrode arrays supplied with appropriate closed-loop stimulation. It was shown in previous studies that weakly respondent neurons on the electrodes can be trained to increase their evoked spiking rate within a predefined time window after the stimulus. Such neurons can be associated with weak synaptic connections in nearby culture network. The stimulation leads to the increase in the connectivity and in the response. However, it was not possible to perform the learning protocol for the neurons on electrodes with relatively strong synaptic inputs and responding at higher rates. We proposed an adaptive closed-loop stimulation protocol capable to achieve learning even for the highly respondent electrodes. It means that the culture network can reorganize appropriately its synaptic connectivity to generate a desired response. We introduced an adaptive reinforcement condition accounting for the response variability in control stimulation. It significantly enhanced the learning protocol to a large number of responding electrodes independently on its base response level. We also found that learning effect preserved after 4–6 h after training.

**Keywords: multielectrode arrays, hippocampal cultures, closed-loop, learning** *in vitro***, learning in neural networks**

## **INTRODUCTION**

Neuronal networks formed in dissociated cultures grown on multielectrode arrays have been widely used as a biological model to monitor mechanisms of information encoding, synaptic plasticity, memory formation, and learning at the network level *in vitro* (le Feber et al., 2010; Frega et al., 2012; Maccione et al., 2012). Planar microelectrode systems permit simultaneous recording and electrical stimulation in different parts of the cultured neuronal network (Thomas et al., 1972).

After 2–3 weeks of spontaneous development the cultured neural networks display spontaneous burst discharges. The discharges consist of 0.1–1 Hz sequences of population bursts of 50–300 ms duration. Recent investigations showed that spatio-temporal patterns of spiking activity within the bursts are organized in a statistically repeatable and reproducible way (Raichman and Ben-Jacob, 2008; Pimashkin et al., 2011). Such repeatability indicated the presence of quite stable synaptic connectivity formed in the cultured network. External electrical stimulation modified the spiking pattern and, hence, induced long-term changes in the synaptic architecture of the underlying network. If the stimulation is applied with closed-loop conditions such changes may be directed to achieve a predefined profile of the evoked response. The latter can further be associated with navigating robots capable to implement simple behavioral tasks (Chao et al., 2008; Shahaf et al., 2008).

Low-frequency electrical stimulation in the form of pulse train (0.03–0.1 Hz) induced population burst responses over most of the neurons in the network during 50–300 ms after the stimulus artifact (Maeda et al., 1995; Wagenaar et al., 2004). Such stimulation did not change functional characteristics of the evoked response at both short and long-term periods (Chiappalone et al., 2008). However, spontaneous bursts can change their pattern after the low-frequency stimulation indicating changes in the network connectivity (Brewer et al., 2009; Bologna et al., 2010; Ide et al., 2010; le Feber et al., 2010). Increasing the stimulation frequency up to 1 Hz or higher led to suppression of the evoked responses (Jimbo et al., 1993; Shahaf and Marom, 2001; Eytan et al., 2003; Wagenaar et al., 2005; le Feber et al., 2010). Note, that tetanic stimulation with 10 Hz induced spike timing-dependent plasticity (STDP) in the culture network (Wagenaar et al., 2006a,b). Note also, that if signal propagation through synaptic pathways was blocked by applying 6-cyano-7-nitroquinoxaline-2,3-dione (CNQX) and (2R)-amino-5-phosphonovaleric acid (APV), the antagonists of *N*-methyl D-aspartate (NMDA) and α-amino-3 hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA)-receptors, then the evoked spikes can be observed only at latencies shorter than 10 ms (Wagenaar et al., 2004). They represent a direct response on the excitation of an axon passing both the stimulation and the recording electrode, or on the excitation of a cell whose axon passes the recording electrode. Blocking Na<sup>+</sup> channels by tetrodotoxin (TTX) abolished all spontaneous and evoked activity in culture network. These results suggested that in normal conditions the stimulus evoked spikes with the latencies greater than 10 ms represented "network" spikes generated by signal propagation through the synaptic pathways of the culture network.

A closed-loop protocol of learning in cultured network of cortical neurons stimulated by low-frequency signal (0.3–1 Hz) was

"fncir-07-00087" — 2013/5/22 — 18:24 — page 1 — #1

proposed by Shahaf and Marom (2001). Each stimulus response was defined as a number of evoked spikes appeared in 50 ± 10 ms post-stimulus interval. For continuous stimulation they introduced the response-to-stimulus ratio (R/S) for the single electrode. This quantity was defined as a moving average over 10 preceding responses. It characterized slow changes in the response caused by plasticity of synaptic pathways between neurons located near stimulating and recording electrodes. If the R/S value exceeded a certain threshold (R/S = 0.2 in Shahaf and Marom, 2001) the stimulation was stopped for 5 min providing the reinforcement. Then the cycle was repeated several times. Time interval needed to reach the threshold in each cycle was treated as adaptation time. The decrease of the adaptation time during the stimulation cycles was then interpreted as learning. Contrariwise, low-frequency stimulation in conditions without the reinforcement (e.g., openloop conditions) did not induce the learning effect. Changes in the response was observed only on the trained electrode, whereas such effect was not found on the other electrodes. le Feber et al. (2010) found that closed-loop stimulation in cortical cultures induced significant changes in synaptic connectivity in contrast to the open-loop conditions. It was also noted that after training the spontaneous bursts were changed enhancing their correlation and synchrony (Li et al., 2007). This learning protocol was used in several other studies (Marom and Shahaf, 2002; Stegenga et al., 2009). It is important to note that only low-active electrodes recording one spike per 10 stimuli (e.g., with R/S = 0.1) were used for learning. Long-term changes were monitored for more than 30 cycles of stimulation. Electrodes with higher R/S (R/S = 0.5) were also examined for learning, but the learning effect was observed only during first six cycles of stimulation (Staveren et al., 2005).

In this paper we presented our results of learning experiments in hippocampal cultured networks on multielectrode arrays with closed-loop stimulation. Using adaptive and activity dependent reinforcement condition we found that the electrodes with relatively high response activity (R/S > 0.1) can be used for learning. Thus, the closed-loop stimulation could modify culture network synaptic pathways with relatively strong connections typically formed in spontaneous development. We also showed that the adaptive reinforcement significantly enhances the number of highly respondent electrodes (typically more than 50%) relative to the ones with lower response (R/S < 0.1) used in the previous studies.

#### **MATERIALS AND METHODS**

#### **CELL CULTURING**

Cell cultures were prepared from the hippocampus of C57BI6 mice embryos at 18th prenatal day (E18) following standard procedures (Potter and DeMarse, 2001; Pimashkin et al., 2011). After trypsin treatment cells were dissociated by trituration and plated on 64 electrode arrays (Alpha MED Science, Japan), pre-coated with adhesion promoting molecules of polyethyleneimine (PEI). The final density of cell culture was about 15,000–20,000 cells/mm2. Note that in previous studies researchers used cultures with cell density of about 10,000–50,000 cells/mm2 (Shahaf and Marom, 2001) and 5000 cells/mm2 (le Feber et al., 2010). In both studies the cultures were plated from cortical cells. In similar learning experiments with hippocampal cultures the density was 2000 cells/mm2 (Li et al., 2007).

Cells were stored in culture neurobasal medium (Invitrogen 21103-049) with B27 (Invitrogen 17504-044), Glutamine (Invitrogen 25030-024) and fetal calf serum (PanEco κ055), under constant conditions of 37◦C, 100% humidity, and 5% CO2 in air in an incubator (MCO-18AIC, SANYO). No antibiotics or antimycotics were used. Glial growth was not suppressed because glial cells were essential to long-term culture health. One half of the medium was changed every 2 days. Experiments were performed when neuronal networks were 3–6 weeks *in vitro* that permitted their functional and structural maturation (Eytan et al., 2003).

#### **ELECTROPHYSIOLOGY**

Extracellular potentials were collected through 64 planar platinum black electrodes simultaneously with the integrated MED64 system (Alpha MED Science, Japan). The 8 × 8 (64) microelectrode arrays with 50 μm × 50 μm size and the 150 μm spacing were used for recording at sampling rate of 20 kHz/channel (**Figure 1A**). Stimuli were generated using a four channels voltage/current stimulator (STG4004, MultiChannel Systems, Germany). Closedloop conditions were performed by custom made software (Labview®) using real-time signal analysis and conditional stimulation.

#### **SPIKE DETECTION**

Detection of recorded spikes was based on threshold calculation of median of the signal according to the following formula:

$$T = N\_\text{s} \sigma, \sigma = \text{median}\left(\frac{|\varkappa|}{0.6745}\right) \tag{1}$$

where *x* is the bandpass-filtered (0.3–8 KHz) data signal, σ is an estimate of the median normalized on the standard deviation of signal with zero number of spikes (Quiroga et al., 2004), and *NS* is a spike detection coefficient determining detection threshold (Pimashkin et al., 2011). Standard deviation of signal containing Gaussian noise was equal to median of absolute values of the signal divided by 0.6745 which was a normalization of the median on the standard deviation.

Spike detection coefficient *NS* permitted to take into account the contribution of different spike amplitudes. *NS* = 4 was used for all data accounting spikes with amplitudes more than 20 μV. Minimal interspike interval was set to 1 ms. Detected spikes were then plotted in a raster diagram.

#### **STIMULATION PROTOCOL**

We used trains of biphasic rectangular voltage pulses (600 mV and 300 μs per phase, with positive phase first) at low-frequency in the range of 0.05–0.06 Hz. The value of stimulation frequency was chosen to induce bursting activity in the 20–500 ms post-stimulus interval (**Figure 1B**). Note that in previous studies the stimulation frequencies were significantly higher (0.1 Hz, 0.3 Hz, Shahaf and Marom, 2001; 0.2–0.33 Hz, le Feber et al., 2010) without any relation to spontaneous bursting frequency. In our experiments most of the stimuli with frequencies higher than 0.1 Hz did not evoked stable bursting activity. However, we found that stimulation at

"fncir-07-00087" — 2013/5/22 — 18:24 — page 2 — #2

0.05 Hz and/or 0.06 Hz which is close to characteristic bursting frequency led to the evoked bursts. Note also, that technically the lower frequencies were also more preferable for the long-term stimulation because of the less influence on electrode disruption due to electrolysis.

Similarly, to previous studies we characterized the response by the response-to-stimulus ratio (R/S) calculated for each response and for each electrode. For our purpose, we counted the number of spikes detected in 40–80 ms post-stimulus interval on each electrode independently and then we defined R/S as the moving average across 10 preceding responses (Shahaf and Marom, 2001). This quantity indicated slow changes of the neuronal response over past 170–200 s.

#### *Control stimulation (open-loop)*

The control stimulation was performed during 75 min (five cycles of 10 min – stimulation, 5 min – rest) with 0.05 Hz stimulation frequency (150 stimuli) and with 0.06 Hz (180 stimuli). In more than 50% of the experiments (14 out of 24) the control stimulation was performed for 31 cycle (465 min ∼7.5 h) to test the learning effect without reinforcement. After control stimulation the R/S values were calculated for each electrode.

The stimulation electrode first was chosen at random. If it evoked bursts recorded by the most of electrodes during stimulation for 5 min then the electrode was considered as stimulation electrode. If no bursting response was found, we tried another one. We considered only stably responding cultures, which during control stimulation did not significantly increase or decrease the total number of spikes in 20–300 ms post-stimulus interval for all recording electrodes. Slow changes of the responses were tested by estimating significant difference between the responses in the first and the last half of the recordings by Mann–Whitney rank-sum test (*p* < 0.05). If the sets of responses were not significantly different then the stimulation electrode was retained for further training, otherwise, we tested another electrode also chosen randomly or took another culture for the experiments. We also note, that most the cultures, in which the responses increased or decreased during control stimulation, demonstrated stable responses after several days. The responses were compared by relative changes of the mean value and of the standard deviation of the first and the last 30 stimuli responses in 20–300 ms interval normalized to the number of the spikes in the first 30 responses. Recording from each electrode was characterized by two statistical indicators: mean R/S value, M(R/S), and the R/S standard deviation, σ(R/S). The electrode for training was randomly chosen among the electrodes having M(R/S) value in the range of 0–8 with standard deviation in the range 0.1M(R/S) < σ (R/S) < 2 M(R/S) in control stimulation.

"fncir-07-00087" — 2013/5/22 — 18:24 — page 3 — #3

#### *Training stimulation (closed-loop)*

Training stimulation was applied in closed-loop conditions. It started in one hour after the control stimulation. The training consisted of cyclic stimulation with continuous evaluation of the response. If the R/S value of the response to current stimulus exceeded a definite threshold then the stimulation stopped automatically. It provided the reinforcement for the culture targeting to

achieve a required state. We introduced novel algorithm defining the R/S threshold for the reinforcement condition taking into account the responses in control stimulation. Such definition set different threshold values for different parts of the culture (e.g., different electrodes) involved in the training experiment. We took the highest 15% of the R/S values distribution for selected electrode, which was observed in control stimulation (example

"fncir-07-00087" — 2013/5/22 — 18:24 — page 4 — #4

in **Figure 2C**). The lower boundary of that fraction of the distribution was assigned as the R/S threshold value. The threshold may also be referred as the 85th percentile. The percentage of the R/S values used for threshold estimation was defined as *threshold estimation parameter* R/SThr%.

The training phase of the experiment consisted of sequence of the stimulation cycles with the same frequency and real-time evaluation of the R/S value on the selected electrode. If the R/S value of the activity from the selected electrode in response to stimulus reached the R/S threshold or if the stimulation time exceeds 10 min, then the stimulation was automatically stopped for 5 min completing the training cycle. Then the training cycle was repeated for 30–35 times. Thus the response of the neurons on the selected electrode altered the stimulation duration in each cycle. Time interval, from the beginning of the cycle to the moment where R/S value was found to be greater or equal to the R/S threshold was defined as *adaptation time*, TR/S. The TR/S was monitored for each cycle and the sequence of TR/S values defined *learning curve*. Relative change of the TR/S during the experiment was defined as *adaptation time ratio*, K(TR/S) and was estimated as mean T*R*/*<sup>S</sup>* in the last 10 cycles divided to the mean of the T*R*/*<sup>S</sup>* in the first 10 cycles. The decrease of the TR/S during the stimulation cycles [K(TR/S) < 0.5] was then treated as successful learning for the neurons on the selected electrode to generate the desired response on the stimulation. To compare the efficiency of the closed-loop stimulation parameters K(TR/S) and TR/S were also calculated for control stimulation (e.g., the open-loop).

We also checked if the learning effect is stable in 4–6 h after the experiments by performing four cycle training stimulation.

At the longer time intervals (days or weeks) the cultures were changed significantly due to spontaneous development. In our experiments we reused some of them in not less than 2 days after the last training stimulation. When multiple experiments were performed on a single culture, we selected electrodesfrom different regions of the array for each new experiment to avoid possible influence of the previous stimulation experiments.

#### *Spontaneous activity analysis*

To analyze the effect of the stimulations on the state of the culture network we recorded spontaneous bursting activity during 10 min. We compared the average inter burst intervals, average number of spikes per burst and burst durations for the recordings before and after the stimulation experiments. Individual bursts detection was based on threshold estimation of basal spike rate activity as a total number of spikes observed in each 50 ms time bin (see Pimashkin et al., 2011 for more details). Statistical analysis of the bursting activity characteristics was performed by Mann–Whitney rank-sum test (*p* < 0.05).

#### **RESULTS**

#### **OPEN-LOOP STIMULATION**

First we analyzed responses of the culture on long-lasting (five cycles – 75 min and 31 cycles – 465 min) low-frequency stimulation (0.05, 0.06 Hz) of the stimulation electrodes that evoked population bursting response (see Materials and Methods). The stimuli were initially delivered through one randomly chosen electrode (**Figure 1B**). The dynamics of the evoked network response recorded from all electrodes was characterized by post-stimulus time histogram (PSTH). For each 10 ms time interval after the stimulus artifact a total number of the spikes recorded from all electrodes was calculated (**Figure 1C**). Maximum of the spike rate of the response was observed at 50–100 ms after stimulus.

Then, we analyzed the characteristics of the responses in the control stimulation (**Figure 2**). In our experiments we found that 31.13% of the electrodes (total 64) had 0 < M(R/S) ≤ 0.1 during the control stimulation (14 trials of long recordings, *n* = 10 cultures). The percentage of electrodes having 0 < M(R/S) ≤ 10 was 58.16% (**Figure 2A**). Particular electrodes for training stimulation were chosen among the electrodes with 0 < M(R/S) ≤ 8 (see Materials and Methods). Note that in previous studies only the activity from the electrodes with average R/S values during the control stimulation M(R/S) equal to 0.1 were chosen for training, and R/S = 0.2 was set as the R/S threshold (Shahaf and Marom, 2001; Li et al., 2007; Stegenga et al., 2009; le Feber et al., 2010).

Time dynamics of the R/S values for each stimulus response during the control stimulation is shown in **Figure 2B**. Ending moments of the 10 min stimulation cycles are marked by blue lines. Note that the responses were quite variable. The learning threshold was defined as the lower value from the highest 15% of R/S values referred as the 85th percentile of the R/S values (see Materials and Methods). The example of the R/S values distribution from the selected electrode and the R/S threshold is shown in **Figure 2C**. In other words, the threshold was set to detect quite rare and high rate responses. Note that for different electrodes the R/S thresholds were in range from 0.2 to 12 in different experiments.

After the threshold was defined the adaptation time *TR/S* can be estimated for each cycle. To confirm that the learning effect can be induced only in closed-loop conditions, we estimated a learning curve (*TR/S* for each cycle) for control stimulation (**Figure 2D**). The results show that adaptation time remains relatively stable. Next we analyzed the influence of the R/S threshold estimation parameter on the adaptation dynamics by setting different R/SThr% – 5, 10, 15, and 20% (95th, 90th, 85th, and 80th percentile, respectively). Note that the lower threshold is set the easier to reach the threshold by spontaneous fluctuations of the response. Hence, the adaptation curves for the lower thresholds were located lower (**Figure 2D**). However, changing the threshold did change qualitatively the adaptation dynamics.

#### **CLOSED-LOOP STIMULATION**

Next we made the experiments on training stimulation with the reinforcement (see Materials and Methods). In these conditions the stimulation were turned off when the learning threshold was reached at each cycle.

The adaptation dynamics for one experiment is shown in **Figure 3A**. The adaptation time for the case of successful learning (black curve) went down after several cycles of the training stimulation. We also found that some of the cultures could not be trained as illustrated by the red curve in **Figure 3A**. For those cultures the adaptation time was fluctuating with its maximal value for the whole duration of the stimulation. The training stimulation was applied for 17 different cultures in 24 experiments. **Figure 3B** shows average learning curve for the set of successful experiments.

"fncir-07-00087" — 2013/5/22 — 18:24 — page 5 — #5

experiments, six cultures). Dashed curves illustrates the standard deviation **(C)** Average adaptation time during five cycles at the beginning of the training experiment (1, *n* = 6), when learning is achieved (2, *n* = 6) and in 4–6 h after the end of main experiment (3, *n* = 3). Error bar corresponds to the standard deviation, statistical significance was tested by *t*-test (*p* < 0.05). **(D)** The R/S threshold values for successful (black markers) and failed (colored markers)

experiments the learning was achieved using R/S threshold parameter 15% (out of 9 and out of 24 experiments in total). **(E)** Average adaptation time ratio for control stimulation, failed learning and successful learning. Error bar corresponds to standard deviation. The ratios of the successful learning were significantly different to the control stimulation (*t*-test, *p* < 0.05).

In contrast to the open-loop case (control stimulation) the adaptation time decreased indicating the learning effect. To quantify it we used the adaptation time ratio K(TR/S; see Materials and Methods). If K(TR/S) was lower than 0.6 then the training was considered as successful. We also analyzed the influence of the threshold estimation parameter (**Figure 3D**). Interestingly, that only the use of R/SThr% = 15%, induced the learning effect (black squares in **Figure 3D**). It was found in six of nine experiments for *n* = 9 cultures with absolute value of the R/S threshold less than 1. Similar statistics of about 50% successful experiments were reported in the previous studies (Shahaf and Marom, 2001; le Feber et al., 2010).

In the adaptation dynamics the decrease of time *TR/S* was typically observed after 10–14 stimulation cycles (see **Figures 3A,B**).

"fncir-07-00087" — 2013/5/22 — 18:24 — page 6 — #6

In two experiments it was decreased almost immediately after the second stimulation cycle. In average at the end of the training experiment the adaptation time became 110.62 ± 81.17. Note, that the average R/S values for the first and for the last 30 stimuli were not statistically different. In several experiments after 2–4 h of stimulation we obtained rather high *TR/S* values leading to higher deviations in the averaged values (**Figure 3B**).

To confirm that learning effect of the closed-loop stimulation may induced long-term changes (at the time scale of hours) we performed several experiments after main course of learning. The training stimulation of four cycles (60 min) was applied in 4–6 h after end of the main experiments. We found that in three of six cases the learning effect was preserved as illustrated in **Figure 3C**.

Next we addressed the question if the pattern of the response is changed due to the stimulation. We analyzed changes in the number of spikes recorded in the evoked response. **Figure 4A** illustrates these changes in one of the successful experiments. One can note that after learning the spike intensity of the response increased, e.g., more responses composed of doublets, triplets and more spikes were observed. The average increase over all successful experiments is illustrated in **Figure 4B**. We also analyzed the response from other responding electrodes as illustrated in **Figure 4C**. We found that after successful learning the activity of the whole culture network increased significantly.

Changes in spontaneous activity were monitored by 10 min recordings (see Materials and Methods). We calculated the average inter burst interval, average spikes per burst and burst duration as shown in **Figure 4D**. For each characteristic we did not find any significant difference comparing between the four different phases of the experiment (before the control stimulation, before and after training stimulation and after 4–6 h after main learning experiments).

## **DISCUSSION**

We applied low-frequency stimulation to hippocampal culture network with on-line monitoring of the response-to-stimulus ratio (R/S) in open-loop and closed-loop conditions. The key response indicator was defined as average number of post-stimulus

interval. The responses were taken from 100 stimuli in the beginning of the control stimulation, beginning and ending of the training stimulation. Number of evoked spikes in 40–80 ms post-stimulus interval recorded from selected electrode **(B)** and from all recording electrodes **(C)**. The responses were taken from 100 stimuli in the beginning of the control stimulation, beginning and

changes measured before the control stimulation, before and after training stimulation and after 4–6 h after main learning experiments. The quantities of the inter-burst intervals, spikes per burst and burst durations were normalized to ones measured in control conditions (before stimulation). The values were compared with the Rank-sum test (*p* < 0.05).

"fncir-07-00087" — 2013/5/22 — 18:24 — page 7 — #7

spikes per 10 stimuli in 40–80 ms time interval. Note that this interval corresponded to a peak in the post-stimulus histogram.

We found that learning in culture network can be achieved using an adaptive activity dependent reinforcement condition defined by the response-to-stimulus ratio (R/S) threshold value calculated from the statistics of control (e.g., the open-loop) stimulation. The threshold was estimated from the appearance of rare and high-rate responses in control stimulation (e.g., the highest 15% of the R/S values). Such responses may be associated with signal propagation along spontaneously activated and relatively strong synaptic pathways in the culture network. In other words, learning in our experiments means that particular synaptic pathways relative to particular stimulation electrode became "strengthen" to satisfy the reinforcement condition. In contrast to the previous studies in our approach we can use the electrodes with quite high basal activity in control simulation, 0 < M(R/S) < 0.5. Note, that total number of such electrodes was quite high, 67 ± 11%, which indicates that the learning protocol can be applied to rather large number of electrodes. Statistics of successful trials was about 50% which is comparable to earlier studies (Shahaf and Marom, 2001; le Feber et al., 2010).

Note, the R/S threshold, in fact, defines the reinforcement condition which is crucial for successful learning. In particular, we found that for lower values of the R/S threshold the learning effect was not achieved at all (**Figure 3D**). It is explained by the fact that the high variability of basal responses in culture network led to the increase of the fraction of random over-threshold responses that fails the learning effect which is concerned with regular changes in synaptic pathways in the network.

It is believed that learning effect is associated with structural and functional plasticity of underlying neuronal networks. In simple words synaptic connections are modified due to closed-loop stimulation to achieve an adaptive state defined by the reinforcement condition. In earlier studies low-activity electrodes were typically used (Shahaf and Marom, 2001; le Feber et al., 2010). Their activation implied that synaptic connections accompanying the electrodes were strengthened after the stimulation. Our results

#### **REFERENCES**


eventually demonstrated that not only weak connections between stimulating and recording electrodes can be increased but also well-functioning synaptic pathways can be modified for active electrodes.

Previous studies (Shahaf and Marom, 2001; Li et al., 2007) demonstrated that such training was quite selective. Only neurons on the trained electrodes increased the number of spikes in the response and hence the R/S value. In our experiments we found some increase of the responses from all electrodes (**Figure 4C**) and increase from the trained (selected) electrode (**Figure 4B**). However, the difference of this increase was not significant indicating the absence of the selectivity. We assume that it happened because of the overall activity (mean R/S) and R/S threshold were higher than in the previous studies. Setting higher reinforcement conditions for reaching the threshold in our learning protocol may require stronger modification of the overall synaptic connectivity (hence lower selectivity) to achieve learning.

Another important question was for how long time the synaptic changes can be preserved in the network after learning. We checked the response of our six trained cultures after 4–6 h and found that learning effect preserved in three of six samples (**Figures 3C,E**). Thus, the training stimulation in closed-loop conditions may induce long-term changes in structure and functions of culture network synaptic connectivity. We also found that spontaneous activity of the trained cultures was relatively stable and did not change significantly after learning experiments, e.g., we did not find statistical difference in the characteristics of the spontaneously generated bursts (inter-burst intervals, spikes per burst and burst durations).

## **ACKNOWLEDGMENTS**

This research was supported by The Ministry of education and science of Russia, projects No. 8055, 14.B37.21.0927, 14.B37.21.1073, 14.B37.21.1203, 14.132.21.1663, 11.519.11.1003, Grant for Leading Scientists (No. 11.G34.31.0012), by the Russian President Grant No. MK-4602.2013.4 and MCB Program of Russian Academy of Science.


synchronized bursting in developing networks of cortical neurons. *J. Neurosci.* 15, 6834–6845.


"fncir-07-00087" — 2013/5/22 — 18:24 — page 8 — #8

with wavelets and superparamagnetic clustering. *Neural Comput.* 16, 1661–1688.


(2005). The effect of training of cultured neuronal networks, can they learn? *Proc. Neural Eng.* 328–331.


for stimulation of dissociated cultures using multi-electrode arrays. *J. Neurosci. Methods* 138, 27–37.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 August 2012; accepted: 18 April 2013; published online: 24 May 2013.*

*Citation: Pimashkin A, Gladkov A, Mukhina I and Kazantsev V (2013) Adaptive enhancement of learning protocol in hippocampal cultured networks grown on multielectrode arrays. Front. Neural Circuits 7:87. doi: 10.3389/ fncir.2013.00087*

*Copyright © 2013 Pimashkin, Gladkov, Mukhina and Kazantsev. This is an openaccess article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

"fncir-07-00087" — 2013/5/22 — 18:24 — page 9 — #9

## Restoration of upper limb movement via artificial corticospinal and musculospinal connections in a monkey with spinal cord injury

## *Yukio Nishimura1,2,3\*†, Steve I. Perlmutter 1,2 and Eberhard E. Fetz 1,2\**

*<sup>1</sup> Department of Physiology & Biophysics, University of Washington, Seattle, WA, USA*

*<sup>2</sup> Washington National Primate Research Center, University of Washington, Seattle, WA, USA*

*<sup>3</sup> Precursory Research for Embryonic Science and Technology, Japan Science and Technology Agency, Tokyo, Japan*

#### *Edited by:*

*Steve M. Potter, Georgia Institute of Technology, USA*

#### *Reviewed by:*

*Liang Guo, Massachusetts Institute of Technology, USA Abhishek Prasad, University of Miami, USA*

#### *\*Correspondence:*

*Yukio Nishimura, Department of Developmental Physiology, National Institute for Physiological Sciences, National Institutes of Natural Sciences, 38 Nishigonaka, Myoudaiji, Okazki, Aichi 444-858, Japan. e-mail: yukio@nips.ac.jp; Eberhard E. Fetz, Department of Physiology & Biophysics, University of Washington, 1705 NE Pacific Street, HSB G424, Box 357290, Seattle, WA 98195, USA. e-mail: fetz@uw.edu*

#### *†Present address:*

*Yukio Nishimura, Department of Developmental Physiology, National Institute for Physiological Sciences, National Institutes of Natural Sciences, Okazki, Japan.*

## **INTRODUCTION**

Functional loss of limb control in individuals with spinal cord injury or stroke can involve interruption of descending pathways to spinal networks, although the neural circuits located above and below the lesion retain their function. An artificial neural connection that bridges the lost pathway has potential to compensate for the functional loss. Recent studies showed that monkeys could use cortical activity to control functional electrical stimulation (FES) in muscles transiently paralyzed by nerve block (Moritz et al., 2008; Pohlmeyer et al., 2009; Ethier et al., 2012). However, restoring coordinated movement of paralyzed limbs with peripheral FES remains problematic (Popovic et al., 2002). Stimulation of peripheral nerve or muscle often evokes movement about only a single joint and recruits the largest, most fatigable motor units first. Spinal microstimulation offers an alternative method to produce coordinated movement and more natural recruitment of motor units (Mushahwar and Horch, 2000; Mushahwar et al., 2000; Mussa-Ivaldi and Bizzi, 2000; Saigal et al., 2004; Moritz et al., 2007). In anesthetized animals, current can be delivered to

Functional loss of limb control in individuals with spinal cord injury or stroke can be caused by interruption of corticospinal pathways, although the neural circuits located above and below the lesion remain functional. An artificial neural connection that bridges the lost pathway and connects cortical to spinal circuits has potential to ameliorate the functional loss.We investigated the effects of introducing novel artificial neural connections in a paretic monkey that had a unilateral spinal cord lesion at the C2 level. The first application bridged the impaired spinal lesion. This allowed the monkey to drive the spinal stimulation through volitionally controlled power of high-gamma activity in either the premotor or motor cortex, and thereby to acquire a force-matching target. The second application created an artificial recurrent connection from a paretic agonist muscle to a spinal site, allowing musclecontrolled spinal stimulation to boost on-going activity in the muscle.These results suggest that artificial neural connections can compensate for interrupted descending pathways and promote volitional control of upper limb movement after damage of descending pathways such as spinal cord injury or stroke.

**Keywords: brain–computer interface, artificial neural connection, hand, spinal cord injury, local field potential, muscle, spinal cord, monkey**

> spinal sites to produce coordinated patterns of muscle contraction (Zimmermann et al., 2011).

Several studies have demonstrated that multichannel spike signals recorded with intracortical electrode arrays can be used to estimate arm movements (Kennedy et al., 2000; Wessberg et al., 2000; Serruya et al., 2002; Carmena et al., 2003; Choi et al., 2009; Vargas-Irwin et al., 2010) and muscle activity (Morrow and Miller, 2003; Santucci et al.,2005;Koike et al.,2006; Pohlmeyer et al.,2007; Schieber and Rivlis, 2007). Although recordings of cell assemblies by intracortical electrodes can provide a rich repertoire of signals, their limitations include signal deterioration due to glial scarring (Polikov et al., 2005), potential displacement from the recording site (Leuthardt et al., 2004) and invasive recording techniques. Chronically implanted electrode arrays typically lose the ability to record cell spikes after several years (Krüger et al., 2010; Simeral et al., 2011). Reliable spike recording is a challenge for the long durations required for clinical applications. Movement parameters can also be decoded from local field potentials (LFPs; Zhuang et al., 2010; Flint et al., 2012) and the electrocorticogram

"fncir-07-00057" — 2013/4/9 — 21:42 — page 1 — #1

(Schalk et al., 2007; Sanchez et al., 2008; Miller et al., 2009; Chao et al., 2010; Shin et al., 2012) in motor-related areas, potentially offering more stable signals that represent the activities of many neurons near the electrode. Thus, instead of relying on cell spikes recorded with intracortical electrodes, it is possible to use cortical field potentials, or muscle activity as a surrogate of cortical cell activity.

Here we describe a case study in which an awake monkey with spinal cord injury could volitionally control the paretic upper limb through artificial neural connections using LFPs in motor cortex or activity of muscles to trigger stimulation of a spinal site appropriate to restore goal-directed movement of the affected arm.

## **MATERIALS AND METHODS**

Experiments were performed with a male *Macaca nemestrina* monkey (4 years old, weight 5.5 kg). The experiments were approved by the Institutional Animal Care and Use Committee (IACUC) at the University of Washington and all procedures conformed to the National Institutes of Health Guide for the Care and Use of Laboratory Animals.

#### **SURGERIES**

All implant surgeries were performed using sterile techniques while the animal was anesthetized using 1–1.5% sevoflurane. Dexamethasone, cephalexin, and ketoprofen were administered preoperatively and buprenorphine was given postoperatively.

## *Cortical implants*

Silver electrode wires (0.1 mm diameter, ∼50 k at 1 kHz) were chronically implanted after making small incisions in the dura, in the digit, wrist and arm areas of primary motor cortex (M1) and the arm area of the dorsal aspect of premotor cortex (PMd) in the left hemisphere (contralateral to the spinal lesion, **Figure 1A**). The incisions of the dura mater were sutured closed. Small titaniumsteel screws were attached to the skull as anchors. A stainless steel head-post was mounted on the skull for head fixation. The cortical electrodes and the head-post chamber were anchored to the screws with acrylic cement.

## *Surgery for EMG recording*

Initially, electromyographic (EMG) activity was measured with electrodes surgically implanted in 16 arm and hand muscles, identified by anatomical features and by movements evoked by trains of low-intensity stimulation. Bipolar, multistranded stainless steel wires (Cooner Wire, Chatsworth, CA, USA) were sutured into each muscle and wires were routed subcutaneously to a connector on the animal's back. A jacket worn by the monkey prevented access to the back connector between recording sessions. After these electrodes were broken by the monkey, additional wires were implanted transcutaneously for subsequent EMG recordings.

## *Surgery for spinal implant and spinal cord lesion*

We made two separate unilateral laminectomies on the right side in the same surgery to prepare to record the activity of spinal neurons during behavior. The laminae and dorsal spinous processes

"fncir-07-00057" — 2013/4/9 — 21:42 — page 2 — #2

of the C2–C3 and C5–C7 vertebrae were removed. A chamber was implanted over the rostral laminectomy. Stimulus electrodes were implanted at the caudal site. After recovery from surgery the monkey exhibited an upper arm hemiparesis, including inability to control the digits independently, and the recordings were not performed. The deficit remained throughout the 3 months during which these experiments were performed. Post-mortem histology revealed that a spinal cord lesion was inadvertently created around C2 to C3 during surgery, perhaps due to a contusion on the spinal surface or hemorrhage (**Figure 1D**).

For the stimulus electrode, the dura mater and arachnoid under C5–C7 vertebrae were removed. Eleven polyurethane-coated, platinum–iridium wires (diameter 30μm; impedance 200–600 k- at 1 kHz) were inserted 2.5–4 mm into the lower-cervical spinal cord targeting the ventral horn where hand motoneurons are located (Jenny and Inukai, 1983; Chiken et al., 2001). Penetration depth was determined by making a sharp bend in the microwire at the appropriate length. A second bend several mms more proximal provided strain relief and allowed the wires to float on the cord (Mushahwar et al., 2000). The microwires were bonded with cyanoacrylate glue to the spinal surface at each penetration point. The wires were routed into a silicone tube, which was glued with dental acrylic to bone screws placed in the lateral masses and T1 dorsal process, and routed through the skin to a connector. The spinal cord was covered with the subcutaneous fascia and gelfoam. The laminectomy was closed with acrylic cement. The skin and underlying soft tissue were then sutured closed.

#### **TORQUE-TRACKING TASK**

Prior to surgery the monkey had been trained to perform a torquetracking task (Maier et al., 1998). The monkey controlled the one-dimensional position of a cursor on a video monitor with isometric flexion and extension wrist torques, and acquired targets displayed on the screen. The monkey was required to maintain torque within each target for 0.5–1.0 s to receive a juice reward. Targets remained on the screen until the hold criterion was met, followed by presentation of the next target, either immediately or after a variable reward period.

#### **INTRACORTICAL MICROSTIMULATION**

A few days after the cortical implant, movements evoked by intracortical microstimulation (ICMS) through the implanted electrodes with the monkey awake were documented by visual observation. Trains of 10 pulses of constant-current, biphasic square-wave pulses with 0.2-ms durations at 300 Hz evoked movements at thresholds of 20–120 μA (**Figure 1B**). The ICMS map was re-established 14 days after the spinal lesion with the monkey awake. After the spinal cord injury trains of ICMS at 450 μA were ineffective at most sites (**Figure 1C**).

#### **SPINAL STIMULATION PROCEDURE**

Intraspinal stimuli consisting of constant-current, biphasic square-wave pulses with 0.2-ms durations were delivered through the spinal microwires. In general, stimulation (10–700 μA) was delivered by a single electrode. The output effects evoked from each spinal site were documented with stimulus-triggered averages of rectified EMG (St-TA) during task performance. Current pulses were delivered at a low rate (10–20 Hz) to avoid temporal summation. Stimulus-evokedfacilitation and suppression of EMG were identified as consistent features in the St-TAs above or below, respectively, 2 standard deviations (SD) of baseline. Baseline was defined as the interval from 30 to 0 ms preceding the trigger pulse. The mean percent increase (MPI) measured the average values between onset and offset of the feature minus baseline, divided by baseline. Based on post-stimulus effects in St-TAs, we chose a single electrode and current for the artificial neural connection paradigm.

#### **BEHAVIORAL TASK WITH ARTIFICIAL NEURAL CONNECTION**

Prior to establishing an artificial neural connection, the monkey learned to control a computer cursor with brain activity or muscle activity in separate operant conditioning sessions. Rack-mounted instrumentation was programmed to compile a running average (200 ms) of either EMG or rectified, high-gamma (90–160 Hz) LFP activity to create a continuous signal that controlled the onedimensional position of a cursor on a video monitor. Targets that indicated high- or low-amplitude LFP or EMG were randomly presented on the screen. Targets remained on the screen until the monkey held the cursor within each target for 0.5–1.0 s to receive a juice reward.

After a few sessions of this preliminary task, an artificial connection to the spinal cord was established in subsequent sessions. Instead of directly controlling cursor position, LFP or EMG activity triggered spinal stimuli. Cursor position was now driven by isometric torque produced about the wrist. For each session, either flexion or extension torque controlled cursor position, depending on the torque produced by the spinal stimulation used in that session. The monkey learned to control LFP or EMG activity to acquire targets displayed on the monitor to receive a juice reward as described above (Maier et al., 1998). The direction of cursor movement was matched in all sessions; i.e., increases in LFP or EMG activity in preliminary sessions moved the cursor in the same direction as increases in torque during the sessions with an artificial connection.

#### **ARTIFICIAL CORTICOSPINAL CONNECTION**

In several preliminary sessions, the monkey controlled the cursor with high-frequency gamma (90–160 Hz) LFP activity recorded in either M1 or PMd. Then an artificial corticospinal connection (ACSC) was established using the same signal which had been used in the previous preliminary sessions to trigger trains of spinal stimulation to bridge the impaired corticospinal connection. Rack-mounted instrumentation was programmed to compile a running average of rectified LFP activity in the high-gamma band and to trigger delivery of intraspinal microstimulation at 300 Hz to a single electrode whenever the LFP exceeded a threshold determined by the experimenter. Prior to each session, we determined the background noise level of the high-gamma band signal and set the threshold so that no stimulation was delivered when the monkey was at rest. St-TAs of EMG guided the choice of a single electrode in the spinal cord, the stimulus current, and the position of the cursor on the screen for the artificial neural connection. A few sessions of ACSC were tested within a day, but different pairs of cortical and spinal sites were chosen.

"fncir-07-00057" — 2013/4/9 — 21:42 — page 3 — #3

#### **ARTIFICIAL MUSCULOSPINAL CONNECTION**

In several preliminary sessions, the monkey controlled the cursor with EMG activity recorded from a single forearm muscle. Even after the spinal cord lesion, the monkey could produce some muscle activity in the paretic hand. Then, an artificial musculospinal connection (AMSC) was established using the same EMG to trigger trains of spinal stimulation. Rack-mounted instrumentation was programmed to compile a running average of muscle activity and trigger a train of intraspinal microstimulation at 300 Hz during the time that EMG exceeded a threshold determined by the experimenter. Prior to each session, we determined the background noise level of the EMG signal and set the threshold so that no stimulation was delivered when the monkey was at rest. The stimulus-evoked EMG was insufficient to cross threshold, so the monkey could terminate stimulation by terminating his volitional EMG activity. Sessions of AMSC were tested on different days than ACSC. A few sessions of AMSC were tested within a day, but different pairs of muscles and spinal sites were chosen.

### **HISTOLOGICAL PROCEDURES**

At the end of the experiments, the monkey was deeply anesthetized with an overdose of sodium pentobarbital (50–75 mg/kg, i.v.) and perfused transcardially with 0.1 M phosphate-buffered saline (PBS, pH 7.3), followed by 4% paraformaldehyde in 0.1 M PBS (pH 7.3). The spinal cord was removed immediately and saturated with fresh PBS containing, successively, 10, 20, and 30% sucrose. Serial sections 50 μm thick were cut on a freezing microtome. Sections processed for Nissl staining with 1% cresyl violet were used to assess the extent of the lesion and the location of electrode tracks.

#### **STATISTICAL ANALYSIS**

To determine the statistical difference of task performance between "artificial neural connection" and "catch" trials, we used the unpaired-*T* test. One-way analysis of variance (ANOVA) with repeated measures was performed to determine the significant differences in task performance among the M1, PMd, and M1 and PMd. *Post hoc* multiple comparisons were conducted using the Bonferroni test. Statistical significance level was set at *p* < 0.05. All pooled values are reported as mean ± SD.

## **RESULTS**

#### **EXTENT OF SPINAL LESION AND HISTOLOGIC EVIDENCE OF ELECTRODE LOCATION IN SPINAL CORD**

**Figure 1D** shows the spinal cord section showing the maximum extent of the lesion located at the C2 level, as evidenced by gliosis. The dorsolateral region on the side ipsilateral to the lesion was severely deformed because of mechanical damage and degeneration of axons. The lesion area covered most of the right dorsolateral funiculus. The dorsal funiculi were partially damaged on both sides, but the ventrolateral funiculi were preserved on both sides. The lesion extended from the caudal part of C1 to most of C2. Thus, the lesion interrupted most of the corticospinal and rubrospinal tracts but preserved the reticulospinal tract and some ascending tracts.

We intended to position the electrode tips in the ventral horn and intermediate zone where motoneurons and premotoneuronal interneurons are located. We found two electrode tracks in the sections. **Figure 1E** shows one electrode track at the level of C6. The electrode tip was located in the ventral horn, as shown in **Figures 1E a,b**. The second recovered electrode was located in the medial intermediate zone.

## **FUNCTIONAL DEFICIT**

The monkey's ability to independently control movement of digits, such as for precision grip, exhibited deficits shortly after the lesion and did not recover throughout the 3-month experimental period. Power grip recovered gradually 5–7 weeks after the lesion, consistent with a previous study (Alstermark et al., 2011).

The cortical somatotopic maps before and after lesion are shown in **Figures 1B,C**. Movements could be evoked from only two of five previously effective sites in M1, and only with higher currents than before the injury. The PMd was even more affected since no movements at all could be elicited from two previously effective sites after the lesion. Thus the extent and excitability of the upper limb representation in motor cortex decreased substantially after the spinal lesion, consistent with a previous study (Schmidlin et al., 2004).

### **SPINAL STIMULATION**

To document the muscle responses evoked by intraspinal stimuli we compiled St-TAs of rectified EMG during performance of the wrist flexion and extension task. **Figure 2** shows the St-TAs for a spinal site located caudal to the lesion. Spinal stimulation below the lesion evoked facilitation or suppression effects in multiple muscles, as found for effects evoked from the intact spinal cord (Moritz et al., 2007). Furthermore, spinal stimulation activated synergistic muscle groups. For example, stimuli at the site in **Figure 2** strongly facilitated finger extensor muscles [e.g., extensor digitorum 4 and 5 (ED45) and extensor digitorum communis (EDC)] and suppressed antagonist flexor muscles. Facilitation or suppression effects were evoked in 45.6% of the 12–16 recorded muscles from all spinal sites. Based on the responses in St-TAs, we chose a single electrode and current for the artificial neural connection paradigm. For all electrodes, stimulus effects gradually deteriorated over 3 months, presumably due to electrode encapsulation by the physiological reaction (Mushahwar et al., 2000). Finally, the whole spinal implant including wire electrodes with dental acrylic and bone screws sloughed off the vertebrae after 3 months.

## **ARTIFICIAL CORTICOSPINAL CONNECTION**

To bridge the spinal cord lesion, high-gamma LFP activity in either M1 or PMd was used to trigger trains of spinal stimulation (**Figure 3A**). **Figure 3B** shows a typical example of intraspinal stimulation controlled by the LFP signal recorded from the digit area of M1 (site identified by arrow in **Figure 1A**). During the period of FES (green bar), the monkey was able to trigger and stop stimulation volitionally, thereby repeatedly acquiring the targets. To document that the LFP-controlled intraspinal stimulation was necessary, the stimulation was briefly turned off during "catch trials" (white bar in **Figure 3B**) in five sessions. The monkey continued to make efforts to acquire the target in the catch trials, as evidenced by the above-chance increases in high-gamma

"fncir-07-00057" — 2013/4/9 — 21:42 — page 4 — #4

extension. **(B)** Muscle responses evoked by a single pulse at 90 μA. The vertical scale bar at right indicates mean percent increase (MPI) over baseline. communis (EDC), extensor carpi radialis (ECR), brachioradialis (BR), biceps brachii (BB), pectoralis (PEC), and deltoid (DEL).

activity, but did not succeed in acquiring the targets. We applied such LFP-controlled intraspinal stimulation in 12 different sessions (duration of sessions: 8–47 min; range of trial number within each session: 46–245 trials), using 11 different pairs of cortical and spinal sites, summarized in **Figure 4A**. The average task performance in LFP-controlled intraspinal stimulation trials was significantly higher than those in catch trials (compare green and black bars in **Figure 4A**). Task performance was comparable with LFP recorded from M1 and PMd (cf. blue and red bars in **Figure 4A**). Task performance using LFP from cortical sites from which movements could and could not be evoked after injury was similar. We also examined the task performance during the transition from the operant conditioning of LFP to ACSC. **Figure 4B** shows the time course of task performance in the operant conditioning session (before time zero in **Figure 4B**) and subsequent ACSC session (after time zero in **Figure 4B**) using LFP from the same cortical electrode. The monkey quickly learned after a few sessions to modulate the power of LFP to acquire targets. Switching from the operant control sessions to the ACSC session was very smooth. The task performance in the ACSC session was sustained at nearly the same level as in the operant conditioning session.

#### **VOLITIONAL BOOSTING OF MUSCLE ACTIVITY BY AN ARTIFICIAL MUSCULOSPINAL CONNECTION**

Although the spinal cord lesion produced a severe deficit in forearm movements, the monkey could still produce weak muscle activity. To investigate whether an artificial recurrent connection could boost the activity of a muscle, we used EMG of the paretic muscles to trigger spinal stimulation at a site that produced a contraction of the same muscle (**Figure 5A**). **Figure 5B** shows a typical example of the muscle-controlled intraspinal stimulation using EMG of the paretic extensor carpi ulnaris (ECU) wrist extensor muscle. As shown during the period of FES (green bar), the monkey was able to sustain the EMG burst and torque to acquire the target. During the catch trial, the monkey made a few unsuccessful attempts to produce wrist torque, as seen in the EMG and torque, but was unable to acquire the target. Thus, the muscle-controlled intraspinal stimulation effectively boosted on-going muscle activity of the paretic agonist. We applied similar muscle-controlled FES in 10 different sessions (duration of sessions: 12–33 min; range of trial number within each session: 42–180 trials), using five different pairs of muscle and spinal sites. The average task performance in muscle-controlled intraspinal stimulation trials was significantly higher than that in catch trials (compare green and black bars in **Figure 6A**). **Figure 6B** shows the time course of task performance in the session of operant conditioning of EMG activity and subsequent AMSC session using EMG from the same muscle. The task performance in AMSC sessions was sustained at nearly same level as with operant conditioning session.

## **DISCUSSION**

This case report demonstrates that LFP- or EMG-controlled stimulation in a spinal site could be used to produce volitionally controlled functional wrist torque in a paretic monkey with a spinal cord lesion rostral to the stimulation site. The monkey could volitionally control brain and muscle activities to produce synergistic muscle responses with intraspinal stimulation caudal to the lesion. These results suggest that muscle- or LFP-controlled FES could compensate for the interrupted descending pathways and restore volitional control of functional movement in the upper limb after spinal cord injury or stroke.

The fact that stimulation in a spinal site caudal to a spinal cord lesion can evoke synergistic muscle responses suggests that activity-dependent spinal stimulation may be a promising target

"fncir-07-00057" — 2013/4/9 — 21:42 — page 5 — #5

**FIGURE 3 | Brain-controlled intraspinal stimulation below the lesion. (A)** Schematic shows local field potential (LFP) in motor cortex gating trains of electrical stimulation (300 Hz) to a spinal site below the lesion. The switch in the recurrent loop was opened for catch trials. **(B)** Four successful trials with the artificial corticospinal connection (ACSC, green) and one catch trial (white). During the catch trial, the monkey made several unsuccessful attempts to produce wrist torque, as seen in the EMG and torque. The blue rectangles indicate duration and force range of target. The pink vertical bars indicate duration of electrical stimulation in the spinal site. The red line in second trace represents the threshold for spinal stimulation. From top, raw LFP in motor cortex, rectified and smoothed high-gamma LFP (90–160 Hz), EMG from four muscles (abbreviations as in **Figure 2**), and wrist torque. Arrows indicate times of successful task completion and reward.

for neuroprosthetics that can restore movements after spinal cord injury. In contrast to FES of muscles, spinal microwires are subject to less mechanical fatigue than wires implanted peripherally and require lower stimulus currents to evoke movements. Intraspinal stimulation also produces more natural, graded recruitment of motor units than muscle or nerve stimulation (Mushahwar and Horch,1998).Wefound that spinal stimulation caudal to the lesion evoked facilitation or suppression effects in multiple muscles. Intraspinal stimulation is known to activate many afferent fibers of passage (Gaunt et al., 2006), and probably excites motoneurons transsynaptically by activating a sufficient number of their inputs, such as propriospinal, corticospinal, and/or afferent axons. Fibers of passage have lower activation thresholds than cell bodies and are thus recruited at lower stimulus currents (Gustafsson and Jankowska, 1976). Afferent axons directly excite synergist muscles and inhibit antagonist muscles via inhibitory spinal interneurons. Since we used a single signal, derived from either cerebral cortex or muscle, to control stimulation, the degree of movement control demonstrated here remains limited. Extending this strategy to control more natural and complex movements would require additional input signals and output spinal sites. Compared with FES in muscle, the activation of functional muscle synergies from single intraspinal sites could significantly reduce the number of implanted electrodes as well as the number of independent control signals required from a neuroprosthetic system.

Task performance with LFP recorded in M1 was comparable with performance using LFP from PMd or EMG from muscle. Furthermore, LFP from any cortical site could control spinal stimulation-evoked wrist movements, regardless of whether stimulation of the cortical site evoked wrist movements or not (cf. **Figure 1C**). Previous biofeedback studies have shown that cells in motor (Fetz and Baker, 1973; Fetz and Finocchio, 1975; Moritz et al., 2008) or somatosensory (Moritz and Fetz, 2011) cortex with no discernable relation to muscles can be volitionally modulated after brief practice sessions. We used a similar operant conditioning paradigm with biofeedback for eliciting cortical LFP or EMG to trigger spinal stimuli. The level of performance in the operant

**(ACSC). (A)** Average task performance with the ACSC and during catch trials. Error bars represent standard deviation. **(B)** Time course of task performance. Before time zero the monkey was required to control the cursor with LFP activity. After time zero the task involved ACSC, using the same cortical electrode in digit area of M1.

"fncir-07-00057" — 2013/4/9 — 21:42 — page 6 — #6

**FIGURE 5 | Muscle-controlled spinal cord stimulation. (A)** Schematic shows EMG activity gating a train of stimuli to a spinal site below the lesion. **(B)** Five successful trials with AMSC (green) and unsuccessful catch trials (white). During the catch trial, the monkey made several unsuccessful attempts to produce wrist torque, as seen in the EMG and torque. The blue rectangles indicate duration and force range of target. The pink bars indicate

duration of electrical stimulation in the spinal site. The red line in top row represents the threshold for gating spinal stimulation. The upper and lower traces are the EMG from ECR and wrist torque generated by the monkey, during stimulation (AMSC, in green) or without stimulation (Catch, in white). Arrows indicate times of successful task completion and reward.

**connection (AMSC). (A)** Average task performance for AMSC and catch trials. Error bars represent standard deviation. **(B)** Time course of task

cursor with EMG activity. After time zero the monkey controlled wrist torque via spinal stimulation triggered from the same muscle with the AMSC.

"fncir-07-00057" — 2013/4/9 — 21:42 — page 7 — #7

conditioning task was identical to that with the ACSC and AMSC artificial neural connections. Thus, an arbitrary cortical or muscle signal could be brought under volitional control using biofeedback, to substantially expand the sources of control signals for brain–computer interfaces.

Implementation of the artificial connections with a portable bidirectional neural interface will enable adaptive learning over much longer times and under more varied conditions (Jackson et al., 2006b; Nishimura et al., 2010). The autonomous'Neurochip' system can discriminate brain or muscle activity and deliver stimulation during free behavior (Zanos et al., 2011). Such autonomous low-power circuits could allow subjects to practice continuously with an artificial connection, without requiring complex decoding algorithms or external devices such as robotic arms. Further development of such direct control of a paretic extremity may lead to implantable devices that could help restore volitional movements to individuals with impaired motor control. Furthermore, recent evidence suggests that continuous activity-dependent stimulation promotes plasticity in motor cortex (Jackson et al., 2006a) and corticospinal connections (Fetz et al., 2010). Thus, activitydependent stimulation during free behavior may produce both adaptive learning to exploit artificial connections (Nishimura et al., 2010) as well as Hebbian strengthening of spared pathways after neural damage in descending pathway (Fetz et al., 2010). Furthermore, long-term exposure to artificial neural connections could induce reorganization of cortical and spinal circuitry and facilitate functional recovery.

## **REFERENCES**


of grasp following paralysis through brain-controlled stimulation of muscles. *Nature* 485, 368–371.


In conclusion, this study demonstrates that artificial neural connections that bridge impaired pathways can ameliorate functional loss. Closed-loop control with intraspinal microstimulation driven by brain or muscle activity could control synergistic muscle activities in upper limb in a monkey with spinal cord injury. The success of our protocol suggests that neurorehabilitative treatment could exploit similar paradigms for restoring volitional control of the extremity for individuals with spinal cord injury or stroke.

## **CONTRIBUTIONS**

Yukio Nishimura and Steve I. Perlmutter conducted the experiments. Yukio Nishimura analyzed the data. Yukio Nishimura, Steve I. Perlmutter, and Eberhard E. Fetz conceived the study and wrote the manuscript.

## **ACKNOWLEDGMENTS**

We thank A. Price, R. Robinson, and J. Skiver-Thompson for technical help, T. Eto for histology and L. Shupe for programming. The work was supported by grants from the National Institutes of Health NS 12542, Christopher & Dana Reeve Foundation, Life Sciences Discovery Fund, and the W. M. Keck Foundation for E. F., the National Institutes of Health NS 40867 for Steve I. Perlmutter and Precursory Research for Embryonic Science and Technology, Japan Science and Technology Agency for Yukio Nishimura.

than local alpha-motoneuron responses. *J. Neurophysiol.* 96, 2995– 3005.


"fncir-07-00057" — 2013/4/9 — 21:42 — page 8 — #8


forearm muscles. *PLoS ONE* 4:e5924. doi: 10.1371/journal.pone.0005924


network of diverse motor cortex neurons. *J. Neurophysiol.* 97, 70–82.


E. (2011). The Neurochip-2: an autonomous head-fixed computer for recording and stimulating in freely behaving monkeys. *IEEE Trans. Neural Syst. Rehabil. Eng.* 19, 427–435.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 November 2012; paper pending published: 15 January 2013; accepted: 13 March 2013; published online: 11 April 2013.*

*Citation: Nishimura Y, Perlmutter SI, and Fetz EE (2013) Restoration of upper limb movement via artificial corticospinal and musculospinal connections in a monkey with spinal cord injury. Front. Neural Circuits 7:57. doi: 10.3389/fncir.2013. 00057*

*Copyright © 2013 Nishimura, Perlmutter, and Fetz. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

"fncir-07-00057" — 2013/4/9 — 21:42 — page 9 — #9

## Closing the loop in primate prefrontal cortex: inter-laminar processing

#### *Ioan Opris <sup>1</sup> \*, Joshua L. Fuqua1, Peter F. Huettl 2, Greg A. Gerhardt 2, Theodore W. Berger 3, Robert E. Hampson1 and Sam A. Deadwyler <sup>1</sup> \**

*<sup>1</sup> Department of Physiology and Pharmacology, Wake Forest University School of Medicine, Winston-Salem, NC, USA*

*<sup>2</sup> Department of Anatomy and Neurobiology, University of Kentucky, Lexington, KY, USA*

*<sup>3</sup> Department of Biomedical Engineering, University of Southern California, Los Angeles, CA, USA*

#### *Edited by:*

*Steve M. Potter, Georgia Institute of Technology, USA*

#### *Reviewed by:*

*Manuel Casanova, University of Louisville, USA Randy M. Bruno, Columbia University, USA*

#### *\*Correspondence:*

*Ioan Opris, Department of Physiology and Pharmacology, Wake Forest University School of Medicine, Winston-Salem, NC, USA. e-mail: ioopris@wfubmc.edu*

*Sam A. Deadwyler, Department of Physiology and Pharmacology, Wake Forest University School of Medicine, Medical Center Boulevard, Winston-Salem, NC 27157, USA. e-mail: sdeadwyl@wfubmc.edu*

Prefrontal cortical (PFC) activity in the primate brain emerging from minicolumnar microcircuits plays a critical role in cognitive processes dealing with executive control of behavior. However, the specific operations of columnar laminar processing in prefrontal cortex (PFC) are not completely understood. Here we show via implementation of unique microanatomical recording and stimulating arrays, that minicolumns in PFC are involved in the executive control of behavior in rhesus macaque nonhuman primates (NHPs) performing a delayed-match-to-sample (DMS) task. PFC neurons demonstrate functional interactions between pairs of putative pyramidal cells within specified cortical layers via anatomically oriented minicolumns. Results reveal target-specific, spatially tuned firing between inter-laminar (layer 2/3 and layer 5) pairs of neurons participating in the gating of information during the decision making phase of the task with differential correlations between activity in layer 2/3 and layer 5 in the integration of spatial vs. object-specific information for correct task performance. Such inter-laminar processing was exploited by the interfacing of an online model which delivered stimulation to layer 5 locations in a pattern associated with successful performance thereby closing the columnar loop externally in a manner that mimicked normal processing in the same task. These unique technologies demonstrate that PFC neurons encode and process information via minicolumns which provides a closed loop form of "executive function," hence disruption of such inter-laminar processing could form the bases for cognitive dysfunction in primate brain.

**Keywords: prefrontal cortex, inter-laminar correlated firing, nonhuman primates, columnar correlates of target selection, columnar correlates of task difficulty, spatial vs. object tuning**

## **INTRODUCTION**

The prefrontal cortex (PFC) with its privileged position at the top of sensory-motor processing hierarchy (Alexander et al., 1986; Fuster, 2001) has been traditionally viewed as the seat of higher cognitive functions such as working memory and executive control of behavior (Fuster and Alexander, 1971; Funahashi et al., 1989; Miller, 2000). According to many theories of cognition, cortical mechanisms of executive function coordinate and control "online" cognitive processes underlying memory storage, behavioral selection and motor planning (Posner and Snyder, 1975; Goldman-Rakic, 1996; Shallice and Burgess, 1996; Miyaki et al., 2000; Miller and Cohen, 2001; Baddeley, 2002; Graybiel, 2008). Prefrontal neural activity in the primate brain that emerges from cortical laminar minicolumns is hypothesized to play a critical role in cognitive processes dealing with working memory and executive control of behavior (Goldman-Rakic, 1996; Mountcastle, 1997; Rao et al., 1999; Miller and Cohen, 2001; Baddeley, 2002; Casanova et al., 2007, 2009).

Cortical minicolumns consist of vertically-oriented "aggregates" of cell bodies that represent the basic anatomic and physiologic microcircuitry of the cerebral cortex (Mountcastle, 2003) that consist of pyramidal cells and several types of GABAergic, inhibitory interneurons (i.e., double-bouquet, basket, and chandelier cells) (Casanova et al., 2002a,b, 2007; Sokhadze et al., 2012). Minicolumns in PFC are interconnected to each other through horizontal "long range" projections in layer 2/3 (Kritzer and Goldman-Rakic, 1995), inter-laminar miniloops (Weiler et al., 2008; Takeuchi et al., 2011) and "reverberatory loops" through projections to the subcortical basal ganglia nuclei and thalamus (Alexander et al., 1986). Such "reverberatory loops" combine incoming signals from thalamus in layer 4 and inputs from cortical horizontal projections in layer 2/3, in order to compare inputs to a threshold criterion triggering an output response under specific conditions.

The ability to make behavioral selections in humans involves attention, target/goal choice, planning and monitoring of actions, and is regarded as a facet of decision making based on sensory evidence, expected costs, and benefits associated with the outcome (Opris and Bruce, 2005; Opris et al., 2005a,b; Heekeren et al., 2008; Pesaran et al., 2008; Resulaj et al., 2009). In order to make optimal selections or decisions, many areas in the primate brain with converging inputs to the supra-granular layers of the PFC are activated (Kritzer and Goldman-Rakic, 1995; Opris et al., 2011; Takeuchi et al., 2011), thus raising the question as to how the PFC processes information required for selection of a particular behavioral response necessary for achieving functional objective. It has been shown that neurons in PFC recorded from rhesus macaque nonhuman primates (NHPs) demonstrate functional interactions between inter-laminar "cell pairs" synaptically connected via cortical minicolumns (Kritzer and Goldman-Rakic, 1995; Mountcastle, 1997; Buffalo et al., 2011; Opris et al., 2011; Takeuchi et al., 2011) and that these cells coordinate activity required to encode spatial location and select the target location or target features.

In the studies presented here this presumed executive function of PFC minicolumns was examined via custom designed conformal multielectrode arrays (MEAs) implemented to record the firing of inter-laminar cell pairs oriented in cortical "microstrips" in NHPs (Opris et al., 2011). The recording pads on the MEAs matched the dimensions of two interconnected cell layers in PFC (layer 2/3 and layer 5) which allowed simultaneous monitoring of columnar oriented cells in each layer in order to characterize the control of arm movements in a cognitive task requiring working memory and image-based target selection (Deadwyler et al., 2007; Hampson et al., 2011). The results reveal target-specific, spatially tuned firing between columnar oriented pairs of interlaminar PFC neurons, during the decision making and/or motor planning phase of the task (Hampson et al., 2011).

## **METHODS**

All animal procedures were reviewed and approved by the Institutional Animal Care and Use Committee of Wake Forest University, in accordance with U.S. Department of Agriculture, International Association for the Assessment and Accreditation of Laboratory Animal Care, and National Institutes of Health guidelines.

#### **VISUAL DELAYED-MATCH-TO-SAMPLE (DMS) TASK**

The NHPs utilized as subjects in this study (*n* = 4) were trained for at least 2 years to perform a well characterized, customdesigned visual delayed-match-to-sample (DMS) task (Hampson et al., 2011; Opris et al., 2011) shown in **Figure 1A**. Animals were seated in a primate chair with a platform in front of a display screen in which position of the arm on the platform was tracked via a UV-fluorescent reflector affixed to the back of the wrist, illuminated via a 15 W UV lamp, and detected by an LCD camera positioned 30 cm above. Hand position and movement was digitized and displayed as a bright yellow cursor on the screen and horizontal positions of illuminated clip-art targets were computed from the video image using a Plexon Cineplex scanner. The DMS task paradigm is shown in **Figure 1A**. Trials were initiated by the animal placing the cursor inside a yellow 3" circle or square randomly illuminated in one of the nine spatial positions on the screen. The presence of either the circle or square constituted the "Start" signal for the trial and indicated "trial type" with respect to the Match reward contingency on the same trial (**Figure 1A**). Placement of the cursor into the Start signal image produced a trial unique clip-art image randomly displayed in one of eight peripheral screen positions on each trial for 2, 0 s, which characterized the "Sample Phase" of the task. Movement of the cursor into the Sample image (Sample Response) blanked the screen and initiated the Delay phase for 10–60 s, randomly selected on each trial. Timeout of the Delay interval initiated the onset of the Match phase of the task (Match phase "onset") in which 2–7 trial unique clip-art images, including the Sample image, were presented on the screen with position selected randomly on each trial. Placing the cursor into either, (1) the Sample image (*Object* trial) or (2) the same location as the prior Sample Response (*Spatial* trial), during the Match phase constituted the correct "Match Response (MR)" which produced a drop of juice as the reward, delivered via a sipper tube located near the animal's mouth, and blanked the screen for 10 s until the next trial. Placement of the cursor into one of the nonmatch (distracter) images on an *Object* trial, or a different spatial location on the screen during a *Spatial* trial, constituted a MR error that blanked the screen without reward delivery and initiated the 10 s inter-trial interval (ITI). All clip-art images (sample and distracter) were unique for each trial in sessions of 100–150 trials and were chosen from a 10,000 image selection buffer which was updated to replace 20% of the images every month. The four NHPs were trained to overall performance levels of 70–75% correct with respect to the above described DMS task parameters.

## **SURGERY**

Animals were surgically prepared with cylinders for attachment of a microelectrode manipulator over the specified brain regions of interest. During surgery animals were anesthetized with ketamine (10 mg/kg), then intubated and maintained with isoflurane (1–2% in oxygen 6 l/min). Recording cylinders (Crist Instruments, Hagerstown, MD) were placed over 20 mm diameter craniotomies for electrode access to stereotaxic coordinates of the Frontal Cortex (25 mm anterior relative to interaural line and 12 mm lateral to midline/vertex) in the caudal region of the Principal Sulcus (**Figure 2A**), the dorsal limb of Arcuate Sulcus in area 8 and the dorsal part of premotor area 6 (Hampson et al., 2011), areas previously shown by PET imaging to become activated during task performance (Hampson et al., 2009). Two titanium posts were secured to the skull for head restraint with titanium steel screws embedded in bone cement. Following surgery, animals were given 0.025 mg/kg buprenorphine for analgesia and penicillin to prevent infection. Recording cylinders were disinfected thrice weekly with Betadine during recovery and daily during recording. Vascular access ports (Norfolk Medical Products, Skokie, IL) for drug infusions were implanted subcutaneously in the mid-scapular region, the end of the catheter threaded subcutaneously, to a femoral incision, inserted into the femoral vein, and threaded for a distance calculated to terminate in the vena cava and flushed daily with 5 ml heparinized saline needed for IV drug administration.

## **ELECTROPHYSIOLOGY: RECORDING AND STIMULATION**

Electrophysiological procedures and analysis utilized the MAP Spike Sorter by Plexon, Inc. (Dallas, TX) for 64 channels

were signaled by presentation of one of the two "Focus" signals into which the animal placed the cursor to start the trial. On *Object* trials (yellow ring) reward was delivered for selection of the same clip-art image to be presented in the Sample phase, when it appeared later in the Match phase of the trial, irrespective of position on the screen. On *Spatial* trials (blue square) reward was delivered in the Match phase for selection of the image in the "spatial location on the screen" in which the image was presented in the Sample phase, irrespective of the clip-art image occupying that position in the Match phase. The sequence of events on both types of trials: (1) presentation of "Focus signal" to initiate the trial with cursor placement into the signal, (2) presentation of the 'Sample' clip-art image requiring cursor movement into the image "Sample Response" (3) initiation of a variable "Delay" interval of 1–60 s with the screen blank, (4) upon timeout of the delay interval the Match phase is initiated in which the Sample image is presented on the screen at random locations accompanied by 1–6 other non-match (distracter) images.

mouth. Placement of the cursor into an inappropriate image or location for = 0.5 s caused the trial to terminate and the screen to blank without reward delivery. The inter-trial interval (ITI) was 10.0 s, and Object and Spatial trials were randomly presented 0.6 and 0.4 percent of trials per session, respectively. **(B)** DMS performance averaged over all animals (mean % correct MRs) for *Object* trials as a function of number of Match phase distracter images (number of images 2–7) and length of delay interval (10–60 s) Asterisks: <sup>∗</sup>*F(*1*,* <sup>486</sup>*)* = 7*.*98, *p <* 0*.*01, ∗∗*F(*1*,* <sup>486</sup>*)* = 12*.*24, *p <* 0*.*001, ANOVA. **(C)** Behavioral performance averaged over all animals (mean % correct MRs) for *Spatial* trials as a function of length of delay interval (1–20 s) and number of Match phase distracter images (number of images 2–7) Asterisks: <sup>∗</sup>*F(*1*,* <sup>486</sup>*)* = 7*.*98, *p <* 0*.*01, ANOVA.

simultaneous recordings. Customized conformal designed ceramic MEAs were constructed at the University of Kentucky, Center for Microelectrode Technology—CenMet, Lexington, KY, and consisted of etched platinum pads (**Figure 2B**) for recording multiple single neuron activity (Hampson et al., 2004, 2011). Single extracellular action potentials (**Figure 2B**) were isolated and analyzed with respect to activity on specific recording pads (mpedance range 0.5–3.0 MOhms) during different events within DMS trials (**Figure 2C**). The configuration of the MEA (**Figure 2B**) was specially designed to conform to the columnar anatomy of the PFC such that the top four recording pads recorded activity from neurons in the supra-granular layer 2/3 (L2/3) while the lower set of four pads, separated vertically by 1350µm, simultaneously recorded neuron activity in the infragranular layer 5 (L5) of the PFC (**Figures 2B** and **C**). Recordings from multiple pads in designated locations on the MEAs were analyzed by a nonlinear model previously perfected for assessing and extracting spatiotemporal multineuron firing patterns in PFC using the same MEAs and to deliver task-contingent electrical stimulation to L5 in the same pattern as recorded during correct trial performance (Hampson et al., 2012). Stimulation consisted of 1.0 ms bipolar pulses (50–70 uA) delivered to L5 recording locations following presentation of the Match phase screen and prior to the completion of the MR (**Figure 7**).

#### **ELECTROCHEMICAL RECORDING**

Ceramic MEAs similar to those utilized above for electrophysiological recording were also prepared for electrochemical recording (Burmeister et al., 2004, 2008; Quintero et al., 2007, 2011; Hascup et al., 2008, 2011; Fuqua et al., 2010). The electrochemistry arrays consisted of four recording sites (15 × 333µM) in two rows, separated by 500µm, with a 7 cm polyimide shaft for depth positioning. The electrodes were configured to record from Layer 2/3 with the reference in Layer 1. MEAs were dip coated with Nafion®, a fluoropolymer which excludes the passage of anions, thus ensuring that only cations would reach the platinum recording surface. The dorsal ("sentinel" or reference) recording sites were coated with bovine serum albumin (BSA) plus glutaraldehyde; ventral recording sites were coated with Glutamate oxidase and BSA + glutaraldehyde. The GluOx coating allowed the ventral pads to be sensitive to glutamate release through the enzymatic production of H2O2. A +0.7V charging potential was applied to the MEA once per second (using an Ag/AgCl reference electrode) to oxidize the H2O2 resulting from detection of glutamate at the electrode. The "relaxation" current from H2O2 oxidation was proportional to second-by-second changes in glutamate concentration at the electrode (Quintero et al., 2011).

#### **DATA ANALYSIS**

Task performance was determined for each animal (*n* = 4) as percent correct trials within and across sessions and related to simultaneous MEA recordings on individual trials during Match phase image selection MR in the task (Hampson et al., 2011). Cell types were identified as regular firing pyramidal cells in terms of baseline (nonevent) firing rate (Opris et al., 2009) and significant changes (*z >* 3*.*09, *p <* 0*.*001) in firing (see below) on single trials in perievent histograms (PEHs) derived for intervals of ±2.0 s relative to the time of Match screen presentation that signaled onset of the Match phase of the task (**Figure 2C**). Task-related neural activity was classified according to locations on the conformal MEA which were positioned specifically in L2/3 and L5 (**Figure 2B**) upon insertion in PFC prior to the start of the DMS session. To account for neuronal responses in terms of columnar microcircuit organization neurons recorded on the MEAs were characterized by (1) simultaneous cell activity on both sets of vertical separated (1350µm) pads (L2/3 cell upper and L5 cell lower), during electrode positioning (**Figure 2B**), and (2) whether the same cell pair firing was modulated similarly during the Match phase of the DMS task (Hampson et al., 2012). Standard (Z) scores of increased firing rates relative to nonevent baseline values were calculated for individual cells for each DMS task event. Firing rate was analyzed in 250 ms bins for ±2.0 s relative to time of initiation (0.0 s) task events. Only neurons with firing rates significantly elevated from that in pre-event phases (−2.0 to 0.0 s) baseline period were included for analysis. Differences in cross-correlation between neuron spikes of L2/3 and L5 cell pairs on the same vertical sets of MEA pads (**Figure 2B**) were assessed for the same temporal intervals using standardized distributions of correlation coefficients assessed under different conditions related to performance in the Match Phase (**Figures 2**–**5**). Mean crosscorrelation histograms (CCHs) were calculated and compared relative to mean coefficients normalized relative to probability of firing for the same populations of cell pairs under different experimental conditions (**Figures 2B**, **3C**, **4C**, and **5C**), all of which satisfied the 99% confidence requirement (Opris et al., 2011). CCHs were generated using a shift predictor algorithm built into NeuroExplorer version 4 (http://www*.*neuroexplorer*.*com/), which computed chance cross-correlation levels by randomizing the actual spike sequence and calculating cross-correlations five different times for a given pair of neurons, which was then subtracted from the true coefficients for CCHs to adjust for correlated firing due to differences in cell firing rates and frequency of bursting (Opris et al., 2011; Takeuchi et al., 2011). Population (mean) CCHs, normalized as a function of probability, were computed by averaging coefficients across multiple cell pairs and plotting the mean values (±SEM) in 1.0 ms bins (**Figures 2B**, **3C**, **4C**, and **5C**).

#### **IDENTIFICATION OF CORTICAL LAYERS AND MINICOLUMNS**

The conformal MEA (model W3) probe (**Figure 2B)** was designed so that the two sets of recording pads could only record simultaneous activity from neurons separated by ∼1350µm, which given the orientation of insertion into PFC (**Figure 1B**) could only consist of infra-granular layer 5 and supra-granular layer 2/3 cell activity (Hansen and Dragoi, 2011; Opris et al., 2011; Takeuchi et al., 2011). Misplacement of the probe due to a different angular penetration relative to columnar orientation in PFC was detectable by the absence of simultaneous cell recordings on the sets of vertically separated (1350µm) pads. In addition, the MEA (Hampson et al., 2004; Opris et al., 2011) employed here allowed simultaneous recording of two PFC minicolumns (**Figure 2B**) since, with proper vertical alignment (*<*5.0◦), activity from adjacent minicolumns could be detected, since MEA

for trials with a few (2 and 3 images) vs. many (6 and 7 images) distracter images constructed from the same interlaminar L2/3-L5 cell pair shown in **(A)**. **(D)** Normalized population CCHs for trials with low (2, 3 red) vs. high (6, 7 blue) numbers of images in the Match phase consisting of the average correlation coefficients across individual CCHs from 27 different inter-laminar cell pairs. Scatter plot showing differential distributions of individual CCH peak correlation coefficients on trials with low vs. high numbers of images for the same cell pairs (*n* = 27) comprising the population CCH. ∗∗*p <* 0*.*001, <sup>∗</sup>*p <* 0*.*01, ANOVA.

**and columnar firing. (A)** Raster and peri-event histograms comparing firing of PFC L2/3 and L5 cells in the Match phase as a function of short (≤20 s) vs. long (*>*40 s) delays in the DMS task. **(B)** Population peri-event histograms depicting the activity of L2/3 and L5 cells during short (≤20 s) vs. long (*>*40 s) delays (*n* = 23 cells in L2/3, *p <* 0*.*01

distribution of Match response latencies for short (red) vs. long (blue) delays. **(C)** Example inter-laminar CCHs for L2/3 and L5 cell pairs shown in **(B)** on trials with short vs. long delays. **(D)** Population of inter-laminar CCHs and scatter plot for short vs. long delay trials (see **Figure 3D**). ∗∗*p <* 0*.*001, <sup>∗</sup>*p <* 0*.*01, ANOVA.

pads were separated laterally by 40 µm which exceeds the distances reported (28 µm) from anatomic assessments (Casanova et al., 2009; Hansen and Dragoi, 2011; Mo et al., 2011; Takeuchi et al., 2011).

#### **TUNING PLOTS**

For each inter-laminar cell pair (L2/3 and L5), firing on the same trials was plotted with respect to the position of the target selected in the Match phase (**Figure 6B**). Directionality was assigned according to the eight positions on the screen with reference to placement of the cursor in the center providing angles corresponding to the location of the match image around the periphery of the screen, yielding 0◦ (directly lateral), 45, 90, 135, 180, 225, 270, 315, and 360◦ movement directions from center of screen (Rao et al., 1999; Felsen et al., 2002). Mean firing rate commencing at Match phase onset until time of occurrence of the MR (i.e., typically 0.5–1.0 s, **Figures 4D** and **5D**) was calculated and represented for each inter-laminar cell pair in polar coordinates as tuning plots of the average firing rate, over all trials in a single session. Directional biases, or "preferences", for cell pairs were defined as response locations with the highest mean firing rates relative to all the other positions responded to during the session (**Figure 6B**). A *tuning index* plot (Meyer et al., 2011) was employed for comparing the distribution of biases for the same cells on *Object* vs. *Spatial* trials (**Figure 6E**).

#### **RESULTS**

The four subjects NHPs trained to perform the DMS task (Hampson et al., 2011) were required to select the same video image presented on-screen in the prior Sample phase from a set of 2–7 images in the subsequent Match Phase after an intervening Delay of 10–60 s (**Figure 1A**). The NHPs made hand tracking movements of a cursor on the screen in the Match phase to obtain a juice reward for selection of the correct (Sample) image

**trials. (A)** Rasters and Peri-event histograms showing firing in the Match phase of a PFC L2/3-L5 cell pair recorded during *Spatial* vs. *Object* type trials. **(B)** Population peri-event histograms depicting the activity of PFC L2/3 (*n* = 50) and L5 (*n* = 54) cells on *Spatial* (blue) vs. *Object* (red) trials presented during match phase in the DMS task [*F(*1*,* <sup>1039</sup>*)* = 12*.*89, *p <* 0*.*001, ANOVA]. Histograms show distribution of match response latencies for *Spatial* (blue) vs. *Object* (red) trials. **(C).** Average inter-laminar cross-correlation for the same cell pairs (*n* = 26) recorded on *Object* vs. *Spatial* trials. Scatter plot of shows differential distribution of peak CCH values for *Object* vs. *Spatial* trials for the same cell pairs. **(D)** Behavioral performance as a function of the number of images (2–7) on *Object* vs.

neurotransmitter concentrations in PFC Layer 2/3. Mean (±S.E.M.) glutamate concentration ([Glutamate]) measured as a percentage increase over baseline (average 8*.*69 ± 0*.*77 µM) glutamate concentration. Horizontal axis indicates phase of DMS task: intertrial interval (ITI), Sample phase, Delay phase (Dly), end of delay phase 5 s prior to Match (PreM), Match phase and reinforcement (Reinf.). Asterisks: <sup>∗</sup>*p <* 0*.*01, ∗∗*p <* 0*.*001, *Object* vs. *Spatial* trials; #*p <* 0*.*01, ##*p <* 0*.*001 DMS task phases vs. ITI. **(F)** Frequency of phasic glutamate release events measured as transient increase (*<*2.0 s duration) of at least 5% in [Glutamate] for the same trials shown **(E)**. Frequency normalized to number of events per second per DMS trial (Fuqua et al., 2010).

in different positions which varied on each trial with respect to image-type and screen position. The key variables in the task therefore were: (1) number of distracter images (2–7) presented randomly in different screen positions in the Match phase on each trial, (2) the duration of the intervenning delay interval (1.0–60.0 s) and (3) the random placement of the Sample (target) image in 1 of 7 spatial positions on the screen in the Match phase (after the delay interval). Previous research with the same DMS task has indicated the necessity of attention, short-term memory and response latency, together with influence of type of choice, as factors that affect cognitive workload in the same task (Porrino et al., 2005; Deadwyler et al., 2007). Recent analyses of PFC activity showed that animals execute a "decision process" in the Match phase of the task (**Figure 1A**) involving target selection (Hampson et al., 2011) and that this involved inter-laminar synchrony in cell activity (Hampson et al., 2012). In the study

spatial tuning plot (diagram in center) for a PFC L2/3 cell on *Spatial* (blue) and *Object* (red) trials. The tuning plot in the middle displays Match phase mean firing rates (shaded areas in PEHs) along radial axes corresponding to movement of the cursor into each of the eight screen image positions from the screen center summed over all trials in a single session. The spatial (i.e. screen position) "bias" indicated by the highest firing rate for target selection, for the *Object* trial tuning vectors was in the "medial left" position (i.e.,180◦ ), while the bias for *Spatial* trial tuning vectors was in the

"down" (i.e., 270◦) position. **(C)** Average firing rate for *Spatial* biases (preferred target locations) and Object biases summed across different (*n* = 42) inter-laminar (L2/3 and L5) cells. **(D)** Scatter plot comparing preferred (i.e., highest) firing rate directions for the same cells in **(C)** on *Spatial* vs. *Object* trials, indicating a more biased directional firing on *Spatial* trials. **(E)** Histogram comparing the distribution of preferred firing for the same cells as a function of a tuning index (TI) derived as TI = (PF−NF)/ (PF+NF), on *Spatial* (blue) and *Object* (red) trials, where PF represents preferred location/direction firing rate and NF stands for non-preferred direction firing rate. The plot in **(E)** shows that there was a trend for lower TIs, less bias for one position, on *Spatial* vs. *Object* trials by showing more cells with lower TI values. ∗∗*p <* 0*.*001, ANOVA.

presented here PFC columnar inter-laminar pair-wise cell firing from four NHPs (60 cell pairs: 21 in animal K, 16 in B, 12 in E and 11 in G) was characterized for all of the above mentioned task-related parameters shown previously (Porrino et al., 2005; Deadwyler et al., 2007) to control cognitive processing in this DMS task.

#### **MULTIELECTRODE ARRAY RECORDINGS FROM CORTICAL LAYERS AND MINICOLUMNS**

Prior reports of neural relationships to executive function and decision making in a sensorimotor hierarchy (Miller and Cohen, 2001; Opris and Bruce, 2005; Heekeren et al., 2008; Pesaran et al., 2008; Opris et al., 2012) referred to recordings made in dorsolateral PFC as shown in **Figures 2A** and **B**, which were also reported to depend on the interaction between neurons in different layers in the same area (Goldman-Rakic, 1996; Opris et al., 2011; Takeuchi et al., 2011). In this study, inter-laminar connectivity was sensed by previously described conformal-designed MEAs (Hampson et al., 2012) positioned to simultaneously record neurons located in PFC layer 2/3 and layer 5 in adjacent "minicolumns" during performance of the DMS task (**Figures 2B** and **C**). The MEA contained two linear sets of four recording pads separated vertically by 1350 µm to conform to the distance between PFC cortical cell layer 2/3 (L2/3) and layer 5 (L5) when inserted perpendicular to the parallel lamellae (see "Methods"). The two sets of dual vertical pads in each upper and lower position on the MEA were separated horizontally by 40 µm in order to exceed the reported 28µm width of single cortical minicolumns (Casanova et al., 2007; Opris et al., 2011). This allowed simultaneous recording from two adjacent L2/3 and L5 columnar "cell pairs" constituting neural activity from two separate minicolumns on a single MEA probe. This pad configuration insured that only cells in L2/3 and L5 were recorded, since the appearance of cells simultaneously on both sets of vertical pads required 0◦ angular placement relative to both cell layers (Takeuchi et al., 2011) as shown in **Figure 2B**. In this study spatiotemporal analyses of 180 prefrontal cortical (PFC) pyramidal cells recorded in four NHPs revealed a large number (*n* = 60) of confirmed L2/3 and L5 cell pairs in this region of PFC (**Figure 2A**) that displayed inter-laminar interactions during the Match phase of the DMS task.

#### **INTER-LAMINAR PROCESSING IN PFC DURING DMS TASK**

The relevance of minicolumnar activity to decision making has been investigated in several types of cognitive processing tasks (Goldman-Rakic, 1996; Opris and Bruce, 2005; Heekeren et al., 2008; Pesaran et al., 2008; Resulaj et al., 2009; Opris et al., 2011, 2012). An example of this inter-laminar interaction during the target-selection in the Match phase of the DMS task (**Figure 1A**) is shown in **Figure 2C** in raster/PEHs constructed over ±2.0 s for the Sample and Match phases of the trial for a cell pair recorded in the PFC with the MEA (**Figure 2B**). The cell pair was recorded on appropriate sets of pads as shown in the illustration of the two cells in L2/3 and L5 next to the MEA (**Figure 2B**). Neurons in both layers showed significant increases in mean firing during Sample (L2/3: *Z* = 7*.*30, *p <* 0*.*001; L5: *Z* = 4*.*16, *p <* 0*.*001) and Match (L2/3: *Z* = 12*.*86, *p <* 0*.*001; L5: *Z* = 6*.*20, *p <* 0*.*001) screen presentations (post events: 0.0–2.0 s) and during subsequent movements associated with target selection in this task (Hampson et al., 2011). A consistent finding employing this recording configuration was that within neuron pairs significantly higher mean firing rate in the 0.0 + 2.0 s interval were observed for L2/3 cells after Match phase onset [*F(*1*,* <sup>153</sup>*)* = 20*.*93, *p <* 0*.*001] as demonstrated in the upper and lower raster/PEHs in **Figure 2C**. More precise functional connections between individual cells within each minicolumn was determined by cross (CCHs; Opris et al., 2011; Takeuchi et al., 2011; Hong et al., 2012) constructed for the same minicolumn cell pairs. This is shown for the firing displayed in the PEHs in **Figure 2C** and although there was significantly correlated firing (Match: *Z* = 12.23, *p <* 0*.*001; Sample: *Z* = 10.12, *p <* 0*.*001) the differences in peak correlation shown in the CCHs [*F(*1*,* <sup>401</sup>*)* = 9*.*41, *p <* 0*.*001] indicate that the cell pair firing was more synchronized in the Match than in the Sample phase of the task.

## **EFFECTS OF TASK DIFFICULTY ON INTER-LAMINAR PROCESSING** *Number of match phase images*

As shown in prior reports (Porrino et al., 2005; Deadwyler et al., 2007; Hampson et al., 2011) a major cognitive factor influencing target selection in the Match phase of this task was the number of distracter images (number of images) presented with the Sample image on a given trial (**Figure 1A**). **Figure 3A** shows an example of a graded decrease in cell pair firing in both L2/3 and L5 as a function of the number of images presented in the Match phase. In agreement with prior results (Hampson et al., 2011), overall mean firing rates of L2/3 (*n* = 26) and L5 (*n* = 16) neurons (**Figure 3B**) were systematically decreased as a function of the number of images in the Match phase (L2/3: *F(*6*,* <sup>1039</sup>*)* = 8*.*29, *p <* 0*.*001; L5: *F(*6*,* <sup>639</sup>*)* = 8*.*64; *p <* 0*.*001, ANOVA). However, more importantly this decrease was also expressed in terms of correlated firing between L2/3-L5 cell pairs as shown in **Figures 3C** and **D** (*n* = 27) in which Match phase CCHs on trials with few (2 and 3) images showed significantly higher correlations than on trials with more (6 and 7) images [*F(*1*,* <sup>53</sup>*)* = 7*.*21; *p <* 0*.*01, ANOVA]. This finding of decreased inter-laminar correlated firing is consistent with the fact that increasing the number of distracter images decreases task performance (**Figures 1B** and **C**) due to an increase the in cognitive workload of the task (Hampson et al., 2011; Kelley and Lavie, 2011).

#### *Duration of delay*

Another factor increasing cognitive workload in the DMS task is memory of the Sample target image across the delay interval (**Figure 1A**) and has been shown to be a factor influencing Match target selection (Deadwyler et al., 2007). Consistent with this relationship as shown in **Figure 4B** was the fact that average firing rates for L2/3 and L5 cell pairs was significantly lower on "long" (*>*40 s) vs. "short" (=20 s) delay trials [L2/3: *F(*1*,* <sup>919</sup>*)* = 6*.*67, *p <* 0*.*01, *n* = 23; L5: *F(*1*,* <sup>719</sup>*)* = 10*.*92; *p <* 0*.*001, *n* = 18, ANOVA]. **Figure 4C** shows that Match phase (0.0–2.0 s) CCHs for both L2/3 and L5 cells were significantly lower on "short" vs. "long" delay trials [short delay: *F(*1*,* <sup>1639</sup>*)* = 10*.*87, *p <* 0*.*001; long delay: *F(*1*,* <sup>1639</sup>*)* = 6*.*71, *p <* 0*.*01] as were the average CCHs for all L2/3 vs. L5 cell pairs (**Figure 4B**) under both conditions [*F(*1*,* <sup>45</sup>*)* = 7*.*27; *p <* 0*.*01, ANOVA]. The decrease in interlaminar correlation as a function of short vs. long delays is shown more explicitly in the scatterplot in **Figure 4D** where short delay trials produced higher correlation coefficients than long delay trials for the same cell pairs.

#### **EFFECT OF 'TRIAL TYPE' (***OBJECT* **vs.** *SPATIAL***) ON INTER-LAMINAR PROCESSING**

PFC minicolumns are a functional neuronal "module" (Buxhoeveden and Casanova, 2002; Casanova et al., 2003) with basic associative abilities to integrate horizontal and vertical anatomic "components" of the cortex (Mountcastle, 1997; Lund et al., 2003; Tanaka, 2003; Opris et al., 2011). The visual signals carrying *Spatial* information ascend from visual cortex on the dorsal stream to be integrated in PFC minicolumns with signals from the ventral stream that label the clip art image visual features such as color, shape, brightness used on *Object* trials. To compare firing in PFC layers L2/3 and L5 on *Spatial* vs. *Object* trials we examined image selection ability of cortical minicolumns during the Match phase of DMS task in the same cells during both types of trial in the same session. **Figures 5A** and **B** show differences in L2/3 and L5 cells with respect to mean (±SEM) firing rate changes during the Match phase interval of *Spatial* and *Object* trials trials within the same DMS sessions. Mean firing rates during the Match phase (0.0–2.0 s) were significantly higher for L2/3 vs. L5 cells for both types of trials [*F(*1*,* <sup>1039</sup>*)* = 12*.*89, *p <* 0*.*001, ANOVA], however, **Figure 5B** shows that rates were significantly lower on *Spatial* vs. *Object* trials for L2/3 cells [*F(*1*,* <sup>499</sup>*)* = 10*.*96, *p <* 0*.*001, *n* = 50], but not for L5 [*F(*1*,* <sup>539</sup>*)* = 1*.*12, ns, *n* = 54; ANOVA]. **Figure 5C** shows that these differences in firing rates were also associated with significant decreases in mean CCHs for the same L2/3-L5 cell pairs on *Object* vs. *Spatial* trials [*F(*1*,* <sup>51</sup>*)* = 12*.*20, *p <* 0*.*001] which as indicated in the "Methods" section, were not due to alterations in firing rate *per se* (Hong et al., 2012). These results are consistent with the differences in degree of difficulty between *Object* vs. *Spatial* trials with respect to task performance, as shown in **Figure 5D**.

The contribution of different cellular networks for differential columnar processing on *Object* vs. *Spatial* trials was examined by employing electrochemical recording of glutamate levels in PFC Layer 2/3. Glutamate neurotransmission have been implicated in learning and memory (Dudkin et al., 2003; Riedel et al., 2003), therefore we hypothesized that changes in levels of released glutamate would correlate with differential cognitive processing (Stephens et al., 2010) of DMS trials. Glutamatesensitive electrochemical recording MEAs were tested in three sessions for each of the four NHPs. The average basal glutamate concentration across animals and sessions was 8*.*69 ± 0*.*77µM. **Figure 5E** shows the percent change in *tonic* glutamate concentration (Glutamate) from baseline sorted by individual phases (events) in the DMS task averaged separately across animals for *Object* vs. *Spatial* trials. While both *Object* and *Spatial* trials exhibited significantly increased glutamate concentrations [*F(*5*,* <sup>789</sup>*)* = 11*.*42, *p <* 0*.*001] in the Delay and Match phases of the task compared to baseline and ITI levels (Fuqua et al., 2010), glutamate levels were significantly elevated on *Object* relative to *Spatial* trials [*F(*2*,* <sup>789</sup>*)* = 32*.*17, *p <* 0*.*001]. **Figure 5F** depicts the frequency of *phasic* (i.e., transient) glutamate increases putatively related to neurotransmitter release events (Stephens et al., 2010). Although the frequency of glutamate release detected in the vicinity of the electrode was similar, it was still greater for *Object* vs. *Spatial* trials (**Figure 5E**) suggesting that the difference in overall tonic concentration represented activity of a network of glutamate synapses throughout PFC.

#### *Spatial tuning*

Another comparison of *Object* vs. *Spatial* trial processing was provided by examining "tuning plots" (Rao et al., 1999; Felsen et al., 2002) of PFC L2/3 and L5 cell pairs constructed for each target location on the screen during Match target selection (**Figure 6A**). **Figure 6B** shows an example of L2/3 cell firing on both *Spatial* (blue) and *Object* (red) trials. This type of comparison clearly dissociates the L2/3 cell biases/preferences on *Spatial*; tuning vector points to lower target location, 270◦) vs. *Object* trials (**Figure 6A**; tuning vector points to left target location; 180◦). **Figure 6C** shows average PEHs of preferred firing rates on *Spatial* (blue) vs. *Object* (red) trials for 42 neurons which showed significant increases [*F(*1*,* <sup>1679</sup>*)* = 19*.*63; *p <* 0*.*001, ANOVA] on *Spatial* vs. *Object* trials. Finally, a scatter plot of mean firing rates (**Figure 6D**) at biased target locations of the same cells (*n* = 42) as in **Figure 6C** shows a significant difference in preferred firing on *Spatial* vs. *Object* trials (*P <* 0*.*001; paired *T*-test). A "tuning index" defined as: TI = *(*PF − NF*)/(*PF + NF*)*, where PF represents mean firing rate in the preferred/biased location and NF the non-preferred (lowest) firing location, was calculated to compare firing in the Match phase on *Object* vs. *Spatial* trials. **Figure 6E** shows the comparison of tuning index for Match target selection on *Spatial* vs. *Object* trials, that have comparable magnitudes in selection abilities on different, prior trial-specific instructions via the focus signal (**Figure 1A**), which is consistent with the multifunctional roles of these same cells in executive control. The results shown in **Figures 6D** and **E** indicate dominance of preferred location firing on *Spatial* vs. *Object* trials which was likely the result of the influence of the prior trial type instruction in the Focus phase of the task.

#### **CLOSING THE LOOP WITH INTERLAMINAR REGULATED STIMULATION**

The unique properties of conformal MEAs also provide the basis for applying a system specific model to control firing of cells via application of electrical stimulation to the same loci in which columnar firing has been detected and analyzed with respect to DMS task performance (Hampson et al., 2012). This same model was implemented here to test whether it could facilitate performance on trials that show a distinctive difference in performance as a function of the prior instructions as to type of response to make in the Match phase (i.e., *Object* vs. *Spatial* trials). **Figure 7A** shows the integration of a multi-input multioutput (MIMO) nonlinear math model to assess the patterns of firing in L2/3 and L5 cells recorded in the columnar manner with the MEA shown with adjacent vertical pads (Hampson et al., 2012; Opris et al., 2012). **Figure 7B** reflects the type of input and output firing patterns recorded and analyzed by the MIMO model and also illustrates how the output pattern of L5 cell firing is duplicated via a multichannel stimulator that is capable of delivering predetermined patterns of pulses to the same L5 pads to mimic firing on correct trials. The advantage of the MIMO model is that the online recording provides the means to detect when the inappropriate L2/3 firing pattern occurs which triggers the delivery of the appropriate L5 stimulation pattern providing the means to override errors and enhance performance (Hampson et al., 2012). The results of stimulation delivery are shown in **Figures 7C** and **D**, in which the effects on performance are compared to trials in which stimulation was not delivered, irrespective of trial type. **Figure 7C** shows the change in latency to respond on stimulation trials with respect to the time of onset of the Match phase, while **Figure 7D** shows the increase in correct performance on trials as a function of the number of distracter images in the Match phase. Finally in agreement with all prior demonstrations and correlations of columnar specificity with respect to the influence of trial type on DMS performance, **Figure 7E** shows that *Spatial* trials that received MIMO stimulation showed improved performance relative to *Object* trials (with the same number of distracter images and delays 1–20 s). These results indicate that MIMO derived stimulation facilitated cognitive processing required to retrieve the "rule" for successful Match phase selection of the appropriate Sample item as shown in **Figure 6**.

## **DISCUSSION**

#### **INTER-LAMNAR PROCESSING IN PREFRONTAL CORTEX vs. CLOSING THE LOOP**

Response execution during DMS trial, and the illustrated firing in L5 which

The findings reported here (**Figures 2**, **3**, and **4**) are consistent with the idea that neurons in the supra- and infra-granular layers form efficient mini-columnar circuits during Match phase target selection required for effective performance of this DMS task (Swadlow et al., 2002; Pesaran et al., 2008; Resulaj et al., 2009; Buffalo et al., 2011; Opris et al., 2011; Takeuchi et al., 2011). The implementation of the unique MEA (**Figure 2B**) provided the basis for the detailed assessment of inter-laminar correlated firing (Opris et al., 2011) that was validated in multiple recordings of L2/3 and L5 cell pairs that yielded similar relations following differential changes in performance-dependent task parameters across animals and sessions (**Figures 3D**, **4D**, and **5D**). The increase in L2/3 and L5 correlations specific to the decision for target selection in the Match phase of the task (**Figures 2**, **3**, and **4**) suggests that a key variable in controlling task performance was activation of L5 neurons via specific minicolumnar input from paired neurons in layers 2 and 3 which have been shown to participate in the integration of "long-range" sensory inputs from the parietal dorsal visual stream (Opris and Bruce, 2005; Heekeren et al., 2008; Pesaran et al., 2008; Resulaj et al., 2009). Such integration was definitely reduced by trial difficulty as indicated by the reduction in firing synchrony between L2/3 and L5 cell pairs relative to trials with less cognitive demand (**Figures 1B**, **3C**,**D**, and **4C**,**D**). Prior investigations have shown that the firing of adjacent minicolumns is not correlated with respect to L2/3 and L5 activation during the Match phase of the task (Hampson et al., 2012; Opris et al., 2012). This again supports the notion that specific columnar processing was the basis for effective task performance and that such processing with

of 1–20 s. ∗∗*p <* 0*.*001, ANOVA.

respect to correlated firing between columns was independent, potentially reflecting processing of different forms of task specific information (Miyaki et al., 2000; Miller and Cohen, 2001; Opris et al., 2011).

Another feature demonstrating the columnar nature of this type of multineuron processing was the fact that classified L2/3–L5 cell pairs also showed the same Match phase spatial tuning biases (Felsen et al., 2002) during the session (**Figure 6**), which indicates the possible presence of previously identified PFC minicolumnar selection biases (Rao et al., 1999; Resulaj et al., 2009; Opris et al., 2011) in the cell pairs reported here (**Figures 3**, **4**, and **5**). This columnar processing trend, with the same tunning bias of L2/3 and L5 cells, reported in 81% of the cell pairs in *Spatial* trials was also present in the same percentage during *Object* trials, although the direction of tuning biases in the same minicolumn varied between these two trial contingencies.

**Figures 5B** and **C** show a very important distinction with respect to PFC inter-laminar processing which illucidates markedly why animals were less efficient in performing *Spatial* vs. *Object* types of trials with the same delays (**Figure 5D**) in the same behavioral sessions. The reduction in L2/3-L5 cell pair correlation on *Spatial* trials shown in **Figure 5C**, reflects a difference related to a state controlled by "prior" trial specific instruction (**Figure 1A** Focus signal) and suggests a lack of contextual encoding sufficient to maintain the same level of interlaminar communication. This is supported also by the demonstration of the independent influence of trial delay shown in **Figure 4** which clearly had a greater influence on *Spatial* vs. *Object* trials. In addition, the electrochemical measurement of glutamate concentration in Layer 2/3 (**Figures 5E**–**F**) suggests

#### **REFERENCES**


simultaneous measures of choline and acetylcholine in CNS. *Biosens. Bioelectron.* 23, 1382–1389.


that different networks, circuits, or even possibly, interlaminar columns of PFC neurons, differentially support the processing of *Spatial* vs. *Object* trials. Thus, Inter-laminar processing likely underlies the putative "executive function" of this brain region. These unique neural recordings demonstrate that relations between prefrontal neurons that encode and process information between cortical layers via minicolumns are likely relevant factors involved in executive dysfunction in which interlaminar disruption could be the basis for the cognitive impairment as shown recently (Hampson et al., 2011, 2012; Opris et al., 2012). This was verified by the fact that delivery of the appropriate firing pattern with MIMO model derived electrical stimulation in the same L5 neural firing pattern as during successful execution of the MR in the task, improved performance when more distracter images were present (**Figure 7D**). However the fact that MIMO stimulation also facilitated performance by avoiding a different type of error with respect to retaining and implementing the "rule" for the type of trial (*Object* or *Spatial*) being executed (**Figure 7E**), suggests that closing PFC columnar loops activates a process that normally functions to enhance cognitive decision making in NHPs performing tasks that require retention of the contexts in which target selections are made.

### **ACKNOWLEDGMENTS**

We thank Joshua Long, Joseph Noto, Brian Parish, Mack Miller, and Shahina Kozhisseri for their assistance on this project. This work was supported by National Institutes of Health Grants DA06634, DA023573, DA026487 and by Defense Advanced Research Projects Agency (DARPA) contract N66601-09-C-2080 to Sam A. Deadwyler.

Trippe, J. (2009). Minicolumnar width: comparison between supragranular and infragranular layers. *J. Neurosci. Methods* 184, 19–24.


the monkey's dorsolateral prefrontal cortex. *J. Neurophysiol*. 61, 331–349.


A. (2004). "Ceramic-based microelectrode neuronal recordings in the rat and monkey," in *Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBS),* (Lexington, KY), 25, 3700–3703.


visual cortex. *Cereb. Cortex* 21, 659–665.


A. (2012). Columnar processing in primate prefrontal cortex: evidence for executive control microcircuits. *J. Cogn. Neurosci.* 24, 2334–2347.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 22 August 2012; paper pending published: 21 September 2012; accepted: 30 October 2012; published online: 22 November 2012.*

*Citation: Opris I, Fuqua JL, Huettl PF, Gerhardt GA, Berger TW, Hampson RE and Deadwyler SA (2012) Closing the loop in primate prefrontal cortex: inter-laminar processing. Front. Neural Circuits 6:88. doi: 10.3389/fncir. 2012.00088*

*Copyright © 2012 Opris, Fuqua, Huettl, Gerhardt, Berger, Hampson and Deadwyler. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Decreased Hering–Breuer input-output entrainment in a mouse model of Rett syndrome

#### **Rishi R. Dhingra1,2,Yenan Zhu2,3, Frank J. Jacono1,4, David M. Katz <sup>2</sup> , Roberto F. Galán2,3 and Thomas E. Dick <sup>1</sup>\***

<sup>1</sup> Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, Case Western Reserve University, Cleveland, OH, USA

<sup>2</sup> Department of Neurosciences, Case Western Reserve University, Cleveland, OH, USA

<sup>3</sup> Systems Biology and Bioinformatics Program, Case Western Reserve University, Cleveland, OH, USA

<sup>4</sup> Louis Stokes Veterans Affairs Medical Center, Case Western Reserve University, Cleveland, OH, USA

#### **Edited by:**

Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany

#### **Reviewed by:**

Mathias Dutschmann, Florey Neuroscience Institutes, Australia Donald R. McCrimmon, Northwestern University Feinberg School of Medicine, USA Angelina Y. Fong, Macquarie University, Australia

#### **\*Correspondence:**

Thomas E. Dick, Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, Case Western Reserve University, 10900 Euclid Avenue, Cleveland, OH 44106, USA. e-mail: thomas.dick@cwru.edu

Rett syndrome, a severe X-linked neurodevelopmental disorder caused by mutations in the gene encoding methyl-CpG-binding protein 2 (Mecp2), is associated with a highly irregular respiratory pattern including severe upper-airway dysfunction. Recent work suggests that hyperexcitability of the Hering–Breuer reflex (HBR) pathway contributes to respiratory dysrhythmia in Mecp2 mutant mice. To assess how enhanced HBR input impacts respiratory entrainment by sensory afferents in closed-loop in vivo-like conditions, we investigated the input (vagal stimulus trains) – output (phrenic bursting) entrainment via the HBR in wildtype and MeCP2-deficient mice. Using the in situ perfused brainstem preparation, which maintains an intact pontomedullary axis capable of generating an in vivo-like respiratory rhythm in the absence of the HBR, we mimicked the HBR feedback input by stimulating the vagus nerve (at threshold current, 0.5 ms pulse duration, 75 Hz pulse frequency, 100 ms train duration) at an inter-burst frequency matching that of the intrinsic oscillation of the inspiratory motor output of each preparation. Using this approach, we observed significant input-output entrainment in wild-type mice as measured by the maximum of the cross-correlation function, the peak of the instantaneous relative phase distribution, and the mutual information of the instantaneous phases. This entrainment was associated with a reduction in inspiratory duration during feedback stimulation. In contrast, the strength of input-output entrainment was significantly weaker in Mecp2−/<sup>+</sup> mice. However, Mecp2−/<sup>+</sup> mice also had a reduced inspiratory duration during stimulation, indicating that reflex behavior in the HBR pathway was intact. Together, these observations suggest that the respiratory network compensates for enhanced sensitivity of HBR inputs by reducing HBR input-output entrainment.

**Keywords: closed-loop, entrainment, vagus, Hering–Breuer reflex, Mecp2**

## **INTRODUCTION**

Rett syndrome is caused by loss of MeCP2 function and is associated with an increase in respiratory pattern irregularity characterized by periods of forceful breathing (hyperventilation), breathing pauses, and abnormal cardiorespiratory coupling, as well as increased mean respiratory frequency (Weese-Mayer et al., 2006; Katz et al., 2012). MeCP2-deficient mice have a similar irregular breathing phenotype including increased mean respiratory frequency,increased variability in frequency, and increased frequency of apneas of both central and obstructive types (Katz et al., 2009; Voituron et al., 2010). The intrinsic neuronal mechanisms associated with these breathing alterations include widespread hyperexcitability in several respiratory areas of the brainstem including the nucleus tractus solitarius (nTS,Kline et al., 2010;Kron et al., 2012), Kölliker–Fuse nuclei (KFn, Stettner et al., 2007), locus coeruleus (Taneja et al., 2009), and ventrolateral medulla (Medrihan et al., 2008). Accordingly, therapies targeted at reducing neuronal hyperexcitability are effective in reducing the frequency of central apnea in mice (Abdala et al., 2010).

At the network level, MeCP2-deficiency leads to exaggerated post-inspiratory (PI) activity in vagal nerve recordings whose efferent fibers innervate the upper-airway (Stettner et al., 2007). The PI motor pattern is controlled by peripheral and pontine drives (See **Figure 1A**). Activation of pulmonary stretch receptor (PSR) inputs, located in the airways and lungs, send feedback encoding lung volume to the respiratory network via the vagal nerves to inhibit inspiration and facilitate expiration [the Hering– Breuer reflex (HBR);Kubin et al., 2006]. Neurons in the nTS, called pump cells, receive these vagal inputs and relay the information to medullary PI neurons as well as to the KFn causing robust inhibition of inspiration and prolongation of expiration (Berger, 1977; Ezure and Tanaka, 1996; Ezure et al., 1998, 2002). The dorsolateral pontine drive for the PI motor pattern was identified in studies in which blockade of NMDAergic transmission in the KFn after transection of the vagal nerves eliminates the inspiratory off-switch and leads to an apneustic breathing pattern (Fung et al., 1994; Ling et al., 1994). Moreover, pump cell projections to the dl pons may gate an excitatory efference copy of central pattern generator

**FIGURE 1 | Closing the Hering–Breuer mechanosensory feedback loop in the in situ arterially perfused preparation. (A)** Schematic of the closed-loop in vivo respiratory rhythm generating network. rCPG, respiratory CPG; PSR, pulmonary stretch receptor; nTS, nuclei of the solitary tract; KFn, Kölliker–Fuse nuclei. **(B)** To mimic Hering–Breuer reflex (HBR) feedback, we first estimated the intrinsic oscillation frequency, ω0, from an epoch of integrated phrenic nerve activity (PNA). Second, we estimated the minimum threshold to evoke the inspiratory inhibitory HBR by applying 10 s stimulus trains (20 Hz, 0.5 ms pulse-width) of increasing stimulus intensities to the contralateral vagus nerve. Once these two parameters were determined from each experimental preparation, we generated a fictive feedback input that consisted of a 2-min stimulus of rhythmic 100 ms trains (75 Hz, 0.5 ms pulse-width) whose inter-burst frequency matched ω<sup>0</sup> with pulse amplitude just above the threshold for resetting. **(C)** A representative tracing of fictive feedback input (shaded bars) and PNA output (trace) from a wild-type mouse. The dashed line indicates the threshold used for post hoc event detection. **(D,E)** To analyze (Continued)

#### **FIGURE 1 | Continued**

entrainment between the input and output, onset times for the two signals were extracted and used to generate the instantaneous phase time series, ϕoutput(t) **(D)** and ϕinput(t) **(E)**. ϕ(t) increases linearly from 0 to 2π between events and represents the movement of each oscillator around its limit-cycle.

(CPG) output from late-inspiratory neurons of the pre-Bötzinger complex (Cohen and Shaw, 2004; Dick et al., 2008). Thus in the absence of vagal afferent activity, the central representation of the "pattern" is disinhibited in the pons and dl pontine activity can drive PI activity. Thereby, this circuit motif allows the network to generate a PI rhythm in the absence of closed-loop feedback control.

In MeCP2-deficient mice, because both the nTS and the KFn appear functionally hyperexcitable, it is unclear whether the excessive PI activity and the resultant respiratory pattern dysrhythmia are due to a peripheral PI mechanism involving the nTS or a pontine PI mechanism involving the KFn. To disentangle this issue, we simulate the closed-loop behavior of the network by re-introducing rhythmic vagal feedback *in situ* and measuring the ability of the wild-type versus MeCP2-deficient networks to entrain to a threshold-amplitude periodic vagal input. Entrainment to rhythmic inputs is a fundamental property of any oscillator that occurs when the weak external forcing causes the oscillator to adjust its periods to become phase-locked with the imposed rhythm. In the entrained regime, the ratio between the intrinsic oscillation frequency and the imposed forcing takes on rational values. From the phase approximation model, we know that the existence of stable coupled dynamics between the oscillator and the rhythmic forcing depends on (1) the difference in frequency between the intrinsic oscillation frequency and that of the rhythmic input and (2) the strength of the coupling. In humans and cats, respiration readily entrains to HBR inputs during mechanical ventilation (Petrillo and Glass, 1984; Graves et al., 1986). In rats, respiration can also be entrained directly by rhythmically stimulating vagal afferent nerve endings (Dutschmann et al., 2009).

In this report, we test the hypothesis that increased respiratory pattern irregularity in *Mecp2*−/<sup>+</sup> mice is associated with an enhancement of respiratory entrainment by HBR inputs. While the strengthening of the coupling at the level of the nTS in MeCP2 deficient mice predicts an increase in entrainment between the CPG and the vagal input, we observed that *Mecp2*−/<sup>+</sup> mice display reduced input-output entrainment consistent with a dysfunctional pontine PI mechanism that causes respiratory dysrhythmia in these mice. However, the peripheral HBR pathway is still functional because inspiratory duration decreased during rhythmic vagal stimulation.

## **MATERIALS AND METHODS**

Experimental protocols were approved by the Case Western Reserve University Institutional Animal Care and Use Committee and were performed with strict adherence to all American Association for Accreditation of Laboratory Animal Care International (AAALAC), National Institutes of Health and National Research Council guidelines.

Experiments were performed in adult (10–12 week postnatal age), female *Mecp2*tm1.1Jae mice maintained on a mixed background (129Sv,C57BL/6,Balb/c; *n* = 6 wild-type,*n* = 5 *Mecp2*−/+; Chen et al., 2001; Guy et al., 2001). We utilized heterozygous female mice because they more closely model the human condition in which the mutation is lethal in males and results in somatic mosaicism in females due to the stochastic nature of X-chromosome inactivation. Female *Mecp2*tm1.1Jae heterozygotes show respiratory pattern irregularities like their null male littermates albeit at a later developmental time (Schmid et al., 2012). Further, a recent consortium on developing translational therapies for Rett syndrome has highlighted the importance of validating preclinical findings in heterozygous female *Mecp2* mutants (Katz et al., 2012).

To close the vagal mechanosensory feedback loop (**Figure 1B**), we stimulated the vagus nerve in the arterially perfused *in situ* preparation (*in situ* preparation), which is devoid of peripheral feedbacks, but maintains both an intact pontomedullary respiratory CPG and intact peripheral sensory nerve inputs (Paton, 1996). Briefly, mice were deeply anesthetized with isoflurane (1.5-3%, Piramal Healthcare, Andhra Pradesh, India). Once the mouse failed to respond to a noxious paw pinch, it was transected below the diaphragm and transferred into an ice-cold artificial cerebrospinal fluid (aCSF) bath for precollicular decerebration, cerebellectomy, and dissection of phrenic and vagal nerves. The preparation was then transferred to a recording chamber. The descending aorta was cannulated and perfused with aCSF (125 mM NaCl, 3 mM KCl, 1.25 mM KH2PO4, 2.5 mM CaCl2, 1.25 mM MgSO4, 25 mM NaHCO3, 10 mM d-glucose) containing 1.25% Ficoll (31˚C) using a peristaltic pump (Watson and Marlow 505S, Cornwall, UK). The perfusate was continuously bubbled with a gas mixture containing 94% O2/6% CO2. Because of the small size of the descending aorta, we were not able to measure perfusion pressure accurately. Adequate perfusion of the brainstem was maintained with flows between 17 and 20 ml/min. Within minutes of cannulation, respiratory movements resumed. If the respiratory activities were initially disorganized, then a single bolus of NaCN (0.1 ml, 0.03% w/v) was delivered to stimulate the peripheral chemoreceptors and restore the eupneic-like patterning of respiratory motor output.Vasopressin was not administered during these experiments.

#### **NERVE RECORDINGS**

Phrenic (PNA) and vagal (VNA) nerve activities were used as an index of fictive respiratory motor output. The distal end of either nerve was recorded via suction electrodes, filtered (0.003–3 kHz) and amplified (5–20 K; Grass P511,West Warwick, RI, USA), digitized (Power 1401,CED,Cambridge,UK), and stored (10 kHz sampling frequency) on a computer using Spike2 acquisition software (CED, Cambridge, UK).

#### **EXPERIMENTAL PROTOCOL**

After the tuned respiratory rhythm stabilized (15–20 min), baseline activity was recorded for at least 5 min to assess differences in respiratory patterning between the genotypes and to measure the intrinsic oscillation frequency of each preparation for determining the burst frequency of fictive vagal feedback. Next, the threshold amplitude for evoking the HBR was determined by measuring the threshold for an expiratory prolongation response to a constant train of vagal stimulation (20 Hz, 10 s train duration, 0.5 ms pulse duration). The threshold amplitude was defined as the stimulus current necessary to evoke an expiratory prolongation of at least 1.5 × the baseline expiratory duration. Having determined feedback burst frequency and pulse amplitude parameters, custom scripts written in MATLAB were used to generate rhythmic event trains (75 Hz, 100 ms train duration, 0.5 ms pulse duration) whose inter-burst frequency was matched to the intrinsic PNA burst frequency (∼75–200 breaths/min). Burst stimulation was used because afferent discharge of slowly adapting PSRs are characterized by sinusoidal ramps in impulse frequency *in vivo* (Widdicombe, 1954; Luck, 1970). As these vagal PSR fibers are large and myelinated (Düring et al., 1974), the pulse duration was chosen to preferentially activate myelinated fibers.

Fictive feedback stimulation trials (2 min duration) were separated by at least 1 min of baseline activity and were repeated until the decay of the preparation caused the intrinsic oscillation frequency to drift by more than 30% from baseline which typically occurred 1.5–5 h after the resumption of PNA. During fictive feedback stimulation trials, custom Spike2 scripts transformed the MATLAB-generated event time series into a TTL output to deliver the fictive feedback to the preparation. Importantly, the fictive feedback was not triggered by the PNA. Instead, the rhythmic feedback was started irrespective of the current phase of the respiratory CPG. Nonetheless, the respiratory CPG quickly entrained the ongoing respiratory oscillation within a few cycles in wild-type mice (**Figure 4A**, *left panels*).

#### **DATA ANALYSIS**

Phrenic nerve activities onset and offset times were derived from a threshold crossing algorithm and visually inspected for artifacts (**Figures 1C–E**). Respiratory period, phase durations, variability, and apnea index were calculated from the recorded baseline epoch (*n* = 6 wild-type; 5 *Mecp2*−/+). Apneas were defined as respiratory cycles with duration more than 1.5 times the mean period. The Wilcoxon signed rank test was used to determine the significance of HBR-induced reduction in Ti (**Figure 3**).

To test our hypothesis that increased pattern irregularity in MeCP2-deficient mice is associated with alterations in the ability of the CPG to entrain to afferent feedback inputs, we characterized the input-output entrainment generated by fictive vagal feedback in *Mecp2*−/<sup>+</sup> mice (*n* = 11 trials) versus wild-type littermates (*n* = 22 trials) using the cross-correlogram and several statistical measures derived from the instantaneous phase time series including the relative phase histogram, the instantaneous phase coherence, the synchronization index, and the mutual information of the instantaneous phases.

The normalized cross-correlogram, or transfer function, was computed using standard routines available in the MATLAB Signal Processing Toolbox. For computation of the cross-correlogram, the input signal was represented by a square-wave function with a pulse-width equal to the train duration. Before computing the cross-correlogram, both input and output signals were scaled between 0 and 1 and DC-removed. The maximum of

the cross-correlogram was used as an index of the strength of input-output coupling.

The instantaneous phase time series was determinedfrom onset times of the phrenic or input signals, *t<sup>k</sup>* . The instantaneous phase, ϕ(*t*), which is assumed to grow linearly in time within each cycle, was defined according to the following equation:

$$\varphi(t) = 2\pi \frac{t - t\_k}{t\_k - t\_{k+1}} + 2\pi, \quad t\_k < t < t\_{k+1} \tag{1}$$

where *t<sup>k</sup>* is the time of the *k*-th event, and *t* <sup>k</sup>+<sup>1</sup> is the time of the next event.

From the instantaneous phase time series', the instantaneous relative phase difference time series, ϕoutput − ϕinput, was computed. Histograms of the instantaneous relative phase were computed using 21 bins over the range 0–2π and scaled by the number of samples to determine the probability of a given instantaneous relative phase. The deviation from a uniform distribution was determined using the Rayleigh test for circular uniformity. The maximum of the instantaneous relative phase histogram was used as a measure of input-output coupling.

To define the regions in the instantaneous relative phase time series that were associated with strong input-output entrainment, we computed the phase coherence of the relative phases. The phase coherence is a windowed statistic that measures the squared magnitude of the mean phase angle:

$$\gamma(t) = \left\| \frac{1}{N} \sum\_{i=t-w}^{t} e^{i\left[\varphi\_{\text{output}} - \varphi\_{\text{input}}\right]} \right\|^2 \tag{2}$$

The phase coherence was computed with a 3s window. The phase coherence yields a value between 0 and 1. Values near zero are not phase-locked, whereas values closer to 1 indicate the presence of phase-locking. To measure the latency to entrainment, the duration of entrainment and number of phase slips, we used a phase coherence threshold of 0.9.

The synchronization index, or phase-locking value, also maps the circular distribution of relative instantaneous phase onto the unit circle. The magnitude of the synchronization index is proportional to the degree of input-output entrainment. The synchronization index is computed via the following equation:

$$\left| \gamma\_{(n,m)} = \left| \left\langle e^{i\left[n\varphi\_{\text{output}} - m\varphi\_{\text{input}}\right]} \right\rangle \right| \right| \tag{3}$$

for any *n:m* coupling. In the present study, the input oscillation frequency was chosen such that only 1:1 coupling was observed.

Mutual information is a measure of statistical dependence in a pair of time series. We used mutual information of the instantaneous phases in combination with surrogate data testing to quantify input-output entrainment and to allow for a statistical determination of the significance of the observed phase-locking. The mutual information index is defined according to the following equation:

$$H\left(\varphi\_{\text{input}},\ \varphi\_{\text{output}}\right) = H\left(\varphi\_{\text{input}}\right) + H\left(\varphi\_{\text{output}}\right) - H\left(\varphi\_{\text{input}}|\varphi\_{\text{output}}\right) \tag{4}$$

where *H*(ϕ*X*) is the entropy of time series ϕ*<sup>X</sup>* computed from the individual probability histogram, and *H*(ϕ*X|*ϕ*Y*) is the conditional entropy of time series ϕ*<sup>X</sup>* and ϕ*<sup>Y</sup>* computed from the joint probability histogram. The entropy of either distribution is computed according to:

$$H\left(\varphi\_{X}\right) = -\sum\_{k=1}^{L} P\left(\varphi\_{X}\left(k\right)\right) \ln P\left(\varphi\_{X}\left(k\right)\right) \tag{5}$$

where *L* is the number of bins in the histogram and *P*(ϕ*X*(*k*)) is the probability of observing ϕ*<sup>X</sup>* in bin *k*. Note that because the mutual information index is sensitive to the number of bins *L*, we consistently used 50 bins in the generation of all histograms to allow for comparisons across experiments.

To determine the significance of the observed input-output coupling, a surrogate data testing scheme was needed to represent the null hypothesis of independent pairs of oscillatory activity. To generate bootstrapped distributions of the null hypothesis, we randomized the inter-event intervals of both the input and output before computing the instantaneous phases (500 surrogates/trial). The mutual information of these surrogate time series' was then computed to generate the bootstrapped mutual information histogram. The observed mutual information value of the coupling was considered significant if it fell above the 99% confidence interval of the bootstrap distribution. This criteria served as the basis for the identification of intermediate and severe *Mecp2*−/<sup>+</sup> entrainment defects discussed in **Figures 4** and **5**.

All data were expressed as mean ± SEM. Unless stated otherwise, we applied one-way repeated measures ANOVA to determine the significance of the results. If significant, we used a Bonferroni *post hoc* test to determine specific differences.

## **RESULTS**

Representative traces of PNA from wild-type and *Mecp2*−/<sup>+</sup> baseline breathing patterns are shown in **Figure 2A**. The duration of the respiratory period was not significantly different between genotypes (**Figure 2B**), but the variability of the respiratory period [CV(Ttot)] was greater in *Mecp2*−/<sup>+</sup> than wild-type mice (*Mecp2*−/+, 0.23 ± 0.02 versus wild-type, 0.09 ± 0.01; *p* < 0.001; **Figure 2C**). The irregularity of the pattern was characterized by a higher frequency of spontaneous apnea in *Mecp2*−/<sup>+</sup> versus wild-type mice (*Mecp2*−/+, 3.7 ± 0.4 apneas/min versus wild-type, 0 ± 0 apneas/min; *p* < 0.001; **Figure 2D**). The intrinsic respiratory oscillation in *Mecp2*−/<sup>+</sup> mice had strong vagal efferent activity (*data not shown*) consistent with patterns reported previously (Stettner et al., 2007; Abdala et al., 2010).

During rhythmic vagal nerve stimulation (**Figure 3A**), inspiratory duration (Ti) decreased in both wild-type and *Mecp2*−/<sup>+</sup> mice as indicated by the pair-wise deviation from the line of identity (wild-type: baseline, 0.20 ± 0.01 s versus stimulation, 0.13 ± 0.001 s, *p* = 5 × 10−<sup>7</sup> ; *Mecp2*−/+: baseline, 0.15 ± 0.01 s versus stimulation, 0.10 ± 0.01 s, *p* = 5 × 10−<sup>4</sup> ; **Figure 3B**) consistent with the role of this peripheral sensory modality in the inspiratory off-switching mechanism. Further, the reduction in Ti induced by vagal stimulation tended to be smaller in *Mecp2*−/<sup>+</sup>

versus wild-type mice, though this difference was not significant (*Mecp2*−/+, −5.0 ± 1.0% versus wild-type, −7.1 ± 0.1%, *p* = 0.065; **Figure 3C**).

To test our hypothesis, we analyzed the input-output phaselocking between rhythmic vagal stimulation and the central respiratory oscillation (**Figures 4** and **5**). To characterize the significance of input-output coupling in individual trials, we relied on the mutual information index applied in conjunction with a bootstrapping approach wherein surrogate time series were generated via shuffling the inter-burst intervals (**Figures 4D,E** and **5A–G**; See Materials and Methods). As expected, wild-type mice had significant input-output entrainment in all trials (19/19 trials). By contrast, entrainment varied in *Mecp2*−/<sup>+</sup> mice: a severe group (6/11 trials) had a complete loss of input-output coupling; and an intermediate group (5/11 trials) had weak, but still significant input-output phase-locking (**Figure 4**). To fully characterize the changes in input-output coupling, we analyzed the relative phase difference time series (**Figures 4A,B**), the input-output cross-correlogram (**Figure 4C**), and the mutual information of the instantaneous phases (**Figures 4D,E**).

Representative traces of the relative phase time series during closed-loop stimulation are presented in **Figure 4A** (*top panels*). Entrainment between input and output is observed as epochs with a slope near 0, whereas sharp spikes in the time series are indicative of phase slips (**Figure 4A**). To characterize the duration and latency to input-output phase-locking, we measured the phase coherence, or mean phase angle on the unit circle, using a sliding windowed algorithm (**Figure 4A**, *bottom panels*). Phase-locked epochs within the trial were indicated by contiguous time regions where the phase coherence was >0.9 (**Figure 4A**, *shaded regions in top panels, horizontal lines in bottom panels*). *Mecp2*−/<sup>+</sup> mice had a strong tendency toward greater latency to input-output phase-locking from the beginning of a stimulation trial relative to wild-type mice (*Mecp2*−/+, 11.3 ± 3.3 s versus wild-type, 3.7 ± 1.4 s, *p* = 0.07). Further, severe *Mecp2*−/<sup>+</sup> mice had a significantly greater latency to phaselocking compared to intermediate or wild-type mice (severe, 18.2 ± 3.9 s versus wild-type, 3.7 ± 1.4 s, *p* < 0.001, versus intermediate, 3.0 ± 2.1 s, *p* < 0.01, **Figure 5A**). Wild-type mice had longer durations of stable entrainment relative to *Mecp2*−/<sup>+</sup> mice (wild-type, 7.8 ± 0.8 s versus *Mecp2*−/+, 4.5 ± 0.4 s, *p* < 0.01, versus severe, 3.7 ± 0.3 s, *p* < 0.05, **Figure 5B**). Compared to the severe group, intermediate *Mecp2*−/<sup>+</sup> mice had a tendency for longer durations of entrainment, but this was not significant. Further, *Mecp2*−/<sup>+</sup> mice also showed a mild tendency for an increased number of phase slips during stimulation trials relative to wild-type mice (*Mecp2*−/+, 37.5 ± 6.9 slips/trial versus wild-type, 24.9 ± 5.4 slips/trial, *p* = 0.37, **Figure 4C**). Finally, the severe *Mecp2*−/<sup>+</sup> group had fewer bouts of input-output phaselocking (3/6 trials), whereas the intermediate *Mecp2*−/<sup>+</sup> group consistently showed short bouts of input-output phase-locking (5/5 trials).

Representative relative phase histograms are shown in **Figure 4B**. A preferred relative phase between the vagal input and phrenic output was observed in both wild-type and the intermediate *Mecp2*−/<sup>+</sup> group, but not the severe *Mecp2*−/<sup>+</sup> group, which had a more uniform circular distribution of relative phases during rhythmic stimulation trials. However, all distributions had a measurable directionality as indicated by the Rayleigh test for deviance from circular uniformity indicative of a functional HBR. For the group, the maximum of the relative phase histogram and the synchronization index – the mean resultant vector of the relative phase time series – were both greater in wild-type relative to *Mecp2*−/<sup>+</sup> mice [Max (Rel. Phase Hist.): wild-type, 0.66 ± 0.05 versus *Mecp2*−/+, 0.48 ± 0.08, *p* < 0.05, wild-type versus severe, 0.28 ± 0.02, *p* < 0.001, severe versus intermediate, 0.71 ± 0.10, *p* < 0.01, **Figure 5D**; and synchronization index: wild-type, 0.67 ± 0.04 versus *Mecp2*−/+, 0.44 ± 0.09, *p* < 0.05, wild-type versus severe, 0.19 ± 0.03, *p* < 0.001, severe versus intermediate, 0.72 ± 0.08, *p* < 0.001, **Figure 5F**].

We also computed the transfer function of the system during rhythmic feedback stimulation as a measure of input-output phase-locking (**Figure 4C**). Cross-correlograms were periodic with the successive peaks and troughs decaying monotonically with increasing lag.While the qualitative structure of the functions was not changed between wild-type and *Mecp2*−/<sup>+</sup> mice, the peak of the transfer function decreased in *Mecp2*−/<sup>+</sup> mice [Max (Crosscorrelation): *Mecp2*−/+, 0.23 ± 0.05 versus wild-type, 0.50 ± 0.04, *p* < 0.001].Wild-type and intermediate mice also significantly differed from the severe group [Max (Cross-correlation): wild-type, 0.23 ± 0.05 versus severe, 0.11 ± 0.01, *p* < 0.001, severe versus intermediate, 0.38 ± 0.05, *p* < 0.05, **Figure 5E**].

Finally,we characterized the joint probability distributionfunctions of the instantaneous input- and output-phases by computing their mutual information, which quantifies the general dependence between the phase of the input and the phase of the output. To determine the significance of the observed entrainment,we performed bootstrap analyses by shuffling the inter-event intervals and re-computing the mutual information of the instantaneous phases. Phase-locking, characterized by clear banding in the joint probability distribution function, was observed in the wild-type and intermediate *Mecp2*−/<sup>+</sup> group (**Figure 4D**, left and center panels respectively), whereas the uniform joint probability distribution function of the severe *Mecp2*−/<sup>+</sup> group (**Figure 4D**, right panel) reflected the drifting of the instantaneous phases. Representative bootstrapped mutual information histograms were roughly Gaussian, though bounded >0 because of the rarity of obtaining a perfectly uniform joint probability distribution (**Figure 4E**).

regions in top panels). Mecp2<sup>−</sup>/<sup>+</sup> mice had increased latency to and reduced duration of entrainment relative to wild-type mice. **(B)** Representative histograms of the instantaneous phase difference are shown for a wild-type (left panel) and Mecp2<sup>−</sup>/<sup>+</sup> mice (middle and right panels). In all cases, even the severe Mecp2<sup>−</sup>/<sup>+</sup> mice, the distribution was significantly different from a uniform circular distribution as determined by the Rayleigh test for circular non-uniformity. **(C)** Representative cross-correlograms are shown for a wild-type (left panel), intermediate Mecp2<sup>−</sup>/<sup>+</sup> (middle panel), and severe (Continued)

#### **FIGURE 4 | Continued**

Mecp2<sup>−</sup>/<sup>+</sup> (right panel) mice. The maximum of the cross-correlogram was reduced in Mecp2<sup>−</sup>/<sup>+</sup> versus wild-type mice. **(D)** Representative joint probability histograms of the instantaneous input- and output-phases depict the strength of input-output entrainment in the intensity of banding. Entrainment was strong in wild-type and intermediate Mecp2<sup>−</sup>/<sup>+</sup> preparations, but was abolished in severe Mecp2<sup>−</sup>/<sup>+</sup> mice. **(E)** The significance of the observed input-output entrainment was determined by generating bootstrap

distributions of the mutual information of the instantaneous phases. Surrogate instantaneous phase time series were generated by shuffling the input- and output-inter-event intervals before determining the phase. The observed value of the mutual information is indicated by the arrowheads on the abscissa. The upper-bound of the 99% confidence interval is indicated by dashed vertical lines. Wild-type and intermediate Mecp2<sup>−</sup>/<sup>+</sup> mice always showed significant input-output entrainment, whereas severe Mecp2<sup>−</sup>/<sup>+</sup> mice did not have significant input-output entrainment.

The input-output entrainment was considered significant if the observed mutual information was greater than the 99% confidence interval of the bootstrap distribution. Five of 11 *Mecp2*−/<sup>+</sup> had significant entrainment according to the bootstrap results and were thereby classified as the intermediate *Mecp2*−/<sup>+</sup> phenotype. For the group, the mutual information of the instantaneous phases was greater in wild-type relative to *Mecp2*−/<sup>+</sup> mice (mutual information: wild-type, 0.76 ± 0.07 versus *Mecp2*−/+, 0.44 ± 0.14, *p* < 0.05). Severe mice also had significantly weaker entrainment as measured by the mutual information of the instantaneous phases compared to both wild-type and intermediate mice (severe,0.11 ± 0.02 versus wild-type,0.76 ± 0.07,*p* < 0.001, versus intermediate, 0.85 ± 0.19, *p* < 0.01, **Figure 5G**).

#### **DISCUSSION**

Imposing rhythmic vagal feedback stimulation in the *in situ* preparation decreased Ti and evoked robust bouts of input-output phase-locking in wild-type mice. Contrary to our hypothesis, *Mecp2*−/<sup>+</sup> mice had significantly weaker input-output phaselocking though the decrease in Ti during vagal feedback stimulation suggested that the HBR was still intact. Using mutual information with bootstrapped surrogate distributions to evaluate significant input-output entrainment, *Mecp2*−/<sup>+</sup> mice were separated into intermediate and severe entrainment phenotypes consistent with the mosaic expression of MeCP2. Severe *Mecp2*−/<sup>+</sup> mice completely lost input-output entrainment,where as intermediate *Mecp2*−/<sup>+</sup> mice had significant input-output entrainment, but was weaker relative to wild-type mice. Together, our findings identify a compensatory adaptation of the MeCP2-deficient respiratory network that decouples the respiratory rhythm from vagal feedback inputs.

#### **TECHNICAL CONSIDERATIONS**

In the present study, we assessed the ability of the isolated adult brainstem respiratory network to entrain to rhythmic vagal stimulation as a model of the *in vivo* closed-loop condition. As noted in the introduction, the presence of stable entrainment depends on two factors: (1) the frequency difference between the oscillators, e.g., the weakly coupled oscillator network that comprises the respiratory rhythm generator, and the periodic input, e.g., the rhythmic vagal stimulation; and (2) the strength of the coupling between the oscillator and the input. We controlled for frequency differences by tuning the fictive vagal feedback frequency to that of each preparation. Thus, even though the *Mecp2*−/<sup>+</sup> mice used in this study had a slightly increased period *in situ*, this did not prevent stable entrainment because they received a suitably slower fictive feedback input. Similarly, we controlled for differences in the strength of coupling stimulating the vagus nerve at the threshold for evoking HBR-like responses. Additionally, for the purpose of investigating closed-loop control of respiratory behavior, the utilization of the*in situ* preparation was critical because it maintains an intact pontomedullary axis, spares sensory afferent, and motoneuronal efferent pathways, and produces an *in vivo*-like respiratory rhythm allowing re-introduction of fictive vagal feedback without confounding changes in chemosensory and baroreceptor afferent pathways (Paton, 1996). Further, the absence of anesthesia was particularly important for investigating the MeCP2-deficient breathing phenotype as breathing arrhythmias in these mice are reduced by anesthetics (Viemari et al., 2005; Abdala et al., 2010).

A key caveat of our study is that fictive feedback was delivered by stimulating whole vagal nerve bundles which contain fibers from three types of pulmonary receptors: slowly adapting receptor (SAR) fibers, rapidly adapting receptor fibers, and C-fibers. In rats, SAR and RAR fibers can be stimulated preferentially because they are thick and myelinated. Accordingly, a short pulse duration and a low stimulus current were chosen to activate myelinated rather than unmyelinated fibers. Moreover, vagal stimulation, unlike lung inflation, has been shown to activate PI output recorded from the pharyngeal branch of the vagus (Hayashi and McCrimmon, 1996). However, in this study, the authors observed similar effects on inspiratory and expiratory phase durations when comparing vagal stimulation- and lung inflation-induced HBRs suggesting that functional effects of vagal stimulation and lung inflation on network output are similar enough to warrant such comparison. Moreover, these authors went on to use this same paradigm of vagal stimulation to identify neurons in the ventrolateral medulla whose activities are modulated in a paucisynaptic fashion to mediate the HBR (Hayashi et al., 1996). Similarly, in the present study and others, we observe HBR-like responses to vagal stimulation consistent with those reported for lung inflation (Karczewski et al., 1980; Budzinska et al., 1981; Siniaia et al., 2000; Dutschmann et al., 2009).Moreover, the respiratory network readily entrains to rhythmic vagal stimulation in rats (Dutschmann et al., 2009) as well as our fictive feedback trials in wild-type mice. However, from the present findings, we can conclude only that the vagal feedback entrainment behavior is lost in the MeCP2-deficient respiratory network without further experiments to dissect the mechanistic basis for the loss of functional connectivity between input and output.

#### **RELEVANCE TO RESPIRATORY ABNORMALITIES IN RETT SYNDROME**

Given previous findings of hyperexcitablity in TS-nTS synapses and exaggerated HBR-like responses to vagal stimulation, we hypothesized that entrainment between peripheral feedbacks and the respiratory rhythm should be enhanced in MeCP2-deficient mice (Stettner et al., 2007; Kline et al., 2010; Song et al., 2011). Instead, by using rhythmic vagal stimulation, we observed a reduction in input-output entrainment suggesting that despite the exaggerated HBR responses, observed during vagal stimulation with continuous trains, vagal feedback appears to be filtered by a compensatory adaptation of the network such that rhythmic inputs above the threshold for eliciting phase resetting have little consistent effect on the respiratory rhythm. Previous findings support a role for dysfunctional postnatal maturation of the KFn in the loss of phase-locking with afferent feedbacks. As mentioned earlier, the KFn is a key determinant of the PI motor pattern because local blockade of NMDAergic transmission results in apneusis (Fung et al., 1994; Ling et al., 1994; Dutschmann and Herbert, 2006). Moreover, the KFn has reciprocal connectivity with the vl NTS such that the KFn can gate the influence of peripheral inputs on the respiratory rhythm (Herbert et al., 1990; Ezure et al., 2002; Dutschmann and Dick, 2012). Dutschmann et al. (2009) demonstrated that after postnatal maturation, repeated trials of vagal stimulation leads to an anticipatory transition to from inspiration to expiration that precedes the arrival of vagal stimuli. This learning process also depended on NMDAergic transmission,which has been shown to mature postnatally with a similar developmental time course (Kron et al., 2008). In MeCP2-deficient mice, glutamate microinjection in the KFn results in an exaggerated PI apnea (Stettner et al., 2007). Moreover, the response to constant vagal stimulation shows a loss of habituation and desensitization (Stettner et al., 2007; Song et al., 2011), which have previously been shown to depend on the KFn (Siniaia et al., 2000). These data lead to the speculation that ponto-vagal interactions may be the critical factor mediating the irregular breathing pattern in *Mecp2*−/<sup>+</sup> mice.

Alternatively, the noise in the respiratory network is likely a critical factor in preventing stable coupling between the network and its peripheral feedbacks in MeCP2-deficient mice. From the phase approximation model, we know that noise reduces the parameter space associated with stable phase-locked dynamics between an oscillator and a rhythmic forcing. Thus, if the respiratory rhythm generator itself is more variable, this would prevent stable entrainment with the dynamics of the periphery. Accordingly, variability in the respiratory rhythm recorded from MeCP2-deficient mice is present in the absence of peripheral feedbacks: both in the isolated CPG *in vitro* (Viemari et al., 2005), as well as *in situ* where the network is intact, but functionally isolated from peripheral feedback (Stettner et al., 2007; Abdala et al., 2010).

#### **PHASE SYNCHRONIZATION MEASURES FOR INVESTIGATING CLOSED-LOOP RESPIRATORY BEHAVIOR**

Over the past decade, several reports have explored the restoration of rhythmic vagal feedback in reduced preparations. Mellen and Feldman (2000) first re-introduced phasic lung inflation in the *en bloc* preparation which was modified to maintain the lungs and vagal nerve pathways. Though they established that the medullary components were sufficient to evoke the HBR mechanism as evidenced by a reduction in inspiratory duration, their preparation did not include pontine components that also modulate the PI motor pattern. Utilization of the *in situ* preparation overcomes this limitation and confirmed that phasic lung inflation reduces inspiratory duration and increases respiratory frequency (Harris and St-John, 2005). Importantly, these earlier approaches utilized closed-loop stimulation wherein lung inflation was triggered by the onset of PNA. Interactions between the respiratory rhythm and rhythmic HBR feedback inputs are also apparent when the stimulation is not coupled with the motor output as in the present study. As mentioned above, Dutschmann et al. (2009) used this latter approach to demonstrate that the pontine control of PI activity is subject to postnatal maturation and depends on NMDAergic transmission in the KFn. However, in this report, only cycle-triggered averaging was used to demonstrate the presence of functional input-output entrainment. Our approach extends this and earlier methodologies by introducing robust measures of phase synchronization that are necessary to evaluate entrainment. In the present study, the use of surrogate data sets in concert with mutual information of the instantaneous phases as a test statistic permitted assessment of significant phase-locking between the rhythmic input and respiratory motor output in individual stimulation trials. Additionally, the use of instantaneous relative phase and phase coherence allowed us to identify bouts of entrainment within individual stimulation trials. Applying our improved methodology to MeCP2-deficient mice revealed intermediate and severe decrements in input-output entrainment that were not recognized previously. In the future, our approach could be extended with a data-driven modeling approach based on the generic phase oscillator model by incorporating multiple central field recordings within the pontomedullary respiratory column to understand changes in the directionality of coupling within the respiratory network during HBR feedback stimulation (Zhu et al., 2013).

On a more general level, the findings of this study raise the possibility that the respiratory rhythm is tuned to the dynamics of the periphery. In locomotor CPGs, central oscillations are coupled to the dynamics of the limb at the a frequency between

### **REFERENCES**


the intrinsic frequencies of the neural oscillator and of the physical limb system (Hatsopoulos, 1996; Chiel and Beer, 1997). A hallmark of this phenomenon, resonance tuning of a CPG, is that the frequency of the instrinsic oscillation is reduced in the absence of feedback (Pearson et al., 1983). Mathematically, considering the most general model of phase-coupled oscillators, this property is a consequence of the fact that the coupling between oscillators is both weak and additive with respect to the intrinsic oscillation frequency. While researchers have not previously considered resonance tuning in the context of respiration, the intrinsic respiratory oscillation frequency is reduced after removal of PSR (Stella, 1938; Dhingra et al., 2011) or peripheral chemoreceptor (Eldridge, 1974; Miller and Tenney, 1975; Hayashi et al., 1983) afferent inputs in cats and rodents. Further investigation of whether resonant tuning of respiratory dynamics may be critical for rhythmogenesis is warranted because the presence of excitatory feedback changes membrane current dynamics underlying phase switching in models of feedback-coupled locomotor CPGs (Spardy et al., 2011). Even though CPGs are defined by their ability to transform a constant drive into a rhythmic output without sensory feedback, in the intact animal, output rhythms are modulated continuously in a closed-loop fashion by peripheral afferent oscillations that, contrary to the assertions of the last decades, may be central to the rhythm generating mechanism.

#### **ACKNOWLEDGMENTS**

This work was supported by National Institutes of Health Grants HL-080318 (Thomas E. Dick), HL-42131 (David M. Katz), T32 HL-007913 (Rishi R. Dhingra), The Mt. Sinai Health Care Foundation (Roberto F. Galán) and Award I01BX000873 from the Biomedical Laboratory Research and Development Service of the VA Office of Research and Development (Frank J. Jacono). We also wish to thank Dr. Richard Romaniuk and Dr. Miriam Kron for their insightful comments on the manuscript.


in carotid-deafferented cats. *Respir. Physiol.* 23, 23–30.


model of Rett syndrome. *J. Neurosci.* 29, 12187–12195.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 06 September 2012; accepted: 01 March 2013; published online: 03 April 2013.*

*Citation: Dhingra RR, Zhu Y, Jacono FJ, Katz DM, Galán RF and Dick TE (2013) Decreased Hering–Breuer inputoutput entrainment in a mouse model of Rett syndrome. Front. Neural Circuits 7:42. doi: 10.3389/fncir.2013.00042*

*Copyright © 2013 Dhingra, Zhu, Jacono, Katz, Galán and Dick. This is an openaccess article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Vesicular stomatitis virus with the rabies virus glycoprotein directs retrograde transsynaptic transport among neurons *in vivo*

#### *Kevin T. Beier 1, Arpiar B. Saunders 2, Ian A. Oldenburg2, Bernardo L. Sabatini <sup>2</sup> and Constance L. Cepko1 \**

*<sup>1</sup> Department of Genetics and Department of Ophthalmology, Harvard Medical School, Harvard University and Howard Hughes Medical Institute, Boston, MA, USA <sup>2</sup> Department of Neurobiology, Harvard Medical School, Harvard University and Howard Hughes Medical Institute, Boston, MA, USA*

#### *Edited by:*

*Eberhard E. Fetz, University of Washington, USA*

#### *Reviewed by:*

*Naoshige Uchida, Harvard University, USA Jeffrey C. Smith, National Institutes of Health, USA*

#### *\*Correspondence:*

*Constance L. Cepko, Department of Genetics, Department of Ophthalmology, Harvard Medical School, Howard Hughes Medical Institute, Boston, MA 02115, USA. e-mail: cepko@genetics. med.harvard.edu*

Defining the connections among neurons is critical to our understanding of the structure and function of the nervous system. Recombinant viruses engineered to transmit across synapses provide a powerful approach for the dissection of neuronal circuitry *in vivo*. We recently demonstrated that recombinant vesicular stomatitis virus (VSV) can be endowed with anterograde or retrograde transsynaptic tracing ability by providing the virus with different glycoproteins. Here we extend the characterization of the transmission and gene expression of recombinant VSV (rVSV) with the rabies virus glycoprotein (RABV-G), and provide examples of its activity relative to the anterograde transsynaptic tracer form of rVSV. rVSV with RABV-G was found to drive strong expression of transgenes and to spread rapidly from neuron to neuron in only a retrograde manner. Depending upon how the RABV-G was delivered, VSV served as a polysynaptic or monosynaptic tracer, or was able to define projections through axonal uptake and retrograde transport. In animals co-infected with rVSV in its anterograde form, rVSV with RABV-G could be used to begin to characterize the similarities and differences in connections to different areas. rVSV with RABV-G provides a flexible, rapid, and versatile tracing tool that complements the previously described VSV-based anterograde transsynaptic tracer.

**Keywords: vesicular stomatitis virus, transsynaptic infection, rabies, retrograde transneuronal tracing,** *in vivo***, technology, polysynaptic**

## **INTRODUCTION**

Mapping neuronal connectivity in the central nervous system (CNS) of even simple organisms is a difficult task. Recombinant viruses engineered to trace synaptic connections and express transgenes promise to enable higher-throughput mapping of connections among neurons than other methods, e.g., serial reconstruction from electron micrographs (Bock et al., 2011; Briggman et al., 2011). The Pseudorabies (PRV) and Rabies viruses (RABV) have been the best characterized and most utilized circuit tracing viruses to date (Ugolini et al., 1989; Kelly and Strick, 2000). RABV was recently modified by Wickersham and colleagues such that it can travel across only one synapse, allowing for a straightforward definition of monosynaptic connections (Wickersham et al., 2007b). This strategy permitted the first unambiguous identification of retrogradely connected cells from an initially infected cell ("starter cell"), without the need for electrophysiology. Moreover, the starter cell could be defined through the expression of a specific viral receptor that limited the initial infection.

Recently, we created an anterograde monosynaptic virus that complements the previously available retrograde viral tracers (Beier et al., 2011). Vesicular stomatitis virus (VSV), a virus related to RABV, with its own glycoprotein (G) gene (VSV-G), or with a G from the unrelated lymphocytic choriomeningitis virus (LCMV-G), spreads in the anterograde direction across synapses. VSV can be used as a polysynaptic tracer that spreads across many synapses, owing to the fact that the normal, replicationcompetent form of the virus does not cause serious diseases in humans (Brandly and Hanson, 1957; Johnson et al., 1966; Brody et al., 1967). Whether the virus is a monosynaptic or polysynaptic tracer is determined by the method of delivery of the G gene (**Figure 1A**). Advantages of VSV are that it is well-characterized, is relatively simple in comparison to PRV, and it rapidly grows to high titer in tissue culture cells. It is also being developed as a vaccine vector, often using a G of another virus as the immunogen, as well as being developed as a cytocidal agent that will target tumor cells in humans (Balachandran and Barber, 2000; Stojdl et al., 2000, 2003).

Previous studies of the anatomical patterns of transmission, as well as physiological recordings, have shown that the transmission of VSV and RABV among neurons is via synapses (Kelly and Strick, 2000; Wickersham et al., 2007b; Beier et al., 2011). In addition, it has been shown that RABV, as well as lentiviruses with RABV-G in their envelope, travel retrogradely from an injection site (Mazarakis et al., 2001; Wickersham et al., 2007a). We hypothesized that providing a recombinant VSV (rVSV) with the RABV-G would create a retrograde polysynaptic transsynaptic tracer without the biosafety concerns inherent to RABV. Our initial characterization of rVSV with RABV-G showed that indeed

**FIGURE 1 | Synaptic tracing strategies using VSV. (A)** Schematic illustrating the strategies for polysynaptic or monosynaptic retrograde or anterograde transsynaptic transmission of rVSV encoding GFP. The initially infected cell is indicated by an asterisk. VSV encoding a glycoprotein (G) within its genome can spread polysynaptically. The direction of the spread depends on the identity of the glycoprotein. Infected neurons are shown in green. In some cases, the initially infected starter cell can be defined by the expression of an avian receptor, TVA (tagged with a red fluorescent protein). The TVA-expressing neurons can then be specifically infected by rVSV*-*G with the EnvA/RABV-G (A/RG) glycoprotein (Wickersham et al., 2007b) on the virion surface [rVSV*-*G(A/RG)]. These starter cells are then yellow, due to viral GFP and mCherry from TVA-mCherry expression. For monosynaptic tracing, the G protein is expressed *in trans* in the TVA-expressing cell, and thus complements rVSV*-*G to allow transmission in a specific direction. **(B)** Genomic diagrams of rVSV vectors. All VSVs contain four essential proteins: N, P,

M, and L. Some viruses encode a G gene in their genome, which allows them to spread polysynaptically. rVSV vectors typically encode a transgene in the first position, while others carry an additional transgene in the G position. **(C)** Morphological characterization of rVSV-infected neurons in several locations within the mouse brain. **(i,ii)** Caudate-putamen (CP) neurons at 4 dpi from an injection of the CP with rVSV(VSV-G) viruses encoding **(i)** CFP or **(ii)** Korange. **(iii)** Labeled neurons of the CA1 region of the hippocampus are shown at 5 dpi following injection into the hippocampus of rVSV(VSV-G) encoding Venus. **(iv,v)** Cortical pyramidal neurons are shown following injection into the CP of rVSV(RABV-G) expressing **(iv)** GFP at 24 hpi, or **(v)** mCherry at 48 hpi. Inset in **(iv)** is a high magnification of the neuron in panel **(iv)**, highlighting labeling of dendritic spines. **(vi)** Multiple viruses can be co-injected into the same animal. Here, individual rVSV*-*G(VSV-G) viruses encoding CFP, GFP, Venus, Korange, and mCherry were used to infect the cortex. Scale bars = 50 µm.

it could be taken up as a retrograde tracer (Beier et al., 2011). To determine if it could transmit among neurons following its replication in neurons, and to further analyze the transmission patterns of both the monosynaptic and polysynaptic forms of rVSV with RABV-G, we made injections into several CNS and peripheral locations. In addition, we performed co-infections of rVSV with RABV-G and the anterograde form of rVSV in order to exploit the differences in the directionality of transmission of these two viruses in mapping circuits.

## **RESULTS**

#### **VSV CAN ENCODE A VARIETY OF TRANSGENES**

Schematics of viruses created and used throughout this study are shown in **Figure 1**. We created rVSV vector plasmids carrying different transgenes in either the first or fifth genomic positions (**Figure 1B**). After rescuing each virus, we tested the ability of each to express transgenes in different brain regions through intracranial injections (**Figure 1C**). All rVSV vectors drove robust fluorophore expression 1 or 2 days post-infection (hpi) (**Figure 1C**) (van den Pol et al., 2009). In fact, by 12 hpi, labeling was sufficiently bright to image fine morphological details, such as dendritic spines (**Figure 1C**,**iv**).

#### **PHYSIOLOGY OF CELLS INFECTED WITH rVSV ENCODING RABV-G**

To characterize the physiological properties of cells infected with rVSV, we tested a replication-competent rVSV encoding GFP, with RABV-G in the genome in place of VSV-G [hereafter designated rVSV(RABV-G)]. van den Pol et al. reported that hippocampal neurons infected with replication-incompetent (G-deleted or "*-*G") rVSV were physiologically healthy at 12–14 hpi, but were less so by 1 day post-infection (dpi) (van den Pol et al., 2009). Given the known toxicity of both VSV and RABV-G (Coulon et al., 1982), we tested the physiology of cortical pyramidal neurons in the motor cortex (M1) infected with rVSV(RABV-G). Between 12 and 18 hpi, the membrane capacitance, input resistance, resting membrane potential, and current-to-action potential firing relationship were indistinguishable between infected and uninfected neurons (**Figure 2**). However, by 2 dpi, electrophysiological properties were so abnormal in the infected cortical pyramidal cells that physiological measurements could not be made.

#### **VSV EXPRESSES TRANSGENES RAPIDLY IN NEURONS**

The speed and strength of the expression of transgenes encoded by VSV depends upon the gene's genomic position (van den Pol et al., 2009; Beier et al., 2011). Genes in the first position are expressed the most highly, with a decrease in the level of expression in positions more 3 within the viral plus strand. When GFP was inserted into the first position of VSV, GFP fluorescence was first detectable at approximately 1 hpi in cultured cells (van den Pol et al., 2009).

In order to quantify the relative expression of a fluorescent protein in the first genomic position in neurons, rat hippocampal slices were infected with a replication-incompetent rVSV that expresses mCherry (rVSV*-*G, **Figures 1A,B**). This was a *-*G virus which had the RABV-G supplied in trans during the preparation of the virus stock [referred to as rVSV*-*G(RABV-G)]. Average fluorescence intensity of the infected cells was measured every hour over the course of 18 h. By 4 hpi at 37◦C, red fluorescence was clearly visible, and reached maximal levels by approximately 14 hpi (*N* = 3, **Figure 3**). Similar results were obtained with a virus encoding GFP in the first genomic position rather than mCherry (i.e., **Figure 1B**) (*N* = 3).

#### **rVSV(RABV-G) SPREADS TRANSSYNAPTICALLY IN THE RETROGRADE DIRECTION**

We previously demonstrated that rVSV(RABV-G) could be taken up retrogradely by neurons (Beier et al., 2011), but these experiments did not distinguish between direct axonal uptake of the initial inoculum vs. retrograde transsynaptic transmission following viral replication. To distinguish between these two mechanisms and to extend the previous analyses, we conducted further experiments in the mammalian visual system (**Figures 4A–G**). As visual cortex area 1 (V1) does not receive direct projections from retinal ganglion cells (RGCs), but rather receives secondary input from RGCs via the lateral geniculate nucleus (LGN), infection of RGCs from injection of V1 would demonstrate retrograde transmission from cells which supported at least one round of viral replication. Following a V1 injection with rVSV(RABV-G), GFP-positive RGCs were observed in the retina by 3 dpi (*N* = 3; **Figure 4G**). Importantly, viral labeling in the brain was restricted to primary and secondary projection areas, even at 7 dpi. These included the LGN (**Figure 4D**) and the hypothalamus (**Figure 4E**), two areas known to project directly to V1 (Kandel, 2000). Selective labeling was observed in other areas, such as cortical areas surrounding V1 (**Figure 4C**), which project directly to V1, and also in the superior colliculus (SC) stratum griseum centrale, which projects to the LGN (**Figure 4F**). Labeling was also observed in the nucleus basalis, which projects to the cortex, as well as many components of the basal ganglia circuit, which provide input to the thalamus [such as the caudate-putamen (CP), globus pallidus (GP), and the subthalamic nucleus (STn)]. The amygdala, which projects to the hypothalamus, was also labeled. Consistent with a lack of widespread viral transmission, animals did not exhibit signs of disease at 7 dpi.

These data show that rVSV(RABV-G) can spread in a retrograde direction from the injection site, but do not address whether the virus can spread exclusively in the retrograde direction. Directional transsynaptic specificity can only be definitively addressed using a unidirectional circuit. We therefore turned to the primary motor cortex (M1) to CP connection, in which neurons project from the cortex to the CP, but not in the other direction (**Figure 4H**) (Beier et al., 2011). Injections of rVSV(RABV-G) into M1 should not label neurons in the CP if the virus can only label cells across synapses in the retrograde direction. Indeed, at 2 dpi, areas directly projecting to the injection site, including the contralateral cortex, were labeled (**Figure 4I**). Only axons from cortical cells were observed in the CP, with no GFP-labeled cell bodies present in the CP (**Figure 4J**), consistent with lack of anterograde transsynaptic spread. By 3 dpi, a small number of medium spiny neurons (MSNs) in the CP were observed, likely via secondary spread from initially infected thalamic or GP neurons (data not shown).

**rVSV(RABV-G) into the CP.** Slices were cut 12 hpi and recordings were taken over the subsequent 6 h. **(A)** Example spike trains driven by 100, 200, and 400 pA square current pulses lasting 1 s for infected (left) and uninfected (right) neurons. **(B)** A summary plot showing current/action

averages. Infection does not alter the **(C)** input resistance, the **(D)** capacitance, or **(E)** resting membrane voltages (infected cells, *N* = 10, uninfected cells, *N* = 9). Horizontal bars denote mean with standard error of the mean.

## **PERIPHERAL UPTAKE OF rVSV(RABV-G) AND TRANSMISSION TO THE CNS**

A particular advantage of retrograde viral tracers is the ability to label CNS neurons projecting to peripheral sites. This has been a powerful application of both RABV and PRV (Ugolini et al., 1989; Standish et al., 1994). To test if rVSV(RABV-G) could also perform this function, we examined the innervation of the dura surface by neurons of the trigeminal ganglion, a neuronal circuit thought to be involved in migraine headaches (Penfield and McNaughton, 1940; Mayberg et al., 1984). These neurons have axons, but not canonical dendrites, and send projections into the spinal cord and brainstem. Therefore, the only way trigeminal neurons could become labeled from viral application to the dura is through retrograde uptake of the virus.

We applied rVSV(RABV-G) to the intact dura mater and analyzed the dura, trigeminal ganglion, and CNS for labeling (**Figure 4K**). At the earliest time point examined, 3 dpi, we observed axons traveling along the dura, but little other evidence of infection (**Figure 4L**). No labeled neuronal cell bodies on the dura were observed, consistent with the lack of neurons on this surface. In contrast, we did find labeled cell bodies in the trigeminal ganglion (**Figure 4M**). No infection was seen in the CNS, even at 4 dpi, consistent with the lack of inputs from the brain into the trigeminal ganglion (*N* = 4 animals).

## **THE KINETICS OF RETROGRADE TRANSSYNAPTIC SPREAD**

To further characterize patterns and kinetics of viral transmission and directional specificity of transsynaptic spread, injections of rVSV(RABV-G) were made into the CP (**Figure 5A**). In order to determine which cells were labeled by direct uptake of virus in the inoculum, a separate set of animals were injected into the CP with the replication-incompetent rVSV*-*G(RABV-G) (*N* = 3 animals, analyzed 3 dpi). Cells labeled by rVSV*-*G(RABV-G) were observed in the CP, GP, substantia nigra (SN), thalamus, and

layers 3 and 5 of the cortex, consistent with infection at the axon terminal and retrograde labeling of cell bodies of neurons known to project directly to the CP (**Figure 5C**) (Albin et al., 1995). Areas labeled by CP injection are indicated in **Figure 5B**.

The patterns of spread for the replication-competent rVSV(RABV-G) were characterized over the course of 1–5 dpi (**Figures 5D–H**). During this interval, progressively more cells in infected regions were labeled by rVSV(RABV-G), including within the CP, nucleus basalis, cortex, and GP (listed in **Figure 5B**). In addition, more cortical cells were labeled in clusters near cortical pyramidal neurons, both ipsilateral and contralateral to the injected side, including neurogliaform cells (data not shown). These data are in contrast to those observed following infection with an anterograde transsynaptic tracing virus, such as rVSV with its own G gene, rVSV(VSV-G) (**Figure 5B**). At 3 dpi following rVSV(VSV-G) injection into the CP, the cerebral cortex was not labeled, but regions receiving projections from the CP, such as the STn, GP, and SN, were labeled (Beier et al., 2011).

In order to investigate other areas for evidence of cell-to-cell retrograde transsynaptic spread, the nucleus basalis was examined following infection of the CP with replication-competent rVSV(RABV-G). The nucleus basalis was labeled by 2 dpi (**Figures 5E–H**), consistent with at least a single transsynaptic jump, as this area does not directly project to the CP. The virus appeared to travel transsynaptically at the rate of roughly 1 synapse per day, as evidenced by the lack of labeled neurogliaform cells in the cortex, and lack of neurons in the nucleus basalis at 1 dpi, and label appearing in these cell types/areas at 2 dpi, as previously observed (Beier et al., 2011). Labeling remained well-restricted to the expected corticostriatal circuits at 5 dpi, suggesting that viral spread becomes less efficient after crossing one or two connections, consistent with injections into V1 (**Figure 4**). While glial cells can be infected and were observed near the injection site (van den Pol et al., 2002; Chauhan et al., 2010), infected glial cells away from the injection site generally were not observed.

#### **POLYSYNAPTIC TRACERS CAN BE COMBINED** *in vivo*

One advantage of having both anterograde and retrograde forms of the same virus is that they can be used in parallel, or in tandem, to trace circuitry to and from a single or multiple sites of injection, with each virus having similar kinetics of spread and gene expression. In fact, if different fluorophores are used in different viruses, e.g., rVSV(VSV-G) and rVSV(RABV-G), then the viruses can be co-injected into the same site and their transmission can be traced independently (**Figure 6A**). This is most straightforward if there are no cells at the injection site that are initially infected by both viruses. Co-infected cells can be easily detected, as they would express both fluorescent proteins shortly after injection.

In order to determine whether two viruses would allow simultaneous anterograde and retrograde transsynaptic tracing from a single injection site, a rVSV(VSV-G) expressing Venus and a rVSV(RABV-G) expressing mCherry were injected individually (**Figures 6B–D**) or co-injected (**Figures 6E–G**) into the motor cortex, and brains were examined 3 dpi. The pattern of labeling from the co-injected brains was equivalent to the patterns observed when each virus was injected individually: rVSV(VSV-G) was observed to infect neurons in the cortex, CP, and downstream nuclei, whereas the rVSV(RABV-G) was not observed to infect neurons in the CP, but rather in the thalamus and nucleus basalis (*N* = 4). The initial co-infection rate is dependent upon

#### **FIGURE 4 | rVSV(RABV-G) exhibits polysynaptic retrograde spread**

*in vivo***. (A)** Schematics of two parasaggital sections separated by 1.3 mm are shown. rVSV(RABV-G) injected into V1 (black needle) should yield infected cells in the labeled areas shown in green, including RGCs in the retina (panel **ii**). Areas projecting directly to V1, such as the hypothalamus (h), LGN, as well as other cortical areas, can be labeled by direct retrograde uptake of injected virions, whereas RGCs, which project to the LGN, can only be labeled by secondary viral spread. **(B)** rVSV(RABV-G) was injected into V1 (yellow arrowhead), and both the brain and retina examined 7 dpi. Infection in the brain appeared to be primarily in directly projecting areas, including the surrounding cortices, the LGN (white arrow), and hypothalamus (white arrowhead). Higher magnifications of labeled cells from a V1 injection are shown in panels **(C–G)**. **(C)** somatosensory cortex, 7 dpi; **(D)** LGN, 3 dpi; **(E)** hypothalamus, 3 dpi; **(F)** SC, 3 dpi; **(G)** RGC, 3 dpi. **(H)** Schematic of a coronal section showing rVSV(RABV-G) injected into M1 (black needle). The contralateral cortex (green) should be labeled by this virus, while at early time points such as 2 dpi, the CP,

which receives projections from the cortex but does not itself send projections to the cortex, should not (gray). **(I)** Coronal section showing GFP-labeled neurons in M1, imaged 4 dpi. The injection site was in M1, indicated by a yellow arrowhead, with neurons projecting to the injection site indicated by the white arrowhead. **(J)** CP neuronal cell bodies were not labeled, but labeled cortical axon bundles running through the CP were observed (inset shows axon bundles in the area demarcated by the white arrowhead). **(K–M)** rVSV(RABV-G) can trace circuits into the CNS from a peripheral site. **(K)** Parasaggital schematic showing a predicted area of infection following infection of the dura with a retrogradely transported virus. rVSV(RABV-G) was applied to the intact dura (arrow) and if retrograde uptake and transport can occur, trigeminal ganglion neurons that project to the dura (green) should become labeled. **(L)** Examples of axons located on the dura, 3 dpi. Infected neuronal cell bodies were not located on the dura, **(M)** but instead were observed in the trigeminal ganglion. No infection of the brain was observed in these animals. Scale bars: **(B,I,J)** = 1 mm, **(L)** = 100 µm, **(C–G,M)** = 50 µm.

**FIGURE 5 | Time course of rVSV(RABV-G) spread from the CP recapitulates the connectivity of known basal ganglia-thalamo-cortical circuits. (A)** A parasaggital schematic showing the relevant projections into and from the injection site in the CP. Black needle points to injection site, green = primary projecting regions, blue = secondary projecting region. CP, caudate-putamen; GP, globus pallidus; SN, substantia nigra; STh, subthalamic nucleus; Th, thalamus; NB, nucleus basalis. **(B)** Assessment of viral spread from rVSV(RABV-G) and rVSV(VSV-G) injections into the CP. The presence or absence of labeling is indicated by (+) and (−), respectively. The extent of labeling is indicated by the number of (+). Some animals were infected with *-*G viruses to determine which areas were labeled by direct uptake of the virions, rather than by replication and transmission. These were sacrificed at

3 dpi. **(C)** Parasaggital section of a brain infected with VSV[greek delta]G(RABV-G). The injection site is marked by a red arrow. Several areas that project directly to the CP were labeled due to direct uptake of the virions, including the cortex, thalamus, and GP (arrowheads), 3 dpi. **(D–H)** Replication-competent rVSV(RABV-G) was injected into the CP (red arrows), and the time course of labeling was monitored for 5 days [**(D)** 1 day, **(E)** 2, **(F)** 3, **(G)** 4, and **(H)** 5 days]. Insets show high magnifications of areas indicated by white arrows. Sections from animals at 1 dpi show labeling consistent with the initial infection [compare to rVSV*-*G(RABV-G), panel **C**], while spread to secondarily connected areas, such as the nucleus basalis, was observed at 2 dpi (yellow arrows). Viral spread was relatively restricted to the basal ganglia circuit, even out to 5 dpi. Scale bars = 1 mm.

**FIGURE 6 | Simultaneous anterograde and retrograde transsynaptic circuit tracing using rVSV. (A)** Connectivity schematics of parasaggital sections indicating patterns of spread from injection of M1 (injection needles) with a polysynaptic virus transmitting across synapses in the **(i)** anterograde or **(ii)** retrograde directions. Panel **(iii)** shows the pattern from co-injection of two polysynaptic viruses, one anterograde and one retrograde. Green represents the anterograde virus, red the retrograde virus, and yellow, both. Note that yellow indicates that the area is predicted to host infection by both viruses, with potentially some individual cells showing infection with both viruses. **(B)** The anterograde transsynaptic virus rVSV(VSV-G), when injected alone, labeled M1 as well as anterograde projection areas, such as the CP, GP, and thalamus, whereas **(C)** the retrograde virus rVSV(RABV-G) labeled M1 as well as areas projecting to the cortex, including the thalamus. **(D)** High magnification of thalamic cells shown in **(C)** (white arrow). **(E,F)** Examples taken from a series of parasaggital sections from the same brain of an animal injected with both viruses simultaneously into M1. Co-infection of cells in M1

was not observed, **(G)**, and no spurious labeling of anterograde or retrograde projection regions was observed—i.e., the combination of viruses was equal to the sum of each virus injected individually. Insets show high magnifications of thalamic neurons in **(E)** and **(F)** labeled by the two viruses (indicated by the yellow arrows) demonstrating no co-labeling. **(G)** A high magnification view of the injection site in the cortex shown in panel **(F)** (white arrow), showing independent labeling of neurons by each virus. **(H)** A schematic of a parasaggital section depicting the pattern of transmission of an anterograde (green) and retrograde (red) virus injected into two different areas of the basal ganglia circuit. This strategy can be used to connect multiple elements in a circuit. The rVSV(VSV-G) that expressed Venus (labeled cells depicted in green) was injected into M1, while the rVSV(RABV-G) that expressed mCherry was injected into the SN, where it labeled direct pathway MSNs in the CP (yellow). **(I)** Using these coordinates, largely non-overlapping regions of the CP were labeled by these viruses, as shown in **(J)**. Scale bars: **(B,C,E,F,I)** = 1 mm; **(D,G,J)** = 50µm.

the dose of the initial inocula. When injecting 3 × 10<sup>3</sup> focus forming units (ffu) rVSV(VSV-G) and 3 × 10<sup>4</sup> ffu rVSV(RABV-G), no co-infection was observed at the injection site. Thus, co-infection of the same brain region, without co-infection of the same cells, does not alter the spreading behavior of either rVSV(VSV-G) or rVSV(RABV-G).

One example of how this dual retrograde and anterograde transsynaptic tracing system can be used is to determine if three distinct regions are connected and the directionality of any connections. For example, the anterograde transsynaptic virus can be injected into one region, the retrograde into another, and a third region can then be examined for evidence of labeling by either or both viruses (e.g., **Figure 6H**). To test this possibility, rVSV(VSV-G) was injected into the motor cortex, rVSV(RABV-G) was injected into the substantia nigra pars reticulara (SNr), and animals were sacrificed at 3 dpi. We observed that cells were singly labeled, either with Venus [rVSV(VSV-G)] or with mCherry [rVSV(RABV-G)], and were located largely in different regions of the CP (**Figures 6I**,**J**) (*N* = 3). These results suggest that the anterograde connections from the cells infected with rVSV(VSV-G) in the M1 were with CP MSNs that did not project to the region of the SNr injected with rVSV(RABV-G) (*N* = 3 animals).

#### **VSV CAN TRACE MONOSYNAPTICALLY CONNECTED CIRCUITS IN THE RETROGRADE DIRECTION**

In addition to polysynaptic tracing, VSV can be modified to trace circuits monosynaptically (Beier et al., 2011). With RABV, this was achieved *in vivo* by first infecting with an adeno-associated virus (AAV) expressing TVA, a receptor for an avian retrovirus, and RABV-G (Wall et al., 2010). This was followed 3 weeks later by infection with a *-*G RABV with an EnvA/RABV-G chimeric glycoprotein on the virion surface (Wickersham et al., 2007b), which allowed infection specifically of the cells expressing TVA. A similar strategy was used to test rVSV's ability to monosynaptically trace retrogradely connected neurons *in vivo.* Inputs to choline acetyltransferase (ChAT)-expressing neurons in the striatum were used for this test. These neurons primarily receive input from the cortex and the thalamus (Thomas et al., 2000; Bloomfield et al., 2007) (**Figure 7A**). In order to mark this population, we crossed ChAT-Cre mice to Ai9 mice, which express tdTomato in cells with a Cre expression history (Madisen et al., 2010). Six-week-old mice from this cross were injected in the CP with two AAV vectors: one expressing a Cre-conditional ("floxed") TVA-mCherry fusion protein, and another expressing a floxed RABV-G. Two weeks later, the mice were injected in the same coordinates with rVSV*-*G with the EnvA/RABV-G chimeric glycoprotein on the virion surface [rVSV*-*G(A/RG)] (Beier et al., 2011). Cells successfully infected with these two AAV vectors could host infection by a rVSV and should be able to produce rVSV virions with RABV-G on the surface. Such starter cells should also express tdTomato and GFP. If rVSV were to be produced, and if it were to transmit across the synapse retrogradely, cortical and thalamic neurons should be labeled by GFP.

Mice injected with these AAV and rVSV viruses were sacrificed 5 days after rVSV infection, and brains analyzed for fluorescence. As expected for starter cells, some neurons in the CP expressed both tdTomato and GFP (**Figure 7B**). Outside of the CP, small numbers of GFP+ neurons that were not mCherry+ were observed in the cortex (**Figures 7C,D**) and thalamus (**Figure 7E**), consistent with retrograde spread. Control animals not expressing Cre, or not injected with AAV encoding RABV-G, did not label cells in the cortex or thalamus (*N* = 3 for both controls and experimental condition).

## **DISCUSSION**

#### **rVSV(RABV-G) IS A RETROGRADE TRACER IN THE CNS**

Here, we report on the use of rVSV as a retrograde transsynaptic tracer for CNS circuitry. VSV can be modified to encode the RABV-G protein in the viral genome, allowing the virus to replicate and transmit across multiple synaptically connected cells, i.e., as a polysynaptic tracer. Alternatively, if the virus has the G gene deleted from its genome and RABV-G is provided *in trans*,

**FIGURE 7 | Monosynaptic retrograde tracing using rVSV** *in vivo***. (A)** A schematic of a parasaggital section showing the predicted pattern of monosynaptic retrograde spread from Choline Acetyltransferase (ChAT)-expressing neurons in the CP to directly connected cells. A combination of two Cre-dependent adeno-associated viruses (AAVs), one expressing a TVA-mCherry fusion protein and the other RABV-G, were injected into the CP of ChAT-Cre/Ai9 animals. This permits expression of the transgenes encoded in the AAVs in cells with a ChAT expression history. Two weeks later, rVSV*-*G(A/RG), a G-deleted virus that only infects TVA-expressing neurons, was injected into the same region, and the brain

was observed 5 days later. The injection of rVSV into the CP (black needle) should result in infection of TVA-expressing neurons in the CP. From these starter cells, monosynaptic spread could occur only to directly connected inputs such as those in the cortex and thalamus (green). (**B,B**- ) Initially infected cells in the CP were both red (TVA-expressing) and green (rVSV infected) (arrow). **B** shows the red and blue channels only, blue = DAPI. (**C–E**) Examples of rVSV-infected cells in the cortex **(C,E)** and thalamus **(D)** that were infected by monosynaptic transmission from the starter cells, (arrows indicate cells infected by transmission), *N* = 3. Scale bars: **(B,D)** = 50µm, **(C,E)** = 500 µm.

it behaves as a monosynaptic tracer (Beier et al., 2011). Although it has been known for many years that RABV travels retrogradely among neurons (Astic et al., 1993; Ugolini, 1995; Kelly and Strick, 2003), and pseudotyping lentiviruses with RABV-G is sufficient for axonal transport (Mazarakis et al., 2001), the retrograde transmission specificity among neurons had not been clearly shown to be a property of the G protein itself, as it might have been due to other viral proteins in addition to, or instead of, the viral G protein. Since native VSV does not have these retrograde transsynaptic properties (van den Pol et al., 2002; Beier et al., 2011), and the only alteration to the VSV genome was the substitution of the VSV G gene with the G gene of RABV, it is clear that the RABV glycoprotein is responsible for retrograde direction of viral transmission across synapses, at least in the case of rVSV.

#### **VSV AS A VIRAL VECTOR FOR THE CNS: RAPID GENE EXPRESSION AND GENOME CAPACITY**

The early onset of gene expression from VSV relative to RABV (one hour vs. multiple hours) makes it beneficial in experimental paradigms in which the experiment needs to be done within a narrow window of time, such as tissue slices and explants. In addition, more than one transgene can be encoded in the viral genome without the need of a 2A or IRES element. The use of the first position of the genome enhances the expression level of the transgene inserted at that location, since VSV (and RABV) express genes in a transcriptional gradient; therefore, the first gene is the most highly transcribed (Knipe, 2007). This leads to rational predictions of expression levels so that one can choose the position of insertion of a transgene, or transgenes, according to this gradient and the desired level of expression. The size of the viral capsid is apparently not rigid, allowing for the inclusion of genomes that are substantially larger than the native genome, unlike the rigid capacity for some other viral vectors, such as AAV (Duan et al., 2000; Yan et al., 2000).

## **SUFFICIENCY OF GLYCOPROTEINS TO CONFER DIRECTIONALITY OF rVSV SPREAD ENABLES NOVEL APPLICATIONS**

The fact that VSV can be made to spread anterogradely (Beier et al., 2011) or retrogradely across synapses with the change of a single gene affords several advantages over viral tracers that heretofore have not shown such flexibility in the directionality of tracing. In addition to the obvious application of tracing anterograde connections, combinations can be made to exploit the different forms of the virus. One example that employs the simultaneous infection with an anterograde and retrograde form of VSV is demonstrated in **Figure 6**. This experiment was designed to address whether the anterograde projections from the cortex to the CP would label the same brain regions as were labeled by a retrograde virus injected into the SN. Although a block of superinfection by the virus may preclude infection of the same cell with multiple rVSVs, adjacent cells could still become labeled by different viruses (Whitaker-Dowling et al., 1983). The observed results could be due to a preferential labeling by the anterograde transsynaptic virus of indirect pathway MSNs in this experiment, which then synapse onto the GP, thereby reflecting a viral bias. Alternatively, it could indicate that the cortical neurons in the injected region largely do not label the MSNs that project to the area of the SN injected with the retrograde virus. One further possibility is that too little virus was used to observe co-labeling of a given region. However, given the density of infection (i.e., **Figures 6I,J**), the latter possibility seems unlikely. Additionally, the spread of the polysynaptic rVSV(RABV-G) appears to attenuate with increasing numbers of synapses crossed, permitting an analysis of more restricted viral spread. This is quite fortuitous, as if spread were to continue, it would lead to widespread infection and lethality. In addition, reconstruction of connectivity would be more difficult. This reduced efficiency appears to also hold for the monosynaptic form of VSV complemented with RABV-G, as the efficiency of transmission appeared lower than the comparable experiment with RABV (Watabe-Uchida et al., 2012). This is likely due to viral attenuation when VSV-G is replaced with RABV-G.

#### **ADVANTAGES OF VSV OVER OTHER VIRAL TRACERS: SAFETY**

We were attracted to the use of VSV as a viral tracer due to its long track record as a safe, replication-competent laboratory agent. Laboratory workers using VSV have not contracted any diseases, and natural VSV infections among human populations in Central America and the southwestern United States (Rodríguez, 2002) occur without evident pathology (Johnson et al., 1966; Brody et al., 1967). VSV was thus an attractive candidate for its use as a polysynaptic tracer for CNS studies, which requires an ability to replicate through multiple transmission cycles. Both replicationcompetent and incompetent forms of VSV are in use under Biosafety Level 2 containment. Replication-competent RABV is Biosafety Level 3, due to the fact that infection with replicationcompetent RABV is almost always fatal to humans and in mice when infected intracerebrally (Smith, 1981; Knipe, 2007).

Differences in pathogenicity between VSV and RABV are likely due to the ability of RABV to evade the innate immune system, particularly interferon (Hangartner et al., 2006; Junt et al., 2007; Lyles and Rupprecht, 2007; Rieder and Conzelmann, 2009; Iannacone et al., 2010). VSV infection efficiently triggers an interferon response, and it has not evolved a method of escape from this response, unlike RABV (Brzózka et al., 2006). In fact, VSV is being pursued as a vaccine for other viruses, including RABV (Lichty et al., 2004; Publicover et al., 2004; Kapadia et al., 2005; Schwartz et al., 2007; Iyer et al., 2009; Geisbert and Feldmann, 2011). VSV does not typically spread beyond the initially infected site in the periphery (Kramer et al., 1983; Vogel and Fertsch, 1987). This likely is the cause of the minor or absent symptoms in humans and animals infected in nature. Polysynaptic VSV vectors are thus predicted to be much safer than polysynaptic RABV vectors. We have tested this prediction by injecting a series of mice in the footpads and hind leg muscles with rVSV(RABV-G), with the result that no injected animals showed any evidence of morbidity or mortality (Beier, Goz et al., in preparation).

While safer for laboratory workers than RABV, the main drawback to using VSV is its rapid cellular toxicity (van den Pol et al., 2009; Beier et al., 2011). Toxicity is due to suppression of cellular transcription and a block in the export of cellular RNAs from the nucleus to the cytoplasm (Black and Lyles, 1992; Her et al., 1997; Ahmed and Lyles, 1998; Petersen et al., 2000; von Kobbe et al., 2000), as well as inhibition of the translation of cellular mRNAs (Francoeur et al., 1987; Jayakar et al., 2000; Kopecky et al., 2001). VSV is much quicker to enact its gene expression program than is RABV, such that cells suffer the toxic effects more quickly than after RABV infection. One aspect of VSV that can be exploited in the future to ameliorate the speed of toxicity is the use of VSV mutants and variants. One such mutant is the M51R, which permitted us to conduct physiological analyses of pre-and post-synaptic cells (Beier et al., 2011). We are in the process of examining the transmission properties of this mutant *in vivo*, as well as the effects of other mutations or viral variants on prolonging the health of neurons after infection.

#### **SUMMARY**

rVSV vectors can be used to study the connectivity of neuronal circuitry. In addition to combinations of replication-competent forms of VSV, the replication-incompetent, monosynaptic forms of the virus can be easily combined, without the need to change viruses (Beier et al., 2011). This allows a straightforward way to study both the projections into, and out from, a genetically defined cell population. This can be done with the same viral genome, with the only change needed being the glycoprotein, for the selection of the direction of transmission. This flexibility of VSV makes it a powerful, multi-application vector for studying connectivity in the CNS.

#### **MATERIALS AND METHODS**

#### **VIRUS CONSTRUCTION AND PRODUCTION**

All rVSV clones were cloned from the rVSV*-*G backbone (Chandran et al., 2005). mCherry, Kusabira orange, Venus, and CFP were cloned into the first (GFP) position using XhoI and MscI sites, and VSV-G (a gift from Richard Mulligan, Harvard Medical School, Boston, MA) and RABV-G (a gift from Ed Callaway, Salk Institute, San Diego, CA) were cloned into the fifth (G) position using the MluI and NotI restriction sites. Genes for fluorescent proteins were obtained from Clontech.

Viruses were rescued as previously described (Whelan et al., 1995). At 95% confluency, eight 10 cm plates of BSR cells were infected at an MOI of 0.01. Viral supernatants were collected at 24-h time intervals and ultracentrifuged at 21,000 RPM using a SW28 rotor and resuspended in 0.2% of the original volume. For titering, concentrated viral stocks were applied in a dilution series to 100% confluent BSR cells and plates were examined at 12 hpi. Viral stocks were stored at −80◦C.

For *-*G viruses, 293T cells were transfected with PEI (Ehrhardt et al., 2006) at 70% confluency on 10 cm dishes with 5µg of pCAG-RABV-G. Twenty-four hours post-infection, the cells were infected at an MOI of 0.01 with rVSV*-*G expressing either GFP or mCherry. Viral supernatants were collected for the subsequent 4 days at 24 h intervals.

Virus preparations are now available from the Salk GT3 viral core (http://vectorcore*.*salk*.*edu/). All plasmids are available from Addgene (http://www*.*addgene*.*org/).

#### **AAV VECTORS**

AAV-FLEx-RABV-G and AAV-FLEx-TVA-mCherry plasmids originated from the Lab of Naoshige Uchida (Watabe-Uchida et al., 2012), and virus stocks were generous gifts from Brad Lowell, Harvard Medical School.

#### **INJECTIONS OF MICE**

ChAT-Cre (B6;129S6-Chattm1*(*cre*)*Lowl/J) and Ai9 (B6.Cg-Gt(ROSA)26Sor*<*tm9(CAG-tdTomato)Hze*>*/J) mice were obtained from the Jackson Laboratory (Madisen et al., 2010).

Eight-week-old CD-1 mice were injected using pulled capillary microdispensers (Drummond Scientific, Cat. No: 5-000- 2005), using coordinates from The Mouse Brain in Stereotaxic Coordinates (Franklin and Paxinos, 1997). Injection coordinates (in mm) used were:

Primary Motor Cortex: A/P +1.34 from bregma, L/M 1.7, D/V −1 from pial surface

LGN: A/P −2.46 from bregma, L/M 2, D/V −2.75

Superior Colliculus: A/P −3.88 from bregma, L/M 0.5, D/V −1

CP: A/P +1 from bregma, L/M 1.8, D/V −2.5

Primary Visual Cortex (V1): A/P −3.4 from bregma, L/M 2.5, D/V −0.8.

SNr: A/P −3.28 from bregma, L/M 1.5, D/V −4.25

For multi-color analysis (**Figures 1C,D**), 3 × 10<sup>9</sup> ffu/mL rVSV was injected into various regions. For CP injections, 100 nL of rVSV(RABV-G) or rVSV(VSV-G) at 3 × 10<sup>7</sup> ffu/mL was injected at a rate of 100 nL/min. For the replication-incompetent viruses, 100 nL of 1 × 107 ffu/mL rVSV*-*G(RABV-G) or rVSV*-*G (VSV-G) was injected. In the motor cortex, 100 nL of 1 × 10<sup>7</sup> ffu/mL rVSV(RABV-G) was injected, and mice harvested 2 dpi. For V1 injections, 100 nL of 3 × 10<sup>10</sup> ffu/mL rVSV(RABV-G) was injected, and mice were examined 3 or 7 dpi.

For infections of the dura mater, 1µL of 3 × 10<sup>10</sup> ffu/mL rVSV(RABV-G) was applied to the surface of the dura. The virus was allowed to absorb, and the surface was subsequently covered in bone wax, and the wound sutured.

For co-injections of virus into the same animal, 100 nL of a combination of 3 × 10<sup>7</sup> ffu/mL rVSV(VSV-G) and 3 × 108 ffu/mL rVSV(RABV-G) were co-injected into the motor cortex, and brains examined 3 dpi. For injections of the viruses into different regions, 100 nL of 3 × 10<sup>7</sup> ffu/mL rVSV(VSV-G) was injected into M1, and 100 nL of 3 × 10<sup>8</sup> ffu/mL rVSV(RABV-G) into the SNr, and brains examined 3 dpi. A lower titer of rVSV(VSV-G) was used, as rVSV(RABV-G) is attenuated.

All mouse work was conducted in biosafety containment level 2 conditions and was approved by the Longwood Medical Area Institutional Animal Care and Use Committee.

#### **SLICE PREPARATION AND PHARMACOLOGY**

Recordings were made from cortical pyramidal neurons in slices taken from postnatal day 12–18 mice, inoculated in the CP 12–18 h prior with rVSV(RABV-G). Coronal slices (300µm thick) were cut in ice-cold external solution containing (in mM): 110 choline, 25 NaHCO3, 1.25 NaH2PO4, 2.5 KCl, 7 MgCl2, 0.5 CaCl2, 25 glucose, 11.6 Na-ascorbate, and 3.1 Na-pyruvate, bubbled with 95% O2 and 5% CO2. Slices were then transferred to artificial cerebrospinal fluid (ACSF) containing (in mM): 127 NaCl, 25 NaHCO3, 1.25 NaH2PO4, 2.5 KCl, 1 MgCl2, 2 CaCl2, and 25 glucose, bubbled with 95% O2 and 5% CO2. After an incubation period of 30–40 min at 34◦C, slices were stored at room temperature. All experiments were conducted at room temperature (25◦C). In all experiments, 50µM picrotoxin, 10µM 2,3-Dioxo-6-nitro-1,2,3,4 - tetrahydrobenzo [f]quinoxaline - 7- sulfonamide (NBQX), and 10µM 3-((R)-2-Carboxypiperazin-4-yl)-propyl-1-phosphonic acid (CPP) were present in the ACSF to block GABAA/C, AMPA, and NMDA receptor-mediated transmission, respectively. All chemicals were from Sigma or Tocris.

#### **ELECTROPHYSIOLOGY AND IMAGING**

Whole-cell recordings were obtained from infected and uninfected deep layer cortical pyramidal neurons identified with video-IR/DIC and GFP fluorescence was detected using epifluorescence illumination. With the deep layers of the cortex, 2-photon laser scanning microscopy (2PLSM) was used to confirm the cell types based on morphology. Deep layer pyramidal neurons had large cell bodies, classic pyramidal shape and dendritic spines. Glass electrodes (2–4 M) were filled with internal solution containing (in mM): 135 KMeSO4, 5 KCl, 5 HEPES, 4 MgATP, 0.3 NaGTP, 10 Na2HPO4, 1 EGTA, and 0.01 Alexa Fluor-594 (to image neuronal morphology) adjusted to pH 7.4 with KOH. Current and voltage recordings were made at room temperature using a AxoPatch 200B or a Multiclamp 700B amplifier. Data was filtered at 5 kHz and digitized at 10 kHz.

## **REFERENCES**


N-methyl-d-aspartate 2C receptor subunits. *Neuroscience* 150, 639–646.


#### **DATA ACQUISITION AND ANALYSIS**

Imaging and physiology data were acquired and analyzed as described previously (Carter and Sabatini, 2004). Resting membrane potential was determined by the average of three 5-s sweeps with no injected current. Passive properties of the cell, membrane (Rm) and series resistance (Rs) and capacitance (Cm), were measured while clamping cells at −65 mV and applying voltage steps from −55 to −75 mV. The current—firing relationship was determined in current clamp with 1-s periods of injected current from 100 to 500 pA.

#### **HIPPOCAMPAL SLICE CULTURES**

The time course of viral gene expression experiments were carried out in organotypic hippocampal slice cultures prepared from postnatal day 5–7 Sprague-Dawley rats as described previously (Stoppini et al., 1991). Slices were infected after 7 days *in vitro*, and images were acquired on a two-photon microscope.

### **ACKNOWLEDGMENTS**

We are grateful for technical assistance from Vanessa Kainz in the laboratory of Rami Burstein at the Beth Israel Deaconess Medical Center. This work was supported by HHMI (Constance L. Cepko and Bernardo L. Sabatini), and #NS068012-01 (Kevin T. Beier).

necessary for infection. *Science* 308, 1643–1645.


characterization system for the whole mouse brain. *Nat. Neurosci.* 13, 133–140.


virus. *Proc. Natl. Acad. Sci. U.S.A.* 107, 21848–21853.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 26 October 2012; paper pending published: 22 December 2012; accepted: 20 January 2013; published online: 07 February 2013.*

*Citation: Beier KT, Saunders AB, Oldenburg IA, Sabatini BL and Cepko CL (2013) Vesicular stomatitis virus with the rabies virus glycoprotein directs retrograde transsynaptic transport among neurons in vivo. Front. Neural Circuits 7:11. doi: 10.3389/fncir.2013.00011*

*Copyright © 2013 Beier, Saunders, Oldenburg, Sabatini and Cepko. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any thirdparty graphics etc.*

## Spatial vision in insects is facilitated by shaping the dynamics of visual input through behavioral action

## *Martin Egelhaaf\*, Norbert Boeddeker, Roland Kern, Rafael Kurtz and Jens P. Lindemann*

*Neurobiology and Centre of Excellence "Cognitive Interaction Technology", Bielefeld University, Germany*

#### *Edited by:*

*Eberhard E. Fetz, University of Washington, USA*

#### *Reviewed by:*

*Vladimir Brezina, Mount Sinai School of Medicine, USA Alexander Borst, Max Planck Institute of Neurobiology, Germany*

#### *\*Correspondence:*

*Martin Egelhaaf, Neurobiology and Centre of Excellence "Cognitive Interaction Technology", Bielefeld University, Germany. e-mail: martin.egelhaaf@ uni-bielefeld.de*

Insects such as flies or bees, with their miniature brains, are able to control highly aerobatic flight maneuvres and to solve spatial vision tasks, such as avoiding collisions with obstacles, landing on objects, or even localizing a previously learnt inconspicuous goal on the basis of environmental cues. With regard to solving such spatial tasks, these insects still outperform man-made autonomous flying systems. To accomplish their extraordinary performance, flies and bees have been shown by their characteristic behavioral actions to actively shape the dynamics of the image flow on their eyes ("optic flow"). The neural processing of information about the spatial layout of the environment is greatly facilitated by segregating the rotational from the translational optic flow component through a saccadic flight and gaze strategy. This active vision strategy thus enables the nervous system to solve apparently complex spatial vision tasks in a particularly efficient and parsimonious way. The key idea of this review is that biological agents, such as flies or bees, acquire at least part of their strength as autonomous systems through active interactions with their environment and not by simply processing passively gained information about the world. These agent-environment interactions lead to adaptive behavior in surroundings of a wide range of complexity. Animals with even tiny brains, such as insects, are capable of performing extraordinarily well in their behavioral contexts by making optimal use of the closed action–perception loop. Model simulations and robotic implementations show that the smart biological mechanisms of motion computation and visually-guided flight control might be helpful to find technical solutions, for example, when designing micro air vehicles carrying a miniaturized, low-weight on-board processor.

**Keywords: spatial behavior, optic flow, saccades, flying insects, obstacle avoidance, navigation behavior**

## **OPTIC FLOW AS AN IMPORTANT SPATIAL CUE FOR FAST MOVING ANIMALS**

Behavior is a phenomenon that takes place in space and is intricately entangled with it. The organism is required to interact with its surroundings in a way appropriate to the respective situational context. It should be able to respond appropriately to objects, for instance, by avoiding collisions with obstacles or by detecting and fixating inanimate objects of interest or other organisms, such as a predator, prey, or mate. On a larger spatial scale, organisms should be able to navigate from one place to another and to localize a goal on the basis of environmental spatial cues.

Insects are obviously well able to cope with these behavioral challenges in a highly virtuosic and efficient way. Think of a blowfly, for example, landing on the rim of a cup, or two flies chasing each other; without technical assistance, our visual system is incapable of resolving the complexity of such flight maneuvres, and the speed at which they are executed exceeds by far the capacities of our own motor system. During their virtuosic flight maneuvres, blowflies can make up to ten sudden ("saccadic") turns per second, during which they may reach angular velocities of up to 4000◦/s. The extraordinary navigational skills of bees are another awe inspiring example of insect spatial behavior: spatial cues enable bees to localize previously learnt, barely visible goals, such as a food source or the entrance to their nest, over large distances even in cluttered environments. All these feats are accomplished with visual systems of comparatively poor spatial resolution and extremely small brains that consist of no more than a million neurons, underlining the resource efficiency of the underlying mechanisms.

We will argue in this review that biological agents, such as flying insects, are such efficient and adaptive autonomous systems because they rely, to a large extent, on strategies by which they shape their sensory input through the specific way they move and change their gaze direction. In this way, they actively reduce the complexity of their sensory input and, thus, the computational load for the underlying brain mechanisms. Therefore, by exploiting the consequences of the action–perception cycle, animals with even tiny brains, such as insects, are enabled to perform extraordinarily well in solving spatial vision tasks in a wide range of behavioral contexts. This view somehow contrasts with common conceptions of how spatial vision is accomplished.

If laypeople are asked for the requirements of spatial vision, they are likely to reply that most animals, including humans, are equipped with two eyes which allow them to view the world from slightly different vantage points, and that the nervous system makes use of the resulting disparity information for depth vision. However, the spatial range that can be resolved in this way is critically restricted by the distance between the eyes, the overlap of their visual fields and their spatial resolution (Collett and Harkness, 1982). Hence, stereoscopic vision—if it is available at all to a particular animal species—is functional only in the near range. This poses a problem, especially for fast moving animals, such as many flying insects (as well as for human car drivers), because, in order to control appropriate reactions, such as avoiding collisions with obstacles, spatial information is required at much greater distances than may be available through stereoscopic mechanisms. Amongst the depth cues that are available in addition to binocular information, for example, contrast differences between near and distant objects (Collett and Harkness, 1982), the retinal image motion induced by self-movements of the animal ("optic flow") is particularly relevant (Koenderink, 1986; Rogers, 1993; Poteser and Kral, 1995; Lappe, 2000; Redlick et al., 2001; Vaina et al., 2004).

Whenever an animal moves in its environment, the retinal images are continually displaced. During translatory movements, these displacements depend on the distance of environmental objects to the eyes, their angular location relative to the direction of motion and the velocity of locomotion. Only translational optic flow is distance dependent and, thus, contains spatial information, whereas rotational optic flow is useless for spatial vision, because all objects during rotations are displaced at the same angular velocity irrespective of their distance (**Figure 1**; Koenderink, 1986). Hence, the translatory optic flow component contains information about the relative distance of environmental objects from the animal: objects nearby pass quickly, while objects far-off appear virtually stationary. This motion-induced spatial information is based on behavioral action, because it is only available during self-motion, but not when the animal is stationary. Many animals, ranging from insects to humans, were concluded to exploit optic flow information for depth cueing.

We will focus in this review on the spatial behavior of insects that is based on depth information derived from optic flow. Since optic flow is particularly relevant during fast locomotion in three dimensions, we will mainly cover spatial vision in flight and address four major issues: (1) Components of insect behavior that are thought to be involved in solving basic spatial tasks and how they may depend on motion-based information; (2) the processing of motion-dependent spatial information and how it is facilitated by active gaze movements; (3) the representation of behaviorally relevant spatial information in the visual system; and (4) the behavioral significance of neurons extracting information about self-motion of the animal, as well as the environment, from the image flow generated on the eyes as a consequence of the action–perception loop being closed. Obviously, solving any spatial vision task especially by flying insects that lack passive stability—requires, as a precondition, the animal's flight attitude to be somehow stabilized by appropriate feedback control systems. This issue, though very important for spatial orientation behavior and widely analysed for decades, will be touched on only briefly, because it has already been thoroughly reviewed (Hengstenberg, 1993; Taylor and Krapp, 2008).

**FIGURE 1 | Schematic illustration of the consequences of rotational (upper diagram) or translational self-motion (bottom diagram) for the resulting optic flow.** Superimposed images were either generated by rotating a camera around its vertical axis or by translating it forward. Rotational self-motion leads to image movements (red arrows) of the same velocity (reflected in the arrow length) irrespective of the distance of environmental objects from the observer. In contrast, the optic flow elicited by translational self-motion (blue arrows) depends on the distance between objects from the observer. Hence, translational optic flow contains spatial information.

## **BEHAVIOR INVOLVED IN SPATIAL TASKS AND ITS CONTROL BY VISUAL MOTION CUES**

Many animals, including humans, use optic flow for the control of spatial behavior. Since spatial information can most easily be extracted from the retinal image flow during translatory selfmotion, some animals execute translatory movements of their body and/or head that appear to be dedicated to generate optic flow suitable for depth cueing. Locusts, mantids, and dragonflies, for instance, sitting in ambush perform lateral body and head movements in preparation for a jump or for catching prey, respectively (Collett, 1978; Sobel, 1990; Collett and Paterson, 1991; Kral and Poteser, 1997; Olberg et al., 2005). Some bird species bob their heads back and forth, most likely to acquire depth information (Davies and Green, 1988; Necker, 2007). Moreover, flying insects, such as flies and bees (Schilstra and van Hateren, 1999; Boeddeker et al., 2010; Braun et al., 2010, 2012; Geurten et al., 2010), but also birds (Eckmeier et al., 2008), perform a saccadic flight and gaze strategy in which short and rapid head and body saccades are separated by largely translatory locomotion. This strategy facilitates access to spatial information from the resulting optic flow.

The use of optic flow to gain spatial information has been shown most convincingly in behavioral experiments in which animals responded to objects that were camouflaged by covering them with the same texture as their background. Thus, these objects could be discriminated only on the basis of optic flow cues elicited during self-motion. *Drosophila*, for instance, is well able to discriminate the distance of different objects on the basis of slight differences in their retinal velocities (Schuster et al., 2002). Bees (Srinivasan et al., 1987; Lehrer et al., 1988) and blowflies (Kimmerle et al., 1996) use relative motion cues mainly at the edges of objects to discriminate between their height and to land on them (**Figure 2A**; Srinivasan et al., 1990; Kimmerle et al., 1996; Kern et al., 1997). Bees also use motion contrast in discrimination

with a random texture of different heights. The floor and walls of the flight arena were covered with the same texture. Hence, the discs could only be discriminated by relative motion cues induced on the eyes by the self-motion of the animal. Flies landed on discs raised at least 1 cm above the floor significantly more often than on a reference disc on the floor (data from Kimmerle et al., 1996). **(B)** Contour plot of the turning responses of tethered flying flies measured with a yaw torque compensator (comp) for different combinations of temporal frequencies of object motion (OM) and translatory background motion (tBM). The motion stimuli were striped patterns (spatial period 6.3◦ ) presented on two monitor screens placed at an angle of 90◦ symmetrically in front of the fly. OM was displayed within a vertical 6.3◦ wide window in front of the right eye. Object-induced responses are given in a color coded way with warmer colors indicating larger responses. Flies show strong turning responses when OM is faster than tBM. The strongest

of honeybees in a cylindrical flight arena with three cylindrical landmarks (upper left diagram). The landmarks were either homogeneously red or were covered by the same random pattern as the background. Bees were trained to find a barely visible feeder placed between the homogeneous landmarks. The trajectory of one search flight maneuvre is shown in the top view (bottom left diagram). The feeder (green circle) and the landmarks (black dots) are indicated. The position of the bee is indicated by red dots at each 32 ms interval; straight lines represent the orientation of the long axis of the bee. The duration of search flights until landing on the feeder was not significantly increased when the pattern of the landmarks was changed from homogeneous red to the random dot texture that also covered the background (right diagram). Red lines indicate median values, the upper and lower margins of the boxes, the 75th and 25th percentiles; the whiskers indicate the data range (Data from Dittmar et al., 2010).

tasks (Lehrer and Campan, 2005) and for navigating back to the previously learnt location of a barely visible goal (**Figure 2C**; see below; Dittmar et al., 2010). Moreover, hawk-moths hovering in front of a flower use motion cues to control their distance to the nectar donating blossom (Pfaff and Varjú, 1991; Farina et al., 1994; Kern and Varjú, 1998). However, motion information is also used for spatial tasks that are not related to objects. Bees, for instance, exploit optic flow information to estimate distances traveled during navigation flights. The dependence of optic flow information on the depth structure of the environment is also relevant in this context: experimental manipulation of the environment between flights can induce characteristic errors in distance estimation because estimates of distances traveled in a given environment cannot be generalized to environments with different depth structures (Srinivasan et al., 2000; Esch et al., 2001; review: Wolf, 2011).

What are the mechanisms involved in solving spatial behavioral tasks? Insects play a pivotal role in systems analyses of these mechanisms, both at the behavioral and the neural level. Behavioral systems analyses have been mainly performed in flight simulators on tethered flying flies, because the visual input can be perfectly controlled by the experimenter while, in most experimental paradigms, turning responses are recorded. Here, the visual consequences of locomotion are emulated by motion stimuli to which the tethered animal is exposed. However, the degrees of freedom of movement that can be executed by the animal and monitored by the experimenter in these behavioral paradigms are constrained, thus providing only limited access to the rich behavioral repertoire of the animal. Apart from a few exceptions (e.g., Land and Collett, 1974; Collett and Land, 1975; Wagner, 1982; Zeil, 1986), it has only recently become possible to investigate spatial behavior systematically under free-flight conditions with high spatial and temporal resolution and to also reconstruct what an animal has seen during largely unconstrained behavior (Lindemann et al., 2003). In the following, we restrict the review to only a few components of spatial behavior that have been experimentally investigated in detail.

#### **OBJECT DETECTION AND OBJECT-DIRECTED RESPONSES**

It has been known for a long time from experiments in tethered flight that flies can discriminate objects from their background on the basis of motion cues and attempt to fixate them in the frontal visual field (Virsik and Reichardt, 1976; Reichardt and Poggio, 1979; Reichardt et al., 1983; Egelhaaf, 1985a; Egelhaaf and Borst, 1993a; Kimmerle et al., 1997, 2000; Maimon et al., 2008; Aptekar et al., 2012). In these experiments, the tethered animal could not move, and only its yaw torque was measured. Relative motion was generated by specifically controlling object and background displacements. In real life, this situation usually occurs as a consequence of the action–perception cycle being closed while the animal moves in a three-dimensional environment and actively generates relative motion cues on its eyes through its behavior (see above).

Only three features of the control system mediating object detection in flies will be mentioned here. (1) The detectability of objects depends to a large extent on the dynamical properties of object and background motion. Object detection is facilitated if the background moves at a moderate velocity, such as during translation in an environment where the background is at a medium distance from the animal (**Figure 2B**) (Kimmerle et al., 1997). (2) The visual pathways extracting motion-dependent object information and those processing other types of motion information (e.g., those controlling compensatory optomotor responses or translation velocity) are commonly assumed to segregate at the level of the fly's third visual neuropile. The object system appears to be distinguished by its dynamical and other properties. In particular, the object system responds to high-frequency changes of the retinal position and velocity of the object, whereas strong compensatory optomotor responses are evoked by low-frequency velocity changes (Egelhaaf, 1987; Aptekar et al., 2012). The object pathway appears to be kept separate from the other pathways up to the level of the steering muscles that mediate object-induced turns (Egelhaaf, 1989). (3) Even when the object moves exactly in the same way in subsequent stimulus presentations, it may either be fixated by the fly or no fixation responses may be elicited at all. Such a bimodal distribution of responses in the behavioral context of object detection—a full response or no response—suggests a gating mechanism in the neural pathway mediating motion-induced object fixation (Kimmerle et al., 2000).

Currently we can only speculate about the functional significance under real-life conditions of a control system that induces turning responses in tethered flight toward an object moving in front of its background. Potentially, an object may initiate landing behavior under free-flight conditions. This is plausible in blowflies as well as in bees, because (1) an object is most effective in eliciting fixation responses when the ventral part of the visual field is stimulated (Virsik and Reichardt, 1976), and (2) when detecting and approaching a landing site in free-flight, relative motion cues are exploited mainly in the ventral visual field (Wagner, 1982; Lehrer et al., 1988; Kimmerle et al., 1996; Kern et al., 1997; van Breugel and Dickinson, 2012). Similar objectdetection systems could play an important role in bees during local navigation when landmarks based on contrast, texture, and relative motion cues need to be detected to guide the animal to its goal (see below).

#### **COLLISION AVOIDANCE**

In many situations, objects or other structures in the environment (e.g., extended surfaces, such as walls) are not goals the animal may aim for, but may interfere with the animal's trajectory as obstacles that need to be avoided. Thus, collision avoidance represents a basic, but highly relevant spatial task. Again, optic flow has been shown in a variety of animals, including humans, to be one of the most relevant cues that may signal an impending collision (e.g., Lappe, 2000; Vaina et al., 2004).

Optic flow has been shown to be relevant in collision avoidance behavior for both tethered and free-flying flies. There is consensus amongst studies that asymmetries in the optic flow across the two eyes, for instance, when approaching environmental structures on one side, are decisive for eliciting collision avoidance responses: (1) Flies tend to turn away from the eye experiencing image expansion (Tammero and Dickinson, 2002a,b; Tammero et al., 2004; Bender and Dickinson, 2006b; Budick et al., 2007; Reiser and Dickinson, 2010). (2) The probability of eliciting an evasive turn has been concluded to be highest if the focus of image expansion is located in the lateral rather than in the frontal part of the visual field (Tammero and Dickinson, 2002a; Tammero et al., 2004; Bender and Dickinson, 2006b). Such optic flow might occur during flights with a strong sideways component. These results do not imply that the focus of expansion in the retinal motion pattern during object approach is explicitly extracted by the neuronal circuits that mediate collision avoidance. Based on experiments done in free-flight in different types of flight arenas that allow for more complex behavior than in tethered flight, mechanisms that rely on asymmetries in the optic flow field across the two eyes other than explicitly extracting the focus of expansion are well able to account for relevant aspects of collision avoidance (see below; Lindemann et al., 2008; Mronz and Lehmann, 2008; Kern et al., 2012).

#### **INTERACTION BETWEEN OBJECT FIXATION AND COLLISION AVOIDANCE**

Expanding visual flow fields are encountered by flying insects not only when they encounter an obstacle, but also when flying straight toward an object that may serve as a landing site or as a landmark in the context of navigation behavior. As sketched above, tethered flying *Drosophilae* turn away from an expanding retinal image. Given the strength of this evasive response, it is difficult to explain how flies can fly straight in natural surroundings with ample objects surrounding them. This apparent paradox is partially resolved by the finding that *Drosophila*, when flying toward a conspicuous object, tolerates a level of expansion that would otherwise induce avoidance (Reiser and Dickinson, 2010). This suggests that the gain of the control system mediating evasive turns is reduced if prominent visual features are attractive and represent a behavioral goal. Therefore, flies appear to require a goal to keep an overall flight direction, either toward a salient object (Heisenberg and Wolf, 1979; Götz, 1987; Maimon et al., 2008; Reiser and Dickinson, 2010), toward an attractive odorant (Budick and Dickinson, 2006), when flying upwind (Budick et al., 2007), or while pursuing a moving target such as a potential mate (Trischler et al., 2010).

#### **SPATIAL INFORMATION RELEVANT FOR LOCAL NAVIGATION**

Whereas collision avoidance and landing are spatial tasks that must be solved by any flying insect, local navigation is relevant especially for particular insects, such as bees, some wasps and ants, which care for their brood and, thus, have to return to their nest after foraging. Consequently, the full complexity of spatial navigation has been analysed mainly in bees, wasps, and ants both in artificial and natural environments. Nonetheless, basic elements of local navigation could be found also in *Drosophila* (Foucaud et al., 2010; Ofstad et al., 2011). Since various aspects of insect navigation and the underlying mechanisms have been reviewed recently (Collett and Collett, 2002; Collett et al., 2006; Zeil et al., 2009; Zeil, 2012), only selected issues will be addressed here, and spatial information processing during flight will be the major focus.

Visual landmarks represent crucial spatial cues and are employed to localize a goal, especially if it is barely visible itself. Information about the landmark constellation around the goal is memorized during elaborate learning flights: the animal flies characteristic sequences of ever increasing arcs while facing the area around the goal. During these learning flights, the animal somehow gathers relevant information that is subsequently used to relocate the goal when returning to it after an excursion. A variety of visual cues, such as contrast, texture and color, are suitable to define landmarks and are employed to find the goal (reviews: Collett and Collett, 2002; Collett et al., 2006; Zeil et al., 2009; Zeil, 2012). Recently, landmarks that are defined by motion cues alone were shown to be sufficient for bees to locate the goal (Dittmar et al., 2010). In this study, several landmarks that were camouflaged by their texture and, thus, could not be discriminated from the background by stationary cues were placed in particular locations surrounding the goal (**Figure 2C**). The mechanisms by which the landmark constellation is learnt and how the memorized information is eventually used to locate the goal are not yet fully understood. However, it is clear that optic flow information generated actively during the bees' typical learning and searching flights is essential for the acquisition of a spatial memory of the goal environment. Moreover, in the vicinity of the landmarks, the animals were found to adjust their flight movements according to specific textural properties of the landmarks (Dittmar et al., 2010; Braun et al., 2012).

Landmarks close to the goal are, for geometrical reasons, most suitable to define the goal location, because the retinal locations of close landmarks are displaced more than distant ones during the translational movements of the animal (Stürzl and Zeil, 2007). Emerging as a direct consequence of the closed action–perception cycle, this property "weighs" the relevance of environmental objects to serve as landmarks for local navigation in the vicinity of the goal.

## **SPATIAL INFORMATION BASED ON SACCADIC GAZE AND FLIGHT STRATEGY**

Saccadic gaze changes have a rather uniform time course and are shorter than 100 ms. Angular velocities of up to several thousand ◦/s can occur during saccades (**Figure 3**). Since roll movements of the body that are performed for steering purposes during saccades, and also during sideways translations, are compensated by counter-directed head movements, the animals' gaze direction is kept virtually constant during intersaccades (Schilstra and van Hateren, 1999; Boeddeker and Hemmi, 2010; Boeddeker et al., 2010; Braun et al., 2010, 2012; Geurten et al., 2010, 2012). Saccade dynamics in flies have been shown to be fine-tuned by mechanosensory feedback from the halteres, the gyroscopic sense organs of dipteran flies, evolutionarily developed from the hind wings. Haltere feedback may thus contribute to increasing the duration of intersaccadic intervals (Sherman, 2003; Bender and Dickinson, 2006a). Nevertheless, halteres are no prerequisite for a saccadic gaze strategy, given that bees and wasps show similar flight dynamics as flies without halteres (**Figure 3**) (Boeddeker et al., 2010). By squeezing body and head rotations into the brief saccades, translational gaze displacements last for more than 80% of the entire flight time (van Hateren and Schilstra, 1999;

speed. **(C)** Orientation of the fly's longitudinal body axis (solid red line) and flight direction (broken black line) in the external coordinate system. **(D)** Angular velocity of the fly. The fly changed its gaze and heading direction through a series of short and fast body turns. Flight direction and body axis

can deviate considerably. **(F)** Head (blue) and body orientation (red). The head usually turns with the thorax but at a higher angular speed, starting, and finishing slightly earlier. **(G)** Head (blue) and body (red) angular velocity. **(E–G)** Data from Boeddeker et al. (2010).

Boeddeker and Hemmi, 2010; Boeddeker et al., 2010; Braun et al., 2010, 2012; van Breugel and Dickinson, 2012).

It should be noted that flying insects may appear to meander smoothly when their overall flight trajectory is inspected (Boeddeker et al., 2005; Kern et al., 2012). Having frequently been an issue of misunderstandings, this smoothness does not contradict a saccadic flight style. As a consequence of inertial forces, flying insects, in particular large ones, may move for some time after a saccadic change in body orientation in their previous direction. Thus, the saccadic gaze strategy is reflected only to some extent in the overall flight trajectories. (**Figure 3**). This may be different in the much smaller *Drosophila* where at least some rapid large-amplitude turns can be seen in the overall flight trajectories (Tammero and Dickinson, 2002b).

Blowflies do not fly exactly straight even in straight flight tunnels without any obstacles. Rather they perform sequences of saccades, alternating their direction and the saccade amplitude depending on the clearance of the animal with respect to the walls of the flight tunnel (Kern et al., 2012). A saccadic flight style may be functionally relevant, even if the overall flight course pursued by the animal is straight. This is because the animal normally has no prior knowledge about the spatial structure of the environment. Thus, the uncertainty about whether it can fly on a straight course or not needs to be resolved on the basis of optic flow information. Regular changes of flight and gaze direction might, therefore, be a useful flight strategy, because it would allow the animal to check (during intersaccadic intervals) the translational optic flow for environmental information (Kern et al., 2012).

Since the saccadic flight and gaze strategy leads to either primarily rotational or primarily translational optic flow on the eyes, it can be interpreted as a behavioral adaptation to facilitate spatial vision. This is because only translational optic flow depends on the distance of the animal to environmental objects and, thus, contains spatial information (see above). A segregation of optic flow fields into their rotational and translational components can, at least in principle, be accomplished computationally for most realistic situations (Longuet-Higgins and Prazdny, 1980; Prazdny, 1980; Dahmen et al., 2000). However, such a computational strategy for the nervous system appears to be a lot more demanding than preventing the formation of composite rotational and translational optic flow by behavioral means. Thus, a saccadic gaze and flight strategy can be regarded as an efficient way to provide the nervous system with input from which spatial information can be extracted with relatively little computational effort.

#### **CONTROL OF SACCADES AS THE MAIN ROTATIONAL COMPONENTS OF FLIGHT BEHAVIOR**

The saccadic gaze strategy of insects has been characterized in various functional contexts: flies exhibit a saccadic flight pattern during spontaneous behavior, for instance, when cruising around without any obvious goal. This was shown in a wide range of environments including outdoors conditions (**Figure 3A**). Saccade frequencies of up to 10 per second were observed (Schilstra and van Hateren, 1999; van Hateren and Schilstra, 1999; Tammero and Dickinson, 2002b; Boeddeker et al., 2005, 2010; Braun et al., 2010, 2012; Dittmar et al., 2010; Geurten et al., 2010). The direction, amplitude and frequency of saccades depend not only on the spatial outline, but also on the texture of the environment. Thus, saccades are, at least to some extent, under visual control and serve purposes in spatial behavior, such as in collision avoidance behavior (Frye and Dickinson, 2007; Geurten et al., 2010; Braun et al., 2012; Kern et al., 2012).

There is consensus that intersaccadic optic flow during collision avoidance behavior plays a decisive role in controlling the direction and amplitude of saccades. However, which optic flow parameters may be most relevant is still inconclusive. Notwithstanding, all proposed mechanisms of evoking saccades rely on some sort of asymmetry in the optic flow pattern in front of the two eyes. The asymmetry may be due to the location of the expansion focus in front of one eye or to a difference between the overall optic flow in the visual fields of the two eyes (Tammero and Dickinson, 2002b; Lindemann et al., 2008; Mronz and Lehmann, 2008; Kern et al., 2012).

Not all of the visual field has been concluded to be involved in saccade control, at least for blowflies. The optic flow in the lateral parts of the visual field does not play a role in determining saccade direction (Kern et al., 2012). This feature might be related to the way in which blowflies fly: during intersaccades, they predominantly fly forwards with some sideways component after saccades that shifts the pole of expansion of the flow field slightly toward frontolateral locations (Kern et al., 2012). In contrast, in *Drosophila*—which are able to hover and fly sideways (Ristroph et al., 2009)—lateral and even rear parts of the visual field have also been shown to be involved in saccade control. Therefore, in *Drosophila*, a mechanism that also takes lateral retinal areas into account for saccade control is plausible from a functional point of view (Tammero and Dickinson, 2002b).

#### **CONTROL OF INTERSACCADIC TRANSLATIONAL MOTION**

Whereas saccades are fairly stereotyped across different behavioral contexts, the intersaccadic translational movements may vary to a much larger extent, depending on the behavioral context as well as the spatial layout of the environment (Braun et al., 2010, 2012). This aspect has been addressed systematically in two different behavioral contexts: (1) The dependence of translation velocity on the spatial layout of the environment, and (2) the control of translational movements during visual landmark navigation in the vicinity of an invisible goal.

Insects tend to decelerate when their flight path is obstructed. Flight speed is thought to be controlled by optic flow generated during translational flight (David, 1979, 1982; Farina et al., 1995; Kern and Varjú, 1998; Baird et al., 2005, 2006, 2010; Frye and Dickinson, 2007; Fry et al., 2009; Dyhr and Higgins, 2010; Straw et al., 2010; Kern et al., 2012). Flies, bees, and moths were concluded to keep the optic flow on their eyes at a "preset" total strength by adjusting their flight speed. Accordingly, they decelerate when the translational optic flow increases, for instance, while passing a narrow gap or flying in a narrow tunnel (**Figures 4A,B**) (Srinivasan et al., 1991, 1996; Verspui and Gray, 2009; Baird et al., 2010; Portelli et al., 2011; Kern et al., 2012). However, not all parts of the visual field contribute equally to the input of the velocity controller. Whereas the intersaccadic optic flow generated in eye regions looking well in front of the insect has a strong impact on flight speed, the lateral visual field plays only a minor role (Baird et al., 2010; Portelli et al., 2011; Kern et al., 2012).

Translational flight maneuvres during the spatial navigation of bees have a particularly elaborate fine structure and can be described by a distinct set of prototypical movements (**Figure 4C**). The optic flow generated during flight sequences close to visual landmarks appears to be systematically employed to localize a virtually invisible goal. Not only the overall velocity, but also the relative distribution of sideways and forward

**FIGURE 4 | (A)** Control of translational velocity in blowflies. Boxplot of the translational velocity in flight tunnels of different widths, in a flight arena with two obstacles and in a cubic flight arena (sketched below data). Translation velocity strongly depends on the geometry of the flight arena. **(B)** Boxplot of the retinal image velocities within intersaccadic intervals experienced in the fronto-ventral visual field (see inset) in the different flight arenas. In this area of the visual field, the intersaccadic retinal velocities are kept roughly constant by regulating the translation velocity according to clearance with respect to environmental structures. The upper and lower margins of the boxes in **(A)** and **(B)** indicate the 75th and 25th percentiles, and the whiskers the data range (Data from Kern et al., 2012). **(C)** Translational and rotational prototypical movements of honeybees during

local landmark navigation (see example in **Figure 2C**). Homing flight sequences can be decomposed into nine prototypical movements using clustering algorithms in order to reduce the behavioral complexity. Each prototype is depicted as a star plot containing the four velocity components drawn onto color-coded lines equally dividing the drawing plane (see inset). For each line, the distance of the dot from the center determines the value of the corresponding velocity component, and the error bars give the standard deviation of this value. Percentage values provide the relative occurrence of each prototype. More than 80% of flight-time corresponds to a varied set of translational prototypical movements and less than 20% has significantly non-zero rotational velocity corresponding to the saccades (Data from Braun et al., 2012).

translational movements depend on the insect's distance and orientation relative to the landmarks and the goal (Zeil et al., 2009; Dittmar et al., 2010, 2011; Braun et al., 2012; Zeil, 2012). Bees, for example, frequently tend to perform translational movements with a strong sideways component close to landmarks, as if they

wanted to scrutinize them in detail. These sideways movements are more pronounced if the landmarks are camouflaged by the same texture as their background and, thus, can be detected only by relative motion cues in the optic flow fields (Dittmar et al., 2010; Braun et al., 2012).

## **PROCESSING OF OPTIC FLOW IN THE INSECT NERVOUS SYSTEM**

Separating the rotational and translational optic flow components behaviorally can be viewed as an efficient strategy to reduce the computational load for the nervous system when extracting information about the environment and, especially, about its spatial layout. Nonetheless, the retinal image flow resulting from the closed action–perception cycle still has complex spatiotemporal properties, and its processing represents a demanding challenge for the nervous system. In particular, there is not much time for gathering environmental information between saccades. With up to 10 saccades per second being generated, intersaccadic intervals may be as short as only a few ms and rarely longer than 100–200 ms. Time is a critical issue for at least three reasons: (1) All neural processing is timeconsuming, beginning with the biophysical mechanisms of signal transduction in the photoreceptors, and ending with transmitter signaling at neuromuscular junctions. (2) Sensory input is encoded by nerve cells with only limited reliability. Repeated presentation of the same input may lead to variable neural responses, which constrain the information which can be transmitted within a given time interval. (3) Neural computations are not necessarily rigid, but may flexibly adjust to the prevailing stimulus conditions. To be functionally beneficial, the time constants of such adaptive processes need to match the behaviorally relevant timescale of changes of the various visual stimulus parameters.

These three issues become particularly challenging if information is to be processed and represented with sufficient reliability on the very short timescales that are behaviorally relevant for fast flying insects. The virtuosity of the spatial behavior of many insects is proof that their sensory and nervous systems somehow cope successfully with this challenge. Since insects accomplish all this with very small brains comprising only a million or less neurons, they seem to be champions of resource efficient information processing and behavioral control.

So far, we only have vague conceptions of how all this is accomplished. In the following, we briefly sketch the available knowledge about the processing of retinal image flow. Particular focus is placed on how the spatiotemporal properties of image flow are shaped by the closed action–perception cycle.

#### **SPATIOTEMPORAL VISUAL INPUT OF INSECTS IS SHAPED BY ACTIVE GAZE STRATEGIES**

From what has been sketched above, it may be obvious that the spatiotemporal characteristics of the input to the visual system will depend strongly not only on the features of the behavioral surroundings, but also on the specific dynamical characteristics of locomotion. These movements, resulting from the closed-loop nature of the behavior, may, in turn, depend on the environmental properties. The statistical properties of a wide variety of natural scenes have been characterized in many studies. The scenes analysed were usually stationary, or they resulted from movements either at constant velocities or with dynamics that differ a lot from that of unrestrained gaze changes during natural locomotion (e.g., Eckert and Buchsbaum, 1993; Dong and Attick, 1995; van Hateren, 1997; Simoncelli and Olshausen, 2001; Betsch et al., 2004; Geisler, 2008). In a recent study, we simulated the natural dynamics of the saccadic gaze strategy of insects and registered the resulting image sequences in a large variety of natural environments (Schwegmann et al., in preparation).

Given the characteristic temporal structure of behavioral dynamics, the parameters within these image sequences also change in a temporally structured way. Two aspects of such changes may be particularly relevant for extracting behaviorally relevant environmental information from the retinal image flow: (1) Relevant image parameters, such as brightness, contrast, and spatial frequency composition, vary according to image region and viewing direction, and fluctuate more rapidly during saccadic turns than during intersaccades. (2) During translatory intersaccadic movements, image parameters resulting from close structures fluctuate in general much more than those resulting from distant structures (**Figure 5**).

The dynamical properties imposed by the saccadic gaze change and the image statistics of natural environments constrain the time constants of information processing. Furthermore, the adaptive mechanisms that are thought to adjust the sensitivity of the visual system to the prevailing stimulus conditions have to operate on a suitable timescale. In particular, to optimize the encoding of the fluctuations of environmental image features during the intersaccadic intervals, adaptation in the visual system should essentially take place on a timescale shorter than the duration of these intervals (i.e., within some tens of milliseconds) and may be driven by the high-frequency changes of the respective image parameters. Several physiological components of motion adaptation have been described at the different levels of the fly visual system (e.g., Maddess and Laughlin, 1985; Brenner et al., 2000a; Harris et al., 2000; Fairhall et al., 2001; Kurtz, 2007; Kalb et al., 2008; Liang et al., 2008). To what extent the time constants of these processes, which have been identified with experimenter designed motion stimuli, match the dynamics of parameter changes in the natural visual input, and how these adaptive processes are controlled, is still not clear.

## **PERIPHERAL PROCESSING OF MOTION INFORMATION**

How is the environmental and, in particular, the spatial information extracted from the retinal image flow and represented in the visual motion pathway? The retinal input is transformed at the level of photoreceptors in basically two ways: (1) The retinal input is sampled by the array of photoreceptors. Compared with technical imaging systems, the number of image points and, thus, the spatial resolution is very low, with only approximately 750 image points per eye in *Drosophila* (Hardie, 1985), 5000 in the blowfly *Calliphora* (Beersma et al., 1977) and 5400 in honeybees (Seidl and Kaiser, 1981). The visual angle between photoreceptors is matched by their acceptance angle resulting in a blurred retinal image (Götz, 1965; van Hateren, 1993). Despite the low spatial resolution of the eyes of insects, they are obviously able to accomplish even intricate spatial vision tasks (see above). The low number of retinal input channels reduces the computational load for subsequent information processing tremendously and, thus, may be one reason why insects are so efficient with respect to computational expenditure. (2) As a consequence of

the biophysical transduction machinery, the photoreceptors represent a kind of temporal low-pass filter. Owing to adaptive mechanisms, the strength of this temporal blurring depends on the ambient brightness, with the time-constants of blurring reflecting a trade-off between fast transmission and the reliability of the retinal output signals given the stochastic nature of the photons impinging on the photoreceptors (Juusola et al., 1994, 1996; Juusola, 2003).

The photoreceptor output is fed into the neural network of the first visual neuropile, the lamina (**Figure 6A**). Here, those photoreceptors looking at the same point in visual space converge on common second order neurons (Kirschfeld, 1972), thereby increasing the reliability of signal transmission, especially at low-light intensities (Laughlin, 1994). The photoreceptor signals are further processed in the lamina. (1) They are temporally band-pass filtered, thereby enhancing the representation of contrast changes in the retinal images (Laughlin, 1994; van Hateren, 1997). Owing to the special properties of the synapses between photoreceptors and second order neurons, the signal time course becomes faster and more transient with increasing background intensity (Juusola et al., 1995). Given the noisiness of the input signals and the limited dynamic range of nerve cells, the overall brightness-dependent spatiotemporal filter properties of the peripheral visual system are thought to maximize the flow of information about natural moving images (van Hateren, 1992). It should be noted that these conclusions are based so far on image sequences resulting from smoothly superimposed rotational and translational movements, without taking the different dynamical properties of image changes during saccades and intersaccades into account. During translational intersaccadic movements, the image dynamics can be expected to depend on the depth structure of the scenery, because the retinal images of distant objects move at lower velocities than those of near objects (**Figure 5**). (2) Recent evidence based on targeted genetic manipulations of individual cell types in the peripheral visual system of *Drosophila* indicate, though there are differences in details between studies, that the lamina output is segregated into parallel ON and OFF pathways, signaling either brightness increases or decreases (Joesch et al., 2010; Reiff et al., 2010; Clark et al., 2011). One functional consequence of splitting the visual input into ON and OFF components is to facilitate the biophysical implementation of the mechanism of motion detection at subsequent stages of the visual system. The core of this mechanism is a multiplicationlike interaction between neighboring retinal input channels (see below), which gives a positive output for two positive as well as for two negative inputs (Egelhaaf and Borst, 1992, 1993b; Eichner et al., 2011).

#### **LOCAL MOTION COMPUTATION**

A lot is known, especially in flies, about the computations underlying motion vision. The available evidence on bees suggests that motion information is processed in their visual system according to similar principles. Local motion detection is assumed to be accomplished in the second visual neuropile, the medulla (**Figure 6A**). Motion-specific responses have been found in the two most proximal layers of the medulla. Most motion sensitive medulla neurons that could be functionally characterized have small receptive fields, as is expected from neurons involved in local motion detection (review: Strausfeld et al., 2006). As a consequence of the small size of the neurons in this brain area and the difficulty of recording their activity, conclusions concerning the cellular mechanisms underlying motion detection are still tentative. A lot of progress is currently being made by combining the sophisticated repertoire of genetic and molecular approaches in *Drosophila* with electrophysiological and imaging techniques to identify the different components of the neural circuits underlying motion detection (Rister et al., 2007; Joesch et al., 2008, 2010; Katsov and Clandinin, 2008; Borst, 2009; Reiff et al., 2010; Clark et al., 2011; Schnell et al., 2012).

A large number of features of motion detection can be accounted for by a computational model, the so-called correlation-type motion detector. In its simplest form, a local motion detector is composed of two mirror-symmetrical subunits. In each subunit, the signals of adjacent light-sensitive cells receiving the filtered brightness signals from neighboring points in visual space are multiplied after one of them has been delayed. The final detector response is obtained by subtracting the outputs of two such subunits with opposite preferred directions, thereby considerably enhancing the direction selectivity of the motion detection circuit. Each motion detector reacts with a positive signal to motion in a given direction and with a negative signal to motion in the opposite direction (reviews: Reichardt, 1961; Borst and Egelhaaf, 1989; Egelhaaf and Borst, 1993b). Various elaborations of this basic motion detection scheme have been proposed to account for the responses of insect motion-sensitive neurons under a wide range of stimulus conditions including even natural optic flow as experienced under free-flight conditions (see e.g., Borst et al., 2003; Lindemann et al., 2005; Brinkworth et al., 2009).

#### **EXTRACTION OF OPTIC FLOW INFORMATION**

Since the optic flow as induced during locomotion has a global structure, it cannot be represented in any specific way by local mechanisms alone. Rather, local motion measurements from large parts of the visual field need to be combined. This is accomplished in the third visual neuropile, the lobula complex, by directionally selective wide-field neurons (**Figure 6**) in all insect species analysed so far. Independent of the species under investigation, these neurons will here be collectively referred to as LWCs (lobula complex wide-field cells). LWCs have been investigated in particular detail in flies, where they reside in the distinct posterior part of the lobula complex; they are, therefore, often termed lobula plate tangential cells (LPTCs). In bees, the lobula complex is undivided; however, bees have very similar motion-sensitive wide-field neurons to those characterized in the lobula plate of flies (DeVoe et al., 1982; Ibbotson, 1991). Most LWCs spatially pool the outputs of many retinotopically arranged local motionsensitive neurons on their large dendrites and, accordingly, have large receptive fields. These local motion-sensitive neurons are thought to correspond to the local motion detectors, as described above. LWCs are excited by motion in their preferred direction and are inhibited by motion in the opposite direction (reviews: Hausen and Egelhaaf, 1989; Krapp, 2000; Borst and Haag, 2002; Egelhaaf et al., 2002; Egelhaaf, 2006; Taylor and Krapp, 2008; Borst et al., 2010).

For fly LWCs, the local motion-sensitive elements that synapse onto their dendrites have been concluded to differ in their preferred direction of motion. As a consequence, local preferred directions of LWCs change gradually over their receptive field and it has been suggested that they coincide with the directions of the velocity vectors characterizing the flow fields that are induced during certain types of self-motion (Hausen, 1982; Krapp et al., 1998, 2001; Petrowitz et al., 2000; Taylor and Krapp, 2008).

Despite the characteristic patterns of preferred directions in the receptive fields of LWCs, dendritic pooling of motion input is not sufficient to obtain specific responses during particular types of self-motion. Network interactions, mediated by both electrical and chemical synapses, between LWCs within one brain hemisphere and between both halves of the visual system are important for shaping their specific sensitivities for optic flow (**Figure 6B**; reviews: Borst and Haag, 2002; Egelhaaf et al., 2002; Egelhaaf, 2006; Borst et al., 2010). To enhance the specificity of LWCs for particular global optic flow patterns, interactions between both visual hemispheres are particularly relevant. The optic flow, for instance, across both eyes during forward translation is directed backwards. In contrast, during a pure rotation about the animal's vertical axis, optic flow is directed backwards across one eye, but forwards across the other eye. Thus, translational and rotational optic flow can, at least in principle, be distinguished if motion from both eyes is taken into account (Hausen, 1982; Egelhaaf et al., 1993; Horstmann et al., 2000; Farrow et al., 2003, 2006; Karmeier et al., 2003; Borst and Weber, 2011; Hennig et al., 2011). Other LWCs of blowflies, the figure detection (FD) cells, respond best to the motion of objects rather than to global optic flow patterns. This object sensitivity could be shown for one prominent element of this group of cells to be a consequence of inhibitory synaptic interactions with other LWCs (**Figures 6B–D**) (Egelhaaf, 1985b; Warzecha et al., 1993; Kimmerle and Egelhaaf, 2000a,b; Hennig et al., 2008, 2011; Hennig and Egelhaaf, 2012; Liang et al., 2012). FD cells are thought to play a prominent role in detecting stationary objects in the environment, such as landing sites that are distinguished from their background by motion, and also other visual cues. Other LWCs found in various fly species respond to much smaller objects than do FD cells. These cells were interpreted as being involved in detecting and pursuing prey and/or mates (Olberg, 1981, 1986; Gilbert and Strausfeld, 1991; Nordström et al., 2006; Nordström and O'Carroll, 2006; Barnett et al., 2007; Geurten et al., 2007; Trischler et al., 2007) and it is suggested they owe their exquisite sensitivity for extremely small targets to a variety of local and global synaptic interactions (Nordström, 2012).

Although the synaptic interactions between LWCs may increase their specificity for particular types of optic flow and stimulus sizes, this specificity is usually far from being perfect, and most neurons still respond to a wide range of "non-optimal" stimuli indicating that behaviorally relevant motion information is encoded by the activity profile of populations of LWCs rather than by the responses of individual cells.

Despite their specific differences, LWCs have general properties which may be functionally relevant in the context of spatial vision.

• *Velocity dependence*: LWCs do not operate like odometers: their mean responses increase with increasing velocity, reach a maximum, and then decrease again. Hence, their response does not reflect pattern velocity unambiguously. This ambiguity is even more complex, since the location of the velocity maximum depends on the textural properties of the moving stimulus pattern. If the spatial frequency of a drifting sine-wave grating is shifted to lower values, the velocity optimum shifts to higher values. In terms of the correlation model of motion detection, the location of the temporal frequency optimum is determined by the time constant of the delay filters in the local motion detectors (review: Egelhaaf and Borst, 1993b). The pattern dependence of velocity tuning is reduced if the stimulus pattern consists of a broad range of spatial frequencies, as is characteristic of natural scenes (Dror et al., 2001; Straw et al., 2008). Despite these ambiguities, flies and bees appear to regulate their intersaccadic translation velocity during free-flight to keep the retinal velocities in that part of the operating range of the motion detection system in which responses increase monotonically with retinal velocities (Baird et al., 2010; Portelli et al., 2011; Kern et al., 2012).


level of LWCs, making them dependent on the direction of motion (reviews: Clifford and Ibbotson, 2003; Egelhaaf, 2006; Kurtz, 2009). All these processes are usually regarded as adaptive, although their functional significance is still not entirely clear. Several non-exclusive possibilities have been proposed, such as adjusting the dynamic range of motion sensitivity to the prevailing stimulus dynamics (Brenner et al., 2000a; Fairhall et al., 2001), saving energy by adjusting the neural response amplitudes without affecting the overall information that is conveyed (Heitwerth et al., 2005), and increasing the sensitivity to changes in stimulus parameters resulting from environmental discontinuities (Maddess and Laughlin, 1985; Liang et al., 2008, 2011; Kurtz et al., 2009).


## **BEHAVIORAL SIGNIFICANCE OF OPTIC FLOW NEURONS**

What is the functional significance of the response characteristics of the motion sensitive and directionally selective LWCs described above? Two related and, to some extent, interdependent views are prevalent in the literature: (1) LWCs are conventionally conceived as self-motion sensors and, in particular, rotation detectors, in other words, neural elements sensing deviations of the animal from its normal attitude and/or flight course. (2) It is often implicitly assumed that the motion detection system should produce responses that come close to a veridical representation of the retinal velocities. Deviation from this velocity representation, such as the ambiguities in the responses resulting from the pattern properties of the stimulus and the fact that the response first increases with increasing velocity, but then decreases again beyond some velocity level (see above), are then regarded as deficiencies of an imperfect biological mechanism. However, it is becoming increasingly obvious from recent research that both views need to be qualified given the peculiar spatiotemporal characteristics of the retinal image flow resulting from the active vision strategies of insects. Moreover, constraints imposed by the timescale of behavior need to be taken into account when interpreting the functional significance of LWCs.

#### **A ROLE OF LWCs IN MEDIATING COMPENSATORY OPTOMOTOR TURNING RESPONSES**

LWCs are commonly thought to mediate compensatory optomotor turning responses of the entire body as well as the head. The strongest, though not very specific, evidence is based on the fact that many characteristics of the behavioral responses correlate well with the response characteristics of LWCs: they show similar velocity sensitivity, and the local preferred directions of various LWCs appear to match with rotational optic flow fields and, thus, were interpreted as an adaption to detect rotational self-motion of the animal around different axes (Krapp and Hengstenberg, 1996; Krapp et al., 1998, 2001; Krapp, 2000; Elyada et al., 2009).

Optomotor following of the entire animal is often analysed in tethered flight both under open- and closed-loop conditions: Here, the fly generates turning responses of the head and the body and follows the moving pattern. This response is usually interpreted to reflexively stabilize the retinal images by minimizing the retinal velocities, for instance, resulting from external and/or internal disturbances (Hausen and Egelhaaf, 1989; Krapp, 2000; Borst and Haag, 2002; Egelhaaf, 2006; Taylor and Krapp, 2008; Borst et al., 2010). However, only rotational optic flow can be eliminated in this way, and the retinal images cannot be stabilized entirely during flight, because the animal needs to translate if it wants to move from one place to another.

A general feature of compensatory optomotor responses is that they are relatively slow. Their response dynamics differ considerably from the much faster object-induced fixation responses (Egelhaaf, 1987, 1989; Warzecha and Egelhaaf, 1996; Duistermars et al., 2007; Rosner et al., 2009). What is the functional significance of such slow compensatory optomotor responses under natural behavioral conditions? Since intersaccadic gaze stabilization is very fast, it is hardly conceivable that it could be controlled by optomotor feedback. Optomotor feedback can play a role only at a much slower timescale, for instance, to compensate for steady asymmetries at the level of the sensory input (e.g., dirt on one eye or internal gain differences) or the motor output (e.g., wornout wings). Evidence for this comes from experiments where asymmetries were introduced to the visual system by occluding one of the eyes (Kern et al., 2000, in preparation). These behavioral results indicate that LWCs may play a role in mediating compensatory responses of the animal to slow unintended deviations from course, after their output signals are considerably low-pass filtered. So far, it is not clear where in the nervous system downstream of the lobula complex and by what mechanisms this filtering is accomplished.

In addition to the body, the head of flies and bees also performs compensatory optomotor responses in both tethered and free-flight. Compensatory head movements are most prominent during roll rotations of the body as are generated during banked saccadic turns and during sideways translations (Hengstenberg, 1993; van Hateren and Schilstra, 1999; Boeddeker and Hemmi, 2010; Boeddeker et al., 2010; Geurten et al., 2010). Fast gaze stabilization in flies is mainly achieved by mechanosensory input from halteres that act as gyroscopes (Sandeman and Markl, 1980). However, some LWCs have a rather direct impact on head muscles and, thus, on mediating head rotations (Milde et al., 1987, 1995; Gronenberg and Strausfeld, 1990; Gronenberg et al., 1995; Huston and Krapp, 2008, 2009). Bees, like most other insects, lack specialized inertial sensors like halteres. Nonetheless, they also show an optomotor reflex that uses visual motion to stabilize the head with respect to the visual environment under free-flight conditions at retinal velocities of up to 300◦/s (Boeddeker and Hemmi, 2010). Experiments on fruit flies provide a similar picture: whereas the visual system is tuned to relatively slow rotation, the haltere-mediated response to mechanical oscillation increases with rising angular velocity (Hengstenberg, 1993; Sherman and Dickinson, 2003, 2004).

In conclusion, LWCs are likely to mediate optomotor responses on a relatively slow timescale, and might thus help compensating rotational optic flow arising from internal asymmetries of the animal. Given the extremely rapid timescale on which gaze direction is stabilized during saccadic flight maneuvres and the response latencies of visually mediated head responses, the functional role of LWCs for compensatory head rotations under free-flight conditions is still not entirely clear.

#### **A ROLE OF LWCs IN GATHERING INFORMATION ABOUT THE ENVIRONMENT DURING INTERSACCADIC INTERVALS**

The time that flies and bees keep their gaze straight amounts to more than 80% of the overall flight-time (Schilstra and van Hateren, 1999; van Hateren and Schilstra, 1999; Boeddeker et al., 2005, 2010; Braun et al., 2010, 2012; Geurten et al., 2010; van Breugel and Dickinson, 2012). Hence, rotations are squeezed into relatively short and rapid saccadic turns. This flight and gaze strategy has been interpreted as a way to facilitate gathering environmental information that is contained in the retinal image flow during translatory self-motion (see above). Therefore, motion-sensitive neurons appear to be predestined to provide environmental information during intersaccadic intervals.

This suggestion is plausible, because the specificity of most LWCs for rotational optic flow is not exclusive and they also respond strongly to translational optic flow (Hausen, 1982; Horstmann et al., 2000; Karmeier et al., 2003, 2006; Taylor and Krapp, 2008). Moreover, the most prominent rotations performed by insects in free-flight, the saccadic turns, lead to angular velocities that are much beyond the monotonic operating range of the motion detection system (see above); rather the monotonic operating range roughly matches the intersaccadic translational velocities in those retinal regions that are probably involved in controlling the translation velocity of the animal (Kern et al., 2012).

As has been stressed above, LWCs are not veridical sensors of velocity and, thus, do not provide unambiguous information about self-motion. This is particularly obvious for the translatory movements during intersaccadic intervals, because here, retinal velocities do not only depend on the velocity of locomotion, but also on the three-dimensional layout of the environment. This dependency is reflected in the responses of HS cells; a group of three fly LWCs with a main preferred direction from the front to the back in the visual field of one eye. These neurons depolarize if environmental structures are sufficiently close, especially during translatory self-motion with a strong sideways component (**Figure 8**) (Boeddeker et al., 2005; Kern et al., 2005; Lindemann et al., 2005; Liang et al., 2012). Similar results were obtained in further LWCs during translatory movements in other directions (Karmeier et al., 2006). However, spatial information is only provided by LWCs if rotational movements are largely eliminated during the intersaccadic intervals, emphasizing the importance of the active saccadic flight and gaze strategy in the context of spatial vision (Kern et al., 2006). The responses to objects nearby are even more augmented by adaptation mechanisms, which depend on stimulus history, and, thus, on the properties of previous flight sequences (Liang et al., 2008, 2011).

What is the range within which spatial information is encoded in this way? Under spatially constrained conditions where the flies flew at translational velocities of only slightly more than 0.5 metres per second, the spatial range within which significant distance dependent intersaccadic responses are evoked amounts to approximately two metres (Kern et al., 2005; Liang et al., 2012). Since a given retinal velocity is determined in a reciprocal way by distance and velocity of self-motion, respectively, the spatial range that is represented by LWCs can be expected to increase with increasing translational velocity. In other words, the behaviorally relevant spatial range can be assumed to scale with locomotion velocity. From an ecological point of view, this consequence of the closed-loop nature of vision is economical and efficient, since the behaviorally relevant spatial depth range increases during fast self-motion. A fast moving animal can thus initiate an avoidance maneuvre earlier and at a greater distance from an obstacle than when moving slowly.

Recently, we found that the responses of bee LWCs to visual stimuli as experienced during navigation flights in the vicinity of an invisible goal also strongly depend on the spatial layout of the environment. The spatial landmark constellation that guides the bees to their goal leads to a characteristic time-dependent

response profile in LWCs during the intersaccadic intervals of navigation flights (Mertes et al. in preparation).

The responses of LWCs of flies and bees do not only depend on the retinal velocities, but are also sensitive to pattern properties (**Figure 7**; see above). Although the pattern-dependent modulations in the neural responses have been conventionally viewed as detrimental to the velocity signal, they may reflect functionally relevant information about the environment (Meyer et al., 2011; Hennig and Egelhaaf, 2012). This may be the case especially during intersaccadic translatory movements: since the retinal velocity scales with distance, an object nearby will lead to larger intersaccadic depolarization than a more distant one. Assuming that objects nearby are especially functionally relevant, object detection via optic flow automatically weighs objects according to their distance and, thus, their functional relevance. In other words, cluttered spatial scenery is segmented in this way, without much computational expenditure, into nearby and distant objects.

The amplitude of pattern-induced neural responses depends to a large extent on the size of the neuron's receptive field. Large receptive fields blur pattern-dependent response fluctuations and, thus, improve the quality of velocity signals (**Figure 7**). However, they do this at the expense of how well the signals can be localized. Hence, if motion signals originating from an object need to be localized by a neuron in the visual field, its receptive field should be sufficiently small; then, however, velocity coding is only poor and the signal provides local pattern information (Meyer et al., 2011). Hence, a neuron that is to encode spatial information on the basis of optic flow elicited during translatory self-motion should possess a receptive field that matches the size of the behaviorally relevant objects or textures. Sensitivity to objects may be further augmented by inhibitory spatial interactions, as is characteristic of blowfly FD cells (Hennig and Egelhaaf, 2012), and also by adaptive mechanisms (Liang et al., 2008, 2011). The enhanced sensitivity to objects in FD cells results from non-linearities in the synaptic interactions between an inhibitory neuron and the FD cell, on the one hand (Egelhaaf, 1985c; Hennig et al., 2008), and from the excitatory receptive field of the FD cell being smaller than that of its inhibitory input, on the other hand (**Figures 6B–D**) (Egelhaaf, 1985b; Egelhaaf et al., 1993; Krapp et al., 2001). In addition, the larger receptive field of the inhibitory LWC enhances the pattern-dependent response fluctuations in the FD cell (Hennig and Egelhaaf, 2012). Thus, the same mechanism which accounts for the FD cells being highly sensitive to objects defined by relative motion cues is also responsible for their sensitivity to objects which are defined by discontinuities in the textural properties of the environment.

It became evident in recent studies that the response properties of fly LWCs are affected by the behavioral state of the animal. Most prominently, the response amplitudes of LWCs increase if the animal is behaviorally active during the electrophysiological recording (Chiappe et al., 2010; Maimon et al., 2010; Rosner et al., 2010; Jung et al., 2011). This effect can be mimicked to some extent by application of the octopamine agonist CDM, which may induce an increase in overall spike rate and a slight shift in the velocity tuning (Longden and Krapp, 2009, 2010; Jung et al., 2011; de Haan et al., 2012; Rien et al., 2012). Octopamine has already been shown much earlier to increase the overall spike rate of LWCs in honeybees, although changes in velocity tuning have not been tested (Kloppenburg and Erber, 1995). These changes in LWC properties related to the behavioral state of the animal are unlikely to alter the conclusions about how environmental features are represented during intersaccadic LWC responses. High intersaccadic velocities, for instance, occur close to objects or the walls of the flight arena. A shift in velocity tuning toward higher velocities would reduce the likelihood of retinal velocities beyond the monotonic response range of the motion detection system and, thus, would improve the encoding of distance information.

We can conclude that LWCs of flies and bees provide information about the spatial layout and the pattern properties of the environment. This information is linked to the translational selfmotion of the flying animal during intersaccadic intervals. As a consequence of the action–perception cycle and the distance dependence of translational optic flow, this spatial information is confined to the behaviorally relevant range of up to a few metres. Within this range, the animal has to take action, for instance, to avoid collisions with obstacles, to select a landing place or to employ environmental objects as landmarks in order to learn and/or find the location of a barely visible goal.

#### **CONSTRAINTS SET BY A TIMESCALE OF NATURAL BEHAVIOR**

In classical behavioral paradigms using tethered flying insects, the experimenter-defined motion sequences usually stay constant on a timescale of several hundreds of milliseconds and even seconds. However, during unrestrained behavior, the retinal motion patterns continually change. As a consequence of the typical saccadic flight and gaze strategy of insects (see above), optic flow dynamics during natural locomotion also deviate considerably from dynamic stimuli (e.g., white-noise velocity fluctuations) that are often employed in characterizing LWCs. In the context of spatial vision, the intersaccadic intervals are of particular interest. Although they take up, on the whole, more than 80% of the entire flight time, they may be as short as 30 ms.

Why is the duration of intersaccadic intervals and, thus, the timescale on which information about the environment needs to be processed an issue at all? On the one hand, neurons are relatively unreliable computing devices and, on the other hand, the spatial behavior of flying insects takes place on a comparatively rapid timescale. The problem of reliability is particularly daunting, as there is not much redundancy at the output level of the insect visual system which would allow for the pooling of information across equivalent neurons.

When the same stimulus is presented repeatedly to a neuron, the responses may vary a lot between trials. Neuronal activity fluctuates continually even during constant velocity motion (reviews: Pelli, 1991; de Ruyter van Steveninck and Bialek, 1995; Warzecha and Egelhaaf, 2001). On the basis of individual response traces, it is not easily possible to discern stimulus-driven activity changes from those that are due to sources not associated with the stimulus ("noise"). The origin of various potential noise sources in the visual motion pathway and the consequences of the unreliable nature of neural signals have been analysed in flies (e.g., de Ruyter van Steveninck and Bialek, 1995; de Ruyter van Steveninck et al., 1997; Warzecha and Egelhaaf, 1999; Warzecha et al., 2000; Egelhaaf et al., 2001; Lewen et al., 2001; Borst, 2003; Grewe et al., 2003, 2007; Nemenman et al., 2008). These aspects, as well as the impact of neuronal noise on the precision with which motion information can be encoded, have been controversially discussed (Haag and Borst, 1997, 1998; Warzecha and Egelhaaf, 1997; Warzecha et al., 1998, 2000, 2003; Brenner et al., 2000b; Fairhall et al., 2001; Kalb, 2006). One aspect appears to be especially relevant in the context of computing spatial information: given that neuronal responses are noisy, it will take some time to infer reliably behaviorally relevant environmental information from neuronal activity. Bayesian analysis of noisy intersaccadic responses of individual fly LWCs and populations of LWCs reveals that sufficiently reliable information about translatory self-motion and, thus, about spatial parameters of the environment can be decoded already on a timescale of little more than 5 ms and, thus, on a time-scale of even the shortest intersaccadic intervals (Karmeier et al., 2005). Since the neural responses in this analysis were integrated over time, the intersaccadic responses decoded on this basis do not allow for resolving temporal response fluctuations that may arise from pattern properties during an intersaccadic interval. How much the neural responses fluctuate in a pattern-dependent way on a timescale of intersaccades needs to be investigated by scrutinizing individual responses to translations in natural surroundings.

## **CONCLUSIONS**

Despite their small brains with less than a million neurons and a spatial resolution of their eyes much smaller than any useful technical camera system, insects such as flies or bees are able to solve complex spatial tasks, such as avoiding collisions with obstacles, landing on objects or even finding hardly visible goals on the basis of spatial landmark information. Insects outperform man-made autonomous flying systems in these tasks especially if resource efficiency with respect to computational expenditure and energy consumption are conceived as a benchmark. Moreover, insects accomplish this at flight velocities that imply rapid time-varying retinal image flow. The processing of rapid retinal image flow represents great challenges for the neuronal machinery, given the limited reliability of neurons as computing devices. Obviously, as a consequence of millions of years of evolution, insect nervous systems have become well adapted to successfully cope with these computational challenges and to solve those computational tasks that are relevant for the success of the species efficiently and parsimoniously.

One means to accomplish their extraordinary performance is that flies and bees actively shape the image flow on their eyes by their characteristic flight behavior. Neural processing of spatial and textural information about the environment is greatly facilitated by largely segregating the rotational from the translational optic flow through a saccadic flight and gaze strategy. It is suggested that tuning the neural networks of motion computation to the specific spatiotemporal properties of the actively shaped optic flow patterns enables the nervous system to solve apparently complex spatial vision tasks more efficiently and parsimoniously than might be possible without such an active vision strategy. Only by taking into account the characteristics of the retinal image flow that is generated under natural closed-loop conditions did it become clear that the classical interpretations of the functional significance of neurons sensitive to optic flow need to be at least modified and extended: these neurons not only reflect information about the animals' self-motion, but also—through the image flow generated during intersaccadic translational movements about the outside world. Accordingly, these neurons may be regarded as sensors for environmental information that, as a consequence of the distance dependence of translational optic flow, weigh in computationally inexpensive ways environmental information according to its presumptive significance for spatial vision.

Hence, we can conclude from the experimental work on the spatial behavior of insects and the underlying neural mechanisms, in combination with model simulations, that biological systems such as flies or bees derive part of their power as autonomous systems from scrutinizing their environment during the execution of sets of carefully selected motor routines, instead of just passively gathering information about the world. These

#### **REFERENCES**


animal–environment interactions lead to adaptive behavior in environments of a wide range of complexity. Model simulations and robotic implementations reveal that the smart biological mechanisms of motion computation and flight control might be helpful when designing micro air vehicles that may carry an onboard processor of only relatively small size and weight (Floreano et al., 2009).

5:e9361. doi: 10.1371/journal.pone. 0009361


Egelhaaf et al. Spatial vision in insects

movement detectors of honeybees. Directionally-selective visual neurons in the lobula and brain. *J. Comp. Physiol.* 147, 155–170.


circuitries and behavioural significance of the FD-Cells. *Biol. Cybern.* 52, 267–280.


analysis and electrophysiological investigation in the visual system of the fly. *Biol. Cybern.* 56, 69–87.


fly gaze stabilization system. *PLoS Biol.* 6:e173. doi: 10.1371/journal. pbio.0060173


of visual interneurons in the fly. *J. Neurophysiol.* 94, 2182–2194.


*Sensing*, eds F. Barth, J. Humphrey, and M. Srinivasan (Wien, New York: Springer), 115–128.


analysing natural optic flow: quantitative model analysis of the blowfly motion vision pathway. *J. Neurosci.* 25, 6435–6448.


*A Neuroethol. Sens. Neural Behav. Physiol.* 193, 1177–1183.


H. G. (2011). Honeybees' speed depends on dorsal as well as lateral, ventral and frontal optic flows. *PLoS ONE* 6:e19486. doi: 10.1371/ journal.pone.0019486


computing image velocity. *Science* 281, 1848–1850.


in the visual system of the blowfly. *Neuroscience* 119, 1103–1112.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 04 September 2012; accepted: 03 December 2012; published online: 20 December 2012.*

*Citation: Egelhaaf M, Boeddeker N, Kern R, Kurtz R and Lindemann JP (2012) Spatial vision in insects is facilitated by shaping the dynamics of visual input through behavioral action. Front.* *Neural Circuits 6:108. doi: 10.3389/fncir. 2012.00108*

*Copyright © 2012 Egelhaaf, Boeddeker, Kern, Kurtz and Lindemann. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any thirdparty graphics etc.*

## Closed-loop response properties of a visual interneuron involved in fly optomotor control

## *Naveed Ejaz1,2\*, Holger G. Krapp2 and Reiko J. Tanaka<sup>2</sup>*

*<sup>1</sup> Institute of Cognitive Neuroscience, University College London, London, UK <sup>2</sup> Department of Bioengineering, Imperial College London, London, UK*

#### *Edited by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### *Reviewed by:*

*Michael Nitabach, Yale University School of Medicine, USA Dierk F. Reiff, Albert-Ludwigs-Universität Freiburg, Germany*

#### *\*Correspondence:*

*Naveed Ejaz, Institute of Cognitive Neuroscience, University College London, Alexandra House, 17 Queen Square, London WC1N 3AR, UK. e-mail: n.ejaz@ucl.ac.uk*

Due to methodological limitations neural function is mostly studied under open-loop conditions. Normally, however, nervous systems operate in closed-loop where sensory input is processed to generate behavioral outputs, which again change the sensory input. Here, we investigate the closed-loop responses of an identified visual interneuron, the blowfly H1-cell, that is part of a neural circuit involved in optomotor flight and gaze control. Those behaviors may be triggered by attitude changes during flight in turbulent air. The fly analyses the resulting retinal image shifts and performs compensatory body and head rotations to regain its default attitude. We developed a fly robot interface to study H1-cell responses in a 1 degree-of-freedom image stabilization task. Image shifts, induced by externally forced rotations, modulate the cell's spike rate that controls counter rotations of a mobile robot to minimize relative motion between the robot and its visual surroundings. A feedback controller closed the loop between neural activity and the rotation of the robot. Under these conditions we found the following H1-cell response properties: (i) the peak spike rate decreases when the mean image velocity is increased, (ii) the relationship between spike rate and image velocity depends on the standard deviation of the image velocities suggesting adaptive scaling of the cell's signaling range, and (iii) the cell's gain decreases linearly with increasing image accelerations. Our results reveal a remarkable qualitative similarity between the response dynamics of the H1-cell under closed-loop conditions with those obtained in previous open-loop experiments. Finally, we show that the adaptive scaling of the H1-cell's responses, while maximizing information on image velocity, decreases the cell's sensitivity to image accelerations. Understanding such trade-offs in biological vision systems may advance the design of smart vision sensors for autonomous robots.

**Keywords: brain machine interface, optomotor control, closed-loop system, blowfly, electrophysiology, optic flow, motion vision**

#### **INTRODUCTION**

In recent years an increasing interest has emerged to apply biological principles of signal processing and control design to autonomous robotics. An enormous body of behavioral and physiological data accumulated over several decades on how the nervous system, mostly of insects, uses sensory signals for motor control (e.g., review: Taylor and Krapp, 2007) led to a significant growth in biomimetic robotics (Floreano et al., 2009; Srinivasan, 2011; Srinivasan et al., 2012). The major drive for this development comes from two directions: engineers are keen to exploit biology for the design of new robust as well as adaptive sensor and control systems, while neurobiologists are interested in robotics as a tool to validate their experimentally derived functional principles (Webb, 2008; Barth et al., 2012)

A prominent example of the joint venture between neurobiologists and engineers is the application of functional principles of insect vision to guidance, navigation, and control in aerial robotics (Srinivasan et al., 2012). Discoveries on how flies and bees process visual motion information to estimate their self-motion and control their flight has sparked a number of projects where the underlying principles were implemented in autonomous small

scale air vehicles (Hyslop and Humbert, 2010; Hyslop et al., 2010). Although most control systems, both in biology and engineering operate under closed-loop conditions, many implementations so far were based on experimental data obtained under open-loop conditions.

Invertebrate animal models are ideally suited for studying the response properties of neural control circuits generating movements under both open- and closed-loop conditions. Specifically, flies display a broad repertoire of visually guided behaviors including gaze and flight stabilization reflexes which can readily be quantified at both the behavioral and the electrophysiological level. Visuo-motor stabilization behaviors or optomotor reflexes have been extensively studied at the behavioral level under both openand closed-loop conditions (Gotz, 1964, 1968; review: Heisenberg and Wolf, 1993). Correspondingly, a great deal is now known about the open-loop response properties of a population of visual interneurons in flies, the lobula plate tangential cells (LPTCs; review:Krapp andWicklein,2008), which contribute to the control of optomotor reflexes (review: Hausen, 1993). However, with only a single exception (Warzecha and Egelhaaf, 1996), studies on LPTC response properties were all carried out under open-loop

"fncir-07-00050" — 2013/3/26 — 9:10 — page 1 — #1

conditions. The specific involvement of the LPTCs in fly visual stabilization behavior naturally poses the question as to whether or not response dynamics observed under closed-loop conditions are comparable with those measured in open-loop.

Here, we compare the open- and closed-loop response properties of an identified visual interneuron, the H1-cell, which is part of a neural circuit that provides optomotor reflexes in the fly (Haag and Borst, 2001). Specifically, we compare the effect of dynamically changed image velocities and accelerations on the instantaneous spike rate of the cell under open- and closed-loop conditions. We report that the response properties are qualitatively similar under both conditions and discuss the implication of our results in the context of fly optomotor reflexes with respect to potential applications to bio-inspired control design.

## **MATERIALS AND METHODS**

#### **FLY-ROBOT INTERFACE**

The closed-loop fly-robot interface (FRI; **Figure 1A**) uses the H1 cell of an immobilized fly, placed in front of two cathode ray tube (CRT) displays, as a sensor that provides an estimate of the horizontal angular velocity of a visual pattern (spatial wavelength λsp =11◦, contrast≈100%). The spike rate of the H1-cell resulting from pattern motion on the CRT displays was used by closedloop feedback controllers to regulate the angular velocity of the robot (**Figure 1B**). The robot was positioned on a turntable placed inside a cylindrical arena lined with a vertically oriented grating pattern. The dynamic properties of the robot (Arexx Engineering, ASURO Robot Kit) and the turntable represented the real-world actuator components of the FRI. Relative motion between the robot and the visual pattern forced by movement of the turntable mimicked self-motion of the animal resulting in horizontal pattern shifts. High-speed cameras mounted on the robot captured the visual image shifts at 200 fps and presented it on the visual CRT displays.

## **ELECTROPHYSIOLOGY RECORDINGS**

Experiments were carried out on 2–3 day old female blowflies, *Calliphora vicina*. Each animal was immobilized and its symmetrical deep pseudo-pupil (Franceschini, 1975) was used to align the head with respect to the CRT displays. Two small holes were cut in the right and left part of the animal's rear head capsule for placement of the recording and ground electrodes, respectively.

**FIGURE 1 |The fly-robot interface (FRI) (A) A fly was placed in front of a visual display consisting of two high-speed CRT displays.** Input to the two monitors were provided by two high-speed video cameras mounted on a mobile robot. The robot was positioned on a turntable placed inside a cylindrical arena lined with vertically oriented grating pattern. Robot and turntable movements were limited to rotations around the vertical axis. Visual motion as a result of the rotation of the turntable was captured by the cameras. Electrophysiology recordings from the H1-cell were used to control

the rotation of the robot. **(B)** Block diagram of the closed-loop FRI. Relative motion between the turn-table and the robot, ω*p*-ω*r* , caused spiking in the H1-cell. The responses of the H1-cell (instantaneous spike rate *F*), were used by a controller to compensate for externally generated turntable movements, by driving the robot in the opposite direction. **(C)** The *F2E* convertor maps *F* onto the control input *E.* The piece-wise sigmoid functions, based on which *E* was used to update the robot speed *V r* . (modified from Ejaz et al., 2012).

"fncir-07-00050" — 2013/3/26 — 9:10 — page 2 — #2

Tungsten electrodes were used to record the extracellular spike rate of the left H1-cell's telodendritic output arborisation (Krapp et al., 2001) in the right lobula plate. An amplitude threshold was used to digitize the spike times at a resolution of 0.1 ms. The digitized spikes were convolved with a causal half-Gaussian kernel (σfr = 0.05 ms) to obtain an estimate of the instantaneous spike rate, *F*. The instantaneous spike rate, *F*, is considered to reflect the visual motion (ω<sup>p</sup> − ωr) under closed-loop conditions and was used as an input for the two closed-loop controllers described in the section below. A video protocol of the fly preparation can be found in Ejaz et al. (2011a).

#### **CLOSED-LOOP CONTROLLERS**

For the controller, a non-linear transformation (*F2E* converter) and a feedback gain (*K*p) were applied to the instantaneous spike rate, *F*, in order to obtain an 8-bit value (*V*r), which was used to modulate the robot's angular velocity (ωr; **Figure 1B**). *F* depends on pattern motion determined by the difference between the turntable and the robot angular velocities. The *F2E* converter converts *F* into the control input *E*, based on piece-wise sigmoid functions for 0 ≤ *F* ≤ *F*<sup>S</sup> and for *F*<sup>S</sup> ≤ *F* ≤ *F*max where *F*<sup>s</sup> and *F*max represent the spontaneous and maximum spike rates, respectively (**Figure 1C**). ±*E*max represents the upper and lower 8-bit values, over which the robot speed is modulated. Using this controller, the robot speed is updated by:

$$V\_{\mathbf{r}}(t+1) = K\_{\mathbf{p}} \cdot E + V\_{\mathbf{r}}(t). \tag{1}$$

Prior to the closed-loop experiments, both *F*<sup>s</sup> (mean ± SE: 19.67 ± 2.3) and *F*max (mean ± SE: 78 ± 4.27) were determined in open-loop conditions for each fly, using three trails of 5 s stimulation without and with image motion in the preferred direction (PD) of the H1-cell, respectively.

As shown below in more detail, we used two different controllers to close the loop between the visual motion (ω<sup>p</sup> − ωr) observed by the fly and the instantaneous spiking rate, *F*, of its H1-cell. The first one is a static gain controller, which consisted of a fixed feedback gain *K*<sup>p</sup> and an *F2E* converter with constant *F*max (Ejaz et al., 2011a,b). This controller belongs to the class of linear feedback controllers in which the control effort is proportional to the error being controlled for. In our case, the updated robot speed is proportional to the visual motion error (Eq. 1) under closedloop conditions. Note, that an equivalent proportional controller was previously used by Warzecha and Egelhaaf (1996), to generate a feedback signal based on the differential activity of two H1-cells under closed-loop conditions. In the second controller, the condition of a fixed feedback gain was relaxed in order to obtain an adaptive gain controller. In order to achieve an adaptive feedback gain, every 50 ms, the maximum spike rate, *F*max is updated over a historical time window of length Δ*T*ws. Continuously updating *F*max scaled the sigmoid mapping between *F* and *E* during motion in the PD (*F*<sup>s</sup> ≤ *F* ≤ *F*max; **Figure 1C**) of the H1-cell, where the updated value for the robot speed was calculated with *K*<sup>p</sup> = 1 in Eq. 1. This scaling method provided the basis for the adaptive feedback gain, and was motivated by a neural coding strategy proposed by Laughlin (1994).

Once a value for the updated robot speed is estimated using either controller, it is transmitted to the robot via Bluetooth. As a result, the robot speeds up or down in order to correct for the visual motion error.

### **CLOSED-LOOP EXPERIMENTS**

We carried out two closed-loop experiments using the setup described above.

#### *Constant input with static gain controller*

In order to determine the input/output relationship for the H1 cell, we applied a constant angular velocity for 12.5 s set to

$$\alpha\_{\mathbb{P}} = \begin{cases} & \text{0}^{\circ}/\text{s} & \text{for} \quad 0 \le t < 2.5 \,\text{s}, \\ & 144^{\circ}/\text{s} & \text{for} \quad 2.5 \,\text{s} \le t \le 15 \,\text{s}, \end{cases} \tag{2}$$

and used the static gain controller to close the loop.

We measured the cell's spike rates (*F*) and the image velocities (ω<sup>p</sup> − ωr) for five flies with four different values of *Kp*, in a total of 111 trials (12 trials for *Kp* = 0.01, 24 trials for *Kp* = 0.1, 15 trials for *Kp* = 0.5, and 60 trials for *Kp* = 1.0) and discretized them at a rate of 100 Hz.

**Figure 2A** shows the spike rate of the H1 cell plotted against the image velocity. A sigmoid function was fitted (least square fit) to the data shown in the plot:

$$F = \frac{A}{1 + e^{-\beta(\omega\_p - \omega\_t)}},\tag{3}$$

where *A* is the upper asymptote which captures the peak spike rate and β is the growth parameter which determines the slope of the function. We use the fitting parameters *A* and β to evaluate the effect of different image velocities on the input–output relationship of the H1 cell. Larger values of *A* correspond to larger peak spike rates the cell generates for a given image velocities over the trial. The value of β specifies the slope of the function. Smaller values of β correspond to shallower and steeper slopes of the function converting image velocity into spike rate.

#### *Sinusoidal input with adaptive gain controller*

In order to determine the frequency response of the H1-cell, we applied sinusoidal angular velocities ω<sup>p</sup> = 72[sin(2Π*fit*) + 1] to the closed-loop system, where the input frequency *f*<sup>i</sup> covered a range of 0.03 ≤ *f*<sup>i</sup> ≤ 1.0 Hz, and updated the adaptive gain controller based on estimation time windows, Δ*T*ws = [0.05, 0.10, 0.15] (*N* = 5 flies), to close the loop. In previous work, we showed that the adaptive gain controller has a higher cut-off frequency as compared to the static gain controller with the corresponding frequency response gains for the two controllers being approximately equal (Ejaz et al., 2012). The adaptive gain controller was therefore chosen primarily because it allowed us to obtain H1-cell responses over a wider range of frequencies as compared to the static gain controller.

At each input frequency, *f*i, the amplitude (power spectral densities) and phase for the H1 input (ω<sup>p</sup> − ωr),*G*<sup>i</sup> and *P*i, and those for the H1 output (spike rate *F*), *G*<sup>o</sup> and *P*o, were calculated using a periodogram. Sequences were pre-multiplied with a Hamming window equal to the length of the sequence. The obtained gain *G*o *G*i and phase (*P*o−*P*i) are shown in **Figures 4C,D**, respectively.

"fncir-07-00050" — 2013/3/26 — 9:10 — page 3 — #3

**FIGURE 2 | (A)** Experimental measurement showing the input–output relationship for the H1-cell under closed-loop conditions (blue) and its least-squares sigmoid fit (*A* = 2.68, β = 0.29) (red), obtained from the fly

## **RESULTS**

We performed two experiments (described in Materials and Methods) in order to determine whether the responses of the H1-cell were different under open- and closed-loop conditions.

#### **EFFECTS OF THE MOMENTS OF THE IMAGE VELOCITY DISTRIBUTION ON THE H1-CELL INPUT–OUTPUT RELATIONSHIP UNDER OPEN- AND CLOSED-LOOP CONDITIONS**

The input–output relationship of the H1 cell, i.e., the relationship between the image velocity (input) and the spike rate (output) was obtained for different gains, *Kp*, of the static gain controller using a constant angular velocity (**Figure 2A**). Here, the spike rate was normalized by its mean over each trial. The obtained input–output relationship for the H1-cell can be approximated by a sigmoid function, as was suggested by Brenner et al. (2000) for the open-loop experiments, although the variability of the H1-cell response turned out to be much larger under our closedloop conditions. The highly variable responses are possibly due to the non-stationarity of the image velocity distributions during closed-loop experiments. The image velocities previously used under open-loop conditions by Brenner et al. (2000) and Fairhall et al. (2001) were generated from a normal distribution with zero mean and fixed standard deviation for the duration of each trial, over which the spike rate was measured. In our closed-loop experiments, however, the image velocities observed by the fly depended on the performance of the FRI in minimizing the retinal slip speeds (ω<sup>p</sup> − ωr). During the course of a closed-loop trial, the performance of the FRI typically varied between perfect image stabilization and short periods of high image velocities. Therefore, while the overall image velocities observed by the fly during a trial are normally distributed (**Figure 2B**), the standard deviation of the image velocities, when calculated over a shorter time interval, are constantly changing during a trial resulting in a highly variable input–output relationship of the H1-cell (**Figure 2A**).

robot response with the static gain controller (*Kp* = 0.1) and sinusoidal input (*fi* = 0.03 Hz). For further explanation see text. **(B)** Distribution of the image velocities observed in closed-loop is approximately Gaussian.

To characterize the H1-cell response under closed-loop conditions, we initially measured the first (mean μv) and second (standard deviation σv) moments of the input, i.e., the image velocity observed by the fly (**Figure 3A**). Increasing the feedback gain *Kp* monotonically increases both μ<sup>v</sup> and σv. The increase in σ<sup>v</sup> can be explained by control oscillations particularly pronounced for high feedback gains (Ejaz et al., 2011a). Such control oscillations are not specific to the FRI, but have also been observed as yaw torque fluctuations during closed-loop optomotor tasks in *Drosophila* (Mayer, 1989; Wolf and Heisenberg, 1990; Warzecha and Egelhaaf, 1996). The increase in μ<sup>v</sup> can be explained by the fact that we use a single H1-cell for closed-loop control. Ideally, a fly would attempt to maintain optomotor equilibrium by balancing clockwise and counter-clockwise rotations so that the observed image motion is minimized. The two H1-cells would contribute sensitivity to motion in opposing directions based on which the optomotor equilibrium is maintained. However, when a single H1 cell is used for closed-loop control, the optomotor equilibrium is un-balanced which in turn leads to an increased value of μv. An un-balanced optomotor equilibrium does not, however, seem to have drastic behavioral consequences for the fly. In a behavioral study, Kern and Egelhaaf (2000) occluded one eye in *Lucilia* and measured the turning responses in both freely flying and walking flies inside a visual arena. The authors concluded that it was hard to tell from the turning responses that the fly had been limited to the use of monocular vision and that while the flies exhibited a slight turning preference toward the stimulated eye (i.e., increased μ<sup>v</sup> ), no such asymmetry could be observed in individual responses. As a result, while increasing μ<sup>v</sup> decreases the overall performance of image stabilization under closed-loop, it does not affect the conclusions that can be drawn regarding the response properties of the H1-cell.

After characterizing the input to the H1-cell by its standard deviation and mean, we investigated the effect of σ<sup>v</sup> on the cell's

"fncir-07-00050" — 2013/3/26 — 9:10 — page 4 — #4

response properties. Its input–output relationship has previously been shown in open-loop measurements to adapt to the standard deviation, σ*v*, of the input image velocity distribution (Brenner et al., 2000; Fairhall et al., 2001). Large values of σ<sup>v</sup> cause the input–output function to expand along the *x*-axis (image velocity) leading to a shallower slope, i.e., smaller value for β, for the response function. In comparison, small values of σ<sup>v</sup> cause compression along the velocity axis resulting in a steeper slope for the response function and consequently a higher value for β. Our experiments show that the response function also scales in proportion to the standard deviation of the image velocity under closed-loop conditions (**Figures 3B,C**). As reported by Brenner et al. (2000) for open-loop condition, normalizing the input– output relationship by the standard deviation removes differences in cell's adaptation properties under closed-loop condition as well (**Figure 3C**).

Note however, that this normalization does not change the peak spike rate, *A*, in our closed-loop experiments (**Figure 3C**), suggesting that the peak spike rate of the H1-cell may be controlled by another moment of the image velocity distribution, possibly its mean. Further open-loop experiments on the H1-cell have shown that increasing either the mean (Reisenman et al., 2003) or the standard deviation (Borst et al., 2005) of the image velocity results in a decrease of its peak spike rate. In our closed-loop experiments, an approximate 2-fold increase in the standard deviation (from σ<sup>v</sup> = 16.0 deg/s to σ<sup>v</sup> = 28.5 deg/s) results in a spike rate deduction of approximately 18% (**Figure 3C**). Such a decrease is larger than it would be predicted for an increased standard deviation under open-loop conditions (Borst et al., 2005). This suggests that the peak spike rate of the H1-cell under closed-loop conditions depends on both the mean and the standard deviation of the image velocity.

The effects of the moments of the input distribution on the spike rate of the H1-cell for the open- and closed-loop conditions are highly similar in that (i) the H1-cell response is adjusted to the standard deviation of the image velocity and (ii) the H1-cell

"fncir-07-00050" — 2013/3/26 — 9:10 — page 5 — #5

decreases its peak spike rate when mean and standard deviation of the image velocity distribution are increased.

#### **EFFECTS OF INCREASING IMAGE ACCELERATIONS ON THE GAIN OF THE H1-CELL RESPONSE UNDER OPEN- AND CLOSED-LOOP CONDITIONS**

In the second experiments, we induced sinusoidal angular velocity perturbations into the closed-loop system, while varying the length of the spike rate estimation time window Δ*T*ws for the adaptive gain controller. For each trial, the resulting closed-loop image velocities (**Figure 4A**) and H1-cell spike rates were recorded to investigate the cell's frequency response.

The gain of the H1-cell decreased linearly with increasing input frequencies, with a gradient of approximately 8–9 dB/dec (**Figure 4B**). The linear decrease in gain with frequency did not, by and large, depend on the time window Δ*T*ws for the adaptive controller. The corresponding phase (**Figure 4C**) of the H1-cell decreased from approximately 35◦ at a frequency of *fi* = 0.03 Hz to around 0◦ for *fi* > 0.03 Hz.

We are tempted to argue here that, under closed-loop conditions, the frequency-dependent decrease of the H1-cell response gain (**Figure 4C**) is related to the increase in image acceleration. For the sinusoidal image velocity perturbations we used in the second experiment, the increase in the input frequency, *fi*, leads to an increase in the image acceleration. Therefore, the gain plot of the H1-cell (**Figure 4B**) represents the relationship between the cell's spike rate and the image accelerations under closed-loop conditions and suggests that the response gain of the cell decreases for increasing image acceleration. Such an approximately linear decrease in response gain of the cell with increasing accelerations was also observed under open-loop conditions (Borst et al., 2005).

The effects of increasing the frequency on the moments of the image velocity distribution are shown in **Figures 4D,E**. As the frequency increased, σ<sup>v</sup> increased from around 28◦ to 46◦/s (**Figure 4D**) and the peak spike rate decreased (**Figure 4E**). Actually, the decrease in the peak spike rate is largely the result of the increase in σv, as the corresponding mean (μv) of the image velocities is very close to 0◦/s for the frequency range we examined (**Figure 4D**). It should be noted that increasing σ<sup>v</sup> is equivalent to increasing the image velocity amplitudes and therefore produces higher image accelerations, which in turn decreases the response gain of the cell.

## **DISCUSSION**

## **THE FRI AS A CLOSED-LOOP EXPERIMENTAL SYSTEM**

The use of a robotic controller to understand animal behavior provides real-world physical interactions typically missing from modeling studies where a low-pass filter is used to describe the dynamics of the fly flight motor system. As argued byWebb (2006), this lack of physical interaction would mean that complex motion dynamics such as slipping due to friction cannot be accounted for in the computer model. Indeed, recent work by Dickson et al. (2010) showed that both body-inertia and -damping play a significant role in the dynamics of saccadic yaw turns in *Drosophila* flight. While the configuration of the fly in such a closed-loop experimental setup is far removed from conditions during natural flight, the stimulus velocity distributions observed by the fly in the FRI are within range of those used in previous measurements of

the H1 cell under open-loop conditions (Warzecha and Egelhaaf, 1996; Brenner et al., 2000; Borst, 2003; Borst et al., 2005).

While H1-cell responses have been studied extensively under open-loop conditions (e.g., Maddess and Laughlin, 1985; Brenner et al.,2000; Borst,2003; Reisenman et al.,2003), this paper presents the first study of the cell's response properties for a variety of image velocity profiles under closed-loop conditions. Our FRI was used to generate dynamic visual stimuli, i.e., sinusoidal and constant image velocity perturbations to drive the responses of the H1-cell.

In a pioneering study, Warzecha and Egelhaaf (1996) obtained electrophysiology recordings from both the ipsi- and the contralateral H1 LPTC's while the fly compensated for externally imposed visual motion under closed-loop conditions. In that study, however, comparatively small and constant image velocities (18) were used and thus there was little or no modulation of the image accelerations presented to the H1-cells. As a result Warzecha and Egelhaaf (1996) characterized the cell's closed-loop responses only for a rather narrow velocity profile, compared to the dynamic visual stimuli generated by our FRI.While they found the responses of the cell to decrease as the image velocity increased which is in agreement with our findings – they did not observe the dependence of the H1-cell responses on image accelerations we report here.

A limitation of our experiments was that only the activity of one H1-cell had been considered for closed-loop control. During walking and free-flight, a fly receives information about its yaw rotation from both the ipsi- and the contra-lateral H1-cells. Using only the activity of a single cell for visual stabilization reduced the fly's sensitivity to yaw rotations. Given that the peak spike rate of the H1-cell has been found to decrease strongly with an increase of the mean image velocity, in both open- (Reisenman et al., 2003) and closed-loop (**Figure 3B**) measurements, one key function of two H1-cells could be to keep the fly in optomotor equilibrium by trying to minimize the mean image velocity. Such a strategy of minimizing the mean image velocity would remove any restrictions on the peak spike rate of the cell. This in turn would be advantageous as the fly would remain sensitive to differences in image velocities as opposed to absolute values, which appears to be a general feature of biological sensing (Taylor and Krapp, 2007).

Our results with the FRI show that the open- and closed-loop responses are qualitatively similar, in the sense that the H1-cell maximizes the information transmitted about the image velocity distribution by adapting its input–output relationship (Brenner et al., 2000; **Figures 3B,C**). We found in addition that higher image acceleration, as a result of increased standard deviation of the image velocity distribution, decreases the gain of the H1-cell. It is important to note that this dependence is not the result of the dynamic properties of the robot or the turntable. This is because the decrease in the H1-cell response gain is too large (8–9 dB), even for small changes in acceleration (between 0.03 and 0.3 Hz), to be explained by the frequency response of either the robot or the turntable (Ejaz et al., 2011b, 2012).

In the following we will discuss our findings in more detail with an emphasis on coding of visual motion information optomotor control and the translation of closed-loop results into biomimetic applications.

"fncir-07-00050" — 2013/3/26 — 9:10 — page 6 — #6

**FIGURE 4 | H1-cell frequency response. (A)** An example time course data of image velocity observed by the fly for an input frequency *f*<sup>i</sup> = 0.3 Hz. The gain **(B)** and phase **(C)** plots over different input frequencies. **(D)** The mean (μv) and standard

deviation (σv) of the image velocities. \*Values of σ<sup>v</sup> for *f*<sup>i</sup> = 0.3 Hz and *f*<sup>i</sup> = 1.0 Hz are significantly different (calculated using Wilcoxon rank-sum method with α = 0.001). **(E)** Peak spike rate (\*) for different input frequencies.

"fncir-07-00050" — 2013/3/26 — 9:10 — page 7 — #7

#### **THE DECREASE IN THE RESPONSE GAIN FOR INCREASING IMAGE ACCELERATION IS A DIRECT CONSEQUENCE OF THE ADAPTIVE SCALING PROPERTY OF H1**

A key finding of our closed-loop experiments we report here, is that the decrease in sensitivity to image accelerations is a direct result of the H1-cell's adaptive scaling property. To the extent of our knowledge, this relationship has not been explicitly highlighted previously in open-loop measurements and is discussed below.

**Figure 5** shows how the H1 cell decreases its sensitivity to acceleration by scaling its response range to fit that of a wider image velocity distribution. Increasing the standard deviation of the image velocities results in a decrease in the gradient, β, of the cells input–output function (**Figures 5A,B**). This decrease in β directly results in a linear decrease in the peak acceleration sensitivity of the cell (**Figures 5C,D**). Furthermore, this decrease in sensitivity to acceleration is linear. The adaptive re-scaling of the H1-cell responses which maximizes information transfer, apparently comes at the expense of a reduced sensitivity to image acceleration. It is tempting to speculate that the trade-off between maximizing information transmission related to the input image velocity and the reduced sensitivity to acceleration might reflect a more general strategy preferred during the evolution of sensory systems. While in the visual system a decreased sensitivity to acceleration might be partly compensated for by signals from other sensory modalities (e.g., the halteres), a decrease in information transmission would be detrimental for the neural representation of visual motion. Given that neurons are required to process information under very strict energy constraints (Laughlin et al., 1998; Laughlin, 2001), inefficiencies in neural coding might come at a high evolutionary cost. In addition, inefficient coding at the sensory system level will most certainly propagate downstream to produce inadequate motor outputs. Altogether, a loss of acceleration sensitivity as a result of adaptive re-scaling might be a comparatively small cost to pay.

**FIGURE 5 | Effect of σv on the acceleration sensitivity of the H1-cell. (A)** The fitted values of β for the proportional controller with gain *K*p are plotted against the standard deviation of the velocity distribution, σv. Increasing σv linearly decreases β as per the relationship specified by the regression line. **(B)** The input–output functions for three different values of β (normalized by the peak spike rate, *F*) show that **(C)** decreasing beta linearly

decreases the gradient at the point σv = 0 deg/s **(D)** and this decrease is linear. This reduction in gradient of the H1-cell input–output function represents a decrease in sensitivity to changes in the image velocity i.e., a decrease in sensitivity to image acceleration. The results show that increasing σv directly decreases the sensitivity of the H1-cell to image accelerations.

"fncir-07-00050" — 2013/3/26 — 9:10 — page 8 — #8

The dependence of the H1-cell frequency response on image acceleration is also found to be qualitatively similar for open- and closed-loop studies. In earlier work, Brenner et al. (2000) showed that the altering the image velocity or acceleration resulted in a modulation of the responses of the H1 cell under open-loop conditions. In subsequent studies open-loop studies, the gain of the H1 cell was proposed to depend on acceleration and other higherorder time derivatives of image velocity (Borst, 2003; Borst et al., 2005). Specifically, the responses of the H1 cell decreased as a result of increasing image accelerations, which is also what we report under closed-loop conditions (**Figure 5**).

Our results also show that the acceleration sensitivity of the H1 cell is highest while there is little or no pattern motion (σ<sup>v</sup> = 0*o*/s, **Figure 6C**). This is clearly advantageous for the fly, as it enables the H1-cell to respond more quickly to image motion that rapidly changes from null direction to PD, as Lewen et al. (2001) proposed earlier.

While maintaining a high sensitivity to acceleration, i.e., high value of β, might make sense intuitively, it comes with potential risks. An input–output response function with a high value of β means that very small changes in angular velocity result in large changes of spike rate. β therefore partly determines the forward gain in the motion vision pathway of the system. The potential risk, however, is that with a high forward gain in combination with inherent noise in the system may easily drive the responses of downstream neural circuits into saturation. Additionally, if the feedback gain (on top of the forward gain of the H1 cell) is too high and control delays are too long, then the feedback control

**FIGURE 6 | H1 and HSE receptive fields and horizontal network connections. (A)** Top row shows monocular and binocular receptive fields of the H1 and the HSE LPTCs, respectively. The insets show the dendritic arborization patterns of both cells in the left lobula plate as well as the HSN LPTC in the right lobula plate. The dendritic input arbourizations and the telo-dendritic output arborizations of the H1 cell are connected via a thin axon that transmits visual motion information from the left to the right lobula plate using action potentials. The HSE and the HSN cells arborize in the

equatorial and the north sections of the lobula plate, respectively (modified from Krapp et al., 2001). **(B)** The connectivity in the network of LPTCs sensitive to horizontal motion. Excitatory and inhibitory interactions are depicted with open triangles and filled circles, respectively. The HSE and HSN cells receive excitatory input from the contralateral H1 and H2 cells and project onto descending neurons which in turn supply the neck and flight motor systems of the fly (modified from Krapp et al., 2001).

"fncir-07-00050" — 2013/3/26 — 9:10 — page 9 — #9

system is in danger of becoming instable. In this context, decreasing sensitivity to acceleration by having a lower response gain for increasing frequencies is possibly advantageous from a control theoretic point of view and this argument is discussed in the following section.

## **SIMILARITY IN H1-CELL RESPONSES UNDER OPEN- AND CLOSED-LOOP CONDITIONS AND ITS IMPLICATIONS FOR OPTOMOTOR CONTROL**

As a hetero-lateral neuron, the H1 cell helps disambiguate between rotation- and translation-induced optic flow as it is completely inhibited during forward translation but excited during yaw rotations. The cell also provides excitatory input to the HSE and HSN cells (**Figure 6**), two major output neurons of the visual system that respond to visual motion with a graded modulation of their membrane potential (Hausen, 1976, 1982). By connecting to the contralateral HSE and HSN cells, it makes the response of these output cells more specific to yaw rotation. Therefore, the response properties and connectivity of the H1 cell make it an important neuron in the optomotor pathway of the fly.

It is by no means trivial that the response of the H1-cell to the moments of the image velocity distribution (mean, standard deviation, acceleration), are highly similar under openand closed-loop conditions. This similarity may reflect the way in which the sensory-motor control loops are organized in the fly.

One particular model of the sensory-motor control loop in the fly proposed by Warzecha and Egelhaaf (1996) and Borst et al. (2005) does not require sensory feedback to explain the non-linear response properties of the H1-cell. In this model, the non-linear properties of the H1-cell and the LPTCs in general, can be predicted solely based on the properties of the Reichardt (1987) elementary movement detector (EMD). Warzecha and Egelhaaf (1996) suggested that the reduced gain of the H1-cell at higher image velocities is the result of intrinsic response properties of EMDs. Similarly, Borst et al. (2005) showed that an EMD model could explain the dependence of the H1-cell responses to the standard deviation and the autocorrelation time constant of the image velocities. In both cases, no feedback signals were required to explain the non-linear response properties of the H1-cell, which were suggested to be based on the computational structure of the EMD model. This particular model of control architecture is closely linked to that proposed by Wehner (1987) who argued that architecture and response properties of invertebrate sensory systems reflect a detailed model of the physical world. If the model is true, no feedback signals are necessary and H1-cell responses under closed-loop are simply the result of EMD properties, readily observable under open-loop conditions.

An alternate control architecture involves forward models, in which a copy of a motor command (efference copy) is used to subtract those components of the sensory feedback that are due to the animal's own action (Chan et al., 1998; Wolpert and Ghahramani, 2000; reviews: Webb, 2004; Krapp and Wicklein, 2008). Forward models or efference copies have been proposed to explain the mechanism by which flies adjust their gain parameters when faced with unexpected visual feedback during an optomotor task inside a flight simulator (Kirschfeld, 1989). One possible explanation for the similarity of the H1-cell responses under open and closed-loop conditions could be that in fully restrained animals an efference copy from the motor system may not be sent to the LPTCs. Our FRI did not allow us to assess the potential impact of an efference copy signal. Even though the fly was under closedloop conditions, it was immobilized for the purpose of obtaining electrophysiology recordings which makes it unlikely that the animal would have generated motor commands similar to those under free flight conditions. At this point, we can only speculate on which control architecture, i.e., with or without feedback control at the level of the LPTCs, best explains the similarity in H1-cell responses as measured under open- and closed-loop conditions.

Finally, in the context of optomotor control, the frequency response of the H1-cell (**Figures 4C** and **5D**) imposes certain limitations on fly's ability to compensate for externally imposed yaw rotations. For the visual system to contribute to the stabilization of visual motion, the reduction of gain and cut-off frequency of the horizontal cells in the lobula plate must be higher than those of the flight muscles which produce compensatory torque. The response delay in the motion vision pathway (≈30 ms) for the fly is long compared to other sensory systems like the ocelli (≈15 ms) and the halteres (≈10 ms) (Taylor and Krapp, 2007). It would therefore make sense for the cell's response not to have a high gain at high frequencies that would potentially result in instabilities of the control system as initially proposed by Warzecha and Egelhaaf (1996). A low gain at high frequencies would be in agreement with the proposed primary function of LPTCs to mainly compensate for slow drifts (Collett et al., 1993). In comparison, the halteres and the ocelli, with their short response delays, would be better suited to control yaw rotations in the higher dynamic range. The importance of keeping delays to a minimum within the optomotor control loop, and in biological control loops, in general (Dickson et al., 2010), is also evidenced by our finding that the response phase of the H1-cell stays close to zero over the tested frequency range.

The surprising qualitative similarity between closed- and open-loop data suggest that it is reasonable, in instance first approximation, to base any implementations of fly inspired (optomotor) control design on experimental open-loop data. This could potentially expedite the translation of biological design principles in technical applications as methodologically more challenging closed-loop experiments may not always be required to conclusively characterize the dynamics of neuronal responses. It should be noted, however, that the present study focused only on a 1 DoF visual stabilization task. Neuronal closed- and open-loop activity supporting multisensory control of higher dimensional tasks may as well show very different response dynamics – in particular if observed in freely or semi-freely moving animals.

## **ACKNOWLEDGMENTS**

We would like to thank K.D. Longden and K.D. Peterson for helping with the electrophysiology experiments; K.D. Longden, M. Wicklein, and D.A. Schwyn for helpful discussions on the work in this manuscript. Research was supported by the Higher Education Commission Pakistan Fellowship to Naveed Ejaz, US Airforce Research Labs [grant FA 8655-09-1-3022] grant to Holger G. Krapp, and an EPSRC Career Acceleration Fellowship to Reiko J. Tanaka.

"fncir-07-00050" — 2013/3/26 — 9:10 — page 10 — #10

## **REFERENCES**


(2009). *Flying Insects and Robots*. Berlin: Springer-Verlag.


optomotor responses. *Naturwis* 76, 378–380.


Barth, J. A. C. Humphrey, and M. V. Srinivasan (Berlin: Springer-Verlag), 19–39.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 02 November 2012; accepted: 07 March 2013; published online: 27 March 2013.*

*Citation: Ejaz N, Krapp HG and Tanaka RJ (2013) Closed-loop response properties of a visual interneuron involved in fly optomotor control. Front. Neural Circuits 7:50. doi: 10.3389/fncir.2013. 00050*

*Copyright © 2013 Ejaz, Krapp and Tanaka. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

"fncir-07-00050" — 2013/3/26 — 9:10 — page 11 — #11

## The iso-response method: measuring neuronal stimulus integration with closed-loop experiments

#### *Tim Gollisch1 \* and Andreas V. M. Herz <sup>2</sup>*

*<sup>1</sup> Department of Ophthalmology and Bernstein Center for Computational Neuroscience Göttingen, University Medical Center Göttingen, Göttingen, Germany <sup>2</sup> Department Biology II and Bernstein Center for Computational Neuroscience Munich, Ludwig-Maximilians-Universität München, Munich, Germany*

#### *Edited by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### *Reviewed by:*

*C. J. Heckman, Northwestern University, USA Ronen Segev, Ben-Gurion University, Israel*

#### *\*Correspondence:*

*Tim Gollisch, Department of Ophthalmology and Bernstein Center for Computational Neuroscience Göttingen, University Medical Center Göttingen, Waldweg 33, 37073 Göttingen, Germany. e-mail: tim.gollisch@ med.uni-goettingen.de*

Throughout the nervous system, neurons integrate high-dimensional input streams and transform them into an output of their own. This integration of incoming signals involves filtering processes and complex non-linear operations. The shapes of these filters and non-linearities determine the computational features of single neurons and their functional roles within larger networks. A detailed characterization of signal integration is thus a central ingredient to understanding information processing in neural circuits. Conventional methods for measuring single-neuron response properties, such as reverse correlation, however, are often limited by the implicit assumption that stimulus integration occurs in a linear fashion. Here, we review a conceptual and experimental alternative that is based on exploring the space of those sensory stimuli that result in the *same* neural output. As demonstrated by recent results in the auditory and visual system, such *iso-response stimuli* can be used to identify the non-linearities relevant for stimulus integration, disentangle consecutive neural processing steps, and determine their characteristics with unprecedented precision. Automated closed-loop experiments are crucial for this advance, allowing rapid search strategies for identifying iso-response stimuli during experiments. Prime targets for the method are feed-forward neural signaling chains in sensory systems, but the method has also been successfully applied to feedback systems. Depending on the specific question, "iso-response" may refer to a predefined firing rate, single-spike probability, first-spike latency, or other output measures. Examples from different studies show that substantial progress in understanding neural dynamics and coding can be achieved once rapid online data analysis and stimulus generation, adaptive sampling, and computational modeling are tightly integrated into experiments.

**Keywords: neural computation, sensory systems, stimulus integration, closed-loop experiment, isoresponse, neuron models**

## **INTRODUCTION**

Mapping high-dimensional input streams into low-dimensional output spike trains is a core operation of almost every neuron in the brain. No auditory neuron is sensitive to only one frequency of a time-varying sound signal, no visual neuron responds to only one wavelength in a light stimulus. Both types of neurons rather integrate inputs over a range of frequencies. Similarly, strong dimensional reduction also occurs when retinal ganglion cells integrate signals over space via tens to hundreds of bipolar cells with smaller receptive fields, when pyramidal cells combine input from 10,000 other cortical neurons, or when cerebellar Purkinje cells are innervated by 200,000 parallel fibers to cause well-orchestrated movement patterns. In all these cases, huge amounts of information are lost—and need to be lost, or rather discarded, so that those particular stimulus combinations can be distilled that are indeed important for behavior.

Extracting the specific rule of how a given neuron combines its inputs is a prerequisite for understanding its computational function. Consider, for example, the responses of auditory neurons to a sound pressure wave s(t) with several frequency components, s(t) = <sup>i</sup> si cos(2πνit). A neuron whose firing rate *r* is some function *f* of the summed amplitudes, *r* = *f* (isi), encodes the maximal sound amplitude whereas another neuron whose activity depends on the summed squares of these components, *r* = *g(*is 2 <sup>i</sup> ), encodes sound energy. In both cases, it is a particular scalar quantity, isi or is 2 <sup>i</sup> , respectively, that matters for the neuron's firing rate, whereas the detailed composition of the vector (s1, s2, s3*,...*) is irrelevant. Similarly, the shapes of the output non-linearities *f* and *g* are of no importance for the fact that the two neurons encode sound amplitude and energy, respectively, as long as the cells' firing thresholds, saturation levels, and input sensitivities are such that behaviorally important signal ranges can be encoded. Moreover, this simple example demonstrates that measuring a cell's input-output relation by changing the total input strength—as often done in electrophysiological experiments—will provide information about the output non-linearity, but will typically *not* reveal which computation is represented by the cell's activity.

This observation calls for alternative methods to investigate the principles and mechanisms of stimulus integration and to reveal the potential non-linearities involved in this process. Here, we review recent advances to this end, based on closed-loop measurements of iso-response stimuli. Iso-response stimuli are defined as those combinations of the individual stimulus components that yield the same predefined neuronal response. To efficiently search for sets of such stimulus combinations in neurophysiological experiments, closed-loop experiments with automated data analysis and appropriate feedback to the applied stimulation provide an essential ingredient. As discussed and exemplified below, this iso-response approach has already led to new fundamental insights into the function of neurons and neural circuits in different sensory modalities and provides a large potential for future developments and advances in a wide range of systems.

## **MODEL FRAMEWORK FOR INVESTIGATING STIMULUS INTEGRATION**

A common methodology for analyzing a neuron's stimulusresponse relation is based on system identification theory and applies the framework of cascade models (see e.g., Marmarelis and Marmarelis, 1978; Korenberg and Hunter, 1986). These models aim at describing input-output systems in a phenomenological way by a sequence of mathematical primitives, such as linear filters and non-linear transformations. The most prominent member of the cascade model family is arguably the LN model (Hunter and Korenberg, 1986; Sakai, 1992; Meister and Berry, 1999; Chichilnisky, 2001; Paninski, 2003; Schwartz et al., 2006), which comprises a stage of linear filtering of the stimulus, followed by a non-linear transformation of the filter output.

The appeal of this model stems from its simple interpretation; the linear filter describes how different stimulus components are integrated and thus represents the neuron's receptive field structure, whereas the non-linearity captures the output transformation induced by spike generation. In addition, the model elements can be derived in physiological experiments with relative ease. The linear filter, for example, can readily be found through calculating the spike-triggered average (STA) in response to broad-band stimulation, such as white-noise input (de Boer and Kuyper, 1968; Bryant and Segundo, 1976; Eggermont et al., 1983; Chichilnisky, 2001; Paninski, 2003). In using a single linear filter for the stimulus integration stage, however, the LN model implicitly assumes that the entire stimulus integration occurs in a linear fashion. All non-linear effects are relegated to the output non-linearity. The LN model is thus of limited use as soon as the true processing chain contains non-linear operations before stimulus integration is complete. A prominent example are complex cells in visual cortex, whose input stage corresponds to the sum of two squared Gabor filter signals—resulting in the wellknown energy model (Adelson and Bergen, 1985)—so that the cells' input-output function corresponds to an LNLN instead of an LN cascade.

A step forward is made by analyzing the spike-triggered covariance (STC) matrix (Bryant and Segundo, 1976; de Ruyter van Steveninck and Bialek, 1988; Brenner et al., 2000; Schwartz et al., 2006; Samengo and Gollisch, 2012), an extension of the STA. STC analysis allows one to extract multiple linear filters whose contributions are non-linearly combined. This works well for assessing whether a neuron can be described as a linear integrator (STC then yields just one filter) or is better described by non-linear stimulus integration (STC yields multiple filters). Furthermore, this analysis can thereby identify those stimulus components (i.e., filters) whose non-linear integration underlies a neuron's response characteristics. Yet, STC analysis by itself is typically not sufficient for quantitatively assessing the functional form of non-linear stimulus integration, in particular because several parallel filters have to be considered and nonlinear effects of stimulus integration and of the output stage need to be separated. We will return to this aspect later and discuss the complementary nature of STC and iso-response analysis.

Given the above considerations, let us thus consider a model that goes beyond the LN model by incorporating an explicit separation between non-linear operations before and after stimulus integration has taken place (**Figure 1A**). The input to this model is provided by two or more stimulus components s1*,...,*sn that separately undergo some non-linear transformation N1(·). The linear sum of these terms then serves as input to a second nonlinearity N2(·). This results in a sequence of non-linear, linear, and again non-linear operations and is thus correspondingly called an NLN cascade (Marmarelis and Marmarelis, 1978; Korenberg and Hunter, 1986). In what follows, the NLN cascade model serves as a canonical framework for studying stimulus integration and helps us formalize the relevant challenges and strategies. More complex cascades can be obtained by extending the linear sum to a linear filter operation or by combining more elementary building blocks. For example, auditory signal transduction has been described by an LNLN cascade (**Figure 1D**; Gollisch and Herz, 2005).

The important feature of the canonical model of **Figure 1A** is that it separates non-linear transformations occurring after stimulus integration has taken place (function N2) from non-linear transformations occurring just before or in the course of stimulus integration (function N1). Thus, it is the function N1 that determines the nature of stimulus integration and dictates which scalar measure is distilled out of the combination of stimulus components si. N2, on the other hand, provides a transformation that determines how this scalar measure is represented, but does not affect what is represented in the neuron's output. Hence, the benefit of the canonical model of **Figure 1A** is to provide a framework for separating non-linearities that are relevant for stimulus integration from those that are irrelevant for this purpose, even if they strongly influence the neural output, for example in the form of an all-or-none spike generation threshold or pronounced response saturation.

## **THE ISO-RESPONSE METHOD**

As seen in the discussion above, a fundamental challenge to studying neuronal information processing is that non-linearities relevant for stimulus integration need to be separated from subsequent non-linearities, in particular those at the output stage. To approach this challenge, an experimental design is needed that directly reflects these different non-linear processing stages. Crucial insight is provided by a strategy known from measuring threshold curves in neurobiology (Evans, 1975) or using equivalence criteria in psychophysics (Jameson and Hurvich, 1972): instead of estimating the full input-output relation, stimulus parameters are varied such that the neuron's response stays at a *constant level* (Gollisch et al., 2002; Gollisch and Herz, 2003).

The key idea behind this concept is that staying at a constant response level removes the effect of the output non-linearity in the canonical model of stimulus integration (**Figure 1A**). How different stimulus components have to be combined to reach this response level thus serves as a direct signature of the nature of stimulus integration. This is most easily seen when considering a system with two independent input channels s1 and s2. In the twodimensional stimulus space spanned by s1 and s2, iso-response stimuli are typically located on one-dimensional curves, which we call iso-response curves. Linear integration, for example, is characterized by iso-response curves that are straight lines, even if the overall response function of the neuron is strongly non-linear because of the output non-linearity (**Figure 1B**). Deviations from linearity in the integration process, on the other hand, lead to differently shaped curves. As a simple example, integration in the form of a sum of squares yields circular iso-response curves, defined by the circle equation s2 <sup>1</sup> + s 2 <sup>2</sup> = const (**Figure 1C**).

In higher dimensional stimulus spaces, the iso-response curves become iso-response manifolds. Linear integration then corresponds to an iso-response manifold whose shape is a hyperplane. The iso-response manifolds represent the invariances of a neuron's input-output relation and therefore provide an important characterization of the neuron's computational role, even when considering only low-dimensional stimulus subspaces. These still supply a signature of the neuron's invariances; for example, if a neuron has ellipsoids as iso-response manifolds in a high-dimensional stimulus space, an investigation of a twodimensional planar subspace will display elliptic iso-response curves. High-dimensional hyperplanes, on the other hand, always yield simple straight lines in a two-dimensional projection.

The prime advantage of the method lies in the fact that the iso-response manifolds are independent of the potentially highly non-linear operation occurring at the final output stage; the iso-response approach relies solely on comparisons of stimuli for which this output stage has identical effects. This focus on a particular response range also makes the approach experimentally efficient, which is of special importance when data acquisition time is limited. Furthermore, by their very definition, iso-response stimuli are "perceived" as identical by the neuron under investigation. The shape of an iso-response manifold thus has a direct functional interpretation, whereas it is often difficult to assign a particular meaning to the specific shape of a neuron's traditional stimulus-response curve. Finally, as the full stimulusresponse curve need not be determined within the iso-response paradigm, strong stimulation can be obviated so that experimental artifacts caused by activity-dependent cellular fatigue are not an issue.

Depending on the investigated neuron or on the considered stimuli, different neuronal output characteristics may be relevant for information transmission. Accordingly, the iso-response concept is not limited to "iso-firing rate" but can also be implemented as "iso-first-spike latency" (Bölinger and Gollisch, 2012), "isofiring phase," or other iso-response variants. In fact, *every* neural response feature that depends on input stimuli can serve as an isoresponse dimension, including the value of the probability that a single spike occurred at all (Gollisch and Herz, 2005). Other useful target response measures could be the firing phase relative to some underlying large-scale rhythm or a specific temporal discharge pattern. This goes along with a freedom of choice regarding the dynamics of the chosen stimulus. Iso-response methods can be applied with extremely brief, highly non-stationary stimuli down to the sub-millisecond range (Gollisch and Herz, 2005) as well as with longer, stationary stimuli (Gollisch et al., 2002; Horwitz and Hass, 2012). The first paradigm provides a chance to disentangle rapid biophysical processes that subserve temporally precise stimulus integration, whereas the second setting allows one to focus on the system's spectral or spatial integration properties, independently of temporal dynamics. Furthermore, a given neuron may use different coding schemes for different stimulus attributes. To cover such multiplexing of information (or rule it out for the neuron under study), one can apply different isoresponse measures within one experiment (Bölinger and Gollisch, 2012).

## **HISTORICAL BACKGROUND OF ISO-RESPONSE MEASUREMENTS**

The concept of measuring different stimuli that yield the same response also underlies the measurements of threshold tuning curves, which are widely used, for example, to characterize auditory neurons (Tasaki, 1954; Holton and Weiss, 1983; Harris and Dallos, 1984; Geisler et al., 1990). Here, the predefined response is typically set to be the minimal notable difference from baseline activity, and these thresholds are obtained along the axis of varying sound frequency. The measurements of threshold as compared to measuring the response strengths for a given stimulus amplitude at different sound frequencies—has the advantage that it avoids overly strong stimulation, which would trigger non-linear suppression mechanisms, blurring the tuning characteristics (Eustaquio-Martín and Lopez-Poveda, 2011).

Other early applications of iso-response measurements have been carried out in the visual system. In the frog retina, threshold intensities of spots in the receptive field center of a recorded ganglion cell were obtained for different light intensity in the surround (Barlow, 1953). This was used to study whether signals in the center and surround of the receptive field were combined in a linear or non-linear fashion. For neurons in primary visual cortex, the combined direction and spatial frequency selectivity was characterized by measuring responses to different combinations of motion direction and spatial frequency and then extracting isoresponse curves in the 2D direction–frequency space (Jones et al., 1987). The purpose of these iso-response curves was to provide an easy visualization of the data, which were then analyzed to determine whether motion direction and frequency affected the response independently of each other or whether an interaction between these stimulus dimensions became apparent.

These early applications of the iso-response paradigm, however, did not aim at detailed characterizations of the non-linearities involved in stimulus integration. This requires high-precision measurements of iso-response stimuli, despite the limited recording time in physiological experiments. A key development for providing the required efficiency in the assessment of iso-response stimuli has been the possibility to use closed-loop experiments, benefiting from the recent colossal advancements in computer hardware and software.

## **MEASURING ISO-RESPONSE STIMULI WITH CLOSED-LOOP EXPERIMENTS**

From an experimental viewpoint, the iso-response methodology suggests a conceptual change in the design of a neurophysiological experiment—instead of measuring how responses vary for different predefined stimuli, the goal is to manipulate stimuli such that the recorded cell's output stays at the same level, or at least remains within a small predefined range. This challenging task can only be accomplished efficiently within a closed-loop setting (Benda et al., 2007) so that information about changes in the neural output can immediately be fed back to the stimulus generator (**Figure 2A**).

In a first, exploratory phase of an iso-response experiment, the closed-loop setting is highly useful to determine which stimulus dimensions should be explored at all (e.g., which spatial locations or spectral components). In the second phase, the actual iso-response stimuli are determined. To do so, the closed-loop setup is used to implement a search algorithm. The search for a particular stimulus that provides a predefined response can, for example, proceed radially outwards from stimulus origin in different directions (**Figure 2B**). Alternatively, the search can move along an iso-response curve (**Figure 2C**) by starting at some stimulus and then searching in its vicinity for stimuli leading to the same response. The search for a stimulus that yields a predefined response is essentially a root-finding problem, for which many

**FIGURE 2 | Closed-loop methods for measuring iso-response stimuli. (A)** Closing the loop by tuning stimulus parameters according to measured responses. In response to a stimulus with two components s1 and s2 (top), a recorded neuron (right) responds with spikes that are automatically detected, for example by a threshold criterion (bottom). The spike response is then compared to a chosen target criterion (left), which may be the number of spikes, the timing of the first spike, the probability of spiking, or any other accessible response feature. According to this comparison, the values of s1 and s2 are adjusted for the next stimulus presentation in order to approach the target response. **(B)** Potential search strategy in radial directions of the stimulus space. This combines several linear searches, which can be performed sequentially or interleaved, typically starting near the origin so that overly strong stimulation is avoided. **(C)** Potential search strategy by tracking iso-response curves. Here, previously measured iso-response stimuli are used as starting conditions for searching nearby stimuli that yield the same response. This can be done, for example, by changing the ratio of s1 and s2 while keeping the same radial distance as for a previously measured iso-response stimulus and then tuning this radial distance until the desired response is obtained. As compared to the strategy of pure radial searches in **(B)**, this sequential search can provide higher recording efficiency, but does not allow interleaving multiple searches.

algorithms of varying efficiency and complexity exist (Press et al., 1992). Essentially, however, the search amounts to comparing the measured response to a target and deciding whether increasing or decreasing the strength of the stimulus components reduces the deviation. As it is not always possible to exactly reach the desired response value, the parameter values for these stimuli are often determined through interpolation from stimuli that led to responses within a small region around the set response. To save precious experimental time, this can also be done offline.

In either phase of the iso-response experiments, precise, flexible, and fast stimulus control is needed, as well as good control over the data acquisition, in particular regarding spike detection and spike sorting (Lewicki, 1998; Quiroga et al., 2004; Santhanam et al., 2004; Wood et al., 2004; Rutishauser et al., 2006). The rapid detection of iso-response stimuli through efficient closed-loop approaches can then not only be used to obtain high-accuracy measurements, but also allows one to measure and compare different variations of iso-response curves from the same cells. For example, it may help elucidate the mechanisms underlying the non-linearities of stimulus integration to repeat iso-response measurements in the presence of pharmacological blockers, for different response measures, or using different stimulus components as the inputs s1 and s2. To illustrate the power and potential of closed-loop methods for iso-response measurements, we will, in the following, summarize some key ideas and results of recent applications of this method in different sensory systems.

## **EXAMPLE I: THE AUDITORY PERIPHERY OF LOCUSTS**

We begin with the integration of acoustic stimuli in locust auditory receptor cells. In this model system, three different types of iso-response experiments have been performed to address several distinct questions. In a first study, iso-firing rate stimuli were used to discriminate between rival hypotheses for spectral integration of sound signals (Gollisch et al., 2002). In a second study, iso-spike probability experiments revealed temporal integration mechanisms on a sub-millisecond scale (Gollisch and Herz, 2005). In a third study, iso-firing rate stimuli were used once more, but they were now designed such that different adaptation mechanisms could be discerned (Gollisch and Herz, 2004). Together, the three iso-response studies led to new insights and quantitative results far beyond the scope of traditional experiments.

Locust auditory receptor neurons are directly attached to the animal's eardrum via short dendrites. When the eardrum vibrates in response to incident sound, mechanosensory ion channels in the neurons open (Gillespie and Walker, 2001). The transduction currents cause depolarizations of the neuronal membrane and thereby trigger spikes, which can be recorded from the receptors' axons in the auditory nerve (Hill, 1983a). Individual receptor cells are broadly tuned to sound frequencies above a few kilohertz and do not phase-lock to the sound's carrier frequency (Hill, 1983b).

Returning to the example from the introduction, let us consider sound pressure waves s(t) that consist of superimposed pure tones, s(t) = <sup>i</sup> si cos(2πνit). How the cells' average firing rate *r* depends on sound intensity is subject to three rival hypotheses, in which *r* is considered to be a non-linear function *r* = *f*(*J*) of the "effective stimulus intensity" *J*, which in turn represents a different fundamental measure of sound intensity according to each hypothesis (Garner, 1947; Tougaard, 1996; Heil and Neubauer, 2001): *Amplitude Hypothesis*: *J* is proportional to a weighted sum of the tone amplitudes, *J*AH = <sup>i</sup>λ<sup>i</sup> si, where the factors λ<sup>i</sup> represent the relative sensitivities of the eardrum to different sound frequencies. Thus, *J* reflects the maximum amplitude of the eardrum vibration. *Energy Hypothesis: J* corresponds to the energy of the eardrum oscillations, *J*EH = iλ2 i s 2 <sup>i</sup> . *Pressure Hypothesis: J* corresponds to the temporal mean of the absolute value of the oscillation, *J*PH = *<*|s(t) ˆ |*>*, where s(t) describes the sound ˆ pressure wave after taking the sensitivities λ<sup>i</sup> into account.

Which of these three hypotheses applies to locust auditory receptors? Answering this question about the true physical cause of output activity is complicated by the strongly non-linear dependence of *r* on *J* through the output non-linearity *f* and because *J* cannot be determined directly since the locust auditory system is very delicate so that one cannot reliably measure sound transduction prior to the receptor cells' spike generation. Thus, to investigate stimulus integration independently of *f*, the iso-response paradigm was implemented, using superpositions of two sine-wave stimuli in order to identify those amplitude combinations that led to the same firing rate. Note that for each of the three hypotheses, the stimulus-response relation takes on the form of the canonical model of **Figure 1A**: the output non-linearity N2 is always given by the function *f*, whereas N1 is either just a linear function (amplitude hypothesis), a squaring relation (energy hypothesis), or a more complicated non-linearity that has to be determined numerically (pressure hypothesis). The iso-response curves can thus distinguish between the three hypotheses independently of the non-linear relation between the effective stimulus strength and the firing rate.

As indicated in **Figure 3A** for an exemplary receptor neuron, the amplitude and pressure hypotheses were rejected by the measured shapes of iso-response curves, whereas the energy hypothesis provided a good fit to the data (Gollisch et al., 2002). To test the generality of this conclusion, a useful extension is to investigate how iso-response curves for different response levels are related to one another (**Figure 3B**). For locust auditory receptor neurons, iso-firing rate curves obtained for the same neuron at different firing rates turned out to lie on ellipses that are scaled versions of one another. Again, this finding is in accordance with the energy hypothesis, which predicts that the ratio of the ellipses' half-axes should always equal the ratio of the constants λ<sup>1</sup> and λ2. In addition, the energy model also holds for the initial transient response at stimulus onset as well as for superpositions of multiple pure tones and even accurately predicts receptor responses to broad-band noise stimulation (Gollisch et al., 2002).

These observations led to the conclusion that sound-intensity coding in this insect model system is well captured by a cascade model (**Figure 1D**), in which the sound wave is first mechanically filtered by the eardrum and the transduction stage then provides a squaring non-linearity prior to temporal integration of the electrical signals in the receptor neuron. A non-linear output stage finally describes the firing-rate encoding of the effective sound intensity *J*EH, resulting in an LNLN-cascade. The temporal dynamics of this cascade, in particular of the different filtering stages, however, were beyond the reach of this first set of experiments with stationary stimuli.

Instead, disentangling the characteristics of temporal integration in sound encoding requires the application of highly dynamic stimuli. Accordingly, the iso-response paradigm was extended to a response measure appropriate for such a dynamic scenario, namely the probability of occurrence of a single spike following a brief stimulus (Gollisch and Herz, 2005). Thus, iso-response curves were measured for double-click stimuli with inter-click intervals of less than one millisecond. The two click amplitudes s1 and s2 were again adjusted via a closed-loop search algorithm during an experiment such that a recorded cell responded to repeated stimulation with a fixed spike probability.

When the inter-click intervals were sufficiently large (**Figure 3C**, middle panel), the iso-response curves were approximately circular (**Figure 3D**, open squares). This finding further corroborates the energy hypothesis, as the circular iso-response curve shows that equal spike probability was obtained for equal sound energy, s2 <sup>1</sup> + s 2 <sup>2</sup>. When very short inter-click intervals were chosen (**Figure 3C**, top panel), however, the iso-response curves were nearly straight lines (**Figure 3D**, filled circles). Thus, on short times scales, the sum of the two click amplitudes, s1 + s2, determines the spike probability. This is readily explained if one assumes that the two stimulus components are already mechanically integrated by the oscillation of the eardrum, which is expected to act as a linear filter for the sound-pressure wave (Schiolten et al., 1981).

The different shapes of the iso-response curves on different time scales imply that different integrative steps are relevant during the mechanosensory transduction process. This is expected, as the sound pressure wave is first mechanically filtered by the eardrum. After conversion into electrical signals, these are integrated by the capacitive properties of the neuron's cell membrane. In the LNLN cascade of sound transduction (**Figure 1D**), the two temporal integration steps are captured by the linear filters L1 and L2, respectively. How can the temporal structure of these two filters, separated by the squaring non-linearity of mechanosensory transduction, be disentangled? The solution again lies in properly designed iso-response measurements, here by comparing the click amplitudes necessary to evoke the same spike probability when the pressure deflection of the second click either has the same or the opposite sign of the pressure deflection of the first click (**Figure 3C**, bottom panel). The rationale behind this approach is that the linear integration before the squaring non-linearity is sensitive to a change in sign, whereas the integration following the squaring transformation is not. Using the mathematical description of the LNLN cascade, this reasoning can be cast into formulas for extracting filter shapes of L1 and L2 at different time points, which correspond to the applied inter-click intervals (Gollisch and Herz, 2005).

This approach showed that L1 resembles the filter of a damped oscillator (**Figure 3E**). In fact, the measured resonance frequencies of these oscillators corresponded to the receptor cells' maximal spectral sensitivity, which typically lies in the range of several kilohertz (Gollisch and Herz, 2005). In addition, these measurements revealed damping time constants of typically few hundred microseconds, thus providing insight into the mechanical eardrum properties at the different sites where the receptor cells are attached. By contrast, the second filter L2 rather had the shape of a leaky integrator with exponential decay characteristics, thus showing the time scales of electrical integration at the cell membrane (**Figure 3F**). Typically, the decay of L2 was slower than that of L1. Thus, long inter-click intervals surpass the mechanical integration and rather reveal the quadratic integration characteristics of electrical signals as evident in the approximately circular iso-response curves for sufficiently long inter-click intervals (**Figure 3D**).

Note that the assessment of the integration dynamics on time scales as short as few tens of microseconds could be achieved by measuring the spike probability with comparatively large temporal windows of several milliseconds. This makes the approach insensitive to variability in spike timing, which mars the temporal resolution of traditional correlations techniques (Aldworth et al., 2005; Dimitrov and Gedeon, 2006; Gollisch, 2006). By contrast, the temporal resolution in these iso-response measurements is limited only by the accuracy of stimulus delivery, which may easily reach the microsecond range with appropriate hardware and software.

On much longer time scales, many neurons exhibit spikefrequency adaptation (**Figure 3G**). An initially high firing rate slowly decreases over time, even though the stimulus stays constant. There is a wide range of different biophysical mechanisms known to be involved in spike-frequency adaptation. In many neurons, a major contribution stems from output-driven components that are triggered by the spiking activity of the neuron. Adaptation may, however, also contain components that are driven by the sensory or synaptic input in a feed-forward way. The different dependences of adaptation on the sensory input and neural output will have distinct effects on the coding properties of a sensory neuron. For a functional characterization of adaptation, we therefore have to identify the causal relationships between sensory input, neural activity, and the level of adaptation.

To tackle this problem, one needs to measure input-driven adaptation, which is triggered by the strength of a stimulus component si, independently of output-driven adaptation, which follows the total response level of the neuron. Applying again the iso-response approach to auditory receptor neurons, this can be done by tuning the intensities for different sound frequencies in such a way that the steady-state firing rate is the same (Gollisch and Herz, 2004). Consequently, the level of output-driven adaptation must be equal. Switching between these sounds (**Figure 3H**) can then reveal input-driven components, because these need to approach a new equilibrium value after such a switch. This process results in transient deflections of the firing rate, which can be observed in electrophysiological recordings of the spiking activity (**Figure 3I**). The careful tuning of the sounds leads to a high sensitivity of the method that allows one to detect inputdriven adaptation components even when they are far smaller in effect than simultaneously present output-driven components.

## **EXAMPLE II: RETINA**

The vertebrate retina is a neural network at the back of the eyeball that constitutes the first stage of visual processing. The processed visual signals are encoded by retinal ganglion cells into patterns of spikes for transmission along the optic nerve to various brain regions. As in many other sensory systems, the network of the retina features a great deal of convergence; a single ganglion cell can collect signals from tens to hundreds of excitatory bipolar cells (Freed and Sterling, 1988), which in turn each collect signals from many photoreceptors. Inhibitory interactions mediated by horizontal cells and amacrine cells influence which signals are transmitted in this processing chain and how they are modified.

The spikes from an individual retinal ganglion cell thus reflect the processing of this complex upstream circuit. What the circuit computes follows to a large degree from the nature of the non-linearities associated with the ganglion cell's integration over its collection of inputs (Gollisch and Meister, 2010). That this integration can occur in a non-linear fashion has been known for more than fifty years, since ganglion cells were first categorized as linear X cells and non-linear Y cells (Enroth-Cugell and Robson, 1966). Yet, the classical experiments for identifying nonlinear stimulus integration with reversing spatial gratings only indicate whether or not a non-linearity is present and do not directly reveal its functional form. Moreover, it is likely that the class of non-linearly integrating cells is composed of various types of ganglion cells, which may express different types of non-linear characteristics, serving different visual functions.

Based on the iso-response paradigm, the nature of stimulus integration in the receptive field can be analyzed by subdividing the receptive field into two halves (**Figure 4A**) and using the values of the visual contrast in each half as inputs, analogous to the canonical model of **Figure 1A**. This approach

After determining the receptive field center of a retinal ganglion cell (dashed line), different contrast levels s1 and s2 were simultaneously displayed for 500 ms, each in one half of the receptive field. **(B)** Stimulus space. Iso-response stimuli were measured in the space spanned by s1 and s2. Experiments were performed on Off-type ganglion cells, which best respond to negative contrast. Several sample stimulus patterns are shown at their respective locations in stimulus space. The origin corresponds to the gray level of background illumination. **(C)** Iso-rate and iso-latency curves for a

a threshold-quadratic non-linearity of stimulus integration. **(D)** Iso-rate and iso-latency curves for a different ganglion cell from a subpopulation in the salamander retina. While the iso-latency curve has a similar shape as the curves in **(C)**, the iso-rate curve shows a notch along the lower-left diagonal, corresponding to particular sensitivity to homogeneous stimulation of the receptive field. This follows from a dynamic local gain control mechanism, mediated by inhibitory interactions. All panels reprinted from Bölinger and Gollisch (2012), Copyright (2012), with permission from Elsevier.

has recently been applied to measuring stimulus integration by Off-type ganglion cells in the salamander retina (Bölinger and Gollisch, 2012). The contrast combinations (s1, s2) were flashed briefly onto the receptive field of a ganglion cell, whose spikes were recorded extracellularly. Closed-loop experiments were then used to find such combinations that either gave the same spike count (iso-rate curves) or the same first-spike latency (iso-latency curves). As the stimulus started from an intermediate gray illumination, both positive contrast (brightening) as well as negative contrast (dimming) could be applied, and iso-response stimuli were therefore measured beyond just one quadrant of stimulus space (**Figure 4B**).

The iso-response curves revealed that all measured ganglion cells in the salamander retina featured non-linear stimulus integration. For the majority of cells, iso-rate curves and iso-latency curves had the same general shape, as shown by an example in **Figure 4C**. The curves were approximately circular in the region where both contrast values were negative (corresponding to the preferred contrast for these Off-type cells). In this region, the curves thus resembled the circular iso-response curves seen in a simple model (**Figure 1C**) and in the previous example (**Figure 3A**), suggesting that a sum of squares determines the response of these ganglion cells. Combinations of negative contrast in one half of the receptive field and positive contrast in the other, however, yielded sections of the iso-response curves that were nearly parallel to the axes of the plot. This suggests that the amount of positive contrast had little or no effect on the response strength, corresponding to a thresholding mechanism that implements a half-wave rectification. Together, the shape of these iso-response curves indicates that a threshold-quadratic transformation is the fundamental non-linearity of stimulus integration over the receptive field center of these ganglion cells.

Other recorded ganglion cells, however, showed a fundamentally different shape of the iso-rate curves (**Figure 4D**). Instead of the circular shape in the region where both contrast values are negative, the curves show a pronounced notch, indicating that particularly small contrast levels were required to reach the target spike count when both receptive field halves were stimulated with the same negative contrast. Accordingly, the cells were named "homogeneity detectors," as they appear particularly suited to detect large, homogeneous objects, even at low contrast (Bölinger and Gollisch, 2012).

Both types of ganglion cells, those with threshold-quadratic non-linearities as well as homogeneity detectors, are strongly non-linear in their integration characteristics. They would thus both be classified as Y-type cells according to a conventional investigation of linear vs. non-linear stimulus integration with reversing grating stimuli (Enroth-Cugell and Robson, 1966; Bölinger and Gollisch, 2012). The assessment of integration characteristics with iso-response curves, on the other hand, allowed an analysis of the particular type of non-linearity in a quantitative and detailed fashion and thus provided a distinction between different types of non-linear stimulus integration that had not been apparent before.

Interestingly, the iso-latency curves of homogeneity detectors did not display the characteristic notch, but rather showed the circular region, similar to the majority of measured iso-rate curves. The comparison between iso-rate and iso-latency curves thus already provides insights regarding the mechanism responsible for the characteristics of homogeneity detectors; it suggests that sensitivity to homogeneous stimuli is obtained through a process that acts only after the first spike is initiated and thus has a dynamic nature. Further investigations showed that this phenomenon is brought about by local inhibitory circuitry, acting as a local gain control and coming into effect with a slight delay because of the additional synaptic stage involved in the inhibitory pathway (Bölinger and Gollisch, 2012).

## **EXAMPLE III: VISUAL CORTEX**

A further recent application of iso-response measurements has shed light onto the integration of color information by neurons in primate visual cortex (Horwitz and Hass, 2012). This study was motivated by the puzzle that neuronal responses in visual cortex to color stimuli often appeared incongruent with representing linear sums and differences of cone signals, an expectation that had been developed on the basis of psychophysical color perception experiments (Hering, 1920; Hurvich and Jameson, 1957). To resolve this issue and test whether non-linear integration of cone signals had been a missing ingredient in the models with which the data had been analyzed, Horwitz and Hass (2012) measured iso-response surfaces of macaque V1 neurons in a three-dimensional color stimulus space, defined by the activation of the three types of cones in the retina (**Figure 5**). Using drifting chromatic gratings as stimuli, the iso-response stimuli were defined as those combinations of cone activation that elicited the same firing rate over the stimulus duration.

The iso-response stimuli define two-dimensional surfaces in this three-dimensional stimulus space. For some cells, the isoresponse surfaces were simple planes (**Figure 5A**), indicating that these cells represent indeed a linear combination of cone activation strengths. Other cells, however, showed strong deviations from linear integration; for those cells, the iso-response data points were much better fitted by quadratic models, either corresponding to a hyperboloid (**Figure 5B**) or to an ellipsoid (**Figure 5C**). Taken together, the data show that iso-response surfaces of individual cells are generally well described by either a linear or a quadratic integration model. This finding demonstrates that the previous lack of a coherent description of cortical responses to color stimuli in terms of cone activations resulted from not taking non-linear integration into account.

Interestingly, the hyperboloid iso-response surface of **Figure 5B** is similarly non-convex as the iso-response curve of homogeneity detectors measured in the retina (**Figure 4D**). This shape suggests that the cells are especially sensitive to one particular stimulus dimension—homogeneous stimulation of the receptive field in the case of the retinal neuron; a particular cone activation pattern in the case of the cortical neuron—whereas responses in other directions appear suppressed; in the case of the cortical neuron, this means that for certain combinations of cone activation, the desired response is never reached. One may thus hypothesize that the hyperboloid shape of the iso-response surface in cortical neurons is brought about by a similar active suppression mechanism as mediated by local inhibition in the case of the retinal homogeneity detectors.

**FIGURE 5 | Iso-response measurements of integration of cone signals by neurons in macaque primary visual cortex.** The panels show iso-response stimuli (black circles) obtained with drifting chromatic gratings for three representative sample cells in **(A)**, **(B)**, and **(C)**, respectively. Stimuli that yielded the same firing rate are plotted in the space spanned by S-cone activation (S) and by the sum and difference of L-cone and M-cone activation (L + M and L − M, respectively). For each cell, the data are shown in the 3D plots for two different viewpoints (left and right column, respectively). Gray lines indicate directions in stimulus space along which the predefined response criterion could not be reached. As shown by the green surface plots, iso-response stimuli are well fitted by a linear plane for the cell in **(A)**, by a hyperboloid for the cell in **(B)**, and by an ellipsoid for the cell in **(C)**. Panels **(B)** and **(C)** also show best fits of linear planes (black quadrangles), which do not provide good descriptions of the iso-response stimuli. Reprinted by permission from Macmillan Publishers Ltd: Nature Neuroscience (Horwitz and Hass, 2012), copyright (2012).

The iso-response surface in the shape of an ellipsoid (**Figure 5C**), on the other hand, indicates that the cell represents a sum of squares, similar to findings in both the locust auditory system (**Figure 3A**) and the salamander retina (**Figure 4C**) as well as in the energy model for complex cells in visual cortex (Adelson and Bergen, 1985). The ubiquity of this type of non-linear stimulus integration may indicate a general-purpose representation, providing invariance under rotations in stimulus space.

## **MULTIPLE STAGES OF STIMULUS INTEGRATION**

The canonical model of **Figure 1A** suggests that the iso-response method is most easily applied to systems with two non-linear stages, one before stimulus integration has taken place and one afterwards. Yet, valuable insight can also be obtained for systems with more successive non-linearities. First, from a functional point of view it may not be necessary to disentangle all nonlinear stages; rather, it may be of interest to determine the total, combined non-linear transformation before stimulus integration takes place and separate it from the total non-linearity afterwards. This procedure aims at casting the investigated system again into the form of the canonical NLN cascade of **Figure 1A**, but will fail for systems that deviate strongly from this simplified structure.

Second, one may profit from the fact that many neural systems, in particular sensory systems, are organized in a hierarchical fashion so that the relevant temporal, spatial, and spectral scales increase from processing layer to processing layer. This allows one to choose the stimulus layout—by appropriately defining what is represented by the two components s1 and s2—in such a way that the relevant stimulus integration occurs at a certain stage along the processing chain, dividing the chain into the total non-linear transformation before and after this stage. By varying the stimulus scale used in the analysis, one can thus distinguish between successive non-linear stages.

To illustrate this strategy, let us consider a model with three non-linear stages N1, N2, and N3, separated by successive stages of stimulus integration, which first only pool over sets of neighboring inputs and subsequently integrate over these sets (**Figure 6A**). To separate these integration stages, we now first choose a "coarse" stimulus layout, in which the four input channels are combined into pairs so that "nearby" channels, which are pooled together already in the first integration stage, receive the same stimulus intensity s1 or s2, respectively (stimulus pattern inside the blue box in **Figure 6A**). For this stimulus layout, s1 and s2 remain separate through both N1 and N2 and are combined only prior to the output non-linearity N3. This means that the iso-response curve of s1 and s2 will reflect the concatenation of N1 and N2, but is not influenced by N3. Now, let us consider a "finer" stimulus layout, in which "nearby" input channels already receive different stimulus components s1 and s2 (stimulus pattern inside the green box in **Figure 6A**). For this layout, s1 and s2 are combined directly after N1 and before N2, which means that the iso-response curve of s1 and s2 will now only reflect the non-linearity N1 and be insensitive to both N2 and N3. Investigating and comparing the shapes of iso-response curves on a fine and coarse scale thus can be used to derive both nonlinearities N1 and N2. Finally, for completeness, N3 could simply be obtained by homogeneously stimulating all four input channels with the same, varying stimulus intensity, thus measuring the combined effect of all three non-linear stages, and comparing this to the effect of N1 and N2 alone.

The strategy of comparing iso-response curves measured with coarse and fine stimulus layouts has been used to track the origin of the non-linearities in the receptive fields of retinal ganglion cells that were described in **Figure 4** (Bölinger and Gollisch, 2012). Spatial stimulus integration in the retina occurs successively from photoreceptor cells via biopolar cells to ganglion cells. These integration stages cover different spatial scales; photoreceptor cells integrate light over a distance of about 10μm (Mariani, 1986; Sherry et al., 1998), whereas bipolar cells have receptive

**FIGURE 6 | Approach for disentangling non-linearities at multiple stages of stimulus integration in hierarchical models. (A)** Cascade model with three consecutive non-linear stages, N1, N2, and N3, separated by two integration stages. The model assumes that first nearby stimulus components are integrated, whose results are then combined in a subsequent stage. Different stimulation schemes that can be used to separate the effects of the non-linearities are shown on top. When nearby input channels are stimulated with the same stimulus component s1 or s2, respectively (stimulus pattern in blue box), the iso-response curve is affected by the combination of N1 and N2. When the two stimulus components s1 and s2 are placed so that they are combined already by the first integration stage (stimulus pattern in green box), only non-linearity N1 is relevant for the shape of the iso-response curve. **(B)** Application of the strategy to separate different integration stages of spatial integration by retinal ganglion cells. When nearby spatial locations receive the same contrast (blue data points), the iso-rate curve shows the standard threshold-quadratic non-linearity as in **Figure 4C**. When the two contrast components s1 and s2 are interleaved so that presynaptic bipolar cells typically already start integrating the two components, but individual photoreceptors only receive either one of the components (green and orange data points, corresponding to squares in the stimulus layout with 150 and 60 μm side length, respectively), the iso-rate curves approach straight lines, showing that the integration stage from photoreceptors to bipolar cells can be approximated as linear integration. Panel **(B)** reprinted from Bölinger and Gollisch (2012), Copyright (2012), with permission from Elsevier.

fields of roughly 50–100μm diameter (Wu et al., 2000; Baccus et al., 2008) and ganglion cells in the range of 200–600μm. Thus, analyzing whether the non-linear structures of iso-response curves persist or change on spatial scales below several tens of micrometers allows one to test whether the site of the nonlinearity is before or after stimulus integration by bipolar cells. This concept has been applied by arranging the stimulus components in a checkerboard-like fashion with different sizes of the individual checkerboard fields. Measurements of iso-response stimuli then showed that, as the scale of the fields fell roughly below 100μm, the shapes of iso-response curves approached straight lines (**Figure 6B**). This meant that no relevant nonlinearity occurred between photoreceptor cells and bipolar cells; to good approximation, stimuli were integrated linearly by bipolar cells.

Essentially the same principle was also behind the separation of different integration stages in locust auditory receptor neurons, as discussed above, by probing the system with pairs of acoustic clicks at different inter-click intervals (**Figure 3C**; Gollisch and Herz, 2005). For very short inter-click intervals, iso-response curves showed linear integration of the two clicks, corresponding to the linear mechanical integration at the eardrum; for longer inter-click intervals that surpassed the mechanical integration time, the quadratic non-linearity of transduction became apparent (**Figure 3D**).

## **COMPARISON WITH SPIKE-TRIGGERED COVARIANCE ANALYSIS**

The iso-response method aims at identifying non-linear interactions in consecutive stages of neuronal processing. This relates the method conceptually to cascade models and reverse-correlation techniques, such as STA and STC analysis. As already discussed above, STA analysis fails to capture non-linear integration, because all stimulus integration is assumed to occur linearly in the single-filter LN model. STC analysis and related informationtheoretic techniques (Paninski, 2003; Sharpee et al., 2004; Pillow and Simoncelli, 2006), on the other hand, provide multiple filters and a corresponding multi-dimensional non-linearity. While the popularity of STC analysis primarily rests on its ability to determine the number and shapes of relevant filters, it also, in principle, allows studying non-linear stimulus integration by analyzing the features of the multi-dimensional non-linearity. A primary challenge for this is again the need to separate nonlinearities of stimulus integration from the non-linearity at the output stage. If no explicit models of the output non-linearity are available, calculating iso-response curves within the multidimensional stimulus subspace that is spanned by the identified filters (Rust et al., 2005) appears to be the method of choice for identifying non-linearities of stimulus integration, even if these iso-response curves must be computed in an offline fashion.

Note, however, that there are important practical differences between analyzing non-linear stimulus integration with STC analysis or with closed-loop iso-response measurements. STC analysis is based on continuous, stationary stimulation, typically with white-noise statistics. The closed-loop iso-response method, on the other hand, can also be applied under non-stationary presentation of individually analyzed stimulus segments and can thus be used also for fairly brief stimuli, such as flashed visual images or short sound bursts. This difference in stimulus statistics can have interesting consequences for the processing features of the investigated system. For example, high-threshold inhibition from amacrine cells in the retina (Bölinger and Gollisch, 2012) may be effectively absent in white-noise experiments, but contribute to ganglion cell processing for flashed or saccade-like image presentations.

Second, STC analysis can yield a fairly large number of filters, and the high dimensionality of the associated stimulus subspace may impede a detailed analysis of the non-linear stage (Rust et al., 2005). Unless spiking is well described by a Poisson process, the temporal dynamics of spike generation alone can lead to a collection of several relevant filters (Agüera y Arcas and Fairhall, 2003; Agüera y Arcas et al., 2003). Along the same line, STC analysis of retinal ganglion cells with purely temporal stimuli has been shown to yield multiple temporal filter components (Fairhall et al., 2006). When on top of temporal variations, stimuli have further structure, such as spatial dimensions, one obtains additional filters, including filter combinations that mix temporal effects with other stimulus dimensions. A detailed analysis of the full non-linear stage then easily becomes impractical, in particular for more than two or three dimensions, both for reasons of graphical display and required amounts of data. As a feasible alternative, one may aim at analyzing non-linearities in low-dimensional subspaces, for example, spanned by just two selected filters (Rust et al., 2005). However, all other relevant filters then effectively act as noise sources, reducing the efficiency of this analysis. The closed-loop iso-response approach circumvents this problem by focusing on a chosen, small-dimensional set of stimulus components, such as two spectral or spatial stimulus components. This becomes particularly useful when combined with prior closed-loop identification of appropriate stimulus components, for example, by matching the components to the location and size of a receptive field. The possibility to focus on few purposefully selected stimulus components as well as on a narrow response regime is the benefit of the technically more demanding closedloop approach. Yet, the selected components remain a choice of the experimenter under the assumption that these correspond to meaningful, separate input channels for the neuron under study.

In this view, STC analysis and iso-response measurements are complementary. While the strength of the STC analysis lies mostly in determining—with relatively few prior assumptions—the number and nature of stimulus features that are non-linearly integrated, the iso-response method assumes certain stimulus components to be relevant features and aims at determining their non-linear integration in detail. For systems with little prior expectation about the relevant input channels, it may well make sense to base a closed-loop measurement of iso-response stimuli on the results of a prior STC analysis for guiding the choice of the applied stimulus components.

#### **NEXT STEPS AND FUTURE CHALLENGES**

As shown by the above examples, the iso-response method provides a powerful concept for studying how neurons integrate sensory inputs. Using different types of stimuli allows one to focus on spectral, spatial, temporal, or spatio-temporal integration. Exploring and comparing different output measures, such as firing-rate or first-spike latency, provides valuable insight into potential coding schemes. Furthermore, unlike correlation-based approaches, the temporal resolution of the iso-response method is *not* limited by the precision with which the output signal can be measured. This is best illustrated by the experiments where click-stimuli were presented to auditory receptor neurons whose output was measured in terms of the probability that a single, isolated spike is generated within a window stretching several milliseconds (Gollisch and Herz, 2005). The temporal filters L1 and L2 of the corresponding LNLN cascade were determined at a temporal resolution below 20 microseconds, restricted only by the precision of the acoustic stimulus generator. The stochastic nature of neural responses did not cause any limitations—in fact, the isospike-probability paradigm is only feasible because of a nonzero intrinsic noise level so that a single spike is generated in some, but not all trials. The critical, beneficial role of a neural characteristic that is usually considered an experimental nuisance was an interesting observation in these studies. In addition, one may think that the iso-response paradigm applies to conventional feedforward chains only; but as demonstrated by the study on inputvs. output-driven adaptation, certain feedback loops can also be studied with iso-response methods (Gollisch and Herz, 2004). We are thus confident that the iso-response paradigm will see further conceptual and methodological extensions in the future.

On the practical side, ongoing advances in soft- and hardware technology will increase the closed-loop interaction speed and also make it possible to include second-level analyses into the very design of iso-response experiments. This concern, for example, automated stopping rules in the search algorithms and automated selection of search directions, two developments of key importance for extending the iso-response approach to higherdimensional search spaces. Closed-loop experiments have already been used to determine stimulus *ensembles* that are optimal from an information theoretical point of view (Machens et al., 2005). This is a computationally highly demanding task. With ever-rising computer power, however, it might be interesting to extend this concept and search for iso-information stimulus ensembles.

A prominent research area that could also benefit strongly from the iso-response methodology concerns the computations carried out by dendrites and dendritic trees. Synaptic integration along dendrites is often assumed to be linear, although it has been known for a long time that non-linearities exist and that they can have substantial consequences for neuronal computation (Koch et al., 1983; Mel, 1994; Poirazi et al., 2003; Katz et al., 2009; Abrahamsson et al., 2012). Based on traditional measurement paradigms, however, electrophysiological as well as imaging experiments can only address the question whether synaptic integration is linear, sublinear, or perhaps superlinear. Characterizing these non-linearities using the iso-response method would be an important step toward understanding dendritic computation. To investigate the scope and limits of such an approach, one could first focus on single-cell models of increasing complexity (Herz et al., 2006) with which one can test the method under well-defined and easily modifiable control conditions.

As demonstrated by the examples presented in this review, the iso-response method opens a new vista on neural dynamics and information processing. By focusing on one key question—"Which input combinations generate the *same* neural output?"—the method automatically reveals the invariance classes of the neuron (or neural substructure) under study. This feature should prove particularly helpful for studying sensory systems with complex and poorly understood stimulus spaces, such as olfaction, as well as for understanding multi-sensory integration and higher cortical processing. Note in this context that neural responses in the cortical area MST have been explained using a LNLN cascade model (Mineault et al., 2012). As shown in this review, the iso-response method is ideally suited to explore such models and determine their parameters with high precision. This suggests that even neural processing levels far from the sensory periphery can be studied quantitatively using the iso-response method.

At least conceptually, one could also extend this method beyond the single-neuron level and study multi-neuronal activity patterns. As a simple example, one may explore iso-synchrony stimuli that keep the level of synchronous activity between two or more neurons constant. Searching for multi-neuronal response patterns will require some conceptual developments regarding the applied search algorithm, that is, how to systematically tune stimuli toward eliciting a given multi-neuronal spike pattern. On the technological side, the necessary methods for fast and reliable online spike detection and sorting of multiple spike trains have already begun to become available (Quiroga et al., 2004; Santhanam et al., 2004; Wood et al., 2004; Rutishauser et al., 2006), but still need to be further explored for practical applications of closed-loop experiments.

At a larger scale, network activity could be characterized by identifying iso-population-response stimuli, using local-fieldpotential, MEG, or even fMRI signals. As for single neurons, one may learn far more by carefully analyzing those stimulus combinations that cause the same large-scale response than by observing that certain stimuli lead to more activation than others—without really knowing how to interpret differences in

## **REFERENCES**


rescaling maximizes information transmission. *Neuron* 26, 695–702.


the activation levels. Within the iso-response framework, the tricky task of construing activity changes can be circumvented, and one can directly focus on one of the most important functional characteristics of a specific neuron or neural population: How are sensory or synaptic inputs integrated over space, frequencies, and time?

## **ACKNOWLEDGMENTS**

This work was supported by the German Initiative of Excellence, the International Human Frontier Science Program Organization, and the Deutsche Forschungsgemeinschaft (DFG) through the Collaborative Research Center 889 (Tim Gollisch) as well as by the German Ministry for Science and Education (BMBF) through the Bernstein Center for Computational Neuroscience Munich (FKZ 01GQ1004A) (Andreas V. M. Herz).

2nd. (2006). Selectivity for multiple stimulus features in retinal ganglion cells. *J. Neurophysiol.* 96, 2724–2738.


coding in an insect auditory system. *J. Neurosci.* 22, 10434–10448.


*Approach.* New York, NY: Plenum Press.


detection and sorting of extracellularly recorded action potentials in human medial temporal lobe recordings, *in vivo*. *J. Neurosci. Methods* 154, 204–224.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 22 August 2012; paper pending published: 15 September 2012; accepted: 29 November 2012; published online: 19 December 2012.*

*Citation: Gollisch T and Herz AVM (2012) The iso-response method: measuring neuronal stimulus integration with closed-loop experiments. Front. Neural Circuits 6:104. doi: 10.3389/fncir. 2012.00104*

*Copyright © 2012 Gollisch and Herz. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Statistics of neuronal identification with open- and closed-loop measures of intrinsic excitability

## *Ted Brookings 1\*, Rachel Grashow <sup>2</sup> and Eve Marder <sup>1</sup>*

*<sup>1</sup> Volen Center and Biology Department, Brandeis University, Waltham, MA, USA*

*<sup>2</sup> Environmental Health Department, Harvard School of Public Health, Boston, MA, USA*

#### *Edited by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### *Reviewed by:*

*Scott Hooper, Ohio University, USA Farzan Nadim, New Jersey Institute of Technology, USA Ronald L. Calabrese, Emory University, USA*

#### *\*Correspondence:*

*Ted Brookings, Volen Center and Biology Department, Brandeis University, Volen Center MS 013, 415 South Street, Waltham, MA 02454, USA.*

*e-mail: ted.brookings@googlemail .com*

In complex nervous systems patterns of neuronal activity and measures of intrinsic neuronal excitability are often used as criteria for identifying and/or classifying neurons. We asked how well identification of neurons by conventional measures of intrinsic excitability compares with a measure of neuronal excitability derived from a neuron's behavior in a dynamic clamp constructed two-cell network.We used four cell types from the crab stomatogastric ganglion: the pyloric dilator, lateral pyloric, gastric mill, and dorsal gastric neurons. Each neuron was evaluated for six conventional measures of intrinsic excitability (intrinsic properties, IPs). Additionally, each neuron was coupled by reciprocal inhibitory synapses made with the dynamic clamp to a Morris–Lecar model neuron and the resulting network was assayed for four measures of network activity (network activity properties, NAPs).We searched for linear combinations of IPs that correlated with each NAP, and combinations of NAPs that correlated with each IP. In the process we developed a method to correct for multiple correlations while searching for correlating features. When properly controlled for multiple correlations, four of the IPs were correlated with NAPs, and all four NAPs were correlated with IPs. Neurons were classified into cell types by training a linear classifier on sets of properties, or using *k*-medoids clustering.The IPs were modestly successful in classifying the neurons, and the NAPs were more successful. Combining the two measures did better than either measure alone, but not well enough to classify neurons with perfect accuracy, thus reiterating that electrophysiological measures of single-cell properties alone are not sufficient for reliable cell identification.

**Keywords: clustering algorithms, multiple correlations, feature selection, stomatogastric ganglion, identified neurons, dynamic clamp, half-center oscillator, Morris–Lecar model**

## **INTRODUCTION**

A major step in elucidating the connectivity of nervous system circuits is identifying the neurons in the circuit. In the case of small invertebrate circuits neuronal identification is often straightforward (Getting and Dekin, 1985; Getting, 1989; Marder and Calabrese, 1996; Marder and Bucher, 2001, 2007; Kristan et al., 2005), using a combination of neuronal projection patterns, position, firing patterns, size, and color. This has facilitated the establishment of the connectivity diagrams of the circuits underlying stereotyped behaviors in a variety of animals (Mulloney and Selverston, 1974a,b; Selverston et al., 1976; Getting et al., 1980; Selverston and Miller, 1980; Getting, 1981; Hume and Getting, 1982; Hume et al., 1982; Miller and Selverston, 1982a,b; Pearson et al., 1985; Katz, 1996; Marder and Calabrese, 1996; Perrins and Weiss, 1996; Schmidt et al., 2001; Sasaki et al., 2007; Calabrese et al., 2011).

In contrast, developing relatively unambiguous connectivity diagrams for circuits with larger numbers of neurons such as those found in most vertebrate nervous systems has been historically more difficult, partially because neuronal identification has been challenging. This is starting to change with the advent of new genetic and molecular techniques. Nonetheless, classification of neurons into types and subtypes is not yet routine in larger networks (Jonas et al., 2004; Sugino et al., 2006; Toledo-Rodriguez and Markram, 2007; Miller et al., 2008; Okaty et al., 2011a,b), and a variety of electrophysiological measures are often used to classify neurons in types and subtypes.

The use of electrophysiological measurements alone for identification can be potentially problematical, as many neurons can change their activity patterns as a function of neuromodulation and activation of modulatory pathways (Dickinson et al., 1990; Meyrand et al., 1991; Weimann et al., 1991; Weimann and Marder, 1994). Moreover, recent work has shown that the same identified neurons can show large ranges in the values of many conventional measures of intrinsic excitability (intrinsic properties; IPs) in different animals (Grashow et al., 2010). In this study we compare directly the utility of six IPs with less conventional measures obtained by introducing a biological neuron into an artificial network with an oscillatory model neuron, and analyzing the resulting activity (network activity properties, NAPs). IPs are obtained via open-loop stimulation, whereas the NAPs are obtained in closed loop: the dynamic clamp injects current into the biological neuron based on the state of the model neuron, which is in turn affected by the biological neuron. The NAPs nevertheless constitute a measure of the biological neuron's properties, because the model neuron is standardized, and thus differences in NAPs between experiments must originate in differences in the neurons themselves.

Our hypothesis was that a measure of intrinsic excitability that challenges a biological neuron with a time-varying closed-loop stimulation might better reveal its essential properties than more static measures. To this end, we searched for relationships between IPs and NAPs, and asked if either was effective in predicting the neuron's cell type.

Grashow et al. (2010) searched for relationships between IPs and NAPs, as pairwise correlations. We expected whole sets of properties to give more information about neuronal identity than individual properties, therefore we asked the related but distinct question: can the IPs be reconstructed from NAPs, and vice versa? Thus for each IP we chose a subset of NAPs and performed a linear regression to fit the IP with the chosen NAPs. We then searched for the subset of NAPs that gave the least regression error per degree of freedom. This search for the most relevant set of properties, known as "feature selection" (see Materials and Methods), found a small set of NAPs that correlated highly with each IP; and when applied conversely, a small set of IPs that correlated highly with each NAP. We were then confronted with the problem of assessing the significance of the correlation that we had discovered. Consequently, one of the goals of this paper was to develop a statistical procedure to correctly assess the significance of multiple correlations, especially those that arise from feature selection.

While Grashow et al. (2010) showed that IPs and NAPs differ across cell types, here we attempt to deduce neuronal type from properties. We trained linear classifiers to test if neuronal types fall into different (linearly separable) regions of property-space. We used *k*-medoids clustering to test if neuronal identity could be blindly discovered, looking only at the properties. One challenge of this approach is that the result of a clustering algorithm is the assignment of each cell to an essentially unlabeled cluster index, making assessment of clustering accuracy an issue. Particularly problematic is determining if differing clustering results from two different sets of properties are significant or merely statistical flukes. Therefore another goal was to develop a procedure to assess the significance of differences in clustering results.

## **MATERIALS AND METHODS**

The majority of the raw data used for these analyses was published in Grashow et al. (2010), and then supplemented with additional experiments. All experiments were done on identified neurons of the stomatogastric ganglion (STG) of the crab *Cancer borealis*. For each neuron we measured six traditional IPs, and with dynamic clamp, four NAPs. Details of the experimental methods are identical to those previously published (Grashow et al., 2010). Here we reiterate the essential details.

#### **ELECTROPHYSIOLOGY**

Recordings and current injections were performed in discontinuous current-clamp mode with sample rates between 1.8 and 2.1 kHz. Input resistance was measured as the slope of the voltage– current (*VI*) curve in response to hyperpolarizing current injections, (voltage was measured after the neuron reached steady state). The frequency–current (FI) curve was measured as the response to depolarizing current injections (typically between 0 and 1 nA). In the STG cells that we assayed, the FI curve had a characteristic shape that was curved at lower injected currents, but became approximately linear at higher injected currents. We computed FI slope by fitting a line to the linear region of the FI curve. This linear region was determined by fitting a line to the FI curve, then progressively eliminating the point with the lowest current injected, until the residual error was small or only three points remained. The residual error was considered acceptable if the sum of squared error divided by the degrees of freedom (number of points minus two) was less than 2.0. Spike frequency at 1 nA was read from the FI curve. Minimum voltage with zero injected current was taken from a trace where the neuron was not perturbed (in silent cells, this would be the resting membrane potential).

#### **CONSTRUCTING THE HYBRID CIRCUIT**

Real-time Linux dynamic clamp (Dorval et al., 2001), version 2.6, was run on an 800-MHz Dell Precision desktop computer. STG neurons were incorporated into a hybrid network with a simulated Morris and Lecar (1981) model. The biological neuron and the Morris–Lecar model neuron were connected with mutually inhibitory synapses, and artificial hyperpolarization-activated inward current (*I* h) was added to the biological neuron. The maximal conductance of the synapse from the model to the STG cell was *g*¯syn. The maximal conductance of the synapse from the biological cell to the Morris–Lecar cell was 2 × ¯*g*syn. The maximal conductance of the *I* <sup>h</sup> current was *g*¯h. *g*¯syn and *g*¯<sup>h</sup> were each independently varied from 10 to 100 nS in 15 nS steps, forming a seven-by-seven grid from every possible combination.

Identically to the procedures in Grashow et al. (2010), the Morris–Lecar model contained a non-inactivating Ca2<sup>+</sup> conductance and a non-inactivating K<sup>+</sup> conductance, in addition to a leak conductance. The membrane voltage of the Morris–Lecar neuron was determined based on the following equations:

$$\begin{aligned} C\frac{dV}{dt} &= -\bar{\text{g}}\_{\text{calc}}\left(V - E\_{\text{Ca}}\right) - \bar{\text{g}}\_{\text{Ca}}M\left(V - E\_{\text{Ca}}\right) - \bar{\text{g}}\_{\text{K}}N\left(V - E\_{\text{K}}\right) \\ \frac{dN}{dt} &= \tau\_{N}\left(N\_{\infty} - N\right) \\ \frac{dM}{dt} &= \tau\_{M}\left(M\_{\infty} - M\right) \\ M\_{\infty} &= \frac{1}{1 + \exp\left(\frac{-\left(V - V\_{1/2,\text{Ca}}\right)}{V\_{\text{Alpe,Ca}}}\right)} \\ N\_{\infty} &= \frac{1}{1 + \exp\left(\frac{-\left(V - V\_{1/2,\text{K}}\right)}{V\_{\text{Alpe,K}}}\right)} \\ \tau\_{N} &= \tau\_{0K}\text{sech}\left(\frac{V - V\_{1/2,\text{K}}}{2V\_{\text{Alpe,K}}}\right) \end{aligned}$$

The values of the fixed parameters are in **Table 1**. C was the membrane capacitance of the Morris–Lecar model neuron. *g*¯*Ca* , *g*¯K, and *g*¯leak were the maximal conductances for the Ca2+, K+, and leak conductances, respectively.*V*1/2,Ca was the half-activation voltage of the Ca2<sup>+</sup> conductance and *V*slope,Ca was the slope of the activation curve for *g*Ca2+. *E*Ca was the reversal potential for the Ca2<sup>+</sup> current, and τ<sup>M</sup> was the time constant for *M*, the activation variable of the Ca2<sup>+</sup> conductance. *V*1/2,K was the half-activation of the K<sup>+</sup> current, *V*slope,K was the slope of the activation curve

**Table 1 | Values of parameters for the Morris–Lecar model cell, artificial hyperpolarization-activated currents, and artificial inhibitory synapses.**


for the K<sup>+</sup> conductance and *E*<sup>K</sup> was the reversal potential of K+. τ0K is the scale factor for the time constant for *N*, the activation variable of the K<sup>+</sup> current. *E*leak is the reversal potential for the leak current.

The artificial *I* <sup>h</sup> (Buchholtz et al., 1992; Sharp et al., 1996) was described by the equations:

$$\begin{aligned} I\_{\mathrm{h}} &= \bar{\mathrm{g}}\_{\mathrm{h}} R \, (E\_{\mathrm{h}} - V) \\ \frac{d\boldsymbol{R}}{dt} &= k\_{\mathrm{R}} \, (R\_{\infty} - \boldsymbol{R}) \\ \boldsymbol{\epsilon}\_{1} &= \boldsymbol{\epsilon}\_{1} \end{aligned}$$

where

$$\begin{aligned} R\_{\infty}(V) &= \frac{1}{1 + \exp\left[\left(V - V\_{1/2}\right) / s\_{\mathbb{R}}\right]} \\ k\_{\mathbb{R}}(V) &= c\_{\mathbb{R}}\left\{1 + \exp\left[\left(V - V\_{\mathbb{k}\mathbb{R}}\right) / s\_{\mathbb{k}\mathbb{R}}\right]\right\} \end{aligned}$$

where *g*¯<sup>h</sup> (varied from 10 to 100 nS) was the maximal conductance of *I* h; *R* was the instantaneous activation; *R*<sup>∞</sup> was the steadystate activation; *E*<sup>h</sup> was the *I* <sup>h</sup> reversal potential; *V*1/2 was the half-maximum activation; *s*<sup>R</sup> was the step width; *c*<sup>R</sup> was the rate constant; *V*kR was the half-maximum potential for the rate; and *s*kR was the step width for the rate.

The artificial inhibitory graded transmission synapse from the Morris–Lecar model to the biological cell was based on Sharp et al. (1996) and was described by the following equations:

$$\begin{aligned} I\_{\rm syn} &= \bar{\rm g}\_{\rm syn} \cdot \mathcal{S} \cdot \left( E\_{\rm syn} - V\_{\rm post} \right) \\ (1 - S\_{\infty}) \text{ } \mathbf{t}\_{\rm syn} \frac{dS}{dt} &= (S\_{\infty} - S) \end{aligned}$$

where

$$\mathcal{S}\_{\infty} \left( V\_{\text{pre}} \right) = \begin{cases} \tanh \left[ \left( V\_{\text{pre}} - V\_{1/2} \right) / V\_{\text{slope}} \right] & \text{if } V\_{\text{pre}} > V\_{1/2} \\ & 0 & \text{otherwise} \end{cases}$$

where *g*¯syn(varied from 10 to 100 nS) was the maximal synaptic conductance; *S* was the instantaneous synaptic activation; *S*∞ was the steady-state synaptic activation. The reversal potential of the synaptic current, *E*syn, had different values in the two synapses: −80 mV when the biological neuron was postsynaptic, and−70 mV when theMorris–Lecar model was postsynaptic.*V*pre and *V*post are the presynaptic and postsynaptic potentials, respectively; τsyn was the time constant for synaptic decay; *V*1/2 was the synaptic half-activation voltage and *V*slope was the synaptic slope voltage.

#### **ASSAYING HALF-CENTER ACTIVITY**

Network activity was classified into one of four categories. If the biological cell did not fire action potentials and had no oscillation, the network was "silent." If the biological cell had no spikes but did have a slow membrane potential oscillation, the network was "model dominated." If the biological cell fired action potentials, the network was either "half-center" or "bio-dominated." In halfcenter networks, the biological cell had slow membrane potential oscillations, and the predominant flow of synaptic current was from the bursting cell (the biological cell, then the model cell in alternation) to the non-bursting cell greater than 90% of the time.

#### **FINDING CORRELATIONS BETWEEN NAPs AND IPs**

For each IP, we searched for subsets of NAPs that were highly correlated with it. Conversely for each NAP we searched for subsets of IPs that were highly correlated with it. Both analyses were required because the problem is inherently asymmetric: it is possible for each individual NAP to be well-explained by a linear combination of IPs, but conversely have no individual IP that is well-explained by a linear combination of NAPs. We describe the algorithm to find subsets of NAPs that are highly correlated with an IP; the converse algorithm is identical, only with the data sets swapped.

We formed matrices *M*IP and *M*NAP, whose rows denote identity and whose columns denote different IPs and NAPs respectively. We *z*-scored each matrix column to eliminate the effects of scaling and offset. For each IP (column of *M*IP) we searched for a subset of NAPs (columns of *M*NAP) and linear coefficients that approximated the column of *M*IP. For a given subset of NAPs we formed a reduced matrix *R*NAP that contained only the columns corresponding to the properties we chose, then performed linear regression to find the best linear coefficients. We assessed the quality of this fit, or "prediction error" as mean square error per degree of freedom, where the number of degrees of freedom was the number of rows (STG neurons) minus the number of columns of *R*NAP (properties in the subset). We associated this prediction error with the subset of NAPs, and found the best subset of NAPs by minimizing the prediction error with greedy feature selection.

#### **GREEDY FEATURE SELECTION**

The greedy algorithm is a heuristic approach to searching a large space of candidate features. It quickly selects a very good subset of features, although not necessarily the best. We initialized the greedy algorithm by creating an empty set of features, and declaring its prediction error to be infinite.We then proceeded iteratively as follows:


At the end of iteration, the output of the greedy algorithm was the best subset of features and their prediction error.

#### **ASSIGNING** *p***-VALUES TO CORRELATIONS**

Given enough properties, we expect to see large correlations between some of them, even if they are merely random numbers. This is exacerbated by the greedy algorithm, because it discovers the largest correlations and ignores small ones. To determine whether the correlations in the data were large enough to be likely real, we computed the probability that equally large correlations would be found randomly in uncorrelated data.

Correlations in data are related to ordering. Independently scrambling the order of rows of *M*IP and rows of *M*NAP eliminates any correlations between IPs and NAPs, while preserving their distributions as well as the relationships within the IPs and within the NAPs. Performing the greedy search for correlations on these scrambled data returns the prediction errors for correlations between unrelated data. Because there are many ways to scramble the rows, we did this repeatedly (10,000 scrambled trials) and obtained an empirical estimate of the null distribution for prediction errors. The *p*-value of a correlation arising randomly is the proportion of prediction errors in the null distribution that is lower than the prediction error from the unscrambled data.

#### **CORRECTING FOR MULTIPLE CORRELATIONS**

Because of the need to assign many *p*-values, we expected that several might yield apparently "significant" results by chance, thus *p*-values needed to be adjusted to compensate for this problem.We describe a method for evaluating and correcting *p*-values obtained from scrambled data, that is conceptually similar to the Holm– Bonferroni correction (Holm, 1979; Aickin and Gensler, 1996) for multiple comparisons. A fit to scrambled data is expected to occasionally produce outliers with very low errors, although it is not clear that these outliers would be concentrated on any particular IP or NAP. Thus to determine if our best fit is likely due to chance, we compare our best fit to the best fit for each scrambled trial, regardless of which property gave the best fit in the different scrambled trials.

Therefore we started by sorting the prediction errors from the correlation search into increasing order. Then we proceeded iteratively as with the Holm–Bonferroni technique, starting with the best fit (least prediction error).We sorted the prediction errorfrom each scrambled trial into increasing order, generating an empirical estimate of the null distribution of the best fit. The adjusted *p*-valuefor the fit was the proportion of scrambled best fits that had lower error. If the adjusted *p*-value did not meet the significance criterion (*p* < 0.05) the fit and all higher error fits were not significant and the iteration stopped. Otherwise the fit was significant, and the corresponding property was removed from consideration in future *p*-values, both in the scrambled and unscrambled data. For example, if property P\_3 was the best fit and was found to be significant, then P\_3 would be removed from each scrambled trial regardless of whether it was the best fit for that scrambled trial. Then the iteration would continue with the next best fit. In this way (similar to Holm–Bonferroni), the best overall fit is compared to the best of *N* scrambled fits, the second-best is compared to *N* − 1, etc., until one of the fits is not significant. Furthermore, the data being removed from consideration are data that have already been shown to have correlations significantly better than chance.

#### **LINEAR CLASSIFICATION**

Linear classifiers were constructed of four binary classifiers, one for each neuronal type. Each binary classifier computed the likelihood that a set of properties belonged to a cell corresponding to the classifier's type. The likelihood function for binary classifiers was a logistic function acting on a linear function of *z*-scored neuronal properties (Bishop, 1996; Taylor et al., 2006). If there are *N* properties, then the likelihood *L* was calculated as

$$L(\overrightarrow{\mathcal{W}}) = P\left(\mathbb{w}\_0 + p\_1 \ast \mathbb{w}\_1 + p\_2 \ast \mathbb{w}\_2 + \dots + p\_N \ast \mathbb{w}\_N\right),$$

where *pn* are the *z*-scored properties, *wn* are the weights (*w0* is an offset), and *P* is the logistic function,

$$P(\mathbf{x}) = \frac{1}{1 + \exp(-\mathbf{x})}.$$

The weights for all binary classifiers were trained simultaneously using the whole dataset (or a subset if we were using cross-validation). To minimize the weights, we used a "soft max" function

$$\mathcal{S} = \frac{L\_{\text{correct}}}{L\_{\text{DG}} + L\_{\text{PD}} + L\_{\text{GM}} + L\_{\text{LP}}},$$

where LDG, LPD, LGM, and LLP were the likelihoods computed for each cell type, and Lcorrect is the likelihood for the correct cell type for a given cell. When training the classifier, we maximized the sum of the log-likelihood of *S* over the whole dataset (or a subset if we were using cross-validation),

$$\overrightarrow{\boldsymbol{\mathcal{W}}} = \arg\max\_{k \in \{\text{neurons}\}} \log \left( \mathbb{S}\_{\mathbb{k}} (\overrightarrow{\boldsymbol{\mathcal{W}}}) \right)$$

The optimization was performed using an iterative line-search method (Bishop, 1996), initialized with Fisher's linear discriminant (Bishop, 1996). When determining the results of the trained classifier, we determined the cell type as the one with the greatest likelihood.

To determine how well linear classification generalized, we used leave-two-out cross-validation. Thousand trials were conducted. In each trial the properties for two randomly chosen neurons were selected to be test data, and a linear classifier was trained on the remaining data. After the classifier was trained, the test data were classified. The cross-validation accuracy was the proportion of test data that were classified correctly.

#### **FINDING CLUSTERS**

We used *k*-medoids (Hastie et al., 2009) to categorize blindly STG neurons based on their *z*-scored properties. We set *k* = 4 and measured distance between points using *L1* (taxicab) norm. The *k*-medoids algorithm was initialized using the same procedure as *k*-means++ (Arthur and Vassilvitskii, 2007). Because the initialization is not deterministic, we used 200 trials, using the results of the trial that had the smallest average distance from each point to its medoid.

The results of the clustering allow construction of a contingency table called a "confusion matrix." The rows of the matrix correspond to STG cell type, the columns to cluster label, and the entries are the number of cells so categorized [e.g., the gastric mill (GM), 2 entry corresponds to the number of GM cells grouped into cluster 2]. Cluster labels were assigned the cell identity that maximized the proportion of cells correctly identified (proportion correct). Mutual information (MI) was computed as in Vinh et al. (2009). Adjusted mutual information (AMI) was computed as

$$\text{AMI} = \frac{\text{MI} - \overline{\text{MI}}}{\text{MI}\_{\text{max}} - \overline{\text{MI}}},$$

where MI is the expected value of MI for a random clustering, and MImax is the maximum possible MI. MImax was computed as in Vinh et al. (2009). We computed MI as the mean of the distribution of MI for random clustering, generated using a bootstrap technique.

#### **GENERATING THE DISTRIBUTION OF MUTUAL INFORMATION VIA BOOTSTRAP**

We modeled the confusion matrix as being generated by binomial random numbers [e.g., if two out of 13 dorsal gastric (DG) neurons were in cluster 1, the DG,1 entry was modeled as binomial random with a maximum value of 13 and an expectation value of 2]. From a given confusion matrix, we randomly generated synthetic confusion matrices using the same binomial distributions for each entry, and computed MI from these synthetic confusion matrices. We used 10,000 synthetic confusion matrices to generate an empirical distribution of MI. Because the clustering algorithm will always place at least one cell in every cluster, any synthetic confusion matrices with a column of all zeros were discarded and regenerated. To generate the MI distribution for real data sets, we used the confusion matrix generated by the results of the *k*medoids algorithm. To generate the MI distribution for clustering of random data sets, we used a confusion matrix with identical columns, and rows that summed to the number of cells in our actual data set (e.g., the sum of the DG row was equal to the number of DG cells in our data).

#### **COMPUTING** *P***-VALUES FOR DIFFERENCES IN CLUSTERING**

We did not seek to compute a rigorous probability that one set of properties is inherently superior to another with regard to clustering performance. Instead we asked if the difference in MI between two clustering results (MIlow and MIhigh) can be plausibly ascribed merely to fluctuations in the number of cells in each cluster. To compute *p* between two clusterings, we used the bootstrap method to obtain the distribution of MIlow. We then calculated *p* as the proportion of syntheticMIlow that is greater than the actualMIhigh. We called differences with *p* < 0.05 "significant."

#### **SOURCE CODE**

Source code implementing the statistical methods that we developed is hosted permanently at http://www.bio.brandeis.edu/ MarderLabCode/

#### **RESULTS**

The STG of the crab *C. borealis* has 26–27 neurons, that can be reliably identified according to their projection patterns (Marder and Bucher, 2007). Each STG has two pyloric dilator (PD) neurons, one lateral pyloric (LP) neuron, four GM neurons, and one DG neuron. The data in this paper come from 55 neurons (PD *n* = 13; LP *n* = 15; GM *n* = 14; DG *n* = 13). The PD and LP neurons are part of the circuit that generates the fast (period <sup>∼</sup>1 s) pyloric rhythm and the GM and DG neurons are part of the circuit that generates the slow (period 6–10 s) GM rhythm.

Conventional IPs were measured by injecting current steps and ramps into individual neurons to measure input resistance, spike threshold voltage, FI slope, spike frequency with 1 nA injected current, spike height, and minimum voltage with zero injected current. **Figure 1A** shows a recording of a DG neuron in response to a current ramp, showing the voltage at threshold and the spike height. **Figure 1B** shows the same cell in response to depolarizing current pulses of different amplitudes. **Figure 1C** shows the plot of spike frequency vs. injected current. Spike frequency with 1 nA injected current can be read directly from this plot, while the linear fit (blue line) allows determination of FI slope. **Figure 2** summarizes all of the IPs that went into the analysis, with the new data points in color, and those from the prior study (Grashow et al., 2010) in gray. Note that the variance of each measure is considerable, and there is a great deal of overlap across cell types.

Because of the overlap in these measures even across neurons with very different characteristic behaviors during ongoing network activity, we reasoned that a set of properties that better captured the potential dynamics of the neurons in a closed-loop dynamic network might be more useful in characterizing these neurons than the conventional, open-loop IPs shown in **Figure 1**.

Stomatogastric ganglion neurons are part of circuits that are rhythmically active, so we sought a measure that would place these neurons into a rhythmically active circuit under experimenter control. Therefore we used the dynamic clamp to create two-cell circuits: one cell being the neuron to be evaluated, the second a standard model neuron used in all experiments. **Figure 3** shows the result of a dynamic clamp experiment in which an isolated DG neuron was coupled with reciprocal inhibition to a Morris–Lecar model neuron (Morris and Lecar, 1981; Grashow et al., 2010), and the strength of the synaptic conductances (*g*¯syn) and an imposed

Spike threshold voltage is the voltage at the point of maximum curvature (dashed line) before the first spike in response to a ramp of injected current. **(B)** The FI curve was obtained by measuring spike frequency in response to depolarizing current steps. **(C)** FI Slope is the slope of the best-fit line to the linear region of the FI curve. For this DG neuron, the four rightmost points were used (see Materials and Methods). The spike rate in response to 1 nA of injected current is (in this case) the last data point.

*I* <sup>h</sup> conductance (*g*¯h) were varied. A schematic of this circuit is shown in **Figure 3A**. **Figure 3B** shows the behavior of the model neuron and the biological neuron in the uncoupled state, and **Figures 3C,D** show different patterns of resulting network activity. **Figure 3E** illustrates the case in which the model and biological neurons are firing in alternating bursts of activity, or half-center oscillations.We obtained NAPs exclusively by examining networks with half-center activity, because these were precisely the networks

that exhibited rhythmic activity with the complex mix of spiking and slow membrane potential oscillations that characterizes the membrane potential trajectories that STG neurons display during ongoing pyloric and gastric mill rhythms.

**Figure 3F** shows a map of the network behavior as *g*¯syn and *g*¯<sup>h</sup> were varied. The map positions that produced half-center alternating bursts are shown in the red dots. For each of the 55 experiments, we used these maps to calculate the proportion of map positions that resulted in half-center activity. In the map shown in **Figure 3F**, this proportion was 11/49. For each set of parameters that gave half-center activity (map positions with alternating bursts) we calculated the half-center frequency and the number of spikes/burst in the biological neuron (**Figure 3G**). Because each biological neuron used had a different set of IPs, we expected that the map produced with each one would be different. The hypothesis was that features of these maps and of their half-center behavior (size, location, burst frequencies, number of spikes/burst of halfcenter activity) would constitute a data set that might more reliably capture the neurons' dynamics, and consequently their cellular identity, than the conventional measures of intrinsic excitability.

**FIGURE 3 | Network activity of the artificial circuit depends on** *(g***¯syn,** *g***¯h***)* **parameter values. (A)** Schematic for two-cell synthetic circuit. A model Morris–Lecar neuron is connected to a biological STG neuron (either DG, GM, LP, or PD) via artificial mutual inhibitory synapses. Dynamic clamp simulated the synapses, as well as injecting artificial *h*-conductance into the STG neuron. **(B)** Voltage traces from uncoupled (*g*¯ syn = 0, *g*¯<sup>h</sup> = 0)Morris–Lecar model (top) and DG neuron (bottom). **(C–E)** Voltage traces from connected circuit with different (*g*¯ *syn* , *g*¯*<sup>h</sup>* ) parameter values. Colors denote network activity classification: green traces denote

**Figure 4** presents the data from all of the half-centers found in the 55 experiments analyzed. In most of the maps half-center activity was found in a horizontal swath,indicating that half-center activity was more sensitive to *g*¯*<sup>h</sup>* than to *g*¯*syn*. The SD of *g*¯*<sup>h</sup>* provides a measure of the width of the horizontal swath. **Figure 4A** shows that although the variance of this measure for each cell type is considerable, the PDs had a larger SD than the other cell types. The proportion of half-centers in the 55 neurons is shown in **Figure 4B**. The mean half-center frequency was higher in the PD neuron set of networks (**Figure 4C**), and mean number of spikes/burst was lowest in the networks made with GM neurons (**Figure 4D**).

We refer to the four measures – "SD of *g*¯h,""proportion of halfcenter networks," "mean half-center frequency," and "mean spikes per burst" – as NAPs. There are many conceivable network properties; however we restricted the analysis to a handful that were simple to measure and that we reasoned would be related to both IPs and cell identity.

#### **CORRELATIONS BETWEEN NAPs AND IPs**

We first asked if there is any predictive relationship between these measures of network activity and conventional IPs. For each IP, we searched for the subset of NAPs that best predicted it (see Materials and Methods). Conversely, we looked for the subset of IPs that best predicted each NAP.

To assess the significance of any correlations we repeated the predictive analysis on 10,000 shuffled trials (see Materials and Methods). In a shuffled trial, we scrambled the cell identity while preserving the distribution of individual properties. The shuffled trials provided an empirical estimate of the null distribution for prediction error; and because the null distribution was estimated from trials with multiple correlations, we were able to correct for multiple correlations and calculate adjusted *p*-values (see Materials and Methods).

networks dominated by the DG neuron, blue denotes networks dominated by the Morris–Lecar neuron, and red denotes networks exhibiting half-center oscillations. **(C)** *g*¯ syn = 70, *g*¯<sup>h</sup> = 55. **(D)** *g*¯ syn = 85, *g*¯<sup>h</sup> = 10. **(E)** *g*¯ syn = 40, *g*¯<sup>h</sup> = 40. **(F,G)** NAPs are representative of a biological neuron's overall response to all of the (*g*¯ *syn* , *g*¯*<sup>h</sup>* ) parameter values in a map. **(F)** The proportion of half-center networks is trivially obtained from the map. Std *g*¯<sup>h</sup> is the SD of *g*¯<sup>h</sup> values among half-center networks. **(G)** Spikes per burst and half-center frequency are both obtained from individual networks, then averaged over all half-center networks.

The results of this analysis are detailed in **Table 2**. **Figure 5** shows selected scatter-plots of several properties vs. the values predicted by their best-fit linear combination. Three of the six IPs were significantly predicted by NAPs, and all of the NAPs were significantly predicted by IPs. However, the *R*<sup>2</sup> values of the correlations were low, indicating weak predictive value.

#### **LINEAR CLASSIFIER**

We next asked whether we could reliably determine the identity of thefour neuron types using the six measurements of IPs (**Figure 2**), using the four measurements of NAPs taken from the dynamic clamp networks (**Figure 4**) or by combining the two sets of data together. Given the large range of these measurements within a cell type and the overlap of the values across the cell types, it is clear that no single measure would reliably allow the identification of the neurons.

To identify neuron types by their properties, the different cell types must have properties that segregate into different clusters. To check if this is the case, we attempted to train a linear classifier to determine neuronal identity based on a given set of *z*-scored properties. The classifier was constructed of four binary linear classifiers, one for each neuronal class. Binary classifiers estimated the likelihood that a neuron's properties corresponded to the binary classifier's STG type. The likelihood was a number between zero and one; whichever binary classifier returned the highest likelihood "won," and the overall classifier then determined that the properties belonged to a neuron of the corresponding type.

Conceptually, for a set of N properties, a linear classifier describes four *N* − 1 dimensional oriented hyperplanes (one for each binary classifier), with all the neurons of the correct type on the "plus" side of a hyperplane, and the remaining neurons on the "minus" side. In practice, the situation may be less straightforward, with all four hyperplanes being in compromise positions,

allowing their pooled information to determine identity. In such a situation, placement, and orientation of hyperplanes may depend on outlier points, especially when the number of cells is small but N is large.

We trained a linear classifier for each set of properties. The classifier based on IPs identified 85% of cells correctly, the classifier based on NAPs identified 84% correctly, and the combined properties classifier identified 100% of cells correctly (**Figure 6**). However, these high accuracies were partially due to overfitting outlier points. When we tested the generalizability of these classifiers with leave-two-out cross-validation (see Materials and Methods), the accuracy of each classifier dropped somewhat: IPs classified 64% correctly, NAPs 68%, and the combined data 78%. Thus only by combining the two data sets can the cell types be distinguished on the basis of properties, and even then the boundary between them is complex and dependent upon the position of outlier points.

#### **FINDING CLUSTERS**

Ideally, one would like to identify neuron types blindly, not merely verify that they fall into properties of clusters. This approach would be applicable to a system where cells cannot be unambiguously identified as they can in the STG. We used the *k*-medoids algorithm (Hastie et al., 2009) with *k* = 4 to find clusters of properties. However, clusters do not directly correspond to any particular cell type (i.e., after running *k*-medoids, a cell is labeled "cluster 2" not "GM"). To address this issue, we computed two measures of accuracy. We assigned cluster labels to cell identity to maximize the number of cells that are correctly categorized, and computed the "proportion correct." This number is necessarily in the interval between 1/*k* and 1. We also computed the MI between the cluster labels and the cell identities. By appropriately scaling the MI we computed the AMI which has a maximum value of one, and an expected value of zero for random numbers. In addition to our real data, we applied the clustering technique to a synthetic set of properties generated from Gaussian random numbers, to illustrate chance results. We used a bootstrap technique to estimate *p*-values for significant differences in MI between clusterings.

Both of the accuracy measures (MI/AMI and proportion correct) showed the same general trends. No set of properties was able to correctly identify all cells. When we performed *k*-medoids clustering on Gaussian random numbers, as expected we obtained no information (proportion correct = 0.29, MI = 0.089,AMI = −0.002). Real neuronal properties were able to obtain significant information: for IPs proportion correct = 0.60, MI = 0.36, and AMI = 0.21; for NAPs proportion correct = 0.69, MI = 0.72, and AMI = 0.49; and for both sets joined proportion correct = 0.84, MI = 0.90, and AMI = 0.62. The results of clustering on the combined properties are depicted in **Figure 7A**. All sets of STG properties achieved results significantly better than random numbers, with *p* < 0.001. The differences between sets of properties were also significant (NAPs vs. IPs *p* = 0.01, combined properties vs. IPs *p* < 0.001, combined properties vs. NAPs *p* = 0.03). The quantification of accuracy is summarized in **Figure 7B**. Together these results show that NAPs encode more information about cell identity than traditional IPs, but both sets contain distinct information (**Figure 7C**).

#### **DISCUSSION**

Establishing reasonable and reliable methods for classifying and characterizing neurons has been a far thornier practical problem than might have been predicted from first principles. While there


#### **Table 2 | Correlations between IPs and NAPs.**

*Each property (in bold) was fit with one or more properties from the other data set. The optimal set of fitting parameters was determined via greedy feature selection. We list the resulting R2, adjusted p-value, and optimal fitting properties for each property. p-Values are adjusted for multiple correlations and the feature-selection process.*

are large and uniquely identifiable neurons in small invertebrate nervous systems and in the spinal cords of fish and frogs, in most regions of the vertebrate central nervous system and in many brain areas in invertebrates, cell identification cannot be achieved by size or location of the neurons alone. The electrophysiological firing patterns of many neurons change (Dickinson et al.,1990;Weimann et al., 1991), either as a consequence of neuromodulation, development, or disease. Transmitter phenotype (Borodinsky et al., 2004) and transcription factor expression (William et al., 2003;Wienecke et al., 2010) are often developmentally or activity regulated, complicating the use of a single chemical marker to identify neurons, although chemical markers may be sufficient at times (Zagoraiou et al., 2009). Neuronal projection patterns to distant targets such as muscles or other brain regions often provide unambiguous identification, but when multiple cell types are entangled in local circuits, even projection patterns may not be sufficient. These issues are further confounded by the large variance measured in a variety of properties of individual neurons (Getting, 1981; Hume and Getting, 1982; Swensen and Bean, 2005; Schulz et al., 2006, 2007; Goaillard et al., 2009; Tobin et al., 2009; Grashow et al., 2010). This raises the question of whether combining multiple measures can serve to better cluster or identify neurons, and if so, what kinds of assays are potentially more useful than conventional measures of intrinsic excitability.

In this study we used six conventional measures of neuronal IPs and four measures of how neurons behaved in an artificial network to determine whether any or all of these measures could correctly cluster and identify neurons whose identity was already known. This exercise highlighted a number of difficulties that, to a greater or lesser degree, will potentially plague investigators wishing to use electrophysiological measures to identify neurons. It is clear from the variance in each of the IPs across individual neurons of the same class and from the overlap of these values across cell types, that no single measure would reliably serve to identify the neurons

(**Figure 2**; Grashow et al., 2010). Therefore, our goal was to determine whether the combined set of electrophysiological measures would reliably allow us to cluster the neurons into groups that mapped correctly with their identity.

One might naively think that increasing the number of electrophysiological measures performed for each neuron would increase the likelihood of proper identification. This might appear to be especially the case if each measure probes a different essential feature of the cell's performance. For that reason, we chose to embed each neuron in an artificial network that we reasoned would test its dynamic behavior differently than the conventional measures of excitability. Nonetheless, increasing the number of measures made for each neuron tested comes with a statistical cost, as each additional increases the likelihood of finding spurious correlations. Because it is necessary to correct for this before assigning statistical significance, adding measures that even partially sample the same biophysical attributes to a neuron may be more counterproductive than helpful. For some cell types a different set of measured IPs or NAPs might be more useful than those studied here. Nonetheless, our point is simply to say that "more is not necessarily better." To this end, in choosing the NAPs that we included, we used our biological intuition to select four that appeared to be relatively functionally independent of each other, and we discarded many other potential network measures that might have added relatively little to the analysis and added a substantial multiple comparison statistical burden. Obviously, these problems are more acute with relatively modest-sized data sets, such as that analyzed here, and become less acute with data sets with *n*'s in the several thousand (at which point it is also possible to use additional methods).

Because STG neurons can be recorded from intracellularly for many hours it was experimentally feasible for us to ask whether these NAPs would be useful in neuron characterization. We were surprised that the networks did not provide more information than they did, and we do not expect or recommend that

an investigator working in systems where recordings cannot be maintained for many hours attempt the same process, although other closed-loop measures that are less time-consuming could be devised.

We assessed significance of correlations with a custom bootstrapping method that combines shuffled trials and the Holm– Bonferroni correction for multiple comparisons. It is commonly assumed that correlations do not need to be corrected because they are only indicative of interesting relationships, not a rigorous test in themselves. However, the process of finding correlations may be *too* effective – if it can find seemingly strong correlations in random data, then there may be confusion between correlations that are indicative of real relationships, and those that are most likely spurious.

Data-mining commonly produces large spurious correlations. When we simply applied a Pearson correlation test, every correlation was "significant" and there were several *p*-values less than 10<sup>−</sup>10. Using shuffled trials properly accounts for the power of data-mining and dramatically increased the magnitude of the

**FIGURE 6 | A linear classifier can be trained to correctly identify cell type.** Hundred percentage of neurons were correctly identified when a linear classifier was trained on combined IPs and NAPs. To visualize the grouping of neurons by property, we projected the 10-dimensional space of properties down to two dimensions. Analogously to the results of Principal Component Analysis, we determined components (combinations of properties) that pointed along directions of particular interest. These directions (Dimension 1 and Dimension 2) were chosen to provide maximal spacing between the different neuronal types.

*p*-values (many were still highly significant). Building Holm– Bonferroni into the procedure allowed further correction for investigating many properties.

Although these two methods (comparing to shuffled trials, and adjusting *p*-values with the Holm–Bonferroni method) are commonly used separately, we are not aware of them being used together as done here. However, we believe it is necessary to combine them because we had no *a priori* hypothesis about which correlations were likely to be most significant. After searching a data set for correlations we find several with varying strengths, and want to know which correlations are weak enough that they could plausibly be drawn from the distribution of expected spurious correlations. To do this it was necessary to keep track of all the correlations for each shuffled trial, and therefore integrate the multiple comparisons correction into the shuffled trial structure.

The results of our correlation analysis appear to differ from Grashow et al. (2010) which analyzed much of the same data. However it should be borne in mind that the two analyses ask different questions. Grashow et al. (2010) asked, "What are the pairwise relationships between these different properties?", while here we asked "How well can we reconstruct one set of properties from the other?" Both are potentially interesting questions, and have their strengths and weaknesses.

Grashow et al. (2010) tested all possible pairwise correlations (48 total) between larger property sets. Because of this,

their correction for multiple comparisons was quite large. We searched for the best linear reconstruction for each property (10 total), thus incurring a smaller penalty for multiple comparisons. However the approach here may not find all the correlations that are of potential interest. For instance, if an NAP is highly correlated with two IPs, and those two IPs are highly correlated with each other, then this approach is unlikely to discover that the NAP is correlated with both IPs. More likely it would only discover one of them. This is because adding the second IP will decrease the degrees of freedom without substantially improving the fit, leading the combination to be heavily discounted by the feature-selection algorithm. Thus some of the differences between the two studies result from asking different questions.

However, some differences between the studies were due to improvements in methodology. As is often done, Grashow et al. (2010) computed adjusted *p*-values by using the Pearson correlation coefficient test for normally distributed data, then applying the Holm–Bonferroni correction for multiple comparisons. These data exhibit deviations from normality (e.g., **Figure 2F**, where LP exhibits substantial skew, and PD appears to be bimodal) and thus the Pearson correlation test is not fully appropriate. Furthermore the Holm–Bonferroni correction may be overly conservative because it does not account for potential relationships between the different quantities whose correlation is being tested. Here, basing the significance test on scrambled trials, we avoided making assumptions about the distribution of the data. By incorporating the adjustment for multiple correlations into the scrambled trials, we directly computed the approximate null distribution for the *n*th-best correlation. This enabled us to compute a *p*-value that is conservative enough to account for multiple correlations without being unnecessarily over-conservative.

In general, the methods we used – searching for correlations and clustering – decrease in effectiveness as the dimensionality they must work in (the number of properties) increases. Increasing dimensionality increases the probability of finding correlations between random numbers, making corrections for multiple correlations more severe and therefore decreasing the ability to detect real correlations. Clustering algorithms suffer from a host of problems referred to collectively as "the curse of dimensionality" (Bishop, 1996; Beyer et al., 1999; Hinneburg et al., 2000; Houle et al., 2010), which cause them to find the "wrong" clusters. It is especially difficult if extraneous dimensions are added, because algorithms naturally tend to break up clusters at essentially meaningless gaps in the extraneous dimensions. We decreased the number of properties we considered to avoid this problem as much as possible.

In this analysis the information pertaining to cell identity can be thought of as a Venn diagram (**Figure 7C**), with IPs containing some information, NAPs containing more information, and some overlap between the two sets. It is possible that because we used six measures of IPs, thus a higher dimension than the NAPs, they are merely falling afoul of the curse of dimensionality and thus unfairly penalized when compared to the four NAPs. However, when the two sets are grouped together the resulting set of properties has still higher dimensionality and outperforms either one separately, suggesting that the Venn diagram is appropriately representing the data. Looking at the raw MI if we naively assume that non-overlapping information combines additively,we see that the NAPs contain roughly twice as much information as the IPs, and that roughly half the information in the IPs is in the overlapping region of the Venn diagram (**Figure 7C**). Thus the NAPs do capture the neuronal dynamics of each cell type better than the conventional measures of IPs, although we cannot perfectly identify neurons only by their properties. The success of NAPs suggests that closed-loop dynamic current perturbations yield greater information about cell identity than static perturbations. However, the properties extracted from this perturbation must be chosen carefully, because the size of the data sets will never be large enough to search through the essentially infinite space of all possible properties.

We used *k*-medoids, one of the simplest clustering algorithms (Andreopoulos et al., 2009), but not always the best. The *k*medoids algorithm is nearly the same as the *k*-means algorithm,

#### **REFERENCES**


however clusters are represented by one of the members of the cluster (called the "medoid," and chosen to minimize distance to other members of the cluster) rather than the mean of members of the cluster. *k*-Medoids is more robust to outliers than *k*-means (Andreopoulos et al., 2009; Hastie et al., 2009) and is applicable in situations when computing mean objects is impossible or undesirable. *k*-Medoids is slower than *k*-means with very large data sets [selecting the medoid is O(N2) while computing the mean is O(N)]. However, as is common for electrophysiology, the data set studied here is small and we expect plentiful outliers due to the variability in neurons and noise inherent in measuring neuronal properties. In principle, more sophisticated density-based (Sander et al., 1998), nearest-neighbor-based (Ertöz et al., 2003; Bohm et al., 2004; Pei et al., 2009; Kriegel et al., 2011), or correlationbased (Kriegel et al., 2008) methods are capable of determining the number of clusters, recognizing extraneous dimensions, and finding clusters with complex shapes. When deciding on the methods we would use for this paper, we conducted pilot tests for a variety of clustering algorithms on synthetic data sets of a size comparable to our IPs and NAPs. In these pilot tests, we found that the more sophisticated clustering algorithms were less successful at identifying cluster membership (i.e., neuronal identity) correctly. With small data sets, there were inevitably extraneous large density fluctuations or extraneous correlations, therefore for this study, *k*medoids was superior by virtue of being simpler, but this would certainly change with a larger data set. These results suggest that with a large number of neurons and a small number of highly relevant properties, identifying cells via clustering is likely to be fruitful.

In this analysis we implemented corrections for multiple comparisons and methods to determine the statistical significance of the resulting correlations. In many studies reporting correlations, corrections for multiple correlations were not made. Obviously, if the correlations are robust, they will persist after the appropriate corrections are made. Nonetheless, it is likely that some reported correlations in the literature would not have survived a more rigorous statistical analysis. Of course, it is essential to remember that a weak correlation may point to a fundamental biological insight, while a strong correlation may not always help illuminate an underlying biological process.

#### **ACKNOWLEDGMENTS**

We thank Dr. Adam Taylor for useful discussions. This work was supported by MH467842, T32 NS07292, and NS058110 from the National Institutes of Health.

C. Beeri and P. Buneman (Berlin: Springer), 217–235.


Francisco: Morgan Kaufmann Publishers Inc.) 671675, 506–515.


pattern-generating circuit with neurons of different networks. *Nature* 351, 60–63.


conductance spaces. *J. Neurophysiol.* 96, 891–905.


*Machine Learning* (Montreal, QC: ACM), 1073–1080.


with development of postinjury spasticity. *J. Neurophysiol.* 103, 761–778.


could be construed as a potential conflict of interest.

*Received: 28 January 2012; accepted: 06 April 2012; published online: 27 April 2012.*

*Citation: Brookings T, Grashow R and Marder E (2012) Statistics of neuronal identification with open- and closedloop measures of intrinsic excitability. Front. Neural Circuits 6:19. doi: 10.3389/fncir.2012.00019*

*Copyright © 2012 Brookings, Grashow and Marder. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits noncommercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

## Nonlinear dynamical model based control of *in vitro* hippocampal output

#### *Min-Chi Hsiao1 \*, Dong Song2 and Theodore W. Berger <sup>3</sup>*

*<sup>1</sup> Department of Biomedical Engineering, University of Southern California, Los Angeles, CA, USA*

*<sup>2</sup> Department of Biomedical Engineering, Center for Neural Engineering, University of Southern California, Los Angeles, CA, USA*

*<sup>3</sup> Department of Biomedical Engineering, Program in Neuroscience, and Center for Neural Engineering, University of Southern California, Angeles, CA, USA*

#### *Edited by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### *Reviewed by:*

*Karim Oweiss, Michigan State University, USA Thierry R. Nieus, Italian Institute of Technology, Italy*

#### *\*Correspondence:*

*Min-Chi Hsiao, Department of Biomedical Engineering, University of Southern California, 3641 Watts Way, Los Angeles, CA 90089, USA. e-mail: mhsiao@usc.edu*

This paper describes a modeling-control paradigm to control the hippocampal output (CA1 response) for the development of hippocampal prostheses. In order to bypass a damaged hippocampal region (e.g., CA3), downstream hippocampal signal (e.g., CA1 responses) needs to be reinstated based on the upstream hippocampal signal (e.g., dentate gyrus responses) via appropriate stimulations to the downstream (CA1) region. In this approach, we optimize the stimulation signal to CA1 by using a predictive DG-CA1 nonlinear model (i.e., DG-CA1 trajectory model) and an inversion of the CA1 input–output model (i.e., inverse CA1 plant model). The desired CA1 responses are first predicted by the DG-CA1 trajectory model and then used to derive the optimal stimulation intensity through the inverse CA1 plant model. Laguerre-Volterra kernel models for random-interval, graded-input, contemporaneous-graded-output system are formulated and applied to build the DG-CA1 trajectory model and the CA1 plant model. The inverse CA1 plant model to transform desired output to input stimulation is derived from the CA1 plant model. We validate this paradigm with rat hippocampal slice preparations. Results show that the CA1 responses evoked by the optimal stimulations accurately replicate the CA1 responses recorded in the hippocampal slice with intact trisynaptic pathway.

**Keywords: neural prosthesis, Volterra kernel, inverse control, trajectory model, hippocampus**

## **INTRODUCTION**

A neural prosthesis is a prosthetic device that interfaces with the nervous system to improve or restore impaired neural function (Berger et al., 1994; Schwartz, 2004; Patil and Turner, 2008). The neuroprosthetic technology has been advancing rapidly (Bernotas et al., 1986; Creasey et al., 2004; Mayberg et al., 2005; Hochberg et al., 2006; Allison et al., 2007; Stacey and Litt, 2008). Neural prostheses can be categorized according to the directions of the signal communication between the device and the nervous system (Turner et al., 2005; Song et al., 2007). The first category of neural prostheses attempts to decode neural signals and then to activate an external object. An example would be the neuroprobes decoding motor cortex signals to control a robotic arm (Donoghue, 2002; Nicolelis, 2003; Taylor et al., 2003). The second kind of neural prostheses encodes external sensory stimuli and intends to activate the nervous system. Examples are cochlear implants and artificial retinas (Middlebrooks et al., 2005; Weiland et al., 2005). The third kind of neural prostheses, which forms a bidirectional closed-loop system with the nervous system, receives incoming neural signals from one nervous region and sends its output to activate another nervous system region (Berger et al., 2001, 2011). For the neural prosthesis that involves stimulation to the nervous system, the output system responses could be influenced by the stimulation parameters such as location, intensity, and frequency. Because the signal transformation in the nervous system is nonlinear, it is also important to consider the nonlinearity between stimulation patterns and the output responses.

Without considering this nonlinear relationship, large deviations between the device-evoked responses and the desired responses are expected. In practice, such deviations can be mitigated by tuning the stimulation parameters (Lauer et al., 2000; O'Suilleabhain et al., 2003; McIntyre et al., 2004; Tellez-Zenteno et al., 2006; Rupp and Gerner, 2007; Albert et al., 2009; McLachlan et al., 2010). This optimization procedure is typically performed manually and empirically, e.g., assuming a static and linear relation between the stimulation pattern and the desired responses, and then searching for the optimal ratio between the stimulation intensity and the outcome responses via a trial-and-error procedure. To formally solve this important problem, one needs to develop a rigorous stimulation paradigm that takes the (nonlinear dynamical) relationship between stimulation signals and system responses into account (Liu and Oweiss, 2010; Liu et al., 2011).

We are in the process of developing a neural prosthesis to restore the long-term memory formation function of the hippocampus that is lost in Alzheimer's disease, stroke, epilepsy, or other neurological disorders. Our concept of such a prosthetic device is a biomimetic model of the input–output nonlinear dynamics of the hippocampus—a model that captures how hippocampal circuitry re-encodes, or transforms, incoming spatiotemporal patterns of neural activity (i.e., short-term memories) into outgoing spatio-temporal patterns of neural activity (i.e., long-term memories) (Squire, 1992; Berger et al., 2001, 2005; Burgess et al., 2002). We have shown in rodents, both *in vitro* (Chan et al., 2004; Hsiao et al., 2006) and *in vivo* (Song et al., 2007, 2009; Berger et al., 2011, 2012), that a nonlinear hippocampal model is capable of predicting accurately the output signals based on the ongoing input signals in the hippocampus. In this study, we extend this concept by developing a rigorous stimulation paradigm with control theory, and then implementing it rat hippocampal slices.

The intrinsic circuitry of the hippocampus consists of three major subregions: dentate gyrus (DG), CA3, and CA1 as shown in **Figure 1A**. This trisynaptic circuit can be maintained in a transverse slice preparation (Andersen et al., 1969, 2000; Amaral and Witter, 1989). The signal transformations in all three regions are highly nonlinear and dynamical (Berger et al., 1988; Sclabassi

**FIGURE 1 | (A)** A rat hippocampal slice and its major intrinsic pathways. The input signals from perforant path fibers excite dentate granule cells. Dentate output, in turn, excites CA3 pyramidal cells through mossy fibers. Output from CA3 is transmitted to CA1 pyramidal cells through Schaffer collaterals. This so-called "trisynaptic pathway" is the principal network involved in hippocampal neuronal information processing. **(B)** A block diagram showing the trisynaptic pathway in a hippocampal slice. **(C)** A schematic diagram of a hippocampal prosthesis model functionally replacing the original pathway, where CA3 is damaged, so the signal

transmission cannot be completed. This bi-directional prosthetic device receives incoming neural signals from one hippocampal region (DG) and sends its output to stimulate another hippocampal region (CA1). **(D)** The proposed modeling-control paradigm to optimize the stimulation patterns. In this framework, the desired CA1 output is first predicted with the DG signal by the trajectory model, and then converted to the desired stimulation patterns through the inverse model. The desired stimulation patterns then drives the output system (CA1) to the desired output responses.

et al., 1988; Bartesaghi et al., 2006). From an engineering perspective, the hippocampal circuit can be viewed as a cascade of input–output transfer functions between the DG, CA3, and CA1 subregions (**Figure 1B**). In the context of extracellular recording as in this study, the evoked field potentials in each subsystem are measured as input–output signals. For example, the CA3 response (field excitatory postsynaptic potentials amplitude, fEPSP) can be used as the input signal to CA1, and the CA1 response can be considered as the final system output. A schematic diagram of such a hippocampal prosthesis is shown in **Figure 1C**, where CA3 is damaged, thus the signal transmission from DG to CA1 cannot be completed. In the replacement scenario, the prosthesis model processes the DG signals and generates optimal stimulations to elicited desired output response in the CA1 region.

The successful implementation of such a device depends on three sequential components. First, the device must capture incoming neuronal signals reliably from the input region. Second, it must mimic the damaged system precisely through a computational model. Finally, the device should reproduce the desired responses in the output region through electrical stimulation. Thus, through bi-directional communication with the brain, the prosthetic device could essentially bypass the damaged region and substitute the lost function.

This paper describes the procedure of deriving optimal stimulation patterns using an inverse control concept (Houk, 1988; Widrow and Walach, 1996; Camacho and Bordons, 2003; Normann, 2007). The "trajectory model" is a model that predicts the desired output response based on the input patterns. This model can be developed using available knowledge or built directly from experimental input–output data. The stimulationresponse properties of the output system is described as the "plant model." The "inverse plant model" describes a system whose transfer function is the inverse transformation of the plant model (Widrow and Bilello, 1993; Widrow and Plett, 1997; Karniel et al., 2001). This inverse transformation can be determined once the input–output transformation of the plant model is fully explored. Once these three models (i.e., trajectory, plant, and inverse plant models) are built, the signals flow like what is shown in **Figure 1D**. Signals recorded from the DG (input) system pass through the trajectory model to predict the desired output. The inverse plant model is then used to derive the desired stimulation amplitudes from desired output. Finally, the CA1 (output) region generates the controlled output responses. Results show that the strategy described in this paper is able to control CA1 output activities (shown in **Figure 1D** as "Controlled CA1 responses") to replicate the CA1 activities recorded from the hippocampal slice with intact trisynaptic pathway (shown in **Figure 1B** as "Trisynaptic CA1 responses").

## **MATERIALS AND METHODS**

The proposed modeling-control paradigm was verified using an *in vitro* rat hippocampal slice preparation. Section "Experimental Procedures" provides an explanation of the methodology used to prepare the hippocampal slices, and the description of our electrophysiology experimental setup. Section "Modeling-Control Paradigm Implementation and Data Collection" describes the estimations and validations of the trajectory model, the plant model and the inverse plant model, and the associated data collection and analysis procedures. The overall experimental protocol is described in section "Modeling-Control Framework Experiment Protocol."

## **EXPERIMENTAL PROCEDURES**

#### *Acute hippocampal slice preparation*

Hippocampal slices from 8 to 10-week-old male Sprague-Dawley rats (250–300 gm) were prepared. The animals were first anesthetized with halothane (Halocarbon Laboratory, USA) and then decapitated. Their skulls were rapidly removed and the brain was carefully extracted. Hippocampi were separated from the cortices in an iced sucrose buffer solution (Sucrose 206 mM; KCl 2.8 mM; NaH2PO4 1.25 mM; NaHCO3 26 mM; Glucose 10 mM; MgSO4 2 mM; Ascorbic Acid 2 mM). Hippocampal slices 400 micrometers thick were sliced transversely from the ventral hippocampi using a vibratome (Leica VT1000S, Germany). The slices were incubated for at least 1 h in 2 mM MgSO4 artificial cerebral spinal fluid (aCSF) at room temperature, to equilibrate. During each electrophysiological recording session, one slice at a time was transferred to the planar multielectrode array. The array attached with a circular plastic chamber and perfused with normal aCSF (NaCl 128 mM; KCl 2.5 mM; NaH2PO4 1.25 mM; NaHCO3 26 mM; Glucose 10 mM; MgSO4 1 mM; Ascorbic Acid 2 mM; CaCl2 2 mM) maintained at room temperature (24∼26◦C). In the recording chamber, each slice was held down by a metallic ring with nylon mesh attached to it. The positioning of the slice was accomplished by manipulating the ring with a small brush. All the solutions were bubbled with 95% O2 and 5% CO2 mixed gas. The protocol described above was approved by the Department of Animal Resources and Institutional Animal Care and Use Committee at the University of Southern California.

#### *Electrophysiological recording setup and procedures*

Electrophysiology data were collected through an extracellular recording technique using an MEA60 system (Multi Channel Systems, Germany), as seen in **Figure 2**. This system consisted of pre-amplifiers (1200× gain), a data acquisition device (MC\_Card), and an 8-channel stimulus generator (STG1008), all operated using software provided by Multi Channel Systems (MC\_Rack V3.2.0 and MC\_Stimulus V2.0.6). A conformal 60-channel planar multielectrode array was made specifically for this study. The geometry of this conformal array was designed to match the cytoarchitecture of the hippocampus slices (**Figure 2C**) and was platinum based. Details in fabrication and the arrangement of the array can be found in Gholmieh et al. (2006) and Taketani and Baurdy (2006). Collected data were sampled at a frequency of 10 kHz per channel and were recorded using MC\_Rack. The MEA60 system was assembled over an inverted microscope (Leica DM-IRB, Germany). In each experiment, the position of the slice on the MEA was captured by a digital image capture system (Diagnostic Instruments, Spot RT Digital Camera, USA) with SPOT (V4.6.4.3) software and Adobe Photoshop (Adobe V7.0, USA).

**FIGURE 2 | A photo of the electrophysiological recording system. (A)** The MEA60 system and **(B)** the conformal planar MEA (Gholmieh et al., 2006; Taketani and Baurdy, 2006). **(C)** A photomicrograph of a hippocampal slice on the conformal MEA. The set alignment of this array is according to rat hippocampal cytoarchitecture covering major subregions of DG, CA3, and CA1. The waveforms represent the trisynaptic response of the hippocampal slice recorded in each region. The white lines indicate the amplitude measurement of DG population spike amplitude and CA1 fEPSP amplitude (see section "FARIT-Induced Trisynaptic Data Collection and Analysis").

#### *Stimulation and data collection procedures*

In this study, biphasic currents with a 100μs duration in each phase were applied to all stimulation patterns. Different stimulation trains were programmed in MC\_Stimulus and used to study the nonlinear properties of different regions. There was a 5–7 min waiting period between each stimulus train. The evoked neural responses were simultaneously recorded from different regions. The channels were first selected based on the placement of the recording electrodes on the cytoarchitecture of the slices (i.e., DG channels must be in the DG region, CA3 channels must be in the CA3 region, etc). Among those channels, the channels with the largest response (dendritic population spike or EPSP) amplitudes are further selected and analyzed. The main purpose of this procedure is to find the most representative responses for each region, channels with small responses or inappropriate placements are not analyzed because their recordings may reflect a non-cell-body placement or a mixture of activities from multiple regions.

#### **MODELING-CONTROL PARADIGM IMPLEMENTATION AND DATA COLLECTION**

#### *DG-CA1 trajectory model implementation*

*FARIT-induced trisynaptic data collection and analysis.* An external bipolar electrode of twisted Nichrome wires was used to elicit the trisynaptic response. Paired-pulse or quadruplet-pulse electrical stimulation was applied to the perforant pathway of each slice using the external electrode to generate electrophysiological responses throughout the trisynaptic pathway (evoked field potentials in DG, CA3, and CA1, as seen in **Figure 2C**). When the full trisynaptic response was observed, we stimulated the slice with a series of fixed-amplitude, random inter-impulseinterval trains (FARITs). Four 300-pulse Poisson distributed FARITs of a fixed current intensity (biphasic, 150–300μA) were delivered to the perforant path (1200 impulses; range of intervals: 2 ms to 5 s; mean frequency: 2 Hz). Response amplitudes from selected channels in DG and CA1 regions were analyzed to build the DG-CA1 trajectory model. The neuron response measurement in DG was population spike amplitude, the amplitude was calculated by averaging the distance between the negative peak and the first positive peak (measure "a" in **Figure 2C**) and the distance between the negative peak and the second positive peak (measure "b" in **Figure 2C**) (Houk, 1988). To measure responses in the CA1 regions, the field potential amplitude was defined as the negative peak of the waveform (measure "c" in **Figure 2C**).

*Trajectory model configuration.* A single-input, single-output discrete model was derived from Volterra series as expressed below (Marmarelis and Orme, 1993; Marmarelis, 2004):

$$\begin{aligned} \mathbf{y}(n) &= k\_0 + \sum\_{m=0}^{M} k\_1(m)\mathbf{x}(n-m) \\ &+ \sum\_{m\_1=0}^{M} \sum\_{m\_2=0}^{M} k\_2(m\_1, m\_2)\mathbf{x}(n-m\_1)\mathbf{x}(n-m\_2) + \dots \end{aligned} \tag{1}$$

The zeroth order kernel *k*<sup>0</sup> is the value of output *y* when the input is absent. First order kernels *k*<sup>1</sup> describe the relationship between each single input *x(n* − *m)* and output *y*. Second order kernels *k*<sup>2</sup> describe the relationship between the output *y* and each unique pair of input *x(n* − *m*1*)*, *x(n* − *m*2*)*. The term *n* represents time of occurrence of the present impulse in the input–output sequence and *m* represents the interval of the impulses prior to the present impulse within the kernel memory window *M*, *m* = 0 denotes the present input. The input to the system can be expressed as a series of variable-amplitude, random-interval delta functions:

$$\propto (t\_i) = \sum\_{i=1}^{l} A\_i \\$\left(t - t\_i\right) \tag{2}$$

where *i* is the index number of impulses and *I* is the total number of impulses. The time of occurrence of the *i* th impulse is *ti*. In the DG-CA1 trajectory model experiment, DG population spike amplitude were used as input (*Ai*) and CA1 fEPSP amplitude were used as output *y(n)*. Because the input amplitude is varied, in order to isolate influence from present input, we considered the zero-lag terms in the original Volterra series (1) independently, as follows:

$$\begin{aligned} y(n) &= k\_0 + k\_1(0)\mathbf{x}(n) + \sum\_{m=1}^M k\_1(m)\mathbf{x}(n-m) \\ &+ k\_2(0,0)\mathbf{x}(n)\mathbf{x}(n) + \sum\_{m\_1=1}^M k\_2(m\_1,0)\mathbf{x}(n-m\_1)\mathbf{x}(n) \\ &+ \sum\_{m\_2=1}^M k\_2(0,m\_2)\mathbf{x}(n)\mathbf{x}(n-m\_2) \\ &+ \sum\_{m\_1=1}^M \sum\_{m\_2=1}^M k\_2(m\_1,m\_2)\mathbf{x}(n-m\_1)\mathbf{x}(n-m\_2) + \dots \end{aligned}$$

and can be then rearranged as:

$$\mathbf{y}(n) = k\_0 + k\_1(0)\mathbf{x}(n) + k\_2(0, 0)\mathbf{x}(n)^2 + \sum\_{m=1}^{M} k\_1(m)\mathbf{x}(n-m)$$

$$+ \sum\_{m\_1=1}^{M} \sum\_{m\_2=1}^{M} k\_2(m\_1, m\_2)\mathbf{x}(n - m\_1)\mathbf{x}(n - m\_2)$$

$$+ 2 \sum\_{m=1}^{M} k\_2(m)\mathbf{x}(n)\mathbf{x}(n - m) + \dots \tag{3}$$

The first three terms on right represent the static input–output curve. The last three terms describe the nonlinear dynamical effect of the inputs on the output. In order to reduce the number of open parameters, an estimation of the kernels is facilitated by expanding them on the orthonormal Laguerre basis functions *L* (Marmarelis, 1993):

$$L\_l(m) = \alpha^{(m-l)/2} (1-\alpha)^{1/2} \sum\_{k=0}^l (-1)^k \binom{m}{k} \binom{l}{k} \alpha^{l-k} (1-\alpha)^k$$

where α is the Laguerre parameter (0 *<* α *<* 1) and affects the time extent of the basis functions. The convolution of Laguerre basis functions *L* and inputs *x* can be represented as

$$\nu\_l(t\_i) = \sum\_{t\_i - \mu < t\_j \le t\_i} A\_j L\_l(t\_i - t\_j)$$

where *Aj* is the input spike amplitude in (2), *ti* is the time of occurrence of the current impulse in the input–output sequence and *tj* is the time of occurrence of the *j* th impulse prior to the present impulse within the kernel memory window μ. The adapted Laguerre expansion of Volterra kernels with *L* basis functions can be rewritten as:

$$\gamma(t\_i) = c\_0 + A\_i c\_1(0) + A\_i^2 c\_2(0, 0) + \sum\_{l=1}^{L} c\_1(l) \nu\_l(t\_i)$$

$$+ \sum\_{l\_1=1}^{L} \sum\_{l\_2=1}^{L} c\_2(l\_1, l\_2) \nu\_{l\_1}(t\_i) \nu\_{l\_2}(t\_i)$$

$$+ 2A\_i \sum\_{l=1}^{L} c\_2(l) \nu\_l(t\_i) + \dots \tag{4}$$

where *c*0, *c*1, *c*2*,...* are the kernel expansion coefficients. Since the number of basis functions can be made much smaller than the memory length, the number of open parameters is greatly reduced by this expansion technique. The kernel expansion coefficients (*c*0, *c*1, *c*2*,...*) can be estimated via the least-squares method, and can be used to reconstruct the Volterra kernels (*ki*) using Laguerre basis functions (*L*)

$$k\_0 = c\_0, \quad k\_1 = \sum\_{l=1}^{L} c\_l L\_l, \quad k\_2 = \sum\_{l\_l=1}^{L} \sum\_{l\_l=1}^{L} c\_{l\_1, l\_2} L\_{l\_1, l\_2}$$

#### *CA1 plant model implementation*

*RARIT-induced monosynaptic data collection and analysis.* After collecting the FARIT data for building DG-CA1 trajectory model, in the same slice, paired-pulse stimulation was applied to the *stratum radiatum* from a pair of stimulation electrodes in the conformal array in order to elicit the monosynaptic CA1 response. The pair of stimulation electrodes was selected according to their location and their ability to evoke typical paired-pulse facilitation. In this set of experiments, the amplitudes of the FARITs were modified to formulate a random-amplitude, random-interval trains (RARITs, Gaussian distributed, mean amplitude: 150μA, which is the mean CA1 evoked postsynaptic potential amplitude observed in the FARIT-induced trisynaptic dataset). Once the pair of stimulation electrodes were determined, four 300-pulse RARITs were delivered to the slice. A channel from the CA1 region was selected and fEPSP amplitudes were analyzed for suitability in constructing the CA1 plant model.

*CA1 plant model configuration.* The same Laguerre-Volterra (LV) modeling approach described in section "Trajectory Model Configuration" was applied to build a CA1 plant model. In this set of experiments, the amplitudes of the RARITs (from previous section) were used as measures of the input signal *Ai* in (2), and the fEPSP amplitudes of CA1 were used as measures of output. Once the input–output transformation of the CA1 system was fully explored, the inverse model can be further derived.

#### *Inverse CA1 plant model implementation*

*Inverse CA1 plant model configuration.* The inverse model was built to transform the output (i.e., desired output of a CA1 region) to the input (i.e., desired input stimulation to a CA1 region). To develop the inverse model based on the LV model, the original Equation (4) was rearranged to:

$$[c\_2(0,0)]A\_i^2 + \left[c\_1(0) + 2\sum\_{l=1}^L c\_2(l)\nu\_l(t\_i)\right]A\_i$$

$$+ \left[c\_0 + \sum\_{l=1}^L c\_1(l)\nu\_l(t\_i) + \sum\_{l\_1=1}^L \sum\_{l\_2=1}^L c\_2(l\_1, l\_2)\nu\_{l\_1}(t\_i)\right]$$

$$\times \nu\_{l\_2}(t\_i) - \wp(t\_i)\bigg] = 0\tag{5}$$

In (5), the desired output *y* and the coefficients *c*0, *c*1, *c*2*,...* were obtained during process of model estimation. All the convolution terms could also be determined using the coefficients and previous stimulation amplitudes A*<sup>i</sup>* <sup>−</sup> 1, A*<sup>i</sup>* <sup>−</sup> <sup>2</sup>*,....* Once all the terms are determined, (5) became a quadratic equation with unknown desired input stimulation (*A*). It can be simplified as:

$$aA^2 + bA + c = 0\tag{6}$$

where

$$\begin{aligned} a &= c\_2(0,0), \quad b = c\_1(0) + 2\sum\_{l=1}^L c\_2(l)\boldsymbol{\nu}(t\_l), \\ c &= c\_0 + \sum\_{l=1}^L c\_1(l)\boldsymbol{\nu}(t\_l) + \sum\_{l\_1=1}^L \sum\_{l\_2=1}^L c\_2(l\_1,l\_2)\boldsymbol{\nu}\_1(t\_l)\boldsymbol{\nu}\_2(t\_l) - \boldsymbol{\chi}(t\_l) \end{aligned}$$

such that the transformation of the inverse model (output to input) became an operation of solving *A* in (6). In this study, the roots of the quadratic equation can all be calculated from

$$A = \frac{-b + \sqrt{b^2 - 4ac}}{2a} \tag{7}$$

The validation of this inverse model implementation is shown in the result (section "Inverse CA1 Plant Model Implementation and Validation Results"). All the calculated stimulation amplitude were used to recompose to the new stimulation trains, called desired-amplitude RITs (DARITs) as described below.

*DARIT-induced monosynaptic data collection and analysis.* In this set of experiments, the amplitude of the RARITs (as used in section "RARIT-Induced Monosynaptic Data Collection and Analysis") were reformed using the optimal stimulation amplitudes calculated from the inverse CA1 plant model (from previous section "Inverse CA1 Plant Model Configuration"), named DARITs. Four 300-pulse DARITs were delivered to the slice through the same pair of stimulation electrodes as RARIT experiments. A channel from the CA1 region was selected and the fEPSP amplitudes were analyzed.

### *Model validation*

In this study, data were evaluated using the Variance Accounted For (VAF) and the Normalized Mean Square Error (NMSE) as described below:

$$\text{VAF} = (1 - \text{var}(Y\_i - X\_i) / \text{var}(Y\_i))$$

$$\text{NMSE} = \sum\_{i} (Y\_i - X\_i)^2 \Big/ \sum\_{i} Y^2$$

where *X* is the predicted amplitude of the model, *Y* is the amplitude analyzed from the recorded data, and *var* is the variance of the data. Specific data sets were chosen for comparison and are presented in the result section. To evaluate the prediction power of the models, we have used a cross-validation method, i.e., independent datasets are used for model estimation and model prediction. All model goodness-of-fit reported in this paper are obtained using this method.

## **MODELING-CONTROL FRAMEWORK EXPERIMENT PROTOCOL**

The experiment protocol to verify our modeling-control paradigm for an *in vitro* hippocampal prosthesis involves following steps:


CA1 plant model can then be formulated. **(C)** Applying DG patterns (from **A**)

amplitudes from **A** and **C** can then be compared.

approach, and then formulating an inverse CA1 plant model (**Figure 3B**).


## **RESULTS**

The diagrams and performance of the DG-CA1 trajectory and CA1 plant model prediction, and the inverse CA1 plant model implementation are presented in this section. The presented protocol was conducted in six experiments. In each experiment, two sets of data were collected from a hippocampal slice. The first dataset was composed of the FARIT-induced trisynaptic data and was used to build the DG-CA1 trajectory model. The second dataset was composed of the RARIT-induced monosynaptic data and was used to build the CA1 plant model. The resulting two built models and their predictions are presented. This section also include the implementation of the inverse CA1 plant model, and lastly, the validation of the modeling-control paradigm used for regulating CA1 nonlinear dynamics.

#### **DG-CA1 TRAJECTORY MODEL AND THE PREDICTION RESULTS**

The FARIT-induced hippocampal trisynaptic data were analyzed for use in building the DG-CA1 trajectory model. The amplitudes of evoked DG population spikes were used as measures of the input to the system, and the amplitudes of evoked CA1 fEPSPs were used as measures of output of the system. An LV kernel model was applied to study the nonlinearity of this system. Examples of the first and the second order LV kernels are shown in **Figures 4A,B**, respectively.

of each figure (indicated by arrows) represent the effect of present input. **(C)** A segment of comparison between a FARIT-induced trisynaptic CA1

DG-CA1 trajectory model.

CA1 responses recorded from the slice and the outputs predicted by the

It should be noted that in **Figure 4A**, the singular point represents the *K*1(0) term in (3). The different polarity in **Figure 4A** manifests the importance of isolating the zero lag terms. In **Figure 4B**, the singular point represents the *K*2(0,0) term in (3), and the two lines indicated by arrows represent the *<sup>M</sup> m*1=1 *<sup>M</sup> <sup>m</sup>*2=<sup>1</sup> *k*2*(m*1*, m*2*)* term, while one of the input pairs is the present input (*m* = 0).

Model estimation was completed using population spike amplitudes and the intervals of the input–output sequences. From all datasets, the slice response amplitudes were analyzed and compared with the predicted amplitudes. The mean VAF was 65.97 ± 17.30%. In **Figure 4C**, a segment of FARIT-induced trisynaptic CA1 fEPSP amplitudes is compared to its counterpart predicted from the DG-CA1 trajectory model. The result is further confirmed by an overall comparison between the actual responses and model predicted outputs, as shown in **Figure 4D**. The quantile–quantile (Q–Q) plot demonstrates that the actual trisynaptic CA1 responses recorded from the slice are accurately predicted by the DG-CA1 trajectory model.

#### **CA1 PLANT MODEL AND THE PREDICTION RESULTS**

The RARIT-induced CA1 monosynaptic data were analyzed for use in building the CA1 plant model. The random amplitudes of the RARITs were used as measures of input into the system, and the amplitudes of evoked CA1 fEPSPs were used as measures of the output of the system. The LV kernel model was applied to study the nonlinearity of the CA1 system. Examples of the first and the second order LV kernels are shown in **Figures 5A,B**, respectively.

Model estimation was completed using stimulation intensities and intervals of the input–output sequences. The VAF between slice response and model prediction was 85*.*56 ±

**FIGURE 5 | (A)** The first order and **(B)** the second order LV kernel of the CA1 plant model. The singular points and lines showing on the edge of each figure (indicated by arrows) represent the effect of present input. **(C)** A segment of comparison between RARIT-induced monosynaptic CA1

response amplitudes and the amplitudes predicted by the CA1 plant model. **(D)** The Q–Q plot of the data distribution between actual monosynaptic CA1 responses recorded from the slice and the outputs predicted by the CA1 plant model.

13*.*91%, averaged from six datasets. This shows that the CA1 plant model can accurately predict CA1 amplitudes based on stimulation amplitudes. A segment of the RARIT-induced monosynaptic CA1 fEPSP amplitudes is compared to its counterpart predicted by the CA1 plant model, as shown in **Figure 5C**. **Figure 5D** displays the Q–Q plot of the overall monosynaptic CA1 responses and CA1 plant model predicted results.

#### **INVERSE CA1 PLANT MODEL IMPLEMENTATION AND VALIDATION RESULTS**

The implementation of the inverse CA1 plant model is accomplished using RARIT-induced monosynaptic data. The purpose for formulating such an inverse model is to transform output predictions into input stimulations. The output predictions were acquired from the CA1 plant model, and applied as the *y* in (5). The coefficients *c*0, *c*1, *c*2*,...* were obtained during process of model estimation. Three terms involved the convolution of Laguerre basis functions and input amplitude were unknown, which include

$$\sum\_{l=1}^{L} c\_2(l)\boldsymbol{\nu}\_l(t\_l), \quad \sum\_{l=1}^{L} c\_1(l)\boldsymbol{\nu}\_l(t\_l), \text{ and } \sum\_{l\_1=1}^{L} \sum\_{l\_2=1}^{L} c\_2(l\_1, l\_2)\boldsymbol{\nu}\_{l\_1}(t\_l)\boldsymbol{\nu}\_{l\_2}(t\_l).$$

Based on our experimental design, no stimulation existed before the stimulation train was sent, so these unknown terms were equal to zero. Thus, the first stimulation amplitude can be calculated by (7). After the first stimulation amplitude was calculated, it was then applied to convolve with the Laguerre basis function and formulate the unknown terms for calculating the next stimulation amplitude. The operations for solving the root were run through all data points in order to process the transformation from output into input. As a result, the inverse model allows us to convert the desired output response amplitudes to input stimulation amplitudes in a dynamic, recursive manner.

The validation of this inverse model was completed by comparing the calculated stimulation amplitudes with the RARIT amplitudes. The scatter plot in **Figure 6** shows that the calculated stimulation amplitudes and the RARIT stimulation amplitudes are identical, showing that: (1) the real roots could all be calculated; and (2) this inverse model implementation can successfully derive optimal stimulations based on desired response amplitudes.

#### **MODELING-CONTROL RESULTS**

Following the protocol in this modeling-control framework experiment, CA1 desired output is first predicted through DG-CA1 trajectory model, and then applied into the inverse CA1 plant model to derive the optimal stimulation amplitudes. These amplitudes were used to formulate DARITs and were then sent into the slice, and the monosynaptic CA1 responses were recorded. The proposed modeling-control framework was intended to evoke CA1 to produce activities similar to the original CA1 activities. Thus, DARIT-induced monosynaptic CA1 amplitudes were compared with FARIT-induced trisynaptic CA1 response amplitudes. Two examples are shown in **Figure 7**.

Each panel illustrates results from one experiment: amplitudes of fEPSPs recorded from the CA1 region are shown as a

function of 50 impulses chosen from among 2400 impulses of the stimulation trains (1200 administered with FARIT stimulation; 1200 administered with DARIT stimulation). In order to collapse the *x* axis to comprise more data points, time intervals between impulses are not represented in the figures; only "Input Event" number (sequence of sample impulses) is shown. Data for the FARIT-induced trisynaptic CA1 responses (CA1-trisynaptic) are shown in blue diamonds; data for the DARIT-induced monosynaptic CA1 responses (CA1-Controlled) are shown in red squares. As seen in **Figure 7**, the variation in CA1 fEPSP amplitudes was also captured in our model controlled paradigm. The accuracy was evaluated using NMSE of the amplitude, and the average NMSE was 15*.*41 ± 8*.*35%. A Q–Q plot compared through the entire data sets is shown in **Figure 8**.

### **DISCUSSIONS**

One of the essential objectives of a neural prosthetic device is to recreate the desired neural responses. While it is important to develop a reliable hardware model to represent the computational functions of a system, the control between device stimuli and actual responses is equally important. For example, in the application of deep brain stimulation (DBS), many efforts have been made in calibrating the stimulation parameters to achieve the desired effect (Mayberg et al., 2005; Okun et al., 2005). DBS devices depend on a trial-and-error process for finding the optimal stimulation pattern. Patients must repeatedly perform an exercise for a neurologist to adjust the stimulation parameters such as voltage, amplitude, pulse width, frequency, and electrode position (Moro et al., 2002; Volkmann et al., 2002; O'Suilleabhain et al., 2003). Another example is the application of functional electrical stimulation (FES) (Riener, 1999; Duffell et al., 2008;

Donovan-Hall et al., 2011). The basic principle of FES is the generation of action potentials in the uninjured lower motor neurons by external electrical stimulation. This device faces problems such as muscle fatigue, spasticity, and limited force in the stimulated muscle. Using control strategies is one way to avoid internal disturbances and improve the time-consuming trialand-error adjustment (Matjacic et al., 2003; Braz et al., 2009). Current neural prostheses face the same problem—the stimulation signals need to be adjusted manually or empirically based on the output response. In this article, we describe a rigorous approach to generate the stimulation patterns using a modelingcontrol framework. In the hippocampal slice preparation, with the purpose of restoring the CA1 output responses observed in the intact trisynaptic (DG to CA3 to CA1) circuitry, a nonlinear trajectory model was built to predict the CA1 desired output based on DG input. The predicted CA1 output was then converted to optimal stimulation through an inverse plant model of CA1 (i.e., an inverse transformation of CA1 input–output properties). Thus, the stimulation was essentially derived based on the desired output response, and was used to reactivate the CA1 response. An experimental validation of this modelingcontrol paradigm using hippocampal slices is provided. One of our preliminary studies was to stimulate CA1 region with nonoptimal stimulation parameters, which means the nonlinearity of CA1 input–output relationship was not take into concern. The average NMSE from four experiments was 35*.*23 ± 18*.*21%, which is much higher than the modeling-control paradigm results.

In current experimental paradigm, the only open parameters is the stimulation intensity, since the desired output responses are single EPSPs, the stimulations are the standard biphasic pulses, and there is no frequency (multiple impulses will elicit undesired multiple EPSPs). We are aware of the fact that in other applications (e.g., DBS), the phase and frequency are equally important parameters and also could be optimized. Our current modelingcontrol paradigm can be extended and used as a platform for the optimization of those parameters in the future. We understand that the stimulation site is also critical for this kind of devices, sometimes the misplacement of the electrode lead could cause poor efficacy or adverse effects (Richardson et al., 2009; van den Munckhof et al., 2010). Current clinical DBS surgeries were assisted with preoperative images analysis (MRI or CT),

and intra-operatively guided with computerized stereotactic techniques. The lead could also be switched with limited-adjustability to compensate the inappropriate placement issue. By applying the stimulation electrode array in current experimental setup, the optimal stimulation site and its influence to the model can be further evaluated.

Our demonstrations also show that implementing a bidirectional neural prosthesis implicitly replaces the damaged system. In this approach, we do not need to explicitly estimate the transformational property of the CA3 region in the trisynaptic circuit. As long as we have the trajectory responses of the output system and once we identify the nonlinear input–output relationship of the output system, we are able to drive it to the desired output through its inverse model. One potential issue here would be "How to know the desired trajectory responses in the intact system?" In our opinion, there are several solutions/mechanisms that can mitigate this problem. First, we may develop a "generic model" from data recorded in normal animals. Previous studies have shown that there are significant amount of common features in the functional input–output properties across different animals, despite the animal-to-animal variations. In the case described in this study, all trajectory models are qualitatively similar in terms of the kernel polarity, kernel duration and kernel shape. Using model derived from other animals is imperfect, but at least provides a good approximation. Second, in behaving animal applications, the "imperfect" outputs generated by the "imperfect model" will be read out by the downstream brain regions. Neural plasticity, which is ubiquitous in the central nervous system, may play a role in adapting the system to the imperfect outputs or model. Third, more sophisticated computational methods such as reinforcement learning can potentially be used to develop self-adaptive or co-adaptive models.

The paradigm introduced in this paper did not include the error feedback. This was based on the assumption that the error observed in the output responses is instantaneous and does not influence the future output. In order to extend the paradigm to a closed-loop feedback system (Bernotas et al., 1986; Houk, 1988; Veltink et al., 1992; Abbas and Riener, 2001; Liu et al., 2011), the output error, that may caused by the interface between electrodes and nervous systems, the variation of the system, or the internal disturbance, need to be considered. The trajectory model

developed in this study can be used as a reference model, which provide the desired output responses to compare with the actual responses recorded from output system, to calculate the error signal (**Figure 9**). In this case, the feedback error will be used as an external input signal and sent to adjust the properties of the inverse model. The influence of the previous errors on the current output may be taken in to account in a dynamic manner. In such

### **REFERENCES**


a scheme, the optimal stimulation signals are calculated by the inverse plant model based on both the input and error signals.

## **ACKNOWLEDGMENTS**

This work was supported in part by the NSF (BMES ERC and BITS Program), DARPA (HAND Project), ONR (Adaptive Neural System Program), NIBIB, and the Brain Restoration Foundation.

in brain interfaces. *Nat. Neurosci.* 5(Suppl), 1085–1088.


review. *IEEE Trans. Rehabil. Eng.* 8, 205–208.


disease of electrical parameter settings in STN stimulation. *Neurology* 59, 706–713.


spinal cord injury and challenges for the future. *Acta Neurochir. Suppl.* 97(Pt 1), 419–426.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 31 August 2012; paper pending published: 27 September 2012; accepted: 30 January 2013; published online: 20 February 2013.*

*Citation: Hsiao M-C, Song D and Berger TW (2013) Nonlinear dynamical model based control of in vitro hippocampal output. Front. Neural Circuits 7:20. doi: 10.3389/fncir.2013.00020*

*Copyright © 2013 Hsiao, Song and Berger. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Controlling the oscillation phase through precisely timed closed-loop optogenetic stimulation: a computational study

#### *Annette Witt 1, Agostina Palmigiano2, Andreas Neef 3, Ahmed El Hady3, Fred Wolf <sup>3</sup> \* and Demian Battaglia2 \**

*<sup>1</sup> Cognitive Neuroscience Department, German Primate Center, Bernstein Center for Computational Neuroscience, Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany*

*<sup>2</sup> Bernstein Center for Computational Neuroscience, Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany*

*<sup>3</sup> Max Planck Institute for Dynamics and Self-Organization, Bernstein Center for Computational Neuroscience and Bernstein Focus Neurotechnology, CRC-889 Cellular Basis of Sensory Processing, Göttingen, Germany*

#### *Edited by:*

*Steve M. Potter, Georgia Institute of Technology, USA*

#### *Reviewed by:*

*Carmen Canavier, LSU Health Sciences Center, USA Nathan Urban, Carnegie Mellon University, USA Robert J. Butera, Butera Lab, Georgia Institute of Technology (Georgia Tech), USA*

#### *\*Correspondence:*

*Fred Wolf and Demian Battaglia, Max Planck Institute for Dynamics and Self-Organization, Am Faßberg 17, D-37077 Göttingen, Germany. e-mail: fred@nld.ds.mpg.de; demian@nld.ds.mpg.de*

Dynamic oscillatory coherence is believed to play a central role in flexible communication between brain circuits. To test this communication-through-coherence hypothesis, experimental protocols that allow a reliable control of phase-relations between neuronal populations are needed. In this modeling study, we explore the potential of closed-loop optogenetic stimulation for the control of functional interactions mediated by oscillatory coherence. The theory of non-linear oscillators predicts that the efficacy of local stimulation will depend not only on the stimulation intensity but also on its timing relative to the ongoing oscillation in the target area. Induced phase-shifts are expected to be stronger when the stimulation is applied within specific narrow phase intervals. Conversely, stimulations with the same or even stronger intensity are less effective when timed randomly. Stimulation should thus be properly phased with respect to ongoing oscillations (in order to optimally perturb them) and the timing of the stimulation onset must be determined by a real-time phase analysis of simultaneously recorded local field potentials (LFPs). Here, we introduce an electrophysiologically calibrated model of Channelrhodopsin 2 (ChR2)-induced photocurrents, based on fits holding over two decades of light intensity. Through simulations of a neural population which undergoes coherent gamma oscillations—either spontaneously or as an effect of continuous optogenetic driving—we show that precisely-timed photostimulation pulses can be used to shift the phase of oscillation, even at transduction rates smaller than 25%. We consider then a canonic circuit with two inter-connected neural populations oscillating with gamma frequency in a phase-locked manner. We demonstrate that photostimulation pulses applied locally to a single population can induce, if precisely phased, a lasting reorganization of the phase-locking pattern and hence modify functional interactions between the two populations.

**Keywords: oscillations, functional connectivity, modeling, closed-loop systems, optogenetic stimulation, phase response**

## **INTRODUCTION**

Neural activity of brain circuits at many scales has often been reported to display oscillatory components at different frequencies (Eckhorn et al., 1988; Gray et al., 1989; Kreiter and Singer, 1996; Tallon-Baudry et al., 1996; Roelfsema et al., 1997; Varela et al., 2001; Brovelli et al., 2004; Samonds and Bonds, 2004; Melloni et al., 2007; Buffalo et al., 2011). In particular, the *communication-through-coherence* hypothesis (Fries, 2005) suggests that oscillatory coherence between different neural circuits could control functional interactions between them with a high degree of flexibility (Womelsdorf et al., 2007). In particular, evidence for a role of enhanced inter-areal oscillatory coherence in attentional modulation is rapidly accumulating (Fries et al., 2001; Gregoriou et al., 2009; Rotermund et al., 2009; Bosman et al., 2012; Gregoriou et al., 2012; Grothe et al., 2012).

The circuit mechanisms underlying the local generation of oscillations, specifically in the gamma range of frequencies (30–100 Hz) have been explored in studies *in vitro* (Whittington et al., 1995; Bartos et al., 2007) and *in silico* (Brunel and Hakim, 1999; Whittington et al., 2000; Brunel and Hansel, 2006; Wang, 2010). All of these contributions have highlighted the crucial role played by the interplay of GABAergic interneurons in creating time-windows in which excitatory and inhibitory neurons can fire in a sparsely synchronized manner, before being counteracted by strong and delayed feedback inhibition. More recently, the functional involvement of local inhibitory networks could be causally verified *in vivo* by targeted selective optogenetic stimulation of Parvalbumine-positive basket cells in a cortical circuit (Cardin et al., 2006; Sohal et al., 2009).

In an analogous way, optogenetic techniques might be used for direct tests of the communication-through-coherence hypothesis and other suggested functional roles of brain oscillations, like their implication in phase coding (Lisman, 2005; Koepsell et al., 2010; Nadasdy, 2010; Kayser et al., 2012). For such applications, however, improved optogenetic stimulation protocols are needed that allow for precise control of the phase relations between different neuronal populations or assemblies, rather than a pure enhancement of oscillatory power.

Theoretical investigations suggest that, due to non-trivial phase response properties (Pikovsky et al., 2001) of oscillating neuronal populations (Akam et al., 2012), stimulation pulses might have a strong influence on local and long-range phaserelations, but only if properly timed with respect to the ongoing oscillatory dynamics (Tiesinga and Sejnowski, 2010; Battaglia et al., 2012). Application of phase-timed stimuli requires a realtime estimate of the phase from continuously recorded local field potential (LFP) data.

Optogenetic stimulation conditional on recorded activity constitutes a closed-loop setup. The advantage of closed-loop stimulation compared to open-loop stimulation is the possibility to program an artificial feedback with defined rules and constraints dependent on the target system's dynamical history. Closed loop electrical stimulation has been successfully used to clamp network activity (Wallach et al., 2011), to control the firing rate of neurons (Miranda-Dominguez et al., 2010), to control bursting activity (Wagenaar et al., 2005), and to train cultured neuronal networks (Marom and Shahaf, 2002). Closing the loop between living neurons and robotics has also been used to realize embodiment—by using representations generated by network activity either to control a robotic arm (Bakkum et al., 2007) or control autonomous systems (Bandyopadhyay, 2005)—or to study neuronal plasticity (Novellino et al., 2007).

In this study, we explore through a modeling approach the feasibility of closed-loop optogenetic control of the phase of a local oscillation and of inter-areal phase synchronization. To this end, we simulated the activity of populations of excitatory and inhibitory conductance-based neurons with random connectivity. To investigate the case where a sparse transduction with Channelrhodopsin 2 (ChR2) is achieved *in vitro* or *in vivo*, small fractions of these neurons were endowed with a newly developed and data-constrained conductance-based model of a lightactivated channel. This case is of particular interest, since it has been shown that low transduction rates achieved through either particle mediated gene transfer or via lipid reagents (Takahashi et al., 2012) can increase the spatial specificity of light stimulation (Schoenenberger et al., 2008). Our model, however, applies robustly also to the case of higher ChR2 transduction rates, as the ones that can be achieved using viral transfection (Adamantidis et al., 2007; Aravanis et al., 2007), in utero electroporation (Petreanu et al., 2007) or in T helper type 1 (Thy1) transgenic mice (Wang et al., 2007).

Demonstrating the reliability of our model, we first simulated phase shifting of LFP oscillations with open-loop optogenetic stimulation, quantitatively reproducing and generalizing experimental results *in vitro* (Akam et al., 2012). We moved then to the analysis of a canonical cortical circuit with two interacting areas. Here, we simulated a realistic closed-loop stimulation protocol which was suited to trigger lasting changes of inter-areal phase relations and, correspondingly, to affect communicationthrough-coherence. Thus, we intend our modeling exploration to foster the implementation of a new generation of closed-loop optogenetic experiments *in vitro* and *in vivo* aiming at inducing distributed reorganization of functional interactions at the system level.

#### **MATERIALS AND METHODS**

#### **ChR2 PHOTOCURRENT EXPERIMENTAL CHARACTERIZATION**

Human embryonic kidney cells were transfected with a plasmid encoding a ChR2-YFP fusion protein. The pcDNA 3.1-ChR2- YFP construct was kindly provided by Ernst Bamberg, (MPI for Biophysics, Frankfurt, Germany). After two–four days, successfully transfected cells were identified by their YFP fluorescence. In the whole-cell configuration, the membrane voltage was clamped to −60 mV. Channelrhodopsin's conductance was changed by 500 ms long light pulses. The conductance change was monitored as a time and light-intensity dependent current change (**Figure 1B**). In the case of cultured hippocampal neurons, cell were transfected at 7 DIV with AAV1/2-CAG-ChR2-YFP virus. After 1 week, successfully transduced cells could be identified by their YFP fluorescence.

Whole-field illumination was provided by an extended laser beam (488 nm). Light intensity was controlled by neutral density filters (optical density 1 and 2, respectively) and by means of the software provided for the laser. A comparison of the light-induced current waveforms for 90% attenuation by software and a neutral density filter with an optical density of 1.0 showed excellent agreement, indicating that the software produced the intended attenuation. The laser was switched using a built-in mechanical shutter with a response time in the μs range, achieved through the minute spatial extent of the beam.

#### **BIOPHYSICALLY CALIBRATED MODEL OF ChR2 PHOTOCONDUCTANCE**

Based on the results of the previously described experiment, we modeled the evoked photocurrents as the product of activation and inactivation functions. The current activation could be described by a single exponential function and the current inactivation by the sum of two exponential functions (see also **Figure 1B**). This light-induced conductance change could be well described by the functional form:

$$F\_{\rm ChR2}(t) = A\_{\rm act} \left( 1 - e^{-\frac{t - t\_{\rm ON} - d}{t\_{\rm act}}} \right)$$

$$\cdot \left( A\_{\rm persistent} + A\_{\rm incat}^{(1)} e^{-\frac{t - t\_{\rm ON} - d}{t\_{\rm incat}^{(1)}}} + A\_{\rm incat}^{(2)} e^{-\frac{t - t\_{\rm ON} - d}{t\_{\rm incat}^{(2)}}} \right) \tag{1}$$

Here *d* represents a latency observed between the times *t*ON of light onset and the actual start of the conductance rise and *A*persist is set to *A*persist = 1 − *A(*1*)* inact <sup>−</sup> *<sup>A</sup>(*2*)* inact in order to prevent the inactivation conductance factor from becoming negative. Note that Equation 1 holds true only as long as the light is switched on. After switching off the light, the response returns to baseline with a single exponential time course with time constant τOFF. When individual current responses were fitted, the latency *d*, the amplitude *A*act, the inactivating fractions *A(*1*)* inact and *<sup>A</sup>(*2*)* inact, and

**FIGURE 1 | Evoked ChR2 photocurrent: conductance-based model. (A)** Whole cell voltage clamp recording of a cultured neuron, transduced with Channelrhodopsine 2 (ChR2) and illuminated with LED light (during the time interval shown by a green horizontal bar). Two current intensity recordings have been performed, the first in a physiological solution, i.e., with all channels active (black curve), the second with TTX in the bath, i.e., with blocked Na-channels (red curve). When the Na channels are still active (black curve), even the voltage clamp (−70 mV) at the soma cannot prevent the cell from spiking. **(B)** Activation kinetics of the photo-induced conductance in human embryonic kidney cells (HEK-293) that are transfected with ChR2. For increasing light power density (100% *W*max corresponds approximately to 130 mW/mm2) the activation becomes faster. Peak conductance increases

the activation time constant τact were found to be dependent on the light-intensity *W*light when individual current responses were fitted. However, the time constants related to inactivation were almost unchanged for different light intensities. Therefore, for simultaneously fitting current responses evoked by different light intensities (ranging over two orders of magnitude), two global (i.e., light-independent) parameters τ *(*1*)* inact and τ *(*2*)* inactwere used. In order to model the dependence on the light intensity of the other parameters (*d*, τact, *A*act, *A(*1*)* inact, and *<sup>A</sup>(*2*)* inact) we fitted the following functions to the recorded data:

$$d = d\_A + d\_B \mathcal{W}\_{\text{light}} + \frac{d\_C}{\mathcal{W}\_{\text{light}}} \tag{2}$$

$$
\tau\_{\rm act} = \tau\_{\rm act}^{(0)} + c\_{\rm act} e^{-k\_{\rm act} W\_{\rm light}} \tag{3}
$$

$$A\_{\rm act} = a\_0 + \frac{a\_{\rm min} - 1}{1 + \left(W\_{0.5} / W\_{\rm light}\right)^2} \tag{4}$$

$$A\_{\rm incat}^{(1)} = b\_0 + \frac{b\_1}{b\_2 + \left(W\_{\rm light} - W\_{\rm incat}\right)^2} \tag{5}$$

$$A\_{\rm incat}^{(2)} = c\_{\rm incat} e^{-k\_{\rm incat}W\_{\rm light}} \tag{6}$$

All the parameters of Equations (2–5) are the result of leastsquared fits. For Equation (6) *k*inact has been set manually to

from 0 to ∼10% of the maximal intensity and decreases for higher light intensities. Note the different scale of evoked currents in neurons and HEK cells. **(C)** Simulated photocurrents generated by the conductance-based model described by Equation (1), for different light intensities (expressed relatively to maximum illumination intensity) and for a rectangular shaped light pulse stimulation with a duration of 3 ms. Model parameters and their dependence on light intensity (see **Table 2**) are obtained from fits to photoconductance recordings analogous to the one shown in panel **(B)**, performed for different light intensities. For short light pulses as used here, the experiments indicate that the largest conductances are obtained for light intensities between 10% and 50% (interpolation of the simulated photocurrent results in an optimal value of 18% of the maximum light intensity).

assure convergence of the fitting procedure. All fitted parameters of the ChR2 conductance model, together with their standard deviations, are summarized in **Table 1**. Light intensity is measured relatively to the maximum intensity *W*max that can be achieved in our setup. A precise calibration of the absolute power density at the maximal intensity was not performed. We have estimated it to be approx. *W*max = 130 mW/mm2 for a continuous illumination, which is rather high if compared to 5–6 mW/mm<sup>2</sup> used by Ishizuka et al. (2006) and Ernst et al. (2008) and the maximum (around 20 mW/mm2) used in Nikolic et al. (2009).

#### **ChR2-TRANSDUCED NEURONAL POPULATIONS MODEL**

A local neuronal population was modeled as a random network of *NE* = 4000 excitatory and *NI* = *NE/*4 = 1000 inhibitory conductance-based model neurons of the Wang-Buzsáki (WB) type (Wang and Buzsáki, 1996). The WB model describes a single compartment neuron endowed with sodium and potassium currents. The membrane potential follows the equation:

$$C\frac{dV}{dt} = -I\_L - I\_{\rm Na} - I\_K + I\_{\rm syn} + I\_{\rm noise} + \kappa I\_{\rm ChR2} \tag{7}$$

where *C* is the capacitance of the neuron, *IL* = *gL(V* − *VL)* is the leakage current, *I*syn reflects recurrent interactions with other neurons in the network, *I*noise models the driving exerted by background noise and *IChR*<sup>2</sup> is the photocurrent-induced by external light stimulation. Sodium and potassium currents are


*Parameters to simulate time and light-intensity dependent conductance changes mediated by channelrhodopsin 2. Errors are sample standard deviations. Parameters returned from the global fit procedure do not have a measure of uncertainty. See section Materials and methods for the model description.*

voltage-dependent and given by *INa* = *gNam*<sup>3</sup> <sup>∞</sup>*h(V* − *VNa)* and *IK* = *gKn*<sup>4</sup>*(V* − *VK)*. The activation of the sodium current was modeled as instantaneous. We used sodium and potassium current voltage-dependent activation and inactivation functions as given in Wang and Buzsáki (1996).

The synaptic current evoked by a single presynaptic action potential was given by *I*spike*(t)* = −*g*α*s*spike*(t)(V* − *V*α*)*, where the reversal potential *V*α of the synapse is 0 mV for excitatory AMPA synapses (α = *E*) and −80 mV for inhibitory GABA synapses (α = *I*). The time-course of the postsynaptic conductance was described as a difference of exponentials:

$$\varepsilon\_{\rm spike}(t) \propto \left( e^{-(t + d\_{\rm syn} - t\_{\rm spike})/\tau\_{\rm rise}} - e^{-(t + d\_{\rm syn} - t\_{\rm spike})\tau\_{\rm dency}} \right) \tag{8}$$

for *t > t*spike, 0 otherwise, where *t*spike is the time of the presynaptic spike, *d*syn is a combined conduction and synaptic delay, and τrise and τdecay are respectively the rise- and decay time constants. The normalization constant of *s*spike*(t)* was chosen such that its peak value is equal to 1. The peak conductances of all excitatory and inhibitory synapses were set to *gE* and *gI*, respectively. The total recurrent current *I*syn*(t)* was then given by the sum of the contributions *I*spike*(t)* from all presynaptic spikes fired before time *t*.

The background noise input *I*noise to each neuron was modeled as an additional synaptic current-induced by statistically independent Poisson trains of excitatory spikes with a common firing rate νnoise and a peak conductance *g*noise.

Excitatory and inhibitory neurons in the populations were transduced by ChR2 with a same probability, given by the transduction rate *PChR*2. The photocurrent prefactor κ was set to 1 and 0 respectively for transduced and non-transduced neurons. The induced photocurrent was given by *IChR*2*(t)* = −*gChR*2*FChR*<sup>2</sup> *W*light*(t) (V* − *VChR*2*)*. The conductance waveform *FChR*2*(t)* given by Equation (1)—that depends on the applied waveform *W*light*(t)* of the optical stimulation—was multiplied by a prefactor *gChR*2, such that the peak photocurrent evoked by a pulse with optimal light intensity in the used model neurons (simulated at resting potential) was 2 nA . The reversal potential was *VChR*<sup>2</sup> ∼= 0.

Excitatory neurons established synapses with other excitatory or inhibitory neurons within the same local circuit with probability *PE*, inhibitory neurons with probability *PI*. In addition, when considering multiple interconnected local areas, excitatory neurons within a local circuit established long-range connections with excitatory or inhibitory neurons in a remote local area with a probability *P(lr) <sup>E</sup>* .

#### **ADOPTED PARAMETERS AND OSCILLATORY SYNCHRONY**

The neuronal population model described in the previous section can generate two qualitatively different dynamical regimes, characterized by different degrees of oscillatory coherence. The network resides in one or the other regime depending both on the drive to the network, controlled in this study by varying the background firing rate νnoise, and on the strength of local inhibitory interactions, controlled in this study by varying the probability of inhibitory connection *PI*.

The single neuron and network parameters used for all simulations are summarized in **Table 2**. However, we note that qualitatively similar dynamical features, in particular the existence of a smooth transition between a weakly and a strongly synchronous oscillatory regime, would be obtained for a broad range of parameters, with the frequency of the collective oscillation primarily determined by the synaptic time constants, τrise and τdecay, (Brunel and Wang, 2003). We also find that the transition toward strong synchrony tends to get sharper with increasing network size [not shown, but see as well (Brunel and Hakim, 1999)].

Synchronization of the population activity was quantified through the synchronization index χ (Golomb and Hansel, 2000):

$$\chi = \frac{\sigma\_{\rm LFP}^2}{\langle \sigma\_{V\_i}^2 \rangle} \tag{9}$$

given by the ratio between the variance of the average membrane potential of all excitatory and inhibitory neurons in the local population—here briefly defined conventionally as the "LFP" signal—and the average variance of the membrane potentials

**Table 2 | Parameters of the spiking neuronal network model.**


*Parameters to simulate the activity of transduced neuronal populations (see section Materials and methods for the model description).*

*Vi* of individual neurons in the population. The synchronization index χ is bounded in the unit range, χ = 0 meaning asynchronous and χ = 1 fully synchronous dynamics.

The dependency of firing rate of excitatory and inhibitory neurons, of the collective oscillation frequency and of the synchrony level χ was studied by systematically varying the parameters νnoise in the range between 2 and 6 kHz and *PI* between 0.2 and 0.6 (the reference values, tabulated in **Table 2**, being νnoise = 3 kHz and *PI* = 0*.*3). All the quantities were evaluated over simulated time-series lasting 20 s of real time.

#### **ANALYSIS OF PHASE RESPONSE**

Although the simulation generates spike trains for all neurons, we focus here on alterations of the ongoing collective activity and, therefore, on the oscillating LFP signal. A single rectangularshaped light pulse with a given intensity *W*light and duration *T*light was applied to the considered network at a specific time of application *t*ON. For different values of *W*light and *T*light, we tested the effects of overall 1500 different light onset times *t*ON, distributed uniformly over a time interval of approximately 50 oscillation periods. Indeed, averaging over multiple periods was required, because of stochastic fluctuations of the period length. For each stimulation pulse, the activity of the network was further simulated over 60 oscillation cycles following the perturbation.

In every simulation run, the initial conditions, the network topology and the background noise were kept identical, in order to exclusively study the dependence of the induced perturbation on the parameters of the light stimulation and its application time. Pairs of LFP time series were thus generated consisting of a time series after the application of a photostimulation and a time series of the corresponding unperturbed neural dynamics. For every such pair of time series, instantaneous phase values were extracted using a Hilbert transform (Gabor, 1946), an approach extensively used for investigating phase dynamics and synchronization of non-linear oscillators (Pikovsky et al., 2001). The induced phase shift was then measured by averaging the phase difference φ between the perturbed and the unperturbed LFPs over the last 50 recorded oscillation cycles. A transient of 10 oscillation cycles immediately following *t*ON was discarded to ignore transient effects caused by the applied light pulse. The times of perturbation application *t*ON were translated into phases and binned into 30 equally sized phase bins. The observed phase shifts φ were averaged over each bin and plotted as a function of the phase of perturbation application φ*(t*ON*)* for different light intensities *W*light and perturbation pulse duration *T*light, and also for networks with different transfection rates *PChR*2.

The dependency of phase responses on varying values of light intensity, pulse duration and timing of the perturbation were investigated for a specific realization of the network random connectivity. We have repeated our analysis for three different random realizations of connectivity (with the same homogeneous probabilities of connection, *PI* and *PE*). The corresponding phase responses to light stimuli were qualitatively and quantitatively very similar (not shown). In particular, differences between random network instances were of the same order of magnitude as the error bars shown in **Figure 4**, corresponding to fluctuations of the phase response over time for a same connectivity realization. These similarities are not surprising and match theoretical expectations, since dynamical effects arising from fluctuations due to finite-size connectivity are small for the large network size adopted here (Golomb and Hansel, 2000). Therefore, we can conclude that our results hold in general for random networks with the same (in a probabilistic sense) connectivity features.

#### **ANALYSIS OF PHASE LOCKING CHANGES**

If two coupled neuronal populations are simulated with the parameters given in **Table 2**, the oscillations of the two LFPs self-organize in a phase-locked configuration. The temporarily stable relative phase difference, φ, can have two different values: φlocked or 1−φlocked (phases are measured over the cyclic unit interval 0 ≤ φ ≤ 1). Both phase-locking values correspond to out-of-phase configurations in which either of the two populations leads in phase over the other.

In our simulations, only one of the two local neuronal populations was transduced with ChR2. We applied light stimulation pulses to this transduced population, with a light intensity *W*light = 20% (expressed as the percentage of the maximum possible light intensity of our setup *W*max) and a pulse duration of *T*light = 3 ms. Similar to the protocol used for the phase response analysis of a single population, 1500 different pulse onset times, *t*ON, were used, which were uniformly distributed over 50 oscillation cycles. Starting from random initial conditions, no perturbation was applied for the first 100 oscillation cycles, to ensure complete convergence to a stable phase-locked attractor. Without loss of generality, we considered the configuration in which the phase of the transduced population leads over that of the not transduced population (i.e., in which the stable inter-circuit phase difference is close to φlocked before the perturbation).

Variations of the phase-difference between the two populations were measured in two different time-windows. We first studied the short-term effects of the light stimulation, by averaging the instantaneous Hilbert phase difference over the first 5 oscillation cycles after the perturbation. Binning different onset times according to the corresponding phase of application of the perturbation (as done for the estimation of single population phase response), we quantified the probability *P*shifting*(*φ*)*, that a light pulse induces a relative variation of more than 10% (reduction or increase) of the inter-population phase-difference. For each application phase bin, *P*shifting*(*φ*)* was compared with the probability of observing similarly large spontaneous fluctuations of φ in the unperturbed activity of the same network.

We then studied longer term effects of the light stimulation by averaging the difference of the instantaneous Hilbert phases over the 50 cycles that follow the ten omitted oscillation cycles directly after stimulation. The aim of this long-term analysis was to assess the occurrence of a switching from the phase-locking pattern with phase-difference close to φlocked toward the other phase-locking pattern with phase difference close to 1-φlocked. Once again binning onset times according to the corresponding phase of perturbation application, we quantified the probability *P*switching*(*φ*)* that the long-term averaged phase difference was within a tolerance interval of 1 − φlocked ± δ, with δ = 0*.*05 (i.e., the transduced population switched steadily from the role of phase leader to phase laggard). For each phase bin, *P*switching*(*φ*)* was compared to the probability of observing a spontaneous switching of the phase locking (from φlocked to 1-φlocked) over an equivalent time span of 50 cycles, based on time-series of the unperturbed dynamics of the same network.

The probabilities *P*shifting*(*φ*)* and *P*switching*(*φ*)* were finally plotted as polar histograms with ten equally-spaced bins for the phase of the onset of the light stimulation φ*(t*ON*)*, in which the corresponding probabilities of spontaneous shifting or switching were also reported in order to identify phase bins in which the effects induced by the perturbation pulse were significantly low or high (**Figure 5**).

#### **ONLINE PHASE PREDICTION**

A closed-loop approach (**Figure 6**) is necessary to estimate a time *t*ON which corresponds to a future occurrence of a given target phase φtarget, leading to the largest possible probability of switching of the inter-areal locking (**Figure 5**).

To study the feasibility of such an approach, we modeled its implementation, considering the same bi-areal network used to characterize induced switching between phase-locked states (see previous section and **Figure 5**). Simulated LFPs were recorded from both the stimulation target area and a second coupled area. However, the calculations performed online involved only the LFP time-series *V(t)* recorded in the target area. The timeseries *V*˜ *(t)* of the second area were recorded and analyzed offline to determine phase-locking patterns before and after the stimulation.

We approximated the "true" Hilbert phase φ*H(t)* associated to *V(t)* by a linearly interpolated phase. This approximation could be simply done by interpolating a variable φ*L(t)*that was linearly growing in the unit interval 0 ≤ φ*<sup>L</sup> <* 1 between any two times *tk* and *tk*<sup>+</sup><sup>1</sup> delimiting an oscillation cycle. As shown by **Figure 7B**, the phase variables φ*H(t)* and φ*L(t)* are related by a mildly non-linear map, described as a static non-linearity φ*<sup>H</sup>* = *fLH(*φ*L)*. However, we systematically ignored this non-linearity in the following by approximating φ*H(t)* directly by φ*L(t)*.

The workflow for the prediction of the perturbation onset time *t*ON is split up into multiple stages (**Figure 6**). First of all, it was necessary, during a *testing stage*, to detect the presence of sufficiently strong local oscillations and to measure their average frequency *f*peak. It was important to monitor the characteristics of LFP oscillations (band-passed around *f*peak) in the stimulation target area (*monitoring stage*) and to extract, based on observations of past activity, a model able to approximately predict future phase evolution (*prediction stage*).

Even in the ideal case of an elevated synchrony index χ and sustained oscillations, the duration of oscillation periods *Ti* fluctuated from cycle-to-cycle around their average *T*¯(cf. **Figure 7A**). Let us suppose that the last oscillation period recorded in the monitoring stage was *Tk* = *tk* − *tk*<sup>−</sup><sup>1</sup> and that the prediction stage lasts (less than) *s* oscillation cycles. Neglecting correlations between period lengths of consecutive cycles, the time of beginning of the next cycle after the end of the prediction stage could be estimated via a simple *linear extrapolation*:

$$t\_{k+s}^{(0)} = t\_k + s\bar{T} \tag{10}$$

However, for our network model, the temporal autocorrelation function of period lengths *Ti,i* = 1*,..., k* displayed a fast but not instantaneous decay for increasing lags (measured in oscillation cycles). These weak, positive correlations between consecutive cycle durations could be well captured by a *first order autoregressive process* [AR(1)], *Ti* <sup>=</sup> *<sup>T</sup>*¯ <sup>+</sup> *<sup>a</sup>(Ti* <sup>−</sup> <sup>1</sup> <sup>−</sup> *<sup>T</sup>*¯*)* <sup>+</sup> *<sup>i</sup>*, with *<sup>T</sup>*¯ the average oscillation period over the monitoring time-window, *a* the AR(1) coefficient and *<sup>i</sup>* an i.i.d. Gaussian distributed residual noise term (Brockwell and Davis, 1996). With this AR(1) model, the beginning of the next cycle was estimated as:

$$t\_{k+s}^{(1)} = t\_k + s\bar{T} + \left(\frac{a^{s+1} - a}{a - 1}\right) \cdot T\_k \tag{11}$$

The AR(1) coefficient was derived as:

$$a = \frac{k}{k-1} \frac{\sum\_{i=1}^{k-1} \left(T\_i - \bar{T}\right) \left(T\_{i+1} - \bar{T}\right)}{\sum\_{i=1}^{k} \left(T\_i - \bar{T}\right)^2} \tag{12}$$

based on the periods *Ti,i* = 1*,..., k*, measured during the monitoring stage and on their average duration *T*¯ .

Spectral analysis of LFPs recorded in the stimulation target area and in a second coupled area was performed during the testing stage. A windowed Fast Fourier Transform (FFT) was applied to demeaned chunks of the LFP signal, to extract a rough estimate of the instantaneous power spectrum. When the power at some frequency *f*peak in the gamma range exceeded a determined threshold in both recorded areas, the monitoring stage started.

During the monitoring stage, a computationally efficient loworder recursive time domain filter (Percival and Walden, 1993) was applied to clean the oscillating LFP signals. The filtered timeseries was computed online as:

$$V\_{\text{filtered}}(t) = V(t) + \alpha\_1 V\_{\text{filtered}}(t-1) + \alpha\_2 V\_{\text{filtered}}(t-2) \tag{13}$$

Filter coefficients were chosen as α<sup>2</sup> = −0*.*99 and α<sup>1</sup> = 4α<sup>2</sup> cos*(*2π*(*1 − *f*peak*))/(*1 − α2*)* (assuming a sampling rate of 1 kHz). The pass frequency was then equal to *f*peak and the main frequency of the activity of recorded areas was maintained. The LFP time-series *V(t)* and *V*˜ *(t)* recorded during the monitoring stage were stored. An analysis of the inter-areal phase-locking pattern before stimulation was then performed offline, while the closed-loop experiment was continuing. A monitoring stage including approximately 20 oscillation cycles was found to be sufficiently long to achieve accurate model estimation.

The limited amount of fast computations to be performed during the prediction stage is summarized as follows:


After the application of the perturbation pulse, the LFPs of both areas were recorded and stored. An analysis of the interareal phase-locking pattern after stimulation was then performed offline and compared to the phase-locking assessed before stimulation. In case of failed switching, either the same linear model was used to extrapolate directly the time *t*ON of a further stimulation pulse, or a new testing stage was initiated, verifying that oscillations were still ongoing or waiting for the next oscillatory epoch to begin.

The decision between a prediction scheme based on the AR(1) model and a simpler linear extrapolation scheme depends ultimately on the correlation statistics of the series of period lengths. It can be shown that the prediction error of the estimated phase is reduced by the AR(1) prediction scheme compared to linear extrapolation by a maximal amount of 100%*/* √ 1 − *a*<sup>2</sup> (and by exactly this amount for Gaussian distributed samples). If the AR(1) parameter *a* estimated from the recordings during the monitoring window is small (as a rule of thumb, *a <* 0*.*3), then the performance improvement is negligible and advantage can be taken from the faster computation of the simpler linear extrapolation. Unfortunately, this criterion requires the evaluation of *a*. Nevertheless, the analysis of **Figure 7E** indirectly suggests that the AR(1) coefficient depends non-monotonically on the synchrony level, and that it increases going from low to intermediate synchrony indices χ, but drops again going toward higher χ. The choice of a high power threshold during the testing stage guarantees a high level of synchrony and, therefore, small values of *a* during the monitoring stage. This allows one to adopt the computationally faster step (5) instead of (4). However, a tradeoff should be made between the need of a fast prediction and the probability to detect a number of oscillatory epochs sufficient for meeting the testing stage criteria.

## **RESULTS**

#### **DATA-CONSTRAINED MODEL OF ChR2-PHOTOCURRENT**

In order to assess from *in silico* experiments the efficacy of optogenetic stimulation in inducing changes of local phase or of inter-areal phase relations, we first derived a realistic and fully data-constrained model of the evoked ChR2 conductance. To do so, we first performed an experimental characterization of photocurrents evoked in living cells *in vitro* by light stimulation over a broad range of light intensities spanning two decades of power (see section Materials and Methods). Then, based on this systematic set of measurements, we fitted to the whole dataset a unique conductance-based model that describes the evoked time-dependent photocurrent, and hence the conductance, as the product of activation and inactivation factors.

The light-activated ChR2 ion channel mediates a current that is carried mostly by Na+, K+, and H<sup>+</sup> with contributions of Ca2+. Its reversal potential is typically around 0 mV and therefore it is depolarizing at neuronal resting potential. We found that upon illumination onset, a current built up with a nearly exponential time course with a time constant τact ranging from 10 ms, for very weak light intensities that barely evoked any current response, to below 1 ms for high intensities. For a large range of intensities the current displayed a transient behavior and its amplitude, after reaching a peak, decayed over tens of milliseconds to reach a plateau. This inactivation behavior was biphasic and its time constants were not dependent on light intensity, unlike the activation time constant. Finally, when the light was switched off, the current decayed back to baseline with a time course that was well described by a single exponential with a 10 ms time constant.

**Figure 1A** depicts inward currents induced by a light pulse of moderate intensity (approximately 3 mW/mm2 for 10 ms) in a cultured hippocampal neuron transduced with ChR2. Even such a weak light pulse was able to elicit an action current, as the axon escaped the voltage-clamp (**Figure 1A**, black line). The ChR2 photocurrent could be isolated, by blocking Na-channels with tetrodotoxin (**Figure 1A**, red line).

To achieve an improved characterization of the photocurrent time-course, we systematically analyzed recordings over (nonspiking) transfected kidney cells (**Figure 1B**) using a very large range of light power densities for the characterization of ChR2 activation and inactivation kinetics. We found that the peak and steady state photocurrent do not increase monotonically with light power density. A maximal peak current is achieved around 10–20% of the maximum power density (see section Discussion). For applications, where such power densities can be attained, for instance with a laser or a strongly focused LED, a careful tuning of the applied light intensity could thus potentially reduce the minimum transduction rate needed to efficiently drive the local oscillations in a target area.

As detailed in section Materials and Methods, it was possible to capture the time-course of the evoked ChR2 current with a single conductance-based model with light-dependent parameters. The simulated photocurrents generated by the model in response to a single square pulse of light lasting 3 ms are shown in **Figure 1C** for various light intensities (corresponding to the typical short pulse length used in the simulations of next sections). As evident from **Figure 1C**, our data-constrained model was able to capture the non-monotonic dependence of peak photocurrent on the light intensity, leading to the largest peak photocurrent for a light intensity of approximately 18% the largest deliverable intensity *W*max.

#### **SPIKING NETWORK MODELS OF TRANSDUCED OSCILLATING AREAS**

To study the response to light stimulation of systems involving transfected neuronal areas, we simulated the activity of simple canonic circuits composed of just one local area or of two local areas mutually coupled with equal strength. Each area was modeled as a large network of randomly interconnected excitatory and inhibitory neurons. As shown in **Figure 2A**, a fraction of these excitatory and inhibitory model neurons were equipped with ChR2 photoconductances, inducing depolarization in response to simulated light stimulation.

For most of the analyses reported in this study, we adopted within each local area strong and delayed inhibition and a sufficiently strong background drive (see **Table 2**). With such a choice of parameters, local circuits underwent—through an "ING" type (i.e., "interneuron-generated") mechanism (Whittington et al., 2000; Brunel and Wang, 2003; Brunel and Hansel, 2006; Tiesinga and Sejnowski, 2009) a marked and persistent oscillatory activity, well visible in the traces of a LFP-like signal. The collective frequency of oscillation was in the gamma range. Since driving was provided by background Poisson noise, the spiking activity of individual neurons was very irregular and characterized by a weaker firing rate (cf. **Figure 2B**). Weak pairwise correlations between spike trains coexisted thus with stronger pairwise correlations between membrane potential fluctuations (Yu and Ferster, 2010; Battaglia and Hansel, 2011). While inhibitory connections were confined within each local area, excitatory neurons could additionally establish long-range connections between distant local areas (**Figure 5A**). In this case, the gamma oscillations generated by each local circuit were set into one of many possible multistable phase-locked states (**Figure 5B**).

The dynamical features of the simulated neural activity, including in particular its degree of oscillatory synchrony, depended sensibly on the noisy drive to the network and on the

strength of local inhibition. For increased drive intensity and/or stronger inhibitory interactions, a smooth transition occurred toward a dynamic regime characterized by elevated collective synchronization (**Figure 3A**). In this synchronous regime, the frequencies of the network oscillation were in the gamma range, varying between 40 and 70 Hz (**Figure 3B**), while the average firing rate of individual excitatory neurons varied between 1 and 3 Hz (**Figure 3C**) and of inhibitory neurons between 2 and 7 Hz (**Figure 3D**).

Starting from a very wide range of parameters including the probability of inhibitory connections and the strength of the external driving force (**Figure 3**), oscillatory synchrony can be robustly boosted by enhancing the external drive to the network. Qualitatively reproducing existing experimental findings (Adesnik and Scanziani, 2010; Akam et al., 2012), our simulations showed that slowly ramping or constant low-intensity optogenetic stimulation can be used to "switch on" a markedly oscillatory behavior. As shown by **Figure 3E** a network with poorly synchronous activity can be optogenetically driven toward higher oscillatory synchrony, as evident not only from LFP spectrograms but also visually from LFP traces.

In the following, we will mainly consider model networks tuned to generate strong LFP gamma oscillations. However, such

a choice is not an arbitrary restriction. Indeed, high synchrony regimes—either spontaneously emergent or induced artificially by continuous photostimulation—are particularly suited for analyses of phase shifting and locking.

#### **SHIFTING THE PHASE OF AN ONGOING LOCAL OSCILLATION**

It is well known that the effect of a perturbation to an oscillating system depends on the phase at which the perturbation is applied (Pikovsky et al., 2001). To explore the phase dependency of light stimulation, we applied simulated stimulation pulses with different durations *T*light to local populations with different transduction rates *PChR*<sup>2</sup> (**Figure 4**). Light intensity was always set to the optimum value of *W*light = 18% *W*max, which led to maximum evoked peak photocurrents.

For all the explored conditions, we always found strongest effects on the phase of an ongoing oscillation when the perturbation was applied at a phase half-way between the trough and the peak of the collective population oscillation (**Figure 4B**). In this case the phase of the perturbed oscillation was advanced with respect to the unperturbed case (**Figures 4C,D**). Short pulses lasting 1 or 3 ms led only to phase advance effects. As shown in **Figure 4C**, phase advances of the order of one quarter of a cycle could be achieved using such short pulses, over a very wide range of transduction rates, going from very high (100%) down to moderate (25%). Noticeable phase advance effects (although reduced to just one tenth of a cycle) could even be detected for transduction rates as low as 5%.

As displayed by **Figure 4D**, longer stimulation durations also led to phase lagging effects. These effects occurred in different ranges of perturbation application phases than for phase advancing effects. However, phase lagging effects were always weaker than phase advancing effects. For instance, for a transduction rate of 25%, pulses lasting 10 ms could induce phase advances of over a quarter of cycle, but only phase laggings of less than one tenth of cycle.

The positive peaks of the *phase response curves* (PRCs) plotted in **Figures 4C,D** were aligned across all conditions. The strongest phase shifting effects were always observed when the perturbation was applied in proximity of the phase φ = 0*.*17. We also mention that for the short stimulation duration used, the evoked photocurrent was dominated by the fast activation time-course. Inactivation played no role in determining the response. As a matter of fact, the effect of the fast initial rise of the photocurrent was to evoke a spike in the transduced neurons, as in panel 1A, and additional synchronous spikes evoked in a subpopulation of cells were the dynamic cause of the induced phase shift, as in Battaglia et al. (2012).

#### **PERTURBING PHASE RELATIONS BETWEEN DIFFERENT OSCILLATING POPULATIONS**

After the controlled shifting of the phase of a local oscillation, we explored whether precisely phased stimulation could be used to manipulate phase relations between different local oscillating circuits. To do so, we considered a canonic circuit of two coupled oscillating areas, interconnected by long-range random excitatory projections (**Figure 5A**). In general, when driven into a synchronous regime, motifs of a few local areas mutually connected with equal strength can give rise to different phase-locked states.

**FIGURE 4 | Phase shifts induced by photostimulation. (A)** Examples of phase shifts induced by a single light pulse. Top: the phase (blue curve) of the oscillation of the transduced population is shifted by light perturbation (illustrated as a lightning symbol with green background) and, afterwards (magenta curve), remains advanced with respect to the unperturbed oscillation (gray curve). Bottom: such a phase shift cannot be seen when the timing of the light perturbation corresponds to other differently chosen oscillation phases. **(B)** Waveform of the oscillating LFP in dependence on the Hilbert phase. Shown are 500 oscillation cycles (gray) of a LFP and their average waveform (blue). By our conventions, the phase ranges in the unit interval. The maximum of the LFP is obtained for (Hilbert) phase

These states are associated to different patterns of inter-areal phase relations, which are maintained in a relatively stable manner over long time intervals (Battaglia et al., 2007, 2012).

The specific bi-areal network of **Figure 5A** generated two multi-stable phase-locked states. In the unperturbed system, background noise caused spontaneous switching between these two states (i.e., from one configuration of inter-areal phase relations to another). The result of these stochastic fluctuations was a clearly bimodal distribution of the instantaneous phase difference between the two areas (**Figure 5B**). The actual phase relations in the phase-locked modes depend ultimately on the PRC of the coupled populations. As discussed in Battaglia et al. (2007, 2012), the PRCs associated to our network model are such that they lead to *out-of-phase* locking for sufficiently strong inhibition (unless long-range synaptic delays are tuned *ad-hoc* within narrow intervals associated to in- or anti-phase configurations). Out-of-phase locking is found also in more general systems of pulse-coupled neurons (or neuronal masses) under certain conditions on synaptic delays (Woodman and Canavier, 2011; Wang et al., 2012).

In out-of-phase locked modes, it is always possible to identify one area (leader) whose oscillations lead in phase over the oscillations of the other area (laggard). This leads to anisotropic directed functional influences between local circuits (Battaglia et al., 2012), in agreement with the communication-throughcoherence hypothesis (Fries, 2005), despite the fact that interareal connections are reciprocal and of equal strength in both

values close to 0.3 while the minimum occurs for phase values close to 0.6. **(C,D)** phase shifts caused by light pulses applied at different (Hilbert) phases of the ongoing LFP oscillation. An optimal light intensity of 18% *W*max is used. **(C)** Dependence of the phase shift on the transduction rate *PChR*<sup>2</sup> of the population (for a stimulus duration *T*light = 3 ms). **(D)** Dependence of the phase shift on the stimulus duration *T*light (for a fixed transduction rate of *PChR*<sup>2</sup> = 25%). Bold characters in the legend denote the "reference" phase shift, i.e., *PChR*<sup>2</sup> = 25% and *T*light = 3 ms of stimulus duration (green curves). In panels **(C)** and **(D)**, error bars are standard deviation of the phase shifts obtained for different perturbations applied in a same time-bin.

directions. Switching between alternative phase-locking configurations would thus correspond to changes in the dominant direction of inter-areal functional influences. Spontaneous switching was a relatively rare event in the high synchrony regime explored here (the average waiting time for spontaneous switching was over 60 periods). Nevertheless, optogenetic stimulation could be used to actively trigger switching events (**Figure 5C**).

Inter-areal phase relations after the application of a single perturbation pulse were compared to the average locked phase difference before the pulse itself. We studied how both transient short-term and persistent long-term effects depend on the phase of perturbation onset. **Figure 5D** shows the probability that the average inter-areal phase difference for the five cycles directly following the perturbation has increased or reduced by at least 10% relative to the average phase difference prior to the perturbation. For a wide range of phases of stimulation onset, such probability was larger than 50% and remarkably larger than the level accounted for by spontaneous fluctuations of the inter-areal phase difference.

The dependency on the perturbation phase was more pronounced for long-term effects. **Figure 5E** shows the probability of a switch in phase locking, i.e., that the average inter-areal phase difference over a long time window beginning ten cycles after the perturbation has changed its sign (note, indeed, that the two phase-locked configurations of the simulated bi-areal motif are characterized by average phase-differences of φ = ±φlocked,

long-range excitatory connections. **(B)** Both populations oscillate in a non-regular way but with the same main frequency. A histogram of the instantaneous phase difference is shown for a pair of very long LFP time series (over 50,000 oscillation cycles). This distribution is clearly bimodal, indicating the existence of two favorite modes of approximate out-of-phase locking (with the orange population leading in phase over the violet, or the other way around). **(C)** LFP traces of two phase-locked populations. The application of a light pulse stimulation (denoted by a green background and a lightning symbol) can induce switching to another phase-locked mode. This is shown by the qualitative changes between the crosscorrelogram (XC, computed over 500 ms) of the two LFPs before (left) and after (right) light stimulation. Note the changed sign of the lag of the highest XC peak,

cf. **Figure 5B**). In contrast to short-term shifting, the probability of actual switching was concentrated in a narrow phase interval centered on the peak of the single-area PRC, as expected from theory (Battaglia et al., 2012). The switching probability for other phase bins dropped quickly to the level of spontaneous switching.

Our simulations show that the peak probability of optogenetically-induced switching could rise above 60% even for small transduction rates of 25%. However, this happened only if the phase of the perturbation onset was precisely selected. Indeed, the comparison of **Figures 5D,E** shows that many of the short-term shifting effects observed for randomly phased perturbations did not develop into lasting changes in phase-locking. To conclude, we would like to mention that a similar pulse-induced reorganization of inter-areal phase relations could be achieved even when the perturbation was applied to the laggard rather than to the leader area [not shown, but see (Battaglia et al., 2012)].

#### **CLOSING THE LOOP**

As discussed in the last section, the controlled switching of interareal phase-locking—and, hence, of functional connectivity required perturbations optimally phased with respect to ongoing stimulation (*PChR*<sup>2</sup> = 25%, *T*light = 3 ms). This probability is presented by a polar histogram in dependence on the phase of the onset of the light stimulation (with respect to the leader population). The red circle indicates the probability of similarly large spontaneous phase shifts (i.e., without photostimulation). **(E)** Phase difference averaged over 50 cycles starting 10 cycles after the light pulse. A switching is considered as successful if the sign of this average phase difference has changed (see panel **B**). The probability of successful phase switching is given by a polar histogram, as in panel **(D)**. The red circle indicates the probability of spontaneous switching in the case of non-stimulated activity. Ignoring transient effects, switching can be induced with high probability only if the perturbation is applied within a specific narrow phase range.

oscillations. To increase the probability to induce switching, the timing of perturbation must thus be determined based on phase information extracted from recordings of the recent population activity. We suggest here a possible closed-loop protocol for the online prediction of the timing of stimulation achieving an optimal switching rate. The workflow of the proposed idealized experiment is outlined by a schematic time bar (**Figure 6A**) and a corresponding flow chart (**Figure 6B**). The potential performance of such protocol was studied by simulating the induction of switching in the bi-areal network of **Figure 5A**.

In contrast to this well behaved *in silico* model, oscillatory coherence *in vivo* or *in vitro* recordings is usually transient and confined to specific epochs. There is nevertheless experimental evidence that epochs of phase synchronization at fast gamma frequencies can persist over several hundreds of ms *in vivo* (Varela et al., 2001; Pesaran et al., 2002; Gregoriou et al., 2009; Bosman et al., 2012; Grothe et al., 2012). Detecting the onset of one of such oscillatory epochs was precisely the aim of the *testing stage*, in which LFPs in both areas of the bi-areal motif were recorded and their spectral characteristics extracted in real-time to verify that LFP power and inter-areal coherence with respect to a common

#### **FIGURE 6 | Closed loop strategy for precisely phased**

**photostimulation. (A)** Schematic illustration of the proposed experimental protocol. During the *testing stage* (light blue) the LFP is recorded and tested for sufficiently strong power in the gamma-range. If the gamma band power is high enough, then a bandpass-filter is tailored to its peak frequency (light gray arrow). In the *monitoring* stage (red), phases are extracted from the band-passed LFP. Based on these observations, during the *prediction stage* (yellow), lasting only a very few oscillation periods, a linear model of phase evolution is extrapolated to predict the time at which the target phase of the oscillation will occur next. A light pulse is then delivered at this predicted time (green background with lightning symbol). **(B)** The workflow of the closed loop experiment is presented as a flow chart, with the left swim lane presenting computation and decision steps and the right swim lane showing recording and stimulation of the transfected neuronal population. Curved green arrows highlight the closed-loop nature of the workflow, i.e., the light pulse stimulation delivered at a time depending on the phase evolution of LFP oscillations during the monitoring window.

frequency (band) rose above a minimum threshold (see section Materials and Methods).

The *monitoring stage* was entered immediately after the detection of an epoch of reliable inter-areal coherence. During this monitoring stage, LFP signals were recorded, filtered in real time through a low-order band-pass filter with a pass frequency optimized during the testing window and, finally, stored.

A fast online analysis of the phase dynamics of the stored LFP of only the target area was then performed during the following *prediction stage*. Its aim was to predict the timing of one of the next occurrences of the target phase, solely from the phase information acquired during the monitoring stage. To keep the prediction window as short as possible, we propose to use computationally cheap and consequently linear techniques for phase extrapolation. Indeed, the "real" phase values (given by Hilbert Transform of the LFP signal, see section Materials and Methods) and a simple linear descriptor of the phase are strongly correlated (**Figure 7B**) and non-linear effects can be neglected in a first-order approximation.

The phase-locking between LFPs recorded after the stimulation application was finally compared with the locking existing before the stimulation to verify the successful induction of state switching.

**Figure 7** analyzes the simulated performance of the proposed protocol, when applied to *in silico* recordings from the bi-areal network motif of **Figure 5**. **Figure 7C** shows how the predicted onset phases of light stimulation concentrate around the actual target phase given by the peak PRC value of φtarget = 0*.*18. The scattering of predicted phases is computed by hypothesizing prediction stages with different possible (short) durations. This estimate was done with two prediction schemes which both have fast implementations: a simple linear extrapolation based on the average period length and a first-order autoregressive model [AR(1)] (see section Materials and Methods), accounting for correlations between the durations of successive oscillation cycles, at least approximately. For increasing lengths of the prediction window, the median and the average value of the predicted Hilbert phase remained very close to the target (**Figure 7C**). However, the distribution of extrapolated phase values broadened, as indicated by their increasing dispersion. Nevertheless, for a prediction window lasting three oscillation cycles—a sufficiently long time to perform the fast computation required for linear extrapolation (see section Discussion)—the interquartile range of predicted phase values was still contained in the width of the reference PRC. Consequently, we still expect an enhanced effectiveness of light stimulation pulses applied at the inferred time *t*ON, compared to randomly timed pulses.

The error made in predicting a target phase depends necessarily on the quality of the recorded oscillation. The dynamical regime of the simulations in **Figures 5** and **7C** was strongly synchronous. As previously discussed, the degree of synchrony of the collective response depends on the external driving force to the network and on the strength of local inhibition (**Figure 3A**). We performed phase prediction based on recordings of simulated dynamics with different degrees of synchrony. As shown in **Figure 7D**, stronger synchrony was associated to smaller prediction errors. Interestingly, prediction errors remained moderate

**FIGURE 7 | Online prediction of the phase of stimulation onset. (A)** The period length of LFP oscillations fluctuates from cycle to cycle and has a broad-range uni-modal distribution (here shown for period lengths as estimated from the Hilbert phases). **(B)** Hilbert phase versus linear phase for a sample LFP time series. To speed-up the computation of *t*ON in the prediction stage, the Hilbert phase can be approximated by a linear phase, since, as here shown, they are strongly correlated and the mild static non-linearity *fLH* linking them can be neglected. **(C)** Distribution of the phase of *t*ON predicted by two different methods and for different lengths of the prediction window (measured in oscillation cycles). Shown are histograms and box plots (box giving median and interquartile range, white circle the mean value and whiskers the 5-th and 95-th percentiles) of the predicted phase of light stimulation φ(*t*ON) for two prediction methods—pure linear extrapolation based on the average period length (green) and first order autoregressive [AR(1)] models (orange)—applied to period lengths recorded

even when considering regimes "at the edge of synchrony." Furthermore adopting a more elaborate AR(1) approach yielded the strongest performance improvement with respect to simpler linear extrapolation precisely for these intermediate synchrony values (**Figure 7E**).

In contrast, prediction errors associated to weak synchronous dynamics were larger and even the AR(1) approach failed to improve over linear extrapolation in these cases. However, in these regimes, the dynamics rarely displayed long-lasting oscillatory epochs and the probability of spontaneous switching was comparable to the one of induced switching, thus invalidating our analysis protocol. In these cases, therefore, continuous photostimulation should be used to enhance the degree of coherence of the coupled populations activity (analogously to **Figure 3E**).

#### **DISCUSSION**

#### **FROM POWER BOOSTING TO RELIABLE PHASE CONTROL**

Optogenetic stimulation has been successfully applied to boost the power of fast neural oscillations *in vivo* and *in vitro*. In Cardin et al. (2006), pulsed optogenetic stimulation *in vivo* was used to highlight the existence of a resonance at gamma range frequencies of local inhibitory cortical microcircuits. Adesnik and Scanziani (Adesnik and Scanziani, 2010) and Akam et al. during the monitoring stage. Both the median and the mean of predicted Hilbert phase are in good agreement with the exact target phase (leading with highest probability to a phase shift) with a dispersion not larger than the width of the positive part of the reference phase-shift response curve (reproduced from **Figures 4C,D** on the top of the panel). **(D)** The prediction error (i.e., the standard deviation of the inferred phase φ*(t*ON*)* of photostimulation onset) depends on the synchronization level of the neuronal population activity (cf. **Figure 3A**). The prediction error based on linear extrapolation (measured in units of average oscillation period lengths) is shown for different probabilities of local inhibitory connection *pI* and background noise rates νnoise. Larger synchronization leads to better prediction. **(E)** The ratio of the prediction error based on the AR(1) model and the prediction error based on linear extrapolation in dependence on the same parameters. For intermediate synchrony levels, the prediction error can be consistently reduced by the use of an AR(1) model.

(2012) experimented with ramped light stimulation to induce long-lasting oscillatory episodes in slices.

Beyond controlling oscillation power, the experiments by Akam et al. (2012) are closely related to the first part of our model study. They used 5 ms-long light stimulation pulses to shift local oscillation phases and quantify the phase response curves (PRCs) of oscillations in hippocampal slices, analogously to the simulated experiment of **Figure 4**. The hippocampal PRC measured by Akam et al. (2012) was distinctly biphasic, leading to phase advancement or phase delaying, depending on the phase of application of the stimulation. Such biphasic PRC shape is in qualitative and approximately in quantitative agreement with the PRCs extracted from our local population model for stimulation pulses of comparable lengths (cf. **Figure 4D**, orange curve for 5 ms-long pulses and red curve for 10 ms-long pulses).

Interestingly, however, the PRCs extracted from our model for shorter stimulation durations lacked phase-delaying regions and displayed only a narrow phase range leading to consistent phase advancement. Furthermore, they were characterized by a relatively broad range of application phases for which light stimulation was completely ineffective. These features of the PRC shapes are robustly obtained if the circuit mechanism for the generation of oscillations dominantly relies on delayed mutual interactions within interneuronal networks (Battaglia et al., 2007, 2012). One can actually use very different neuronal models to obtain oscillatory and phase-locking behaviors that qualitatively match those observed. For instance, spatially structured networks of integrate-and-fire neurons (Battaglia and Hansel, 2011) have dynamical regimes that tightly correspond to those of homogeneous networks of the conductance-based neurons (Battaglia et al., 2007) that we adopt here. We predict therefore that similarly looking PRCs could be obtained in the case of Kainateinduced *in vitro* oscillations in slices, in which excitatory neurons are entrained by a coherently oscillating interneuronal population but are not actively involved in the generation of the local rhythm (Fisahn et al., 2004; Bartos et al., 2007; Andersson et al., 2012).

Narrow phase ranges associated to large PRC values reduce the probability of inducing stable phase shifting by applying stimulation at arbitrary times. However such narrow intervals become a desirable resource when optogenetic stimulation is precisely phased conditional to ongoing oscillations, as executable in perspective with a closed-loop setup. Indeed, PRC shapes like the reference PRC discussed in **Figure 3** (green curve for *PChR*<sup>2</sup> = 25%, and *T*light = 3 ms light-pulses) could allow an "all-or-none" control of phase shifting, in which strong effects are obtained only if the stimulation is applied within a specific target range of phases, but in which undesired switching triggered by noise or by a misapplied input is largely suppressed.

## **A SIMPLE ChR2 MODEL CAPTURES NON MONOTONIC PHOTORESPONSE**

The light-activated cation channel ChR2 activates more rapidly and supports larger peak current amplitudes for increasing light intensities. Therefore, we speculated that brief, high intensity light pulses would provide the optimal stimulation for our purposes. To our knowledge there were no studies that systematically documented ChR2 current responses for stimuli with light intensities above 20 mW/mm2 (Ishizuka et al., 2006; Ernst et al., 2008; Lin et al., 2009). At this intensity the activation rate is still light sensitive and we aimed to increase it even more using light intensities as high as approximately 130 mW/mm2. While the activation rate did indeed decrease further, the fact that the peak current amplitude *decreased* for intensities above approximately 20 mW/mm<sup>2</sup> came to us as a surprise (**Figures 1B,C**). This behavior has not been reported before, to the best of our knowledge, though the measurements published in Lin et al. (2009) hint at a decreasing peak amplitude for the highest intensity applied there, which was approximately 19.8 mW/mm2.

Such phenomenon might be reminiscent of the photoreactive P480b intermediate state, which can be converted by blue light to the early P500 intermediate state. This transition was proposed as a shortcut of the photocycle from a spectroscopic study of ChR2 channels (Ritter et al., 2008). Since previously published models of ChR2 currents (Nikolic et al., 2006, 2009) could not account for this non-monotonic light response, it was necessary to deploy a novel model. Our simple conductance-based model correctly captures the existence of an optimal light intensity for photostimulation, without need to incorporate elaborate details about the ChR2 molecular structure and dynamics. Note that the application of our model is not limited to brief light pulses, but can also predict light-induced conductance in response to ramps of light (cf. **Figure 3E**).

Our model is also accurately data-constrained. To calibrate model parameters, light induced changes of ChR2 conductance were measured in voltage clamp. If the voltage can be clamped throughout a cell, any changes in the whole-cell current can be attributed to ChR2 conductances. In differentiated neurons, however, this perfect voltage control cannot be attained. This is obvious from the recording in **Figure 1A** (black trace), where the activation of ChR2 depolarized the axon sufficiently to activate voltage-dependent sodium channels, which created an unclamped spike. Even when sodium channels are blocked, the conditions are not optimal for a precise biophysical characterization. Using essentially passive and electrotonically compact cells, such as HEK-293 cells (Nikolic et al., 2009), provided optimal recording conditions (**Figure 1B**). The smaller amplitude of the photocurrents in these cells reflected differences in cell surface and expression levels, while the biophysical properties of ChR2 were most likely identical to those expressed in neurons.

## **TECHNICAL FEASIBILITY**

As discussed above, the extraction of PRCs describing the collective response of a transduced neuronal population to light stimulation was already achieved *in vitro* (Akam et al., 2012). Our modeling study suggests that a similar approach could be successfully applied *in vivo*, since phase-shifting effects can be robustly obtained with high and low transduction rates, covering the wide range achievable with different experimental techniques (Adamantidis et al., 2007; Petreanu et al., 2007; Wang et al., 2007; Takahashi et al., 2012). The success rate will depend on a suitably tuned light intensity and on the ability to select the phase of the stimulation onset conditional on ongoing oscillation dynamics. Another factor that might enhance the controllability of phases is the use of faster variants of Chr2, such as ChETA (Gunaydin et al., 2010) and the E123T/T159C (Berndt et al., 2011) mutants.

A closed-loop approach is required for determining the optimal timing of pulse stimulations. **Figure 7C** shows that if the time required for the prediction stage is of the order of a few oscillation cycles, then the discrepancy between the target and the actual perturbation phase is comparable to the width of the peak of the PRC. Consequently the resulting phase shifting should remain close to the optimum. The prediction strategy that we propose (**Figure 6**) is based uniquely on a small number of linear computations, which are particularly suited for ultrafast (millisecond scale) implementation on reconfigurable hardware chips (Zhuo and Prasanna, 2008; Sadrozinski and Wu, 2011) or on GPU architectures (Owens et al., 2008; Volkov and Demmel, 2008) on which FFT algorithms can be efficiently implemented (Bhattacharyya et al., 2010). As a matter of fact, hardware implementations of period extraction (Waskito et al., 2010) and autoregressive modeling of biologic signals (Marinkovic et al., 2005; Kim and Rosen, 2010) have already proven to be order(s) of magnitude faster than on conventional CPUs. Taking into account these high levels of performance and the approximations we propose to implement, a length of the prediction window of ∼50 ms that corresponds to approximately three cycles of a 40–70 Hz rhythm appears completely realistic.

Our simulated oscillations constitute an idealized model for neuronal rhythms measured *in vivo* or *in vitro*. In our model, especially when the synchronization index is very high, cycle-to-cycle period length fluctuations are positively correlated with weak to intermediate correlation strength. In real neuronal oscillations, however, adaptation or other phenomena might introduce more complex correlation patterns between the lengths of different periods. Nevertheless, such correlations might still be captured by AR(1) modeling, as hinted to by the better performance of AR(1) in dynamic regimes at the "edge of synchrony" (**Figure 7E**), in which period length fluctuations are more strongly correlated.

Under specific experimental conditions, long-lasting oscillatory epochs might be a rare event. It would then become difficult to meet the conditions for the applicability of our protocol (i.e., the testing stage of **Figure 6** might never be passed). In this case, continuous optogenetic stimulation could be used to stabilize and boost oscillations, as simulated in **Figure 3E**. Then, similarly to the approach of Akam et al. (2012), precisely timed "kicks," superposed on this continuous light stimulation, could be used to perturb the instantaneous phase. In this sense, optogenetic stimulation is more promising than electric micro-stimulation. First, it allows combining continuous and pulsed stimulation within a single setup. Second, it can control with high selectivity the degree of synchronization, not only by providing an unspecific drive to the entire network, but also enhancing the drive to specific neuronal subpopulations, like for instance FS-PV cells which provide the phasic inhibition crucial for rhythm generation (Cardin et al., 2006; Sohal et al., 2009).

Finally, we are optimistic that the network models of transduced neural populations that were pioneered by Talathi et al. (2011) and further developed in this study are powerful tools, which will be increasingly adopted to conduct, optimize and accelerate the design and the calibration of closed-loop optogenetic experimental protocols.

## **PROBING PHASE-CODING AND COMMUNICATION-THROUGH-COHERENCE**

Reliable optogenetic manipulation of the phase dynamics of oscillating neuronal populations would open the way to an interventional exploration of phase coding schemes. In the phase coding framework, it is argued that the phase of spikes relative to a "reference clock"—paced either by a stimuluslocked (De Charms and Merzenich, 1996; Arabzadeh et al., 2006) or an internally-generated oscillation (O'Keefe and Recce, 1993; Siegel et al., 2009)—carry information, which is independent from and multiplexed with the one conveyed by rate fluctuations (Montemurro et al., 2008). Anticipating or delaying the ticks of such a "reference clock," as the one putatively framed by slow cortical oscillations (Kayser et al., 2012), should perturb the decoding of phase-based representations.

Beyond the control of the phase of a local oscillation, inter-areal phase correlations could be disrupted transiently by unspecific optogenetic stimulation (**Figure 5D**). Furthermore, precisely-phased perturbations determined within a closed-loop system could induce persistent switching between alternative phase-locked dynamic patterns (Tiesinga and Sejnowski, 2010; Battaglia et al., 2012). In this sense, the realization of an experiment inspired by the idealized analysis of **Figure 4**, would provide a direct testing of the communication-through-coherence hypothesis (Fries, 2005). More specifically, it would allow experimental testing of whether different sets of inter-areal phase relations lead to different inter-areal functional interactions and to an altered balance between bottom-up and top-down information flows, as predicted by theory (Battaglia et al., 2012).

A reorganization of phase relations between distant neuronal populations might have perceptual or behavioral consequences. Selective alteration of inter-population phase relations, for instance between areas FEF and V4 (Gregoriou et al., 2009) or areas V1 and V4 (Grothe et al., 2012), might be used to suppress or boost attentional effects or even to emulate reorienting of attention. Furthermore, our theoretical investigations suggest that stimulation applied locally to a single area might induce distributed reorganization of phase relations between other more distant areas (Battaglia et al., 2012). Closed-loop optogenetic stimulation might then in perspective be used to trigger systemlevel switching between global brain states (Deco et al., 2009; Freyer et al., 2011).

## **AUTHOR CONTRIBUTIONS**

Agostina Palmigiano and Demian Battaglia performed the simulations. Annette Witt, Agostina Palmigiano, and Demian Battaglia analyzed the simulations. Andreas Neef and Ahmed El Hady performed and analyzed the experiments. Annette Witt, Andreas Neef, Fred Wolf, and Demian Battaglia designed the models and the study. Annette Witt, Agostina Palmigiano, Andreas Neef, Ahmed El Hady, Fred Wolf, and Demian Battaglia wrote the manuscript.

## **ACKNOWLEDGMENTS**

The authors thank Pascal Fries, Alexander Gail, Theo Geisel, Andreas Kreiter, Shy Shoham, and Walter Stühmer for inspiring discussions. Annette Witt was supported by the Stifterverband für die deutsche Wissenschaft and by the Claussen-Simon-Stiftung. We all acknowledge financial support by the German Federal Ministry of Education and Research (BMBF) via the Bernstein Center for Computational Neuroscience—Göttingen (01GQ1005B, 01GQ0430, 01GQ07113), the Bernstein Focus Neurotechnology—Göttingen (01GQ0811) and the Bernstein Focus Visual Learning (01GQ0921, 01GQ0922), the German Israel Research Foundation and the VolkswagenStiftung (ZN2632) and the Deutsche Forschungsgemeinschaft through CRC-889 (906-17.1/2006).

#### **REFERENCES**


Bandyopadhyay, P. R. (2005). Trends in biorobotic autonomous undersea vehicles. *IEEE J. Oceanic Eng.* 30, 109–139.


Battaglia, D., Witt, A., Wolf, F., and Geisel, T. (2012). Dynamic effective connectivity of inter-areal brain circuits. *PLoS Comp. Biol.* 8:e1002438. doi: 10.1371/journal.pcbi.1002438


coordination of action-potential timing. *Nature* 381, 610–613.


neuronal inputs by differential modulations of gamma-band phase-coherence. *J. Neurosci.* 32, 16172–16180.


J. (2008). Monitoring lightinduced structural changes of Channelrhodopsin-2 by UV-visible and Fourier transform infrared spectroscopy. *J. Biol. Chem.* 283, 35033–35041.

Roelfsema, P. R., Engel, A. K., König, P., and Singer, W. (1997). Visuomotor integration is associated with zero time-lag synchronization among cortical areas. *Nature* 385, 157–161.


driven by metabotropic glutamate receptor activation. *Nature* 373, 612–615.


**Conflict of Interest Statement:** Ahmed El Hady is also appointed editor of the present special research topic issue. The other authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 November 2012; accepted: 07 March 2013; published online: 17 April 2013.*

*Citation: Witt A, Palmigiano A, Neef A, El Hady A, Wolf F and Battaglia D (2013) Controlling the oscillation phase through precisely timed closedloop optogenetic stimulation: a computational study. Front. Neural Circuits 7:49. doi: 10.3389/fncir.2013.00049*

*Copyright © 2013 Witt, Palmigiano, Neef, El Hady, Wolf and Battaglia. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Adaptive stimulus optimization for sensory systems neuroscience

## *Christopher DiMattina1\* and Kechen Zhang <sup>2</sup>*

*<sup>1</sup> Program in Psychology, Florida Gulf Coast University, Fort Myers, FL, USA*

*<sup>2</sup> Department of Biomedical Engineering, The Johns Hopkins University School of Medicine, Baltimore, MD, USA*

#### *Edited by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### *Reviewed by:*

*Demian Battaglia, Max Planck Institute for Dynamics and Self-Organization, Germany Tim Gollisch, University Medical Center Göttingen, Germany*

#### *\*Correspondence:*

*Christopher DiMattina, Program in Psychology, Florida Gulf Coast University, 10501 FGCU Boulevard South, Fort Myers, FL 33965-6565, USA*

*e-mail: chris\_dimattina@yahoo.com*

In this paper, we review several lines of recent work aimed at developing practical methods for adaptive on-line stimulus generation for sensory neurophysiology.We consider various experimental paradigms where on-line stimulus optimization is utilized, including the classical *optimal stimulus* paradigm where the goal of experiments is to identify a stimulus which maximizes neural responses, the *iso-response* paradigm which finds sets of stimuli giving rise to constant responses, and the *system identification* paradigm where the experimental goal is to estimate and possibly compare sensory processing models. We discuss various theoretical and practical aspects of adaptive firing rate optimization, including optimization with stimulus space constraints, firing rate adaptation, and possible network constraints on the optimal stimulus. We consider the problem of system identification, and show how accurate estimation of non-linear models can be highly dependent on the stimulus set used to probe the network. We suggest that optimizing stimuli for accurate model estimation may make it possible to successfully identify nonlinear models which are otherwise intractable, and summarize several recent studies of this type. Finally, we present a two-stage stimulus design procedure which combines the dual goals of model estimation and model comparison and may be especially useful for system identification experiments where the appropriate model is unknown beforehand. We propose that fast, on-line stimulus optimization enabled by increasing computer power can make it practical to move sensory neuroscience away from a descriptive paradigm and toward a new paradigm of real-time model estimation and comparison.

**Keywords: sensory coding, optimal stimulus, adaptive data collection, neural network, parameter estimation**

## **INTRODUCTION**

One classical approach in sensory neurophysiology has been to describe sensory neurons in terms of the stimuli that are most effective to drive these neurons. The stimulus that elicits the highest response is often referred to as the *optimal stimulus* (Albrecht et al., 1980; Stork et al., 1982; DiMattina and Zhang, 2008). Although the optimal stimulus provides a simple and intuitive means of characterizing a sensory neuron, positively identifying the optimal stimulus may be technically difficult for high-dimensional stimuli, and simply knowing the optimal stimulus without adequately exploring responses to other stimuli may provide limited information about sensory function (Olshausen and Field, 2005). Due to these practical and conceptual limitations of characterizing neurons by the optimal stimulus, many researchers have recently taken engineering-inspired approaches to studying neural coding, for example, by characterizing neurons in terms of the mutual information between sensory stimuli and a neuron's responses (Machens, 2002; Sharpee et al., 2004; Machens et al., 2005; Chase and Young, 2008), by characterizing iso-response surfaces in stimulus parameter spaces (Bölinger and Gollisch, 2012; Horwitz and Hass, 2012), or by fitting predictive mathematical models of neural responses to neurophysiology data (Wu et al., 2006). However, just like the classical optimal stimulus paradigm, these engineering-inspired methods also

give rise to non-trivial high-dimensional stimulus optimization problems.

With recent advances in desktop computing power, it has become practical to perform stimulus optimization adaptively in real-time during the course of an experiment (Benda et al., 2007; Newman et al., 2013). In this review, we consider several recent lines of work on adaptive on-line stimulus optimization, focusing on single-unit recording *in vivo* for systems-level sensory neuroscience. Other kinds of closed-loop neuroscience experiments like dynamic patch clamping or closed-loop seizure interventions are considered elsewhere (Prinz et al., 2004; Newman et al., 2013). We first discuss the concept of the optimal stimulus and consider how its properties may be constrained by the underlying functional model describing a neuron's stimulus–response relation. We then discuss how adaptive stimulus optimization has been utilized experimentally to find complex high-dimensional stimuli which optimize a neuron's firing rate, including promising recent studies using evolutionary algorithms. We also discuss a different kind of study where stimuli are "optimized" to elicit a desired constant firing rate so that iso-response contours of the stimulus–response function may be obtained, as well as studies seeking maximally informative stimulus ensembles. Finally, we discuss how adaptive stimulus optimization can be utilized for effective estimation of the parameters of sensory processing

"fncir-07-00101" — 2013/6/5 — 10:46 — page 1 — #1

models, as well as for effective model comparison. In conclusion, we suggest that adaptive stimulus optimization cannot only make the classical optimal stimulus paradigm more tractable, but can potentially move sensory neuroscience toward a fundamentally new experimental paradigm of real-time model estimation and comparison.

#### **THE OPTIMAL STIMULUS**

#### **DEFINING THE OPTIMAL STIMULUS**

In order for a sensory neuron to be useful to an organism, there must be a consistent functional relationship between the parameters of sensory stimuli and neural responses. Although this relationship may be highly complex and non-linear, for any set of stimuli defined by parameters **x** = (*x*1,... ,*xn*)T we may think abstractly of the expected neural responses being described by some function *r* = *f*(**x**). For simplicity and definiteness, in this section we will focus our discussion of the optimal stimulus on the most common case where *r* is a scalar quantity which represents the firing rate of a single neuron, and will assume that the expected firing rate is entirely a function of the stimulus parameters, ignoring variables such as spiking history and stimulus-specific adaptation by assuming that they are kept constant (Ulanovsky et al., 2003; Bartlett and Wang, 2005; Asari and Zador, 2009).

Given this formulation, the problem of finding the optimal stimulus **x**<sup>0</sup> is simply the problem of maximizing the function *f*(**x**). Perhaps the simplest and most intuitive notion of the optimal stimulus is that of a firing rate peak in stimulus parameter space centered at **x**0, as illustrated in **Figure 1A**. Here *f* is maximized at **x**0, and for any stimulus perturbation Δ**x** we have *f*(**x**<sup>0</sup> + Δ**x**) < *f*(**x**0). However, for high-dimensional stimulus spaces like image pixel space (Simoncelli et al., 2004) or auditory frequency space (Yu and Young, 2000; Barbour and Wang, 2003a) this intuitive notion of the optimal stimulus as a response peak is hardly the only possibility. In the example shown in **Figure 1B**, the neuron is tuned along one direction in the stimulus space, but is untuned along an orthogonal direction. In this case, there is not a single optimal stimulus **x**<sup>0</sup> as in **Figure 1A**, but rather a continuum of optimal stimuli lying along a ridge containing **x**<sup>0</sup> (**Figure 1B**, thick green line). Another theoretical possibility is the saddle-shaped response surface in **Figure 1C**, where depending on the dimension chosen for exploration, the same stimulus **x**<sup>0</sup> can be either a firing rate peak or a valley.

For high-dimensional stimulus spaces, a full factorial exploration is impossible since the number of stimuli needed grows exponentially with the dimension, a problem referred to colloquially as the *curse of dimensionality* (Bellman, 1961). In many experiments, stimulus spaces are explored in a restricted subset of dimensions. The behaviors of neuron in the unexplored stimulus dimensions may have various possibilities including the ones considered above. One cannot assume that the stimulus–response relationship must always be a single peak as in **Figure 1A**. Indeed, one of the challenges of sensory neurophysiology is that without prior knowledge about the neuron under study, there are no constraints whatsoever on the possibilities for the optimal stimulus, which must be found in a process of trial-and-error with no way to conclusively prove global optimality (Olshausen and Field, 2005). We now briefly discuss a recent theoretical study describing possible constraints on the optimal stimulus which arise from general anatomical properties of underlying functional circuitry.

#### **CONSTRAINTS FROM UNDERLYING FUNCTIONAL CIRCUITRY**

Ultimately, the stimulus–response relationship function *f*(**x**) is generated by the underlying neural circuitry connecting the sensory periphery to the neuron under study, but in general this circuitry is highly complex (Felleman andVan Essen, 1991; Shepherd, 2003) and not generally known to the experimenter. Nevertheless, recent theoretical work suggests that very basic anatomical properties of the neural circuitry may be able to provide experimentally useful constraints on the possibilities for the optimal stimulus (DiMattina and Zhang, 2008).

Consider the simple hypothetical sensory network shown in **Figure 2A** (left panel) which receives synaptic inputs from two

**FIGURE 1 | Hypothetical stimulus–response relationships for a sensory neuron.** The red circle represents the boundary of the set of permissible stimuli. **(A)** Stimulus **x**0 is a firing rate peak which corresponds to the intuitive notion of the optimal stimulus where any perturbation away from **x**0 results in a decrease in the firing rate. **(B)** This neuron is tuned to one stimulus

dimension but is insensitive to the second dimension. Instead of a single optimal stimulus **x**0 there is a continuum of optimal stimuli (green line). **(C)** A neuron whose stimulus–response function around the point **x**0 is saddle-shaped. Along one stimulus dimension **x**0 is a firing rate maximum, and along the other stimulus dimension **x**0 is a minimum.

"fncir-07-00101" — 2013/6/5 — 10:46 — page 2 — #2

peripheral sensory receptors (filled black circles) which linearly transduce stimulus the variables *x*1, *x*<sup>2</sup> and pass their outputs to a pair of interneurons, which in turn converge onto the output neuron from which responses *r* are measured. Since there are physical limits on the intensities of stimuli which can be generated by laboratory equipment, we may reasonably assume that the collection *X* of permissible stimuli is some closed subset of the real plane consisting of an interior and boundary (rightmost panel, thick red line). We may also reasonably assume that each neuron's input–output property is a described by an increasing gain function *g*(*u*). With these reasonable assumptions, it is simple to show that that the gradient of the function *f*(**x**) implemented by this circuit cannot vanish, and thus an optimal stimulus which is a firing rate peak as in **Figure 1A** is impossible. Therefore, it follows that optimal stimulus must lie on the boundary of *X* (**Figure 2A**, right panel), with the exact location depending on the synaptic weights and other parameters of the network.

In general, it can be shown that for hierarchical neural networks which can be arranged into layers that if the gain functions are increasing, the number of neurons in successive layers is decreasing or constant, and weight matrices connecting successive layers are non-degenerate, then it is impossible for the optimal stimulus for any neuron in this network to be a firing rate peak like that illustrated in **Figure 1A** (DiMattina and Zhang, 2008). It is important to note that this result requires that the stimuli be defined in the space of activities of the input units to the neural network, such as image pixel luminances which are the inputs to the network. One interesting corollary of this result is that if the space *X* of permissible stimuli is bounded by a maximum power constraint *n <sup>i</sup>*=<sup>1</sup> *<sup>x</sup>*<sup>2</sup> *<sup>i</sup>* ≤ *E*, the optimum firing rate will be obtained for a stimulus **x** ∈ *X* having the greatest power or contrast, since this stimulus will lie on the boundary. Indeed, for many sensory neurons in the visual, auditory, and somatosensory modalities, increasing the stimulus contrast monotonically increases the firing rate response (Albrecht and Hamilton, 1982; Cheng et al., 1994; Oram et al., 2002; Barbour and Wang, 2003b; Ray et al., 2008), which is interesting considering that convergent

networks satisfying the conditions of the theorem can model the functional properties of many sensory neurons (Riesenhuber and Poggio, 1999, 2000; Lau et al., 2002; Prenger et al., 2004; Cadieu et al., 2007).

At first, this result may seem to be of limited applicability since it is well known that the numbers of neurons in successive processing stages can be widely divergent (Felleman and Van Essen, 1991). However, the theorem applies only to the *functional network* which connects a given neuron to the sensory periphery. For instance, in the example shown in **Figure 2B**, the functional network connecting neuron *a* to the input layer is a convergent network with the number of units decreasing from layer to layer (blue), whereas the full network is divergent with the number of units increasing from layer to layer. Similarly, it is important to note that the neural network to which we apply the theorem may not be a literal description of the actual neural circuit, but simply a mathematical description of the functional relationship between the stimulus parameters and the neural response. For instance, a standard functional model of the ventral visual stream postulates a feedforward architecture similar to the models of complex cells postulated by Hubel and Wiesel (Riesenhuber and Poggio, 1999, 2000), and the theorem can be applied to neurons in these models. Similarly, divisive normalization models postulated for visual and auditory neurons (Heeger, 1992b; Koelling and Nykamp, 2012) can be re-written in a form to which the theorem applies and shown to have a non-vanishing gradient (Koelling and Nykamp, 2012).

#### **ADAPTIVE OPTIMIZATION OF FIRING RATE**

Despite the conceptual difficulties with the notion of an optimal stimulus, it provides sensory neuroscience with an intuitive first-pass description of neural function when an appropriate quantitative model is unknown. In this section, we discuss adaptive stimulus optimization methods which have been utilized experimentally for optimizing the firing rate of sensory neurons in high-dimensional stimulus spaces where a full factorial exploration would be intractable. Mathematically, the optimization

"fncir-07-00101" — 2013/6/5 — 10:46 — page 3 — #3

problem may be specified as that of finding

$$\mathbf{x}^\* = \underset{\mathbf{x} \in X}{\text{arg}\max} \ f(\mathbf{x}),\tag{1}$$

where **x**<sup>∗</sup> is the optimal stimulus, *f* is the (unknown) stimulus–response function, and *X* is the set of allowable stimuli. Methods to optimize firing rate fall into two general categories: those that ascend the local gradient of the stimulus–response function, and those which utilize genetic or evolutionary approaches. We discuss each of these approaches and their relative merits, along with issues of adaptation and constrained stimulus spaces.

#### **LOCAL HILL-CLIMBING**

Due to the inherent variability in neural responses (Tolhurst et al., 1983; Rieke et al., 1997), optimizing the firing rate of sensory neurons is a difficult stochastic optimization problem (Spall, 2003). Early work on adaptive stimulus optimization was performed by Harth and Tzanakou (1974), who applied a method of stochastic gradient ascent known as ALOPEX, or "Algorithm of Pattern Extraction" to neurons in the frog visual tectum (Tzanakou et al., 1979). This method works by computing correlations between random perturbations of the current stimulus and changes in firing rate and using these correlations to iteratively update the current stimulus to increase the expected firing rate, eventually converging to the optimal stimulus. More recently, related methods have been employed to optimize the responses of neurons in the primary visual (Foldiak, 2001) and auditory (O'Connor et al., 2005) cortices, providing independent verification of previously described receptive field properties like orientation selectivity (Hubel and Wiesel, 1962) or inhibitory sidebands (Shamma et al., 1993). Variations of ALOPEX have also been utilized to quickly find the best frequency for auditory nerve fibers, an essential first step in many auditory neurophysiology experiments (Anderson and Micheli-Tzanakou, 2002).

In addition to these correlation-based approaches, numerous other computational methods have been utilized for firing rate optimization. One approach is to iteratively make local linear or quadratic approximations to the neural responses around a reference stimulus (Bandyopadhyay et al., 2007a; Koelling and Nykamp, 2008, 2012), which can then be used to determine a good search directions in the stimulus space. This approach has been utilized by Young and colleagues in order to determine that the optimal stimulus for neurons in the dorsal cochlear nucleus is a spectral edge centered at the neuron's best frequency (Bandyopadhyay et al., 2007b), consistent with suggestions from previous studies (Reiss and Young, 2005). An alternative optimization method which does not require estimating the local response function gradient is the Nelder–Mead simplex search (Nelder and Mead, 1965), which has been used to optimize the responses of neurons in cat auditory cortex to four-tone complexes (Nelken et al., 1994).

#### **GENETIC ALGORITHMS**

One limitation of the stimulus optimization methods above is that they are local searches which iteratively update the location of a single point (or simplex of points). Therefore, it is certainly possible for optimization runs to end up stuck at local firing rate

maxima. Furthermore points of vanishing gradient do not necessarily indicate maxima (Koelling and Nykamp, 2012), as we can see from the examples in **Figure 1**. Furthermore, local search methods only identify a single optimal stimulus, and do not sample the stimulus space richly enough to fully describe neural coding. One possible alternative adaptive optimization method used in previous neurophysiological studies which can potentially surmount both of these problems is a *genetic algorithm* (Goldberg, 1989). A genetic algorithm works by populating the stimulus space widely with many stimuli (analogous to "organisms"), which survive to the next generation with a probability proportional to the firing rate they elicit (analogous to their"fitness"). The parameters of the surviving stimuli are combined at random in a factorial manner ("crossing-over") and mutated in order to produce a new generation of different stimuli based on the properties of the current generation. Over several iterations of this algorithm, a lineage of stimuli will evolve which maximizes the firing rate of the neuron under study, and since the sampling of the stimulus space is nonlocal, genetic algorithms are more likely to avoid the problem of local maxima than hill-climbing methods.

Genetic algorithms were applied to neurophysiology studies by Winter and colleagues, who optimized the parameters of amplitude-modulated tones defined in a four-dimensional space in order to study neural coding in the inferior colliculus (Bleeck et al., 2003). The optimal stimuli found by this method were in agreement with tuning functions found by traditional methods, thereby validating the procedure. More recently, a very powerful demonstration of genetic algorithms as a tool for adaptive optimization was given by Connor and colleagues studying the representation of two-dimensional shape in V4 (Carlson et al., 2011) and three-dimensional shape in the inferotemporal cortex (Yamane et al., 2008; Hung et al., 2012). The parameter space needed to define three-dimensional shapes is immense and impossible to explore factorially, with most of the stimuli in this space being ineffective. Nevertheless, a genetic algorithm was successful at finding shape stimuli having features which were effective at driving neurons, with the optimization results being consistent over multiple runs. Furthermore, because the genetic algorithm cross-over step generates stimuli which factorially combine different stimulus dimensions, it did a sufficiently thorough job of sampling the stimulus space to permit the investigators to fit predictive models which accurately described the tuning of the neurons to arbitrary shape stimuli (Yamane et al., 2008).

As reviewed above, the different methods developed for automatically optimizing firing rate responses of sensory neurons differ greatly, both in their general search strategy (i.e., gradient ascent versus genetic algorithms) as well as their exact methods for implementing that strategy (Nelken et al., 1994; Foldiak, 2001; Koelling and Nykamp, 2012). Furthermore, it is important to note that while genetic algorithms are a commonly chosen alternative to gradient ascent in the existing literature (Bleeck et al., 2003; Yamane et al., 2008; Chambers et al., 2012; Hung et al., 2012), a wide variety of alternative optimization methods could in principle be applied as well, such as simulated annealing (Kirkpatrick et al., 1983), and particle swarm optimization (Kennedy and Eberhart, 1995). However, without direct comparisons of algorithms on benchmark problems using numerical simulation, it is hard to

"fncir-07-00101" — 2013/6/5 — 10:46 — page 4 — #4

directly and fairly compare these various methods. As automated stimulus optimization becomes more widely used in physiological experiments, systematic comparison of optimization methods on benchmark problems is certainly an interesting avenue for future research in computational neuroscience.

#### **STIMULUS SPACE CONSTRAINTS**

Quite often, one may wish to optimize neuronal responses in a constrained stimulus space for constraints which are more complex than simple upper and lower bounds on stimulus dimensions. For many neurons one can always increase the firing rate simply by increasing the stimulus energy or contrast (Albrecht and Hamilton,1982;Cheng et al.,1994; Oram et al.,2002), so it is of interest to optimize the stimulus with the constraint of fixed stimulus energy. In Eq. 1, the optimal stimulus is defined over the set of all allowable stimuli, *X*, which depends on the constraints in the stimulus space. When each component of the stimulus **x** = (*x*1,..., *xn*)<sup>T</sup> is constrained between an upper bound and a lower bound (e.g., the luminance of image pixels has a limited range of possible values), the set *X* is a hypercube:

$$X = \left\{ \mathbf{x} : a\_i \le x\_i \le b\_i, \ i = 1, \dots, n \right\}.\tag{2}$$

With a quadratic energy constraint, the allowable stimulus set *X* should become a hyper-sphere:

$$X = \left\{ \mathbf{x} : \sum\_{i=1}^{n} \mathbf{x}\_i^2 = E \right\}. \tag{3}$$

For example, Lewi et al. (2009) derived a fast procedure for optimization for effective model estimation under stimulus power constraints. Optimizing the stimulus in Eq. 1 subject to an energy constraint is an optimization problem for which there are many numerical methods for solutions (Douglas et al., 2000; Nocedal and Wright, 2006).

In special cases where there is prior information about the functional form of *f*(**x**), the constrained optimization problem may permit numerically elegant solutions for finding optimal stimuli subject to non-linear constraints, as well as finding invariant transformations of a stimulus which leave responses unchanged. A recent study (Berkes and Wiskott, 2006, 2007) considered the problem of optimizing the responses of any neuron whose functional properties are given by an inhomogeneous quadratic form *f* (**x**) = **x** T **Ax** + **b**<sup>T</sup> **x** + *c*, subject to an energy constraint **x**T**x** = *E*. This study presented a very efficient algorithm for computing the optimal stimulus **x**<sup>∗</sup> which requires only a bounded one-dimensional search for a Lagrange multiplier, followed by analytical calculation of the optimal stimulus. In addition, they demonstrated a procedure for finding approximate invariant transformations in the constrained stimulus space, which for complex cells amount to shifts in the phase of a Gabor patch. As quadratic models have become popular tools for characterizing non-linear sensory neurons (Heeger, 1992a; Yu and Young, 2000; Simoncelli et al., 2004; Berkes and Wiskott, 2005; Bandyopadhyay et al., 2007a), their algorithm offers a useful tool for sensory neuroscience.

#### **NEURAL RESPONSE ADAPTATION**

It is well known that when the same stimulus is presented repeatedly to sensory neurons, they exhibit firing rate adaptation, becoming less sensitive to that stimulus over time (Ulanovsky et al., 2003; Asari and Zador, 2009). Similarly, responses to sensory stimuli can often non-stationary and are affected by context provided by preceding stimuli (Bartlett and Wang, 2005). Adaptation potentially presents a difficulty for stimulus optimization methods, since toward the end of the optimization run as the algorithm converges on a (locally) optimal stimulus, a series of very similar stimuli may be presented repeatedly, thereby leading to firing rate adaptation. This phenomena has been observed in studies in the published literature (Yamane et al., 2008) and presents a potential obstacle to studies of adaptive stimulus optimization (Koelling and Nykamp, 2012). Given the suppression of neural responses to stimuli which occur with high probability (Ulanovsky et al., 2003), one way of dealing with adaptation may be to intersperse random stimuli with those generated by the optimization run, so as to reduce adaptation effects. However, this may be an inefficient method for dealing with adaptation, since it increases the number of stimuli needed in an experiment (Koelling and Nykamp, 2012).

Apart from these technical considerations, the problem of firing rate adaptation illustrates a fundamental conceptual limitation of phenomenological sensory neurophysiology. In particular, it demonstrates that the act of probing a sensory neuron with stimuli can potentially changes the response properties of the neuron itself, possibly including its optimal stimulus. Therefore, it may not be conceptually correct to characterize the stimulus optimization problem as it is written in Eq. 1, but rather to characterize it as a far more complicated optimization problem where the function *f*(**x**, **h**(*t*)) to be optimized is constantly changing, dependent on both the stimulus **x** and response history **h**(*t*). In this case, the optimal stimulus for a given neuron may only be well-defined with respect to a given history of stimuli and responses.

One solution to this problem would be to have a mathematical model of the neuron's stimulus–response function which takes adaptation into account. Indeed, recent work has demonstrated that bilinear models of sensory neurons incorporating adaptation parameters can greatly improve predictions when compared standard linear receptive field models (Ahrens et al., 2008a). Other work has shown that the failure of spectrotemporal receptive field (STRF) models to account fully for neural responses to natural stimuli may be accounted for by rapid synaptic depression (David et al., 2009), further underscoring the importance of including adaptation parameters in neural models. We discuss the issues of neuronal adaptation and stimulus-response history further when we discuss the estimation of neural models using active data collection.

On the whole however, the problem of adaptation does not seem to pose a fatal limitation to adapting firing rate optimization, as it has been applied successfully in many recent studies (Foldiak, 2001; O'Connor et al., 2005). Furthermore, there are many neurons in the brain for which adaptation effects are small and thus do not pose a concern (Ingham and McAlpine, 2004). These methods are potentially of great importance for investigating neural

"fncir-07-00101" — 2013/6/5 — 10:46 — page 5 — #5

coding of complex stimuli defined in high-dimensional spaces (Yamane et al., 2008), and it is of great interest to better understand how adaptation affects stimulus optimization and receptive field characterization.

## **ISO-RESPONSE SURFACES AND MODEL COMPARISON**

In high-dimensional stimulus spaces, the same response from a sensory neuron can be elicited by a continuum of equally effective optimal stimuli rather than a single optimal stimulus (**Figure 1**). Therefore, in some experiments it may be of interest to find sets of equivalent stimuli known as *iso-response surfaces* which yield the same response. One possible way of formalizing an optimization problem for this class of experiments is to formulate it as finding stimuli

$$\mathbf{x}^\* = \operatorname\*{arg\,min}\_{\mathbf{x} \in X} d(f(\mathbf{x}), \boldsymbol{\varepsilon}), \tag{4}$$

which *d*(· ,·) is some metric measure (e.g., squared error) quantifying the discrepancy between the desired response *c* and the neuronal response *f*(**x**). Multiple optimization runs from different starting locations and for different values of the desired constant response *c* permit the experimenter to determine families of iso-rate surfaces for the neuron under study. The geometrical shapes of the iso-rate surfaces can help to determine how stimulus variables *x*1,...,*xn* are integrated, and thus provide a useful tool for comparison of hypothetical models. For instance, linear integration of stimulus energy would yield iso-response surfaces which are hyperplanes of the form *n <sup>i</sup>*=<sup>1</sup> *xi* = *c*, whereas non-linear integration would yield non-planar iso-response surfaces. **Figure 3** illustrates iso-response surfaces for two different hypothetical sensory processing models.

The iso-response surface method was used by Gollisch et al. (2002) to test several competing hypotheses about how spectral energy is integrated in locust auditory receptors. The iso-response contours to combinations of two or three pure tone stimuli with fixed frequencies and variable amplitudes were of elliptical shape, consistent with an energy-integrator model of spectral integration. Further work extended the iso-response method to incorporate temporal integration, yielding a complete cascade model of auditory transduction (Gollisch and Herz, 2005).

A more recent study applied this technique to study the integration of visual contrast over space in salamander retinal ganglion cells, revealing a threshold-quadratic non-linearity in the receptive field center as well as a subset of ganglion cells most sensitive to spatially homogeneous stimuli (Bölinger and Gollisch, 2012). The iso-response surface method has also been applied fruitfully in mammalian sensory systems as well. A recent study by Horwitz and Hass (2012) utilized this procedure to study integration of color signals from the three retinal cone types in single neurons in the primary visual cortex. It was found that half of the neurons had planar iso-response surfaces, consistent with linear integration of cone signals. However, the other half showed a variety of non-linear iso-response surfaces, including cup-shaped surfaces indicating sensitivity to only narrow regions of color space.

Although the iso-response surface method has been applied successfully in stimulus spaces of low dimensionality (two or three dimensions), tracing out level hyper-surfaces in higher-dimensional spaces may pose a formidable computational

challenge (Han et al., 2003; Willett and Nowak, 2007). In future research, dimensionality reduction procedures might be useful for extending the iso-response surface method to high-dimensional stimulus spaces like pixel space or auditory frequency space (Yu and Young, 2000), as well as for high-dimensional spaces defining complex naturalistic stimuli like 3D shapes or species-specific communication sounds (DiMattina and Wang, 2006; Yamane et al., 2008).

### **MAXIMALLY INFORMATIVE STIMULUS ENSEMBLES**

It has been proposed that one of the major goals of sensory coding is to efficiently represent the natural environment (Barlow, 1961; Simoncelli, 2003). In this spirit, another class of closed-loop stimulus optimization methods has been developed to find the optimal *ensemble* of sensory stimuli for maximizing the mutual information between stimuli and neural responses (Machens, 2002). This method differs from efforts to find the optimal stimulus or efforts to find iso-response surfaces because the goal is not to find an individual stimulus **x**<sup>∗</sup> which optimizes the desired criterion (i.e., Eq. 1), but rather to find the optimal distribution *p*∗(**x**) which optimizes the mutual information *I*(y; **x**), where y denotes the observed neural response (typically the firing rate of a single neuron). Mathematically, we write

$$p^\*(\mathbf{x}) = \operatorname\*{arg\,max}\_{p(\mathbf{x}) \in P} I(\mathbf{y}; \mathbf{x}) = \int\_X \int\_Y p(\mathbf{y} \mid \mathbf{x}) p(\mathbf{x}) \ln \frac{p(\mathbf{y} \mid \mathbf{x})}{p(\mathbf{y})} d\mathbf{x} d\mathbf{y} \,, \tag{5}$$

where *P* is the space of probability densities on the stimulus space *X*, and *p*(*y* | **x**) and *p*(*y*) are determined experimentally by observing neural responses to stimuli. In practice, one starts with an assumption of a uniform distribution with finite support and then applies the Blahut–Arimoto algorithm (Blahut, 1972; Arimoto, 1972) to iteratively update the sampling weights (Machens, 2002). This method has been applied experimentally to characterize grasshopper auditory receptor neurons, demonstrating optimality for processing behaviorally relevant species-specific communication sounds (Machens et al., 2005; Benda et al., 2007).

## **ADAPTIVE OPTIMIZATION FOR SENSORY MODEL ESTIMATION**

An ideal gold standard for sensory neuroscience is to obtain a complete and accurate functional stimulus–response model of the neuron under study. In theory, once such a model is attained, one can then numerically or analytically calculate from this model the neuron's optimal stimulus, its iso-response surfaces, and its maximally informative stimulus ensembles. That is, if one identifies the system, one gets"for free" other information one may be interested in. However, despite its conceptual appeal, the problem of system identification is of great practical difficulty. This is because one needs to specify an accurate yet experimentally tractable model whose parameters can be estimated from data obtained during the available observation time. Unfortunately, research in computational neuroscience has shown that tractable (e.g., linear and quadratic) models are not accurate, whereas more biologically accurate models (deep, multiple layer neural networks incorporating dynamics, recurrence, etc.) often pose intractable parameter estimation problems.

"fncir-07-00101" — 2013/6/5 — 10:46 — page 6 — #6

It is well known from the fields of statistics and machine learning that one can more quickly and accurately estimate the parameters of a function using adaptive data collection, where new observations are generated in an iterative, adaptive manner which optimize the expected utility of the responses given the goal of estimating the model parameters (Lindley, 1956; Bernardo, 1979; MacKay, 1992). Mathematically, the optimization problem is to find at each iteration

$$\mathbf{x}\_{n+1}^{\*} = \underset{\mathbf{x} \in X}{\text{arg}\max} \ U\_{n}^{(\mathcal{E})}(\mathbf{x}),\tag{6}$$

where *<sup>U</sup>*(E) *<sup>n</sup>* (**x**) is the estimation utility function based on the data of the first *n* stimulus–response pairs. There are many choices for this function, including expected squared error (Müller and Parmigiani, 1995), expected prediction error (Sugiyama, 2006), and mutual information between stimuli and model parameters (Paninski, 2005). The generic name for this strategy is *optimal experimental design* or OED (Federov, 1972; Atkinson and Donev, 1992; Cohn et al., 1996), and it is often studied in a Bayesian framework (MacKay, 1992; Chaloner and Verdinelli, 1995). Recent theoretical and experimental work has shown that such methods can potentially be fruitfully applied in neuroscientific experiments (Paninski, 2005; Paninski et al., 2007; Lewi et al., 2009, 2011; DiMattina and Zhang, 2011). Not only can optimal experimental design make the estimation of high-dimensional models practical (Lewi et al., 2009), but can also make it tractable to estimate highly non-linear models which cannot be readily identified from random "white noise" data of the kind typically used in system identification experiments (DiMattina and Zhang, 2010, 2011). We first discuss the application of such methods in psychology and cognitive science, and then discuss recent theoretical and experimental work on applications of OED methods to sensory neurophysiology experiments

#### **ADAPTIVE STIMULUS OPTIMIZATION IN PSYCHOLOGY AND COGNITIVE SCIENCE**

Psychophysics has long utilized adaptive data collection, with the classic example being the staircase method for threshold estimation (Cornsweet, 1962). More recently, an adaptive Bayesian approach to threshold estimation (QUEST) which chooses new stimuli for each trial at the current Bayesian estimate of the threshold was developed (Watson and Pelli, 1983), and subsequent work extended this approach to permit simultaneous estimation of both the threshold and slope of the psychometric function (Snoeren and Puts, 1997). Another line of work applied an information-theoretic approach to estimating the slope and threshold parameters, where stimuli were chosen at each trial to maximize the expected information gained about the slope and threshold parameters (Kontsevich and Tyler, 1999). More sophisticated methods of this kind have been utilized for psychometric

"fncir-07-00101" — 2013/6/5 — 10:46 — page 7 — #7

functions defined on two-dimensional stimuli (Kujala and Lukka, 2006), with these procedures being applied for estimating contrast sensitivity functions (Lesmes et al., 2010) and color sensitivity of human observers (Kujala and Lukka, 2006). In addition to finding widespread application in sensory psychophysics, adaptive methods have also been used more broadly in the cognitive sciences (Wixted and Ebbesen, 1991; Rubin and Wenzel, 1996; Nosofsky and Zaki, 2002; Opfer and Siegler, 2007; Kujala et al., 2010; McCullough et al., 2010).

#### **GENERALIZED LINEAR MODELS AND BIOPHYSICAL MODELS**

More recently, investigators in computational neuroscience have demonstrated that adaptive information-theoretic sampling where stimuli are chosen to maximize the expected information gain between a stimulus and the model parameters can be a highly effective means of estimating the parameters of sensory processing models (Paninski, 2005; Paninski et al., 2007). A fast information-theoretic algorithm has been developed for the generalized linear model which applies a static non-linearity to the output (Lewi et al., 2009). The generalized linear model has been utilized in numerous studies (Simoncelli et al., 2004) and enjoys a likelihood function with no local maxima (Paninski, 2004). Their algorithm relied on a Gaussian approximation to the posterior density, permitting fast recursive updates, with the calculations for finding the optimal stimulus growing only as the square of the stimulus space dimensionality. Numerical simulations demonstrated that their procedure was asymptotically efficient, with the empirically computed variance of the posterior density converging to the minimum theoretically possible variance.

One issue which potentially affects studies of stimulus optimization is neuronal adaptation due to the stimulus history (Ulanovsky et al., 2003; Asari and Zador, 2009). In sensory neurons, this may be manifested as the system actually changing its underlying parameters which we seek to estimate as the experiment progresses. However, the procedure developed by Lewi et al. (2009)was demonstrated to be robust to parameter drift in numerical simulations, suggesting the ability to compensate for changes to the system brought about by adaptation effects. Furthermore, their model also permits the estimation of a spike-history filter, allowing neuronal response history to influence predictions to new stimuli.

A further study by this group applied this algorithm to fitting generalized linear models to avian auditory neurons probed with conspecific song samples, and it was found that accurate estimation could be obtained using vastly fewer samples when they were chosen adaptively using the algorithm then when they were chosen non-adaptively (Lewi et al., 2011). Although this procedure has yet to be applied in real on-line experiments, it provides experimenters working on a variety of systems with a powerful tool for quickly characterizing neurons whose responses are well described by generalized linear models (Chichilnisky, 2001) or related models (Pillow et al., 2008).

More recently, this group has also applied optimal experimental design to the cellular neuroscience problem of accurately estimating voltages from dendritic trees using measurements suffering from low signal-to-noise ratio (Huggins and Paninski, 2012). Using simulated compartmental models, these authors demonstrated that by adaptively choosing observation locations which minimize the expected squared error of the voltage measurement, a substantial improvement in accuracy was obtained compared to random sampling. This procedure is potentially of great experimental usefulness because techniques like two-photon imaging permit spatially complete observations of dendrites, but with low signal-to-noise ratios (Djurisic et al., 2008; Canepari et al., 2010).

## **MULTIPLE LAYER NEURAL NETWORKS**

Since many sensory neurons are non-linear (Young et al., 2005;Wu et al., 2006), it is of interest to characterize neurons using various non-linear models, including quadratic and bilinear models (Yu and Young, 2000; Berkes andWiskott, 2006;Ahrens et al., 2008a,b), neural networks (Lau et al., 2002; Prenger et al., 2004; Cadieu et al., 2007) and basis function networks (Poggio and Girosi, 1990). A generalized linear model is also a non-linear model because it employs a static non-linearity at the output stage. Although a generalized linear model allows limited non-linearities, it enjoys tractable and consistent estimation procedures without problems of local minima (Paninski, 2004). Identifying more complex nonlinear models like hierarchical neural networks from physiological data tends to be harder due to problems like local minima and plateaus in the error surface (Amari et al., 2006; Cousseau et al., 2008; Wei and Amari, 2008; Wei et al., 2008).

For studies aimed at estimating generalized linear models, the use of a fixed white-noise stimulus set is often quite effective and is theoretically well-justified (Chichilnisky, 2001; Paninski, 2004;Wu et al., 2006). However, recent theoretical work suggests that using fixed stimulus sets like white noise may be deeply problematic for efforts to identify non-linear hierarchical network models due to *continuous parameter confounding* (DiMattina and Zhang, 2010). This problem is illustrated for a very simple non-linear neural network model shown in **Figure 4A**. In this example, the goal is to recover the parameters (*w,v*) of the network by performing maximum likelihood (ML) estimation given noisy stimulus–response observations. When the input stimuli *x* only drive the hidden unit over a region of its gain function which can be well approximated by a power function (**Figure 4B**, top), the estimates obtained by ML for different datasets lie scattered along the continuum *vw*<sup>α</sup> = *C*, as one would expect for a power law gain function *g*(*u*) = *Au*<sup>α</sup> (**Figure 4C**, top). (Here the constant *C* = *v*T*w*<sup>α</sup> <sup>T</sup>, where *w*<sup>T</sup> and *v*<sup>T</sup> are the true values of the input and output weights.) In contrast, when the input stimuli *x* drive the hidden unit over a full range of its gain so that the power law approximation is poor (**Figure 4B**, bottom), the true parameters are accurately recovered for different datasets (**Figure 4C**, bottom).

A hypothetical experiment which suffers from this problem is illustrated in **Figure 5**. We see that when the network in **Figure 5A** is probed with random stimuli (**Figure 5B**, right), the hidden unit is driven over a limited range of its gain function which may be well approximated by an exponential, so that the sigmoidal gain (**Figure 5C**, black curve) may *de facto* be replaced by a new exponential gain function *g*(*u*) = *Ae*α*<sup>u</sup>* (**Figure 5C**, red curve). With this new gain, it follows that a continuum of different values of the output weight *v* and bias *w*<sup>0</sup> lying on the curve *ve*α*w*<sup>0</sup> = *C* will yield models whose responses to the training data are

"fncir-07-00101" — 2013/6/5 — 10:46 — page 8 — #8

output weight parameters (*w*,*v*) we wish to estimate from noisy stimulus–response data. Noise is drawn from a Poisson distribution. **(B)** *Top*: The input stimuli *x* ∈ [−0.5, 0.5] only drive the hidden unit over a limited region of its gain function (black curve) which may be well approximated by a stimuli like that in the top of **Figure 4B**, the estimates (black dots) lie scattered along the curve predicted by the power law confounding theory. *Bottom*: When trained with sets of stimuli like those in the bottom panel of **Figure 4B**, the true parameter values (red triangle) are more reliably recovered.

indistinguishable and therefore multiple estimates of these parameters from different random training sets will lie scattered along this curve (**Figure 5D**). (Here the constant *C* = *v*T*e*α*w*0T where *V*<sup>T</sup> and *w*0T are the true values of the output weight and hidden unit bias.)

Adaptive stimulus optimization methods like informationtheoretic sampling (Paninski, 2005) can in principle overcome this problem of continuous parameter confounding, as we see in **Figure 5D** where the correct network parameters are reliably recovered when optimally designed stimuli (**Figure 5B**, left) are used. This simple example suggests that adaptive stimulus optimization may make it tractable to reliably recover the parameters of complex hierarchical networks needed to model non-linear neurons, whereas it is much harder to recover these networks using standard stimulus sets like white noise.

Many previous studies in the statistics and machine learning literature have demonstrated that faster convergence and smaller generalization error may be obtained when neural networks are trained adaptively using optimally designed stimuli (Lindley, 1956; MacKay, 1992; Cohn et al., 1994). Recently, we have developed a practical method for implementing the information-theoretic stimulus optimization approach derived for generalized linear models (Lewi et al., 2009) for arbitrary nonlinear models like hierarchical neural networks. Although this method employs numerous approximations, it has been shown in simulated experiments to be effective at recovering non-linear neural networks having multiple hidden units, and is fast enough to utilize in real experiments (Tam et al., 2011; Dekel, 2012; Tam, 2012).

## **ADAPTIVE OPTIMIZATION FOR SENSORY MODEL COMPARISON**

Quite often the appropriate model for describing a sensory neuron or perceptual quantity is unknown. Therefore, an important experimental goal may be to discriminate between two or more competing models. Mathematically, the optimization problem is to iteratively find stimuli

$$\mathbf{x}\_{n+1}^{\*} = \underset{\mathbf{x}}{\text{arg}\max} \ U\_{n}^{(\text{C})}(\mathbf{x}),\tag{7}$$

which optimize a model comparison utility function *Un* (C) (**x**), one choice of which may be the expected reduction in model space entropy (Cavagnaro et al., 2010; DiMattina and Zhang, 2011). This equation may be regarded as the optimal comparison counterpart of the equation for optimal estimation (Eq. 6). We now briefly discuss recent studies making use of adaptive stimulus optimization for model selection.

#### **PSYCHOPHYSICAL MODEL COMPARISON**

Although standard model comparison methods like the Bayesian Information Criterion (BIC; Schwarz, 1978) or predictive

"fncir-07-00101" — 2013/6/5 — 10:46 — page 9 — #9

input from a broadly integrating interneuron (I-unit). **(B)** Examples of optimally designed (left) and random (right) stimuli. Note that the optimally designed stimuli exhibit complex correlated structure. **(C)** Random stimuli (green dots) only drive the E-unit over a limited range of its gain function (black curve)

cross-validation may be applied *post hoc* (Vladusich et al., 2006; Wu et al., 2006), numerous studies suggest that performing experiments using stimuli optimized for model comparison may be far more effective (Atkinson and Fedorov, 1975a,b). One method for model comparison developed recently for psychophysical experiments is known as MAximum Differentiation (MAD) competition (Wang and Simoncelli, 2008). Given two perceptual models which relate stimulus parameters to a perceptual quantity, this method generates a pair of stimuli which maximizes/minimizes the response of one model while holding the other model's response fixed. Next, this procedure is repeated with the role of the two models reversed. Testing human subjects on the two pairs of synthesized stimuli can determine which model is "better" in the sense of telling us which model's max/min pairs are simpler to

the exponential theory (black curve). In contrast, estimates attained from optimally designed stimuli accurately recover the true parameters (red

triangle).

"fncir-07-00101" — 2013/6/5 — 10:46 — page 10 — #10

discriminate. This procedure has been fruitfully applied to comparing image quality assessment models which aim to predict human perception of image quality (Wang and Bovik, 2006)

An information-theoretic method for model comparison was recently derived by Cavagnaro et al. (2010). Given a set of models with the *i*-th model having prior probability *P*0(*i*), stimuli are chosen to maximize the mutual information between the stimulus and the model index *i* by minimizing the expected model space entropy in a manner directly analogous to informationtheoretic model estimation (Paninski, 2005), except that in this case the unknown variable is a discrete model index *i* rather than a continuous parameter value **θ**. This method was applied to competing models of memory retention from the cognitive science literature (Rubin et al., 1999) and was shown to permit much more accurate discrimination than standard non-adaptive methods.

#### **NEURAL MODEL COMPARISON**

In general, the correct parameters of competing sensory processing models are not known beforehand. Therefore, it is of interest to consider how to conduct experiments which estimate and discriminate competing models. Typically, investigators in neurophysiology and neuroimaging have applied model-comparison techniques *post hoc* (David and Gallant, 2005; Vladusich et al., 2006; Penny, 2012), particularly in the system identification literature (Prenger et al., 2004; David and Gallant, 2005; Wu et al., 2006; Sharpee et al., 2008; Rabinowitz et al., 2012; Schinkel-Bielefeld et al., 2012). However, a fundamental limitation with *post hoc* analysis is that it is not possible to generate and test critical stimuli which are optimized for model comparison, as this is only possible while the system is under observation. This limitation can only be overcome by fitting multiple models to a sensory neuron during the course of an experiment and then using the fitted models to generate and present critical stimuli which are optimized to best discriminate the models. Although previous work has presented stimuli on-line to test or verify a single model (deCharms et al., 1998; Touryan et al., 2002), very little work in single-unit *in vivo* sensory neurophysiology has presented stimuli optimized for model comparison in real-time (Tam et al., 2011).

A recent study considered a two-stage approach for combining the goals of model estimation and comparison in neurophysiology experiments, illustrated schematically in **Figure 6A** (DiMattina and Zhang, 2011). In the first stage, stimuli are adaptively optimized for parameter estimation, with the optimal stimulus for each model being presented in turn. In the second stage, stimuli are generated adaptively in order to optimally discriminate competing models making use of an information-theoretic criterion (Cavagnaro et al., 2010) or a likelihood-based criterion. In the special case of two models *f*<sup>1</sup> (**x**, **θ**1), *f*<sup>2</sup> (**x, θ**2) and Gaussian response noise, it can be shown that under a likelihood criterion the best stimulus for model discrimination is the stimulus which maximizes the quantity *f*<sup>1</sup> (**x, θ**1) − *f*<sup>2</sup> (**x, θ**2) 2 , and furthermore this stimulus will maximally increase the BIC in favor of the best model (DiMattina and Zhang, 2011).

**Figure 6** illustrates a numerical experiment making use of this two-stage procedure for the problem of discriminating an additive and multiplicative model of neural responses (**Figure 6B**), where the additive model is assumed to be the true model. After the estimation phase, the BIC does not have a strong preference for either model, only being correct about half the time (**Figure 6C**). However, after presenting 500 stimuli optimized for discriminating the additive and multiplicative model and applying the BIC to all available data, the correct (additive) model is preferred for 24 of 25 Monte Carlo trials (red curve). As a control, presenting additional stimuli optimized for model estimation only improves final model selection moderately (blue curve), while presenting random stimuli does not at all improve model selection performance (green curve). This procedure has now been applied in neurophysiology experiments to generate critical stimuli to distinguish between two competing models of spectral processing by single neurons in the primate inferior colliculus (Tam et al., 2011; Tam, 2012).

## **DISCUSSION**

With increasing computer power, it is becoming practical for neuroscience experiments to utilize adaptive stimulus optimization where stimuli are generated in real-time during the course of the experiment (Benda et al., 2007; Newman et al., 2013). Various experiments have utilized adaptive stimulus optimization in order to break the "curse of dimensionality" and find the optimal stimulus for a sensory neuron in spaces which are too large for factorial exploration (O'Connor et al., 2005; Yamane et al., 2008). However, simply characterizing the optimal stimulus for a sensory neuron provides at best only a partial description of neural coding (Olshausen and Field, 2005). Therefore, in addition to helping to find the optimal stimulus, adaptive stimulus optimization makes it possible to pursue engineering-inspired approaches to sensory neurophysiology which may yield greater functional insights, for instance finding stimulus ensembles maximizing information between stimuli (Machens, 2002; Machens et al., 2005) and neural responses or fitting and comparing multiple non-linear models to neural responses (Lewi et al., 2009; DiMattina and Zhang,2011). **Table 1** summarizes the various closed-loop stimulus optimization paradigms discussed in this review, and **Figure 7** schematically illustrates the closed-loop experimental approach.

The vast majority of the work to date has applied closedloop methods to studying scalar firing rate responses measured from single neurons. However, as closed-loop approaches are continuing to develop, and as new techniques like optical imaging (Ohki et al., 2005; Bock et al., 2011) are making it increasingly feasible to observe large numbers of neurons simultaneously, it is of great interest for future investigations to apply these methods to neural populations and to measurements beyond scalar firing rate. Here we briefly discuss some possible directions for future research.

While the notion of the optimal stimulus is well-defined for single neurons, it is not well-defined for neural populations. However, an alternative approach to stimulus optimization for a population of neurons is to find the stimulus at which the population is best at discriminating nearby stimuli, as opposed to the stimulus yielding the highest firing rate response. Indeed, it has been suggested by a number of investigators that highslope regions of tuning curves, where nearby stimuli are best

"fncir-07-00101" — 2013/6/5 — 10:46 — page 11 — #11

discriminated, are much more important in sensory coding than tuning curve peaks (Seung and Sompolinsky, 1993; Harper and McAlpine, 2004; Butts and Goldman, 2006; Bonnasse-Gahot and Nadal, 2008). Under reasonable assumptions of independent Poisson responses, the one-dimensional stimulus *x* at which a neural population can best discriminate nearby stimuli *x* + δ*x* is the stimulus which maximizes the Fisher information *I*<sup>F</sup> (*x*) = -*N i* = 1 *f <sup>i</sup>* (*x*) <sup>2</sup> / *fi* (*x*), where *fi*(*x*) is the tuning curve of the *i*-th neuron (Dayan et al., 2001). It is relatively straightforward to extend this Fisher information formalism to higher dimensional stimulus spaces (Zhang and Sejnowski, 1999; Johnson et al., 2001; Bethge et al., 2002). Local approximation of the Fisher information matrix has been used in previous work aimed at stimulus optimization in a single neuron (Bandyopadhyay et al., 2007b), and this technique could readily generalize to find the stimulus which is best discriminated from nearby stimuli by a population code.

Extension of the definition of iso-response surfaces (Gollisch et al., 2002) to multiple neurons is relatively straightforward. In particular, if we can view each neuron as implementing a function *f*(**x**) on the stimulus space, then the region of stimulus space which simultaneously satisfies multiple constraints *f*<sup>1</sup> (**x**) = *c*1, ··· , *fN* (**x**) = *cN* should simply be the (possibly empty) intersection of the regions of stimulus space satisfying each individual constraint. It would be interesting to extend the maximally informative ensemble approach (Machens, 2002) to multiple neurons as well. One potential difficulty is that the number of possible responses which one needs to measure to compute

**Table 1 | Summary of various closed-loop stimulus optimization approaches utilized in sensory systems neuroscience.**


"fncir-07-00101" — 2013/6/5 — 10:46 — page 12 — #12

the probability distribution *p*(y|**x**) increases exponentially with the number of neurons in the population. Indeed, this exponential increase in the number of symbols with the dimensionality of the response space is a well-known problem with applications of information-theoretic methods in neuroscience (Rieke et al., 1997). It would be desirable to develop more efficient computational techniques for studying neuronal populations in the future (Yarrow et al., 2012).

In addition to considering neural populations, another direction for extending the closed-loop paradigm is to consider neural responses more sophisticated than firing rates, for instance the temporal patterns of neural responses (Optican and Richmond, 1987; Victor and Purpura, 1996), first spike latency (VanRullen et al., 2005; Gollisch and Meister, 2008), or synchronous responses in neural populations (Brette, 2012). Since a temporal pattern is a vector but not a scalar, one needs to extract a scalar quantity from a temporal pattern in order to define the optimal stimulus. For example, synchrony can be defined as a scalar quantity (Steinmetz et al., 2000) and can in principle be optimized over a stimulus space in the same manner as firing rate. The iso-response paradigm (Gollisch et al., 2002) would generalize quite well to both spike pattern and synchrony measures. In this case of spike pattern, the goal would be to find the equivalence class of all stimuli which could elicit a desired pattern of spiking, and theoretical efforts have demonstrated that it is possible to design stimuli to produce a desired spike pattern (Ahmadian et al., 2011). Similarly, for iso-synchrony curves one could find equivalence classes of stimuli yielding the same degree of synchrony in the population by utilizing algorithms similar to those developed for firing rate.

One of the most powerful applications of the closed-loop paradigm is the ability to move sensory neurophysiology toward a model-based paradigm, where experiments are performed with the goal of identifying and comparing multiple competing nonlinear models (Paninski, 2005; Lewi et al., 2009; DiMattina and Zhang, 2011; Tam et al., 2011). One advantage of model identification is that successful identification gives the experimenter a variety of biologically important information about the neuron or neuronal population "for free." That is, once one has determined an accurate model for a sensory neuron, the optimal stimulus for maximizing firing rate, the iso-response surfaces, or the stimulus ensemble maximizing information transmission can be predicted from this model, and these predictions can be tested experimentally. However, the model-based approach is not without its difficulties, as many sensory neurons are poorly described by tractable linear and quadratic models and may be better described by more complex models like basis functions and neural networks. Recent work has demonstrated that in principle, adaptive stimulus optimization methods long utilized in machine learning and psychophysics can be applied in sensory neurophysiologyfor purposes of model estimation and comparison (Paninski, 2005; Lewi et al., 2009; DiMattina and Zhang, 2011). In particular, our recent study has presented a practical two-stage experimental

"fncir-07-00101" — 2013/6/5 — 10:46 — page 13 — #13

method for generating stimuli which are optimal for estimating the parameters of multiple non-linear models and then generating stimuli on-line in order to critically compare the predictions of different models (DiMattina and Zhang, 2011). This method is presently being applied in ongoing auditory neurophysiology studies (Tam et al., 2011; Dekel, 2012; Tam,

#### **REFERENCES**


Bandyopadhyay, S., Young, E. D., and Reiss, L. A. J. (2007b). "Spectral edges as optimal stimuli for the dorsal cochlear nucleus," in *Hearing – From Basic Research to Applications*, eds B. Kollmeier, G. Klump, V. Hohmann, U. Langemann, M. Mauermann, S. Upperkamp, et al. (New York: Springer-Verlag), 39–45.


2012), and may be applicable to a broad variety of sensory systems.

#### **ACKNOWLEDGMENTS**

Supported by grant NSF IIS-0827695. Thanks to Sarah Osmer DiMattina for her assistance with graphic design.


"fncir-07-00101" — 2013/6/5 — 10:46 — page 14 — #14


determining visual receptive fields. *Vis. Res.* 14, 1475–1482.


response strategies, selective attention, and stimulus generalization. *J. Exp. Psychol. Learn. Mem. Cogn.* 28, 924–940.


"fncir-07-00101" — 2013/6/5 — 10:46 — page 15 — #15


Sharpee, T., Rust, N., and Bialek, W. (2004). Analyzing neural responses to natural signals: maximally informative dimensions. *Neural Comput.* 16, 223–250.

Shepherd, G. (2003). *The Synaptic Organization of the Brain*, 5 Edn. New York: Oxford University Press.

Simoncelli, E. (2003). Vision and the statistics of the visual environment. *Curr. Opin. Neurobiol.* 13, 144–149.


cat and monkey visual cortex. *Vis. Res.* 23, 775–85.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 August 2012; accepted: 08 May 2013; published online: 06 June 2013.*

*Citation: DiMattina C and Zhang K (2013) Adaptive stimulus optimization for sensory systems neuroscience. Front. Neural Circuits 7:101. doi: 10.3389/ fncir.2013.00101*

*Copyright © 2013 DiMattina and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any thirdparty graphics etc.*

"fncir-07-00101" — 2013/6/5 — 10:46 — page 16 — #16

## A Hebbian learning rule gives rise to mirror neurons and links them to control theoretic inverse models

## *A. Hanuschkin1,2, S. Ganguli <sup>3</sup> and R. H. R. Hahnloser 1,2\**

*<sup>1</sup> Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland*

*<sup>3</sup> Department of Applied Physics, Stanford University, Stanford, USA*

#### *Edited by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### *Reviewed by:*

*Yang Dan, University of California, USA Klaus R. Pawelzik, University*

#### *Bremen, Germany*

*\*Correspondence: R. H. R. Hahnloser, Institute of Neuroinformatics, University of Zurich and ETH Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland e-mail: rich@ini.phys.ethz.ch*

Mirror neurons are neurons whose responses to the observation of a motor act resemble responses measured during production of that act. Computationally, mirror neurons have been viewed as evidence for the existence of internal inverse models. Such models, rooted within control theory, map-desired sensory targets onto the motor commands required to generate those targets. To jointly explore both the formation of mirrored responses and their functional contribution to inverse models, we develop a correlation-based theory of interactions between a sensory and a motor area. We show that a simple eligibility-weighted Hebbian learning rule, operating within a sensorimotor loop during motor explorations and stabilized by heterosynaptic competition, naturally gives rise to mirror neurons as well as control theoretic inverse models encoded in the synaptic weights from sensory to motor neurons. Crucially, we find that the correlational structure or stereotypy of the neural code underlying motor explorations determines the nature of the learned inverse model: random motor codes lead to causal inverses that map sensory activity patterns to their motor causes; such inverses are maximally useful, by allowing the imitation of arbitrary sensory target sequences. By contrast, stereotyped motor codes lead to less useful predictive inverses that map sensory activity to future motor actions. Our theory generalizes previous work on inverse models by showing that such models can be learned in a simple Hebbian framework without the need for error signals or backpropagation, and it makes new conceptual connections between the causal nature of inverse models, the statistical structure of motor variability, and the time-lag between sensory and motor responses of mirror neurons. Applied to bird song learning, our theory can account for puzzling aspects of the song system, including necessity of sensorimotor gating and selectivity of auditory responses to bird's own song (BOS) stimuli.

**Keywords: mirror neurons, inverse problem, linear models, songbird, sensory motor learning**

## **INTRODUCTION**

Complex vertebrate motor behaviors are generated by dedicated cortical circuits. The organization of these circuits and the plasticity rules that lead to their development and that guarantee their maintenance are functionally related to neural activity in single units and across larger populations (Gallese et al., 1996; Rizzolatti et al., 1996; Rizzolatti and Craighero, 2004; Harvey et al., 2012). For example, neural activity often strongly co-varies with motor behavior, allowing for estimation of detailed limb movement parameters from mere single-neuron recordings (Georgopoulos et al., 1986; Schwartz et al., 1988) and facilitating neural prosthesis (Santhanam et al., 2006; Ethier et al., 2012). However, in other cases, the amount of firing variability in single neurons can be dramatically dissociated from behavioral variability. For example, in songbirds, two distinct premotor areas are responsible for the generation of different aspects of the same vocal behavior. On the one hand, the cortical area HVC is involved in generating stereotyped adult song; lesions of HVC lead to degradation of typical adult song toward more unstructured subsong typical of very young birds (Nottebohm et al., 1976; Aronov et al., 2008). On the other hand, its counterpart, the lateral magnocellular nucleus of the anterior nidopallium (LMAN) in very young birds is involved in subsong production and in adults it is involved in the production of very subtle song variability that is barely noticeable to the human ear (Aronov et al., 2008). Lesions of LMAN in juveniles abolish song learning (Bottjer et al., 1984), and lesions in adults reduce the already small variability of adult undirected songs (the songs not direct toward another bird), manifest for example by reduced fluctuations of sound pitch (Kao et al., 2005; Stepanek and Doupe, 2010).

These lesion studies ascribing differential roles of HVC and LMAN to song production, are paralleled by findings from electrophysiology. In HVC of singing birds, single principal neurons fire highly stereotyped spiking patterns associated with a given song syllable, with precision of individual action potentials in the sub millisecond range (Hahnloser et al., 2002; Kozhevnikov and Fee, 2007). By contrast, in LMAN of birds singing undirected songs, neurons fire very variable spike patterns, patterns

*<sup>2</sup> Neuroscience Center Zurich (ZNZ), Zurich, Switzerland*

that fluctuate on a trial-to-trial basis between loosely timed high-frequency bursts of action potentials and no spiking at all (Olveczky et al., 2005; Kao et al., 2008). Thus, stereotyped adult song is subserved by precise firing in HVC whereas subtle variability of adult song is subserved by large firing variability in LMAN, **Figure 1**. The diverse neural codes in LMAN and HVC are integrated in a dedicated nucleus that mediates both differential influences from these stereotypy and variability generators. Both HVC and LMAN project to the robust nucleus of the arcopallium (RA), which is the cortical output nucleus that directly innervates syringeal and respiratory motor neurons.

Whether stereotyped or variable, internal motor patterns responsible for generating behavior cannot be fully understood without considering the sensory input reaching the motor system.

stereotyped song motif (in a different bird), three exemplary motif oscillograms are shown on top. In each rendition of the motif the neuron

produces a brief burst of spikes at precisely the same time.

Indeed, the very development of motor systems as well as the formation of motor plans are profoundly shaped by sensory inputs. For example, the development of the mirror neuron system depends on sensorimotor experience (Catmur, 2012) and, the successful development of birdsong depends on intact HVC and LMAN activity during sensory exposure (Basham et al., 1996; Roberts et al., 2012).

We have learned much about the integration of sensory inputs into motor systems from single neuron studies examining responses during motor production and during matched sensory states. Among the key findings are mirror neurons that fire similarly when an animal executes a motor act and when it sees or hears another animal perform that same act. For example, mirror neurons in F5 of monkey premotor cortex fire both when the monkey touches an object and sees another subject touch that object (Rizzolatti et al., 1996; Rizzolatti and Craighero, 2004). Mirror neurons also exist in HVC of songbirds; these neurons fire at a precise time in the song, both when the bird sings the song and when it hears a similar song produced by another bird (Prather et al., 2008).

Mirror neurons establish a link between the observation of an act in another and self-generation of that same act. Such a remarkable correspondence between sensory and motor roles in single neurons has led to numerous suggestions about the function of mirror neurons in communication, imitation learning, cultural learning, and language development (Rizzolatti and Craighero, 2004; Oztop et al., 2012). Most importantly, mirrored responses have been proposed to be causally related to streams of motor and sensory activity (Oztop et al., 2006, 2012). A recent proposal is to tie properties of the mirror neuron system to correlative learning rules (Cooper et al., 2012). Accordingly, sensory responses in mirror neurons could develop from the contingency of motor-related firing and its sensory consequences feeding back to motor areas. Here we develop this idea and propose a simple mathematical theory of mirror neuron formation from correlational learning rules. To examine the critical role of motor variability, we study, based on earlier work (Hahnloser and Ganguli, 2013), mirror neuron formation for both motor codes with strongly correlated firing patterns among neurons, as in HVC, as well as for motor codes with uncorrelated firing patterns among neurons, as in LMAN.

We are particularly interested in relating mirror neuron properties to their computational role in control theoretic inverse models. Mirror neurons have previously been recognized as direct evidence of inverse models, which are models that transform desired sensory states into motor commands that can achieve those states and may be used for action generation (Oztop et al., 2012). From the control-theoretic perspective, internal inverse models give rise to mirrored responses because of the precise correspondence between a desired sensory target, the motor commands for producing that target, and the resulting sensory feedback. We pursue this idea and elucidate the conditions under which inverse models can arise from correlational learning during sensory feedback-dependent motor explorations.

We assume inverse models form in a context without prior knowledge of structure of either the motor apparatus or the delayed sensory feedback. We design an eligibility-weighted correlational learning rule that allows for the formation of both inverse models and mirror neurons. In the rule we propose, synaptic strengthening depends on contiguous co-activation of pre-and postsynaptic neurons, whereas synaptic weakening depends on heterosynaptic competition between sensory afferents innervating the same motor neuron. We argue that from a synaptic perspective, this rule is considerably simpler and more plausible than previously proposed rules and computational approaches toward systems-level inverse models based on error backpropagation (Jordan and Rumelhart, 1992). Our rule is most closely related to direct inverse model approaches (Miller, 1987; Slotine, 1987), in which, however, the possibility of unknown feedback delays has not been adequately addressed. Most importantly, we find that whether the formed mirror neuron system and inverse model is suitable for action imitation depends on the correlational structure of the neural code associated with motor production. Whereas a variable (explorative) motor code leads to causal inverse models and is suitable for mirror-neuron dependent action imitation, a stereotyped (repetitive) motor code leads to predictive inverse models and is not suitable for action imitation. Thus, our work provides an interesting link between the correlational structure of motor behavior, its underlying neural code, and fine-grained temporal properties of mirror neuron responses and their suitability for flexible action imitation.

Furthermore, these conceptual connections suggest a set of natural experiments designed to probe for the existence, and characterize the causal nature of, inverse models by measuring the fine grained temporal properties of the sensory and motor responses of mirror neurons. As we discuss below, when applied to the bird song system, these experiments make a specific, testable prediction about the existence and temporal properties of mirror neurons in the variable motor circuit LMAN, as well as explain the origin of previously observed temporal properties of mirror neurons in the stereotyped motor circuit HVC.

## **RESULTS**

#### **A LINEAR FRAMEWORK**

We develop our theory in a simple linear framework in which the sensory response *a(t)* in a sensory brain area at time *t* is a vector of firing rates that is linearly related to the motor cause *m(t* − *τ)* at an earlier time *t* − *τ* , where *m(t* − *τ)* is a vector of firing rates in a motor area such as HVC or LMAN. The *time delay of sensory feedback τ* = *τ<sup>m</sup>* + *τ<sup>a</sup>* is the sum of the time *τ<sup>m</sup>* needed to translate motor activity into behavioral (vocal) output and the time *τ<sup>a</sup>* it takes for a vocalization to elicit a sensory response. We assume a linear *motor-sensory mapping* modeled by the matrix **Q**, allowing us to specify the form of delayed sensory feedback as *a(t)* = **Q***m(t* − *τ)*, **Figure 2**.

Note that for simplicity we assume linearity of the motorsensory mapping **Q**. However, the simple linearity assumption inherent in **Q** need not be inconsistent with the existence of nonlinearities between motor neuron activity and behavioral output (for example, song) and also with non-linearities between behavioral output and sensory responses. While it is the case that each of these transformations is highly non-linear, the dimensionality of motor behavior patterns realizable by muscle activity, or

**FIGURE 2 | Delayed feedback and inverse model, illustrated by vocal production in birds.** In our model of delayed sensory feedback the auditory response *a(t)* in a sensory area at time *t* depends linearly on motor activity *m(t* − *τ)* in a motor brain area at an earlier time *t* − *τ* according to *a(t)* = **Q***m(t* − *τ)*, where **Q** is the unknown motor-sensory mapping and *τ* the unknown delay of auditory feedback. An inverse **V** is a mapping from sensory neurons back onto motor neurons that inverts the action of **Q**: **V** = **Q**<sup>−</sup>1.

recorded by early sensory responses, is much smaller than the dimensionality of sensory or motor activity patterns deep within the cortex, by virtue of the fact that cortical motor and sensory neurons largely outnumber the few muscles and sensory receptors involved in the composite motor to sensory feedback loop. So for example, within the bird song system, it is thus probable that the low dimensional, composite non-linear transformation from cortical motor patterns, to muscle activity in the syrinx, to song, to cochlear response, back to cortical sensory feedback, could be well-approximated by a direct high dimensional linear map from the cortical motor area back to the cortical sensory area. This is in exact analogy to the theory of support vector regression approaches from machine learning, in which low dimensional non-linear maps can be well-approximated by high dimensional linear maps (Smola and Schölkopf, 2004). Thus, for our purposes, all we assume is that there exists at least one high dimensional linear map from cortical motor patterns to cortical sensory feedback patterns that approximates the composite feedback pathway implemented through the non-linear processes of motor generation and perception.

Now, an inverse model in this context is a mapping **V** = **Q**−<sup>1</sup> expressed in the synaptic weights **V** from sensory onto motor neurons. Such a mapping allows sensory neurons to postdict the possible motor cause *m<sup>a</sup>* of a sensory target (vector) *a* (either driven externally, recalled from memory, or resulting from a planning strategy) according to *m<sup>a</sup>* = **V***a*. Such a postdiction ability of inverse models can be used in feedforward motor control in which the appropriate stream of motor commands *ma(t)* can be computed for a given desired sensory target sequence *a(t)* according to *ma(t)* = **V***a(t)*.

The goals of our theory are to outline a biologically plausible, local mechanism for learning of the synaptic mapping **V** and to characterize the associated emergence of mirror neurons in this process.

#### **ELIGIBILITY-WEIGHTED HEBBIAN LEARNING**

We designed a simple learning rule in which potentiation of sensory-to-motor synaptic connections **V** arises from correlated firing in pairs of sensory and motor neurons. Because sensory feedback is delayed, synapses must be able to detect correlated firing within some non-zero time window, which we achieve by introducing an eligibility trace *e(s)* that establishes a link between activity at time *t* in a motor neuron and activity in a sensory neuron at a later time *t* + *s* (see also **Figures 3A,C**). The eligibility trace modulates the change in synaptic strength associated with correlated pre- and postsynaptic firing—it is a biophysical process that resides on the postsynaptic side of **V** synapses and is triggered by activity (i.e., spikes) in the postsynaptic (motor) neuron. Intuitively, we imagine that the spiking of a motor neuron, elicited for example from an internal source of motor variation that generates exploratory motor behavior, triggers the eligibility trace that in turn makes all synapses from sensory neurons onto that motor neuron eligible for future modification. Thus, if the delayed sensory feedback arrives to the sensory area within the window of eligibility, sensory to motor synapses can potentially learn to postdict the motor cause by correlating the current sensory feedback with past motor activity that might have generated it. We further assume that the eligibility is monotonically decaying in time, implying that sensory inputs preferentially connect onto motor neurons that were recently and reliably activated rather than motor neurons that were activated a long time ago. Necessarily, the decay of the eligibility trace must be slow enough to be able to attribute significant eligibility to sensory inputs with motor-to-sensory delays *τ* , which we subsume in the condition *e(τ)* -0.

The full correlational learning rule describing changes in synaptic strength V*ij* from auditory neuron *j* onto motor neuron *i* reads:

$$\text{V}\,\text{8V}\_{i\dot{j}} = \int\_0^\infty \text{ds} \left[ e(s)m\_i(t-s)a\_{\dot{j}}(t) \right] - \hat{m}\_i(t)a\_{\dot{j}}(t),\qquad(1)$$

where *m*ˆ *<sup>i</sup>(t)* = *<sup>k</sup>* V*ikak(t)* is the (silently) postdicted motor activity, corresponding to the summed auditory input to neuron

**FIGURE 3 | Cross-correlation functions for variable and stereotyped motor codes. (A)** In a variable motor code *m(t)* (shaded area). Activity bursts *m*<sup>1</sup> (black) and *m*<sup>2</sup> (blue) of width *t*<sup>0</sup> in two example motor neurons occur at diverse time lags relative to each other across renditions of the song motif. Auditory tuning in the shown sensory neuron is such that it responds *a*<sup>1</sup> to bursts *m*<sup>1</sup> after a time lag *τ* . Repeated co-activation *m*<sup>1</sup> → *a*<sup>1</sup> and non-zero eligibility *e(τ)* (red bar) at time lag *τ* leads to increased synaptic weight V11 (red arrow) and to a causal inverse. Lack of correlation between *m*<sup>2</sup> and *a*1, as well as heterosynaptic competition, prevents V21 from similarly increasing (blue thin arrow). **(B)** The cross-correlation function *Cij(t)* for variable codes is flat except the

auto-correlation peak at zero time lag (motor activity is uncorrelated among neuron pairs). Note: based on square activity pulses in motor neurons in **(A)** the true cross-correlation shape is triangular (blue dotted line) which we approximate by a square pulse of width *t*<sup>0</sup> 10 *ms*. The auto-correlation peak height is *C*0. **(C)** In a stereotyped motor code *m(t)* (shaded area), bursts *m*<sup>1</sup> (black) and *m*<sup>2</sup> (blue) occur at a fixed time lag relative to each other across renditions of the song motif (traveling pulse of activity). Repeated co-activation *m*<sup>2</sup> → *a*<sup>1</sup> at higher eligibility (red bar) than the eligibility of *m*<sup>1</sup> → *a*<sup>1</sup> leads to strengthening of synapse V21 (red arrow) and to a predictive inverse. **(D)** The cross-correlation function *Cij(t)* for stereotyped codes peaks also at non-zero time lags.

*i* at time *t*. The subtractive term *m*ˆ *iaj* provides an equal time heterosynaptic depression (Lynch et al., 1977; Chistiakova and Volgushev, 2009) among all sensory afferent synapses onto a motor neuron. The strength of this depression depends on the amount of presynaptic activity but does not depend directly on postsynaptic activation. The utility of such depression is not only to stabilize activity but also to force synaptic connections towards inverse mappings as we will see. Note that we assume **V** synapses are "silently" correlating pre- and postsynaptic activity in Equation 1, i.e., **V** synapses do not contribute either to postsynaptic depolarization or to postsynaptic hyperpolarization. In other words, while the inverse model is being learned, the motor activity *mi(t)* is entirely driven by some other source than the afferent auditory input. Thus, from the perspective of extracellular physiology, it would appear that sensory feedback arriving to the motor area through the inverse model, is gated out of the motor area while that motor area is engaged in internally generated motor explorations.

In the following we examine the outcome of this learning rule in response to various forms of motor codes, with the goal of computing the synaptic weight matrix **V** at a steady-state of the learning rule, <sup>d</sup> dt**V** = 0, where denotes averaging over time (e.g., over different renditions of the song). To simplify the calculations, we assume motor codes with narrow spiketrain cross-correlation functions, i.e., the width *t*<sup>0</sup> of spike-train cross-correlation functions is much smaller than the characteristic decay time of the eligibility trace. Although such functions have not been extensively studied due to the difficulty of simultaneously recording from several neurons during singing, narrow cross-correlation is plausible for RA and HVC neurons because pseudo simultaneous recordings can be constructed from serial recordings thanks to high firing stereotypy in these cells, yielding cross-correlation widths on the order of 10 ms (Leonardo and Fee, 2005). Note that in LMAN, because of high firing variability, similar estimation of cross-correlation width is virtually impossible.

We model motor codes with diverse inherent levels of randomness. We model stereotyped motor codes by assuming that spike-train cross correlations extend over large time lags, in agreement with a traveling pulse of activity (Hahnloser et al., 2002; Harvey et al., 2012). We model variable motor codes by assuming that cross-correlations vanish except in a peak at zero time lag (white noise assumption), **Figure 3B**.

#### **A VARIABLE NEURAL CODE YIELDS CAUSAL INVERSES**

If motor activity is uncorrelated among different neuron pairs, the resulting sensory to motor map **V** = *e(τ)t*0**Q**−<sup>1</sup> equals the inverse of **Q** weighted by the eligibility at time lag *τ* (Equation A5, for the derivation see Appendix A3). Hence, **V** is a causal inverse that maps sensory representations onto their motor causes (in **Figure 3A**, auditory neurons map onto those motor neurons whose firing correlates most strongly with their own).

For example, during singing the motor cause *m*1*(t* − *τ)* (say a neuron that generates a 4 kHz tone) will frequently be followed by auditory response *a*1*(t)* (a 4 kHz detector neuron), leading to strengthening of synapse V11. By contrast, due to high variability of the motor code, associations between *m*2*(t* − *τ)* (say a neuron that generates a 3 kHz tone) and *a*1*(t)* are much less frequent (because the bird randomizes the production of 3 and 4 kHz tones). Hence, synapse V21 from the 4 kHz detector onto the 3 kHz generator will lose to synapse V11 due to heterosynaptic competition (**Figure 3A**).

#### **A STEREOTYPED NEURAL CODE YIELDS PREDICTIVE INVERSES**

If the motor code is stereotyped and different motor neuron pairs are correlated at even very large time lags (extending over the full range of the eligibility trace and possibly beyond), then **V** *e(*0*)t*0**H***τ***Q**−<sup>1</sup> is approximately a concatenation of the inverse of **Q** and a shifter matrix **H***<sup>τ</sup>* that maps motor activity at one time onto motor activity at a time lag *τ* later, i.e., **V** is a predictive inverse of **Q** (Equation A10). Under a predictive inverse **V**, a sensory neuron maps onto those motor neurons that were most recently active (and reliably follow in activation other motor neurons that give rise to the sensory neuron's response).

For example, during singing, the motor cause *m*1*(t* − *τ)* of a 4 kHz tone will frequently occur before the cause *m*2*(t)* of a 3 kHz tone (because the bird produces stereotyped downsweep syllables). Hence, the 4 kHz auditory detector response *a*1*(t)* will find much higher eligibility in motor neuron 2, leading to strengthening of V21 at the expense of V11, i.e., the 4 kHz detector neuron connects onto the 3 kHz generator neuron (**Figure 3C**).

#### **LACK OF RESPONSE TO PERTURBED AUDITORY FEEDBACK AND SELECTIVITY FOR THE BOS**

During Hebbian learning of **V** in Equation 1 we required that synapses **V** are not able to drive spike responses in motor neurons during singing (**V** synapses learn silently). The main intuitive reason for the necessity of silent learning is that the learning goal of the inverse model synapses are to silently correlate the motor and sensory streams, without perturbing the motor stream that would result if sensory feedback were to pass through and drive spikes in the motor area. If the inverse model synapses allowed sensory feedback to significantly drive motor spikes, then the incoming sensory signals would serve to drive motor activity resulting in cyclic motor output with cycle time approximately equal to *τ* , i.e., birds would unavoidably produce repetitive motor output (stuttering).

Interestingly, there is much evidence for the gating out of sensory information in song motor nuclei. Principal motor neurons in LMAN and HVC do not respond to playback of white noise stimuli during singing (Leonardo, 2004; Kozhevnikov and Fee, 2007) and during states of high arousal (Cardin and Schmidt, 2003), though there are reports of distorted feedback responses in HVC interneurons in Bengalese finches (Sakata and Brainard, 2008). Lack of feedback sensitivity in principal motor neurons is usually ascribed to a form of gating caused by specific thalamic or neuromodulatory mechanisms (Dave et al., 1998; Schmidt and Konishi, 1998; Shea and Margoliash, 2003; Cardin and Schmidt, 2004; Coleman et al., 2007; Hahnloser et al., 2008), see also the Discussion.

By contrast, LMAN (Doupe and Konishi, 1991; Doupe, 1997; Solis and Doupe, 1999; Roy and Mooney, 2007) and HVC neurons (Katz and Gurney, 1981; Margoliash, 1983, 1986; Williams and Nottebohm, 1985) respond to auditory stimulation while birds are anesthetized or asleep, which we model as gating on of **V** synapses, i.e., we assume that auditory responses in motor neurons are driven via the learned inverse models.

The puzzling observation of the gating out of sensory inputs to motor areas during motor exploration, is naturally accounted for in our theory by necessity of correlating the current presynaptic sensory stream with past postsynaptic motor streams, to learn an unbiased inverse model (of unperturbed motor stream).

Also, interestingly, in both HVC and LMAN sensory responses are strongest for bird's own song (BOS) stimuli compared to other stimuli including the tutor song or the BOS played back in reverse time (McCasland and Konishi, 1981; Margoliash, 1986; Lewicki, 1996; Solis and Doupe, 1999). Such selectivity follows naturally from our model assumptions: for both stereotyped and variable motor codes, the mappings, whether causal or predictive, can only invert sensory responses that lie in the image of **Q** and cannot invert the full space of responses orthogonal to the image of **Q**. Such a restriction arises because only sensations that could arise through combinations of previously experienced sensory feedback during singing can actually be inverted into appropriate motor commands. In other words, the inverse model synapses map prior sensory feedback generated by the bird's own previous song into appropriate motor commands, but necessarily fails to map sensory activity patterns that are very different from the BOS into coherent motor patterns. Thus, assuming HVC and LMAN can be thought of as downstream of the output of an inverse model, our Hebbian learning rule generating inverse models can naturally account for the preference of sensory responses in HVC and LMAN for BOS; sounds very different from BOS are not appropriately inverted, and therefore presumably do not lead to coherent activation of motor patterns via sensory inputs propagating through the inverse model synapses.

#### **INVERSE MODELS AND SENSORIMOTOR MIRRORING**

The Hebbian learning rule in Section Eligibility-Weighted Hebbian Learning determines the wiring of sensory afferents into motor areas based on sensorimotor experience. How could one experimentally test for the existence of such wiring without painstaking, detailed inspection of anatomical connections and characterization of the sensorimotor mapping **Q**? Here we outline the design of experiments to probe for the existence of either causal or predictive inverse models. We propose to record from single neurons both in sensory and motor states and to compare motor activity and sensory-evoked responses using cross-correlation functions: as we will show, the time lag of peak cross correlation provides evidence for either predictive or causal inverses.

In such mirroring experiments that we propose, a single neuron is first recorded during singing and then during playback of the just recorded songs while the bird is asleep in the dark (during which the auditory gate is open and motor neurons become responsive to auditory stimuli, presumably through an inverse model from an upstream sensory area). In our model, sensory responses *m<sup>a</sup> <sup>i</sup> (t)* = ˆ*mi(t)* = *<sup>k</sup>* V*ij aj(t)* during playback are driven via synaptic weights **V** (assumed to be at a steadystate of Equation 1, <sup>d</sup> dt**V** = 0). Computing the cross-correlation

functions Corr*(s)* of the sensory response *m<sup>a</sup> <sup>i</sup> (t)* with motor activity *mi(t)* (as a function of time lag *s)* yields that (see **Figure 4**):


For derivations and model assumptions see Appendices A2–A4. In particular, here we assumed no synaptic delay between auditory and motor neuron, though this assumption can be relaxed. In summary, for both stereotyped and variable motor codes, sensory responses mirror motor activity. The amount of randomness in the motor code dictates the time lag of peak cross-correlation between motor activity and sensory-evoked responses, which we refer to as the *mirroring offset*. The mirroring offset thus serves as an important experimental observable that provides a window into fundamental differences in the types of inverse models that are computed by Hebbian learning, **Figure 4**.

Note that variable motor codes are associated with weaker mirroring than stereotyped codes, i.e., the cross-correlation functions for variable codes exhibit lower peak amplitudes than crosscorrelation functions associated with stereotyped codes: In our model, the ratio of peak cross correlation is given by the eligibility at time lag *τ* divided by the eligibility at time lag zero (Equation A12 derived in the Appendices A3, A4). Thus, the steeper the

*m<sup>a</sup>*

eligibility trace, the weaker the mirrored response in case of variable motor codes. By contrast, the shape of the eligibility trace is expected to have almost no influence in case of stereotyped codes.

<sup>1</sup> in case of a causal inverse (middle column) and to motor neuron response

<sup>2</sup> in case of a predictive inverse (right column) after an additional time lag *τ<sup>s</sup>* (spike propagation time from auditory to motor area) that is assumed to be 0

Note that the auditory response *m<sup>a</sup> <sup>i</sup> (t)* = *<sup>k</sup>* V*ik ak(t)* in a motor neuron to song playback is mathematically identical to the (silently) postdicted motor activity *m*ˆ *<sup>i</sup>(t)* = *<sup>k</sup>* V*ik ak(t)* defined after Equation 1, and used in learning the inverse model. Nevertheless, we use different symbols for these quantities to disentangle their meaning, i.e., the former being a superthreshold sensory response elicited in a quiet non-singing state of the bird, the latter being a subthreshold subtractive term that stabilizes synaptic learning during singing. The biophysical underpinnings of these two terms might largely be identical, with the silent nature of the posticted activity arising from some form of response gating (see also the Discussion).

#### **GRADIENT DESCENT**

*m<sup>a</sup>*

*m<sup>a</sup>*

We note that the learning rule in Equation 1 corresponds to gradient descent on the following error function:

$$\mathcal{E}(t) = \frac{1}{2} \sum\_{i} \int\_{0}^{\infty} \left[ m\_{i}(t - s) - \sum\_{k} \mathcal{V}\_{ik} a\_{k}(t) \right]^{2} \mathcal{e}(s) \, \text{ds} \tag{2}$$

For a derivation, see Appendix A1. Thus, synaptic weights **V** converge such as to yield optimal postdiction *m*ˆ *<sup>i</sup>(t)* = *<sup>k</sup>* V*ik ak(t)* of motor activity from sensory feedback. The origin of our eligibility-weighted Hebbian learning rule with heterosynaptic competition, from gradient descent of an energy function, confers a degree of robustness to the learning, as well as suggests generalizations to situations in which the synaptic transformation from sensory to motor areas is non-linear.

<sup>2</sup> is selective to the sound feature that during singing was generated by the much earlier burst *m*<sup>1</sup> in a different neuron (black burst in **(A)**, right panel), but

not the feature generated by *m*<sup>2</sup> (blue burst in **(A)**, right panel).

#### **PROBABILISTIC MODELS**

More realistic neuron models are non-linear and contain spikes that are potentially probabilistic and certainly binary events. Also, more realistically, we may want to explicitly model intrinsic noise in motor and sensory-related responses rather than deal with motor variability only through their effects on cross correlations. As a first step to dealing with such realism, we have derived two probabilistic neuron models in which inverse models and mirroring can be studied in similar manners as in the linear model, outlined in the following.

In one of these models we calculate the influence of probabilistic (binary) responses on the strength of mirroring. We consider a random motor area that at any time can only be in one of two possible states *M* = 1 and *M* = 0 with prior probability *p(M* = 1*)* = <sup>1</sup> <sup>2</sup> . Assume analogously that the sensory area is such that a particular sensory feature is either detected (*S* = 1) or not detected (*S* = 0). We then model the relationship between motor activity and sensory consequence in terms of conditional dependencies between these two random variables. We assess the strength of mirroring in this model in terms of the cross-correlation coefficient between the two random variables (as derived in Appendix A5) and find the following result:


Thus, the simple probabilistic model shows that the strength of mirroring may also be strongly reduced by the amount of intrinsic noise present in sensory and motor systems.

## **DISCUSSION**

We have presented a simple model for the development of mirror neuron systems that is mathematically tractable, allowing us to relate mirror neuron properties such as the correlative strengths and the time lag of peak mirrored responses to the stereotypy (the correlation structure) of motor-related firing. Mirroring properties depend on the variability of the neural motor code which may be dissociated from apparent variability of the motor behavior as is the case in LMAN neurons that fire highly variable spike patterns despite high song stereotypy in adults. Our conclusions are valid for arbitrary sensory systems, provided they are able to signal sensory feedback from motor actions with sufficient sensitivity matched to the behavioral richness generated by the motor system (and of course provided that sensory afferents are subject to correlative Hebbian learning). In our derivation we have assumed that cross-correlation functions among motor neuron pairs are narrow, which was a simplifying assumption that allowed us to derive simple analytical forms of the sensory-to-motor mapping **V** and of mirroring properties. Approximate inverses should also result for motor codes with more complex time dependence, because by construction, the learning rule we considered corresponds to a gradient-descent rule that achieves minimal inversion error.

Although inverse models are attractive as models for vocal learning (Guenther et al., 2006; Hahnloser and Ganguli, 2013), they have previously been judged to be inappropriate for vocal learning in songbirds because of mainly two reasons: (1) young birds require many song repetitions with auditory feedback (Doya and Sejnowski, 2000), and (2) the learning schemes proposed either used a biologically implausible algorithm (Jordan and Rumelhart, 1992) or assumed the preexistence of an approximate inverse model (Kawato, 1990). Here we suggested a resolution to both of these issues and shown that in contrary to

previous beliefs, inverse models constitute a potentially plausible framework for vocal learning in birds, too: the many song explorations used by young birds could be required to actually learn the high dimensional inverse model; and, the correlational learning we proposed is quite plausible and simple (but non-trivial nevertheless). This suggests potentially opening up the hypothesis space for learning rules operating within cortico-basal ganglia circuits, in both mammalian and bird song systems, to include models spanning the range from pure reinforcement learning (RL) to pure inverse model learning. Of particular interest would be intermediate learning rules that synergistically incorporate both dopamine-dependent plasticity thought to underlie RL as well, as Hebbian based plasticity shown here to mediate inverse model learning, in order to implement sophisticated model-based RL strategies. For example, a simple proposal would be that dopamine delivered to striatal synapses from the ventral tegmental area (VTA) might not be released purely nonspecifically, but instead might be delivered by an inverse model that can partially map errors in sensory coordinates to errors in motor coordinates, thereby guiding learning in ways more sophisticated than pure RL (O'Reilly and Frank, 2006).

The key to learning causal inverse models is motor variability. In motor areas such as HVC that fire stereotyped patterns, auditory afferents cannot disentangle cause-and-effect, leading to preferential formation of predictive inverses rather than causal ones. Predictive inverses have limited usefulness for action imitation from action observation, because under a predictive inverse, observation of a particular motor gesture will lead to imitation of the subsequent gesture in the imitator's motor repertoire, which may not be part of the actions to be imitated. For example, if a bird repeatedly sings *ABCD* during formation of the inverse and wants to later imitate repetitions of *ABDB*, then its predictive inverse will constrain it to produce repetitions of *BCDA* because perception of *A* maps to production of *B*, perception of *B* maps to production of *C*, etc.

Small temporal delays between motor activity and activity evoked by playback of BOS or BOS-resembling sounds have been reported previously. Prather et al. (2008) showed there is a small mirroring offsets of just a few milliseconds in HVCX neurons of awake swamp sparrows and report similar (not quantified) results in Bengalese finches. Furthermore, Dave and Margoliash (2000) observed a small time lag of auditory-evoked activity also in RA neurons of sleeping zebra finches. Both these experimental findings reflect a predictive inverse. While predictive inverses have limited usefulness for action imitation they might provide stability in sequential vocalization. Indeed Sakata and Brainard report that perturbation of auditory feedback can change song syntax in Bengalese finches (Sakata and Brainard, 2006, 2008; Hanuschkin et al., 2011). By contrast, a causal inverse revealing itself by a large mirroring offset is maximally useful for song imitation. Indeed, preliminary results indicated a large non-zero mirroring offsets in LMAN (Giret et al., 2012).

An important element of our theory is the eligibility trace. To endow Hebbian learning with such a trace is necessary in realistic situations in which effects (sensory feedback) follow their cause (motor command) with some non-zero time lag arising from signal propagation delays, from twitch times of muscles, and from sensory and synaptic receptor latencies. In humans such a lag could span up to several hundreds of milliseconds, whereas in birds it may be as short as several tens of milliseconds. Eligibility traces also appear in RL theories (Seung, 2003; Fiete et al., 2007) and seem to be a general prerequisite for learning in the context of delayed feedback or delayed reward. We can imagine that neurons and synapses may hold decaying eligibility traces in terms of dedicated molecules such as calcium. Action potential generation is associated with rapid calcium entry that decays over the time course from several hundreds of milliseconds to a few seconds (McGeown et al., 1996; Wallace et al., 2008). The monotonic decay of intracellular calcium is well-suited to modeling a monotonically decaying eligibility trace. However, a monotonic decay of eligibility harbors both advantages and disadvantages. The disadvantage, as discussed, is the problem associated with stereotyped motor generators that can only hold predictive inverse models; to make inverse models causal, motor variability is required. Another way to guarantee causal inverse models—even under stereotyped motor explorations—would be to consider eligibility traces that do not monotonically decay but that peak at precisely the time delay inherent in closed sensorimotor feedback loops. The main caveat of such eligibility traces is that it may be questionable whether different muscles recruited for the same behavior must necessarily be associated with the same sensorimotor delay—and it is presently unclear how such variable delays could be matched to variable eligibility traces across synapses in a way that would ensure the learning of a causal inverse model. Moreover, phenomena such as speech co-articulation make it unlikely that there exists a constant sensorimotor delay across a large range of premotor neurons. The advantage, on the other hand, of a decaying eligibility trace is that sensorimotor contingencies and inverses can be learned regardless of sensorimotor latencies, providing robustness of sensorimotor learning.

Convergence of the sensory to motor synaptic weights toward inverses depends on details of the heterosynaptic competitive term. Heterosynaptic competitive terms have a certain appeal because of the useful normalization they provide (Fiete et al., 2010). In the context of this work, such terms imply locally available information at a single synapse about sensory inputs to other synapses. Though this information need not be provided instantaneously, we can only speculate about possible mechanisms for sharing such information among different synapses onto the same postsynaptic neuron. One possibility is that some form of intracellular signaling conveys this information. Another possibility to be explored is whether there exists an entire class of such competitive terms with a similar effect. For example, provided that motor and sensory codes are sufficiently sparse, it is conceivable that very simple subtractive terms might suffice for inverse formation. Whether other (even simpler) competitive terms result in approximate inverses needs to be further explored. We would like to point out preliminary evidence that inverses can be learned with Hebbian rules that include no heterosynaptic competitive terms (Senn and Pawelzik, pers. communication).

Our Hebbian learning theory has been analyzed so far in linear circuits, but we have indicated ways to overcome linearity by pinpointing extensions of our work to include nonlinear mappings and probabilistic neuron models. Further work will be required to test whether our correlative learning approach is suitable also for inverse model learning employing detailed biophysical models of the avian syrinx.

## **ACKNOWLEDGMENTS**

We acknowledge support by the European Research Council (ERC-Advanced Grant 268911) and the Swiss National Science Foundation (Grant 31003A\_127024), and support from the Swartz, Sloan, and Burroughs-Wellcome foundations, and Defense Advanced Research Projects Agency (DARPA). R. H. R. Hahnloser thanks Walter Senn for helpful discussions on inverse models and S. Ganguli thanks Michael Brainard and Kris Bouchard for useful discussions on birdsong learning, and the transition from causal to predictive inverse models.

#### **REFERENCES**


*Science* 290, 812–816. doi: 10.1126/science.290.5492.812


S. Gazzaniga (Cambridge, London: MIT Press), 469–484.


syntax generation in the Bengalese finch. *J. Comput. Neurosci.* 31, 509–532. doi: 10.1007/s10827-011- 0318-z


white-crowned sparrow. *J. Neurosci.* 3, 1039–1057.


Mooney, R. (2012). Motor circuits are required to encode a sensory model for imitative learning. *Nat. Neurosci.* 15, 1454–1459. doi: 10.1038/nn.3206


*J. Neurophysiol.* 104, 2474–2486. doi: 10.1152/jn.00977.2009

Wallace, D. J., Meyer, S., Astori, S., Yang, Y., Bausen, M., Palmer, A. E., et al. (2008). Single-spike detection *in vitro* and *in vivo* with a genetic Ca2<sup>+</sup> sensor. *Nat. Methods* 5, 797–804. doi: 10.1038/nmeth.1242

Williams, H., and Nottebohm, F. (1985). Auditory responses in avian vocal motor neurons: a motor theory for song perception in birds. *Science* 229, 279–282.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 November 2012; accepted: 15 May 2013; published online: 19 June 2013.*

*Citation: Hanuschkin A, Ganguli S and Hahnloser RHR (2013) A Hebbian learning rule gives rise to mirror neurons and links them to control theoretic inverse models. Front. Neural Circuits 7:106. doi: 10.3389/fncir. 2013.00106*

*Copyright © 2013 Hanuschkin, Ganguli and Hahnloser. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## **APPENDICES**

## **A1. GRADIENT DESCENT DERIVATION OF ELIGIBILITY-WEIGHTED HEBBIAN LEARNING**

We can derive the learning rule in Equation 1 by gradient (steepest) descent on an error function E. The differential change δ**V** in synaptic weight is proportional to the gradient and we can write:

$$
\delta \mathbf{V} = -\frac{\mathrm{dE}}{\mathrm{d} \mathbf{V}}.\tag{A1}
$$

The error function E*i(t)* for neuron *i* we define as the square difference between motor activity *mi(t* − *s)* and postdicted motor activity *m*ˆ *<sup>i</sup>(t)* = *<sup>k</sup>* V*ik ak(t)*, weighted by the eligibility associated with the time lag *s*.

$$\mathcal{E}\_i(t) = \frac{1}{2} \int\_0^\infty \mathrm{d}s \left[ m\_i(t-s) - \sum\_k \mathcal{V}\_{ik} \, a\_k(t) \right]^2 \, \mathcal{e}(s)$$

The total error is simply the sum of errors over all neurons E = *<sup>i</sup> Ei*.

By taking the gradient with respect to the *i,j* th weight only we find

$$\frac{d\mathcal{E}\_i}{d\mathcal{V}\_{\vec{\eta}}} = -\int\_0^\infty \text{ds} \left[ m\_i(t-s) - \sum\_k \mathcal{V}\_{ik} a\_k(t) \right] e(s) a\_j(t)$$

$$= -\int\_0^\infty \text{ds} \left[ e(s) m\_i(t-s) \left. a\_j(t) + \sum\_k \mathcal{V}\_{ik} a\_k(t) \right| a\_j(t) e(s) \right].$$

Assume a normalized eligibility trace ∞ <sup>0</sup> *e(s)*ds = 1 :

$$\frac{\mathrm{dE}\_i}{\mathrm{dV}\_{ij}} = -\int\_0^\infty \mathrm{ds}\,\boldsymbol{e}(\boldsymbol{s}) m\_i(t-\boldsymbol{s}) \,\boldsymbol{a}\_j(t) + \sum\_k \mathrm{V}\_{ik} \,\boldsymbol{a}\_k(t) \,\boldsymbol{a}\_j(t).$$

$$\Rightarrow 8\mathrm{V}\_{ij} = \int\_0^\infty \mathrm{ds}\,\boldsymbol{e}(\boldsymbol{s}) m\_i(t-\boldsymbol{s}) \,\boldsymbol{a}\_j(t) - \sum\_k \mathrm{V}\_{ik} \,\boldsymbol{a}\_k(t) \,\boldsymbol{a}\_j(t) \,\text{ (A2)}$$

## *A1.1. Extension to non-linear network*

Note that our linear approach can be extended by introducing a nonlinear function *f* in the auditory to motor mapping in Equation 2:

$$\mathcal{E} = \frac{1}{2} \sum\_{i} \int\_{0}^{\infty} \mathrm{d}s \left[ m\_{i}(t - s) - f \left( \sum\_{k} \mathcal{V}\_{ik} \, a\_{k}(t) \right) \right]^{2} e(s)$$

$$\Rightarrow \delta \mathcal{V}\_{ij} = \int\_{0}^{\infty} \mathrm{d}s \Big[ m\_{i}(t - s) - f\_{i} \Big] f\_{i}^{'} \, a\_{k}(t) e(s),$$

$$= \int\_{0}^{\infty} \mathrm{d}s \, m\_{i}(t - s) f\_{i}^{'}(t) \, a\_{k}(t) e(s) - f\_{i}(t) f\_{i}^{'}(t) \, a\_{k}(t)$$

Where *fi* = *f <sup>k</sup>* V*ik ak(t)* and *f <sup>i</sup>* = d*f <sup>k</sup>* V*ik ak(t) /*dV*ij*.

## *A1.2. Probabilistic derivation of Hebbian learning rule*

We derive a version of the Hebbian learning rule in Equation 1 that is based on the following probabilistic Boltzmann neuron model. For simplicity, we do not include the time dependence in the derivation (*τ* = 0). The auditory feedback response *a* given a motor activation *m* is given by the conditional probability

$$\mathbb{P}\_{\mathbf{Q}}(a|m) = \frac{\ell^{a^T \mathbf{Q}m}}{Z\_{\mathbf{Q}}(m)},$$

parameterized by the matrix **Q**, which is the motor-sensory mapping (as before) and where

$$Z\_{\mathbf{Q}}(m) = \sum\_{a} e^{a^T \mathbf{Q}^m}$$

is the partition function. The posterior probability of *m* is given by

$$\mathbb{P}\_{\mathbf{Q}}(m|a) = \frac{\mathbb{P}\_{\mathbf{Q}}(a|m)\mathbb{P}(m)}{\mathbb{P}(a)}.$$

In a sensory state, auditory responses in motor neurons are driven via synapses **V** according to the probabilistic model:

$$\mathbf{P}\_{\mathbf{V}}(m|a) = \frac{e^{m^T \mathbf{V}\_a}}{Z\mathbf{v}(a)}$$

with partition function

$$Z\mathbf{v}(a) = \sum\_{m} e^{m^T \mathbf{V} a}.$$

The error function in Equation 2 is replaced by the Kullbach-Leibler (KL) divergence between P**Q***(m*|*a)* and P**V***(m*|*a)*:

$$\begin{aligned} \mathrm{D\_{KL}}\left(\mathrm{P\_{Q}}(m|a), \mathrm{P\_{V}}(m|a)\right) &= \sum\_{m} \mathrm{P\_{Q}}(m|a) \ln \left(\frac{\mathrm{P\_{Q}}(m|a)}{\mathrm{P\_{V}}(m|a)}\right) \\ &= \sum\_{m} \mathrm{P\_{Q}}(m|a) \left[\ln \left(\mathrm{P\_{Q}}(m|a)\right)\right] \\ &\quad - \ln \left(\mathrm{P\_{V}}(m|a)\right) \left[\\ &= \sum\_{m} \mathrm{P\_{Q}}(m|a) \left[\ln \left(\mathrm{P\_{Q}}(m|a)\right)\right] \\ &\quad + \ln \left(Z\mathrm{y}\_{l}(a)\right) - m^{T}\mathrm{V}a \right] \quad \text{(A)} \end{aligned}$$

Before taking the derivative of DKL we compute the derivative of the partition function:

$$\frac{\partial}{\partial \mathbf{V}\_{\vec{\boldsymbol{w}}}} Z\_{\mathbf{V}}(a) = \sum\_{m} e^{m^{T} \mathbf{V} a} \frac{\partial}{\partial \mathbf{V}\_{\vec{\boldsymbol{w}}}} m^{T} \mathbf{V} a = \sum\_{m} e^{m^{T} \mathbf{V} a} m\_{i} a\_{j}$$

$$= Z\_{\mathbf{V}}(a) a\_{j} \sum\_{m} \mathbf{P}\_{\mathbf{V}}(m|a) m\_{i}$$

$$= Z\_{\mathbf{V}}(a) a\_{j} \langle m\_{i} | a \rangle\_{m},$$

based on which it follows that

$$\frac{\partial}{\partial \mathcal{V}\_{\vec{\eta}}} \ln(Z\mathsf{v}(a)) = \frac{1}{Z\mathsf{v}(a)} \frac{\partial}{\partial \mathcal{V}\_{\vec{\eta}}} Z\mathsf{v}(a) = a\_{\vec{\eta}} \langle m\_{i}|a\rangle\_{m} \dots$$

Using this relationship and Equation A3 we can calculate the derivative of the KL-divergence with respect to V*ij*:

$$\begin{split} &\frac{\partial}{\partial \mathbf{V}\_{ij}} \mathrm{D\_{KL}}(\mathbf{P}\_{\mathbf{Q}}(m|a), \mathbf{P}\_{\mathbf{V}}(m|a)) \\ &= \frac{\partial}{\partial \mathbf{V}\_{ij}} \sum\_{m} \mathrm{P}\_{\mathbf{Q}}(m|a) \left[ \ln \left( \mathrm{P}\_{\mathbf{Q}}(m|a) \right) + \ln \left( Z\_{\mathbf{V}}(a) \right) - m^{T} \mathbf{V} a \right] \\ &= \sum\_{m} \mathrm{P}\_{\mathbf{Q}}(m|a) \left( \frac{\partial}{\partial \mathbf{V}\_{ij}} \left( \ln \left( Z\_{\mathbf{V}}(a) \right) \right) - m\_{i} a\_{j} \right) \\ &= \sum\_{m} \frac{\mathrm{P}\_{\mathbf{Q}}(a|m) \mathrm{P}(m)}{\mathrm{P}(a)} \left( \frac{\partial}{\partial \mathbf{V}\_{ij}} \left( \ln \left( Z\_{\mathbf{V}}(a) \right) \right) - m\_{i} a\_{j} \right) \\ &\Rightarrow \left\langle -\frac{\partial}{\partial \mathbf{V}\_{ij}} \mathrm{D}\_{\mathrm{KL}} \left( \mathrm{P}\_{\mathbf{Q}}(m|a), \mathrm{P}\_{\mathbf{V}}(m|a) \right) \right\rangle\_{a} \\ &= \sum\_{a,m} \mathrm{P}\_{\mathbf{Q}}(a|m) \mathrm{P}(m) \left( m\_{i} \ a\_{j} - \frac{\partial}{\partial \mathbf{V}\_{ij}} \left( \ln \left( Z\_{\mathbf{V}}(a) \right) \right) \right) \end{split}$$

Thus, the gradient decent leads to

$$0 \Rightarrow \delta \mathbf{V}\_{i\bar{j}} = m\_i a\_{\bar{j}} - \frac{\partial}{\partial \mathbf{V}\_{i\bar{j}}} (\ln \left( Z\_{\mathbf{V}}(a) \right)) = \left[ m\_i - \langle m\_i | a \rangle\_m \right] a\_{\bar{j}}$$

This is the probabilistic analog of Equation A2, in which the silent postdictive motor activity *m*ˆ *<sup>i</sup>* is replaced by the conditional expectation *mi*|*a*<sup>m</sup> of activity in motor neuron *i* given the sensory response *a*.

#### **A2. CORRELATION OF MOTOR ACTIVITY DETERMINES AVERAGE SYNAPTIC CHANGE**

The average synaptic change under learning rule Equation A2 satisfies

$$
\left< \mathbf{V}\_{ij} \right> = \int\_0^\infty \left< m\_i(t-s)a\_j(t) \right> \mathbf{e}(s) \mathbf{d}s - \left< \sum\_k \mathbf{V}\_{ik} a\_k(t) a\_j(t) \right>,
$$

$$
= \int\_0^\infty \left< \sum\_k m\_i(t') \, \mathbf{Q}\_{jk} \, m\_k(t'+s-\tau) \mathbf{e}(s) \right> \mathbf{d}s
$$

$$
$$

where we have substituted *t* = *t* − *s*. We can write this equation as

$$
\langle \delta \mathbf{V} \rangle = \left[ \int\_0^\infty e(s) \mathbf{C}(s - \tau) \mathrm{d}s - \mathbf{VQC}(0) \right] \mathbf{Q}^T,\qquad \text{(A4)}
$$

where C*ij(s)* = *mi (t) mj(t* + *s)* is the cross-correlation matrix of motor activity at time lag *s*. In the following we assume without loss of generality that the delay *τ<sup>s</sup>* of synaptic transmission between auditory and motor neurons is negligibly small.

## **A3. VARIABLE MOTOR CODE**

#### *A3.1. V is a causal inverse*

We assume a motor code with a narrow correlation function that is non-zero only for small *t*0.

$$\mathbf{C}(\mathbf{s}) = \begin{cases} 1C\_0 \text{ for } |\mathbf{s}| < t\_0/2 \\ 0 \text{ otherwise} \end{cases}$$

Where **1** is the unity matrix and *C*<sup>0</sup> is a positive constant. The steady state solution δ**V** = 0 of Equation A4 leads to

$$\int\_0^\infty e(s)\mathbf{C}(s-\tau)\,\mathrm{ds}-\mathbf{VQC}(0) = 0.$$

Assuming that the eligibility trace is constant over short time intervals of duration *t*<sup>0</sup> (over which the correlation function is non-zero) yields

$$\begin{aligned} \Rightarrow e\left(\tau\right)t\_0 \mathbf{1}C\_0 - \mathbf{V} \mathbf{Q} \mathbf{1}C\_0 &= 0 \\\\ \iff \mathbf{V} \mathbf{Q} &= e(\tau)t\_0 \mathbf{1} \end{aligned}$$

This implies that the auditory to motor mapping **V** is proportional to the inverse of **Q** weighted by the eligibility at time lag *τ* :

$$\mathbf{V} = e(\boldsymbol{\tau}) \boldsymbol{t}\_0 \mathbf{Q}^{-1} \tag{A5}$$

Thus, **V** is a causal inverse, at least when restricted to the image of **Q**.

## *A3.2. Variable motor codes are associated with large mirroring offsets*

We simulate a mirroring experiment in which we cross correlate in a given neuron the motor activity *mi(t)* and the activity *m<sup>a</sup> <sup>i</sup> (t)* that results from observation of the motor act (achieved in birds by song playback though a loudspeaker).

The auditory response in motor neuron *i* is given by

$$\begin{aligned} m\_i^a(t) &= \sum\_j \mathbf{V}\_{ij} a\_j(t) \\ &= \sum\_j (\mathbf{V}\mathbf{Q})\_{ij} m\_j(t-\tau) \\ &= e(\tau) t\_0 \sum\_j \delta\_{i,j} m\_j(t-\tau) .\end{aligned}$$
 
$$\begin{aligned} &= e(\tau) t\_0 \, m\_i(t-\tau), \end{aligned}$$

Where *δi,<sup>j</sup>* is the Kronecker-Delta, (*δi,<sup>j</sup>* = 1 for *i* = *j* and *δi,<sup>j</sup>* = 0 otherwise). Note that relative to song (either produced by the bird or played through the loudspeaker) the playback-evoked activity *ma <sup>i</sup> (t)* is shifted with respect to the motor activity *mi(t* − *τ)* by a time shift *τ* (as illustrated in **Figure 4**). The cross correlation Corr*(s)* between sensory-evoked and motor generated activity is defined as,

$$\text{Corr}(\mathbf{s}) = \frac{1}{T'} \int\_0^{T'} m\_i(t) m\_i^a(t+s) \, \text{d}t = \left< m\_i(t) m\_i^a(t+s) \right> \text{(A6)}$$

Where *T* is the duration of the motor behavior (e.g., the song motif or song). Inserting the expression for the motor activity evoked by the auditory response into Equation A6 yields,

$$\begin{aligned} \text{Corr}(\mathbf{s}) &= \langle m\_i(t)\mathbf{e}(\mathbf{r})t\_0 \, m\_i(t-\tau+s) \rangle \\ &= \begin{cases} e(\mathbf{r})t\_0 \mathbf{C}\_0 & \text{for } |\mathbf{s}-\tau| < t\_0/2 \\ 0 & \text{otherwise} \end{cases} \end{aligned} \tag{A7}$$

Thus, the cross correlation is nonzero in a small time centered around *τ* . The peak cross-correlation value is given by the eligibility trace at time lag *τ* .

$$\text{CorrPeak} = t\_0 C\_0 e(\mathbf{r})$$

Note that our calculations are valid in principle for meansubtracted *mi(t)*. In case of non-mean subtracted *mi* we should replace the cross correlation in Equation A6 by the cross covariance to obtain the same findings. However, in practice, mean subtraction is not necessary because the peak location (the mirroring offset) is independent of the mean.

#### **A4. STEREOTYPED MOTOR CODES**

#### *A4.1. V is a predictive inverse*

We describe the motor activity by a traveling pulse *mi(t)* = η *i* <sup>ω</sup> − *t* with speed ω, where

$$\eta(t) = \begin{cases} 1 & \text{for } |t| < t\_0/2 \\ 0 & \text{otherwise} \end{cases}$$

and *t*<sup>0</sup> = 1*/*ω. The cross-correlation matrix for such a traveling pulse is a triangular pulse of height *t*0*/T* and width 2*t*<sup>0</sup> which we approximate by a square pulse of width *t*0*,*C*ij(s) t*0 *T* η *i* − *j* <sup>ω</sup> + *s* in the following (illustrated in **Figure 3D**). To facilitate comparison of inverses associated with stereotyped and variable motor codes, we assume their peak correlations are identical, i.e.,

$$C\_0 = \frac{t\_0}{T}.$$

At a steady state δ**V** = 0, Equation A4 yields

$$\int\_0^\infty e\left(s\mathbf{C}(s-\tau)\mathbf{ds} - \mathbf{VQC}(0) = 0\right)$$

Assuming again that the eligibility trace is constant over short time intervals of width *t*0*,* we find

$$\Rightarrow e\left(\mathbf{r} - \frac{i - j}{\alpha}\right)t\_0 = \mathbf{VQ}.$$

Given sensory input *aj(t)* and the inverse map **V**, the auditoryevoked activity *m<sup>a</sup> <sup>j</sup> (t)* is proportional to the eligibility trace:

$$m\_i^a(t) = \sum\_j \mathbf{V}\_{ij} a\_j(t)$$

$$= \sum\_j (\mathbf{V}\mathbf{Q})\_{ij} m\_j(t - \tau)$$

$$= e\left(t - \frac{i}{\alpha}\right) t\_0,\tag{A8}$$

defined for *t* ≥ *<sup>i</sup>* ω .

By approximating the eligibility trace only by its maximum value *e t* − *<sup>i</sup>* ω = *e(*0*)*η *t* − *<sup>i</sup>* ω we have that approximately

$$m\_i^d(t) \cong e(0)t\_0 \, m\_i(t),\tag{A9}$$

and so the playback-evoked activity *m<sup>a</sup> <sup>i</sup> (t)* is roughly identical to motor activity *mi(t)* (as illustrated in **Figure 4**).

To compute the matrix **V** we use the same approximation for the eligibility trace to obtain

$$\frac{1}{t\_0}(\mathbf{V}\mathbf{Q})\_{\vec{\eta}} \simeq e(0)\eta(i-j-\tau) = e(0)\mathbf{H}^t\_{\vec{\eta}},$$

where we have set ω = 1, and where **H** is a shifter matrix (also called cyclic permutation or circulant matrix), e.g., for *n* = 4:

$$\mathbf{H} = \begin{pmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \end{pmatrix}.$$

With this approximation we find that the synaptic mapping

$$\mathbf{V} \simeq e(0)t\_0 \mathbf{H}^\mathbf{f} \mathbf{Q}^{-1} \tag{A10}$$

is the inverse of the motor map shifted in time by *τ* , i.e., **V** maps sensory activity evoked by motor activity at time *t* onto motor activity at time *t* + *τ* . In other words, the sensory lag is compensated and sensory-evoked motor activity at time *t* predicts motor activity at time *t*. Hence, **V** is a predictive inverse.

#### *A4.2. Stereotyped motor codes are associated with small mirroring offsets*

Based on the sensory-evoked activity *m<sup>a</sup> <sup>i</sup> (t)* derived in Equation A8 we find for the cross-correlation function Corr*(s)* between motor activity *mi(t)* and sensory-evoked activity *m<sup>a</sup> <sup>i</sup> (t)*:

$$\text{Corr}\left(s\right) = \frac{1}{T} \int\_{0}^{T} m\_{i}(t) m\_{i}^{a}(t+s)dt$$

$$= \frac{t\_{0}}{T} \int\_{0}^{T} \eta\left(\frac{i}{\omega} - t\right) \varepsilon\left(t - \frac{i}{\omega} + s\right) dt$$

$$= \frac{t\_{0}^{2}}{T} \varepsilon(s) = t\_{0}C\_{0}\varepsilon(s). \tag{A11}$$

Thus, the cross-correlation function is proportional to the eligibility trace. If the eligibility trace is monotonically decaying we find that the peak of Corr*(s)* occurs at *s* = 0 and is given by

$$\text{CorrPeak} = t\_0 C\_0 e(s).$$

In other words, stereotyped neural codes are associated with zero mirroring offsets.

The ratio of peak cross correlation for variable (A6) and stereotyped (A10) motor codes is given by

$$r = \frac{e(\mathbf{r})}{e(0)},\tag{A12}$$

implying that the more stereotyped a neural code, the stronger is the observed mirroring effect.

#### **A5. MIRRORED RESPONSE STRENGTH IN A PROBABILISTIC MODEL**

In the following we define a probabilistic model of a motor neuron that allows us to compute the mirroring strength, i.e., the correlation between motor activity and sensory-evoked activity. We assume a minimal model in which a neuron has only two states *R* = 1 (active) and *R* = 0 (inactive). In addition, we assume two behavioral states *B* = 1 (behavioral feature present), and *B* = 0 (behavioral feature absent). During motor production, the degeneracy of the motor code quantified by the conditional probability of neural activity given that the feature of interest is present during the behavior (e.g., the finger is extended or the song pitch is high) is

$$P\_M(R=1|B=1) = p\_1$$

and the probability that the neuron is active while the behavioral feature is absent (intrinsic noise) is

$$p\_M(\mathbb{R} = 1 | B = 0) = p\_2 \dots$$

Hence, the average motor response [for prior *P(B* = 1*)* = 1*/*2] is given by

$$P(R)\_{\text{motor}} = \sum\_{i} 1 \times P\_{\text{M}}(R=1|B=i)P(B=i) = \frac{1}{2} \left(p\_1 + p\_2\right) = p.c.$$

In the sensory state (during observation of the behavior), the reliability of a response quantified by the conditional probability of triggering a sensory response given presence of the behavioral feature in the stimulus is given by,

$$P\_S(R=1|B=1) = q\_1$$

and the probability of a sensory response without the behavioral feature (intrinsic noise) is:

$$P\_S(R=1|B=0) = q\_2.$$

The parameters *p*1*, p*2*, q*1, and *q*<sup>2</sup> can be freely chosen in this minimal model, for example *q*<sup>2</sup> = *p*<sup>2</sup> if intrinsic noise in sensory and motor states are assumed to be equal.

The average response in the sensory state is given by

$$
\langle R \rangle\_{\text{sensor}} = \frac{1}{2} \left( q\_1 + q\_2 \right) = q.
$$

The correlation between sensory and motor responses in this cell is

$$
\left\langle R\_{\text{motor}} R\_{\text{sensor}} \right\rangle = \sum\_{i} 1 \times P\_M(R=1|B=i) P\_S(R=1|B=i)
$$

$$
\times P(B=i) = \frac{1}{2} \left( p\_1 q\_1 + p\_2 q\_2 \right).
$$

And, the correlation coefficient between motor- and sensoryevoked response is

$$\text{CorrCoeff} = \frac{\langle R\_{\text{motor}} R\_{\text{sensor}} \rangle - \langle R\_{\text{motor}} \rangle \langle R\_{\text{sensor}} \rangle}{\sqrt{\langle R\_{\text{sensor}} R\_{\text{sensor}} \rangle \langle R\_{\text{motor}} R\_{\text{motor}}}},$$

$$= \frac{\frac{1}{2} (p\_1 q\_1 + p\_2 q\_2) - pq}{\sqrt{p(1 - p)q(1 - q)}}.$$

We can discuss the following special cases:

• perfect sensory tuning (*q*<sup>1</sup> = 1*, q*<sup>2</sup> = 0, no instrinsic noise in sensory state):

$$\text{CorrCoeff} = \frac{p\_1 - p\_2}{2\sqrt{p(1-p)}}$$

• same tuning and same intrinsic noise in motor and in sensory states (*q*<sup>1</sup> = *p*1*, q*<sup>2</sup> = *p*2):

$$\text{CorrCoeff} = \frac{\left(p\_1 - p\_2\right)^2}{4p\left(1 - p\right)}$$

In summary, the strength of mirrored responses scales linearly or quadratically with the contrastive probability that neural responses are locked to the behavioral feature vs. spontaneously driven.

## *Mike Skocik1 and Alexay Kozhevnikov1,2\**

*<sup>1</sup> Department of Physics, Pennsylvania State University, University Park, PA, USA*

*<sup>2</sup> Department of Psychology, Pennsylvania State University, University Park, PA, USA*

#### *Edited by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### *Reviewed by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany Noah Cowan, Johns Hopkins University, USA*

#### *\*Correspondence:*

*Alexay Kozhevnikov, Department of Physics, Pennsylvania State University, University Park, PA, USA. Department of Psychology, Pennsylvania State University, University Park, PA, USA. e-mail: akozhevn@phys.psu.edu*

Studies of behavioral and neural responses to distorted auditory feedback (DAF) can help shed light on the neural mechanisms of animal vocalizations. We describe an apparatus for generating real-time acoustic feedback. The system can very rapidly detect acoustic features in a song and output acoustic signals if the detected features match the desired acoustic template. The system uses spectrogram-based detection of acoustic elements. It is low-cost and can be programmed for a variety of behavioral experiments requiring acoustic feedback or neural stimulation. We use the system to study the effects of acoustic feedback on birds' vocalizations and demonstrate that such an acoustic feedback can cause both immediate and long-term changes to birds' songs.

**Keywords: acoustic feedback, animal vocalizations, behavioral neuroscience, sensory feedback, real-time data processing**

## **INTRODUCTION**

Distorted auditory feedback (DAF) is used for assessing the effects of auditory input on vocal production. Presenting DAF and assessing its effects on the song and on the neural activity have been used in songbirds to study the mechanisms of song production and learning (Leonardo and Konishi, 1999; Sakata and Brainard, 2006; Andalman and Fee, 2009; Keller and Hahnloser, 2009; Tschida and Mooney, 2012). Human speech is sensitive to certain types of DAF (Lee, 1950; Houde and Jordan, 1998), and DAF is used to study speech mechanisms. It is often desirable to have real-time DAF, i.e., to rapidly (in a few milliseconds or faster) detect the occurrence of specific acoustic elements in vocalization and present an auditory stimulus once the target acoustic element is detected.

In this paper, we describe an automated system for real-time DAF and demonstrate its use to study both the immediate and the long-term effects of DAF on the song of Bengalese finches. The system uses open-source software and, therefore, is extremely flexible and customizable by the user. It has a significantly lower cost than commercial systems.

Songbirds use auditory feedback to learn to sing when they are young and to maintain their songs in adulthood (Konishi, 1965; Brainard and Doupe, 2000). Long-term exposure to DAF has been shown to cause song degradation in songbirds (Okanoya and Yamaguchi, 1997; Woolley and Rubel, 1997; Leonardo and Konishi, 1999). Some bird species' songs exhibit immediate sensitivity to acoustic input. For these birds, DAF can have an immediate effect on the timing and acoustic structure of the song (Sakata and Brainard, 2006). Analyzing the effects of DAF can yield new understanding of the neural organization of the song and the mechanisms of song learning (Brainard and Doupe, 2000). To study the questions about the effects of time-localized DAF on birdsong, it is important to be able to deliver DAF with high temporal precision in relation to vocalization. To do this, it is necessary to rapidly and reliably detect the specific acoustic elements of the bird's song and, after detection of the acoustic element, generate an acoustic output.

It is a challenging technical task for an acoustic feedback system to be real-time. Real-time performance is most easily achieved with analog systems (Cynx and Von Rad, 2001), but digital systems offer significant advantages in terms of convenience and flexibility. However, the advantages of a digital system are accompanied by the difficulties of making a digital system have small and constant processing delays. The system has to perform analog-to-digital conversion, fast analysis of the recently acquired data and digital-to-analog conversion, and these operations have to take place with reliable timing and concurrently with saving the acquired data. Custom-made DAF systems have been developed and used in behavioral studies (Leonardo and Konishi, 1999; Kao et al., 2005), but their real-time processing characteristics have not been reported. Oftentimes, custom-made systems have significant and not well-controlled delays, especially for systems based on PC's running Windows. Commercial systems for real-time acoustic processing are available but are expensive.

We developed a real-time DAF system based on a PC running Linux and the Real-Time eXperiment Interface (RTXI) software (Lin et al., 2010) and a National Instruments A/D card. The system is low-cost (the cost is only the cost of the hardware, the software is free). The system is capable of A/D bandwidth of over 30 kHz with real-time processing of acoustic signals.

## **METHODS**

A PC with an Intel i7 six-core processor (2.66 GHz) and 4 GB or RAM running Ubuntu Linux 2.6.29.4-rtai with RTXI version 1.1.2 and a National Instruments PCIe-6251 A/D card is used. The A/D card receives audio input from a microphone (AudioTechnika PRO-44, used with Behringer Shark DSP110 microphone amplifier). The output is sent to a speaker amplifier (SLA-1, Applied Research and Technology); the output of the amplifier is connected to a speaker.

The Data Recorder software within RTXI is custom modified. The simplified diagram of signal processing is shown in **Figure 1**. The system has two modes of operation—a non-triggered (idle) mode and a triggered (active) mode. At the core of the modified software is the circular buffer that takes data points one-by-one from the data acquisition engine once they become available. In the non-triggered mode [**Figure 1 (top)**], the system continuously (every 1 ms) computes the rms of the last 10 ms of the input signal. If the signal rms exceeds the threshold, the system is switched into triggered mode.

**FIGURE 1 | Block diagram of the acoustic feedback system.** When not triggered **(top)**, the system computes the rms of the input signal. When the rms exceeds the threshold, the system is triggered. When triggered **(bottom)**, the system computes the spectrogram of the most recent 20 ms of signal and computes the correlation coefficient of this spectrogram with the spectrogram of the template sound (e.g., song syllable). The template sound is detected when the correlation coefficient exceeds a threshold value; in this case, acoustic feedback can be generated. Both the input and the acoustic output are saved to the computer hard drive.

In triggered mode, the system does real-time processing of auditory data. In **Figure 1 (bottom)**, we show the processing done for recognizing the song syllable of a Bengalese finch. The FFT of the past 256 data points (∼8.4 ms) is computed every 1 ms and stored in an FFT circular buffer. Every 1 ms, the spectrogram of the most recent 40 ms of the input signal is obtained from the FFT circular buffer. A correlation coefficient between the input signal spectrogram and the spectrogram of the template is computed. If the correlation coefficient exceeds the threshold, the system detects the occurrence of the target song syllable, and acoustic feedback can be generated, or further processing can be done. While generating acoustic feedback, the system keeps going through all of the above steps, but is disallowed from registering another detection to prevent it from triggering on its own output.

The presence of the data circular buffer allows very fast access to chunks of the most recent data for processing without affecting the timing of the data acquisition process. The FFT circular buffer also allows extremely fast computations of the spectrograms of the sound (computing the spectrogram is a computationally-intensive task). This enables the system to recognize complex vocal elements based on their spectrogram (e.g., frequency sweeps) without compromising the timing. While triggered, every 1 s, the system computes the rms of the previous 200 ms of the input signal to check if the acoustic input is still present. If the rms is below a threshold (no signal), the system goes into the idle mode. While triggered, the system continually saves all the data acquired in a separate array and saves the data to the hard drive once it is switched back to idle mode. A more detailed description of this system, along with the source code, is available at http://www.phys.psu.edu/∼akozhevn/ac\_feedback/.

## **RESULTS**

We tested the performance of our DAF system in several tasks which are often needed in behavioral experiments using acoustic feedback. We also used the system to assess the effects of acoustic feedback on the song of Bengalese finch. All animal procedures were carried out in accordance with the locally approved IACUC protocol.

## **DELAY BETWEEN INPUT AND OUTPUT TEST**

A simple task is generating acoustic feedback when the input level exceeds a certain threshold. Although this task may be too simple for most behavioral experiments, the delay in the system between detecting the crossing of the threshold and producing the output is a useful figure for indicating how fast the system can be when it is solely converting A/D and D/A and saving data without any complex data processing.

The system was programmed so that, once the input exceeded a fixed threshold, the acquired input signal was sent to the D/A output with no extra processing. A square wave with the amplitude exceeding the threshold was applied to the input; the delay between the input and the output was measured with the digital oscilloscope. The measured delay between the output and the input was 27 ± 9µs (mean ± SD, min = 9 µs, max = 43µs). The sample rate was 30.3 kHz, so the observed delays corresponded to a delay of 1 data point between the input and the output. The observed variations of the delay are due to the difference in timing between the external input signal and the timing of the A/D events. In all cases, however, the delay between the input and the output does not exceed 1 datapoint. Therefore, the system has real-time capability.

#### **DETECTION OF SPECIFIC VOCAL ELEMENTS IN THE BIRD'S SONG**

A typical task in experiments using acoustic feedback is detection of a certain "template" sound. The template can be either a sound of a certain frequency or a more complex combination of frequencies, frequency sweeps, etc. Once the template is detected, the acoustic feedback can be played back to the animal. This task is computationally intensive because one needs to compute the characteristics of the recently acquired input signal, then compare these with the characteristics of the template and, if the input is sufficiently similar to the template, decide that the detection has occurred and generate acoustic output. The computation has to be done fast enough to enable realtime performance and not interfere with the data acquisition process.

Common techniques that have been used for detecting acoustic elements are spectrogram-based techniques (Leonardo and Fee, 2005) and feature-based techniques (Tchernichovski et al., 2000). In a spectrogram-based approach, the spectrogram of the recently acquired signal is computed and compared to the template spectrogram. A common way to accomplish this is to compute the correlation coefficient between the two spectrograms. Detection of the template sound occurs if the correlation coefficient exceeds a threshold value.

We tested the performance of the system for detection of specific syllables in the song of a Bengalese finch. The Bengalese finch song consists of a sequence of syllables separated by silences (inter-syllable gaps) (**Figure 2**). The acoustic structure of the song syllables is fairly stable; the main source of variability from one song to another is the sequence of syllables in each song (Honda and Okanoya, 1999).

To detect a specific song syllable, the system continuously computes the correlation coefficient of the spectrogram of the most recent 20 ms segment of the acquired signal with the spectrogram of a 20-ms syllable template (see Methods, **Figure 1**). The target syllable is detected by the system when the correlation coefficient exceeds the threshold value of 0.8. The value of the threshold was chosen by examining the target syllable detections by the DAF system in a set of about 20 songs and comparing the detected syllable occurrences with the actual occurrences of the target syllables determined by visual examination of the song spectrograms. If the threshold is set too high, the probability of missing the target syllable increases. Setting the threshold too low increases the probability of false positive detections. After the syllable detection, acoustic feedback (either white noise or the song syllable) can be played back to the bird.

Typical performance of the system on the real-time syllable recognition task is shown in **Figure 2**. The top spectrogram shows "detection only" mode—the system detects the target syllable in real time, but no playback is generated. The bottom spectrogram shows detection and playback generation—after detecting the target syllable, the system plays back another song syllable

**FIGURE 2 | Top:** spectrogram of the song of a Bengalese finch and the times of occurrence of one of the song syllables. The system was programmed to only detect the occurrences of the target syllable in real time, no acoustic feedback was generated. The detection times are shown as vertical red lines. **Bottom:** the system is detecting the target syllables (vertical red lines) and is generating acoustic feedback after detection. The acoustic feedback waveform is shown below. The feedback signal is one of the birdsong syllables; the acoustic feedback pickup by the microphone is visible on the spectrogram. The zoomed-in spectrogram of the template is shown on the right.

to the bird. The vertical red lines indicate the detection times of the target syllable. The zoomed-in spectrogram of the template is shown on the right. The template contains part of the intersyllable interval and the first 20 ms of the target syllable, so the end of the template (detection time) is approximately in the middle of the 40-ms long syllable.

Performance of the system was checked by comparing the results of automatic detections of the system with the manual identification of the target song syllables carried out by off-line examination of the spectrograms. Out of 659 target syllables, 610 were correctly detected and 49 were missed. There were zero false positives. Thus, the system shows robust performance with the real-time syllable recognition task: over 92% of the target syllables were correctly identified.

This demonstrates that the system is capable of real-time detection of target syllables in the song. Note that the syllables occurring after the target syllables in **Figure 2** are frequency sweeps that overlap with the template's frequencies. The system discriminates them from the target syllables because they have a different frequency profile. Such discrimination is an advantage of the spectrogram-based detection; this would not be possible if only instantaneous frequencies were detected.

Additionally, we tested the system on the detection of syllables in the song of a zebra finch—another bird species. We used our dataset of zebra finch songs with known syllable sequences obtained in a previous study (Kozhevnikov and Fee, 2007). Zebra finch songs were played back through the speaker, and the results of the real-time detection by our DAF system were compared to the known occurrences of the target syllable.

A small subset of songs (10 songs) was used as a test set: the threshold value for the syllable detection was adjusted to optimize the percentage of correctly detected syllables in this small test set. After this, the threshold kept was fixed, and the performance of the system was tested on the whole dataset (about 100 songs). Out of 756 target syllables in the dataset, 728 were correctly detected, 28 were missed; there were 4 false positives. The system correctly detected over 96% of the target syllables in the dataset; the probability of a false positive detection was less than 1%.

#### **EFFECTS OF AUDITORY FEEDBACK ON THE TIMING OF THE BIRDSONG**

Auditory feedback has been shown to have immediate effects on some animal vocalizations. For Bengalese finches, DAF has been shown to affect the timing of song syllables. DAF played after the song syllable increases the time interval between that syllable and the next syllable in the song (Sakata and Brainard, 2006). We tested whether our feedback system is effective in causing real-time changes to the Bengalese finch song. The system was programmed to detect one of the song syllables and, once the syllable was detected, to play back the same song syllable with a probability of 0.05. This ensured that the feedback was sufficiently sparse so almost in all cases there was only one playback during each song. The feedback and control trials were randomly interleaved. This simplified the analysis of the syllable timing and eliminated any confounding effects from playbacks being too close to one another. The delay between the syllable sung by the bird and the syllable playback was 40 ms.

The playback causes some pickup on the input channel, which can cause difficulties in precise determination of the timing of the syllable that is occurring during the playback. Therefore, the time interval between the target syllable and the following syllable (which is partially overlapped with playback) was computed as one half of the difference between the detection time of the target syllable and the detection time of the second syllable after the target syllable. The same procedure was performed in control trials to ensure consistency in data analysis. Since the distributions of time intervals may not be Gaussian, we use a non-parametric statistical test—two-way Kolmogorov–Smirnov test—to assess the statistical significance of DAF effects on the song timing.

**Figure 3** shows the distributions of the time intervals between the target syllable and the following syllable when the feedback is present (blue histogram) and when there is no feedback (red histogram). The widths of the distributions are due to the natural variability of the song timing. In the presence of feedback, the time intervals between the syllables become longer. Without DAF, the mean interval is 74.8 ms (*N* = 637 syllables); in the presence of DAF, the mean interval is 75.7 ms (*N* = 97 syllables). Although the change of the mean duration is small compared to the widths of the distributions, the effect is highly statistically significant (*p* = 0*.*001, two-way Kolmogorov–Smirnov test). The observed lengthening of the time interval between the song syllables is consistent with previous observations (Sakata and Brainard, 2006). Thus, our acoustic feedback has an immediate effect on the song: DAF immediately and reversibly affects song timing.

#### **LONG-TERM EFFECTS OF DAF ON ACOUSTIC STRUCTURE OF THE SONG**

DAF has been shown to cause long-term changes to animal vocalizations. For songbirds, prolonged repeated presentation of DAF

**Bengalese finch song syllables.** Shown above are the histograms of the time intervals between two subsequent syllables in the song in the presence of DAF (blue) and without DAF (red). The means are: *t*mean = 74*.*8 ms (control, *N* = 637 syllables) and *t*mean = 75*.*7 ms (feedback, *N* = 97 syllables), the difference is statistically significant (*p* = 0*.*001, two-way Kolmogorov–Smirnov test).

can cause gradual change of the song (Leonardo and Konishi, 1999; Warren et al., 2011). We tested out system on the task of causing long-term changes of the frequency of one of the song syllables.

The system is programmed to detect the fundamental frequency of one of the song syllables. After the target song syllable is detected, the temporal profile of the pitch (defined as the largest peak in the FFT of the latest 256 points) was computed. The lowest value in the pitch profile in the time window between 3 and 12 ms after the detection time was taken to be the pitch of the syllable. The feedback (white noise) is conditional on the detected pitch of the song syllable. For example, the feedback can be generated if the detected syllable pitch is smaller than a threshold value. Continuous exposure to such feedback has been shown to cause the bird to gradually shift the mean pitch of the syllable so that the feedback is generated less often—the bird adapts its song to avoid hearing DAF (Warren et al., 2011).

We tested whether such conditional DAF could shift the mean syllable pitch in both directions. **Figure 4** shows the long-term effects of DAF which is conditional on the syllable pitch. During days 1–6, the feedback was played back if the pitch was less than 3530 Hz; during days 7–12, the feedback was played back if the pitch was greater than 3530 Hz; during days 13–18, the feedback was played back if the pitch was less than 3510 Hz.

Our feedback is effective in causing gradual changes in the syllable pitch. Playing back DAF when the frequency is lower than the threshold value causes an upward drift of the mean pitch (days 1–6 and 13–18). Presenting feedback when the pitch is higher than the threshold causes a downward drift in pitch (days 7–12). This shows that the system is suitable for studies of long-term effects of DAF on animal vocalizations.

**FIGURE 4 | Prolonged exposure of the bird to DAF causes gradual changes in the song.** The mean pitch of one of the target song syllable was manipulated by feedback conditional on pitch. Each datapoint is the pitch of the target syllable averaged over all the renditions of the target syllable sung by the bird on a given day. The number of renditions varies from day to day (mean = 208, min = 126, max = 288). Error bars are s.e.m. Dashed lines are the values of the threshold. Arrows indicate the direction in which the syllable frequency was expected to change in response to DAF. Larger error bars on day 12 are mostly due to the smallest number of target syllable renditions (*n* = 126) sung on that day.

### **DISCUSSION**

The described real-time acoustic feedback system is a versatile tool for studies of the effects of auditory feedback on ongoing animal vocalization. The advantage of the system is the flexibility of the processing than can be realized. The circular buffer allows real-time acquisition of arbitrary-length segments of the most recent data without affecting the timing of data acquisition. In addition, having a separate FFT buffer facilitates real-time spectral processing of acquired signals. This feature enables a very quick creation of the spectrograms of long (tens or even hundreds of milliseconds) segments of signals at a high rate (the spectrogram is updated every 1 ms).

This capability is very useful for the detection of complex vocal signals. Often, it is not just a certain frequency that needs to be detected, but rather a certain spectrogram pattern, like multiple frequencies or the frequency sweeps frequently seen in birdsong syllables. Since the same frequency can occur in many syllables, it is the whole pattern of the spectrogram that allows real-time detection of the syllable. Our system is very well-suited for rapid spectrogram-based detection of acoustic elements.

The performance of the system will vary depending on the type of animal vocalization and the nature of the acoustic element being detected for two main reasons. First, there is always a natural variability in the acoustic structure of a vocal element, and the degree of this variability may be different for different vocal elements; this will affect the reliability of detection. For example, a birdsong syllable can possess a more or less stereotyped spectrogram; the detection will be easier for a more stereotypical song syllable. Second, a given vocal element can be more or less similar in its acoustic structure to other vocal elements; reliable detection of a target vocal element will be easier if it is spectrally more dissimilar to other vocal elements. To achieve optimum performance, adjustments to the threshold or detection algorithm may be needed; thus, it is important to have a highly customizable system.

It is worth mentioning that, when the song syllables are detected, data processing is not a time-limiting step, and significantly more complex processing can be done without decreasing the A/D rate. We tested the system with longer templates (60 and 100 ms); they did not affect performance. We also tested the simultaneous detection of two templates, so that, every 1 ms, the system computed the correlation coefficient of the sound spectrogram with two template spectrograms, and that also did not affect the A/D rate. For a template 60 ms long, the computation of the correlation coefficient takes 16µs; this time scales linearly with the length of the template. The computation of the FFT (to fill the column in the spectrogram buffer) takes 7–8µs. FFT and correlation coefficient computations are the slowest signal processing steps; all other steps combined take less than 1µs. Thus—If the spectrogram update rate is kept at 1 ms—the system should be capable of simultaneously detecting of over 10 different song syllables. Therefore, fairly complex real-time analysis and detection of multiple vocal elements can be done without compromising the speed of the system.

The system is usually used with 1 output channel and 2 input channels (one input channel for acoustic input and one channel for recording the actual output of the A/D card). It is possible to increase the number of input channels. This way, one could use the data recording capability of the system to collect physiological data (e.g., EEG or neural) during acoustic feedback experiments. However, the process limiting the speed of the system appears to be reading the data from the A/D card and sending the data to the D/A. Thus, increasing the number of channels will slow the system down and decrease the A/D rate. We tested the performance of the system with 3 input channels and 1 output channel. To achieve stable operation, the A/D rate had to be decreased to 20 kHz. Despite this decrease, this is an acceptable rate for many experiments where acoustic and electrophysiological data have to be collected.

Finally, the real-time processing capabilities of the system could be used for neural feedback experiments. The spectrogrambased signal processing capability can be useful for the detection of neural oscillations. The output can be used for targeted microstimulation. The described auditory feedback system is a flexible low-cost tool for behavioral neuroscience research.

#### **ACKNOWLEDGMENTS**

The authors wish to acknowledge Ryan Wendt for his contribution to the initial stages of the acoustic system development and Jonathan Bettencourt and John White for their expert advice on RTXI capabilities and installation. This work was supported by NSF Grant 0827731 and the NSF Research Experience for Undergraduates program at the Pennsylvania State University.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at http://www.frontiersin.org/Neural\_Circuits/10.3389/fncir.2012 00111/abstract

#### **REFERENCES**


avian basal ganglia – forebrain circuit to real-time modulation of song. *Nature* 433, 638–643.


interface for biological control applications," in *2010 Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)* (Buenos Aires, Argentina), 4160–4163.


Woolley, S. M. N., and Rubel, E. W. (1997). Bengalese finches *Lonchura striata domestica* depend upon auditory feedback for the maintenance of adult song. *J. Neurosci.* 17, 6380–6390.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 31 July 2012; accepted: 08 December 2012; published online: 07 January 2013.*

*Citation: Skocik M and Kozhevnikov A (2013) Real-time system for studies of the effects of acoustic feedback on animal vocalizations. Front. Neural Circuits 6:111. doi: 10.3389/fncir.2012.00111*

*Copyright © 2013 Skocik and Kozhevnikov. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Learning and exploration in action-perception loops

#### *Daniel Y. Little1 and Friedrich T. Sommer <sup>2</sup> \**

*<sup>1</sup> Department of Molecular and Cell Biology, Redwood Center for Theoretical Neuroscience, University of California, Berkeley, CA, USA <sup>2</sup> Redwood Center for Theoretical Neuroscience, Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA*

#### *Edited by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### *Reviewed by:*

*Richard Hahnloser, ETH University Zurich, Switzerland Rava Azeredo Da Silveira, Ecole Normale Supérieure, France*

#### *\*Correspondence:*

*Friedrich T. Sommer, Redwood Center for Theoretical Neuroscience, Helen Wills Neuroscience Institute, University of California, 575A Evans Hall, MC #3198, Berkeley, CA 94720-3198, USA. e-mail: fsommer@berkeley.edu*

Discovering the structure underlying observed data is a recurring problem in machine learning with important applications in neuroscience. It is also a primary function of the brain. When data can be actively collected in the context of a closed action-perception loop, behavior becomes a critical determinant of learning efficiency. Psychologists studying exploration and curiosity in humans and animals have long argued that learning itself is a *primary* motivator of behavior. However, the theoretical basis of learning-driven behavior is not well understood. Previous computational studies of behavior have largely focused on the control problem of maximizing acquisition of rewards and have treated learning the structure of data as a *secondary* objective. Here, we study exploration in the absence of external reward feedback. Instead, we take the quality of an agent's learned internal model to be the primary objective. In a simple probabilistic framework, we derive a Bayesian estimate for the amount of information about the environment an agent can expect to receive by taking an action, a measure we term the predicted information gain (PIG). We develop exploration strategies that approximately maximize PIG. One strategy based on value-iteration consistently learns faster than previously developed reward-free exploration strategies across a diverse range of environments. Psychologists believe the evolutionary advantage of learning-driven exploration lies in the generalized utility of an accurate internal model. Consistent with this hypothesis, we demonstrate that agents which learn more efficiently during exploration are later better able to accomplish a range of goal-directed tasks. We will conclude by discussing how our work elucidates the explorative behaviors of animals and humans, its relationship to other computational models of behavior, and its potential application to experimental design, such as in closed-loop neurophysiology studies.

**Keywords: knowledge acquisition, information theory, control theory, machine learning, behavioral psychology, computational neuroscience**

## **1. INTRODUCTION**

Computational models of exploratory behavior have largely focused on the role of exploration in the acquisition of external rewards (Thrun, 1992; Kaelbling et al., 1996; Sutton and Barto, 1998; Kawato and Samejima, 2007). In contrast, a consensus has emerged in behavioral psychology that learning represents the primary drive underlying explorative behaviors (Archer and Birke, 1983; Loewenstein, 1994; Silvia, 2005; Pisula, 2009). The computational principles underlying learning-driven exploration, however, have received much less attention. To address this gap, we introduce here a mathematical framework for studying how behavior affects learning and develop a novel model of learning-driven exploration.

Machine learning techniques for extracting the structure underlying sensory signals have often focused on passive learning systems that can not directly affect the sensory input. Exploration, in contrast, requires actively pursuing useful information and can only occur in the context of a closed action-perception loop. Learning in closed action-perception loops differs from passive learning both in terms of "what" is being learned as well as "how" it is learned (Gordon et al., 2011). In particular, in closed action-perception loops:


Sensorimotor contingencies refer to the causal role actions play on the sensory inputs we receive, such as the way visual inputs change as we shift our gaze or move our head. They must be taken into account to properly attribute changes in sensory signals to their causes. This tight interaction between actions and sensation is reflected in the neuroanatomy where sensorymotor integration has been reported at all levels of the brain (Guillery, 2005; Guillery and Sherman, 2011). We often take our implicit understanding of sensorimotor contingencies for granted, but in fact they must be learned during the course of development (the exception being contingencies for which we are hard-wired by evolution). This is eloquently expressed in the explorative behaviors of young infants (e.g., grasping and manipulating objects during proprioceptive exploration and then bringing them into visual view during intermodal exploration) (Rochat, 1989; O'Regan and Noë, 2001; Noë, 2004).

Not only are actions part of "what" we learn during exploration, they are also part of "how" we learn. To discover what is inside an unfamiliar box, a curious child must open it. To learn about the world, scientists perform experiments. Directing the acquisition of data is particularly important for embodied agents whose actuators and sensors are physically confined. Since the most informative data may not always be accessible to a physical sensor, embodiment may constrain an exploring agent and require that it coordinates its actions to retrieve useful data.

In the model we propose here, an agent moving between discrete states in a world has to learn how its actions influence its state transitions. The underlying transition dynamics is governed by a Controllable Markov Chain (CMC). Within this simple framework, various utility functions for guiding exploratory behaviors will be studied, as well as several methods for coordinating actions over time. The different exploratory strategies are compared in their rate of learning and how well they enable agents to perform goal-directed tasks.

#### **2. METHODS**

**2.1. MATHEMATICAL FRAMEWORK FOR EMBODIED ACTIVE LEARNING**

*CMCs* are a simple extension of Markov chains that incorporate a control variable for switching between different transition distributions in each state, e.g., (Gimbert, 2007). Formally, a CMC is a 3-tuple *(S , A ,-)* where:


$$p(s'|a, s; \Theta) = \Theta\_{as'}$$

$$\Theta\_{as\cdot} \in \Delta\_{N-1} \tag{1}$$

Here, Δ*N*−<sup>1</sup> denotes the standard (*N* − 1)-simplex and is used to constrain to describing legitimate probability distributions:

$$\Delta\_{N-1} := \{ (\mathbf{x}\_0, \mathbf{x}\_1, \dots, \mathbf{x}\_{N-1}) \in \mathbb{R}^N \Big| \sum\_{i=0}^{N-1} \mathbf{x}\_i = 1 \text{ and } \mathbf{x}\_i \succeq \mathbf{0} \,\,\forall i \} $$

CMCs provides a simple mathematical framework for modeling exploration in embodied action-perception loops. At each time step, an exploring agent is allowed to select any action *a* ∈ *A* . This action, along with the agent's current state, then determines which transition distribution its next state is drawn from. For this study, we will make the simplifying assumption that the states can be directly observed by the agent, i.e., the system is not hidden. Since we are interested in the role behavior plays in learning about the world, we consider the exploration task of the agent to be the formation of an accurate estimate, or *internal model -*, of the true CMC kernel that describes its *world -*.

This framework captures the two important roles actions play in embodied learning. First, transitions depend on actions, and actions are thus a constituent part of "what" is being learned. Second, an agent's immediate ability to interact with and observe the world is limited by its current state. This restriction models the *embodiment* of the agent, and actions are "how" an agent can overcome this constraint on accessing information. Our primary question will be how action policies can optimize the speed and efficiency of learning in embodied action-perception loops as modeled by CMCs.

#### **2.2. INFORMATION-THEORETIC ASSESSMENT OF LEARNING**

Following Pfaffelhuber (1972), we define *missing information* IM as a measure of the inaccuracy of an agent's internal model. To compute IM, we first calculate the Kullback–Leibler (KL) divergence of the internal model from the world for each transition distribution:

$$\mathrm{D\_{KL}}(\Theta\_{as} \| \,\widehat{\Theta}\_{as} \,) := \sum\_{s'=1}^{N} \Theta\_{as'} \log\_2 \left( \frac{\Theta\_{as'}}{\widehat{\Theta}\_{as'}} \right) \tag{2}$$

The KL-divergence is an information-theoretic measure of the difference between two distributions. Specifically, Equation (2) gives the expected number of bits that would be lost if observations (following the true distribution) were communicated using an encoding scheme optimized for the estimated distribution (Cover and Thomas, 1991). It is large when the two distributions differ greatly and zero when they are identical. Next, missing information is defined as the unweighted sum of the KL-divergences:

$$\mathcal{I}\_{\mathcal{M}}(\Theta \| \widehat{\Theta}) := \sum\_{s \in \mathcal{P}, a \in \mathcal{A}'} \mathcal{D}\_{\mathcal{KL}}(\Theta\_{as} \| \widehat{\Theta}\_{as}) \tag{3}$$

We will use missing information to assess learning under different explorative strategies. Steeper decreases in missing information over time represent faster learning and thus more efficient exploration. The definition of missing information and those of several other relevant terms that will be introduced later in this manuscript have been compiled into **Table 1** for easy reference.

#### **2.3. BAYESIAN INFERENCE LEARNING**

As an agent acts in its world, it observes the state transitions and can use these observations to update its internal model *-*. Taking a Bayesian approach, we assume the agent models its world  as a random variable **Θ** with an initial *prior distribution f* over the space of possible CMC structures, *W* = ΔNM *<sup>N</sup>*−1. There is no standard nomenclature for tensor random variables and we will therefore use a bold upright theta **Θ** to denote the random variable and a regular upright theta Θ to denote an arbitrary realization of this random variable. Thus, *f(*Θ*)* describes the exploring agent's initial belief that Θ accurately describes its world, i.e., that **Θ** = *-*. By Bayes' theorem, an agent can calculate a posterior belief on the structure of its world from its prior and any data it has collected, d:

$$f(\Theta|\vec{\text{d}}) = \frac{\rho(\vec{\text{d}}|\Theta)f(\Theta)}{\rho(\vec{\text{d}})} \tag{4}$$

Bayes' theorem decomposes the posterior distribution of the CMC kernel into the likelihood function of the data, *p(*d|Θ*)*, and the prior, *f(*Θ*)*. The normalization factor is calculated by

#### **Table 1 | Table of measures.**


integrating the numerator over *W* :

$$p(\vec{\mathbf{d}}) = \int\_{\mathcal{Y}'} p(\vec{\mathbf{d}}|\Theta) f(\Theta) d\Theta'$$

We now formulate a Bayesian estimate by directly calculating the posterior belief in transitioning to state *s* from state *s* under action *a*:

$$\widehat{\Theta}\_{as'} := p(s'|a, s, \vec{\mathbf{d}}) = \int\_{\mathcal{W}} p(s', \Theta|a, s, \vec{\mathbf{d}}) d\Theta$$

$$= \int\_{\mathcal{W}} p(s'|a, s; \Theta) f(\Theta|\vec{\mathbf{d}}) d\Theta$$

$$= \int\_{\mathcal{W}} \Theta\_{as'} f(\Theta|\vec{\mathbf{d}}) d\Theta = \mathbb{E}\_{\Theta|\vec{\mathbf{d}}} [\Theta\_{as'}] \tag{5}$$

For discrete priors the above integrals would be replaced with summations. Equation (5) demonstrates that the Bayesian estimate is simply the expectation of the random variable given the data. While other estimates are possible for inferring world structure, such as Maximum Likelihood, the Bayesian estimate is often employed to avoid over-fitting (Manning et al., 2008). Moreover, as the following theorem demonstrates, the Bayesian estimate is optimal under our minimum missing information objective function.

**Theorem 1.** *Consider a CMC random variable* **Θ** *modeling the ground truth environment and drawn from a prior distribution f . Given a history of observations* d*, the expected missing information between* **Θ** *and an agent's internal model is minimized by the Bayesian estimate* = *-. That is:*

$$\widehat{\Theta} := \operatorname{E}\_{\Theta \mid \vec{\mathbf{d}}}[\Theta] = \operatorname\*{arg\,min}\_{\Phi} \operatorname{E}\_{\Theta \mid \vec{\mathbf{d}}} \left[ \operatorname{I}\_{\mathsf{M}}(\Theta \|\: \Phi) \right],$$

*Proof.* See Appendix A1

The exact analytical form for the Bayesian estimate will depend on the prior distribution. We emphasize that the utility of the Bayesian estimate rests on the accuracy of its prior. In the discussion, we will address issues deriving from uncertain or inaccurate prior beliefs, but for now will provide the agents with priors that match the generative process by which we create new worlds for the agents to explore.

 *-*  

#### **2.4. THREE TEST ENVIRONMENTS FOR STUDYING EXPLORATION**

In the course of exploration, the data an agent accumulates will depend on both its behavioral strategy as well as the structure of its world. We reasoned that studying diverse environments, i.e., CMCs that differ greatly in structure, would allow us to investigate how world structure effects the relative performance of different exploratory strategies and to identify action policies that produce efficient learning under broad conditions. We thus constructed and considered three classes of CMCs that differ greatly in structure: Dense Worlds, Mazes, and 1-2-3 Worlds. Dense Worlds are randomly generated from a uniform distribution over all CMCs with *N* = 10 states and *M* = 4 actions (see **Figure A1** in Appendix). They therefore represent very unstructured worlds. Mazes, in contrast, are highly structured and model moving between rooms of a 6-by-6 maze (see **Figure 1**). The state space in mazes consist of the *N* = 36 rooms. The *M* = 4 actions correspond to the noisy translations in the four cardinal directions. To make the task of learning in mazes harder, 30 transporters are randomly distributed amongst the walls which lead to a randomly chosen absorbing state (concentric rings in **Figure 1**). While perhaps not typically abundant in mazes, absorbing states, such as at the bottom of a gravity well, are common in real world dynamics. Finally, 1-2-3 Worlds differ greatly from both Dense Worlds and Mazes in that their transitions are drawn from a discrete distribution rather than a continuous one (see **Figure A2** in Appendix). Since our work is heavily rooted in the Bayesian approach, the consideration of worlds with a different priors was an important addition to understanding the dependency of an exploration strategy on these priors. 1-2-3 Worlds consist of *N* = 10 states and *M* = 3 actions. In a given state, action *a* = 1 moves the agent deterministically to a single target state, *a* = 2 moves the agent with probability 0*.*5 to one of two target states, and *a* = 3 moves the agent with probability 0*.*333 to one of three potential target states. The Appendix contains detailed information on the generative distributions used to create examples from each class of environments and also provides the analytical form for the Bayesian estimate in each world (see Appendix A2).

## **3. RESULTS**

#### **3.1. ASSESSING THE INFORMATION-THEORETIC VALUE OF PLANNED ACTIONS**

The central question to be addressed in this manuscript is how behavior affects the learning process in embodied actionperception loops. The fast reduction of missing information

is taken to be the agent's objective during learning-driven exploration (Equation 3). As discussed in section 2.3, the Bayesian estimate minimizes the expected missing information and thus solves the inference problem. The control problem of choosing actions to learn quickly nevertheless remains to be solved. We now show that Bayesian inference can also be used to predict how much missing information will be removed by an action. We call the decrease in missing information between two internal models the *information gain* (IG). Letting  be a current model derived from data d and *a, <sup>s</sup>*→*<sup>s</sup>* ∗ be an updated model after observing a transition from *s* to *s* <sup>∗</sup> under action *a*, the information gain for this observation is:

absorbing state (blue target). The shading of an arrow indicates the probability of the transition (darker color represents higher probability).

$$\begin{split} \mathcal{I}\_{\mathcal{G}}(a,s,s^\*) &:= \mathcal{I}\_{\mathcal{M}}(\Theta \| \widehat{\Theta}) - \mathcal{I}\_{\mathcal{M}}(\Theta \| \widehat{\Theta}^{a,s \to s^\*}) \\ &= \sum\_{s'} \Theta\_{as'} \log\_2 \frac{\widehat{\Theta}^{a,s \to s^\*}\_{\frac{as'}{\Theta}}}{\widehat{\Theta}\_{as'}} \end{split} \tag{6}$$

An exploring agent cannot compute IG directly because it depends on the true CMC kernel *-*. It also cannot know the outcome *s* <sup>∗</sup> of an action until it has taken it. We therefore again take the Bayesian approach introduced in section 2.3 and consider the agent to treat  and *s* <sup>∗</sup> as random variables. Then, by calculating the expected value of IG, we show in Theorem 2 that an agent can compute an estimate of information gain from its prior belief on  and the data it has collected. We term this estimate the *predicted information gain* (PIG).

**Theorem 2.** *If an agent is in state s and has previously collected data* d*, then the expected information gain for taking action a is given by:*

$$\begin{split} \text{PIG}(a,s) &:= \text{E}\_{s^\*, \mathbf{O} \mid \overline{\mathbf{d}}} [\text{I}\_{\text{G}}(a,s,s^\*)] \\ &= \sum\_{s^\*} \widehat{\Theta}\_{as\*} \mathbf{u}\_{\text{KL}}(\widehat{\Theta}\_{as\*}^{a,s \to s^\*} \| \, \widehat{\Theta}\_{as\*}) \end{split} \tag{7}$$

*Proof.* See Appendix A3

PIG has an intuitive interpretation. In a sense the agent imagines the possible outcomes *s* <sup>∗</sup> of taking action *a* in state *s*. It then determines how each of these results would hypothetically change its internal model *a, <sup>s</sup>*→*<sup>s</sup>* ∗ . It compares these new hypothetical models to its current model by computing the KLdivergence between them. The larger this difference the more information the agent would likely gain if it indeed transitioned to state *s* <sup>∗</sup>. Finally, it averages these hypothetical gains according to the likelihood of observing *s* <sup>∗</sup> under its current model.

For each class of environments, **Figure 2** compares the average PIG with the average realized information gain as successive observation are used to update a Bayesian estimate. In accordance with Theorem 2, in all three environments PIG accurately predicts the average information gain. Thus, theoretically and empirically, PIG represents an accurate estimate of the improvement an agent can expect in its internal model if it takes a planned action in a particular state.

Interestingly, the expression on the RHS of Equation (7) has been previously studied in the field of Psychology where it was introduced *ad hoc* to describe human behavior during hypothesis testing (Klayman and Ha, 1987; Oaksford and Chater, 1994; Nelson, 2005). To our knowledge, its equality to the predicted gain in information (Theorem 2) is novel. In a later section, we will compare PIG to other measures proposed in the field of Psychology.

#### **3.2. CONTROL LEARNERS: UNEMBODIED AND RANDOM ACTION**

Before introducing and assessing the performance of different explorative strategies, we first develop positive and negative controls. A naive strategy would be to select actions uniformly randomly. Such random policies are often employed to encourage exploration in reinforcement learning models. We will use a *random action* strategy as a negative control exhibiting the baseline learning rate of an undirected explorer.

An *unembodied* agent that achieves an upper bound on expected performance serves as a positive control. Unlike an embodied agent, the unembodied control is allowed, at every time step, to relocate itself to any state it wishes. For such an agent, optimization of learning decomposes into an independent sampling problem (Pfaffelhuber, 1972). Since the PIG for each transition distribution decreases monotonically over successive observations (**Figure 2**), learning by an unembodied agent can be optimized by always sampling from the state and action pair with the highest PIG. Thus, learning can be optimized in a greedy fashion:

$$(a,s)\_{\text{Unemb.}} := \underset{(a,s)}{\text{arg}\,\text{max}} \,\text{PIG}(a,s) \tag{8}$$

Comparing the learning performances of the random action and unembodied control (red and black curves, respectively in **Figure 3**) we find a notable difference among the three classes of environments. The performance margin between these two controls is significant in Mazes and 1-2-3 Worlds (*p <* 0*.*001), but not in Dense Worlds (*p >* 0*.*01). Despite using a naive strategy, the random actor is essentially reaching maximum performance in Dense Worlds, suggesting that exploration of this environment is fairly easy. In contrast, in Mazes and 1-2-3 Worlds, a directed exploration strategy may be necessary to reach learning speeds closer to the unembodied upper bound.

#### **3.3. EXPLORATION STRATEGIES BASED ON PIG**

PIG represents a utility function that can be used to guide exploration. Since greedy maximization of PIG is optimal for the unembodied agent, one might expect a similar strategy to be promising for an embodied agent. Unlike the unembodied control, however, the embodied agent [PIG(greedy)] would only be able to select its action, not its state:

$$a\_{\text{PIG}(\text{greedy})} := \underset{a}{\text{arg}\,\text{max}} \,\text{PIG}(a, s) \tag{9}$$

The performance comparison between PIG(greedy) (Equation 9) and the positive control (Equation 8) is of particular interest because they differ only in that one is embodied while the other is not. As shown in **Figure 4** the performance difference is largest in Maze worlds, moderate though significant in 1-2-3 Worlds and smallest in Dense Worlds (*p <* 0*.*001 for Mazes and 1-2-3 Worlds, *p >* 0*.*001 for Dense Worlds). To quantify the embodiment constraint faced in a world, we define an *embodiment index* as the relative difference between the areas under the learning curves for PIG(greedy) and the unembodied control. The average embodiment indices for Dense Worlds, Mazes, and 1-2-3 Worlds are 0.02, 2.59, and 1.27, respectively.

**FIGURE 3 | Learning curves for control strategies.** The average missing information is plotted over exploration time for the unembodied positive control and random action baseline control. Standard errors are plotted as dotted lines above and below learning curves (*n* = 200).

We also find that, whereas PIG(greedy) yielded no improvement over random action in Dense Worlds and Mazes (*p >* 0*.*001), it significantly improved learning in 1-2-3 Worlds (*p <* 0*.*001), suggesting that this utility function was most beneficial in 1-2-3 Worlds.

Greedy maximization of PIG only accounts for the immediately available information gains and fails to account for the effect an action can have on future utility. In particular, when the potential for information gain is concentrated at remote states in the environment, it may be necessary to coordinate actions over time. Forward estimation of total future PIG is intractable. We therefore employ a back-propagation approach previously developed in the field of economics called *value-iteration* (VI) (Bellman, 1957). The estimation starts at a distant time point (initialized as *τ* = 0) in the future with initial values equal to the PIG for each state-action pair:

$$Q\_0(a,s) := \text{PIG}(a,s).$$

Then propagating backwards in time, we maintain a running total of estimated future value:

$$Q\_{\mathfrak{r}\to 1}(a, s) := \text{PIG}(a, s) + \chi \sum\_{s' \in \mathcal{F}} \widehat{\Theta}\_{as'} \cdot V\_{\mathfrak{r}}(s') \qquad (10)$$
  $\text{where } V\_{\mathfrak{r}}(s) := \max\_{a} Q\_{\mathfrak{r}}(a, s)$ 

Here, γ is a discount factor, set to 0*.*95. Such discount factors are commonly employed in value-iteration algorithms to favor more immediate gains over gains further in the future (Bellman, 1957). As discussed later, discounting may also help, in part, to account for the decreasing return on information from successive observations (see **Figure 2**).

Ideally, the true transition dynamics  would be used in Equation (10), but since the agent must learn these dynamics, it employs its internal model  instead. Applying the VI algorithm to PIG, we construct a behavioral policy PIG(VI) that coordinates actions over several time steps toward the approximate maximization of expected information gain:

$$\operatorname{ap}\_{\mathbf{I}\mathbf{G}(\mathbf{V})} := \operatorname\*{arg\,max}\_{a} \mathbf{Q}\_{-\mathbf{I}0}(a, s);$$

As shown in **Figure 4**, the use of VI to coordinate actions yielded the greatest gains in Mazes, with moderate gains also seen in 1-2- 3 Worlds. Along with the embodiment indices introduced above, these results support the hypothesis that worlds with high embodiment constraints require agents to coordinate their actions over several time steps to achieve efficient exploration.

Bellman showed that VI accurately estimates future gains when the true transition dynamics  is known and when the utility function is stationary (Bellman, 1957). Neither of these are true in our case, and PIG(VI) is therefore only an approximation of future gains. Nevertheless, as we will show, its utility is validated by its superior performance when compared to other previously introduced exploration strategies.

While a learning agent cannot use the true dynamics for VI, we can ascertain how much this impairs its exploration by considering a second positive-control PIG(VI+) which is given the true dynamics for coordinating its actions. That is, this control uses  instead of  in Equation (10) above. The performance of PIG(VI+) only differs from PIG(VI) in Mazes, and this difference is relatively small compared to the gains made over the random or greedy behaviors (**Figure 4**). Altogether these results suggest that PIG(VI) may be an effective strategy employable by embodied agents for coordinating explorative actions toward learning.

#### **3.4. STRUCTURAL FEATURES OF THE THREE WORLDS**

In the course of exploration, the data an agent accumulates will depend on both its behavioral strategy as well as the dynamical structure of its world. To elucidate this interaction, we next consider how structural differences in the three classes of environments correlate with an agents ability to explore. In particular, we consider three structural features of the worlds: their tendency to draw agents into a biased distribution over states, the amount of control a single action provides an agent over its future states, and the average distance between any two states.

#### *3.4.1. State bias*

To assess how strongly a world biases the state distribution of its agents we quantify the unevenness of the equilibrium distribution under a random action policy. The equilibrium distribution quantifies the likelihood that an agent will be in a particular state at a distant time-point in the future. To quantify the bias of this distribution, we define a *structure index* (SI) as the relative difference between its entropy *H()* and the entropy of the uniform distribution *H(U)*:

$$SI(\Psi) := \frac{H(U) - H(\Psi)}{H(U)}$$

where:

$$H(\mathbf{s}) := -\sum\_{\mathbf{s} \in \mathcal{P}} p(\mathbf{s}) \log\_2(p(\mathbf{s})),$$

In **Figure 5A**, the structure indices for 200 worlds in each class of environment are plotted against the embodiment index (defined in section 3.3). As depicted, the embodiment index correlates strongly with the structure index suggesting that state bias represents a significant challenge embodied agents face during exploration.

#### *3.4.2. Controllability*

To measure the capacity for an agent to control its state trajectory we computed a control index as the mutual information between a random action *a*<sup>0</sup> and an agent's state *t* time steps in the future *st* averaged uniformly over possible starting states *s*0:

$$\begin{aligned} CI(t) &= \sum\_{s\_0 \in \mathcal{J}'} \frac{1}{N} \text{MI}[A\_0, S\_t | s\_0] \\ &= \sum\_{s\_0 \in \mathcal{J}'} \frac{1}{N} \left( \sum\_{a\_0 \in \mathcal{A}', s\_l \in \mathcal{J}'} p(a\_0, s\_l | s\_0) \log\_2 \left( \frac{p(s\_t | a\_0, s\_0)}{p(s\_t | s\_0)} \right) \right) \end{aligned}$$

As shown in **Figure 5B**, an action in a Maze or 1-2-3 Worlds has significantly more impact on future states than an action in Dense Worlds. Controllability is required for effective coordination of actions, such as under PIG(VI). In Mazes, where actions can significantly affect states far into the future, agents yielded the largest gains from coordinated actions. 1-2-3 Worlds also revealed high controllability, but only over the more immediate future. Interestingly, 1-2-3 Worlds also showed moderate gains from coordinating actions.

#### *3.4.3. Mean path length*

To assess the size of each CMC, we calculated the average minimum expected path length between every two states. To do this, we first determined the action policy that would minimize the expected path length to any target state. We then calculated the expected number of time-steps it would take an agent to navigate to that target state while employing this optimal policy. The average value of this expected path length taken across start and

**FIGURE 5 | Quantifying the structure of the worlds. (A)** The embodiment index, defined in section 3.3, is plotted against the structure index for each of 200 Dense Worlds, Mazes, and 1-2-3 Worlds. **(B)** For the same CMCs, the average controllability is plotted as a function of the number of time steps the state lies in the future. The error bars depict standard deviations. **(C)** Again for the same CMCs, the learning performance gap in between PIG(VI) and PIG(VI+) is plotted against the mean path length between any two states.

target states was used as a measure of the extent of the CMC (see Appendix A4 for details). We had previously found that the three classes of CMCs differed in the relative performance between the PIG(VI) explorer and the PIG(VI+) control. Since these two strategies differ only in that the former uses the agent's internal model to coordinate its actions while the latter is allowed to use the true world dynamics, we wondered if the performance gap between the two (the area between their two learning curves) could be related to the path length to a potential source of information. Indeed, comparing this performance gap to the mean path length for each world, we found a strong correlation, as shown in **Figure 5C**. This suggests that coordination of actions may be more dependent on internal model accuracy for spatially extended worlds. Finally, it is interesting to note that the Mean Path Length is typically larger in mazes than 10 time steps, the planning horizon used in Value Iteration. Ten was chosen simply as a round number and it may be surprising that it works as well as it does in such spatially extended worlds. We believe two factors may contribute to this. First, it is likely that states of high informational value will be close together. Coordinating actions toward a nearby state of high value will therefore likely bring the agent closer to other states of potentially higher value. Second, and we suspect more importantly, since the mean path length is an average, a VI planner can direct its action toward a high information state under the possibility that it might reach that state within 10 time steps even if the expected path length to that location is significantly longer.

#### **3.5. COMPARISON TO PREVIOUS EXPLORATIVE STRATEGIES**

Models of exploration have been previously developed in the field of reinforcement learning (RL). Usually, these models focus on the role of exploration in reward acquisition rather than its direct role in learning world structure. Still, several of the principles developed in the RL field can be implemented in our framework. In this section, we compare these various methods to PIG(VI) under our learning objective. Since no rewards are available, we consider only RL strategies that can be implemented without rewards. Random action is perhaps the most common exploration strategy in RL. As we have already seen, random action is only efficient for exploring Dense Worlds. The following directed exploration strategies have also been developed in the RL literature (learning curves are plotted in **Figure 6**):

*Least Taken Action (LTA):* Under LTA, an agent will always choose the action that it has performed least often in the current state (Sato et al., 1988; Barto and Singh, 1990; Si et al., 2007). Like random action, LTA yields uniform sampling of actions in each state. Across worlds, LTA fails to significantly improve on the learning rates seen under random action (*p >* 0*.*001 for all three environments).

*Counter-Based Exploration (CB):* Whereas LTA actively samples actions uniformly, CB attempts to induce a uniform sampling across states. To do this, it maintains a count of the occurrences of each state, and chooses its action to minimize the expected count of the resultant state (Thrun, 1992). CB performs even worse than random action in Dense Worlds and 1-2-3 Worlds (*p <* 0*.*001). It does outperform random actions in Mazes but falls far short of the performance seen by PIG(VI) (*p <* 0*.*001).

*Q-learning on Surprise [PEIG(Q)]:* Storck et al. (1995) developed Surprise as a measure to quantify past changes in an agent's internal model which they used to guide exploration under a Q-learning algorithm (Sutton and Barto, 1998). Interestingly, it can be shown that Surprise as employed by Storck et al. is equivalent to the posterior expected information gain (PEIG), a posterior analog to our PIG utility function (see Appendix A5 and **Table 1**). Q-learning is a model-free approach to maximizing long-term gains of a utility function (Sutton and Barto, 1998). Implementing this strategy, we found that like CB, PEIG(Q) generally performed worse than random action.

(Thrun, 1992), and Q-Learning on posterior expected information gain [PEIG(Q)] (Storck et al., 1995). The standard control strategies are also shown. Standard errors are plotted as dotted lines above and below learning curves (*n* = 200).

The results in **Figure 6** show that PIG(VI) outperforms the previous explorative strategies at learning in structured worlds. We note that all of these strategies were originally developed to encourage exploration for the sake of improving reward acquisition, and their poor performance at our learning objective does not conflict with their previously demonstrated utility under the reinforcement learning framework.

#### **3.6. COMPARISON TO UTILITY FUNCTIONS FROM PSYCHOLOGY**

Independent findings in Psychology have suggested that the maximization of PIG can be used to predict human behavior during hypothesis testing (Oaksford and Chater, 1994). Inspired by these results, we investigated two other measures also developed in this context. Like PIG, both are measures of the difference between the current and hypothetical future internal models:

*Predicted mode change (PMC)* predicts the height difference between the modes of the current and future internal models (Baron, 2005; Nelson, 2005):

$$\text{PMC}(a,s) = \sum\_{s^\*} \widehat{\Theta}\_{ass^\*} \left[ \max\_{s'} \widehat{\Theta}\_{ass'}^{a,s \to s^\*} - \max\_{s'} \widehat{\Theta}\_{ass'} \right] \tag{11}$$

*Predicted L1 change (PLC)* predicts the average L1 distance between the current and future internal models (Klayman and Ha, 1987):

$$\text{PLC}(a,s) = \sum\_{s^\*} \widehat{\Theta}\_{as^\*} \left[ \frac{1}{N} \sum\_{s'} \left| \widehat{\Theta}\_{as'}^{a,s \to s^\*} - \widehat{\Theta}\_{as'} \right| \right] \tag{12}$$

We tested agents that approximately maximize PMC or PLC using VI. As **Figure 7** reveals, PIG(VI) proved again to be the best performer overall. In particular, PIG(VI) significantly outperforms PMC(VI) in all three environments, and PLC(VI) in 1-2-3 Worlds (*p <* 0*.*001). Nevertheless, PMC and PLC achieved significant improvements over the baseline control in Mazes and 1-2-3 Worlds, highlighting the benefit of coordinated actions across different utility functions. Interestingly, when performance was measured by an L1 distance instead of missing information, PIG(VI) still outperformed PMC(VI) and PLC(VI) in 1-2-3 Worlds (data not shown).

#### **3.7. GENERALIZED UTILITY OF EXPLORATION**

In considering the causes underlying a behavior such as exploration, psychologists often distinguish between the proximate (or behavioral) causes and the ultimate (or evolutionary) causes (Mayr, 1961; Pisula, 2009). Proximate causes are those factors that act directly on the individual in the control of behavior, while ultimate causes are those factors that contribute to the survival value of a behavior upon which natural selection can act. Thus far we have focused on efficient learning as the major objective because it has been identified by psychologists as the primary proximate cause of exploration (Archer and Birke, 1983; Loewenstein, 1994). We now, however, return to the question of the ultimate cause of exploration, which must lie in improved survival or reproductive fitness. The evolutionary advantage of learning-driven exploration is thought to lie in the general usefulness of possessing an accurate internal model of the world (Kaplan and Kaplan, 1983; Renner, 1988, 1990; Pisula, 2003, 2008). Unlike many models of reward-driven exploration, which focus on learning to optimize reward acquisition in a single context, an accurate internal model derived from learning-driven exploration may hold general utility applicable across a wide range of contexts. To compare the general utility of internal models gained through the various exploration methods, we assessed the ability of our agents to apply their internal models toward solving an array of goal-directed tasks. We note that these studies were performed without any changes to the exploration strategies employed by the agent. In essence, we interrupt an agent's exploration at several benchmark time points. We then ask the agent how it would solve, given its internal model, a particular task before allowing it to continue on in its exploration. The agent does not actually perform the task. It is simply asked to solve the task using it internal model. The solution that it provides is then compared by us to the optimal

**FIGURE 7 | Comparison between utility functions.** The average missing information is plotted over time for agents that employ VI to maximize long-term gains in the three objective function, PIG, PMC, or PLC. The standard control strategies are also shown (*n* = 200).

solution. We considered two types of tasks, navigation and reward acquisition:

*Navigation:* Given a starting state, the agent has to quickly navigate to a target state.

*Reward Acquisition:* Given a starting state, the agent has to gather as much reward as possible over 100 time steps. Reward values are drawn from a normal distribution and randomly assigned to every state in the CMC. The agent is given the reward value of each state.

After various lengths of exploration, the agent's internal model is assessed for general utility. For each task, we derive the behavioral policy that optimizes performance under the internal model. As a positive control, we also derive an objective optimal policy that maximizes performance given the true CMC kernel. The difference in realized performance between the agent's policy and the control is used as a measure of navigational or reward loss. For detailed methods, please see Appendix A6.

**Figure 8** depicts the average rank in the navigational and reward tasks for the different explorative strategies. In all environments, for both navigation and reward acquisition, PIG(VI) always grouped with the top performers (*p >* 0*.*001), excepting positive controls. PIG(VI) was the only strategy to do so. Thus, the explorative strategy that optimized learning under the missing information objective function also prepared the agent for accomplishing arbitrary goal-directed tasks.

Our test for generalized utility differs from the standard reinforcement learning paradigm in that it tests an agent across multiple tasks. The agent therefore cannot simply learn habitual sensorimotor responses specific to a single task. Though most reinforcement learning studies consider only a stationary, unchanging reward structure, we wanted to compare PIG(VI) to reward-driven explorers. BOSS is a state-of-the-art modelbased reinforcement learning algorithm (Asmuth et al., 2009). To implement reward-driven exploration we trained a BOSS reinforcement-learner to navigate to internally chosen targetstates. After reaching its target, the BOSS agent would randomly select a new target, updating its model reward structure accordingly. We then assessed the internal model formed by a BOSS explorer under the same navigational and reward acquisition tasks. As can be seen in **Figure 8**, BOSS (black cross) was not as good as PIG(VI) at either class of objectives despite being trained specifically on the navigation task.

## **4. DISCUSSION**

In this manuscript we introduced a parsimonious mathematical framework for studying learning-driven exploration by embodied agents based on information theory, Bayesian inference, and CMCs. We compared agents that utilized different exploration strategies toward optimizing learning. To understand how learning performance depends on the structure of the world, three classes of environments were considered that challenge the learning agent in different ways. We found that fast learning could be achieved in all environments by an exploration strategy that coordinated actions toward long-term maximization of PIG.

## **4.1. CAVEATS**

The optimality of the Bayesian estimate (Theorem 1) and the estimation of information gain (Theorem 2) both require an accurate prior over the transition kernels. For biological agents, such priors could have been learned from earlier exploration of related environments, or may represent hardwired beliefs optimized by evolutionary pressures. Alternatively, an agent could attempt to simultaneously learn a prior while exploring its environment.

**FIGURE 8 | Demonstration of generalized utility.** For each world (*n* = 200), explorative strategies are ranked for average performance on the navigational tasks (averaged across *N* start states and *N* target states) and the reward tasks (averaged across *N* start states and 10 randomly generated reward distributions). The average ranks are plotted with standard deviations. PIG(VI) is depicted as a filled green circle. Strategies lying outside the pair of horizontal green lines differ significantly from PIG(VI) in navigational performance. Strategies lying outside the pair of

vertical green lines differ significantly from PIG(VI) in reward performance (*p <* 0*.*0001). The different utility functions and heuristics are distinguished by color: PIG(green), PEIG (magenta), PMC (dark-blue), PLC (cyan), LTA (orange), and CB (yellow). The different coordination methods are distinguished by symbol: Greedy (squares), VI (circles), VI+ (diamonds), Heuristic Strategies (asterisks). The two standard controls are depicted as points as follows: Unembodied (black), Random (red). The BOSS reinforcement learner is depicted by a black cross.

Indeed a simple maximum-likelihood estimation of the concentration parameter for Dense Worlds and Mazes is sufficient for an agent to achieve efficient exploration (data not shown). Nevertheless, biological agents may not always have access to an accurate prior for an environment. For such cases, future work is required to understand exploration under false priors and how they could yield sub-optimal but perhaps biologically realistic exploratory behaviors.

Another potential limitation of our approach occurs from the fact that the VI algorithm is only optimal if the utility function is stationary (i.e., unchanging) (Bellman, 1957). Any utility function, including PIG, that attempts to capture learning progress will necessarily change over time. This caveat may be partially alleviated by the fact that PIG changes only for the sampled distributions. Furthermore, PIG decreases in a monotonic fashion (see **Figure 2**) which can potentially be captured by the discount factor of VI. Interesting future work may lie in accounting for the effect of such monotonic decreases in estimates of future information gains either through direct estimation or through better approximation by a different choice of discounting mechanism. The problem of accounting for diminishing returns on utility has been previously approached in the field of optimal foraging theory. Modeling the foraging behaviors of animals, optimal foraging theory considers an animals decision of when it should leave its present feeding area, or patch, in which it has been consuming the available food and expend energy to seek out a new, undiminished patch (MacArthur and Pianka, 1966). Charnov's Marginal Value Theorem, a pivotal finding in the field, suggests that the decision to transition should be made once the expected utility of the current patch decreases to the average expected utility across all patches accounting for transition costs (Charnov, 1976). Extending this work to our information-theoretic approach in CMCs may provide the necessary insights to address the challenge of diminishing returns on information gain.

Finally, the VI algorithm scales linearly with the size of the state space, and the calculation of PIG can scale linearly with the square of the size of the state space. This means that for larger and larger CMCs, these approaches will become more computationally expensive to perform. For large worlds, clever methods for approximating these approaches or for sparsifying their representation may be necessary. An explicit model of memory may also be necessary to fully capture the limitation on computational complexity biological organisms face. A wealth of literature from Reinforcement Learning and related fields may offer insights in approaching these challenge which we reserve for future work.

## **4.2. RELATED WORK IN REINFORCEMENT LEARNING**

CMCs are closely related to Markov Decision Processes (MDPs) commonly studied in Reinforcement Learning. MDPs differ from CMCs in that they explicitly include a stationary reward function associated with each transition (Sutton and Barto, 1998; Gimbert, 2007). RL research of exploration usually focusses on its role in balancing exploitative behaviors during reward maximization. Several approaches for inducing exploratory behavior in RL agents have been developed. One very common approach is the use of heuristic strategies such as random action, least taken action, and counter-based algorithms. While such strategies may be useful in gathering unchanging external rewards, our results show that they are inefficient for learning the dynamics of structured worlds.

Other RL approaches involve reward-driven exploration. In the absence of external rewards, exploration could still be induced under reward-driven strategies by having the agent work through a series of internally chosen reward problems. This is essentially how the described BOSS agent operates. It was nevertheless insufficient to reach the performance accomplished by PIG(VI).

In addition, several RL studies have investigated intrinsically motivated learning. For example, Singh et al. (2010) have demonstrated that RL guided by saliency, an intrinsic motivation derived from changes in stimulus intensity, can promote the learning of reusable skills. As described in section 3.5, Storck et al. introduced the combination of Q-learning and PEIG as an intrinsic motivator of learning (Storck et al., 1995). In their study, PEIG(Q) outperformed random action only over long time scales. At shorter time scales, random action performed better. Interestingly, we found exactly the same trend, initially slow learning with eventual catching-up, when we applied PEIG(Q) to exploration in our test environments (**Figure 6**).

#### **4.3. BETWEEN LEARNING-DRIVEN AND REWARD-DRIVEN EXPLORATION**

While curiosity, as an intrinsic value for learning, is believed to be the primary drive of explorative behaviors, other factors, including external rewards, may play a role either in motivating exploration directly or in shaping the development of curiosity (Archer and Birke, 1983; Loewenstein, 1994; Silvia, 2005; Pisula, 2009). In this manuscript, we wished to focus on a pure learning-based exploration strategy and therefore chose to take an unweighted sum of missing information as a parsimonious objective function (Equation 3). Two points, however, should be noted in considering the relationship of this work to previous work in the literature. First, our objective function considers only the learning of the transition dynamics governing a CMC as this fully describes such a world. When we incorporate additional features into our framework, such as rewards in MDPs, those features too could be learned and assessed under our missing information objective function. Toward this goal, interesting insights may come from comparing our work with the multi-armed bandits literature. Multi-armed bandits are a special class of single state MDPs (Gittins, 1979). By considering only a single state, multi-armed bandits remove the embodiment constraint of multi-state CMCs and MDPs. Thus, CMCs and multi-armed bandits represent complimentary special cases of MDPs. That is, a CMC is an MDP without reward structure, while a multi-armed bandit is an MDP without transition kernels. Recent research has attempted to decouple the exploration and exploitation components of optimal control in multi-armed bandits (Abbeel and Ng, 2005; Bubeck et al., 2009). These studies aim at minimizing, through exploration, a construct termed regret, the expected reward forgone by a recommended strategy. Regret is similar to the navigational and reward acquisition loss values we calculated for ranking our explorers under goaldirected tasks. Importantly, while our work considered a wide array of goal-directed tasks, these multi-armed bandit approaches typically consider only learning a single fixed reward structure. Understanding these difference will be important if one wishes to shift attention from the unbiased information-theoretic view we take to a directed task-dependent view. Identifying a means, perhaps through information theory, of quantifying uncertainty in which strategy will optimize a task, will be an important extension bridging these two approaches. The idea of directed information brings us to our second consideration in relating our work to previous literature. Psychologists have found that curiosity, or interest, can vary greatly both between and within individuals (Silvia, 2001, 2006). While one should be careful to not conflate the valuation of an extrinsic reward with the emotion of interest, it is possible such valuations could act to influence the development of interests. By transitioning away from our non-selective measure of missing information toward a weighted objective function that values certain information over others, we may begin to bridge the learning-driven and reward-driven approaches to exploration. One interesting proposal, put forth by Vergassola et al. suggests that information regarding a reward often falls off with distance as an organism moves away from the source of the reward (Vergassola et al., 2007). Accordingly, a greedy local maximization of information regarding the reward may simultaneously bring the individual closer to the desired reward. The resultant "infotaxis" strategy is closely related to our PIG(greedy) strategy but is applied only to a single question of where a particular reward is located.

#### **4.4. RELATED WORK IN PSYCHOLOGY**

In the Psychology literature, PIG, as well as PMC and PLC, were directly introduced as measures of the expected difference between a current and future belief (Baron, 2005; Klayman and Ha, 1987; Oaksford and Chater, 1994; Nelson, 2005). Here, we showed that PIG equals the expected change in missing information (Theorem 2). Analogous theorems do not hold for PMC or PLC. For example, PLC is not equivalent to the expected change in L1 distance with respect to the true world. This might explain why PIG(VI) outperformed PLC(VI) even under an L1 measure of learning.

We applied PIG, PMC, and PLC to the problem of learning a full model of the world. In contrast, the mentioned psychology studies focussed specifically on hypothesis testing and did not consider sequences of actions or embodied actionperception loops. These studies revealed that human behavior during hypothesis testing can be modeled as maximizing PIG, suggesting that PIG may have biological significance (Oaksford and Chater, 1994; Nelson, 2005). However, those results could not distinguish between the different utility functions (PIG, PMC, and PLC) (Nelson, 2005). Our finding that 1-2-3 Worlds give rise to large differences between the three utility functions may help identify new behavioral tasks for disambiguating the role of these measures in human behavior.

To model bottom–up visual saliency and predict gaze attention, Itti and Baldi recently developed an information-theoretic measure closely related to PEIG (Itti and Baldi, 2006, 2009; Baldi and Itti, 2010). In this model, a Bayesian learner maintains a probabilistic belief structure over the low-level features of a video. Attention is believed to be attracted to locations in the visual scene that exhibit high Surprise. Several potential extensions of this work are suggested by our results. First, it may be useful to model the active nature of data acquisition during visual scene analysis. In Itti and Baldi's model, all features are updated for all location of the visual scene regardless of current gaze location or gaze trajectory. Differences in acuity between the fovea and periphery, however, suggest that gaze location will have a significant effect on which low-level features can be transmitted by the retina (Wässle and Boycott, 1991). Second, our comparison between PIG and PEIG (**Figure 6**) suggests that predicting future changes may be more efficient than focusing attention only on those locations where change has occurred in the past. A model that anticipates Surprise, as PIG anticipates information gain, may be better able to explain some aspects of human attention. For example, if a moving object disappears behind an obstruction, viewers may anticipate the reemergence of the object and attend that location. Incorporating these insights into new models of visual saliency and attention could be an interesting course of future research.

#### **4.5. INFORMATION-THEORETIC MODELS OF BEHAVIOR**

Recently information-theoretic concepts have become more popular in computational models of behavior. These approaches can be grouped under three guiding principles. The first principle uses information theory to quantify the complexity of a behavioral policy, with high complexity considered undesirable. Tishby and Polani for example, considered RL maximization of rewards under such complexity constraints (Tishby and Polani, 2011).

The second principle is to maximize a measure called *predictive information* which quantifies the amount of information a known (or past) variable contains regarding an unknown (or future) variable (Tishby et al., 1999; Ay et al., 2008; Still, 2009). Predictive information has also been referred to as *excess entropy* (Crutchfield and Feldman, 2003) and should not be confused with PIG. When the controls of a simulated robot were adjusted such that the predictive information between successive sensory inputs was maximized, Ay et al. found that the robot began to exhibit complex and interesting explorative behaviors (Ay et al., 2008). This objective selects for behaviors that cause the sensory inputs to change often but to remain predictable from previous inputs, and we therefore describe the resulting exploration as stimulation-driven. Such exploration generally benefits from a good internal model but on its own, does not drive fast learning. It is therefore more suitable later in exploration, after a learningdriven strategy, such as PIG(VI), has had a chance to form an accurate model. PIG, in contrast, is most useful in the early stages when the internal model is still deficient. These complimentary properties of predictive information and PIG lead us to hypothesize that a simple additive combination of the two objectives may naturally lead to a smooth transitioning from learning-driven exploration to stimulation-driven exploration, a transition that may indeed be present in human behavior (see section 4.6).

Epsilon machines introduced by Crutchfield and Young (1989) and the information bottleneck approach introduced by Tishby et al. (1999) combine these first two principles of maximizing predictive information and constraining complexity. In particular maximizing the information between a compressed internal variable and the future state progression subject to a constraint on the complexity of generating the internal variable from sensory inputs. Recently, Still extended the information bottleneck method to incorporate actions (Still, 2009).

Finally, the third information-theoretic principle of behavior is to minimize of free-energy, an information-theoretic bound on surprise. Friston put forth this Free-Energy (FE) hypothesis as a unified variational principle for governing both the inference of an internal model and the control of actions (Friston, 2009). Under this principle, agents should act to minimize the number of states they visit. This stands in stark contrast to both learning-driven and stimulation-driven exploration. A learning-driven explorer will seek out novel states where missing information is high, while a stimulation-driven explorer actively seek to maintain high variation in its sensory inputs. Still, reduced state entropy may be valuable in dangerous environments where few states permit survival. The balance between cautionary and exploratory behaviors would be an interesting topic for future research.

#### **4.6. TOWARD A GENERAL THEORY OF EXPLORATION**

With the work of Berlyne (1966), psychologists began to dissect the different motivations that drive exploration. A distinction between play (or diversive exploration) and investigation (or specific exploration) grew out of two competing theories of exploration. As reviewed by Hutt (1970), "curiosity"-theory proposed that exploration is a consummatory response to curiosityinducing stimuli (Berlyne, 1950; Montgomery, 1953). In contrast, "boredom"-theory held that exploration was an instrumental response for stimulus change (Myers and Miller, 1954; Glanzer, 1958). Hutt suggested that the two theories may be capturing distinct behavioral modes, with "curiosity"-theory underlying investigatory exploration and "boredom"-theory underlying play. In children, exploration often occurs in two stages, inspection to understand what is perceived, followed by play to maintain changing stimulation (Hutt and Bhavnani, 1972). These distinctions nicely correspond to the differences between our approach and the predictive information approach of Ay et al. (2008) and Still (2009). In particular, we hypothesize that our approach corresponds to curiosity-driven investigation, while predictive information a la Ay et al. and Still may correspond with play. Furthermore, the proposed method of additively combining these two principles (section 4.4), may naturally capture the transition between investigation and play seen in children.

For curiosity-driven exploration, there are many varied theories (Loewenstein, 1994). Early theories viewed curiosity as a drive to maintain a specific level of arousal. These were followed by theories interpreting curiosity as a response to intermediate levels of incongruence between expectations and perceptions, and later by theories interpreting curiosity as a motivation to master one's environment. Loewenstein developed an Information Gap Theory and suggested that curiosity is an aversive reaction to missing information (Loewenstein, 1994). More recently, Silvia proposed that curiosity comprises two traits, complexity and comprehensibility (Silvia, 2005). For Silvia complexity is broadly defined, and includes novelty, ambiguity, obscurity, mystery, etc. Comprehensibility appraises whether something can be understood. It is interesting how well these two traits match information-theoretic concepts, complexity being captured by entropy, and comprehensibility by information gain (Pfaffelhuber, 1972). Indeed, PIG might be able to explain the dual aspects of curiosity-driven exploration proposed by Silvia. PIG is bounded by entropy and thus high values require high complexity. At the same time, PIG equals the expected decrease in missing information and thus may be equivalent to expected comprehensibility.

All told, our results add to a bigger picture of exploration in which the theories for its different aspects fit together like pieces of a puzzle. This invites future work for integrating these pieces into a more comprehensive theory of exploration and ultimately of autonomous behavior.

#### **4.7. APPLICATION TOWARD EXPERIMENTAL DESIGN**

In many ways, scientific research itself epitomizes learningdriven exploration. Like our modeled agents, researchers design experiments to maximize their expected gain in information. Recently, there has been growing interest in automated experimental design. While not every experimental paradigm will fit neatly into our CMC framework, our explorative principles may have direct application to closed-loop neurophysiology. Suppose, for example, we are interested in how ongoing activity within a population of neurons affects their receptive fields. To study this, we would want to measure the neurons' responses to different stimuli and determine how those responses are affected by the activity of the neurons just prior to stimulus presentation. Specific sequences of *priming* stimuli may be necessary to drive the neurons into a particular activation state of ongoing activity in which their responses to a *probe* stimulus could be measured. It may be difficult for a researcher to determine before hand which sequences of stimuli are interesting, but PIG(VI) might offer an automated way of choosing appropriate stimuli on the fly. The ongoing activity of a population of neurons can be treated as the states of the system, and the choice of stimuli as the actions. A closed-loop electrophysiology system controlled by PIG(VI) could investigate not only how the neurons responded to presented stimuli but also how to use the stimuli to prime the neurons into interesting states of ongoing activity for probing.

## **ACKNOWLEDGMENTS**

This work was supported by the Redwood Center for Theoretical and Computational Neuroscience Graduate Student Endowment to Daniel Y. Little and NSF Grant (IIS-0713657) to Friedrich T. Sommer. The authors wish to thank Tom Griffiths, Reza Moazzezi, and the Redwood Center for Theoretical Neuroscience for useful discussions.


observed: levels of entropy convergence. *Chaos* 13, 25. (Woodbury, NY).


neuroscience and robotics. *Curr. Opin. Neurobiol.* 17, 205–212.


*Control and Computation* (Monticello, IL).

Tishby, N., and Polani, D. (2011). "Models, Architectures, and Hardward, Chapter 19," in *Perception-Action Cycle,* eds V. Cutsuridis, A. Hussain, and J. G. Taylor (New York, NY: Springer), 601–636.

Vergassola, M., Villermaux, E., and Shraiman, B. (2007). 'infotaxis' as a strategy for searching without gradients. *Nature* 445, 406–409.

Wässle, H., and Boycott, B. (1991). Functional architecture of the mammalian retina. *Physiol. Rev.* 71, 447.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

#### *Received: 01 September 2012; accepted: 23 February 2013; published online: 22 March 2013.*

*Citation: Little DY and Sommer FT (2013) Learning and exploration in action-perception loops. Front. Neural* *Circuits 7:37. doi: 10.3389/fncir. 2013.00037*

*Copyright © 2013 Little and Sommer. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## **APPENDIX A**

#### **A1 PROOF OF THEOREM 1**

**Claim.** *Consider a CMC random variable* **Θ** *modeling the ground truth environment and drawn from a prior distribution f . Given a history of observations* d*, the expected missing information between* **Θ** *and an agent's internal model is minimized by the Bayesian estimate* = *-. That is:*

$$\widehat{\Theta} := \operatorname{E}\_{\Theta \mid \vec{\mathbf{d}}}[\Theta] = \operatorname\*{arg\,min}\_{\Phi} \operatorname{E}\_{\Theta \mid \vec{\mathbf{d}}} \left[ \operatorname{I}\_{\mathsf{M}}(\Theta \|\, \Phi) \right],$$

*Proof.* Minimizing missing information is equivalent to independently minimizing the KL-divergence of each transition kernel.

$$\begin{split} & \underset{\Phi\_{\mathsf{@}}}{\operatorname{arg\,min}} \mathbb{E}\_{\mathsf{@}|\vec{\mathsf{d}}} \left[ \mathrm{D} \mathrm{KL} \left( \boldsymbol{\Theta\_{ds}} \, \| \, \boldsymbol{\Phi\_{as}} \right) \right] \\ & \quad = \underset{\Phi\_{\mathsf{@}}}{\operatorname{arg\,min}} \mathrm{E}\_{\mathsf{@}|\vec{\mathsf{d}}} \left[ \sum\_{j'} \boldsymbol{\Theta\_{ds'}} \log\_2 \left( \frac{\boldsymbol{\Theta\_{ds'}}}{\boldsymbol{\Phi\_{ds'}}} \right) \right] \\ & \quad = \underset{\Phi\_{\mathsf{@}}}{\operatorname{arg\,min}} \mathrm{E}\_{\mathsf{@}|\vec{\mathsf{d}}} \left[ \sum\_{j'} \boldsymbol{\Theta\_{ds'}} \log\_2 \left. \boldsymbol{\Theta\_{ds'}} - \boldsymbol{\Theta\_{ds'}} \log\_2 \left. \boldsymbol{\Phi\_{as'}} \right| \right] \right] \\ & \quad = \underset{\Phi\_{\mathsf{@}}}{\operatorname{arg\,min}} \left. - \mathrm{E}\_{\mathsf{@}|\vec{\mathsf{d}}} \left[ \sum\_{j'} \boldsymbol{\Theta\_{ds'}} \log\_2 \left. \boldsymbol{\Phi\_{as'}} \right| \right] \\ & \quad = \underset{\Phi\_{\mathsf{@}}}{\operatorname{arg\,min}} \left. - \sum\_{j'} \mathrm{E}\_{\mathsf{@}|\vec{\mathsf{d}}} \left[ \boldsymbol{\Theta\_{as'}} \right] \log\_2 \boldsymbol{\Phi\_{as'}} \right] \\ & \quad = \underset{\Phi\_{\mathsf{@}}}{\operatorname{arg\,min}} \mathrm{H} \left[ \mathrm{E}\_{\mathsf{@}|\vec{\mathsf{d}}} \left[ \boldsymbol{\Theta\_{ds}} \; \right]; \ \boldsymbol{\Phi\_{as}} \right] \end{split}$$

Here H denotes cross-entropy (Cover and Thomas, 1991). Finally, by Gibb's inequality (Cover and Thomas, 1991):

$$\underset{\Phi\_{a\circ}}{\arg\min} \, \mathbb{H}\left[E\_{\Theta|\vec{\mathbf{d}}}\left[\Theta\_{a\circ}\right]; \Phi\_{a\circ}\right] = \mathbb{E}\_{\Theta|\vec{\mathbf{d}}}\left[\Theta\_{a\circ}\right] = \widehat{\Theta}\_{a\circ}.$$

#### **A2 GENERATIVE DISTRIBUTIONS AND BAYESIAN ESTIMATES FOR THE 3 CLASSES OF ENVIRONMENTS**

(1) *Dense Worlds* correspond to complete directed probability graphs with *N* = 10 states and *M* = 4 actions. An example is depicted in **Figure A1**. Each transition distribution is independently drawn from a Dirichlet distribution over the standard (*N* − 1)-simplex:

$$f(\Theta\_{as\cdot}) = \text{Dir}(\mathfrak{a}) = \frac{1}{Z(\mathfrak{a})} \cdot \prod\_{s'} \Theta\_{as'}{}^{\mathfrak{a}\_{s'}-1}$$

The normalizing constant *Z* brings the area under the distribution to 1:

$$Z(\mathfrak{a}) := \int\_{\Delta\_{N-1}} \prod\_{s'} \Theta\_{as'} {}^{\mathfrak{a}\_{s'}-1} d\Theta\_{as} = \frac{\prod\_{s'} \Gamma(\mathfrak{a}\_{s'})}{\Gamma(\sum\_{s'} {}^{\mathfrak{a}\_{s'}})}$$

$$\text{where } \Gamma(\mathbf{x}) := \int\_0^\infty t^{\mathbf{x} - 1} e^{-t} \,\mathrm{d}t$$

The mean of a Dirichlet distribution takes on a simple form:

$$\int\_{\Delta\_{N-1}} \Theta\_{as} \frac{\prod\_{s'} \Theta\_{as'} \mathfrak{a}\_{s'} - 1}{Z(\mathfrak{a})} \, d\Theta\_{as} = \frac{\mathfrak{a}}{\sum\_{s'} \mathfrak{a}\_{s'}'} $$

We will assume a symmetric prior setting **α***s* equal to α for all *s* - . The vector form of the Dirichlet distribution will nevertheless still be useful in deriving the Bayesian estimate. The parameter α determines how much probability weight is centered at the midpoint of the simplex and is known as the *concentration factor*. For Dense Worlds, we use a concentration parameter α = 1 which results in a uniform distribution over the simplex.

To derive an analytic form for the Bayesian estimate of Dense Worlds, we define the matrix *F* such that *Fass* is a count of the number of times *a,s* → *s* has occurred in the data. Since each layer *as*· of the CMC kernel is independently distributed, its posterior distribution can be computed as follows:

$$f(\Theta|F) = \frac{\prod\_{\boldsymbol{\upvee}} \Theta\_{as\boldsymbol{\upvee}}^{\mathrm{F}\_{as\boldsymbol{\upvee}}} \cdot \prod\_{\boldsymbol{\upvee}} \Theta\_{as\boldsymbol{\upvee}}^{\cdot \cdot \cdot 1} / Z(\boldsymbol{\alpha})}{p(F)}$$

$$= \frac{\prod\_{\boldsymbol{\upvee}} \Theta\_{as\boldsymbol{\upvee}}^{\mathrm{F}\_{as\boldsymbol{\upvee}} + \alpha - 1}}{Z(\boldsymbol{\alpha})p(F)} = \mathrm{Dir}(F + \alpha)$$

Thus, the posterior distribution is also Dirichlet and the Bayesian estimate is simply the mean of the distribution:

$$\widehat{\Theta}\_{ss'} = \frac{F\_{sss'} + \alpha}{\sum\_{s^\*} F\_{sss^\*} + \alpha} = \frac{F\_{sss'} + 1}{\sum\_{s^\*} F\_{sss^\*} + 1} \tag{13}$$

In this form, we find that the Bayesian estimate for Dense Worlds is simply the relative frequencies of the observed data with the addition of fictitious counts of size α to each bin. The incorporation of this fictitious observation is referred to as Laplace smoothing and is often performed to avoid over-fitting (Manning et al., 2008). The derivation of Laplace smoothing from Bayesian inference over a Dirichlet prior is a well known result (MacKay and Peto, 1995).

(2) *Mazes* consist of *N* = 36 states corresponding to rooms in a randomly generated 6 by 6 maze and *M* = 4 actions corresponding to noisy translations, each biased toward one of the four cardinal directions. An example is depicted in **Figure 1**. Walking into a wall causes the agent to remain in its current location. Thirty transporters are randomly distributed amongst the walls which lead to a randomly chosen absorbing state (concentric rings in **Figure 1**). States that are not one step away from the originating state (either directly, through a portal, or against a wall) are assumed to have zero probability of resulting from any action. Transition probabilities for states that are one step away are drawn from a Dirichlet distribution with concentration parameter α = 0*.*25, and the highest probability is assigned to the state corresponding to the preferred direction of the action. The small concentration parameter distributes more probability weight in the corners of the simplex resulting in less entropic transitions.

Letting *Ns* denote the number of states one-step away from state *s*, the Bayesian estimate for maze transitions is given by:

$$
\widehat{\Theta}\_{a,s,s'} = \frac{F\_{as'} + \alpha}{N\_s \cdot \alpha + \sum\_{s^\*} F\_{as^\*}} \tag{14}
$$

As with Dense Worlds, the Bayesian estimate (Equation 14) for mazes is a Laplace smoothed histogram.

(3) *1-2-3 Worlds* consists of *N* = 20 states and *M* = 3 actions. In a given state, action *a* = 1 moves the agent deterministically to a single target state, *a* = 2 brings the agent with probability 0*.*5 to one of two possible target states, and *a* = 3 brings the agent with probability 0*.*333 to one of three potential target states. The target states are randomly and independently selected for each transition distribution. An absorbing state

depicted as arrows pointing from the current state to each of the possible resultant states. Arrow color depicts the likelihood of each transition. The absorbing state is depicted in gray.

is form by universally increasing the likelihood that state 1 is chosen as a target. Explicitly, letting *<sup>a</sup>* be the set of all admissible transition distributions for action *a*:

$$\Omega\_a := \left\{ \Theta \in \mathbb{R}^N | \sum\_{s'} \Theta\_{s'} = 1 \text{ and } \Theta\_{s'} \in \left\{ 0, \frac{1}{a} \right\} \forall s' \right\}.$$

the transition distributions are drawn from the following distribution:

$$p(\Theta\_{as}) = \begin{cases} 0 & \text{if } \Theta\_{as} \notin \Omega\_a \\\\ \frac{1 - 0.75^a}{\binom{N - 1}{a - 1}} & \text{else if } \Theta\_{as 1} = \frac{1}{a} \\\\ \frac{1 - (1 - 0.75^a)}{\binom{N - 1}{a}} & \text{otherwise} \end{cases} \tag{15}$$

Bayesian inference in 1-2-3 Worlds differs greatly from Mazes and Dense Worlds because of its discrete prior. If *a,s* → *s* - has been previously observed, then the Bayesian estimate for *ass*is given by:

$$
\widehat{\Theta}\_{\text{ass}'} = \frac{1}{a}
$$

If *a,s* → *s* has not been observed but *a,s* → 1 has, then the Bayesian estimate is given by:

$$
\widehat{\Theta}\_{ass'} = \frac{1 - \frac{|\mathcal{H}^\*|}{a}}{N - T}
$$

Here *T* is the number of target states that have already been observed. Finally, if neither *a,s* → *s* nor *a,s* → 1 have been observed, then the Bayesian estimate is:

$$\widehat{\Theta}\_{ds'} = \begin{cases} \frac{1}{a} \cdot \frac{1 - 0.75^a}{1 + \left(\binom{a-1}{T} - 1\right) \cdot 0.75^a} & \text{if } s' = 1 \\\\ \frac{1 - \left(\frac{T}{a} + \widehat{\Theta}\_{ds1}\right)}{N - T - 1} & \text{otherwise} \end{cases}$$

#### **A3 PROOF OF THEOREM 2**

**Claim.** *If an agent is in state s and has previously collected data* d*, then the expected information gain for taking action a is given by:*

$$\begin{split} \text{PIG}(a,s) &:= \mathbb{E}\_{s^\*, \mathbf{O} \mid \overline{\mathbf{d}}} [\text{IG}(a,s,s^\*)] \\ &= \sum\_{s^\*} \widehat{\Theta}\_{as\*} \text{DKL}(\widehat{\Theta}\_{as}^{a,s \to s^\*} \| \, \widehat{\Theta}\_{as} \text{)} \end{split} \tag{16}$$

*Proof.*

E*s*∗*,***Θ**|d [IG*(a,s,s* <sup>∗</sup>*)*] = <sup>E</sup>*s*∗*,***Θ**|d - *s*- **Θ***ass* log2 ! *a, <sup>s</sup>*→*<sup>s</sup>* ∗ *ass*- *ass*- " = E*s*∗|d - *s*- E**Θ**|d*, <sup>s</sup>*<sup>∗</sup> [**Θ***ass*-]log2 ! *a, <sup>s</sup>*→*<sup>s</sup>* ∗ *ass*- *ass*- " = E*s*∗|d - *s*- *a, <sup>s</sup>*→*<sup>s</sup>* ∗ *ass* log2 ! *a, <sup>s</sup>*→*<sup>s</sup>* ∗ *ass*- *ass*- " = E*s*∗|d DKL*(a, <sup>s</sup>*→*<sup>s</sup>* ∗ *as*· *as*·*)* = - *s*∗ *p(s* ∗|*a,s,* d*)*DKL*(a, <sup>s</sup>*→*<sup>s</sup>* ∗ *as*· *as*·*)* by *(*5*)* = - *s*∗ *ass*<sup>∗</sup>DKL*(a, <sup>s</sup>*→*<sup>s</sup>* ∗ *as*· *as*·*)*

#### **A4 DERIVATION OF MEAN PATH LENGTH**

To optimize navigation to a target state *s* <sup>∗</sup>, we consider modified transition probabilities:

$$p\_{\text{navigation}}(s'|a,s) = \begin{cases} \Theta\_{as'} & \text{if } s \neq s^\* \\ 1 & \text{if } s = s' = s^\* \\ 0 & \text{otherwise} \end{cases}$$

A navigational utility function is then defined as:

$$U\_{\text{navigation}}(s) = \begin{cases} -1 & \text{if } s \neq s^\* \\ 0 & \text{otherwise} \end{cases}$$

An optimal policy π is derived through value-iteration as follows:

$$\begin{aligned} Q\_0(a,s) &:= U\_{\text{navigation}}(s) \\ Q\_{\text{r}-1}(a,s) &:= U\_{\text{navigation}}(s) + \sum\_{s' \in \mathcal{P}} p\_{\text{navigation}}(s'|a,s) \cdot V\_{\text{r}}(s') \\ \text{where } V\_{\text{r}}(s) &:= \max\_a Q\_{\text{r}}(a,s) \end{aligned}$$

Value-iteration is continued until *V* converges, and the optimal policy is then defined as:

$$
\pi(s) = \underset{a}{\text{arg}\,\text{max}}\,\mathcal{Q}\_{\text{convergence}}(a, s)
$$

The expected path length to target *s* <sup>∗</sup> is then calculated as:

$$\text{E[steps to } s^\*] = \sum\_s -\frac{1}{N} V\_{\text{convergence}}(s)$$

The mean path length is then taken to be the average of the expected path length over the *N* possible target states.

#### **A5 DERIVATION OF PEIG**

**Claim.** *Surprise, as employed by Storck et al. (1995), is equal to the posterior expected information gain. That is, if an agent is in state s and has previously collected data* d*, then the expected information gain for taking action a and observing resultant state s*<sup>∗</sup> *is given by:*

$$\text{Surprise}(a, s, s') := \text{D}\_{\text{KL}}(\widehat{\Theta}\_{a\*}^{\vec{\text{dl}}\cup s^\*} \| \, \widehat{\Theta}\_{a\*}^{\vec{\text{d}}}) = \text{E}\_{\Theta \mid \vec{\text{dl}}\cup s^\*} \{\text{I}\_G(a, s, s^\*)\}\tag{17}$$

*Proof.*

$$\begin{split} \operatorname{E}\_{\mathsf{\Theta}|\vec{\mathrm{d}}\cup s^{\*}}[\operatorname{I}\_{\mathsf{G}}(a,s,s^{\*})] &= \operatorname{E}\_{\mathsf{\Theta}|\vec{\mathrm{d}}\cup s^{\*}}\left[\sum\_{s'} \mathsf{\Theta}\_{as'} \log\_{2}\left(\frac{\widehat{\Theta}\_{as'}^{\vec{\mathrm{d}}\cup s^{\*}}}{\widehat{\Theta}\_{as'}^{\vec{\mathrm{d}}}}\right)\right] \\ &= \sum\_{s'} \operatorname{E}\_{\mathsf{\Theta}|\vec{\mathrm{d}}\cup s^{\*}}\left[\mathsf{\Theta}\_{as'}\right] \log\_{2}\left(\frac{\widehat{\Theta}\_{as'}^{\vec{\mathrm{d}}\cup s^{\*}}}{\widehat{\Theta}\_{as'}^{\vec{\mathrm{d}}}}\right) \\ &= \sum\_{s'} \widehat{\Theta}\_{as'}^{\vec{\mathrm{d}}\cup s^{\*}} \log\_{2}\left(\frac{\widehat{\Theta}\_{as'}^{\vec{\mathrm{d}}\cup s^{\*}}}{\widehat{\Theta}\_{as'}^{\vec{\mathrm{d}}}}\right) \\ &= \operatorname{D}\_{\mathsf{KL}}(\widehat{\Theta}\_{as}^{\vec{\mathrm{d}}\cup s^{\*}} \parallel \widehat{\Theta}\_{as}^{\vec{\mathrm{d}}}) \end{split}$$

#### **A6 METHODS FOR ASSESSING PERFORMANCE IN GOAL-DIRECTED TASKS**

To assess the general utility of an agent's internal model, the agent is first allowed to explore for a fixed number of times steps. After exploring, the agent is asked, for each goal-directed task, to choose a fixed policy that optimizes performance under its learned model:

*Navigation:* To optimize navigation to a target state *s* <sup>∗</sup> under internal model *-*, we took an approach analogous to our method for calculating the mean path length of a world (see Appendix A4). We first consider modified transition probabilities:

$$p\_{\text{navigation}}(s'|a, s; \widehat{\Theta}) = \begin{cases} \widehat{\Theta}\_{as'} & \text{if } s \neq s^\* \\ 1 & \text{if } s = s' = s^\* \\ 0 & \text{otherwise} \end{cases}$$

A navigational utility function is then defined as:

$$U\_{\text{navigation}}(s) = \begin{cases} -1 & \text{if } s \neq s^\* \\ 0 & \text{otherwise} \end{cases}$$

An optimal policy π is derived through value-iteration as follows:

$$\begin{aligned} Q\_0(a,s) &:= U\_{\text{navigation}}(s) \\ Q\_{\text{f}-1}(a,s) &:= U\_{\text{navigation}}(s) + \sum\_{s' \in \mathcal{F}} p\_{\text{navigation}}(s'|a,s; \widehat{\Theta}) \cdot V\_{\text{f}}(s') \\ \text{where } V\_{\text{f}}(s) &:= \dots \dots \widehat{\Theta}\_{\text{f}}(s) \end{aligned}$$

where *<sup>V</sup><sup>τ</sup> (s)* := max *<sup>a</sup> <sup>Q</sup><sup>τ</sup> (a,s)*

This process is iterated a number a times, *τ*convergence *>* 1000, sufficient to allow *Q* to converge to within a small fixed margin. An optimal policy is then defined as:

$$\pi\_{\widehat{\Theta}}(s) = \operatorname\*{arg\,max}\_{a} Q\_{-\mathfrak{r}\_{\text{convergence}}}(a, s)$$

The realized performance of π is assessed as the expected number of time steps, capped at 20, it would take an agent employing π to reach the target state. A true optimal policy is calculated as above except using  instead of *-*. For each world and each exploration strategy, navigation is assessed after *t* ∈ {25, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000, 1500, 2000, 2500, and 3000} exploration time steps and compared to the true optimal strategy. Performance difference from true optimal is calculated is averaged over the tested exploration lengths, all starting states, and all target states. The different explorative strategies are then ranked in performance.

*Reward Acquisition:* Policies in reward acquisition tasks are derived as above for navigational tasks except as follows:

$$\mathcal{P}\_{\text{reward}}(s'|a, s; \widehat{\Theta}) = \widehat{\Theta}\_{\text{ass}'}$$

$$U\_{\text{reward}}(s) \sim \text{Uniform}([-1, 1])$$

$$\pi\_{\widehat{\Theta}}(s) = \operatorname\*{arg\,max}\_{a} Q\_{-100}(a, s)$$

Realized performance is assessed as the expected total rewards accumulated by an agent employing π over 100 time steps.

## Neural control and adaptive neural forward models for insect-like, energy-efficient, and adaptable locomotion of walking machines

#### *Poramate Manoonpong1 \*, Ulrich Parlitz 2,3 and Florentin Wörgötter <sup>1</sup>*

*<sup>1</sup> Bernstein Center for Computational Neuroscience, The Third Institute of Physics, Georg-August-Universität Göttingen, Göttingen, Germany*

*<sup>2</sup> Max Planck Research Group Biomedical Physics, Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany*

*<sup>3</sup> Institute for Nonlinear Dynamics, Georg-August-Universität G ttingen, G ttingen, Germany ö ö*

#### *Edited by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### *Reviewed by:*

*Ralf Der, Max Planck Institute for Mathematics, Germany Georg Martius, Max Planck Institute for Mathematics in the Sciences, Germany William Lewinger, University of Surrey, UK*

#### *\*Correspondence:*

*Poramate Manoonpong, Bernstein Center for Computational Neuroscience, III Physikalisches Institut - Biophysik, Georg-August-Universität Göttingen, Friedrich-Hund Platz 1, D-37077 Göttingen, Germany. e-mail: poramate@physik3.gwdg.de; poramate@manoonpong.com*

Living creatures, like walking animals, have found fascinating solutions for the problem of locomotion control. Their movements show the impression of elegance including versatile, energy-efficient, and adaptable locomotion. During the last few decades, roboticists have tried to imitate such natural properties with artificial legged locomotion systems by using different approaches including machine learning algorithms, classical engineering control techniques, and biologically-inspired control mechanisms. However, their levels of performance are still far from the natural ones. By contrast, animal locomotion mechanisms seem to largely depend not only on central mechanisms (central pattern generators, CPGs) and sensory feedback (afferent-based control) but also on internal forward models (efference copies). They are used to a different degree in different animals. Generally, CPGs organize basic rhythmic motions which are shaped by sensory feedback while internal models are used for sensory prediction and state estimations. According to this concept, we present here adaptive neural locomotion control consisting of a CPG mechanism with neuromodulation and local leg control mechanisms based on sensory feedback and adaptive neural forward models with efference copies. This neural closed-loop controller enables a walking machine to perform a multitude of different walking patterns including insect-like leg movements and gaits as well as energy-efficient locomotion. In addition, the forward models allow the machine to autonomously adapt its locomotion to deal with a change of terrain, losing of ground contact during stance phase, stepping on or hitting an obstacle during swing phase, leg damage, and even to promote cockroach-like climbing behavior. Thus, the results presented here show that the employed embodied neural closed-loop system can be a powerful way for developing robust and adaptable machines.

#### **Keywords: efference copy, central pattern generators, sensory feedback, recurrent neural networks, local leg control, walking gait, autonomous robots**

## **1. INTRODUCTION**

Walking animals, like locusts, stick insects, and cockroaches, can traverse diverse terrains in an energy-efficient way. During traversing, their locomotion can also adapt to deal with terrain changes. Furthermore, their movements are elegant and versatile. These capabilities are the result of the coupling of biomechanics (Dickinson et al., 2000) and neural control. For instance, the appropriate biomechanical structures of body and legs of a cockroach (Ritzmann et al., 2004) allows it to walk naturally, deal with minor disturbances during traversing rough terrain, and even climb over relatively high obstacles as compared to its size. While biomechanics allows for such capabilities, neural control, on the other hand, combines information from different sensor modalities and provides coordinated outputs to many motor joints (Büschges, 2005; Grillner, 2006; Cruse et al., 2009; Mulloney and Smarandache, 2010; Fuchs et al., 2011). This process is fast and adaptive which leads to the generation of locomotion and adaptation.

During the last few decades, roboticists have tried to imitate such natural properties with artificial legged locomotion systems. Several of them have paid attention on the biomechanical design of such systems to have animal-like properties (Cham et al., 2002; Iida and Pfeifer, 2004; Lewinger et al., 2005; Kingsley et al., 2006; Schneider et al., 2012). Others have focused on sensorimotor coordination and control for locomotion and adaptation by using different approaches including machine learning algorithms (Lee et al., 2006; Erden and Leblebicioglu, 2008), classical engineering control techniques (Brooks, 1986; Shkolnik and Tedrake, 2007), and biologically inspired control mechanisms (Beer et al., 1997; Kuo, 2002; Lewis and Bekey, 2002; Dürr et al., 2003; Ekeberg et al., 2004; Cruse et al., 2007; Kimura et al., 2007; Spenneberg and Kirchner, 2007; Amrollah and Henaff, 2010; Daun-Gruhn and Büschges, 2011; Harischandra et al., 2011; Lewinger and Quinn, 2011; von Twickel et al., 2012). With increasing machine complexity, integrating more behaviors, and obtaining adaptability, the control problems become more challenging.

Artificial neural networks (ANNs) appear appropriate for such control problems due to their intrinsically distributed architecture, their capability to integrate new behaviors, as well as synaptic learning (Beer et al., 1997; Dürr et al., 2003; Ekeberg et al., 2004; Cruse et al., 2007; Kimura et al., 2007; Amrollah and Henaff, 2010; Daun-Gruhn and Büschges, 2011; Lewinger and Quinn, 2011; Harischandra et al., 2011; von Twickel et al., 2012). In addition they have a number of excellent properties as follows. They are able to build a controller as a composition of different neural modules to produce desired motor behaviors (von Twickel et al., 2012). And, they are conceptually close to biological systems compared to other solutions. In particular recurrent neural networks (RNNs) exhibit dynamical behavior (oscillatory, hysteresis, chaotic patterns, etc.) for generating basic rhythmic locomotion behavior (Beer et al., 1997; Kimura et al., 2007; Amrollah and Henaff, 2010; Daun-Gruhn and Büschges, 2011; von Twickel et al., 2012). Considering this, here we exploit the features of ANNs to develop locomotion control for walking machines. This is based on a modular structure consisting of different neural modules having main functions that follow three key mechanisms found in animal locomotion (Holst and Mittelstaedt, 1950; Meyrand et al., 1991; Cruse et al., 1998; Katz, 1998; Bläsing and Cruse, 2004; Cruse et al., 2009; Harris-Warrick, 2011): (1) central mechanisms [i.e., central pattern generators (CPGs)] for generating basic rhythmic motions, (2) sensory feedback (i.e., afferent-based control) for shaping the motions, and (3) internal forward models (i.e., efferent-based control) for sensory prediction and walking state estimations. While these three key mechanisms are essential for locomotion control as found in biological legged systems, only individual instances of them had been successfully applied to artificial ones (Beer et al., 1997; Ishiguro et al., 2003; Cruse et al., 2007; Kimura et al., 2007; Spenneberg and Kirchner, 2007; Amrollah and Henaff, 2010; Schroeder-Schetelig et al., 2010; Harischandra et al., 2011; Lewinger and Quinn, 2011; Owaki et al., 2012; von Twickel et al., 2012), thereby providing partial solutions. A few studies have applied all these mechanisms to animal-like legged robots to achieve complex behavior and adaptability (Lewis and Bekey, 2002). However, the mechanisms have been often used for active two-legged walking (Lewis and Simo, 2001).

Taking all these mechanisms into account for the design of our adaptive neural locomotion control leads to robust walking behavior in many situations. Furthermore, the controller can generate a multitude of walking patterns (e.g., 20 patterns), insect-like leg movements, and energy-efficient and adaptable locomotion for a biomechanical six-legged walking machine, like the AMOS II1 robot used here. It also allows AMOS II to cope with leg damage and even promote cockroach-like climbing behavior. Besides the complex behavior generation, the rationales behind this study are also: (1) to give a better understanding of how a CPG mechanism with neuromodulation, sensory feedback, and adaptive internal forward models with efference copies can be combined in artificial legged locomotion systems and (2) to emphasize that the generated behaviors require the coupling of biomechanics (i.e., physical structure) and neural mechanisms with sensory feedback embedded in an embodied neural-closed loop system. The work presented here extends our previous works (Manoonpong et al., 2007, 2008b; Steingrube et al., 2010) by modifying a chaotic CPG (Steingrube et al., 2010) into a CPG with neuromodulaiton leading to more gaits and smoother and faster switching between them compared with the chaotic CPG. It also introduces for the first time local leg feedback and adaptive forward models as well as their combination with the CPG in robust walking behaviors.

The following section describes the technical specification of the six-legged walking machine AMOS II used for the experiments, followed by adaptive neural locomotion control. The controller is developed to generate versatile and adaptable locomotion of walking machines. The experimental results are shown in section 3. Discussion is given in section 4.

## **2. MATERIALS AND METHODS**

All the experiments of this work were carried out with the physical six-legged walking machine AMOS II. Thus, the first section describes its biomechanical setup, followed by details of the adaptive neural locomotion controller and its components which are the main contribution of this work. Here, some results are described alongside the introduced components from which they mainly derive because this provides a better understanding of their functionalities.

#### **2.1. THE WALKING MACHINE PLATFORM AMOS II (BIOMECHANICS)**

In order to explore and test the performance of the proposed adaptive neural locomotion control in a physical system, the sixlegged walking machine AMOS II is employed (**Figure 1A**). It is an improved version of our previous six-legged walking machine AMOS (Steingrube et al., 2010).

AMOS II has six identical legs. Each leg has three joints (**Figure 1B**): the thoraco-coxal (TC-) joint enables forward (+) and backward (−) movements, the coxa-trochanteral (CTr-) joint enables elevation (+) and depression (−) of the leg, and the femur-tibia (FTi-) joint enables extension (+) and flexion (−) of the tibia (**Figures 1C,D**). The morphology of these multi-jointed legs is modeled on the basis of a cockroach leg (Zill et al., 2004) but the tarsus segments are ignored. Each tibia contains a spring compliant element to substitute part of the function of the tarsus; i.e., absorbing the impact force during touchdown on the ground. In addition, a passive coupling is installed at each joint (**Figure 1B**) in order to yield passive compliance and to protect the motor shaft. The maximum and minimum ranges of the joint movements of the legs are shown in **Figures 1C,D**. In a normal walking condition (e.g., walking on flat terrain), we set the default joint movements so that its body is very close to the ground (i.e., low center of mass) and its body falls to the ground before taking the next step during normal walking. However, for walking over rough terrains, these ranges will be automatically shifted such that AMOS II lifts its body up for better locomotion. This walking strategy is inspired by insect walking, like that of a cockroach (Alexander, 1982; Ritzmann et al., 2004, 2012) and it also ensures stability when confronting leg damage.

The body of AMOS II consists of two segments: a front segment where two front legs are installed and a central body

<sup>1</sup>Advanced MObility Sensor-driven walking device II.

segment where the two middle and the two hind legs are attached. They are connected by one active backbone joint (BJ) inspired by the invertebrate morphology of the American cockroach's trunk (**Figure A1**). This BJ can rotate around the lateral or transverse axis in a range between −45◦ (minimum downward position) and +45◦ (maximum upward position). It stays at zero degree during walking and it leans upwards and bends downwards while climbing. In total, AMOS II has 19 active joints (three at each leg, one BJ). They are driven by digital servomotors (HSR-5990 TG) delivering a stall torque of 2.9 Nm at 5 V. In addition, the body joint torque is tripled by using a gear to achieve a more powerful body joint motion. Besides the motors, AMOS II has 21 sensors: two ultrasonic sensors (US) at the front body part, six foot contact (FC) sensors in its legs, six infrared reflex (IR) sensors at the front

opposite side have the same ranges; i.e., the range of L1 = R1, the range of

of its legs, one current sensor (CS) and one inclinometer (IM) sensor inside the body, and three light dependent (LD) sensors, one USB camera (CM) and one laser scanner (LS) on the front body part (**Figure 1**). These sensors are used to generate stimulus induced behavior (like, photo tropism and obstacle avoidance) as well as versatile, energy-efficient, and adaptable locomotion. The USB camera is used for terrain classification and the LS is used to measure obstacle height in order to distinguish between a wall and a surmountable obstacle.

left middle leg (L2); TL3, CL3, FL3 = left hind leg (L3); BJ = a backbone joint.

We use a Multi-Servo IO Board (MBoard) installed inside the body to digitize all sensory input signals except the CM and LS signals. We also use it to generate a pulse-width-modulated signal to control the position of the servomotor. For experiments here, the MBoard is connected to a personal computer (PC) where the CM and LS are directly connected and a neural locomotion controller is implemented. The communication between a PC and the MBoard is accomplished via an RS232 interface at 57.6 kb/s. Electrical power supply for all servomotors, the MBoard, and all sensors is given by lithium polymer batteries with a voltage regulator producing a stable 5 V supply.

#### **2.2. ADAPTIVE NEURAL LOCOMOTION CONTROL**

The adaptive neural locomotion control (**Figure 2**) has been developed based on a modular structure. It consists of two main components: CPG-based control and local leg control. The CPG-based control basically coordinates all leg joints of AMOS II, thereby generating insect-like leg movements and a multitude of different behavioral patterns. The patterns include forward/backward walking, turning left and right, and insect-like gaits. These gaits allow for energy-efficient locomotion on different terrains. All these patterns can be autonomously controlled by exteroceptive sensors, like a camera, a LS, and US. While the CPG-based control provides versatile autonomous behaviors, the local leg control using proprioceptive sensory feedback (like FC sensors) adapts the movement of an individual leg of AMOS II to deal with a change of terrain, losing of ground contact during stance phase, or stepping on or hitting an obstacle during swing phase.

Here, the CPG-based control of the entire system has four components: (1) a CPG mechanism with neuromodulation for generating different periodic signals, (2) neural CPG postprocessing for shaping the CPG signals to obtain smooth leg movements, (3) neural motor control consisting of two additional different networks [phase switching network (PSN) and velocity regulating networks (VRNs)] for controlling walking direction (forward/backward and turning), and (4) motor neurons with delay lines for sending final motor commands to all leg joints of AMOS II.

For the local leg control, it has only two components for each leg: (1) an adaptive neural forward model transforming the motor signal (efference copy) generated by the CPG into an expected sensory signal for estimating the walking state and (2) elevation and searching control for adapting leg motion (e.g., extension/flexion and elevation/depression).

All neurons of the control network (**Figures 2**, **A2**) are modeled as discrete-time non-spiking neurons. They are updated

with a frequency of approximately 27 Hz. The activity *ai* of each neuron develops according to:

$$a\_i(t) = \sum\_{j=1}^{n} W\_{ij} o\_j(t-1) + B\_i, \quad i = 1, \ldots, n,\tag{1}$$

where *n* denotes the number of units, *Bi* an internal bias term or a stationary input to neuron *i*, *Wij* the synaptic strength of the connection from neuron *j* to neuron *i*. The output *oi* of all neurons of the network is calculated by using the hyperbolic tangent (tanh) transfer function, i.e., *oi* = tanh*(ai),* ∈ [−1*,* 1], except for the CPG postprocessing neurons using a step function, the motor neurons using piecewise linear transfer functions, and neurons in searching and elevation control using a linear transfer function.

#### **2.3. CPG-BASED CONTROL**

The structure of this control unit is based on our previous sensordriven CPG-based controller (Steingrube et al., 2010) in which a chaotic CPG is used as a main component. While the chaotic CPG can produce different periodic output signals including a chaotic one, only a few number of gaits (e.g., five different gaits) and a chaotic motion have been realized for hexapod locomotion (Steingrube et al., 2010). Furthermore, switching between these gaits cannot be immediately achieved but requires a few steps and the transition is non-smooth. This is because the system has to switch to a chaotic state first before obtaining a new periodic pattern.

Thus to overcome this drawback, in this study we modify the chaotic CPG to a simpler CPG mechanism with neuromodulation. It is inspired by biological findings (Meyrand et al., 1991; Katz, 1998; Harris-Warrick, 2011) (see the section 4 for more details). It provides a large number of periodic output patterns including a chaotic one, resulting in a large number of walking patterns (i.e., more than five stable gaits). It also allows fast and smooth switching between patterns. The circuit consists of two neurons*i* ∈ {1*,* 2}, fully connected (**Figure 3A**). The discrete-time dynamics of the activity states *ai* and the output states *oi* of the circuit follows Equation (1) and a tanh transfer function, respectively. Their initial states are set to a small positive value, e.g., 0.1. An extrinsic modulatory input *MI* is introduced and projected to the synaptic connections of the neurons (**Figure 3A**), thereby

CPG network is updated with a frequency of approximately 27 Hz (i.e., one time step is ≈0.037 s). **(C)** Examples of the asymmetrical periodic outputs of the CPG (top) where *MI* is set to 0.02, 0.08, and 0.16. The signals differ in phase by π*/*2 and are shaped by neural CPG postprocessing such that smooth ascending and descending signals are obtained for motor control (bottom). This kind of asymmetrical periodic signals is appropriate for walking found in insects where swing (ascending slope) and stance (descending slope) phases differ in duration, being intrinsically asymmetry (Wilson, 1966).

modulating the outputs of the CPG (**Figures 3B,C**). *MI* will be controlled by a sensory signal (see the section 3). According to this, the synaptic weights are described as:

$$W\_{11,22} = W\_{d0},\tag{2}$$

$$W\_{12\_m} = W\_{d1} + MI,\tag{3}$$

$$W\_{21\_m} = -(W\_{d1} + MI),\tag{4}$$

where *W*11*,*<sup>22</sup> are fixed synapses and *W*12*m,*21*<sup>m</sup>* are modulated synapses. *Wd*<sup>0</sup> and *Wd*<sup>1</sup> are the default synaptic weights, which are used to create basic periodic signals. They need to be selected in accordance with the dynamics of the system that generates periodic or quasi-periodic attractors (Pasemann et al., 2003).

We empirically adjust and set the parameters to *Wd*<sup>0</sup> = 1*.*4 and *Wd*<sup>1</sup> = 0*.*18. This parameter setup with *MI* = 0*.*0 results in a very low frequency of the periodic outputs. Increasing *MI* will increase the frequency of the outputs (see black solid line in **Figure 3B**). The investigation of AMOS II walking on a flat floor using this CPG shows that its walking speed is proportional to the value of *MI*; i.e., increasing *MI* leads to the increasing of walking speed (see blue dashed line in **Figure 3B**). However, the walking speed will decrease if *MI* is grater than 0.19. This is because the output frequency is too high such that the motors of AMOS II cannot follow the driving frequency properly <sup>2</sup> . Interestingly, together with neural motor control and a delay line mechanism embedded in the motor neuron module (described below), AMOS II shows different walking patterns at the different values of *MI* (e.g., 20 patterns) where some of these patterns show similar gaits but differ in stepping frequency in the swing and stance phases. **Figure 4** shows examples of six different patterns or gaits: slow wave gait (*MI* = 0*.*02), fast wave gait (*MI* = 0*.*04), tetrapod gait (*MI* = 0*.*06), caterpillar gait (*MI* = 0*.*09), intermixed gait (*MI* = 0*.*12), and fast tripod gait (*MI* = 0*.*19). Some of them are similar to insect gaits (Wilson, 1966) and allow for energy-efficient locomotion on particular terrains (see the section 3). Here we use visual information to trigger the most energy-efficient gait while AMOS II traverses different terrains. Visual information is obtained from a terrain classification system consisting of the USB camera of AMOS II (**Figure 1A**) and an online featurebased terrain classification algorithm. The camera acquires terrain images while the classification algorithm (i.e., image processing) extracts local features of the images using Scale Invariant Feature Transform (SIFT) (Lowe, 2004), encodes the features

<sup>2</sup>Note that this limitation is not because of the CPG but due to the hardware. Applying the CPG to different robots (e.g., light weight robots with fast actuator speed), one might be able to obtain more than 20 different walking speeds on flat terrain.

using the Bag of Words (BoW) technique (Zhang et al., 2010), and then classifies the words using Support Vector Machines (SVMs) with a radial basis function kernel (Cortes and Vapnik, 1995). The output of the algorithm provides terrain information used to set *MI* of the CPG, thereby triggering the corresponding pre-mapped energy-efficient gait (see the section 3).

Fast and smooth switching between gaits in a comparison to our previous chaotic CPG can be seen at **Video S1**. In principle, for the AMOS II system, a transition state from one stable gait to another stable gait using the CPG with neuromodulation requires about 2 s while it needs about 5 s when using the chaotic CPG. This fast switching between gaits is required for situations like escaping from an attack or danger (i.e., fast changing from a slow wave gait to a fast tripod gait). Note that the change of the modulation value occurs instantaneously where the CPG with neuromodulation immediately switches from one frequency to a new frequency. However, the system requires a longer time for a new gait to emerge because of delay lines (described below) transmitting the CPG signals to the motor neurons.

The outputs of the CPG are passed to motor neurons through two hierarchical subcomponents or modules: neural CPG postprocessing and neural motor control. The neural CPG postprocessing (**Figure 2**), which directly receives the CPG outputs, consists of postprocessing neurons with a threshold value of 0.85 and integrator units (**Figure A2**). Specifically, the neurons are for signal shaping while the integrator units are for obtaining continuous signals with asymmetry of ascending and descending slopes (**Figure 3C**). At first the CPG outputs get transformed by the neurons which produce the step function outputs with high (+1) or low (−1) value. Time intervals of the high and low outputs are counted. The high and low outputs are converted to continuous signals with ascending and descending slopes, respectively. The conversion is done by dividing the integrated high and low outputs by the time intervals part. Since the counting of the time intervals is subsequent, each slope is calculated using the time intervals of the previous period. Finally, the integrator outputs are scaled to the range between −1*.*0 and 1*.*0. For different frequencies of the CPG, the time intervals are different, thereby generating different ascending and descending slopes (**Figure 3C**).

Note that the CPG with the neural CPG postprocessing presented here has certain advantages over a classical solution (e.g., constructing CPG signals directly by hand or using a simple wavegenerator). This is because the CPG, derived from a RNN with two neurons, in principle exhibits various dynamical behaviors (e.g., periodic patterns, chaotic patterns, and hysteresis effects) which can be exploited for locomotion control (Manoonpong et al., 2008a; Steingrube et al., 2010). While the network can generate various output patterns, the neural CPG postprocessing is used to only translate these output signals into smooth continuous signals (e.g., saw-tooth signals) for motor control and does not change the network dynamics. In fact, the CPG and its postprocessing are independent; therefore, one could also apply different postprocessing mechanisms to shape or transform the CPG outputs into other periodic forms if required. In this neural approach, we can simply change the gaits (flexibility) and obtain various patterns including chaotic motions<sup>3</sup> (versatility) by only changing the network parameters (i.e., synaptic weights and bias terms). Furthermore, one could also apply learning mechanisms (with an additional neuron) to the CPG such that the CPG can be entrained by sensory feedback in order to adapt to the feedback pattern and memorize it (Nachstedt et al., 2012). This will lead to the adaptivity of the gaits. Implementing this adaptivity on the AMOS II system is one of our major plans for future work. All these features (flexibility, versatility, and adaptivity) would be difficult to be achieved by a classical solution.

The neural motor control, which receives the postprocessed CPG outputs, consists of two different neural networks: one PSN and two VRNs. All neuron outputs of these networks are given by a hyperbolic tangent (tanh) transfer function. The PSN is a generic feedforward network (see **Figure A2** for the network structure). This network is designed by hand and consists of 4 hierarchical layers with 12 neurons. The synaptic weights and bias terms of the network are determined in a way that they do not change the periodic form of input signals (i.e., the postprocessed CPG outputs) and keep the amplitude of the signals as high as possible. Thus, all synaptic weights and bias terms were set to 0.5, which will convert the signals in the linear domain of the transfer function, except the synaptic weights and bias terms of the output neurons. They were set to 3.0 and −1*.*35, respectively, in order to amplify the signals and to shift the offset of the final output signals such that they have their center at zero. The complete network and parameters (i.e., all synaptic weights and bias terms) are shown in **Figure A2**. As a result, the network can switch the phase of the CPG outputs to lead or lag behind each other by π*/*2 in phase with respect to a given input for walking sideways [see Steingrube et al. (2010) and Manoonpong et al. (2008b) for more details]. It also provides additional fine tuning of the phase of the CPG outputs to achieve a proper phase shift between the CTr- and FTi-joints leading to insect-like leg movements (**Figure 5**).

The two VRNs are also simple feed-forward networks (see **Figure A2** for the network structure). The network is derived from a multiplication of two values in the range *x, y* ∈ [−1*,* 1]. It was constructed by four hidden neurons, which are connected with an output neuron. The network was trained by using the backpropagation algorithm (Rumelhart et al., 1986). The resulting network parameters (synaptic weights and bias terms) are shown in **Figure A2**. It approximately works as a multiplication operator. Each VRN controls the three ipsilateral TC-joints on one side. Since the VRNs function qualitatively like a multiplication function (Manoonpong et al., 2007), they have capability to increase or decrease the amplitude of the TC-joint signals and even reverse them with respect to their control inputs. Controlling the TC-joint signals in this way results in various walking directions, like forward/backward, turning left/right, turning in different radians, or curve walking in forward and backward directions [see Manoonpong et al. (2008b) for walking experiments].

<sup>3</sup>This CPG will show chaotic dynamics if its synaptic weights are set to *W*<sup>11</sup> = −5*.*5, *W*<sup>22</sup> = 0*.*0, *W*12*<sup>m</sup>* = 1*.*475, *W*21*<sup>m</sup>* = −1*.*65 with additional bias terms (*B*<sup>1</sup> = −5*.*725, *B*<sup>2</sup> = 0*.*25) projecting to the neurons *C*<sup>1</sup> and *C*2, respectively. The chaotic patterns prove behaviorally useful for self-untrapping from a hole in the ground (Steingrube et al., 2010).

Using exteroceptive sensors, like US (**Figure 1**), together with a neural sensory preprocessing network (see the network *N*2*,*<sup>3</sup> in **Figure A2**) where the network processes the US and provides a final resulting turning signal to the VRNs, allows AMOS II to autonomously avoid obstacles and to escape from a corner and even a deadlock situation (**Video S3**). Currently the network (**Figure A2**) has fixed synaptic weights resulting in a hard-wired anticipatory behavior with a fixed turning angle in front of the obstacles for avoiding them. Instead one could also apply a learning mechanism [e.g., Hebbian learning and synaptic scaling (Tetzlaff et al., 2011)] to adapt the synaptic weights of the network. This would enable AMOS II to learn to anticipate an obstacle and perform different turning behaviors depending on environmental complexity.

Note that the PSN and VRNs have been developed using a neural approach since this allows for adaptation and the use of standard (neural) learning (e.g., backpropagation) to modify the networks' properties and it is also close to biological systems. For example, there is strong evidence for a phase shifting property found in inter-segmental neurons in the connective elements of a cockroach (Pearson and Iles, 1973). Phase relationships between these neurons can change as would be required for emulating the functionality of our PSN. Studies by Akay et al. (2007) show that in stick insect locomotion motorneuron pools are able to not only drive protractor (swing) and retractor (stance) muscle activities but also reverse their activities leading to the change of locomotion directions (e.g., from walking forward to backward and vice versa). The functionality of these motorneuron pools is directly reproduced by our VRN which controls and reverses motor signals. In addition, another specific functionality of the VRN, namely that of regulating the magnitude of the motor signals allowing for different moving speeds, has been already found in another study (Gabriel and Büschges, 2007). This study suggests that in stick insects there are neurons that receive synaptic input, which modifies their activity according to the walking speed of the animal. This input seems specific to only these neurons and it arises via local pre-motor inter-neurons, which could, thus, represent the VRN interneurons as suggested by our network. In addition to this, the PSN and VRNs are generic and transferable. As suggested by their names, the PSN and VRN serve a general purpose (e.g., "phase switching") largely regardless of the robot's specific embodiment. Due to modularity, the PSN and VRN are typically independent of each other in their functioning and do not influence or become influenced by other components. Thus, they can be combined to form controllers of different types of robots (Manoonpong et al., 2007, 2008b; Steingrube et al., 2010; Chadil et al., 2011) where they do not require fine tuning for the specific system in which they are employed.

Finally, the outputs of the PSN and VRNs are sent to the motor neurons through delay lines (**Figure A2**). The ipsilateral lag is determined by a delay τ (i.e., 16 time steps or ≈0.6 s) and the phase shift between both left and right sides is given by a delay τ*<sup>L</sup>* (i.e., 48 time steps or ≈2 s). These delays are independent of the CPG signals. This setup leads to biologically motivated leg coordination since the legs on each side perform phase shifted waves of the same frequency (Wilson, 1966). The frequency of the waves is defined by *MI* of the CPG. The connections to the motor neurons are similar to our previous work (Steingrube et al., 2010) except the ones to the FTi-motor neurons. They are modified here (**Figure A2**) to be more similar to insect-like leg movements (Ekeberg et al., 2004; Cruse et al., 2009). **Figure 5** illustrates all leg movements during forward and backward walking. During forward walking, in the swing phase the FTi-joints of the front and middle legs extend while the ones of the hind legs flex. In the stance phase, the FTi-joints of the front legs gradually flex to pull the body forward while the ones of the hind legs gradually extend to also push it forward. For the middle legs, the FTi-joints combine both actions of the FTi-joints of the front and hind legs. They flex rapidly and early during the stance phase in order to pull the body since in this period the legs are at an anterior position [i.e., positive TC-joint angles (**Figure 1C**)]. Afterwards, they stay flexed and then gradually extend in order to push the body since in this period the legs are at a posterior position [i.e., negative TC-joint angles (**Figure 1C**)]. These biologically-inspired leg movements (Ekeberg et al., 2004; Cruse et al., 2009) provide more propelling force, resulting in an increased walking speed of AMOS II by ≈15% compared with the fixed FTi-joint version (Steingrube et al., 2010). These movements are reversed for backward walking. We encourage readers to also see the video showing the leg movements of AMOS II at **Video S4**. Since the generated leg movements are independent of other influences, similar movements exist in all gaits. It is important to note that the leg movements shown here, however, are still not completely similar to insect leg movements. This can be further improved by applying additional components, i.e., muscle models (Xiong et al., 2012), to obtain a smoother foot path and to come closer to insect-like leg movements.

#### **2.4. LOCAL LEG CONTROL**

While the CPG-based control in principle can generate a multitude of different behavioral patterns and insect-like locomotion (i.e., leg movements and gaits) without sensory feedback, it cannot adapt an individual leg to deal with a change of terrain, losing of ground contact during stance phase, or stepping on or hitting an obstacle during swing phase. This adaptable locomotion is necessary for traversing rough terrain or climbing over obstacles. To address this issue, we introduce here local leg control consisting of two components: (1) an adaptive neural forward model and (2) elevation and searching control. These two components are applied to each leg of AMOS II (see **Figures 2**, **A2**).

The adaptive neural forward model serves to estimate the walking state. To do so, it transforms a motor signal (i.e., here the CTr-motor signal<sup>4</sup> , efference copy) into an expected sensory signal to be able to compare it to the actual incoming one (i.e., here the FC signal of the leg). The forward model consists of only two neurons (**Figure 6A**). The neuron *F* transforms the motor signal while the neuron *P* performs postprocessing. We construct the neuron *F* as a hysteresis element (Pasemann, 1993) using a single recurrent neuron with synaptic plasticity (described below in details) and the postprocessing neuron *P* as a standard one (see Equation 1) with a tanh transfer function. Note that this postprocessing neuron *P* with its large fixed presynaptic weight (i.e., 10.0) basically sharpens a transformed motor signal to perfectly match to a FC signal.

Due to a delay in the relation between FC signal and the CTrmotor signal, a simple thresholding method cannot be applied for signal transformation. Therefore, we use the single recurrent neuron instead since this is a simple neural mechanism providing dynamical properties (e.g., hysteresis effect) that can smooth the motor signal and at the same time provide a delay in the input– output relation required to transform the motor signal into the expected sensory signal. The activation function of this neuron is given by:

$$a\_F(t) = W\_R(t) o\_F(t-1) + W\_I(t)I(t) + B(t),\qquad(5)$$

where *I* is the input of the neuron which is here the CTr-motor signal coming from the CPG-based control. *oF* is the output of the neuron given by the tanh transfer function, i.e., *oF* = tanh*(aF),* ∈ [−1*,* 1]. *WR*, *WI*, and *B* are the recurrent weight, the presynaptic weight, and the bias term of the neuron, respectively. These parameters need to be adjusted to obtain a proper hysteresis loop for the signal transformation. Therefore, we employ a gradient descent learning rule to adapt them. In principle, the rule attempts to minimize the error *E* between the target output *T* and the actual output *oF* of the neuron through gradient descent. The error is measured as:

$$E(t) = \frac{1}{2}(T(t) - o\_F(t))^2. \tag{6}$$

<sup>4</sup>We use the CTr-motor signal instead of the TC- and FTi-motor signals since its pattern is close to the FC signal.

consisting of recurrent and non-recurrent neurons. **(B)** Changes of the parameters of the model of the right front leg (R1). **(C)** The hysteresis effect between the input and output signals of the forward model of R1 where the converged parameters are used (see **B**). In this situation, the input varies between −1*.*0 and 1.0. Consequently, the output will gradually show high activation (≈ +1*.*0) when the input increases to value above −0*.*55. The output will show low activation (≈ −1*.*0) when the input decreases below −0*.*715. **(D)** The CTr-motor signal of R1 which is the input of the neuron *F*. Its high activation drives the leg to swing (i.e., swing phase) while its low

output of the postprocessing neuron *P* is used to compare to the foot contact signal for estimating the walking state. **(F)** The output of the neuron *F* or the transformed motor signal. **(G)** The foot contact signal of R1. It is filtered and mapped onto the interval [−1*,* +1] where +1 is the leg has no ground contact and vice versa. Dashed lines are provided for comparison. Note that the parameter changes of the forward models of the other legs show similar patterns. Their convergence was achieved after about eight to twenty walking steps. The parameters converged at slightly different values, resulting in slightly different hysteresis loops. One time step is ≈0.037 s.

In this study, we use the filtered FC sensor signal, linearly mapped onto the interval [−1*,* 1], as the target output. According to the learning rule, the parameters (*WR*, *WI*, and *B*) are updated every time step (≈0.037 s) in proportion to the gradient and given as follows:

$$
\Delta W\_{\rm R} = -\mu \frac{\partial E}{\partial W\_{\rm R}} = \mu \left( T(t) - o\_{\rm F}(t) \right) (1 - o\_{\rm F}(t)^2) o\_{\rm F}(t - 1), \tag{7}
$$

$$
\Delta W\_I = -\mu \frac{\partial E}{\partial W\_I} = \mu (T(t) - o\_F(t))(1 - o\_F(t)^2)I(t), \tag{8}
$$

$$
\Delta B = -\mu \frac{\partial E}{\partial B} = \mu \left( T(t) - o\_F(t) \right) \left( 1 - o\_F(t)^2 \right), \tag{9}
$$

where μ is the learning rate which is set to a small positive value, e.g., 0.01. For the training process, we initialize the neural activity and output states of the forward model to 0.0 and *WR*, *WI*, and *B* to 1.0. Due to this simple neural system, the process can perform online. We implemented six forward models on AMOS II where each of them works on one leg. Afterwards, we let AMOS II walk in a normal condition (i.e., walking on floor with a certain gait). The training process will stop as soon as the difference between the filtered FC signal and the postprocessed neural output *oP* is smaller than a threshold, e.g., 0.05, over a certain period of times (e.g., 500 time steps). We performed the training process only once and only for the normal walking condition. This walking condition is used as a reference to compare it to other walking conditions in any terrain.

**Figure 6B** illustrates the parameter changes of the forward model of, e.g., the right front leg (R1, **Figure 1A**) during training. The training process was set to start after 500 time steps (or around four walking steps) and the parameters (*WR*, *WI*, and *B*) converged after around 1300 steps (or around seven walking steps). The resulting parameters lead to a proper hysteresis loop (**Figure 6C**). Utilizing this hysteresis property together with the neural postprocessing, the CTr-motor signal is finally transformed into the expected FC signal (**Figures 6D–G**). In this example, AMOS II walked with a slow wave gait (i.e., *MI* = 0*.*02). It is important to note that the models of all legs that adapted to this gait can be directly applied to other gaits.

After training, the output of each trained forward model (i.e., the expected FC signal, **Figure 6E**) is used to compare it to the actual incoming FC signal of the leg (**Figure 6G**). The difference *-* (**Figure 6A**) between them determines the walking state where a positive value (+*-*) means losing ground contact during the stance phase and a negative one (−*-*) means stepping on or hitting obstacles during the swing phase. Thus, we use the positive value for searching control (**Figure 7A**). The value is accumulated through a recurrent neuron *S* with a linear transfer function and always reset to 0.0 at the beginning of swing phase. The output of this neuron *oS* with significant change (e.g., *oS >* 0*.*15) controls vertical shifting of the CTr- and FTi-joints. Consequently, these joints are shifted when the positive difference occurs; thereby, the respective leg searches for a foothold. This searching control only occurs in the stance phase. On the other hand, we use the negative value for elevation control (**Figure 7B**). The value is also accumulated through a recurrent neuron *E* with a linear transfer function. The output of this neuron with significant change5 (e.g., *oE <* −15) shifts the CTr- and FTi-joint movements upwards. At the same time, the TC-joint movement is shortly inhibited. As a consequence, the leg is elevated, thereby avoiding an obstacle or freeing itself from the obstacle. This elevation control only occurs in the swing phase. Note that the IR sensors installed at the legs (**Figure 1B**) can be also used for elevation control. This allows the legs to avoid hitting a large obstacle in the front (**Video S5**).

To illustrate the functionality of the searching control and clearly observe leg motion, we activated one leg [e.g., right middle leg (R2)] and fixed the other legs to a certain position. Afterwards, we changed ground level during stance phase. Changing it causes different positive errors (+*-*) due to mismatch between the expected FC signal and the actual incoming one. The error is accumulated through the recurrent neuron *S*. If the accumulated error (**Figure 7C**) is higher than the threshold, the searching controller then controls the CTr- and FTi-joints to depress the leg and at the same time extend the tibia, respectively. This results in searching for a foothold. Note that the TC-joint motion is not influenced. All joint angles of the leg in this experiment are shown in (**Figures 7D–F**). We encourage readers to also see the video of this experiment at **Video S6**.

To illustrate the functionality of the elevation control and clearly observe leg motion, we also activated only one leg [e.g., right middle leg (R2)] and fixed the other legs to a certain position. In addition, we inhibited the searching control such that the leg could not search for a foothold. This is to better see and understand the changes of the joint angles. To force elevation of the leg, we made the foot touch an obstacle during the swing phase. This causes negative errors (−*-*) that are accumulated through the recurrent neuron *E*. If the accumulated error (**Figure 7G**) is higher than the threshold, the elevation controller then inhibits the TC-joint for the forward motion of the leg and at the same time drives the CTr- and FTi-joints to elevate the leg and fully extend the tibia, respectively. This results in the elevation of the leg, thereby freeing it from the obstacle during the swing phase. After the leg frees from the obstacle, the TC-, CTr-, and FTi-joints immediately return to their unaltered positions. Since the process occurs in a very short time, the gait does not break down (see the section 3). All joint angles of the leg in this experiment are shown in (**Figures 7H–J**). We encourage readers to also see the video of this experiment at **Video S5**.

#### **3. RESULTS**

In the previous sections, we showed the individual functionalities and performances of the CPG-based control and the local leg control in part. Here, we present experiments carried out to assess the ability of their combination (i.e., adaptive neural locomotion control, **Figure 2**). The first experiment investigated energy-efficient gaits for different terrains. To do so, we categorized terrains into four different groups: hard terrain (e.g., floor, pavement), loose terrain (e.g., fine gravel), rough terrain (e.g., gravel), and vegetated terrain (e.g., grass).

For each of these terrain groups, we let AMOS II walk from slow to fast gaits by manually increasing *MI* of the CPG. During locomotion, the local leg control autonomously adapted the legs for a foothold. Thus, in this experiment, the CPG-based control and the local leg control function as open-loop control and closed-loop control, respectively. We calculate the electric energy consumption of each walking pattern as:

$$E = I\text{Vt},\tag{10}$$

where *I* is average electric current in amperes used by the motors during walking 1 m. It is measured using the Zap 25 CS installed inside AMOS II. *V* is voltage (here 5 V). *t* is time in seconds for the travel distance (here 1 m). **Figure 8** shows the energy

<sup>5</sup>Here, we use a high threshold value for controlling the elevation since a minor disturbance can be handled by passive mechanisms (spring and passive couplings) installed at the leg. Using a small threshold value might lead to an unnecessary elevation of the leg resulting in unstable motion.

consumptions measured in these four terrain groups where the measurement of each group was repeated five times.

**Figures 8A,B** suggest using the *MI* values of 0.04 and 0.06 which generate a fast wave gait and a tetrapod gait on loose and

rough terrains, respectively. **Figures 8C,D** suggest using the *MI* value of 0.19 which produces a fast tripod gait on hard and vegetated terrains. Note that AMOS II started to slip when the value of *MI* was higher than 0.19 for hard and vegetated terrains and it got

stuck most of the time when the *MI* values were higher than 0.16 and 0.10 for rough and loose terrains, respectively. This experimental result reveals that each terrain group requires a specific gait which leads to the lowest energy consumption. This allows mapping the four terrain groups to the energy-efficient gaits.

The second experiment employed the investigated energyefficient gaits together with the visual terrain classification system (described in section 2.3) to allow AMOS II to autonomously perform energy-efficient locomotion while traversing the different terrains. The output of the visual terrain classification system provides terrain information. This information was used as the preprocessed sensory input to set *MI* of the CPG, thereby triggering the corresponding pre-mapped energy-efficient gait. This way, the experiment reflects a complete neural closed-loop system (**Figure 2**). The experimental result is shown in **Figure 9**.

It can be seen that at the beginning AMOS II walked with a fast wave gait (photo 1) since it detected fine gravel (loose terrain) using its visual system. Afterwards, it changed from the wave gait to a tetrapod gait (photo 2) since it detected gravel (rough terrain). Finally, it used a fast tripod gait (photo 3) on the floor (hard terrain). During traversing the different terrains, AMOS II adapted its legs individually to deal with a change of terrain. That is, it depressed its leg and extended its tibia to search for a foothold when losing a ground contact during the stance phase. Losing ground contact information is detected by a significant change of the positive accumulated error *oS*, see black line in **Figure 9C**). However, during the swing phase no leg elevation was observed (i.e., no significant change of the negative accumulated error *oE*, see red line in **Figure 9C**) since only minor perturbation occurred, where the perturbation was handled by the passive components of the leg. We encourage readers to see the video of this experiment at **Video S7**. Another test in an outdoor environment where AMOS II walked from gravel to grass can be seen at **Figure A4**. In addition to energy-efficient and adaptable locomotion emphasized in this experiment, the basic leg movements of AMOS II and the used gait follows insect locomotion. Thus, this experiment is an example of the demonstration of insect-like, energy-efficient, and adaptable locomotion of walking machines, like AMOS II.

The third experiment focused on both, leg elevation and foothold searching, of AMOS II to deal with small obstacles. In this scenario, we let AMOS II walk with a certain pattern [e.g., a slow wave gait (*MI* = 0*.*02)] and placed small obstacles (≈2.5 cm height) on its path. The experimental result is shown in **Figure 10**. It can be seen that, while walking forward, the foot of the right front leg (R1) of AMOS II hit an obstacle during the swing phase (photo 1), thereby preventing the leg from completing the phase. This leads to a significant change of the negative accumulated error *oE* (**Figure 10A**). As a consequence, AMOS II elevated the leg to free it from the obstacle (photo 2). Afterwards, it placed the

**FIGURE 9 | Real-time data of energy-efficient and adaptable locomotion on three different terrains. (A)** The output of the online terrain classification system which is a preprocessed visual sensory signal. **(B)** The modulatory input *MI* of the CPG which is directly controlled by the sensory signal. It was set to 0.04 (fast wave gait), then 0.06 (tetrapod gait), and finally 0.19 (fast tripod gait). **(C)** The positive (*oS* ) and negative (*oE* ) accumulated errors (**Figures 7A,B**). They control leg adaptation to deal with different terrains. **(D–F)** The TC-, CTr-, and FTi-joint angles of the right middle leg (R2) during walking from fine gravel (loose terrain) to gravel (rough terrain) to floor (hard terrain). They represent the leg movement including adaptation. **(G)** Gait diagram showing the different energy-efficient gaits of AMOS II while traversing the terrains. Black boxes indicate swing phase while white areas between them indicate stance phase. Abbreviations are referred to **Figure 1**. Above pictures show snap shots from the camera on AMOS II used for the terrain classification while walking. Below pictures show snap shots of locomotion of AMOS II during the experiment. Note that one time step is ≈0.037 s.

leg on top of the obstacle without getting stuck (photo 3). Due to the difference of the ground level, this causes a significant change of the positive accumulated error *oS* (**Figure 10B**). AMOS II then lowered the leg more downward to ensure ground contact. After a few steps, the leg again lost a ground contact during the stance phase (photo 4), resulting in searching for a foothold (photo 5). Finally, AMOS II successfully walked away from the obstacles. This experiment reveals that using this leg adaptation mechanism

angles of the right front leg (R1) during walking on the floor with small obstacles (≈2.5 cm height). They represent the leg movement including

> AMOS II can effectively locomote on terrain with small obstacles without getting stuck. We encourage readers to also see the video of this experiment at **Video S8**.

and searching actions, respectively. Note that one time step is

The fourth experiment was to show that the adaptive neural locomotion control not only generates insect-like, energyefficient, and adaptable locomotion of AMOS II (as shown above) but also allows it with the help of its BJ to climb over a large obstacle. To do so, we placed AMOS II on rough terrain (i.e., soil with

≈0.037 s.

stones) with an 11 cm high obstacle at front. The task of AMOS II was to move forward and climb over the obstacle. For this experiment, the CPG-based control generated a basic walking pattern [e.g., a slow wave gait (*MI* = 0*.*02)] while the local leg control adapted the legs individually for foothold searching and elevation, thereby enabling effective locomotion and supporting the body of AMOS II during climbing. Note that the slow wave gait was used in this experiment because it is the most effective gait for climbing which allows AMOS II to negotiate the highest climbable obstacle (13 cm height which equals 75% of its leg length) [see Goldschmidt et al. (2012) for details]. In addition to the locomotion control, reactive BJ control was also applied to control the BJ for climbing [see Goldschmidt et al. (2012) for details]. The controller produces an abstraction of body flexion observed in cockroach climbing. It controls the BJ to lean upwards to surmount obstacles and to bend downwards for stable climbing. This downward motion appears in cockroach climbing while the upward motion does not exist. Instead of leaning the body flexion joint upwards as AMOS II does, a cockroach extends its front and middle legs to raise its reaching height to surmount obstacles, thereby rearing its entire body to a taller pose. Here, we used the US at the front body part of AMOS II (**Figure 1A**) for obstacle detection and BJ control. **Figure 11** presents the experimental result.

At the first period (0–500 time steps), the local leg control was deactivated. Due to the rough terrain, the feet could not perfectly touch the ground during the stance phase; thus, AMOS II could not move forward (photo 1). After 500 time steps, the local leg control was activated. It allows for foothold searching, thereby adapting locomotion to the terrain. As a result, AMOS II moved forward. As AMOS II approached the obstacle, the US detection activated the BJ control such that the BJ

during walking. It leant upwards and then bent downwards during climbing. **(C–E)** The TC-, CTr-, and FTi-joint angles of the left hind leg (L3) show snap shots of the locomotion of AMOS II during the experiment.

Note that one time step is ≈0.037 s.

leant upwards (photo 2). Due to a time-out period after leaning upwards, the BJ moved downwards to ensure stability while climbing (photo 3). During climbing, a hind leg [e.g., left hind leg (L3), photo 4] lowered downwards, showing leg extension, to support the body. Finally, AMOS II successfully locomoted on rough terrain and surmounted the 11 cm high obstacle (photo 5). We encourage readers to also see the video of this experiment at **Video S9**. Besides this experimental result, it is important to note that both adaptive locomotion and reactive BJ controllers have a distributed implementation, but they are indirectly coupled by sensory feedback and the physical components of AMOS II. This way, the combined neural control network driven by the sensor signals synchronizes leg and BJ movements for stable walking and climbing.

The final experiment was to illustrate that the adaptive neural locomotion controller can adapt the remaining legs to deal with a leg damage situation. In this experiment we let AMOS II walk with a slow wave gait (*MI* = 0*.*02) and then disconnected the power connector of the motor of a leg joint such that the joint became inactive (i.e., uncontrollable). This is to simulate leg damage. After damage, we placed AMOS II on top of an object to observe the adaptation of the remaining legs that allows AMOS II to be able to continue moving forward. **Figure 12** present the experimental result.

As shown in **Figure 12**, AMOS II walked in a normal walking condition at the beginning (photo 1). During walking, we disconnected the motor power connector of the FTi-joint of the left middle leg (photo 2) such that the joint became inactive. Then we

**(A)** The filtered foot contact (FC) signal of the left middle leg (L2) where +1 is the leg has no ground contact and −1 is the leg touches the ground. **(B–D)** The TC-, CTr-, and FTi-joint angles of L2. **(E,F)** The CTr- and FTi-joint angles of the right middle leg (R2). The joint adaptation was controlled by the negative (*oE* ) and positive (*oS* ) accumulated errors (**Figures 7B,A**). The changes of the

slow wave gait (*MI* = 0*.*02, **Figure 10F**). Below pictures show snap shots of the locomotion of AMOS II during the experiment. Dashed line indicates the time that the motor power connector of the FTi-joint of L2 was disconnected. Red area indicates the time that AMOS II was on a 3.5 cm high object. Note that one time step is ≈0.037 s.

also tilted the tibia upward; thereby, the foot could not touch the ground properly. This results in the leg adaptation to search for a foothold (photo 3). Afterwards, we placed AMOS II on top of a 3.5 cm high object (photo 4). Since AMOS II was on the object, its legs lost a ground contact. AMOS II adapted its legs to search for a foothold (see, e.g., the FTi- and CTr-joint signals of the right middle leg in **Figures 12E,F**). As a result, it successfully climbed down from the object and continued walking forward (photo 5). The ability of leg adaptation was mainly achieved by the local leg control mechanisms. These mechanisms even allow AMOS II to climb down from the object with a 7 cm height. Without them, AMOS II got stuck on the object. We encourage readers to see the video of this experiment at **Video S10**. This experimental result reveals that the developed adaptive neural locomotion controller can not only generate versatile locomotion behaviors including climbing (shown in the other experiments) but also give robustness to the system by allowing it to cope with damage.

## **4. DISCUSSION**

Here, we briefly discuss some remaining issues concerning the sixlegged walking machine AMOS II and its controller, because most of the relevant discussion points have been treated in the above sections.

AMOS II was used as an experimental platform and represents an embodied neural closed-loop system with many degrees of freedom. It was designed with a morphology analogous to a cockroach. It was constructed in a straightforward way as a biomechatronic system consisting of several sensors and actuators. Due to extra rubber coupling elements and springs integrated into the joints and tibiae of AMOS II, this yields passive compliance allowing AMOS II to deal with minor disturbances during locomotion over rough terrain (as described in the second experiment). The joint compliance also enables AMOS II to passively flex its legs to avoid damages when the environment changes (**Video S11**). Besides the physical components of AMOS II that follow biomechanics of walking animals, another special trait of AMOS II is that we configured the ranges of the joint movements of AMOS II such that it has a very low center of mass (i.e., low ground clearance) and its body falls to the ground before taking the next step during normal walking. When negotiating a large obstacle, AMOS II uses its BJ together with additional reactive BJ control (Goldschmidt et al., 2012) to climb over it while its leg movements automatically adapt accordingly (**Video S12**).

In fact, the advantage of low ground clearance is evident in case of leg damage. In this situation, a robot with high ground clearance will tip over or fall down a lot (**Figure A5A**) leading to unstable locomotion and remaining legs need to carry more load. Thus, the motors need to produce high torque to carry the load resulting in high power consumption (**Figure A5B**). Furthermore the legs might have difficulty to swing during swing phase (**Figures A5C,D**); thereby, the robot will not move forwards properly (**Figure A5G** and **Video S10**). In contrast, with low ground clearance the robot will not much fall down (**Figure A5A**) since its body is already close to the ground and the remaining legs need not to carry much more load leading to lower power consumption compared to the high ground clearance case (**Figure A5B**), and they are able to swing during swing phase (**Figures A5E,F**). As a result, the robot can still move better in a straight way (**Figure A5H** and **Video S10**). However, the drawback of having low ground clearance is that the robot could get stuck often when walking on non-flat terrains. Accordingly, during walking over rough terrains AMOS II will lift its body up to obtain higher ground clearance such that it does not get stuck. Lifting the body up is automatically done by shifting the center of the CTr-joint angles downwards (more depression) and the center of the FTi-joint angles upwards (more extension) and this is the default joint movements for rough terrains. By contrast, most walking machines (Lee et al., 2006; Spenneberg and Kirchner, 2007; Lewinger and Quinn, 2009) always perform locomotion with high ground clearance (**Video S12**). Although such a high ground clearance walking strategy could simplify it for the controller to deal with different terrains, it might lead to instability of the systems (as described above); unless, additional control mechanisms are applied (Spenneberg et al., 2004). In fact, the biologically-inspired locomotion strategy of AMOS II arises not only from biomechanics but is a combination of its biomechanics and adaptive neural locomotion control. While the biomechanics allows for leg and body movements as well as provides some degree of disturbance rejection, the adaptive neural locomotion controller generates versatile motions and adaptation.

The controller consists of two main parts: CPG-based control and local leg control. The CPG-based control is the improved version of our original chaotic CPG-based controller [compare **Figure A2** in Steingrube et al. (2010) with **Figure A2** of this paper]. Two main components of the controller have been modified here while the other parts remain unchanged. We replaced the chaotic CPG by a simpler CPG mechanism with neuromodulation. As a consequence, by exploiting neural dynamics of the new CPG mechanism, we can generate a multitude of walking patterns (e.g., 20 patterns). Some of these patterns are comparable to insect gaits (Wilson, 1966) and allow for energy-efficient locomotion on different terrains, like, fine gravel (loose terrain), gravel (rough terrain), and grass (vegetated terrain). The CPG also provides fast switching between the patterns compared with the chaotic CPG. For motor connections, we modified the connections to the FTi-motor neurons such that the FTi-joints are activated during walking while in the previous work these joints are inhibited; i.e., they stay in a flexed position. The introduced FTi-joint movements are inspired by insect leg movements (Ekeberg et al., 2004; Cruse et al., 2009). During the stance phase of forward walking, the FTi-joints of the front legs flex inward, of the hind legs extend outward, and of the middle legs combine these two movements by first flexion and then extension. As a consequence, the front, hind, and middle legs pull, push, and pull and push the body forward, respectively. This results in faster walking speed compared with the fixed FTi-joint version. This CPG-based control coordinating all joints can be considered as open-loop control since in principle it does not require any sensory feedback for the locomotion generation (i.e., multiple patterns and insect-like leg movements). However, the loop can be simply closed by using, e.g., exteroceptive sensory feedback to generate stimulus induced behavior (like, photo tropism and obstacle avoidance) as well as to select an energy-efficient gait with respect to the terrain in an autonomous manner.

In contrast to the CPG-based control, the local leg control introduced here for the first time employs proprioceptive sensory feedback (i.e., here only FC sensors) for adaptable locomotion. Thus it can be considered as closed-loop control. It has two components applied independently to each leg of AMOS II: an adaptive forward model with efference copy and searching and elevation control. The forward model is constructed by using a simple hysteresis neuron with recurrent connection. It can learn online to transform the CTr-motor signal (efference copy) into the expected FC signal. While the forward model is minimal and sufficient here, one could combine several of them to obtain different forward models for different purposes, e.g., sensory noise cancelation and slope detection (Manoonpong and Wörgötter, 2009) or use them for designing non-linear filters (Manoonpong et al., 2010). Due to our controller being modular, if desired, one could replace this simple hysteresis neuron by more complex neural networks [e.g., reservoir computing networks (Dasgupta et al., 2012)] for transforming motor commands into complex expected sensory signals.

Our forward model presented here can be considered as an adaptive predictor that can learn to predict the sensory consequences (expected sensory feedback) from motor commands (efference copy) (Kawato, 1999). The expected sensory feedback (or transformed motor command) is then used to compare it with the actual FC signal for the walking state estimation. The sensory prediction error enables AMOS II to determine whether its leg loses ground contact during the stance phase or hits or steps on any obstacles during the swing phase. Afterwards, this information is used to adapt the leg accordingly through the searching and elevation control. The adaptive leg motions (i.e., searching and elevation motions) follow the observed locomotion in certain insects, like locusts (Pearson and Franklin, 1984), cockroaches (Tryba and Ritzmann, 2000), and stick insects (Fischer et al., 2001), during walking on rough terrain. As a result, employing closed-loop local leg control mechanisms with the forward models allows AMOS II to not only successfully traverse rough terrains and climb over large obstacles, but to also cope with leg damage.

Besides special features described above, our adaptive neural locomotion controller also combines three key aspects found in animal locomotor control: central mechanism (CPGs) (Meyrand et al., 1991; Katz, 1998; Harris-Warrick, 2011), sensory feedback (afferent-based control) (Cruse et al., 2009), and internal forward models with efference copies (efferent-based control) (Holst and Mittelstaedt, 1950; Cruse et al., 1998; Bläsing and Cruse, 2004). In particular, our CPG-based control or central mechanism for versatile locomotion generation relies on a CPG mechanism with neuronmodulation that is inspired by the function of neural CPG circuits found in lobsters (Selverston et al., 1993; Pulver and Marder, 2002) and the mollusc *Tritonia diomedea* (Katz et al., 1994). These biological findings suggest that extrinsic and intrinsic neuromodulatory inputs to the CPG circuits can alter the cellular changes and synaptic properties of neurons in the circuits. Thereby, these inputs modify the output of the CPG leading to behavioral flexibility and different locomotion modes. This process can be achieved on the fly resulting in the adaptation of behavior to environmental changes in an ongoing fashion. Our local leg control mechanisms based on sensory feedback (afferent-based control) and adaptive neural forward models with efference copies (efferent-based control) for state estimation and adaptable locomotion follows the evidence of forward model predictions with sensory feedback in the stick insects *Aretaon asperrimus*. It shows that during climbing over very large gaps the stick insects perform an immediate change in the stepping pattern of the legs when losing ground contact at the end of the swing phase (Bläsing and Cruse, 2004). This would reflect an expectation of regular ground contacts. Other results supporting the idea of forward model predictions (Cruse et al., 1998) indicate that, during the swing phase of the stick insects, reactions to obstacles depend on an internal state.

While these three key aspects are essential for locomotion control, some works have taken these aspects into account for developing locomotion control in simulation (Kuo, 2002; Dürr et al., 2003). Only a few have successfully applied it to a real system but with small numbers of inputs and outputs and behavioral restrictions (Lewis and Simo, 2001; Lewis and Bekey, 2002), thereby, reducing the sensor-motor coordination problem substantially. Most studies use a combination of several CPGs and sensory feedback to generate different walking behaviors (Beer et al., 1997; Harischandra et al., 2011) including reflexes (Kimura et al., 2007; Spenneberg and Kirchner, 2007; Lewinger and Quinn, 2011; von Twickel et al., 2012). The reflexes driven by only sensory feedback results in searching and elevation actions when losing ground contact and hitting an obstacle, respectively. However, due to the lack of forward model predictions (internal state) this control approach has difficulties to generate reactions for walking machines to avoid an obstacle when stepping on it during swing phase as the stick insects do. Another interesting approach, like "Walknet" (Cruse et al., 2007), has no central control unit. Instead, it uses a decentralized control architecture with local coordination rules highly depending on different types of proprioceptive sensory feedback, e.g., FC, joint angle, and joint angular velocity signals, to determine an internal state and generate basic locomotion and adaptation. However, this mechanism malfunctions when losing the sensory information, thereby it is less robust.

In contrast to this, our adaptive neural locomotion controller based on a modular structure is robust and has fault tolerance capabilities. Damage to a part of the system can result in a loss of some of the abilities of the system, but, the whole system can still function partially (see the leg damage experiment in **Figure 12**). Its modules (**Figures 2**, **A2**) generally have a simpler structure as compared to the network as a whole. Thus, their functions and dynamics are analyzable by observing the input/output relationship of an individual module (Manoonpong et al., 2007, 2008b). Its individual modules have been used in earlier studies and successfully provided partial solutions to different walking machines (Manoonpong et al., 2007, 2008b). Furthermore, the controller, using a single CPG, sensory feedback, and forward model predictions providing an internal state, can generate a multitude of walking patterns (e.g., 20 walking patterns), insectlike leg movements, energy-efficient locomotion, and adaptable locomotion (like searching and elevation actions including reactions when stepping on an obstacle during swing phase). It can also handle leg damage and even generate cockroach-like climbing behavior (**Video S12**) when additional reactive BJ control is applied (Goldschmidt et al., 2012). The controller can also be simply transferred to another six-legged walking machine having a different morphology but leg lengths with similar proportion to AMOS II. In this case, the internal network structure and parameters of its CPG-based control (**Figure 2**, left) remain unchanged. We set *MI* of the CPG to 0.15; thereby, the controller generates a tripod gait with a walking frequency of approximately 0.8 Hz for the machine (**Video S13**). Only the maximum and minimum ranges of the joint movements of the legs and the neural parameters of the adaptive forward models (**Figure 2**, right) are different. The neural parameters are adapted to the new system by using the online learning mechanism (Equations 8–9). In principle, applying the controller to other different walking machines might be necessary to also adjust generated walking frequency (i.e., operating range of *MI* of the CPG). The capability of the controller which combines the key aspects of the biological locomotion systems to achieve a very rich behavioral repertoire in an autonomous fashion, to the best of our knowledge, has not been achieved in other walking machine systems so far.

Taken together this work suggests how a CPG mechanism with neuromodulation, sensory feedback, and internal forward models with efference copies can be used for controlling complex robots. It further confirms that this combination plays an important role for locomotion in biological as well as artificial systems. The results presented here show that the employed embodied neural closed-loop system can be an option for developing robust and adaptable machines, thereby bringing the goal of approaching living creatures in their levels of performance a little bit closer. As the controller is modular, it is flexible and offers the future possibility of integrating joint angle and joint CS signals as feedback together with additional entrainment and reflexive mechanisms (Takemura et al., 2005; Cruse et al., 2009; Nachstedt et al., 2012) to avoid leg slipping which currently occurs when the legs work partially against each other. The controller can also be extended to multiple CPGs (Ren et al., 2012) in order to be able to adjust the frequency of each leg individually for some situations like gap crossing (Bläsing, 2006) or damage compensation (Ren et al., 2012). It even can be combined with other neural modules like short term motor memory (Dasgupta et al., 2012) and muscle models (Xiong et al., 2012). This will enable the robotic system to be capable of navigating in complex environments with a certain degree of memory-guided behaviors and at the same time performing more natural movements with active compliances.

#### **ACKNOWLEDGMENTS**

This research was supported by the Emmy Noether Program of the Deutsche Forschungsgemeinschaft (DFG, MA4464/3-1), the Federal Ministry of Education and Research (BMBF) by a grant to the Bernstein Center for Computational Neuroscience II Göttingen (01GQ1005A, project D1), and European Communitys Seventh Framework Programme FP7/2007–2013 (Specific Programme Cooperation, Theme 3, Information and Communication Technologies) under grant agreement no. 270273, Xperience. We thank Steffen Zenker, Eren Erdal Aksoy, Xiaofeng Xiong, Eduard Grinke, and Dennis Goldschmidt for technical assistance and Frank Hesse for discussions.

## **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found fncir*.*2013*.*00012/abstract online at: http://www.frontiersin.org/Neural\_Circuits/10.3389/

#### **Video S1 | Comparison of gait switching using a CPG with**

**neuromodulation and a chaotic CPG.** Using a CPG with neuromodulation, AMOS II shows fast and smooth switching between gaits, while the switching is slower and less smooth when using our previous chaotic CPG. (http://manoonpong.com/Frontiers/SupplementaryVideo1.wmv)

**Video S2 | Examples of 20 different walking patterns.** AMOS II walks with different patterns from slow to fast speed with respect to *MI* = 0*.*0*,* 0*.*01*,...,* 0*.*19. (http://manoonpong.com/Frontiers/SupplementaryVideo2.wmv)

**Video S3 | Turning behavior.** AMOS II autonomously turns to avoid obstacles and escape from a corner and even a deadlock situation. It detects obstacles and a corner by using its ultrasonic sensors installed at front. (http://manoonpong.com/Frontiers/SupplementaryVideo3.wmv)

**Video S4 | Insect-like leg movements.** To clearly observe the insect-liked leg movements of AMOS II, we place it on a box and let it perform forward and backward walking.

(http://manoonpong.com/Frontiers/SupplementaryVideo4.wmv)

**Video S5 | Leg elevation.** To clearly observe the leg elevation of AMOS II, we place it on a box and make the foot touch an obstacle during the swing phase. Due to mismatch between the expected foot contact signal, generated by the adaptive forward model, and the actual one, AMOS II can immediately elevate its leg (here right middle leg) to free the leg from the obstacle. In addition, we show that using an IR sensor at the leg also allows AMOS II to elevate its leg in order to avoid hitting the obstacle. The first part of this video corresponds to the result shown in **Figures 7G–J** of the manuscript.

(http://manoonpong.com/Frontiers/SupplementaryVideo5.wmv)

**Video S6 | Searching for a foothold.** To clearly observe searching for a foothold of AMOS II, we place it on a box and change the ground level during the stance phase. Due to mismatch between the expected foot contact signal, generated by the adaptive forward model, and the actual one, AMOS II can immediately lowers its leg (here right middle leg) to search for a foothold. This video corresponds to the result shown in **Figures 7C–F** of the manuscript. (http://manoonpong.com/Frontiers/SupplementaryVideo6.wmv)

**Video S7 | Energy-efficient and adaptable locomotion on different terrains.** First test shows that AMOS II walks with a fast wave gait since it detects fine gravel (loose terrain) using its visual system. Afterward, it changes from the wave gait to a tetrapod gait since it detects gravel (rough terrain). Finally, it uses a fast tripod gait on the floor (hard terrain). Another test in an outdoor environment shows that AMOS II walks with a tetrapod gait since it detects gravel (rough terrain). Afterward, it changes from the tetrapod gait to a tripod gait since it detects grass (vegetated terrain). Note that during traversing the different terrains, AMOS II adapts its legs individually to the terrains. The first part of this video corresponds to the result shown in **Figure 9** of the manuscript and the second part of this video corresponds to the result shown in **Figure A4**. (http://manoonpong.com/Frontiers/SupplementaryVideo7.wmv)

**Video S8 | Adaptable locomotion on terrain with small obstacles.** First test shows that AMOS II can free its right front leg after the leg hits an obstacle during the swing phase. Due to the difference of the ground level, AMOS II also adapts its legs by lowering them more downward to ensure ground contact during the stance phase. Other tests also show this kind of adaptable locomotion of AMOS II. The first part of this video corresponds to the result shown in **Figure 10** of the manuscript. (http://manoonpong.com/Frontiers/SupplementaryVideo8.wmv)

**Video S9 | Climbing over a large obstacle in an outdoor environment.** AMOS II walks on rough terrain and then climbs over an 11 cm high obstacle. This video corresponds to the results shown in **Figure 11** of the manuscript. (http://manoonpong.com/Frontiers/SupplementaryVideo9.wmv)

**Video S10 | Adaptable locomotion during leg damage.** While AMOS II is walking, we disconnect the motor power connector of the FTi-joint of its left middle leg to simulate leg damage. Local leg control allows AMOS II to adapt its legs to deal with the leg damage. As a result, it could still move forward without problem. This first part of the video corresponds to the results shown in **Figure 12** of the manuscript. Another test is shown in the second part of the video. The third part of the video shows that AMOS II fails to cope with leg damage if local leg control is not activated. The last part of the video shows walking behaviors with low and high ground clearance during leg damage. (http://manoonpong.com/Frontiers/SupplementaryVideo10.wmv)

**Video S11 | Passive compliances of the joints and legs of AMOS II.** The joint compliance enables AMOS II to passively flex its legs to avoid damages when the environment changes. In addition, its leg compliance allows it to absorb external (ground reaction) forces. (http://manoonpong.com/Frontiers/SupplementaryVideo11.wmv)

**Video S12 | Walking and climbing like a cockroach.** AMOS II keeps its body very close to the ground during walking. While climbing, it uses its active backbone joint and leg adaptation. The video also compares its locomotion with

## **REFERENCES**


cockroaches and other walking machines [four legs (Quadruped robot of the Stanford AI Lab, http://ai.stanford.edu), six legs (BILL-Ant-a robot of Case Biorobotics Lab, http://biorobots.cwru.edu/), eight legs (Scorpion robot of DFKI Bremen - Robotics Innovation Center, http://robotik.dfki-bremen.de/)]. Note that cockroach videos are referred to (Ritzmann et al., 2004; Abbott, 2007; Lewinger and Quinn, 2009) while the walking machine videos are referred to (Lee et al., 2006; Spenneberg and Kirchner, 2007; Lewinger and Quinn, 2009). (http://manoonpong.com/Frontiers/SupplementaryVideo12.wmv)

**Video S13 | Testing the adaptive neural locomotion controller on another six-legged walking machine.** We transfer the adaptive neural locomotion controller to another walking machine. We set *MI* of the CPG to 0.15; thereby, the controller generates a tripod gait with a walking frequency of approximately 0.8 Hz for the machine. As a result, the controller allows the machine to perform foothold searching when its leg loses ground contact, to adapt its locomotion to deal with irregular terrain or different ground levels, and to climb over a 7 cm high obstacle. For climbing, additional reactive active backbone joint control is also applied. (http://manoonpong.com/Frontiers/SupplementaryVideo13.wmv)

insect walking. *Arthropod Struct. Dev.* 33, 287–300.


neurons of different networks. *Nature* 351, 60–63.


robots. *Arthropod Struct. Dev.* 33, 361–379.


T. (2005). Slip-adaptive walk of quadruped robot. *Robot. Auton. Syst.* 53, 124–141.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 November 2012; accepted: 21 January 2013; published online: 13 February 2013.*

*Citation: Manoonpong P, Parlitz U and Wörgötter F (2013) Neural control and adaptive neural forward models for insect-like, energy-efficient, and adaptable locomotion of walking machines. Front. Neural Circuits 7:12. doi: 10.3389/ fncir.2013.00012*

*Copyright © 2013 Manoonpong, Parlitz and Wörgötter. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## **APPENDIX**

## **THE WALKING MACHINE PLATFORM AMOS II (BIOMECHANICS)**

The most important specification of AMOS II is presented in the main text of the manuscript. Therefore, we only provide here a clear picture of the active backbone joint (BJ) of AMOS II including its angle range (see **Figure A1**, left). Its minimum downward position (−45◦) is comparable to the one observed in a cockroach (see **Figure A1**, right). Due to the mechanical design of the BJ, it allows the joint to also lean upwards to a maximum position of 45◦. The leaning upward and downward motions are used for climbing over a large obstacle having a height up to 13 cm or 75% of the leg length of AMOS II.

#### **COMPLETE NEURAL CIRCUIT**

**Figure A2** shows the complete neural circuit of the adaptive locomotion controller. The controller generates versatile locomotion behavior of AMOS II by means of CPG-based control and local leg control (see main text for details). In total the controller has six neural modules where the modules I–IV belong to the CPG-based control and the modules V, VI belong to the local leg control.

**Module I (CPG with neuromodulation)**: *MI* = modulatory input; *C*1*,*<sup>2</sup> = output neurons of the CPG. We use a hyperbolic tangent (tanh) transfer function for the CPG neurons.

**Module II (neural CPG postprocessing)**: *CP*1*,*<sup>2</sup> = postprocessing neurons with a step function; *Int*1*,*<sup>2</sup> = integrator units.

**Module III (neural motor control)**: *I*1*,...,*<sup>4</sup> = neural control parameters for generating different walking directions and stopping motion; *H*1*,...,*<sup>14</sup> = interneurons of the phase switching network (PSN); *H*15*,...,*<sup>28</sup> = interneurons of the velocity regulating networks (VRNs). We use a tanh transfer function for the interneurons. Parameters are *A* = 1*.*7246, *B* = −2*.*48285, and *C* = −1*.*7246.

**Module IV (motor neurons)**: *M*1*,...,*<sup>5</sup> = premotor neurons; *TR*1*, CR*1, *FR*<sup>1</sup> = TC-, CTr- and FTi-motor neurons of the right front leg (R1); *TR*2, *CR*2, *FR*<sup>2</sup> = right middle leg (R2); *TR*3, *CR*3, *FR*<sup>3</sup> = right hind leg (R3); *TL*1, *CL*1, *FL*<sup>1</sup> = left front leg (L1); *TL*2, *CL*2, *FL*<sup>2</sup> = left middle leg (L2); *TL*3, *CL*3, *FL*<sup>3</sup> = left hind leg (L3); *BJ* = a backbone motor neuron which is controlled by reactive BJ control [not shown here but see Goldschmidt et al. (2012)]; τ = ipsilateral lag (i.e., 16 time steps or ≈0.6 s); τ*<sup>L</sup>* = the phase shift between both left and right sides (i.e., 48 time steps or ≈2 s). We use piecewise linear transfer functions for the premotor and motor neurons.

**Module V (adaptive neural forward models)**: *F*1*,...,*<sup>6</sup> = adaptive hysteresis neurons for motor signal transformation; *WI*, *WR*, *B* = learning parameters; *P*1*,...,*<sup>6</sup> = postprocessing neurons; *-* = an error between the expected foot contact (FC) signal and the actual one. We use a tanh transfer function for the hysteresis and postprocessing neurons.

**Module VI (searching and elevation control)**: *PD*1*,...,*<sup>6</sup> = preprocessing neurons which provide only a positive error (+*-*); *ND*1*,...,*<sup>6</sup> = preprocessing neurons which provide only a negative error (−*-*); *E*1*,...,*<sup>6</sup> = *S*1*,...,*<sup>6</sup> = recurrent neurons (i.e., accumulators). We use piecewise linear transfer functions for the preprocessing neurons and use a linear transfer function for the recurrent neurons.

Note that in all modules, all numbers are synaptic weights and the ones marked with subscript "B" refer to fixed bias terms.

Different exteroceptive and proprioceptive sensors are used here as inputs to the adaptive controller to generate stimulus induced behavior, energy-efficient gait, and adaptable locomotion. The sensors are: left and right ultrasonic sensors (US), six FC sensors (FC1*,...,*6), one USB camera (CM), six infrared reflex (IR) sensors (IR1*,...,*6), one current sensor (CS), and left and right light dependent sensors (LD). All raw sensory signals are preprocessed using neural preprocessing except the visual signal which is done by using an online feature-based terrain classification algorithm.

contact or stance phase and gray areas refer to no ground contact during swing phase. As frequency increases, some legs steps in pairs (dashed enclosures). One time step is ≈0.037 s.

controlled by the sensory signal. It was set to 0.06 (tetrapod gait) and then 0.19 (fast tripod gait). **(C)** The positive (*oS* ) and negative (*oE* ) accumulated errors of the expected foot contact signal and the actual one (cf. **Figures 7A,B** of the manuscript). They control leg adaptation to deal with different terrains. **(D–F)** The TC-, CTr-, and FTi-joint angles of the right middle terrains. Black boxes indicate swing phase while white areas between them indicate stance phase. Abbreviations are referred to **Figure 1** of the manuscript. Above pictures show snap shots from the camera on AMOS II used for the terrain classification while walking. Below pictures show snap shots of locomotion of AMOS II during the experiment. Note that one time step is ≈0.037 s.

The foot contact signals of R1 and L2 for the high ground clearance case.

clearance, respectively.

This algorithm is briefly described in the main text of the manuscript.

We use a hysteresis neuron (*N*1) with a tanh transfer function for preprocessing the CS signal. The hysteresis principle (Pasemann, 1993) leads to a non-linear transition of two output states (low and high activations). Thus, hysteresis neuron can effectively filter sensory noise (Manoonpong et al., 2008b). The preprocessed CS signal which provides an energy level is used to inhibit all joint movements, thereby stopping robot motion, when the system has low power.

We use four neurons (*N*2*,...,*5) with a tanh transfer function to form the neural preprocessing network of the left and right LD signals and the left and right US signals. The network is developed based on a minimal recurrent controller (MRC) structure (Pasemann et al., 2003) which allows balancing positive (LD) and negative (US) tropisms. The network outputs (i.e., outputs of *N*2*,*3) provide orienting control signals which are transmitted to *I*3*,*<sup>4</sup> of the neural motor control module. As a result, AMOS II can effectively perform an appropriate turning angle to avoid obstacles or corners as well as turn toward a light source.

We simply use neurons (*N*6*,...,*17) with a tanh transfer function for preprocessing the FC1*,...,*<sup>6</sup> and IR1*,...,*<sup>6</sup> signals. This is because the sensor signals contain small noise which can be eliminated by the non-linearity of the neuron. The preprocessed sensor signals are used for local leg control (described in the main text of the manuscript). All neural preprocessing parameters, e.g., synaptic strengths and bias terms (see **Figure A2**) were obtained by experiments [see Manoonpong et al. (2008a,b) for more details of the neural preprocessing parameters].

#### **ADDITIONAL EXPERIMENTAL RESULTS**

Here we present three more experimental results that complement those shown in the main text of the manuscript.

**Figure A3** shows 20 walking patterns with different speeds of AMOS II. These patterns are mainly controlled by the CPG-based controller. Setting the modulatory input *MI* of the CPG to 0.0, each leg steps in a wave on each side of the body with overlap. Increasing *MI*, stepping frequency increases and some legs steps in pairs (see dashed enclosures). This results in a variety of patterns (or gaits) including insect-like gaits and intermixed gaits. For example, one observes wave gaits with different frequencies (*MI* = 0*.*01–0*.*04), tetrapod gaits with different frequencies (*MI* = 0*.*05–0*.*06), caterpillar gaits with different frequencies (*MI* = 0*.*07–0*.*10), and tripod gaits with different frequencies (*MI* = 0*.*15–0*.*19). Legs are labeled from front to back as numbers 1–3 and the left and right sides are L and R, respectively. Note that increasing *MI* higher than 0.19, we found only two different gaits comparable to tripod gait (e.g., *MI* = 0*.*19) and caterpillar gait (e.g., *MI* = 0*.*10).

**Figure A4** shows autonomous selection of energy-efficient gaits while traversing from gravel to grass in an outdoor environment. It can be seen that at the beginning AMOS II walked with a tetrapod gait (photos 1,2) since it detected gravel (rough terrain) using its visual system. Afterward, it changed from the tetrapod gait to a tripod gait (photo 3) since it detected grass (vegetated terrain). During traversing the different terrains, AMOS II adapted its legs individually to deal with a change of terrain. That is, it depressed its leg and extended its tibia to search for a foothold when losing a ground contact during the stance phase where this information is detected by a significant change of the positive accumulated error *oS*, see black line in **Figure A4C**. However, during the swing phase no leg elevation was observed since only minor perturbation occurred (i.e., no significant change of the negative accumulated error *oE*, see red line in **Figure A4C**). We encourage readers to see the video of this experiment at **Video S7**.

**Figure A5** shows walking behaviors with high and low ground clearance when legs are damaged. In this test, AMOS II was driven by only the CPG-based control described in the section 2.3 of the manuscript. We let AMOS II walk with a slow wave gait (*MI* = 0*.*02) and then disconnected the motor power connectors of the CTr- and FTi-joints of the right (R3) and left (L3) hind legs and the left front leg (L1). The joints became inactive (i.e., uncontrollable). This is to simulate leg damage. It can be seen that AMOS II with high ground clearance had large body inclination (≈ −18◦, **Figure A5A**) leading to unstable locomotion and remaining legs need to carry more load. Thus, the motors need to produce high torque to carry the load resulting in high power consumption (**Figure A5B**). Furthermore the legs could not swing properly during swing phase (**Figures A5C,D**). In this case, the left middle leg (L2) always stayed on the ground; thereby, the robot turned to the left (**Figure A5G** and **Video S10**). In contrast, with low ground clearance the AMOS II fell down a little bit (**Figure A5A**) since its body was already close to the ground and remaining legs need not to carry more load leading to lower power consumption compared to the high ground clearance case (**Figure A5B**). The remanning legs (R1, R2, and L2) were able to swing during swing phase (**Figures A5E,F**). As a result, it could still move more straightforward compared to the high ground clearance case (**Figure A5H** and **Video S10**).

## **REFERENCES**

Abbott, A. (2007). Biological robotics: working out the bugs. *Nature* 445, 250–253. Goldschmidt, D., Wörgötter, F., and Manoonpong, P. (2012). "Biologically


# **NEURAL CIRCUITS**

## Control of breathing by interacting pontine and pulmonary feedback loops

#### *Yaroslav I. Molkov1,2, Bartholomew J. Bacak1, Thomas E. Dick3 and Ilya A. Rybak1 \**

*<sup>1</sup> Department of Neurobiology and Anatomy, Drexel University College of Medicine, Philadelphia, PA, USA*

*<sup>2</sup> Department of Mathematical Sciences, Indiana University – Purdue University, Indianapolis, IN, USA*

*<sup>3</sup> Departments of Medicine and Neurosciences, Case Western Reserve University, Cleveland, OH, USA*

#### *Edited by:*

*Eberhard E. Fetz, University of Washington, USA*

#### *Reviewed by:*

*Ansgar Buschges, University of Cologne, Germany Deborah Baro, Georgia State University, USA*

#### *\*Correspondence:*

*Ilya A. Rybak, Department of Neurobiology and Anatomy, Drexel University College of Medicine, 2900 Queen Lane, Philadelphia, PA 19129, USA. e-mail: ilya.rybak@drexelmed.edu* The medullary respiratory network generates respiratory rhythm via sequential phase switching, which in turn is controlled by multiple feedbacks including those from the pons and nucleus tractus solitarii; the latter mediates pulmonary afferent feedback to the medullary circuits. It is hypothesized that both pontine and pulmonary feedback pathways operate via activation of medullary respiratory neurons that are critically involved in phase switching. Moreover, the pontine and pulmonary control loops interact, so that pulmonary afferents control the gain of pontine influence of the respiratory pattern. We used an established computational model of the respiratory network (Smith et al., 2007) and extended it by incorporating pontine circuits and pulmonary feedback. In the extended model, the pontine neurons receive phasic excitatory activation from, and provide feedback to, medullary respiratory neurons responsible for the onset and termination of inspiration. The model was used to study the effects of: (1) "vagotomy" (removal of pulmonary feedback), (2) suppression of pontine activity attenuating pontine feedback, and (3) these perturbations applied together on the respiratory pattern and durations of inspiration (*TI*) and expiration (*TE*). In our model: (a) the simulated vagotomy resulted in increases of both *TI* and *TE*, (b) the suppression of pontine-medullary interactions led to the prolongation of *TI* at relatively constant, but variable *TE*, and (c) these perturbations applied together resulted in "apneusis," characterized by a significantly prolonged *TI*. The results of modeling were compared with, and provided a reasonable explanation for, multiple experimental data. The characteristic changes in *TI* and *TE* demonstrated with the model may represent characteristic changes in the balance between the pontine and pulmonary feedback control mechanisms that may reflect specific cardio-respiratory disorders and diseases.

**Keywords: respiratory central pattern generator, brainstem, ventrolateral respiratory column, pre-Bötzinger complex, pontine-medullary interactions, pulmonary feedback, control of breathing, apneusis**

## **INTRODUCTION**

The respiratory rhythm and motor pattern controlling breathing in mammals are generated by a respiratory central pattern generator (CPG) located in the lower brainstem (Cohen, 1979; Bianchi et al., 1995; Richter, 1996; Richter and Spyer, 2001). The pre-Bötzinger complex (pre-BötC), located within the ventrolateral respiratory column (VRC) in the medulla, contains mostly inspiratory neurons (Smith et al., 1991; Rekling and Feldman, 1998; Koshiya and Smith, 1999). The pre-BötC, interacting with the adjacent Bötzinger complex (BötC), containing mostly expiratory neurons (Cohen, 1979; Ezure, 1990; Jiang and Lipski, 1990; Bianchi et al., 1995; Tian et al., 1999; Ezure et al., 2003), represents a core of the respiratory CPG (Bianchi et al., 1995; Tian et al., 1999; Rybak et al., 2004, 2007, 2008, 2012; Smith et al., 2007, 2009; Rubin et al., 2009; Molkov et al., 2010, 2011). This core circuitry generates primary respiratory oscillations defined by the intrinsic biophysical properties of respiratory neurons, the architecture of network interactions within and between the pre-BötC and BötC, and the inputs and drives from other brainstem compartments, including the pons, retrotrapezoid nucleus (RTN), raphé, and nucleus tractus solitarii (NTS). It has been suggested (Rybak et al., 2007, 2008; Smith et al., 2007) that these external inputs and drives may have a specific spatial mapping onto respiratory neural populations within the pre-BötC/BötC core network, so that changes in these inputs or drives can alter the balance in excitation between key populations within the core network, thereby affecting their interactions and producing specific changes in the respiratory motor patterns observed under different conditions.

Most CPGs controlling rhythmic motor behaviors in invertebrates and vertebrates operate under control of multiple afferent feedbacks and often provide feedback to the sources of their descending and afferent inputs hence allowing feedback regulation of the descending and afferent control signals (Dubuc and Grillner, 1989; Ezure and Tanaka, 1997; Blitz and Nusbaum, 2008; Buchanan and Einum, 2008), and this regulation often operates via presynaptic inhibition (Nushbaum et al., 1997; Ménard et al., 2002; Côté and Gossard, 2003; Blitz and Nusbaum, 2008).

As in other CPGs, afferent feedbacks are involved in the control of the mammalian respiratory CPG and the generation and shaping of the breathing pattern. Many peripheral mechano- and chemo-sensory afferents, including those from the lungs, tracheobronchial tree and carotid bifurcation, provide feedback signals involving in the homeodynamic control of breathing, cardiovascular function, and different types of motor behaviors coordinated with breathing, such as coughing (see Loewy and Spyer, 1990, for review). The NTS is the major integrative site of these afferent inputs. The present study focuses on the mechanoreceptor feedback mediated by pulmonary stretch receptors (PSRs). These mechanoreceptors respond to mechanical deformations of the lungs, trachea, and bronchi, and produce a burst of action potentials during each breath, thereby providing the central nervous system with feedback regarding rate and depth of breathing (see Kubin et al., 2006, for review). Activation of PSRs elicits reflex effects including inspiratory inhibition or expiratory facilitation (representing the so-called Hering-Breuer reflex), enhancement of early inspiratory effort, bronchodilatation, and tachycardia. PSR axons travel within the vagus nerve, and form excitatory synapses in NTS pump cells (Averill et al., 1984; Backman et al., 1984; Berger and Dick, 1987; Bajic et al., 1989; Anders et al., 1993; Kubin et al., 2006). Pharmacological microinjection and lesion studies (McCrimmon et al., 1987; Ezure et al., 1991, 1998; Ezure and Tanaka, 1996, 2004; Kubin et al., 2006) suggest that NTS pump cells mediate the Hering-Breuer reflex (lunginflation induced termination of inspiration). Through pump cells, PSR-originating information alters the activity of CPG neurons in manners consistent with their proposed roles in rhythm generation.

The other feedback loop, important for the respiratory CPG operation, involves multiple pontine-medullary interactions. The pons (Kölliker-Fuse nucleus, parabrachial nucleus, A5 area, etc.) contains neurons expressing inspiratory (I)-, inspiratoryexpiratory (IE)-, or expiratory (E)-modulated activity, especially in vagotomized animals (Bertrand and Hugelin, 1971; Feldman et al., 1976; Cohen, 1979; Bianchi and St. John, 1982; St. John, 1987, 1998; Shaw et al., 1989; Dick et al., 1994, 2008; Jodkowski et al., 1994; Song et al., 2006; Segers et al., 2008; Dutschmann and Dick, 2012). This modulation is probably based on reciprocal connections between medullary and pontine respiratory regions which were described in a series of morphological studies (Cohen, 1979; Bianchi and St. John, 1982; Nunez-Abades et al., 1993; Gaytan et al., 1997; Zheng et al., 1998; Ezure and Tanaka, 2006; Segers et al., 2008). The principal source of pontine influence on the medulla is thought to be the Kölliker-Fuse region in the dorsolateral pons, although other areas, including those from the ventrolateral pons, are also involved (Bianchi and St. John, 1982; Chamberlin and Saper, 1994, 1998; Dick et al., 1994; Fung and St. John, 1994a,b,c; Jodkowski et al., 1994, 1997; Morrison et al., 1994; St. John, 1998; Rybak et al., 2004; Dutschmann and Herbert, 2006; Mörschel and Dutschmann, 2009; Dutschmann and Dick, 2012). Pontine activity contributes to the regulation of phase duration as demonstrated by stimulation and lesion studies (Cohen et al., 1993; Jodkowski et al., 1994, 1997; Okazaki et al., 2002; Cohen and Shaw, 2004; Rybak et al., 2004; Dutschmann and Herbert, 2006; Mörschel and Dutschmann, 2009; Dutschmann and Dick, 2012). Stimulation of the Kölliker-Fuse or medial parabrachial nuclei induced a premature termination of inspiration (I-E transition) and extended expiratory phase. These effects were similar to the effects of vagal stimulation (Cohen, 1979; Hayashi et al., 1996). Also, the effects of both vagal and pontine stimulation appear to be mediated by the same medullary circuits that control onset and termination of inspiration (Haji et al., 1999; Okazaki et al., 2002; Rybak et al., 2004; Mörschel and Dutschmann, 2009; Dutschmann and Dick, 2012). Finally, the respiratory pattern in vagotomized animals with an intact pons is similar to that in animals without the pons and vagi intact. The above observations support the idea that the pontine nuclei mediate a function similar to that of the Hering-Breuer reflex.

Bilateral injections of NMDA antagonists (MK-801 and AP-5) into the rostral pons reversibly increase the duration of inspiration in vagotomized rats, and this increase is dose-dependent (Fung et al., 1994). This suggests that the rostral pons contains neurons with NMDA-receptors participating in the inspiratory off-switch mechanism. Morrison et al. (1994) showed that lesions of the parabrachial nuclei in the decerebrate, vagotomized, unanesthetized rat produced a significant (4-fold) increase in the duration of inspiration and a doubling of the duration of expiration, supporting a role for this pontine area in the regulation of the timing of the phases of respiration. This abnormal breathing pattern is known as apneusis. Administration of MK-801 into the rostral dorsolateral pons was shown to induce apneusis in vagotomized ground squirrels (Harris and Milsom, 2003). Systemic injection of MK-801 increases the inspiratory duration or results in an apneustic-like breathing in vagotomized and artificially ventilated rats (Foutz et al., 1989; Monteau et al., 1990; Connelly et al., 1992; Pierrefiche et al., 1992, 1998; Fung et al., 1994; Ling et al., 1994; Borday et al., 1998). Similarly, Jodkowski et al. (1994) showed that electrical and chemical lesions in the ventrolateral pons produced apneustic breathing in vagotomized rats. At the same time, apneustic breathing is not usually developed if the vagi remained intact and can be reversed by vagal stimulation, suggesting that NMDA receptors are not involved in the pulmonary (vagal) feedback mechanism.

Feldman et al. (1976) recorded cells in the rostral pons that exhibited respiratory modulation only when lung inflation, via a cycle-triggered pump, was stopped. The emergence of this respiratory-modulated activity suggests that afferent vagal input may have an inhibitory effect on the respiratory modulated cells in the pons (see also Feldman and Gautier, 1976; Cohen and Feldman, 1977). In the same work, it was noticed that this activity had no apparent influence on the tonic discharge of pontine neurons, suggesting that this inhibition might be presynaptic. Dick et al. (2008) recorded several hundred cells in the dorsolateral pons of decerebrate cats, artificially ventilated by a cycle-triggered pump before and after vagotomy. In their experiments, vagotomy led to either an emergence or facilitation of respiratory modulation in the pons. Sustained electrical stimulation of the vagus nerve elicited the classic Hering-Breuer reflex. Systemic or local blockade of NMDA receptors can result in an apneustic breathing pattern (Foutz et al., 1989; Connelly et al., 1992; Pierrefiche et al., 1992, 1998; Fung et al., 1994; Ling et al., 1994; Borday et al., 1998) similar to that demonstrated by pontine lesions or transections.

state-dependent feedback control of the CPG may have broader implication in other CPGs in vertebrates and/or invertebrates.

The specifics of feedback control in the brainstem respiratory CPG is that the latter operates under control of two control loops (pulmonary and pontine ones), which both regulate key neural interactions within the CPG, thereby affecting the respiratory rate, respiratory phase durations and breathing pattern, and, at the same time, interact with each other so that each of them may dominate in the control of breathing depending on the conditions and/or the state of the system. Such feedback interactions and a

Specifically, our study focuses on the following major feedback loops involved in the control of breathing (**Figure 1A**): (1) the peripheral, pulmonary (vagal) loop that controls the medullary rhythm-generating kernel via afferent inputs from PSRs mediated by the NTS circuits, and (2) the pontine control loop, that provides pontine control of the respiratory rhythm and pattern. Our central hypothesis is that both the peripheral afferent and pontine-medullary loops control the respiratory frequency and

**FIGURE 1 | The medullary respiratory network with pulmonary and pontine feedbacks. (A)** A general schematic diagram representing the respiratory network with two interacting feedback. See text for details. **(B)** The detailing model schematic showing interactions between different populations of respiratory neurons within major brainstem compartments involved in the control of breathing (pons, BötC, pre-BötC, and rVRG) and the organization of pulmonary and pontine feedbacks. Each neural population (shown as a sphere) consists of 50 single-compartment neurons described in the Hodgkin-Huxley style. The model includes 3 sources of tonic excitatory drive located in the pons, RTN, and raphé—all shown as green triangles. These drives, project to multiple neural

populations in the model (green arrows; the particular connections to target populations are not shown for simplicity, but are specified in **Table A3** in the Appendix). See text for details. *Abbreviations:* AP-5, amino-5-phosphonovaleric acid, NMDA receptor antagonist; BötC, Bötzinger complex; e, excitatory; E, expiratory or expiration; i, inhibitory; I, inspiratory or inspiration; IE, inspiratory-expiratory; KF, Kölliker-Fuse nucleus; MK801, dizocilpine maleate, NMDA receptor antagonist; NTS, Nucleus Tractus Solitarii; P, pump cells; PBN, ParaBrachial Nucleus; PN, Phrenic Nerve; pre-BötC, pre-Bötzinger Complex; PSRs, pulmonary stretch receptors; RTN, retrotrapezoid nucleus; r, rostral; VRC, ventral respiratory column; VRG, ventral respiratory group.

phase durations via key medullary circuits responsible for the respiratory phase transitions (onset of inspiration, E-I, and inspiratory off-switch, I-E, see **Figure 1A**). In addition, these loops interact changing, balancing, and adjusting their control gain via interaction between NTS and VRC and pontine circuits. To investigate the involvement and potential roles of these feedback loops and their interactions with the medullary respiratory circuits we simulated the effects of suppression/elimination of each and both these feedbacks on the respiratory pattern and respiratory phase durations. The results of simulations were compared with the related experimental data and showed good qualitative correspondence hence providing important insights into feedback control of breathing.

## **METHODS**

#### **SIMULATION PACKAGE**

All simulations in this study were performed using a neural simulation package NSM-3.0 developed at Drexel by Drs. Markin, Shevtsova, and Rybak and ported to the high-performance computer cluster systems running OpenMPI by Dr. Molkov. This simulation environment has been specifically developed and used for multiscale modeling and computational analysis of crosslevel integration of: (a) the intrinsic biophysical properties of single respiratory neurons (at the level of ionic channel kinetics, dynamics of ion concentrations, synaptic processes, etc.); (b) population properties (synaptic interactions between neurons within and between populations with random distributions of neuronal parameters); (c) network properties (connectivity strength and type of synaptic interactions, with user-defined or random distribution of connections), (d) morpho-physiological structure (organization of interacting modules/compartments) (see Rybak et al., 2003, 2004, 2007, 2012; Smith et al., 2007; Baekey et al., 2010; Molkov et al., 2010, 2011). NSM-3.0 has special tools for simulation of various *in vivo* and *in vitro* experimental approaches, including suppression of specific ionic channels or synaptic transmission systems, various lesions/transections, application of various pharmacological, electrical and other stimuli to particular neurons or neural populations, etc.

#### **MODELING BASIS: NEURONAL PARAMETERS AND IONIC CHANNEL KINETICS**

The model presented in this paper continues a previously published series of models of neural control of respiration (Rybak et al., 2004, 2007; Smith et al., 2007; Baekey et al., 2010; Molkov et al., 2010, 2011) and, specifically, represents an extension of Smith et al. (2007) model. Following that model, each neuron type in the present model was represented by a population of 20–50 neurons. Each neuron was modeled as a singlecompartment neuron described in the Hodgkin-Huxley (HH) style. These neuron models incorporated the currently available data on ionic channels in the medullary neurons and their characteristics. Specifically, the kinetic and voltage-gated and characteristics of fast (Na) and persistent (NaP) sodium channels in the respiratory brainstem were based on the studies of the isolated pre-BötC neurons in rats (Rybak et al., 2003). The kinetics and steady-state characteristics of activation and inactivation of highvoltage activated (CaL) calcium channels were based on the earlier

studies performed *in vitro* (Elsen and Ramirez, 1998) and *in vivo* (Pierrefiche et al., 1999). Temporal characteristics of intracellular calcium kinetics in respiratory neurons were drawn from studies of Frermann et al. (1999). Other descriptions of channel kinetics were derived from previous models (Rybak et al., 2007; Smith et al., 2007).

Heterogeneity of neurons within each population was set by a random distribution of some neuronal parameters and initial conditions to produce physiological variations of baseline membrane potential levels, calcium concentrations, and channel conductances. A full description of the model and its parameters can be found in the Appendix. All simulations were performed using the simulation package NSM 3.0 (see above). Differential equations were solved using the exponential Euler integration method with a step of 0.1 ms. We utilized the high-performance computational capabilities of the Biowulf Linux cluster at the National Institutes of Health, Bethesda, MD (http://biowulf*.*nih*.*gov).

## **MODEL ARCHITECTURE AND OPERATION IN NORMAL CONDITIONS**

The main objective of this study was to investigate the mechanisms underlying control of the mammalian breathing pattern that is generated in the respiratory CPG circuits in the medulla and modulated by two major feedback loops, one involving interactions of medullary respiratory circuits with the lungs, and the other resulting from interactions of these circuits with the pontine circuits contributing to control of breathing (**Figure 1A**). We used an explicit computational modeling approach and focused on investigating the anticipated changes in the motor output (activity of the phrenic nerve, PN), specifically the changes in the duration of the inspiratory and expiratory phases under conditions of removal or suppression of the above feedback interactions (**Figure 1A**). The full schematic of our model is shown in **Figure 1B**. While developing this model, we used as a basis and extended the well-known large-scale computational model of the brainstem respiratory network developed by Smith et al. (2007). This basic model focused on the interactions among respiratory neuron populations within the medullary VRC. Similar to that model, the medullary respiratory populations in the present model (see **Figure 1B**) include (right-to-left): a ramp-inspiratory (ramp-I) population of premotor bulbospinal inspiratory neurons and an inhibitory earlyinspiratory [early-I(2)] population—both in the rostral ventral respiratory group (rVRG); a pre-inspiratory/inspiratory (pre-I/I) and an inhibitory early-inspiratory [early-I(1)] populations of the pre-BötC; and an inhibitory augmenting-expiratory (aug-E) and inhibitory (post-I) and excitatory (post-Ie) post-inspiratory populations in the BötC. As suggested in the previous modeling studies (Rybak et al., 2004, 2007; Smith et al., 2007), these populations interact within and between the pre-BötC and BötC compartments and form a core circuitry of the respiratory CPG. In addition, multiple inputs and drives from other brainstem components, including the pons, RTN, NTS, and raphé affect interactions within this core circuitry and regulate its dynamic behavior and the motor output expressed in the activity of phrenic nerve (PN).

Respiratory oscillations in the basic and present models emerge within the BötC/pre-BötC core due to the dynamic interactions among: (1) the excitatory neural population, located in the pre-BötC and active during inspiration (pre-I/I); (2) the inhibitory population in the pre-BötC providing inspiratory inhibition within the network [early-I(1)]; and (3) the inhibitory populations in the BötC generating expiratory inhibition (post-I and aug-E). A full description of these interactions leading to the generation of the respiratory pattern can be found in previous publications (Rybak et al., 2004, 2007; Smith et al., 2007). Specifically, during expiration the activity of the inhibitory post-I neurons in BötC decreases because of their intrinsic adaptation properties (defined by the high-threshold calcium and calcium-dependent potassium currents) and augmenting inhibition from the aug-E neurons (**Figures 1B** and **2A,B**). At some moment, the pre-I/I neurons of pre-BötC release from the deceasing post-I inhibition and start firing (**Figure 2**) providing excitation to the inhibitory early-I(1) population of pre-BötC and the premotor excitatory ramp-I populations of rVRG (**Figure 1B**). The early-I(1) population inhibits all postinspiratory and expiratory activity in the BötC leading to the disinhibition of all inspiratory populations including the ramp-I hence completing the onset of inspiration (E-I transition). During inspiration early-I(1) inhibition of BötC expiratory neurons decreases due to intrinsic adaptation properties defined by the high-threshold calcium and calcium-dependent potassium currents (**Figure 2**). This decrease of inspiratory inhibition leads to the onset of expiration and termination of inspiration (inspiratory off-switch) (**Figure 2**). In the rVRG, the premotor ramp-I neurons receive excitation from the pre-I/I neurons and drive phrenic motoneurons and PN activity. The early-I(2) population shapes augmenting pattern of ramp-I neurons and PN. The PN projects to the diaphragm (**Figure 1B**) hence controlling changes in the lung volume (inflation/deflation) providing breathing.

The architecture of network interactions within the medullary VRC column (i.e., within and between the BötC, pre-BötC and rVRG compartments) in the present model is the same as in the preceding model of Smith et al. (2007). The extension of the basic model in the present study includes: (1) a more detailed simulation of the pontine compartment (in the Smith et al. model, the pontine compartment did not have neuron populations but

simply provided tonic drive to medullary respiratory populations), (2) incorporation of suggested interactions between the pontine and medullary populations that form the pontine control loop in the model (**Figures 1A,B**), and (3) incorporation of the pulmonary (vagal) control loop that included models of the lungs and pump cells in the NTS (**Figures 1A,B**).

#### **PONTINE FEEDBACK LOOP**

As shown in multiple studies in cats and rats, many pontine neurons (including those in the Kölliker-Fuse and parabrachial nuclei) exhibit respiratory modulated activity, specifically with I-, IE-, E-, or EI-related activity (Bertrand and Hugelin, 1971; Feldman et al., 1976; Cohen, 1979; Bianchi and St. John, 1982; St. John, 1987, 1998; Shaw et al., 1989; Dick et al., 1994, 2008; Jodkowski et al., 1994; Song et al., 2006; Segers et al., 2008; Dutschmann and Dick, 2012). These neurons may have respiratory modulated activity summarized with background tonic firing or may express a pure phasic respiratory activity (especially in rats, e.g., see Ezure and Tanaka, 2006; Song et al., 2006). These pontine respiratory-modulated activities are probably based on specific axonal projections and synaptic inputs from the corresponding medullary respiratory neurons (Cohen, 1979; Bianchi and St. John, 1982; Nunez-Abades et al., 1993; Gaytan et al., 1997; Zheng et al., 1998; Ezure and Tanaka, 2006; Segers et al., 2008). In turn, pontine neurons (including those in the Kölliker-Fuse and parabrachial nuclei) project back to the medullary respiratory neurons contributing to the control of the respiratory phase durations and phase switching (Okazaki et al., 2002; Cohen and Shaw, 2004; Rybak et al., 2004; Dutschmann and Herbert, 2006; Mörschel and Dutschmann, 2009; Dutschmann and Dick, 2012). These mutual interactions between pontine and medullary respiratory neurons form what we refer to as a pontine (or pontine-medullary) control loop.

To simulate the pontine feedback loop, we incorporated in the pontine compartment of the model the following populations (see **Figure 1B**): the excitatory populations of neurons with inspiratory-modulated (I), inspiratory-expiratorymodulated (IEe) and expiratory-modulated (E) activities, and the inhibitory population of neurons with an inspiratory-expiratorymodulated (IEi) activity. As described above, pontine neurons with such types of modulated activity were found in both rat and cat. However, the existing experimental data on intrapontine and pontine-medullary interactions are insufficient and do not provide exact information on the specific connections between these neuron types; they only suggest general ideas and principles for organization of these interactions, such as the possible reciprocal interconnections between the pontine and medullary neurons with similar respiratory-related patterns (see references in the previous paragraph) and the existence of pontine projections to key medullary neurons involved in the respiratory phase switching (such as post-I, see references above). Therefore in the model, respiratory modulation of neuronal activity in pontine populations was provided by excitatory inputs from the medullary respiratory neurons with the corresponding phases of activity within the respiratory cycle. Specifically, the inspiratory modulation activity in the pontine I population was provided by excitatory inputs from the medullary ramp-I population, the IE modulation in the pontine IEe and IEi populations resulted from excitatory inputs from the medullary ramp-I and post-Ie populations, and the expiratory-modulation in the pontine E population was provided by inputs from the medullary post-Ie population. In addition, to simulate the presence of neurons with respiratory modulated phasic and tonic activities, each of the above four population was split into two equal subpopulations with neurons having the same properties and neuronal connections, but differed by tonic drive, which was received only by tonically active subpopulations (not shown in **Figure 1B**).

In turn, the pontine feedback in the model included (see **Figure 1B**): (1) excitatory inputs from the pontine I neurons (from both tonic and phasic subpopulations) to the medullary pre-I/I and ramp-I populations; (2) excitatory inputs from the pontine IEe neurons (both tonic and phasic subpopulations) to the medullary post-I population; (3) inhibitory inputs from the pontine IEi neurons (again both subpopulations) to the medullary early-I(1) population; and (4) excitatory inputs from the pontine E neurons (both subpopulations) to the medullary post-I, post-Ie, and aug-E populations. These neuronal connections from pons to medulla (especially pontine inputs to the medullary post-I and pre-I/I populations) allowed the pontine feedback to control operation of the respiratory network in the BötC/pre-BötC core and specifically to control the durations of the respiratory phases and phase switching. Specifically, the connection weights in the model were tuned so that (a) the durations of inspiration (*TI*) and expiration (*TE*) in the model without vagal feedback would be within the corresponding physiological ranges for the vagotomized rat *in vivo* (*TI* = 0*.*2–0*.*55 s and *TE* = 0*.*8–1*.*7 s, e.g., see Monteau et al., 1990; Connelly et al., 1992) and (b) after full suppression or removal of the pons, the value of *TI* would dramatically increase (3–4 times or more) to be consistent with apneusis (Jodkowski et al., 1994; Morrison et al., 1994; Fung and St. John, 1995; St. John, 1998).

#### **PULMONARY (VAGAL) FEEDBACK LOOP**

The busting activity of phrenic motoneurons produces rhythmic inflation/deflation of the lungs, which in turn causes rhythmic activation of PSRs projecting back to the medullary respiratory network within the vagus nerve and hence providing pulmonary (vagal) feedback. The activity of pulmonary afferents in the medulla is relayed by the NTS pump (P) cells. To simulate pulmonary feedback loop, we incorporated simplified models of the lungs and PSRs, so that changes in the lung volume were driven by the activity of PN (see **Figures 1A,B**). The resultant lung inflation activates PSRs that projected back activating the excitatory (Pe) and inhibitory (Pi) pump cells populations in the NTS. The latter finally projected to the VRC and pons (**Figure 1B**). Hence in the model, both Pe and Pi populations were involved in the Hering-Breuer reflex preventing over-inflation of the lungs. Specifically (**Figure 1B**), the Pe population excited the post-I population, which was based on the previous experimental data that both lung inflation and electrical stimulation of the vagus nerve produced an additional activation of decrementing expiratory neurons (Hayashi et al., 1996). Following the previous model (Rybak et al., 2004) we suggested that vagal feedback inhibits the early-I(1) population (in this model, via the Pi population). Both these interactions produced a premature termination of inspiration with switching to expiration and a prolongation of expiration.

#### **INTERACTIONS BETWEEN THE LOOPS**

As mentioned in the section "Introduction," the respiratorymodulated activity in the pons is usually much stronger in the absence of lung inflation and in vagotomized animals (e.g., see Feldman et al., 1976; Dick et al., 2008). One explanation for these effects is that the respiratory-modulated activity in the pons is suppressed by vagal afferents via NTS neurons projecting to the pons. There is indirect evidence that this suppression is based on presynaptic inhibition (Feldman and Gautier, 1976; Dick et al., 2008). Therefore in our model, this presynaptic inhibition is provided by the Pi population of NTS and affects all excitatory synaptic inputs from medullary to pontine neural populations (**Figure 1B**). Therefore, this presynaptic inhibition suppresses the respiratory modulation in the activities of pontine neurons and reduces the influence of pontine feedback on the medullary respiratory network operation and the respiratory pattern generated. Because of the lack of specific data, the synaptic weighs of connections from both pump cell populations (Pe and Pi) were set so that (a) significantly reduce the respiratory nodulation in all types of pontine neurons and (b) keep the durations of inspiration and expiration in simulations with vagal feedback intact within their physiological ranges for the rat *in vivo* (*TI* = 0*.*17–0*.*3 s and *TE* = 0*.*3–0*.*5 s, e.g., see Connelly et al., 1992).

#### **SIMULATION OF VAGOTOMY (PULMONARY FEEDBACK REMOVAL)**

Under normal conditions the "intact" model generated the respiratory pattern with the duration of inspiration *TI* = 0*.*189 ± 0*.*046 s and the duration of expiration *TE* = 0*.*388 ± 0*.*064 s (**Figures 2**, **3A**, **4A**, and **5A**). "Vagotomy" was simulated by breaking the pulmonary feedback, specifically by a removal of afferent inputs from PSRs to the pump cells in the NTS (**Figure 1A**). The resultant changes in the activity of different neural populations and in the output respiratory pattern in the model after simulated vagotomy are shown in **Figures 3B** and **4B**. As a result of vagotomy the pump cells (Pi and Pe populations) become silent (only the activity of Pi is shown in **Figures 3B** and **4B**; the activity of Pe population is similar, i.e., it also becomes silent). This eliminates the excitatory effect of lung inflation (PSR) on the post-I population (and post-Ie, pre-I/I, and ramp-I), mediated by Pe, and its inhibitory effect on the aug-E population, provided by Pi (**Figure 1B**). This also eliminates the pulmonary (vagal) control of respiratory phase switching and phase durations. However, this breaking of the pulmonary feedback also removes the presynaptic inhibition of all medullary inputs to pontine neural populations (provided in the intact case by the NTS's Pi population) hence increasing respiratory-modulated activities in the pontine neurons involved in the feedback control of the respiratory network operation (**Figures 1A,B**). This therefore increases the gain of pontine feedback and its role in the control of respiratory phase switching and phase durations. **Figure 3** shows that the vagotomy resulted in increases in the respiratorymodulated activity of pontine populations, a prolongation of inspiration (*TI* = 0*.*277 ± 0*.*108 s), and a dramatic increase in the expiratory phase duration (*TE* = 0*.*938 ± 0*.*065 s). **Figure 4** shows that the applied vagotomy produced a significant increase of inspiratory (I), inspiratory-expiratory (IE), and expiratory (E) modulation in the activity of the corresponding pontine neurons with tonic activity and releases the corresponding firing in pontine neurons with phasic I, IE, and E activities not active in the intact case.

#### **SIMULATION OF PONTINE FEEDBACK SUPPRESSION WITH AND WITHOUT PULMONARY FEEDBACK**

A complete removal of the pons (i.e., a removal of pontine feedback) in the model with an intact pulmonary feedback produced a prolongation of inspiration (*TI* = 0*.*337 ± 0*.*052 s) and a slightly reduced in average (in comparison to the intact model) but highly variable expiratory duration (*TE* = 0*.*353 ± 0*.*159 s) characterized by occasional deletions of aug-E bursts (see **Figures 5B** and **6A**). To compare our simulations with the existing experimental data on the effects of pontine suppression by local injections of MK801, a blocker of NMDA receptors, that might not completely suppress the excitatory synaptic transmission in the pontine neurons and their activity, we also simulated a partial suppression of excitatory synaptic weights in the pontine compartment (e.g., by 25% see **Figure 6A**). Such partial suppression produced a visible prolongation of inspiration (*TI* = 0*.*262 ± 0*.*028 s with *TE* = 0*.*297 ± 0*.*028 s at 25% suppression, **Figure 6A**).

In contrast to pontine suppression with the intact pulmonary feedback, the same procedures after vagotomy led to a dramatic increase in the average duration of inspiration (making the inspiratory duration highly variable) at relatively constant duration of expiration (**Figures 5C** and **6A**). This prolongation of inspiration after vagotomy increased with the degree of pontine suppression (reducing the weights of excitatory synaptic inputs to pontine neurons) (**Figure 6A**) and accompanied by a suppression or full elimination of post-I activity and reduced amplitude of integrated PN (**Figure 5C**). Both these features are typical for apneusis (see Cohen, 1979; Wang et al., 1993; Jodkowski et al., 1994; Morrison et al., 1994; Fung and St. John, 1995; St. John, 1998). The durations of inspiration and expiration after vagotomy at different degrees of pontine suppression were the following: *TI* = 0*.*437 ± 0*.*143 s with *TE* = 0*.*433 ± 0*.*030 s at 25% suppression; *TI* = 0*.*885 ± 0*.*339 s with *TE* = 0*.*417 ± 0*.*004 s at 75% suppression; and *TI* = 571 ± 0.310 s with *TE* = 0*.*431 ± 0*.*003 s at 100% suppression.

The results of our simulations reflecting changes in *TI* and *TE* following different combinations of vagotomy with pontine suppression at different degrees are shown together in **Figure 6A**. Our general conclusions made from these simulations are the following. (1) A suppression of pontine activity with the intact pulmonary feedback leads to a moderate prolongation of inspiration, slight shortening of expiration, and an increase in variability of *TE* (with 100% pontine suppression). (2) The simulated vagotomy (with the intact pontine-medullary interactions) causes a moderate prolongation of inspiration with an increase in variability of *TI* and a strong prolongation of expiration. (3) Combination of both perturbations does not produce visible effects on *TE*, but leads to a significant prolongation of inspiration (increasing with the degree of pontine suppression), increasing of *TI* variability,

and other typical characteristics of apneusis (suppressed post-I activity and reduced PN amplitude).

#### **COMPARISON WITH EXPERIMENTAL DATA**

To test our model, we performed simulation with 25%, 75%, and 100% suppression of the pontine control loop before and after simulated vagotomy (removal of the pulmonary feedback). The resultant changes in *TI* and *TE* are shown in **Figure 6A**. To compare these simulation results with the related experimental data, we built similar diagrams from the early study of Connelly et al. (1992), which examined spontaneously breathing in Wistar rats during the administration of NMDA blocker MK-801 before and after vagotomy (**Figure 6B**). In this study, the experiments on Wistar rats (in contrast to the Sprague-Dawley strain) did not end

with apneusis, due to (in our opinion) an insufficient suppression of the pontine feedback by the performed MK-801 injections. Nevertheless, the effects of vagotomy and MK-801 administration on *TI* and *TE* before and after vagotomy reported in Connelly et al. study are qualitatively similar to our simulations with 25% suppression of pontine feedback (see **Figures 6A,B**). Specifically, the 25% pontine suppression in our simulations and the administration of MK-801 in Connelly et al. experiments result in an increase of *TI* and slight reduction of *TE* before vagotomy and in a significant prolongation of inspiration after vagotomy. In addition, vagotomy alone without other perturbations in both cases results in an increase of *TI* and significant prolongation of *TE* (see **Figures 6A,B**). Moreover, the changes in the respiratory frequency and the shape and amplitude of integrated phrenic activity after vagotomy and/or pontine suppression in our model are similar to that in the experimental studies with MK-801 administration (**Figure 7**). The other comparison of our simulations was made with the experimental study of Monteau et al. (1990) performed in anaesthetized vagotomized rats by

using MK-801 administration, which results are summarized in **Figure 6C**. This study did demonstrate that MK-801 application after vagotomy produced switching from a normal breathing pattern to the typical apneusis. The relationships between *TI* and *TE* in our simulation after vagotomy and their changes following 100% pontine suppression (apneusis) are similar to these in the Monteau et al. study (see **Figures 6A,C**).

## **DISCUSSION**

The results of our simulations promote the concept that both pulmonary and pontine feedback loops contribute to the control of the respiratory pattern and, specifically, the durations of inspiration (*TI*) and expiration (*TE*). Furthermore, our modeling results are consistent with the previous suggestion of specific interactions between these feedback loops, in particular that the PSR afferents involved in the pulmonary control of *TI* and *TE* attenuate the gain of the pontine control of these phase durations (via the presynaptic inhibition of excitatory inputs from medullary to pontine populations) (Feldman and Gautier, 1976;

Feldman et al., 1976; Cohen and Feldman, 1977; Cohen, 1979; Mörschel and Dutschmann, 2009). Nevertheless, according to our simulations, pontine activity still plays a role in the control of inspiration and expiration even when the pulmonary feedback is intact, although the gain of this pontine control is significantly reduced by the presynaptic inhibition. This presynaptic inhibition is expected to suppress the respiratory modulation in the activity of pontine neurons expressing either tonic or phasic firing patterns (Feldman and Gautier, 1976; Feldman et al., 1976; Cohen and Feldman, 1977; Cohen, 1979; St. John, 1987, 1998; Shaw et al., 1989; Dick et al., 1994, 2008; Song et al., 2006; Segers et al., 2008), which is reproduced by our model (**Figure 4**). Also, the model offers a plausible mechanistic explanation for the previous experimental findings that injection of NMDA antagonists in the dorsolateral pons (specifically in the Kölliker-Fuse area) leads to a prolongation of inspiration and to apneusis in the case of a lack of pulmonary feedback (Foutz et al., 1989; Connelly et al., 1992; Pierrefiche et al., 1992, 1998; Fung et al., 1994; Ling et al., 1994; Bianchi et al., 1995; Borday et al., 1998; St. John, 1998).

In contrast to previous suggestions and models (Okazaki et al., 2002; Cohen and Shaw, 2004; Rybak et al., 2004; Dutschmann and Herbert, 2006; Mörschel and Dutschmann, 2009; Dutschmann and Dick, 2012), the mechanisms of action of the two feedbacks considered in the current model are not exactly symmetric. Excitatory inputs from both these feedbacks (from PSRs via the NTS's Pe cells, and from the pontine I, IEe, and E populations) activate the ramp-I, pre-I/I, post-Ie, and post-I medullary populations (see **Figure 1B**). The majority of these excitatory connections are the ones activating the inhibitory post-I population that controls the inspiratory off-switching, i.e., the timing

*TI* and *TE* in the study of Connelly et al. (1992): diagrams are built for spontaneously breathing Wistar rats under control conditions and after administration of NMDA blocker MK-801 before and after vagotomy. **(C)** Changes in *TI* and *TE* in the study of Monteau et al. (1990) performed in anaesthetized vagotomized rats using MK-801 administration.

of inspiratory phase termination and *TI*, and those activating the excitatory pre-I/I population which, in a balance with the inputs to post-I, control the onset of inspiration (and *TE*). However the effect of these excitatory inputs from the two feedbacks on the medullary circuitry is not identical and depends on the particular synaptic weights and the activity pattern of the inhibitory NTS's Pi cells providing presynaptic inhibition of medullary inputs to the pontine neurons (**Figure 1B**). The organization of inhibitory inputs of these feedbacks to the medullary populations in the model is different. While the pulmonary feedback inhibits the aug-E population (via PSRs and Pi cells) causing a complex effect on the respiratory pattern, the pontine IEi population inhibits the early-I(1) population hence promoting expiration, which is clearly seen after vagotomy (**Figure 1B**).

It is important to mention that the current model of the medullary core respiratory circuits in the VRC (including the BötC, pre-BötC, and rVRG) used in our model was derived from the model of Smith et al. (2007) without significant changes. Starting with that first publication, this basic model (with necessary additions) was able to reproduce multiple experimental results, including the characteristic changes of the respiratory pattern following a series of pontine and medullary transections and effect of riluzole (persistent sodium current blocker) on the intact and sequentially reduced *in situ* preparation (Rybak et al., 2007; Smith et al., 2007), the emergence of the additional late-expiratory oscillations in the RTN/parafacial respiratory group (RTN/pFRG) during hypercapnia and interactions between the BötC/pre-BötC and RTN/pFRG oscillators (Abdala et al., 2009; Molkov et al., 2010), the effects of baroreceptor stimulation and the respiratory-sympathetic coupling including this following the intermittent hypoxia (Baekey et al., 2010; Molkov et al., 2011; Rybak et al., 2012), etc. The extended model described here was also able to reproduce the above behaviors, including the biologically plausible changes of membrane potentials and firing patterns of different respiratory neurons (**Figure 2B**). The ability of the extended model to reproduce the experimentally observed effects of the two feedback loops provides an additional support for the model of the core respiratory circuits used in all these previous models.

The exact mechanisms of pontine control of breathing are not well-understood and the pontine-medullary connections incorporated in the model are currently speculative. However, the general importance of the pons in the control of the respiratory pattern is well-recognized (see Dutschmann and Dick, 2012, for review). Studies utilizing the classic neurophysiological approaches of lesioning, stimulating and recording neurons have established that the lateral pons influences not only phase duration, phrenic amplitude, and response to afferent stimulation, but also the dynamic changes in respiratory pattern associated with persistent stimuli. For instance, blocking neural activity in the dorsolateral pons not only prolongs inspiration but also blocks the adaptation to vagal stimulation (Siniaia et al., 2000), and the shortening of expiration associated with repeated lung inflation (Dutschmann et al., 2009). Thus, the pons is not only intimately involved in the initial response to various stimuli, but also in the complex processes of accommodation and habituation. In the cardiovascular control system, parabrachial stimulation attenuates the NTS response to carotid sinus nerve stimulation by inhibition of NTS neurons receiving these inputs (Felder and Mifflin, 1988).

With normally operating pontine-medullary interactions, the simulated vagotomy results in a prolongation of inspiration and significant increase of the expiratory duration

(**Figures 3B** and **6A**). However, despite these changes, the breathing pattern after vagotomy remains similar to that in eupnea (**Figure 3**). This maintenance of the eupneic breathing pattern occurs because the control performed by the pulmonary loop is now partly mimicked by the pontine loop, whose gain is increasing after vagotomy, as the latter removes the presynaptic inhibition of medullary inputs to pontine neurons (**Figure 1B**). Our model suggests that the pulmonary feedback yet performs the major function in the control of respiratory phase transitions and phase durations, and that a removal of this control loop places the full responsibility for this control on the pontine feedback loop.

The complementary role of the pontine and pulmonary feedbacks in control of phase duration (especially *TI*) in our model is consistent with the classical interpretation of their function in respiratory control (see Dutschmann and Dick, 2012, for review). In particular, a premature termination of inspiration and switching to expiration can be elicited by stimulation of either the rostral pons or the pulmonary afferents (Bertrand and Hugelin, 1971; Cohen, 1979; Oku and Dick, 1992; Wang et al., 1993; St. John, 1998; Haji et al., 1999; Okazaki et al., 2002; Rybak et al., 2004; Dutschmann and Herbert, 2006). This observation was explained by their common excitatory input on the post-inspiratory neurons in the medullary VRC which are critically involved in this phase transition (Okazaki et al., 2002; Rybak et al., 2004; Dutschmann and Herbert, 2006; Mörschel and Dutschmann, 2009).

Alternatively, our results suggest that the pontine-medullary feedback does not simply function as an "internal pulmonary feedback," performing a redundant function and compensating for the potential loss of vagal input. The specific increase in the variability of *TE* with the suppression pontine activity and the significant prolongation of *TE* after vagotomy (**Figure 6A**) indicate that the pontine and pulmonary feedbacks differ in the control of *TE*. Indeed, our modeling results show that these control loops may complement each other in differential control of phase duration and breathing pattern variability. For example, an increase of *TE* variability with pontine suppression, as seen in **Figures 5B** and **6A**, may be the case during various breathing disorders, such as sleep apnea or ventilator weaning (Tobin et al., 2012). In this connection, the stability of *TE* can be critically important and is primarily being controlled by the pons. Moreover, the Kölliker-Fuse area of the dorsolateral pons was explicitly identified to contribute to breathing disorders in a mouse model for a neurodevelopmental disease called Rett-syndrome (Stettner et al., 2007; Abdala et al., 2010).

Consistent with the many earlier and recent experimental data from cats and rats (Lumsden, 1923; Cohen, 1979; Wang et al., 1993; Jodkowski et al., 1994; Morrison et al., 1994; St. John, 1998), our simulations show that a strong pontine suppression (e.g., 75%) or its removal after vagotomy leads to apneusis, characterized by a significant increase of inspiratory duration and its variability (**Figures 5C** and **6A**). The other specific characteristics of apneusis are a lack of post-inspiratory activity and a reduction of phrenic amplitude during inspiration (Cohen, 1979; Wang et al., 1993; Jodkowski et al., 1994; Morrison et al., 1994; Fung

### **REFERENCES**


and St. John, 1995; St. John, 1998), which were reproduced in our simulations (**Figure 5C**).

Our understanding of interactions between individual components of complex systems is often insufficient to explain emergent properties of these systems. The present study elucidates the important role of two major feedback loops and interactions between them in regulation of the respiratory rate and breathing pattern allowing the brainstem respiratory network to maintain system's homeostasis and adjust breathing to various metabolic and physiologic demands.

## **ACKNOWLEDGMENTS**

This study was supported by the National Institutes of Health: grants R33 HL087377, R33 HL087379, R01 NS057815, and R01 NS069220.


respiratory neuronal subsets of the rat. *Brain Res. Bull.* 42, 323–334.


Rybak, I. A. (2010). Late-expiratory activity: emergence and interactions with the respiratory CPG. *J. Neurophysiol.* 104, 2713–2729.


Synaptic potentials in respiratory neurones during evoked phase switching after NMDA receptor blockade in the cat. *J. Physiol.* 508(Pt 2), 549–559.


Modeling the ponto-medullary respiratory network. *Respir. Physiol. Neurobiol*. 143, 307–319.


dysfunctions associated with impaired control of postinspiratory activity in Mecp2-/y knockout mice. *J. Physiol.* 579, 863–876.


NMDA and biocytin. *Brain Res.* 782, 113–125.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 31 October 2012; accepted: 24 January 2013; published online: 13 February 2013.*

*Citation: Molkov YI, Bacak BJ, Dick TE and Rybak IA (2013) Control of breathing by interacting pontine and pulmonary feedback loops. Front. Neural Circuits 7:16. doi: 10.3389/fncir. 2013.00016*

*Copyright © 2013 Molkov, Bacak, Dick and Rybak. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## **APPENDIX**

#### **SINGLE NEURON MODEL**

All neurons were modeled in the Hodgkin-Huxley style as singlecompartment models:

$$C \cdot \frac{dV}{dt} = -I\_{\rm Na} - I\_{\rm NaP} - I\_{\rm K} - I\_{\rm CaL} - I\_{\rm K, Ca} - I\_L - I\_{\rm SymE} - I\_{\rm SymI, K} \tag{A1}$$

where *V* is the membrane potential, *C* is the membrane capacitance, and *t* is time. The terms in the right part of this equation represent ionic currents: *INa*—fast sodium (with maximal conductance *g*¯*Na*); *INaP*—persistent (slow inactivating) sodium (with maximal conductance *g*¯*NaP*); *IK*—delayed rectifier potassium (with maximal conductance *g*¯*K*); *ICaL*—highvoltage activated calcium (with maximal conductance *g*¯*CaL*); *IK, Ca*—calcium-dependent potassium (with maximal conductance *g*¯*K, Ca*), *IL*—leakage (with constant conductance *gL*); *ISynE* (with conductance *gSynE*) and *ISynI* (with conductance *gSynI*) excitatory and inhibitory synaptic currents, respectively.

Currents are described as follows:

$$\begin{aligned} I\_{\rm Na} &= \bar{\rm g}\_{\rm Na} \cdot m\_{\rm Na}^3 \cdot h\_{\rm Na} \cdot (V - E\_{\rm Na}); \\ I\_{\rm NaP} &= \bar{\rm g}\_{\rm NaP} \cdot m\_{\rm NaP} \cdot h\_{\rm NaP} \cdot (V - E\_{\rm Na}); \\ I\_K &= \bar{\rm g} \times m\_K^4 \cdot (V - E\_K); \\ I\_{\rm CaL} &= \bar{\rm g}\_{\rm CaL} \cdot m\_{\rm CaL} \cdot h\_{\rm CaL} \cdot (V - E\_{\rm Ca}); \\ I\_{K,Ca} &= \bar{\rm g}\_{K,Ca} \cdot m\_K^2 \cdot (V - E\_K); \\ I\_L &= \underline{\rm g}\_L \cdot (V - E\_L); \\ I\_{\rm SpnE} &= \underline{\rm g}\_{\rm SpnE} \cdot (V - E\_{\rm SpnE}); \\ I\_{\rm SpnI} &= \underline{\rm g}\_{\rm SpnI} \cdot (V - E\_{\rm SpnI}), \end{aligned} \tag{A2}$$

where *ENa*, *EK*, *ECa*, *EL*, *ESynE*, and *ESynI* are the reversal potentials for the corresponding channels.

Variables *mi* and *hi* with indexes indicating ionic currents represent, respectively, the activation and inactivation variables of the corresponding ionic channels. Kinetics of activation and inactivation variables is described as follows:

$$
\pi\_{mi}(V) \cdot \frac{d}{dt} m\_i = m\_{\infty i}(V) - m\_i;
$$

$$
\pi\_{hi}(V) \cdot \frac{d}{dt} h\_i = h\_{\infty i}(V) - h\_i. \tag{A3}
$$

The expressions for steady state activation and inactivation variables and time constants are shown in **Table A1**. The value of maximal conductances for all neuron types are shown in **Table A2**.

The kinetics of intracellular calcium concentration *Ca* is described as follows (Rybak et al., 1997):

$$\frac{d}{dt}\mathrm{Ca} = -k\_{\mathrm{Ca}} \cdot I\_{\mathrm{CaL}} \cdot (1 - P\_{\mathrm{B}}) + (\mathrm{Ca}\_{0} - \mathrm{Ca}) / \mathrm{\tau}\_{\mathrm{Ca}},\tag{A4}$$

where the first term constitutes influx (with the coefficient *kCa*) and buffering (with the probability *PB)*, and the second term

**Table A1 | Steady state activation and inactivation variables and time constants for different ionic channels.**


**Table A2 | Maximal conductances of ionic channels in different neuron types.**


describes pump kinetics with resting level of calcium concentration *Ca*<sup>0</sup> and time constant τ*Ca*.

$$P\_B = B/(Ca+B+K),\tag{A5}$$

where *B* is the total buffer concentration and *K* is the rate parameter.

The calcium reversal potential is considered a variable and is a function of *Ca*:

$$E\_{Ca} = 13.27 \cdot \ln(4/Ca) \text{ (at rest } Ca = Ca\_0$$

$$= 5 \times 10^{-5} \,\text{mM and} E\_{Ca} = 150 \,\text{mV}). \tag{A6}$$

The excitatory (*gSynE*) and inhibitory synaptic (*gSynI*) conductances are equal to zero at rest and may be activated (opened) by the excitatory or inhibitory inputs respectively:

$$\mathcal{S}\_{\text{SymEi}}(t) = \bar{\mathcal{g}}\_{\text{E}} \cdot F\_{i}^{\text{prej}m} \cdot \sum\_{j} \mathcal{S} \{\boldsymbol{w}\_{ji}\} \cdot \sum\_{t\_{kj} < t} \exp\left(-\left(t - t\_{kj}\right) / \tau\_{\text{SymE}}\right)$$

$$+ \bar{\mathcal{g}}\_{\text{Ed}} \cdot \sum\_{m} \mathcal{S} \{\boldsymbol{w}\_{dmi}\} \cdot d\_{mi};$$

$$\mathcal{g}\_{\text{SymIi}}(t) = \bar{\mathcal{g}}\_{\text{I}} \cdot \sum\_{j} \mathcal{S} \left\{-\boldsymbol{w}\_{ji}\right\} \cdot \sum\_{t\_{kj} < t} \exp\left(-\left(t - t\_{kj}\right) / \tau\_{\text{SymI}}\right)$$

$$+ \bar{\mathcal{g}}\_{\text{Id}} \cdot \sum\_{m} \mathcal{S} \{-\boldsymbol{w}\_{dmi}\} \cdot d\_{mi},\tag{A7}$$

where the function *S*{*x*} = *x*, if *x* ≥ 0, and 0 if *x <* 0. In Equations (A7), each of the excitatory and inhibitory synaptic conductances has two terms. The first term describes the integrated effect of inputs from other neurons in the network (excitatory or inhibitory). The second term describes the integrated effect of inputs from external drives *dmi*. Each spike arriving to neuron *i* from neuron *j* at time *tkj* increases the excitatory synaptic conductance by *g*¯*<sup>E</sup>* · *wji* if the synaptic weight *wji >* 0, or increases the inhibitory synaptic conductance by −¯*gI* · *wji* if the synaptic weight *wji <* 0. *g*¯*<sup>E</sup>* and *g*¯*<sup>I</sup>* are the parameters defining an increase in the excitatory or inhibitory synaptic conductance, respectively, produced by one arriving spike at |*wji*| = 1. τ*SynE* and τ*SynE* are the decay time constants for the excitatory and inhibitory conductances respectively. In the second terms of Equation (A7), *g*¯*Ed* and *g*¯*Id* are the parameters defining the increase in the excitatory or inhibitory synaptic conductance, respectively, produced by external input drive *dmi* = 1 with a synaptic weight of |*wdmi*| = 1. All drives were set to 1.

Presynaptic inhibition is simulated as an attenuator of excitatory synapses by means of a factor *Fpresyn* ≤ 1. This factor is calculated according to the following equation:

$$F\_i^{\text{presyn}} = \left(1 + \sum\_j \mathcal{S}\left\{-\boldsymbol{w}\_{ji}^{\rho}\right\} \cdot \sum\_{t\_{kj} < t} \exp\left(-\left(t - t\_{kj}\right) / \tau\_{\text{Sym}I}\right)\right)^{-1},\tag{A8}$$

where *w<sup>p</sup> ji* ≤ 0 is the weight of presynaptic inhibitory connection that synapse *i* receives from neuron *j*. If a synapse *i* does not receive any presynaptic inhibition, then *w<sup>p</sup> ji* = 0 for and hence for this synapse *Fpresyn <sup>i</sup>* = 1.

The relative weights of synaptic connections (*wji*, *<sup>w</sup><sup>p</sup> ji* , and *wdmi*) are shown in **Table A3**.

The following neuronal and synaptic parameters were used:

$$C = \text{36 pF}; E\_{\text{Na}} = \text{55 mV}; E\_{K} = -94 \text{ mV}; E\_{\text{SymE}} = -10 \text{ mV}; E\_{K}$$

*ESynI* = *ECl* = −75 mV;

$$\begin{aligned} \overline{\mathbf{g}}\_{E} &= \overline{\mathbf{g}}\_{I} = \overline{\mathbf{g}}\_{Ed} = \overline{\mathbf{g}}\_{Id} = 1.0 \,\text{nS}; \ \mathbf{r}\_{Sym} &= 5 \,\text{ms}; \ \mathbf{r}\_{S\gamma nI} = 15 \,\text{ms}; \\\ Ca\_{0} &= 5 \times 10^{-5} \,\text{mM}; \ k\_{Ca} = 2 \times 10^{-5} \,\text{mM}/\text{C}; \ \mathbf{r}\_{Ca} = 250 \,\text{ms}, \\\ B &= 0.030 \,\text{mM}; K = 0.001 \,\text{mM}. \end{aligned}$$

**Table A3 | Weights of synaptic connections in the network.**


*Values in brackets represent relative weights of synaptic inputs from the corresponding source populations;*

*ppresynaptic inhibition.*

#### **MODELING NEURAL POPULATIONS**

Each functional type of neuron in the model was represented by a population of 50 neurons. Connections between the populations were established so that, if a population A was assigned to receive an excitatory or inhibitory input from a population B or external drive D, then each neuron of population A received the corresponding excitatory or inhibitory synaptic input from each neuron of population B or from drive D, respectively. The pontine I, IEi, IEe, and E population represent an exception: only half of each population (the tonic subpopulation) receives tonic drive (see in the section "Pontine Feedback Loop"). To provide heterogeneity of neurons within neural populations, the value of *EL* was randomly assigned from normal distributions using average value ± SD. Leakage reversal potential for all neurons (except for the pre-I ones) was *EL* = −60 ± 1*.*2 mV; for pre-I neurons *EL* = −68 ± 1*.*36 mV.

#### **MODELING OF LUNGS, PN, AND PSR**

The phrenic motoneuron population and phrenic nerve (*PN*) were not modeled. Integrated activity of the ramp-I population were considered as PN motor output. An increase in lung volume (lung inflation) *V* was modeled as a low-pass filter of PN activity:

$$
\text{tr}\mathbf{v} \cdot \frac{dV}{dt} = -V + \mathbf{w}\_{\text{PN}\to V} \cdot \text{PN},\tag{A9}
$$

where τ*<sup>V</sup>* = 100 ms is a lung time constant. The PSR output was considered proportional to the lung inflation *V*.

## The response clamp: functional characterization of neural systems using closed-loop control

## *Avner Wallach\**

*Department of Neurobiology, Weizmann Institute of Science, Rehovot, Israel*

#### *Edited by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### *Reviewed by:*

*Steve M. Potter, Georgia Institute of Technology, USA Michele Giugliano, University of Antwerpen, Belgium*

#### *\*Correspondence:*

*Avner Wallach, Department of Neurobiology, Weizmann Institute of Science, 234 Herzl Street, Rehovot 76100, Israel. e-mail: avnerw@tx.technion.ac.il*

The voltage clamp method, pioneered by Hodgkin, Huxley, and Katz, laid the foundations to neurophysiological research. Its core rationale is the use of closed-loop control as a tool for system characterization. A recently introduced method, the *response clamp*, extends the voltage clamp rationale to the functional, phenomenological level. The method consists of on-line estimation of a response variable of interest (e.g., the probability of response or its latency) and a simple feedback control mechanism designed to tightly converge this variable toward a desired trajectory. In the present contribution I offer a perspective on this novel method and its applications in the broader context of system identification and characterization. First, I demonstrate how internal state variables are exposed using the method, and how the use of several controllers may allow for a detailed, multi-variable characterization of the system. Second, I discuss three different categories of applications of the method: (1) exploration of intrinsically generated dynamics, (2) exploration of extrinsically generated dynamics, and (3) generation of input–output trajectories. The relation of these categories to similar uses in the voltage clamp and other techniques is also discussed. Finally, I discuss the method's limitations, as well as its possible synthesis with existing complementary approaches.

**Keywords: response clamp, control, closed-loop, physiology, psychophysics**

## **MOTIVATION**

A paramount goal of any neurophysiological study is to identify and characterize the function of neural systems. What kind of methodology can one employ in order to achieve this goal? A compelling option is to use the framework of control theory and signal processing, which engineers utilize to characterize artificial systems. The first step in this methodology is to define the system's input and output variables. Then, a set of signals is selected (e.g., step, pulse, or harmonic functions) and is applied to the system's input, while the output signal corresponding to each input is observed. Finally, the system is characterized in terms of its *input–output relations*, namely the conversion laws that dictate what kind of output arises in response to any given input (including novel, untested stimuli). Another realization of this approach is to use *noise* as input and to deduce the input–output relations using reverse correlations. This "open-loop" methodology is very efficient when simple systems are involved: a linear time-invariant (LTI) system (e.g., a classical resistor-capacitor circuit) may be fully characterized-based solely on its response to a single step function; simple non-linear elements, such as analog transistors in their "linear" regime, may also be studied using "small signal" (i.e., harmonic) analysis.

The application of such tools to biological systems, however, is severely limited. First, these systems are invariably composed of non-linear elements which exhibit sharp threshold phenomena, i.e., small changes in their input may cause abrupt and significant changes in their output. Second, biological systems are stochastic, with a response variance which is often comparable in magnitude to the response mean (Arieli et al., 1996). Finally, time and activity-dependent processes continuously change the properties of the system; such changes are referred to as inactivation, adaptation, habituation, learning, etc. Therefore, the history of activity impacts on the system's internal variables, which in turn affect future activity—and so forth. This internal feedback results in dynamic instabilities that are manifested in complex trajectories of the system's output.

This challenge was confronted by Hodgkin et al. (1952) in their analysis of the mechanisms underlying the generation of action-potentials. There, too, the dynamics of non-linear and history-dependent internal variables (in this case, membrane conductances) result in a complex voltage trajectory. The breakthrough in that study came with the development of a closed-loop technique called the *voltage clamp*, in which the system's output is stabilized by applying feedback control. Once the voltage is controlled, the dynamics of the membrane conductances were significantly simplified and could be measured by analyzing the control signal (i.e., the feedback current). This enabled comprehensive study and quantitative modeling of the system.

The essence of the clamp rationale, therefore, is to use control as a tool for system characterization; it inverts the experimental approach, determining the output of the system and observing the input signal required in order to obtain this desired output. One might expect this inverted system to simply reflect the behavior of the open-loop (i.e., current clamp) scenario, yet this is seldom the case in the non-linear, time-variant systems ubiquitous in physiology. The voltage clamp and other methods that emanated from it were extremely instrumental in elucidating the mechanisms of excitability. They did not, however, directly target the functionality of neural systems beyond the molecular level as the object of control.

The rationale of the voltage clamp technique was generalized to the study of neural systems at the functional, phenomenological level in a recently introduced method called the *response clamp* (Wallach et al., 2011). The current contribution aims at offering a comprehensive perspective of this method, its possible applications and extensions, as well as of its limitations. Note that, while termed a Review, this article does not attempt to provide an expansive outlook on the field of closed-loop methodology (for such a review of closed-loop physiology, for instance, see Arsiero et al., 2007).

#### **THE RESPONSE CLAMP EXPOSES FUNCTIONAL STATE VARIABLES OF NEURAL SYSTEMS**

The response clamp method utilizes a simple control procedure which allows robust manipulation of the system's response dynamics. First, a selected response feature is either directly measured or estimated from the system's activity. Then, a *Proportional-Integral-Derivative* (PID) controller (Levine et al., 1996) adjusts a stimulation parameter related to that feature in close-loop, so that the system's behavior converges to some desired pattern. The procedure eliminates the feedback from the system's response dynamics to its internal state dynamics. Moreover, these internal (and otherwise hidden) state dynamics are exposed to continuous measurement by analysis of the control signal. To demonstrate this let us use the example of clamping the response *probability*, which served in previously published studies (Marom and Wallach, 2011; Wallach and Marom, 2012).

Many excitable systems are characterized by an "all-or-none" response to external perturbation. While the responses themselves, once evoked, are stereotypical and uniform, the *probability* of evoking these responses is graded and depends on various stimulation parameters, as well as on the present state of the system. Some qualitative understanding of this dependence is required in order to establish control over the response probability; the easiest case is when the probability is monotonically related to some stimulus feature (e.g., intensity or contrast relative to the background). The most abundant form of such monotonic relationships in physiology and psychophysics is the *sigmoidal curve*, which exhibits threshold and saturation phenomena; several mathematical functions were used to model such sigmoidal relations, e.g., the *error function*, the *hyperbolic tangent* and the *logistic curve*. Let us consider the latter, which is characterized by a *threshold* parameter θ and a *dynamic range* parameter σ,

$$P(\mathbf{x};t) = \frac{1}{1 + e^{-\frac{(\mathbf{x}(t) - \theta(t))}{\mathbf{o}(t)}}},\tag{1}$$

where *P* is the response probability and *x* is the stimulation intensity (**Figure 1**). Note that small values of σ signify a steep sigmoid and therefore high sensitivity to changes in stimulation intensity [the maximal slope being *(*4σ*)*<sup>−</sup>1].

Due to the monotonic nature of this relationship, the response probability may be controlled by continuously adjusting the stimulation intensity using a negative feedback loop; the PID algorithm of the response clamp is a simple and efficient way to realize

**FIGURE 1 | Sigmoidal input–output relations.** The response probability's dependence on stimulation intensity follows a sigmoidal function with two parameters (state variables): the threshold θ and the dynamic range σ (see Equation 1). When two response clamps are used, one controller may clamp to 0.75 and the other to 0.25, thus yielding measurements of two distinct loci on the response curve (denoted *x*<sup>75</sup> and *x*25, respectively). The mean of these measurements is the threshold, while their difference is proportional to the dynamic range.

this loop. If we choose 50% as our target response probability, it is readily apparent that the stimulation sequence produced by the controller must satisfy at all times *x(t)* = θ*(t)*, i.e., the control signal in fact reflects the instantaneous threshold, a key functional state variable of the system. In practice this measurement contains some degree of inherent noise, since the system is stochastic and the response probability must be estimated using a finite number of samples.

Thus, using one response clamp controller, one locus on the input–output curve is tracked, providing a single-parameter characterization of these relations. A more detailed characterization is possible using multiple controllers, each clamping to a different value. The controllers take turns in stimulating the system (i.e., they are "time-multiplexed,") each using only the responses to its own stimuli in the control algorithm. This configuration provides non-simultaneous, mutually independent measurements of the system [see Wallach and Marom (2012) for details]. Thus, a multiple clamp set-up consisting of *n* controllers tracks *n* points in the input–output curve. Producing the state variables of interest from this set of measurements might require some "coordinate transform,"

$$
\vec{S} = f\left(\vec{X}\right),
\tag{2}
$$

where *X* is the vector of measurements (i.e., the set of *n* control signals), *f* is some function and *S* is a vector of *m* state variables of interest (*m* ≤ *n*). If, for example, the transform is linear, then

$$
\vec{S} = \mathbf{T} \cdot \vec{X},
\tag{3}
$$

where *T* is some *m* × *n* matrix.

In the example of the sigmoid relations presented in Equation (1), for instance, using two controllers enables tracking both the threshold and the dynamic range variables: the two controllers are used alternatingly, one clamps to 25% while the other to 75% response probability. Thus, the overall response is clamped to a constant 50% of the total stimulation and two distinct loci on the response curve, denoted *x*<sup>25</sup> and *x*75, are measured (see **Figure 1**). The two state variables θ*(t)* and σ*(t)* are produced using the linear mapping<sup>1</sup>

$$\theta(t) = \frac{\varkappa\_{75}(t) + \varkappa\_{25}(t)}{2} \qquad \sigma(t) = \frac{\varkappa\_{75}(t) - \varkappa\_{25}(t)}{\ln 9}. \tag{4}$$

**Figure 2A** demonstrates typical recordings obtained using the double clamp procedure on isolated neurons *in vitro*. The state variables θ*(t)* and σ*(t)* (in this case they are both expressed in mV), obtained using Equation (4), are presented in **Figures 2B,C**.

The use of multiple clamps, therefore, allows for a detailed, multiple-variable characterization of the system. This comes at the price of decreasing the temporal resolution of the measurements, since the total stimulation rate must be distributed between a number of controllers. Finally, multiple clamps may also be used to study structured or modular systems, e.g., the

$$\begin{aligned} \text{This is a two-dimensional case of Equation (3), where } \vec{X} = \begin{pmatrix} \chi\_{75} \\ \chi\_{25} \end{pmatrix}, \\ \vec{S} = \begin{pmatrix} \theta \\ \sigma \end{pmatrix} \text{ and } \mathbf{T} = \begin{pmatrix} 0.5 & 0.5 \\ (\ln 9)^{-1} & -(\ln 9)^{-1} \end{pmatrix} \end{aligned}$$

dynamics of coupling between interrelated systems or the integration of different inputs within the same system.

#### **APPLICATIONS OF THE RESPONSE CLAMP**

The possible applications of the method may be classified into three categories: (1) exploration of intrinsically generated dynamics (2) exploration of extrinsically generated dynamics, and (3) generation of input–output trajectories.

#### **EXPLORING INTRINSICALLY GENERATED DYNAMICS**

Let us return once more to our main source of inspiration, the voltage clamp technique. The studies pioneering this method, and many that followed, applied it on isolated systems (e.g., the giant squid axon) to observe the internal dynamics of ionic conductances at different voltage levels. Later, current fluctuations in microscopic, voltage-clamped membrane patches were analyzed to study the same issue at the molecular level, a technique termed "patch clamp" (Neher et al., 1978). In both cases, therefore, the voltage clamp was used to study the dynamics of the state variables with relations to changes in the clamped variable itself. Let us call such dynamics "intrinsic," as they are not related to some event occurring outside of the clamped system.

Similarly, the response clamp may be used to study the intrinsic dynamics of a system's state variables at the functional level. Such was the application of the response clamp in behavioral psychophysics (Marom and Wallach, 2011), where the subjects'

**FIGURE 2 | Neuronal threshold and range dynamics. (A)** Measurement of *x*<sup>75</sup> (yellow) and *x*<sup>25</sup> (purple) during 1 h of double clamping an isolated neuron *in vitro* (see Methods in Wallach et al., 2011). The two measurements are highly correlated. The neuronal threshold θ **(B)** and the dynamic range σ **(C)**, were computed using Equation (4) (blue line in both). **(D)** When the measurements are displayed in the threshold/range state plane, the significant correlations between them is evident. Fitting with a linear relation

[Equation 5, black line in panel **(D)**, *R*<sup>2</sup> = 0*.*52] enables the estimation of the dynamic range based on the instantaneous threshold [black line in panel **(C)**]. **(E)** Examples to the instantaneous I/O relations (Equation 6) at three different points in time [marked with colored arrowheads in panels **(B)** and **(C)**]. The curve becomes stretched as the threshold increases. Note that as the threshold approaches the minimal value θ0, the curve approaches a step-function, i.e., the neuron becomes a deterministic element.

response fluctuations in the clamped and unclamped scenarios were compared. The results suggest that these fluctuations are markedly restrained in closed-loop conditions, namely when the subject's actions have some (unconscious) effect on future stimuli. This led the authors to postulate that the well-documented trial-to-trial variability does not reflect, as previously suggested, an intrinsic "noise" process; rather, it stems from the unnatural open-loop experimental paradigm. The response clamp may be used in a similar manner to investigate the relations between psychophysical dynamics and brain activity (Monto et al., 2008), or to study the factors driving threshold fluctuations at the cellular, synaptic or network levels.

To further demonstrate how closed-loop analyses provide new insights into functional properties of a system, let us apply the double response probability clamp experiment discussed above to study intrinsic response dynamics of isolated neurons (Gal et al., 2010). By plotting the two derived state variables, θ and σ, against each other (**Figure 2D**), it becomes evident that the long term fluctuations of the dynamic range are highly correlated with those of the threshold. The relations between these two variables may thus be approximated using a linear expression, namely,

$$
\sigma(t) = \alpha \left( \theta(t) - \theta\_0 \right). \tag{5}
$$

Equation (5) may be used to produce a smoothed estimate of σ based on the values of θ (black line in **Figure 2C**). The instantaneous input–output relations are therefore simplified in this case to a single state variable expression,

$$P(\overline{\mathfrak{x}}, \overline{\mathfrak{\theta}}(t)) = \frac{1}{1 + e^{-\left(\overline{\mathfrak{x}} - \overline{\mathfrak{\theta}}(t)\right) / \alpha \overline{\mathfrak{\theta}}(t)}},\tag{6}$$

where *x* = *x* − θ<sup>0</sup> and θ = θ − θ<sup>0</sup> are the *relative* stimulation amplitude and threshold, respectively. **Figure 2E** visualizes such instantaneous I/O relations in three different instants during the recording (marked with arrowheads in **Figures 2B,C**). This result is, in fact, quite expected, as scaling of sensitivity to changes with stimulation magnitude is a ubiquitous phenomenon in both physiology (Abbott et al., 1997) and psychophysics (e.g., the Weber–Fechner law, see Carterette and Friedman, 1974). Note that the offset parameter θ<sup>0</sup> has a biophysical significance: it is the threshold value at which the dynamic range becomes zero, i.e., the neuron becomes *deterministic* (the I/O relations become a step function). θ0, therefore, is the minimal stimulation amplitude required to generate a spike at the maximal excitable state of the neuron, constituting an example to how analyses of intrinsic fluctuations of the system's state variables (reflected in the response clamps' control signals) yield novel findings as to the functional properties of the system.

#### **EXPLORING EXTRINSICALLY GENERATED DYNAMICS**

While much can be learned by studying isolated systems, neural systems are invariably embedded in networks and environments, where they interact with many external factors; any neuron, for instance, is affected by the activity of its peers via synaptic inputs converging onto it. The voltage clamp proved very beneficial in investigating the mechanisms of this communication by providing measurements of the *post synaptic currents* (Hagiwara and Tasaki, 1958): the membrane potential is held constant at some desired value, and changes in the feedback current *due to an external event* (e.g., an action potential generated in a neighboring cell) are measured. Using this application of the clamp, one may isolate individual input components (i.e., by clamping to a specific reversal potential) and separate them from the dynamics of the system itself (by preventing the generation of action potentials).

Similarly, the response clamp may be used to study changes in the functional behavior of systems due to interactions with their external environment. In a recently published paper (Wallach and Marom, 2012), the long-term effects of network events (brief episodes of synchronous, network-wide activity, also called "bursts" or "population spikes") on neuronal threshold were analyzed. Since the measurements are inherently noisy the effect of a single event was usually too small to observe and event-triggered averaging was applied. Using this procedure it was shown that network synchronous events induce a long lasting, bi-phasic deflection of the neuronal threshold. The results demonstrate interrelations between the dynamics at the two levels: the magnitude of the network event is reflected in the amplitude of the neuronal threshold deflection, while the relaxation of the threshold back to baseline is correlated with the recovery dynamics of network excitability.

These results demonstrate how the response clamp could be applied to the study of such extrinsically generated dynamics. Any measurable external influence on the clamped system (either subject to experimental control or autonomous) may be analyzed in a similar manner; the effects of various chemical compounds (such as neuromodulators or toxins) on overall cellular excitability, for instance, may be thus quantified. Similarly, the method may be implemented to study the interactions between different inputs to the same system: the response to one source may be clamped, and changes in the control signal due to activation of the second source may be recorded.

## **GENERATING INPUT−OUTPUT TRAJECTORIES**

Like many other closed-loop stimulation techniques (e.g., Wagenaar et al., 2005; Arsiero et al., 2007; Rolston et al., 2010), the response clamp offers the capability to control the activity patterns of neural systems. This capability may, in and by itself, be useful in different experimental scenarios. In such cases, the control signal is not used for analysis; rather, the effect of the produced dynamics on other (non-clamped) variables is explored. The most notable derivative of the voltage clamp technique in this context is the *dynamic clamp* (Sharp et al., 1993), in which the current injected in a closed-loop effectively adds or removes conductance components to the cell; the contribution of these conductances to the overall system behavior, and not the injected current, is the subject of analysis in this method.

In the response clamp, it is this "overall system behavior" that is manipulated. For instance, let us assume that an isolated system is repetitively stimulated at rate *f*in, and the response probability to this stimulation is clamped to some value *p*. The activity rate of this system, *f*out, is therefore also clamped, since

$$f\_{\rm out} = f\_{\rm in} \cdot p.\tag{7}$$

Thus, by maintaining *f*in constant and varying *p*, one may precisely produce desired activity patterns (see Figure 5 in Wallach et al., 2011). This may be useful if the clamped system serves as an input stage for downstream systems. Interestingly, Toettcher et al. (2011) recently suggested a similar approach (also using a PID-based algorithm) to control the dynamics of intracellular signaling pathways, thus generating a well-controlled, repeatable input to downstream components in the pathway.

Yet one may use this tool to do more than just control the output dynamics: by controlling both the clamped response and the input, regions of the *input–output space* may be efficiently covered. For instance, by jointly altering *f*in and *p* (in opposite directions), one may observe the system's behavior at different input levels, while maintaining the output level (*f*out) constant. Exploring various input–output combinations may elucidate the contributions of input-dependent effects (i.e., direct effects of stimulation) and activity-dependent effects to the overall behavior.

## **LIMITATIONS OF THE RESPONSE CLAMP AND COMPARISON WITH OTHER APPROACHES COVERAGE OF THE TIME-SCALE SPECTRUM**

The voltage clamp served as a source of inspiration and as a reference methodology for the development of the response clamp and its applications. However, a different range of time-scales is accessible in each of these two techniques. In voltage clamp, the controller (the feedback amplifier) is both extremely fast (i.e., its time-constant is much shorter than that of the clamped membrane) and powerful (i.e., the feedback gain is high) so that the clamp process is, for all practical purposes, instantaneous. Thus, the voltage clamp enables the investigation of even the fastest processes in the membrane (e.g., fast activation of sodium channels). Application of the voltage clamp to the study of extremely slow processes, however, is limited in several respects. First, voltage clamp is presently performed using physically invasive intracellular electrodes, a procedure which sets a practical upper bound to the duration of recordings. This technical limitation may be theoretically circumvented if a non-invasive realization of the method is invented (e.g., by harnessing optical techniques for both voltage measurements and current injection). However, voltage clamp is "invasive" in a different, more fundamental sense: as long as the cell is clamped, its natural behavior (i.e., emitting actionpotentials) is completely shut-down. Thus, even if a non-invasive voltage clamp did exist, the results obtained using this technique would have little to do with natural long term dynamics of excitability.

The response clamp provides access to a range of time-scales which is complementary to those covered by the voltage clamp. On one hand, access to the very fast time-scales may be limited due to stimulation constraints (e.g., maximal possible stimulation intensity or rate) and to the time-scale of the control algorithm (determined by the various control parameters). On the other hand, the straightforward realization of the method using non-invasive means of stimulation and recording (e.g., extracellular electrodes), and the fact that the cell's natural spiking behavior remains intact, extends the experimental access into extremely long-term processes. By determining the time-scale of the clamped dynamics, the response clamp provides an experimental tool to separate processes of different time-scales governing the behavior.

#### **APPLICABILITY TO CONTROLLABLE SYSTEMS**

A prerequisite to any application of the response clamp is to establish reliable control of the response feature of interest by manipulation of some input parameter. In the systems studied so far, establishing this control was particularly straightforward since the input–output relations were monotonically non-decreasing (e.g., the sigmoidal curve in Equation 1). In systems where these relations are of a more complex nature (e.g., bell shaped or multimodal), a more elaborate control algorithm is required (Astrom and Wittenmark, 1994). Moreover, in some systems the relevant input feature (the so called "receptive field") may be unknown. In such cases, some algorithm that finds this relevant input feature within the space of all possible inputs must be instated, in order for the clamp to be applicable. Such a combined solution is discussed in the next section.

### **REVERSE CORRELATION AND WHITE-NOISE ANALYSES**

The response clamp demonstrates how closed-loop control may be used as a tool for system characterization. An important openloop alternative which was already mentioned above is the *reverse correlation* approach. In this method the input–output relations of a system are exposed by computing various weighted statistics of the input variable, with the assigned weights derived from the output variable. The most common of these techniques is the *Spike-Triggered Average* (STA) and its extensions, which were used extensively in order to estimate the receptive fields of various neurons [see Simoncelli et al. (2004), and references therein]. When the input is under experimental control, approximated white-noise is usually applied, so that equal energy is applied across a broad range of time-scales (alternatively, whitening procedures may be used). This was shown to guarantee (under some additional restrictive conditions, see Paninski, 2003) that the estimation is unbiased.

There are several limitations to the use of reverse correlation methods. First, if the stimuli space is multi-dimensional, unbiased coverage of this space is very difficult experimentally, as the number of stimuli needed increase exponentially with each additional dimension [Benda et al. (2007) already purposed closed-loop stimulation as a method to efficiently sample this space when systematic, open-loop coverage is impractical]. More importantly, the underlying (and often unstated) assumption in these methods is that the system is feed-forward and static; the receptive field derived using STA is tightly related to the linear stage of *Linear-Nonlinear-Poisson* neuronal models, which do not account for refractoriness, output-dependent processes or threshold dynamics. Moreover, since the space of all possible output dynamics is not necessarily covered (e.g., high firing rates are rarely reached), such output-dependent effects may not be fully expressed in white-noise perturbation.

White-noise analysis, however, holds the considerable advantage of enabling identification of the input features to which the system is sensitive using very limited *a-priori* knowledge (e.g., the relevant modality). As it happens, this advantage precisely addresses the above mentioned impediment to the implementation of the response clamp, namely the need to identify a relevant and effective control variable. Thus, STA and the response clamp may be used in tandem, each method complementing the other: first, STA is implemented to expose the "static" or "baseline" receptive field; then, the response clamp uses this receptive field to produce the control variable, in order to expose the dynamic and output-dependent processes of the system.

#### **COMPARISON WITH OTHER CLOSED-LOOP TECHNIQUES**

Closed-loop control is, in itself, a widely accepted tool in many fields of research. Physiologists have employed closed-loop techniques and protocols to control aspects of neuronal activity at all levels of biological organization [see Arsiero et al. (2007) and references therein]. Already in the late 1960's, Eberhard E. Fetz showed that the activity of a single cortical neuron may be reinforced by applying closed-loop control of food pellets delivery (Fetz, 1969).

In psychophysics, a variety of procedures were developed over the past few decades in order to measure the psychometric threshold in closed-loop (Treutwein, 1995). The underlying assumption in all these procedures is that, in a given experiment, the threshold is *static*, and hence the procedure is stopped once it "converges" to a reliable estimate of this threshold. The key novelty in applying the response clamp to psychophysical investigations (Marom and Wallach, 2011), therefore, is in the analysis of post-convergence fluctuations of the threshold.

The fundamental difference between both the voltage- and response-clamp methods and other closed-loop techniques is that the control signal in all these techniques (be it food-pellet delivery rate, stimulation amplitude, etc.) is seldom used in order to gain access to the dynamics of hidden state variables. An interesting exception worth mentioning is a clinical method called *glucose clamp* (DeFronzo et al., 1979), developed in order to diagnose insulin secretion and resistance by analysis of the control signals (rates of glucose/insulin perfusion or infusion). This method,

## though rarely used in clinical practice, is considered the "gold standard" in the diagnosis of diabetes.

It should be stressed that the use of the PID control algorithm is, in and by itself, of no fundamental importance to the realization of the response clamp. This algorithm was chosen owing to its simplicity and generality; the PID is a pure-feedback, *model free* algorithm, and therefore little *a-priori* knowledge of the controlled system is required in order to implement it. Any other algorithm which efficiently clamps the system's response may be used. One might expect, for instance, that using other adaptive psychophysical protocols would yield similar results to those of Marom and Wallach (2011); rigorous examination of this prediction, however, is yet to be performed.

## **CONCLUDING REMARKS**

The voltage clamp revolutionized the way physiologists study the mechanisms of excitability and synaptic communication. The response clamp method extends the clamp rationale toward functional characterization of neural systems. It offers a general framework for closed-loop exploration that may be implemented at any level of organization, using any available technique of measurement or perturbation. It may also be combined with complementary, open-loop approaches such as white-noise analysis. Finally, one may envision a paradigm in which voltage- or dynamic-clamp "command" is controlled in closed-loop by the response clamp algorithm. Such a multi-layered clamp set-up may aid in bridging the gap between mechanistic and functional characterization of neural systems.

#### **ACKNOWLEDGMENTS**

I am grateful to Dani Dagan and Shimon Marom for their valuable advice and assistance. The research leading to this review has received funding from the European Union Seventh Framework Program FP7 under grant agreement 269459 and was also supported by a grant of the Ministry of Science and Technology of the State of Israel and MATERA grant agreement 3-7878.

#### **REFERENCES**


to stimulus: adaptive sampling in sensory physiology. *Curr. Opin. Neurobiol.* 17, 430–436.


impulse transmission across the giant synapse of the squid. *J. Physiol.* 143, 114–137.


responses with stochastic stimuli. *Cogn. Neurosci.* 3, 327–338.


bursting in cortical cultures with closed-loop multi-electrode stimulation. *J. Neurosci.* 25, 680.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 September 2012; accepted: 09 January 2013; published online: 30 January 2013.*

*Citation: Wallach A (2013) The response clamp: functional characterization of* *neural systems using closed-loop control. Front. Neural Circuits 7:5. doi: 10.3389/ fncir.2013.00005*

*Copyright © 2013 Wallach. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any thirdparty graphics etc.*

## A translational platform for prototyping closed-loop neuromodulation systems

*Pedram Afshar\*, Ankit Khambhati , Scott Stanslaski , David Carlson, Randy Jensen , Dave Linde , Siddharth Dani , Maciej Lazarewicz , Peng Cong, Jon Giftakis, Paul Stypulkowski and Tim Denison\**

*Medtronic Neuromodulation, Minneapolis, MN, USA*

#### *Edited by:*

*Steve M. Potter, Georgia Institute of Technology, USA*

#### *Reviewed by:*

*Thomas DeMarse, University of Florida, USA Douglas J. Bakkum, Georgia Institute of Technology, USA*

#### *\*Correspondence:*

*Pedram Afshar and Tim Denison, Medtronic Neuromodulation, 7000 Central Avenue North, Minneapolis, MN 55432, USA. e-mail: timothy.denison@ medtronic.com; pedram.afshar@medtronic.com*

While modulating neural activity through stimulation is an effective treatment for neurological diseases such as Parkinson's disease and essential tremor, an opportunity for improving neuromodulation therapy remains in automatically adjusting therapy to continuously optimize patient outcomes. Practical issues associated with achieving this include the paucity of human data related to disease states, poorly validated estimators of patient state, and unknown dynamic mappings of optimal stimulation parameters based on estimated states. To overcome these challenges, we present an investigational platform including: an implanted sensing and stimulation device to collect data and run automated closed-loop algorithms; an external tool to prototype classifier and control-policy algorithms; and real-time telemetry to update the implanted device firmware and monitor its state. The prototyping system was demonstrated in a chronic large animal model studying hippocampal dynamics. We used the platform to find biomarkers of the observed states and transfer functions of different stimulation amplitudes. Data showed that moderate levels of stimulation suppress hippocampal beta activity, while high levels of stimulation produce seizure-like after-discharge activity. The biomarker and transfer function observations were mapped into classifier and control-policy algorithms, which were downloaded to the implanted device to continuously titrate stimulation amplitude for the desired network effect. The platform is designed to be a flexible prototyping tool and could be used to develop improved mechanistic models and automated closed-loop systems for a variety of neurological disorders.

**Keywords: automation, closed-loop, neuromodulation, prototyping, hippocampus, seizure**

## **INTRODUCTION**

Neuromodulation devices for deep brain stimulation (DBS) deliver targeted electrical stimulation to treat symptoms of diseases such as Parkinson's disease, essential tremor, and dystonia. To ensure benefit, these therapies require not only accurate placement of the stimulating electrode within neural tissue, but also proper selection of stimulation parameters (e.g., amplitude, pulse width, and frequency). These parameters can be used to mitigate side effects including hemiballism, gait and speech disturbances, and dyskinesias (Limousin et al., 1996, 1998; Hamani et al., 2005; Yu and Neimat, 2008; Bronstein et al., 2011). While many patients benefit from DBS, the parameter selection process is largely heuristic, and reprogramming sessions may be weeks or months apart.

Effort has been applied for more than a decade to build automated systems (**Figure 1**) that use patient state to adjust stimulation parameters, thereby reducing the delay between stimulation updates by many orders of magnitude compared to human intervention. Realizing these systems requires development of sensors to measure patient data and algorithms to translate the data to the appropriate stimulation parameters (Priori et al., 2012). Complexity in the nervous system motivates partitioning the algorithm into two components: one that translates sensor data into estimates of state (i.e., a classifier algorithm) and another that translates the state estimate into a stimulation parameter update (i.e., a control-policy algorithm). In this work, state is left intentionally ambiguous because its meaning depends on the application: examples include seizure versus non-seizure; Parkinson's ON versus OFF; asleep versus awake; or others. Regardless of the application, dividing the algorithm provides the following benefits:


the desired trade-offs between performance, latency, and power consumption can be committed to embedded firmware for untethered operation.

The "agent-environment" model from artificial intelligence research is one model for describing the relationship between the physician and the automated neuromodulation system in learning and implementing algorithms (**Figure 2**). The goal of the agent is to develop a performance element (i.e., algorithm) to model the relationship between environmental percepts and actions taken by effectors. The informed critic (i.e., clinicianresearcher) updates the performance element by learning from its input data (sensors) and intermediate processing (knowledge)

**system.**

to develop new problems or hypotheses regarding the algorithm. Iterative testing allows the critic to simultaneously learn about the environment and develop the best performance element to modulate it.

The agent-environment model is suitable for the development of neuromodulation systems for several reasons. The model:


The translation of automated closed-loop systems has been helped by the development of more sophisticated neural sensors as well as improved understanding of the neural signals that underlie disease. Neurochip-2 (Zanos et al., 2011) and Hermes-D (Miranda et al., 2010) are two examples of technology to measure from the network. Neurochip-2 provides three channels of sensing and stimulation and allows for fast response loop closure to explore concepts like neural plasticity. The Hermes-D system allows for wireless, larger scale measurement (32 channels) of activity, but lacks stimulation capability. Both systems have the advantage of higher bandwidth, which allows for measurement of single unit activity, but draw greater than 1000× more

power for operation than a typical DBS implant, giving them longevity of at most a few days between recharges. Moreover, the limited biomarkers and control variables currently known for neurological diseases motivate the development of platform technologies to enable improved first-principles understanding, which may lead to more rapid clinical translation. A critical step in developing this understanding is the ability to provide simultaneous neural recording and therapeutic stimulation, which is lacking in many research tools today. This capability is needed to understand the system transfer function, which we define as the relationship between stimulation and network behavior.

The study of biopotential biomarkers has shown spectral power in local field potentials (LFP) to be a disease-relevant indicator in a variety of settings (Schnitzler and Gross, 2005; Uhlhaas and Singer, 2006). In particular, these signals are useful in studying networks of thalamo-cortical structures and their dynamic inter-relationships, where abnormal neural synchrony is believed to be a hallmark of disease states (Llinas and Ribary, 2001; Siegel et al., 2012). Furthermore, quantified differences in neural synchrony, which can be measured by calculating power (uV/rtHz)<sup>2</sup> in a particular frequency band (for example, "beta"), have been shown to correlate with symptom severity. For instance, power in the beta band (15–30 Hz) has been found to be related to cardinal Parkinson's symptoms such as bradykinesia and rigidity (Hammond et al., 2007; Eusebio and Brown, 2009; Kühn et al., 2009; Priori et al., 2012). Characteristic changes in power at the theta tremor frequency (Hellwig et al., 2001) and coherent activity in the 6–15 Hz frequency band (Raethjen et al.,

2002) have also been found in essential tremor. Synchronization in even lower frequencies (alpha and theta range) has been found in dystonia (Liu et al., 2002; Silberstein et al., 2003; Kühn et al., 2009; Sharott et al., 2008; Singh et al., 2011). Correlations between power in frequency bands as low as alpha (Zumsteg et al., 2006) and as high as 500 Hz (Blanco et al., 2011) have been reported in patients with epilepsy. Equally importantly, it has been shown that the effect of therapy can be correlated with LFP signals both in DBS (Eusebio et al., 2012; Priori et al., 2012) and levodopa therapy (Rossi et al., 2008). In aggregate, these studies suggest that LFP is a promising sensor input for automated systems treating a variety of neurological disorders.

In this work, we describe a platform for investigating these neural signals toward the development of an automated, closedloop bioelectronic neuromodulation system. The platform comprises tools and a process flow to map the general learning agent to neuromodulation research and enables rapid prototyping of these tools in an implantable neuromodulation device. We use a preclinical, *in vivo* nervous system model to demonstrate the functional components of the system: collection of neural data, identification of relevant features (i.e., biomarkers), development of the algorithm, and consolidation of the algorithm into an implanted device.

## **SYSTEM STRATEGY AND INFRASTRUCTURE**

To implement this system we mapped the general learning agent functional blocks into the neuromodulation domain (**Figure 3**). The interface is bi-directional, extracting measures of neural state

through percepts and actuating states in the nervous system through effectors. Percepts are received through a combination of sensors that include bioelectrical sensing from electrodes (e.g., ECG, EMG, and LFP) and inertial sensing, (e.g., posture and activity). The effector pathway is defined by electrical stimulation pulse patterns, with parameters similar to approved therapy devices.

The challenge in designing the performance element is that characteristics of both percepts and effectors are still evolving. The algorithm addresses this ambiguity through use of classifier portion that maps sensed signals to estimates of state and a control-policy portion that maps state estimates into a desired stimulation.

We have implemented the learning system using an implantable research device and external application tool coupled with real-time telemetry; the system is illustrated in **Figure 4**. We call this partition of external learning elements that can be transferred to the implantable device performance element a "hybrid" design approach. The goal is to construct a complete platform (combining hardware, software, and firmware) for the learning procedure. The learning protocol includes four main steps from initial exploration to a chronic prototype for validation: collection of sensed neural data; design of the performance element's classifiers based on biomarkers; development of the performance element control policy based on measured neural system identification; and embedding of the performance element into the device for chronic validation.

To do this, we designed a system with the following features:

	- Bioelectric sensing with 4 bipolar sensing channels with 150 nV/rtHz noise floor without stimulation and 300 nV/rtHz noise floor with stimulation (nb: Stanslaski et al., 2012 describes constraints of sensing during stimulation).
	- Inertial sensing with a custom three-axis accelerometer with a 10 mg-rms resolution floor drawing under 600 nW/axis (Denison et al., 2007).
	- Stimulation using a commercially available neural simulator system with accepted therapy.
	- Embedded algorithm with independently modifiable classifier and control-policy algorithms.

	- Save, parse, and annotate data collected from implantable device.
	- Implement, prototype, and compare machine learning algorithms.
	- Develop and test classifiers for the implanted system.
	- Stream data directly from the implantable device to an external processor with latency less than 1 s (0.5 s typical).
	- Send stimulation parameter updates to the implantable device with latency from command to stimulation at the electrode in less than 1 s (0.5 s typical).
	- Monitor state transitions in classifier and control-policy algorithms.

The key for this system is to integrate all necessary elements to provide a complete platform for an accelerated learning procedure amenable to rapid-prototyping and clinical translation. The details of these steps follow below.

#### **COLLECTION OF SENSED NEURAL DATA**

The design of the performance element starts with data collection. While there are many methods to sense biopotential data, fully implanted devices offer the advantage of higher signal fidelity than fully external devices (e.g., EEG), reduced infection risk, and improved chronic, ambulatory data collection capability compared with implanted devices with external components (e.g., externalizing leads during DBS surgery or the Hermes-D system). We have previously described the design and implementation of our fully implanted, bi-directional neural interface (Rouse et al., 2011). In brief, the device contains both sensing and stimulation components. The stimulation feature embodies the capability of a commercial DBS system. Biopotential sensing is enabled with a custom-integrated interface chip that allows for measurements of LFP generated from EMG, ECoG, LFP, and ECG (Avestruz et al., 2008), with noise floor of 150 nV/rtHz without stimulation and 300 nV/rtHz with stimulation (nb: Stanslaski et al., 2012, gives details and constraints of sensing during stimulation). The custom integrated circuit (IC) provides data analysis for up to four bipolar channels, which are selectable between Nyquist-rate waveforms (i.e., time channels) and spectral power at specific frequency bands of interest (i.e., power channels). The time channels provide complete spectral information; however, they incur the penalty of much higher power consumption. Power channels, on the other hand, extract a power envelope that is down-sampled to 5 Hz prior to digital signal processing. The reduction of signal dynamic range prior to digitization is a common technique for saving energy in micropower systems. The design model is to use the time channels for neural system identification, including identifying biomarkers and to transfer to the power channels to optimize efficiency chronically. The inertial element uses a

micromachined three-axis accelerometer that transduces capacitive fluctuations to a voltage output. The resolution floor of the inertial element is 10 mg rms, in a 20-Hz band of detection. The sensor draws a total of 2 uW during normal operation, which minimizes longevity impact in the device (Denison et al., 2007). The sensor inputs from bioelectric and inertial sensors can be fused together in the algorithm, if desired.

Data acquisition also provides an opportunity for optimizing efficiency. While the device supports streaming telemetry for time and power channels, it is limited to environments in which the subject is close to a telemetry system, and desired data sampling frequency is low. Event triggered recordings allow for timed segments of high sampling frequency data when the subject is ambulatory. Triggers include user programmable, timer-driven intervals; embedded classifiers; external subject button presses; or combinations thereof. For a typical event structure like motion or seizure onsets, an 8-s loop recording could be applied for two recording channels. With a typical data rate of 422 Hz, approximately 200 recordings can be stored by the embedded SRAM until it needs to be downloaded and cleared. To organize and manage the resulting number of files gathered over a longitudinal study, a file system was developed to provide data structure to researchers. Information such as event time stamps, parameter settings, and event type is embedded in the data during recording and automatically extracted as a companion file to the data. The combination of the custom integrated hardware, signal processing strategy, and data gathering infrastructure facilitates the design of the performance element.

## **LEARNING → PERFORMANCE ELEMENT I: CLASSIFICATION**

The first subsystem of the performance element is a classifier to estimate the state of the nervous system from the sensed LFP biopotentials. Following the hybrid approach of our platform, we implement the classifier as both an internal function of the implantable device and as an external tool for learning and problem generation; the functional flow of the tool is illustrated in **Figure 5**. The external tool allows users to visualize time domain and spectral data, graphically annotate biomarkers of interest, and automatically generate classifiers using supervised machine-learning algorithms. In addition, classifier sensitivity and specificity can be adjusted manually to obtain the desired performance. The resulting classifiers can be stored and compared using automatically computed detection statistics. Beyond data manipulation, the key value of the tool is its relationship with the implanted device; the tool:


The default on-board classifier algorithm is a linear-discriminant using a modified Fischer-discriminant approach; it is a linear decision boundary in a user-selectable feature space that identifies

an event signal sample from other samples. The algorithm was designed using reduced set methods as described in (Shoeb et al., 2009). The use of the multi-dimensional linear boundary was found to optimize trade-offs in power consumption, latency, sensitivity, and specificity. Recent work by Lee describes a similar trade-off calculation and supports our design choices (Lee Kyong et al., 2012). The on-board algorithm can be used for detecting events, which are time-stamped and used to trigger recordings while the subject is ambulatory, thereby reducing current drain nearly 100-fold and reducing classification latency 5-fold, from ∼1 s to ∼200 ms. If the biomarker's characteristics warrant a more complex classifier or shorter latency, the algorithm can be updated, trading off power consumption.

## **LEARNING → PERFORMANCE ELEMENT II: CONTROL POLICY**

The second algorithm subsystem is the control policy that maps the state estimate into an optimal stimulation sequence. Like the classifier algorithm, we implemented the control-policy algorithm both internal to the device and as an external system for learning and problem generation. Non-linearities in network dynamics heighten the need to sample many input–output pairs for system identification. This can be accomplished in two ways:

First, the external tool may be used to sweep any stimulation parameter (e.g., amplitude or frequency) while the implantable device senses and saves biopotential data to the internal memory. Once retrieved from the device, system identification is performed by measuring the relationship between the stimulation parameters and biopotentials.

Second, the control policy may be adjusted in real-time on a researcher's device using an external device to wirelessly transfer data: sensed data is passed to the researcher's device and control-policy output is passed to the implantable device. This capability enables prototyping algorithms including the use of tapped-delay lines and time synchronizing with other sensors and hardware, and deriving a variety of signal features (e.g., phase amplitude coupling). The external device ensures data integrity in both directions through cyclic-redundancy checks and ensures patient safety by returning the device to safe, pre-programmed stimulation state should the researcher's control policy behave unexpectedly. Additional safety is ensured by allowing the control policy to select only among stimulation parameter boundaries that have been predetermined by the researcher.

For the platform design, particular attention was paid to the latency in the telemetry links, which is a key factor to effectively study the dynamics. In the first generation of development, we required that total latency through the channel be constrained to 1 s or less, and typically under 0.5 s. This degree of latency is suitable for many closed-loop algorithms that operate on timescales of seconds, hours, or days. The inherent latency of the links was dominated by two factors: the first is the data packet format and error correction handshakes using the 175 kHz ISM band, and the second is the internal packet transfer within the bioelectronic device, which, for safety reasons, are secondary interrupt priorities compared to the therapeutic stimulation. Although the latency can be much improved by running in the device, it limits flexibility during the initial learning phase. Therefore, for most cases, the new stimulation parameters are generated externally, where algorithms can be made arbitrarily complex and rapidly evaluated to see if they capture the desired behavior of the neural system. It is highly desirable to validate the behavior prior to committing to verification of embedded firmware due to regulatory constraints and requirements. For example, the platform can implement arbitrary control paradigms such as simple bang–bang controllers (modeled from early cardiac defibrillators) or more sophisticated proportional-integral-derivative and linear-quadratic-Gaussian controllers for achieving the optimal path to the desired state maintenance.

#### **COMMITTING THE PERFORMANCE ELEMENT TO THE EMBEDDED DEVICE FOR VALIDATION**

After learning and prototyping the classifier and control policy, the algorithm can be validated by embedding onto the implantable device firmware using telemetry. The firmware uses a dedicated boot loader that allows for a new series of code to be flashed to non-volatile memory inside the device in a few minutes. The firmware in the device is partitioned such that the classifier and control policy can be updated independently of the therapy code, thereby keeping the interaction to that necessary for real-time classification and closed-loop operation. To assist in validation, the firmware is capable of streaming out the classifier and control-policy states in addition to sensed signals in real-time, so that the user has visibility into the algorithm operation. For chronic operation, the state transition information is included in the data log for validation.

## **METHODS: DEMONSTRATION OF THE LEARNING AGENT ARCHITECTURE**

As demonstration of the capabilities of our method and tools, we used the system to investigate, characterize and dynamically modulate the hippocampal dynamics within the circuit of Papez. The circuit of Papez is a thalamo-cortical circuit implicated in temporal lobe epilepsy and involves a reentrant loop involving the hippocampus (HC) and thalamus. The goal was to design from first principles a demonstrative "homeostatic" feedback loop, which would titrate stimulation dynamically to maintain network activity reflected in the field potentials; the intention was to show the capabilities of the technology, as opposed to demonstrate or claim a therapeutic algorithm *per se*. Design of the loop required that we address many issues of neuromodulation design: testing in an awake and freely moving subject, consideration for reliability and repeatability, and chronic implant stability and safety. Methods are detailed from the physiological preparation and technology points of view. The focus of this effort was on exploring the bioelectrical properties of the network and building up a closed-loop system; the conceptual schema for developing inertial-based systems, classifiers and control policies was previously demonstrated with this architecture (Schultz et al., 2012).

## **PHYSIOLOGICAL METHODS**

The *in vivo* device was chronically implanted in an ovine animal model conducted under an IACUC-approved protocol (Stypulkowski et al., 2011) and is summarized here. Following anesthesia, 1.5T MRIs were collected and transferred to a surgical planning station. Trajectories for a unilateral anterior nucleus (AN) DBS lead (Medtronic model 3389) and unilateral HC lead (Medtronic model 3387) were planned, and leads implanted using a frameless stereotactic system (NexFrame from Medtronic, Inc.). Once lead placement was confirmed based upon electrophysiological measures, Medtronic model 37083 extensions were connected to the DBS leads, tunneled to a post-scapular pocket, and connected to the prototype chronic implantable device. **Figure 6** illustrates the overall system placement and setup. Following closure of all incisions, anesthesia was discontinued, and the animal was transferred to surgical recovery.

All sensing and stimulation documented here were conducted in a single, awake sheep resting in a sling. In this particular work, all reported data were recorded from the HC with bipolar montage using contacts surrounding a monopolar stimulation contact (square, biphasic 300µs pulse width on E1 with far-field return) to mitigate artifacts via common-mode rejection during stimulation (Stanslaski et al., 2012); functional network data from thalamic stimulation and sensing are not shown, but can be found in Stypulkowski et al. (2011). Neural data, stimulus trains and classifier detections were recorded and saved by PC software via wireless telemetry. Data were gathered over 15 months and represents over 18 months of operation with the device completely implanted.

As background to the analysis that follows, our physiological system relies on three qualitatively discernible states in the biological system:


## **LEARNING FLOW METHODOLOGY**

The system was deployed on the physiological preparation to develop an embedded closed-loop algorithm using our tools and processes. The technical methods applied the design flow outlined in the system architecture to the physiological preparation:

## • **Collection of sensed neural data**

Using the bi-directional telemetry link and embedded data gathering capabilities, we gathered baseline training data on

background network activity. We also used the stimulator and sensing functionality to identify useful biomarkers and understand system transfer functions required for closing the feedback loop.

• **Learning → Performance element I: design of classifier algorithm**

The software algorithm tool was used to develop classifiers to support the after-discharge detection and verify suppression levels, which were validated using the real-time telemetry link.

• **Learning → Performance element II: development of the control policy**

After development of the classifiers, the auto-detection of afterdischarges and therapy titration was validated using off-line, real-time processing with the bi-directional telemetry link. Key parameters were verified to be acceptable for timing latency. An additional algorithm (data not shown) was tested to show the system could automatically search the parameter space to find acceptable suppression behavior.

• **Committing the performance element to the embedded device for validation**

The final embedded algorithm implemented three subalgorithms into a single-state machine: AD detection and mitigation; suppression detector; and parameter search. The code was then downloaded to the device through wireless telemetry, error checked for complete flash writes, and the implant was then activated with the closed-loop algorithm. All states were exercised in the algorithm routine to validate operation. State transitions were also recorded in the device data records for automated annotation of files, allowing for observational validation and algorithm refinement.

#### **RESULTS**

#### **COLLECTION OF SENSED NEURAL DATA: IDENTIFICATION OF BIOMARKERS AND ALGORITHMS**

We aimed to explore the states of the system to find relevant control-variable biomarkers *in vivo*. Analysis of the post-stimulation data showed decreasing mean beta band power with increasing stimulation amplitude, suggesting suppression of activity, at least locally to the HC (**Figure 8**, right). To determine whether the network was truly suppressed, we performed a second series of transfer function experiments which measured the pre-stimulation baseline beta band power level followed with a high amplitude delivery (≥1.50 V) "probe pulse" capable of inducing AD. Because our experimental setup was not a seizure model, we used the post-stimulation AD duration as the desired output for assessment of network effect. Through spectral analysis of the data, we observed a potential control variable in the 20 ± 2*.*5 Hz band (approximately the beta band) that seemed correlated to the qualitatively observed states:


We characterized the biomarker over 15 months of data collection. For data shown, the units of spectral power in all data figures are (uV/rtHz)2, with an arbitrary scale referred to as least significant bit (LSB). Results showed that AD generation was a probabilistic function of stimulation amplitude; stimulation below 1.5 V did not result in any AD, stimulation between 1.5 and 1.7 V resulted in occasional ADs, and stimulation above 1.7 V always resulted in ADs (data not shown). Furthermore, AD duration appeared to be a function of the beta band pre-stimulation state; the greater the pre-stimulation beta band power above the defined suppressed state, the greater the AD duration (**Figure 7**). Furthermore, these observations were robust: the suppressed state beta band power varied by less than 2LSB over the entire duration of the experiment. These results imply that spectral beta band power could be a control variable of interest when modulating network state.

To further understand the dynamic state of the system, we aimed to characterize the transfer function between our proposed biomarker—spectral beta band power—and stimulation patterns. To characterize the response of the biomarker to stimulation, we ran several titration sweeps. The recorded biomarker signals were captured at rest, during AD events, and during delivery of stimulation at several amplitudes (0.75–1.7 V) and frequencies (50, 120 Hz) in order to determine a reference value to discriminate both suppression and ADs. **Figure 8** shows the network response during stimulation (25 s, red) and between stimulation periods (25 s, blue). Importantly, the detection of AD induction required sensing neural activity in the presence of stimulation (Stanslaski et al., 2012) and would have been lost if channel blanking were employed.

The titration sweep for determining network-state response to stimulation is a critical step in designing the neural control algorithm. The data suggest that stimulation can have different effects on the network: while low and moderate stimulation amplitude appears to suppress the network excitability, high stimulation amplitude can induce an AD. Based on these results, we wanted to use our platform to implement a performance element to have two key features: (1) change stimulation amplitude to keep the network at the balance point of suppression and induction of AD and (2) due to the probabilistic nature of AD induction, allow for the detection of AD in real-time to abort stimulation and adjust the stimulation levels lower. To do this, we designed the performance element in two parts: states classification and control policy implementation.

## **LEARNING PERFORMANCE ELEMENT I: DESIGN OF CLASSIFIER ALGORITHMS**

To automate a control loop, we used the observed qualitative correlations with a quantitative algorithm to detect the AD in real-time with a classifier constructed with the external classifier tool. To help mitigate stimulation artifact, we also used spectral band (approximately 70 Hz) to capture stimulation energy in the network without being confounded by observable changes in neural physiology. To achieve this, we applied a measure of stimulation artifact as a feature input within the algorithm to distinguish stimulation result and non-stimulation result as described in Stanslaski et al. (2012). We include the two power channel outputs in **Figure 9** for demonstration purposes, showing correlation between the amplitude of beta band power and AD in **Figures 9A**,**B**.

After annotation was supplied to the training data sets, we used the tool to develop a linear, binary classifier to detect AD with and without stimulation. The detection probability density plot, receiver operating characteristic (ROC) curves, and detection cross-validation result, which are directly generated

by the software tool, are presented in **Figures 9C**,**D**, and **E**, respectively. The detection probability histogram (**C**) represents the magnitude of the state from the boundary, allowing for multiple dimensions of data to collapse to a single graph biomarker separation. The detection probabilities graph (**D**) provides an estimate of the true-positive and false-positive rates based on the derived classifier. The filtered detection summary graph (**E**) allows for the user to set onset and termination duration constraints (i.e., a minimum duration in a classified state before detection is determined) to help improve specificity at the expense of classifier latency. Graph (**E**) shows an overlay of the classification state over the data. We downloaded and embedded into the implanted device the classifier that optimized sensitivity, specificity, and latency trade-offs.

In addition, we used the tool to develop a separate classifier that could detect the presence of the suppression state based on the beta signal. This was also tested and similarly embedded in the implanted device. Thus, with these classifiers, the state of the neural system could be quantitatively classified on-line as suppression, AD, or resting.

#### **LEARNING PERFORMANCE ELEMENT II: DEVELOPMENT OF THE CONTROL POLICY**

With the classifier in place, we next determined the control policy. Given the unknown neural dynamic requirements and algorithm parameters, the control policy was first prototyped using the hybrid development partition to determine the stimulation amplitudes and changes that would be used for each state. **Figure 10** illustrates an example of this testing to show that stimulation can induce both the AD state and the suppression state. In this test, the controller logic uses two stimulation programs. In the cycle stim (CS) program, high amplitude stimulation (1.50 V) capable of inducing an AD is cycled on and off, while spectral power in critical bands and classifier state is continuously telemetered out of the device. If the classifier does not detect an AD, stimulation continues to cycle.

When the AD state is detected, stimulation is stopped and an alternate setting is applied. The decreased network excitability (DNE) program delivers a lower stimulation level (1.25 V) after a programmed delay for one cycle, then returns to the CS program.

**Figure 10** (bottom) shows typical results achieved with the hybrid algorithm. We ensured no false-positive detections occurred in both open-loop and closed-loop cases by examining the time-domain data. Our results demonstrate that open-loop stimulation leads to sustained ADs *post-stimulation* roughly 50% of the time when the cycle stimulation is applied without the algorithm enabled, whereas with the algorithm enabled, the sustained AD probability drops to 0% [*N* = 12, three monitor sessions, 15 months].

## **COMMITTING THE PERFORMANCE ELEMENT TO THE EMBEDDED DEVICE FOR VALIDATION**

As a final prototyping phase, we desired a system capable of embedded operation to enable chronic, ambulatory data collection for long-term validation as well as improved response latency compared to subjects or other observers (e.g., researches, caregivers).

Based on findings with the hybrid system, the device was enabled to run a multi-branch algorithm for hippocampal network dynamics. The algorithms developed for the embedded detector were merged into a common state machine. As shown in **Figure 11**, this included the three critical loops for the algorithm corresponding to the states of the system, all of which share a common *stimulation sequence* forward loop. The beta band power threshold for determining the state classification was determined using the classifier. In addition, we prescribed an increment of 0.05 V and decrement of 0.1 V for stimulation controllers—i.e., slow attack, fast recovery for attempting to maximize safe searches of the parameter space.


• *Resting loop*—detects resting state and increments stimulation amplitude to verify the ceiling is still valid; this loop is activated when suppression is no longer being achieved with the suppression loop to counteract slowly changing behavior such as circadian patterns, medication dosing, etc.

Note: Additional parameters such as initialization variables and counters are also programmable through telemetry and could be refined as needed.

The algorithm firmware was downloaded into the device and validated with cyclic-redundancy checking.

The embedded algorithm was then evaluated with on-line processing in the ovine model. **Figure 12** presents a typical outcome of the standalone implantable device with the algorithm embedded; we demonstrate all possible states of the of the control policy in this data sample. We start by stimulating at an amplitude known to generate AD, resulting in appropriate stimulation shut-off. Then, stimulation is ON with reduced stimulation amplitude (from 1.7 to 1.6 V). Stimulation at this level produces suppression for one cycle, leading to maintenance of this stimulation level for 1 cycle. On the next cycle, however, suppression is not detected, resulting in stimulation increase to 1.65 V and then again to 1.7 V. At 480 s, the 1.7 V stimulation again leads to an AD. The stimulation is again turned off due to the AD detection and the stimulation level is returned to 1.6 V. This testing showed that the learning procedure could result in a fully embedded solution, from initial identification of biomarkers and transfer functions to a fully-embedded control policy operating *in vivo*.

Several practical points are also worth noting. First, the algorithm is power efficient, because it runs reliably with total current drain less than 20µW with the addition of sensing and algorithm control. This represents roughly 10% of the nominal therapy power used in movement disorder neuromodulation system. Second, the algorithm shows robustness because signal power channel baseline is stable over 15 months with variation within 2 LSB, which is more than 20 times smaller than the AD detection threshold. Finally, the control policy is restricted to a bounded set of stimulation parameters with programmable inter-locks, thereby helping to ensure tolerability and safety.

## **DISCUSSION**

Automated closed-loop control systems may potentially improve neuromodulation therapies by reducing latency for therapy adjustments and personalizing therapies to improve patient

not yet been met.

health. These approaches rely on improved understanding of the nervous system dynamics and how they drive the mechanisms of action for neuromodulation. Mapping these concepts to a learning agent framework helps define key components that can lead to better characterization of the system: sensors for chronically collecting data; effectors for modulating the network; and algorithms for translating data into stimulation parameters. The investigational platform described here fills a gap in current technology by enabling a process methodology for designing and prototyping these algorithms and embedding them in an automated closed-loop neuromodulation device.

in the presence and absence of stimulation and change stimulation

In this work, we demonstrated a platform consisting of an implantable device integrated with external tools for developing classifier and control-policy algorithms. We tested the platform in a system that exhibited contrasting behavior with respect to stimulation amplitude, motivating our algorithm design to find the fine balance point between over- and under stimulation. One of our significant findings was a potentially non-monotonic relationship between stimulation amplitude and system response: beta band power was reduced from baseline at low stimulation amplitudes, while it was increased at higher stimulation amplitudes, resulting in occasional AD. These results imply that neural feedback may be an important consideration in determining the optimal stimulation amplitude.

While we performed our experiments in an *in vivo* ovine, our investigational approach could be applied to the study of other disease states, such as Parkinson's disease, essential tremor, epilepsy, or other neurological conditions. Preliminary exploration of the automated algorithm supports the design of other closed-loop systems using similar control policies to those described here (Eusebio and Brown, 2009; Priori et al., 2012). Furthermore, our system is not limited to neural biopotentials; we can theoretically record any biopotential of sufficient amplitude (e.g., EMG). These biopotentials, along with other sensor data, may be useful in prototyping and validating algorithms for future automated closed-loop systems (Yamamoto et al., 2012).

Our design involved several practical considerations. Perhaps most importantly, we designed the generalized learning system on a chassis that has received prior approval for select therapies. Building off an established foundation helps to lower the translational barriers to exploring advanced systems. An additional key design element is the ability to sense activity in the presence of stimulation (also described in Priori et al., 2012). Our results demonstrate the potential importance of network phenomena that occur while the network is being modulated especially while characterizing transfer functions of the nervous system that might underlie mechanisms of action. In this work, this capability allowed us to monitor for evidence of AD during the stimulation as well as dynamically adjusting the stimulation ceiling as a function of suppression state. These phenomena may be missed by neural sensing architectures that blank out the signal chain during the stimulation (Sun et al., 2008).

Another practical consideration is that the learning pathway is amenable to chronic embedded algorithm operation, particularly in light of the trade-offs between complexity and performance versus simplicity and power consumption (Lee Kyong et al., 2012). The offline analysis and hybrid design approach allow for rapid prototyping of concepts before commitment to embedded firmware. Once embedded, the power draw with our system could be reduced to 20 uW, below 10% of existing nominal therapy power for Parkinson's disease, and latency can be reduced to approximately 200 ms. In the future, use of complementary sensors such as accelerometers and patient feedback may enable algorithms to maintain simplicity and efficiency without sacrificing performance. Ultimately, the ability to titrate stimulation to therapy using responsive algorithms (such as the suppression loop) could potentially yield a net energy savings of chronic responsive systems.

#### **REFERENCES**


Finally, the experiments allowed us to observe overall reliability of the system. Observed signals of network states were stable over the course of the 15-month experiment, providing evidence of robustness in our detection algorithms (*>*20-fold margin) to detect state changes. This finding, combined with other results (Stypulkowski et al., 2011), provides initial confidence in the reliability of the system in an *in vivo* environment. In addition, our control-policy implementation used bounded stimulation parameters to ensure tolerability and safety. The chronic reliability and means of ensuring safety provide both a mechanism for longitudinal learning to occur within one subject and chronic validation of the methods, thereby greatly increasing the likelihood of clinical translation.

The study does suffer from limitations, mostly tied to the choice of animal model used for validation. First, the validation is tied to physiology measures and not a true disease model. The ultimate therapeutic utility of the algorithm will require additional testing in animal and clinical models which might drive refinement of the algorithm. In addition, the hybrid system is limited by telemetry latency. Future investigations characterizing the latency of the feedback loop may be needed to better understand this impact vis a vis neural dynamics. System latency may be particularly relevant when stimulating multiple neural regions, such as in stimulating pairs of neural targets or in functional electrical stimulation of muscle in response to sensed neural signals. Ultimately this latency is addressed when embedded in the system, but might limit the broader application of the hybrid design process.

In summary, we believe increased understanding of the nervous system with such platform systems may lead to improved technical capability to modulate the nervous system to address pathophysiology. As these systems mature, they can be embedded into devices to augment and potentially correct for a malfunctioning nervous system.

J., et al. (2001). Tremor-correlated cortical activity in essential tremor. *Lancet* 357, 519–523.


involuntary movements induced by subthalamic nucleus stimulation in parkinsonian patients. *Mov. Disord.* 11, 231–235.


brain stimulation (aDBS) controlled by local field potential oscillations. *Exp. Neurol.* pii: S0014–4886. doi: 10 . 1016/j . expneurol . 2012 . 09.013. [Epub ahead of print].


et al. (2008). Is the synchronization between pallidal and muscle activity in primary dystonia due to peripheral afferance or a motor drive? *Brain* 131, 473–484.


Zumsteg, D., Lozano, A., and Wennberg, R. (2006). Rhythmic cortical EEG synchronization with low frequency stimulation of the anterior and medial thalamus for epilepsy. *Clin. Neurophysiol.* 117, 2272–2278.

**Conflict of Interest Statement:** The authors are all employees of Medtronic; Paul Stypulkowski, Scott Stanslaski, Randy Jensen, David Carlson, and Tim Denison are also Stockholders in Medtronic.

*Received: 03 September 2012; accepted: 18 December 2012; published online: 22 January 2013.*

*Citation: Afshar P, Khambhati A, Stanslaski S, Carlson D, Jensen R, Linde D, Dani S, Lazarewicz M, Cong P, Giftakis J, Stypulkowski P and Denison T (2013) A translational platform for prototyping closed-loop neuromodulation systems. Front. Neural Circuits 6:117. doi: 10.3389/fncir.2012.00117*

*Copyright © 2013 Afshar, Khambhati, Stanslaski, Carlson, Jensen, Linde, Dani, Lazarewicz, Cong, Giftakis, Stypulkowski and Denison. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Dynamic control of modeled tonic-clonic seizure states with closed-loop stimulation

#### *Bryce Beverlin II <sup>1</sup> and Theoden I. Netoff <sup>2</sup> \**

*<sup>1</sup> Department of Physics, University of Minnesota, Minneapolis, MN, USA*

*<sup>2</sup> Department of Biomedical Engineering, University of Minnesota, Minneapolis, MN, USA*

#### *Edited by:*

*Steve M. Potter, Georgia Institute of Technology, USA*

#### *Reviewed by:*

*John M. Beggs, Indiana University, USA John D. Rolston, Emory University, USA Robert E. Gross, Emory University*

*School of Medicine, USA*

#### *\*Correspondence:*

*Theoden I. Netoff, Department of Biomedical Engineering, University of Minnesota, 7-105 Nis Hasselmo Hall, 312 Church St SE, Minneapolis, MN 55455, USA. e-mail: tnetoff@umn.edu*

Seizure control using deep brain stimulation (DBS) provides an alternative therapy to patients with intractable and drug resistant epilepsy. This paper presents novel DBS stimulus protocols to disrupt seizures. Two protocols are presented: open-loop stimulation and a closed-loop feedback system utilizing measured firing rates to adjust stimulus frequency. Stimulation suppression is demonstrated in a computational model using 3000 excitatory Morris–Lecar (M–L) model neurons connected with depressing synapses. Cells are connected using second order network topology (SONET) to simulate network topologies measured in cortical networks. The network spontaneously switches from tonic to clonic as synaptic strengths and tonic input to the neurons decreases. To this model we add periodic stimulation pulses to simulate DBS. Periodic forcing can synchronize or desynchronize an oscillating population of neurons, depending on the stimulus frequency and amplitude. Therefore, it is possible to either extend or truncate the tonic or clonic phases of the seizure. Stimuli applied at the firing rate of the neuron generally synchronize the population while stimuli slightly slower than the firing rate prevent synchronization. We present an adaptive stimulation algorithm that measures the firing rate of a neuron and adjusts the stimulus to maintain a relative stimulus frequency to firing frequency and demonstrate it in a computational model of a tonic-clonic seizure. This adaptive algorithm can affect the duration of the tonic phase using much smaller stimulus amplitudes than the open-loop control.

**Keywords: seizure model, deep brain stimulation, tonic-clonic, synchrony**

#### **INTRODUCTION**

Approximately one third of patients with epilepsy do not have sufficient control of their seizures even with the use of antiepileptic drugs. The use of deep brain stimulation (DBS) to suppress or truncate seizures is an alternative approach for controlling seizures in drug refractory patients. However, DBS for seizure suppression has had mixed clinical success (Loddenkemper et al., 2001). The SANTE trial, a multi-center clinical trial, used openloop DBS, and demonstrated a 35% reduction in seizures (significantly more than in the control group), but with very few seizure free patients (Fisher et al., 2010). Neuropace has developed a closed-loop stimulator that has been tested in multi-center clinical trials, resulting in a 37.9% decrease in seizures, which is also significant compared to a control group. Although some patients are reluctant to have a device implanted in their brain (Arthurs et al., 2010), there exists a population of patients who have exhausted other medical options and are willing to take surgical risks for any reduction in seizures. There is therefore a need to improve the efficacy of DBS.

We presume that DBS may be more effective if the stimulation parameters could be optimally tuned for each patient. However, improving the efficacy of DBS by tuning the stimulus parameters is difficult, particularly as the mechanism by which DBS suppresses seizures is poorly understood (Vonck et al., 2003). With a better understanding of the mechanisms by which DBS functions, we may be able to design and optimize stimulus parameters and develop a closed-loop stimulator that tunes the stimulus parameters. This paper illustrates how DBS stimulus parameters can be selected based on the dynamics of neurons within the targeted brain area in order to affect synchrony in different stages of a seizure.

There are several different working hypotheses about the underlying mechanism by which DBS is able to suppress seizures. In animal models indirect evidence suggests that stimulation in the anterior thalamic nuclear complex can induce a release of the inhibitory neurotransmitter GABA, which presumably depresses the activity of neurons and results in the observed increase of seizure threshold (Mirski et al., 1997). In brain slice experiments it is possible to directly measure the effect of DBS stimuli in neurons. It has been shown that high frequency stimulation can cause neurons to go into a depolarization blockade, where cells are unable to fire, that will truncate the seizure (Bikson et al., 2001). DC electric fields can be used to hyperpolarize neurons, in order to change the neuron's excitability and suppress seizures (Gluckman et al., 1996). It has also been suggested that the stimulation may prevent neuronal synchronization; under this hypothesis DBS stimuli with a Poisson train of pulses at the same frequency as the high frequency stimulation has been shown to suppress seizures (Wyckhuys et al., 2010).

#### **TONIC-CLONIC SEIZURES**

Grand-mal epileptic seizures consist of two major stages: the tonic and clonic phases. In the tonic phase patients lose consciousness and their muscles tense up, while in the clonic phase the patients begin to jerk (Fisch and Olejniczak, 2006; Bragin et al., 2010). High frequency oscillations (HFOs) (oscillations above 150 Hz) in the intracranial electroencephalogram (EEG) recordings are observed during these seizures (Schindler et al., 2007b) as well as using magnetoencephalogram (MEG) (Garcia Dominguez et al., 2005; Perez Velazquez et al., 2007). In human studies, it has been shown that firing rates at the onset of the seizure are very high and decrease over the course of the tonic-clonic seizure (Ward, 1961). EEG measurements suggest that population amplitude and coherence is greater in the clonic phase than the tonic phase (Quian Quiroga et al., 1997). Frequency sweeps observed in EEG during seizures are a biomarker that can be used to detect seizures (Schiff et al., 2000).

We have recently proposed a model explaining (1) a mechanism for the transition from tonic to clonic phases by slowing of firing of neurons over the seizure and (2) the differing ability of neurons to synchronize at high firing rates and low firing rates (Beverlin et al., 2011). The change in the firing rate in the model is due to due to synaptic depression of the neurons and spike rate adaptation of the neurons, both of which occur at high firing rates (Abbott et al., 1997; Manor and Nadim, 2001). When analyzing EEG, determining the transition between tonic to clonic is somewhat subjective to the epileptologist. In this paper we will simply define the tonic phase of the seizure as network activity with high firing rate and low synchrony, while the clonic phase is high firing rate with high synchrony.

#### **FIRING RATE AND NETWORK SYNCHRONY**

In previous brain slice experiments, it was found that while the firing rates of neurons were high during the tonic phase of the seizure, neurons exhibited a low degree of correlation. During the clonic phase the firing rate decreased and the population became highly synchronous (Netoff and Schiff, 2002). This transition from the tonic to clonic phases may be integral to the evolution of the seizure and its eventual termination. Based on this hypothesis, it has been shown that synchronizing populations with DBS pulses may promote seizure termination and truncate the seizure (Schindler et al., 2007a). We have recently developed a computational model that illustrates a mechanism by which synchrony changes during a seizure (Beverlin et al., 2011).

In our model, seizures start by the failure of inhibition. Without inhibition, the excitatory neurons increase their firing rate and excitatory drive within the network increases in a positive feedback loop resulting in very high firing rates. Over time, the firing rate slows down, and the network transitions to a synchronous high amplitude clonic phase of seizure. In the model the transition from tonic to clonic phases is caused by a change in the sensitivity of neurons to synaptic inputs as their firing rate slows; this leads to a shift in synchrony. We demonstrate how the transition occurs in a network of model neurons and explain the mechanisms using pulse-coupled oscillator theory. There are several ways in which network synchrony may change *in vivo*, including the reintroduction of activity from the inhibitory population (provided they have entered depolarization block at the seizure onset) (Ziburkus et al., 2006), synaptic depression, and vesicle depletion. In our model, the change in firing rate is produced by including synaptic depression and a gradually decreasing input current to the model neurons, to simulate spike rate adaptation of the neurons.

#### **PERIODIC STIMULATION IN EPILEPSY**

DBS has been tested in models of epilepsy in order to disrupt seizures (Good et al., 2009; Fisher et al., 2010; Nelson et al., 2011; Rajdev et al., 2011). Stimuli designed to increase synchrony has been shown to effectively truncate seizures (Schindler et al., 2007a) and DBS has been employed in clinical trials with reasonable success (Morrell, 2006; Fisher et al., 2010; Morrell and On behalf of the RNS System in Epilepsy Study Group, 2011).

During a seizure the firing rate of neurons changes as the phases of the seizure progress. Therefore, we hypothesize that the influence of DBS on population synchrony will change if the stimulus does not adapt to the firing rate of the neuron. In this paper we first estimate the effects of periodic DBS on neuronal populations that are firing at high rates during the tonic phase, and then on the low firing rates during the clonic phase. To study the effects at each phase of the seizure, we fix the synaptic strengths and apply constant current and vary the DBS frequency and amplitude measuring the resulting increase or decrease in synchrony. We find that independent of firing rate, there are ratios of stimulus frequency to neuronal frequency that can either synchronize or desynchronize. Then, in the full model with changing firing rates, we demonstrate an adaptive algorithm that measures the firing rate of a neuron to adjust the stimulus frequency to maintain stimulation in the regime that promotes or decreases synchrony over the entire duration of the seizure. This exemplifies how an adaptive stimulus algorithm may be designed to disrupt synchrony in a population where the population oscillation is changing.

## **METHODS**

We investigate the effectiveness of DBS within an epileptic model using computational simulations of excitatory neuronal networks. The neuron model captures the dynamics of a real neuron's sensitivity to synaptic inputs, current inputs, and periodic forcing from applied stimuli. Synaptic depression variables change the recurrent excitatory drive amongst the population, which changes the firing rates of the neurons. As the neuron's firing rate changes, the sensitivity to synaptic inputs also changes, allowing them to synchronize at slow firing rates, but not at high firing rates. Networks of neurons are connected using a second order network (SONET) that keeps the neurons at the edge of synchrony at the high firing rate.

#### **MORRIS–LECAR MODEL NEURON**

We use a modified version of the Morris–Lecar (M–L) model neuron (Morris and Lecar, 1981; Izhikevich, 2007), a 2-D reduction of the Hodgkin–Huxley model (Rinzel, 1985). DBS stimulation is simulated by applying periodic pulses of current input of varying strength and frequency, depending on the stimulation protocol. The conductance based M–L model calculates the change in voltage as a function of the membrane's ionic currents as described by the following equations:

$$C\dot{V} = I\_{\text{input}} + I\_{\text{DBS}} + I\_{\text{noise}} - \varrho\_L (V - E\_L)$$

$$- \operatorname{g}\_{\text{Na}} m^{\infty}(V)(V - E\_{\text{Na}}) - \operatorname{g}\_{\text{K}} n (V - E\_{\text{K}})$$

$$- D(S - F)(V - E\_{\text{syn}}),$$

$$\dot{n} = \frac{n\_{\infty}(V) - n}{\tau(V)}$$

$$m\_{\infty}(V) = \frac{1}{1 + e^{\frac{V\_{1/2} - V}{\hbar}}}$$

$$\tau(V) = C\ e^{\frac{-(V\_{\text{Nvar}} - V)^2}{\sigma^2}}$$

$$\dot{S} = -\frac{S}{\tau\_s} \sum\_{j=1}^{M} \delta \left( t - t\_{\text{syn}}^j \right)$$

$$\dot{F} = -\frac{F}{\tau\_f} \sum\_{j=1}^{M} \delta \left( t - t\_{\text{syn}}^j \right)$$

where *C* is the membrane capacitance, *V* is the membrane voltage, *I*input is an input current common to all neurons, *I*noise is a white noise input proportional to the square root of the time step independent to each neuron, *g* are the maximal conductances of each current source, *E* are the reversal potentials for each ion, *m* and *n* are the ionic gating variables, where *m*∞ and *n*∞ are the steady-state activation for a given voltage, *V*1*/*2 m satisfies *m*∞ *V*1*/*<sup>2</sup> = 0*.*5*,V*max is the value of *V* at the maximum value of *m*, *k* is the degree of slope at *V*1*/*2, τ is the voltage dependent time constant of the inactivation variable, σ determines the sensitivity of the time constant of *V*, *S* represents the slow variable of the synaptic input shape, with a time constant τ*<sup>s</sup>* and *F* is the fast synaptic time constant. At times of synaptic input, 1 is added to both the *S* and *F* state variables for each presynaptic event at time *t*syn for all *M* events. Synaptic depression, *D*, is defined as *Di* <sup>+</sup> <sup>1</sup>*,<sup>j</sup>* = *Di,jd*, updated for cell *j* after a synaptic input *i* as described in Varela et al. (1997) where the strength of depression is controlled by *d*.

The model is explained in more detail in our recent seizure model paper (Beverlin et al., 2011). The parameters of the M–L model were chosen so that the phase response curve (PRC) is similar to PRCs we have measured in hippocampal pyramidal neurons (Netoff et al., 2005); they are as follows: *C* = 1*.*0μF, *gL* = 8 nS, *EL* = −53*.*24 mV, *gNa* = 18*.*22 nS, *ENa* = 60 mV, *gK* = 4 nS, *EK* = −95*.*52 mV, *V*1*/*2 m = −7*.*<sup>37</sup> mV, *km* = 11*.*97 mV, *V*1*/*<sup>2</sup> *<sup>n</sup>* = −16*.*<sup>35</sup> mV, *kn* = 4*.*21 mV, τ = 1 ms, spikeWidth = 0*.*03, *E*syn = 0, τ*<sup>f</sup>* = 0*.*25 ms, τ*<sup>s</sup>* = 0*.*5 ms. Matlab code for this model is available at http://neuralnetoff.umn.edu/public/TonicClonicControl and from Model DB website (http://senselab*.*med*.*yale*.*edu/modeldb).

#### **NETWORK STRUCTURE**

Directional networks of 3000 cells were generated with an average of 30 out-going excitatory synaptic connections using a second order network topology (SONET), which places additional correlated structure to random networks (Zhao et al., 2011). The specific network structure is determined by specifying the average connectivity (first order statistic) as well as the additional prevalence of two edge motifs, thus referred to as second order motifs. These second order structures are reciprocal, convergent, divergent, and chain connections, as illustrated in **Figure 1**. We generate large networks by specifying the first and second order statistics. It has been found that the prevalence of chains and convergent connections have a strong effect on the synchronizability of the network. Here we choose the network statistics which allow a network to both synchronize and desynchronize, depending on input current and firing rate. The network we use has statistics similar to that measured in rat visual cortex (Song et al., 2005). The specific SONET was chosen out of 186 candidates discussed in recently published results (Zhao et al., 2011). This network, which had 4 times the prevalence of reciprocal connections, 1.4 times the convergent connections, 1.3 times the divergent connections, and 1.2 times the chain connections compared to a random graph, was the closest to measured cortical networks in a rat model.

#### **NETWORK SYNCHRONY MEASURE**

Network synchrony is quantified using the Kuramoto order parameter (*r*) which ranges from 0 (neurons evenly distributed in phase) to 1 (neurons in coherent phase) and calculated as follows:

$$re^{i\phi} = \frac{1}{N} \sum\_{j=1}^{N} e^{i\theta\_j}$$

combinations with two directional connections. The motifs are reciprocal, convergent, divergent, and chain motifs. The prevalence of these motifs within a larger network can be specified when generating the network.

where the phases of neurons (θ*j*) are summed to create a population vector with magnitude (*r*) (Kuramoto, 1984; Strogatz, 2000).

## **RESULTS**

The influence of DBS was tested in a network model that reproduces a tonic to clonic shift in network synchrony as a function of the firing rate of the neurons (Beverlin et al., 2011). In the simulations, at the seizure onset the firing rate of the neurons are very high, as might be expected with runaway excitation, and then over the duration of the seizure, the firing rate of the neurons slowly decreases, eventually bringing about a transition to the clonic phase of the seizure, seen in **Figure 2**. The tonic-clonic transition model reproduces the shift in synchrony observed in EEG. In this model, the firing rate was modulated by a combination of changes in tonic drive to all the neurons, representing drive of exogenous sources, and synaptic depression from neurons within the network. Simulations included 3000 M–L neurons connected using a second order network designed to be at the edge of synchrony when neurons were in the tonic phase of the seizure. Over the duration of the seizure we decreased the tonic drive to represent depression from the exogenous inputs, and the synapses within the network depress during the seizure due to the modeled synaptic depression. Decreased input from both the exogenous and endogenous sources results in a decrease in firing rate over the duration of the seizure.

In this paper we apply periodic stimulation to the seizure model. All cells receive the same stimulus input for a given set of stimulus parameters, assuming that the population is uniformly

distributed from the electrode. To analyze the effects of the stimulation in each phase, we hold the applied current in the neurons constant and freeze the synaptic plasticity to study the effects of stimulation at each phase of the seizure separately. We analyze and model the effects at a high firing rate during the tonic phase and then again at a low firing rate during the clonic phase. Then, we restore the changing exogenous current and plasticity back into the model to measure the effects of periodic stimulation to the duration of the tonic and clonic phases.

#### **OPEN-LOOP PERIODIC STIMULATION WITH FIXED DRIVE TO NEURONS**

First, periodic stimulation was applied to a network simulation driven with high current input (6 nA ), to model the tonic phase of the seizure. At this high firing rate the unstimulated network does not synchronize. Results of stimulation applied to all cells of the network at 5.5 ms intervals are shown in **Figure 3**. Stimulus at this interval during the tonic phase increases synchrony in the tonic phase.

Simulations were repeated while varying the stimulation frequency and amplitude. Synchrony was measured using the

network simulation. **Left**, unstimulated cells have low synchrony. **Right**, network stimulated with 5.5 ms pulses becomes coherent. Kuramoto order parameter, averaged over the last one quarter of the simulation to estimate the steady-state synchrony in the network. These simulations were repeated over a range of stimulus amplitudes and frequencies, results are shown in **Figure 4**. Darker areas indicate stimulus parameters that entrain the neurons, resulting in a synchronized population. These entrained regions are known as *Arnold Tongues* (Milton and Jung, 2003). These "tongues" of entrainment occur at integer ratios of stimulus period to the natural period of oscillation. The points on the map that are lightly shaded indicate those parameters where the network remains desynchronized.

The simulations were then repeated while applying a −2 nA current, in order to simulate a network during the clonic seizure phase, shown in **Figure 5**, where the unstimulated network would spontaneously synchronize. The network is then periodically stimulated with a 2 ms period, shown as dots along the top curve in **Figure 5**. This stimulation reduces the synchrony compared to the unstimulated simulation.

Simulations were run for a range of stimulus amplitudes and frequencies, while driving the network at −2 nA. A synchrony map for these results is shown in **Figure 6**. One notable difference is that the region of entrainment has shifted from 5.5 ms around the natural period when the system is driven with 6 nA, to a region of entrainment of 8.5 ms around the natural period when driven at −2 nA. Because the low current network synchronizes spontaneously, a wider range of stimulus parameters synchronize the network. There are several windows which desynchronize the population. In the example shown in **Figure 5**, we use 2 ms period for stimulation, but 4 ms or about 12.5 ms for example could be used as indicated by light bands in **Figure 6**.

#### **CONTINUOUS CONTROL OF SEIZURES WITH VARIABLE STIMULUS FREQUENCY**

Ultimately, control of seizure states may be most effectively achieved by implementing a closed-loop feedback system, in order to select the stimulus frequency from the measured neuronal frequency (Nelson et al., 2011). We have noticed that

**FIGURE 4 | Synchrony map of stimulated tonic networks.** Current input of 6 nA applied to all cells. Grayscale indicates calculated synchrony as the Kuramoto order parameter averaged over the last 200 ms of individual

the stimulus frequency that entrains the neurons occurs at frequencies just slightly faster than the firing rate of the neurons. Stimulus regimes that desynchronize the population are found to be slightly slower than the firing rate of the neurons. Based on this observation, we developed a simple feedback system

**FIGURE 5 | Desynchronizing the clonic phase.** Computational simulation of network activity with current set to −2 nA to simulate clonic phase of seizure. See **Figure 3** for general figure description. Synchrony of the unstimulated clonic network increases to a strong value near 0.8 (gray). When applying the 2 ms periodic DBS pulse, the network activity is desynchronized. Bottom Left, unstimulated cells have high synchrony. Bottom Right, network stimulated with 2 ms pulses. Less synchronous activity is observed in the stimulated network.

simulation for a range of stimulus amplitudes and periods.

that modulates the stimulus period depending on the firing rate of one cell within the network. All other parameters are the same as previously used in the unstimulated tonic-clonic model, including the current ramp and network topology. Here, the feedback system selects the stimulus frequency based on a user chosen ratio of stimulus to measured frequencies. This ratio can be selected from the entrainment maps. A choice of 1:1, for example, would entrain a population, while a choice of 1:1.14 (stimulated to measured frequency ratio) may desynchronize a population.

**Figure 7** shows the response of the network while stimulating at intervals 1.14 times the interspike intervals of neurons in the population. Synchrony emerges later in this stimulated case than the unstimulated network, prolonging the tonic phase. Eventually, the network slows sufficiently such that synchrony takes over, despite the dispersive effects of the stimulus. Conversely, by applying the stimulus at the same frequency as the firing rate of the neurons (1:1 ratio) we were able to bring about synchrony in much less time than in the unstimulated case, truncating the tonic phase, as shown in **Figure 8**. In both the synchronizing and desynchronizing closed-loop feedback experiments we used stimulus amplitude of 10 nA, one quarter the amplitude used in the open-loop conditions to achieve a similar effect.

## **DISCUSSION**

Tonic-clonic seizures can be devastating to a patient with epilepsy. While there is evidence that DBS can reduce seizures, no clinical application has been found to be fully effective in truncating seizures. It is well known in oscillatory models that periodic forcing of a network of oscillators can synchronize or phase disperse the oscillators (Glass and Mackey, 1988; Elbert et al., 1994; Kaplan et al., 1996). It has previously been proposed that this may be used to control seizures (Milton and Jung, 2003). In a recent paper, we proposed that this may be involved in treating Parkinsonian symptoms (Wilson et al., 2011). In this paper we use similar periodic stimulation theory to affect the tonic and clonic phases of a seizure in a computational model we have recently developed (Beverlin et al., 2011). Recently we proposed that the shift from the desynchronized tonic phase to the synchronous clonic phase occurs as the neuronal firing rate adapts over the duration of the seizure. At the high firing rates, the model neurons do not synchronize, but as the firing rates slow down, the cells become more sensitive to synaptic inputs and the network synchronizes. The change in spike rate is modeled by gradually decreasing the current drive to the neurons along with depressing synapses.

In this paper, we have added periodic stimulation to the tonicclonic model to determine if periodic stimulation could be used to affect the duration of the seizure phases. We analyzed the effects of stimulus frequency and amplitude on the population synchrony at the tonic phase and again at the clonic phase. Depending on the stimulus frequency we were able to synchronize neurons during the asynchronous tonic phase, or desynchronize neurons in the synchronous clonic phase. Periodic stimulation at integer ratios of the stimulus frequency to the natural frequency was found to entrain and thereby synchronize the population. Conversely, periodic stimulation just slightly slower than the firing rates (and at some frequencies, faster than the firing rates of the neurons) could desynchronize the population. Our findings can be explained with PRC theory, which we previously used to explain the effects of the stimulus at different frequency amplitudes and its effect on population synchrony (Beverlin et al., 2011). The effect of firing rate shifting the peak of the PRC to the left in response to excitatory inputs is generally true and should therefore not be heavily

**FIGURE 8 | Tonic phase truncation.** Network of cells stimulated at 1:1 frequency ratio compared to one measured cell in the network. Bottom: Transition to synchronous clonic phase is earlier (black line) than the unstimulated model (gray line). Here the clonic phase is extended. Graphs labeled the same as **Figure 3**.

model dependent (Gutkin et al., 2005; Fink et al., 2011). We chose the M–L model because it is one of the simplest conductances based neuronal models that can demonstrate this effect.

Periodic stimulation of a network during a seizure with a fixed period would have a mixed effect; synchronizing at some phases of the seizure and desynchronizing at others, as the neurons are constantly changing their firing rate. However, the ratio of stimulus frequency to neuronal firing rate that entrains or desynchronizes the population is relatively consistent. Therefore, we created a closed-loop control system that adjusts the stimulus frequency to desynchronize or synchronize the population, holding the stimulus at a fixed rate relative to the neuronal firing rate. In this case the tonic phase of the seizure could effectively be shortened by applying a stimulus at the same frequency of the neurons, while the tonic phase could be prolonged by applying a stimulus frequency that is slightly slower than the firing rate of the neurons, effectively preventing the synchronization of the population.

This model illustrates the principle that periodic stimulation at certain ratios to the measured firing rate of neurons can be used to promote or decrease synchrony and this principle may be used in a closed-loop feedback system for seizure suppression. We are not suggesting that this model is an accurate model of the actual physiology in the brain. Instead, if PRCs can be measured during seizures, our theory may be tested experimentally. We plan to test these hypotheses in brain slice experiments in the near future.

In addition, the complicated structure and function of real neurons in real tissue are beyond the scope of this paper. Here we have investigated the applicability of DBS in a model network; naturally, there may be real world complications when implementing these protocols depending on the location of the electrode(s) and stimulus parameters. In addition, clinical applications of DBS thus far are typically less than 200 Hz. For example the SANTE trials studying the treatment of refractory epilepsy used a stimulation of 145 Hz (Fisher et al., 2010). Some of the frequencies in our model presented here exceed these typical frequencies, but the relative frequency between the stimulus and the neuronal firing rate is what we consider important. Our model is not designed to produce realistic firing rates, so we do not suggest based on this model that these are realistic stimulation frequencies for all brain regions that should be used clinically.

There are many aspects of this simulation which are not physiologically realistic which could be improved in future studies. First, the neurons are modeled as oscillators. Generally, neurons do not fire periodically. However, at the onset of a seizure with high rate of synaptic asynchronous synaptic inputs, neurons may fire close to periodically. All the neurons are also modeled as oscillators with the same parameters and the same firing rate, while it would be more realistic to model the neurons with a distribution of parameters and firing rates. Furthermore, in this model the stimulus was applied uniformly to all the neurons. In a real neuronal network there is geometry to the position of the neurons and a stimulus electrode will not uniformly stimulate all the neurons. All of these aspects of the model could be improved to make it more realistic, and will be the focus of further investigation, but we do not feel will change the fundamental approach we present here to desynchronizing populations.

How might this algorithm be implemented in practice, such as in a brain slice model of seizures and eventually in humans? First, a stimulation electrode and a recording electrode are needed. Then, it is necessary to determine the optimal stimulus frequency ratio with respect to the neuronal frequency. This can be determined from the neuron's PRC to the stimulus. The PRC is measured by open-loop stimulation at random intervals that are on average much longer than the period of the neuron on average. The phase of the oscillation is measured before and after the stimulation to estimate the phase advance of each stimulus. Generally, some model representing the phase advance as a function of the stimulus phase is fit to the resulting data. PRCs would need to be measured at different firing rates or phases of the seizure. From the measured PRCs the Lyapunov Exponents (LEs) of the population response at is estimated different stimulus amplitudes and frequencies (Wilson et al., 2011). Stimulus parameters are selected that maximize the LE to desynchronize the population, or minimize the LE to synchronize. To implement the algorithm, the recording electrode would be used to measure the firing rate of neurons in the population; the measured firing rate would then be used to modulate the frequency of the stimulating electrode.

An interesting finding is that the closed-loop controller could affect the duration of the tonic phase with equal efficacy at one quarter the stimulus amplitude than the open-loop control. This indicates that a simple measure of the neuronal firing rate may significantly improve the efficacy of DBS.

It is important to note that we do not propose that it is best to synchronize and shorten the duration of the tonic phase of the seizure, or to prolong it. We consider that the restructuring of the neuronal network by induction of synaptic plasticity by high firing rates of neurons during seizures may ultimately be the long term deleterious effect if seizures. The goal of the therapy may be to minimize the plasticity changes during a seizure. If neurons fire synchronously, plasticity may be greater than when neurons fire asynchronously. In this case, maximizing the tonic phase of the seizure and minimizing the clonic phase may result in less plasticity changes. However, if the synchronization of the population is integral to the termination of the seizure promoting synchrony may terminate the seizure earlier (Schindler et al., 2007a). For example, if seizures are sustained by recurrent excitation, increasing synchrony may decrease the excitable pool of neurons, thereby decreasing the likelihood of re-entry and terminating seizures earlier. Using a stimulus that can modulate the duration of the tonic phase may help us determine whether synchrony is just a network behavior that occurs at the termination of the seizure or whether it is integral to the termination.

HFOs are population oscillations that are seen between seizures. Suppressing these oscillations may be considered a target for DBS stimulation. The hope would be that disrupting these pathological oscillations may suppress epileptogenesis. The same approach used in this paper might be used to design a stimulus to suppress HFOs. HFOs might be a good target because they are observed to increase prior to a seizure in human and animal models (Worrell et al., 2004), and are thought to arise from synchronous bursts of neurons that occur in an epileptic focus (Bragin et al., 1999, 2010; Ibarz et al., 2010). There is also strong experimental evidence that synchrony amongst cortical regions is increased in epileptic patients (Bullock et al., 1995; Towle et al., 1999; Ben-Jacob et al., 2007; Schevon et al., 2007; Prusseit and Lehnertz, 2008; Zaveri et al., 2009) and that this synchrony changes in the lead up to a seizure (Lehnertz and Elger, 1995; Chavez et al., 2003; Le Van Quyen et al., 2005). In contrast, other evidence suggests that synchrony may decrease prior to a seizure (Mormann et al., 2003). We hypothesize that tuning DBS stimulators to desynchronize prominent pathological oscillations relevant to the generation of seizures interictally suppress seizures. However, we are not aware of any direct evidence that DBS affects these oscillations.

## **REFERENCES**


Martinerie, J. (2003). Spatiotemporal dynamics prior to neocortical seizures: amplitude versus phase couplings. *IEEE Trans. Biomed. Eng.* 50, 571–583.


## **CONCLUSION**

This work proposes a novel method to alter seizures using DBS. In a computational model we have demonstrated that the duration of the tonic-phase of a seizure may be extended or shortened by promoting synchrony using periodic stimulation. Promoting or decreasing synchrony depends on the relative frequency of the stimulation to the firing rate of the neurons. By using a closedloop feedback to adjust the stimulation frequency dependent on the firing rate of the neurons, we are able to extend or decrease the duration of the tonic phase with much weaker stimulus pulses than was necessary in open-loop stimulation.


bistability in neuronal networks with recurrent inhibitory connectivity. *J. Neurosci.* 21, 9460–9470.


experimental seizures. *J. Neurosci.* 22, 7297–7307.


primary visual cortex. *J. Neurosci.* 17, 7926–7940.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 17 October 2012; accepted: 27 December 2012; published online: 06 February 2013.*

*Citation: Beverlin B II and Netoff TI (2013) Dynamic control of modeled tonic-clonic seizure states with closedloop stimulation. Front. Neural Circuits 6:126. doi: 10.3389/fncir.2012.00126*

*Copyright © 2013 Beverlin and Netoff. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Assisted closed-loop optimization of SSVEP-BCI efficiency

## *Jacobo Fernandez-Vargas , Hanns U. Pfaff, Francisco B. Rodríguez and Pablo Varona\**

*Grupo de Neurocomputación Biológica, Departamento de Ingeniería Informática, Escuela Politécnica Superior, Universidad Autónoma de Madrid, Madrid, Spain*

#### *Edited by:*

*Steve M. Potter, Georgia Institute of Technology, USA*

#### *Reviewed by:*

*Attila Szücs, Balaton Limnological Research Institute HAS, Hungary Pablo F. Diez, Universidad Nacional de San Juan, Argentina*

#### *\*Correspondence:*

*Pablo Varona, Grupo de Neurocomputación Biológica, Departamento de Ingeniería Informática, Universidad Autónoma de Madrid, Calle Francisco Tomás y Valiente, 11, 28049 Madrid, Spain. e-mail: pablo.varona@uam.es*

We designed a novel *assisted closed-loop optimization protocol* to improve the efficiency of brain-computer interfaces (BCI) based on steady state visually evoked potentials (SSVEP). In traditional paradigms, the control over the BCI-performance completely depends on the subjects' ability to learn from the given feedback cues. By contrast, in the proposed protocol *both* the subject and the machine share information and control over the BCI goal. Generally, the innovative *assistance* consists in the delivery of online information together with the online adaptation of BCI stimuli properties. In our case, this adaptive optimization process is realized by (1) a *closed-loop search* for the best set of *SSVEP* flicker frequencies and (2) feedback of actual *SSVEP* magnitudes to both the subject and the machine. These closed-loop interactions between subject and machine are evaluated in *real-time* by continuous measurement of their efficiencies, which are used as online criteria to adapt the BCI control parameters. The proposed protocol aims to compensate for variability in possibly unknown subjects' state and trait dimensions. In a study with *N* = 18 subjects, we found significant evidence that our protocol *outperformed* classic SSVEP-BCI control paradigms. Evidence is presented that it takes indeed into account interindividual variabilities: e.g., under the new protocol, baseline resting state EEG measures predict subjects' BCI performances. This paper illustrates the promising potential of *assisted closed-loop* protocols in BCI systems. Probably their applicability might be expanded to innovative uses, e.g., as possible new diagnostic/therapeutic tools for clinical contexts and as new paradigms for basic research.

**Keywords: brain-computer interface, brain-machine interface, activity-dependent stimulation, resting state EEG, resting state network, individual alpha frequency, BCI illiteracy, BCI performance predictor**

## **INTRODUCTION**

The use of closed-loop interaction with biological nervous systems for observation and control purposes goes back to the beginnings of electrophysiology in the 1940s when the *voltage clamp* technique was developed (Marmont, 1949; Cole, 1955). Later on, the *dynamic clamp* technology to implement artificial membrane or synaptic conductances (Robinson and Kawai, 1993; Sharp et al., 1993) has produced many examples of successful closed-loop interactions with neural systems at the cellular and circuit levels (for reviews see Prinz et al., 2004; Goaillard and Marder, 2006; Destexhe and Bal, 2009; Economo et al., 2010).

We recently proposed a generalization of the dynamic clamp concept in electrophysiology and animal ethology to design closed-loop interactions with biological nervous systems beyond electrical stimulation and recording. In particular, we investigated in our previous work goal-driven real-time closed-loop interactions with drug microinjectors, mechanical stimulation devices and video event driven stimulators (Muniz et al., 2008, 2011; Chamorro et al., 2009, 2012). These examples illustrate that modern activity-dependent stimulation protocols can reveal dynamics otherwise hidden under traditional stimulation techniques, provide control of regular and pathological states, induce learning processes, bridge between distinct levels of analysis and lead to a further automation of experiments. In this paper, we propose the same assisted closed-loop approach described in our previous work to optimize the efficiency of steady state visually evoked potentials (SSVEP) based brain-computer interfaces (BCI) which might have a large impact for applied uses, such as computer control and biomedical or prosthetic uses, but also as novel paradigms for basic research. Generally, the innovative assistance consists in the delivery of online information with regard to the control over the given BCI goal both to the human subject and to the system, together with the online adaptation of BCI stimuli properties.

BCIs use measures of brain activity, typically real-time human EEG recordings, usually in order to interact with devices such as virtual keyboards, etc. (for recent reviews see e.g., Birbaumer, 2006; Van Gerven et al., 2009; Nicolas-Alonso and Gomez-Gil, 2012). Among the most successful BCIs are those which rely on SSVEPs, a type of event related potentials (ERPs) generated by the nervous system in response to repetitive visual stimulation (flicker) by linear superposition of transient visually evoked potentials (VEPs) (Capilla et al., 2011) up to 90 Hz (Herrmann, 2001): apart from smaller responses in higher harmonic frequencies, the brain mainly generates electrical activity at just the same fundamental frequency as its visual system is exposed to the visual flicker frequency. SSVEPs are frequently used in basic and applied research because of their relatively large magnitudes which lead to superior signal-to-noise ratios (SNRs) and make them relatively stable against artifacts as compared to other ERPs (Vialatte et al., 2010).

SSVEP-BCIs make use of the physiological property that SSVEP magnitudes can be modulated by visual-spatial selective attention (e.g., Morgan et al., 1996). Thus, SSVEP based BCIs employ multiple visual stimuli (e.g., LEDs or regions on a screen) flickering at different frequencies. Apart from these intraindividual state changes due to attention, SSVEP magnitudes further depend both on extrinsic variables as the spatial and temporal frequencies of the stimulus, and on other intrinsic intra- and interindividual dimensions of the subjects themselves (Ding et al., 2006; Lopez-Gordo et al., 2011). The optimal spatial frequency of a structured stimulus is related to individual traits such as visual acuity or age (Vialatte et al., 2010). There is also a significant difference in the magnitude of SSVEPs between flicker stimulation of the center (fovea centralis) vs. the periphery of the visual field. Environmental conditions (e.g., screen brightness and frequency, distance to the screen, etc.) also influence the performance of the BCI. Although determined by multiple factors, SSVEP magnitudes are modulated by the subjects' states of attention. Hence, online monitoring of SSVEP magnitudes elicited by arrays of multiple flickering light sources allows BCI systems to detect to which flicker source the subject is attending to at a given moment. Taken altogether, these aspects call for automated mechanisms to optimize parameters of the stimuli and of the BCI control, aiming toward flexible adaptiveness to specific individual and contextual situations of SSVEP-BCI use.

Commonly, SSVEP-BCIs use only one prefixed set of flicker frequencies, but nonetheless there are studies employing two different prefixed sets (e.g., Volosyak et al., 2009, 2011) which lead to remarkably different results. Those findings imply that BCI efficiency may crucially depend on flicker frequency selection. Following this idea, we created an assisted closed-loop adaptive algorithm to search for the best frequencies for each subject and for each particular time point/situation of use. The adaptive and informative nature of this novel online approach aims to improve the BCI efficiency as compared to traditional paradigms (see **Figure 1**). Firstly, this optimization process is realized by performing a real-time closed-loop search for the best set of frequencies to achieve the given BCI goal. The number of stimuli and their effectiveness with regard to the BCI goal modulate this realtime search strategy. The closed-loop search is evaluated in realtime by a continuous measurement of the actual BCI efficiency (see section "Efficiency Measures"), which is used as an online criterion to select the BCI control parameters. Secondly, the SSVEP online recording is processed, on the one hand, to an online auditory feedback to inform the subject and, on the other, is used to inform the system to select the best flicker frequencies. This shared information constitutes the assisted part of the closedloop. The proposed protocol aims to address the problems which arise from different hardware configurations, subjects' intra- and inter-individual variabilities, e.g., in neuropsychological dimensions of executive functioning (see e.g., Funahashi, 2001) etc., and other sources of variability in experimental settings and intrinsic dimensions.

The paper is organized as follows: in section "Materials and Methods" the new assisted closed-loop system is described; in section "Results" analyses and correlates efficiency as compared with traditional BCI paradigms are presented; finally, in "Discussion" section we discuss about the generalization and applicability of the proposed novel protocol.

## **MATERIALS AND METHODS**

#### **PARTICIPANTS**

A convenience non-probability sample of *N* = 18 healthy subjects from our department was used applying the exclusion criteria self-reported chronic medication/substance intake and neurological diseases as e.g., epilepsy. Our sample consisted of 6 females and 12 males with age *Mdn* = 26*.*00 years (25th percentile = 23.00, 75th = 35.75), range = 18–59. Subjects had a normal or

**system (about the specificities of the given subject).** In our example, the (loudspeaker symbol, right). corrected-to-normal vision and were right-handed. Permission of the ethics committee of Autonomous University of Madrid was obtained; all subjects participated voluntarily in the sense of an *informed consent* without receiving any incentives. Participants were informed that they could leave the experiments at any time without giving any explication.

### **SSVEP BCI SYSTEM**

#### *Stimulation device*

We constructed a stimulation panel with four white color LEDs (manufacturer *Seoul Semiconductor, white lamp LED LW500AM*, ∅ 5 mm, viewing angle 100◦), using a 100  series resistor to the digital +5V output of the acquisition board (see below) which results in a luminous intensity output *IV* ≈ 700 mcd for each LED.

On a black background panel, each LED was mounted into a reflector with ∅40 mm diffuser cap carrying an outstanding non-transparent cylindrical black screen of 45 mm length; the spatial organization is illustrated in **Figure 2**. Below each white flicker light source we placed a green color standard signaling LED to instruct the subject where to look during the BCI task. The distance of the LED stimulation panel to the subject was kept ∼60 cm, resulting in a visual angle of ∼3.8◦ for every light source.

## *BCI task*

The BCI task consisted in subjects trying to follow a prefixed sequence of 16 steps by focusing their vision onto a specific flickering white light source out of the four possible ones at each step, as continuously indicated by the smaller green signaling LEDs below. This sequence was identical for all subjects. A brief beep sound confirmed the indicated flickering light source as correctly detected.

#### **STIMULATION**

We compared the BCI efficiency under three conditions of flicker frequency selection: (i) by the assisted closed-loop (*ACL*) protocol, (ii) by a standard protocol with stimulation frequencies *prefixed* at 27, 28, 29, and 30 Hz (because 1 Hz distances are commonly employed in SSVEP-BCIs e.g., Herrmann, 2001; Diez et al., 2011; Volosyak et al., 2011), and (iii) by a protocol which used a selection of *top* frequencies for each subject (see section "ACL Algorithm"). In order to compensate for possible presentation order effects, the order of (i), (ii), y (iii) was permutated over the subjects.

**Figure 3** shows the timeline of the experiment. The first phase of the experiment consisted in the measurement of the individual EEG *baseline* and the *frequency scanning phase* to select a set of flicker stimulation frequencies for each subject (the number of frequencies in this set is specific for each participant—see below). The second phase is the BCI phase with its three conditions (i), (ii), and (iii) mentioned above.

#### **SIGNAL ACQUISITION AND PREPROCESSING**

The signal acquisition and preprocessing steps are summarized in **Figure 4**. The EEG signal was recorded at 1024 Hz with eight sintered Ag/AgCl electrodes mounted into a "*Aegis Array*" stretch lycra cap (*Sands Research Inc.,* Texas/USA) using a "*BRAINBOX® EEG-1166*" 64 channel EEG amplifier (*Braintronics B.V,* Almere/Netherlands) with in-house software written in *C*. Vertical and horizontal EOG was recorded bipolarly by an in-house battery driven analog amplifier following a circuitry of Usakli and Gurkan (2010) with sintered Ag/AgCl electrodes fixed by adhesive rings above/below the left eye vs. at left/right *epicanthus* connected to a data acquisition board (*NI-PCI-6251, National Instruments*) at 1024 Hz. The eight standard 10–20 positions were FPz, F3, Fz, F4, Cz, Pz, POz, and Oz (Jasper, 1958). For *online* SSVEP detection as BCI input only POz and Oz were used, while for later *offline* studies the signals from *all* eight mentioned electrodes were analyzed. The EEG reference electrode was placed at nose tip, EOG ground electrode at *glabella* and impedances were kept *<*10 k*-*.

**signal acquisition/stimulation system.** The flickering frequency was controlled by a software driving the digital output of a National Instruments data acquisition (DAQ) board (model *NI-PCI-6251*) directly connected to the white colored LEDs, generating 0/+5V *off* vs. *on* signals according to the

light source independently by a photodiode connected to a digital oscilloscope. Luminous intensity output is *IV* ≈ 700 mcd for each white LED. Smaller green color standard signaling LEDs were placed below to instruct subjects where to look during the BCI task.

subject individually, while those below a predefined threshold are excluded

To improve SSVEP detection, we used the online computed difference signal between Oz and POz as bipolar montage as the only input signal to our BCI system. This reduces both EOG/EMG artifacts and EEG activity not related to the visual cortex because this montage implements a simple and computationally inexpensive *spatial high pass filter* (see **Figure 5**). Thus, the *SNR* for the SSVEP detection is increased as compared to unipolar montages (Diez et al., 2010). In a time window of 2 s, this difference signal was then linearly detrended, treated by a *Hann*-window and then converted into *frequency domain* by Fast Fourier Transform (FFT) with a window length of 2048 sample points. The chosen *Hann-window functio*n has a quite narrow main lobe, which determines a good frequency resolution, and reasonable side lobe suppression (Harris, 1978). Those FFT coefficients meeting the exact flicker frequencies were used, one single coefficient for each flicker frequency. Thus, 20 real numbers were obtained and squared to represent the *power spectral densities* (PSDs) in the flicker range 20–39 Hz (see **Figure 4**). This procedure was developed following Diez et al. (2011). The described analysis was continuously repeated as *sliding windows* with a displacement of 250 ms, resulting in 87.5% overlapping. With all four LEDs emitting steady light, magnitudes of baseline EEG activities *Bf* were measured over 30 s at each future flicker stimulation frequency, determined as *M*PSD by the described procedure (5 sets of 6s with 2 s resting periods in between, see **Figure 3** *Baseline*). Subjects were instructed to use only the resting periods in-between for eye blinks/relaxation and otherwise maintain their eyes quietly open, trying to avoid jaw and tongue movements to reduce EOG/EMG artifacts.

and *gray* baseline recording; in each box durations are reported.

For the *frequency scanning phase* of the experiment an identical measurement procedure was used, but with time windows for flicker stimulation of 4 s in each frequency *f* of the 20–39 Hz range resulting in magnitudes of SSVEPs as response, *Rf* . Each stimulation epoch is followed by a 2 s resting period. In the *BCI phase* of the experiment, the same procedure is used for the

selected stimulation frequencies in a single measurement window of 2 s.

SSVEP PSD magnitudes were normalized to EEG baseline activity in a given frequency *f* as dimensionless *signal-to-noise* ratios:

$$\mathbf{S}\_{\mathbf{f}} = \mathbf{R}\_{\mathbf{f}} / \mathbf{B}\_{\mathbf{f}} \tag{1}$$

In order to minimize fatigue, we tried to keep the baseline and frequency scanning phase as short as possible, 40 s in total for the baseline and 160 s for frequency scanning.

## **ACL ALGORITHM**

#### *Selection of the top frequencies for each subject*

A closed-loop approach is used to select the set of the four *top* stimulation frequencies by compatibility for each subject and in the given experimental context. As a first step, the specified range is scanned which results in a-priori score for each of them. Stimulation frequencies are defined as valid if their *Sf* exceeds a prefixed threshold (set to 10) any time during the ongoing flicker stimulation. For *N* valid frequencies, the frequency corresponding to the largest *Sf* gets an initial score of *s*1*(*0*)* = *N*, the second to best *s*2*(*0*)* = *N* − 1, etc. The frequency corresponding to the lowest *Sf* gets a score of *sN(*0*)* = 1. Finally, the four best scores define the selection of the four *top* stimulation frequencies.

#### *First closed-loop in the ACL-algorithm: iterative selection of the most compatible frequencies*

The previous procedure provides initial scores for each frequency *s*1(0), *s*2*(*0*),..., sN(*0*)* which depend on subjects' intra- and interindividual *state* and *trait* dimensions and on the extrinsic conditions in which the BCI is used. The selection of the four stimulation frequencies is then further optimized in an iterative approach attending to their compatibility. Thus, as the next step, we calculate the following compatibility measure between all possible pairs of frequencies *x* and *y* taking into account a measure of their distance and their scores:

$$c\_{\rm xy}(t) = \alpha \cdot \left(s\_{\rm x}(t) + s\_{\rm y}(t)\right) + \beta \cdot d\_{\rm xy} \tag{2}$$

Here *t* represents the iteration number. We assigned the following weights to the distance and the scores: α = 1*.*5 and β = 1, respectively, where *dxy* is a measure of the distance between the frequencies which we define below. The values for α and β were set empirically based on several trials. Because four frequencies are used simultaneously in our specific BCI implementation, the most compatible four frequencies have to be selected out of *N* valid frequencies, determined by the protocol described above: the first step is to identify pairs of frequencies with optimal compatibility ("2 freq." search in the ACL branch in **Figure 2**). This search consists of *3N/4* iterations (see below), each of them divided into 16 steps with a resting period at its end. The ACL departs from the scores calculated in the scanning procedure *s*1(0), *s*2(0),*...*, *sN*(0): they are modified in the successive iterations to search for the best compatibility.

In each iteration, the subject has to follow a sequence of flicker light sources by focusing upon them, as continuously indicated by the location of the green light. The flicker frequencies are chosen by selecting *maxxy*(*cxy*) at the end of the iteration. To update the scores, we take into account both the success rate and the time as:

$$s\_{\mathbf{x}}(t) = s\_{\mathbf{x}}(t-1) \cdot (\delta \cdot \text{SR} - \,\,\gamma \cdot T) \tag{3}$$

where SR is the success rate (correct SSVEP detections over 16, the number of possible detections) and δ and γ are parameters of the ACL algorithm which were set to δ = 1*.*2 and γ = 0*.*02. *T* is the duration of the detection in seconds. The values for δ and γ were chosen based upon the range of SR and *T and several simulations.*

In this first part of the algorithm, the distance between two specific frequencies *fx* and *fy* for Equation (2) is calculated as:

$$d\_{\mathbf{x}\mathbf{y}} = \left| f\_{\mathbf{x}} - f\_{\mathbf{y}} \right| \tag{4}$$

Each *cxy*(*t*) is updated by the new scores after each iteration. Once this procedure has run *p* = 3*N/*4 times, the highest *cxy*(*p*) is selected and a new set is created with the union of both frequencies. Now, the next highest *cx <sup>y</sup>*(*p*) disjoint from the previous set is chosen and a new set is constructed. This is repeated *N/*2 times because this is the total number of possible disjoint pairs. It is ensured that each set is disjoint from all others. *p* = 3*N/*4 is chosen to test 3*N/*2 frequencies, so that the best frequencies are tested more than once. It is important to note that the duration of the frequency tests has to be restricted.

Afterwards, the second part of the algorithm is performed, the selection of four frequencies. The same procedure as in the first part is employed, but instead of single frequencies, sets of two frequencies are used. The values of *sx*(*p* + 1) of each set are adjusted according to the values *cxy*(*p*), where *x* = *x* ∪ *y*. In this way, the set with the highest value gets *s*1(*p* + 1*)* = *N/*2, the second best *s*2(*p* + 1*)* = *N/*2 − 1 and so on. The last one gets *sN/*2 (*p* + 1*)* = 1. From this point of the algorithm on, these sets are indivisible.

Using the same procedure performed with two frequencies, the process is repeated with four of them. The compatibility and the score actualization rules are still the same. The only difference is the distance measure for Equation (2) calculated as:

$$d\_{\chi\_{\mathcal{Y}}} = \frac{\sum\_{i=1}^{2k} \sum\_{j=1}^{2k} |f\_i - f\_j|}{2k \cdot (2k - 1)}\tag{5}$$

where *k* is the number of frequencies of each set (in this case 2), and *fi* and *fj* are the individual frequencies taken from the union of the sets *x* and *y*. Note that here *x* and *y* refer to sets of two frequencies while in Equation (4) *x* and *y* referred to individual frequencies. This distance expresses the arithmetic mean of all possible pairs in the set resulting from the union of the initial sets *x* and *y*. Note that for *k* = 1, this distance measure is exactly the same distance (Equation 4) as used in the first part of the algorithm. In this second part 3*N/*8 iterations are performed, which is *N/*2 (the number of disjoint sets) times 3/4 (see above).

#### *Second closed-loop in the ACL-algorithm: online auditory feedback of SSVEP magnitudes*

In order to offer additional dynamic information to the subject related to his/her brain activity beyond the SSVEP detection confirmation cue, we provide a continuous online auditory feedback during the trials which represents the distance between the actual state and the pre-defined goal. The feedback signal consists of a 20 possible sinusoids with a range between 100 and 575 Hz which are updated every 0.25 s. The represented distance measure is defined as the difference between the EEG-SSVEP *signal to noise ratio* for the target frequency (Starget *<sup>f</sup>* ) and the threshold. Once Starget *f* has reached this threshold level, the auditory feedback is muted. Previously, subjects are instructed that their goal is to raise the pitch of the sinusoids as high as possible, and that after possible success their further goal would be trying to keep the sounds muted for 1.75 s; after this silence, the program automatically proceeds to the trial's next step. This kind of continuous auditory feedback aims to help subjects to learn to gain control *in their particular way* over SSVEP magnitudes by attracting their attentional resources to these voluntary attempts to increase self-regulation of their resonating brain states.

Concluding, there are two assisted closed loops in our system: the first one operates over the stimulation frequency set with the aim to directly improve the ITRs of each subject. This closed-loop informs the system about subject and environment specificities. The second one informs the subject about his/her brain activity in relation to the use of the interface and helps him/her to do so faster and more accurately. This closed loop works several times for each step of a trial.

#### **SSVEP DETECTION**

In order to reduce the experiment's complexity in terms of a reductionistic paradigm, we choose a simple SSVEP detection strategy in our study. During the *top* and *prefixed* frequency stimulation, the Starget *<sup>f</sup>* value is calculated every 0.25 s. If this value exceeds the threshold for 1.75 consecutive seconds, then this SSVEP is defined as "detected." The threshold value was set to 10 which reflects the observed noise flow (see **Figure 5**). To avoid longer waiting periods when the subject is unable to exceed the threshold, a time limit of 4 s is used, after which that step is considered as fault.

During the ACL, to favor SSVEP detection in case that the subject exceeds the threshold and more time than the 1.75 s is needed to be classified as "detected," there is a small modification in this protocol to allow adaptive time extensions. When Starget *<sup>f</sup>* exceeds the threshold in a given 0.25 s time step, the time limit is increased for another 0.25 s.

#### **EFFICIENCY MEASURES**

After each iteration of the algorithm, both the success rate and time needed are saved. For the *prefixed* and *top* frequencies, *standard Information Transfer Rate* (ITR) is calculated:

$$\text{ITR}(\text{SR}, \ t) = \left(\log\_2(N) + \text{SR} \cdot \log\_2(\text{SR}) + (1 - \text{SR}) \cdot \right)$$

$$\log\_2((1 - \text{SR})/(N - 1)) \cdot \text{Norm}/t \qquad (6)$$

where *N* is the number of targets (*N* = 4 in our case). The value *SR* represents the success rate and *t* is the time taken in minutes. *Norm* is a normalization value set to 960 (60 s times 16 steps in each iteration). Note that if *SR* ≤ 1/*N,* then ITR(*SR, t*) = 0.

In contrast to the conditions *prefixed* and *top,* ITR is measured *several* times during the ACL. Thus, for further a-posteriori analyses these ITR distributions have to be represented by descriptive statistics: for condition *ACL* therefore *M* and *Mdn* of success rates and needed times are used to calculate ITRMean and ITRMedian, completed by maximum ITR (ITRMax).

#### **CONVERGENCE MEASURE**

For a-posteriori analyses, a *convergence measure* for the algorithm in terms of the stimulus frequency exploration was defined: the duration of the 2 freq. search of the algorithm is divided into two parts. For each part, the numbers of explored frequencies are determined and divided by the maximal number of possible frequencies which could be explored (twice the number of iterations). The decrease comparing this measure in the second part vs. in the first part is a sign for how much the frequency exploration is converging. As can be seen in **Table 1**, the number of iterations varies over the subjects. The convergence measure is not reported for the first part because in our sample all subjects had the same maximal value 1, i.e., all possible frequencies were explored. We will use this measure to discuss how the ACL algorithm seems to adapt to subjects' interindividual differences.

#### **STUDY DESIGN**

A three conditions (*ACL, top, prefixed*) balanced *within*-subjects design with three times full permutation of presentation order (ABC, ACB, BAC, BCA, CAB, CBA) and with random assignment of subjects, resulting in *N* = 18 was employed.

#### **BASELINE RESTING STATE EEG MEASURES AS POSSIBLE INTERINDIVIDUAL CORRELATES OF ITR PERFORMANCES**

Aiming to investigate possible correlations between baseline resting state EEG measures and the variables of the experiment, the 30 s baseline EEG (see **Figure 3**) at all eight electrodes reported above were manually cleaned from artifacts with the result of *M* = 20*.*02 s, *SD* = 5*.*54 artifact free epochs. Under *MATLAB 7.11.0.584 win64*, EEG signals were preprocessed in a first step by linear detrending followed by a 8th order *Butterworth* 1.5–70 Hz band pass filter and finally by a 8th order *Butterworth* 45–55 Hz notch filter against 50 Hz power line electromagnetic interferences. Then, preprocessed EEG signals were converted into *frequency domain* by a sliding windows FFT transform of 2 s window length (2048 sample points) with 3.906 ms displacement (4 sample points, which correspond to a 256 Hz sample frequency in the resulting frequency domain signals), after linear detrending and treatment by a *Hann-*window function. Obtained FFT coefficients were squared to obtain the power spectrum and then normalized by dividing by 2048 sample points. In order to obtain *absolute PSD*s for the defined EEG frequencies bands of interest, corresponding coefficients were summed: *thetaLow* (3.5–6.5 Hz), *thetaHigh* (6.5–7.5 Hz); *alphaLow* (7.5–9 Hz), *alphaHigh* (9–12.5 Hz); *betaLow* (12.5–18 Hz), *betaMid* (18–24 Hz), *betaHigh* (18–30 Hz); *totalSpectrum* (0.5–70 Hz). In a first step, those absolute frequency domain *PSDs* signals were normalized dividing every sample point by the corresponding one of *totalSpectrum* which resulted in dimensionless ratios. These ratios indicate for every 256 time points per second the *relative* energy contribution of the frequency band of interest to the EEG total energy at this particular moment. In a last step, in order to represent EEG baseline resting state activities in the analyzed artifact free epochs by one single value for every frequency band, means of these normalized signals were computed over all corresponding time points. Thus, finally we obtained the desired baseline resting state EEG measures as *relative mean PSDs* for further correlational analyses*,* single values for every frequency band over all subjects.

Another measure of interindividual EEG variability is the resting state *individual alpha frequency* (IAF), because it has been found to be remarkably stable *within* subjects, but relatively variable *between* subjects (Kondacs and Szabó, 1999). In order to determine IAF in our experiment, coefficients of PSDs corresponding to the frequency band 8–13 Hz at *Oz* were normalized by *totalSpectrum* PSDs and averaged over all sliding windows in the artifact free baseline resting state epochs. In this averaged and normalized power spectrum the *alpha* frequency with the highest PSD was manually measured and defined as IAF (*peak frequency method*).

#### **STATISTICAL ANALYSES**

All statistical analyses were computed using SPSS 17.0 and STATISTICA 6.0. Previously, *Shapiro–Wilk* tests were calculated to check each of the three conditions for normal distribution in the underlying populations. If one or more conditions showed significant departures from normality, *non-parametric* tests were preferred for further analyses: a *Friedman* test was performed as an *omnibus* test to investigate whether the *central tendencies* of one or more conditions differed significantly from the rest. In case of such a significant result, *post hoc pairwise comparisons* were performed in order to find out what conditions exactly differed significantly from each other, based upon comparison of *mean*



*Note: Information transfer rates (ITRs) in bits/min as measures of individual BCI performances under the different experimental conditions and all Mdn values are highlighted in bold for further analyses.*

*N trials refers to the number of iterations in the first part of ACL (using two flicker LEDs).*

*Convergence measure first half is not reported in the table because all subjects had the same value 1.*

*SNR SSVEPs in Scanning phase are means over all used 20 flicker frequencies.*

*rank differences* using as significance criteria the *critical rank differences* proposed by the more progressive approach of Conover (1980) vs. the more conservative of Schaich and Hamerle (1984).

In order to quantify the *effect sizes* of those *post hoc* pairwise comparisons which resulted in significant differences, we used the probability of superiority of dependent scores, *PSdep*, recommended by Grissom and Kim (2012) and developed in Grissom (1994). It expresses the probability that in a randomly sampled matched pair the value from the condition containing the higher scores is indeed larger than that from the one containing lower scores. *PSdep* is calculated by dividing the number of *positive* differences between the condition *containing the higher scores* minus the condition *containing the lower scores* by the total number of matched pairs. For classifying *PSdep* into small, middle and large effect sizes based upon the standards of Cohen (1988), the *cut-off* values reported by Grissom (1994) are used: *small* 0.56, *medium* 0.64, and *large* 0.71. The same author offers a table to directly convert *PS* into equivalent *Cohen*'*s* . Thus, as effect size measures both *PSdep* and *Cohen*'*s* are reported with standards *small* = 0*.*20, *medium* = 0*.*50 and *large* = 0*.*80 (Cohen, 1988).

In order to check whether significant differences over all six possible permutations of the presentation order might be found, a *mixed-design repeated measures ANOVA* was computed with *stimulation condition* as repeated *within-*subjects factor with three levels (i) ACL algorithm represented as ITR*Median*, (ii) prefixed and (iii) top and *presentation order* as *between-*subjects factor with the six possible permutations as levels (ABC, ACB, BAC etc.). Previously, *Levene*'*s* tests were performed in order to check for homogeneities of error variance. Moreover, the assumption of *sphericity* of the covariance matrix was verified previously by a *Mauchly*'*s sphericity test* in order to assure that the *F* ratios match an *F* distribution. If there was a significant departure from sphericity*, Greenhouse-Geisser* estimates were used to correct degrees of freedom which results in fractions instead of usual integers. Although data may not follow a normal distribution, *ANOVA* has been demonstrated to be relatively robust against moderate deviations from normality (see e.g., Khan and Rayner, 2003). Univariate analyses were used to examine whether there is a significant *between*-subjects main effect of *presentation order* and further if there is a significant *interaction* effect between *presentation order* × *stimulation condition*. Analyses were repeated representing condition (i) ACL algorithm also as ITRMean vs. ITRMax.

For the investigation of linear correlational relationships, *Spearman's rank order correlation coefficient Rho* was additionally used apart from the common *Pearson product-moment correlation coefficient r* due to its relative robustness firstly against outliers, but also against other than linear, but still monotonic relationships and against departures from normality or homoscedasticity. Whenever relevant influence of outliers was suspected, *Spearman's rank correlation coefficient Rho* was preferred.

A-priori *statistical test power* analyses with the program *G*∗*Power 3* (Faul et al., 2007) show that *Pearson* correlation significance tests in the employed sample size of *N* = 18 and with standard significance level α = 0*.*05 have test powers (1 − β*)* ≥ 0*.*80 as recommend by Cohen (1988), when they have effect sizes in the underlying population ρ ≥ 0*.*60, as compared to *H*<sup>0</sup> : ρ = 0*.*00. For ρ = 0*.*50 test power is (1 − β*)* ≥ 0*.*60, for ρ = 0*.*40 *(*1 − β*)* = 0*.*40 and for ρ = 0*.*30 *(*1 − β*)* ≈ 0*.*20. Thus, although the employed sample size *N* = 18 is relatively small, hypothesis testing of *Pearson* correlations with full recommended strictness is definitely possible at the level of assumed large effect sizes.

#### **RESULTS**

**Table 1** reports the data for all *N* = 18 subjects under the three experimental conditions, representing (i) *ACL algorithm* as ITRMean, ITRMedian and ITRMax. Inferential statistical hypotheses testing that (i) outperformed the other two flicker stimulation conditions is reported below.

**Figure 6** shows the SSVEP frequency-response curves in our experiments. For all subjects, the 20 flicker frequencies in the scanning phase were presented in the same order: 23, 37, 30, 31, 36, 22, 29, 33, 39, 24, 35, 21, 25, 27, 32, 34, 28, 20, 26, and 38 Hz. Sequential randomness of this order is confirmed with *Z* = −0*.*230 and *p*exact = 0*.*828 (*Wald–Wolfowitz* runs test after *Mdn* split dichotomization). Our findings that in the 20–39 Hz range, lower flicker frequencies *over all subjects* (**Figure 6A**) evoke higher SSVEP magnitudes are in line with other studies which reported a global maximum SSVEP amplitude around 10 Hz with additional local maxima around 20, 40, and 80 Hz (Regan, 1989; Herrmann, 2001; Bayram et al., 2011). In our sample, we found that SSVEP frequency-response curves differed remarkably *between subjects* (**Figure 6B**) probably due to trait and state variabilities which justifies that they are determined in our experiment in the scanning phase for every subject *individually.*

Analyzing **Figure 6C**, higher frequencies ≥30 Hz lead to higher correlations; no relevant differences can be seen comparing the three experimental conditions. Interestingly, following e.g., Zschocke and Hansen (2012), 30 Hz is the upper boundary of *beta* activity observable in scalp EEGs by conventional amplifiers.

### **SIGNIFICANT AND LARGE IMPROVEMENT OF SSVEP-BCI EFFICIENCY BY THE NOVEL ACL ALGORITHM**

Analyzing the differences in the *central tendencies* between the three experimental conditions (i) *ACL algorithm* (ii) *prefixed* (iii) *top* we represented condition (i) based upon three different descriptive statistics, (a) ITRMean, (b) ITRMedian, (d) ITRMax (see section "Materials and Methods" and **Table 1**). Applying nonparametric inferential statistics we found a very significant and very large superiority of condition (i) *ACL algorithm* over the other two (ii) and (iii) which is independent of its three types of representation (a), (b), and (c), while there is no significant difference between (ii) and (iii). The used statistical methods and measures for the following results are found in section "Statistical Analyses."

(a) A *Friedman* omnibus test comparing the ITRs between the three experimental conditions (i) *ACL algorithm* **represented as ITRMean**, (ii) *prefixed* and (iii) *top* shows a significant overall difference with χ2*(*2*)* = 10*.*116, *p* = 0*.*006.

*Post-hoc pairwise comparisons* based upon *critical mean rank differences* 0.82 (Schaich and Hamerle, 1984) vs. 0.58 (Conover, 1980) indicate that ITRs are significantly higher in (i) *ACL algorithm* as compared to (ii) *prefixed* (*mean rank difference* = 1.03, very large effect size *PSdep* = 0*.*83, = 1*.*37) and also as compared to (iii) *top* (*mean rank difference* = 0.64, large effect size *PSdep* = 0*.*72, = 0*.*83). Comparison of (ii) *prefixed* with (iii) *top* results in a non-significant difference (*mean rank difference* = 0.39).

(b) A *Friedman* omnibus test comparing the ITRs between the three experimental conditions (i) *ACL algorithm* **represented as ITRMedian**, (ii) *prefixed* and (iii) *top* shows a significant overall difference with χ2*(*2*)* = 9*.*262, *p* = 0*.*01.

*Post-hoc pairwise comparisons* based upon *critical mean rank differences* 0.82 (Schaich and Hamerle, 1984) vs. 0.57 (Conover, 1980) indicate that ITRs are significantly higher in (i) *ACL algorithm* as compared to (ii) *prefixed* (*mean rank difference* = 0.94, very large effect size *PSdep* = 0*.*81, = 1*.*25) and also as compared to (iii) *top* (*mean rank difference* = 0.64, very large effect size *PSdep* = 0*.*76, = 1*.*21) applying the less conservative criterion of (Conover, 1980). Comparison of (ii) *prefixed* with (iii) *top* results in a non-significant difference (*mean rank difference* = 0.31).

(c) A *Friedman* omnibus test comparing the ITRs between the three experimental conditions (i) *ACL algorithm* **represented as ITRMax**, (ii) *prefixed* and (iii) *top* shows a significant overall difference with χ2*(*2*)* = 22*.*986, *p* = 0*.*00001.

*Post-hoc pairwise comparisons* based upon *critical mean rank differences* 0.82 (Schaich and Hamerle, 1984) vs. 0.41 (Conover, 1980) indicate that ITRs are significantly higher in (i) *ACL algorithm* as compared to (ii) *prefixed* (*mean rank difference* = 1.47, extremely large effect size *PSdep* = 0*.*94,  = 2*.*25) and also as compared to (iii) *top* (*mean rank difference* = 1.19, extremely large effect size *PSdep* = 0*.*94, = 2*.*25). Comparison of (ii) *prefixed* with (iii) *top* results in a non-significant difference (*mean rank difference* = 0.28).

## **THE ACL ALGORITHM SEEMS TO ADAPT TO SUBJECTS' INTERINDIVIDUAL DIFFERENCES**

*NTrials* in condition (i) *ACL algorithm* using two flicker LEDs (see **Table 1**) is deterministically given by 3/4 of the total number of the SSVEP-*SNR* responses under the 20 flicker frequencies in the scanning phase of the experiment which had exceeded the defined threshold value of 10 (*suitable* frequencies), see *ACL Algorithm* of section "Materials and Methods." Thus, in order to make the investigation of possible interindividual associations between the SSVEP-*SNR* magnitudes with the convergence measure second half (see section "Materials and Methods") relatively independent from *NTrials*, all subjects with *NTrials <* 25th percentile (8*.*75 ≈ 9) were excluded, # subject 6, 8, 9, 12, 13, and 16. The resulting rest of *N* = 12 subjects showed a relatively small variability with range of *NTrials* between 11 and 15. The measure SSVEP-*SNR* mean magnitudes in the scanning phase of the experiment (a) over all flicker frequencies from 20 to 39 Hz was split into two measures, one for (b) *lower* frequencies from 20 to 29 Hz and the other for (c) *higher* frequencies from 30 to 39 Hz. In this subsample, convergence measure second half shows large and highly significant correlations with (a) of *r* = 0*.*839, *p* = 0*.*001, with (b) of *r* = 0*.*843, *p* = 0*.*001 and with (c) of *r* = 0*.*763, *p* = 0*.*004. Checking these relationships against the remaining variability of *NTrials* and age as controlled third variables in *partial correlation* analyses, indeed no changes are observed; those found relationships can be considered as linearly independent from *NTrials* and age. Hence, these findings show that the convergence of the *ACL* algorithm highly depends on the subjects' *trait* ability to generate higher SSVEP-*SNR* magnitudes, with no relevant differences observed between *lower* vs. *higher* flicker frequencies: focusing on a subsample with a more or less constant number of *suitable* frequencies, the ACL algorithm explored the more distinct frequencies in those subjects who displayed the *larger* SSVEP-*SNR* magnitudes in the scanning phase of the experiment.

In conclusion, these findings imply that the ACL algorithm shows a distinct exploration behavior for different subjects and thus indeed is able to adapt to subjects' interindividual differences. Whether this adaptation is the *cause* for the ACL algorithm's outperformance of (ii) *top* and (iii) *prefixed* cannot be examined in depth with the employed experimental design and has to be investigated in further studies.

#### **BASELINE RESTING STATE EEG MEASURES AS CORRELATES OF INTERINDIVIDUAL DIFFERENCES**

Searching for significant and relevant associations between interindiviudal variabilities of ITR performances under the three experimental conditions vs. of baseline resting state EEG relative mean PSDs in all computed frequency bands at all eight used electrodes, effects were only found in *thetaHigh* (6.5–7.5 Hz) and *betaMid* (18–24 Hz). In all the other bands nothing could be observed.

Whereas *Pearson* correlations showed no relationships between the resting state relative mean *thetaHigh* PSDs at *Oz* vs. ITRs in conditions (iii) *prefixed* (*r* = 0*.*034, *p* = 0*.*894) and (ii) *top* (*r* = 0*.*196, *p* = 0*.*436), a significant positive correlation with condition (i) *ACL algorithm* was found (*r* = 0*.*467, *p* = 0*.*048) representing the performance as ITRMedian. Searching for similar relationships in the other seven used electrodes, no associations were observed; these effects exclusively occur at *Oz* in our sample*.* Following the effects size classifications of Cohen (1988), this correlation is to be considered as *moderate. Partial correlation* analyses confirmed that this correlation is linearly independent against age and all means of SSVEP-*SNRs* in the previous scanning phase of the experiment over (a) *all* 20 flicker frequencies, (b) also over the *lower* frequencies 20–29 Hz and (c) also over the *higher* frequencies 30–39 Hz.

At least in the examined sample, interindividual variability in relative mean *thetaHigh PSD* at *Oz* seems to differentiate between *ACL algorithm* and the other two conditions: the larger the observed relative mean PSDs among subjects in the baseline resting state are, the better will be their later SSVEP-BCI performance exclusively under the use of *ACL algorithm.*

At first sight, analyzing baseline resting state relative mean *betaMid* PSDs, an exclusive relationship with only the ITRs in condition (iii) *top* was found for *F3* (*r* = 0*.*484, *p* = 0*.*042), although its neighbor electrodes also showed relationships not very far away from significance, probably due to small sample

*distances* **in brackets calculated in a linear regression analysis with the**

size: *F4* with *r* = 0*.*425, *p* = 0*.*117 and *Fz* with *r* = 0*.*410, *p* = 0*.*091. All the other used electrodes showed no associations. After further graphic inspection of relevant scatterplots and *Box-Whisker-Plots,* a possible negative relationship between baseline resting state relative mean *betaMid* PSDs at *Oz* and ITRMean in condition (i) *ACL algorithm* was suspected, hidden by outliers. *Box-Whisker-Plots* suggested case 15 and 11 as outliers, so for further analysis *Mahalanobis distances* were computed in a linear regression analysis with the ITRsMean of condition (i) *ACL algorithm* as criterion variable and baseline resting state relative mean *betaMid* PSDs at *Oz* as predictor variable. The inspection of *Mahalanobis distances* and the scatterplot (see **Figure 7**) suggest that subject 15 and 11 might be considered as outliers. Excluding them changes the correlation from *r* = −0*.*262, *p* = 0*.*294 to significant *r* = −0*.*530, *p* = 0*.*042. *Partial correlation* analyses confirmed that this correlation is linearly independent against age and all means of SSVEP-*SNRs* in the previous scanning phase of the experiment (a), (b), and (c) mentioned above.

Interestingly, excluding case 15 and 11, baseline resting state relative mean PSDs *betaMid* vs. *thetaHigh* both at *Oz* show an almost significant correlation over the subjects with *r* = −0*.*482 and *p* = 0*.*059, probably due to the small sample size, which is stable against the third variables age and all SSVEP-*SNRs* in the previous scanning phase of the experiment (a), (b), and (c), mentioned above.

*r* = −0*.*262, *p* = 0*.*294 to significant *r* = −0*.*510, *p* = 0*.*043.

In conclusion, baseline resting state relative mean *betaMid* PSDs seem to predict ITR performances under (i) *ACL algorithm* vs. (iii) *top* in an opposed fashion depending on the electrodes: the *lower* baseline resting state relative mean *betaMid* PSDs are at *Oz*, the *higher* will be the ITRs under condition (i); and the *higher* baseline resting state relative mean *betaMid* PSDs are at frontal electrodes (*F3, Fz, F4)* the *higher* will be the ITRs under condition (iii). In addition to these findings in *betaMid*, the higher the baseline resting state relative mean *thetaHigh* PSDs at *Oz* are, the higher will be the ITRs exclusively under condition (i).

Returning to the above described subsample of *N* = 12 obtained by exclusion of all subjects with *NTrials <* 25th percentile (8*.*75 ≈ 9), an interesting observation was found: IAF shows differentiating relationships with ITR performances: a significant correlation of *r* = 0*.*577, *p* = 0*.*0496 was only found with ITRs under (i) *ACL algorithm* (see scatterplot **Figure 8**), but neither under (ii) *top* with *r* = 0*.*394, *p* = 0*.*205 nor under (iii) *prefixed r* = 0*.*283, *p* = 0*.*373. The higher subjects' IAF are in the subsample, the better will be their ITR performance exclusively under the ACL algorithm. *Partial correlation* analyses confirmed that this association is linearly independent against age. Repeating this analysis for the entire sample of *N* = 18 *no* significant correlations between individual alpha frequency (IAF) and ITR performances under the three experimental conditions become apparent (i) with *r* = 0*.*282, *p* = 0*.*257, (ii) *r* = 0*.*198, *p* = 0*.*432 and (iii) *r* = 0*.*243, *p* = 0*.*332. These findings imply that subjects with *low* ITRs in all three conditions might represent another population as compared to the rest. Further studies may try to replicate these findings and identify dimensions which discriminate between these possible two different populations. Moreover, these findings could be relevant for the understanding

**as continuous line, 95% confidence regression bands as dotted lines).** A significant *Pearson* correlation with *r* = 0*.*577, *p* = 0*.*0496 was found in the remaining subsample of *N* = 12 (blue points), removing subjects with *NTrials <* 25th percentile (8*.*75 ≈ 9) (red points), while over the entire sample of *N* = 18 the correlation is hidden with *r* = 0*.*282, *p* = 0*.*257 (all points). This relationship seems to exist exclusively for condition (i) *ACL algorithm*: the higher subjects' IAF are in this subsample, the better will be their ITRMean performance exclusively under (i). *Partial correlation* analyses confirmed that this association is linearly independent against age.

of the so-called *BCI illiteracy* phenomenon (Blankertz et al., 2010; Vidaurre and Blankertz, 2010; Volosyak et al., 2011), see section "Discussion."

Inspired by the findings of Koch et al. (2008) who found correlations of IAF with both magnitudes of *visually evoked potential*s (VEPs) and also with cortical oxygenation measured by *nearinfrared spectroscopy* (NIRS), *Spearman* rank order correlations were computed between IAF and means of SSVEP-*SNR* magnitudes in the scanning phase of the experiment (a) over *all* 20 used flicker frequencies 20–39 Hz, (b) over the *lower* frequencies 20–29 Hz and (c) over the *higher* frequencies 30–39 Hz in the described subsample of *N* = 12. Although not fully reaching significance level, probably due to the relatively small sample size, an interesting pattern was found: IAF vs. (a) with *rho* = 0*.*561, *p* = 0*.*058, IAF vs. (b) with *rho* = 0*.*183, *p* = 0*.*568 and IAF vs. (c) *rho* = 0*.*557, *p* = 0*.*060. Although not fully significant, probably due to the small sample size, interindividual differences in SSVEP-*SNR* magnitudes under the employed *higher* flicker frequencies seem to show a tendency of positive association to higher IAFs while this relationship might not exist for the stimulation with the *lower* frequencies (or if so, it may presumably be lower). These findings motivated the re-analysis of the found relationship in **Figure 8** by *partial correlations* whether it would be linearly independent against SSVEP-*SNR* magnitudes in the scanning phase of the experiment (a), (b) and (c) as described above. While (a) and (b) showed no relevant influence on this relationship, controlling for (c) resulted in a reduction from former *Pearson r* = 0*.*577, *p* = 0*.*0496 to *r* = 0*.*396, *p* = 0*.*228. Hence, these findings imply that IAF and (c) the magnitude of SSVEP responses to only the employed *higher* flicker frequencies share remarkably amounts of common interindividual variability while explaining variability of ITRMean under the ACL algorithm.

#### **EFFECTS OF THE PERMUTATION OF PRESENTATION ORDER**

Investigating possible effects of the permutation of presentation order, a *mixed-design repeated measures ANOVA* was computed with *stimulation condition* as repeated *within-*subjects factor with three levels (i) ACL algorithm represented as ITRMedian, (ii) prefixed and (iii) top and *presentation order* as *between-*subjects factor with the six possible permutations as levels (ABC, ACB, BAC etc.). *Levene*'*s* tests showed homogeneities of error variances. There was no significant *between*-subjects main effect of *presentation order* with *F(*5*,* <sup>18</sup>*)* = 2*.*26, *p* = 0*.*115, η<sup>2</sup> <sup>p</sup> = 0*.*485. Because *Mauchly*'*s sphericity* test indicated a significant departure from the assumption of *sphericity* with χ2*(*2*)* = 6*.*54, *p* = 0*.*038, *Greenhouse-Geisser* estimates were used to correct degrees of freedom (ε = 0*.*691). There was no significant *interaction* between *presentation order* × *stimulation condition* with *F(*10*,* <sup>18</sup>*)* = 0*.*67, *p* = 0*.*738, η<sup>2</sup> <sup>p</sup> = 0*.*219. *ANOVA* analyses were repeated also for condition (i) ACL algorithm represented as ITRMean and ITRMax which resulted in similar findings. In conclusion, neither significant main effects nor significant interactions could be found over all six possible permutations of presentation order. Hence, the found effects in the *central tendencies* reported above with regard to all ITR performances can be considered as independent from possible presentation order effects.

## **DISCUSSION**

Although electrophysiology-based closed-loop interactions with biological nervous systems have been used since the 1940s, modern computers and online software control techniques allow a wide variety of novel activity dependent protocols in neuroscience research and related applications. Current BCI bring up a number of problems related to relatively long previous training times and still relatively low efficiencies (ITRs). This calls for novel techniques which can also address context and subject specificities, e.g., adaptive detection of SSVEPs (e.g., Krauledat et al., 2008).

In this paper we described an *assisted closed-loop protocol* which enhances BCI efficiency, as compared to classic BCI protocols, by providing both the subject and the system with online information which helps them to reach the BCI goal in their interaction. We used a reductionistic paradigm to constrain the inherent complexity of closed-loop exploration: four simultaneous frequencies, a basic SSVEP detection strategy and a relatively simple task to be accomplished by the user. More complex BCI systems might further benefit from the described approach. Our paradigm calls for many possible improvements, ranging from advanced SSVEP detection algorithms, stimuli which inform the user more effectively, up to a more adaptive online control of the interface itself by measuring and exploring additional dimensions (multimodality).

The literature on SSVEP-BCIs does not report general recommendations for the selection of the properties of the visual stimuli (Wu et al., 2008; Zhu et al., 2010), although it is known that the SSVEP magnitudes depend on extrinsic and intrinsic dimensions (Ding et al., 2006; Lopez-Gordo et al., 2011). Our study shows that a closed-loop subject-specific selection of the stimulation frequencies together with the closed-loop auditory feedback lead to increased BCI ITR performance which outperformed the employed control conditions.

Although *assisted closed-loop protocols* seem to enhance BCI efficiency, their use is limited by the additional time needed for the exploration process. In the protocol discussed in this paper, the average time to perform the experiment was around half an hour, flicker frequency selection took most of this time. Due to time restrictions, the parameter space can never be explored completely, so BCI efficiency improvement might remain suboptimal. Thus, there is some unknown *trade-off* between improvement and time needed, which should be explored in further studies. Furthermore, the question how replicable the found flicker frequencies are in the same subjects over multiple follow-up time points could be explored. Probably, observing this stability over time (e.g., *test-retest reliability*) may help to discover important trait vs. state dimensions related to variability of BCI performance. Another limitation due to the SSVEP physiology is that the time window for the auditory feedback is relatively short, so subjects have to establish control over the BCI goal in the range of a few seconds. This implies possible interactions with subjects' traits and states related to cognitive *processing speed* and dimensions of learning abilities.

ACL algorithms offer new possibilities as compared to traditional open-loop paradigms, but require additional decisions and new perspectives for their design and analysis, e.g., with regard to online measurement of actual states and performance, parameter search responding to the particular dynamic behavior of the system, properties of the feedback stimuli, *actuation laws*, etc. However, our findings imply that this additional effort can improve BCI efficiency and contribute to reveal dynamics of the nervous system which would remain hidden under traditional paradigms. Because our analyses showed that EEG resting state measures can predict assisted closed-loop SSVEP-BCI performance, our novel approach seems to flexibly adapt/interact with interindividual cerebral variabilities. Although found in the context of a *sensory motor rhythms* (SMRs) based BCI, other recent work also demonstrated that EEG resting state measures can be relevant predictors of BCI performance (Blankertz et al., 2010). In this emerging field, it could be fruitful to identify possible EEG resting state measures which can differentiate/predict between BCI performances based on biosignals originating from distinct physiological mechanisms: SSVEPs, P300*,* SMRs, *slow cortical potentials* (SCPs)*, electrocorticogram* (ECoG), *magnetoencephalography* (MEG), NIRS *or blood-oxygen-level-dependent* (BOLD). Apart from these biosignals reflecting *brain* activity, *peripheral* psychophysiological measures have been investigated in the context of BCIs, especially as performance predictors, such as *parasympathic/vagal* parameters of resting state *heart rate variability* (HRV) (Kaufmann et al., 2011).

Our proposed approach of new adaptive-interactive paradigms might offer innovative ways how to address the problem of the so-called *BCI illiteracy,* i.e., the incapacity of some subjects to achieve control of BCIs (Blankertz et al., 2010; Vidaurre and Blankertz, 2010; Volosyak et al., 2011). It might be fruitful to explore the possible different impact of ACL algorithms in BCIs based on the mentioned distinct physiological mechanisms, especially with regard to their specific BCI illiteracies.

As mentioned in section "Baseline Resting State EEG Measures as Possible Interindividual Correlates of ITR Performances," the IAF is a measure of *interindividual* EEG variability because it is remarkably stable *within* subjects, but relatively variable *between* subjects (Kondacs and Szabó, 1999). IAF seems to be highly heritable, e.g., Posthuma et al. (2001) found in a study comparing mono- vs. dizygotic twins, analyzing a large representative sample of healthy Dutch adults (*N* = 688), that 71–83% of total IAF variance could be ascribed to genetic variances. Thus, IAF may be considered as an *endophenotype* following the definition of Gottesman and Gould (2003). Klimesch (1997) found in a sample of age matched subjects that the IAF of good working memory performers is about 1 Hz higher vs. that of bad performers. Jin et al. (2006) found that IAF is positively correlated with conflict reaction time. Severity of *Alzheimer*'*s* disease is positively related to the extent of typical IAF slowing in this pathology (Rodriguez et al., 1999). On the neurophysiological level, Steriade et al. (1990) reported that IAF depends on membrane properties of the thalamic neurons which project to the cortex, implying thalamo-cortical feedback loops as one of the important generators of alpha activity (Lopes da Silva, 1991). Mayer et al. (2007) successfully modeled the synchronization of locally coupled bistable thalamic oscillators as controlled by the influence of corticothalamic projections, probably responsible for widespread spindle oscillations in the thalamus. Given these findings, IAF might be understood as a positive correlate of thalamo-cortical information processing speed. With regard of possible correlations of IAF with SSVEP magnitudes, Koch et al. (2008) found interesting correlations of IAF with both magnitudes of VEPs and cortical oxygenation measured by NIRS. Concluding, IAF seems to open new insights into the understanding of the neural circuits underlying BCI performance and thus should be considered as a promising predictor for further studies.

In this study, only eight EEG electrodes were used to investigate EEG resting state measures as performance predictors, but further works might use more electrodes of the 10–20 system to allow a-posteriori offline analyses of *scalp maps* and the use of *source localization* techniques, e.g., *LORETA* (for a review see Grech et al., 2008). Findings of research concerning the cerebral *resting-state networks* call for further studies which use simultaneous EEG/fMRI recordings (for reviews see e.g., Fox and Raichle,

## **REFERENCES**


*Electrochemistry in Biology and Medicine,* ed T. Shedlovsky (New York, NY: Wiley), 121–140.


2007; Van den Heuvel and Hulshoff-Pol, 2010; for typical studies see e.g., Damoiseaux et al., 2006; Van den Heuvel et al., 2009; Yuan et al., 2012).

Opening the scope to other uses, the demonstrated advantage of our adaptive-interactive BCI protocol can be expanded conceptually, e.g., to innovative applications such as diagnostic/therapeutic tools in clinical contexts: exploring the subjectspecific dynamical trajectory of machine-subject interaction could extract information which otherwise would remain undiscovered. Thus, far beyond an engineering focus, the proposed approach might be employed as a new paradigm for basic neuroscientific and biomedical research.

## **ACKNOWLEDGMENTS**

We thank Víctor Bonilla for technical help. This work was supported by UAM CEMU 2012-004, MINECO TIN2012-30883 and TIN-2010-19607.


Federation. *Electroencephalogr. Clin. Neurophysiol.* 10, 370–375.


performance of independent binary SSVEP-BCIs. *Clin. Neurophysiol.* 122, 128–133.


of the brain at rest–exploring EEG microstates as electrophysiological signatures of BOLD resting state networks. *Neuroimage* 60, 2062–2072.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 31 October 2012; paper pending published: 22 December 2012; accepted: 06 February 2013; published online: 25 February 2013. Citation: Fernandez-Vargas J, Pfaff HU, Rodríguez FB and Varona P (2013) Assisted closed-loop optimization of SSVEP-BCI efficiency. Front. Neural Circuits 7:27. doi: 10.3389/fncir. 2013.00027*

*Copyright © 2013 Fernandez-Vargas, Pfaff, Rodríguez and Varona. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any thirdparty graphics etc.*

## Coupling BCI and cortical stimulation for brain-state-dependent stimulation: methods for spectral estimation in the presence of stimulation after-effects

#### *Armin Walter <sup>1</sup> \*, Ander R. Murguialday2,3, Wolfgang Rosenstiel 1, Niels Birbaumer 2,4 and Martin Bogdan1,5*

*<sup>1</sup> Department of Computer Engineering, Wilhelm-Schickard-Institute, Eberhard Karls Universität Tübingen, Tübingen, Germany*

*<sup>2</sup> Institute of Medical Psychology and Behavioural Neurobiology, University Hospital Tübingen, Tübingen, Germany*

*<sup>3</sup> Health Technologies Department, TECNALIA, San Sebastian, Spain*

*<sup>4</sup> Ospedale San Camillo, IRCCS, Venice, Italy*

*<sup>5</sup> Department of Computer Engineering, University of Leipzig, Leipzig, Germany*

#### *Edited by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### *Reviewed by:*

*Yang Dan, University of California, USA*

*Pratik Y. Chhatbar, Medical University of South Carolina, USA*

#### *\*Correspondence:*

*Armin Walter, Department of Computer Engineering, Wilhelm-Schickard-Institute, Eberhard Karls Universität Tübingen, Sand 14, 72076 Tübingen, Germany. e-mail: armin.walter@ uni-tuebingen.de*

Brain-state-dependent stimulation (BSDS) combines brain-computer interfaces (BCIs) and cortical stimulation into one paradigm that allows the online decoding for example of movement intention from brain signals while simultaneously applying stimulation. If the BCI decoding is performed by spectral features, stimulation after-effects such as artefacts and evoked activity present a challenge for a successful implementation of BSDS because they can impair the detection of targeted brain states. Therefore, efficient and robust methods are needed to minimize the influence of the stimulation-induced effects on spectral estimation without violating the real-time constraints of the BCI. In this work, we compared four methods for spectral estimation with autoregressive (AR) models in the presence of pulsed cortical stimulation. Using combined EEG-TMS (electroencephalography-transcranial magnetic stimulation) as well as combined electrocorticography (ECoG) and epidural electrical stimulation, three patients performed a motor task using a sensorimotor-rhythm BCI. Three stimulation paradigms were varied between sessions: (1) no stimulation, (2) single stimulation pulses applied independently (open-loop), or (3) coupled to the BCI output (closed-loop) such that stimulation was given only while an intention to move was detected using neural data. We found that removing the stimulation after-effects by linear interpolation can introduce a bias in the estimation of the spectral power of the sensorimotor rhythm, leading to an overestimation of decoding performance in the closed-loop setting. We propose the use of the Burg algorithm for segmented data to deal with stimulation after-effects. This work shows that the combination of BCIs controlled with spectral features and cortical stimulation in a closed-loop fashion is possible when the influence of stimulation after-effects on spectral estimation is minimized.

**Keywords: brain-computer interfaces, cortical stimulation, spectral estimation, brain-state-dependent stimulation, autoregressive models**

#### **1. INTRODUCTION**

Cortical stimulation is being used to study cortical function, e.g., (Matsumoto et al., 2007). In clinical settings, it is employed for surgical planning (Lefaucheur and de Andrade, 2009) and therapy (Tsubokawa et al., 1991). Furthermore, preliminary studies on the use of cortical stimulation for stroke rehabilitation which used stimulation together with physiotherapy in order to modulate cortical excitability have been conducted (Brown et al., 2008; Levy et al., 2008). Taking the current brain activity of the patient into account when selecting stimulation parameters has been proposed as a possible improvement (Plow et al., 2009). Such an activity-dependent stimulation paradigm has been used by Jackson et al. (2006), who were able to show that cortical microstimulation associated in time with brain activity during a motor task can induce neural reorganization lasting for several days after stimulation in primates.

The effects of transcranial magnetic stimulation (TMS) as well depend on brain states of the stimulated person (Mitchell et al., 2007). Recently, Bergmann et al. (2012) applied TMS coupled to electroencephalography (EEG) to investigate the dependency of stimulation effects on the phase of slow EEG oscillations during sleep. In general, such activity-dependent or brain-state-dependent stimulation (BSDS) paradigms allow to investigate cortical networks at specific activation levels, making BSDS a potentially useful tool in cognitive neuroscience (Jensen et al., 2011) as well as in clinical studies improving consistency of the stimulation effects (Plow et al., 2009).

For effective BSDS, reliable decoding of the brain-state from the ongoing brain activity is necessary. Over the last decades in the field of brain-computer interfaces (BCIs) several different strategies were investigated (Birbaumer et al., 1999; Birbaumer and Cohen, 2007). Especially in the case of movement-related brain states during active or imagined limb movements, spectral power has been shown to be useful for their decoding. In particular, event-related (de-) synchronization of sensorimotor rhythms is an informative measure for discriminating movement and resting states (Wolpaw et al., 2002). Therefore, if one wants to combine BSDS with a movement task, one has to minimize the interference of the stimulation on the estimation of the spectral features to detect the brain-state properly.

The stimulation effects involve problems with spectral estimation caused by the stimulation artefact and the evoked neural activity. A stimulation pulse evokes an artefact in the signal (**Figure 1A**) with an amplitude in the range of several hundred millivolts or even volts, thus often exceeding the dynamic range of the amplifier (Veniero et al., 2009). In the vicinity of stimulation, evoked potentials are recorded (**Figure 1B**) which can reach amplitudes of several hundred microvolts (Matsumoto et al., 2007). Thus, if an analyzed window contains a stimulation pulse, the estimation of the spectrum of this window is difficult, because it is not stationary. This is evident in **Figure 1C**, showing that each stimulation pulse results in strong jumps in the estimated spectral power. Waiting long enough after the pulse is one solution. This approach results in non-continuous brain-state decoding with waiting periods after a stimulus of at least several hundred milliseconds. It dictates a longer inter-stimulus interval (ISI), because a robust

components are covered. **(C)** Time-course of the power at 12 Hz of the signal displayed in **(A)**, resulting from a time-frequency analysis with auto-regressive models (order 16) when a window of 500 ms is shifted in 40 ms steps over the data. Hence, a single stimulation pulse distorts the spectrum for the next 500 ms because it remains in the data window. **(D)** Zoom on the region of the last stimulation pulse. Power at 12 Hz without stimulus processing (solid line) and when the gap is defined as in **(B)** and either MEMgap (dashed line) or AR modeling with order 16 (dashed-and-dotted line) are applied to deal with it.

estimate of the brain-state is needed before the next pulse can be applied.

If small ISIs and/or continuous decoding of the brain-state is necessary, methods that enable spectral estimation of data containing stimulation after-effects are mandatory. One potential solution for this, which has been used mainly in offline studies (no BSDS), is to separate the stimulation effects from the signal, as for example in Litvak et al. (2007). This places restrictions on the recording setup, such as the need for an amplifier with high dynamic range to cover the entire amplitude of the artefact and it is unclear whether such a procedure can be performed online without resulting in residual artefacts which would still lead to distortions of the spectrum. We present in this paper another solution suitable for online BSDS: we ignore the short segment of data dominated by the after-effects of stimulation when estimating the spectrum, leaving us with the challenge to estimate the spectrum when portions of the data are missing from a continuous data flow. We term such an excluded data segment a *gap*. In online experiments, using either signals synchronized with the stimulator or a peak detection algorithm, one can mark a sample before the stimulation pulse as the beginning of the gap. The number of following samples marked as belonging to the gap (i.e., the *gap size*) should be chosen in advance such that the gap, ideally, encloses just the stimulation artefact, and the largest evoked components. The dashed line and the dashed-and-dotted line in **Figure 1D** show the results of two approaches introduced in this work to extract the spectral power when the artefacts are masked by the gap shown in **Figure 1B**. They are much closer to the power before and after the stimulus, compared to the power without any processing of the stimulus (solid line).

In this paper we compare different online brain-state decoding methods on their suitability to perform spectral estimation using autoregressive (AR) models on data containing stimulation pulses and gaps. We consider here only stimulation paradigms with pulsed stimuli and restrict ourselves to data acquired with EEG or electrocorticography (ECoG) and stimulation performed using TMS or epidural electrodes. First, we introduce the methods for spectral estimation in the presence of gaps and investigate

the effects of parameter estimates such as AR model order and gap size on the resulting spectrum. We present results from a simulation study in which gaps are artificially inserted into a BCI data set recorded without stimulation. We then show the different results of the algorithms on short data segments of two BCI training experiments, one with simultaneous TMS and one with simultaneous epidural electrical stimulation to illustrate the effects of cortical stimulation on spectral estimation and the results of correcting stimulation after-effects. Finally, we investigate the separability of intended hand movement and rest for different experimental paradigms (no stimulation, open-loop, or closed-loop stimulation) using non-invasive and invasive data during BCI experiments in three chronic stroke patients.

## **2. METHODS**

#### **2.1. PARTICIPANTS**

Data was recorded from three chronic stroke patients (**Table 1**) suffering from paresis of the left hand. None of the patients was able to produce voluntary finger movements with the left hand. All procedures were approved by the local ethics committee of the medical faculty of the university hospital in Tübingen. Each stroke patient was implanted with 16 epidural platinum iridium disk electrodes (Resume II, Medtronic, Fridley, USA) with a contact diameter of 4 mm placed over the ipsilesional S1, M1, and pre-motor cortex on four strips with an inter-electrode center-tocenter distance of 10 mm. They were arranged in a 4 × 4 grid-like pattern (**Figure 2**). During pre-surgical evaluation, all subjects completed the task described below with combined EEG-TMS (*non-invasive case*) and repeated the same task after the surgery using electrical epidural stimulation and recordings from the implanted electrodes (*invasive case*). The BCI and stimulation experiments were conducted during a period of 4 weeks following the implantation.

#### **2.2. TASK**

The patient was facing a 19- monitor. The left upper limb of the patient was fixed using two straps, one at the forearm and one around the wrist and magnets fixed the fingertips to the actuators


**post-surgical CT for the three patients.** From left to right: P1–P3.

of a mechatronic hand orthosis (Tyromotion Amadeo HTS, Graz, Austria). This device was controlled by a BCI and moved the fingers of the paralyzed hand between an opened and a closed position. The range of the movement was adjusted in each session (Ramos-Murguialday et al., 2012) because it was limited by the spasticity of the patient. Each trial of the task consisted of three phases: preparation (2 s), feedback (6 s), and rest (8 s). During preparation, the subject received an auditory cue ("Left Hand") but was instructed to wait with the execution until the next auditory command ("Go!") was given at the start of the feedback phase. During the feedback phase starting with a closed position of the left hand, the patient had to try to open the left hand until the end of the feedback phase. At that point, another auditory cue ("Relax!") was given. During the rest period, the left hand of the patient was returned to its original closed position (2–3 s) and the patient was instructed to relax. An experimental session was divided into a 4–16 runs, each of these consisting of 11 trials. Runs with clear non-stimulation-related artefacts (e.g., amplifier saturation) on the analyzed channels were excluded from further analysis, resulting in a minimum of three runs per session for analysis and an average of 8*.*7 ± 4*.*3.

#### **2.3. ELECTROPHYSIOLOGICAL RECORDING**

Both EEG and ECoG were recorded with monopolar 32 channel amplifiers (BrainAmp MR plus, BrainProducts, Munich, Germany) with a sampling rate of 1000 Hz. The data was acquired in a packet-wise fashion, where the recording computer received every 40 ms one packet of data consisting of 40 samples per channel. The same behavior was modeled in our simulations of an online BCI. A high-pass filter with a cutoff frequency at 0.16 Hz and a low-pass filter with a cutoff frequency at 1000 Hz were applied. We recorded 32 channels of EEG in the standard 10–10 system, referenced to FCz, using circular Ag-AgCl electrodes. ECoG data was referenced to an electrode at the medio-frontal corner of the electrode grid over pre-motor cortex. Signal acquisition, signal processing and control of the orthosis and (if present in the experiment) the TMS or electrical stimulator were performed using the general-purpose BCI framework BCI2000 (http://www*.*bci2000*.*org) (Schalk et al., 2004) extended with custom-developed features for the control of these devices.

#### **2.4. STIMULATION**

We applied stimulation in the non-invasive case over the hotspot for *extensor digitorum communis* (EDC) activity, identified by a standard mapping paradigm (Wassermann et al., 2008). TMS was applied with a figure-of-eight coil (NeXstim, Helsinki, Finland) with single biphasic pulses (sinusoidal coil current, positive phase first, pulse width 280μs) and an intensity of 110% of the resting motor threshold. The ISI of successive pulses was set to 3 s.

For epidural electrical stimulation we used single biphasic anodal square-wave pulses with a length of 500μs. Stimulation intensity was selected individually per patient and session and chosen to reliably evoke MEPs on the paretic upper limb of the patient. The minimum ISI was set to 2 s in most experiments except when stimulation was applied coupled with the BCI output. In this case, a minimum ISI of 500 ms was chosen. The pulses were applied using a constant current stimulator (STG4008, Multichannel systems, Reutlingen, Germany) with the anode as the epidural electrode that evoked the strongest MEPs on the left upper limb and the cathode being a 50 × 90 mm adhesive electrode placed on the left clavicle of the patient. The current source of the stimulator was switched off 2 s after the last stimulation pulse if no other pulse was triggered before due to a software error, leading to a small but visible step in the recorded signal (**Figure 1B**).

#### **2.5. AUTOREGRESSIVE (AR) MODELS**

A popular choice for spectral estimation in BCI research is to use an AR model for which the coefficients are estimated with the maximum entropy method (Krusienski et al., 2006; McFarland and Wolpaw, 2008). An AR model can be viewed as a linear predictor of the signal samples *x(tk)*, defined as:

$$\mathfrak{x}(t\_k) = \sum\_{i=1}^{p} c\_i \mathfrak{x}(t\_{k-i}) + e$$

where *p* is the order of the model and *e* a sample of a white noise process. If one uses a continuous window of length *N* with *N p* consisting of samples *x(t*0*)* to *x(tN*<sup>−</sup>1*)*, one could solve the following equations with a least-squares procedure to get the coefficients *ci*:

$$\mathbf{x}(t\_k) = \sum\_{i=1}^{p} c\_i \mathbf{x}(t\_{k-i}) \tag{1}$$

$$\mathbf{x}(t\_{k-p}) = \sum\_{i=1}^{p} c\_i \mathbf{x}(t\_{k-p+i}) \text{ for all } k = p, \dots, N-1$$

However, the resulting coefficients do not guarantee a stable AR model (de Waele and Broersen, 2000). Burg proposed a recursive algorithm for the solution of this system that provides stable models with less variance compared to least squares solutions and the Yule-Walker algorithm (Kay and Marple, 1981; de Waele and Broersen, 2000). The Burg algorithm computes the AR coefficients in *p* steps by evaluating in the *i*-th step the residuals of forward and backward prediction of the samples using the coefficients obtained in the *(i* − 1*)*-th step. It is described in Appendix 1.1 in more detail. Spectral estimation with AR models is briefly introduced in Appendix 1.2.

The Burg algorithm requires that the input data is sampled continuously without gaps, a condition which is shared by most of the other algorithms for AR model estimation. Therefore, we need to either fill or remove the gaps before applying one of these algorithms to our data or modify the AR model estimation algorithms to be usable for data with gaps.

#### **2.6. SPECTRAL ESTIMATION IN THE PRESENCE OF GAPS**

This section contains a short description of the different algorithms we compare in this paper that deal with the pre-processing of data containing gaps for spectral estimation with AR models. The input for these algorithms are a segment of data and a vector that contains for each sample in the segment either a 1 (sample belongs to a gap, it has to be excluded from spectral estimation) or a 0 (sample is "clean").

Four methods for dealing with gaps in the data are described below: (1) *linear interpolation*, (2) *AR modeling* which fill the gap with generated data, (3) the *joining of data segments* that removes the gap, and (4) a *modified Burg algorithm for segmented data*. After application of the methods (1)–(3), the standard Burg algorithm is used to estimate the AR model and the spectrum.

#### *2.6.1. Linear interpolation*

We can bridge gaps in the data by linear interpolation between the last sample before and the first sample after the gap:

$$\hat{\mathbf{x}}(t\_{\mathcal{S}+k}) = \mathbf{x}(t\_{\mathcal{S}-1}) + \frac{k+1}{l+1} \cdot \left(\mathbf{x}(t\_{\mathcal{S}+l}) - \mathbf{x}(t\_{\mathcal{S}-1})\right), \ 0 \le k \le l-1 \tag{2}$$

where *x* are the signal samples recorded at times *ti*, *l* is the length of the gap in samples and *tg*<sup>−</sup><sup>1</sup> is the index of the last sample before the gap.

While this might work for offline analysis of a data set, in the case of online analysis during a BCI experiment, in which data is received in a sample- or packet-wise system, one might have not yet received the first clean sample after the gap when trying to produce an estimate for *x(tg* <sup>+</sup> *<sup>k</sup>)* within the gap. We used a simple approach to solve this problem which consists of filling the gap with the value of the last sample before the gap (*x*ˆ*(tg* <sup>+</sup> *<sup>k</sup>)* = *x(tg* <sup>−</sup> <sup>1</sup>*)*) as long as we have not received the packet containing the end of the gap and using linear interpolation for the rest of the gap otherwise. We term this approach *on-line compatible linear interpolation*.

#### *2.6.2. AR modeling*

As a somewhat more sophisticated technique compared to linear interpolation, we generated data from an AR model to fill the gap. For this we used the coefficients *ci* of the AR model estimated for the data window directly before the gap to predict the missing samples *x*ˆ:

$$\hat{\mathbf{x}}(t\_{\xi+k}) = \sum\_{i=1}^{p} c\_i \mathbf{x}'(t\_{\xi+k-i}) + \sigma \cdot e(t\_{\xi+k}), \ 0 \le k \le l-1,\tag{3}$$

$$\mathbf{x}'(t\_{\xi+j}) = \begin{cases} \mathbf{x}(t\_{\xi+j}) & \text{if } j < 0\\ \hat{\mathbf{x}}(t\_{\xi+j}) & \text{otherwise} \end{cases}$$

*x* can refer to either actually recorded samples before the gap or estimated samples by the AR procedure. σ is the standard deviation of the white noise component in the estimated AR model and *e(t)* one value of a white noise process. While this approach has the property to generate data for the gap consistent with the previously measured data, one might prefer to use a mixture of AR modeling and linear interpolation for the online case. This would avoid jumps in the data when merging generated data within the gap with new samples acquired after the gap. These jumps occur for all AR model orders we have tested in our simulations (see Appendix 1.4 for details). We have used this combination here by performing AR extrapolation when information about the first sample after the gap was not available and using linear interpolation otherwise. The signal was received in packets with a length of 40 ms and for each packet, one of three actions were taken: (1) if a packet contained the start and the end of a gap, then linear interpolation was used to fill the gap. (2) If it contained only the start or if the whole packet was part of the gap, then the AR model was used as a linear predictor to fill the gap. (3) If it contained only the end of the gap, then the last sample of the last packet and the first sample after the gap were connected by linear interpolation.

#### *2.6.3. Joining two segments*

If one chooses to ignore the information of the gap altogether when estimating the model, one might consider simply joining the two segments around the gap, therefore sacrificing information about the timing in the vicinity of the gap. In practice, this means that we update the data window only with those samples from a newly acquired data packet that do not belong to a gap. In order to keep the window size for spectral estimation constant, this has the consequence that older samples are used to compute the spectrum with this method compared to the other algorithms.

## *2.6.4. Burg algorithm for segments (MEMgap)*

For standard algorithms that compute the AR coefficients, the samples within the data window need to be continuous. We can make the least-squares estimation of the AR coefficients compatible with data containing gaps by eliminating all equations from (Equation 1) that contain samples from within a gap and then solving the rest of the equations for the coefficients *ci*. As the Burg algorithm (see Appendix 1.1) yields more stable AR models than the least-squares estimation, we modified it to work with gaps based on the Burg algorithm for segmented data proposed in de Waele and Broersen (2000). This was achieved by limiting the computation of forward and backward prediction errors in each step of the algorithm to those samples that are far enough away from a gap. In the remainder of this paper, this algorithm is called MEMgap (Maximum Entropy Method for data with gaps) for brevity. A detailed description of the algorithm is given in Appendix 1.3.

#### **2.7. SIMULATIONS ON CLEAN DATA**

To study empirically the influence of gaps on the estimated spectrum, we performed simulations on 12 data sets that were recorded without stimulation by artificially inserting gaps, then applying the methods described above to estimate the spectrum. The results of the different methods were compared with a reference time-frequency analysis obtained when using the original data set without gaps. Each data set has a length of 182 s. These data sets, each containing 11 trials, were recorded with ECoG in patient P1 in one experimental session. For clarity reasons, we restrict ourselves to one channel (an electrode over right M1). For spectral computation we kept the length of the window constant at 500 ms and the update rate at 25 Hz = 40 samples. We estimated the power at frequencies between 5 and 99 Hz in 2 Hz increments and varied for each method the gap size (0–100 ms in steps of 5 ms) and the model order (values: 16, 32, and 64). We computed the normalized bias, root mean squared error (RMSE) and variance (var) of the stimulus processing algorithms as follows:

$$\text{bias}(f) = \frac{\frac{1}{n} \sum\_{i} (P(f, i) - P\_0(f, i))}{\overline{P\_0(f)}}$$

$$\text{RMSE}(f) = \frac{\sqrt{\frac{1}{n} \sum\_{i} (P(f, i) - P\_0(f, i))^2}}{\overline{P\_0(f)}}$$

$$\text{var}(f) = \text{Var}\left(\frac{P(f) - P\_0(f)}{\overline{P\_0(f)}}\right)$$

*P(f,i)* is the spectral power of data window *i* for frequency bin *f* , *P*0*(f, i)* is the power of the original data window without gaps and *P*0*(f)* is the average power of the full original recording without gaps for frequency bin *f* . *n* is the number of data windows that are affected by gaps (i.e., data windows where *P(f,i)* − *P*0*(f,i)* is not zero). var*(f)* is the variance of the difference between the power values of the original data and the power values of the data with gaps for all data windows affected by gaps and frequency bin *f* , divided by the average power for frequency bin *f* in the data set without gaps. For example, a normalized bias of −0*.*1 means that the estimated power after application of the stimulus processing algorithm is on average 10% smaller than the power of the original data set if a gap is present.

The statistical evaluation of the spectral bias results in the simulations was performed as follows: we obtained the bias for each data set, resulting in 12 values, and performed a nonparametric Wilcoxon signed-rank test for zero median. If the *p*-value for this test was below 0.01, we regarded the bias as significant.

## **3. RESULTS**

First we show results of simulated gaps on data without stimulation to assess the influence of gaps and the stimulation-processing algorithms on the estimated spectrum. Then we illustrate the influence of real single TMS and epidural stimulation pulses on the spectrum if they are left untreated and how the methods of this paper deal with their after-effects. Finally, we apply the algorithms to data sets of BCI experiments with open-loop or closed-loop stimulation and investigate the effect of each method on the discrimination between the brain states during intended movements and rest.

#### **3.1. GAP SIZE**

**Figures 3A**–**C** show the influence of gap sizes between 5 and 100 ms on the error in spectral estimation for three particular frequencies (9, 21, and 81 Hz) and a model order of 32. We find that the RMSE increases with the gap size for all methods. This happens, because the information of the samples that are excluded by the gap is missing for the AR estimation, leading to a greater deviation from the AR coefficients without gaps for increasing gap size. The linear interpolation methods exhibit a negative bias and the AR-prediction shows positive bias (**Figures 3D**–**F** and **5**). The negative bias of the linear interpolation methods occurs because a section of the data window is reduced to a straight line which has a power of almost 0 for higher frequencies, leading to a decrease in the estimated power for these frequencies. This effect increases with greater gaps. AR modeling can lead to jumps in the data, because the extrapolated signal from the start of the gap is not necessarily connected to the actual recorded signal at the end of the gap. Such jumps result in higher estimated power across all frequencies and thus a positive bias. For longer gaps this bias increases because the potential deviation from the true values after the gap (the jumps) becomes larger. The mixture of linear interpolation and AR prediction is in general closer to 0 than the other two, but the sign of its bias depends on data packet size and gap size. The joining and MEMgap algorithms exhibit a bias close to zero, but the RMSE is smaller for MEMgap than for joining. The variance (**Figures 3G**–**I**) also scales with gap size but there are strong differences between the methods visible, with MEMgap and the linear interpolation methods having the lowest variance.

#### **3.2. MODEL ORDER**

Variations of the model order have the largest effect on the AR modeling and the MEMgap algorithm. While AR modeling exhibits a significant positive bias at 21 Hz for gaps longer than 60 ms at a model order of 16 (**Figure 4D**), it is not significantly biased for a model order of 32 and 64 (**Figures 4E**,**F**). As shown in section 3.3 and **Figure 5**, this is due to the frequency-dependency of the bias for AR modeling which has a global minimum around 20 Hz for model order 32 and 64. For MEMgap we find no significant bias for all model orders (**Figures 4D**–**F**) and that the absolute error of the power estimation, captured by the normalized RMSE, as well as the variance, increases rapidly with increasing model orders (**Figures 4A**–**C**,**G**–**I**). This is probably due to the lower number of samples fully available for AR estimation with MEMgap compared to the standard Burg algorithm: for MEMgap, forward or backward prediction errors can not be calculated for up to 2*p* samples around each gap, where *p* is the model order. Higher values of *p* only increase this difference, leaving MEMgap with less and less samples for AR estimation, thus probably leading to greater errors. In general, MEMgap has the lowest RMSE for orders 16 and 32 and gaps longer than 30 ms and the lowest RMSE of all methods with a bias close to 0 at an order of 64.

#### **3.3. FREQUENCY**

In **Figure 3**, we show the results for low and high frequencies with 9 and 81 Hz as parts of the μ and high γ bands, respectively, in

addition to the "intermediate" frequency of 21 Hz as part of the β-band. For 81 Hz, the linear interpolation methods already show a significant negative bias for gaps of 5 ms, whereas for 9 Hz this only becomes significant for gaps greater than 35 ms. This is easily understandable considering that one cycle of a 9 Hz oscillation lasts for more than 100 ms, therefore linear interpolation over a gap of 10–20 ms would be fairly consistent with the real shape of the undisturbed signal. The bias of MEMgap is not significant for any frequency (**Figures 3D**–**F**). The joining method on the other hand exhibits a negative bias for 9 Hz and gaps smaller than 40 ms and a significant positive bias for 81 Hz. For 21 Hz, The bias is significant only for gaps smaller than 10 ms. In terms of RMSE and variance (**Figures 3A**–**C**,**G**–**I**), MEMgap always displays the lowest values for gap sizes greater than 50 ms.

The results in **Figures 3D**–**F**, especially for AR modeling and joining, suggest that the bias might be frequency-dependent. In **Figure 5**, the bias is shown relative to the frequency bin for model orders of 16, 32, and 64 for a gap size of 100 ms where it should be most pronounced. We find that for the joining method, the bias is negative, although non-significant, for frequencies lower than 25 Hz and positive otherwise (significant for most frequencies *>*60 Hz). For AR modeling, the bias is in general positive (significantly for all frequencies for model order 16 and above 55 Hz for 32) and increases with frequency, has a minimum around 20 Hz for a model order of 32 and 64 and is also increased for lower frequencies. For linear interpolation, there is a bias close to −0.2, indicating a reduction in power of about 20%, for frequencies higher than 20 Hz. This value can be explained with the fact that 20% (100 ms of a 500 ms window) of the data had to be filled by linear interpolation which removes the high-frequency content. MEMgap exhibits no significant bias across all frequencies and model orders, except for very low frequencies and high model orders where all methods show a positive bias. Although the bias for the mixture of AR modeling and interpolation is also not significant for most frequencies above 10 Hz and higher model orders, this is due to the interaction between gap size and packet size for this particular method. As seen in **Figures 3D** or **F**, for example, a gap size of 80 would lead to a positive bias.

#### **3.4. APPLICATION ON DATA WITH STIMULATION**

In our experiments, we received the data in packets with a length of 40 ms. This leads to the jumps seen in the bias relative to the gap size for the combination of AR modeling and linear interpolation (e.g., **Figure 3F**, magenta line) as either linear interpolation of AR modeling dominate the outcome. The packet length might be different for other recordings, so we excluded this method from the rest of the experiments, as the conclusions would be very specific for our setup. Further experiments are needed to investigate the influence of this specific parameter. As the simulation results of the two versions of linear interpolation did not differ much, we restricted ourselves to four of the six methods for the remainder of the paper: online-compatible linear interpolation, AR modeling, joining, and MEMgap.

A model order of 16 was chosen for spectral estimation and AR extrapolation as this section is mostly illustrative in nature and serves the purpose to study whether the results from the simulations are transferable to data with actual stimulation. In terms of the estimation bias, we found the clearest effects for a model order of 16: a negative bias for linear interpolation and a positive bias for AR modeling. The latter bias was not present for higher model orders around the studied frequency bin of 21 Hz.

**Figure 6** illustrates the effect of epidural stimulation on the recorded ECoG activity, the evoked activity after stimulation and their influence on spectral estimation for one representative stimulation pulse. **Figure 6A** shows the raw trace of data with a single pulse of electrical epidural stimulation occurring at time point 0. **Figure 6B** displays a zoom on the first 100 ms after the pulse. The stimulation artefact itself is contained within the first 10 ms after the pulse. After that, one can find evoked activity with its peak occurring 13 ms after the pulse and an amplitude of 240μV. This is much higher than the short-term amplitude fluctuations found in our ECoG data without stimulation.

**Figure 6C** demonstrates the importance of adjusting the length of the gap to the actual stimulation effects on the signal. Applying a gap of 10 ms to the data might be enough to cover the stimulation artefact itself, but the spectrum then still shows a clear positive bias due to the influence of the evoked activity. The results for short gap sizes are very similar to those without any gap. Only gaps greater than 20 ms cover the extent of the artefact and the initial evoked activity, leading to power values that are similar to those obtained for data windows without the stimulation event (windows 16–26). There is no clear difference in the outcome of the gaps greater than 20 ms.

differences between the power at a gap size of 0 and 50 and above. Window numbers correspond to the brackets shown in **(A)**, where the first one is 1, the second one (shifted by 40 ms) is 2 and so on. The computation of windows −1, 0, and 15–26 used data that is outside the margins of **(A)**. **(D)** Comparison of linear interpolation (red), AR modeling (green), joining (blue), and MEMgap (gray) with gap sizes of 10 (dashed) and 50 (solid) applied on the data in **(A)**. The solid black line with circles in **(C)** and **(D)** shows the result of spectral estimation without processing of the stimulation after-effect (gap size = 0).

In **Figure 6D**, linear interpolation, joining the data segments, MEMgap, and AR modeling are compared when applied to the stimulation event both for short and long gaps. All methods perform poorly for a gap size of 10 ms, but there are differences for 50 ms. Applying joining and AR modeling results in higher power values than linear interpolation and MEMgap with a clear difference in estimated power between the windows with and without stimulation. Assuming perfect exclusion of all stimulus-related effects, we expect that the power does not differ strongly between e.g., window 15 (which includes a small portion of the gap) and window 16 (without the gap), therefore the result of MEMgap and linear interpolation is more realistic than the output of the other methods. At least for the AR modeling method, the increased estimate of the power compared to, for example, linear interpolation is consistent with the positive bias shown in **Figure 3B**. A reason for the positive bias of the joining method for this example data set might be that drifts of the signal after a stimulation on epidural electrodes are common. If we take a data segment with post-stimulus drifts, exclude the gap and join the data before and after the gap into one window, it will contain a sharp discontinuity and have a comparatively high spectral power. With linear interpolation, the discontinuity will be less severe and have a smaller impact on the signal power. For MEMgap it does not play a role as data before and after the gap is always separated during estimation of the AR coefficients.

Stimulation artefacts and evoked activity are found for combined EEG and TMS in a similar way as for stimulation over implanted electrodes with the strength of the evoked activity depending on the distance to the stimulation site. We illustrate this in **Figure 7** with the result of a TMS pulse on the activity recorded on a distant EEG channel. There is no strong evoked potential visible after the stimulation, therefore, as is evident in **Figures 7C**,**D**, a short window of 10 ms is already sufficient to cover the artefact and to produce an estimation of spectral power that is similar in value compared to that resulting for data windows long after the stimulation when using either linear interpolation, joining or MEMgap to correct for the gap.

#### **3.5. INFLUENCE ON DECODING PERFORMANCE**

The stimulation-processing algorithm can bias the estimated spectrum, or will at least produce deviations from the original spectral power without gaps. This poses the question, how strongly these errors influence the actual brain-state decoding during a BCI experiment. For example, if the bias of linear interpolation toward underestimation of the signal power directly influences, how well we can differentiate data packets obtained during a movement from those recorded during rest, then this algorithm is not suitable for BSDS because it might induce a bias in the subject's performance in an online experiment.

To investigate this, we used data sets with different stimulation paradigms and recording methods (EEG and ECoG) to assess the influence of the algorithms and gap size on the decoding abilities of a BCI system. The patients always performed the

same cued attempted hand movements but we varied the stimulation paradigm between no stimulation, stimulation with fixed ISI and stimulation coupled to the output of the BCI (i.e., BSDS). In the last paradigm, the stimulation pulses were only applied while the BCI detected an intention to move from modulations of the power in the β-band and therefore moved the orthosis. If stimulation was used, stimulation artefacts were identified online with a peak detector if the voltage of two consecutive samples differed by more than 1 mV. The start of the gap was set 2 ms before this artefact and the gap size was adjusted for each patient and session depending on the length of the evoked activity as determined by several test stimuli applied before the start of the session. This resulted typically in a gap length between 30 and 70 ms. Stimulus processing was performed during recording with the online-compatible linear interpolation method.

In the offline analysis, we applied the four methods: joining, linear interpolation, AR modeling, and MEMgap on these data sets and varied the gap size between 0, 10, 50, and 100. We simulated the two different stimulation conditions on the data without stimulation by varying, in which phases of the trial gaps are placed: in the uncoupled condition the whole trial was valid, so the placement of gaps was independent of the activity and brainstate of the patient. For the coupled condition only time points within the movement phase were used as gaps, thus simulating a BSDS paradigm. In both cases the ISI was fixed at 2 s.

After applying the respective stimulation processing algorithm, we computed the spectral power between 16 and 22 Hz on channels located over the right motor cortex. For EEG measurements we used FC4, C4, and CP4 as defined by the 10–10 system (Society, 2006), whereas for ECoG measurements the electrodes were selected individually per patient based on the results of a screening session. We used a window size of 500 ms and a model order of 16 for spectral estimation. These were the same parameters, channels and frequencies that had been used during the online feedback experiments in which the data was recorded. Furthermore, our simulations showed a positive estimation bias for AR modeling at 16–22 Hz only for a model order of 16, not for 32 or 64. Thus, we only used an order of 16 for the simulations on data without stimulation. In order to investigate, whether higher model orders have a substantial effect on the processing of real stimuli, we used model orders of 16, 32, and 64 on the data with open-loop and closed-loop stimulation. For each run (consisting of 11 trials), we calculated the area under the ROC curve (AUC) for the sum of the logarithm of the power values within each data window in the movement phase versus those in the rest phase. We used this as a measure of the separability of these phases on a single-packet level. Taken together from all three patients, we analyzed 87 runs of EEG recordings without stimulation, 24 runs with uncoupled EEG-TMS, 131 ECoG runs without stimuli, 51 runs of ECoG with uncoupled, and 82 runs of ECoG with coupled stimulation. For each recording and stimulation condition, algorithm and gap size, this resulted in a distribution of AUC scores, one per run.

The conditions without stimulation allowed us to test for the bias and absolute error introduced by the gaps into the AUC scores. Thus, we computed the pair-wise differences between the AUC scores of a gap size of 0 and those of all combinations of algorithms and gap size for these conditions. Using Kruskal– Wallis tests, Bonferroni-corrected for multiple comparisons, we tested which algorithm leads to the smallest absolute differences in AUC scores. We also applied Wilcoxon signed rank tests to assess, whether the median of the differences deviates significantly from 0, indicating a systematic bias in the AUC scores. As there is no "true" reference distribution of the AUC scores possible for data with stimulation, we used Bonferroni-corrected non-parametric Friedman tests which account for possible effects of using the same sessions in all conditions to test whether gap sizes greater than 0 lead to different AUC scores compared to a gap size of 0 and to test whether there is a difference between the algorithms at a certain gap size.

**Figure 8** shows data without any actual stimulation, so ideally the difference in AUC scores between a gap size of 0 and of 100 should be zero for all runs. In **Figures 8A**,**B**, stimuli were simulated throughout the trial, thus independent of the task or the output of the BCI. Session-wise comparison of the AUC values with Friedman tests for each gap size show significant differences between MEMgap and the other algorithms only for long gaps. There is a slight decrease in the average AUC value for all algorithms for a gap size of 100 compared to 0 and MEMgap and joining yield significantly smaller absolute differences in AUC scores compared to AR modeling and interpolation (*p <* 0*.*000001). In **Figures 8C**,**D**, the stimuli were simulated only throughout the movement phase. We find a significant decrease in AUC values for AR modeling and a significant increase for linear interpolation. This means that linear interpolation artificially "improves" the decoding power. As shown in the simulation studies, linear interpolation of large gaps leads to a decrease of the power between 16 and 22 Hz (negative bias) which increases the event-related desynchronization effect of sensorimotor rhythms during attempted movements (Wolpaw et al., 2002). The MEMgap method shows a significantly smaller deviation of the AUC values at gap sizes of 50 and 100 from the AUC values without gaps than all the other methods (*p <* 0*.*000001). In contrast to the other algorithms, the median of the AUC differences after MEMgap never differs significantly from 0 for the BSDS condition, except for the ECoG data set with a gap size of 100 (*p* = 0*.*005). The other algorithms differ significantly from MEMgap in almost all cases of the coupled conditions.

Patients in the data sets shown in **Figures 9A**,**B** were stimulated independent of the task. We show only a model order of 16, because the results for an order of 32 and 64 are very similar. It is evident from gap sizes of 0 and 10 that untreated stimulation after-effects are detrimental for decoding. Online decoding will be more successful if enough samples are excluded after a stimulus (in these examples: a gap size of 50 ms seems to work well, although this varies between patients). Using Friedman tests for session-wise comparison of the AUC scores, significant differences of the algorithms are found, although the mean absolute differences are very small (≤0*.*01). In case of the uncoupled ECoG condition, AUC scores with untreated stimulation after-effects are significantly lower than AUC scores for gap sizes of 50 and 100, independent of the applied algorithm (*p <* 0*.*01). This effect is due to residual stimulation after-effects for small or absent gaps that lead to a very high power of data windows that contain

electrical or magnetic pulses. In particular such data windows in the movement phase will be classified incorrectly. If the strong after-effects are removed by longer gaps, the classifier is more likely to produce a correct result which is reflected in the increased AUC score for gaps of at least 50 ms.

Finally, in **Figure 9C**, stimulation was given only during the movement phase. The average AUC value for a gap size of 0 is smaller than 0.5, indicating a higher power during movement than during rest, as opposed to the expected event-related desynchronization. This is due to the task-dependent existence of the stimulation effects: the large stimulation after-effects that occur only during the movement phase lead to a very high spectral power of this phase. Thus, the spectral power of the movement phase is very well separable from the power of the rest phase for a gap size of 0. For a gap size of 10, there is a large variability in the AUC scores. This is because for one of the three patients, a gap of 10 ms was not sufficient to cover all artefact-related jumps in the recording, resulting in AUC scores lower than 0.5. If the aftereffects are dealt with by using a gap size of 50, the relationship between the power during rest and feedback reverses and resembles the expected ERD/ERS pattern. For a gap size of 100, we find in **Figure 9C** that the largest average AUC value is reached for linear interpolation and the smallest one for AR modeling, both differing significantly from the AUC values for MEMgap (*p <* 0*.*000001). This relationship is found for all tested model orders, where joining and AR modeling are on average worse than MEMgap by more than 0*.*02 and 0*.*05, respectively, while linear interpolation yields higher scores by at least 0*.*01. This is consistent with the simulation results in **Figures 8C**,**D** indicating an artificial over- and under-estimation of class separability by these methods. It supports the hypothesis that MEMgap is probably best suited to deal with large gaps in the data, especially for BSDS, because based on the simulation studies the deviation from the true AUC value is significantly smaller than for the other methods.

## **4. DISCUSSION**

One challenge when trying to combine online brain-state decoding from spectral data and direct cortical stimulation is that the after-effects of stimulation such as artefacts (Taylor et al., 2008) or evoked activity (Matsumoto et al., 2007) can have a much higher amplitude than the background brain signals. Therefore, estimation of the brain-state from a segment of data has been unreliable,

if such stimulation after-effects are contained in this segment. This leaves us with three options: we can (1) use only data segments for decoding that are free of any after-effects, (2) attempt to separate stimulation after-effects from background brain activity, e.g., by fitting a template of the expected shape of the effects to the recording, or (3) isolate the portions of the data segment that are "contaminated" by stimulation effects and use only the "clean" parts for decoding.

level for a purely random classifier (Fawcett, 2006). **(A)** Average AUC values for the separation of movement and rest from experiments with epidural

In earlier studies combining TMS and EEG without BSDS (i.e., without the necessity to perform real-time brain-statedecoding from the EEG), options (1) and (2) have been used. In such studies, either a fixed length window around the stimulus was removed offline (Fuggetta et al., 2006), a decomposition into artefact-free and contaminated data was attempted in postprocessing (Litvak et al., 2007; Morbidi et al., 2007; Erez et al., 2010) or a sample-and-hold circuit was used during recording to fix the amplifier output at a constant level during the pulse (Ilmoniemi et al., 1997). The latter method is especially helpful for amplifiers that recover from TMS pulses only after a delay of several hundred milliseconds (Ilmoniemi and Kiciˇ c, 2010 ´ ), although some current amplifiers are able to keep this delay lower than 10 ms (Veniero et al., 2009). The drawback of the sample-and-hold approach is that information on the brain signal directly after the pulse is invariably lost and that the signal contains gaps.

Option (1) has also been used by Bergmann et al. (2012) in their study on EEG-guided TMS, making a waiting period of several seconds between stimulation pulses necessary. If the brainstate is decoded from spectral features and for example 500 ms of data is needed to estimate these features robustly, one has to wait for 500 ms plus the expected duration of the stimulation after-effects for making the first estimate of the brain-state after a stimulation pulse. This duration is therefore also the absolute minimum ISI in this scenario. Removal of the after-effects by template subtraction is only possible, if several constraints are met: the full amplitude range of the stimulation effects has to be within the dynamic range of the amplifier, as portions of the data in which the amplifier is in saturation can not be recovered with this method, resulting in the necessity to correct for gaps in the signal as in option (3). If the recorded effects are not sufficiently stable, attempting to remove them will lead to residuals in the signal. Like the original after-effects, these residuals can have a detrimental effect on the quality of the estimated spectrum and, thus, the decoding process. The employed removal algorithms need to be suitable for an online BCI, so they need to work on a single-trial level and therefore should not be too computationally demanding.

for this algorithm and gap size differ significantly from MEMgap (*p <* 0*.*05,

Friedman test, Bonferroni-corrected).

We have chosen approach (3) for this work, the deliberate introduction of gaps into the signal covering the strongest aftereffects of stimulation and correcting for these gaps during spectral estimation. This allows continuous decoding without influence of the stimulation after-effects, as long as the duration of the aftereffects is estimated properly. To apply this approach, methods are needed that do not depend on continuous data segments for brain-state decoding and can deal with gaps in the data.

In our experiments, we analyzed the spectral power in the μ- and/or β-band to detect the patient's intention to move the paralyzed hand. We compared different approaches (linear interpolation, AR modeling, joining of data segments, and the Burg algorithm adapted for segmented data) on their ability to estimate the spectrum with gaps in the data. To this end, we used an ECoG BCI training data set and analyzed the normalized RMSE, bias and variance of the difference between the estimated spectrum with and without gaps. The RMSE increased with the gap size, although the slope of the error increase was smaller for MEMgap and joining than for algorithms that fill the gap with artificial data (linear interpolation and AR modeling). We found a clear systematic negative bias for linear interpolation and a systematic positive bias for AR modeling. We studied the frequency range between 16 and 22 Hz in most detail, where the bias of AR modeling was only apparent for a model order of 16, but a clear bias of AR modeling can be found for other frequencies at higher model orders, making this method also potentially unreliable. The joining method produces a bias close to 0 around a frequency of 20 Hz, but can lead to a positive bias for higher frequencies, whereas the MEMgap method always results in a bias close to 0. For gaps smaller than 40 ms, linear interpolation typically has the smallest absolute deviation from the true power values while MEMgap outperforms the other methods for longer gaps.

As our simulations show, the RMSE for linear interpolation is smaller than for MEMgap, thus at first glance making linear interpolation superior to MEMgap for large model orders and/or small gaps. However, in the context of a continuous BCI decoding for BSDS, the negative bias exhibited by the linear interpolation methods will bias the output of the BCI in favor of ERD, thus distorting the real performance of the participant to some extent if stimulation is coupled to the detected brain-state. We therefore think that MEMgap is most suited for BSDS as it is superior or at least equal to the other methods in terms of RMSE and variance, does not introduce a systematic bias and outperforms the other methods in minimizing the stimulation after-effects in our BCI paradigm.

Whether this approach of identifying and ignoring the segments of data dominated by stimulation after-effects is feasible in any given experimental setting depends on the duration of stimulation-evoked potentials after the pulse. As we showed here in the simulation studies, if the strongest evoked activity is contained within the first 50–100 ms after the pulse, then a decoding approach using MEMgap is feasible. If no strong evoked activity is observed, e.g., in the case of a remote recording location as illustrated in **Figure 7**, then a short gap of 10 ms covering the

## **REFERENCES**


Kotchoubey, B., Kübler, A., et al. (1999). A spelling device for the paralysed. *Nature* 398, 297–298.


stimulus artefact together with linear interpolation or MEMgap would be sufficient. In Ferreri et al. (2011), evoked EEG activity following single pulse TMS was found for up to 300 ms after the pulse with amplitude fluctuations of less than 20μV for late components. Although we do not expect that such small potentials would have a large impact on the estimated spectrum, especially compared to the stimulation artefact itself or early evoked activity, for every experiment of BSDS with continuous decoding, the size and shape of the evoked activity should be carefully studied to get a proper estimate of the duration of strong after-effects. As was shown by Casarotto et al. (2010), the after-effects depend on a number of parameters, such as stimulation intensity, location and (in the case of TMS) coil orientation. If gaps longer than the 100 ms tested here are necessary to cover all stimulation-related activity, one should either wait long enough until all effects have ceased before making the next brain-state decoding attempt, or increase the size of the data window on which the spectrum is estimated to ensure that it contains enough clean samples to compute a valid estimate.

In conclusion, we have shown that the application of cortical stimulation coupled to the output of an online brain-state decoder based on spectral features is feasible as long as the employed algorithms remove both the stimulation artefact and large early components of evoked activity and allow spectral estimation on non-continuous data. Especially if closed-loop BSDS is used, algorithms that do not introduce a strong bias into the estimated spectrum such as MEMgap are to be preferred over biased methods like linear interpolation to ensure a reliable decoding of the brain-state. In general, the methods investigated here are not restricted to applications with cortical stimulation but can be employed whenever spectral estimation has to be performed on non-continuous data sets with missing blocks of samples.

#### **ACKNOWLEDGMENTS**

This work was funded by the European Research Council (ERC) grant 227632 "BCCI—A Bidirectional Cortical Communication Interface" and the Reinhardt-Koselleck Award of the Deutsche Forschungsgemeinschaft (DFG). We thank Alireza Gharabaghi, Maria Teresa Leão, Georgios Naros, Alexandros Viziotis, and Martin Spüler for insightful discussions and their help in data acquisition and the two reviewers for helpful comments on the manuscript.

*ONE* 5:e10281. doi: 10.1371/journal.pone.0010281


G., et al. (2011). Human brain connectivity during single and paired pulse transcranial magnetic stimulation. *Neuroimage* 54, 90–102.


Neuronal responses to magnetic stimulation reveal cortical reactivity and connectivity. *Neuroreport* 8, 3537–3540.


hemiparetic stroke: a multicenter feasibility study of safety and efficacy. *J. Neurosurg.* 108, 707–714.


*Handbook of Transcranial Stimulation (Oxford Handbooks)*. New York, NY: Oxford University Press.

Wolpaw, J. R., Birbaumer, N., McFarland, D. J., Pfurtscheller, G., and Vaughan, T. M. (2002). Brain-computer interfaces for communication and control. *Clin. Neurophysiol.* 113, 767–791.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 06 August 2012; accepted: 29 October 2012; published online: 16 November 2012.*

*Citation: Walter A, Murguialday AR, Rosenstiel W, Birbaumer N and Bogdan M (2012) Coupling BCI and cortical stimulation for brain-state-dependent stimulation: methods for spectral estimation in the presence of stimulation aftereffects. Front. Neural Circuits 6:87. doi: 10.3389/fncir.2012.00087*

*Copyright © 2012 Walter, Murguialday, Rosenstiel, Birbaumer and Bogdan. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## **1. APPENDIX**

#### **1.1. THE BURG ALGORITHM**

The Burg algorithm is used for the estimation of the coefficients *ci* of an autoregressive (AR) model

$$\varkappa(t\_p) = \sum\_{i=1}^{p} c\_i \varkappa(t\_p - i) + e$$

with order *p* for samples *x(tk)*, 0 ≤ *k < N*,*e* a sample from a white noise sequence. The algorithm needs *p* recursive steps and in each step *j*, the coefficients *cj,<sup>i</sup>* for an autoregressive model of order *j* are computed by the following procedure:

An initial estimation of the power of the white noise component in the AR model is obtained by

$$P\_0 = \frac{1}{N} \sum\_{k=0}^{N-1} |\mathbf{x}(t\_k)|^2.$$

Each new coefficient *ci,<sup>i</sup>* is computed by minimizing the forward and backward prediction errors

$$f\_{\mathfrak{P},k} = \mathfrak{x}(\mathfrak{z}\_k) - \sum\_{i=1}^{\mathfrak{p}} c\_{\mathfrak{p},i} \mathfrak{x}(\mathfrak{z}\_{k-i}) \qquad \qquad \text{with } k = \mathfrak{p}, \dots, N - 1$$

$$b\_{p,k} = \mathbf{x}(t\_{k-p}) - \sum\_{i=1}^{p} c\_{p,i} \mathbf{x}(t\_{k-p+i}) \quad \text{with } k = p, \dots, N-1$$

with the formula

$$c\_{i,i} = \frac{-2\sum\_{k \in I\_i} f\_{i-1,k} \cdot b\_{i-1,k-2}}{\sum\_{k \in I\_i} \left( |f\_{i-1,k}|^2 + |b\_{i-1,k-1}|^2 \right)}, \quad I\_i = \{i+1, \dots, N-1\}. \tag{A1}$$

Each previously computed coefficient *ci* <sup>−</sup> <sup>1</sup>*,<sup>k</sup>* is then adjusted by

$$c\_{i,k} = c\_{i-1,k} + c\_{i,i} \cdot c\_{i-1,i-k}$$

and we update the power estimation to

$$P\_i = (1 - \left|c\_{i,i}\right|^2) \cdot P\_{i-1}$$

and the forward and backward prediction errors:

$$f\_{i,k} = f\_{i-1,k} + c\_{i,i} \cdot b\_{i-1,k-1}$$

$$b\_{i,k} = b\_{i-1,k-1} + c\_{i,i} \cdot f\_{i-1,k} \cdot b\_{i,k}$$

After *p* steps, this results in the AR coefficients *ci* = *ci,p*, *i* = 1*,..., p*.

#### **1.2. ESTIMATING THE SPECTRUM FROM AN AR MODEL**

An AR model can be interpreted as an all-pole infinite-impulseresponse filter with order *p* and coefficients *ci* which is applied to a white noise process with a power of *Pp* (Pardey et al., 1996). Thus, after finding the *p* autoregressive coefficients *ci*, one can estimate the spectrum by evaluating the transfer function *H(z)* =

$$\sqrt{P\_{\mathcal{P}}} \left(1 - \sum\_{k=1}^{\mathcal{P}} c\_k z^k \right)^{-1} \text{ of the filter to find power values}$$

$$P(\omega) = \frac{P\_p}{|1 - \sum\_{k=1}^{p} c\_k e^{-jk\alpha}|^2}$$

at (normalized) frequencies ω.

#### **1.3. THE MEMgap ALGORITHM**

If one assumes that a sequence *g* of length *N* exists and that *g(n)* = 1 only if the corresponding sample *x(tn)* is part of a gap in the data and 0 otherwise then we just have to make sure that none of the samples *x(tn)* with *g(n)* = 1 influence the estimation of the model coefficients. The Burg algorithm computes the AR coefficients for order *p* in *p* steps, yielding in the *i*-th step the coefficients of an AR model with order *i*. If we use in the *i*-th step only those samples fully for computation of the AR coefficients that are at least *i* + 1 time steps away from a sample with *g(n)* = 1, we achieve the desired effect. To be more precise, the coefficients are computed by evaluating forward and backward prediction errors (see Appendix 1.1). In the MEMgap algorithm, forward prediction errors are only computed for samples that are at least *i* + 1 time steps after a gap, backward prediction errors only for those at least *i* + 1 time steps before a gap. Formally, this is done by modifying *Ii* in Equation (A1) to the set

$$I\_i = \{k \mid \mathbf{g}(k) = 0 \land i < k < N \land \}$$

$$(k - n < 0 \lor k - n > i) \,\forall n \text{ with } \mathbf{g}(n) = 1\}.$$

This set can also be computed iteratively in each step of the Burg algorithm as *Ii* = *Ii*<sup>−</sup><sup>1</sup> ∩ *I*- *<sup>i</sup>*−1, where *<sup>I</sup>*- *<sup>i</sup>*−<sup>1</sup> is the set *Ii*<sup>−</sup><sup>1</sup> with each entry incremented by 1 and *I*<sup>0</sup> = {*k* | *g(k)* = 0 ∧ 0 *< k < N*}. This resembles a "forbidden zone" that initially contains only the gaps but grows in each step of the algorithm by one sample. The estimation of the white noise power *P*<sup>0</sup> has to be calculated only with samples outside of gaps: *P*<sup>0</sup> = <sup>1</sup> |*I*0| *<sup>k</sup>*∈*I*<sup>0</sup> |*x(tk)*| 2. The rest of the algorithm works as the standard Burg algorithm described in Appendix 1.1.

Obviously, this method can only work if the continuous segments between gaps are long enough. Therefore, there needs to be at least one segment of samples with a length that is at least equal to the model order *p* in order to make an estimation of the spectrum with this method. In practice, it is preferable if the number of samples in such a segment is several times higher than the model order in order to reduce the bias and variance of the estimator. If there are *ns* segments of clean data with a length greater than *p* and the total number of samples in these segments equals *M*, then only *M* − 2*nsp* forward and backward prediction errors are available in the *p*-th step of the MEMgap algorithm, although all *M* samples are evaluated to compute these errors. This means that even if the total number of samples within gaps might be the same, one can expect that the variance of the spectral estimation will be smaller if there are only a few large gaps in the data compared to having many small gaps because less samples contribute fully in the second case. According to de Waele and Broersen (2000), the same holds for the estimation bias which is inversely proportional to the number of available samples.

#### **1.4. AR MODELING INDUCES JUMPS IN THE SIGNAL**

The approach to fill the gap with samples that were extrapolated by an AR model fitted to the data before the gap can be problematic, if the extrapolation diverges strongly from the measured data. When actually measured samples are added to the data buffer after the gap, there can be a large amplitude difference between the last (extrapolated) sample within the gap and the first measured sample after the gap (**Figure A1A**). Such a "jump" in the signal results in high power across all frequencies, thus distorting the spectrum. To assess the influence of the model order and the gap size on the jumps, we ran simulations on the data from the ECoG recordings without stimulation used in **Figure 8A**: in total, 11266 stimuli were simulated on 397 minutes of data, the gap size was varied between 0 and 100 ms in steps of 5 ms and model orders 16, 32, and 64 were tested.

We applied AR modeling to deal with the gaps and measured the jump as the absolute voltage difference between the last sample generated by the AR model at the end of the gap and the first sample after the gap. The results are shown in **Figure A1B**.

and 600 ms. An AR model of order 32 is estimated from the 500 ms before the gap and applied as a linear predictor to generate 100 samples to fill the gap (red). The original ECoG samples within For comparison, the average absolute voltage difference between neighboring samples of the original ECoG data without any gaps or stimuli is 1*.*68 ± 0*.*43μV (mean ± std).

We found that for all model orders, the average height of the jump increases sharply up to a gap size of 10, yielding 12*.*35 ± 5*.*0μV for order 16, 11*.*82 ± 4*.*43μV for 32 and 11*.*29 ± 4*.*37μV for 64. For further increasing gap size, the average jump height increases more slowly for higher model orders than for lower ones. For a gap size of 100 ms we find average jump heights of 28*.*08 ± 14*.*94μV for order 16, 23*.*45 ± 12*.*51μV for 32 and 20*.*23 ± 11*.*94μV for 64. Thus, while the jump height at the end of the gap is significantly smaller for a model order of 64 compared to 32 and 16 (gap size = 100, paired Wilcoxon signed rank tests, both *p <* 10<sup>−</sup>17), it is still vastly higher than the average sample-tosample difference for ongoing ECoG activity. Therefore, we cannot conclude that higher model orders prevent jumps after the gap.

Furthermore, if the gaps are used to cover the effects of real stimulation, there has to be a jump at the end if the gap is filled by AR modeling and evoked activity is present. AR modeling attempts to extrapolate the pre-stimulus signal which almost certainly differs in its time course from the stimulation-evoked activity, therefore extrapolation can not work perfectly, regardless of the model order.

gap is the jump height. **(B)** Average jump height after filling the gap with AR modeling for model orders 16 (black), 32 (red), and 64

(green).

## Erratum: Coupling BCI and cortical stimulation for brain-state-dependent stimulation: methods for spectral estimation in the presence of stimulation after-effects

*Armin Walter1 \*, Ander Ramos Murguialday2,3, Martin Spüler 1, Georgios Naros 4,5, Maria Teresa Leão4,5, Alireza Gharabaghi 4,5, Wolfgang Rosenstiel 1, Niels Birbaumer 2,6 and Martin Bogdan1,7*

*<sup>1</sup> Department of Computer Engineering, Wilhelm-Schickard-Institute, Eberhard Karls University Tübingen, Tübingen, Germany*

*<sup>3</sup> Health Technologies Department, TECNALIA, San Sebastian, Spain*

*<sup>4</sup> Department of Neurosurgery, University Hospital Tübingen, Tübingen, Germany*

*<sup>5</sup> Department of Neuroprosthetics, Centre for Integrative Neuroscience, Eberhard Karls University Tübingen, Tübingen, Germany*

*<sup>6</sup> IRCCS, Ospedale San Camillo, Venice, Italy*

*<sup>7</sup> Department of Computer Engineering, University of Leipzig, Leipzig, Germany*

*\*Correspondence: armin.walter@uni-tuebingen.de*

#### *Edited by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### *Reviewed by:*

*Ahmed El Hady, Max Planck Institute for Dynamics and Self Organization, Germany*

#### **A commentary on**

#### **Coupling BCI and cortical stimulation for brain-state-dependent stimulation: methods for spectral estimation in the presence of stimulation after-effects**

*by Walter, A., Murguialday, A. R., Rosenstiel, W., Birbaumer, N., and Bogdan, M. (2012). Front. Neural Circuits 6:87. doi: 10.3389/fncir.2012.00087*

The article "Coupling BCI and cortical stimulation for brain-state-dependent stimulation: methods for spectral estimation in the presence of stimulation after-effects" (Walter et al., 2012) incorrectly omitted four authors. The list of authors should have been: Armin Walter, Ander Ramos Murguialday, Martin Spüler, Georgios Naros, Maria Teresa Leão, Alireza Gharabaghi, Wolfgang Rosenstiel, Niels Birbaumer, Martin Bogdan.

All further references to this paper should cite the corrected author list as displayed above.

Furthermore, the text of the "Acknowledgments" section needs to be changed to: This work was funded by the European Research Council (ERC) grant 227632 "BCCI—A Bidirectional Cortical Communication Interface" and the Reinhardt-Koselleck Award of the Deutsche Forschungsgemeinschaft (DFG). We thank Alexandros Viziotis for his work with the patients and the two reviewers for helpful comments on the manuscript.

#### **REFERENCES**

Walter, A., Murguialday, A. R., Rosenstiel, W., Birbaumer, N., and Bogdan, M. (2012). Coupling BCI and cortical stimulation for brain-state-dependent stimulation: methods for spectral estimation in the presence of stimulation after-effects. *Front. Neural Circuits* 6:87. doi: 10.3389/fncir.2012. 00087

*Received: 09 January 2013; accepted: 09 January 2013; published online: 11 February 2013.*

*Citation: Walter A, Murguialday AR, Spüler M, Naros G, Leão MT, Gharabaghi A, Rosenstiel W, Birbaumer N and Bogdan M (2013) Erratum: Coupling BCI and cortical stimulation for brain-state-dependent stimulation: methods for spectral estimation in the presence of stimulation after-effects. Front. Neural Circuits 7:6. doi: 10.3389/fncir.2013.00006*

*Copyright © 2013 Walter, Murguialday, Spüler, Naros, Leão, Gharabaghi, Rosenstiel, Birbaumer and Bogdan. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

*<sup>2</sup> Institute of Medical Psychology and Behavioural Neurobiology, University Hospital Tübingen, Tübingen, Germany*