# NONLINEAR ANALYSIS IN NEUROSCIENCE AND BEHAVIORAL RESEARCH

EDITED BY: Tobias A. Mattei PUBLISHED IN: Frontiers in Computational Neuroscience

#### *Frontiers Copyright Statement*

*© Copyright 2007-2016 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88919-996-9 DOI 10.3389/978-2-88919-996-9

# About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

# Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

# Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

# What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **NONLINEAR ANALYSIS IN NEUROSCIENCE AND BEHAVIORAL RESEARCH**

Topic Editor: **Tobias A. Mattei,** Eastern Maine Medical Center, USA

Three-dimensional phase space representation of a non-linear system with a trajectory involving a strange attractor Image by Nicolas Desprez. Available under CC BY-SA 3.0 license at: https://en.wikipedia.org/ wiki/Attractor#/media/File:Atractor\_Poisson\_ Saturne.jpg

Although nonlinear dynamics have been mastered by physicists and mathematicians for a long time (as most physical systems are inherently nonlinear in nature), the recent successful application of nonlinear methods to modeling and predicting several evolutionary, ecological, physiological, and biochemical processes has generated great interest and enthusiasm among researchers in computational neuroscience and cognitive psychology. Additionally, in the last years it has been demonstrated that nonlinear analysis can be successfully used to model not only basic cellular and molecular data but also complex cognitive processes and behavioral interactions.

The theoretical features of nonlinear systems (such unstable periodic orbits, period-doubling bifurcations and phase space dynamics) have already been successfully applied by several research groups to analyze the behavior of a variety of neuronal and cognitive processes. Additionally the concept of strange attractors has lead to a new understanding of information processing which considers higher cognitive functions (such as language, attention, memory and

decision making) as complex systems emerging from the dynamic interaction between parallel streams of information flowing between highly interconnected neuronal clusters organized in a widely distributed circuit and modulated by key central nodes. Furthermore, the paradigm of self-organization derived from the nonlinear dynamics theory has offered an interesting account of the phenomenon of emergence of new complex cognitive structures from random and non-deterministic patterns, similarly to what has been previously observed in nonlinear

studies of fluid dynamics.Finally, the challenges of coupling massive amount of data related to brain function generated from new research fields in experimental neuroscience (such as magnetoencephalography, optogenetics and single-cell intra-operative recordings of neuronal activity) have generated the necessity of new research strategies which incorporate complex pattern analysis as an important feature of their algorithms.

Up to now nonlinear dynamics has already been successfully employed to model both basic single and multiple neurons activity (such as single-cell firing patterns, neural networks synchronization, autonomic activity, electroencephalographic measurements, and noise modulation in the cerebellum), as well as higher cognitive functions and complex psychiatric disorders. Similarly, previous experimental studies have suggested that several cognitive functions can be successfully modeled with basis on the transient activity of large-scale brain networks in the presence of noise. Such studies have demonstrated that it is possible to represent typical decision-making paradigms of neuroeconomics by dynamic models governed by ordinary differential equations with a finite number of possibilities at the decision points and basic heuristic rules which incorporate variable degrees of uncertainty.

This e-book has include frontline research in computational neuroscience and cognitive psychology involving applications of nonlinear analysis, especially regarding the representation and modeling of complex neural and cognitive systems. Several experts teams around the world have provided frontline theoretical and experimental contributions (as well as reviews, perspectives and commentaries) in the fields of nonlinear modeling of cognitive systems, chaotic dynamics in computational neuroscience, fractal analysis of biological brain data, nonlinear dynamics in neural networks research, nonlinear and fuzzy logics in complex neural systems, nonlinear analysis of psychiatric disorders and dynamic modeling of sensorimotor coordination.

Rather than a comprehensive compilation of the possible topics in neuroscience and cognitive research to which non-linear may be used, this e-book intends to provide some illustrative examples of the broad range of fields to which the powerful tools of non-linear analysis can be successfully employed. We sincerely hope that that these articles may stimulate the reader to deepen its interest in the topic of non-linear analysis in neuroscience and cognitive sciences, paving the way for future theoretical and experimental research on this rapidly evolving and promising research field.

**Citation:** Mattei, T. A., ed. (2016). Nonlinear Analysis in Neuroscience and Behavioral Research. Lausanne: Frontiers Media. doi: 10.3389/978-2-88919-996-9

# Table of Contents

*07 Unveiling complexity: non-linear and fractal analysis in neuroscience and cognitive psychology* Tobias A. Mattei *09 Low-dimensional attractor for neural activity from local field potentials in optogenetic mice* Sorinel A. Oprisan, Patrick E. Lynn, Tamas Tompa and Antonieta Lavin *28 A pooling-LiNGAM algorithm for effective connectivity analysis of fMRI data* Lele Xu, Tingting Fan, Xia Wu, KeWei Chen, Xiaojuan Guo, Jiacai Zhang and Li Yao *37 EEG entropy measures in anesthesia* Zhenhu Liang, Yinghua Wang, Xue Sun, Duan Li, Logan J. Voss, Jamie W. Sleigh, Satoshi Hagihira and Xiaoli Li *54 Detection of subjects and brain regions related to Alzheimer's disease using 3D MRI scans based on eigenbrain and machine learning* Yudong Zhang, Zhengchao Dong, Preetha Phillips, Shuihua Wang, Genlin Ji, Jiquan Yang and Ti-Fei Yuan *69 On the distinguishability of HRF models in fMRI* Paulo N. Rosa, Patricia Figueiredo and Carlos J. Silvestre *82 Detection of epileptiform activity in EEG signals based on time-frequency and non-linear analysis* Dragoljub Gajic, Zeljko Djurovic, Jovan Gligorijevic, Stefano Di Gennaro and Ivana Savic-Gajic *98 Input-output relation and energy efficiency in the neuron with different spike threshold dynamics* Guo-Sheng Yi, Jiang Wang, Kai-Ming Tsang, Xi-Le Wei and Bin Deng *112 Linear stability in networks of pulse-coupled neurons* Simona Olmi, Alessandro Torcini and Antonio Politi *126 Macroscopic complexity from an autonomous network of networks of theta neurons* Tanushree B. Luke, Ernest Barreto and Paul So *137 Multiscale entropy analysis of biological signals: a fundamental bi-scaling law* Jianbo Gao, Jing Hu, Feiyan Liu and Yinhe Cao *146 A three-dimensional mathematical model for the signal propagation on a neuron's membrane* Konstantinos Xylouris and Gabriel Wittum *155 Membrane current series monitoring: essential reduction of data points to finite* 

Raoul R. Nigmatullin, Rashid A. Giniatullin and Andrei I. Skorinkin

*number of stable parameters*

*167 Fast monitoring of epileptic seizures using recurrence time statistics of electroencephalography*

Jianbo Gao and Jing Hu

*175 Astronomical apology for fractal analysis: spectroscopy's place in the cognitive neurosciences*

Damian G. Kelty-Stephen


David Kronemyer and Alexander Bystritsky

*214 Characterizing psychological dimensions in non-pathological subjects through autonomic nervous system dynamics*

Mimma Nardelli, Gaetano Valenza, Ioana A. Cristea, Claudio Gentili, Carmen Cotet, Daniel David, Antonio Lanata and Enzo P. Scilingo

*226 What is the mathematical description of the treated mood pattern in bipolar disorder?*

Fatemeh Hadaeghi, Mohammad R. Hashemi Golpayegani and Shahriar Gharibzadeh


Maryam Beigzadeh, Seyyed Mohammad R. Hashemi Golpayegani and Shahriar Gharibzadeh

*234 Bifurcation analysis of "synchronization fluctuation": a diagnostic measure of brain epileptic states*

Fatemeh Bakouie, Keivan Moradi, Shahriar Gharibzadeh and Farzad Towhidkhah

*236 A more realistic quantum mechanical model of conscious perception during binocular rivalry*

Mohammad Reza Paraan, Fatemeh Bakouie and Shahriar Gharibzadeh

*238 A hypothesis on the role of perturbation size on the human sensorimotor adaptation*

Fatemeh Yavari, Farzad Towhidkhah and Mohammad Darainy

*241 Artificial neural networks: powerful tools for modeling chaotic behavior in the nervous system*

Malihe Molaie, Razieh Falahian, Shahriar Gharibzadeh, Sajad Jafari and Julien C. Sprott

*244 Synchrony analysis: application in early diagnosis, staging and prognosis of multiple sclerosis*

Zahra Ghanbari and Shahriar Gharibzadeh

*246 The hypothetical cost-conflict monitor: is it a possible trigger for conflict-driven control mechanisms in the human brain?*

Sareh Zendehrouh, Shahriar Gharibzadeh and Farzad Towhidkhah

*249 Modeling studies for designing transcranial direct current stimulation protocol in Alzheimer's disease*

Shirin Mahdavi, Fatemeh Yavari, Shahriar Gharibzadeh and Farzad Towhidkhah

*251 Does our brain use the same policy for interacting with people and manipulating different objects?* Fatemeh Yavari

*255 Stochastic non-linear oscillator models of EEG: the Alzheimer's disease case* Parham Ghorbanian, Subramanian Ramakrishnan and Hashem Ashrafiuon

*269 Multisensory integration using dynamical Bayesian networks* Taher Abbas Shangari, Mohsen Falahi, Fatemeh Bakouie and Shahriar Gharibzadeh

# *Tobias A. Mattei\**

*Department of Neurological Surgery, The Ohio State University Medical Center, Columbus, OH, USA \*Correspondence: tobias.mattei@osumc.edu*

#### *Edited by:*

*Misha Tsodyks, Weizmann Institute of Science, Israel*

**Keywords: non-linear analsyis, complex systems, fractal analysis, cognitive psychology, neurosciences**

Although non-linear dynamics has been mastered by physicists and mathematicians for a long time, as most physical systems are inherently non-linear in nature (Kirillov and Dmitry, 2013), the more recent successful application of non-linear and fractal methods to modeling and prediction of several evolutionary, ecologic, genetic, and biochemical processes (Avilés, 1999) has generated great interest and enthusiasm for such type of approach among researchers in neuroscience and cognitive psychology.

After initial works on this emerging field, it became clear that that multiple aspects of brain function as viewed from different perspectives and scales present a nonlinear behavior, with a complex phase space composed of multiple equilibrium points, limit cycles, stability regions, and trajectory flows as well as a dynamics which includes unstable periodic orbits, period-doubling bifurcations, as well as other features typical of chaotic systems (Birbaumer et al., 1995). Moreover it was also demonstrated that non-linear dynamics was able to explain several unique features of the brain such as plasticity and learning (Freeman, 1994).

More recently the concept of strange attractors has lead to a new understanding of information processing in the brain which, instead of the old "localizationist" approaches (Wernicke, 1970), considers higher cognitive functions (such as language, attention, memory and decision-making) as systemic properties which emerge from the dynamic interaction between parallel streams of information flowing between highly interconnected neuronal clusters that are organized in a widely distributed circuit modulated by key central nodes (Mattei, 2013a,b). According to such paradigm, the concept of self-organization has been able to offer a proper account of the phenomenon of evolutionary emergence of new complex cognitive structures from non-deterministic random patterns, similarly to what has been previously observed in nonlinear studies of fluid dynamics (Dixon et al., 2012).

Additionally, the challenges of interpreting massive amounts of information related to brain function generated from emerging research fields in experimental neuroscience (such as functional MRi, magnetoencephalography, optogenetics, and singlecell intra-operative recordings) have generated the necessity of new methods for which incorporate complex pattern analysis as an important feature of their algorithms (Turk-Browne, 2013).

Up to now nonlinear methods have already been successfully employed to describe and model (among many other examples) single-cell firing patterns (Thomas et al., 2013), neural networks synchronization (Yu et al., 2011), autonomic activity (Tseng et al., 2013), electroencephalographic data (Abásolo et al., 2007), noise modulation in the cerebellum (Tokuda et al., 2010), as well as higher cognitive functions and complex psychiatric disorders (Bystritsky et al., 2012). Additionally fractal analysis has been extensively explored not only in the description of the temporal aspects of neuronal dynamics, but also in the evaluation of key structural patterns of cellular organization in both normal and pathological histologic brain samples (Mattei, 2013a,b).

Finally, recent studies have demonstrated that several cognitive functions can be successfully modeled with basis on the transient activity of large-scale brain networks in the presence of noise (Rabinovich et al., 2008). In fact, it has already been suggested that the observed pervasiveness of the 1/f scaling (also called 1/*f* noise, fractal time, or pink noise) in both neural and cognitive functions may have a very close relationship (if not a causal one) with the phenomenon of metastability of brain states (Kello et al., 2008). Other studies in the emerging field of neuroeconomics have shown that it is possible to represent typical decision-making paradigms by dynamic models governed by ordinary differential equations with a finite number of possibilities at the decision points as well as basic rules to address uncertainty (Holmes et al., 2004).

In this special edition of Frontiers Computational Neuroscience dedicated to the topic of Non-linear and Fractal Analysis in Neuroscience and Cognitive Psychology, special articles from several frontline research groups around the world were carefully selected in order to provide a representative sample of the different research fields in neuroscience and cognitive psychology where non-linear and fractal analysis may be successfully applied.

The selected articles include both classical problems where non-linear method have been traditionally employed (such as EEG data analysis) as well as other new research fields in which non-linear analysis has been shown to be useful not only for modeling normal brain dynamics but also for the diagnosis of neurological and psychiatric disorders, monitoring of their natural history and evaluation of the effects of different therapeutic strategies.

Overall, both theoretical and experimental works in the field seem to demonstrate that the advanced tools of non-linear analysis can much more accurately describe and represent the complexity of brain dynamics than traditional mathematical and computational methods based on linear and deterministic analysis.

Although it seems quite unquestionable that future attempts to model complex brain and cognitive functions will significantly benefit from non-linear methods, the exact cognitive and neuronal variables that may exhibit a significant chaotic pattern is still an open question. However, taking into account the pervasiveness of non-linear behavior in the brain, which has already been demonstrated by such an extensive literature in so many different fields of neuroscience and cognitive psychology (as well as the remarkable progress that has been achieved by the application of non-linear and fractal analysis in such research areas), maybe the burden of proof should be on the other side. Perhaps the real question to be answered is: Which areas of neuroscience and cognitive psychology would not benefit from the advantages that non-linear and fractal analysis has to offer?

# **REFERENCES**


*Received: 05 February 2014; accepted: 05 February 2014; published online: 21 February 2014.*

*Citation: Mattei TA (2014) Unveiling complexity: non-linear and fractal analysis in neuroscience and cognitive psychology. Front. Comput. Neurosci. 8:17. doi: 10.3389/ fncom.2014.00017*

*This article was submitted to the journal Frontiers in Computational Neuroscience. Copyright © 2014 Mattei. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Low-dimensional attractor for neural activity from local field potentials in optogenetic mice

Sorinel A. Oprisan<sup>1</sup> \*, Patrick E. Lynn<sup>2</sup> , Tamas Tompa3, 4 and Antonieta Lavin<sup>3</sup>

<sup>1</sup> Department of Physics and Astronomy, College of Charleston, Charleston, SC, USA, <sup>2</sup> Department of Computer Science, College of Charleston, Charleston, SC, USA, <sup>3</sup> Department of Neuroscience, Medical University of South Carolina, Charleston, SC, USA, <sup>4</sup> Department of Preventive Medicine, Faculty of Healthcare, University of Miskolc, Miskolc, Hungary

We used optogenetic mice to investigate possible nonlinear responses of the medial prefrontal cortex (mPFC) local network to light stimuli delivered by a 473 nm laser through a fiber optics. Every 2 s, a brief 10 ms light pulse was applied and the local field potentials (LFPs) were recorded with a 10 kHz sampling rate. The experiment was repeated 100 times and we only retained and analyzed data from six animals that showed stable and repeatable response to optical stimulations. The presence of nonlinearity in our data was checked using the null hypothesis that the data were linearly correlated in the temporal domain, but were random otherwise. For each trail, 100 surrogate data sets were generated and both time reversal asymmetry and false nearest neighbor (FNN) were used as discriminating statistics for the null hypothesis. We found that nonlinearity is present in all LFP data. The first 0.5 s of each 2 s LFP recording were dominated by the transient response of the networks. For each trial, we used the last 1.5 s of steady activity to measure the phase resetting induced by the brief 10 ms light stimulus. After correcting the LFPs for the effect of phase resetting, additional preprocessing was carried out using dendrograms to identify "similar" groups among LFP trials. We found that the steady dynamics of mPFC in response to light stimuli could be reconstructed in a three-dimensional phase space with topologically similar "8"-shaped attractors across different animals. Our results also open the possibility of designing a low-dimensional model for optical stimulation of the mPFC local network.

### Edited by:

Tobias Alecio Mattei, Kenmore Mercy Hospital, USA

### Reviewed by:

Todd Troyer, University of Texas, USA Joaquín J. Torres, University of Granada, Spain Xin Tian, Tianjin Medical University, China

#### \*Correspondence:

Sorinel A. Oprisan, Department of Physics and Astronomy, College of Charleston, 66 Georege Street, Charleston, SC 29424, USA oprisans@cofc.edu

Received: 11 June 2015 Accepted: 18 September 2015 Published: 02 October 2015

#### Citation:

Oprisan SA, Lynn PE, Tompa T and Lavin A (2015) Low-dimensional attractor for neural activity from local field potentials in optogenetic mice. Front. Comput. Neurosci. 9:125. doi: 10.3389/fncom.2015.00125 Keywords: optogenetics, medial prefrontal cortex, electrophysiology, delay-embedding, nonlinear dynamics

# 1. Introduction

Synchronization of neural oscillators across different areas of the brain is involved in memory consolidation, decision-making, and many other cognitive processes (Oprisan and Buhusi, 2014). In humans, sustained theta oscillations were detected when subjects navigated through a virtual maze by memory alone, relative to when they were guided through the maze by arrow cues (Kahana et al., 1999). Also the duration of sustained theta activity is proportional to the length of the maze. However, theta rhythm does not seem to correlate with decision-making processes. The duration of gamma rhythm is proportional to the decision time. Gamma oscillations showed strong coherence across different areas of the brain during associative learning (Miltner et al., 1999). A similar strong coherence in gamma band was found between frontal and parietal cortex during successful recollection (Burgess and Ali, 2002). Cross-frequency coupling between brain rhythms is essential in organization and consolidation of working memory (Oprisan and Buhusi, 2013). Such a cross-frequency coupling between gamma and theta oscillations is believed to code multiple items in an ordered way in hippocampus where spatial information is represented in different gamma subcycles of a theta cycle (Kirihara et al., 2012; Lisman and Jensen, 2013). It is believed that alpha rhythm suppresses task-irrelevant information, gamma oscillations are essential for memory maintenance, whereas theta rhythms drive the organization of sequentially ordered items (Roux and Uhlhaas, 2014). Synchronization of neural activity is also critical, for example, in encoding and decoding of odor identity and intensity (Stopfer et al., 2003; Broome et al., 2006).

Gamma rhythm involves the reciprocal interaction between interneurons, mainly parvalbumin (PV+) fast spiking interneurons (FS PV+) and principal cells (Traub et al., 1997). The predominant mechanism for neuronal synchronization is the synergistic excitation of glutamatergic pyramidal cells and GABAergic interneurons (Parra et al., 1998; Fujiwara-Tsukamoto and Isomura, 2008).

Nonlinear time series analysis was successfully applied, for example, to extract quantitative features from recordings of brain electrical activity that may serve as diagnostic tools for different pathologies (Jung et al., 2003). In particular, largescale synchronization of activity among neurons that leads to epileptic processes was extensively investigated with the tools of nonlinear dynamics both for the purpose of early detection of seizures (Jerger et al., 2001; Iasemidis, 2003; Iasemidis et al., 2003; Paivinen et al., 2005) and for the purpose of using the nonlinearity in neural network response to reset the phase of the underlying synchronous activity of large neural populations in order to disrupt the synchrony and re-establish normal activity (Tass, 2003; Greenberg et al., 2010). A series of nonlinear parameters showed significant change during ictal period as compared to the interictal period (Babloyantz and Destexhe, 1986; van der Heyden et al., 1999) and reflect spatiotemporal changes in signal complexity. It was also suggested that differences in therapeutic responsiveness may reflect underlying distinct dynamic changes during epileptic seizure (Jung et al., 2003).

The present study performed nonlinear time series analysis of LFP recordings from PV+ neurons: (1) to determine if nonlinearity is present using time reversal asymmetry and FNN statistics between the original signal and surrogate data; (2) to measure the phase shift (resetting) induced by brief light stimuli, and (3) to compute the delay (lag) time and embedding dimension of LFP data.

We investigated the response of the local neural network in the mPFC activated by light stimuli and determined the number of degrees of freedom necessary for a quantitative, global, description of the steady activity of the network, i.e., long after the light stimulus was switched off. Although each neuron is described by a relatively large number of parameters, using nonlinear dynamics (Oprisan, 2002) it is possible to capture some essential features of the system in a low-dimensional space (Oprisan and Canavier, 2006; Oprisan, 2009). One possible approach to low-dimensional modeling is by using the method of phase resetting, which reduces the complexity of a neural oscillator to a lookup table that relates the phase of the presynaptic stimulus with a reset in the firing phase of the postsynaptic neuron (Oprisan, 2013).

We recently applied delay embedding to investigating the possibility of recovering phase resetting from single-cell recordings (Oprisan and Canavier, 2002; Oprisan et al., 2003). Although techniques for eliminating nonessential degrees of freedom through time scale separation were used extensively (Oprisan and Canavier, 2006; Oprisan, 2009), the novelty of our approach is that we used the phase resetting induced by light stimulus to quickly identify similar activity patterns for the purpose of applying delay embedding technique.

# 2. Materials and Methods

# 2.1. Human Search and Animal Research

All procedures were done in accordance to the National Institute of Health guidelines as approved by the Medical University of South Carolina Institutional Animal Care and Use Committee.

# 2.2. Experimental Protocol

Male PV-Cre mice (B6; 129P2 - Pvalbtm1(Cre)Arbr/<sup>J</sup> ) Jackson Laboratory (Bar Harbor, ME, USA) were infected with the viral vector [AAV2/5. EF1a. DIO. hChR2(H134R) - EYFP. WPRE. hGH, Penn Vector Core, University of Pennsylvania] delivered to the mPFC as described in detail in Dilgen et al. (2013).

Electrophysiological data were recorded using an optrode positioned with a Narishige (Japan) hydraulic microdrive. Extracellular signals were amplified by a Grass amplifier (Grass Technologies, West Warwick, RI, USA), digitized at 10 kHz by a 1401plus data acquisition system, visualized using Spike2 software (Cambridge Electronic Design, LTD., Cambridge, UK) and stored on a PC for offline analysis. Line noise was eliminated by using a HumBug 50/60 Hz Noise Eliminator (Quest Scientific Inc., Canada). The signal was band-pass filtered online between 0.1 and 10 kHz for single- or multi-unit activity, or between 0.1 and 130 Hz for local field potentials (LFP) recordings.

Light stimulation was generated by a 473 nm laser (DPSS Laser System, OEM Laser Systems Inc., East Lansing, MI, USA), controlled via a 1401plus digitizer and Spike2 software (Cambridge Electronic Design LTD., Cambridge, UK). Light pulses were delivered via the 50 µm diameter optical fiber glued to the recording electrode (Thorlabs, Inc., Newton, NJ, USA).

At the top of the recording track the efficacy of optical stimulation was assessed by monitoring single-unit or multiunit responses to various light pulses (duration 10–250 ms). High firing rate action potentials, low half-width amplitude (presumably from PV-positive interneurons) during the light stimulation, and/or the inhibition of regular spiking units was considered confirmation of optical stimulation of ChR2 expressing PV+ interneurons. The optrode was repositioned along the dorsal ventral axis if no response was found. Upon finding a stable response, filters were changed to record field potentials (0.1–100 Hz). Two different optical stimulations were delivered: (1) a 40 Hz 10-pulse train that lasted 250 ms with 10 ms pulse duration followed by a 15 ms break, and (2) a single pulse with 10 ms duration. In both cases, the recording lasted for 2 s from the beginning of optical stimulus. Local field potential (LFP) activity was monitored for a minimum of 10 min while occasionally stimulating at 40 Hz to ensure the stability of the electrode placement and the ability to induce the oscillation. Additionally, LFP activity was monitored as a tertiary method of assessing anesthesia levels. Several animals were excluded from analysis due to fluctuating levels of LFP activity that resulted from titration of anesthesia levels during the experiment.

# 3. Data Analysis

For each of the six animals, we analyzed 100 different trials, each with a duration of 2 s measured from the onset of a brief 10 ms stimulus until the next stimulus. For each 2 s long LFP recording, there are two regions of interest: the first approximately 0.5 s that follows the stimulus, which is the transient response of the neural network, and the last 1.5 s of the recording that is the steady activity of the network. The transient response is essential in the subsequent analysis of the steady response since it determines the amount of phase resetting induced by optical stimulus (see Section 3.2 for a detailed description of the procedure employed to determine the phase resetting induced by a light stimulus). The steady activity of the network was investigated to determine if there is any low-dimensional attractor that may explain the observed dynamics.

# 3.1. Tests for Nonlinearity

Detection of nonlinearity is the first step before any nonlinear analysis. The test is necessary since noisy data and an insufficient number of observations may point to nonlinearity of an otherwise purely stochastic time series (see for example Osborne and Provencale, 1989). There are at least two widely-used methods for testing time series nonlinearities: surrogate data (Theiler et al., 1992; Small, 2005) and bootstrap (Efron, 1982). The most commonly used method to identify time series nonlinearity is a statistical approach based on surrogate data technique. The bootstrap method extracts explicit parametric models from the data (Efron, 1982).

In the following, we will only use the surrogate data method. Testing for nonlinearity with surrogate data requires an appropriate null hypothesis, e.g., that the data are linearly correlated in the temporal domain, but are random otherwise. Once a null hypothesis was selected, surrogate data are generated for the original series by preserving the linear correlations within the original data while destroying any nonlinear structure by randomizing the phases of the Fourier transform of the data (Theiler et al., 1992).

From surrogates, the quantity of interest, e.g., the time reversal asymmetry, is estimated for each realization. Next, a distribution of the estimates is compiled and appropriate statistical tests are carried out with the purpose of determining if the observed data are likely to have been generated by the process set though the null hypothesis. If the selected measure(s) of suspected nonlinearity does not significantly change between the original and the surrogate data, then the null hypothesis is true, otherwise the null hypothesis is rejected.

The number of surrogates to be generated depends on the rate of false rejections of the null hypothesis (Jung et al., 2003). For example, if a significance level of l = 0.05 is desired, then at least n = 1/l = 20 surrogates need to be generated (Jung et al., 2003; Yuan et al., 2004). A set of values λ<sup>i</sup> (with i = 1, . . . , n) of the discriminating statistics is then computed from the surrogates and compared agains the value λ<sup>0</sup> for the original time series. Rejecting the null hypothesis can be done using: (1) rank ordering or significance testing, (2) the average method (Yuan et al., 2004), or (3) the coefficient of variation method (Theiler et al., 1992; Kugiumtzis, 2002; Jung et al., 2003).

In rank ordering, λ<sup>0</sup> must occurs either on the first or on the last place in the ordered list of all values of the discriminating statistics to reject the null hypothesis (see the null hypothesis rejection using FNN Section 4.2).

In the average statistical method, a score γ (sometimes called a Z-score) is derived as follows:

$$
\lambda = \left| \frac{\bar{\lambda}\_0}{\bar{\lambda}} - 1 \right|,
$$

where λ = 1 n P<sup>n</sup> i=1 λi is the mean value of the discriminating statistics over all surrogates. If the score γ is much less than 1, then the relative discrepancy can be considered negligible. If γ is greater than 1, then the original data and the surrogates are significantly different and the null hypothesis is rejected.

In the coefficient of variation statistical method, a score γ is derived as follows:

$$
\lambda = \left| \frac{\overline{\chi} - \chi\_0}{\sigma\_\lambda} \right|, \tag{1}
$$

where σ<sup>λ</sup> is the standard deviation of the discriminating statistics over all surrogates. If the values λ<sup>i</sup> are fairly normally distributed, rejection of the null hypothesis requires a γ - value of about 1.96 at a 95% confidence level (Stam et al., 1998; Jung et al., 2003).

For every trial and every animal we generated n = 100 surrogates and used two different discriminating statistics to detect potential nonlinearity in our data. The first γ score was based on the reversibility of the time series. The second discriminating statistics was based on the percentage of false nearest neighbors (see Section 4.2).

A time series is said to be reversible only if its probabilistic properties are invariant with respect to time reversal (Diks et al., 1995). Time irreversibility is a strong signature of nonlinearity (Schreiber and Schmitz, 2000) and rejection of the null hypothesis implies that the time series cannot be described by a linear Gaussian random process (Diks et al., 1995). We used the Tisean function timerev to compute the time reversal asymmetry statistics both for the original and the surrogate data (Hegger et al., 1999; Schreiber and Schmitz, 2000). The 100 surrogate data files for each of the 100 trials were generated using Tisean function surrogate (Hegger et al., 1999; Schreiber and Schmitz, 2000).

**Figure 1A** shows one of the original time series (continuous blue line) together with one of its 100 surrogates (dashed red

for each of their surrogates over all five groups showed that only the third group does not meet the nonlinearity criterion (the horizontal continuous line) since its γ

line). Although the two data sets might look similar, the time reversal asymmetry value for the original data was λ<sup>0</sup> = 0.1893 and for the surrogate data shown in **Figure 1A** it was λ = 2.4948. The fact that the surrogates are significantly different from the original data means that, for example, the delay embedding dimension for surrogates is different than for the original data. Indeed, we found that the embedding dimension is higher for surrogates (see **Figure 6C**). It also means that the surrogates do not unfold correctly in the lower-dimensional embedding space of the original data (see Supplementary Materials). We used the coefficient of variation statistical method to compute a γ score from Equation (1). **Figure 1B** shows all γ scores for the first animal. The statistics was computed over groups of original data lumped together based on their "similarity" as determined after correcting for phase resetting induced by the light stimulus (see Section 3.2 below for details) and using the dendrogram (see Section 3.3 below for details). The average γ score of time reversal asymmetry statistics that was computed from individual λi- values for each trial in the third group was less than 1.96. Therefore, the null hypothesis that the data had been created by a stationary Gaussian linear process could not be rejected for this group of LFPs. For all the other groups of original data formed out of the 100 trials the γ score was above 1.96 and therefore we rejected the null hypothesis. Although this time reversal asymmetry discriminating statistics seems to exclude the third group of data, we also used the FNN discriminating statistics for all data (see Section 4.2). The FNN reflects the degree of determinism in the original data and therefore serves as a good choice for a discriminating statistic (Hegger et al., 1999; Yuan et al., 2004). Briefly, for the third group of data, which was rejected based on time reversal asymmetry discriminating statistics, we found that the percentage of FNN for all 100 surrogates computed for all trials in the respective group was always larger than for the original data (see **Figure 6C**). Therefore, based on both discriminating statistics, it is likely that nonlinearity is present in all our data.

score is less than 1.96. For all the other four groups the null hypothesis can be rejected.

# 3.2. Phase Resetting of LFP

LFPs are weighted sums of activities produced by neural oscillators in the proximity of the recording electrode (Ebersole and Pedley, 2003). In order to better understand the effect of a stimulus, such as a brief laser pulse on a neural network, we used a simplified neural oscillator model (see **Figure 2A**) that produced rhythmic activity. We used a Morris-Lecar (ML) model neuron (Morris and Lecar, 1981). When a noise free oscillator with intrinsic firing period P<sup>i</sup> (see **Figure 2A**) is perturbed, e.g., by applying a brief rectangular current stimulus, the effect is a transient change in its intrinsic period. For example, a perturbation delivered at phase 0.3, measured from the most recent membrane potential peak, produces a delay of the next peak of activity (continuous blue trace in **Figure 2A**). On the other hand, an identical perturbation delivered to the same free running oscillator at a phase of 0.5 produces a significant advance of the next peak of activity (dashed red trace in **Figure 2A**). As we notice from **Figure 2A**, the cycles after the perturbation return pretty quickly to the intrinsic activity of the cell, i.e., the most significant effect of the perturbation is concentrated during the cycle that contains the perturbation. The induced phase resetting, i.e., the permanent phase shift of post-stimulus activity compared to pre-stimulus phase, depends not only on the strength and duration of the perturbation, but also on its timing (or phase).

One approach often used for reducing the noise is averaging multiple trials. How should a meaningful average be carried out to both reduce the noise and preserve the characteristics of the rhythmic pattern, such as amplitude, phase, and frequency? One possibility is to align all action potentials at stimulus onset and added them up (see the thick black trace in **Figure 2B**) to generate a LFP. In **Figure 2B** we also added a uniform noise to neural oscillator's bias current such that the individual traces are pretty rugged. The effect of noise is especially visible on the dashes and dashed-dotted traces in **Figure 2B** during the slow hyperpolarization. By adding 100 noisy action potential traces produced by resetting the neural oscillator at 100 equally

FIGURE 2 | Phase resetting of FLPs. The free-running neural oscillator was perturbed at different phases and, as a result, its phase was reset due to a transient change in the length of the current cycle during which the perturbation was active (A). The intrinsic firing period Pi (see black bar on top of the third cycle that contains the perturbation) was shortened by a perturbation applied at phase φ = 0.5 (see dashed red trace and the corresponding red bar on top of the third cycle). The same perturbation applied at phase φ = 0.3 (measured from the peak of the action potential—see vertical dotted lines) lengthened the current cycle (see continuous blue trace and the corresponding blue bar on top of the third cycle). (B) The average membrane potential of 100 noisy traces (thick black line) perturbed at 100 equally spaced phases during the third cycle is less noisy and retains some low frequency oscillations present in all individual traces. All traces were aligned at stimulus onset and only two of them are shown (red dashed and blue dashed-dotted). (C) LFP recordings also aligned at laser stimulus onset show an average LFP trace (black thick trace) that is almost noise free and retains some spectral characteristics of its components. At the same time, the shape of the average LFP trace is significantly different from any individual traces.

spaced phases we produced a smooth average (see the thick black trace in **Figure 2B**). Therefore, on the positive side, we could use a (weighted) sum of noisy traces to reduce the noise in our data. The other positive outcome is that the (weighted) sum retains some of the characteristics of the individual traces, such as the intrinsic firing frequency. However, we also notice form **Figure 2B** that the shape of the (weighted) average is quite different from any of its constituents, which raises the question: is this averaging procedure the right ways of computing a (weighted) average from individual trials? Based on **Figures 2A,B**, we can conclude that the mismatch between the average (black thick line) and the individual trials (blue and red traces) is due to the fact that the periodically delivered stimulus found the background oscillatory activity of the neuron at different phases, therefore, produced different phases resettings. Without correcting for the stimulus induced phase resetting effect on each trial we lose the phase and amplitude information by simply adding all individual traces. We noticed the same effects when attempting to remove the noise in out LFP data be averaging all trials aligned at the onset of the light stimulus (see **Figure 2C**). As a result, whenever performing an averaging of noisy rhythmic patterns for the purpose of reducing the noise, first the individual traces must be corrected for the phase resetting induced by the external stimulus.

After dropping the 0.5 s transient, we noticed that even very similar LFP traces, such as those shown in **Figure 3A**, do not overlap perfectly due to the phase resetting (or the permanent phase shift) induced by light stimuli that arrived at different phases of the LFP activity.

In order to correct the LFP recordings for the phase resetting induced by the brief laser pulse, we performed a circular shift of each LFP trace with respect to one, arbitrarily selected, trace that was considered as a "reference" LFP. The phase resetting maximized the coefficient of correlation between any trial and the arbitrary "reference" (see **Figure 3B**). As a result of the circular shift, the coefficient of correlation increased significantly from an average of 0.0143 ± 0.055 (red trace in **Figure 3C**) to 0.5854 ± 0.1383 (blue trace in **Figure 3C**). Additionally, the root-mean-square (rms) error, i.e., the Euclidian norm of the difference between each 1.5 s long trial and the "reference" trial, was computed (see **Figure 3D**). The rms error before circularly shifting the trials was 13.4 ± 2.9. By circularly shifting the trials to remove the effect of phase resetting induced by the light stimulus, we were able to decrease the rms error to 8.5 ± 1.8 (see green curve with squares in **Figure 2D**).

# 3.3. Dendrograms of Phase Shifted LFPs

The circular shift performed in the previous section with the purpose of maximizing the coefficient of correlation between any trial and an arbitrary "reference" helps correctly defining the relative phase of trials with respect to each other. Another helpful step in the process of automatic data classification before attempting a delay embedding reconstruction was to separate the trials in "similar"-looking groups. Since we were interested in finding out if there is any attractor of network's steady activity, it is expected that phases space traces of different trials would remain close to each other at all times. This implies that individual recordings present some "similarities" that could be detected using the dendrograms, e.g., for the purpose of separating clean data from artifacts (due to malfunction of laser trigger, etc.) We used dendrograms to find the similarity trees of all 1.5 s long, phase-corrected, trials that allowed us to further decrease the rms error to an arbitrarily selected "reference" from the same group (see blue solid circles in **Figure 3D**). The dendrogram in **Figure 4A** used the Euclidian distance to measure similarities between the phase-shifted LFP trials.

The dendrogram could be used, for example, to separated groups of trials based on an arbitrary selection of the cutoff distance along dendrogram's trees. For example, by selecting a cluster distance larger than 40 (see **Figure 4A**) all 100 trials belong to just one group. As already discussed, lumping all trails

correlation between trials without phase resetting correction (red line), to 8.5 ± 2.1 after phase-shifting all trials to correct for phase resetting (blue line), to 6.9 ± 1.8 for phase-shifted dendrogram-based correlation (green solid circles).

in one group may inadvertently lump together low-dimensional attractors with data affected by various equipment malfunctions. Such an approach would make the task of identifying any phase space attractor to which all trajectories remain close at all times more computationally intensive. By decreasing the cluster distance threshold, we could form two groups or more. In the following, we used a cutoff cluster distance close to 20 and obtained five dendrogram-based groups (see the shaded rectangles in **Figure 4A**). The plots of the LFPs for each of the first three groups (**Figure 4B**) show pretty similar waveforms and quite different form the last two groups of the dendrogram (see **Figure 4C**). Therefore, it may be easier to visually identify an attractor (if one exists) by looking at reconstructed attractor of individual trials from the same group, for example by comparing traces from group 1 against each other (see **Figure 7B1**). The same is true when comparing trials from group 5 against each other (see **Figure 7B5**). It is unlikely that we would be able to find any trials from group 1 that remain close to any trials from group 5, a fact that we learned during data preprocessing stage using dendrograms.

The same numerical procedure was applied to all data from six animals of which we only show one detailed example.

# 4. Delay Embedding Method

Given the complexity of a single pyramidal neuron and the intricacy of synaptic coupling in the mPFT cortex (Schnitzler and Gross, 2005), we would expect a rather high-dimensional delay embedding for our LFP recordings.

In electrophysiology, we record the membrane potential time series, which is just one of many independent variables required for a full characterization of neural network activity. Even though we have direct access to only one variable of the d−dimensional dynamical system, i.e., the light-activated local network, it is still possible to faithfully recover, or reconstruct, the phase space dynamics through delay embedding method (Abarbanel, 1996; Kantz and Schreiber, 1997; Schuster and Just, 2005; Kralemann et al., 2008). For a time series x<sup>i</sup> = x(i1t) with i = 1, 2, . . . , N where N is the number of data points and 1t is the (uniform) sampling time, a d−dimensional embedding vector is defined as

$$\boldsymbol{\pi}\_{i} = (\boldsymbol{\pi}\_{i}, \boldsymbol{\pi}\_{i+n}, \dots, \boldsymbol{\pi}\_{i+(d-1)n}),$$

where τ = n1t is the delay, or lag, time (Packard et al., 1980; Takens, 1981).

similar (B). The average LFP for the last two groups are also very similar (C), but quite different from the previous three groups. The resulting rms error of a trial with

respect to its corresponding group average is definitely an improvement over the simple averaging of all trials (see Figure 3D). Even though the group average could be pretty close to capturing features of individual LFP trials form the respective group, the delay time for phase space reconstruction has a wide range of values (D) and there is no obvious group correlation.

Two parameters are essential for a correct delay embedding reconstruction of the phase space: the lag time τ and the embedding dimension dE. The delay, or lag, time τ is the time interval between successive components of the embedded vector. Although we assumed that the same delay time applies to each component of the embedded vector, the delay embedding method also allows for different delays along different directions of the phase space (Vlachos and Kugiumtzis, 2010).

# 4.1. Lag Time

The quality of phase space reconstruction is affected, among other factors, by the amount of noise, the length of the time series, and the choice of the delay time. For example, a too small delay time τ leads to embedded vector with highly correlated, or indistinguishable, components. Geometrically, this means the all trajectories are near the diagonal of the embedding space and the attractor has a dimension close to one irrespective of its complexity. To avoid such redundancy, the delay time τ should be large enough to make the components of the embedded vector independent of each other. However, a too large delay time completely de-correlates the components of the embedded vector. Geometrically, this means that phase space points fill the entire embedding space randomly and the attractor has a dimension close to the embedding space dimension. Although there is no universal method for selecting the "right" delay time, in practice we use a few different approaches to avoid both the redundancy due to a too short delay time and the irrelevance due to a too large delay time (Casdagli et al., 1991).

One of the methods often used for estimating the lag time τ is the autocorrelation of the time series. Although researchers agree that autocorrelation could provide a good estimation of the time lag, there is no consensus regarding the specifics. For example, Zeng et al. (1991) considered that τ is the time at which the autocorrelation decays to e −1 , Schiff and Chang (1992) considered the first time when the autocorrelation is not significantly different from zero, Schuster (Schuster and Just, 2005) suggested using the first zero of autocorrelation function to ensure linear independence of the coordinates, King et al. (1987) considered the time of the first inflection of the autocorrelation, and Holzfuss and Mayer-Kress (1986) considered the first time the autocorrelation reaches a minimum.

In addition to autocorrelation, Fraser and Swinney (1986) suggested using the first local minimum of the average mutual information (AMI) to estimate the time lag. Their method measures the mutual dependence between x<sup>i</sup> and xi+<sup>n</sup> with variable lag time n1t (see also Kantz and Schreiber, 1997; Hegger et al., 1999).

Additionally, the total time spanned (Broomhead and King, 1986) by each embedded vector, i.e., t<sup>w</sup> = (d−1)τ , is a significant measure of potential crossover between temporal correlation that could induce spurious spatial, or geometrical, correlation between phase space points (Theiler, 1990).

# 4.2. Embedding Dimension

The embedding dimension was selected based on Takes's theorem (Takens, 1981) that ensured a faithful reconstruction of a d−dimensional attractor in an embedding space with at most 2d+ 1 dimensions. For a dissipative system, Hausdoff dimension could be estimated from a time series and used as the dimension of the attractor (Holzfuss and Mayer-Kress, 1986; Kennel et al., 1992; Provenzale et al., 1992). Good estimators of Hausdorf's dimension are the correlation dimension (Grassberger and Procaccia, 1983) or the Lyapunov dimension (Kaplan and Yorke, 1979). Once the range (d ≤ d<sup>E</sup> ≤ 2d + 1) of embedding dimensions is known, additional tests could determine the optimum embedding dimension dE.

Kennel et al. (1992) introduced the false nearest neighbors (FNN) procedure to obtain the optimum embedding dimension (see also Kennel et al., 1992; Hegger et al., 1999; Sen et al., 2007). The idea behind FNN approach is to estimate the number of points in the neighborhood of every given point for a fixed embedding dimension. High dimensional attractors projected onto a too low dimensional embedding space show a significant number of false neighbors, i.e., phase space points that look close to each other although in the true attractor space they are far apart. The FFN method compares the Euclidian distance R<sup>d</sup> between two neighbors x<sup>i</sup> and x<sup>j</sup> computed in a d−dimensional space against the distance Rd+<sup>1</sup> in a (d + 1)−dimensional embedding space (Kennel et al., 1992). If the ratio of relative distances between neighbors in the two embedding spaces, i.e., r 2 2

f = R d+1 −R d R 2 d , is larger than a predefined value then the

two points x<sup>i</sup> and x<sup>j</sup> are false neighbors, i.e., the points are neighbors because of a too low projection and not because of the true dynamics. The ratio f is usually set between 1.5 and 15 (Kennel et al., 1992; Abarbanel, 1996; Kantz and Schreiber, 1997). Additionally, if the distance Rd+<sup>1</sup> is larger than the coefficient of variation σ/x¯ of the data then the two points are false neighbors. The reason is that σ is a measure of the size of the attractor and two points that are false neighbor will be indeed stretched to the extremities of the attractor in dimension d + 1. Abarbanel (1996) found that for many nonlinear systems the value of f approaches 15, but the range is quite wide from 9 to 17 (Konstantinou, 2002). By successively computing the fraction of FNNs in different embedding dimensions, it is possible to estimate an optimum embedding. Some algorithms that takes into account the temporal window t<sup>w</sup> = (d − 1)τ spanned by the embedded vectors allow simultaneous estimation of both embedding dimension and lag time (see Stefánsson et al., 1997).

# 5. Results

## 5.1. Experimental Data

Since we were interested in uncovering any possible attractor of phase space trajectories, we only considered the last 1.5 s of each 2 s long recording. We first performed a phase shift of every 1.5 s long LFP recording to correct for the phase resetting due to light stimulus (see **Figure 3B** for two similar-looking LFT traces that were phase-shifted with respect to each other to maximize the correlation coefficient and correct for the phase resetting effect).

#### 5.2. Lag Time

As described in Section 4.1, we used two different approaches to estimating the lag time τ : (1) the autocorrelation function (Casdagli et al., 1991), and (2) the AMI method (Fraser and Swinney, 1986). The first zero crossing of the autocorrelation function is the time τ beyond which x(t + τ ) is completely de-correlated from x(t). However, the first zero crossing of the autocorrelation function takes into account only linear correlations of the data (Abarbanel, 1996). The first minimum of the nonlinear autocorrelation function called Average Mutual Information (AMI) (Fraser and Swinney, 1986) is considered a more suitable choice since this is the time when x(t + τ ) adds maximum information to the knowledge we have from x(t) (Kantz and Schreiber, 1997). In most practical applications the two methods are used together and they usually give similar estimations of the lag time.

We computed the lag times for individual trials (see **Figure 4D** for the distribution of all lag times for animal # 1) and also for group averages (see **Table 1**). Although only the autocorrelationbased lag time are shown both in **Figure 4D** and **Table 1**, the AMI-based lag time values (not shown) were within 10% of those obtained with the autocorrelation.



In **Table 1**, the second column (called Avg.) and the third columns (called Std.) represent the average, respectively, the standard deviation of the corresponding lag time distributions, such as the one shown in **Figure 4D** for animal # 1. The next columns in **Table 1** represent the lag times of the dendrogrambased group averages.

For example, for the first animal, the first zero crossing of the autocorrelation function for dendrogram-based average LFP of group 1 is around τ ≈ 15001t (see **Figure 5A**), whereas the first minimum of the AMI is around τ ≈ 20001t (see **Figure 5B**).

Our data were stored as single-column text files representing the LFP recordings with a sampling rate of 1t = 10−<sup>4</sup> s. Tisean command for estimating the lag time from autocorrelation function was autocor dataFile.txt -p -o, where the option −p specified periodic continuation of data and −o specified that the expected output will be returned to a file named dataFile.txt.co, which is plotted in **Figure 5A**.

Tisean command for estimating the lag time from the AMI was mutual dataFile.txt -D10000 -o, where the option −D10000 specified the range of lag times for which AMI was computed and stored in the file dataFile.txt.mut, which is plotted in **Figure 5B**.

### 5.3. Embedding Dimension

The method of false nearest neighbors (FNN) estimates the embedding dimension d<sup>E</sup> by repeatedly increasing the embedding dimension until the orbits of the phase space flow do not intersect or overlap with each other. We used a lag time τ = 22001t and estimated the embedding dimension using FNN method with ratios f between 2 and 20 (see **Figure 6A**). As expected, for large ratios of distances, e.g., f > 7, the percentage of FNNs drops to almost zero for an embedding dimension d<sup>E</sup> = 3.

The actual Tisean routine used was false\_nearest dataFile. txt -f2 -d2200 -o, which calculated the percentage of FNNs with a ratio f ≥ 2, a lag time d = 22001t, with the default phase space dimensions from 1 to 5 (see **Figure 6A**). **Figure 6A** clearly indicates that an embedding dimension d<sup>E</sup> = 3 is sufficient.

FIGURE 5 | Time lag estimation. The first zero crossing of autocorrelation function is around τ ≈ 15001t (A) and the first minimum of the average mutual information is around τ ≈ 20001t (B) with 1t = 10−<sup>4</sup> s.

FIGURE 6 | Percentage of false nearest neighbors. (A) For a too small ratio f < 7 of distances between neighbor points in different embedding dimensions, the percentage of false nearest neighbors is high and only drops near zero for very large embedding dimensions. For larger ration f > 7 all percentages drop to almost zero false nearest neighbors for an embedding dimension of d<sup>E</sup> = 3. This suggests that an optimum ratio is above f = 7, in agreement with results from others (Abarbanel, 1996; Konstantinou, 2002). (B) To avoid spurious spatial correlations due to inherent temporal correlation between too closely spaced points in a time series, the percentage of FNN was estimated with variable Theiler window (t). (C) The percentage of FNN is also a good discriminating statistics. For the third group of data from the first animal, the logarithmic plot shows that the percentage of FNN for the original data (solid squares) is always smaller than any of the 100 surrogates. Only the envelopes of the minimum (solid circles), respectively, maximum (solid triangles) values of FNN are shown.

Any estimate of dimension, especially when it is based on correlation among data points, assumes that pairs of points are drawn randomly and independently according to the scale invariant measure of the attractor. However, points occurring close in time are not independent and lead to spuriously low estimates of embedding dimension. To avoid this issue, points closer than some minimum time (called the Theiler window) can be excluded from calculations (Grassberger, 1987; Theiler, 1990). Heuristic examples of estimates of Theiler window are three times the correlation time (Heath, 2000), (d − 1)τ , or other ad hoc values based on space-time separation plots (Provenzale et al., 1992).

In our estimation of embedding dimension with FNN method, we also tested a wide range of Theiler windows from 100 to 8000 sampling times (**Figure 6B**) in order to make sure that no spurious temporal correlation among data points led us to a too low estimation of the embedding dimension. All plots of the fraction of FNNs indicated that d<sup>E</sup> = 3 is still a good choice of the embedding dimension. The actual Tisean routine was false\_nearest dataFile.txt -f20 -d2200 -t100 -o, which calculates the percentage of FNNs with a ratio greater than f = 20, a lag time of d = 22001t, a Thriller windows −t of 1001t for all embedding dimensions from 1 to 5 (see **Figure 4B**).

The attractors were reconstructed (see **Figure 7**) using the time lag τ and embedding dimension d<sup>E</sup> as determined above. AAs seen form **Figures 7A1–A5**, the dendrogram-based preprocessing separated quite well the LFP waveforms in "similar" groups such that randomly selected LFPs from the same group remained close to each other at all times (see red and green traces in **Figures 7B1–B5**). The reconstruction of individual trials was performed with their corresponding delay (lag) times (see **Figure 4D** for the distribution of all delay times for the first animal). We also showed the reconstructed group average (blue thick trace in **Figures 7B1–B5**) not because it represents the "true" attractor, but rather as a visual cue to help us gauge if the phase space trajectories of the individual trials remained close to each other and at all times. As expected from the dendrogram-based preprocessing, the first three groups gave very similar reconstructed attractors. The shape of attractors from the first three groups could be roughly described as a continuous circular loop twisted in an "8"-shaped object (see **Figures 7B1–B3**). Since the group average (blue thick line) is less noisy than the individual trials (red and green lines) it serves as a visual aid toward identifying the shape of the attractor suggested by the individual trials. The shape of the first group's attractor (**Figures 7B1–B3**) could be viewed as an "8"-shaped loop bent around its midpoint (see also Supplementary Materials Video). However, by increasing the lag time, the "8"-shaped attractor can be "untangled" such that the two loops look more like the circles shown in **Figures 7B2,B3**. For example, in **Figure 8** we showed two examples of the same trials (red and green lines) together with their corresponding group average (thick black trace) that were reconstructed in the three dimensional phase space using different delay times. In **Figure 8A** for τ = 1900 we clearly notice the twisted "8" shaped attractor that looks straight in **Figure 8B** for a delay time of τ = 2200. Therefore, all attractors in **Figures 7B1–B3** are topologically identical (up to some microscale details) since any of them could be morphed into another by a (circular) phase shift. Furthermore, a close inspection of the fourth's group attractor shows that it is close to the previous three and quite different from the fifth attractor.

The detailed procedure described above was also applied to the other five data sets from different animals. The results are summarized in **Figures 9**–**13**. For all six animals that were retained and analyzed, the zero crossings of the autocorrelation and the minimum the AMI gave consistent lag time estimations (see **Table 1**).

We found that for all six animals the optimum delay embedding dimension was d<sup>E</sup> = 3. We found topologically identical attractors in all first four LFP dendrogram-based groups for animal #1 (see **Figures 7B1–B4**), which cover 90% of the recordings. The attractor is "8"-shaped and is topologically equivalent (after appropriate phase shifting) with an "untangled" attractor (see **Figure 8**).

For animal #2, all attractors belong to the same "8" shaped class or its topologically identical counterparts (see **Figures 9B1–B5**), although the fifth group presented a very large variability.

For animal #3, there were three topologically identical dendrogram-based LFP groups that gave an "8"-shaped attractor (see **Figures 10B1–B3**), which covered 84% of recordings.

For animal #4, all attractors were topologically identical that belonged to the "8"-shaped attractor (see **Figures 11B1–B4**), although the fourth group presented a very large variability.

For animal #5, there were again three topologically identical dendrogram-based LFP groups that belonged to the "8" shaped attractor (see **Figures 12B1–B3**), which covered 74% of recordings.

For animal #6, there were two topologically identical dendrogram-based LFP groups that belonged to the "8" shaped attractor (see **Figures 13B1,B2**), which covered 34% of recordings.

An important characteristic of the attractors that were not included in the above category of "8"-shaped attractors or their topological equivalents is that all of them showed relatively low amplitude oscillations of the LFP. For example, while the peak-to-peak amplitude of LFP oscillations for the four topologically equivalent attractors shown in **Figures 7B1–B4** was between −0.15 and +0.25 arb. units, the amplitude of the LFP for the last group was between −0.075 and 0.075 arb .unit., which is a decrease by a factor of 2.6. Similarly, for animal #3, the range of LFP for the "8"-shaped attractor or their topological equivalents (see **Figures 10B1–B3**) was between −0.4 to +0.7 arb. units whereas for the only dissimilar group the LFP amplitude was between −0.1 to +0.1, a decrease in amplitude of LFP by a factor of 5.5. For animal #5, the decrease in amplitude of LFP only by a factor of 1.5 and for animal #6 the factor was 2.5. One possible explanation could be an intermittent malfunction of the laser's trigger. The dendrogram method helped us automatically sort the data set into "similar" groups before performing a delay embedding. As a result, we decreased the computational

time by eliminating pair comparisons of all reconstructed attractors to determine which trials remain close to each other.

# 6. Discussion

Accurate quantification of the dynamic structure of LFPs can provide insight into the characteristics of the underlying neurophysiological processes that generated the data. In the present study, we first determined that nonlinearity is present in our LFP data using the surrogates method and two different discriminating statistics: (1) time reversal asymmetry, and (2) percentage of FNN. Time reversal asymmetry is a robust method for detecting irreversibility, which represents nonlinearity, even in the presence of a large amount of noise in the time series (Diks et al., 1995). Time reversal asymmetry statistics revealed clear differences between the original and the surrogates, with the exception of one group of data out of five for the first animal. For each of the six animals we had one group of original data for which we could not reject the null hypothesis that the time series could be produced by a linearly filtered noise at a significance level of 5% (Stam et al., 1998).

We performed also a FNN-based nonlinearity test and found that for all LFPs the percentage of FNN is always smaller for the original data trials compared to any of their surrogates. For example, any of the individual trials from the group of data for which we could not reject the null hypothesis based on time reversal asymmetry criterion had a smaller percentage of FNN than any of its 100 surrogates (see **Figure 6C**). As a result, we concluded that nonlinearity is likely present in all our data sets.

We performed two important data preprocessing that helped us reduce the computational time required for attractors identification: (1) phase shifting LFPs to correct for the phase resetting induced by light stimulus, and (2) grouping the shifted LFPs in similar patterns of activity using a dendrogram (see **Figure 4A**).

Since the light stimulus was applied every 2 s, it found the rhythmic LFP activity at different phases. As a result, it produced significantly different permanent phase shifts of the LFPs from trial to trial (see the two out-of-phase red and blue LFP recordings in **Figure 2A**). We determined the amount of phase resetting by circularly shifting the recordings (for example, compare the out-of-phase traces in **Figure 3A** against a better overlap of LFPs in **Figure 3B**). The phase resetting in neural networks is of paramount importance for large neural network synchronization. For example, in deep brain stimulation (DBS) procedures an electrical pulse is applied through an electrode to a brain region with the purpose of disrupting the synchronous activity, e.g., during epileptic seizures (Varela et al., 2001; Tass, 2003; Greenberg et al., 2010). For this purpose, stimuli are carefully designed with appropriate amplitude and duration and are precisely delivered during DBS procedures (Tass, 2003; Greenberg et al., 2010). Such procedures are based on precise measurements of phase resetting. Although we did not use electrical stimuli like in DBS, we also produced large phase resettings in background activity of mPFC. Using correlation maximization criteria, we were able to estimate quantitatively the amount of phase resetting. To our knowledge, phase shifting LFPs to maximize their pair correlation was not previously used in the context of measuring the amount of phase resetting in optogentic experiments.

Although dendrogram grouping is not absolutely necessary for attractors identification, it reduced the computational time required for data analysis. For example, for N = 100 trials we should have performed N(N + 1)/2 ≈ 5000 pair comparisons to find if and which reconstructed phase space trajectories remained close to each other, therefore, hinting toward a possible attractor. Instead, we only checked if the individual trials from the same group remained close to each other (see red and green traces in **Figures 7B1–B5**). By analyzing all possible pairs of trials we would have eventually reached the same conclusion, i.e., that the individual trials from group 1 (**Figure 7B1**) do not remain close to the reconstructed trajectories from group 5 (see **Figure 7B5**).

We showed that the recorded LFPs from mPFC of ChR2 expressing PV+ interneurons could be successfully embedded in a three dimensional space. For this purpose, we presented a detailed analysis of delay embedding procedure for LFPs in response to a brief 10 ms light pulse. Both the autocorrelation and the AMI gave consistently close estimations of delay, or

dendrogram (blue thick line) and two randomly selected trials form the same group (red dashed and green dashed-dotted line). (B1–B4) Show the corresponding three dimensional reconstructed attractors. With the exception of the last group of LFP recordings, the attractors look similar after they are appropriately rotated and/or phase shifted.

lag, time (see **Table 1**). We found that a sufficient embedding dimension was d<sup>E</sup> = 3 for all six animals. The embedding dimension estimation based on the FNN method was stable for a broad range of lag times around the optimally predicted values. We also considered a wide range of values both for the ratio of the distances between neighbors in successively larger phase spaces (parameter f in FNN routine—see Section 4.2) and different Thieler window (parameter t in FNN routine).

We found the same "8"-shaped attractor, or its topologically equivalent counterparts after appropriate phase shifting, in all six animals, which covers overed 80% of recorded data.

All the other attractors were produced by low-amplitude and higher frequency oscillations of LFPs, which led to a more complex structure of the attractor. One possible reason for such a clear separation into two classes of attractors across all animals could be due to neural network bistability, i.e., depending on the phase of the light stimulus the network's activity could lead to one attractor (the "8" shaped) or a more complex geometry. Another possible, much simpler, explanation could be that the recording quality was intermittently degraded by unknown factors, such as laser trigger malfunction, etc. Future LFP recordings are required to test such hypotheses.

and/or phase shifted.

Additionally, the low-dimensional attractor that we identified opens the possibility of fitting the experimental data to a three-dimensional model for the purpose of better understanding the dynamics of the network, e.g., through bootstrap method (Efron, 1982) .

# 7. Conclusions

The activity of medial prefrontal cortex of six optogenetic mice was periodically perturbed with brief laser pulses. The pair correlations between recorded LFPs were enhanced

by appropriate phase shifting them to account for the light-induced phase resetting of network activity. The phase space dynamics was reconstructed using delay embedding method. We found that the reconstructed attractors are three dimensional and they have similar shapes across different animals.

# Author Contributions

SO tested data nonlinearity, corrected data for phase resetting using crosscorrelation, computed dendrogram-based statistics, carried out numerical simulations for delay-embedding, and wrote the manuscript. PL contributed to delay-embedding

# References


numerical simulations. TT and AL performed the experiments and reviewed the manuscript.

# Funding

SO acknowledges support for this research from NSF-CAREER award IOS 1054914 and MUSC bridge funding (AL).

# Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fncom. 2015.00125


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Oprisan, Lynn, Tompa and Lavin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A pooling-LiNGAM algorithm for effective connectivity analysis of fMRI data

# *Lele Xu1, Tingting Fan1, Xia Wu1,2,3,4\*, KeWei Chen5, Xiaojuan Guo1, Jiacai Zhang1 and Li Yao1,3,4*

*<sup>1</sup> College of Information Science and Technology, Beijing Normal University, Beijing, China*

*<sup>2</sup> State Key Laboratories of Transducer Technology, Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai, China*

*<sup>3</sup> State Key Laboratory of Cognitive Neuroscience and Learning, IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China*

*<sup>4</sup> Center for Collaboration and Innovation in Brain and Learning Sciences, Beijing Normal University, Beijing, China*

*<sup>5</sup> Department of Mathematics and Statistics, Banner Good Samaritan PET Center, Banner Alzheimer's Institute, Arizona State University, Phoenix, AZ, USA*

#### *Edited by:*

*Tobias Alecio Mattei, Ohio State University, USA*

#### *Reviewed by:*

*Antonio Politi, Consiglio Nazionale delle Ricerche, Italy Le Wang, Boston University, USA*

#### *\*Correspondence:*

*Xia Wu, College of Information Science and Technology, Beijing Normal University, No. 19 Xin Jie Kou Wai Da Jie, Beijing, 100875, China*

*e-mail: wuxia@bnu.edu.cn*

The Independent Component Analysis (ICA)—linear non-Gaussian acyclic model (LiNGAM), an algorithm that can be used to estimate the causal relationship among non-Gaussian distributed data, has the potential value to detect the effective connectivity of human brain areas. Under the assumptions that (a): the data generating process is linear, (b) there are no unobserved confounders, and (c) data have non-Gaussian distributions, LiNGAM can be used to discover the complete causal structure of data. Previous studies reveal that the algorithm could perform well when the data points being analyzed is relatively long. However, there are too few data points in most neuroimaging recordings, especially functional magnetic resonance imaging (fMRI), to allow the algorithm to converge. Smith's study speculates a method by pooling data points across subjects may be useful to address this issue (Smith et al., 2011). Thus, this study focus on validating Smith's proposal of pooling data points across subjects for the use of LiNGAM, and this method is named as pooling-LiNGAM (pLiNGAM). Using both simulated and real fMRI data, our current study demonstrates the feasibility and efficiency of the pLiNGAM on the effective connectivity estimation.

**Keywords: effective connectivity, causal structure, group analysis, functional magnetic resonance imaging (fMRI), linear non-Gaussian acyclic model (LiNGAM), pooling-LiNGAM (pLiNGAM)**

# **INTRODUCTION**

Functional connectivity and effective connectivity analyses have been widely used in the neuroimaging communities (Friston, 1994; Biswal et al., 1995; Greicius et al., 2003). Functional connectivity reflects the temporal correlations between spatially remote brain regions (Friston et al., 1993), and effective connectivity evaluates the influence that one brain region exerts on others (Friston, 1994). With the ability to describe the directionality of information transferred within a brain network, effective connectivity has become a hot topic in cognitive neuroscience research.

A variety of analysis methods have been developed for estimating effective connectivity, such as the Structural Equation Modeling (McLntosh and Gonzalez Lima, 1994), Dynamic Causal Modeling (Friston et al., 2003), Granger Causality Mapping (Goebel et al., 2003), and Bayesian Network (Zheng and Rajapakse, 2006). In a number of functional magnetic resonance imaging (fMRI) effective connectivity studies, the Gaussian assumption is usually made (Geiger and Heckerman, 1994; Bollen, 1998), however, most of fMRI data possess non-Gaussion distributions. Structural Equation Modeling and Dynamic Causal Modeling are model-driven methods and may be not suitable for resting-state fMRI data (Heckerman, 2008) or for situations where the prior knowledge is insufficient. Bayesian Network is a data-driven method but requires the data to be Gaussiandistributed (Shachter and Kenley, 1989; Baker et al., 1994; Wu and Lewin, 1994). Granger Causality Mapping uses a vector autoregressive model to estimate the effective connectivity among regions. It is also data-driven and only requires the data to be wide-sense stationary and has a zero mean (Goebel et al., 2003). However, Granger Causality Mapping is sensitive to noise and down sampling, thus it may generate spurious causality under some circumstances (Geiger and Heckerman, 1994; Chen et al., 2006; Shimizu et al., 2006).

A new method named linear non-Gaussian acyclic model (LiNGAM) algorithm was proposed by Shimizu et al. (2006) and suggested to be a promising tool to estimate the causal relationship among non-Gaussian distributed data. The fundamental difference of LiNGAM from most classical effective connectivity methods is the assumption of non-Gaussian distributions. The LiNGAM algorithm utilizes higher-order distributional statistics [Independent Component Analysis (ICA)] to estimate causal relations (Shimizu et al., 2006). This algorithm is data-driven and uses the following assumptions: (a) the data generating process is linear, (b) no unobserved confounders are present, and (c) disturbance variables follow non-Gaussian distributions. With a linear, non-Gaussian setting, LiNGAM can estimate the full causal model without undetermined parameters (Shimizu and Kano, 2008), whereas methods with Gaussian data need more information to work, such as the causal ordering of variables (Shimizu et al., 2006).

The LiNGAM algorithm could perform more stably in simulated data with more data points, e.g., the number of data points ≥1000 (Smith et al., 2011). However, the number of data points is fairly small (usually no more than 300) in most fMRI experiments. One viable strategy to address this issue is to pooling data points across subjects, in this way, a larger number of data points could be submitted to the LiNGAM algorithm. In this study, this method is called as pooling-LiNGAM (pLiNGAM), and the pooling subject can be termed as the virtual subject (V-subject).

The pooling of data points from multiple subjects actually belongs to group analysis method. There are mainly three categories of group analysis techniques, including the "virtualtypical-subject" (VTS) method, the "individual-structure" (IS) method, and "common-structure" (CS) method. The VTS method assumes that every subject within a group performs the same function and has the same connectivity network, and it does not consider inter-subject variability (Li et al., 2008). The IS method learns a network for each subject separately and then performs group analysis on the individually learned networks (Goncalves et al., 2001; Li et al., 2007). It considers inter-subject variability but may not integrate group data tightly enough (Li et al., 2008). The CS method imposes the same network structure on each subject, while allowing different parameters across subjects (Mechelli et al., 2002; Kim et al., 2007). It considers the group similarity at the structural level and inter-subject variability at the parameter level (Li et al., 2008). Each technique has its own advantages. Specifically, the VTS approach fits the data when inter-subject variability is assumed minimal, for example healthy subjects; the IS approach fits the data with large inter-subject variability, such as patients with large ranged clinical scores; while the CS approach otherwise (Li et al., 2008). The pLiNGAM used in this paper belongs to the VTS technique, thus our current study only considered the case where the inter-subject variability is low, such as the healthy subjects group.

In this paper, we aimed to demonstrate the feasibility of pLiNGAM on the estimation of effective connectivity by pooling data points across subjects. First, in order to examine the validity of pLiNGAM, the simulated fMRI data that is described in Smith's study (Smith et al., 2011) was adopted. Then, to verify the practicability of pLiNGAM algorithm, the real fMRI data was further used.

#### **MATERIALS AND METHODS**

#### **METHODS**

In this section, the original LiNGAM theory and the proposed pLiNGAM theory will be introduced.

#### *LiNGAM theory*

The LiNGAM algorithm has the following properties:

(a) Suppose *xi* (*i* ∈{1,. . . , *m*}, *xi* stands for the observed variables) can be arranged in their causal order *k*(*i*). For example, as in the Gaussian Bayesian theory, there are two observed variables *x* and *y*, if *x* is the parent node of *y,* then the causal order of *x* and *y* satisfy the relation of *k(x) > k(y)*. The generating process of variables *xi* is recursive (Shimizu and Kano, 2008) and can be represented graphically by a directed acyclic graph (Pearl, 2000; Spirtes et al., 2000).

(b) Each variable *xi* is a linear function of the preceding/parent variables, a "disturbance" term *ei*, and an optional constant term *ci*, that is

$$\mathbf{x}\_{i} = \sum\_{k(j) < k(i)} b\_{ij} \mathbf{x}\_{j} + e\_{i} + c\_{i} \tag{1}$$

where *bij* is the weight coefficient, *k*(*i*) is the causal order for each variable.

(c) The disturbances *ei* are non-Gaussian distributions, non-zero variances, and independent of each other.

After subtracting the mean from each variable *xi* and rewriting the equation in a matrix form, the following equation can be obtained:

$$\mathbf{x} = B\mathbf{x} + e\mathbf{e} \tag{2}$$

where *x* is data vector containing the component *xi*, *B* is the weight coefficients matrix and can be permuted to a strict lower triangular matrix if the causal ordering of variables is known (strict lower triangular matrix is defined as the lower triangular matrix with all zeros on the diagonal) and *e* is a disturbance term. Then, we can have:

$$\mathfrak{x} = \mathrm{A}\mathfrak{e} \tag{3}$$

where *A* = (*I* − *B*)<sup>−</sup>1. Matrix *A* can be permuted to lower triangular (all diagonal elements are non-zero). For Equation (3), the independence and non-Gaussianity of *e* define the special ICA model.

ICA is commonly used to discover hidden sources from a set of observed data when the sources are non-Gaussian and maximally independent. In this algorithm, FastICA (Hyvärinen and Oja, 1997) is chosen to estimate the sources *e* and the weight coefficients matrix *B*. However, there are two essential indeterminacies that ICA cannot solve: the order of independent components and the scaling of independent component amplitudes (Comon, 1994). In LiNGAM algorithm, the first indeterminacy can be solved by reordering the components following the rule that matrix *B* is a strict lower triangular matrix. If the results cannot be reordered to lower triangular, approaches have been produced to set the upper triangular elements to zero by changing the matrix as little as possible (Goebel et al., 2003). The second indeterminacy is usually handled by fixing the weights of their corresponding observed variables to unity. To assess the significance of the estimated connectivity for the LiNGAM algorithm, three statistical tests are usually performed to prune the edges of the estimated network: (a) Wald test, testing the significance of *bij*; (b) chi-square test, examining an overall fit of the model assumptions; and c) difference chi-square test, comparing nested models (Shimizu et al., 2006).

#### *pooling-LiNGAM (pLiNGAM) theory*

To avoid the fatigue of subjects and ensure the quality of the data, researchers often conduct relatively short fMRI experiments. The length of time for data acquisition from these experiments is usually limited, such as 480 s (8 min), thus may result in the unstable results of LiNGAM algorithm. To address this issue, the pLiNGAM algorithm of pooling data over multiple subjects is proposed (Smith et al., 2011).

In this method, long enough fMRI data points are obtained for an artificial subject, referred to as the "V-subject," by pooling several single subjects. As a V-subject is constructed from more than one single subject, it is preferred to assume that the inter-subject variability can be ignored. Here we provide formulated forms of extended LiNGAM, which is pLiNGAM. Suppose there are *n* subjects, then each variable *x* = (*xi*1*, xi*2*,..., xin*) (*i* ∈ {1*,..., m*}) is a linear function of the preceding/parent variables and a "disturbance" term *e* = (*ei*1*,ei*2*,...,ein*) and an optional constant term *c* = (*ci*1*,ci*2*,...,cin*), that is

$$(\mathbf{x}\_{i1}, \mathbf{x}\_{i2}, \dots, \mathbf{x}\_{in}) = \sum\_{k(j) \prec k(i)} b'\_{ij}(\mathbf{x}\_{j1}, \mathbf{x}\_{j2}, \dots, \mathbf{x}\_{jn})$$

$$+ (e\_{i1}, e\_{i2}, \dots, e\_{in}) + (c\_{i1}, c\_{i2}, \dots, c\_{in}) (4)$$

where *b ij* is the weight coefficient, *k*(*i*) belongs to the causal order and *e* = (*ei*1*, ei*2*,...,ein*) is non-Gaussian distributions, non-zero variances and independent of each other.

Then the mean is subtracted from each variable *x* = (*xi*1*, xi*2*,..., xin*), the equation can be rewritten in a matrix form as:

$$
\begin{bmatrix}
\mathbf{x}\_{11} & \mathbf{x}\_{12} & \cdots & \mathbf{x}\_{1n} \\
\vdots & \cdots & \cdots & \cdots & \cdots \\
\mathbf{x}\_{m1} & \mathbf{x}\_{m2} & \cdots & \mathbf{x}\_{mn}
\end{bmatrix} = \begin{bmatrix}
b\_{11} & b\_{12} & \cdots & b\_{1m} \\
\cdots & \cdots & \cdots & \cdots \\
b\_{m1} & b\_{m2} & \cdots & b\_{mm}
\end{bmatrix}
$$

$$
\begin{bmatrix}
\mathbf{x}\_{11} & \mathbf{x}\_{12} & \cdots & \mathbf{x}\_{1n} \\
\cdots & \cdots & \cdots & \cdots & \cdots \\
\mathbf{x}\_{m1} & \mathbf{x}\_{m2} & \cdots & \mathbf{x}\_{mm}
\end{bmatrix} + \begin{bmatrix}
\mathbf{e}\_{11} & \mathbf{e}\_{12} & \cdots & \mathbf{e}\_{1n} \\
\cdots & \cdots & \cdots & \cdots \\
\mathbf{e}\_{m1} & \mathbf{e}\_{m2} & \cdots & \mathbf{e}\_{mn}
\end{bmatrix} (5)
$$

If we abbreviate the matrixes, (5) can be expressed as:

$$\mathbf{x}' = B'\mathbf{x}' + e'\tag{6}$$

where *x* denotes the variable matrix, *B* is the weight coefficients matrix and can be permuted to a strict lower triangular matrix according to the causal ordering of variables. Then we can get the form of Equation (6) the same as Equation (2).

Based on the Equation (6), we can also get Equation (7) that defines the special ICA model as follows:

$$\mathbf{x}' = A'\mathbf{e}'\tag{7}$$

where *A* = (*I* − *B* )<sup>−</sup>1.

The specific steps of pLiNGAM based on V-subjects consist of the following steps:

(1) Generate V-subjects. First, randomly select *m* (1 ≤ *m* ≤ *n*) subjects (the length of a single subject is *Ls*) from the total *n* subjects. Then, the *m* subjects' data are pooled into one V-subject with a randomly order. The length of each V-subject is therefore *Lm* = *m*<sup>∗</sup>*Ls*. **Figure 1** illustrates the procedure.

stands for the total *n* single subjects, which have few data-points. Subject *j*<sup>1</sup> *... jm* stands for the *m* subjects selected from the total *n* subjects. Then the V-subject is the pooling data of the *m* selected subjects in a random order.

(2) Apply LiNGAM algorithm to the V-subjects. Default parameters of the ICA-LiNGAM algorithm are used, except for the "*skew*" instead of the "*tanh*" nonlinearity because the "*skew*" nonlinearity presents better results (Smith et al., 2011).

The error of the pLiNGAM algorithm is measured by the false positive ratio (FPR), false negative ratio (FNR), false direction ratio (FDR) and the sum of FPR, FNR, and FDR. FPR stands for the ratio of the number of falsely added edges to the whole possible existing edges, FNR denotes the ratio of the number of falsely missed edges to the whole possible existing edges, and FDR is the ratio of the number of edges that are wrongly identified in the direction to the whole possible existing edges. Furthermore, the sum of FPR, FNR and FDR is calculated to represent the total error of pLiNGAM.

#### **SIMULATED fMRI DATA**

The simulated data are from Smith et al. in their 2011 publication (Smith et al., 2011), which have been widely used in fMRI studies (Cole et al., 2010; Smith et al., 2011). The simulations are generated using the Dynamic Causal Modeling fMRI forward model (Friston et al., 2003), in which the Dynamic Causal Modeling uses a nonlinear balloon model (Buxton et al., 1998) for the vascular dynamics. These data can provide 28 simulations, and we select the No. 7 simulation set which has 5000 data points in this paper because it has more than enough data points for the purpose of our study. The No. 7 simulation set contains 5 nodes with 250 min of data at a repetition time of 3 s. The total number of data points is 5000 (scans) for each of the 50 simulated subjects. The coefficients matrix used to generate these 50 subjects data have the same structure with slightly different coefficients.

#### **REAL fMRI DATA**

#### *Participants*

12 healthy right-handed young students, including 5 males and 7 females (mean age: 21 years) participate in our study. This study is supported by the Beijing Normal University Imaging Center. All subjects have provided written informed consent.

## *Data acquisition*

Images are acquired using a Siemens Trio 3-Tesla scanner (Siemens, Erlangen, Germany) in the National Key Laboratory for Cognitive Neuroscience and Learning, Beijing Normal University. Participants are instructed to remain motionless, close their eyes but stay awake during the entire scanning procedure which lasts for 8 min. All of the functional data are acquired using an echo-planar imaging sequence with the following parameters: 33 axial slices, *TR* = 2000 ms, *TE* = 30 ms, acquisition voxel size, 3*.*13 × 3*.*13 × 3*.*60 mm3, in-plane resolution = 64×64 and matrix = 64 × 64, 240 volumes.

# *Data analyses*

*Data preprocessing.* The first five volumes of the total 240 volumes in the functional fMRI data are removed to make the signal more stable. Image preprocessing including slice timing, realignment, normalization, and smoothing (FWHM = 8 mm) are conducted using the SPM8 software (http://www*.*fil*.*ion*.*ucl*.* ac*.*uk/spm).

*Default mode network (DMN) and regions of interest (ROIs).* Group ICA is performed to the preprocessed data using the fMRI toolbox (http://mialab.mrn.org/software/#gica) to determine the default mode network (DMN). In recent years, ICA has been widely used to identify the low-frequency neural network during resting-state or cognitively undemanding fMRI scans (Calhoun et al., 2001; Greicius and Menon, 2004; van de Ven et al., 2004). The Group ICA includes two rounds of principal component analysis, ICA separation and back-reconstruction. In ICA separation, the Extended Infomax algorithm is used (Lee et al., 1999). To select the independent component that best matches the DMN, a DMN template is developed based on a dataset of regions reported by Greicius et al. (Greicius and Menon, 2004). Subsequently, the DMN at the single subject level is acquired, and one sample *t*-test (*p <* 0*.*05, false discovery rate corrected) is performed (**Figure 2**). **Figure 2** shows the regions with significant connectivity at the resting state including the medial prefrontal cortex (mPFC), posterior cingulate cortex (PCC), left/right inferior parietal cortex (lIPC/rIPC), left/right lateral and inferior temporal cortex (lITC/rITC), and left/right (para) hippocampus (lHC/rHC). Then, these eight core DMN regions are selected as nodes (ROIs) for the LiNGAM analysis. The coordinates of the eight maximally activated voxels in the core DMN ROIs are given in **Table 1**, and the ROIs are generated with a sphere with 6 mm-radius centered at the voxel with the maxima local *T-*value. Then, the data points of each ROI are extracted with the software rest (http://restfmri*.*net/forum/index*.* php).

*pLiNGAM on the real fMRI data.* Before applying the pLiNGAM on the real fMRI data, the distribution of the V-subject obtained from the real fMRI data is examined by the One-Sample Kolmogorov–Smirnov Test. If the distribution is non-Gaussian, then the LiNGAM will be used on the V-subject to estimate the effective connectivity network among the eight core DMN ROIs.

**FIGURE 2 | DMN identified by group ICA (***p <* **0***.***05, false discovery rate corrected).**

**Table 1 | The coordinates of all the ROIs for real fMRI data (***p <* **0***.***05, false discovery rate corrected).**


*BA, Brodmann's area; mPFC, medial prefrontal cortex; PCC, posterior cingulate cortex; lIPC/rIPC, left/right inferior parietal cortex; lITC/rITC, left/right lateral and inferior temporal cortex; lHC/rHC, left/right (para) hippocampus.*

# **RESULTS**

# **SIMULATED VALIDATION**

To verify the feasibility of pLiNGAM on the estimation of effective connectivity of fMRI data, some simulation validation are performed, including the desired number of data points that is needed to make the results of LiNGAM stable, the feasibility of the pooling of data points across multiple subjects, the effectiveness of V-subjects in pLiNGAM and the influence of pooling order on pLiNGAM.

# *Desired number of data points of LiNGAM*

The simulated data is used to investigate the desirable number of data points that can make the LiNGAM algorithm stable. Part of the total data points (5000 data points) of each single subject is applied to the LiNGAM. Part of data points in each subject are selected at the beginning of the total data points and the length of the points ranges from 200 to 5000. To avoid the influence of differences between subjects, the LiNGAM algorithm is applied

to 50 subjects and the FPR, FNR, and FDR are calculated by averaging the fifty results. The average FPR, FNR, and FDR and the sum of FPR, FNR and FDR are shown in **Figure 3** (FDR is 0, so it is not shown in the figure). Three statistical tests: Wald test, chi-square test, and difference chi-square test (*p* = 0*.*05) are performed to prune the edges of the estimated network. **Figure 3** illustrates that both FPR and FNR are consistently decreasing as the number of data points increases. The sum of FPR and FNR reduces to approximate 7% when the length of data points arrives 5000. Because of the limitation of the number of total data points, this algorithm is not tested with longer data points.

# *Feasibility of subject pooling*

To confirm pooling over subjects' data is a feasible method, the following two validations are performed.


**Table 2 | The** *p***-value [mean (STD)] of One-Sample Kolmogorov–Smirnov Test of 5 ROIs for the simulated fMRI data.**


result of pooling of subjects is better than that of single subject. Three groups of data are modeled: single-subject\_2000 (G1), V-subject\_2000 (G2), and single-subject\_200 (G3). More specifically, the single-subject\_2000 group consists of 10 subjects and each single subject has 2000 data points. The V-subject\_2000 group is a V-subject with 2000 data points, which are pooled from 10 single subjects with 200 data points each. The single-subject\_200 group consists of 10 single subjects and each single subject has 200 data points. The 10 subjects used in this paper are randomly selected from the total 50 subjects and the pooling order is random.

Then, the FPR, FNR, and FDR of these three groups are calculated, and the sum of FPR, FNR, and FDR for the three groups is shown in **Figure 4A**. The results clearly show that the G1 group has a smaller sum of FPR, FNR, and FDR compared to the other two groups, and the G2 group has a smaller sum of FPR, FNR, and FDR than the G3 group. Furthermore, one sample *t*-test is performed on G3 and G1 respectively to verify whether the mean of G3 or G1 is significantly different from G2. The results are encouraging (*T* = −4*.*291, *p* = 0*.*002 for G1; *T* = 3*.*973, *p* = 0*.*003 for G3). These statistical results denote that the G1 group shows better results than both the G2 group and G3 group, and G2 group shows better results than G3 group, which indicating that subject pooling is feasible for the LiNGAM algorithm, and pLiNGAM can offer better results when data points were few for the single subjects. Furthermore, to test if the error rate of the G2 group is stable across different subsets of 10 single subjects, 50 V-subject\_2000 are constructed by randomly selecting 10 single subjects. The sum of FPR, FNR, and FDR of these V-subject\_2000 are then calculated, and the results show that the

error rate of the G2 group is stable across different selections of the 10 subjects (**Figure 4B**).

#### *pLiNGAM with V-subjects*

To explore the FPR, FNR, and FDR estimated using the pLiNGAM with the V-subjects, the V-subjects are constructed according to the schematic shown in **Figure 1**. Each of the Vsubjects is pooled with several (range from 1 to 25) single subjects (200 data points). For example, in each single subject, 200 data points are selected at the beginning of the total data points, then the 6 single subjects with data points of 200 are combined to form one V-subject with data points of 1200. The length of data points of each V-subject ranges from 200 to 5000. To ensure the reliability of the results, 50 V-subjects are constructed for each length of data points. **Figure 5** demonstrates that when data points are more than 2000, the sum of FPR, FNR, and FDR reaches 15%, which is better than most other effective connectivity methods (Cole et al., 2010; Smith et al., 2011).

### *Influence of the pooling order*

To determine whether the order of pooling subjects has any effect on the estimated network, the following test is conducted. 10 single subjects are randomly selected from the total 50 subjects. Among 3628800 possible orders, 3000 orders are randomly selected to examine this effect. For each of the 3000 pooling orders, a V-subject is generated. Then, the pLiNGAM algorithm is applied to these V-subjects and the FPR, FNR, and FDR are calculated. Our results show that the estimated network has no relation with the order of pooling, which is consistent with the fact that the major advantage of concatenation of data points across subjects in ICA is ordering the components in different subjects in the same way (Calhoun et al., 2001).

### **REAL fMRI VALIDATION**

The distribution of the V-subject from the real fMRI data follows non-Gaussian according to the One-Sample

**Table 3 | The result of One-Sample Kolmogorov–Smirnov Test of 8 ROIs for real fMRI data.**


*"Sig" represents the asymmetry significance. When the two-tailed asymptotic significance of each ROI is less than 0.05, the test distribution is not normal.*

Kolmogorov–Smirnov Test (shown in **Table 3**). Thus, the pLiNGAM is applicable for the real fMRI data.

In this section, the stability of causal network is tested on the real fMRI data using pLiNGAM, and the results of effective connectivity for the real fMRI data are also displayed.

### *The stability of effective connectivity on real fMRI data*

pLiNGAM is tested with different subsets of subjects from the real fMRI data to validation the robustness and stability of the result. Several subjects, *n* = 3 for example, are randomly selected from all the subjects (a total of 12 subjects) to construct the V-subject for 100 times (3, 4, 5, 6, 7, 8, and 9 subjects are tested respectively to ensure the procedure of random selection be repeated for 100 times, while 1, 2, 10, 11, 12 subjects can't be randomly selected for 100 times and are not used for testing). For each number of subjects, the causal network is analyzed for 100 times, and the common structure of the 100 causal networks is then considered as a baseline to calculate the FPR, FNR, and FDR of each causal network. Then the average of the sum of the FPR, FNR, and FDR is taken as the variability of the results. As it is shown in **Figure 7**, the variability of different subsets of the subjects is not high (about 0.26 for different number of subjects). This variability is comparable with the results of many algorithms that are mentioned in Smith et al., such as Granger, Bayes net and so on (Smith et al., 2011). Furthermore, the variability of different subsets of the subjects is stable along with the increased number of subjects (slightly decrease). These results indicate the stability and robustness of the causal networks that obtained by pLiNGAM.

#### *The results of effective connectivity for real fMRI data*

**Figure 6** shows the effective connectivity model of DMN during the resting state investigated by the pLiNGAM algorithm (using all the 12 subjects). From **Figure 6**, we can conclude the following connections: *m*PFC→*r*HC/*r*IPC/*l*ITC/PCC/*r*ITC/ *l*IPC/*l*HC, *r*IPC →PCC/*r*HC/*l*HC/*l*ITC, *r*ITC→*r*IPC/*l*IPC/PCC/ *l*ITC/*l*HC/*r*HC, *l*ITC→PCC/*l*HC/*r*HC, *l*IPC→PCC/*r*IPC/*r*HC/ *l*HC/*l*ITC (*p <* 0*.*05, Wald statistics). Seven direct connections are detected between *m*PFC, *r*HC, *r*ITC, *l*HC, *l*IPC, PCC, and the other ROIs. Interestingly, all links associated with *m*PFC are outgoing connections, and all links associated with *r*HC are in-going connections. Furthermore, six of the total seven links associated with *r*ITC are out-going connections, and six of the total seven links associated with *l*HC are in-going connections. In addition, five of the total seven links associated with *l*IPC are out-going connections, and five of the total seven links associated with PCC are in-going connections.

# **DISCUSSION**

This study employs the pLiNGAM algorithm to explore the effective connectivity of fMRI data with the V-subject. The results demonstrate that the pLiNGAM is feasible for both simulated and real fMRI data.

The pLiNGAM algorithm has several advantages in estimating the effective connectivity of brain areas. First, the simulated

fMRI data demonstrate that pLiNGAM produces a more robust effective connectivity model with the V-subject than the original single subject. With a small number of data points, however, the computational stability of pLiNGAM cannot be guaranteed because in ICA estimation, the weight matrix *B* often converges on different values when there are not enough data points (Goebel et al., 2003). Second, this algorithm is based on the assumptions of non-Gaussianity of disturbance variables, linearity and an acyclic model, which allow the identification of the full causal model. Previous methods (Pearl, 2000; Shimizu and Kano, 2008) based on the assumption of Gaussianity require additional information (such as the causal order of variables) to obtain a full causal model (Shimizu et al., 2006). Third, a V-subject composed of more than one subject can provide more valuable information compared to a single subject. Fourth, the sum of FPR, FNR and FDR for the V-subjects can fall to 15% (**Figure 5**), which is smaller than most of other approaches (45%) (Cole et al., 2010; Smith et al., 2011).

Our results of the simulated data show that the sum of FPR, FNR, and FDR can just reduce to approximate 7% but not 0% when there are sufficient number of data points (shown in **Figure 5**), indicating that we can't obtain a perfect network of the simulated data even if the data points are long enough. This situation is explainable. A sampling step was done in the procedure of generating the simulated data (Smith et al., 2011), thus may result in the loss of information about the data. Furthermore, some noises are also added into the simulated data (Smith et al., 2011). All these process may cause the imperfect performance of pLiNGAM even when the data points are long enough.

The subject pooling has been verified to be a reasonable method through the simulated fMRI data. Then this method is applied to the real fMRI data, and the results show that the causal network is reliable and stable across different subsets of subjects, which further indicated the feasible application of pLiNGAM in the situation with low inter-subject variability. Furthermore, most of the links associated with the PCC are in-going connections, demonstrating that the PCC acts as a confluent node. Similar conclusions have been acquired in the previous studies (Li et al., 2012; Yan et al., 2013). In addition, the links associated with *m*PFC show

good consistency because all links are out-going connections. Li et al.'s (2012) study also supports this result.

The variability in **Figure 7** for the real fMRI data is not significantly decreasing (slightly decreasing) as the number of subjects increases, which is different from the results of the V-subject in **Figure 5**. This may because that the variability in the real fMRI data is more stable than that of the simulated data, thus having reached the flat part toward the tail like that in **Figure 5**. To a certain extent, the variability is stable (slightly decrease) along with the increased number of subjects for the real fMRI data, which indicates the stability and robustness of the causal networks that obtained by pLiNGAM. In any way, further detailed explorations are needed to delve into this problem in our future study.

While having many merits, the pLiNGAM method still has several limitations. First, it only performs well when the inter-subject variability is low. pLiNGAM is one form of the "VTS" technique (Li et al., 2008), which assumes that every subject within a group performs the same function and has the same connectivity network. Other group analysis method based on LiNGAM, such as the algorithm proposed in Shimizu (2012), assumes that each subject shares a causal ordering but different connection strengths, which is similar with the "CS" approach (Li et al., 2008). So this algorithm in Shimizu (2012) may perform worse than pLiNGAM when the inter-subject variability is low (e.g., the healthy subject group), while better than pLiNGAM when intersubject variability is a little larger (e.g., patient group). Therefore, more efforts are needed to improve pLiNGAM in order to be applicable for more general situations. Second, the V-subjects have more data points, thus may result in longer calculation time. In addition, the calculation time also depends on group sizes and the number of ROIs (Hyvärinen and Oja, 1997). Third, the assumption of an acyclic model may be a limitation to the fMRI data. This assumption implies that information can only be transmitted from one ROI to another, but not transmitted back. However, feedback is an important feature for biological systems, such as cortico-subcortical loops (Lynch and Tian, 2006). In any way, further exploration is needed to improve the pLiNGAM algorithm.

#### **ACKNOWLEDGMENTS**

This work was supported by the Key Program of National Natural Science Foundation of China (91320201), the Funds for International Cooperation and Exchange of the National Natural Science Foundation of China (61210001), the Excellent Young Scientist Program of China (61222113), and Program for New Century Excellent Talents in University (NCET-12-0056).

#### **REFERENCES**


Bollen, K. A. (1998). *Structural Equation Models*. John Wiley & Sons, Ltd.

Buxton, R. B., Wong, E. C., and Frank, L. R. (1998). Dynamics of blood flow and oxygenation changes during brain activation: the balloon model. *Magn. Reson. Med.* 39, 855–864. doi: 10.1002/mrm.1910390602


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 18 July 2014; paper pending published: 29 July 2014; accepted: 17 September 2014; published online: 06 October 2014.*

*Citation: Xu L, Fan T, Wu X, Chen K, Guo X, Zhang J and Yao L (2014) A pooling-LiNGAM algorithm for effective connectivity analysis of fMRI data. Front. Comput. Neurosci. 8:125. doi: 10.3389/fncom.2014.00125*

*This article was submitted to the journal Frontiers in Computational Neuroscience. Copyright © 2014 Xu, Fan, Wu, Chen, Guo, Zhang and Yao. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# EEG entropy measures in anesthesia

# *Zhenhu Liang1, Yinghua Wang2,3, Xue Sun1, Duan Li 4, Logan J. Voss 5, Jamie W. Sleigh5, Satoshi Hagihira6 and Xiaoli Li 2,3\**

*<sup>1</sup> Institute of Electrical Engineering, Yanshan University, Qinhuangdao, China*


#### *Edited by:*

*Tobias Alecio Mattei, Ohio State University, USA*

#### *Reviewed by:*

*Raoul Rashid Nigmatullin, Kazan Federal University, Russia Fengyu Cong, Dalian University of Technology, China*

#### *\*Correspondence:*

*Xiaoli Li, State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research; Center for Collaboration and Innovation in Brain and Learning Sciences, Beijing Normal University, Beijing 100875, China e-mail: xiaoli@bnu.edu.cn*

# **Highlights:**


**Objective:** Entropy algorithms have been widely used in analyzing EEG signals during anesthesia. However, a systematic comparison of these entropy algorithms in assessing anesthesia drugs' effect is lacking. In this study, we compare the capability of 12 entropy indices for monitoring depth of anesthesia (DoA) and detecting the burst suppression pattern (BSP), in anesthesia induced by GABAergic agents.

**Methods:** Twelve indices were investigated, namely Response Entropy (RE) and State entropy (SE), three wavelet entropy (WE) measures [Shannon WE (SWE), Tsallis WE (TWE), and Renyi WE (RWE)], Hilbert-Huang spectral entropy (HHSE), approximate entropy (ApEn), sample entropy (SampEn), Fuzzy entropy, and three permutation entropy (PE) measures [Shannon PE (SPE), Tsallis PE (TPE) and Renyi PE (RPE)]. Two EEG data sets from sevoflurane-induced and isoflurane-induced anesthesia respectively were selected to assess the capability of each entropy index in DoA monitoring and BSP detection. To validate the effectiveness of these entropy algorithms, pharmacokinetic/pharmacodynamic (PK/PD) modeling and prediction probability (*Pk* ) analysis were applied. The multifractal detrended fluctuation analysis (MDFA) as a non-entropy measure was compared.

**Results:** All the entropy and MDFA indices could track the changes in EEG pattern during different anesthesia states. Three PE measures outperformed the other entropy indices, with less baseline variability, higher coefficient of determination (*R*2) and prediction probability, and RPE performed best; ApEn and SampEn discriminated BSP best. Additionally, these entropy measures showed an advantage in computation efficiency compared with MDFA.

**Conclusion:** Each entropy index has its advantages and disadvantages in estimating DoA. Overall, it is suggested that the RPE index was a superior measure. Investigating the advantages and disadvantages of these entropy indices could help improve current clinical indices for monitoring DoA.

**Keywords: EEG, anesthesia, entropy, pharmacokinetic/pharmacodynamic modeling, depth of anesthesia monitoring**

# **INTRODUCTION**

In the operating room, general anesthesia is important to guarantee successful surgery and ensure patients' safety and comfort. For anesthesia, the reliable monitoring of anesthetic drug effects on the brain is a clinical concern for anesthesiologists (Monk et al., 2005). The central nervous system (CNS) is the main target of anesthetic drugs. Originated in CNS, the electroencephalogram (EEG) reflects the neural activities of brain, and has been widely used as a surrogate parameter to quantify the anesthetic drug effect (Rampil, 1998; Bruhn et al., 2006; Jameson and Sloan, 2006). However, only limited information can be obtained from the EEG signals purely by waveform observation. With the development of signal processing, various methods have been applied to analyze, identify or detect mental disorders and consciousness mechanisms from EEG signals (Okogbaa et al., 1994; Natarajan et al., 2004; Abásolo et al., 2006), as well as evaluating the effects of anesthesia.

In recent decades, numerous attempts have been made to develop an index for describing anesthetic drug effects on the brain, including zero crossing frequency, spectral edge, wavelet analysis, high-order spectral analysis etc. These studies laid the foundation of commercial EEG-based monitors of depth of anesthesia (DoA), such as BIS (Aspect Medical Systems, Newton, MA) (Bruhn et al., 2006; Ellerkmann et al., 2010) and M-entropy (GE Healthcare, Helsinki, Finland) (Viertiö-Oja et al., 2004; Bruhn et al., 2006). Many of these methods are derived from linear theories. However, various studies have shown that the EEG is a non-stationary signal that exhibits non-linear or chaotic behaviors (Elbert et al., 1994; Pritchard et al., 1995; Zhang et al., 2001; Natarajan et al., 2004). This prompted many researchers to adopt non-linear analysis methods in anesthesia study, for example largest Lyapunov exponent (Fell et al., 1996), Hurst exponent (Alvarez-Ramirez et al., 2008), fractal analysis (Klonowski et al., 2006; Gifani et al., 2007; Liang et al., 2012), detrended fluctuation analysis (DFA) (Jospin et al., 2007; Nguyen-Ky et al., 2010b), recurrence analysis (Huang et al., 2006), and non-linear entropies (Bruhn et al., 2001; Li et al., 2008a). In particular, non-linear entropy methods describing the complexity of EEG signals, have received considerable attention.

The word "entropy" was first proposed as a thermodynamic principle by Clausius (1867). It describes the distribution probability of molecules of gaseous or fluid systems. In 1949, Claude E. Shannon introduced entropy into information theory to describe the distribution of signal components (Shannon and Weaver, 1949). So far, numerous entropy algorithms have been proposed and used to quantify DoA, covering Spectral entropy [which includes Response Entropy (RE) and State entropy (SE)] (Viertiö-Oja et al., 2004; Klockars et al., 2012), Approximate entropy (ApEn) (Bruhn et al., 2000), Sample entropy (SampEn) (Richman and Moorman, 2000), Fuzzy entropy (FuzzyEn) (Chen et al., 2007), Shannon Permutation entropy (SPE) (Li et al., 2008a, 2012), Shannon Wavelet entropy (SWE) (Särkelä et al., 2007), and Hilbert-Huang spectral entropy (HHSE) (Li et al., 2008b).

Spectral Entropy is the method applied in the commercial M-Entropy Module (Viertiö-Oja et al., 2004). It consists of two parameters: Response Entropy (RE) and State Entropy (SE). SE primarily includes the spectrum of the EEG signal from 0.8 to 32 Hz, and RE includes electromyogram activity from 0.8 to 47 Hz (Viertiö-Oja et al., 2004). Shannon Wavelet entropy (SWE) is the Shannon entropy in the wavelet domain, which indicates signal variation at each frequency scale (Rosso et al., 2001). And the Hilbert–Huang spectral entropy (HHSE) is the Shannon entropy based on the Hilbert–Huang transform proposed by Huang et al. (1998). HHSE has been successfully applied to the anesthetic EEG signals (Li et al., 2008b).

The above methods are based on the frequency spectrum. Whereas many entropy methods are based on the time series and phase space analysis. ApEn is an algorithm derived from the Kolmogorov-Sinai entropy (Pincus, 1991). It quantifies the predictability of subsequent amplitude values of a signal. A previous investigation showed that ApEn correlates well with the concentration of desflurane (Bruhn et al., 2000). However, ApEn lacks relative consistency and is highly dependent on data length, SampEn was proposed to overcome ApEn's limitation by removing self-matching and relieving its bias (Richman and Moorman, 2000). SampEn has been used for analyzing EEG signals (Montirosso et al., 2010; Yoo et al., 2012). FuzzyEn was proposed by Chen et al. (2007). It is based on the fuzzy membership functions to define the vectors' similarity, using the soft and continuous boundaries of fuzzy functions to ensure the continuity and the validity of FuzzyEn's definition (Chen et al., 2009). SPE was introduced by Bandt and Pompe (2002). It is a complexity measure based on symbolic dynamics (Bandt and Pompe, 2002). Because of its simple concept and fast computation, SPE has been widely used in EEG signal analysis (Cao et al., 2004; Li et al., 2007, 2008a). Furthermore, its derivatives, multi-scale permutation entropy (Li et al., 2010) and composite permutation entropy index (Olofsen et al., 2008) have been successfully applied to analyze EEG signals during anesthesia.

However, "No one knows what entropy really is, so in a debate you will always have the advantage." This statement is true for EEG analysis today (Ferenets et al., 2006). Each entropy index has its own advantages and disadvantages, but how does their performance compare when evaluating the effect of anesthesia on brain activity? To this end, some researchers have compared the performance of different entropy methods for anesthesia monitoring (Sleigh et al., 2001, 2005; Bein, 2006). Unfortunately, these articles analyzed no more than three entropies. To our knowledge, a systematic comparison of the performance of them in assessing anesthesia drug effect is lacking. In this study, we aim to compare the capability of several commonly used entropy indices for monitoring DoA.

We noticed that definitions of all the above entropies are based on Shannon information theory, which belongs to a shortrange or extensive concept. However, the physical systems especially the biomedical systems are often characterized by either long-range interactions, long-term memories, or multifractality (Zunino et al., 2008). To describe these characters, two generalized forms of entropy were proposed: Renyi entropy (Renyi, 1970) and Tsallis entropy (*q*-entropy) (Tsallis et al., 1998). For example Tsallis entropy has a parameter *q* for non-extensity. If *q* > 1, the entropy is more sensitive to events that occur often, whereas if 0 < *q* < 1 it is more sensitive to the events that occur seldom (Maszczyk and Duch, 2008). In the limit *q* → 1, it coincides with Shannon entropy. These generalized entropies can provide additional informational about the importance of specific events, such as outliers or rare events. The two classes of entropies and their combinations with current signal processing methods have been already applied in EEG analysis (Bezerianos et al., 2003; Tong et al., 2003; Inuso et al., 2007) and often been proved advantageous than the Shannon version (Zunino et al., 2008; Arefian et al., 2009). To make the research more instructive, we believe it useful to investigate these non-extensive entropy measures along with those extensive Shannon entropies in DoA monitoring. In this study, we involved the Tsallis wavelet entropy (TWE) and Renyi wavelet entropy (RWE) proposed by Rosso et al. (2003, 2006), as well as the Tsallis permutation entropy (TPE) proposed by Zunino et al. (2008) and a new Renyi permutation entropy (RPE).

For illustrative purpose, we divide the entropies into two families:


In this work, their performance for monitoring DoA were compared. Using data sets obtained during sevoflurane and isoflurane anesthesia, we quantified for each index the responsiveness to loss of consciousness, computation complexity and the ability to detect BSP. Pharmacokinetic/pharmacodynamic (PK/PD) modeling and prediction probability statistics were applied to evaluate the efficiency of each index for tracking anesthetic concentration. Additionally, in order to prove the efficiency of the entropy approaches, two non-linear dynamic methods: DFA (Jospin et al., 2007) and multifractal DFA (MDFA) (Kantelhardt et al., 2002) are compared.

# **ENTROPY INDICES**

The computation of each entropy index is briefly described as follows.

#### **SPECTRAL ENTROPY (RE AND SE)**

Spectral Entropy quantifies the probability density function (PDF) of the signal power spectrum in the frequency domain. The detail of the Spectral Entropy algorithm can be seen in Inouye et al. (1991) and Rezek and Roberts (1998). Spectral Entropy consists of the RE and the SE. RE is computed over a frequency range from 0.8 to 47 Hz while SE is computed over the frequency range from 0.8 to 32 Hz. The normalization step for RE and SE are defined as follows:

$$RE = \frac{H\_{sp\_{0.8-47}}}{\log\left(N\_{0.8-47}\right)}\tag{1}$$

$$SE = \frac{H\_{s^{p\_{0.8-32}}}}{\log\left(N\_{0.8-47}\right)}\tag{2}$$

where *Hsp*0.8−<sup>47</sup> and *Hsp*0.8−<sup>32</sup> means the sum of spectral power between 0.8 and 47 Hz, and 0.8 to 32 Hz, respectively. And *N*0.8−<sup>47</sup> equals the total number of frequency components in the range 0.8–47 Hz. Spectral Entropy describes the degree of skewness in the frequency distribution. For example, in the normalized case, the Spectral Entropy of a pure sine wave with a single spectral peak is 0, while that of white noise is 1.

#### **WAVELET ENTROPY (SWE, TWE, AND RWE)**

WE differentiates specific brain states under spontaneous or stimulus-related conditions and recognizes the time localizations of a dynamic process. To calculate Wavelet Entropy, wavelet energy *Ej* of a signal is determined at each scale *j* as follows:

$$E\_{\circ} = \sum\_{k=1}^{L\_{\circ}} d(k)^2 \tag{3}$$

where *k* and *Lj* are the summation index and the number of coefficients at each scale *j* with in a given epoch, respectively. The total energy over all scales is obtained by:

$$E\_{\text{total}} = \sum\_{j} E\_{j} = \sum\_{j} \sum\_{k=1}^{L\_{j}} d\_{j}(k)^{2} \tag{4}$$

Then wavelet energy is divided by total energy to obtain the relative wavelet energy at each scale *j*:

$$p\_{\dot{\jmath}} = \frac{E\_{\dot{\jmath}}}{E\_{\text{total}}} = \frac{E\_{\dot{\jmath}}}{\sum\_{j} E\_{j}} = \frac{\sum\_{k=1}^{L\_{\dot{\jmath}}} d(k)^{2}}{\sum\_{j} \sum\_{k=1}^{L\_{\dot{\jmath}}} d\_{\dot{\jmath}}(k)^{2}} \tag{5}$$

SWE is calculated from Shannon entropy of *pj* distribution between scales as follows:

$$S^{(s)} = -\sum\_{j} p\_j \log p\_j \tag{6}$$

The detail of the algorithm used in this study can be seen in Särkelä et al. (2007).

And the TWE is defined as,

$$S\_q^{(T)} = \frac{1}{q-1} \sum\_{j} \left[ p\_j - \left( p\_j \right)^q \right] \tag{7}$$

where *q* is a non-extensity parameter.

Based on the definition of Renyi entropy (Renyi, 1970), the RWE is defined as Rosso et al. (2006):

$$S\_a^{(\mathcal{R})} = \frac{1}{1 - a} \log \left[ \sum\_{j} \left( p\_j \right)^a \right] \tag{8}$$

For *S* (*S*) *<sup>q</sup>* , the normalized SWE is

$$SWE = S^{(s)} / \log N\_{\!\!\!/ }\tag{9}$$

where *NJ* is the number of wavelet resolution levels.

And *S* (*T*) *<sup>q</sup>* is normalized by dividing <sup>1</sup> <sup>−</sup> *<sup>N</sup>*<sup>1</sup> <sup>−</sup> *<sup>q</sup> J* /(*q* − 1), defined by Rosso et al. (2003):

$$TWE = \frac{S\_q^{(T)}}{\left[1 - N\_f^{1 - q}\right]/(q - 1)}\tag{10}$$

Further, the normalized *S* (*R*) *<sup>a</sup>* is defined as Maszczyk and Duch (2008):

$$RWE = \frac{S\_a^{(R)}}{\log N\_I} \tag{11}$$

The values of three WE measures depend on the wavelet basis function, the number of decomposed layers (*n*) and the data length (*N*). Furthermore, the TWE and RWE are related to the parameters *q* and *a* respectively. Among these parameters, the wavelet basis function is most important. Because of the lack of a fixed criterion, it is very difficult to select an appropriate wavelet basis function in practical applications and many studies choose it based on experiments. The details of the selection process in this study can be found in Supplement Material 1.

#### **HILBERT-HUANG SPECTRAL ENTROPY (HHSE)**

HHSE is based on the Hilbert-Huang transform, which applies the Shannon entropy concept to the Hilbert-Huang spectrum. The detail of the algorithm is seen in Li et al. (2008b). For a given non-stationary signal *x*(*t*), the EMD method decomposes the signal into a series of intrinsic mode functions (IMFs), *Cn* (1, 2,..., *M*), where *M* is the number of IMFs. The signal *x*(*t*) can be written by:

$$\mathbf{x}(t) = \sum\_{i=1}^{n-1} \operatorname{imf}(t)\_i + r\_n(t) \tag{12}$$

Apply the Hilbert transform to the IMF components,

$$Z\left(t\right) = imf\left(t\right) + iH\left[imf(t)\right] = a\left(t\right)e^{i\int a\left(t\right)dt} \tag{13}$$

in which *a* (*t*) = *imf* <sup>2</sup> (*t*) + *H*<sup>2</sup> *imf*(*t*) , ω (*t*) = *d dt* arctan (*H imf*(*t*) /*imf*(*t*)) , where ω (*t*) and *a*(*t*) are the instantaneous frequency and amplitude, respectively, of the IMFs.

The Hilbert-Huang marginal spectrum is defined by:

$$h\left(\omega\right) = \int H\left(\omega, t\right) dt\tag{14}$$

To simplify the representation, the Hilbert-Huang spectrum is denoted as a function of frequency (*f*) instead of angular frequency (ω). The marginal spectrum is normalized by:

$$
\hat{h}\left(f\right) = h(f) / \sum h(f) \tag{15}
$$

Next, the Shannon entropy concept is applied to the Hilbert-Huang spectrum, and Hilbert-Huang spectral entropy is obtained by:

$$HHSE = -\sum\_{f} \hat{h}\left(f\right) \log\left(\hat{h}\left(f\right)\right) \tag{16}$$

The HHSE values are mainly affected by the frequency resolution and data length (*N*). For accurate computation, the frequency resolution is chosen as 0.1 Hz. *N* directly influences the EMD. In general, the boundary effect may be induced if *N* is too large or too small, which can contaminate the data and distort the power spectrum. The selection of *N* in this study is given in Supplement Material 1.

#### **APPROXIMATE ENTROPY (ApEn)**

ApEn is derived from Kolmogorov entropy. It was introduced by Pincus (1991). It can be used to analyze a finite length signal and describe its unpredictability or randomness. Its computation involves embedding the signal into the phase space and estimating the rate of increment in the number of phase space patterns within a predefined value *r*, when the embedding dimension of phase space increases from *m* to *m* + 1.

For a time series *x* (*i*), 1 ≤ *i* ≤ *N* of finite length *N*, reconstitute the *N* − *m* + 1 vectors *Xm*(*i*) following the form:

$$X\_m(i) = \left\{ \mathbf{x}(i), \mathbf{x}(i+1), \dots, \mathbf{x}(i+m-1) \right\},$$

$$i = 1, 2, \dots, N - m + 1 \tag{17}$$

where *m* is the embedding dimension.

Let *C<sup>m</sup> <sup>i</sup>* (*r*) be the probability that any vector *Xm*(*j*) is within distance *r* of *Xm* (*i*), defined as:

$$C\_i^m(r) = \frac{1}{N - m + 1} \sum\_{j=1}^{N-m+1} \Theta \left( d\_{ij}^m - r \right);$$

$$i, j = 1, 2, \dots, N - m + 1 \tag{18}$$

where d is the distance between the vectors *Xm*(*i*) and *Xm j* , defined as:

$$d\_{\vec{\eta}}^{m} = d\left[X\_i^m, X\_{\vec{\jmath}}^m\right] = \max\left(\left|\mathbf{x}\left(i+k\right) - \mathbf{x}(j-k)\right|\right),$$

$$k = 0, 1, \ldots, m \tag{19}$$

and is the Heaviside function.

After that, define a parameter *m*(*r*):

$$\left(\Phi^{\mathfrak{m}}(r) = (N - m + 1)^{-1} \sum\_{i=1}^{N-m+1} \ln C\_i^{\mathfrak{m}}(r)\right) \tag{20}$$

Next, when the dimension changes to *m* + 1, the above process is repeated.

$$\Phi^{m+1}(r) = (N-m)^{-1} \sum\_{i=1}^{N-m} \ln C\_i^{m+1}(r) \tag{21}$$

Finally, the approximate entropy is defined by:

$$ApEn\left(m, r, N\right) = \Phi^m\left(r\right) - \Phi^{m+1}\left(r\right) \tag{22}$$

The detailed algorithm is seen in Bruhn et al. (2000). The ApEn index is influenced by data length (*N*), tolerance (*r*) and embedding dimension (*m*). According to Pincus (1991) and Bruhn et al. (2000), *N* is recommended to be 1000, *r* 0.1∼0.25 of the standard deviation of the signal and *m* 2∼3. The selection of these parameters is described in Supplement Material 1.

#### **SAMPLE ENTROPY (SampEn)**

The SampEn proposed by Richman and Moorman (2000) is based on ApEn but differs from it in three ways to remove bias:


(3) In order to have an equal number of patterns for both embedding dimension *m* and *m* + 1, the time series reconstitution in SampEn have *N* − *m* rows instead of *N* − *m* + 1 in ApEn in embedding dimension *m*.

The first step of calculating SampEn is the same as ApEn. When the embedding dimension is *m*, the total number of template matches is:

$$B^m(r) = (N - m)^{-1} \sum\_{i=1}^{N-m} C\_i^m(r) \tag{23}$$

Similarly, when the embedding dimension is *m* + 1, the total number of template matches is:

$$A^m(r) = (N - m)^{-1} \sum\_{i=1}^{N-m} C\_i^{m+1}(r) \tag{24}$$

Finally, the SampEn of the time series is estimated by:

$$\text{SampEn (r, m, N)} = -\ln \frac{\text{A}^{\text{m}}(\text{r})}{\text{B}^{\text{m}}(\text{r})} \tag{25}$$

SampEn is based on ApEn, so its parameter selection procedure is similar to that of ApEn (see Supplement Material 1).

### **FUZZY ENTROPY (FuzzyEn)**

Zadeh introduced the concept of "fuzzy set" (Zadeh, 1965). Fuzzy set provides a mechanism for measuring the degree to which a pattern belongs to a given class, by introducing the concept of "membership degree" having a fuzzy function *uc*(*x*). The nearer the value *uc*(*x*) is to unity, the higher the membership grade of *x* in the set *C* will be. Inspired by this, Chen et al. (2007) developed the FuzzyEn based on SampEn. FuzzyEn uses the fuzzy membership function *u*(*d<sup>m</sup> ij* ,*r*) to obtain the similarity between *<sup>X</sup><sup>m</sup> <sup>i</sup>* and *<sup>X</sup><sup>m</sup> j* instead of the Heaviside function.

FuzzyEn is based on SampEn, so its parameter selection is similar to that of SampEn (see Supplement Material 1).

#### **PERMUTATION ENTROPY (SPE, TPE, AND RPE)**

There are three types of PE measures involved in this study. PE is an ordinal analysis method, in which a given time series is divided into a series of ordinal patterns for describing the order relations between the present and a fixed number of equidistant past values (Bandt, 2005). The advantage of this method is its simplicity, robustness and low computational complexity (Li et al., 2007).

For an *N*-point normalized time series {*x*(*i*) : 1 ≤ *i* ≤ *N*}, firstly the time series is reconstructed:

$$X\_i = \{ \mathbf{x}(i), \mathbf{x}(i+\tau), \dots, \mathbf{x}(i+(m-1)\tau) \},$$

$$i = 1, 2, \dots, N - (m-1)\tau \tag{26}$$

where τ is the time delay, *m* is the embedding dimension.

Then, rearrange *Xi* in an increasing order:

$$\begin{aligned} \left( \mathbf{x} \left( i + \left( j\_1 - 1 \right) \mathbf{r} \right) \right) &\leq \mathbf{x} \left( i + \left( j\_2 - 1 \right) \mathbf{r} \right) \right) \leq \dotsb \\ &\leq \mathbf{x} \left( i + \left( j\_m - 1 \right) \mathbf{r} \right) \end{aligned} \tag{27}$$

There are *m*! permutations for *m* dimensions. Each vector *Xi* can be mapped to one of the *m*! permutations.

Next, the probability of the *j*th permutation occurring *pj* can be defined as:

$$p\_{\vec{j}} = \frac{n\_{\vec{j}}}{\sum\_{\vec{j}=1}^{m!} n\_{\vec{j}}} \tag{28}$$

where *nj* is the number of times the *j*th permutation occurs.

Based on the probability of the *j*th permutation *pj*, we define SPE, TPE and RPE as follows.

SPE is just the Shannon entropy associated with the probability distribution *pj*:

$$S\_1^{(s)} = -\sum\_{j=1}^{m!} p\_j \log p\_j \tag{29}$$

And the normalized SPE is:

$$SPE\_n = \frac{S\_1^{(S)}}{S\_{1,\text{max}}^{(s)}} = \frac{\sum\_{j=1}^{m!} \rho\_j \log p\_j}{\log(m!)} \tag{30}$$

Based on the definition of Tsallis entropy, Zunino et al., proposed the normalized TPE and defined it as Zunino et al. (2008):

$$TPE = \frac{\sum\_{j=1}^{m!} \left(p\_j - p\_j^q\right)}{1 - (m!)^{1-q}} \tag{31}$$

Furthermore, the normalized RPE measure based on the Renyi entropy and permutation probability distribution *pj* is:

$$RPE\_n = \frac{\log \sum\_{j=1}^{m!} p\_j^a}{(1-a)\ln m!} \tag{32}$$

In Li et al. (2008a, 2010, 2012), SPE was used to evaluate the effect of sevoflurane and isoflurane anesthesia on the brain. In this study, the parameters of *m* = 6 and τ = 1 are selected for sevoflurane anesthesia as proposed in Li et al. (2008a). The SPE's parameters for isoflurane anesthesia are the same as those proposed by Li et al. (2012). TPE and RPE are first used in DoA measure, therefore selection of the appropriate parameters of TPE and RPE should be based on the experiments. The details of the selection process is shown in Supplement Material 1.

### **MATERIALS AND STATISTICAL METHODS SUBJECTS AND EEG RECORDINGS**

# *EEG data set during sevoflurane-induced anesthesia*

In this study, the first data set we used was from a previous study (McKay et al., 2006), in which 19 patients aged 18–63 years were recruited from Waikato Hospital, Hamilton, New Zealand. The subjects were scheduled for elective gynecologic, general, or orthopedic surgery. All patients fasted for at least 6 h before anesthesia and received no premedication. Patients were American Society of Anesthesiologists physical status I or II and signed written informed consent following approval by the Waikato Hospital ethics committee.

Before application of Ag/AgCl electrodes, the skin was carefully cleaned with an alcohol swab to ensure electrode-skin impedance of less than 7.5 k. A composite electrode, the Entropy™ Sensor, composed of a self-adhering flexible band holding three electrodes were used to record the EEG signals between the forehead and temple (active = FpZ, earth = Fp1, and reference = F8). RE and SE were measured every 5 s with a plug-in M-Entropy S/5 Module (Datex-Ohmeda). The sevoflurane concentration was measured at the mouth at 100/s (McKay et al., 2006). All data were recorded and stored on a laptop computer. Off-line analysis was performed using the MATLAB (version 8, MathWorks Inc.) software.

### *EEG data set during isoflurane-induced anesthesia*

The second data set contains 29 patients (9 men and 20 women, age 33–77 year) receiving elective abdominal surgery during combined isoflurane general anesthesia and epidural anesthesia (Hagihira et al., 2002). These patients had no neurologic or psychiatric disorders and didn't receive medication with any drugs known to influence anesthesia. The data recordings were approved by the Osaka Prefectural Habikino Hospital and all patients gave written informed consent.

Each patient was injected intramuscularly with 0.5 mg atropine before entering the operating room. Initially, an epidural catheter was placed at the appropriate spinal location. Then, after confirming the effect of epidural analgesia, 3 mg/kg thiopental was used to induce anesthesia. Anesthesia was subsequently maintained with isoflurane, oxygen, and nitrogen after tracheal intubation. Vecuronium was given as required. Lidocaine 1% (80–110 mg/h; initial dose, 90–100 mg) was administered epidurally. Patients received controlled ventilation to maintain adequate oxygenation and normocapnia. To keep mean blood pressure at 60 mmHg, dopamines were administered as required at a dose of 2–5µg/(kg·min).

Before induction of anesthesia, five EEG electrodes (A1, A2, FP1, FP2, and FPz) were attached to the patients according to the International 10–20 System. FPz was used as the ground electrode. The EEG signal used was recorded from a unipolar lead (FP1-A1) through a 514 X-2 EEG telemetry system (GE Marquette, Tokyo, Japan) with sample frequency of 512 Hz (another Fp2-A2 channel was not analyzed). Isoflurane was initially increased to 1.5% and then stepped down to 0.7%. The end-tidal concentration of isoflurane was purposely maintained at set levels (1.5, 1.3, 1.1, 0.9, and 0.7%) for 30 min at each level. The EEG recordings at 0.3 and 0.5% isoflurane were collected immediately after the operation. The concentration of isoflurane was continuously monitored and recorded by Canomac (Datex, Helsinki, Finland). The BSP was evident in six of the 29 EEG recordings.

The two data sets used can be obtained by asking the authors of corresponding original papers.

#### **EEG PREPROCESSING**

All the EEG recordings were preprocessed by following the steps outlined in Li et al. (2010) before further analysis. Firstly, data points whose amplitude values exceeded a threshold determined by mean and standard deviation (SD) statistics were removed as outliers. Then, the filter function filter.m was used to remove the frequency components higher than 60 Hz. This FIR filter ensures that phase information is not distorted. Thirdly the stationary wavelet transform was used to reduce electro-oculogram (EOG) artifact. Finally, an inverse filter was used to detect and remove EMG and other high-amplitude transient artifacts.

# **PHARMACOKINETIC/PHARMACODYNAMIC MODELING**

To derive the relationship between effect-site anesthetic drug concentration and the measured EEG index, PK/PD modeling was used. These methods have been successfully used to evaluate the proposed EEG indices (Li et al., 2008a; Olofsen et al., 2008). It describes the relationship between drug dose and its effect through two successive physiological processes (McKay et al., 2006). The pharmacokinetic (PK) side of the model describes the changes in blood concentration of the drug over time, while the pharmacodynamic (PD) aspect shows the relation between the concentration of drug at its effect site and its measured effect. The simplest effect site model is a first order model, defined as:

$$dC\_{\rm eff}/dt = k\_{\rm eo}(C\_{\rm et} - C\_{\rm eff})\tag{33}$$

where *C*eff denotes the effect-site concentration, *k*eo is the firstorder rate constant for efflux from the effect compartment and *C*et is the end-tidal concentration.

In addition, a non-linear inhibitory sigmoid *E*max model was used to describe the relationship between the estimated *C*eff and the measured EEG indices.

$$\text{Effect} = E\_{\text{max}} - (E\_{\text{max}} - E\_{\text{min}}) \times \frac{C\_{\text{eff}}^{\prime}}{E C\_{50}^{\prime} + C\_{\text{eff}}^{\prime}} \tag{34}$$

where Effect is the processed EEG measure, *E*max and *E*min respectively are the maximum and minimum Effect for each individual, *EC*<sup>γ</sup> <sup>50</sup> is the drug concentration that causes 50% of the maximum Effect and γ is the slope of the concentration–response relationship.

The coefficient of determination *R*<sup>2</sup> is calculated by:

$$R^2 = 1 - \frac{\sum\_{i=1}^{n} \left(\wp\_i - \hat{\wp}\_i\right)^2}{\sum\_{i=1}^{n} \left(\wp\_i - \overline{\wp}\right)^2} \tag{35}$$

where *yi* is the measured Effect for a given time and *y*ˆ*<sup>i</sup>* is corresponding modeled Effect.

*C*eff is estimated by iteratively running the above model with a series of *k*eo values, with the optimal *k*eo yielding the greatest *R*<sup>2</sup> for each patient.

### **MDFA EXPONENT**

Kantelhardt et al., proposed the MDFA method to describe the non-stationary time series, which is based on a generalization DFA method (Kantelhardt et al., 2002). Nguyen-Ky et al., used the moving-average DFA method to monitoring the DoA and the results showed that DFA could accurately estimate a patient's hypnotic state (Nguyen-Ky et al., 2010a).

For a time series *x*(*t*) of length *N*, the main computation procedure of MDFA consists of three steps.

Step 1. Construct the profile as the equation showed below,

$$\nu\left(\mathbf{j}\right) = \sum\_{i=1}^{j} \left[\mathbf{x}\left(i\right) - \left<\mathbf{x}\right>\right] \tag{36}$$

where *x* represents the average value of *x*(*t*).

Step 2. Divide the new profile *y j* into *Ns* = *N*/*s* nonoverlapping segments of equal length s. Since the record length *N* may not be a multiple of the considered time scale *s*, a short part at the end of the profile will remain in most cases. In order not to disregard this part of record, the same procedure is repeated starting from the other end of the profile *y j* . Thus, 2*Ns* segments are obtained altogether.

Step 3. Calculate the local trend for each segment by a leastsquare fit of the data and calculate the variance *F*<sup>2</sup> (*s*, *v*). Thus, the qth order fluctuation function is calculated as follows:

$$F\_q\left(s\right) = \left\{\frac{1}{2N\_s} \sum\_{\nu=1}^{2N} \left[F^2\left(s,\nu\right)^{q/2}\right]^{1/q}\right\}^{1/q} \tag{37}$$

If *q* = 0, then

$$F\_0\left(s\right) = \exp\left\{\frac{1}{4N\_s} \sum\_{\nu=1}^{2N\_s} \ln\left[F^2(s,\nu)\right]\right\} \tag{38}$$

It is obvious that when *q* = 2, we have the standard DFA procedure.

MFDFA characterizes the evolution of *Fq* (*s*) is a function of the segment length *s*. Modeling fluctuations that present a powerlaw behavior between *Fq* (*s*) and *s*, *Fq*(*s*) ∝ *s h*(*q*) , where the *h*(*q*) is generalized Hurst exponent.

For the multifractal time series, the scaling behavior is sensitive with the parameter *q*. For positive *q*, *h*(*q*) describes the scaling behavior of the segments with large fluctuations. On the contrary, for negative *q*, *h*(*q*) is sensitive to small fluctuations. For more detail of the MDFA method, see in Kantelhardt et al. (2002).

In this study, we only considered the influence of q with the MDFA measure. The selection of parameter is described in Supplement Material 1.

#### **STATISTICAL ANALYSIS**

To further evaluate the correlation between the measured EEG index and underlying anesthetic drug effect, prediction probability (*Pk*) statistics were applied, as described in Smith et al. (1996). Given two random data points with different *C*eff, *Pk* describes the probability that the measured EEG index correctly predicts the *C*eff of the two points. Its definition is:

$$P\_k = \frac{P\_\mathcal{c} + P\_\text{tx}/2}{P\_\mathcal{c} + P\_d + P\_\text{tx}} \tag{39}$$

where *Pc*, *Pd* and *Ptx* separate the probability that two data points drawn at random, independently and with replacement from the population are a concordance, a discordance or an x-only tie. A value of 1 means that the EEG index is perfectly concordant with *C*eff; whereas a value of 0.5 means the EEG index is obtained by chance. When the monotonic relation between the drug concentration and the EEG index is negative, the resultant *Pk* value is replaced by 1 − *Pk*.

In addition, The Kolmogorov–Smirnov test was used to determine whether the data sets were normally distributed. To assess the index stability during the awake state and the sensitivity to the induction process, the relative coefficient of variation (CV) (Li et al., 2008a) was used. Kruskal-Wallis test was used to determine the significant difference of the index values between awake, induction, anesthesia and recovery states.

# **RESULTS**

First we used these entropy measures on EEG data from sevoflurane anesthesia. **Figure 1A** shows a preprocessed EEG recording and the derivative from the EEG signal during the whole sevoflurane induction process, from awake to induction, then to deep anesthesia, and finally to recovery. With deepening anesthesia, the mean amplitude of the EEG gradually increased and then the amplitude decreased in the state of recovery. The concurrent endtidal sevoflurane concentration is represented by the black line given in **Figure 1B**. It can be regarded as the drug concentration in blood, derived from the recorded sevoflurane concentration at the mouth (represented by gray line). The changes in RE, SE, SWE, TWE, RWE, HHSE, ApEn, SampEn, FuzzyEn, SPE, TPE, RPE, and MDFA corresponding to the EEG recording are successively given in **Figures 1C–K**. As can be seen, all the entropy indices generally followed the changes in EEG pattern as the drug concentration increased and decreased. And MDFA had the opposite trend with entropy indices.

Then we analyzed the EEG recording during isoflurane anesthesia using the same entropy algorithms and MDFA methods. **Figures 2A,B** are the EEG recording and isoflurane end-tidal concentration respectively. It can be seen that the drug concentration increased and then decreased. **Figures 2C–K** shows the same entropy and MDFA indices as **Figures 1C–K**, and demonstrate equivalent trends, in line with changes in drug concentration.

Loss of consciousness (LOC) is the most important clinical time point during anesthesia. We investigated the ability of these entropies in tracking LOC. **Figure 3** demonstrates the changes in each index around LOC, from LOC−30 s to LOC+30 s for all subjects during sevoflurane anesthesia. For these plots, index values were normalized to between 0 and 1. It can be seen in **Figures 3A–N** that MDFA(−8) decreased most rapidly, followed by SWE. Thus, the MDFA with *q* = −8 appeared to be the most sensitive to LOC. To verify this, we calculated the absolute slope values (mean ± SD) of the linear-fitted polynomials vs. time for these indices, as shown in **Figure 3O**. As can be seen, the absolute slope value for MDFA(−8) (0.44 ± 0.22) is largest, followed by SWE (0.43 ± 0.23).

To further compare the ability of the indices to distinguish different anesthesia states, the sevoflurane anesthesia procedure was divided into four states, i.e., awake, induction, deep anesthesia, and recovery. For each index, a box plot is given in **Figure 4**. The data was not normally distributed, so the statistics of the 19 patients undergoing sevoflurane anesthesia were expressed as median (min—max), as shown in **Table 1**. All the entropy indices monotonically decreased as anesthesia deepened, then increased

**FIGURE 1 | An EEG recording from a patient undergoing sevoflurane anesthesia and corresponding entropy indices vs. time. (A)** Preprocessed EEG recording. **(B)** Sevoflurane concentration recorded at the mouth (gray line) and the derived end-tidal sevoflurane concentration (black line). **(C–J)**

The time course of the studied EEG derivative. The indices are calculated over a window of 10 s with an overlap of 75%. **(K)** The time course of MDFA at *q* = 2 [MDFA(2)] and *q* = −8 [MDFA(−8)]. The window and overlap selection are similar with entropy measures.

**FIGURE 2 | An EEG recording from a patient in isoflurane anesthesia and calculated indices. (A)** Preprocessed EEG recording, re-sampled at 128 Hz. **(B)** Recording of the isoflurane end-tidal concentration. **(C–J)**

Time course of entropy indices, with a time interval of 10 s and 5 s overlap. **(K)** Time course of MDFA measures with a time interval of 10 and 5 s overlap.

**FIGURE 3 | Entropy and MDFA analysis around the time of LOC for subject undergoing sevoflurane anesthesia (***n* **= 19). (A–N)** The normalized indices around LOC (from LOC − 30 s to LOC + 30 s) for all subjects. The red plus sign denotes the point of LOC. **(O)**

Statistical analysis of the absolute slope of the linear-fitted polynomials vs. time for studied indices. Bar height indicates the mean value, and the lower and upper line are the 95% confidence interval of each index.

during recovery. The MDFA indices have an opposite trend with the entropy measures. These are consistent with the results in **Figure 1**. The overlap of three types of PE (SPE, TPE, and RPE) values between the awake and deep anesthesia states were smaller than the other indices. This means the PE has a better ability to separate these states and a greater robustness for individual differences.

To estimate the baseline variability and the sensitivity to the induction process of each index, the CV value of all the indices for the sevoflurane data set are computed and the results are given in **Table 2**. During the awake state, the CV value of SampEn was 0.095, which was the highest; The CV value of TPE was 0.003, significantly lower than MDFA(2) (0.240) and MDFA(−8) (0.125) and the other indices. The CV values of SPE and RPE were lower



*RE, response entropy in the M-entropy module; SE, state entropy; SWE, Shannon wavelet entropy; TWE, Tsallis wavelet entropy; RWE, Renyi wavelet entropy; HHSE, Hilbert-Huang spectral entropy; ApEn, approximate entropy; SampEn, sample entropy; FuzzyEn, fuzzy entropy; SPE, Shannon permutation entropy; TPE, Tsallis permutation entropy; RPE, Renyi permutation entropy; MDFA(2), Multifractal detrended fluctuation analysis with q* = 2*; MDFA(-8), Multifractal detrended fluctuation analysis with q* = −8.



than other indices as well. The lower CV value of PE illustrates that PE measures were less sensitive to noise, while MDFA methods were least robust against noise. During induction, the CV of SWE (0.338) was the highest. This demonstrates that SWE had a faster response speed compared to the other indices.

In order to verify the performance of all the indices for monitoring DoA and detecting the burst suppression state, we analyzed the isoflurane anesthesia data set, in which some subjects entered into the burst suppression state during deep anesthesia. The results are given in histogram form and shown in **Figure 5**. All the indices except SE and MDFA decreased with increasing isoflurane concentration. During burst suppression, only ApEn and SampEn continued to decrease. This means that the ApEn and SampEn algorithms could be used to evaluate DoA including detection of the burst suppression state, without the need for Supplementary Methods. The tabulated results for each index at the different isoflurane concentrations and BSP are presented in **Table 3**. The CV of the indices show that PE (0.033) outperformed the others in awake state (0% concentration) (see **Table 4**). And the CV of two MDFA measures were relative higher in awake state. It indicate that MDFA algorithms were no better than some entropy measures in anti-noise performance.

To further compare the performance of the studied indices, PK/PD modeling was performed to describe the relationship between the index values and the estimated sevoflurane and isoflurane effect-site concentration. **Tables 5, 6** give these parameters for isoflurane and sevoflurane anesthesia respectively, in which the maximum coefficient of determination (*R*2) gives the correlation between the index values and the anesthetic effect site concentration. **Figures 6A,B** show the *R*<sup>2</sup> values of the indices for the two data sets. **Figure 6A** shows the *R*<sup>2</sup> values for sevoflurane. It can be seen that *R*<sup>2</sup> for TPE (0.95, 95% confidence interval 0.92–0.98) was significantly higher than the other entropy indices. **Figure 6B** shows *R*<sup>2</sup> values for isoflurane. Again, *R*<sup>2</sup> for SPE (0.81) was higher than the other entropy indices. Although *R*<sup>2</sup> of MDFA with *q* = 8 was relative higher in sevoflurane anesthesia, the value in isoflurane anesthesia was lower. The statistical analysis also shows that for the same entropy algorithm, the mean *R*<sup>2</sup> value for sevoflurane was significantly higher than for isoflurane.

To assess the performance of the indices to correctly predict drug effect-site concentrations, we evaluated the prediction probability *Pk* of all the indices from the PK/PD modeling for all the subjects, as shown in **Figures 7A,B**. And the statistical results are shown in **Table 7**. Overall, most *Pk* values of indices for sevoflurane were higher than for isoflurane. For sevoflurane, *Pk* of RPE and MDFA were equal (0.87, 95% confidence interval is 0.83– 0.90 and 0.83–0.92 respectively), slightly higher than RWE (0.85) and TWE 0.81 (95% confidence interval 0.79–0.84). Also, *Pk* of

**Table 3 | The statistics of the studied indices at different isoflurane concentrations [median (min-max)].**


RPE was higher than that of TPE and SPE. Similarly, *Pk* of RWE was highest in three WE methods. It means that Renyi entropy had a better performance in predicting drug effect-site concentrations comparing with Shannon entropy and Tsallis entropy. The differences between RPE and the other indices were statistically significant (all *p* < 0.05, paired *t*-test), except for MDFA(-8). And the difference between RPE and TPE, SPE were statistically significant (*p* = 0.03 and 0.01 respectively, paired *t*-test), which means that RPE had a stronger ability to track the sevoflurane effect-site concentration during anesthesia. In order to get a more intuitive comparison, the best curve fits of all indices against the effect-site concentration are demonstrated for both sevoflurane (**Figure 8**) and isoflurane (**Figure 9**).

To compare the timeliness performance of each index in tracking DoA, we recorded the computing time of each index for the same subject. 20 EEG recordings from the two data sets were selected. The calculate epoch length (*N*) of each algorithm is equal to 10 s, and the overlap select 5.0 s. The computing time for 1 min of EEG data compared for each index is given in **Table 8**. The fastest index was WE (0.025 ± 0.001 s). The RE/SE and PE computation times were 0.096 ± 0.008 s and 0.545 ± 0.016 s respectively. The MDFA (16.338 ± 0.280 s) was the slowest. The desktop computer used for this test had the following configuration: Intel Core i3 CPU, 4 cores at 2.93 GHz, with 2 GB of RAM, running Windows XP professional operating system.

# **DISCUSSION AND CONCLUSION**

In this study, we investigated the performance of 12 entropy algorithms to assess the effect of GABAergic anesthetic agents on EEG activity, including RE, SE, SWE, TWE, RWE, HHSE, ApEn, SampEn, FuzzyEn, SPE, TPE, and RPE. Two data sets including sevoflurane and isoflurane anesthesia were employed as the test samples for evaluating the entropy algorithms. We compared their performance in estimating the DoA and detecting the burst suppression pattern. PK/PD modeling and prediction probability

**Table 4 | The CV of indices for different isoflurane concentrations.**


#### **Table 5 | The PK/PD modeling parameters for sevoflurane.**

statistics were applied to assess their effectiveness. In addition, we compared the MDFA measure with all entropy indices to test the efficiency of entropy approach.

The twelve entropy measures could be divided into two classes: time-domain-based and time-frequency-domain-based analyses. On one hand, ApEn, SampEn, FuzzyEn, and PE are time domain analysis methods. All these entropy algorithms are based on nonlinear theories, and the first three are phase space analytical methods (Chen et al., 2009). PE is based on ordinal pattern analysis of the time series (Bandt, 2005). Considering that the EEG has non-linear characteristics, these four methods have their advantages. For example, FuzzyEn and PE are less sensitive to the signal quality and calculation length (Pincus, 1991; Li et al., 2008a). Relative to ApEn and SampEn, FuzzyEn can resolve more detail in the time series and has more accurate definition in theory (Chen et al., 2009). On the other hand, RE, SE, WE, and HHSE indices are based on the time-frequency domain. The start point of RE and SE is the spectral entropy, which has the particular advantage that the contributions to entropy from any particular frequency range are explicitly separated. In order to achieve optimal response time, RE and SE adopt a variable time window for each particular frequency-called time-frequency balanced spectral entropy (Viertiö-Oja et al., 2004). Compared to the variable time windows of RE and SE, the window function of WE is variable in both time and frequency domains. The HHSE algorithm is based on the EMD and Hilbert transform (Li et al., 2008b). The advantage of this method is that it can estimate the instantaneous amplitude and phase/frequency. Also it can break down a complicated signal without a basis function (such as sine or wavelet functions) into several oscillatory modes that are embedded in this complicated signal. The marginal spectrum gives a more accurate and nearly continuous distribution of EEG energy, which is completely different from the Fourier spectrum (Li et al., 2008b).


*t*1/<sup>2</sup>*keo, blood effect-site equilibration constant;* γ *, slope parameter of the concentration-response relation; Emax , EEG parameter value corresponding to the maximum drug effect; Emin, EEG parameter value corresponding to the minimum drug effect; EC*50*, concentration that causes 50% of the maximum effect; R*2*, maximum coefficients of determination.*

#### **Table 6 | Parameters of PK/PD models for isoflurane.**


(*n* = 19). For comparison, the *R*<sup>2</sup> values for each index are expressed by a different sign and color. **(B)** The *R*<sup>2</sup> value of the same entropy indices for isoflurane anesthesia (*n* = 20).

Although each entropy algorithm has theoretical advantages with respect to the characterization of EEG recordings during GABAergic anesthesia, we still need to assess the practical performance from several perspectives. In qualitative terms, all the indices are effective at tracking the changes of drug concentration through the EEG analysis. As demonstrated in the presented figures and tables, all the entropies decreased with deepening anesthesia. However, there are quantitative differences between indices for different anesthesia states. This is because the principles underlying each algorithm are entirely different. Entropies based on the time domain, ApEn for example, measure the predictability of future amplitude values of the electroencephalogram based on the knowledge of one or two previous amplitude values. With increasing GABAergic anesthetic drug concentration, the EEG signals become more regular, which leads to a reduction in the ApEn value. Entropies based on the timefrequency domain, such as RE and SE, also decrease with increasing DoA because the EEG shifts to a simpler frequency pattern as the anesthetic dose increases (Rampil, 1998).

In all 12 entropy measures, the TWE, RWE, TPE, and RPE are based on the Tsallis entropy and Renyi entropy theory respectively. Tsallis entropy and Renyi entropy theory are considered

**Table 7 | The** *Pk* **statistics for sevoflurane and isoflurane anesthesia for each entropy and MDFA index.**


generalized concept of entropy compared to Shannon entropy. Similar to Renyi entropy, the Tsallis entropy uses the nonextensive parameter *q* to measure the information of specific events. The results showed that TPE and RPE were better than SPE in assessing the effect of anesthesia. Similar results can also be seen in TWE, RWE, and SWE. There are no studies using TPE or RPE in DoA monitoring before. The excellent performance indicates their potential usefulness in anesthesia analysis.

Furthermore, the coefficient of determination and prediction probability statistics were used to assess the correlation of each index with the anesthetic drug effect site concentration. Three PE measures had a higher *Pk* and *R*<sup>2</sup> compared with the other indices. Also, MDFA at *q* = 2 had a relative higher *Pk* and *R*<sup>2</sup> in all indices. Comparing anesthetic drugs, the *R*<sup>2</sup> values for sevoflurane anesthesia were higher than for isoflurane anesthesia, while the *Pk* values were similar (see **Figures 5**, **6** and **Table 3**). This means that the entropy measures were better able to track sevoflurane than isoflurane effect site concentration.

Four additional measures were considered for evaluation of each entropy index. First, the CV was used to evaluate the sensitivity of each index to artifacts during the awake state (Li et al., 2008b, 2010). The results showed that PE outperformed the other indices on this level. In all entropy measures, SWE had the highest CV during anesthesia induction, indicating that this index was superior at discriminating between the awake and anesthetized states. Secondly, the performance for estimating the point of LOC was considered. Although all the entropy measures could distinguish between awake and anesthetized states (see **Figure 4**), the speed of transition (slope) between the two states was fastest

**FIGURE 8 | Dose-response curves between the RE(A), SE(B), SWE(C), TWE(D), RWE(E), HHSE(F), ApEn(G), SampEn(H), FuzzyEn(I), SPE(J), TPE(K), RPE (L), MDFA(2) (M), MDFA(-8) (N) and the sevoflurane** *C***eff for** **the best fit, with the greatest value of** *R***<sup>2</sup> show above the figures.** The dots denote the measured EEG indices values. The solid lines denote the PK/PD modeled EEG index values.

**Table 8 | The computing time for different entropy and MDFA indices for 1 min data length.**


for SWE, while SE had the slowest transition. Thirdly, the performance for discriminating different drug concentrations was considered, especially the ability to distinguish the burst suppression state. The mean ± SD value of the indices showed that all the entropy measures can distinguish different drug concentrations, while only ApEn and SampEn have the ability to distinguish burst suppression from the other states. This means that, if using PE as a DoA index, an additional method for detecting the burst suppression pattern would need to be incorporated, such as Nonlinear Energy Operator (NLEO) (Särkelä et al., 2002). The results are in accordance with the findings during desflurane anesthesia for ApEn (Bruhn et al., 2000) and sevoflurane anesthesia for PE and HHSE (Li et al., 2008b, 2010). Finally, the computing time was used to assess algorithm complexity. The results showed that the WE index is the fastest algorithm of all the entropy indices tested. HHSE was the slowest: its computing time for the same data length was about 580 times longer that for WE. In order to improve the computational efficiency, the parallelized method based on the graphics processing unit has been proposed (Chen et al., 2010).

The efficiency of these entropy measures were compared with other two non-linear dynamic measures, the MDFA with *q* = 2 and −8, where MDFA with *q* = 2 is a standard DFA measure. The results and statistics show that MDFA were better in some aspects compared to some of entropy measures, such as sharper slope in LOC, higher *Pk* and *R*<sup>2</sup> for sevoflurane (almost equal to RPE) measure. However, there are several shortcomings in MDFA measures. First, CVs of MDFA in awake state were higher compared to those of entropy indices. Second, MDFA could not distinguish the burst suppression state from other states. Most importantly, the computing time of MDFA is the longest in all algorithms, even longer than HHSE, which means that MDFA algorithms are not suitable for real time DoA monitoring. Therefore, entropy approaches are capable for monitoring the EEG changes in anesthesia, and are often advantageous in computation efficiency.

Although this study covers a number of entropy methods and two types of anesthesia, the research has its limitations. For instance, errors caused by individual variability, e.g., age, physical wellness, intraoperative tolerance are hard to control because of the difficulty in data collection in clinical practice. Besides, Interactions between EEG activities and drug concentrations could be studied using finer-grained paradigm, for instance by increasing the drug concentration in a stepwise pattern. Additionally, optimal parameters for each entropy measure may not have been achieved and need further investigation.

This study doesn't provide an absolute measure of "depth" of clinical anesthesia, nor of consciousness for the prevention of intra-operative recall; but rather focuses on understanding the inner workings of each entropy index, and explores whether these indices correlate with GABAergic drug effect. Having a good understanding of the strengths and weaknesses of each measure is necessary before possibly applying them within a clinical context.

In conclusion, each entropy measure has its advantages, and several indices show promise as a simple open-source method for quantifying the brain effects of GABAergic drugs. In particular, the PE indices perform better than other entropy indices as an EEG derivative in several aspects, especially for RPE measure. However, further work is required to accurately quantify the burst suppression pattern. Also, to be useful as a clinical measure, each algorithm still needs additional parameter and computation efficiency optimizations.

### **ACKNOWLEDGMENTS**

This research was supported by National Natural Science Foundation of China (No. 61304247, 61203210 and 61271142), China Postdoctoral Science Foundation (2014M551051) and Applied basic research project in Hebei province (No. 12966120D).

### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fncom.2015. 00016/abstract

### **REFERENCES**


Cao, Y., Tung, W., Gao, J., Protopopescu, V., and Hively, L. (2004). Detecting dynamical changes in time series using the permutation entropy. *Phys. Rev. Ser. E* 70, 46217–46217. doi: 10.1103/PhysRevE.70.046217


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 August 2014; accepted: 28 January 2015; published online: 18 February 2015.*

*Citation: Liang Z, Wang Y, Sun X, Li D, Voss LJ, Sleigh JW, Hagihira S and Li X (2015) EEG entropy measures in anesthesia. Front. Comput. Neurosci. 9:16. doi: 10.3389/ fncom.2015.00016*

*This article was submitted to the journal Frontiers in Computational Neuroscience. Copyright © 2015 Liang, Wang, Sun, Li, Voss, Sleigh, Hagihira and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Detection of subjects and brain regions related to Alzheimer's disease using 3D MRI scans based on eigenbrain and machine learning

Yudong Zhang<sup>1</sup> \*, Zhengchao Dong<sup>2</sup> , Preetha Phillips <sup>3</sup> , Shuihua Wang1, 4, Genlin Ji 1, 5 , Jiquan Yang<sup>5</sup> and Ti-Fei Yuan<sup>6</sup> \*

<sup>1</sup> School of Computer Science and Technology, Nanjing Normal University, Nanjing, China, <sup>2</sup> Division of Translational Imaging and MRI Unit, New York State Psychiatric Institute, Columbia University, New York, NY, USA, <sup>3</sup> School of Natural Sciences and Mathematics, Shepherd University, Shepherdstown, WV, USA, <sup>4</sup> School of Electronic Science and Engineering, Nanjing University, Nanjing, China, <sup>5</sup> Jiangsu Key Laboratory of 3D Printing Equipment and Manufacturing, Nanjing, China, <sup>6</sup> School of Psychology, Nanjing Normal University, Nanjing, China

### Edited by:

Tobias Alecio Mattei, Brain and Spine Center - InvisionHealth - Kenmore Mercy Hospital, USA

#### Reviewed by:

Fahad Sultan, University Tübingen, Germany Petia D. Koprinkova-Hristova, Bulgarian Academy of Sciences, Bulgaria

#### \*Correspondence:

Yudong Zhang, School of Computer Science and Technology, Nanjing Normal University, 1 Wenyuan, Nanjing, Jiangsu 210023, China zhangyudong@njnu.edu.cn; Ti-Fei Yuan, School of Psychology, Nanjing Normal University, 22 Ninghai Rd., Nanjing, Jiangsu 210008, China ytf0707@126.com

> Received: 12 February 2015 Accepted: 17 May 2015 Published: 02 June 2015

#### Citation:

Zhang Y, Dong Z, Phillips P, Wang S, Ji G, Yang J and Yuan T-F (2015) Detection of subjects and brain regions related to Alzheimer's disease using 3D MRI scans based on eigenbrain and machine learning. Front. Comput. Neurosci. 9:66. doi: 10.3389/fncom.2015.00066 Purpose: Early diagnosis or detection of Alzheimer's disease (AD) from the normal elder control (NC) is very important. However, the computer-aided diagnosis (CAD) was not widely used, and the classification performance did not reach the standard of practical use. We proposed a novel CAD system for MR brain images based on eigenbrains and machine learning with two goals: accurate detection of both AD subjects and AD-related brain regions.

Method: First, we used maximum inter-class variance (ICV) to select key slices from 3D volumetric data. Second, we generated an eigenbrain set for each subject. Third, the most important eigenbrain (MIE) was obtained by Welch's t-test (WTT). Finally, kernel support-vector-machines with different kernels that were trained by particle swarm optimization, were used to make an accurate prediction of AD subjects. Coefficients of MIE with values higher than 0.98 quantile were highlighted to obtain the discriminant regions that distinguish AD from NC.

Results: The experiments showed that the proposed method can predict AD subjects with a competitive performance with existing methods, especially the accuracy of the polynomial kernel (92.36 ± 0.94) was better than the linear kernel of 91.47 ± 1.02 and the radial basis function (RBF) kernel of 86.71 ± 1.93. The proposed eigenbrain-based CAD system detected 30 AD-related brain regions (Anterior Cingulate, Caudate Nucleus, Cerebellum, Cingulate Gyrus, Claustrum, Inferior Frontal Gyrus, Inferior Parietal Lobule, Insula, Lateral Ventricle, Lentiform Nucleus, Lingual Gyrus, Medial Frontal Gyrus, Middle Frontal Gyrus, Middle Occipital Gyrus, Middle Temporal Gyrus, Paracentral Lobule, Parahippocampal Gyrus, Postcentral Gyrus, Posterial Cingulate, Precentral Gyrus, Precuneus, Subcallosal Gyrus, Sub-Gyral, Superior Frontal Gyrus, Superior Parietal Lobule, Superior Temporal Gyrus, Supramarginal Gyrus, Thalamus, Transverse Temporal Gyrus, and Uncus). The results were coherent with existing literatures.

Conclusion: The eigenbrain method was effective in AD subject prediction and discriminant brain-region detection in MRI scanning.

Keywords: Alzheimer's disease, Welch's t-test, magnetic resonance imaging, machine learning, machine vision, eigenbrain, support vector machine, particle swarm optimization

# Introduction

Alzheimer's disease (AD) is not a normal part of aging. It is a type of dementia that causes problems with memory, thinking, and behavior. Symptoms usually develop slowly and worsen over time. Symptoms may become severe enough to interfere with daily life, and lead to death (Hahn et al., 2013). There is no cure for this disease. In 2006, 26.6 million people worldwide suffered from this disease. AD is predicted to affect 1 in 85 people globally by 2050, and at least 43% of prevalent cases need high level of care (Brookmeyer et al., 2007). As the world is evolving into an aging society, the burdens and impacts caused by AD on families and the society has also increased significantly. In the US, healthcare on people with AD currently costs roughly \$100 billion per year and is predicted to cost \$1 trillion per year by 2050 (Miller et al., 2012).

Early and accurate detection of AD is beneficial for the management of the disease (Han et al., 2011). Presently, a multitude of neurologists and medical researchers have been dedicating considerable time and energy toward this goal, and promising results have been continually springing up (Xinyun et al., 2011). Magnetic resonance imaging (MRI) is an imaging technique that produces high quality images of the anatomical structures of the human body, especially in the brain, and provides rich information for clinical diagnosis and biomedical research (Shamonin et al., 2014). The diagnostic values of MRI are greatly enhanced by the automated and accurate classification of the MR images (Goh et al., 2014; Zhang et al., 2015a,b). It already plays an important role in detecting AD subjects from normal elder controls (NC) (Angelini et al., 2012; Smal et al., 2012; Nambakhsh et al., 2013; Hamy et al., 2014; Jeurissen et al., 2014).

In earlier cases, most diagnosis work was done to measure manually or semi-manually a priori region of interest (ROI) of magnetic resonance (MR) images, based on the fact that AD patients suffer more cerebral atrophy compared to NCs (Kubota et al., 2006; Anagnostopoulos et al., 2013). Most of these ROI-based analyses focused on the shrinkage of hippocampus and cortex, and enlarged ventricles (Pennanen et al., 2004). Somehow, the ROI-based methods suffer from some limitations. First, the methods focus on the ROIs need prior knowledge. Second, the accuracy of early detection depends heavily on the experiences of the examiners. Third, the mutual information among the voxels is difficult to operate (Xinyun et al., 2011; Lee et al., 2013). Finally, there is no evidence that other regions (except hippocampus and entorhinal cortex) did not provide any information related to AD. Also, the auto-segmentation of ROI is not feasible in practice, and examiners tend to segment the brain manually.

On the other hand, multivariate approaches that consider all the voxels in a scan as one observation offer an alternative method to ROI-based methods. The advantages of multivariate approaches are that they are data driven, which means that the analyses are fully based on the data without any prior knowledge and that the interactions among voxels and error effects are assessed statistically. However, multivariate approaches suffer from either the curse of dimension problem or the small sample size problems or the lack of the capability, to make statistical inferences about regionally specific changes (Álvarez et al., 2009b).

The Eigenbrain was an excellent multivariate approach that solves both the curse of dimensionality and the problems in small sample size. It was proposed by Alvarez et al. (2009a) and Lopez et al. (2009), and was applied on Single Photon Emission Computed Tomography (SPECT) images. In their research, the eigenbrain approach was shown to efficiently reduce the feature space from ∼5 × 10<sup>5</sup> to only ∼10<sup>2</sup> , and therefore, was able to achieve excellent classification accuracy. In this study, we make a tentative test of applying eigenbrains in MRI scans for AD detection.

Support vector machine (SVM) has been arguably regarded as one of the most excellent classification methods in machine learning (Zhang and Wu, 2012a). Original SVMs are linear classifiers, and do not perform well on nonlinear data. Hence, we introduced in the kernel SVMs (KSVMs), which extends original linear SVMs to nonlinear SVM classifiers by applying the kernel function to replace the dot product form in the original SVMs (Gomes et al., 2012). Compared with the original plain SVM, the KSVMs allows one to fit the maximum-margin hyperplane in a transformed feature space (Garcia et al., 2010). The transformation may be nonlinear and the transformed space is high dimensional; thus although the classifier is a hyperplane in the high-dimensional feature space, it may be nonlinear in the original input space (Hable, 2012).

The aim of our study was to develop a novel classification system based on eigenbrain and machine learning, in order to grow a computer-aided diagnosis (CAD) system for the early detection of AD subjects and AD-related brain regions. Our goal was not to replace clinicians, but to provide an assisting tool. The rest of the paper was organized as follows: the next section reviewed relates literatures from two aspects: the extracted features and the classification methods. Section The Proposed Method describes the methodology of the proposed CAD. Section Experiments and Results contains the experiments and results. Section Discussion analyzes the reason behind the experiment results. Finally, Section Conclusion and Future Research is devoted to conclusion and future research. For ease in reading, the acronyms and their meanings of this study are listed in Table 12 in the appendix.

The **contributions** of the paper fell within the following five aspects: (i) We generalized the Eigenbrain to MR images, and proved its effectiveness; (ii) We proposed a hybrid eigenbrainbased CAD system that can not only detect AD from NC, but also detect brain regions that related to AD. (iii) We proved the proposed method had classification accuracy comparable to state-of-the-art methods, and the detected brain regions were in line with 16 existing literatures. (iv) We used inter-class variance (ICV) and Welch's t-test (WTT) to reduce redundant data; (v) We found POL kernel is better than linear and RBF kernel for this study.

# Literature Review

In common convention, the automatic classification consisted of two stages: feature extraction and classifier construction. We reviewed over ten literatures, and analyzed themthrough the two stages.

# Features of MR Images

Scholars have proposed numerous methods to extract various features<sup>1</sup> . Chaplot et al. (2006) used the approximation coefficients obtained by discrete wavelet transform (DWT). Maitra and Chatterjee (2006) employed the Slantlet transform, which is an improved version of DWT. Their feature vector of each image was created by considering the magnitudes of Slantlet transform outputs corresponding to six spatial positions that were chosen according to a specific logic. El-Dahshan et al. (2010) extracted the approximation and detail coefficients of 3-level DWT. Plant et al. (2010) used brain region cluster (BRC). They suggested to use information gain (IG) to rate the interestingness of a voxel, and applied clustering algorithm to identify groups of adjacent voxels with a high discriminatory power. Zhang et al. (2011) exclusively used the approximation coefficients of 3-level decomposition, and used PCA to reduce the features. Ramasamy and Anandhakumar (2011) used fast Fourier transform (FFT) as features. Saritha et al. (2013) proposed a novel feature of waveletentropy, and employed spider-web plots to further reduce features. Zhang et al. (2013) employed digital wavelet transform to extract features then used principal component analysis (PCA) to reduce the feature space. Savio and Grana (2013) proposed to use deformation-based morphometry (DBM) techniques, and proposed five features as Jacobian map, modulated GM (MGM), trace of Jacobian matrix (TJM), magnitude of the displacement field, and Geodesic Anisotropy (GEODAN). In addition, they suggested the use of Pearson's correlation (PEC), Bhattacharyya distance (BD), and WTT to measure the significance of voxel site. Das et al. (2013) suggested to use Ripplet transform, followed by PCA to reduce features. Kalbkhani et al. (2013) modeled the detail coefficients of 2-level DWT by generalizing autoregressive conditional heteroscedasticity (GARCH) statistical model, and the parameters of GARCH model were considered as the primary feature vector. Zhang et al. (2014) used an undersampling (US) technique on the volumetric image, followed by singular value decomposition (SVD) to select features. El-Dahshan et al. (2014) proposed to add a preprocessing technique that used pulsecoupled neural network (PCNN) for image segmentation. Zhou et al. (2015) used wavelet-entropy as the feature space. Zhang et al. (2015a) used discrete wavelet packet transform (DWPT), and harnessed Tsallis entropy to obtain features from DWPT coefficients. Yang et al. (2015) selected wavelet-energy as the features.

From the literature used, the DWT based features were proven to be efficient. In this study, we suggested using a novel feature of eigenbrain, which was used for SPECT images but was never been used in MR images.

# Classification Model in MRI

There are numerous classification models, but only a few of them are suitable for MR images. Chaplot et al. (2006) employed the self-organizing map (SOM) neural network and SVM. Maitra and Chatterjee (2006) used the common artificial neural network (ANN). El-Dahshan et al. (2010) used ANN and K-nearest neighbor (KNN) classifiers. Plant et al. (2010) used SVM, Bayes statistics, and voting feature intervals (VFI) to derive the quantitative index of pattern matching. Zhang et al. (2011) suggested to use ANN. The weights of ANN were trained by scaled-conjugate-gradient method. Ramasamy and Anandhakumar (2011) proposed to use Expectation and Maximization Gaussian Mixture Model algorithm (EM-GMM). Saritha et al. (2013) used the probabilistic neural network (PNN). Zhang et al. (2013) constructed a kernel SVM with RBF kernel, using particle swarm optimization (PSO) to optimize the parameters C and sigma. Savio and Grana (2013) chose SVM, and used grid search for tuning parameters. Das et al. (2013) used least-square SVM, and their 5 × 5 CV showed high classification accuracy. Kalbkhani et al. (2013) tested the KNN and SVM models. Zhang et al. (2014) proposed to combine KSVM and decision tree, and their method was dubbed KSVM-DT. El-Dahshan et al. (2014) used feed forward back-propagation neural network (FFBPNN). Zhou et al. (2015) used a Naive Bayes classifier (NBC) as classification method. Zhang et al. (2015a) used a generalized eigenvalue proximal SVM (GEPSVM) with RBF kernel. Yang et al. (2015) used SVM as the classifier, and employed biogeography-based optimization (BBO) to train the classifier.

After reviewing the latest literatures that were related to classifiers, we found that SVMs had significant advantages of high accuracy, elegant mathematical tractability, and direct geometric interpretation, compared with other classification methods (Collins and Pape, 2011). In addition, it did not need a large number of training samples to avoid overfitting (Li et al., 2010). Kernel technique further enhanced the performance of SVM. Therefore, KSVM was harnessed in this study.

# The Proposed Method

# Preprocessing on Volumetric Data

For each individual, all available 3 or 4 volumetric 3D MR brain images were motion-corrected, and coregistered to form

<sup>1</sup> Some abbreviations are modified to avoid conflict within this paper.

an averaged 3D image. Then, those 3D images were spatially normalized to the Talairach coordinate space and brain-masked. CDR was interpreted as the target (label). It is a numeric scale quantifying the severity of symptoms of dementia (Williams et al., 2013). The patient's cognitive and functional performances were assessed in six areas: memory, orientation, judgment and problem solving, community affairs, home and hobbies, and personal care. In this study, we chose two types of CDR, i.e., the subjects with CDR of 0 were considered as NC and subjects with CDR of 1 were considered as AD (Marcus et al., 2007).

Calculating eigenbrains on the entire brain was difficult. Instead, we proposed a simplified method that selected several key slices that capture structures indicative of AD from NC. The procedure was as follows: we established the ICV v as

$$\nu\left(k\right) = \left\|\left.\mu\_{\rm AD}\left(\text{Slice}=k\right) - \mu\_{\rm NC}\left(\text{Slice}=k\right)\right\|^2 \tag{1}$$

where k was the index of key slice, µAD and µNC represented the mean of gray-level values of the kth slice of AD subjects and NC subjects, respectively, ||.||<sup>2</sup> represented the l2-norm. Then, we selected the key-slices of ICV larger than 50% of maximum ICV, with 10× undersampling factor (i.e., every 10 slices).

In addition, the slice direction can be chosen as either axial, sagittal, or coronal. Usually coronal direction will give a clearer view than the other two directions. **Figure 1** showed that the coronal slice had an advantage over other directions in that it can cover three of the most important tissues within one slice. Those tissues were seen as indicative of AD. These tissues are the cerebral cortex, the ventricle, and the hippocampus. If we used axial or sagittal slice, then we may need to record two or even more slices to cover those tissues. Therefore, we chose the coronal direction for key slice selection, with the aim of recording only one slice.

# Eigenbrain

AD has different physical structures from NC. Revisit **Figure 1** which indicated the AD subjects had severe atrophy of the cerebral cortex (region i), severely enlarged ventricles (region ii), and extreme shrinkage of hippocampus (region iii). Therefore, eigenbrain tried to capture those different characteristic changes of anatomical structures between AD and NC.

The labeled three regions are (i) cerebral cortex (ii) ventricle, and (iii) hippocampus.

Eigenbrain is carried out by PCA, which is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components (PC). For 2D images the PCs are extended naturally to the 2D eigenbrains.

Suppose **X** is a given data matrix with size of N × A, where N represents the number of samples and A number of attributes (For a 256 × 256 image, we need to vectorize it to a 1 × 65536 vector, hence A = 65536). First, we normalized the dataset matrix **X**, so that each sample in the normalized matrix **Z** was meancentered and unit-variance scaled, by subtracting its mean value and dividing the difference by its standard deviation.

$$Z \leftarrow \frac{X - \mu\left(X\right)}{\sigma\left(X\right)}\tag{2}$$

Next, we estimated the covariance matrix **C** with size of A × A by

$$\mathcal{C} \leftarrow \frac{1}{N-1} \mathbf{Z}^T \mathbf{Z} \tag{3}$$

Here we used N − 1 instead of N in order to produce an unbiased estimator of the variance (See Bessel's correction (Russell and Cohn, 2012) for details).

Third, we perform the eigendecomposition of **C**:

$$C = U \wedge U^{-1} \tag{4}$$

where **U** is an A × (N − 1) matrix, whose columns are the eigenvectors of covariance matrix **C**, matrix 3 is an (N − 1) × (N − 1) diagonal matrix whose diagonal elements are eigenvalues of **C**, each corresponding to an eigenvector of A. It is common to sort the eigenvalue matrix 3 and eigenvector matrix **U** in order of decreasing eigenvalue u<sup>1</sup> > u<sup>2</sup> > . . .> uN. To view the ith eigenbrain u(i), the ith column of **U** was reshaped to an image. Suppose the ith column of **U** contains 65536 elements, then the reshaped image was 256 × 256.

$$
\mu\left(\dot{\iota}\right) = \text{reshape}\left(U\left(\left.\dot{\iota},\dot{\iota}\right)\right)\tag{5}
$$

Note that in our situation (N ∼ 10<sup>2</sup> and A ∼ 10<sup>4</sup> , where ∼ denotes the same order of magnitude), the computation burdens of eigendecomposition of equation (4) are enormous. It can be accelerated by replacing **C** in equation (3) with **C** ′ , since N<<A.

$$\mathbf{C}' \leftarrow \frac{1}{N-1} \mathbf{Z} \mathbf{Z}^T \tag{6}$$

The size of **C** ′ is N × N, which can significantly reduce the computation burden. Using Matlab, the eigenbrain can be done by a simple "PCA" command without considering these issues. The flowchart of calculating eigenbrain is shown in **Figure 2**.

The eigenvalues represent the distribution of energy of the source data among each of the eigenbrains, where the eigenbrains form a basis for original data.

To further select an eigenbrain that is the most statistically significant, we employ the two-sample location test. Saritha et al.

(2013) selected the Student's t-test which assumes both the means and variances of the two data are equal. The assumption of equal variances was not necessary and can be dropped; while the assumption of equal means is essential to select significantly important eigenbrains. Therefore, we used WTT that is an adaption of the Student's t-test and checks nothing except the two populations that have equal means.

The null hypothesis is that the eigenvalues of AD and NC have equal means, without assuming they have equal variances. The alternative hypothesis is they have unequal means. WTT was carried out at the 95% confidence interval. The eigenvalues of the selected most important eigenbrain (MIE) were used as input features for following classification.

# Region Detection

We proposed a visual interpretation method of Eigenbrain to detect regions that can distinguish AD and NC, which is not reported in literatures of Alvarez et al. (2009a) and Lopez et al. (2009). The interpretation in a four-stage process is listed in **Table 1**.

# Classifier

SVM was used as the classifier. In addition, sequential minimal optimization (SMO) is chosen to train SVM due to simple and fast speed (Zhang and Wu, 2012b). Traditional linear SVMs cannot separate intricately distributed data. In order to generalize SVMs to create nonlinear hyperplane, the kernel trick is applied. The KSVMs allows us to fit the maximum-margin hyperplane in a transformed feature space (Liu et al., 2014). The transformation may be nonlinear and the transformed space is a higher dimensional space. Though the classifier is a hyperplane in the higher-dimensional feature space, it may be nonlinear in the original input space.

#### TABLE 1 | Four-stage region detection method.

#### Region detection

Step 1 We selected the most important eigenbrain (MIE).

Step 2 We performed an absolution operation on MIE, since there are both positive and negative elements in the MIE matrix.

Step 3 We highlighted those voxels with the values higher than 0.98 quantile, i.e., 98th percentile.

Step 4 We outputted the anatomical label information of selected voxels using Talairach Daemon software, the output of which contained five levels: hemisphere, lobe, gyrus, tissue, and cell.

#### TABLE 2 | Assessment of classification performance.


#### TABLE 3 | Pseudocode of proposed method.

Step 1 Input 3D MRI data and corresponding CDR labels.

Step 2 Select key slices by ICV larger than 50% of maximum, with 10× undersampling factor.

Step 3 Generate eigenbrain set for each key slice.

Step 4 Select the MIE by WTT with 95% confidence interval.

Step 5 (Output 1): Submit eigenvalues of MIE to the classifier, and report its performance based on 50 × 10 CV.

Step 6 (Output 2): Report the discriminant regions by the absolute coefficient values higher than 0.98 quantile.

The radial basis function (RBF) kernel is one of the most widely used kernels with the form as Zhang and Wu (2012b).

$$\kappa \left( \mathbf{x}\_m, \mathbf{x}\_n \right) = \exp \left( -\frac{\|\mathbf{x}\_m - \mathbf{x}\_n\|}{2\sigma^2} \right) \tag{7}$$

where κ is the kernel function, σ the scaling factor, and x<sup>m</sup> and x<sup>n</sup> are vectors in the input space.

Another commonly used kernel is polynomial (POL) kernel defined as

$$\times \left( \mathbf{x}\_m, \mathbf{x}\_n \right) = \left( \mathbf{x}\_m^T \mathbf{x}\_n + \boldsymbol{\mathcal{c}} \right)^d \tag{8}$$

where d is the degree of polynomial, and c a soft margin constant trading off the influence of higher-order vs. lower-order terms in the polynomial.

Based on the two kernels, we tested RBF-KSVM and POL-KSVM for our models. To obtain the best parameter of kernels (the scaling factor σ of RBF, or the degree d and soft margin constant c of POL), PSO was employed since it has been used successfully to tune parameters of KSVM in various problems (Aich and Banerjee, 2014; Khazaee and Zadeh, 2014; Xue et al., 2014).

#### TABLE 4 | Subject demographics status.


TABLE 5 | Preprocessing of a specified subject.

K-fold CV was employed, and K was assigned with a value of 10 considering the best compromise between computational cost and reliable estimates, i.e., the dataset is randomly divided into 10 mutually exclusively subsets of approximately equal size,

in which 10 − 1 = 9 subsets were used as training set and the last subset was used as the validation set. The procedure that was mentioned above was repeated 10 times, so each subset was used once for validation. The K results from the K folds were combined together, to yield a single estimation of the whole dataset.

The K-fold CV repeated 50 times, i.e., we carried out a 50 × 10-fold CV. For each time, we used four measures: accuracy, sensitivity, specificity, and precision (**Table 2**), to assess the performance. Here TP, FP, TN, and FN represented the instance number of true positive, false positive, true negative, and false negative, respectively. We considered a correctly identified AD case as a true positive, following the common convention. Summarizing the 50 repetitions, we reported the final measures of both the mean and standard deviation (SD) of the four measures.

# Implementation

The purpose of the proposed method is two-fold: (i) to find discriminant voxels that distinguish AD from NC; and (ii) to develop a CAD system and report its performance. The pseudocode is listed in **Table 3**.

# Experiments and Results

The programs were in-house developed using Matlab 2014a, and ran on IBM laptop with 3 GHz Intel i3 dual-processor and 8 GB RAM. Readers could repeat our results on any machine where Matlab is available.

#### TABLE 6 | Difference between NC and AD on key-slices.


# Data Source

We downloaded the dataset from Open Access Series of Imaging Studies (OASIS) (Ardekani et al., 2013, 2014). We chose the cross-sectional dataset corresponding to MRI scans of individuals at a single time point (Bin Tufail et al., 2012). The OASIS dataset consists of 416 subjects aged 18–96, who are all right-handed. We excluded subjects under 60 years old and those with missing records and then picked 126 subjects (98 NCs and 28 ADs) from the rest of the subjects. The demographic statuses of the included subjects were summarized in **Table 4**. Here SES, CDR, and MMSE represent socioeconomic status, clinical dementia rating, and mini-mental state examination, respectively.

# Preprocessing

**Table 5** shows an example of the combination of 3 individual scans of a subject. The resolution is 1 × 1 × 1.25 mm. The preprocessing performed motion-correction on the 3D MR images, registered them to form a combined image in the native acquisition space, and resampled to 1 × 1 × 1 mm. Afterwards, the combined image was spatially normalized to the Talairach coordinate space, and brain-extracted (**Table 5**).

# Key-slice Selection by ICV

The curve of ICV against slice index was shown in **Figure 3A**. We selected 10 coronal slices (60, 70, 80, 90, 100, 110, 120, 130, 140, and 150). Their corresponding ICVs were all higher than 50% of the maximum. **Figures 3B,C** showed the axial and sagittal view of the 10 key-slices. **Table 6** showed the comparison between NC and AD in the selected 10 key-slices.

# Eigenbrains

**Table 7** showed the eigenbrain results obtained by running PCA on the slices of all subjects. For each slice, we had a set of 125 eigenbrains in total. Due to the page limit, we selected and listed the first 6 eigenbrains. The eigenbrains were sorted in the order of decreasing eigenvalues. In general, the eigenbrains in the previous columns were more important than in latter columns.

# Most Important Eigenbrain

WTT was conducted to give quantified proof of why the first eigenbrain was MIE. We performed WTT for the first six eigenbrains of all key-slices between eigenvalues to characterize those that were AD and those that were NC. The results were


#### TABLE 7 | Eigenbrain results.

Frontiers in Computational Neuroscience | www.frontiersin.org June 2015 | Volume 9 | Article 66 |

(Continued)

#### TABLE 7 | Continued

shown in **Table 8**, and p-values less than 0.05 were marked in bold. Only the first eigenvalues of all slices were less than 0.05; therefore, the first eigenbrain was indeed the MIE, and we assigned the eigenvalues of MIE of all 10 key-slices (namely, 10 × 1 = 10 features) of each subject to classification.

# Classification Comparison

The two classes in order were AD and NC, following common convention. Here we designed three tasks. The first did not use the kernel technique, i.e., the basic linear SVM; the second used RBF-KSVM; and the third used POL-KSVM. The kernel parameters and error penalty were optimized by PSO method. The classification results were listed in **Table 9**, in addition with the results of state-of-the-art methods.

# Region Detection

We carried out the region detection procedure from MIE as Section Region Detection described. **Table 10** showed the result, in which the green points represented the discriminant voxels.

Here we reported the discriminative regions interpreted by eigenbrain in **Table 11**, where BA represented Brodmann area.

# Discussion

It is clearly observed in **Table 6** that the selected coronal slices are significant in detecting AD from NC. In particular, the AD subjects show the cerebrospinal fluid (CSF) in the areas occupied by brain matter in the NC subjects. We conclude that 10× is reasonable because of following three reasons: (1) The 10× key-slice undersampling (i.e., select only one slice from 10 consecutive slices) yields a coarser brain while still capturing most tissues in the brain (Compare **Table 6** with **Figure 1**). (2) It is very hard to define a fitness (optimization) function to find the optimal undersampling factor. (3) The classification system has a good accuracy in distinguishing AD from NC, and it detects correct AD-related brain regions (See **Tables 9**, **11**). As there are spatial redundancy for neighboring coronal slices, the undersampling could reduce this redundancy to a rather small degree.

Overall, the eigenbrains in **Table 7** capture both similarities and differences of structural features between AD and NC. The first eigenbrain capture the significant feature of AD from NC, and the second and following eigenbrains capture general brain structure. Revisiting the hippocampus part in the first eigenbrain of all key-slices, it is easily perceived that the body lateral ventricles area of AD are highlighted, which is indeed a distinct attribute between AD and NC. Our experiment extends the eigenbrain on SPECT images by Alvarez et al. (2009a) and Lopez et al. (2009) and shows that eigenbrain is also suitable for MRI scans.

The p-values in **Table 8** show that the first eigenvalue λ<sup>1</sup> are all less than 0.05 for all key-slices. It indicates that mean values of λ<sup>1</sup> of AD and NC are significantly different. Hence, the most dominating eigenvalue characterizing AD and NC is the one corresponding to the first eigenbrain. For other eigenvalues, merely 1 of 10 p-values is less than 0.05, which indicates that those eigenbrains are not dominating features indicative of AD from NC. Therefore, the first eigenvalue is MIE and was selected.

Classification results in **Table 9** compare the proposed three classifiers with state-of-the-art methods, in which Zhang's results (Table 7 in Zhang et al., 2014) are calculated through a single K-fold CV experiment. Plant's results (Task 1 in Table 3 Plant et al., 2010) offer the means together with 95% confidence intervals. Savio's results (Table 5 Savio and Grana, 2013) give the means with SD. For the proposed methods, it is **unexpected** that the POL-KSVM produces better classification accuracy of 92.36 ± 0.94 than linear SVM of 91.47 ± 1.02 and RBF-KSVM of 86.71 ± 1.93, because RBF was reported as the most widely used kernel. Our results are better than or comparable to other approaches to AD prediction from MR brain images of NC, e.g., US + SVD-PCA + SVM-DT of 90% (Zhang et al., 2014), BRC + IG + SVM of 90% (Plant et al., 2010), BRC + IG + Bayes of 92% (Plant et al., 2010), MGM + PEC + SVM of 92.07% (Savio and Grana, 2013), GEODAN + BD + SVM of 92.09% (Savio and

#### TABLE 8 | WTT of the first six eigenvalues of 10 key-slices.


P-values less than 0.05 are in bold.

#### TABLE 9 | Comparison of classification results.


Grana, 2013), and TJM + WTT + SVM of 92.83% (Savio and Grana, 2013). There were many other methods (Gray et al., 2012; Arbizu et al., 2013; Chaves et al., 2013; Dukart et al., 2013; Cohen and Klunk, 2014) proposed for detecting AD from NC, however, they treated images from other modalities (such as SPECT and PET). Therefore, it is not appropriate to compare the proposed methods with them. We will test our methods on SPECT and PET images in the future.

**Table 11** shows that eigenbrains interpret the discriminant voxels involving the following regions reported in existing literatures: Anterior Cingulate (BA-24, BA-32) (Schultz et al., 2014), Caudate Nucleus (Head, body, and tail) (Möller et al., 2015), Cerebellum (Colloby et al., 2014), Cingulate Gyrus (BA-23, BA-24, BA-31) (Yu et al., 2014), Claustrum (De Reuck et al., 2014), Inferior Frontal Gyrus (BA-47) (Eliasova et al., 2014), Inferior Parietal Lobule (BA-40) (Wang et al., 2015), Insula

#### TABLE 10 | Discriminant voxels.

#### TABLE 11 | Regions found by Eigenbrain.


(BA-13) (He et al., 2015), Lateral Ventricle (Voevodskaya et al., 2014), Lentiform Nucleus (Möller et al., 2015), Lingual gyrus (Lehmann et al., 2013), Medial Frontal Gyrus (BA-10, BA-11, BA-25, BA-6) (Kang et al., 2013), Middle Frontal Gyrus (BA-11) (Schultz et al., 2014), Middle Occipital Gyrus (Lehmann et al., 2013), Middle Temporal Gyrus (Aubry et al., 2015), Paracentral Lobule (BA-3, BA-4, BA-5, BA-6, BA-7) (Kang et al., 2013), Parahippocampal Gyrus (Amygdala, BA-28, BA-35, Hippocampus) (Eskildsen et al., 2015), Postcentral Gyrus (BA-5) (Kang et al., 2013), Posterior Cingulate (Shinohara et al., 2014), Precentral Gyrus (BA-4) (Kang et al., 2013), Precuneus (BA-7, BA-31) (Kang et al., 2013), Subcallosal Gyrus (BA-25, BA-34, BA-47) (Paakki et al., 2010), Sub-Gyral (BA-40, Corpus Callosum, Hippocampus) (Streitburger et al., 2012), Superior Frontal Gyrus (Chen et al., 2014), Superior Parietal Lobule (Quiroz et al., 2013), Superior Temporal Gyrus (BA-38) (Paakki et al., 2010), Supramarginal Gyrus (Quiroz et al., 2013), Thalamus (Medial Geniculum Body, Pulvinar, Ventral Lateral Nucleus) (He et al., 2015), Transverse Temporal Gyrus (BA-41) (Kim et al., 2012), and Uncus (BA-28) (Bangen et al., 2014).

Nevertheless, some regions reported to be associated with AD are not interpreted by Eigenbrain, such as subthalamic nucleus (De Reuck et al., 2014). The reason may lie in three aspects. First, the quantile of our method is assigned with a value of 0.98, which is considered high. Reducing the quantile value may include more regions. Second, some literature used other advanced imaging modalities, such as MRSI and fMRI for metabolism detection and function analysis. Third, the key-slice selection procedure may miss important regions.

From another point of view, **Table 11** demonstrates the power of the eigenbrain. Our study uses only one feature (eigenbrain) on 10 key-slices of a simple 3D structural MR image, nevertheless, our findings cover 30 related regions reported by over twenty literatures, which used various feature extraction methods and advanced imaging technologies.

The **contributions** of the paper fall within the following five aspects: (i) We generalize the Eigenbrain to MR images, and prove its effectiveness; (ii) We propose a hybrid eigenbrain-based CAD system that can not only detect AD from NC, but also detect brain regions that related to AD. (iii) We prove the proposed method has a classification accuracy comparable to state-of-theart methods, and the detected brain regions are in line with 16 existing literatures. (iv) We use ICV and WTT to reduce redundant data; (v) we find POL kernel is better than linear and RBF kernel for this study.

In conclusion, the advantages of eigenbrain are three-fold: (i) it reaches very high classification accuracy, which was better than or competitive with state-of-the-art methods (Plant et al., 2010; Savio and Grana, 2013; Zhang et al., 2014); (ii) it can directly find discriminant voxels/regions within the whole brain; (iii) it can be combined with other features, in order to increase the classification performance. On the other hand, the disadvantages of eigenbrain also exist: (i) it is essentially two-dimensional, which does not reduce the redundancy along the slice direction; (ii) it needs preprocessing of spatial registration, which costs large amount of computation resources.

To the policy-makers, this study suggests the eigenbrain technique can achieve comparable results to traditional methods. It may offer a ray of hope for AD diagnosis with unconventional means with the combination of eigenbrain and machine learning. This preclinical study suggests that hospitals and medical laboratories enroll more computer scientists and engineers, with the aim of developing efficient AD diagnosis and region detection systems.

# Conclusion and Future Research

We presented an automated and accurate classification method that was based on eigenbrains and machine learning, in order to detect AD subjects and AD-related brain regions using 3D MR images. The results showed the proposed POL-KSVM method achieved 92.36% accuracy, which was competitive with state-ofthe-art methods.

In the future, we will focus our research in the following aspects: (i) We shall generalize the eigenbrain to three dimensional, so the procedure of key-slice selection can be removed; (ii) We shall test other kernels for SVM, and try to replace KSVM with other advanced pattern recognition tools. (iii) Eigenbrain can be used in combination with DWT-based features and others, and an increase in classification accuracy is expected.

# Acknowledgments

This work was supported by NSFC (610011024, 61273243, 51407095), Program of Natural Science Research of Jiangsu Higher Education Institutions (13KJB460011, 14KJB520021),

# References


Jiangsu Key Laboratory of 3D Printing Equipment and Manufacturing (BM2013006), Key Supporting Science and Technology Program (Industry) of Jiangsu Province (BE2012201, BE2014009-3, BE2013012-2), Special Funds for Scientific and Technological Achievement Transformation Project in Jiangsu Province (BA2013058), and Nanjing Normal University Research Foundation for Talented Scholars (2013119XGQ0061, 2014119XGQ0080). The authors acknowledge their gratitude to the OASIS dataset that came from NIH grants P50AG05681, P01 AG03991, R01 AG021910, P50 MH071616, U24 RR021382 and R01 MH56584.

# Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fncom. 2015.00066/abstract

System, Computing and Engineering (ICCSCE) (Penang: IEEE), 317–321. doi: 10.1109/ICCSCE.2012.6487163


Russell, J., and Cohn, R. (2012). Bessel's Correction.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Zhang, Dong, Phillips, Wang, Ji, Yang and Yuan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# On the distinguishability of HRF models in fMRI

#### Paulo N. Rosa<sup>1</sup> \*, Patricia Figueiredo<sup>2</sup> and Carlos J. Silvestre<sup>3</sup>

<sup>1</sup> Flight Systems Business Unit, Aerospace, Defense & Systems Department, Deimos Engenharia, Lda., Lisboa, Portugal, 2 Institute for Systems and Robotics and Department of Bioengineering, Instituto Superior Técnico, Universidade de Lisboa, Portugal, <sup>3</sup> Department of Electrical and Computer Engineering, Faculty of Science and Technology, University of Macau, Taipa, Macau, China

Modeling the Hemodynamic Response Function (HRF) is a critical step in fMRI studies of brain activity, and it is often desirable to estimate HRF parameters with physiological interpretability. A biophysically informed model of the HRF can be described by a non-linear time-invariant dynamic system. However, the identification of this dynamic system may leave much uncertainty on the exact values of the parameters. Moreover, the high noise levels in the data may hinder the model estimation task. In this context, the estimation of the HRF may be seen as a problem of model falsification or invalidation, where we are interested in distinguishing among a set of eligible models of dynamic systems. Here, we propose a systematic tool to determine the distinguishability among a set of physiologically plausible HRF models. The concept of absolutely input-distinguishable systems is introduced and applied to a biophysically informed HRF model, by exploiting the structure of the underlying non-linear dynamic system. A strategy to model uncertainty in the input time-delay and magnitude is developed and its impact on the distinguishability of two physiologically plausible HRF models is assessed, in terms of the maximum noise amplitude above which it is not possible to guarantee the falsification of one model in relation to another. Finally, a methodology is proposed for the choice of the input sequence, or experimental paradigm, that maximizes the distinguishability of the HRF models under investigation. The proposed approach may be used to evaluate the performance of HRF model estimation techniques from fMRI data.

### Edited by:

Tobias Alecio Mattei, Brain & Spine Center - InvisionHealth - Kenmore Mercy Hospital, USA

#### Reviewed by:

Nelson Jesús Trujillo-Barreto, University of Manchester, UK Daniele Marinazzo, University of Ghent, Belgium

#### \*Correspondence:

Paulo N. Rosa, Flight Systems Business Unit, Aerospace, Defense & Systems Department, Deimos Engenharia, Av. D. Joao II, Lt 1.17.01, 10th floor, 1998-023 Lisboa, Portugal paulo.rosa@deimos.com.pt

> Received: 22 July 2014 Accepted: 24 April 2015 Published: 19 May 2015

#### Citation:

Rosa PN, Figueiredo P and Silvestre CJ (2015) On the distinguishability of HRF models in fMRI. Front. Comput. Neurosci. 9:54. doi: 10.3389/fncom.2015.00054 Keywords: HRF, fMRI, BOLD fMRI, distinguishability, model selection, experimental paradigm

# Introduction

The hemodynamic response function (HRF) describes the local changes in cerebral blood flow, volume, and oxygenation associated with neuronal activity, and it is extensively used to model Blood Oxygen Level Dependent (BOLD) signals measured using functional Magnetic Resonance Imaging (fMRI) (Logothetis and Wandell, 2004). In general, fMRI experiments are used to map networks of brain activity that are associated with a specific stimulus or task, or that are functionally correlated during rest. Mapping of stimulus/task-related BOLD changes is most frequently achieved by fitting a general linear model (GLM) to the data, consisting on the stimulus/task time course convolved with a pre-specified HRF model (Friston et al., 1994), assuming a linear time invariant system (Boynton et al., 1996). Although the exact mechanisms underlying the HRF are not yet completely known, the consistency of its observed shape allowed for canonical (parameterized) HRF models to be derived (Friston et al., 1998). In particular, double-gamma HRF models are commonly employed in fMRI analysis. Nevertheless, extensive HRF variability has been reported across brain regions (Handwerker et al., 2004), scanning sessions (Aguirre et al., 1998), tasks (Cohen and Ugurbil, 2002), physiological modulations (Liu et al., 2004), subjects (Handwerker et al., 2004), and populations (D'Esposito et al., 2003), which may hinder or confound the measurement of BOLD changes associated with brain activity, limiting the interpretability of fMRI studies.

Common approaches attempting to take into account HRF variability allow for greater flexibility in the HRF shape and dynamics by describing it through a set of basis functions in a GLM framework. They include using the partial derivatives with respect to time and dispersion of a canonical HRF (Friston et al., 1998), finite impulse response (FIR) basis sets (Glover, 1999), and specially designed basis functions (Woolrich et al., 2004). An approach that also takes into account the spatial localization of the HRF was very recently proposed in Vincent et al. (2014). While a small number of basis functions cannot accurately model the whole range of HRF shapes and delays, at the other extreme, deconvolution of the BOLD response is a very noisy process. Critically, these approaches do not provide a biophysical foundation for the HRF model, hence limiting the physiological interpretability of the associated parameters. Moreover, they do not explain empirically observed non-linearities in the BOLD responses (Birn et al., 2001).

Biophysically informed non-linear models of the HRF have been proposed, based on the combination of the Balloon model, describing the dynamic changes in deoxyhemoglobin content as a function of blood oxygenation and blood volume (Buxton et al., 1998), with a model of the blood flow dynamics during brain activation, where neuronal activity is approximated by the stimulus/task input scaled by a factor called neural efficiency (Friston et al., 2000). In the original work that proposed this model, the associated parameters were estimated by using a Volterra kernel expansion to characterize the system dynamics (Friston et al., 2000). Later, a Bayesian estimation framework was introduced, allowing for the use of a priori distributions of the parameter values and the production of the respective posterior probability distributions given the data by using Expectation-Maximization methods (Friston, 2002). This HRF model and respective estimation procedure have further been incorporated in Dynamic Causal Models (DCM) developed to study effective connectivity among networks of brain regions from fMRI data (Friston et al., 2003). More recently, the methods of dynamic expectation maximization, variational filtering, and generalized filtering have also been proposed for model inversion (estimation) in this context (Friston et al., 2008).

Several extensions of the Balloon model have since been considered (Buxton et al., 2004), as well as a metabolic/hemodynamic model that takes the metabolic dynamics into account in order to incorporate the separate roles played by excitatory and inhibitory neuronal activities in the generation of the BOLD signal (Sotero and Trujillo-Barreto, 2007). A few alternative approaches for the estimation of these HRF models and related extensions have also been proposed (Riera et al., 2004). In Riera et al. (2004), a fully stochastic model was presented in order to include physiological noise in the hemodynamic states, in addition to the measurement noise in the observations. A local linearization filter was used for estimating the hemodynamic states as well as the model parameters. In Sotero et al. (2009), a similar approach was used for estimating the metabolic/hemodynamic model proposed by the same group. In contrast to these linearization-based approaches, Johnston et al. (2008) used particle filters so as to truly accommodate the model non-linearities. More recently, Havlicek et al. (2011) proposed non-linear cubature Kalman filtering as a means to invert models of coupled dynamical systems, which furnishes posterior estimates of both the hidden states and the parameters of the system, including any unknown exogenous input.

In fMRI experiments, the system input is given by the stimulus/task time course, which is generally designed as a series of events alternating with baseline periods at specified interstimulus intervals (ISIs). A number of studies have addressed the problem of systematically assessing the quality of fMRI experimental designs, both in terms of the ability to detect stimulus/task-related BOLD activation (detection power) and the ability to estimate the HRF model (estimation efficiency) in a given amount of imaging time (Dale, 1999; Liu et al., 2001). Different methodologies have been proposed to determine the optimal design of fMRI experiments for maximal estimation efficiency (Buracas and Boynton, 2002; Wager and Nichols, 2003; Maus et al., 2012), and a few studies have compared different HRF models and the associated estimation efficiency, focusing on specific parameters of interest such as the response latency and duration (Lindquist and Wager, 2007; Lindquist et al., 2009). Importantly, the authors were concerned with the physiological plausibility of the estimated HRF parameters and with their independence, such that differences in one parameter are not confounded with differences in another parameter. However, these studies were based on parameterized HRF models with no direct biophysical groundings, which severely limited the desired physiological interpretability. To our knowledge, no study has so far investigated the effect of experimental design on the estimation of biophysically informed models of the HRF.

When the HRF model is expressed as a dynamic system, the identifiability of this system must be established in order to guarantee that the HRF models inferred from the input/output data are physiologically plausible. It has been shown that the sensitivity of the HRF system input/output behavior to the model parameters is in general small, which means that, when many parameters are estimated together, their values can be varied over a large range with only small changes in the system output (Deneux and Faugeras, 2006). In these cases, the problem of model estimation may be treated as a model falsification (or invalidation) problem, in which we are interested in distinguishing among a set of eligible dynamic systems (Silvestre et al., 2010a). The simplest model falsification problem one can think of is that of stating whether or not a given model is compatible with the current observed input/output data. However, it is important to notice that a model can never be validated in practice. Indeed, the model being compatible with the input/output data up to time t does not imply that it should be compatible at time t + δ where δ > 0. Therefore, one can only say that a given model is not falsified (or invalidated) by the current input/output data. On the other hand, a model is obviously invalidated or falsified once it is not compatible with the observations. Hence, we usually refer to model falsification rather than model validation, since the latter is not achievable in practice. The related problem of model (in)distinguishability arises in a wide range of decision architectures, especially in those that are used in noisy and/or uncertain environments, where more than a single eligible model is compatible with the observed input/output dataset. The distinguishability of two models is in general affected by the input signals, particularly by the uncertainty on the input time-delay and on its magnitude. In fact, model invalidation requires a kind of persistence of the excitation condition in the exogenous inputs, so that the magnitude of the system output signal is large enough when compared to the noise level of the data acquisition process—see (Grewal and Glover, 1976; Walter et al., 1984) and references therein.

In this paper, we extend the results in Silvestre et al. (2010b), by first introducing the concept of absolutely inputdistinguishable systems and showing that, for systems with forced responses, the distinguishability between two models can be significantly affected by the shape and magnitude of the external input signals. Moreover, several types of uncertainty, such as unknown input time-delays and uncertain magnitudes of the input signal, can also be adverse to model invalidation. We then exploit the concept of absolutely input-distinguishable systems, in order to optimize the estimation efficiency of fMRI experimental designs through the maximization of the distinguishability among a set of physiologically plausible HRF models. It is stressed that one of the main motivations for the work described herein is the development of a technique that helps define an optimal sequence of stimuli, so that the differences between the models in the set of plausible HRFs become apparent. Hence, the methodology proposed in this paper provides a first step to the so-called experimental paradigm design, while also shedding light on the intrinsic limitations of HRF parameter estimation based on fMRI.

# Methods

The Balloon Model proposed by Buxton et al. (1998), and further analyzed and complemented with the flow dynamics by Friston et al. (2000), consists of a non-linear differential equation that describes the dynamics of normalized values of the blood flow b<sup>f</sup> , with s being the vasodilatatory and activity dependent signal that increases the flow b<sup>f</sup> , the veins deoxyhemoglobin content q, and the blood venous volume v, which are considered 1 at rest. This non-linear dynamic system can be described by

$$\begin{cases} \dot{s} = \varepsilon u - k\_2 s - k\_f (b\_f - 1) & \stackrel{\Delta}{=} F\_1 \\ \dot{b}\_f = s & \stackrel{\Delta}{=} F\_2 \\ \dot{\nu} = \frac{1}{\varepsilon} \left( b\_f - \nu^{\frac{1}{a}}\_a \right) & \stackrel{\Delta}{=} F\_3 \\ \dot{q} = \frac{1}{\varepsilon} \left( b\_f \frac{1 - \left( 1 - E\_\vartheta \right)^{1/b\_f}}{E\_o} - \nu^{\left( \frac{1}{a} - 1 \right)} q \right) & \stackrel{\Delta}{=} F\_4 \\ \nu = V\_o \left[ k\_1 \left( 1 - q \right) + k\_2 \left( 1 - \frac{q}{\nu} \right) + k\_3 \left( 1 - \nu \right) \right] \\ & = f(\ge, \theta, u) \end{cases} \tag{1}$$

where x = [x1, x2, x3, x4] <sup>T</sup> = [s, b<sup>f</sup> , v, q] T , E<sup>o</sup> is the resting net oxygen extraction fraction by capillary bed, ε is the efficacy with which neuronal activity causes an increase in signal, 1/k<sup>s</sup> and 1/k<sup>f</sup> are time constants, τ is the mean transit time, and α is a stiffness exponent that specifies the flow-volume relationship of the venous balloon. The output of this model, y(t), is the BOLD signal and represents a complex response controlled by different parameters, that range from the blood oxygenation, to the cerebral blood flow, and cerebral blood volume, and reflects the regional increase in metabolism due to enhancing of the neural activity. In the output equation, V<sup>o</sup> is the resting blood volume fraction, and k1, k2, and k<sup>3</sup> are constants.

The response of the system described by Equation (1), with the parameters in **Table 1** and with initial state x T (0) = [0 1 1 1], to a rectangular input signal, is depicted in **Figure 1**, for different integration periods.

The linear approximation of the model of the system leads to pronouncedly different responses, when compared to the nonlinear system. An alternative to this, as described in sequel, is to consider a so-called bilinear model, which accurately mimics the non-linear behavior for sufficiently small integration periods.

### Linearization and Discretization of the Model

The model described by Equation (1) is highly non-linear and parameter-dependent, thus barely allowing any systematic analysis of the associated expected behavior. Hence, to make the problem tractable from a mathematical point of view, the (bi)linearization of the HRF is considered in this paper. This approach allows the use of a widely spread framework for analysis, namely that of the linear time-varying systems. **Figure 1** shows that a close match of the HRF can be obtained by using a bilinear approximation (linear on the state, if the input is fixed, and linear on the input, if the state is fixed). Therefore, in this subsection, a (bi)linearization is derived that approximates the non-linear model locally and that is able to describe the state of the system at a given time, x(kTs), as a function of the state several sampling periods before, x (k − N)T<sup>s</sup> .

In particular, linearizing Equation (1) around x(·) = x ∗ and u(·) = 0, i.e., writing the associated Taylor expansion and truncating it at the linear term, one obtains (omitting the time-dependence of the variables, for the sake of readability):

$$\begin{split} \dot{\boldsymbol{\mathfrak{X}}} & \approx \mathcal{F} \{ \mathbf{x}^\*, \boldsymbol{\theta}, \mathbf{0} \} + \left. \frac{\partial F \{ \mathbf{x}, \boldsymbol{\theta}, \boldsymbol{u} \}}{\partial \mathbf{x}} \right|\_{\mathbf{x}^\*, \boldsymbol{\theta}, \boldsymbol{0}} \{ \boldsymbol{\mathfrak{x}} - \boldsymbol{\mathsf{x}}^\* \} \\ & + \sum\_{i} \boldsymbol{u}\_{i} \left( \left. \frac{\partial^2 F \{ \mathbf{x}, \boldsymbol{\theta}, \boldsymbol{u} \}}{\partial \mathbf{x} \partial \boldsymbol{u}\_{i}} \right|\_{\mathbf{x}^\*, \boldsymbol{\theta}, \boldsymbol{0}} \{ \boldsymbol{\mathfrak{x}} - \boldsymbol{\mathsf{x}}^\* \} + \left. \frac{\partial F \{ \mathbf{x}, \boldsymbol{\theta}, \boldsymbol{u} \}}{\partial \boldsymbol{u}\_{i}} \right|\_{\mathbf{x}^\*, \boldsymbol{\theta}, \boldsymbol{0}} \right), \end{split}$$

where

$$
\frac{\partial F}{\partial \mathbf{x}} = \begin{bmatrix}
1 & 0 & 0 & 0 \\
& 0 & \frac{1}{\mathfrak{t}} & -\frac{\binom{\mathfrak{t}}{\mathfrak{t}} - 1}{\alpha \mathfrak{t}} & 0 \\
& 0 & \frac{\partial F\_4}{\partial x\_2} & \frac{\partial F\_4}{\partial x\_3} & -\frac{\chi\_j^{\left(\frac{1}{\mathfrak{t}} - 1\right)}}{\mathfrak{t}}
\end{bmatrix},\tag{2}
$$

TABLE 1 | Parameters for the non-linear model described by Equation (1).

and with output equation described by

$$\frac{\partial \chi}{\partial \mathbf{x}} = \begin{bmatrix} 0, & 0, & -k\_3 V\_o + k\_2 V\_o q \nu^{-2}, & -k\_1 V\_o - k\_2 V\_o \nu^{-1} \end{bmatrix}.$$

Moreover, given that F<sup>1</sup> depends linearly upon u, we have that ∂ 2F = 0.

∂x∂ui Using the transformation proposed in Friston et al. (2000), one finally obtains the following dynamics:

$$
\dot{\tilde{\chi}} = A\tilde{\chi} + \sum\_{i} u\_{i} \, E\_{i} \tilde{\chi}, \tag{3}
$$

,

where x˜ = - 1 x T

$$\begin{array}{rcl} A & \stackrel{\scriptstyle \Delta}{=} & \begin{bmatrix} 0 & 0\\ F(\mathbf{x}^\*, \boldsymbol{\theta}, \boldsymbol{u}) - \frac{\partial F(\mathbf{x}^\*, \boldsymbol{\theta}, \boldsymbol{u})}{\partial \mathbf{x}} \mathbf{x}^\* \end{bmatrix}, \\\ E\_i & \stackrel{\scriptstyle \Delta}{=} & \begin{bmatrix} 0 & 0\\ \frac{\partial F(\mathbf{x}^\*, \boldsymbol{\theta}, \mathbf{0})}{\partial \boldsymbol{u}\_i} & \mathbf{0} \end{bmatrix}, \end{array}$$

and <sup>∂</sup>F(<sup>x</sup> ∗ ,θ ,0) ∂ui = - ε 0 0 0 <sup>T</sup> .

#### Uncertain Dynamic Model Description

,

It should be noticed that the dynamics in Equation (3) are bilinear in the state and input variables. This non-linear term hinders the distinguishability analysis proposed in Rosa and Silvestre (2011) and, thus, a more suitable description is derived in herein.

For the sake of simplicity, we start by redefining x(t) <sup>1</sup>= ˜x(t) and x ∗ (t) <sup>1</sup>=[1 x ∗ (t) T ] T . It was previously shown that the continuous-time dynamic model of the HRF, for a single input, can be approximated by

$$\begin{cases} \dot{\mathbf{x}}(t) &= \begin{array}{c} \left( A(t) + B\_o(t)u(t) + \Delta(t)B\_1(t)u(t) \right) \mathbf{x}(t), \\\\ \mathbf{y}(t) &= \begin{array}{c} \mathbf{x}^\*(0), \end{array} \end{array} \mathbf{x}(0) = \mathbf{x}^\*(0), \end{cases} \quad \text{(4)}$$

with t ≥ 0, and where 1:R <sup>+</sup> → R was also included to represent an input uncertainty subject to |1(t)| ≤ 1 for all t ≥ 0, and where B<sup>o</sup> = E1. This input uncertainty can be seen as a surrogate for uncertainty in the stimulation signal. The initial state is denoted by x(0) ∈ R n , and n is the number of states of the system. Moreover, we assume that

$$B\_1(t) = \eta B\_o(t),$$

with known η ∈ R. We also define B(t) = Bo(t) + 1(t)B1(t).

To proceed with the derivation of a discrete-time description of the HRF model in Equation (4), for a given sampling period, Ts , the following assumptions are posed:

**Assumption 1**: The input signal, u(·), is constant during sampling periods, i.e., u(t) = u(kTs), for all t ∈ [kTs, (k + 1)Ts[.

**Assumption 2**: The input uncertainty, 1(·), is constant during sampling periods, i.e., 1(t) = 1(kTs), for all t ∈ [kTs, (k+1)Ts[.

**Assumption 3**: The maps A(·), Bo(·), and B1(·), are constant during sampling periods, i.e., A(t) = A(kTs), Bo(t) = Bo(kTs), and B1(t) = B1(kTs), for all t ∈ [kTs, (k + 1)Ts[.

Under these assumptions, the system in Equation (4) can be rewritten as

$$\begin{cases} \dot{\varkappa}(t) &= \tilde{A}\left(k, \Delta(k)\right)\varkappa(t), \quad \varkappa(0) = \varkappa^\*(0), \\\varkappa(t) &= \varkappa(\tilde{\varkappa}(t)), \end{cases} \tag{5}$$

for x˜(t) ∈ [kTs,(k + 1)Ts], and where

$$
\tilde{A}\left(k, \Delta(k)\right) = A\_o(k) + \Delta(kT\_s)A\_1(k),
$$

with

$$A\_o(k) = A(kT\_s) + B\_o(kT\_s)\mu(kT\_s),$$

and

$$A\_1(k) = B\_1(kT\_s)\mu(kT\_s).$$

In the sequel, we will abbreviate x(k) = x(kTs), for the sake of simplicity. We are now in conditions of stating the following proposition:

**Proposition 1:** Define

$$I^\* = \begin{bmatrix} \mathbf{0} & \mathbf{0} & \cdots & \mathbf{0} \\ \mathbf{0} & 1 & \ddots & \vdots \\ \vdots & \ddots & \ddots & \mathbf{0} \\ \mathbf{0} & \cdots & \mathbf{0} & 1 \end{bmatrix},$$

and

$$\phi(k) = V(k)\Lambda^\*(k)V^{-1}(k)e^{A(kT\_i)T\_s} - V(k)\Lambda^\*(k)V^{-1}(k) - I^\*,$$

where V(k)3(k)V −1 (k) = A(kTs)T<sup>s</sup> is the spectral decomposition of A(kTs)T<sup>s</sup> with 3(k) diagonal and 311(k) = 0, and

$$
\Lambda\_{\vec{\imath}}^{\*}(k) = \begin{cases}
\begin{array}{c}
\frac{1}{\Lambda\_{\vec{\imath}}(k)}, & \text{if } \mathsf{i} = \mathsf{j} \text{ and } \Lambda\_{\vec{\imath}} \neq \mathsf{0}, \\
0, & \text{otherwise.}
\end{array} \\
\end{cases}
$$

Furthermore, let

$$\begin{aligned} G\_o(k) &= \quad e^{A(k)} + B\_o(k)\mu(k) + \phi(k)B\_o(k)\mu(k) \text{ and} \\ G\_1(k) &= \quad B\_1(k)\mu(k) + \phi(k)B\_1(k)\mu(k). \end{aligned}$$

Then, the system in Equation (5) is described by

$$\begin{cases} \varkappa(k+1) &=& G\left(k, \Delta(k)\right)\varkappa(k), \quad \varkappa(0) = \varkappa^\*(0),\\ \varkappa(k) &=& h(\varkappa(k)), \end{cases} \tag{6}$$

where

$$G\left(k, \Delta(k)\right) = G\_o(k) + \Delta(k)G\_1(k),$$

and for x(k) = x(kTs). Proof: See Appendix A in Supplementary Material.

Notice that Equation (6), with G k, 1(k) = Go(k) + 1(k)G1(k), associated with the linearization of the output map, g, is a full description of the HRF dynamics by means of a linear model with known matrices, Go(k) and G1(k), and an uncertain parameter, 1(k). This description, however, is bilinear in the state, x(k), and model uncertainty, 1(k). This bilinear relationship is tainted once we describe the state x(k + 1) as a function of x(k − 1). Nevertheless, notice that

 Go(k + 1) + 1G1(k + 1) Go(k) + 1G1(k) = Go(k + 1)Go(k) + 1 G1(k + 1)Go(k) + Go(k + 1)G1(k) , since G1(k + 1)G1(k) = 0 and where, for the time being, we considered that 1 is constant (but unknown), i.e., 1(k) = 1 for all k. To see this, notice that

$$\begin{array}{rcl} G\_1(k+1)G\_1(k) &=& (B\_1(k+1) + \phi(k+1)B\_1(k+1))(B\_1(k) \\ &+ \phi(k)B\_1(k)) \\ &=& \underbrace{B\_1(k+1)B\_1(k)}\_{+\phi(k+1)} + B\_1(k+1)\phi(k)B\_1(k) \\ &+ \phi(k+1)\underbrace{B\_1(k+1)B\_1(k)}\_{=0} + \\ &+ \phi(k+1)B\_1(k+1)\phi(k)B\_1(k), \end{array}$$

and that B1(k + 1)φ(k)B1(k) = 0, due to the fact that the first row of φ is zero, and that all but the first column of B<sup>1</sup> are also zero.

By proceeding in a similar manner, we conclude that

$$\begin{aligned} \left(G\_o(k+m) + \Delta G\_1(k+m)\right) \cdots \left(G\_o(k) + \Delta G\_1(k)\right) \\ = \Psi\_o(k+m) + \Delta \Psi\_1(k+m), \end{aligned}$$

where

$$
\Psi\_o(k+m) = G\_o(k+m) \cdots G\_o(k),
$$

and

$$\begin{cases} \Psi\_1(k) &=& G\_1(k), \\ \Psi\_1(k+m) &=& G\_0(k+m)\Psi\_1(k+m-1) \\ & \quad +G\_1(k+m)\Psi\_0(k+m-1). \end{cases}$$

Hence, the state x(k + m + 1) can be written as

$$\mathbf{x}(k+m+1) = \left(\Psi\_o(k+m) + \Psi\_1(k+m)\Delta\right)\mathbf{x}(k)$$

Furthermore, the non-linear output Equation of (1) can be linearized as

$$\chi(\mathbf{x}) = \chi(\mathbf{x}^\*) + \left. \frac{\partial \chi}{\partial \mathbf{x}} \right|\_{\mathbf{x}^\*} (\mathbf{x} - \mathbf{x}^\*), \tag{8}$$

which, in turn, can alternatively written as:

$$z = \left. \chi(\mathbf{x}) - \left. \chi(\mathbf{x}^\*) + \left. \frac{\partial \chi}{\partial \mathbf{x}} \right|\_{\mathbf{x}^\*} \mathbf{x}^\* = \left. \frac{\partial \chi}{\partial \mathbf{x}} \right|\_{\mathbf{x}^\*} \mathbf{x}. \tag{9}$$

where z(t) can be seen as the measurement for the linear time-varying system obtained by the linearization of Equation (1).

#### Absolutely Distinguishable Systems

The problem of indistinguishability typically arises from large amplitudes of the measurement noise, small intensity of the input excitation signals, model uncertainty, and uncertain initial conditions. In particular, if the Signal-to-Noise Ratio (SNR) of the measurements is not sufficiently large, one may be able to explain the observed variables by using more than a single dynamic model, from the set of eligible models. A similar conclusion applies if the intensity of the input signal is not sufficient to excite the dynamics of the system.

This section will therefore propose a methodology to systematically derive conditions that guarantee the distinguishability of a set of dynamic models, regardless of the noise sequences and initial states.

#### Systems with Uncertain Initial State

We start by analyzing the case where the dynamics of the system are known, although the initial state is uncertain and the measured variables are corrupted by bounded noise. Using Equation (8), we have that

$$\chi(k) = \underbrace{\chi(\mathbf{x}^\*(k)) - \mathbf{C}(k)\mathbf{x}^\*(k)}\_{\vec{\chi}(k)} + \mathbf{C}(k)\mathbf{x}(k) + n(k), \qquad (10)$$

,

where

$$C(k) = \left. \frac{\partial \mathcal{Y}}{\partial \mathbf{x}} \right|\_{\mathbf{x}^\*(k)}$$

and where n(k) is the measurement noise. Consider that a given input sequence, u(0), · · · , u(N), feeds the inputs of systems S<sup>A</sup> and SB, respectively described by

$$\begin{array}{rcl} \mathsf{S}\_{A}: \left\{ \begin{array}{rcl} \mathsf{x}\_{A}(k+1) &=& G\_{A}(k)\mathsf{x}\_{A}(k), \\ \mathsf{y}\_{A}(k) &=& \bar{\mathsf{y}}\_{A}(k) + \mathsf{C}\_{A}(k)\mathsf{x}\_{A}(k) + \mathsf{n}\_{A}(k), \end{array} \right. \\\\ \mathsf{S}\_{B}: \left\{ \begin{array}{rcl} \mathsf{x}\_{B}(k+1) &=& G\_{B}(k)\mathsf{x}\_{B}(k), \\ \mathsf{y}\_{B}(k) &=& \bar{\mathsf{y}}\_{B}(k) + \mathsf{C}\_{B}(k)\mathsf{x}\_{B}(k) + \mathsf{n}\_{B}(k), \end{array} \right. \end{array}$$

where y<sup>A</sup> and y<sup>B</sup> are defined as in Equation (10), and |nA(k)| ≤ <sup>n</sup>¯ 2 , |nB(k)| ≤ <sup>n</sup>¯ 2 . Moreover, we assume that xA(0) ∈ X<sup>o</sup> and xB(0) ∈ Xo, where X<sup>o</sup> ∈ R n is a convex polytope. Let φ<sup>i</sup> = [n T i , u T i ] T , denote the measurement noise, n<sup>i</sup> ∈ W ⊆ R nn , and input signals, u<sup>i</sup> ∈ U ⊆ R nu , at time instant i.

**Definition 1**: Systems S<sup>A</sup> and S<sup>B</sup> are said absolutely (X0, U, W)-input distinguishable in N sampling times if, for any non-zero

$$\left(\mathfrak{x}\_A(0), \mathfrak{x}\_B(0), \phi\_1, \phi\_2, \dots, \phi\_N\right) \in X\_\sigma \times X\_\theta \times \overbrace{\Phi \times \cdots \times \Phi}^{N \text{times}}$$

where φ<sup>i</sup> ∈ W × U = :8 ⊆ R <sup>n</sup>u+n<sup>d</sup> for i = 0, 1, · · · , N, there exists k ∈ {0, 1, · · · , N} such that

$$
\gamma\_A(k) \neq \gamma\_B(k).
$$

Moreover, two systems are said absolutely (Xo, U, W)-input distinguishable if there exists N ≥ 0 such that they are absolutely (Xo, U, W)-input distinguishable in N sampling times.

$$\text{Let } U = (\mu(0), \mu(1), \dots, \mu(N)) \text{ and }$$

$$W = \left\{ \left( n(0), n(1), \dots, n(N) \right) : \underset{0 \le k \le N}{\forall \ |n(k)| \le \frac{\bar{n}}{2}} | n(k) | \le \frac{\bar{n}}{2} \right\}.$$

The following proposition can be used to state whether a pair of systems is distinguishable or not.

**Proposition 2**: Systems S<sup>A</sup> and S<sup>B</sup> are absolutely (Xo, U, W) input distinguishable in N sampling times if and only if a solution to the following linear problem does not exist:

$$
\begin{bmatrix}
\begin{bmatrix}
& \mathbf{C}\_{A}(0) & & -\mathbf{C}\_{B}(0) \\
& -\mathbf{C}\_{A}(0) & & \mathbf{C}\_{B}(0) \\
& \mathbf{C}\_{A}(1)\mathbf{G}\_{A}(0) & & -\mathbf{C}\_{B}(0)\mathbf{G}\_{B}(0) \\
& -\mathbf{C}\_{A}(1)\mathbf{G}\_{A}(0) & \mathbf{C}\_{B}(0)\mathbf{G}\_{B}(0) \\
& \vdots & & \vdots \\
& & \vdots \\
& & \vdots \\
& -\mathbf{C}\_{A}(N)\mathbf{G}\_{A}(N-1)\cdots\mathbf{G}\_{A}(0) & \mathbf{C}\_{B}(0)\mathbf{G}\_{B}(N-1)\cdots\mathbf{G}\_{B}(0) \\
& \mathbf{M}\_{\phi} & & \mathbf{M}\_{\phi} \\
& & \mathbf{0} & & \mathbf{M}\_{\phi}
\end{bmatrix}
$$

$$
\begin{bmatrix}
\mathbf{x}\_{A}(0) \\
& \\ \hline
\begin{bmatrix}
\mathbf{x}\_{A}(0) \\
& \\ \hline
\end{bmatrix}
\leq 
\begin{bmatrix}
\begin{bmatrix}
\bar{n}-\bar{\boldsymbol{\beta}}\_{A}(0)+\bar{\mathbf{\beta}}\_{B}(0) \\
\bar{\boldsymbol{n}}+\bar{\boldsymbol{\beta}}\_{A}(0)-\bar{\mathbf{\beta}}\_{B}(0) \\
\bar{\boldsymbol{n}}-\bar{\boldsymbol{\beta}}\_{A}(1)+\bar{\boldsymbol{\beta}}\_{B}(1) \\
\bar{\boldsymbol{n}}+\bar{\boldsymbol{\beta}}\_{A}(1)-\bar{\mathbf{\beta}}\_{B}(1) \\
\vdots \\
& \vdots \\
& \bar{\boldsymbol{n}}-\bar{\boldsymbol{\beta}}\_{A}(N)-\bar{\mathbf{\beta}}\_{B}(N) \\
& \mathbf{m}\_{\phi}
\end{bmatrix},
\end{bmatrix},
\tag{11}
$$

where X<sup>o</sup> is defined so that x ∈ X<sup>o</sup> ⇔ Mox ≤ mo, which can be written as X<sup>o</sup> = Set(Mo, mo).

Proof: See Appendix B in Supplementary Material.

#### Systems with Uncertain Model

We now consider the case where the system dynamics are uncertain and described by

$$\begin{array}{rcl} \mathcal{S}\_{A}: \left\{ \begin{array}{rcl} \boldsymbol{\chi}\_{A}(k+1) &=& \left( \boldsymbol{G}\_{o}^{A}(k) + \Delta\_{A} \boldsymbol{G}\_{1}^{A}(k) \right) \boldsymbol{\chi}\_{A}(k), \\ \boldsymbol{\chi}\_{A}(k) &=& \bar{\boldsymbol{\chi}}\_{A}(k) + \boldsymbol{C}\_{A}(k) \boldsymbol{\chi}\_{A}(k) + \boldsymbol{n}\_{A}(k), \end{array} \right. \\\\ \mathcal{S}\_{B}: \left\{ \begin{array}{rcl} \boldsymbol{\chi}\_{B}(k+1) &=& \left( \boldsymbol{G}\_{o}^{B}(k) + \Delta\_{B} \boldsymbol{G}\_{1}^{B}(k) \right) \boldsymbol{\chi}\_{B}(k), \\ \boldsymbol{\chi}\_{B}(k) &=& \bar{\boldsymbol{\chi}}\_{B}(k) + \boldsymbol{C}\_{B}(k) \boldsymbol{\chi}\_{B}(k) + \boldsymbol{n}\_{B}(k), \end{array} \right. \end{array}$$

where y<sup>A</sup> and y<sup>B</sup> are defined as in Equation (10), and |nA(k)| ≤ <sup>n</sup>¯ 2 , |nB(k)| ≤ <sup>n</sup>¯ 2 . We also assume that |1A| ≤ 1 and |1B| ≤ 1. Moreover, for this case we assume that X<sup>o</sup> is a singleton, thus removing the uncertainty in the initial state. In this case, S<sup>A</sup> and S<sup>B</sup> denote families of systems, due to the uncertainties 1<sup>A</sup> and 1B. Therefore, the introduction of the following definition is required.

**Definition 2**: The families of systems S<sup>A</sup> and S<sup>B</sup> are said absolutely (Xo, U, W)-input distinguishable in N sampling times if, for any pair of realizations (S1, S2) ∈ S<sup>A</sup> × SB, the systems S<sup>1</sup> and S<sup>2</sup> are absolutely (Xo, U, W)-input distinguishable in N sampling times.

Hence, we are now in condition to state the following proposition:

**Proposition 3**: The families of systems S<sup>A</sup> and S<sup>B</sup> are absolutely (Xo, U, W)-input distinguishable in N sampling times if and only if there does not exist a solution to the following linear problem:

$$
\Theta\_N \left[ \begin{array}{c} \Delta\_A \\ \Delta\_B \end{array} \right] \le \theta\_N,\tag{12}
$$

where

$$
\Theta\_N = \begin{bmatrix}
0 & 0 \\
0 & 0 \\
\mathcal{C}\_A(1)\Psi\_1^A(0)\mathbf{x}\_A(0) & -\mathcal{C}\_B(1)\Psi\_1^B(0)\mathbf{x}\_B(0) \\
\vdots & \vdots \\
\mathcal{C}\_A(N)\Psi\_1^A(N-1)\mathbf{x}\_A(0) & -\mathcal{C}\_B(N)\Psi\_1^B(N-1)\mathbf{x}\_B(0) \\
\end{bmatrix}
$$

and

θN = n¯ − ¯yA(0) + ¯yB(0) − CA(0)xA(0) + CB(0)xB(0) n¯ + ¯yA(0) − ¯yB(0) + CA(0)xA(0) − CB(0)xB(0) n¯ − ¯yA(1) + ¯yB(1) − CA(1)9<sup>A</sup> o (0)xA(0) + 9<sup>B</sup> o (0)xB(0) n¯ + ¯yA(1) − ¯yB(1) + CA(1)9<sup>A</sup> o (0)xA(0) − 9<sup>B</sup> o (0)xB(0) . . . n¯ − ¯yA(N) + ¯yB(N) − CA(N)9<sup>A</sup> (N − 1)xA(0) + 9<sup>B</sup> (N − 1)xB(0) 

# Proof: See Appendix C in Supplementary Material.

n¯ + ¯yA(N) − ¯yB(N) + CA(N)9<sup>A</sup>

**Figure 2A** depicts the impulse and step responses of the HRF model with the parameters of **Table 1**, with an uncertainty of 10% in the input signal. It should be noticed that this type of uncertainty mainly affects the amplitude of the responses of the system. Thus, the rise- and fall-times are not significantly influenced by small variations on the amplitude of the input signal.

o

o

o

o

(N − 1)xB(0)

(N − 1)xA(0) − 9<sup>B</sup>

# Systems with Uncertain Input Time-Delays

In this subsection, a strategy to model uncertain input timedelays is developed. The approach presented in the sequel amounts for rewriting these uncertain input time-delays as model uncertainty.

Consider that the input signal, at sampling time k, is given by

$$
u(k) = \tilde{u}(k - k\_d),$$

where k<sup>d</sup> is an integer (the uncertain delay) satisfying |kd| ≤ ¯kd, with known ¯kd. The value of <sup>u</sup>˜(k), for each <sup>k</sup> <sup>≥</sup> 0, is also assumed known and bounded. Thus, we have, for each k ≥ 0,

$$
\underline{u}(k) \le \underline{u}(k) \le \bar{u}(k), \tag{13}
$$

where <sup>u</sup>¯(k) <sup>=</sup> max|m|≤¯k<sup>d</sup> <sup>u</sup>˜(k−m) and <sup>u</sup>(k) <sup>=</sup> min|m|≤¯k<sup>d</sup> u˜(k−m). Therefore, Equation (13) can be rewritten as

$$
\mu(k) = \mu\_o(k) + \Delta\_\mu(k)\mu\_1(k),
$$

where |1u(k)| ≤ 1, uo(k) = u¯(k)+u(k) 2 , and u1(k) = u¯(k)−u(k) 2 .

Hence, unknown but bounded time-delays on the input can be treated as uncertainty on the B matrix. The impulse and step responses of the HRF model with the parameters of **Table 1**, with an uncertain input time-delay, kd, bounded by |kd| ≤ 3, are depicted in **Figure 2B**. As seen in the figure, the uncertainty in the input time-delay enlarges the uncertainty in the rise- and fall-times of the output.

# Systems with Uncertain Model and Input Time-Delays

For the sake of completeness, in this subsection we analyze the effects of simultaneous uncertainty on the model and on the input time-delays. The results for this scenario are depicted in **Figure 2C**. As expected, the uncertainty on the model chiefly affects the amplitude of the responses, while the uncertainty on the input time-delay changes the corresponding rise- and fall-times.

# Systems with Uncertain Model and Uncertain Initial State

We now consider the case where both the system dynamics and the initial state are uncertain. The problem is set to that of concluding whether the following two families of systems are distinguishable:

$$\begin{array}{rcl} \mathcal{S}\_{A}: \left\{ \begin{array}{rcl} \boldsymbol{\chi}\_{A}(k+1) &=& (\boldsymbol{G}\_{o}^{A}(k) + \Delta\_{A}(k)\boldsymbol{G}\_{1}^{A}(k))\boldsymbol{\chi}\_{A}(k), \\ \boldsymbol{\chi}\_{A}(k) &=& \bar{\boldsymbol{\chi}}\_{A}(k) + \boldsymbol{\mathcal{C}}\_{A}(k)\boldsymbol{\chi}\_{A}(k) + \boldsymbol{n}\_{A}(k), \end{array} \right. \\\\ \mathcal{S}\_{B}: \left\{ \begin{array}{rcl} \boldsymbol{\chi}\_{B}(k+1) &=& (\boldsymbol{G}\_{o}^{B}(k) + \Delta\_{B}(k)\boldsymbol{G}\_{1}^{B}(k))\boldsymbol{\chi}\_{B}(k), \\ \boldsymbol{\chi}\_{B}(k) &=& \bar{\boldsymbol{\chi}}\_{B}(k) + \boldsymbol{\mathcal{C}}\_{B}(k)\boldsymbol{\chi}\_{B}(k) + \boldsymbol{n}\_{B}(k), \end{array} \right. \end{array}$$

where y<sup>A</sup> and y<sup>B</sup> are defined as in Equation (10), and |nA(k)| ≤ <sup>n</sup>¯ 2 , |nB(k)| ≤ <sup>n</sup>¯ 2 . We also assume that |1A(k)| ≤ 1 and |1B(k) ≤ 1. Moreover, for this case we assume that X<sup>o</sup> is a convex polytope.

**Proposition 4:** Let e<sup>1</sup> = - 1 0 0 0 0 <sup>T</sup> . The families of systems S<sup>A</sup> and S<sup>B</sup> are absolutely (Xo, U, W)-input distinguishable in N sampling times if and only if there does not exist a solution to the following linear problem:


where the unknown variables are xA(0), · · · xA(N), xB(0), · · · xB(N), zA(0), · · · zA(N − 1), and zB(0), · · · zB(N − 1). Proof: See Appendix D in Supplementary Material.

**Figure 3A** depicts the maximum amplitude of the measurements noise that guarantees the absolute distinguishability of two particular families of HRF models, as a function of the uncertainty on the input signal and on the corresponding time-delay. As expected, the maximum level of sensor noise such that the two families of models are absolutely distinguishable, decreases with both types of uncertainty.

# Pre-Processing of fMRI Time Series

We stress that the assumption that the additive noise in the measured signal is bounded is not conservative in practice, since outliers and other unboundedness behaviors can, in general, be tackled during pre-processing, i.e., before performing the main analysis of the signals. This can be done, in particular, by low-pass filtering the signal, so that high-frequency noise is significantly attenuated.

Additionally, the following pre-processing steps are commonly applied to fMRI time series data before submitting them to statistical analysis (Jezzard et al., 2001): (i) normalization of the whole 4D fMRI dataset by scaling each volume by a single (common) scaling factor, so that subsequent analyses are valid; (ii) motion correction by alignment of all fMRI volumes to a reference volume in the time series, usually performed by applying rigid-body transformations, in order to reduce the effect of subject head motion during the experiment; and (iii) high-pass temporal filtering, usually using a local fit of a straight line (Gaussian-weighted within the line to give a smooth response), in order to remove low-frequency artifacts such as signal drifts or physiological fluctuations.

# Results

In this section, we study the influence of the choice of the input signal on the distinguishability of a set of HRF models. A methodology to optimize the fMRI experimental design that takes advantage of this knowledge is also presented.

Throughout the remainder of this paper, we are going to refer to the families of HRF models A and B, described by the dynamics in Equation (1), with the physiologically plausible parameters presented in **Table 2**. Model family B displays a pronounced undershoot and the presence of an initial dip, in stark contrast to model family A.

The response of the nominal HRF models, for the parameter configurations of **Table 2**, with initial state x T (0) = - 0 1 1 1 , to a rectangular input signal of duration 1 s and unit magnitude, is depicted in **Figure 4**.

In general, the input signal is composed of a series of rectangular pulses (events) of duration thigh alternating with baseline periods of duration tlow, with a total duration of 200 s (see **Figure 5**).

In order to illustrate the characteristic behavior of HRF model family A, their responses to rectangular input signals of duration 5 and 20 s and unit magnitude, with an uncertain input timedelay, kd, bounded by |kd| ≤ 3 s, and input uncertainty of 10%, are depicted in **Figure 6**. The uncertainty on the input time-delay enlarges the uncertainty in the rise- and fall-times of the output, while the uncertainty in the input mainly affects the amplitude of the responses of the system.

**Figure 3B** depicts the maximum amplitude of the measurements noise that guarantees the absolute distinguishability of the families of models A and B, for an input signal with tlow = 12 s and thigh = 12 s, as a function of


the uncertainty on the magnitude of the input signal and on the corresponding time-delay. As expected, the maximum level of measurement noise such that the families of models A and B are absolutely distinguishable decreases with both types of uncertainty.

Furthermore, we considered a stochastic input signal, composed of a series of rectangular pulses with mean duration of E(thigh) = 12 s, and mean baseline period of E(tlow) = 12 s drawn from a uniform distribution of width 12 s. According to the results in the literature (see, for instance, Josephs et al., 1997; Miezin et al., 2000), we observe that, by performing random small variations on thigh and tlow, alternative trajectories of the nonlinear model Equation (1) are exploited, which in turn improves the identifiability of the models, as depicted in **Figure 3C**.

We now analyze the effect of different experimental designs on the distinguishability of the families of models at hand. At this point, our goal is to find the combination of values of tlow and thigh such that the absolute distinguishability of two or more

families of models is guaranteed for the highest upper bound on the amplitude of the measurement noise. We denote this optimal combination of values by (t ∗ low, t ∗ high). The advantage of using an input signal with parameters (t ∗ low, t ∗ high) obviously stems from the fact that we can allow for the highest amplitude on the measurement noise, while guaranteeing the distinguishability of the families.

**Figure 7** depicts the results obtained, considering no timedelay or magnitude uncertainty. As expected, input signals with very small values of thigh and large values of tlow do not have the power required to significantly stimulate the system. On the other hand, input signals with very small values of thigh and tlow are faster than the dynamics of the system, and hence do not produce remarkable changes in the output of the plant. As a final remark, the optimum value for tlow and thigh is 10 s, i.e., t ∗ low = 10 s and t ∗ high = 10 s.

# Discussion

We have addressed the problem of the distinguishability of HRF models in the analysis of fMRI data of brain activation, based on the biophysically informed description of the HRF as a nonlinear time-invariant input-state-output dynamic system. We first introduced the concept of absolutely input-distinguishable systems and then showed that the distinguishability between two HRF models, and hence system identification, is significantly affected by the external input (stimulus/task) signals. In particular, the uncertainty in the input time-delays and its magnitude may adversely affect model identification, by reducing the maximum noise level below which model distinguishability is guaranteed. We then applied the concept of absolutely inputdistinguishable systems to the development of a methodology for the assessment of the HRF estimation efficiency of fMRI experimental designs, through the maximization of the distinguishability level among a set of physiologically plausible HRF models.

The main contribution of this paper is therefore 2-fold. On the one hand, we show that the distinguishability of two HRF models depends on the level of the measurement noise as well as on the characteristics of the input signal. On the other hand, we develop a methodology to optimize fMRI experimental designs for HRF estimation, which maximizes the allowable noise amplitude that does not impair the distinguishability of a set of a priori admissible dynamic systems.

In this paper, it is assumed that the system inputs can be selected or, at least, measured. This assumption is verified in a straightforward manner when external inputs are present, such as sensory stimuli or cognitive tasks. Although no explicit external inputs exist in resting-state fMRI acquisitions, it has been observed that discrete neuronal events do occur (Deco and Jirsa, 2012). Most interestingly, it has been recently suggested that such events can be identified as peaks of relatively large BOLD signal amplitude (Tagliazucchi et al., 2011), and restingstate fMRI data can then be seen as "spontaneous event-related" data (Wu et al., 2013).

# Significance of HRF Estimation

The importance of estimating the HRF in fMRI experiments is based on the extensively observed variability of its shape and dynamics across brain regions, conditions, subjects, and populations, with critical consequences in the analysis of fMRI data. In fact, one direct consequence of HRF variability is that the deviation of the real HRF from the pre-specified HRF leads to a poorer model of the observed BOLD signal and hence reduces the sensitivity to detect BOLD changes (Handwerker et al., 2004). Another consequence is the potential detection of a group effect due to a systematic HRF difference, which would then be incorrectly interpreted as a neuronal effect. Moreover, when attempting to infer causality within brain networks from BOLD data, differences in HRF latency across brain regions can potentially confound the directionality of information flow (David et al., 2008; Smith et al., 2011; Murta et al., 2012; Jorge et al., 2014). On the other hand, HRF variability may be an object of interest on its own, potentially reflecting physiological changes associated with the effects of drugs, aging or pathology, for example (Iadecola, 2004). Additionally, there is a growing interest in studying, not only the amplitude of BOLD activation, but also its dynamics, namely its latency and duration, which are reflected in the HRF (Bellgowan et al., 2003). In these cases, it would be desirable to estimate the actual HRF model underlying the BOLD signal measured in each voxel, experiment, subject or population, or otherwise account for its variability.

Despite the acknowledged need for modeling the HRF underlying fMRI BOLD data, and although different approaches have been continuously proposed in the literature for this purpose, our ability to understand HRF variability remains poor (Handwerker et al., 2012). Critically, most studies have focused on parameterized HRF models in a linear framework, while the estimation of physiologically plausible non-linear HRF models with direct biophysical interpretability has been very limited. In particular, no previous study has investigated the optimal fMRI experimental design for the estimation of such biophysical HRF models. We believe that our work therefore makes an important contribution for understanding how a biophysically informed model of the HRF may be inferred from fMRI data, as a function of experimental design and measurement noise.

# Biophysically Informed HRF Modeling

Using a biophysically informed model of the HRF not only allows for a physiologically plausible interpretation of the results, but it also more accurately explains empirical BOLD data, particularly concerning commonly observed non-linearities. Importantly, in contrast to parameterized HRF models, biophysical models described by dynamic systems can account for the detailed dynamics of BOLD responses through a reduced number of parameters, while constraining it to be physiologically plausible. For example, the post-stimulus undershoot and the initial dip are two features of observed BOLD responses that naturally emerge from this dynamic system under slightly different combinations of a limited number of parameters. Although using such dynamic systems represents an additional computational effort compared to the more straightforward linear methods, this may nevertheless become the chosen approach in studies where a detailed characterization of the BOLD temporal dynamics is desirable. In particular, the combination of EEG with fMRI may greatly benefit from such approaches (Riera et al., 2005). On the other hand, important complementary information may be gained for HRF model estimation by combining BOLD recordings with the acquisition of blood flow data using Arterial Spin Labeling (ASL) or near-infrared spectroscopy (NIRS) (Huppert et al., 2006). Despite the potential advantages of such a biophysically informed dynamic system approach to fMRI data analysis, only a few studies have been dedicated to the associated problem of system identification/model estimation (Friston, 2002; Riera et al., 2004). Our study therefore makes a significant contribution to this limited body of literature, by introducing the concept of input-distinguishability of HRF models in order to inform model selection in this context.

# Optimization of the Experimental Design

Previous studies systematically assessing the quality of fMRI experimental designs have again been focused on parameterized HRF models within a linear framework (Dale, 1999; Liu et al., 2001). They found that optimal estimation efficiency is obtained at the cost of reduced detection power by employing randomized rapid event-related designs. In fact, it was shown that, if the ISI is properly jittered or randomized from trial to trial, the efficiency improves monotonically with decreasing mean ISI (Dale, 1999). In general, it is found that a trade-off exists between detection power and estimation efficiency, with block designs being optimal for the former while event-related designs are optimal for the latter (Liu et al., 2001). Nevertheless, a recent report established the feasibility and test-retest reliability of estimating HRF parameters from block design fMRI data (Shan et al., 2014). In our work, we have used a randomized design by introducing uncertainty in the ISI, and we showed that smaller uncertainty leads to better distinguishability for the same noise level. Our results are therefore consistent with the literature.

# Limitations

The framework adopted in this work resorts to deterministic concepts and, therefore, certain assumptions are posed on the signals acting on the system, in particular in terms of maximum amplitudes. Stochastic approaches are more flexible in that sense, but require the knowledge regarding the statistical properties of those signals, which may not be trivial to obtain, or which may be violated in practice. Therefore, a compromise between these two alternative frameworks—deterministic and stochastic—for the distinguishability of HRF models is still a subject of further research.

# Conclusion

In summary, in this paper we proposed a novel approach to assess distinguishability among a set of physiologically plausible biophysically informed HRF models, and to design fMRI experiments for optimal estimation efficiency of such HRF models, with potentially great impact in further understanding HRF variability and its physiological meaning.

# Acknowledgments

We acknowledge financial support by the Portuguese Science Foundation through Projects PTDC/SAU-ENB/112294/2009, PTDC/BBB-IMG/2137/2012 and FCT [UID/EEA/50009/2013], and project MYRG117(Y1-L3)-FST12-MKM of the University of Macau. We also thank the reviewers for their insightful comments and corrections.

# Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fncom. 2015.00054/abstract

# References


structural identifiability. IEEE Trans. Autom. Control 29, 56–57. doi: 10.1109/TAC.1984.1103379


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Rosa, Figueiredo and Silvestre. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Detection of epileptiform activity in EEG signals based on time-frequency and non-linear analysis

Dragoljub Gajic1, 2 \*, Zeljko Djurovic<sup>1</sup> , Jovan Gligorijevic<sup>3</sup> , Stefano Di Gennaro<sup>2</sup> and Ivana Savic-Gajic<sup>4</sup>

<sup>1</sup> Department of Signals and Systems, School of Electrical Engineering, University of Belgrade, Belgrade, Serbia, <sup>2</sup> Center of Excellence DEWS, University of L'Aquila, L'Aquila, Italy, <sup>3</sup> Faculty of Engineering, University of Kragujevac, Kragujevac, Serbia, <sup>4</sup> Faculty of Technology, University of Nis, Leskovac, Serbia

We present a new technique for detection of epileptiform activity in EEG signals. After preprocessing of EEG signals we extract representative features in time, frequency and time-frequency domain as well as using non-linear analysis. The features are extracted in a few frequency sub-bands of clinical interest since these sub-bands showed much better discriminatory characteristics compared with the whole frequency band. Then we optimally reduce the dimension of feature space to two using scatter matrices. A decision about the presence of epileptiform activity in EEG signals is made by quadratic classifiers designed in the reduced two-dimensional feature space. The accuracy of the technique was tested on three sets of electroencephalographic (EEG) signals recorded at the University Hospital Bonn: surface EEG signals from healthy volunteers, intracranial EEG signals from the epilepsy patients during the seizure free interval from within the seizure focus and intracranial EEG signals of epileptic seizures also from within the seizure focus. An overall detection accuracy of 98.7% was achieved.

### Edited by:

Tobias Alecio Mattei, Brain and Spine Center - InvisionHealth - Kenmore Mercy Hospital, USA

### Reviewed by:

Peter König, University of Osnabrück, Germany Germán Mato, Centro Atomico Bariloche, Argentina

#### \*Correspondence:

Dragoljub Gajic, Department of Signals and Systems, School of Electrical Engineering, University of Belgrade, Bulevar Kralja Aleksandra 73, Belgrade 11000, Serbia dragoljubgajic@gmail.com

> Received: 01 January 2015 Accepted: 08 March 2015 Published: 24 March 2015

#### Citation:

Gajic D, Djurovic Z, Gligorijevic J, Di Gennaro S and Savic-Gajic I (2015) Detection of epileptiform activity in EEG signals based on time-frequency and non-linear analysis. Front. Comput. Neurosci. 9:38. doi: 10.3389/fncom.2015.00038 Keywords: seizure detection, epileptiform activity, non-linear analysis, scatter matrices, quadratic classifiers

# Introduction

According to the estimations of the World Health Organization around 50 million people worldwide suffer from epilepsy as the most common disorder of the brain activity (World Health Organization, 2012). It is characterized by sudden and recurrent seizures which are the result of an excessive and synchronous electrical discharge of a large number of neurons. Epileptic seizures can be divided by their clinical manifestation into two main classes, partial and generalized (Tzallas et al., 2007). Partial or focal epileptic seizures involve only a circumscribed region of the brain (epileptic focus) and remain restricted to this region while generalized epileptic seizures involve almost the entire brain. Both classes of epileptic seizures can occur at all ages. An epileptiform activity in EEG signals including spikes, sharp waves, or spike-and-wave complexes can be evident not only during a seizure (the ictal period) but also a short time before (the preictal period) as well as between seizures (the interictal period). Consequently, EEG signals have been the most utilized in clinical assessments of the brain state including both prediction and detection of epileptic seizures (Waterhouse, 2003; Casson et al., 2010). However, the detection of epileptiform activity in EEG signals by visual scanning of EEG recordings usually collected over a few days is a tedious and time-consuming process. In addition, it requires a team of experts to analyze the entire length of the EEG recordings in order to detect epileptiform activity. A reliable technique for detection of epileptiform activity in EEG signals would ensure an objective and facilitating treatment of patients and thus improve the diagnosis of epilepsy. Furthermore, it would also enable an automated prediction and/or detection of epileptic seizures in real time by a system to be implanted in head of epileptic patients (Jerger et al., 2001). Such a system would significantly improve quality of life of people suffering from epilepsy. Most of the techniques for automated detection of epileptiform activity that have emerged in recent years consist of two key successive steps: extraction of features from EEG signals and then classification of the extracted features for detection of epileptiform activity.

The feature extraction, as the first step, has a direct influence on both precision and complexity of the entire technique. Most common statistical features in time domain, such as the mean, the variance, the coefficient of variation and the total variation, by themselves are not sufficient for a reliable detection of epileptiform activity, and thus are mostly used as statistical measures for features in other domains. The variance and the total variation are considered to have better discriminatory capabilities than the mean, since they are able to detect magnitude of change in a signal over time. Even though we can note a certain periodicity and synchronization between EEG signals from different electrodes, neither the autocorrelation nor the cross-correlation have proved to be reliable features for detection of epileptiform activity. This is especially true in the case of the cortical EEG where the recording electrodes are so close to each other that the synchronization could be noted even when there was no seizure. However, in the literature we can still find several applications of these two features (Niederhauser et al., 2003; Jerger et al., 2005).

Unlike the previous features, the spectral features of EEG signals obtained through the Fourier transform have found wide applications in the field (Polat and Gunes, 2007; Mousavi et al., 2008). Namely, all the research carried out to date clearly indicates that it is much better to identify and extract the features of interest in frequency domain than in time domain, even though the both domains contain identical information. The analysis in time-frequency domain gives even better results considering that it contains, in addition to frequency, also the temporal component of signal which is lost during the Fourier transform. The literature mainly contains techniques based on wavelet transform (Subasi, 2007a,b; Wang et al., 2011; Gajic et al., 2014 ´ ) which has also been used in the research related to other brain disorders, such as schizophrenia (Hazarika et al., 1997) and Alzheimer's disease (Adeli and Ghosh-Dastidar, 2010). The detection of epileptiform activity based on non-linear analysis, i.e., extraction of the correlational dimension and the Lyapunov exponents as non-linear features can also be noted in some research studies (Iasemidis et al., 2003; Srinivasan et al., 2007; Adeli and Ghosh-Dastidar, 2010).

A precise classification as the second key step directly depends on the previously extracted features. That is, there is no classifier which could in any way make up for the shortcomings which are consequence of the information lost during the feature extraction. Like in the case of the feature extraction, we can come across a very wide range of classifiers starting from the most simple ones with thresholds (Altunay et al., 2010) or rule-based (Gotman, 1999), to linear classifiers (Liang et al., 2010; Iscan et al., 2011) and all the way to those more complex ones based on fuzzy logic and artificial neural networks (Gajic, 2007; Subasi, 2007a; Tzallas ´ et al., 2007). We can also note the use of other techniques for classification based on k nearest neighbors (Guo et al., 2011; Orhan et al., 2011), decision trees (Tzallas et al., 2009), expert models (Ubeyli, 2007; Ubeyli and Guler, 2007) as well as Bayes classifiers (Tzallas et al., 2009; Iscan et al., 2011). Considering that the feature extraction as a process of higher priority can be computationally very demanding it is always more desirable to use simpler classifiers so that the entire decision-making system could ideally work in real time.

In this paper we present an automated technique for detection of epileptiform activity in EEG signals. In contrast with the existing techniques which are mainly based on features from one domain of interest, our new technique optimally integrates features from a few domains and frequency sub-bands of clinical interest in order to increase its robustness and accuracy. We extract features in both time and frequency domain as well as time-frequency domain using discrete wavelet transform which has already been recognized as a very good linear technique for analysis of non-stationary signals such as EEG signals. In addition, by non-linear analysis we extract the correlation dimension and the largest Lyapunov exponent as much better measures of EEG signal non-linearity which is only approximated by other linear techniques such as fast Fourier transform (FFT) and discrete wavelet transform (DWT). After the feature extraction we optimally reduce the feature space dimension to two using scatter matrices and then perform classification in the reduced feature space by quadratic classifiers which have already been known as very robust solutions for classification of random feature vectors.

# Materials and Methods

# Materials

The EEG signals used to design and test the new technique were recorded at the University Hospital Bonn, Germany with the same 128-channel amplifier system (Andrzejak et al., 2001). After 12 bit analog-to-digital conversion the EEG signals were saved in a data acquisition system at a sampling rate of 173.61 Hz. The amplifier range was adjusted well so that the recordings could be made with 12 bits. The recorded EEG signals were further passed through a low pass filter with the finite impulse response and bandwidth of 0–60 Hz. The frequencies higher than 60 Hz mostly present noise and are a very small part of the signal total energy in the frequency band up to 86.8 Hz saved by the acquisition system. We used 100 segments of epileptic and 200 segments of non-epileptic EEG signals to design and test our new technique. The epileptic EEG signals were recorded using cortical electrodes from 5 epileptic patients during seizure from within the seizure focus, i.e., the region of unhealthy brain tissue that was later removed by surgery. The first 100 segments of non-epileptic EEG signals were also recorded using cortical electrodes from the same epileptic patients and the same unhealthy brain tissue but during seizure-free interval. The remaining 100 segments of non-epileptic EEG signals were recorded using scalp electrodes from 5 healthy volunteers and of course their healthy brain tissue. So, there was a total of three groups with 100 segments of the EEG signals. All the segments have duration of 4096 samples, i.e., 23.6 s, and were additionally tested on the weak stationarity (Andrzejak et al., 2001) in order to perform non-linear analysis. Since the EEG signals were recorded from different patients and with different electrodes, all extracted EEG signal segments were also additionally normalized in order to have the same zero mean and unit variance as shown in **Figure 1**. In this way, we wanted to design a detection technique that is not dependent on patient and the EEG recording system either.

# Methods

There are five broad sub-bands of the EEG signal which are generally of clinical interest: delta (0–4 Hz), theta (4–8 Hz), alpha (8–16 Hz), beta (16–32 Hz), and gamma waves (32– 64 Hz). Higher frequencies are often more common in abnormal brain states such as epilepsy, i.e., there is a shift of EEG signal energy from lower to higher frequency bands before

and during a seizure (Gajic et al., 2014 ´ ). These five frequency sub-bands provide more accurate information about neuronal activities underlying the problem. Consequently, some changes in the EEG signal, which are not so obvious in the original full-spectrum signal, can be amplified when each subband is considered independently. Thus, we extract features from each sub-band separately and also in time, frequency and time-frequency domain as well as by non-linear analysis. After the feature extraction we reduce dimension of the feature space to two. Finally, two quadratic classifiers able to separate all three groups of the EEG signals from each other are designed. The entire structure of the technique is shown in **Figure 2**.

### Time-Frequency Domain Analysis

Since the segments of the EEG signals have already been normalized and all have zero mean and unit variance, additional extraction of these two features as well as coefficient of variation as function of mean value and variance, does not make any sense. However, we extracted the total variation as another measure of signal variability in the time domain even after normalization since it counts number of signal sign changes or signal polarity. In the case of a signal segment x [n] of N samples, i.e., n = 1, 2 · · · N, the total variation is given by:

$$\nu\_{\mathbf{x}} = \frac{1}{N - 1} \frac{\sum\_{n=2}^{N} |\mathbf{x}\left[n\right] - \mathbf{x}[n-1]|}{(\max\_{\mathbf{x}} - \min\_{\mathbf{x}})} \tag{1}$$

where the signal is essentially normalized by the difference between its maximum and minimum values in the segment of interest. Obviously, the value of the total variation is located in the range between 1/(N − 1) for slower signals and 1 for signals with very high and frequent changes.

EEG signals, as the outcome of events with different repetition periods, contain signals whose different frequencies cannot be identified in the time domain, since all these signals are shown together. Thus, signal transformation from the time domain to the frequency domain is necessary, which in the case of a signal segment x[n] of N samples is achieved using the fast Fourier transform (FFT) defined by:

$$\text{ifft}\left[\omega\right] = \sum\_{n=1}^{N} \text{x}[n]e^{-i\alpha n}, \ \omega = \frac{2\pi m}{N}, \ 0 \le m \le N-1 \qquad \text{(2)}$$

where ω = 2πf /f<sup>s</sup> represents the angular frequency discretized in N samples (Proakis and Manolakis, 1996). In order to avoid discontinuities between the end and beginning of the segments and thus spurious spectral frequency components the beginning of each segment was chosen in such a way that the amplitude difference of the last and first data points was within the range of amplitude differences of consecutive data points, and the slopes at the end and beginning of each segment had the same sign. This procedure reduces edge effects that result in spectral leakage in the FFT spectrum. In order to further minimize spectral leakage windowing of signal segments by the Hamming window (the sum of a rectangle and a Hanning window) is used before application of the FFT. Considering the fact that by transforming the signal into the frequency domain we do not lose any original information from the time domain, the signal can completely be reconstructed using the inverse Fourier transform by:

$$\text{tr}[n] = \frac{1}{N} \sum\_{\omega=0}^{2\pi(N-1)/N} f\sharp[\omega]e^{i\alpha n}, \ 1 \le n \le N \tag{3}$$

Clearly, the longer the segment x[n], i.e., the larger N, the greater the frequency resolution.

Power spectral density is also one of the most important features of the signal in the frequency domain and represents the contribution of each individual frequency component to the power of the whole signal segment x[n]. In practice, power spectral density is usually estimated using the coefficients of the fast Fourier transform, i.e., the periodogram (Welch, 1967) given by:

$$\left[\text{per}\left[\boldsymbol{\alpha}\right]\right] = \frac{1}{N} \left|\boldsymbol{f}\mathbb{f}\left[\boldsymbol{\alpha}\right]\right|^{2}\tag{4}$$

which is an unbiased and inconsistent estimator. Thus, with the increase in the length of the signal segment, the mean of the estimation tends toward the actual value of power spectral density, which is actually an advantage, unlike variance estimation, which is not reduced, i.e., which does not have a tendency toward zero with the increase in segment length. A periodogram can be further normalized by the total signal power, i.e.,:

$$\left[\operatorname{per}\_{norm}\left[\omega\right] = \frac{1}{N}\left|\mathcal{f}\mathfrak{f}\left[\omega\right]\right|^{2} / \sum\_{\omega=0}^{2\pi(N-1)/N} \operatorname{per}\left[\omega\right] \tag{5}$$

where we obtain the relative contribution of each frequency component to the total power of the signal. If the original signal segment x[n] is further divided into P sub-segments of the N/P samples, the periodogram can be calculated as follows:

$$\left[\operatorname{per}\left[\omega\right]\right] = \frac{1}{P} \sum\_{\rho=0}^{P-1} \frac{P}{N} \left|\mathcal{H}\_{\rho}[\omega]\right|^2 \tag{6}$$

where fft<sup>p</sup> [ω] is the fast Fourier transform of each of the subsegments of the N/P sample. In this way, the periodogram is actually an averaged one with a smaller variance, but clearly with a lower resolution in the frequency domain. Based on the periodogram we extracted relative power of all five previously mentioned sub-bands, i.e., delta (0–4 Hz), theta (4–8 Hz), alpha (8–12 Hz), beta (12–30 Hz), and gamma (30–60 Hz), as features of interest in frequency domain.

By analyzing the EEG signals solely in the time domain, extracted features do not contain any information on frequencies, which are, as we will later show, also very important for the proper detection of epileptic EEG signals. On the other hand, by transforming the signals from the time into the frequency domain, any information on time is completely lost, except of course in the case of sequential application on sufficiently short and stationary sub-segments, which also has its disadvantage in terms of the correct choice of the length of these sub-segments which would enable the simultaneous achievement of the desired resolution in both domains. In addition, once selected, the subsegment length, i.e., the resolution in the time domain, remains fixed throughout the entire frequency bands and cannot be adjusted to the dominant signal frequencies at a specific time. Signal processing using wavelets very accurately resolves this deficiency and results in sufficient information on non-stationary signals, both in the time and frequency domain. We are already familiar with the fact that a signal can be presented as a linear combination of its basic functions. A unit impulse function whose power is limited and whose mean differs from zero is the basic function of the signal in the time domain, whereas in the frequency domain, this role is assigned to the sinusoidal function that has infinite power, and a zero mean. In the time-frequency domain, the basic function is the wavelet, which is actually a function of limited power, i.e., duration, and a zero mean (Rao and Bopardikar, 1998), and for which the following is valid:

$$\sum\_{n=-\infty}^{\infty} \left| \psi[n] \right|^2 < \infty, \sum\_{n=-\infty}^{\infty} \psi\left[n\right] = 0. \tag{7}$$

The wavelet that is moved, or translated, in time for b samples and scaled by the so-called dilation parameter a is given by:

$$
\psi\_{ab}[n] = \frac{1}{\sqrt{a}} \psi \left[ \frac{n-b}{a} \right]. \tag{8}
$$

By changing the dilation parameter, the basic wavelet (a = 1) changes its width, that is, it spreads (a > 1) and contracts (0 ≤ a < 1) in the time domain. In the analysis of non-stationary signals, the possibility of changing the width of the wavelet represents a significant advantage of this analysis technique, considering the fact that wider wavelets can be used to extract slower changes, i.e., lower signal frequencies, and narrower wavelets can be used to extract faster changes, i.e., higher frequencies. Following the selection of the values of parameters a and b it is possible to transform segments of the signal x - k of N samples, that is, to calculate the wavelet transform coefficients in the following way:

$$\mathbb{W}\_{ab}[n] = \sum\_{\mathfrak{r}=1}^{N} \mathfrak{x}[\mathfrak{r}] \psi\_{ab} \begin{bmatrix} n-\mathfrak{r} \end{bmatrix}, \; 1 \le n \le N \tag{9}$$

Thus, what is actually being extracted from the signal are only those frequencies that are within the wavelet frequency band ψab[n], i.e., the signals are filtrated by the wavelet ψab[n]. As previously indicated, based on the coefficients obtained in this way, the original signal can be reconstructed using an inverse wavelet transform. Of course, if necessary, it is possible to also independently reconstruct the part of the signal which is filtered, as well as the part that was rejected by the wavelet ψab[n] on the basis of the so-called detail coefficients and approximation coefficients respectively, which are of course a function of the transformation coefficients ψab[n].

Parameters a and b can continuously change, which is not so practical especially bearing in mind that the signal can be completely and accurately transformed and reconstructed by using a smaller and finite number of wavelets, that is, by using a limited number of discrete values of parameters a and b, which is also known as the discrete wavelet transform (DWT). In this case, parameters a and b are the powers of 2, which gives us the dyadic orthogonal wavelet network with frequency bands which do not overlap each other. The dilation parameter a, as the power of 2, at each subsequent higher level of transformation, doubles in value in comparison to the value from the previous level, which means that the wavelet becomes twice as wide in the time domain, and has a frequency band that is half as narrow and twice as low. This actually decreases the resolution of the transformed signal in the time domain two-fold, increasing it twice as much in the frequency domain. Thus, the signal frequency band from the previous level is split into two halves at every next level, into a higher band which contains higher frequencies and describes the finer changes, or details, and a lower band that contains lower frequencies and actually represents an approximation of the signal from the previous level. This technique is also known as wavelet decomposition of the signal.

Before the application of DWT, it is necessary to choose the type of the basic wavelet as well as the number of levels into which the signal will be decompose. After analysis of several types of the basic wavelets, the fourth-order Daubechies wavelet (Rao and Bopardikar, 1998) was selected for further analysis within this work since it has good localizing properties both in the time and frequency domains (Kalayci and Özdamar, 1995; Petrosian et al., 2000) Due to its shape and smoothing feature this type of the basic wavelet has already shown good capabilities in the field of EEG signal processing. The discrete wavelet decomposition was performed at four levels that resulted into five sub-bands of clinical interest. The standard deviation and the average relative power of the DWT coefficients in each of the sub-bands were extracted as representative features in time-frequency domain.

#### Non-linear Analysis

EEG signals, as the result of the activities of an extremely complex and non-linear system, in addition to the fairly well-known and previously described linear techniques, can also be analyzed using some of the non-linear techniques. By using linear techniques, any non-linearity that can be found in the signal is only approximated, which can result in the loss of certain pieces of potentially relevant information. If that is the case, the use of non-linear techniques is preferred since they are more reliable for non-linear analyses, despite the fact that they imply weak signal stationarity (Varsavsky et al., 2011), and the fact that they need somewhat longer segments, which leads to their being computationally more demanding than linear techniques.

Let x[n] again represent the signal segment which is to be analyzed, where n = 1 · · · N. Also, let m denote the lag for which we can define two new sub-segments x[n], the first x<sup>k</sup> containing samples starting from k up to N − m and the second xk+<sup>m</sup> with samples starting from k + m to N. Both of these sub-segments contain N − k − m + 1 samples and can be represented opposite one another in the phase space with a lag m and the so-called embedding dimension 2. In case of three sub-segments: xk+2m, xk+<sup>m</sup> and x<sup>k</sup> , the embedding dimension of the phase space would be 3. The lagged phase space provides a completely different view of signal evolution in time, where we can note that the signal gravitates to a certain part of the phase space, known as the attractor. With the aim of constructing lagged phase space, i.e., the signal attractor, it is necessary to previously define the values of the lag and the embedding dimension, which although significantly smaller than the real dimension of the non-linear system space, provides an approximation of the signal complexity and non-linearity (Andrzejak et al., 2001). The lag m should be large enough so that these sub-segments would overlap as little as possible, that is, share as little mutual information as possible, but at the same time sufficiently small so that the sub-segments could be long enough for any further useful analysis. An optimal lag is obtained by determining the mutual information coefficient the sub-segments for different values of the lag m. The mutual information coefficient is defined by Williams (1997):

$$\text{Info}\_{m} = \sum\_{i=1}^{N\_{i}} \sum\_{j=1}^{N\_{i}} \rho\left(\mathbf{x}\_{k}\begin{bmatrix} \mathbf{i} \end{bmatrix}, \mathbf{x}\_{k+m}\begin{bmatrix} \mathbf{j} \end{bmatrix}\right) \log\_{2} \frac{\rho\left(\mathbf{x}\_{k}\begin{bmatrix} \mathbf{i} \end{bmatrix}, \mathbf{x}\_{k+m}\begin{bmatrix} \mathbf{j} \end{bmatrix}\right)}{\rho\left(\mathbf{x}\_{k}\begin{bmatrix} \mathbf{i} \end{bmatrix}\right) \rho\left(\mathbf{x}\_{k+m}\begin{bmatrix} \mathbf{j} \end{bmatrix}\right)} \\ \tag{10}$$

where N<sup>s</sup> represents the number of areas in which the signal is discretized based on the amplitude and p is the corresponding probability that the sub-segment belongs to a certain area. The first local minimum shown in the graph representing the dependence of the mutual information coefficient on lag determines the optimal lag mo.

After determining the optimal lag, the minimum embedding dimension of the lagged phase space is estimated using Cao's technique (Cao, 1997). In the phase space with a lag m<sup>o</sup> and embedding dimension d, the original segment is represented by its phase portraits, which all together make up the attractor defined by the following points in the lagged phase space:

$$y\_d[i] = \left[ \mathbf{x}[i] \ge [i + m\_o] \cdot \cdots \ge [i + m\_o(d - 1)] \right] \tag{11}$$

where i = 1, 2, · · · , N − mo(d − 1). According to the technique developed by Cao, if d is the right dimension, then the two points are also close to each other in phase space dimension d, as well as in the phase space of dimension d + 1 and are referred to as real neighbors (Cao, 1997). Dimension increases gradually until the number of false neighbors reaches zero, that is, until the Cao's embedding function defined by:

$$\varepsilon\_{d} = \frac{1}{N - m\_{o}d} \sum\_{i=1}^{N - m\_{o}d} \frac{\left\| \chi\_{d+1}[i] - \chi\_{d+1}[n\_{i,d}] \right\|}{\left\| \chi\_{d}[i] - \chi\_{d}[n\_{i,d}] \right\|} \tag{12}$$

becomes constant, where i = 1, 2, · · · , N − mod and yd[ni, <sup>d</sup>] represents the nearest neighbor of yd[i] in the d-dimensional phase space with a lag mo. In fact, the minimum embedding dimension dmin is determined when the ratio between the ed+1/e<sup>d</sup> approaches the value of 1. Since this ratio may approach 1 in some other cases, e.g., for completely random signals, an additional check is also carried out where the Cao's embedding function is redefined and given by:

$$e\_d^\* = \frac{1}{N - m\_o d} \sum\_{i=1}^{N - m\_o d} \left| \mathbf{x} \left[ i + m\_o d \right] - \mathbf{x} \left[ n\_{i,d} + m\_o d \right] \right| \tag{13}$$

where x - ni, <sup>d</sup> + mod is the nearest neighbor of x - i + mod . The constant value of the ratio e ∗ d+1 /e ∗ d for different values of the embedding dimension indicates that we are dealing with a random signal. The signal is not random, i.e., it is deterministic if this ratio differs from 1 for at least one value of the embedding dimension, which in that case is also the minimum value.

The correlation dimension is a measure of the complexity of the signal attractor in the lagged phase space. This dimension, unlike most others better known dimensions, may have a fractional value and could thus characterize the dimension, that is, the complexity of the attractors with more precision than the embedding dimension; however, it is always less than or equal to the embedding dimension.

Let C<sup>ε</sup> be the correlational sum of the signal segment with N samples within the radius ε in its phase space with a lag m<sup>o</sup> and minimum embedding dimension dmin, i.e., M = N − modmin points ydmin given by Williams (1997):

$$C\_{\varepsilon} = \lim\_{M \to \infty} \frac{1}{M^2} \sum\_{i=1}^{M} \sum\_{j=1}^{M} H(\varepsilon - \left\| \chi\_{d\_{\min}}[i] - \chi\_{d\_{\min}}[j] \right\|) \tag{14}$$

where H is the Heaviside step function that results in 1 if ydmin [j] is within the radius ε of ydmin [i], i.e.,:

$$\left\|\boldsymbol{\varepsilon} - \left\|\boldsymbol{\chi}\_{d\_{\min}}[\boldsymbol{i}] - \boldsymbol{\chi}\_{d\_{\min}}[\boldsymbol{j}]\right\|\right\| > \mathbf{0} \tag{15}$$

otherwise it is 0. The correlation dimension dcorr is the approximated slope of the natural logarithm of the correlation sum as a function of ε. Given that the total number of possible distances between two points in a lagged phase space equals M(M − 1)/2, the correlation dimension could directly be obtained by the Takens estimator (Takens, 1981; Cao, 1997) using:

$$d\_{corr} = -\left[\frac{2}{M\left(M-1\right)}\sum\_{i=1}^{M}\sum\_{j=1}^{M}\log\left(\frac{\left\|\chi\_{d\_{\min}}\left[i\right]-\chi\_{d\_{\min}}\left[j\right]\right\|}{\varepsilon}\right)\right] \tag{16}$$

The largest Lyapunov exponent λmax represents a measure of both chaotic behavior of the attractor and the divergence of the trajectories in phase space, i.e., the predictability of the signal. Attractor divergence is the distance between two closely positioned points in a phase space after a certain period of time of k samples, which is also known as the prediction length. Based on chaos theory, i.e., the so-called butterfly effect, two points close in the phase space of a chaotic system may have completely different trajectories. Thus, the divergence of the trajectories implies a chaotic system, and vice versa. The Lyapunov exponent actually characterizes the exponential growth of that divergence. The number of Lyapunov exponents is equal to the embedding dimension, and each of these Lyapunov exponents represents the rate of a contracting (λ < 0) or expanding attractor (λ > 0) in a certain direction of the phase space. In the case of a chaotic system, the trajectories must diverge in at least one dimension, which means that at least one Lyapunov exponent must be greater than zero, when it is, at the same time, the largest Lyapunov exponent. If several Lyapunov exponents are positive, then the largest among them indicates the direction of the maximum expansion of the attractor and its chaotic behavior. The mean of the trajectory divergence after k samples and a sampling period T<sup>s</sup> can be calculated by the Wolf's technique (Wolf et al., 1985; Rosenstein et al., 1993) using:

$$d\_{T} = \frac{1}{\langle M - k \rangle} \sum\_{i=1}^{M-k} \frac{\left\| \mathcal{Y}\_{d\_{\min}}[i+k] - \mathcal{Y}\_{d\_{\min}}[n\_{i}+k] \right\|}{\left\| \mathcal{Y}\_{d\_{\min}}[i] - \mathcal{Y}\_{d\_{\min}}[n\_{i}] \right\|} \tag{17}$$

where ydmin [i] and ydmin [ni] represent two close points on different trajectories in the phase space. The largest Lyapunov exponent λmax is in this case an approximation of the slope of the natural logarithmic trajectory divergence as a function of the number of samples k, i.e., d<sup>T</sup> = d0e kTsλmax where d<sup>0</sup> stands for the initial divergence. In addition, there is another very similar more practical technique for the evaluation of the largest Lyapunov exponent proposed by Sato et al. where we first calculate the prediction error for several different values of the number of samples k using:

$$p\_k = \frac{1}{(M-k)} \sum\_{i=1}^{M-k} \log\_2 \frac{\|\boldsymbol{\chi}\_{d\_{\min}}\left[\boldsymbol{i} + k\right] - \boldsymbol{\chi}\_{d\_{\min}}[n\_i + k]\|}{\|\boldsymbol{\chi}\_{d\_{\min}}[i] - \boldsymbol{\chi}\_{d\_{\min}}[n\_i]\|} \tag{18}$$

after which the λmax is determined as the slope of the middle and approximately linear part of the prediction error p<sup>k</sup> as a function of kT<sup>s</sup> .

We extract both the correlation dimension and the largest Lyapunov exponent as features that describe complexity and chaotic behavior of the attractor in the lagged phase space. By choosing the radius ε, the phase space is divided into parts of the dimension ε. While the correlation dimension shows how many points can be found in the surrounding areas of the phase space, the Lyapunov exponent describes the distance between each of the trajectories that terminate in different parts of the phase space but start from the same one. In other words, both of these features give us an idea of how complex and predictable EEG signal is, which, of course, they both interpret and quantify in their own characteristic way.

#### Dimension Reduction in Feature Space

Let an n-dimensional random vector X be transformed through the application of a certain linear transformation into an n-dimensional random vector Y = A <sup>T</sup>X where A is the transformational square matrix of the dimension n. Then the mean vector and the covariance matrix of the random vector Y are M<sup>Y</sup> = A <sup>T</sup>M<sup>X</sup> and 6<sup>Y</sup> = A <sup>T</sup>6XA. Based on that, the distance function is:

$$d\_Y^2(Y) \quad = \quad \text{( $Y - M$ y)}^T \Sigma\_Y^{-1} (Y - My) = \text{( $X - M$ x)}^T \Sigma\_X^{-1}$$

$$\text{( $X - M$ y)} = d\_X^2(X) \tag{19}$$

that is, the distance function does not change with the linear transformation. If we were to perform the translation of the coordinate system for the mean vector M<sup>X</sup> we would obtain the random vector Z = X − M<sup>X</sup> whose mean vector is zero and its covariance matrix is the same as 6X. If we wanted to determine the random vector Z which maximizes the distance function d 2 Z (Z) = Z <sup>T</sup>6−1Z under the condition that Z <sup>T</sup>Z = 1, it is necessary to minimize the following criterion:

$$J = Z^T \Sigma^{-1} Z - \mu \left( Z^T Z - 1 \right) \tag{20}$$

where µ is the Lagrange multiplier. By using a partial derivate ∂J/∂Z and by equating it with zero, we obtain the following:

$$
\partial \mathbf{J}/\partial \mathbf{Z} = \mathbf{2} \boldsymbol{\Sigma}^{-1} \mathbf{Z} - \mathbf{2} \boldsymbol{\mu Z} \implies \boldsymbol{\Sigma} \mathbf{Z} = \boldsymbol{\lambda} \boldsymbol{Z} \tag{21}
$$

where λ = 1/µ. With the aim of obtaining a non-zero solution which satisfies the equation:

$$
\Sigma Z = \lambda Z \iff (\Sigma - \lambda I)Z = 0 \tag{22}
$$

it is further necessary to find such a parameter λ which satisfies the following so-called characteristic equation of a matrix 6:

$$|\Sigma - \lambda I| = 0 \tag{23}$$

Every λ which satisfies this characteristic equation is known as eigenvalue of the matrix 6 while the vector Z related to specific eigenvalue is known as an eigenvector. When 6 is a symmetric n × n matrix, then there are n real eigenvalues λ1, λ2, ... ,λ<sup>n</sup> and n real eigenvectors 81, 82,... ,8<sup>n</sup> which are mutually orthogonal and for which 68 = 83 and 8T8 = I where 8 = [8<sup>1</sup> 8<sup>2</sup> · · · 8n] is the square matrix of the eigenvectors, 3 the diagonal matrix of the eigenvalues:

$$
\Lambda = \begin{bmatrix}
\lambda\_1 & \cdots & \mathbf{0} \\
\vdots & \ddots & \vdots \\
\mathbf{0} & \cdots & \lambda\_n
\end{bmatrix} \tag{24}
$$

while I is the identity matrix.

If the matrix 8 is used as a transformation matrix during the linear transformation Y = 8TX, then the covariance matrix of the random vector Y will be 6<sup>Y</sup> = 8T6X8 = 3. This kind of transformation is orthonormal since for the transformation matrix 8 holds 8T8 = I. In addition, during all these orthonormal transformations, the Euclidean distance does not change, that is kYk <sup>2</sup> = Y <sup>T</sup>Y = X <sup>T</sup>8T8X = X <sup>T</sup>X = kXk 2 .

Let X be an n-dimensional random vector of the extracted features which could be represented using n linear independent vectors in the following way:

$$X = \sum\_{i=1}^{n} y\_i \Phi\_i = \Phi Y \tag{25}$$

where 8 = [8<sup>1</sup> 8<sup>2</sup> · · · 8n] and Y = - y<sup>1</sup> y<sup>2</sup> · · · y<sup>n</sup> that is 8<sup>i</sup> are the basis vectors of the new n-dimensional space, and the new coordinates y<sup>i</sup> are the scalar products of the basis vectors 8<sup>i</sup> and the random vector X. Assuming that the columns of the matrix 8 or in other words the basis vectors 8<sup>i</sup> are orthogonal, the coordinates of the random vector X in the new space can be obtained in the following way:

$$y\_i = \Phi\_i^T X.\tag{26}$$

Thus, Y represents a mapped random vector and the orthonormal transformation of the original random vector X. The random vector X approximated using only the m (m < n) basis vectors, i.e., the mapped features, could be represented in the following way:

$$\widehat{X}(m) = \sum\_{i=1}^{m} \nu\_i \Phi\_i + \sum\_{i=m+1}^{n} b\_i \Phi\_i \tag{27}$$

where the approximation error becomes:

$$
\Delta X(m) = X - \hat{X}(m) = \sum\_{i=m+1}^{n} (y\_i - b\_i)\Phi\_i \tag{28}
$$

and the mean squared error:

$$\overline{\varepsilon}^2(m) = E\left\{ \left\| \Delta X(m) \right\|^2 \right\} = \sum\_{i=m+1}^n E\left\{ \left( \wp\_i - b\_i \right)^2 \right\} \tag{29}$$

has its own minimal value for b<sup>i</sup> = E yi = 8<sup>T</sup> i E {X}. The optimal mean squared error can then be presented in the following form:

$$\begin{aligned} \overline{\varepsilon}\_{opt}^2(m) &= \sum\_{i=m+1}^n E\left\{ \left( \wp\_i - E\left\{ \wp\_i \right\} \right)^2 \right\} \\ &= \sum\_{i=m+1}^n \Phi\_i^T E\left\{ (X - E\left\{X\right\})(X - E\left\{X\right\})^T \right\} \Phi\_i \\ &= \sum\_{i=m+1}^n \Phi\_i^T \Sigma\_X \Phi\_i = \sum\_{i=m+1}^n \lambda\_i \end{aligned} \tag{30}$$

where 6<sup>X</sup> is the covariance matrix of the random vector X and λ<sup>i</sup> are its eigenvalues. Thus, the minimal mean squared error of approximation is also equal to the sum of the eigenvalues of the leftout coordinates, which actually means that we should leave out coordinates with the smallest eigenvalues. The mapping of the random vector X into the space made up by the eigenvectors of its covariance matrix 6<sup>X</sup> is known as the Karhunen-Loeve (KL) expansion. When reducing the dimension of the feature space using the KL expansion technique we should bear in mind that the performance of each feature is characterized by its eigenvalue. Thus, by rejecting features we should first reject those with the smallest eigenvalue, i.e., with the smallest variance in the new feature space. For example, in the case of dimension reduction from two to one shown in **Figure 3** the feature y<sup>2</sup> would be rejected as less informative even though it has better discriminatory potential than y1. Also the coordinates y<sup>i</sup> are mutually uncorrelated considering that the covariance matrix of the random vector Y is diagonal, i.e.,:

$$
\Sigma\_Y = \Phi^T \Sigma\_X \Phi = \Lambda = \text{diag}\left\{ \lambda\_1 \lambda\_2 \cdots \lambda\_n \right\}.\tag{31}
$$

Unlike the previously outlined method, the reduction of dimension based on scatter matrices (Fukunaga, 1990; Djurovic, 2006) is of special significance for the new detection technique since it takes into consideration the very purpose of the reduction, that is, the classification of the random vectors. Let L be the number of classes which should be classified and M<sup>i</sup> and 6i, i = 1 · · · L the mean vectors and the covariance matrices of these classes, respectively. Then the within-class scatter matrix can be defined by:

$$S\_W = \sum\_{i=1}^{L} P\_i E\left\{ \left( X - M\_i \right) \left( X - M\_i \right)^T / \omega\_i \right\} = \sum\_{i=1}^{L} P\_i \Sigma\_i \tag{32}$$

and the between-class scatter matrix as:

$$S\_B = \sum\_{i=1}^{L} P\_i \left(M\_i - M\_0\right) \left(M\_i - M\_0\right)^T \tag{33}$$

where M<sup>0</sup> is the joint vector of mathematical expectation for all the classes together, that is:

$$M\_0 = E\left\{X\right\} = \sum\_{i=1}^{L} P\_i M\_i. \tag{34}$$

In addition the mixed scatter matrix can be defined by:

$$\mathcal{S}\_M = E\left\{ \left( X - M\_0 \right) \left( X - M\_0 \right)^T \right\} = \mathcal{S}\_W + \mathcal{S}\_B. \tag{35}$$

Then the problem of dimension reduction is reduced to the identification of the n × m transformation matrix A which maps the random vector X of dimension n onto the random vector Y = A <sup>T</sup>X of dimension m and at the same time maximizes the criteria J = tr(S −1 <sup>W</sup> SB). This criteria is invariant to non-singular linear transformations and results into transformation matrix that takes the following form:

$$A = \begin{bmatrix} \Psi\_1 \ \Psi\_2 \ \dots \ \Psi\_m \end{bmatrix} \tag{36}$$

where 9<sup>i</sup> , i = 1,... , m are the eigenvectors of the matrix S −1 2 S1 which correspond to the greatest eigenvalues, i.e., (S −1 <sup>W</sup> SB)9<sup>i</sup> = λi9i, i = 1,... , n, λ<sup>1</sup> ≥ λ<sup>2</sup> ≥ · · · ≥ λn. Dimension reduction based on scatter matrices applied to the case shown in **Figure 3** would result into selection of the feature y<sup>2</sup> that is much better choice than the feature y<sup>1</sup> selected by the KL expansion technique, of course in terms of more accurate classification.

#### Design of Quadratic Classifiers

Quadratic classifiers are already known to be very good robust solutions to the problems of classification of random vectors whose statistical features are either unknown or change over time. Additionally, quadratic classifiers allow visual insight into the classification results. We design a piecewise quadratic classifier for detection of epileptiform activity, i.e., two quadratic classifiers, able to separate all three classes of the EEG signals of interest as shown in **Figure 2**. The quadratic classifiers have the same structure defined by the following equation:

$$\begin{aligned} h(Y) &= \begin{bmatrix} Y^T QY + V^T Y + \nu\_0 \\ \end{bmatrix} \\ &= \begin{bmatrix} \mathcal{y}\_1 \mathcal{y}\_2 \end{bmatrix} \begin{bmatrix} q\_{11} & q\_{12} \\ q\_{21} & q\_{22} \end{bmatrix} \begin{bmatrix} \mathcal{y}\_1 \\ \mathcal{y}\_2 \end{bmatrix} + \begin{bmatrix} \nu\_1 \ \nu\_2 \end{bmatrix} \begin{bmatrix} \mathcal{y}\_1 \\ \mathcal{y}\_2 \end{bmatrix} + \nu\_0 \end{aligned} \tag{37}$$

where y<sup>1</sup> and y<sup>2</sup> are two features in the reduced feature space. The matrix Q, the vector V and scalar ν<sup>0</sup> are the unknowns which are also need to be determined optimally. The quadratic equation (37) can be represented in a linear form as:

$$h(Y) = \begin{bmatrix} q\_{11} \ q\_{12} \ q\_{22} \ \upsilon\_1 \ \upsilon\_2 \end{bmatrix} \begin{bmatrix} \mathcal{Y}\_1^2\\\ 2\mathcal{Y}\_1\mathcal{Y}\_2\\\ \mathcal{Y}\_2^2\\\ \mathcal{Y}\_1\\\ \mathcal{Y}\_2^2 \end{bmatrix} + \upsilon\_0 = V\_z^T Z + \upsilon\_0. \tag{38}$$

In order to also achieve the largest possible between-class and shortest within-class scattering during the dimension reduction in the feature space, for the optimization criterion we have selected the following function (Fukunaga, 1990):

$$f = \frac{P\_1 \eta\_1^{\,^2} + P\_2 \eta\_2^{\,^2}}{P\_1 \sigma\_1^{\,^2} + P\_2 \sigma\_2^{\,^2}}\tag{39}$$

where P<sup>1</sup> and P<sup>2</sup> are probabilities and

$$\eta\_l = \mathcal{E}\left\{ h(\mathbf{Z})/a\mathbb{I} \right\} = \mathcal{E}\left\{ V\_z^T Z + \mathbf{v}\_0/a\mathbb{I} \right\} = V\_z^T M\_l + \mathbf{v}\_0 \tag{40}$$

$$\left\|\sigma\right\|^2 = \nu a \left\{ h(Z)/\omega \vert \right\} = \nu a \left\{ V\_z^T Z + \mathbf{v}\_0/\omega \vert \right\} = V\_z^T \Sigma\_l V\_z. \tag{41}$$

M<sup>l</sup> and 6<sup>l</sup> are the mean vectors and covariance matrices, respectively, of the random vector Z for each of the two classes l that need to be classified. By optimizing the function f , for the optimal vector Vz, i.e., matrix Q and vector V from Equation (37), we have:

$$V\_z = \begin{bmatrix} q\_{11} \\ q\_{12} \\ q\_{22} \\ v\_1 \\ v\_2 \end{bmatrix} = [P\_1 \Sigma\_1 + P\_2 \Sigma\_2]^{-1} (M\_2 - M\_1) \tag{42}$$

and for the optimal scalar:

$$\upsilon\_0 = -\ \upsilon\_z^T \left( P\_1 M\_1 + P\_2 M\_2 \right) \tag{43}$$

which finishes the design of the quadratic classifiers as well as the new technique for detection of epileptiform activity.

Statistical performances such as sensitivity, specificity and accuracy of the designed piecewise quadratic classifier, i.e., the new technique for detection of epileptiform activity, is estimated based on the classification results. The sensitivity is defined as a ratio between the number of correctly classified segments and the total number of the segments for each of the classes separately. The specificity is also calculated for each of these three classes separately and represents the ratio between the number of correctly classified features of the other two classes and the total number of the segments of these two classes. The accuracy is calculated as the ratio between the total number of correctly classified segments and the total number of the segments in all three classes together.

# Results

# Feature Extraction

In total 30 features for each of 300 analyzed segments of the EEG signals were extracted. All the features together with their mean values and standard deviations for all three different classes of EEG signals of interest are presented in **Table 1**. The extracted features refer to the adequate clinical sub-bands since these subbands had better discrimination characteristics compared with the whole frequency band between 0 and 60 Hz. The separability index as a measure of the discriminatory potential was also calculated for all the extracted features. In this case, the separability index is the criteria J = tr(S −1 <sup>W</sup> SB) where S<sup>W</sup> and S<sup>B</sup> are earlier defined within- and between-class scatter matrices, respectively. Based on these matrices, a higher separability index corresponds to better separability between different classes of the EEG signals. Based on these 30 features, each original segment of the EEG signals from time domain can be presented now by its feature vector X = [x1x<sup>2</sup> · · · x30] T , i.e., by the point in the feature space with dimension of 30.

The total variation is the only one feature that we extracted in the time domain. In **Table 1**, it can be noticed that the total variation has a certain potential for the detection of epileptiform activity in EEG signals. However, the total variation is not that much reliable despite the fact that is a pretty well estimated having in mind the duration of each of the analyzed segments.

The periodogram represents a very important feature of the signal in the frequency domain given that based on it we can get a relative contribution of either any individual frequency or a specific frequency band to the total power of the analyzed signal. The periodograms of one epileptic and two non-epileptic (from both unhealthy and healthy tissue) segments of the EEG signals are shown in **Figure 4** where it can be noticed that the EEG signal power of is shifting from lower to higher frequencies in the presence of epileptiform activity.

Using the discrete wavelet transform (DWT) we can completely and independently extract higher and lower frequencies from the signal. All that can be done with different resolution in the time domain, i.e., higher resolution in the time domain for higher frequencies and lower resolution in the time domain for lower frequencies. The EEG signal segments were analyzed at four levels, i.e., the discrete wavelet decomposition was performed at four levels as presented in **Figure 5**. At the first level of decomposition, the original frequency band of the EEG signals (0–60 Hz) was divided into its higher (30–60 Hz) and lower part (0–30 Hz), i.e., the details and the approximation of the signals at the first decomposition level, respectively. Then at the second decomposition level, the frequency band of the approximation from the first level was additionally divided into its higher (15–30 Hz) and lower (0–15 Hz) part, i.e., the



details and the approximation of the signals at the second decomposition level, respectively. After all four decomposition levels, the original band was divided into its five sub-bands, i.e., four sub-bands with the details and one sub-band with the approximation. All these five sub-bands approximately correspond to the earlier defined clinical sub-bands. Power distribution of the EEG signals in the time-frequency domain is quite well described by the DWT coefficients. However, in order to reduce the dimension of the problem and make easier further classification we calculated certain statistics of these coefficients in each sub-band such as the standard deviation and the average relative power, i.e., the square of the absolute values of the DWT coefficients.

Given that the EEG signal also roughly represents a dynamics of a very complex non-linear system such as the brain, the non-linear analysis based on the chaos theory was used in order to extract the information that could not been extracted by any of previously described linear techniques. It is interesting to see that unlike the other feature extraction techniques in the field, a complete agreement about if at all and how to perform a nonlinear analysis of the EEG signals has not been achieved yet. Thus, quite often it is possible to find contradictory results of such experiments in the literature. For example, the correlation dimension and the largest Lyapunov exponent have completely different values in Hively et al. (1999), Adeli and Ghosh-Dastidar (2010) and Iasemidis and Sackellares (1991). The feature extraction techniques and non-linear analysis implemented and used in this research are exclusively based on the chaos theory described in the methods part. In addition, there are no any further subjective adjustments applied to the EEG signals, which provides a high level of reproductivity of the obtained results at any time.

At first, the optimal lag and the embedding dimension were determined in order to reconstruct a segment of the EEG signals in its own lagged phase space. The optimal lag m<sup>o</sup> was obtained as the first local minimum of the function of the mutual information coefficients. The value of the optimal lag of the most of analyzed segments varied between 5 and 7. The minimum embedding dimension dmin was determined using Cao's technique, i.e., based on the saturation of the embedding function ed, for example as presented in **Figure 6** in the case of one segment. In other words when a further increase in the embedding dimension does not result in more than 5% of increase in the embedding function. The value of the embedding function of all 300 segments processed approached 1. In fact, this confirms that there is a certain level of chaos present in the segments of the EEG signals. That chaos is not random but deterministic given that the value of the redefined embedding function e ∗ d is not constant for all values of the embedding dimension as it can be seen in **Figure 6**.

The value of the minimum embedding dimension varied between 4 and 10.

After reconstruction of the EEG signals in the lagged phase space, the correlation dimension of attractor was estimated using

FIGURE 6 | Embedding function ed (upper) which approaches 1 and thus confirms a presence of a certain level of chaos in EEG signals and redefined embedding function e \* d (lower) which is not constant for all values of the embedding dimension m confirming that chaos is not random but deterministic.

the Taken's estimator. After a few tests the value of radius ε in the phase space was set to 5% of the total size of the attractor since the higher values resulted into to many points, and the smaller ones into insufficient number of points for a good estimation of the correlation dimension. From **Table 1**, it can be concluded that the correlation dimension as a non-linear feature has a potential for detection of epileptiform activity in EEG signals. It is also obvious that the attractor complexity, i.e., the chaotic behavior of the EEG signals, is lower in presence of epileptiform activity. The values of the correlation dimension in all cases were higher than the embedding dimension of the lagged phase space, which is in accordance with the chaos theory.

The largest Lyapunov exponent as a measure of signal predictability was estimated using Sato's technique. At first, the prediction error as a function of number of samples k was determined as shown in **Figure 7** in the case of one segment. Then, the largest Lyapunov exponent was estimated based on the function's slope in its medium part. As it can be seen in **Table 1**, the largest Lyapunov exponent has smaller discrimination ability compared with the correlation dimension. Additionally, it can be also noticed that the presence of epileptiform activity reduces the predictability of the EEG signals since the largest Lyapunov exponent is slightly higher in that case.

# Dimension Reduction in Feature Space

After the feature extraction from all the segments of the EEG signals, obviously none of the individually extracted features is sufficiently reliable for detection of epileptiform activity in EEG signals. This fact represents the main reason to perform the feature extraction in a few different domains of interest, i.e., time, frequency, time-frequency domain and non-linear analysis. The assumption is that the each of them contains some new information about the EEG signal, i.e., the information which is not present in any other domain and thus later contributes to more accurate classification and detection. Therefore, a better separability between the classes of epileptic and non-epileptic segments is expected after an optimal combination of the features from different domains than in the case of using only features from

one domain as it is the case with almost all the literature in the field.

Both the KL expansion technique and the dimension reduction technique based on the scatter matrices were tested on the features from all the domains. The obtained results, i.e., adequate separability indexes before and after the dimension reduction in the feature space are presented in **Table 2**. The reduction technique based on the scatter matrices gives better results in all the domains of interest and also results into the separability index that is, as expected, greater than any individual separability index given in **Table 1**.

In **Table 2**, one can see that out of all the analyzed features, the highest separability index and the best discrimination characteristics between epileptic and non-epileptic segments have the features obtained in time-frequency domain after the DWT. However, the other features despite their lower separability indexes are also useful for later classification that is concluded based on an additional analysis whose results are presented in **Table 3**. It can be noticed that starting from the features in time domain the separability index increases by a gradual inclusion of the features from other domains.

Unlike the previous figures, **Figure 8** shows 50 original nineteen-dimensional feature vectors X, which correspond to 50 segments from each of the three classes of the EEG signals, mapped into their new reduced two-dimensional feature space. All these 150 two-dimensional vectors Y will be later used in the next section for the design of appropriate classifiers while the rest of 150 segments and their corresponding feature vectors will be used to test the performance of the designed classifiers as well as the total accuracy of the new technique for detection of epileptiform activity in EEG signals.

TABLE 2 | Separability indexes after application of two different techniques for dimension reduction in feature space.


TABLE 3 | Separability indexes after the reduction based on the scatter matrices and gradual involvement of features from different domains.


divergence of nearby phase space trajectories.

# Classification

After the reduction of the feature space dimension to two, the next step is the design of appropriate classifiers that can separate epileptic from non-epileptic segments of the EEG signals in the reduced feature space shown in **Figure 8**. This represents the last step in design of the new technique for detection of epileptiform activity in EEG signals. Having in mind the nature of the EEG signals and possible changes in their statistical properties it is very desired to use robust classifiers. Based on **Figure 8** it can be concluded that quadratic classifiers represent quite logical choice for classification even though these three classes of the EEG signals are also piecewise linearly separable but with a much higher classification error. In total two quadratic classifiers were designed following the procedure described in Section Design of Quadratic Classifiers.

As it can be seen in **Figure 9**, the first classifier separates the non-epileptic segments of the EEG signals of healthy brain tissue (in green) from the non-epileptic segments of unhealthy tissue (in blue) as well as from the epileptic segments (in red). This classifier is defined using the following equation:

$$h(Y) = \sum\_{i=1}^{2} \sum\_{j=1}^{2} q\_{ij} \nu\_i \nu\_j + \sum\_{i=1}^{2} \nu\_i \nu\_i + \nu\_0 \tag{44}$$

where the unknown parameters are q<sup>11</sup> =− 4870.8, q<sup>12</sup> = q<sup>21</sup> =− 239.9, q<sup>22</sup> =− 174.9, ν<sup>1</sup> =− 29.2, ν<sup>2</sup> =− 174.9 and ν<sup>0</sup> =− 2.3. After that, the second classifier which separates the remaining two unseparated classes of the EEG signals segments, i.e., the epileptic and the non-epileptic segments of unhealthy brain tissue, was designed. The parameters of the Equation (44) for this classifier are q<sup>11</sup> =− 436.7, q<sup>12</sup> = q<sup>21</sup> =− 128.2, q<sup>22</sup> = 444.6, ν<sup>1</sup> =− 237.9, ν<sup>2</sup> =− 57.2 and ν<sup>0</sup> = 0.5 while the classifier itself is shown in **Figure 10**.

The performance of the designed classifiers and thus the new technique for detection of epileptiform activity in EEG signals was tested by classification of the remaining 150 segments which were not previously used during the design procedure. The obtained results are presented in **Figure 11**, where the piecewise quadratic classifier is just a combination of two quadratic classifiers.

The classification results can also be represented by a confusion matrix that is given in **Table 4**, where its each cell contains number of classified features for each combination of three classes of the EEG signals segments. Based on the confusion matrix and **Figure 11**, it can be concluded that all the nonepileptic segments of healthy tissue were correctly classified.

FIGURE 9 | The first quadratic classifier which separates non-epileptic EEG signals of healthy tissue (in green) from non-epileptic (in blue) and epileptic EEG signals of unhealthy tissue (in red) during the design and training phase.

during the design and training phase.

(in red) from non-epileptic (unhealthy in blue and healthy in green) EEG signals of the test set.

#### TABLE 4 | Confusion matrix.


#### TABLE 5 | Statistical performances.


However, the remaining two classes contained one segment each which was incorrectly classified, i.e., classified as it belongs to the other class. The statistical performances such as sensitivity, specificity and accuracy, of the designed piecewise quadratic classifiers are presented in **Table 5**. As it can be seen, the total accuracy of the new technique for detection of epileptiform activity in EEG signals is 98.7%. Typically, quadratic classifiers are robust and do not exhibit overtraining when the number of parameters to be estimated is much less than the number of samples as in this case. Anyway, it is a good practice to cross validate this piecewise classifier in order to ensure its stability. A fivefold cross validation was performed and it resulted in the crossvalidation loss, i.e., the error of the out-of-fold samples, of 1.7%. Even though it is slightly higher than the classification error of 1.3% it gives a confidence that the classifier is reasonably accurate.

# Discussion

Having in mind the results of other techniques available in the literature, presented in **Table 6** and tested on the identical segments of the EEG signals, the new technique demonstrated a very good performance. The accuracy of the other techniques varied between 85 and 99%. In addition to high accuracy achieved, it should also be emphasized that all the segments of the analyzed EEG signals were normalized before the feature extraction. In that way we managed to overcome one of the main disadvantages of the techniques from **Table 6** in terms of real clinical application, i.e., those techniques rely on the amplitude of the EEG signals as one of the key discriminatory features. However, the EEG signal amplitude has been found as unreliable in real clinical applications since it varies significantly even with healthy individuals, depending on other brain activities as well as other activities of human body. Also, some other undesired effects, e.g., different electrodes used for recording, different patients and their brain tissues, on the detection technique has also been removed by normalization. Unlike the techniques from **Table 6**, which are mainly based only on features from one of the domains, the new technique relies on carefully extracted features from all the domains of interest including non-linear analysis as well. Because of that, this technique is more robust and less sensitive on changes in the EEG signals that dominantly impact the features from one or two domains while at the same time are invisible in other domains and do not have any relation with a presence of epileptiform activity in EEG signals to be detected.

In order to further increase the detection accuracy of the new technique during its real clinical application, a previous elimination of artifacts is very desirable immediately after acquisition of the EEG signals, i.e., before any further processing and feature extraction. The artifacts removal can be performed very reliably using some of already developed and available techniques (Hyvarinen et al., 2001; Rosso et al., 2002). In addition, it is also necessary to make a certain compromise in terms of duration of the segments to be sequentially analyzed in real time. The segment duration should be subsequently adjusted depending on both application and patient. Not only during the feature extraction and the dimension reduction in the feature space, but also during the design of classifiers, a special attention has been paid to the robustness of the detection technique. This resulted in the choice of quadratic classifiers which in addition to their simplicity are known for a high level of robustness in the applications of this type. Quadratic classifiers have also one more important feature that is possibility of visualization of the classification results in two-dimensional space. Namely, despite the fact that the mapped features y<sup>1</sup> and y<sup>2</sup> as a linear combination of the original features x<sup>i</sup> extracted from the different domains cannot be anymore associated to certain properties of the EEG signals, they still can provide some further useful insights. For example, in **Figure 11** it can be noticed that the feature y<sup>1</sup> can help during determination of the damage level of the brain tissue, while the feature y<sup>2</sup> indicates either presence or absence of epileptic EEG signal.

As part of our future work we plan an additional testing on other bigger and mainly commercially available data bases of the EEG signals (e.g., http://epilepsy-database.eu) containing much

#### TABLE 6 | Other techniques for detection of epileptic EEG signals.


more interictal, preictal and ictal EEG data with the aim of further development and adaptation of the new technique for use in a real clinical environment. We will also try to access its potential in the field of emotion detection (e.g., happiness, sadness, depression, alertness, etc.) as well as detection of abnormal activities associated with some other brain disorders such as Alzheimer's disease and schizophrenia.

# References


# Acknowledgments

The support from the Marie Curie FP7-ITN InnHF, Contract No: PITN-GA-2011- 289837 and the Erasmus Mundus Action II EUROWEB Project, Contract No: 204625-1-2011-1-SE-ERA MUNDUS-EMA21 is gratefully acknowledged.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Gajic, Djurovic, Gligorijevic, Di Gennaro and Savic-Gajic. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Input-output relation and energy efficiency in the neuron with different spike threshold dynamics

Guo-Sheng Yi <sup>1</sup> , Jiang Wang<sup>1</sup> \*, Kai-Ming Tsang<sup>2</sup> , Xi-Le Wei <sup>1</sup> and Bin Deng<sup>1</sup>

*<sup>1</sup> School of Electrical Engineering and Automation, Tianjin University, Tianjin, China, <sup>2</sup> Department of Electrical Engineering, The Hong Kong Polytechnic University, Hong Kong, China*

Neuron encodes and transmits information through generating sequences of output spikes, which is a high energy-consuming process. The spike is initiated when membrane depolarization reaches a threshold voltage. In many neurons, threshold is dynamic and depends on the rate of membrane depolarization (*dV*/*dt*) preceding a spike. Identifying the metabolic energy involved in neural coding and their relationship to threshold dynamic is critical to understanding neuronal function and evolution. Here, we use a modified Morris-Lecar model to investigate neuronal input-output property and energy efficiency associated with different spike threshold dynamics. We find that the neurons with dynamic threshold sensitive to *dV*/*dt* generate discontinuous frequency-current curve and type II phase response curve (PRC) through Hopf bifurcation, and weak noise could prohibit spiking when bifurcation just occurs. The threshold that is insensitive to *dV*/*dt*, instead, results in a continuous frequency-current curve, a type I PRC and a saddle-node on invariant circle bifurcation, and simultaneously weak noise cannot inhibit spiking. It is also shown that the bifurcation, frequency-current curve and PRC type associated with different threshold dynamics arise from the distinct subthreshold interactions of membrane currents. Further, we observe that the energy consumption of the neuron is related to its firing characteristics. The depolarization of spike threshold improves neuronal energy efficiency by reducing the overlap of Na<sup>+</sup> and K<sup>+</sup> currents during an action potential. The high energy efficiency is achieved at more depolarized spike threshold and high stimulus current. These results provide a fundamental biophysical connection that links spike threshold dynamics, input-output relation, energetics and spike initiation, which could contribute to uncover neural encoding mechanism.

Keywords: spike threshold dynamic, input-output relation, energy efficiency, biophysical connection, spike initiation

# Introduction

Neurons, as the basic information-processing unit of the nervous system, can accurately represent and transmit various spatiotemporal patterns of sensory input in the form of sequences of output spikes (Koch, 1999; Dayan and Abbott, 2005; Klausberger and Somogyi, 2008). The generation and conduction of action potentials need to consume a lot of energy, which would have a great impact on neural codes and circuits (Niven and Laughlin, 2008; Alle et al., 2009; Sengupta et al., 2010, 2013, 2014; Moujahid et al., 2011). Characterizing energy efficiency associated with different

#### Edited by:

*Tobias Alecio Mattei, Kenmore Mercy Hospital, USA*

#### Reviewed by:

*Abdelmalik Moujahid, University of the Basque Country, Spain Ramesh Kandimalla, Texas Tech University, USA*

#### \*Correspondence:

*Jiang Wang, School of Electrical Engineering and Automation, Tianjin University, No. 92 Weijin Road, Nankai District, Tianjin 300072, China jiangwang@tju.edu.cn*

> Received: *14 March 2015* Accepted: *08 May 2015* Published: *27 May 2015*

#### Citation:

*Yi G-S, Wang J, Tsang K-M, Wei X-L and Deng B (2015) Input-output relation and energy efficiency in the neuron with different spike threshold dynamics. Front. Comput. Neurosci. 9:62. doi: 10.3389/fncom.2015.00062* input-output relations is an essential step toward capturing the full strategies used by the neuron to encode stimulus. Previous experimental and modeling studies (Koch, 1999; Dayan and Abbott, 2005; Klausberger and Somogyi, 2008; Niven and Laughlin, 2008; Prescott et al., 2008a; Alle et al., 2009; Carter and Bean, 2009; Sengupta et al., 2010, 2013, 2014) have reported that both of the input-output relation and energy efficiency of neurons depend not only on input spatiotemporal properties but also on neuronal intrinsic characteristics.

One basic intrinsic property for all spiking neurons is the spike threshold, which is a special membrane potential that distinguishes subthreshold responses from spikes (Izhikevich, 2005; Goldberg et al., 2008). The small depolarization of membrane potential below this special value is subthreshold and decays to resting potential, while large depolarization above this value is suprathreshold and results in an action potential (Izhikevich, 2005; Prescott et al., 2008a; Wester and Contreras, 2013). That is, a spike is initiated only when membrane depolarization reaches this threshold potential. In vivo, the spike threshold is dynamic, and varies with input properties as well as spiking history. Especially, it is inversely correlated with the preceding rate of membrane depolarization (i.e., dV/dt) prior to spike initiation (Azouz and Gray, 2000, 2003; Henze and Buzsáki, 2001; Ferragamo and Oertel, 2002; Escabí et al., 2005; Wilent and Contreras, 2005; Kuba et al., 2006; Goldberg et al., 2008; Priebe and Ferster, 2008; Cardin et al., 2010; Higgs and Spain, 2011; Platkiewicz and Brette, 2011; Wester and Contreras, 2013; Fontaine et al., 2014). A dynamic threshold plays a critically important role in spike generation, which would participate in and produce profound influences on neuronal input-output properties (Azouz and Gray, 2000, 2003; Henze and Buzsáki, 2001; Ferragamo and Oertel, 2002; Escabí et al., 2005; Wilent and Contreras, 2005; Kuba et al., 2006; Priebe and Ferster, 2008; Cardin et al., 2010; Platkiewicz and Brette, 2011). For instance, the neuron with a dynamic threshold is more capable of filtering out synaptic inputs (Higgs and Spain, 2011) and regulating its response sensitivity (Azouz and Gray, 2000, 2003; Ferragamo and Oertel, 2002; Wilent and Contreras, 2005; Cardin et al., 2010). Further, the dynamic threshold could also effectively enhance feature selectivity (Azouz and Gray, 2003; Escabí et al., 2005; Wilent and Contreras, 2005; Priebe and Ferster, 2008), contribute to coincidence detection and gain modulation (Azouz and Gray, 2000, 2003; Platkiewicz and Brette, 2011), as well as facilitate precise temporal coding (Kuba et al., 2006; Higgs and Spain, 2011).

The spike threshold dynamics could be modulated by the biophysical properties of intrinsic membrane currents (Hodgkin and Huxley, 1952; Azouz and Gray, 2000, 2003; Wilent and Contreras, 2005; Guan et al., 2007; Goldberg et al., 2008; Higgs and Spain, 2011; Platkiewicz and Brette, 2011; Wester and Contreras, 2013; Fontaine et al., 2014). Two especially relevant biophysical mechanisms are Na<sup>+</sup> inactivation and K<sup>+</sup> activation, which are originally recognized by Hodgkin and Huxley (1952). Because Na<sup>+</sup> inactivation specifically affects spike initiation (Platkiewicz and Brette, 2011), it is usually regarded as the fundamental mechanism of regulating threshold (Azouz and Gray, 2000, 2003; Henze and Buzsáki, 2001; Wilent and Contreras, 2005; Platkiewicz and Brette, 2011; Wester and Contreras, 2013; Fontaine et al., 2014). Recently, more and more studies find that the outward K<sup>+</sup> channels, especially those activated at the subthreshold potentials, could also powerfully regulate spike threshold (Storm, 1988; Bekkers and Delaney, 2001; Dodson et al., 2002; Guan et al., 2007; Goldberg et al., 2008; Higgs and Spain, 2011; Wester and Contreras, 2013). Blocking them (Storm, 1988; Bekkers and Delaney, 2001; Dodson et al., 2002; Guan et al., 2007; Goldberg et al., 2008) or depolarizing their activation voltage to make them unactivated prior to spike initiation (Wester and Contreras, 2013) could both result in a loss of the inverse correlation between spike threshold and dV/dt.

In addition to modulating threshold dynamic, the biophysical properties of membrane currents could also control neuronal spike initiation (Koch, 1999; Izhikevich, 2005; Prescott and Sejnowski, 2008; Prescott et al., 2008a,b; Yi et al., 2014a,b). It is shown that if the K<sup>+</sup> current that flows out of the cell is absent or unactivated at the potentials around spike threshold, i.e., perithresholds, the neuron generates a continuous frequencycurrent curve through a saddle-node on invariant circle (SNIC) bifurcation, i.e., Hodgkin class 1 excitability (Izhikevich, 2005; Prescott et al., 2008a,b; Yi et al., 2014a). On the contrary, if the outward K<sup>+</sup> current has already activated at the perithresholds, the neuron generates a discontinuous frequency-current curve through a Hopf bifurcation, i.e., Hodgkin class 2 excitability (Izhikevich, 2005; Prescott et al., 2008a,b; Yi et al., 2014a). Furthermore, Rothman and Manis (2003a,b,c) find that a high density of low-threshold K<sup>+</sup> current in ventral cochlear nucleus is responsible for phasic firing of class 2 excitability, while a lower density promotes regular firing of class 1 excitability. These reports suggest that membrane biophysics is able to further determine neuronal input-output relations. Then, the dynamics of the spike threshold should also be dependent on input-output properties. Uncovering the biophysical connection between them is crucial for explaining how biophysical properties contribute to neural coding. Meanwhile, it could also provide a deeper insight into the mechanism of neural coding than a purely phenomepological description of input-output relation. However, the relevant studies are still lacking.

In fact, the biophysical properties of membrane currents not only affect spike threshold dynamic and input-output relation, but also influence neuronal energetics. During the generation of action potential, there is flux of different ions across their voltagegated ionic channels, such as, influx of Na<sup>+</sup> and efflux of K+. In this process, the ions need to expand significant quantities of energy to permeate cell membrane against their concentration gradient (Attwell and Laughlin, 2001; Niven and Laughlin, 2008; Alle et al., 2009; Carter and Bean, 2009; Sengupta et al., 2010, 2013, 2014; Moujahid et al., 2011, 2014; Moujahid and D'Anjou, 2012). The influx or efflux of ions, i.e., inward or outward ionic currents, dominate and make a significant contribution to neuronal energy consumption (Attwell and Laughlin, 2001; Alle et al., 2009; Sengupta et al., 2010, 2013, 2014). Previous studies (Alle et al., 2009; Carter and Bean, 2009; Sengupta et al., 2010, 2013; Moujahid and D'Anjou, 2012; Moujahid et al., 2014) have shown that adjusting the biophysical properties of voltagegated Na<sup>+</sup> and K<sup>+</sup> currents, such as, channel conductance or activation/inactivation time constant, could modulate the energy efficiency of neuron. Then, a critical question arises as to how the spike threshold dynamic, a basic property of neuron, influences its energy consumption. Until now, there is still no relevant research on this issue.

Here, we systematically characterize the input-output property and energy efficiency of the neuron with different spike threshold dynamics. To achieve this goal, we first adopt a two-dimensional biophysical model and vary its parameter that controls the voltage-dependency of K<sup>+</sup> current to produce different relationships between spike threshold and dV/dt. Then, we investigate how the minimal neuron responds to external stimulus as well as its relevant biophysical mechanism in the case of different threshold dynamics. Finally, we deduce the energy functions involved in the dynamics of neuron model, and determine the energy efficiency associated with each threshold dynamic.

# Materials and Methods

## Two-Dimensional Neuron Model

A two-dimensional biophysical model proposed by Prescott et al. (2008a) is adopted to explore how spike threshold dynamic modulates neuronal input-output relation and metabolic energy in present study. It is a modified version of Morris-Lecar model, which incorporates three ionic currents, i.e., a fast Na<sup>+</sup> current INa, a delayed rectifying K<sup>+</sup> current IK, as well as a leak current IL. The model is given by the following differential equations (Prescott et al., 2008a)

$$C\frac{dV}{dt} = I\_{in} + I\_{noise} - \overline{\mathfrak{g}}\_K n(V - V\_K) - \overline{\mathfrak{g}}\_{Na} m\_{\infty}$$

$$(V)(V - V\_{Na}) - \mathfrak{g}\_L (V - V\_L) \tag{1}$$

$$\frac{dn}{dt} = \varphi\_n \frac{n\_{\infty}(V) - n}{\mathfrak{r}\_n(V)}\tag{2}$$

where V is the membrane voltage and n is the activation gating variable for IK. The three terms on the right side of Equation (1), i.e., g<sup>K</sup> n(V − VK), gNam∞(V)(V − VNa) and gL(V − VL), respectively denote slow outward IK, fast inward INa and outward IL. m∞(V) = 0.5 1 + tanh [(V − βm) /γm] and n∞(V) = 0.5 1 + tanh [(V − βn) /γn] are the steadystate voltage-dependent activation functions for INa and IK, and τn(V) = 1/ cosh [(V − βn) /2γn] is the K<sup>+</sup> voltage-dependent time constant function. The kinetics of inward INa are controlled by parameter β<sup>m</sup> and γm, and the kinetics of outward I<sup>K</sup> are controlled by β<sup>n</sup> and γn. In previous modeling study, Wester and Contreras (2013) have shown that hyperpolarizing K<sup>+</sup> activation voltage, even in the absence of Na<sup>+</sup> inactivation, is sufficient to produce a dynamic spike threshold that is inverse to the preceding dV/dt. Then, we vary parameter β<sup>n</sup> from −5 to −15 mV in steps of −2 mV to produce different sensitivity of spike threshold to dV/dt in our stimulation. These values of β<sup>n</sup> can span different spike initiation dynamics of the model (Prescott et al., 2008a). **Table 1** gives the numerical values

TABLE 1 | Parameters in two-dimensional model (Prescott et al., 2008a).


and corresponding neural functions of the parameters in twodimensional model, which are the same as those described in Prescott et al. (2008a).

Iin is the injected current used to stimulate neuron, which can be either steps or ramps in our study. Inoise is used to replicate synaptic noise, and is modeled as an Ornstein-Uhlenbeck process (Uhlenbeck and Ornstein, 1930)

$$\frac{dI\_{noise}}{dt} = -\frac{I\_{noise}}{\tau\_{noise}} + \sigma N(t) \tag{3}$$

where N(t) is a random number drawn from a Gaussian distribution with average 0 and unit variance. The amplitude of weak noise Inoise is controlled by the scaling parameter σ (Destexhe et al., 2001; Prescott and Sejnowski, 2008; Prescott et al., 2008a,b), which could vary from 0µA/cm<sup>2</sup> to 3µA/cm<sup>2</sup> in our study. The time constant is τnoise = 5ms (Prescott and Sejnowski, 2008; Prescott et al., 2008b). When we determine spike threshold, phase response curve (PRC) and bifurcation patterns, the noisy current is removed from the neuron.

### Method to Calculate Spike Threshold

The spike threshold for different values of dV/dt is determined by a novel approach proposed by Wester and Contreras (2013). According to their description, we use Iin to produce a cluster of ramps to stimulate the neuron, so

$$I\_{\rm int} = \begin{cases} \begin{array}{c} \text{Kt} \ \{0 \le t \le t\_0\} \\ \text{0} \end{array} \end{cases} \tag{4}$$

The ramp slope K controls the values of dV/dt leading to the spike initiation. With a larger value of K, the membrane potential V is forced to approach the threshold potential at a faster speed, which corresponds to a bigger value of dV/dt. The stimulation duration is controlled by t0. For a given slope K, the membrane potential V will gradually approach the threshold as t<sup>0</sup> increases. When membrane potential V is around the threshold potential, we stepwise extend ramp duration t<sup>0</sup> to make each step result in about additional 0.1 mV depolarization in V until an action potential is initiated in the neuron. In this way, if V is driven to cross spike threshold at the time of ramp offset, there will be a spike generated after removing ramp (i.e., t > t0). Conversely, the neuron fails to initiate a spike if V does not reach the threshold potential at the time of ramp offset. Then, we empirically increase ramp duration t<sup>0</sup> to seek such a special membrane potential V ∗ : 0.1 mV hyperpolarized to V ∗ is subthreshold and neuron fails to initiate a spike at the time of ramp offset, whereas 0.1 mV depolarized to V ∗ is suprathreshold and neuron could initiate a spike at the time of ramp offset. We define this special membrane potential V ∗ as the spike threshold of the neuron. In this manner, the upstroke of the spike is purely due to the sufficient activation of Na<sup>+</sup> current, which has nothing to do with the current ramp. This method allows us to measure the spike threshold with a high precision less than 0.1 mV.

## Phase Response Curve Calculation

The PRC measures the phase shift of a periodically oscillating neuron in response to a brief current pulse delivered at different phases of the oscillation cycle (Ermentrout, 1996; Izhikevich, 2005; Smeal et al., 2010; Fink et al., 2011; Schultheiss et al., 2012). The PRC of the neuron can be defined as (Ermentrout, 1996; Izhikevich, 2005; Smeal et al., 2010; Schultheiss et al., 2012)

$$PRC(\vartheta) = 1 - T'(\vartheta)/T \tag{5}$$

where T is the oscillation period of the neuron without perturbation (i.e., 1/T represents natural oscillation frequency), and T ′ (ϑ) is the oscillation period when the neuron is stimulated at phase ϑ. A positive value of PRC indicates there is a phase advance, and a negative value indicates a phase delay. If the amplitude of current pulse is sufficiently small and its duration is sufficiently brief, the PRC becomes the infinitesimal PRC, which could reflect the intrinsic dynamics of the oscillator (Ermentrout, 1996; Smeal et al., 2010; Fink et al., 2011; Schultheiss et al., 2012). In the following, we use "PRC" to refer to the infinitesimal PRC. Further, the PRCs of neural oscillator have often been classified into two categories: Type I that respond with only phase advances to excitatory stimuli, and Type II that display both phase advances and delays (Hansel et al., 1995; Smeal et al., 2010; Fink et al., 2011).

# Method to Determine Energy Consumption in Two-Dimensional Model

We use the method proposed by Moujahid et al. (2011, 2014) and Moujahid and D'Anjou (2012) to determine the electrochemical energy involved in the modified Morris-Lecar model. The model in Equation (1) can be regarded as an electrical circuit, which consists of membrane capacitance C, Na+, K<sup>+</sup> and leak ionic channels. According to the description by Moujahid et al. (2011, 2014) and Moujahid and D'Anjou (2012), the total electrical energy accumulated in this circuit at a given time can be expressed by

$$E(t) = \frac{1}{2}CV^2 + E\_{Na} + E\_K + E\_L \tag{6}$$

Here, <sup>1</sup> 2 CV<sup>2</sup> is the electrical energy accumulated in the membrane capacitance. ENa, EK, and E<sup>L</sup> are the energies in the batteries needed to create the concentration jumps in Na+, K<sup>+</sup> and chloride, respectively. These energies could be supplied by external stimuli, i.e., Iin or Inoise. The first-order derivative with respect to time of the Equation (6) is

$$\frac{dE}{dt} = CV\frac{dV}{dt} + I\_{Na}V\_{Na} + I\_{K}V\_{K} + I\_{L}V\_{L} \tag{7}$$

Substituting dV dt with Equation (1), the energy rate <sup>δ</sup> (i.e., dE dt ) in the circuit can be written as

$$\mathcal{S} = (I\_{in} + I\_{noise})V - I\_{Na}(V - V\_{Na}) - I\_K(V - V\_K) - I\_L(V - V\_L) \tag{8}$$

where (Iin+Inoise)V is the energy power supplied by stimulus. The last three terms on the right hand of Equation (8) represent the energy consumption rate of the ionic channels. If we substitute INa, IK, and I<sup>L</sup> with their expressions, we can deduce the energy rate of each ionic channel

$$\delta\_{\rm Na} = \overline{\mathcal{g}}\_{\rm Na} m\_{\infty}(V)(V - V\_{\rm Na})^2 \tag{9}$$

$$
\delta\_K = \overline{\mathcal{g}}\_K \mathfrak{n} (V - V\_K)^2 \tag{10}
$$

$$\delta\_L = \lg(V - V\_L)^2\tag{11}$$

It is easy to see that this method is not based on the stoichiometry of the ions. Thus, it requires no hypothesis about the overlapping between Na<sup>+</sup> and K<sup>+</sup> ions, and then avoids the overestimate values of energy (Moujahid et al., 2011, 2014; Moujahid and D'Anjou, 2012).

# Numerical Stimulation

The differential equations of the entire system are numerically integrated with MATLAB. The bifurcation analysis is performed with XPPAUT (Ermentrout, 2002) following the standard procedures. In bifurcation analysis, we use Iin to produce step currents to stimulate the neuron and systematically vary its intensity to determine at what point the neuron qualitatively changes its dynamical behavior, such as, starting or ceasing repetitive spiking. This special point corresponds to a bifurcation. Further, the PRC is also calculated by XPPAUT.

# Results

In this section, we first adjust parameter β<sup>n</sup> that controls the half-activation voltage of K<sup>+</sup> channel to produce the spike threshold that has different sensitivity to the preceding dV/dt, as shown in **Figures 1A,B**. One can find that the spike threshold becomes more depolarized as we shift β<sup>n</sup> alone from −5 to −15mV in steps of −2mV (**Figure 1B**). For three cases of β<sup>n</sup> = −5, −7, and −9mV, the spike thresholds are all insensitive to dV/dt, and there is always no inverse relationship between spike threshold and dV/dt. On the contrary, the spike threshold shows relatively large variations and becomes sensitive to dV/dt with β<sup>n</sup> = −11, −13, and −15mV. In these three cases, the spike threshold varies inversely with the preceding dV/dt, and

simultaneously the inverse relationship becomes more significant as β<sup>n</sup> decreases. The range of dV/dt in **Figure 1B** is from 0.45 to 4.5 mV/ms, which is achieved by increasing ramp slope K in Equation (4). This range is selected in accordance with previous modeling (Wester and Contreras, 2013) and experimental (Wilent and Contreras, 2005) studies. In the following, we respectively explore neuronal input-output relation and energy efficiency in these six cases.

# Input-Output Property of the Neuron with Different Threshold Dynamics

For different sensitivity of spike threshold to dV/dt, we respectively investigate how neuron responds to constant current in the cases of no noise (σ = 0µA/cm<sup>2</sup> ), low noise (σ = 0.5µA/cm<sup>2</sup> ) and high noise (σ = 3µA/cm<sup>2</sup> ). To achieve this goal, we use Iin to produce step current to stimulate the neuron and systematically alter its intensity to determine neuronal spike frequency f .

**Figure 1C** gives neuronal spike frequency f as a function of input current Iin (i.e., f − Iin curve) in six cases of threshold dynamic. For three levels of noise, one can observe that the depolarization of spike threshold slightly reduces the slope of f − Iin curve at the low firing rates and obviously shifts the curve to the right, which corresponds to increasing the minimal current intensity used for triggering repetitive spike (i.e., current threshold). If spike threshold is insensitive to dV/dt (i.e., β<sup>n</sup> = −5, −7, and −9mV), the neuron could spike repetitively at very low frequencies in all levels of noise, which endows it with a continuous f − Iin curve. However, when spike threshold is sensitive to dV/dt (i.e., β<sup>n</sup> = −11, −13, and −15mV), the neuron is unable to maintain repetitive spike at low rates and produces a discontinuous f − Iin curve in the cases of no or low noise levels (**Figure 1C**). This discontinuous f − Iin curve could be switched to continuous by high level of noise.

Since noise is another ubiquitous feature of the nervous system with myriad effects on neural coding (Tuckwell, 1989; Gerstner and Kistler, 2002; Tuckwell et al., 2009; Tuckwell and Jost, 2010), we further investigate how noise modulates spike trains of the neuron with different spike threshold dynamics, as shown in **Figures 2**, **3**. It is observed that no matter there is an inverse relationship between spike threshold and dV/dt or not, the spike number always increases monotonically from 0 as noise amplitude σ increases when Iin is less than the bifurcation value I ∗ in. For Iin just beyond I ∗ in, the noise could inhibit or even terminate the repetitive spiking of neuron when its spike threshold is sensitive to dV/dt (**Figures 2D–F**). In this case, the neuron is able to generate repetitive spike without noise (i.e., σ = 0µA/cm<sup>2</sup> ), since Iin has already exceeded bifurcation value I ∗ in. Introducing synaptic noise makes the spike trains become irregular. Unexpectedly, weak noise (such as, σ = 0.2µA/cm<sup>2</sup> ) has an obvious inhibitory effect on neuronal spiking behavior, which even terminates repetitive spiking for a long time. When noise amplitude is increased to σ = 1.5µA/cm<sup>2</sup> or even higher, there will be more spikes evoked again. That is, when Iin is in the vicinity of I ∗ in, small noise could noticeably inhibit neuronal spiking and there is a minimum in the mean spike number as σ goes up (**Figures 3D–F**). Meanwhile, as the inverse relationship between spike threshold and dV/dt gets pronounced, the inhibitory effect induced by small noise becomes stronger. However, this inhibitory effect does not appear in the

neuron with an insensitive spike threshold to dV/dt (left panels, **Figures 2**, **3**). In this case, the noise only disturbs its spike trains and makes them become irregular, which is unable to terminate repetitive spiking (**Figures 2A–C**).

# Phase Response Curves of the Neuron with Different Threshold Dynamics

In previous section, we have found that different sensitivity of spike threshold to dV/dt could result in distinct (i.e., discontinuous or continuous) f − Iin curves in the case of no or low noise. In this section, we use PRC theory to further characterize neuronal response properties in the case of different threshold dynamics.

**Figure 4** displays the PRCs of the neuron model in six cases of spike threshold dynamic. It is found that the PRC is dependent on the natural oscillation frequency of neuron, and increasing it could attenuate the amplitude of phase shift. When spike threshold is insensitive to dV/dt, the neuron generates type I

FIGURE 3 | Mean numbers of spikes as a function of noise amplitude for each threshold dynamic. (A–F) respectively give the mean spike number *N* (40 trials) as noise amplitude σ is increased in the neuron for 1000 ms time interval with different values of β*n*. The value of *I in* indicated by blue line is below the bifurcation point *I* \* *in* and there is no repetitive spiking generated in the neuron without noise, while the values of *I in* indicated by three other colors are above the bifurcation point *I* \* *in*.

PRC, which exclusively displays phase advances (i.e., positive values) to excitatory brief pulse (**Figure 4A**). However, when spike threshold has an obvious inverse relation with dV/dt, the neuron shows phase delays (i.e., negative values) at earlier phases and phase advances at later phases (**Figure 4B**), which is manifested as a type II PRC. It has been proposed that type I PRC corresponds to a continuous f − Iin curve and type II PRC corresponds to a discontinuous f − Iin curve (Ermentrout, 1996; Izhikevich, 2005; Smeal et al., 2010; Fink et al., 2011). Our simulation results in **Figures 1C**, **4** are in accordance with this proposal. Further, it is worth pointing out that there are very small negative regions at the earlier phases of type I PRCs (**Figure 4A**). This is because the action potentials generated in Morris-Lecar like model consume a much larger portion of interspike interval than other models (Rinzel and Ermentrout, 1998; Fink et al., 2011). But according to the descriptions of Fink et al. (2011), we could ignore these early small phase delays in type I PRCs.

# Biophysical Basis of the Spike Initiation Associated with Different Threshold Dynamics

By varying parameter βn, we have identified the input-output property associated with each spike threshold dynamic. Our next step is to explore why the neuron with distinct threshold dynamics produces different input-output properties. It has been known that the membrane currents with opposite directions play different roles in spike generation. The currents flowing into

the cell mainly depolarize membrane voltage to produce the rapid upstroke of the spike (i.e., positive feedback), whereas the currents flowing out of the cell mainly hyperpolarize membrane voltage which are responsible for the repolarization and produce the downstroke of the spike (i.e., negative feedback) (Izhikevich, 2005; Prescott et al., 2008a,b; Yi et al., 2014a). Here, we investigate how the opposite currents interact at the perithreshold potentials to determine neuronal response property in six cases of spike threshold dynamic.

Reducing parameter β<sup>n</sup> from −5 to −15mV results in a hyperpolarizing shift in the half-activation voltage of outward K <sup>+</sup> current I<sup>K</sup> (**Figure 1A**), which causes I<sup>K</sup> to be more strongly activated by the perithreshold depolarization (**Figure 5A**). For three cases that the spike threshold is insensitive to dV/dt (i.e., β<sup>n</sup> = −5, −7, and −9mV), the outward I<sup>K</sup> activates at a higher potential than inward INa (**Figure 5A**), which indicates that the slow outward current I<sup>K</sup> does not become activated until after the spike is initiated. In these three cases, the relationship between steady-state net membrane current ISS and membrane voltage V (i.e., ISS − V curve) is always non-monotonic (**Figure 5B**), which has a region of negative slope. At the local maximum of ISS − V curve, the inward INa balances outward unactivated I<sup>K</sup>

more strongly activated by perithreshold depolarization. (B) gives the relationship between steady-state net membrane current *ISS* and membrane potential *V* (i.e., *ISS* − *V* curve). *ISS* is computed as the sum of three individual currents, i.e., *ISS* = *INa* + *I<sup>K</sup>* + *I<sup>L</sup>* . (C,D) summarize the bifurcation diagram associated with each spike threshold dynamic. The stable equilibrium is indicated by orange solid line and unstable is orange dotted line. The stable limit cycle is indicated by green solid line and unstable is purple dotted line.

and outward IL. Then, any further depolarization could result in the progress activation of INa and make it become selfsustaining to generate the upstroke of the spike. In other words, the bifurcation occurs at this voltage, i.e., ∂ISS/∂V = 0. Since the depolarizing current INa faces no restraint of hyperpolarizing current at the perithreshold potentials, the membrane potential V could be driven to slowly pass through spike threshold. Thus, the neuron is able to spike repetitively at low frequencies and produce a continuous f −Iin curve. This continuous input-output property is generated through a SNIC bifurcation (**Figure 5C**), which corresponds to a non-monotonic ISS−V curve (Izhikevich, 2005; Prescott et al., 2008a,b; Yi et al., 2014a). Further, because inward INa dominates spike initiation without the restraint of I<sup>K</sup> at the perithresholds, a brief, excitatory stimulus only leads to advances in oscillation cycle and positive values of phase shift, which corresponds to a type I PRC.

For the other three cases that the spike threshold is sensitive to dV/dt (i.e., β<sup>n</sup> = −11, −13, and −15mV), the outward I<sup>K</sup> activates at roughly the same V with inward INa or at a slightly lower V than INa (**Figure 5A**). The activation of I<sup>K</sup> at low potentials makes the outward currents become so strong that the inward INa is unable to balance them at the perithreshold potentials, which results in a monotonic ISS − V curve without local maximum (**Figure 5B**). To initiate action potentials, the inward INa must exploit its fast kinetic to activate faster than slow outward IK, and drives V through threshold potential with a sufficient speed that the outward I<sup>K</sup> cannot catch up. Only in this way can the positive feedback outrun negative feedback to produce the upstroke of the spike. Since the V trajectory between two spikes must be more rapid than IK, the neuron is unable to spike repetitively at low frequencies, which endows it with a discontinuous f − Iin curve. This discontinuous input-output property is generated through a Hopf bifurcation (**Figure 5D**), which corresponds to a monotonic ISS − V curve (Izhikevich, 2005; Prescott et al., 2008a,b; Yi et al., 2014a). Further, in this case there is a special subthreshold region where the activation of low-threshold I<sup>K</sup> is greater than inward INa. When voltage trajectory pass through this region, an excitatory pulse will evoke a larger response from outward I<sup>K</sup> than from inward INa, which leads to negative PRC values at early phases. At higher membrane potential later in this special subthreshold region, the fast activating INa dominates neuronal response to brief excitatory pulse, which leads to the positive PRC values at later phases. Then, the neuron generates a type II PRC that has both phase delays and advances in these three cases.

Further, as spike threshold gets depolarized, the outward I<sup>K</sup> becomes more strongly activated at the perithreshold potentials, which increases the net current ISS and makes it reach a higher outward level prior to spike initiation. Since the outward current hyperpolarizes membrane potential V and prohibits action potential, there should be stronger step current Iin to counteract outward current and activate inward INa to generate spike. Then, the current threshold for triggering neuronal repetitive spiking increases as spike threshold gets depolarized.

Finally, when Hopf bifurcation occurs (i.e., the spike threshold is sensitive to dV/dt), there is a narrow bistable region in the vicinity of bifurcation, where stable resting state and stable limit cycle coexist (**Figure 5D**). Then, synaptic noise could switch voltage trajectory from one attractor, a stable limit cycle, to another, a stable resting point (Tuckwell et al., 2009; Tuckwell and Jost, 2010, 2011, 2012; Guo, 2011). This is the basis of the inhibitory effects of weak noise on spiking behavior. Meanwhile, the bistable region widens as the relationship between spike threshold and dV/dt gets pronounced, which causes the inhibitory effects of weak noise on repetitive spiking to become stronger. On the contrary, there is no bistable region in the case of SNIC bifurcation (**Figure 5C**), so the noise is unable to inhibit or terminate neuronal spiking in this case, i.e., the spike threshold is insensitive to dV/dt.

# Energy Efficiency in the Neuron with Different Threshold Dynamics

We have identified the input-output property and spike initiation mechanism associated with each threshold dynamic. Here, we characterize the energy efficiency consumed by the neuron in six cases of threshold dynamic.

We first describe how ionic currents and their energy consumption evolve during the generation of a spike. **Figure 6A** shows an action potential generated in the neuron with β<sup>n</sup> = −5mV to Iin = 37.5µA/cm<sup>2</sup> in the case of no noise (i.e., σ = 0µA/cm<sup>2</sup> ). At this value of Iin and σ, the neuron spikes repetitively at about 23.5 Hz. **Figure 6B** gives the Na+, K<sup>+</sup> and leak currents corresponding to the spike waveform described in **Figure 6A**. The Na<sup>+</sup> current flows into the cell and has a negative sign, but we plot it with a positive sign for a better visualization of the overlap between Na<sup>+</sup> and K<sup>+</sup> currents. During the upstroke, the Na<sup>+</sup> current first activates and drives membrane voltage to quickly depolarize. Then, the outward K<sup>+</sup> current activates which hyperpolarizes membrane voltage and leads to the downstroke. The energy consumption rates of the three ions are shown in **Figure 6C**, which are computed according to Equations (9)– (11). They represent the instantaneous energy consumption per second by corresponding ionic channel, which are all positive. One can observe that there are overlaps between Na<sup>+</sup> and K<sup>+</sup> energy, especially during the downstroke (**Figure 6C**). **Figure 6D** gives the total energy rate δ consumed by all the ionic currents, which is used to generate the action potential in **Figure 6A**. In order to maintain the spiking activity of the neuron, this energy consumption must be replenished by the ion pumps and metabolically supplied by the hydrolysis of ATP molecules.

The left panels in **Figure 7** give the average energy consumption rate δ as a function of input current Iin (i.e., δ − Iin curve) in six cases of spike threshold dynamics for three levels of noise. It can be found that the energy consumption rate δ in quiescent state is much lower than that in spiking state. This is because the increase of supplied energy to the neuron, i.e., increasing step current, promotes the ionic to pass through cell membranes, and makes them consume more energy. When spike threshold is insensitive to dV/dt (i.e., β<sup>n</sup> = −5, −7, and −9mV), the δ − Iin curve is always continuous for three levels of noise. However, if there is an obvious inverse relation between threshold and dV/dt (i.e., β<sup>n</sup> = −11, −13, and −15mV), the δ − Iin curve is discontinuous in the cases of no or low noise and continuous for high level of noise. Thus, the energy consumption rate of the neuron during the transition from quiescent state to spiking regime is dependent on its firing rates, which is displayed in **Figure 1C**. As spike threshold gets depolarized, the δ − Iin curve in firing regime shifts to the right and the corresponding average energy consumption rate δ of the neuron decreases.

The right panels in **Figure 7** show the total energy consumption in nJ per cm<sup>2</sup> calculated as the integral over long period of time of the area under the instantaneous ionic channel energy curve [i.e., the sum of the energy rates given by Equations (9)–(11)] divided by the number of spikes, which gives the energy consumption of a single spike. As step current Iin increases, the energy consumed in one spike first quickly decreases, and then has a very slight increase (about 0.1nJ/cm<sup>2</sup> per 1µA/cm<sup>2</sup> ). As threshold gets depolarized, the energy consumption in one action potential becomes larger with some low Iin values, and the synaptic noise obviously increases this consumption. However, with high values of Iin, the energy demand for a spike gets smaller

as spike threshold depolarizes, and increasing synaptic noise produces little effects on this consumption. That is, depolarizing spike threshold increases the energy utilization efficiency of the neuron in high firing rates. The lower values of energy consumption in one spike are achieved at more depolarized spike threshold and high stimulus current.

this case, the neuron generates repetitive spiking at about 23.5 Hz.

From the results in **Figures 6B,C**, it can be found that there are overlaps between Na<sup>+</sup> and K<sup>+</sup> currents in an action potential. These two positive charges flow in opposite directions as they pass through cell membrane, so that they can neutralize each other during the overlap. The overlap charge could be computed as the integral of Na<sup>+</sup> current during the hyperpolarized phase of the spike (Moujahid et al., 2011, 2014; Moujahid and D'Anjou, 2012), which is the inward Na<sup>+</sup> that is counterbalanced by outward K+. Previous studies (Alle et al., 2009; Carter and Bean, 2009; Sengupta et al., 2010, 2013; Moujahid and D'Anjou, 2012; Moujahid et al., 2014) have shown that reducing this overlap load could decrease the energy demands for spike generation. From **Figure 8A**, one can find that the overlap Na<sup>+</sup> indeed undergoes a reduction as spike threshold gets depolarized in the case of high Iin values. The efficient use of inward Na<sup>+</sup> could decrease the energy consumption in an action potential and enhance the energy efficiency of the neuron (**Figure 8B**).

# Discussion

Our results demonstrate there is a fundamental connection between spike threshold dynamics and neuronal input-output properties. When spike threshold is insensitive to dV/dt, the f − Iin curve is continuous and weak noise is unable to produce inhibitory effects on spiking rhythms. In this case, the neuron generates a type I PRC that exclusively displays phase advances. However, when spike threshold is sensitive to dV/dt, the neuron generates a discontinuous f − Iin curve and a type II PRC in the cases of no or low noise. Increasing noise amplitude switches the f − Iin curve from discontinuous to continuous. Simultaneously, weak synaptic noise obviously prohibits spiking rhythms when Iin is near and above the bifurcation point I ∗ in. In this case, as the inverse relationship between spike threshold and dV/dt gets pronounced, the inhibitory effects of weak noise on spiking rhythms and the discontinuity of f −Iin curve both become more significant. Further, the depolarization of the spike threshold shifts the f − Iin curve to the right, alters the slope of f − Iin curve at low spike rates, and increases the current threshold for evoking neuronal repetitive spiking. These results indicate that the spike threshold properties, such as, whether it is sensitive to dV/dt, the inverse degree of it depends on dV/dt, or even the values of threshold potential could all obviously influence neuronal input-output relations.

All these input-output properties associated with each spike threshold dynamic are derived from the distinct nonlinear interactions between inward (depolarizing) and outward (hyperpolarizing) currents at the perithreshold potentials. When spike threshold is insensitive to dV/dt, the outward I<sup>K</sup> does not activate prior to spike threshold, which leads inward INa to dominate spike initiation without the restraint of IK. Due to the absent of outward IK, the inward INa is able to balance weak outward currents at the perithreshold potentials, which results in a non-monotonic ISS − V curve, a type I PRC, and a SNIC bifurcation. Under these conditions, V could be forced to slowly pass through threshold potential and the neuron is able to spike at low frequencies, thus producing a continuous f − Iin curve. Since the SNIC bifurcation does not have the bistable region, the inhibitory effects of weak noise on spiking rhythms is missing in this case. When spike threshold is sensitive to dV/dt, the outward I<sup>K</sup> is able to activate at the subthresholds, and could become sufficiently strong prior to spike initiation.

Then, inward INa is unable to balance it at the perithreshold potentials, which leads to a monotonic ISS − V curve, a type II PRC and a Hopf bifurcation. The action potential could be successfully initiated because inward INa activates quickly to drive V through threshold with a sufficient speed that slow outward I<sup>K</sup> cannot overtake. This means the neuron is unable to spike at low rates, which corresponds to a discontinuous f − Iin curve. Since the neuron generates a narrow bistable region when Hopf bifurcation occurs, the weak noise could convert its state from stable limit cycle to resting and then prohibit repetitive spiking. Further, the increase of current threshold for evoking repetitive spiking is also due to the intensity of net outward current becomes stronger as threshold gets depolarized.

The biophysical explanation about how the activation properties of intrinsic membrane currents contribute to the spike threshold dynamic with the preceding dV/dt has been reported in many experimental and modeling studies (Hodgkin and Huxley, 1952; Storm, 1988; Azouz and Gray, 2000, 2003; Bekkers and Delaney, 2001; Henze and Buzsáki, 2001; Dodson et al., 2002; Wilent and Contreras, 2005; Guan et al., 2007; Goldberg et al., 2008; Higgs and Spain, 2011; Wester and Contreras, 2013; Fontaine et al., 2014). Meanwhile, the biophysical basis of how different dynamical mechanisms of spike initiation (i.e., SNIC and Hopf bifurcation) generate distinct input-output relations, such as Hodgkin class 1 and class 2 excitability (Koch, 1999; Izhikevich, 2005; Prescott and Sejnowski, 2008; Prescott et al., 2008a,b; Yi et al., 2014a) or type I and type II PRC (Ermentrout,

1996; Smeal et al., 2010; Fink et al., 2011), has also been well established. However, none of them has explored how spike threshold dynamic modulates neuronal input-output relation. With a simple biophysical model, we have successfully identified a fundamental connection between spike threshold dynamic and input-output property in this study. We also provided a biophysical interpretation about how the nonlinear interactions between inward and outward currents at the perithersholds contribute to such connection. The powerful predictive ability of subthreshold biophysical properties is further attested in our work, which may be conducive to increase its future applications in neural coding.

Since the stochasticity is a prominent feature of neural system (Tuckwell, 1989; Gerstner and Kistler, 2002; Tuckwell and Jost, 2010), much effort has been devoted to exploring what effects of noise may produce on neuronal activity. A lot of modeling and experimental studies have reported that noise is able to enrich neuronal stochastic dynamics and trigger many complex behaviors near different bifurcation points. For example, it may induce stochastic firing patterns and enhance neuronal information transmission capability through coherence resonance near SNIC bifurcation (Gu et al., 2011; Jia et al., 2011; Jia and Gu, 2012), inhibit repetitive spiking through inverse stochastic resonance near Hopf bifurcation (Paydarfar et al., 2006; Tuckwell et al., 2009; Tuckwell and Jost, 2010, 2011, 2012; Guo, 2011), or completely destroy bifurcation scenarios and make neuronal response present a reliable feature (Tateno and Pakdaman, 2004). However, most of these studies focus on the phenomenological description of how noise impacts spiking behavior, while do not provide a satisfying explanation about the relation between neuronal intrinsic property and noisy effects. Unlike them, the present study associates noisy effects on spiking rhythms with neuronal intrinsic threshold dynamic. What is more, we provide a plausible biophysical interpretation for the observed noisy effects by relating them to the dynamical mechanism of spike initiation. All these investigations could provide a great insight into how noise participates in neural coding.

In addition, we adopt a novel approach proposed by Moujahid et al. (2011, 2014) and Moujahid and D'Anjou (2012) to characterize the electrochemical energy of the neuron with different spike threshold dynamics. This approach is based on the biophysical considerations about the nature of neuron model, which allows one to deduce an analytical expression of the electrochemical energy involved in the dynamics of the model. Contrary to the ion counting approach, this method does not need to calculate the number of Na<sup>+</sup> required to depolarize membrane when estimating energy consumption, and also it

*in* <sup>=</sup> <sup>60</sup>µA/cm<sup>2</sup> and

requires no hypothesis about the extent of the overlapping between Na<sup>+</sup> and K<sup>+</sup> (Moujahid et al., 2011, 2014; Moujahid and D'Anjou, 2012). Thus, it could avoid the overestimate value of energy that results from the ionic-counting based method (Attwell and Laughlin, 2001; Alle et al., 2009; Hertz et al., 2013). With this approach, we have found a basic link between spike threshold, energy efficiency, and spiking frequency. It is shown that the average energy consumption rate increases with spiking frequency and could detect the transition of the neuron from quiescence to firing state, whereas the energy demand of a single spike decreases with spiking frequency. This relation between energy consumption and spiking frequency is consistent with that observed in the neocortex, hippocampus, thalamus, and squid axon (Moujahid and D'Anjou, 2012; Moujahid et al., 2014). As spike threshold gets depolarized, the average energy consumption rate gets smaller. Meanwhile, the energy demand for generating an action potential in the case of high stimulus also decreases. This demonstrates that depolarizing spike threshold could increase the energy efficiency of the neuron. We further show that the more efficient use of electrochemical energy in the case of more depolarized threshold is mainly due to the reduced overlap load between inward Na<sup>+</sup> and outward K<sup>+</sup> currents. Previous reports (Alle et al., 2009; Carter and Bean, 2009; Sengupta et al., 2010, 2013; Moujahid and D'Anjou, 2012; Moujahid et al., 2014) have proposed that if the Na<sup>+</sup> and K<sup>+</sup> currents have the substantially reduced overlap, the corresponding action potential is more energy efficient. Our stimulation results are consistent with this proposal. All these experimental and modeling observations suggest that the interactions between inward and outward currents could also determine the electrochemical energy required by the neuron to generate action potentials.

# References


# Conclusion

A dynamic spike threshold dependent on dV/dt plays a vital role in neural coding and spike initiation, which requires a number of metabolic energy. In this work, we have used a modified Morris-Lecar model to systematically investigate the input-output property and energy efficiency of the neuron with different spike threshold dynamics. To the best of our knowledge, this is the first study that links spike threshold dynamics, biophysical properties, spike initiation, input-output relations and energy efficiency together. The predictions and relevant mechanistic explanations could be tested by intracellular recording in vivo, and simultaneously more biophysically realistic simulations will be required if we want to replicate these biological effects more accurately. The systematic investigation about how spike threshold dynamics modulates neural input-output properties and energy efficiency is a useful stepwise method for exploring how spike threshold participates in neural coding. Moreover, translating the phenomenological descriptions into biophysical interpretation is crucial for revealing how membrane biophysics impacts neural coding. Thus, our stimulations could contribute to uncover the functional significance of spike threshold as well as biophysical properties in neural coding mechanism.

# Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grants 61471265, 61401312, 61372010, and 61172009, and Tianjin Municipal Natural Science Foundation under Grants 12JCZDJC21100, 13JCZDJC27900, and 13JCQNJC03700.


dynamical analysis of a two-compartment neuron model. J. Comput. Neurosci. 36, 383–399. doi: 10.1007/s10827-013-0479-z

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Yi, Wang, Tsang, Wei and Deng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Linear stability in networks of pulse-coupled neurons

# *Simona Olmi 1,2, Alessandro Torcini 1,2\* and Antonio Politi 1,3*

*<sup>1</sup> Consiglio Nazionale delle Ricerche, Istituto dei Sistemi Complessi, Sesto Fiorentino, Italy*

*<sup>2</sup> INFN—Sezione di Firenze and CSDC, Sesto Fiorentino, Italy*

*<sup>3</sup> SUPA and Institute for Complex Systems and Mathematical Biology, King's College, University of Aberdeen, Aberdeen, UK*

#### *Edited by:*

*Tobias A. Mattei, Ohio State University, USA*

#### *Reviewed by:*

*Marc Timme, Max Planck Institute for Dynamics and Self Organization, Germany Tobias A. Mattei, Ohio State University, USA*

#### *\*Correspondence:*

*Alessandro Torcini, Consiglio Nazionale delle Ricerche, Istituto dei Sistemi Complessi, Via Madonna del Piano 10, I-50019 Sesto Fiorentino, Italy*

*e-mail: alessandro.torcini@cnr.it*

In a first step toward the comprehension of neural activity, one should focus on the stability of the possible dynamical states. Even the characterization of an idealized regime, such as that of a perfectly periodic spiking activity, reveals unexpected difficulties. In this paper we discuss a general approach to linear stability of pulse-coupled neural networks for generic phase-response curves and post-synaptic response functions. In particular, we present: (1) a mean-field approach developed under the hypothesis of an infinite network and small synaptic conductances; (2) a "microscopic" approach which applies to finite but large networks. As a result, we find that there exist two classes of perturbations: those which are perfectly described by the mean-field approach and those which are subject to finite-size corrections, irrespective of the network size. The analysis of perfectly regular, asynchronous, states reveals that their stability depends crucially on the smoothness of both the phase-response curve and the transmitted post-synaptic pulse. Numerical simulations suggest that this scenario extends to systems that are not covered by the perturbative approach. Altogether, we have described a series of tools for the stability analysis of various dynamical regimes of generic pulse-coupled oscillators, going beyond those that are currently invoked in the literature.

**Keywords: linear stability analysis, splay states, synchronization, neural networks, pulse coupled neurons, Floquet spectrum**

# **1. INTRODUCTION**

Networks of oscillators play an important role in both biological (neural systems, circadian rhythms, population dynamics) (Pikovsky et al., 2003) and physical contexts (power grids, Josephson junctions, cold atoms) (Hadley and Beasley, 1987; Filatrella et al., 2008; Javaloyes et al., 2008). It is therefore comprehensible that many studies have been and are still devoted to understanding their dynamical properties. Since the development of sufficiently powerful tools and the resulting discovery of general laws is an utterly difficult task, it is convenient to start from simple setups.

The first issue to consider is the model structure of the single oscillators. Since phases are typically more sensitive than amplitudes to mutual coupling, they are likely to provide the most relevant contribution to the collective evolution (Pikovsky et al., 2003). Accordingly, here we restrict our analysis to oscillators characterized by a single, phase-like, variable. This is typically done by reducing the neuronal dynamics to the evolution of the membrane potential and introducing the corresponding *velocity field* which describes the single-neuron activity. Equivalently, one can map the membrane potential onto a phase variable and simultaneously introduce a phase-response curve (PRC) [Upon changing variables, the velocity field can be made independent of the local variable (as intuitively expected for a true phase). When this is done, the phase dependence of the velocity field is moved to the coupling function, i.e., to the PRC] to take into account the dependence of the neuronal response on the current value of the membrane potential (i.e., the phase). In this paper we adopt the first point of view, with a few exceptions, when the second one is mathematically more convenient.

As for the coupling, two mechanisms are typically invoked in the literature, diffusive and pulse-mediated. While the former mechanism is pretty well understood [see e.g., the very many papers devoted to Kuramoto-like models (Acebrón et al., 2005)], the latter one, more appropriate in neural dynamics, involves a series of subtleties that have not yet been fully appreciated. This is why here we concentrate on pulse-coupled oscillators.

Finally, for what concerns the topology of the interactions, it is known that they can heavily influence the dynamics of the neural systems leading to the emergence of new collective phenomena even in weakly connected networks (Timme, 2006), or of various types of chaotic behavior, ranging from weak chaos for diluted systems (Popovych et al., 2005; Olmi et al., 2010) to extensive chaos in sparsely connected ones (Monteforte and Wolf, 2010; Luccioli et al., 2012). We will, however, limit our analysis to globally coupled identical oscillators, which provide a much simplified, but already challenging, test bed. The high symmetry of the corresponding evolution equations simplifies the identification of the stationary solutions and the analysis of their stability properties. The two most symmetric solutions are: (1) the fully synchronous state, where all oscillators follow exactly the same trajectory; (2) the splay state (also known as "ponies on a merry-go-round," antiphase state or rotating waves) (Hadley and Beasley, 1987; Ashwin et al., 1990; Aronson et al., 1991), where the oscillators still follow the same periodic trajectory, but with different (evenly distributed) time shifts. The former solution is the simplest representative of the broad class of clustered states (Golomb and Rinzel, 1994), where several oscillators behave in the same way, while the latter is the prototype of asynchronous states, characterized by a smooth distribution of phases (Renart et al., 2010).

In spite of the many restrictions on the mathematical setup, the stability of the synchronous and splay states still depend significantly on additional features such as the synaptic responsefunction, the velocity field, and the presence of delay in the pulse transmission. As a result, one can encounter splay states that are either strongly stable along all directions, or that present many almost-marginal directions, or, finally, that are marginally stable along various directions (Nichols and Wiesenfield, 1992; Watanabe and Strogatz, 1994). Several analytic results have been obtained in specific cases, but a global picture is still missing: the goal of this paper is to recompose the puzzle, by exploring the role of the velocity field (or, equivalently, of the phase response curve) and of the shape of the transmitted post-synaptic potentials. Although we are neither going to discuss the role of delay nor that of the network topology, it is useful to recall the stability analysis of the synchronous state in the presence of delayed δ-pulses and for arbitrary topology, performed by Timme and Wolf in Timme and Wolf (2008). There, the authors show that even the complete knowledge of the spectrum of the linear operator does not suffice to address the stability of the synchronized state.

The stability analysis of the fully synchronous regime is far from being trivial even for a globally coupled network of oscillators with no delay in the pulse transmission: in fact, the pulse emission introduces a discontinuity which requires separating the evolution before and after such event. Moreover, when many neurons spike at the same time, the length of some interspike intervals is virtually zero but cannot be neglected in the mathematical analysis. In fact, the first study of this problem was restricted to excitatory coupling and δ-pulses (Mirollo and Strogatz, 1990). In that context, the stability of the synchronous state follows from the fact that when the phases of two oscillators are sufficiently close to one another, they are instantaneously reset to the same value (as a result of a non-physical lack of invertibility of the dynamics). The first, truly linear stability analyses have been performed later, first in the case of two oscillators (van Vreeswijk et al., 1994; Hansel et al., 1995) and then considering δ-pulses with continuous PRCs (Goel and Ermentrout, 2002). Here, we extend the analysis to generic pulse-shapes and discontinuous PRCs [such as for leaky integrate and fire (LIF) neurons].

As for the splay states, their stability can be assessed in two ways: (1) by assuming that the number of oscillators is infinite (i.e., taking the so called thermodynamic limit) and thereby studying the evolution of the distribution of the membrane potentials—this approach is somehow equivalent to dealing with (macroscopic) Liouville-type equations in statistical mechanics; (2) by dealing with the (microscopic) equations of motion for a large but finite number *N* of oscillators. As shown in some pioneering works (Kuramoto, 1991; Treves, 1993), the former approach corresponds to develop a mean field theory. The resulting equations have been first solved in Abbott and van Vreeswijk (1993) for pulses composed of two exponential functions, in the limit of a small effective coupling [A small effective coupling can arise also when PRC has a very weak dependence on the phase (see section 3)]. Here, following Abbott and van Vreeswijk (1993), we extend the analysis to generic pulse-shapes, finding that substantial differences exist among δ, exponential and the so-called α-pulses (see the next section for a proper definition).

Direct numerical studies of the linear stability of finite networks suggest that the eigenfunctions of the (Floquet) operator can be classified according to their wavelength - (where refers to the neuronal phase—see section 4.1 for a precise definition). In finite systems, it is convenient to distinguish between long (LW) and short (SW) wavelengths. Upon considering that - = *n*/*N* (1 ≤ *n* ≤ *N*), LW can be identified as those for which *n N*, while SW correspond to larger *n* values. Numerical simulations suggest also that the time scale of a LW perturbation typically increases upon increasing its wavelength, starting from a few milliseconds (for small *n* values) up to much longer values (when *n* is on the order of the network size *N*) which depend on "details" such as the continuity of the velocity field, or the pulse shape. On the other hand, SW are characterized by a slow size-dependent dynamics.

For instance, in LIF neurons coupled via α-pulses, it has been found (Calamai et al., 2009) that the Floquet exponents of LW decrease as 1/-<sup>2</sup> (for large -), while the time scale of the SW component is on the order of *N*2. In practice the LW spectral component as determined from the finite *N* analysis coincides with that one obtained with the mean field approach (i.e., taking first the thermodynamic limit). As for the SW component, it cannot be quantitatively determined by the mean-field approach, but it is nevertheless possible to infer the correct order of magnitude of this time scale. In fact, upon combining the 1/-<sup>2</sup> decay (predicted by the mean-field approach) with the observation that the minimal wavelength is 1/*N*, it naturally follows that the SW time scale is *N*2, as analytically proved in Olmi et al. (2012). Furthermore, it has been found that the two spectral components smoothly connect to each other and the predictions of the two theoretical approaches coincide in the crossover region.

It is therefore important to investigate whether the same agreement extends to more generic pulse shapes and velocity fields. The finite-*N* approach can, in principle, be generalized to arbitrary shapes, but the analytic calculations would be quite lengthy, due to the need of distinguishing between fast and slow scales and the need of accounting for higher order terms. For this reason, here we limit ourselves to give a positive answer to this question with the help of numerical studies.

The only, important, exception to this scenario is obtained for quasi δ-like pulses (Zillmer et al., 2007), i.e., for pulses whose width is smaller than the average time separation between any two consecutive spikes, in which case all the SW eigenvalues remain finite for increasing *N*.

In section 2 we introduce the model and derive the corresponding event-driven map, a necessary step before undertaking the analytic calculations. Section 3 is devoted to a perturbative stability analysis of the splay state in the infinite-size limit for generic velocity fields and pulse shapes. The following section 4 reports a discussion of the stability in finite networks. There we briefly recall the main results obtained in Olmi et al. (2012) for the splay state and we extensively discuss the method to quantify the stability of the fully synchronous regime. The following two sections are devoted to a numerical analysis of various setups. In section 5 we study splay states in finite networks for generic velocity fields and three different classes of of pulses, namely, with finite, vanishing (≈1/*N*), and zero width. In section 6 we study periodically forced networks. Such studies show that the scaling relations derived for the splay states apply also to such a microscopically quasi-periodic regime. A brief summary of the main results together with a recapitulation of the open problem is finally presented in section 7. In the first appendix we derive the Fourier components needed to assess the stability of a splay state for a generic PRC. In the second appendix the evaporation exponent is determined for the synchronous state in LIF neurons.

# **2. THE MODEL**

The general setup considered in this paper is a network of *N* identical pulse-coupled neurons (rotators), whose evolution is described by the equation

$$\dot{X}^{\dot{j}} = F(X^{\dot{j}}) + \text{g}E(t), \quad \dot{j} = 1, \dots, N \tag{1}$$

where *X<sup>j</sup>* represents the membrane potential, *g* is the coupling constant and the *mean field E*(*t*) denotes to the synaptic input, common to all neurons in the network. When *X<sup>j</sup>* reaches the threshold value *X<sup>j</sup>* = 1, it is reset to *X<sup>j</sup>* = 0 and a spike contributes to the mean field *E* in a way that is described here below. The resetting procedure is an approximation of the discharge mechanism operating in real neurons. The function *F*(*X*) (the velocity field) is assumed to be everywhere positive, thus ensuring that the neuron is repetitively firing. For *F*0(*X*) = *a* − *X* the model reduces to the well-known case of LIF neurons.

The mean field *E*(*t*) arises from the linear superposition of the pulses emitted by the single neurons. In order to describe its time evolution, it is sufficient to introduce a suitable ordinary differential equation (ODE), such that its Green function reproduces the expected pulse shape,

$$E^{(L)} = \sum\_{i}^{L-1} a\_i E^{(i)} + \frac{K}{N} \sum\_{n|t\_n < t} \delta(t - t\_n), \tag{2}$$

where the superscript (*i*) denotes the *i*th time derivative, *L* the order of the differential equation and *K* = *<sup>i</sup>* α*i*, (−α*<sup>i</sup>* being the poles of the differential equation), so as to ensure that the single pulses have unit area (for *N* = 1). The δ-functions appearing on the right hand side of Equation (2) correspond to the spikes emitted at times {*tn*}: each time a spike is emitted, the term *E*(*L*−1) has a finite jump of amplitude *K*/*N*. Therefore *L* controls the smoothness of the pulses: *L* − 1 is the order of the lowest derivative that is discontinuous. *L* = 0 corresponds to the extreme case of δ-pulses with no field dynamics; *L* = 1 corresponds to discontinuous exponential pulses; *L* = 2 (with α<sup>1</sup> = α2) to the so-called α-pulses (*Es*(*t*) = α2*te*−α*<sup>t</sup>* ). Since α-pulses will be often referred to, it is worth being a little more specific. In this case, Equation (2) reduces to

$$
\dot{E}(t) + 2\alpha \dot{E}(t) + \alpha^2 E(t) = \frac{\alpha^2}{N} \sum\_{n|t\_n < t} \delta(t - t\_n), \tag{3}
$$

and it is convenient to transform this equation into a system of two ODEs, namely

$$\dot{E} = P - \alpha E, \quad \dot{P} + \alpha P = \frac{\alpha^2}{N} \sum\_{n|t\_n < t} \delta(t - t\_n), \tag{4}$$

where we have introduced, for the sake of simplicity, the auxiliary variable *P* ≡ α*E* + *E*˙.

### **2.1. EVENT-DRIVEN MAP**

By following Zillmer et al. (2006) and Calamai et al. (2009), it is convenient to pass from a continuous—to a discrete-time evolution rule, by deriving the event-driven map which connects the network configuration at consecutive spike times. For the sake of simplicity, in the following part of this section we refer to α-pulses, but there is no conceptual limitation in extending the approach to *L* > 2.

By integrating Equation (4), we obtain

$$E\_{n+1} = E\_n \mathbf{e}^{-\alpha T\_n} + P\_n T\_n \mathbf{e}^{-\alpha T\_n} \tag{5}$$

$$P\_{n+1} = P\_n e^{-\alpha T\_n} + \frac{\alpha^2}{N},\tag{6}$$

where we have taken into account the effect of the incoming pulse (see the term <sup>α</sup>2/*<sup>N</sup>* in the second equation) while *<sup>T</sup><sup>n</sup>* <sup>=</sup> *tn* <sup>+</sup> <sup>1</sup> − *tn* is the interspike interval; *tn* <sup>+</sup> <sup>1</sup> corresponds to the time when the neuron with the largest membrane potential reaches the threshold.

Since all neurons follow the same first-order differential equation (this is a mean-field model), the ordering of their membrane potentials is preserved [neurons "rotate" around the circle [0, 1] without overtaking each other (Jin, 2002)]. It is, therefore, convenient to order the potentials from the largest to the smallest one and to introduce a co-moving reference frame, i.e., to shift backward the label *j*, each time a neuron reaches the threshold. By formally integrating Equation (1),

$$X\_{n+1}^{j} = \mathcal{F}(X\_n^{j+1}, T\_n) + g \frac{\mathbf{e}^{-T\_n} - \mathbf{e}^{-\alpha T\_n}}{\alpha - 1} \left( E\_n + \frac{P\_n}{\alpha - 1} \right)$$

$$-g \frac{T\_n e^{-\alpha T\_n}}{(\alpha - 1)} P\_n. \tag{7}$$

Moreover, since *X*<sup>1</sup> *<sup>n</sup>* is always the largest potential, the interspike interval is defined by the threshold condition

$$X\_n^1(\mathcal{T}\_n, E\_n, P\_n) \equiv 1.\tag{8}$$

Altogether, the model now reads as a discrete-time map, involving *<sup>N</sup>* <sup>+</sup> 1 variables, *En*, *Pn*, and *<sup>X</sup><sup>j</sup> <sup>n</sup>* (1 ≤ *j* < *N*), since one degree of freedom has been eliminated as a result of having taken the Poincaré section (*X<sup>N</sup> <sup>n</sup>* ≡ 0 due to the resetting mechanism). The advantage of the map description is that we do not have to deal any longer with δ-like discontinuities, or with formally infinite sequences of past events.

In this framework, the splay state is a fixed point of the eventdriven map. Its coordinates can be determined in the following way. From Equation (5), one can express *P*˜ and *E*˜ as a function of the yet unknown interspike interval *T* ,

$$
\tilde{P} = \frac{\alpha^2}{N} (1 - \mathbf{e}^{-\alpha T})^{-1} \quad \tilde{E} = T \tilde{P} (\mathbf{e}^{\alpha T} - 1)^{-1} . \tag{9}
$$

The value of the membrane potentials *X*˜ *<sup>k</sup>* are then obtained by iterating backward in *j* Equation (7) (the *n* dependence is dropped for the fixed point) starting from the initial condition *X*˜ *<sup>N</sup>* = 0. The interspike interval *T* is finally obtained by imposing the condition *X*˜ <sup>0</sup> = 0. In practice the computational difficulty amounts to finding the zero of a one dimensional function and, even though *<sup>F</sup>*(*X<sup>j</sup>* <sup>+</sup> <sup>1</sup>, *<sup>T</sup>* ) can, in most cases, be obtained only through numerical integration, the final error can be very well kept under control.

# **3. THEORY (***N* **= ∞)**

The stability of a dynamical state can be assessed by either first taking the infinite-time limit and then the thermodynamic limit, or vice versa. In general it is not obvious whether the two methods yield the same result and this is particularly crucial for the splay state, as many eigenvalues tend to 0 for *N* → ∞. In this section we discuss the scenarios that have to be expected when the thermodynamic limit is taken first. We do that by following Abbott and van Vreeswijk (1993).

As a first step, it is convenient to introduce the phase-like variable

$$\mathbf{y}^i = \int\_0^{\mathbf{X}^i} \frac{d\mathbf{x}}{G(\mathbf{x})}, \qquad \mathbf{0} \le \mathbf{y}^i \le 1 \tag{10}$$

where, for later convenience, we have defined *G*(*X*) ≡ *g* + *T*0*F*(*X*), *T*<sup>0</sup> = *NT* being the period of the splay state (i.e., the single-neuron interspike interval). The phase *y<sup>i</sup>* evolves according to the equation

$$\frac{dy^i}{dt} = \tilde{E} + \frac{\mathrm{g\varepsilon}(t)}{G(X(\mathbb{y}^i))}\tag{11}$$

where *E*˜ = 1/*T*<sup>0</sup> is the amplitude of the field in the splay state, ε(*t*) = *E*(*t*) − *E*˜. In the splay state, since ε = 0, *y<sup>i</sup>* grows linearly in time, as indeed expected for a well-defined phase. In the thermodynamic limit, the evolution is ruled by the continuity equation

$$\frac{\partial \rho}{\partial t} = -\frac{\partial J}{\partial \nu} \tag{12}$$

where ρ(*y*, *t*)*dy* is the fraction of neurons whose phase *y<sup>i</sup>* lies in (*y*, *y* + *dy*) at time *t*, and

$$J(\mathbf{y}, t) = \left[\tilde{E} + \frac{\mathbf{g}\varepsilon(t)}{G(X(\mathbf{y}))}\right] \rho(\mathbf{y}, t) \tag{13}$$

is the corresponding flux. As the resetting implies that the outgoing flux *J*(1, *t*) (which coincides with the firing rate) equals the incoming flux at the origin, the above equation has to be complemented with the boundary condition *J*(0, *t*) = *J*(1, *t*). Finally, in this macroscopic representation, the field equation writes

$$\mathfrak{e}^{(L)} = \sum\_{i}^{L-1} a\_i \mathfrak{e}^{(i)} + K(J(1, t) - \breve{E}),\tag{14}$$

while the splay state corresponds to the fixed point ρ = 1, ε = 0, *J* = *E*˜. The smoothness of the splay state justifies the use of a partial differential equation such as (Equation 12). Its stability can be studied by introducing the perturbation *j*(*y*, *t*)

$$j(\mathbf{y}, t) = J(\mathbf{y}, t) - \tilde{E},\tag{15}$$

and linearizing the continuity equation,

$$\frac{\partial \dot{j}}{\partial t} = \frac{\mathcal{g}}{G(X(\mathbf{y}))} \frac{\partial \mathbf{e}}{\partial t} - \mathbb{E} \frac{\partial \dot{j}}{\partial \mathbf{y}}.\tag{16}$$

while the field equation simplifies to

$$\mathfrak{e}^{(L)} = \sum\_{i}^{L-1} a\_i \mathfrak{e}^{(i)} + K \mathfrak{j}(1, t). \tag{17}$$

By now introducing the Ansatz

$$j(\mathbf{y}, t) = j\_f(\mathbf{y}) \mathbf{e}^{\lambda t} \qquad \mathbf{e}(\mathbf{y}, t) = \mathbf{e}\_f(\mathbf{y}) \mathbf{e}^{\lambda t},\tag{18}$$

in Equations (16) and (17) and, thereby solving the resulting ODE, one can obtain an implicit expression for *jf*(*y*),

$$j\_{\bar{f}}(\boldsymbol{\chi}) = \mathbf{e}^{-\lambda \boldsymbol{\chi}/\bar{E}} \left[ 1 + \frac{\mathbf{g} K \lambda}{\tilde{E} \prod\_{k=1}^{L} (\lambda + \alpha\_k)} \int\_0^{\boldsymbol{\chi}} d\boldsymbol{z} \frac{\mathbf{e}^{\lambda z/\bar{E}}}{G(\boldsymbol{X}(\boldsymbol{z}))} \right],$$

where −α*<sup>k</sup>* and *K* are defined as below Equation (2). By imposing the boundary condition for the flux, *jf*(1) = *jf*(0) = 1, one finally obtains the eigenvalue equation (Abbott and van Vreeswijk, 1993),

$$\left(\mathbf{e}^{\lambda/\tilde{E}} - 1\right) \prod\_{k=1}^{L} \left(\lambda + \alpha\_{k}\right) = \frac{\mathbf{g}K\lambda}{\tilde{E}} \int\_{0}^{1} d\boldsymbol{\eta} \frac{\mathbf{e}^{\lambda \boldsymbol{\eta}/\tilde{E}}}{G(\mathbf{X}(\boldsymbol{\eta}))}.\tag{19}$$

In the case of a constant *G*(*X*(*y*)) = σ, *L* eigenvalues correspond to the zeroes of the following polynomial equation

$$\prod\_{k=1}^{L} (\lambda + \alpha\_k) = \frac{\mathcal{g}K}{\sigma}.\tag{20}$$

For *g* = 0 such solutions are the poles which define the field dynamics, while for *g* = σ, λ = 0 is a solution: this corresponds to the maximal value of the (positive) coupling strength beyond which the model does no longer support stationary states, as the feedback induces an unbounded growth of the spiking rate. Besides such *L* solution, the spectrum is composed of an infinite set of purely imaginary eigenvalues,

$$
\lambda = 2\pi \text{in}\vec{E} = \frac{2\pi \text{in}}{T\_0} \quad n \neq 0. \tag{21}
$$

The existence of such marginally stable directions reflects the fact that all *y<sup>i</sup>* phases experience the same velocity field, independently of their current value (see Equation 11), so that no effective interaction is present among the oscillators. In the limit of small variations of *G*(*X*(*y*)), one can develop a perturbative approach. Here below, we proceed under the more restrictive assumption that the coupling constant *g* is itself small: we have checked that this restriction does not change the substance of our conclusions, while requiring a simpler algebra.

A small *g* value implies that λ is close to 2πin*E*˜ and thereby expand the exponential in Equation (19). Up to first order, we find

$$\lambda\_n = 2\pi \text{in}\vec{E}\left[1 + \frac{\text{g}K(A\_n + iB\_n)}{\prod\_{k=1}^{L}(2\pi \text{in}\vec{E} + \alpha\_k)}\right] \tag{22}$$

where

$$(A\_n + iB\_n) = \int\_0^1 d\boldsymbol{\eta} \frac{\mathbf{e}^{i2\pi n\boldsymbol{\eta}}}{G(X(\boldsymbol{\eta}))}\tag{23}$$

are the Fourier components of the phase-response curve 1/*G*(*X*(*y*)).

In order to estimate the leading terms of the real part of λ*<sup>n</sup>* in the large *n* limit, let us rewrite Equation (22) as

$$\lambda\_n = i\gamma\_n + gK\gamma\_n \frac{-B\_n + iA\_n}{\prod\_{k=1}^L (\alpha\_k^2 + \gamma\_n^2)} \prod\_{k=1}^L (\alpha\_k - i\gamma\_n) \tag{24}$$

where γ*<sup>n</sup>* = 2π*nE*˜ = (2π*n*)/*T*0. Since γ*<sup>n</sup>* is proportional to *n*, the leading terms in the product at numerator of Equation (24) are

$$\prod\_{k=1}^{L} (\alpha\_k - i\gamma\_n) \sim (-i)^L \gamma\_n^L + \mathcal{S}(-i)^{L-1} \gamma\_n^{L-1},\tag{25}$$

where *S* = *<sup>L</sup> <sup>k</sup>* <sup>=</sup> <sup>1</sup> α*<sup>k</sup>* while the leading term in the product at denominator in Equation (24) is γ <sup>2</sup>*<sup>L</sup> <sup>n</sup>* . Accordingly, the main contribution to the real part of the eigenvalues is, in the case of even *L*,

$$\operatorname{Re}\{\lambda\_n\} \sim \operatorname{gK}(-1)^{L/2} \left[ \frac{\operatorname{SA}\_n}{\chi\_n^L} - \frac{B\_n}{\chi\_n^{L-1}} \right] \tag{26}$$

and, for odd *L*,

$$\operatorname{Re}\{\lambda\_n\} \sim \operatorname{gK}(-1)^{(L+3)/2} \left[ \frac{A\_n}{\lambda\_n^{L-1}} + \frac{SB\_n}{\lambda\_n^L} \right]. \tag{27}$$

An exact expression for the Fourier components *An* and *Bn* appearing in Equation (23) can be derived in the large *n* limit. In particular, the integral over the interval [0, 1] appearing in Equation (23) can be rewritten as a sum of integrals, each performed on a sub-interval of vanishingly small length 1/*n*. Furthermore, since the phase-response 1/*G* has a limited variation within each sub-interval, it can be replaced by its polynomial expansion up to second order. Finally, as shown in Appendix A, the following expression are obtained at the leading order in 1/*n* for a discontinuous *F*(*X*)

$$A\_n \simeq \frac{-T\_0}{4\pi^2 n^2} \left[ \frac{F'(1)}{G(1)^2} - \frac{F'(0)}{G(0)^2} \right],\tag{28}$$

$$B\_n \simeq \frac{T\_0}{2\pi n} \left[ \frac{F(1) - F(0)}{G(1)G(0)} \right]. \tag{29}$$

Therefore, for even *L*, the leading term for *n* → ∞ is

$$\operatorname{Re}\{\lambda\_n\} = \frac{\operatorname{gKT}\_0^L(-1)^{L/2} \left(F(0) - F(1)\right)}{(2\pi n)^L G(1) G(0)}.\tag{30}$$

For even *L*, the stability of the short-wavelength modes (large *n*) is controlled by the sign of (*F*(0) − *F*(1)): for even (odd) *L*/2 and excitatory coupling, i.e., *g* > 0, the splay state is stable whenever *F*(1) > *F*(0) (*F*(1) < *F*(0)). Obviously the stability is reversed for inhibitory coupling.

Notice that for *L* = 0, i.e., δ-spikes, the eigenvalues do not decrease with *n*, as previously observed in Zillmer et al. (2007). This is the only case where all modes exhibit a finite stability even in the thermodynamic limit.

For odd *L*, the real part of the eigenvalues is

$$\operatorname{Re}\{\lambda\_n\} = \frac{g \operatorname{KT}\_0^L (-1)^{(L+1)/2}}{(2\pi n)^{(L+1)}} \times \tag{31}$$

$$\left\{\frac{F'(1)}{G(1)^2} - \frac{F'(0)}{G(0)^2} - \operatorname{ST}\_0 \frac{F(1) - F(0)}{G(1)G(0)}\right\},$$

in this case the value of *F*(*X*) and of its derivative *F* (*X*) at the extrema mix up in a non-trivial way.

Finally, as for the scaling behavior of the leading terms we observe that

$$\operatorname{Re}\{\lambda\_n\} \sim n^{-q}, \quad q = 2\left\lfloor \frac{L+1}{2} \right\rfloor \tag{32}$$

where · stays for the integer part of the number. Therefore the scaling of the short-wavelength modes for discontinuous *F*(*X*) is dictated by the post-synaptic pulse profile.

For a continuous but non-differentiable *F*(*X*), (i.e., *F* (1) = *F* (0)), if *L* is even, it is necessary to go two orders beyond in the estimate of the Fourier coefficients (see Appendix A). As a result, the eigenvalues scale as

$$\operatorname{Re}\{\lambda\_n\} \propto n^{-(L+2)}.\tag{33}$$

For odd *L*, it is instead sufficient to assume *F*(0) = *F*(1) in Equation (31).

Altogether, we have seen that the non-smoothness of both the post-synaptic pulse and of the velocity field (or, equivalently, of the phase response curve) play a crucial role in determining the degree of stability of the splay state. The smoother are such functions and the slower short-wavelength perturbations decay, although the changes occur in steps which depend on the parity of the order of the discontinuity (at least for the pulse structure). Moreover, the overall stability of the spectral components depends in a complicate way on the sign of the discontinuity itself.

# **4. THEORY (FINITE** *N***)**

#### **4.1. THE SPLAY STATE**

The stability for finite *N* can be investigated by linearizing Equations (5–7). A thorough analysis has been developed in Olmi et al. (2012); here we limit ourselves to review the key ideas as a guide for the numerical analysis.

We start by introducing the vector *W* = ({*x<sup>j</sup>* }, , *p*) (*j* = 1, *N* − 1), whose components represent the infinitesimal perturbations of the solution {*X<sup>j</sup>* }, *E*, *P*. The Floquet spectrum can be determined by constructing the matrix **A** which maps the initial vector *W*(0) onto *W*(*T* ),

$$\mathbf{W}(T) = \mathbf{A}W(0) \tag{34}$$

where *T* corresponds to the time separation between two consecutive spikes. This is done in two steps, the first of which corresponds to evolving the components of a Cartesian basis according to the equations obtained from the linearization of Equations (1, 4) (in the comoving reference frame),

$$
\dot{\mathbf{x}}^j = \frac{dF}{d\mathbf{x}\_{j+1}}\mathbf{x}^{j+1} + \mathbf{g}\boldsymbol{\epsilon}, \quad j = 2, \dots, N \quad \dot{\mathbf{x}}^N \equiv \mathbf{0}
$$

$$
\dot{\boldsymbol{\epsilon}} = \boldsymbol{p} - \alpha \boldsymbol{\epsilon}, \quad \dot{\boldsymbol{p}} = -\alpha \boldsymbol{p}. \tag{35}
$$

The second step consists in accounting for the spike emission, which amounts to add the vector

$$U = \{ \langle \dot{X}^j(T) \rangle, \dot{E}(T), \dot{P}(T) \} \mathfrak{r} \tag{36}$$

where τ is obtained from the linearization of the threshold condition (8),

$$\pi = -\left(\frac{\partial X^1}{\partial E}\epsilon + \frac{\partial X^1}{\partial P}p\right)\frac{1}{\dot{X}^1} \tag{37}$$

The diagonalization of the resulting matrix **A**, gives *N* + 1 Floquet eigenvalues μ*k*, which we express as

$$
\mu\_k = e^{i\phi\_k} e^{T\_0(\lambda\_k + i\omega\_k)/N},\tag{38}
$$

where φ*<sup>k</sup>* = <sup>2</sup>π*<sup>k</sup> <sup>N</sup>* , *k* = 1,..., *N* − 1, and φ*<sup>N</sup>* = φ*N*−<sup>1</sup> = 0, while λ*<sup>k</sup>* and ω*<sup>k</sup>* are the real and imaginary parts of the Floquet exponents. The variable φ*<sup>k</sup>* plays the role of the wavenumber *k* in the linear stability analysis of spatially extended systems.

Previous studies (Olmi et al., 2012) have shown that the spectrum can be decomposed into two components: (1) *k* ∼ *O*(1); (2) *k*/*N* ∼ *O*(1). The former one is the LW component and can be directly obtained in the thermodynamic limit (see the previous section). For *L* = 2 and α<sup>1</sup> = α<sup>2</sup> (i.e., for α pulses), it has been found that the results reported in Abbott and van Vreeswijk (1993) match does obtained for 1 *k N* in Olmi et al. (2012). The latter one corresponds to the SW component: it depends on the system size and cannot, indeed, be derived from the mean field approach discussed in the previous section. In the next section, we illustrate some examples that go beyond the analytic studies carried out in Olmi et al. (2012).

#### **4.2. THE SYNCHRONIZED STATE**

In this section we address the problem of measuring the stability of the fully synchronized state for a generic oscillator dynamics *F*(*x*). The task is non-trivial, because of the resetting mechanism, which acts simultaneously on all neurons. On the one side, we extend the results obtained in Goel and Ermentrout (2002) which are restricted to a continuous PRC, on the other side we extend the results of Mirollo and Strogatz (1990) which refer to excitatory coupling and δ pulses. In order to make the analysis easier to understand we start considering α-pulses. Other cases are discussed afterward.

The starting point amounts to writing the event driven map in a comoving frame,

$$X\_{n+1}^{j} = \mathcal{F}\left(X\_n^{j+1}, E\_n, P\_n, \mathcal{T}\_n\right) \tag{39}$$

$$E\_{n+1} = E\_n e^{-\alpha T\_n} + P\_n T\_n e^{-\alpha T\_n},\tag{40}$$

$$P\_{n+1} = P\_n e^{-\alpha T\_n} + \frac{\alpha^2}{N},\tag{41}$$

where the function *F* is obtained by formally integrating the equations of motion over the time interval *Tn*. Notice that the field dynamics has been, instead, explicitly obtained from the exact integration of the equations of motion [compare with Equations (3, 4)]. The interspike time interval *T<sup>n</sup>* is finally determined by solving the implicit equation

$$\mathcal{F}(X\_n^1, E\_n, P\_n, \mathcal{T}\_n) = 1. \tag{42}$$

In order to determine the stability of the synchronized state, it is necessary to assume that the neurons have an infinitesimally different membrane potentials, even though they coincide with one another. As a result, the full period must be broken into *N* steps. In the first one, of length *T*, all neurons start in *X* = 0 and arrive at 1, but only the "first" reaches the threshold; in the following *N* − 1 steps, of 0-length, one neuron after the other passes the threshold and it is accordingly reset in 0.

With this scheme in mind we proceed to linearize the equations, writing the evolution equations for the infinitesimal perturbations *x j <sup>n</sup>*, *n*, *pn*, and τ*<sup>n</sup>* around the synchronous solution. From Equations (39–41) we obtain,

$$\mathbf{x}\_{n+1}^{j} = \mathcal{F}\_{\mathbf{X}}(j+1)\mathbf{x}\_{n}^{j+1} + \mathcal{F}\_{\mathbf{E}}(j+1)\boldsymbol{\epsilon}\_{n} + \\\\
$$

$$\mathcal{F}\_{\mathbf{P}}(j+1)\boldsymbol{p}\_{n} + \mathcal{F}\_{\mathbf{T}}(j+1)\mathbf{r}\_{n} \quad 1 \le j < N \quad (43)$$

$$
\epsilon\_{n+1} = e^{-\alpha T} \epsilon\_n + T e^{-\alpha T} p\_n - 
$$
 
$$
\epsilon\_{..} \tilde{r} \qquad \text{p. } -\alpha T \text{ ...} \tag{14}
$$

$$\left(\alpha \tilde{E} - P\_n e^{-\alpha T}\right) \mathbf{r}\_n \tag{44}$$

$$p\_{n+1} = e^{-\alpha T} p\_n - \alpha P\_n e^{-\alpha T} \mathfrak{r}\_n. \tag{45}$$

with the boundary condition *x<sup>N</sup> <sup>n</sup>* <sup>+</sup> <sup>1</sup> = 0 (due to the reset mechanism) and where the subscripts *X*, *E*, *P*, and *T* denote a partial derivative with respect to the given variable. Moreover, the dependence on *j* + 1 is a shorthand notation to remind that the various derivatives depend on the membrane potential of the (*j* + 1)st neuron. Finally, we have left the *n*-dependence in the variable *P* as it changes (in α2/*N* steps, when the neurons progressively cross the threshold), while *E*˜ refers to the field amplitude, which, instead, stays constant.

The above equations must be complemented by the condition

$$
\pi\_n = -\mathcal{T}\_\mathcal{X} \mathfrak{x}\_n^1 + \mathcal{T}\_\mathcal{E} \mathfrak{e}\_n + \mathcal{T}\_\mathcal{P} \mathfrak{p}\_n,\tag{46}
$$

where *T<sup>Z</sup>* = *FZ*(1)/*F<sup>T</sup>* (1) (*Z* = *X*, *E*, *P*). Equation (46) is obtained by differentiating Equation (42) which defines the period of the splay state.

We now proceed to build the Jacobian for each of the *N* steps, starting from the first one. In order not to overload the notations, from now on, the time index *n* corresponds to the step of the procedure. It is convenient to order all the variables, starting from *x<sup>j</sup>* (*j* = 1, *N* − 1), and then including and *p*, into a single vector, so that the evolution is described by an (*N* + 1) × (*N* + 1) matrix with the following structure,

$$\mathcal{N}(n) = \begin{pmatrix} \Gamma(n) & \mathbf{0} \\ \Psi(n) \ \Omega(n) \end{pmatrix},\tag{47}$$

where **0** is an (*N* − 1) × 2 null matrix; (*n*) is a quadratic (*N* − 1) × (*N* − 1) matrix, whose only non-zero elements are those in the first column and along the supradiagonal; (*n*) is a 2 × (*N* − 1) matrix whose elements are all zero except for the first column; finally (*n*) is a 2 × 2 matrix.

Since in the first step all neurons start from the same position *X* = 0, one can drop the *j* dependence in *F*. With the help of Equations (46, 43)

$$
\Gamma(1)\_{j,1} = -\mathcal{F}\_X
$$

$$
\Gamma(1)\_{j,j+1} = \mathcal{F}\_X \tag{48}
$$

Moreover, with the help of Equations (44–46)

$$
\Psi(\mathbf{l})\_{11} = -\left(\alpha \mathbb{E} - \mathbb{P}e^{-\alpha T}\right) \mathcal{T}\_{\mathbf{X}}
$$

$$
\Psi(\mathbf{l})\_{12} = -\alpha \mathbf{P}e^{-\alpha T}\mathcal{T}\_{\mathbf{X}},\tag{49}
$$

where we have also made use that *P*<sup>1</sup> = *P*˜. Finally,

$$\begin{aligned} \Omega(1)\_{11} &= e^{-\alpha T} - \left(\alpha \triangle - \tilde{P} e^{-\alpha T}\right) \mathcal{T}\_{\mathbb{E}}, \\ \Omega(1)\_{12} &= T e^{-\alpha T} - \left(\alpha \triangle - \tilde{P} e^{-\alpha T}\right) \mathcal{T}\_{\mathbb{P}}, \\ \Omega(1)\_{21} &= -\alpha \tilde{P} e^{-\alpha T} \mathcal{T}\_{\mathbb{E}}, \\ \Omega(1)\_{22} &= e^{-\alpha T} - \alpha \tilde{P} e^{-\alpha T} \mathcal{T}\_{\mathbb{P}}, \end{aligned} \tag{50}$$

In the next steps, *T<sup>n</sup>* vanishes, so that *F<sup>E</sup>* = *F<sup>P</sup>* = 0, while *F<sup>X</sup>* = 1 and *<sup>F</sup><sup>T</sup>* (1) <sup>=</sup> *<sup>F</sup>*(1) <sup>+</sup> *gE*˜ := *<sup>V</sup>*1. Moreover, *<sup>F</sup><sup>T</sup>* (*j*) depends on whether the *j*th neuron has passed the threshold or not. In the former case *<sup>F</sup><sup>T</sup>* (*<sup>j</sup>* <sup>+</sup> <sup>1</sup>) <sup>=</sup> *<sup>F</sup>*(0) <sup>+</sup> *gE*˜ := *<sup>V</sup>*0, otherwise *<sup>F</sup><sup>T</sup>* (*<sup>j</sup>* <sup>+</sup> <sup>1</sup>) <sup>=</sup> *V*1. As a result,

$$
\Gamma(n)\_{\vec{j},1} = -V^{\vec{j}}/V^1
$$

$$
\Gamma(n)\_{\vec{j},\vec{j}+1} = 1\tag{51}
$$

where *V<sup>j</sup>* = *V*<sup>0</sup> if *j* < *n* and *V<sup>j</sup>* = *V*1, otherwise. At the same time, from the equations for the field variables, we find that

$$
\Psi(n)\_{11} = \frac{\alpha \tilde{E} - (\tilde{P} + (n - 1)\frac{\alpha^2}{N})}{V^1}
$$

$$
\Psi(n)\_{12} = \frac{\alpha (\tilde{P} + (n - 1)\frac{\alpha^2}{N})}{V^1},
\tag{52}
$$

while (*n*) reduces to the identity matrix.

From the multiplication of all matrices, we find that the structure is preserved, namely

$$
\mathcal{N}(\mathsf{N}) \cdots \mathcal{N}(\mathsf{2}) \mathcal{N}(\mathsf{1}) = \begin{pmatrix} \mathsf{A} & \mathsf{0} \\ \Downarrow \; \mathsf{Q} \; \mathsf{Q} \, \mathsf{(1}) \end{pmatrix}, \tag{53}
$$

where ( ¯ *n*) is a 2 × (*N* − 1) matrix, whose elements are all zero except for those of the first column, namely

$$
\bar{\Psi}\_{11} = \Psi(1)\_{11} + \Psi(n)\_{11}
$$

$$
\bar{\Psi}\_{12} = \Psi(1)\_{12} + \Psi(n)\_{12}
$$

Furthermore, Λ is a diagonal matrix, with

$$\Lambda\_{\vec{\mathcal{W}}} = \mathcal{F} \ge \frac{V^0}{V^1} = \frac{F(0) + g\tilde{E}}{F(1) + g\tilde{E}} \exp\left[\int\_0^T dt F'(X(t))\right] \tag{54}$$

Therefore, it is evident that the stability of the orbit is measured by the diagonal elements Λ*jj* together with the eigenvalues of which are associated to the pulse structure. In practice, *F<sup>X</sup>* corresponds to the expansion rate from *X* = 0 to *X* = 1 under the action of the mean field *E* and we recover a standard result in globally coupled identical oscillators: the spectrum is degenerate, all eigenvalues being equal and independent of the network size. The result is, however, not obvious in this context, due to the care that is needed in taking into account the various discontinuities. We have separately verified that the same conclusion holds for exponential spikes.

The stability of the synchronized state can be also addressed by determining the evaporation exponent Λ*<sup>e</sup>* (van Vreeswijk, 1996; Pikovsky et al., 2001), which measures the stability of a probe neuron subject to the mean field generated by the synchronous neurons with no feedback toward them. By implementing this approach for a negative perturbation, van Vreeswijk found that Λ*<sup>e</sup>* is equal to Λ*jj* (for α-functions). By further assuming that *F* < 0, he was able to prove that the synchronized state is stable for inhibitory coupling and sufficiently small α-values. The situation is more delicate for exponential pulse-shapes. As shown in di Volo et al. (2013), Λ*<sup>e</sup>* > 0 (Λ*<sup>e</sup>* < 0) depending whether the perturbation is positive (negative). In this case, the Floquet exponent reported in Equation (54) coincides with the evaporation exponent estimated for negative perturbations. In Appendix B. we show that the difference between the left and right stability is to be attributed to the discontinuous shape of the pulse: no anomaly is expected for α pulses.

## **5. NUMERICAL ANALYSIS**

The theoretical approaches discussed in the previous sections allow determining: (1) the SW components of the Floquet spectrum for discontinuous velocity fields; (2) the leading LW exponents directly in the thermodynamic limit for generic velocity fields and pulse shapes, in the weak coupling limit. It would be possible to extend the finite *N* results to other setups, but we do not think that the effort is worth, given the huge amount of technicalities. We thus prefer to illustrate the expected behavior with the help of some simulations which, incidentally, cover a wider range than possibly accessible to the analytics.

More precisely, in this and the following section we study the models listed in **Table 1** in a standard set up (splay states) and under the effect of periodic external perturbations.

# **5.1. FINITE PULSE WIDTH**

Here, we discuss the stability of the splay state for different degrees of smoothness of the velocity field at the borders of the unit interval for post-synaptic pulses of α-function type.

We start from discontinuous velocity fields. They have been the subject of an analytic study which proved that the SW component scales as 1/*N*<sup>2</sup> (Olmi et al., 2012). The data reported in **Figure 1A** for *F*1(*X*) confirms the expected scaling: the agreement with the theoretical curve derived in Olmi et al. (2012) is impressive over the entire spectral range, while the mean field Equation (30) gives a very good estimation of the spectrum except for the shortest wavelengths, where it overestimates the numerical data. The mean field approximation turns out to be more accurate for continuous velocity fields (with a discontinuity of the first derivative at the

**Table 1 | In the first column is reported the list of the velocity fields** *F***(***X***) analyzed in the paper. All the considered fields are everywhere positive within the definition interval** *X***∈[0,1], thus ensuring that the neuron is supra-threshold. The second column refers to the continuity properties of the fields within the interval [0,1].**


*The function is labeled as discontinuous if F(0)* = *F(1); it is <sup>C</sup>(0) if F(0)* <sup>=</sup> *F(1) but F (0)* = *F (1) and <sup>C</sup>(1) if F(0)* <sup>=</sup> *F(1) and F (0)* = *F (1). F(X) is <sup>C</sup>(*∞*) if it is infinitely differentiable and each derivative is continuous at the extrema of the definition interval.*

borders of the definition interval). Indeed the agreement between the theoretical expression Equation (A10) and the numerical data is very good for the entire range [see **Figure 1B** which refers to *F*4(*X*)].

The numerical Floquet spectra for fields that are *<sup>C</sup>*(0) , but not *<sup>C</sup>*(1) (*F*(0) <sup>=</sup> *<sup>F</sup>*(1), *<sup>F</sup>* (0) = *F* (1)), are reported in **Figure 2** [the curves in panels (**A**, **B**) refer to *F*2(*X*) and *F*4, respectively]. For these velocity fields, we have also verified that the spectra scale as 1/*N*4, confirming the observation reported in Calamai et al. (2009) for a different velocity field with the same analytical properties. The data displayed in **Figures 2A,B** refer to the LW components: they indeed confirm to be independent of the system size and scale as 1/*k*<sup>4</sup> (see the dashed line) as predicted by the perturbative theory discussed in section 3.

The spectra reported in the other two panels refer to analytic velocity fields: in all cases the initial part of the Floquet spectra is again independent of *N* and scales approximately exponentially with *k*, confirming that the scaling behavior of the exponents is related to the analyticity of the velocity field. The fluctuating background with approximate height 10−<sup>12</sup> is just a consequence of the finite numerical accuracy. This is the reason why we did not dare to estimate the SW components that would be exceedingly small.

#### **5.2. VANISHING PULSE-WIDTH**

Here, we analyze the intermediate case between finite pulse-width and δ-like impulses. Similarly to what done in Zillmer et al. (2007) for the LIF, we consider α pulses, where α = β*N*, with β independent of *N*.

In **Figure 3A** we report the spectra for a discontinuous velocity field, *F*1(*x*). In this case the Floquet spectra remain finite, so that the corresponding states remain robustly stable even in the thermodynamic limit. Also in this case the agreement with the theoretical expression reported in Equation (7) in Olmi et al. (2012) is extremely good, while Equation (30) overestimates the spectra for large phases. The field considered in panel (b) (*F*2(*X*)) is *<sup>C</sup>*(0) but not *<sup>C</sup>*(1) . In this case, the Floquet spectra scale as 1/*N*: this scaling is predicted by the analysis reported in section 3 and the whole spectrum is very well reproduced by Equation (A10).

**FIGURE 1 | Floquet spectra for α-pulses for (A) a discontinuous field** *F***1(***X***) and (B) a continuous field** *F***4(***X***).** The orange dotted line in **(A)** represents the theoretical curve estimated by using Equation (7) in Olmi et al. (2012), while the dashed maroon curve represents the theoretical curve estimated by using Equation (30) in section 3. In **(B)** the dashed maroon curve is calculated by using Equation (A10). All data refer to *a* = 1.3 and α = 3.

**FIGURE 2 | Floquet spectra for α-pulses for two continuous sinusoidal fields, namely** *F***2(***X***) (A) and** *F***4(***X***) (B); and two analytic fields, namely** *F***6(***X***) (C) and** *F***7(***X***) (D).** The dashed blue line in **(B)** indicates a scaling 1/*k*4. All data refer to *a* = 1.3 and α = 3.

Last but not least, we have studied an analytic field, namely *F*7(*X*). In this case the Floquet spectra appear to scale exponentially to zero with the wavevector *k*, similarly to what observed for the finite pulse width, as shown in **Figure 4**.

#### **5.3. δ PULSES**

Finally we considered the case of δ-pulses: whenever the potential *X<sup>j</sup>* reaches the threshold value, it is reset to zero and a spike is sent to and *instantaneously* received by all neurons. We studied just two cases: (1) the analytic field *F*7(*X*); (2) a leaky integrate-and fire neuron model with *F*0(*X*). The results, obtained for inhibitory coupling [since the splay state is known to be stable only in such a case (van Vreeswijk, 1996; Zillmer et al., 2006)] are consistent with the expectation for the β model.

In particular we found, in the analytic case (1), that the Floquet spectra decay exponentially to zero. The exponential scaling is not altered if a phase shift ζ is introduced in the velocity field (i.e., for *F*(*X*) = *a* − 1 + *e*2 sin(2π*X*+ζ) ). In the case of the LIF model (*F*0),

we already know that the Lyapunov spectrum tends, in the δ-pulse limit, to Zillmer et al. (2007)

$$\lim\_{\beta \to \infty} \lambda\_{\pi} = -1 + \frac{1}{T\_0} \ln \left( \frac{a}{a-1} \right). \tag{55}$$

This result is confirmed by our simulations which also reveal that the splay state is stable even for small, excitatory coupling values, extending previous results limited to inhibitory coupling (Zillmer et al., 2006).

### **6. PERIODIC FORCING**

In this section we numerically investigate the scaling behavior of the Floquet spectrum in the presence of a periodic forcing, to test the validity of the previous analysis in a more general context. We have restricted our studies to splay-state-like regimes, where it is important to predict the behavior of the many almost marginally stable directions. Moreover, we have considered only the smooth α-pulses. In this case, the dynamical equations read

$$\begin{aligned} \dot{X}^j &= F(X^j) + gE + A\cos(\varphi), \qquad j = 1, \ldots, N, \\ \dot{E} &= P - \alpha E, \\ \dot{P} &= -\alpha P, \\ \dot{\varphi} &= \alpha. \end{aligned} \tag{56}$$

They have been written in an autonomous form, since it is more convenient to perform the Poincaré section according to the spiking times, rather than introducing a stroboscopic map. The interspike interval is determined by the equation

$$\mathcal{T} = \int\_{X\_{\text{old}}}^{1} \frac{dX^{1}}{F(X^{1}) + \mathcal{g}E + A\cos(\varphi)}.\tag{57}$$

where *X*<sup>1</sup> is the membrane potential of the first neuron (the closest to threshold), and *X*old is its initial value.

We analyzed only those setups where the unperturbed splay state is stable. More precisely: the two discontinuous fields *F*0(*X*) and *<sup>F</sup>*1(*X*), the two *<sup>C</sup>*(0) fields (*F*2(*X*) and *<sup>F</sup>*3(*X*)), and the analytic field *F*7(*X*). In all cases the external modulation induces a periodic modulation of the mean field *E* with a period *Ta* = 2π/ω equal to the period of the modulation. At the same time, we have verified that, although the forcing term has zero average (i.e., it does not change the average input current), the average interspike interval is slightly self-adjusted and, what is more important, there is no evidence of locking between the modulation and the frequency of the single neurons. In other words, the behavior is similar to the spontaneous partial synchronization observed in van Vreeswijk (1996) (where the modulation is self-generated).

Because of the unavoidable oscillations of the interspike intervals, it is necessary to identify the spike times with great care. In practice we integrate Equation (56) with a fixed time step Δ*t*, by employing a standard fourth-order Runge–Kutta integration scheme. At each time step we check if *X*<sup>1</sup> > 1, in which case we go one step back and adopt the Hénon trick, which amounts to exchanging *t* and *X*<sup>1</sup> in the role of independent variable (Henon, 1982).

The linear stability analysis can be performed by linearizing the system (56), to obtain

$$\begin{aligned} \dot{x}^j &= \frac{dF(X^j)}{dX^j} x^j + g\epsilon - A \sin(\varphi) \delta\varphi, \qquad j = 1, \ldots, N, \\\dot{\epsilon} &= p - \alpha \epsilon, \\\dot{p} &= -\alpha p, \\\delta\dot{\varphi} &= 0; \end{aligned}$$

and by thereby estimating the corresponding Lyapunov spectrum.

In the case of *F*<sup>0</sup> and *F*1, we have always found that the Lyapunov spectrum scales as 1/*N*<sup>2</sup> as theoretically predicted in the absence of external modulation (see **Figure 5** for one instance of each of the two velocity fields).

A similar agreement is also found for *F*3, where the Lyapunov spectrum scales as 1/*N*4, exactly as in the absence of external forcing (see **Figure 6**). Analogous results have been obtained for the other velocity fields (data not shown), which confirm that the validity of the previous analysis extends to more complex dynamical regimes, as long as the membrane potentials are smoothly distributed.

# **7. SUMMARY AND OPEN PROBLEMS**

In this paper we have discussed the linear stability of both fully synchronized and splay states in pulse-coupled networks of identical oscillators. By following Abbott and van Vreeswijk (1993), we have obtained analytic expressions for the long-wavelength components of the Floquet spectra of the splay state for generic velocity fields and post synaptic potential profiles. The structure of the spectra depends on the smoothness of both the velocity field and the transmitted pulses. The smoother they are and the faster the eigenvalues decrease with the wavelength of the corresponding eigenvectors. In practice, while splay states arising in LIF neurons with δ-pulses have a finite degree of (in)stability along all directions, those emerging in analytic velocity fields have many exponentially small eigenvalues. These results have been derived in the mean field framework, where the system is assumed to be infinite. Although realistic neural networks are finite, the present

**FIGURE 5 | Lyapunov spectra for neurons forced by an external periodic signal, we observe the scaling 1/***N***<sup>2</sup> for the discontinuous velocity fields (A)** *F***0(***X***) and (B)** *F***1(***X***).** In both cases *A* = 0.1, *Ta* = 2.

analysis predicts correctly, even for finite systems, the stability of the eigenmodes associated to the fastest scales and the order of magnitude of the eigenvalues corresponding to slower time scales. Interestingly, the scaling behavior of the eigenvalues carries over to that of the Lyapunov exponents, when the network is periodically forced, suggesting that our results have a relevance that goes beyond the highly symmetric solutions studied in this paper.

Finally, we derived an analytic expression for the Floquet spectra for the fully synchronous state. In this case the exponents associated to the dynamics of the membrane potentials are all identical, as it happens for the diffusive coupling, but here the result is less trivial, due to the fact that one must take into account that arbitrarily close to the solution, the ordering of the neurons may be different. Moreover, the value of the (degenerate) Floquet exponent coincides with the evaporation exponent (van Vreeswijk, 1996; Pikovsky et al., 2001) whenever the pulses are sufficiently smooth, while for discontinuous pulses (like exponential and δ-spikes) the equivalence is lost (see also di Volo et al., 2013).

For discontinuous velocity fields, another important property that has been confirmed by our analysis is the role of the ratio *R* = *N*/(*T*0α) between the width of the single pulse (1/α) and the average interspike interval of the whole network (*T* = *T*0/*N*). In fact, it turns out that the asynchronous regimes can be strongly stable along all directions only when *R* remains finite in the thermodynamic limit (and is possibly small). This includes the idealized case of δ-like pulses, but also setups where the single pulses are so short that they can be resolved by the single neurons. Mathematically speaking, this result implies that the thermodynamic limit does not commute with the limit of a zero pulsewidth. It would be interesting to check to what extent this property extends to more realistic models. A first confirmation result is contained in Pazó and Montbrió (2013), where the authors find a similar property in a network of Winfree oscillators.

Among possible extensions of our analysis, one should definitely mention the inclusion of delay in the pulse transmission. This generalization is far from trivial as it modifies the phase diathe stability analysis of the synchronized phase. An analytic treatment of this latter case is reported in Timme et al. (2002) for generic velocity fields and excitatory δ-pulses. gram of the possible states (see Bär et al., 2012 for a recent brief overview of the possible scenarios) and it complicates noticeably

#### **ACKNOWLEDGMENTS**

We thank David Angulo Garcia for the help in the use of symbolic algebra software. Alessandro Torcini acknowledges financial support from the European Commission through the Marie Curie Initial Training Network "NETT," project N. 289146, as well as from the Italian Ministry of Foreign Affairs for the activity of the Joint Italian-Israeli Laboratory on Neuroscience. Simona Olmi and Alessandro Torcini thanks the Italian MIUR project CRISIS LAB PNR 2011–2013 for economic support and the German Collaborative Research Center SFB 910 of the Deutsche Forschungsgemeinschaft for the kind hospitality at Physikalisch-Technische Bundesanstalt in Berlin during the final write up of this manuscript.

#### **REFERENCES**

Abbott, L. F., and van Vreeswijk, C. (1993). Asynchronous states in networks of pulse-coupled oscillators. *Phys. Rev. E* 48, 1483. doi: 10.1103/PhysRevE.48.1483


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 12 November 2013; paper pending published: 13 December 2013; accepted: 13 January 2014; published online: 04 February 2014.*

*Citation: Olmi S, Torcini A and Politi A (2014) Linear stability in networks of pulsecoupled neurons. Front. Comput. Neurosci. 8:8. doi: 10.3389/fncom.2014.00008 This article was submitted to the journal Frontiers in Computational Neuroscience. Copyright © 2014 Olmi, Torcini and Politi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **APPENDICES**

## **A. FOURIER COMPONENTS OF THE PHASE RESPONSE CURVE**

In this appendix we briefly outline the way the explicit expression of *An* and *Bn*, defined in Equation (23), can be derived in the large *n* limit for a velocity field *F*(*X*) that is either discontinuous, or continuous with discontinuous first derivatives at the border of the definition interval.

The integration interval [0, 1] appearing in Equation (23) is splitted in *n* sub-intervals of length 1/*n*, and the original equation can be rewritten as

$$\mathbf{f}(A\_n + iB\_n) = \sum\_{k=1}^n \int\_{(k-1)/n}^{k/n} d\boldsymbol{\eta} \frac{\mathbf{e}^{i2\pi n\boldsymbol{\eta}}}{\mathbf{G}(\boldsymbol{\eta})}.\tag{A1}$$

For *n* sufficiently large we can assume that the variation of 1/*G*(*y*) is quite limited within each sub-interval, and we can approximate the function as follows, up to the second order

$$\begin{split} \frac{1}{G(\mathbf{y})} &= \frac{1}{\mathbf{g} + T\_0 F(\mathbf{y}\_0)} \left\{ 1 - \frac{T\_0 F'(\mathbf{y}\_0)}{\mathbf{g} + T\_0 F(\mathbf{y}\_0)} (\mathbf{y} - \mathbf{y}\_0) \\ &+ \left[ \left( \frac{T\_0 F'(\mathbf{y}\_0)}{\mathbf{g} + T\_0 F(\mathbf{y}\_0)} \right)^2 - \frac{T\_0 F''(\mathbf{y}\_0)}{2(\mathbf{g} + T\_0 F(\mathbf{y}\_0))} \right] (\mathbf{y} - \mathbf{y}\_0)^2 \right\} \end{split}$$

where *y*<sup>0</sup> = (*k* − 1)/*n* is the lower extremum of the *n*th subinterval.

By inserting these expansions into Equation (A1) and by performing the integration over the *n* sub-intervals, we can determine an approximate expression for *An* and *Bn*. The estimation of *An* involves integrals containing cos(2π*ny*); it is easy to show that the integral over each sub-interval is zero if the integrand, which multiplies the cosinus term, is constant or linear in *y*; therefore the only non-zero terms are,

$$\int\_{(k-1)/n}^{k/n} dy \cos(2\pi ny) y^2 = \frac{1}{2\pi^2 n^3}.\tag{A2}$$

This allows to rewrite

$$A\_n = \frac{1}{2\pi^2 n^2} \sum\_{k=1}^n H\_2\left(\frac{k-1}{n}\right) \frac{1}{n}$$

$$= \frac{1}{2\pi^2 n^2} \left[ \int\_0^1 dx H\_2(\mathbf{x}) \right] + \mathcal{O}\left(\frac{1}{n^3}\right) \qquad\qquad\text{(A3)}$$

where

$$H\_2(\mathbf{x}) = \left[ \frac{(T\_0 F'(\mathbf{x}))^2}{(\mathbf{g} + T\_0 F(\mathbf{x}))^3} - \frac{T\_0 F''(\mathbf{x})}{2(\mathbf{g} + T\_0 F(\mathbf{x})^2)} \right]. \tag{A4}$$

It is easy to verify that *H*2(*x*) admits an exact primitive and therefore to perform the integral appearing in Equation (A3) and to arrive at the expression reported in Equation (28).

The estimation of *Bn* is more delicate, since now integrals containing sin(2π*ny*) are involved. The only vanishing integrals over the sub-intervals are those with a constant integrand multiplied by the sinus term and therefore the estimation of *Bn* reduces to

$$\begin{aligned} B\_n &= \sum\_{k=1}^n H\_1 \left( \frac{k-1}{n} \right) \int\_{(k-1)/n}^{k/n} d\boldsymbol{\eta} \sin(2\pi n \boldsymbol{\eta}) \boldsymbol{\eta} \\ &+ \sum\_{k=1}^n H\_2 \left( \frac{k-1}{n} \right) \int\_{(k-1)/n}^{k/n} d\boldsymbol{\eta} \sin(2\pi n \boldsymbol{\eta}) \left( \boldsymbol{\eta}^2 - 2\boldsymbol{\eta} \frac{k-1}{n} \right) \end{aligned}$$

where

$$H\_1(\mathbf{x}) = -\frac{T\_0 F'(\mathbf{x})}{(\mathbf{g} + T\_0 F(\mathbf{x}))^2},\tag{A5}$$

and the non-zero integrals are

$$\int\_{(k-1)/n}^{k/n} d\boldsymbol{\y} \sin(2\pi\boldsymbol{n}\boldsymbol{\eta}) \boldsymbol{\upy} = -\frac{1}{2\pi n^2},\tag{A6}$$

and

$$\int\_{(k-1)/n}^{k/n} dy \sin(2\pi n\eta) y^2 = \frac{1-2k}{2\pi n^3}.\tag{A7}$$

This allows to rewrite *Bn* as

$$B\_n = -\frac{1}{2\pi n} \sum\_{k=1}^n H\_1\left(\frac{k-1}{n}\right) \frac{1}{n}$$

$$-\frac{1}{2\pi n^2} \sum\_{k=1}^n H\_2\left(\frac{k-1}{n}\right) \frac{1}{n}.\tag{A8}$$

We can then return to a continuous variable by rewriting (A8), up to the *<sup>O</sup>*(1/*n*3), as

$$B\_n = -\frac{1}{2\pi n} \left[ \int\_0^1 H\_1(\mathbf{x}) d\mathbf{x} + \frac{H\_1(1) - H\_1(0)}{2n} \right]$$

$$-\frac{1}{2\pi n^2} \int\_0^1 H\_2(\mathbf{x}) d\mathbf{x}.\tag{A9}$$

The expression Equation (29) is finally obtained by noticing that the primitive of *H*2(*x*) is *H*1(*x*)/2, and that

$$\int\_0^1 H\_1(\mathbf{x})d\mathbf{x} = \frac{1}{(\mathbf{g} + T\_0 F(\mathbf{0}))} - \frac{1}{(\mathbf{g} + T\_0 F(\mathbf{1}))}.$$

For continuous velocity fields, *Bn* = 0 so that, we can derive from Equation (26) an exact expression for the real part of the Floquet spectrum in the case of even *L* (for odd *L* the equivalent expression is given by Equation (31))

$$\operatorname{Re}\{\lambda\_n\} = \frac{\operatorname{gKST}\_0^{L+1}(-1)^{L/2}}{(2\pi n)^{(L+2)}} \frac{F'(0) - F'(1)}{G(1)^2}.\tag{A10}$$

A rigorous validation of the above formula would require going one order beyond in the 1/*n* expansion of *Bn*, a task that is utterly complicated. In the specific case of the Quadratic Integrate and Fire neuron (or -neuron) *F*(*X*) = *a* − *X*(*X* − 1), it can be, however, analytically verified that *Bn* is exactly zero. Moreover, Equation (A10) is in very good agreement with the numerically estimated Floquet spectra for two other continuous velocity fields, namely *F*4(*X*) and *F*2(*X*) as shown in **Figures 1**, **3**, respectively. As a consequence, it is reasonable to conjecture that Equation (29) is correct up to order *<sup>O</sup>*(1/*n*4).

#### **B. EVAPORATION EXPONENT FOR THE LIF MODEL**

In this appendix we determine the (left and right) evaporation exponent for a synchronous state of a network of LIF neurons. This is done by estimating how the potential of a probe neuron, forced by the mean field generated by the network activity, converges toward the synchronized state. The stability analysis is performed by following the evolution of a perturbed probe neuron. Let us first consider an initial condition, where the synchronized cluster has just reached the threshold (*Xc* = 1), while the probe neuron is lagging behind at a distance δ*i*. Such a distance is equivalent to a delay *td*

$$t\_d = \frac{8\_i}{F^+(1)},\tag{A11}$$

where the subscript "+" means that the velocity field is estimated just after the pulses have been emitted. Over the time *td*, the potential of the cluster increases from the reset value 0 to

$$
\delta\_{\epsilon} = F^{+}(0)t\_{d} = \frac{F^{+}(0)}{F^{+}(1)}\delta\_{i}.\tag{A12}
$$

From now on (in LIF neurons), the distance decreases exponentially, reaching the value

$$
\delta\_f = \delta\_\epsilon e^{-T},\tag{A13}
$$

after a period *T*. As a result,

$$\frac{\aleph\_f}{\aleph\_i} = \frac{F^+(0)}{F^+(1)}e^{-T} = \frac{a + gE^+}{a - 1 + gE^+}.\tag{A14}$$

The logarithm of the expansion factor gives the left evaporation exponent

$$
\Lambda\_e^l = \ln \left( \frac{a + gE^+}{a - 1 + gE^+} \right) - T. \tag{A15}
$$

Let us now consider a probe neuron which precedes the synchronized cluster by an amount δ*i*. After a time *T* the distance becomes

$$
\delta\_{\epsilon} = \delta\_i \mathcal{e}^{-T} \tag{A16}
$$

since no reset event has meanwhile occurred. Such a distance corresponds to a delay

$$t\_d = \frac{\mathfrak{k}\_d}{F^-(1)},\tag{A17}$$

where the subscript "−" means that the velocity has now to be estimated just before the pulse emission. By proceeding as before one obtains,

$$\frac{\delta\_f}{\delta\_i} = \frac{F^-(0)}{F^-(1)} e^{-T}. \tag{A18}$$

so that the right evaporation exponent writes

$$
\Lambda\_e^r = \ln \left( \frac{a + gE^-}{a - 1 + gE^-} \right) - T. \tag{A19}
$$

It is easy to see that the left and right exponents differ if and only if *E*<sup>−</sup> = *E*+, i.e., if the pulses themselves are not continuous: this is, for instance, the case of exponential and δ pulses.

# Macroscopic complexity from an autonomous network of networks of theta neurons

# *Tanushree B. Luke , Ernest Barreto and Paul So\**

*School of Physics, Astronomy, and Computational Sciences and The Krasnow Institute for Advanced Study, George Mason University, Fairfax, VA, USA*

#### *Edited by:*

*Tobias Alecio Mattei, Ohio State University, USA*

#### *Reviewed by:*

*Jianbo Gao, Wright State University, USA Yasuhiro Tsubo, Ritsumeikan University, Japan*

*\*Correspondence: Paul So, George Mason University, Mail Stop 2A1, Fairfax, VA 22030,*

*USA e-mail: paso@gmu.edu* We examine the emergence of collective dynamical structures and complexity in a network of interacting populations of neuronal oscillators. Each population consists of a heterogeneous collection of globally-coupled theta neurons, which are a canonical representation of Type-1 neurons. For simplicity, the populations are arranged in a fully autonomous driver-response configuration, and we obtain a full description of the asymptotic macroscopic dynamics of this network. We find that the collective macroscopic behavior of the response population can exhibit equilibrium and limit cycle states, multistability, quasiperiodicity, and chaos, and we obtain detailed bifurcation diagrams that clarify the transitions between these macrostates. Furthermore, we show that despite the complexity that emerges, it is possible to understand the complicated dynamical structure of this system by building on the understanding of the collective behavior of a single population of theta neurons. This work is a first step in the construction of a mathematically-tractable network-of-networks representation of neuronal network dynamics.

**Keywords: theta neuron, type-I neuron, hierarchical network, neural field, macroscopic behavior, coherence, synchrony, chaos**

# **1. INTRODUCTION**

The brain is a complex hierarchical network of networks (Zhou et al., 2006; Bullmore and Sporns, 2009; Meunier et al., 2010). Neurons are organized into different neuronal assemblies, and these neuronal assemblies interact with each other, forming larger assemblies (Sherrington, 1906; Hebb, 1949; Harris, 2005). But while there is a wealth of knowledge on the microscopic scale regarding the dynamics of individual neurons, the macroscopic behavior of such interacting populations of neurons is not well understood. Indeed, the functional and informationprocessing activity of the brain, from perception to consciousness, is thought to result from the emergent collective behavior of these assemblies.

In recent years, the mathematical study of networks of this kind, based on globally-coupled populations of simple phase oscillators, has advanced significantly. This is in large part due to new analytical techniques (Ott and Antonsen, 2008, 2009; Marvel et al., 2009; Ott et al., 2011; Pikovsky and Rosenblum, 2011). These techniques enable the derivation of low-dimensional dynamical systems that reveal the collective emergent behavior of the full discrete population (in the limit of an infinite number of interacting elements). In the context of computational neuroscience, these methods were applied to autonomous globally-coupled networks of canonical Type-I neurons (i.e., theta neurons) by Luke et al. (2013), and to non-autonomous theta neuron networks by So et al. (2014). More recently, Laing (2014) extended these results to include space-dependent coupling. A similar approach, based on phase-response curves, was pursued by Pazó and Montbrió (2014).

Of course, such networks lack the intricate connectivity found in real biological networks. Nevertheless, they are ideal building blocks for the construction of a more realistic, yet mathematically tractable, network-of-networks representation of the brain. In the current study, we consider the simplest hierarchical structure as a first step in this process. Using two globally-coupled networks of theta neurons, we arrange for the activity of one population to drive the second population. Thus, the overall network has an autonomous driver-response configuration. We demonstrate that even in this simplest network-of-networks, the collective behavior of the response network can exhibit a full range of complex behavior, from simple collective rhythms to temporally chaotic dynamics. Most importantly, we provide a complete non-linear dynamical analysis of this system, including predictive bifurcation diagrams for the behavior of the response population in terms of the driver's dynamics and the network characteristics.

# **2. RECAP OF SINGLE POPULATION RESULTS**

### **2.1. THE THETA NEURON**

Neurons are typically classified into two types, based on the nature of the onset of spiking as a constant injected current exceeds an effective threshold (Hodgkin, 1948; Ermentrout, 1996; Izhikevich, 2007). Type-I neurons begin to spike at an arbitrarily low rate, whereas Type-II neurons spike at a non-zero rate as soon as the threshold is exceeded. Neurophysiologically, excitatory pyramidal neurons are often of Type-I, and fast-spiking inhibitory interneurons are often of Type-II (Nowak et al., 2003; Tateno et al., 2004). Near the onset of spiking, Type-I neurons can be represented by a canonical phase model that features a saddle-node bifurcation on an invariant cycle, or SNIC bifurcation (Ermentrout and Kopell, 1986; Ermentrout, 1996). This model has come to be known as the theta neuron, and is given by

$$\dot{\theta} = (1 - \cos \theta) + (1 + \cos \theta)\eta,\tag{1}$$

where θ is a phase variable on the unit circle and η is a bifurcation parameter related to the injected current. For η < 0, the neuron is attracted to a stable equilibrium which represents the resting state. An unstable equilibrium is also present, representing the threshold. If an external stimulus pushes the neuron's phase across the unstable equilibrium, θ will move around the circle and approach the resting equilibrium from the other side. When θ crosses θ = π, the neuron is said to have spiked. Thus, for η < 0, the neuron is excitable. As the parameter η increases, these equilibria approach each other and merge via the SNIC bifurcation at η = 0. At this point, the equilibria disappear, leaving a limit cycle. The neuron spikes regularly for η > 0. In the following, we call η the "excitability parameter."

#### **2.2. A NETWORK OF THETA NEURONS**

We formulate a single population of *N* theta neurons as follows:

$$\dot{\theta}\_{\dot{\jmath}} = \left(1 - \cos \theta\_{\dot{\jmath}}\right) + \left(1 + \cos \theta\_{\dot{\jmath}}\right)\left[\eta\_{\dot{\jmath}} + I\_{\text{syn}}\right],\tag{2}$$

where *j* = 1,..., *N* is the index for the *j*-th neuron. The neurons are coupled via a pulse-like synaptic current

$$I\_{\rm sym} = \frac{k}{N} \sum\_{i=1}^{N} P\_n(\theta\_i), \tag{3}$$

where *Pn*(θ) = *an* (1 − cos θ) *<sup>n</sup>*, *n* ∈ N, and *an* is a normalization constant1 such that

$$\int\_0^{2\pi} P\_n(\theta)d\theta = 2\pi.$$

The parameter *n* defines the sharpness of the pulse-like synapse in that *Pn*(θ) becomes more and more sharply peaked as *n* increases. We assume that the synaptic strength *k* is the same for all neurons.

Note that the connectivity described by Equations (2) and (3) includes self-coupling terms. These have negligible effect on the collective network dynamics (data not shown), which is to be expected since they represent only one out of *N* inputs to any given neuron. Nevertheless, we note that these self-connections have real-world analogs in "autapses," which have been found in several regions of the brain (e.g., Bacci et al., 2003; Bekkers, 2003).

Neurons in real biological networks exhibit a range of different intrinsic dynamics. We model this by taking the excitability parameter η*<sup>j</sup>* of each neuron to be different, with each η*<sup>j</sup>* being drawn randomly from a distribution *g*(η). In the following analysis, we assume a Lorentzian distribution,

$$g(\eta) = \frac{1}{\pi} \frac{\Delta}{(\eta - \eta\_0)^2 + \Delta^2},\tag{4}$$

$$\,^1a\_{\mathfrak{n}} = 2\pi / \int\_{-\pi}^{\pi} (1 - \cos(\mathfrak{x}))^n = n! / (2n - 1)!!$$

where η<sup>0</sup> is the center of the distribution, and , the half-width at half-maximum, describes the degree of heterogeneity in the population.

#### **2.3. REDUCTION AND ASYMPTOTIC STATES OF THE SINGLE POPULATION**

The macroscopic behavior of our network can be quantified by the "macroscopic mean field," or order parameter, defined as

$$\tilde{z}(t) = \sum\_{j=1}^{N} e^{i\theta\_j},\tag{5}$$

where the tilde indicates that the sum is over a finite population of *N* oscillators. (Below we will drop the tilde in the case of an infinite network.) The magnitude of the order parameter |*z*˜(*t*)| ∈ [0, 1] quantifies the degree of synchronization present at time *t*.

In Luke et al. (2013), we used the Ott-Antonsen method (Ott and Antonsen, 2008, 2009; Ott et al., 2011) to derive a lowdimensional dynamical system whose asymptotic dynamics can be shown to coincide with that of the order parameter of the single-population network defined above (Equations 2–4), in the limit *N* → ∞. This reduced dynamical system is

$$\dot{z} = -i\frac{(z-1)^2}{2} + \frac{(z+1)^2}{2}\left\{-\Delta + i\left[\eta\_0 + kH\_n(z)\right]\right\},\qquad(6)$$

where

$$H\_n(z) = I\_{\rm syn}/k = a\_n \left( A\_0 + \sum\_{q=1}^n A\_q (z^q + z^{\*q}) \right), \tag{7}$$

$$A\_q = \sum\_{j,m=0}^{n} \delta\_{j-2m,q} Q\_{jm},\tag{8}$$

and

$$Q\_{jm} = \frac{(-1)^{j-2m} n!}{2^j m! (n-j)! (j-m)!}. \tag{9}$$

In these equations, *z*<sup>∗</sup> denotes the complex conjugate of *z*, and δ*i*,*<sup>j</sup>* is the Kronecker delta function on the indices (*i*, *j*). Note that *Hn*(*z*) = *H*<sup>∗</sup> *<sup>n</sup>* (*z*) is a real-valued function.

The analysis of Equations (6–9) reported in Luke et al. (2013) showed that the theta neuron network can exhibit three types of asymptotic states. These correspond to a node, a focus, and a limit cycle in the order parameter. A complete bifurcation analysis describing how these states change as the parameters *k*, η0, and change was also reported. For our purposes in the current work, we now briefly describe the three possible collective macroscopic states.

We called the node, focus, and limit cycle solutions the "Partially Synchronous Rest" (PSR), "Partially Synchronous Spiking" (PSS), and "Collective Periodic Wave" (CPW) states, respectively. In the PSR state, most neurons remain at rest, while in the PSS state, most neurons spike continuously. Nevertheless, in both these states, the macroscopic mean field (or order parameter) sits at an equilibrium. In contrast, the CPW state corresponds to periodic oscillations of the complex order parameter, and typically, both |*z*(*t*)| and arg (*z*) oscillate in time indicating that the individual neurons clump together and spread apart in a periodic fashion. We refer the interested reader to Luke et al. (2013) for further details, including movies that illustrate both the microscopic and macroscopic behaviors of these collective states.

# **3. FORMULATION OF THE DRIVER-RESPONSE NETWORK**

In this work, we are interested in the dynamics exhibited by a network of two coupled populations of theta neurons. We formulate the general case, but restrict analysis to the simplest such configuration: a driver-response network.

### **3.1. GENERAL TWO-POPULATION MODEL**

Extending the model described above, a general formulation of a pair of interacting populations of theta neurons can be expressed as follows:

$$\begin{aligned} \dot{\theta}\_{1,j} &= 1 + \eta\_{1,j} - (1 - \eta\_{1,j})\cos\theta\_{1,j} + a\_n(1 + \cos\theta\_{1,j}) \\ &\left[\frac{k\_{11}}{N\_1} \sum\_{p=1}^{N\_1} (1 - \cos\theta\_{1,p})^n + \frac{k\_{12}}{N\_2} \sum\_{q=1}^{N\_2} (1 - \cos\theta\_{2,q})^n\right], \\ \dot{\theta}\_{2,j} &= 1 + \eta\_{2,j} - (1 - \eta\_{2,j})\cos\theta\_{2,j} + a\_n(1 + \cos\theta\_{2,j}) \\ &\left[\frac{k\_{21}}{N\_1} \sum\_{p=1}^{N\_1} (1 - \cos\theta\_{1,p})^n + \frac{k\_{22}}{N\_2} \sum\_{q=1}^{N\_2} (1 - \cos\theta\_{2,q})^n\right], (10) \end{aligned}$$

where θ1,*<sup>j</sup>* and θ2,*<sup>j</sup>* denote the *j*th neuron in the first and second populations, respectively, and the extension to any number of interacting populations is straightforward. The excitability parameters η1,*<sup>j</sup>* and η2,*<sup>j</sup>* are randomly drawn from two independent Lorentzian distributions as in Equation (4), with medians η1, η<sup>2</sup> and widths 1, 2, respectively. We take the sharpness parameter of the pulse-like synaptic interaction, *n*, to be the same for both populations. Macroscopic mean field parameters *z*˜1(*t*), *z*˜2(*t*) can be defined for each population by analogy with Equation (5).

Adapting the procedures described in Luke et al. (2013), we derived the Ott-Antonsen reduction of the coupled networks of Equation (10). This resulted in the following dynamical system:

$$\dot{z}\_1 = -i\frac{(z\_1 - 1)^2}{2} + \frac{(z\_1 + 1)^2}{2}$$

$$\{-\Delta\_1 + i\left[\eta\_1 + k\_{11}H\_n(z\_1) + k\_{12}H\_n(z\_2)\right]\},$$

$$\dot{z}\_2 = -i\frac{(z\_2 - 1)^2}{2} + \frac{(z\_2 + 1)^2}{2}$$

$$\{-\Delta\_2 + i\left[\eta\_2 + k\_{21}H\_n(z\_1) + k\_{22}H\_n(z\_2)\right]\}.\tag{11}$$

with *Hn*(*z*) defined as in Equations (7–9). As before, the asymptotic dynamics of Equation (11) can be shown to coincide with that of the order parameters of the populations in the network of Equation (10), in the limit *N*1, *N*<sup>2</sup> → ∞.

We showed in Luke et al. (2013) that the dynamical structure of the single population depends rather weakly on the synaptic sharpness parameter *n*. Furthermore, we argued that a modest sharpness is more biophysically plausible than the δ-function coupling obtained in the limit *n* → ∞. Thus, from here on, we fix *n* = 2 and drop the subscript on *Hn* to ease notation.

#### **3.2. THE DRIVER-RESPONSE SYSTEM**

To put our network in the driver-response form, we set *k*<sup>12</sup> = 0, so that population 1 receives no input from population 2. Therefore, the macrostates and bifurcations of population 1 are identical to those explored in Luke et al. (2013), described above. However, we allow *k*<sup>21</sup> = 0. Our goal is to examine the consequences of the influence of population 1 on population 2. We call population 1 the "driver" and population 2 the "response" system. See **Figure 1**.

Writing the governing equation of population 2 as

$$\dot{z}\_2 = -i\frac{(z\_2 - 1)^2}{2} + \frac{(z\_2 + 1)^2}{2} \left\{-\Delta\_2 + i\left[\eta\_{\ell\overline{\ell}} + k\_{22}H(z\_2)\right] \right\} (12)$$

with

$$
\eta\_{\rm eff} \equiv \eta\_2 + k\_{21} H(z\_1), \tag{13}
$$

and comparing to Equation (6), we see that the behavior of population 2 is the same as that of a single population of theta neurons with an effective median excitability parameter η*eff* . This effective parameter depends on the median excitability parameter intrinsic to population 2 η2, the inter-population coupling *k*21, and the state of the driver *z*1.

Note that η*eff* depends linearly on both η<sup>2</sup> and *k*<sup>21</sup> and nonlinearly on the driver's state *z*<sup>1</sup> through *H*(*z*1). Additionally, η*eff*

may be time-dependent if population 1 exhibits a CPW state, since in that case *z*<sup>1</sup> oscillates periodically. In the following, we will examine all these cases.

# **4. RESULTS**

We will examine the behavior of population 2 as various parameters are varied. We organize the presentation of our results by first considering the case in which the driver population exhibits an equilibrium state. Later, we consider the case in which the driver population exhibits periodic behavior.

We will mainly consider two configurations of the response system. The "excitatorily coupled" response system has *k*<sup>22</sup> > 0,

**FIGURE 2 | (A)** A two-dimensional bifurcation diagram of the excitatorily-coupled response system. The heavy black lines are saddle-node (SN) bifurcation curves, and the solid dot denotes the parameters of the response system when decoupled from the driver. In the cases considered in the main text, the driver causes η*eff* to vary along the horizontal dotted line. The parameters are: η<sup>1</sup> = −0.2,

The bifurcation diagrams that appear below in **Figures 2**, **3**, **4B**, **5B**, **8C** were obtained using XPPAUT (Ermentrout, 2002). Data for all other figures were generated using custom-designed code.

### **4.1. DRIVER ON A MACROSCOPIC EQUILIBRIUM**

We begin by fixing the driving population's parameters at η<sup>1</sup> = −0.2, <sup>1</sup> = 0.1, and *k*<sup>11</sup> = −2, which corresponds to a PSR state. Thus, *z*<sup>1</sup> remains fixed at a constant value. We examine the behavior of the two response system configurations as we vary the

<sup>1</sup> = 0.1, *k*<sup>11</sup> = −2, and *k*<sup>22</sup> = 9. **(B)** The one-dimensional bifurcation diagram showing the asymptotic values of *y*<sup>2</sup> = Im(*z*2) vs. *k*21. Solid and dashed curves indicate stable and unstable equilibria, respectively, corresponding to partially synchronous spiking (PSS) and partially synchronous resting (PSR) states. The parameters are as in **(A)**, with η<sup>2</sup> = −10 and <sup>2</sup> = 0.5.

**FIGURE 3 | (A)** The two-dimensional bifurcation diagram of the inhibitorily-coupled response system. The heavy black lines are saddle-node (SN) bifurcation curves, green is a homoclinic (HC) bifurcation curve, and red is an Andronov-Hopf (AH) bifurcation curve. The latter two curves emerge from a Bogdanov-Takens (BT) point. The solid dot denotes the parameters of the response system when decoupled from the driver. In the cases considered in the main text,

the driver causes η*eff* to vary along the horizontal dotted line. The parameters are: η<sup>1</sup> = −0.2, <sup>1</sup> = 0.1, *k*<sup>11</sup> = −2, and *k*<sup>22</sup> = −9. **(B)** The one-dimensional bifurcation diagram showing the asymptotic value of *x*<sup>2</sup> = Re(*z*2) vs. *k*21. Solid curves denote stable equilibria; dashed black curves are unstable equilibria. Green represents the maxima and minima of a collective periodic wave (CPW) limit cycle. The parameters are as in **(A)**, with η<sup>2</sup> = 5 and <sup>2</sup> = 0.5.

**FIGURE 4 | (A)** The non-linear behavior of η*eff* as a function of *k*<sup>11</sup> for the excitatorily-coupled response system. η*eff* is plotted horizontally to facilitate comparison with **Figure 2A**. The parameters are: η<sup>1</sup> = −0.05, <sup>1</sup> = 0.05, η<sup>2</sup> = −10, with the inter-population coupling fixed at *k*<sup>21</sup> = 2.0. **(B)** The

**FIGURE 5 | (A)** The non-linear behavior of η*eff* as a function of *k*<sup>11</sup> for the inhibitorily-coupled response system. η*eff* is plotted horizontally to facilitate comparison with **Figure 3A**. **(B)** The one-dimensional bifurcation diagram showing the asymptotic value of *x*<sup>2</sup> = Im(*z*2) vs. *k*11. Solid and dashed black

inter-population coupling parameter, *k*21. From Equation (13), η*eff* varies linearly with respect to *k*21.

#### *4.1.1. Excitatorily-coupled response system*

We set the response system's internal coupling to *k*<sup>22</sup> = 9, and show in **Figure 2A** the two-parameter bifurcation diagram of the response system with respect to <sup>2</sup> and η*eff* . Two saddle-node bifurcation curves which meet at a cusp are seen. To the left of these curves, the response network exhibits a PSR state, and to the right, a PSS state. These states coexist inside the approximately triangular region.

We set the remaining parameters of the response system to η<sup>2</sup> = −10 and <sup>2</sup> = 0.5. Thus, for *k*<sup>21</sup> = 0, η*eff* = η2, and the response system is situated at the solid black point marked in **Figure 2A**. As *k*<sup>21</sup> increases from zero, η*eff* increases linearly along the dotted line in **Figure 2A**, starting from the black point. In so doing, it traverses the SN bifurcation curves. **Figure 2B** shows

one-dimensional bifurcation diagram showing the asymptotic value of *y*<sup>2</sup> = Im(*z*2) vs. *k*11. Solid and dashed curves indicate stable and unstable equilibria, respectively. The parameters are as in **(A)**, with <sup>2</sup> = 0.5 and *k*<sup>22</sup> = 9.

curves indicate stable and unstable equilibria, respectively, and green represents the maxima and minima of a CPW limit cycle state. The parameters are: η<sup>1</sup> = −0.05, <sup>1</sup> = 0.05, η<sup>2</sup> = 5, <sup>2</sup> = 0.5, and *k*<sup>22</sup> = −9. The inter-population coupling is fixed at *k*<sup>21</sup> = 3.5.

how the imaginary part of the response's asymptotic macroscopic mean field [*y*<sup>2</sup> = Im(*z*2)] changes with respect to *k*21, illustrating the coexistence of the stable PSR and PSS states, along with an unstable PSR state (uPSR).

The point marked "SN/NF" in **Figure 2B** indicates that as *k*<sup>21</sup> increases, a saddle node bifurcation is encountered, corresponding to the left SN curve in **Figure 2A**. This creates a stable and an unstable PSS state. However, the unstable PSS state converts into an unstable PSR state at a value of *k*<sup>21</sup> very slightly beyond the SN bifurcation. That is, the node corresponding to the unstable PSS state becomes a unstable PSR focus, a transition we called a Node-Focus (NF) transition in Luke et al. (2013). The distinction between these events is indistinguishable in the figure.

#### *4.1.2. Inhibitorily-coupled response system*

We performed a similar analysis for the case in which the response system's internal coupling is *k*<sup>22</sup> = −9, i.e., inhibitory, and η<sup>2</sup> = 5. The remaining parameters were unchanged. The results are shown in **Figure 3**. In this case, the two-dimensional bifurcation diagram of the response system with respect to <sup>2</sup> and η*eff* (**Figure 3A**) shows a similar (but mirror-image) cusp of saddlenode curves. A new feature is the occurrence of a codimension-2 Bogdanov-Takens (BT) point on the left SN curve, and the emergence of homoclinic (HC; green) and Andronov-Hopf (AH; red) bifurcation curves from the BT point.

**Figure 3B** shows how the real part of the response's asymptotic macroscopic mean field [*x*<sup>2</sup> = Re(*z*2)] changes with respect to *k*21. As before, η*eff* increases linearly as *k*<sup>21</sup> increases, starting from the black solid point in **Figure 3A** and moving toward the right, traversing the various bifurcation curves along the dotted line. Note the presence of the attracting limit cycle CPW state in **Figure 3B**, which emerges at the HC bifurcation and terminates at the AH bifurcation as *k*<sup>21</sup> increases.

It is interesting to note that in both cases described above, the same bifurcation structure would be encountered if, instead of varying *k*<sup>21</sup> with a fixed value η2, we varied η<sup>2</sup> with a fixed value of *k*21. While this is obvious from Equation (13) since *H*(*z*1) is constant in these cases, this leads to the non-obvious conclusion that by modifying either the inter-population coupling or the intrinsic median excitability of the response population—two rather different system characteristics—one obtains identical transitions in the response network.

### *4.1.3. Variation of the driver's macroscopic equilibrium*

In the cases we considered previously, η*eff* changed linearly with respect to the inter-population coupling *k*21. We now turn our attention to the effects incurred by altering the value of the driver influence function *H*(*z*1) in Equation (13). We do this by varying the driver's internal coupling strength *k*11, thus causing the driver's asymptotic macroscopic mean field *z*<sup>1</sup> to change. This manipulation has the effect of changing η*eff non-linearly* with respect to *k*11.

For simplicity, we only consider a range of *k*<sup>11</sup> such that the driver always remains on a macroscopic equilibrium state, and we fix the inter-population coupling at *k*<sup>21</sup> = 2.

We begin with the case of the excitatorily-coupled response system considered above, with η<sup>2</sup> = −10, <sup>2</sup> = 0.5, and *k*<sup>22</sup> = 9, and choose the remaining driver parameters to be η<sup>1</sup> = −0.05 and <sup>1</sup> = 0.05. **Figure 4A** shows the non-linear behavior of η*eff* as *k*<sup>11</sup> is varied. Even though we are considering *k*<sup>11</sup> to be the independent parameter, we plot η*eff* horizontally so that it may be easily compared to **Figure 2A**; recall that this shows the twodimensional bifurcation diagram of the response system. Now, as *k*<sup>11</sup> changes, η*eff* moves back and forth along the dotted line non-linearly. In particular, **Figure 4A** shows that for very negative values of *k*11, η*eff* is near −5, which corresponds to a point in **Figure 2A** to the right of the SN curves. As *k*<sup>11</sup> increases, η*eff* decreases to approximately −10, thus crossing both SN curves in **Figure 2A** from right to left in the process. η*eff* subsequently increases, and goes back across the SN curves from left to right. Note that **Figure 4A** includes vertical lines marking the position of the SN bifurcations (i.e., the values of η*eff* at which the horizontal line at <sup>2</sup> = 0.5 in **Figure 2A** crosses the SN curves).

**Figure 4B** shows the behavior of the asymptotic state of the response system [*y*<sup>2</sup> = Im(*z*2)] as a function of *k*11. This shows that as *k*<sup>11</sup> increases, the response system passes through two separate regions of bistability, corresponding to the two traversals of the triangular bistable region in **Figure 2A**. Thus, **Figure 4B** is qualitatively similar to two copies of **Figure 2B**, with the structure for *k*<sup>11</sup> < 0 reversed. Note that the two regions are not symmetrical. This is due to the non-symmetric behavior of η*eff* as *k*<sup>11</sup> changes.

Next, we examine how the same manipulation of the driver system affects the inhibitorily-coupled response system. The parameters are as above, but with η<sup>2</sup> = 5 and *k*<sup>22</sup> = −9. **Figure 5A** shows how η*eff* changes as *k*<sup>11</sup> is varied, again plotted with η*eff* on the horizontal axis for ease of comparison with **Figure 3A**. Note the vertical lines in **Figure 5A** marking the SN, HC, and AH bifurcations.

The one-dimensional bifurcation diagram depicting the asymptotic state of the response system as a function of *k*<sup>11</sup> is shown in **Figure 5B**. A situation similar to the previous case is observed. Two distorted versions of the structure of **Figure 3B**, with the features for *k*<sup>11</sup> < 0 being reversed, are seen. Again, this is due to the non-linear and asymmetric behavior of η*eff* as it traverses the bifurcations in **Figure 3A** twice: first right to left, and then left to right, as *k*<sup>11</sup> is increased. Note also the presence of an attracting limit cycle CPW state in intervals of both positive and negative *k*11.

# **4.2. DRIVER ON A MACROSCOPIC LIMIT CYCLE**

We now focus on the behavior of the response population when the driver is on a CPW state, which is a limit cycle of the driver's macroscopic mean field (or order parameter). Throughout this section, we fix the driver parameters at η<sup>1</sup> = 10.75, *k*<sup>11</sup> = −9, and <sup>1</sup> = 0.5, which results in a CPW driver state for which *H*(*z*1) oscillates periodically in time. In particular, we have *H*(*z*1) > 0 for all time. Thus, according to Equation (13), η*eff* also oscillates periodically for *k*<sup>21</sup> = 0, and both the centroid and the amplitude of the η*eff* oscillation increase as *k*<sup>21</sup> increases.

We show below that in this configuration, the response population can exhibit periodic, multistable, chaotic, and/or quasiperiodic behavior, depending on the response system's parameters and the interpopulation coupling strength *k*21.

# *4.2.1. Periodic behavior in the response system*

We begin by considering the excitatorily coupled response system, with <sup>2</sup> = 0.5 and *k*<sup>22</sup> = 9, but with η<sup>2</sup> = −20. When decoupled from the driver, this places the response system at a point well to the left in the parameter space of **Figure 2A**. Thus, the response system in isolation asymptotes to a PSR state. As *k*<sup>21</sup> is increased from zero to eight, η*eff* oscillates back and forth along the horizontal line in **Figure 2A** at <sup>2</sup> = 0.5, but always stays to the left of the SN curves shown in that figure. Thus, the driver simply pushes the response system's PSR state back and forth, avoiding any bifurcations. The result is simple periodic behavior in the driven response system. **Figure 6A** shows a plot of the maximum and minimum of *x*<sup>2</sup> = Re(*z*2) vs. *k*21. As *k*<sup>21</sup> increases, the amplitude of this simple periodic behavior increases. We observe that the frequency of the response system's oscillation is the same

**FIGURE 6 | (A)** Simple periodic behavior in the response system driven by a CPW state of the driver as a function of the inter-population coupling strength *k*21. The curves are local maxima and minima of *x*<sup>2</sup> = Re(*z*2). The driver parameters are η<sup>1</sup> = 10.75, <sup>1</sup> = 0.5, and

We now change the response system such that η<sup>2</sup> = −5, and leave all other parameters the same as above. This change places the response system at a point to the right of the SN curves in **Figure 2A**, and for these parameters, the uncoupled response system asymptotes to a PSS state. Once again, as *k*<sup>21</sup> increases, η*eff* oscillates back and forth along the <sup>2</sup> = 0.5 line in **Figure 2A**, but this time it does so always staying to the right of the SN curves.

The result is multi-frequency periodic behavior in the response system that is more complicated than in the previous example. **Figure 6B** shows a plot of the *local* minima and maxima of *y*<sup>2</sup> = Im(*z*2) vs. *k*21. **Figure 7** shows *y*<sup>2</sup> vs. *x*<sup>2</sup> plots of the periodic orbits at *k*<sup>21</sup> = 6 (upper panels) and *k*<sup>21</sup> = 10 (lower panels). As *k*<sup>21</sup> increases from zero, a periodic orbit with winding number two emerges (similar to that shown in **Figure 7A**) and grows in amplitude, peaking near *k*<sup>21</sup> ≈ 2.5. The amplitude subsequently decreases to a minimum near *k*<sup>21</sup> ≈ 7.2, and then slowly increases again. Note that the four curves in **Figure 6B** for *k*<sup>21</sup> ∈ [0, 7.2] correspond to two pairs of alternating local maxima and minima in the time series of *y*2, as shown in **Figure 7B**.

Interestingly, near *k*<sup>21</sup> ≈ 7.2, an additional loop appears in the orbit, as shown in **Figure 7C**. This is reflected in the additional inner curves in **Figure 6B** that appear for *k*<sup>21</sup> - 7.2, and the two additional local maxima and minima in the time series of *y*<sup>2</sup> in **Figure 7D**.

#### *4.2.2. Multistability in the response system*

Continuing with the excitatorily coupled response system (with *k*<sup>22</sup> = 9 > 0), we set η<sup>2</sup> = −10 and leave all other parameters unchanged. In this case the uncoupled response system is at a point just to the left of the left SN curve in **Figure 2A**, and as *k*<sup>21</sup> increases, η*eff* again sweeps back and forth along the horizontal line at <sup>1</sup> = 0.5. However, now this sweeping cuts across both SN curves. Thus, the response system sweeps back and forth across the approximately triangular multistable region bounded by the SN curves.

*k*<sup>11</sup> = −9, and the response parameters are η<sup>2</sup> = −20, <sup>2</sup> = 0.5, and *k*<sup>22</sup> = 9. **(B)** Slightly more complicated periodic behavior obtained at the same parameters, except with η<sup>2</sup> = −5. The curves are local maxima and minima of *y*<sup>2</sup> = Im(*z*2).

**Figure 8A** shows the maxima and minima of *y*<sup>2</sup> vs. *k*<sup>21</sup> for this case. The first feature to emerge as *k*<sup>21</sup> increases from zero is a simple periodic orbit whose amplitude increases, similar to the example in **Figure 6A**. At *k*<sup>21</sup> ≈ 0.5, a new and separate coexisting limit cycle appears, as indicated by the upper curves that emerge in **Figure 8A**. **Figure 8B** shows the *y*<sup>2</sup> vs. *x*<sup>2</sup> plots of these two limit cycles at *k*<sup>21</sup> = 1.5, where the larger orbit corresponds to the upper two curves in **Figure 8A**. In this bistable region, the macroscopic dynamics of the response system approaches one or the other of these periodic states, depending on the initial conditions.

**Figure 8C** shows, in black, the asymptotic states of *y*<sup>2</sup> vs. η*eff* for *fixed* values of η*eff* , with *k*<sup>21</sup> = 1.5. These curves show that for a large interval of η*eff* , a stable PSR coexists with a stable PSS and an unstable PSR state for the frozen (i.e., η*eff* fixed) system. With the driver on the CPW state, η*eff* sweeps from approximately −9.1 to −7.6 and back again–a range which is well within the bistable region. Superimposed in green in **Figure 8C** are projections of the two coexisting limit cycles onto this space, showing that the lower limit cycle is a simple periodic perturbation of the response system's underlying PSR state, and the upper limit cycle is a periodic perturbation of the underlying PSS state.

#### *4.2.3. Chaos in the response system*

We now switch to the inhibitorily coupled response system, with parameters η<sup>2</sup> = 5, <sup>2</sup> = 0.5, and *k*<sup>22</sup> = −9. The parameter space of this system is shown in **Figure 3A**, and the uncoupled response system resides at the solid black dot in that figure, to the left of all the bifurcations. As the interpopulation coupling strength *k*<sup>21</sup> increases, η*eff* sweeps across the same horizontal line at <sup>2</sup> = 0.5 with increasing amplitude and centroid, initially crossing just the left SN bifurcation curve. At *k*<sup>21</sup> ≈ 5.2, η*eff* begins sweeping across the homoclinic and the Andronov-Hopf bifurcation curves. Eventually, for sufficiently large *k*21, η*eff* sweeps across all four bifurcation curves (SN, AH, HC, and SN).

**Figure 9A** shows the local maxima and minima of *x*<sup>2</sup> = Re(*z*2) vs. *k*21. We initially see the emergence of a simple periodic orbit that grows slowly in amplitude. However, at *k*<sup>21</sup> ≈ 5.2, chaos

suddenly emerges through a crisis. **Figure 9B** shows a magnification of this region, with a plot of the two largest Lyapunov exponents. We see that there are significant intervals of *k*<sup>21</sup> for which there is a positive Lyapunov exponent, indicating the presence of macroscopic chaos.

As *k*<sup>21</sup> increases, the first chaotic band, beginning at *k*<sup>21</sup> ≈ 5.28, coexists with the simple periodic loop that was present for smaller *k*<sup>21</sup> (this coexistence is not apparent in the figure). Outside of this band, there is a window dominated by periodic behavior of rather high period. A second chaotic band appears at approximately *k*<sup>21</sup> = 5.48. This second band terminates at approximately *k*<sup>21</sup> = 5.65, after which a series of reverse period-doubling cascades are seen.

The *y*<sup>2</sup> vs. *x*<sup>2</sup> plot of the chaotic attractor present at *k*<sup>21</sup> = 5.296, for which the largest Lyapunov exponent is approximately 0.2118, is shown in **Figure 10A**.

*k*<sup>22</sup> = −9, η<sup>2</sup> = 5.

#### *4.2.4. Quasiperiodicity in the response system*

Finally, we consider the case in which the response system exhibits a CPW state when uncoupled from the driver, and ask what happens when this is driven by another CPW state in the driver. We use the same drive system parameters as above, and set the response system's parameters to be the same except for <sup>2</sup> = 0.3. As the inter-population coupling strength *k*<sup>21</sup> is increased, various phase-locked and quasiperiodic states are seen. An example of quasiperiodic behavior in the response system for *k*<sup>21</sup> = 0.1 is shown in **Figure 10B**.

of *x*<sup>2</sup> = Re(*z*2) vs. the inter-population coupling *k*21. **(B)** Magnification of

#### **5. DISCUSSION**

In this work, we have taken the first step toward designing a mathematically tractable modular network-of-networks representation of neuronal systems. Our approach is based on dynamical analysis techniques that enable a complete description of the emergent macroscopic behavior of large, heterogeneous discrete networks of globally-coupled phase oscillators. Building on previous results (Luke et al., 2013) in which we used these techniques to show that the collective dynamics of a single such population of theta neurons is relatively simple (exhibiting just equilibria and limit cycle states), we constructed the next simplest hierarchical structure: a driver-response configuration of theta neuron populations. Our results show that even in this simplest of configurations, the response system (and hence, the network as a whole) can exhibit a full range of dynamical behaviors and surprising complexity. A notable strength of our work is that despite the complexity that emerges from this arrangement, the behavior can be understood and explained in terms of what is known about a single population's dynamics and bifurcation structure.

With the driving system on a fixed equilibrium, we showed that the response system is equivalent to a single population with a simple shift in one parameter. Specifically, this parameter is the median of the distribution of excitability parameters in the response system, which indicates whether the response population is dominated by excitable or intrinsically-spiking neurons. Although this arrangement does not introduce any new dynamical features, we showed that the response system can nevertheless still exhibit an interesting bifurcation structure involving macroscopic equilibria, limit cycles, and multistability as the strength of the inter-population coupling varies. More interestingly, we found that the inter-population coupling strength is effectively equivalent to the response system's median intra-population excitability. By this we mean that changes in either of these rather different network parameters lead to identical bifurcation scenarios. This surprising result follows from the drive-response network configuration in particular.

The first level of additional complication arose when modestly altering an internal parameter of the drive system. This effectively led to a *non-linear* change in the response system's median excitability, causing a dramatic change in the response's bifurcation structure. Such bifurcation structures might be difficult to understand if encountered blindly, as might be the case when studying the dynamics of a network without knowledge of its internal structure. Experimental studies of neuronal networks often take a similar "black box" approach out of necessity, since detailed knowledge of connectivity (i.e., the "connectome") is rarely available. In our case, however, we showed that knowledge of the non-linearity, along with knowledge of the bifurcation structure of a single network, leads to a natural explanation of the additional features that arise due to the network-of-networks structure. In our particular case studies, we observed multiple distorted and reversed copies of the bifurcation structure that is associated with a single population of theta neurons. We therefore speculate that in "black box" investigations, the observation of such repeated and/or distorted bifurcation structures might be indicative of driver-response-type connectivity in the network of study.

Finally, we investigated the consequences of placing the driver system on a collective rhythmic state (i.e., a macroscopic periodic orbit). Our results were consistent with previous results that studied non-autonomous phase oscillator (So and Barreto, 2011) and theta neuron systems (So et al., 2014). In those investigations, it was shown that networks of oscillators subjected to a sinusoidal variation of a network parameter led to complicated dynamics including quasiperiodicity and macroscopic chaos. Here, our driver-response arrangement of two separate interacting populations of theta neurons leads to an overall autonomous system, but with the response system being subjected to a periodic driving signal from the driver. Such arrangements might be found in real neuronal systems at the early stages of sensory input processing. For example, the lateral geniculate nucleus may be driven by a periodic visual signal delivered to the retina. Another candidate might be the trisynaptic circuit of the dentate gyrus and the CA3 and CA1 regions of the hippocampus (Kandel et al., 2000). More generally, the information-processing capabilities of the brain are thought to be regulated by collective rhythms, notably theta and gamma oscillations, which arise in various areas and periodically drive other areas (Buzsáki, 2006).

Our results may also have implications for populations of bursting neurons (So et al., 2014). Neuronal bursting in individual neurons is commonly understood to arise as the result of the interplay between a slowly oscillating neuronal parameter (or "slow variable") and the neuron's fast spiking dynamics. Bursting arises if the slow parameter sweeps back and forth across bifurcations, and (Rinzel and Ermentrout, 1989) classified bursters as square, parabolic, or elliptic based on the bifurcations encountered in this process. It has also been demonstrated that slowly oscillating intra- and extra-cellular ion concentrations can lead to wide range of neuronal bursting behaviors (Cressman et al., 2009, 2011; Barreto and Cressman, 2011).

Finally, we note that our explorations in this work were limited to cases in which the driver population's parameters were either fixed or were varied only modestly. In the latter case, we changed the driver's median excitability parameter only to the extent that its collective equilibrium state was displaced but not altered. Significantly greater complexity in the response's dynamics would arise if the collective state of the driver were pushed across its own bifurcations, possibly resulting in topological changes and hysteretic effects in the driver's macroscopic state. As discussed above, such complexity would be difficult to understand if encountered in a "black box"-type investigation. Nevertheless, if it is known that the network of interest has a driver-response structure, it may be possible to comprehend the origin of such complexity in the manner that we have outlined here.

This study constitutes an initial attempt at building a mathematically tractable model to understand the collective behavior of a hierarchical "network-of-networks" arrangement of model neurons. In future work we plan to consider networks of networks that include feedback connections and additional populations in an effort to understand the emergence of macroscopic dynamical complexity in more realistic networks.

# **AUTHOR CONTRIBUTIONS**

Tanushree B. Luke, Ernest Barreto, and Paul So conceived and designed the investigation, analyzed the data, and wrote the paper. Tanushree B. Luke and Paul So performed the numerical computations.

# **ACKNOWLEDGMENT**

Publication of this article was funded by the George Mason University Libraries Open Access Publishing Fund.

# **REFERENCES**


Buzsáki, G. (2006). *Rhythms of The Brain, 1 Edn*. New York, NY: Oxford University Press. doi: 10.1093/acprof:oso/9780195301069.001.0001


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 31 August 2014; accepted: 27 October 2014; published online: 18 November 2014.*

*Citation: Luke TB, Barreto E and So P (2014) Macroscopic complexity from an autonomous network of networks of theta neurons. Front. Comput. Neurosci. 8:145. doi: 10.3389/fncom.2014.00145*

*This article was submitted to the journal Frontiers in Computational Neuroscience. Copyright © 2014 Luke, Barreto and So. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Multiscale entropy analysis of biological signals: a fundamental bi-scaling law

#### Jianbo Gao1, 2 \*, Jing Hu<sup>2</sup> , Feiyan Liu1, 3 and Yinhe Cao1, 2

*1 Institute of Complexity Science and Big Data Technology, Guangxi University, Nanning, China, <sup>2</sup> PMB Intelligence LLC, Sunnyvale, CA, USA, <sup>3</sup> School of Management, University of Chinese Academy of Sciences, Beijing, China*

Since introduced in early 2000, multiscale entropy (MSE) has found many applications in biosignal analysis, and been extended to multivariate MSE. So far, however, no analytic results for MSE or multivariate MSE have been reported. This has severely limited our basic understanding of MSE. For example, it has not been studied whether MSE estimated using default parameter values and short data set is meaningful or not. Nor is it known whether MSE has any relation with other complexity measures, such as the Hurst parameter, which characterizes the correlation structure of the data. To overcome this limitation, and more importantly, to guide more fruitful applications of MSE in various areas of life sciences, we derive a fundamental bi-scaling law for fractal time series, one for the scale in phase space, the other for the block size used for smoothing. We illustrate the usefulness of the approach by examining two types of physiological data. One is heart rate variability (HRV) data, for the purpose of distinguishing healthy subjects from patients with congestive heart failure, a life-threatening condition. The other is electroencephalogram (EEG) data, for the purpose of distinguishing epileptic seizure EEG from normal healthy EEG.

# Keywords: scaling law, multiscale entropy analysis, fractal signal, heart rate variability (HRV), adaptive filtering

# 1. Introduction

Biological systems provide the definitive examples of highly integrated systems functioning at multiple time scales. Neurons function on a time scale of milliseconds. Circadian rhythms operate on time scale of hours, reproductive cycles occur on a time scale of weeks, and bone remodeling involves time scales of months. As an integrated system, each process interacts with faster and slower processes. Consequently, biosignals often are multiscaled (Gao et al., 2007)—depending upon the scale at which the signals are examined, they may exhibit different behaviors (e.g., nonlinearity, sensitive dependence on small disturbances, long memory, extreme variations, and nonstationarity), just as a great painting may exhibit various details and arouse a multitude of aesthetic feelings when appreciated at different distances, from different angles, under different illuminations, and under different moods.

With the rapid advance of sensing technology, complex data have been accumulating exponentially in all areas of life sciences. To better cope with such complex data, recently, Costa et al. (2005) have introduced an interesting method, the multiscale entropy (MSE) analysis. MSE has found numerous applications in various types of biosignal analysis, including fetal heart rate monitoring (Cao et al., 2006), assessment of EEG dynamical complexity in

#### Edited by:

*Tobias Alecio Mattei, Brain* & *Spine Center - InvisionHealth - Kenmore Mercy Hospital, USA*

#### Reviewed by:

*Guillaume Lajoie, Max Planck Institute for Dynamics and Self-Organization, Germany Bailu Si, Chinese Academy of Sciences, China Xiaoli Li, Beijing Normal University, China*

#### \*Correspondence:

*Jianbo Gao, Institute of Complexity Science and Big Data Technology, Guangxi University, 100 Daxue Road, Nanning, Guangxi 530005, China jbgao.pmb@gmail.com*

> Received: *14 December 2014* Accepted: *14 May 2015* Published: *02 June 2015*

#### Citation:

*Gao J, Hu J, Liu F and Cao Y (2015) Multiscale entropy analysis of biological signals: a fundamental bi-scaling law. Front. Comput. Neurosci. 9:64. doi: 10.3389/fncom.2015.00064* Alzheimer's disease (Mizuno et al., 2010), classification of surface EMG of neuromuscular disorders (Istenic et al., 2010), heart rate analysis for predicting hospital mortality (Norris et al., 2008), and analysis of hear beat interval and blood flow for characterizing psychological dimensions in non-pathological subjects (Nardelli et al., 2015). MSE has also been extended to multivariate MSE (Ahmed and Mandic, 2011) and multiscale permutation entropy (Li et al., 2010). So far, however, no analytic analyses about MSE or multivariate MSE have been carried out. This has severely limited our basic understanding of MSE. For example, it has not been known whether MSE estimated using default parameter values and short data set is meaningful or not. Nor is it known whether MSE has any relation with other complexity measures, such as the Hurst parameter, which characterizes the correlation structure of the data.

To help gain insights into the above questions, and to guide more fruitful applications of MSE in diverse fields of life sciences, in this work, we report a fundamental bi-scaling law for MSE of the most popular model of biosignals, the fractal 1/f type time series. As example applications, we will analyze heart rate variability (HRV) and electroencephalogram (EEG) data. With HRV, we will focus on distinguishing healthy subjects from patients with congestive heart failure (CHF), a life-threatening condition, as well as resolving an interesting debate (Wessel et al., 2003; Nikulin and Brismar, 2004) regarding the usefulness of MSE in distinguishing HRV of healthy subjects from that of patients with certain cardiac disease. With EEG, we will focus on distinguishing epileptic seizure EEG from normal healthy EEG.

# 2. Materials and Methods

# 2.1. Data

To illustrate the use of scaling analysis of MSE, in this paper, we analyze two types of data, heart rate variability (HRV), for the purpose of distinguishing healthy subjects from patients with congestive heart failure (CHF), and EEG, for the detection of epileptic seizures.

We downloaded two types of HRV data from the PhysioNet (MIT-BIH Normal Sinus Rhythm Database and BIDMC Congestive Heart Failure Database available at http://www. physionet.org/physiobank/database/#ecg), one for healthy subjects, and the other for subjects with CHF. The latter includes long-term ECG recordings from 15 subjects (11 men, aged 22 to 71, and 4 women, aged 54 to 63) with severe CHF (NYHA class 3–4). This group of subjects was part of a larger study group receiving conventional medical therapy prior to receiving the oral inotropic agent, milrinone. Further details about the larger study group can be found at the PhysioNet. The individual recordings of ECG are each about 20 h in duration, and contain two ECG signals each sampled at 250 samples per second with 12-bit resolution over a range of ±10 millivolts. The other database are for 18 normal subjects. The individual recordings are each about 25 h in duration, each sampled at 128 samples per second. The HRV data analyzed here are the R-R intervals (in unit of second) derived from the ECG recordings.

The EEG database is downloaded at http://www.meb. unibonn.de/epileptologie/science/physik/eegdata.html. The database consists of three groups, H (healthy), E (epileptic subjects during a seizure-free interval), and S (epileptic subjects during seizure); each group contains 100 data segments, whose length is 4097 data points with a sampling frequency of 173.61 Hz. These data have been carefully examined by adaptive fractal analysis (Gao et al., 2011c) and scale-dependent Lyapunov exponent (Gao et al., 2006b, 2011b, 2012), for the same purpose of distinguishing epileptic seizure EEG from normal healthy EEG.

# 2.2. Methods

Entropy characterizes creation of information in a dynamical system. To facilitate derivation of a fundamental scaling law for MSE, we first rigorously define MSE and all related concepts.

Suppose that the F-dimensional phase space is partitioned into boxes of size ε F . Suppose that there is an attractor in phase space and consider a transient-free trajectory xE(t). The state of the system is now measured at intervals of time τ . Let p(i1, i2, · · · , id) be the joint probability that xE(t = τ ) is in box i1, xE(t = 2τ ) is in box i2, · · · , and xE(t = dτ ) is in box id. Let us now introduce the block entropy,

$$H\_d(\mathfrak{e}, \mathfrak{r}) = -\sum\_{i\_1, \dots, i\_d} p(i\_1, \dots, i\_d) \ln p(i\_1, \dots, i\_d), \qquad \text{(1)}$$

take the difference between Hd+<sup>1</sup> (ε, τ ) and Hd(ε, τ ), and normalize it by τ ,

$$h\_d(\mathfrak{e}, \mathfrak{r}) = \frac{1}{\mathfrak{r}} [H\_{d+1}(\mathfrak{e}, \mathfrak{r}) - H\_d(\mathfrak{e}, \mathfrak{r})].\tag{2}$$

Let

$$h(\mathfrak{e}, \mathfrak{r}) = \lim\_{d \to \infty} h\_d(\mathfrak{e}, \mathfrak{r}) \tag{3}$$

It is called the (ε, τ )-entropy (Gaspard and Wang, 1993). Taking limits, we obtain the Kolmogorov-Sinai (K-S) entropy,

$$\begin{aligned} K &= \lim\_{\substack{\mathfrak{r} \to 0 \ \mathfrak{e} \to 0}} \lim\_{\mathfrak{e} \to 0} h(\mathfrak{e}, \mathfrak{r}) \\ &= \lim\_{\substack{\mathfrak{r} \to 0 \ \mathfrak{e} \to 0}} \lim\_{\substack{d \to \infty}} \lim\_{\mathfrak{t} \to \infty} \frac{1}{\mathfrak{r}} [H\_{d+1}(\mathfrak{e}, \mathfrak{r}) - H\_d(\mathfrak{e}, \mathfrak{r})] \end{aligned} \tag{4}$$

We now consider computation of the (ε, τ )-entropy from a time series of length N, x<sup>1</sup> , x<sup>2</sup> , · · · , x<sup>N</sup> . As is well-known, the first step is to use the time delay embedding to construct vectors of the form:

$$V\_i = [\mathbf{x}\_i, \mathbf{x}\_{i+L}, \dots, \mathbf{x}\_{i+(\rho u - 1)L}],\tag{5}$$

where m, the embedding dimension, and L, the delay time, can be chosen according to certain optimization criterion (Gao et al., 2007). Then one can employ the Cohen-Procaccia algorithm (Cohen and Procaccia, 1985) to estimate the (ε, τ )-entropy. In particular, when it is evaluated at a fixed finite scale εˆ, the resulting entropy is called the approximate entropy. To get better statistics from a finite time series, one may compute K2(ε) using the Grassberger-Procaccia's algorithm (Grassberger and Procaccia, 1983):

$$K\_2(\mathfrak{s}) = \lim\_{m \to \infty} \frac{\ln C^{(m)}(\mathfrak{s}) - \ln C^{(m+1)}(\mathfrak{s})}{mL \delta t} \tag{6}$$

where δt is the sampling time, C (m) (ε) is the correlation integral based on the m−dimensional reconstructed vectors V<sup>i</sup> and V<sup>j</sup> ,

$$C^{(m)}(\varepsilon) = \lim\_{N\_{\nu} \to \infty} \frac{2}{N\_{\nu}(N\_{\nu} - 1)} \sum\_{i=1}^{N\_{\nu}-1} \sum\_{j=i+1}^{N\_{\nu}} H(\varepsilon - ||V\_i - V\_j||), \tag{7}$$

where N<sup>v</sup> = N − (m − 1)L is the number of reconstructed vectors, H(y) is the Heaviside function (1 if y ≥ 0 and 0 if y < 0). C (m+1)(ε) can be computed similarly based on the m + 1−dimensional reconstructed vectors. When we evaluate K2(ε) at a finite fixed scale εˆ, we obtain the sample entropy S<sup>e</sup> (Richman and Moorman, 2000).

MSE analysis is based on the sample entropy Se. The procedure is as follows. Let X = {x<sup>t</sup> : t = 1, 2, . . .} be a covariance stationary stochastic process with mean µ, variance σ 2 , and autocorrelation function r(k), k ≥ 0. Construct a new covariance stationary time series

$$X^{(b\_s)} = \{ \mathfrak{x}\_t^{(b\_s)} : t = 1, 2, 3, \dots \}, \ b\_s = 1, 2, 3, \dots, s$$

by averaging the original series X over non-overlapping blocks of size b<sup>s</sup> ,

$$\boldsymbol{x}\_{t}^{(b\_{i})} = (\boldsymbol{x}\_{tb\_{i} - b\_{i} + 1} + \dots + \boldsymbol{x}\_{tb\_{i}}) / b\_{s}, \ t \ge 1 \,. \tag{8}$$

MSE analysis involves (i) choosing a finite scale εˆ in phase space, and (ii) computing S<sup>e</sup> from the original and the smoothed data X and X (bs) at the chosen scale εˆ. For convenience of later discussion, we denote K (bs) 2 (ε) for the correlation entropy of the smoothed data. When b<sup>s</sup> = 1, it is the correlation entropy of the original data, and can be simply denoted as K2(ε).

We emphasize that the length of the smoothed time series is only 1/b<sup>s</sup> of the original one. To fully resolve the scaling behavior of K2(ε), the requirement on data length is quite stringent. A fundamental question is whether MSE calculated from short noisy data is meaningful or not.

# 3. Results

#### 3.1. Scaling for the MSE of Fractal Time Series

Among the most widely used models for biological signals, including HRV, EEG, and posture (Gao et al., 2011a), is the fractal time series with long memory, the so-called 1/f α , or 1/f 2H−1 , α = 2H − 1 processes, where 0 < H < 1 is called the Hurst parameter, whose value determines the correlation structure of the data (Gao et al., 2006a, 2007): when H = 1/2, the process is like the independent steps of the standard Brownian-motion; when H < 1/2, the process has antipersistent correlations; when H > 1/2, the process has persistent correlations. Two special cases, white noise with H = 0.5 and 1/f process with H = 1, have been extensively used for the development of multivariate MSE (Ahmed and Mandic, 2011). In this subsection, we derive fundamental scalings for MSE of the ubiquitous 1/f <sup>2</sup>H−<sup>1</sup> noise.

A covariance stationary stochastic process X = {X<sup>t</sup> : t = 0, 1, 2, . . .}, with mean µ, variance σ 2 , and autocorrelation function r(w),w ≥ 0, is said to have long range correlation if r(w) is of the form Cox (1984)

$$r(\omega) \sim \omega^{2H-2}, \text{ as } \omega \to \infty,\tag{9}$$

where 0 < H < 1 is the Hurst parameter. When 1/2 < H < 1, P w r(w) = ∞, leading to the term long range correlation. Note the X time series has a power spectral density 1/f 2H−1 . Its integration, {yt}, where y<sup>t</sup> = P<sup>t</sup> i = 1 xi , is called a random walk process which is nonstationary with powerspectral density (PSD) 1/f 2H+1 . Being 1/f processes, they cannot be aptly modeled by Markov processes or ARIMA models (Box and Jenkins, 1976), since the PSD for those processes are distinctly different from 1/f . To adequately model 1/f processes, fractional order processes has to be used. The most popular is the fractional Brownian motion model Mandelbrot (1982), whose increment process is called the fractional Gaussian noise (fGn). The importance and popularity of fGn in modeling various types of noises in science and engineering motivates us to focus our analysis on it when deriving the bi-scaling law.

1/f <sup>2</sup>H−<sup>1</sup> noises are self-similar, with the autocorrelation for the original data and the smoothed data (defined by Equation 8) being the same (Gao et al., 2006a, 2007). This signifies that there must exist a simple relation between K (bs) 2 (ε) and K2(ε). To find this relation, we note that the variance, var(X (bs) ), of the smoothed data, and the variance, σ 2 , of the original data, are related by the following simple and elegant scaling law (Gao et al., 2006a, 2007),

$$var(X^{(b\_s)}) = \sigma^2 b\_s^{\ 2H-2} \tag{10}$$

Equation (10) states that the scale ε for the original data is transformed to a smaller scale b H−1 s ε for the smoothed data. Using the self-similarity property of the 1/f <sup>2</sup>H−<sup>1</sup> noise, we therefore obtain,

$$K\_2^{(b\_\delta)}\left(b\_s^{H-1}\varepsilon\right) = K\_2(\varepsilon) \tag{11}$$

Since for stationary random processes, K2(ε) diverges when ε → 0, Equation (11) states that K (bs) 2 b H−1 s ε can be obtained from K2(ε) by shifting downward the curve for K2(ε). How much K2(ε) should be shifted depends on the functional form for K2(ε), which we shall find out momentarily.

First we note that for 1-D independent random variables, which correspond to H = 1/2, h(ε, τ ) ∼ − ln ε (Gaspard and Wang, 1993). Therefore, K2(ε) ∼ − ln ε. In fact, for any stationary noise process, irrespective of its correlation structure, we always have C (m) (ε) ∼ ε <sup>−</sup>m, ε → 0, therefore,

$$K\_2(\varepsilon) \sim -\ln \varepsilon,\ \text{ }\varepsilon \to 0\tag{12}$$

Equation (12) is, however, not adequate for us to understand the scaling of K2(ε) on finite scales. To gain more insights, we resort to the rate distortion function or the Shannon-Kolmogorov (SK) entropy (Berger, 1971; Gaspard and Wang, 1993). It is thought to diverge with ε in the same way as the (ε, τ )-entropy and K2(ε) (Gaspard and Wang, 1993).

Suppose we wish to approximate the random signal X(t) by Z(t) according to

$$\rho(X, Z) = \lim\_{T \to \infty} \frac{1}{T} \int\_0^T \left\langle [X(t) - Z(t)]^2 \right\rangle dt \le \varepsilon^2 \tag{13}$$

where <> denotes averaging. Equation (13) may be considered a partition of the phase space containing the random signal X(t) by centering around X(t). Denote the conditional probability density for Z given x by q(z|x). The mutual information I(q) between X and Z is a functional of q(z|x),

$$I(q) = \int \int d\mathbf{x} dz \, p(\mathbf{x}) q(\mathbf{z}|\mathbf{x}) \ln[q(\mathbf{z}|\mathbf{x})/q(\mathbf{z})].\tag{14}$$

The SK (ε, τ )-entropy is

$$H\_{\text{SK}}(\mathfrak{e}, \mathfrak{r}, T) = \text{Inf}\_{q \in Q(\mathfrak{e})} I(q) \tag{15}$$

where Q(ε) is the set of all conditional probabilities q(z|x) such that Condition (13) is satisfied. The SK (ε, τ )-entropy per unit time is then

$$h\_{\text{SK}}(\mathfrak{e}, \mathfrak{r}) = \lim\_{T \to \infty} H\_{\text{SK}}(\mathfrak{e}, \mathfrak{r}, T)/T \tag{16}$$

For stationary Gaussian processes, hSK (ε, τ ) can be readily computed by the Kolmogorov formula (Berger, 1971; Kolmogorov, 1956). In the case of a discrete-time process, it reads

$$\varepsilon^2 \quad = \frac{1}{2\pi} \int\_{-\pi}^{\pi} \min[\theta, \Phi(\omega)] d\omega \tag{17}$$

$$h\_{\rm SK}(\mathfrak{e}, \mathfrak{r}) \;= \; \_{4\pi}^{1} \int\_{-\pi}^{\pi} \max\{0, \ln[\Phi(\omega)/\theta] \} d\omega \tag{18}$$

where 8(ω) is the PSD of the process and θ is an intermediate variable.

We now evaluate the SK entropy for a popular model of 1/f <sup>2</sup>H−<sup>1</sup> noise, the fractional Gaussian noise (fGn). It is a stationary Gaussian process with PSD 1/ω 2H−1 . Since we are primarily interested in small ε, we may choose the intermediate variable θ ≤ 8(ω). Let us denote 8(ω) = B(H)ω <sup>1</sup>−2H, where B(H) is a factor depending on H. When H = 1/2, it equals the variance of the noise σ 2 H = 1/2 . Using Equations (17) and (18), we immediately have

$$h\_{\rm sk}(\varepsilon) = A(H) - \ln \varepsilon \tag{19}$$

where

$$A(H) = \frac{1 - 2H}{2}(\ln \pi - 1) + \frac{1}{2}\ln B(H) \tag{20}$$

If we assume fGn of different H to have the same variance, then R π <sup>0</sup> 8(ω)dω is a constant independent of H. A(H) can then be written as

$$A(H) = \frac{1}{2} \ln \sigma\_{H=1/2}^2 + \frac{1}{2} \left[ \ln(2 - 2H) - (1 - 2H) \right] \tag{21}$$

A(H) is maximal when H = 1/2. However, when H is not close to 0 or 1, the term <sup>1</sup> 2 [ln(2 − 2H) − (1 − 2H)] is negligibly small, signifying that hSK (ε) cannot readily classify fGn of different H.

Since hSK (ε) and K2(ε) diverge in the same fashion (Gaspard and Wang, 1993), using Equation (12) to determine the prefactor, we have a scaling for finite ε

$$K\_2(\mathfrak{e}) \sim -\ln \mathfrak{e} \tag{22}$$

Combining Equations (22) and (11), we arrive at a fundamental bi-scaling law for K (bs) 2 (ε) for fractal time series:

$$K\_2^{(b\_i)}(\varepsilon) \sim (H - 1) \text{ ln } b\_s - \ln \varepsilon \tag{23}$$

To verify the above bi-scaling law, and more importantly, to gain insights into the relative importance of the two scale parameters b<sup>s</sup> and ε in MSE analysis, we numerically perform MSE analysis of fGn processes with different H. A few examples are shown in **Figures 1**, **2**. The computations are done with 2<sup>14</sup> points and m = 2. We observe excellent bi-scaling relations, thus verifying Equation (23). Recalling our earlier comment that K2(ε) itself is not very useful for distinguishing fGn of different H, **Figure 2** clearly shows that the scaling K (bs) 2 (ε) ∼ (H − 1) ln b<sup>s</sup> can aptly separate fGn processes of different H. In fact, H values estimated from **Figure 2** are fully consistent the values of H chosen in simulating the fGn processes. This analysis thus has demonstrated the major advantage of the scale parameter b<sup>s</sup> over ε for the study of fGn processes using MSE. It has also made it clear that MSE is a highly non-trivial extension of the sample entropy, and more generally, the correlation entropy K2(ε).

While Equation (23) is fundamental for MSE, it can also help us better understand the behavior of multivariate MSE, which is shown in numerical simulations to be almost constant for 1/f processes with H = 1, and decays in a well-defined fashion for white noise, where H = 1/2, and some randomized data derived from experimental data possibly with correlations (Ahmed and Mandic, 2011). The reason is very clear. For 1/f process, H = 1, and therefore, MSE or multivariate MSE does not vary with the scale parameter b<sup>s</sup> . For white noise or some derived randomized data, H = 1/2, and therefore, MSE or multivariate MSE decays with the scale parameter b<sup>s</sup> in a well-defined fashion,

$$K\_2^{(b\_s)}(\mathfrak{e}) \sim -\frac{1}{2} \ln b\_s, \quad \text{or} \quad b\_s \sim e^{-2K\_2(\mathfrak{e})}.\tag{24}$$

One can readily check that the MSE curve for white noise shown in Ahmed and Mandic (2011) is fully consistent with the formula derived here.

#### 3.2. Heart Rate Variability Data Analysis

As an important application of MSE, we analyze HRV data for the purpose of distinguishing healthy subjects from patients with

CHF, a life-threatening condition. This is an important issue. We refer to (Hu et al., 2009, 2010) and references therein for the background. Note that part of the data examined here were analyzed in prior work (Ivanov et al., 1999; Barbieri and Brown, 2006), for the same purpose. We analyze all 33 datasets here. For ease of comparison, we take the first 3 × 10<sup>4</sup> points of both groups of HRV data for analysis. Note that based on different b<sup>s</sup> parameter, MSE was not very good at separating the two groups (Hu et al., 2010). This instigated a debate on whether MSE was useful or not for analyzing HRV (Wessel et al., 2003; Nikulin and Brismar, 2004). To resolve this interesting debate, and more importantly, to satisfactorily separate the two groups of HRV data, we shall focus on the dependence of MSE on the scale parameter ε in the following discussions.

Since earlier studies find HRV data to be nonstationary, having 1/f spectrum with anti-persistent long-range correlations and multifractality (see Ivanov et al., 1999 and references therein), we analyze the increment processes of the HRV data. **Figure 3** shows K2(ε) vs. ln ε curves for the two groups of HRV data. We observe: (i) On small scales, K2(ε) vs. ln ε curves for both groups of HRV data show good scaling behavior. As a consequence, one can expect a scaling relation between K (bs) 2 (ε) and ln b<sup>s</sup> (Equation 23). This is indeed so. The results, being very similar to that shown in **Figure 2**, are not shown here, however. (ii) The scaling of K2(ε) vs. ln ε is better and longer for the normal HRV data. (iii) As indicated by ε ∗ in the figure, the smallest scale resolvable by the HRV data of the healthy subjects is much larger than that of the diseased subjects.

We now discuss how to use MSE to distinguish the healthy subjects from patients with CHF. We have found (i) The curves K (bs) 2 (ε) vs. b<sup>s</sup> averaged over all the subjects within the two groups are different, just as reported in Costa et al. (2005). However, such curves are not very useful for separating the two groups as a diagnostic tool, as pointed out in Nikulin and Brismar (2004). The fundamental reason is of course that the Hurst parameter H is not very effective in distinguishing healthy subjects from patients with HRV, as quantitatively analyzed in Hu et al. (2010). (ii) The smallest resolvable scale, ε ∗ , completely separates the healthy subjects from patients with CHF, as shown by **Figure 3**. Note the scale parameter ε is a generalization of the concept variance (or standard deviation). The observation made by Nikulin and Brismar (2004) that a variance-like parameter is better than MSE with varying block size parameter b<sup>s</sup> in distinguishing healthy subjects from patients with HRV is most appropriately interpreted as the following: the parameter b<sup>s</sup> is less important than the scale parameter ε. This is somewhat the opposite of the case for 1/f noise analyzed in the last section.

To more clearly see how much more advantageous ε is over bs in distinguishing healthy subjects from patients with HRV, we examine how the scaling K2(ε) ∼ − ln ε can be used for this purpose. We have found that the errors obtained by linearly fitting the K2(ε) vs. ln ε curves of **Figure 3** are much smaller for the normal HRV data than for those of CHF patients and also can completely separate the healthy subjects from patients with CHF. This is shown in **Figure 4**. Therefore, the scale parameter ε is indeed more important than b<sup>s</sup> .

FIGURE 3 | K<sup>2</sup> (ε) vs. **ln** ε curves for the HRV data of (A) 18 normal subjects and (B) 15 patients with CHF. Each curve corresponds to one subject. The computations were done with 3 × 10<sup>4</sup> points and *m* = 5. ε ∗ indicates the smallest scale resolvable by the data.

# 3.3. Epileptic Seizure Detection Through MSE of EEG

Epilepsy is a common and debilitating brain disorder. It is characterized by intermittent seizures. During a seizure, the normal activity of the central nervous system is disrupted. The concrete symptoms include abnormal running/bouncing fits, clonus of face and forelimbs, or tonic rearing movement as well as simultaneous occurrence of transient EEG signals such as spikes, spike and slow wave complexes or rhythmic slow wave bursts. Clinical effects may include motor, sensory, affective, cognitive, automatic and physical symptomatology. To make medications effective, timely detection of seizure is very important. In the

past several decades, considerable efforts have been made to detect/predict seizures through nonlinear analysis of EEGs. For a list of the major nonlinear methods proposed for seizure detection, we refer to Gao and Hu (2013) and references therein. In particular, the three groups of EEG data analyzed here, H (healthy), E (epileptic subjects during a seizure-free interval), and S (epileptic subjects during seizure), were examined by adaptive fractal analysis (Gao et al., 2011c) and scale-dependent Lyapunov exponent (Gao et al., 2012), and excellent classification was achieved.

To examine how well MSE characterizes the three groups of EEG data, we have plotted in **Figure 5** the mean MSE curves for the three groups, for two parameter values of the phase space scale,ε. We observe that they separate very well. Indeed, statistical test shows that the separations are significant. In particular, for the scale parameter in the phase space ε = 0.2, the MSE curve for the S group lies well below the other 2 curves. One may be tempted to equate this as smaller complexity of the seizure EEG. However, such an interpretation is informative only relative to the specific ε chosen here, which is 0.2. When ε = 0.05, the red curve for seizure EEG actually lie above the other 2 curves for larger b<sup>s</sup> . In fact, if one can pause a moment and think twice, one would realize that such interpretations are not too helpful for clinical applications, since MSE can vary substantially within and across the groups.

We have tried to use MSE at specific b<sup>s</sup> values to classify the three groups of EEG. Guided by the mean MSE curves in **Figure 5**, we have found that when ε = 0.2, if only two b<sup>s</sup> can be used, then b<sup>2</sup> = 2 and 15 are the optimal values. The result of the classification is shown in **Figure 6A**. We observe that there are some overlaps between groups H (healthy) and E (epileptic subjects during a seizure-free interval), as well as E and S (epileptic subjects during seizure). Intuitively, this is reasonable. Overall, the classification is not very satisfactory. How may we improve the accuracy of the classification?

Recall that in fractal scaling analysis of EEG, EEG data are found to be equivalent to random walk processes, but not noise or increment processes (Gao et al., 2011c). The latter amounts to a differentiation of the random walk processes. Since the basic scaling law derived here is for noise or increment process, but not for random walk processes, it suggests us to try to compute MSE from the differenced data of EEG, defined by y<sup>i</sup> = x<sup>i</sup> −xi−1, where x<sup>i</sup> is the original EEG signal. The mean MSE curves for the differenced data of EEG are shown in **Figure 7**, again for two ε values. We observe that the separation between the mean MSE curves becomes wider. Indeed, classification of the 3 EEG groups

now is much improved, as shown in **Figure 6B**. Itshould be noted however that the accuracy of the classification is still slightly worse than using other methods, such as adaptive fractal analysis (Gao et al., 2011c) and scale-dependent Lyapunov exponent (Gao et al., 2012).

# 4. Conclusion and Discussion

To better understand MSE, we have derived a fundamental bi-scaling relation for the MSE analysis. While MSE analysis normally only focuses on the scale parameter b<sup>s</sup> with ε more or less arbitrarily chosen, our analysis of fGn and HRV data clearly demonstrates that both scale parameters are important—in the case of HRV analysis, the ε is more important, while in the case of 1/f noise, the b<sup>s</sup> parameter is more important. In fact, we have shown (Hu et al., 2010) that MSE, when used with ε fixed, is not very effective in distinguishing healthy subjects from patients with HRV. The accuracy achieved when we focus on the scaling of K2(ε) ∼ − ln ε is not only much higher, but also comparable to that using the scale-dependent Lyapunov exponent (SDLE) (Gao et al., 2006a, 2007, 2013), as reported by Hu et al. (Hu et al., 2010). The fundamental reason of course is that SDLE has a similar scaling as K2(ε) ∼ − ln ε.

We have also computed MSE for the original as well as the differenced data of the three EEG groups, H (healthy), E (epileptic subjects during a seizure-free interval), and S (epileptic subjects during seizure), and found that mean MSE curves for the three groups are well separated. The classification of the 3 EEG groups using MSE at two specific scale parameters b<sup>s</sup> is reasonably good, and is better for the differenced data than for the original EEG data. This strongly suggests that EEG data are like random walk processes. However, even with the differenced data of EEG, the classification is still not as accurate as using adaptive fractal analysis (Gao et al., 2011c) and scaledependent Lyapunov exponent (Gao et al., 2011a). One of the reasons for this inferiority lies in the difference in the range of scales covered by these three multiscale methods. Adaptive fractal analysis and scale-dependent Lyapunov exponent both cover the entire range of scales presented in the EEG data. However, with the length of the EEG data, which is only 4097 points for each data set, MSE can only cover a moderate range

# References


of scales, with the largest b<sup>s</sup> only around 20, since with b<sup>s</sup> = 20, the smoothed data is already only 200 points long. Our analysis here has raised an important question: how do we use MSE to analyze short data? We conjecture that it may be beneficial to focus on the scaling of K2(ε) ∼ − ln ε, or develop new smoothing schemes, by introducing a parameter equivalent to 1/b<sup>s</sup> but without sacrificing the length of the smoothed data.

# Acknowledgments

One of the authors (JG) is grateful for the generous support by National Institute for Mathematical and Biological Synthesis (NIMBIOS) at the University of Tennessee to attend Heart Rhythm Disorders Investigative Workshop.

in nonstationary environments. Front. Comput. Physiol. Med. 4:119. doi: 10.3389/fphys.2013.00119


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Gao, Hu, Liu and Cao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A three-dimensional mathematical model for the signal propagation on a neuron's membrane

### Konstantinos Xylouris \* and Gabriel Wittum

*Department of Simulation and Modeling, Faculty of Informatics, Goethe Center for Scientific Computing, Goethe University Frankfurt, Frankfurt am Main, Germany*

In order to be able to examine the extracellular potential's influence on network activity and to better understand dipole properties of the extracellular potential, we present and analyze a three-dimensional formulation of the cable equation which facilitates numeric simulations. When the neuron's intra- and extracellular space is assumed to be purely resistive (i.e., no free charges), the balance law of electric fluxes leads to the Laplace equation for the distribution of the intra- and extracellular potential. Moreover, the flux across the neuron's membrane is continuous. This observation already delivers the three dimensional cable equation. The coupling of the intra- and extracellular potential over the membrane is not trivial. Here, we present a continuous extension of the extracellular potential to the intracellular space and combine the resulting equation with the intracellular problem. This approach makes the system numerically accessible. On the basis of the assumed pure resistive intra- and extracellular spaces, we conclude that a cell's out-flux balances out completely. As a consequence neurons do not own any current monopoles. We present a rigorous analysis with spherical harmonics for the extracellular potential by approximating the neuron's geometry to a sphere. Furthermore, we show with first numeric simulations on idealized circumstances that the extracellular potential can have a decisive effect on network activity through ephaptic interactions.

Keywords: models, theoretical, ephaptic coupling, dipole effect, detailed 3D-modeling, 3D-modeling, cable equation

# Introduction

The membrane potential belongs to the most important quantities of a neuron. Its function of time and space describes neuronal activity. It is a voltage across the membrane defined by the difference between the intra- and extracellular potential.

Since the neuron is embedded in ionic milieus, potential gradients in the off-membrane spaces result in electric fluxes, which are conserved according to the first principles. This conservation law is the basis of the standard cable equation which describes the unfolding and propagation of an action potential (Rall, 1962, 1964; Scott, 1975) very efficiently. The standard cable equation maps a neuron to a tree of lines, each of which corresponds to a cylindric compartment with mean diameter. On these structures, it computes the evolution of the membrane potential according to its diffusion equation.

#### Edited by:

*Tobias Alecio Mattei, Brain & Spine Center - InvisionHealth - Kenmore Mercy Hospital, USA*

#### Reviewed by:

*Ingo Bojak, University of Reading, UK Le Wang, Boston University, USA Xin Tian, Tianjin Medical University, China Mikhail Katkov, Weizmann Institute of Science, Israel*

#### \*Correspondence:

*Konstantinos Xylouris, Department of Simulation and Modeling, Faculty of Informatics, Goethe Center for Scientific Computing, Goethe University Frankfurt, Kettenhofweg 139, Frankfurt am Main 60325, Germany konstantinos.xylouris@ gcsc.uni-frankfurt.de*

> Received: *17 March 2015* Accepted: *02 July 2015* Published: *17 July 2015*

#### Citation:

*Xylouris K and Wittum G (2015) A three-dimensional mathematical model for the signal propagation on a neuron's membrane. Front. Comput. Neurosci. 9:94. doi: 10.3389/fncom.2015.00094*

The resulting extracellular potentials can be theoretically computed with the line source method (Holt and Koch, 1999; Gold et al., 2006), once the transmembrane currents have been determined with the aid of the cable equation's solution.

These extracellular potentials in turn can be exploited to examine ephaptic feedbacks on other neurons (Holt and Koch, 1999). Indeed, the distribution of the extracellular potential can elicit transmembrane currents which may have decisive effects on the membrane potential of neighboring cells (Anastassiou et al., 2011; Buzsáki et al., 2012).

The goal of the current paper is to develop and implement an integrated three-dimensional model which synchronously captures both quantities, the membrane potential and the extracellular potential, during activity and which uses the neuron's geometry as it is instead of reducing it to cylindric compartments. The aim of such a model is to deepen the knowledge in signal processing and to carry out simulations on small networks of realistic neurons while having all these influences in action.

The work of Voßen et al. (2007) did a first step in the development of a generalized cable equation. It was built on the principle of the continuity of electric fluxes. Although the core model with the intra-, extracellular and membrane potential was correctly derived, the subsequent approach used to couple these unknowns and to solve them numerically resulted in major difficulties. The limit case to the standard cable equation evoked greater challenges and the simulations themselves were restricted to a very small time period of hundreds of micro seconds on a small part of a passive membrane.

The study of Xylouris et al. (2010) used a more direct approach for the coupling and generalized the existing model to active membranes. Nonetheless, although it was capable to reproduce action potentials, it still lacked in many characteristics of the signal processing, like the width of the propagating signal, the waveform of extracellular potential at activity and the computation on more complicated geometries. Indeed, computations on more complicated geometries diverged numerically. Furthermore, the membrane potential's defining equation in Xylouris et al. (2010) was the transmembrane current, which contains the time derivative of the membrane's capacitive property as only differential operator. The membrane potential's propagation was provided indirectly through the difference between the intra- and extracellular potential- thus making it actually hard to expect correct results for the spacial distribution. Moreover, as consequence, it produced vanishing transmembrane currents causing zero extracellular potentials and zero ephaptic interactions. This is why, the solving procedure with this direct coupling was of little use.

This paper introduces a completely new coupling of the unknowns. Therein, the defining equation for the membrane potential contains its own spacial differential operator. For the first time, we could carry out simulations on three-dimensionally resolved ideal neurons and on a small network of cells. This description, furthermore, allows for a proof that the extracellular potential distributes in the extracellular space like a current multipole. It will show that the only current monopole for a neuron exists at rest.

# Model

# Three-Dimensional Cable Equation

Let in and out be domains in R <sup>3</sup> denoting the neuron's intra- and extracellular space, respectively, and ¯ in ∩ ¯ out = Ŵ the membrane, a two dimensional manifold embedded in R 3 . Let = in ∪ out = R <sup>3</sup> be the whole space. Let, furthermore, 8in, 8out, and V<sup>m</sup> be the intra-, extracellular, and membrane potential, respectively. 8 will represent either 8in or 8out.

The quantities σin and σout denote the intra- and extracellular conductivities, respectively. The normal nin→out is the normal on the membrane Ŵ pointing from the intracellular space to the extracellular. We will need this quantities in order to define the fluxes. For the active transmembrane flux, we will just consider the Hodgkin–Huxley model for the sake of a simpler writing. There we have the sodium conductivity gNa<sup>+</sup> , the potassium conductivity gK<sup>+</sup> and the leakage conductivity gL. The quantities ENa<sup>+</sup> , EK<sup>+</sup> , and E<sup>L</sup> denote the reversal potentials of the indexed ions. The gating parameters n, m, h obey ordinary differential equations (Hodgkin and Huxley, 1952) and calibrate how much of the maximal possible ionic flux passes through the channel.

Considering the non-membrane conductivity (≈ 3 mS cm ) (López-Aguado et al., 2001) and the dielectricity of water (≈ 1), Gary Holt demonstrated in his Ph.D. Thesis (Holt, 1997) that a possible non-membrane capacitor would discharge with a time constant of approximately 3 ns. Because this time scale is much faster than the one of the phenomena considered—the fast channel dynamics react on a µs-time scale—, it appears as good approximation to assume no capacitive properties for the nonmembrane spaces (ρ = 0 in in and out). Indeed, this is the basis of the derivation for the three dimensional cable equation. In addition, we will assume to have time invariant magnetic fields ( dBE dt = 0). Then, Gauß's and Faraday's law satisfy root equations in the intra- and extracellular space, so that the conservative electric field can be expressed with the aid of a potential gradient. Combining this gradient with Gauß's law immediately leads to the Laplace equation for the potentials in the non-membrane spaces.

$$\nabla \cdot \vec{E} = \frac{\rho}{\epsilon \epsilon\_0} \stackrel{(\rho=0)}{=} 0,\tag{1}$$

$$\nabla \times \vec{E} = -\frac{d\vec{B}}{dt} \stackrel{!}{=} \mathbf{0},\tag{2}$$

$$
\Rightarrow \vec{E} = -\nabla \Phi,\tag{3}
$$

$$
\Rightarrow -\Delta \Phi = 0.\tag{4}
$$

The constants ǫ<sup>0</sup> and ǫ are the dielectricities in vacuum and material, respectively.

Because of flux continuity, the flux across the membrane is continuous and must correspond to the flux emerging from the membrane dynamics [denoted with jall(Vm)]. Hence,

$$-\sigma\_{\rm in} \nabla \Phi\_{\rm in} \cdot n\_{\rm in\to out} = -\sigma\_{\rm out} \nabla \Phi\_{\rm out} \cdot n\_{\rm in\to out} = j\_{\rm all} \quad \text{on } \Gamma. \tag{5}$$

With this boundary condition in mind, we arrive at the threedimensional cable equation (**Figure 1**):

$$-\Delta\Phi\_{\rm out} = 0 \qquad\qquad\qquad\qquad\text{in }\Omega\_{\rm out},\tag{6}$$

$$-\Delta \Phi\_{\rm in} = 0 \qquad\qquad\qquad\qquad\text{in } \Omega\_{\rm in},\tag{7}$$

$$V\_m = \Phi\_{\rm in} - \Phi\_{\rm out} \qquad \qquad \qquad \text{on} \quad \Gamma. \tag{8}$$

The flux jall contains all fluxes passing the membrane. Considering just the Hodgkin–Huxley model and some additional stimulus, it looks like:

$$j\_{\rm all} = c\_m \frac{dV\_m}{dt} + m^3 h g\_{\rm Na^+} (V\_m - E\_{\rm Na^+}) + n^4 g\_{\rm K^+} (V\_m - E\_{\rm K^+})$$

$$+ g\_{\rm L} (V\_m - E\_{\rm L}) + j\_{\rm Stm}.\tag{9}$$

Since it is possible to have different dynamics on each region of the neuronal membrane, we furthermore introduce the following δ-functions

$$\begin{aligned} \delta\_{\text{dend}}(\mathbf{x}) &= \left\{ \begin{array}{ll} 1 & \text{on the dendrite} \\ 0 & \text{else} \end{array} \right\}, \\ \delta\_{\text{active}}(\mathbf{x}) &= \left\{ \begin{array}{ll} 1 & \text{on the soma or nodes of Ramsey} \\ 0 & \text{else} \end{array} \right\}, \\ \delta\_{\text{syn}}(\mathbf{x}) &= \left\{ \begin{array}{ll} 1 & \text{on the postsyrapatic density} \\ 0 & \text{else} \end{array} \right\}, \\ \delta\_{\text{stim}}(\mathbf{x}) &= \left\{ \begin{array}{ll} 1 & \text{on the simulation area} \\ 0 & \text{else} \end{array} \right\}. \end{aligned}$$

Assuming pure resistivity for the non-membrane spaces, the respective potentials distribute according to the Laplace equation therein. These Laplace problems fulfill at the membrane an interface-condition which complies with the conservation of fluxes. The emerging flux from the potential equals the total transmembrane flux, denoted with *j*all. Within *j*all all transmembrane currents are accumulated: capacitive, channel, any stimulation or synaptic currents.

With the help of these δ-functions, we can define a more refined transmembrane flux considering where it precisely occurs.

We define

$$\begin{split} j\_{\rm HH}(n,m,h,V\_{m}) &= m^3 h g\_{\rm Na^+}(V\_m - E\_{\rm Na^+}) + n^4 g\_{\rm K^+}(V\_m - E\_{\rm K^+}) \\ &+ g\_{\rm L}(V\_m - E\_{\rm L}). \end{split} \tag{10}$$

The synaptic activity is simply modeled with the aid of a modified Heaviside function H(x, t). This function should be one as soon as the membrane potential at the pre-synapse exceeds a certain value, say 2 mV, and it remains one for the time the synapse is active regardless of the presynaptic membrane potential. Additional activation at the pre-synapse should integrated by the synaptic function α(Vm|pre, t)

$$j\_{\rm syn}(V\_m|\_{\rm pre}, t) = H(V\_m|\_{\rm pre}, t) \cdot \alpha(V\_m|\_{\rm pre}, t),\tag{11}$$

where Vm|pre is the membrane potential at the presynaptic terminal.

Then the refined total transmembrane current has the form:

$$\begin{aligned} j\_{\text{all}}(\mathbf{x}, V\_m) &= c\_m \frac{dV\_m}{dt} + \delta\_{\text{active}}(\mathbf{x}) j\_{\text{HH}}(n, m, h, V\_m) \\ &+ \delta\_{\text{stim}}(\mathbf{x}) j\_{\text{Sm}}(t) + \delta\_{\text{syn}}(\mathbf{x}) j\_{\text{syn}}(V\_m|\_{\text{pre}}, t) . \end{aligned} \tag{12}$$

#### Numeric Model

The three dimensional cable equation (Equations 6–8) is a nonsymmetric system (8in does not couple with 8out the same way as 8out with 8in) of PDEs which couples two Laplace equations in the intra- and extracellular space with the transmembrane flux. This flux depends on the membrane potential. One difficulty in solving this system is the coupling of the membrane potential, which lives on a lower dimensional manifold, with the quantities, which live in full space. Since the discretization of this system is carried out with the help of integrals, the lower dimensional quantity cannot be measured the same way as the quantities in space (because the space integrals do not see it at all). In order to get rid of this particularity, we will extend the membrane potential, which is defined by the difference between the intra- and extracellular potential (V<sup>m</sup> = 8in − 8out) on the membrane, to the intracellular space. To that end, we extend the extracellular potential to the intracellular space and combine its extension with the intracellular potential equation. So, we arrive at a problem for the membrane potential in the intracellular space.

Because V<sup>m</sup> = 8in − 8out on the membrane Ŵ, we will extend 8out to the intracellular space continuously so that the following identity holds. Let this extension be denoted with 8IN out:

$$V\_m = \Phi\_{\rm in} - \Phi\_{\rm out} = \Phi\_{\rm in} - \Phi\_{\rm out}^{IN} \qquad \text{on} \quad \Gamma,\tag{13}$$

$$
\Rightarrow \Phi\_{\text{out}} = \Phi\_{\text{out}}^{IN} \tag{14}
$$

At this point we have some freedom to choose the right hand side of the extracellular potential extension equation. We choose it to be zero. Then it can be easily combined with the intracellular problem (Equation 7), which is a Lapalcian, too. We have

$$\begin{aligned} -\,\Delta\Phi\_{\text{out}}^{IN} &= \,\mathbf{0} & \quad \text{in }\Omega\_{\text{in}},\\ \Phi\_{\text{out}}^{IN} &= \,\Phi\_{\text{out}} & \quad \text{on }\Gamma, \end{aligned}$$

$$\begin{aligned} \Rightarrow -\Delta(\Phi\_{\text{in}} - \Phi\_{\text{out}}^{IN}) = -\Delta V\_m &= 0 & \text{in } \Omega\_{\text{in}}, \quad \text{(16)}\\ -\sigma\_{\text{in}} \nabla V\_m \cdot n\_{\text{in}\to\text{out}} &= j\_{\text{all}}(V\_m) \\ &+ \sigma\_{\text{in}} \nabla \Phi\_{\text{out}}^{IN} \cdot n\_{\text{in}\to\text{out}} \\ & & \text{on } \Gamma. \end{aligned}$$

Thus, instead of solving the system (Equations 6–8) we solve (**Figure 2**):

$$-\Delta \Phi\_{\text{out}} = 0 \qquad \qquad \qquad \text{in } \Omega\_{\text{out}}, \quad \text{(17)}$$

$$\begin{aligned} -\sigma\_{\text{out}} \nabla \Phi\_{\text{out}} \cdot n\_{\text{in}\to\text{out}} &= j\_{\text{all}}(V\_m) & \text{on } \Gamma, \\ -\Delta \Phi\_{\text{out}}^{IN} &= 0 & \text{in } \Omega\_{\text{in}}, \end{aligned} \tag{18}$$

$$
\Phi\_{\rm out}^{\rm IN} = \Phi\_{\rm out} \qquad \qquad \qquad \stackrel{\cdots}{\rm on} \,\,\,\,\,\stackrel{\cdots}{\cdot}
$$

$$\begin{array}{c c c} -\Delta V\_m = 0 & \text{in } \Omega\_{\text{in}}, \\ \dots & \dots & \dots \end{array} \tag{19}$$

$$\begin{aligned} -\sigma\_{\text{in}} \nabla V\_m \cdot n\_{\text{in}\to\text{out}} &= j\_{\text{all}} (V\_m) \\ &+ \sigma\_{\text{in}} \nabla \Phi\_{\text{out}}^{IN} \cdot n\_{\text{in}\to\text{out}} \\ &\qquad \text{on } \Gamma. \end{aligned}$$

For referencing reasons, we will call the additional current, which is considered in the boundary condition of the membrane potential equation (Equation 19), as ephaptic current

FIGURE 2 | By extending the extracellular potential to the intracellular space (green) continuously, an extension of the membrane potential into the intracellular space is established. By means of this trick we obtain a coupling between the extracellular and the membrane potential which can be directly used for numerics and simulations.

$$j\_{\rm eph} := \sigma\_{\rm in} \nabla \Phi\_{\rm out}^{IN} \cdot n\_{\rm in\rightarrow\rm out} \cdot \tag{20}$$

### Numeric Discretization and Procedures

In space, we discretize this system (Equations 17–19) with the finite volume method (Versteeg and Malalasekera, 2007). This method guarantees the local conservation of fluxes. This is necessary, because the model has been derived on this principle. Furthermore, important characteristics of the solution, as we will see in the following section depend on this conservation. In time, an implicit method is used while the non-linearity is resolved with the Newton method.

Similarly to the finite element method, we discretize the domain with volume elements, for example tetrahedrals, whose edge points and edges form the grid h, and we approximate the unknown functions (in our case Vm, 8out, and 8IN out) with a linear combination of shape functions. Our shape functions bj(x) have the property to be continuous and linear on each elements (j = 0, ..., #<sup>h</sup> = N). They are as many as our grid points (#<sup>h</sup> = N) and are uniquely determined by the following defining conditions

$$b\_{\rangle}(\mathbf{x}\_{k}) = \delta\_{jk} \quad \mathbf{x}\_{j} \in \mathfrak{Q}\_{\hbar} \tag{21}$$

$$b\_j(\mathbf{x})\text{ is continuous and linear on each element}\qquad(22)$$

We represent our unknown functions with these

$$V\_m(\mathbf{x}, t) = \sum\_{j=0}^{N} \nu\_{m\_j}^t b\_j(\mathbf{x}), \tag{23}$$

$$\Phi\_{\rm out}(\mathbf{x}, t) = \sum\_{j=0}^{N} \phi\_{\rm out\_j}^t b\_j(\mathbf{x}), \tag{24}$$

$$\Phi\_{\rm out}^{IN}(\mathbf{x}, t) = \sum\_{j=0}^{N} \phi\_{\rm out\_j}^{IN, t} b\_j(\mathbf{x}). \tag{25}$$

Purpose of the discretization schema is to establish linear systems out of the differential Equations (17–19) which uniquely determine the unknowns coefficients v t mj , φ<sup>t</sup> outj , φIN,<sup>t</sup> outj of these linear combinations. The upper index t should indicate that these coefficients are time dependent.

For the finite volume method, we need to construct a so called dual grid, which arises from the domain discretization and which is used in order to discretize the differential space operators. We call the elements of the dual grid control volumes. The volume elements of the dual grid are defined by the edge points which correspond to the barycenters of the initial tetrahedrals and the barycenters of its sides and edges. By this construction, we create as many control volumes as we have nodes in the grid h. Let B<sup>k</sup> be the control volume of the k-th grid node. We integrate the differential equations over this control volume and apply Gauß' integral theorem:

$$-\Delta\Phi\_{\rm out}(x,t) = 0\tag{26}$$

$$\int\_{B\_k} -\Delta \sum\_{j=0}^N \phi\_{\text{out}j}^t b\_j(\mathbf{x}) d\mathbf{x} = -\int\_{B\_k} \sum\_{j=0}^N \phi\_{\text{out}j}^t \Delta b\_j(\mathbf{x}) d\mathbf{x} \qquad (27)$$

$$= \int\_{\partial B\_k} \sum\_{j=0}^N \phi\_{\text{out}j}^t \nabla b\_j(\mathbf{x}) \cdot \vec{n}(\mathbf{x}) dS(\mathbf{x})$$

$$= \sum\_{j=0}^N \phi\_{\text{out}j}^t \int\_{\partial B\_k} \nabla b\_j(\mathbf{x}) \cdot \vec{n}(\mathbf{x}) dS(\mathbf{x})$$

$$= \sum\_{j=0}^N \phi\_{\text{out}j}^t a\_{kj}.$$

Because ∂B<sup>k</sup> is a polyhedron and bj(x) is analytically known, the integrals R ∂B<sup>k</sup> ∇bj(x) · En(x)dS(x) = akj can be analytically computed. Furthermore, on the membrane these integrals equal to the transmembrane flux (Equation 12) which in general can also depend on other unknowns, like the gating variables or the membrane potential. Furthermore, this is the term which includes the time operator <sup>d</sup> dt . We discretize our equation fully implicit and because this flux is not linear, we apply Newton's method to solve the emerging equations for each time step. Therein, the Jacobian of the system needs to be inverted, which we accomplish with high efficient iterative solvers. More precisely, we use a parallel ILU-preconditioned BiCGstab method (Barrett et al., 1987). All of this has been implemented with the use of the C++-library ug4 (Vogel et al., 2012), providing flexible numerical tools for these purposes.

# Results

The intracellular problem (Equation 7) is a Laplace problem with a Neumann boundary. We referred this to the approximation of purely resistive non-membrane spaces (i.e., the intra- and extracellular space do not contain any free charges). Thus, the driving force of the intracellular potential is given by its Neumann-flux on the boundary (i.e., the membrane). Now, integrating the Laplace equation over the whole neuron and applying Gauß's theorem yields an important constrain for the transmembrane currents: The fluxes are balanced out over the whole membrane at each point of time!

$$-\Delta \Phi\_{\rm in} = 0\tag{28}$$

$$\begin{split} \Rightarrow \int\_{\Omega\_{\text{in}}} -\Delta \Phi\_{\text{in}} d\mathbf{x} &= \int\_{\Gamma} -\sigma\_{\text{in}} \nabla \Phi\_{\text{in}} \cdot \boldsymbol{n}\_{\text{in}\to\text{out}} dS(\mathbf{x}) \\ &= \int\_{\Gamma} j\_{\text{all}}(V\_{m}) dS(\mathbf{x}) \stackrel{!}{=} \mathbf{0} \end{split} \tag{29}$$

There are at least two important implications of this situation. First, an influx at some point of the membrane, necessarily leads to an out-flux at some other point of the membrane with the same total amount of current. Moreover, this must happen simultaneously, since otherwise the condition is violated.

Second, the extracellular potential distributes like a multipole in the extracellular space.

# Dipole-like Distribution of the Extracellular Potential for a Idealized Sphere Neuron

Regardless of the neuron's shape, the extracellular potential equation (Equation 17) demonstrates that its only source is the transmembrane flux as expressed through its boundary condition. A current monopole of the extracellular potential would be defined by the overall transmembrane flux. Yet, this flux is always zero as shown before (Equation 29). Thus, there is no monopole component and the extracellular potential distributes in space like a current multipole. To get some quantitative idea of its distribution, we approximate the neuron's geometry to a sphere. Then, we are able to express the extracellular potential with a generalized Fourier series of spherical harmonics.

Let in = B<sup>R</sup> be a sphere with radius R and Ŵ = ∂B<sup>R</sup> its boundary. The spherical harmonics Y m l (θ , φ) satisfy the Laplace problem on this geometry:

$$-\Delta Y\_l^m = \mathbf{0} \tag{30}$$

$$\Phi\_{\rm out}(r,\theta,\phi) = \sum\_{l\geq 0} \sum\_{m\geq -l}^{l} \langle b\_{lm} r^{-(l+1)} \rangle Y\_{l}^{m}(\theta,\phi) \tag{31}$$

$$
\Rightarrow -\Delta\Phi\_{\text{out}} = 0.\tag{32}
$$

The solution 8out is concretized by the coefficients blm. These are determined by the transmembrane flux jall(Vm):

$$\frac{\partial \Phi\_{\rm out}}{\partial r}|\_{r=R} = \sum\_{l\geq 0} \sum\_{m\geq -l}^{l} - (l+1) \frac{1}{R^{l+2}} b\_{lm} Y\_{l}^{m}(\theta,\phi) = j\_{\rm all}(V\_{m})$$

$$\longrightarrow k \qquad \cdots \qquad R^{l+2} \int^{\pi} \int^{2\pi} \text{sinc}(\theta) \dot{\mathbf{i}}\_{\uparrow}(V\_{\downarrow}) Y^{\mathbf{n}}(\theta,\phi) d\theta d\phi$$

$$\Rightarrow b\_{kn} = -\frac{\kappa}{l+1} \int\_0^\cdot \int\_0^{\cdots} \sin(\theta) j\_{\text{all}}(V\_m) Y\_k^n(\theta, \phi) d\theta d\phi. \tag{34}$$

Especially, we obtain for the first coefficient b<sup>00</sup> which corresponds to the potential of a monopole:

$$b\_{00} = -\frac{R^{l+2}}{l+1} \int\_0^\pi \int\_0^{2\pi} \sin(\theta) j\_{\text{all}}(V\_m) \frac{1}{\sqrt{4\pi}} d\theta d\phi$$

$$= -\frac{R^{l+2}}{(l+1)\sqrt{4\pi}} \int\_\Gamma j\_{\text{all}}(V\_m) dS(\mathbf{x}) = 0. \tag{35}$$

Thus, the solution of the extracellular potential does not contain any monopole-part and behaves like a multipole falling in space with higher powers of the distance.

# Numerical Error Analysis and Verification by a Comperison with NEURON

NEURON (Hines and Carnevale, 1997) is a highly sophisticated simulation environment for modeling a wide range of neuronal networks with the aid of the standard cable equation. Since the current three-dimensional model generalizes the one dimensional cable equation and since there are no non-trivial analytic solutions of an active neuron for our equations, we want to use this software environment in order verify both our model and our implementation. Our results should be very similar with these of NEURON for comparable computational domains. In order to keep the three-dimensional computation fast and in order to be able to create suitable three-dimensional computational domains, we carry out this comparison on a very long cylinder l = 9.9 mm with small diameter d = 200 µm in relation to its length ( <sup>d</sup> <sup>l</sup> <sup>≈</sup> <sup>2</sup> · <sup>10</sup>−<sup>4</sup> ). Such cases approximately comply with the assumption of the one-dimensional model ( of infinite cylinders). No significant differences in the rise and propagation of an arising action should be visible.

We use proMesh (Reiter, 2014) to construct the three dimensional cylindric soma with a length 9.8 mm and a diameter 200 µm (**Figure 3**).

This test domain we now use in order to first verify the the correct implementation of our discretization schema and second in order to see that we indeed obtain almost identical solutions in comparison with those produced by NEURON.

First is obtained, if the computed solution converges as the computational grid fineness is increased. In order to assess the second point, we have to compare the one dimensional solution of NEURON with the three-dimensional solution of our model. By construction of the one dimensional cable equation, each quantity, although computed on every point of a line, actually represents a volumetric quantity. Thus, the one-dimensional model assumes for all quantities to be radial symmetric and iso-potential on cross-sections of a three-dimensional cylinder. Considering this particularity, we can blow up the solution of NEURON to a three-dimensional solution and compare it with the solution of our model or we compare NEURON's solution with our solution recorded on the cylinder axis. For the sake of simplicity, we use the second way considering that its difference with the volumetric comparison is just the factor of the crosssection area.

Because for three dimensional numeric computations, domains have to be discretized, even simple cylinders never correspond to ideal cylinders, which, however, are the basis of the one-dimensional model. Thus, we will always expect small quantitative differences in such a comparison and, therefore, we are already satisfied to evaluate the differences with NEURON with the aid of an Euclidean integral norm

$$||f||\_{L^{2}([a,b])} = \sqrt{\int\_{a}^{b} |f|^{2} d\mathfrak{x}},\tag{36}$$

where the interval [a, b] corresponds to the time interval of the simulation. Furthermore, in order to get this measure dimensionless, we will consider the relative error between the solution of neuron VmNEURON and the solution computed at refinement level x, denoted with VmLevel <sup>x</sup> , over the interval [0, T]

$$\frac{||V\_{\mathfrak{m}\_{\text{NEUZON}}} - V\_{\mathfrak{m}\_{\text{Level }x}}||\_{L^2([0,T])}}{||V\_{\mathfrak{m}\_{\text{Level }x}}||\_{L^2([0,T])}}.\tag{37}$$

Yet, qualitative measures like propagation speed and signal width should be identical.

Concerning the numeric convergence at grid refinement, we computed the solution on our cylinder, composed by a tetrahedral grid, at two levels of refinement and observed the desired convergence (**Figure 4**). This behavior should serve as benchmark for the right implementation of the finite volume discretization schema.

The solution between the standard cable equation and the three dimensional model are qualitatively undistinguishable (**Figure 3**). The small numerical differences (**Table 1**) are due to the aforementioned reasons: the cylinder in the computation is a disretization of an ideal one, the cylinder's length is finite (the standard cable equation assumes infinite cylinders). Moreover, since the three-dimensional model additionally considers the coupling of the extracellular potential on the membrane, so that there are always to be expected some subtile differences in the solutions, which are reflected in **Table 1**.

However as regards the emerging of the action potential (**Table 1**, **Figure 4**), the propagation speed of 5 <sup>m</sup> s , and the signal width (**Table 1**, **Figure 4**) we receive identical results.

# Simulation on a Small Network of Four Idealized Neurons

With a computationally quite demanding simulation, we also solve the Equations (17–19) on a more complicated geometry representing four idealized neurons with chemical synapses (**Figure 5**).

The simulation is demanding, because we have a non-linear time-dependent domain problem in three dimensions. It means we solve several a huge linear systems in each time step within Newton's method. Thereby, the time step to be chosen is constrained by the fast dynamics of the active membrane's gating variables, which in our case is chosen with 10µs, while we aim to simulate the time period of 14 ms. This means we need to compute the solution for 1400 time steps, which is time demanding despite parallel procedures due to the geometry's complexity.

We constructed the computational domain given by a small network of four neurons with the help of an algorithm developed in Niklas Antes' master thesis (Antes, 2009). Each cell consists of a myelinated axon (diameter d ≈ 5µ m), a soma (d ≈ 20µm) and dendrites (d ≈ 10µm). The cells are several hundred micrometer separated among each other.

As regards the transmembrane current jall(x,Vm) (Equation 12) for the different cell parts, we just considered passive properties on the dendrites while an active membrane reflecting Hodgkin–Huxley dynamics for the soma as well as for the

NEURON. (A) Computational domain with the marked areas (B–D) where the membrane potential is recorded. (B–D) Time courses of the membrane potential at the corresponding areas. The solution of the three dimensional model at refinement level 0 is the blue line. After two refinements the solution converges—the red line representing the solution at refinement level 1

TABLE 1 | Relative error of the computed solution in comparison with NEURON.


*The relative error between the solution computed with NEURON VmNEURON and the solution computed on refinement level x, denoted with VmLevel x is very small. This implies that qualitative characteristics like propagation speed, signal width as well are very similar. The small differences measured here can be explained with the nature of the three-dimensional model which automatically considers the extracellular potential in the signal processing and which works with discretized and finite domains (in this case: cylinders are supposed to be ideal and infinite for the standard cable equation).*

nodes of Ranvier. On the myelinated sheaths, the transmembrane current jall(x,Vm) is composed of the first term in Equation (12) only, the capacitive current. Furthermore, two of the cells (cell 1 and cell 4, see **Figure 5**) own external input areas by which the network can be stimulated.

Because we simulate the relatively small time period of 14 ms, we let the synapses work as pre-defined strong post-synaptic current pulses of some nA, which are triggered as soon as the membrane potential at the pre-synapse indicates that an action potential has arrived. This is assumed to happen when the membrane potential at the pre-synapse exceeds the value of 5 mV.

For the sake of simplicity, we choose a constant intra- and extracellular conductivity σin = 2 mS cm , σout = 20 mS cm .

We activate the network by stimulating cell number one (see **Figure 5**) with approximately 30 pA at each of its input areas over the whole simulation period of 14 ms. At the moment of 8ms, we then stimulate cell number four with a current pulse of approximately 0.5 nA over 20µs. Although this stimulation of the fourth cell is not enough to generate an action potential alone, within the regime of this network and with the ephaptic current activated (Equation 20), an action potential arises (see **Figure 6**). This demonstrates that ephaptic interactions can have a decisive effect as to whether a neuron fires.

discretization schema. We see that the solution produced by the three-dimensional model (dotted green line) is almost the same as the solution produced by NEURON (purple line). The small differences are due to the nature of the three-dimensional modeling procedure (see text).

The model integrates the impact of the extracellular potential into the signal processing. Though its impact is rather small, it still can have a significant effect when combined with the right stimulation at the right time. Action potentials can arise, which otherwise would not show up (**Figure 6**).

# Discussion

The three-dimensional passive model of Voßen et al. (2007) has been extended to a model with active membrane dynamics and has been reformulated mathematically with the aid of an extension of the membrane potential into the intracellular space. This reformulation, for the first time, facilitated numeric simulations of neuronal activity on three-dimensionally resolved idealized neurons generalizing the one dimensional cable equation by fully incorporating the three-dimensional extension of the neurons' geometry and by automatically considering the extracellular potential's influence on the membrane. As shown, the latter influence -though it is quite small- in combination with additional stimulation at the right timing can lead to an action potential which otherwise would not have arisen.

For the sake of verifying the correct implementation of this model and because it should deliver similar results as the onedimensional cable equation for the limit case of long and thin cylinders, we carried out a comparison with NEURON and obtained very good agreement between the two models.

Based on the assumption of charge-free non-membrane spaces -an assumption also used for the derivation of the standard cable equation-, we could provide strong theoretical evidence (to our knowledge for the first time) with the aid of the threedimensional model that there aren't any current monopoles as the overall out-flux across the membrane balances out. A significant consequence of this behavior is that the leading term of the extracellular potential's multipole expansion vanishes so that it falls in space with higher powers of its distance to the transmembrane current source. In the work of Lindén et al. (2011), this very assumption has been applied for the extracellular potential in order to arrive at converging LFPs. The authors in Lindén et al. (2011) showed that a monopole behavior would lead to a diverging LFP.

We consider the ability to carry out realistic simulations with the cable equation on three-dimensionally resolved ideal neurons as important step and milestone on the way of refining and generalizing existing models for neuronal activity. This three

simulation (right column) in which it is included. Both simulations are carried out with the same stimulation paradigm. An initial signal spreads through the network. Additionally around the moment of 8 ms, the forth cell (the upper cell of this network) is activated slightly with a current pulse so that it depolarizes just below the threshold for and action potential. Although the effects of ephaptic interactions are very small, we see that they can determine whether a neuron activates in particular circumstances.

dimensional model facilitates gaining a better understanding of all the processes involved in the signal processing, especially the influence of the extracellular potential activity on the membrane and the impact of the precise three-dimensional shape of the neuron's geometry. Concerning the ephaptic communication, it would be interesting to further investigate its influence on synchronous firing within networks. The latter point also seems to be very promising since lots of precise experimental geometric data are produced. Questions connecting function with geometry can be directly tackled with this model.

However, there is still a long way to go on this path, as the biggest challenge at the moment for our model is its computational demand. Further algorithmic and computational analysis needs to be invested in order to make applicable cutting edge solvers of linear systems arising from partial differential equations -like algebraic multi grid methods- on highly parallel machines, even on graphic card clusters. As next steps, we want to focus on these improvements.

On the other hand, the computational efficiency is a big advantage for standard one dimensional cable equation. Once we accomplished this efficiency for the three-dimensional model, there are still lots of interesting applications which we wish to address- especially concerning backward modeling with questions like which are the underlying network properties in order to reproduce a given a extracellular potential activity wave.

Furthermore, we see the need of a deeper theoretical analysis of this model with the purpose to provide a mathematical proof that it converges to the standard cable equation for the limit case of infinite cylinders and vanishing extracellular resistivity.

Our long-range purpose is to generalize this model with homogenization and multi-scale techniques so that to be able to

# References


simulate the activity of bigger clusters of neuronal networks while also considering the detail in processing on the small scale.

Realized steps on this path will be hopefully items of future publications.

# Acknowledgments

The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007–2013) under grant agreement number 650003 (Human Brain Project).

# Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fncom. 2015.00094


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Xylouris and Wittum. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Membrane current series monitoring: essential reduction of data points to finite number of stable parameters

# *Raoul R. Nigmatullin1, Rashid A. Giniatullin2,3 and Andrei I. Skorinkin4,5,6\**

*<sup>1</sup> Theoretical Physics Department, Institute of Physics, Kazan Federal University, Kazan, Russia*

*<sup>2</sup> Department of Neurobiology, A.I. Virtanen Institute, University of Eastern Finland, Kuopio, Finland*

*<sup>3</sup> Laboratory of Neurobiology, Department of Physiology, Kazan Federal University, Kazan, Russia*

*<sup>4</sup> Department of Radioelectronics, Institute of Physics, Kazan Federal University, Kazan, Russia*

*<sup>5</sup> Department of Biophysics of Synaptic Processes, Kazan Institute of Biochemistry and Biophysics Russian Academy of Sciences, Kazan, Russia*

*<sup>6</sup> Department of Bioinformatics, Institute of Informatics, Kazan, Russia*

#### *Edited by:*

*Tobias Alecio Mattei, Ohio State University, USA*

#### *Reviewed by:*

*Jianbo Gao, Wright State University, USA Abdelmalik Moujahid, University of the Basque Country UPV/EHU, Spain*

#### *\*Correspondence:*

*Andrei I. Skorinkin, Radioelectronics Department, Institute of Physics, Kazan Federal University, 16 Kremliovskaja Str., Room 102, Kazan 420008, Russia e-mail: askorink@yandex.ru*

In traditional studies of changes in cell membrane potential or trans-membrane currents a large part of the recorded data presents "a pure noise." This noise results mainly from the random openings of membrane ionic channels. Different types of stationary or non-stationary noise analysis have been used in electrophysiological experiments for identification of channels kinetic states. But these methods have a limited power and often cannot answer to the main question of the experimental study: do external factors induce a significant change of channels kinetics? A new method suggested in the current study is based on the scaling properties of the beta-distribution function that allows reducing the series containing 200,000 and more data points to analysis of only 10–20 stable parameters. The following clusterization using the generalized Pearson correlation function allows taking into account the influence of an external factor and combine/separate different parameters of interest into a statistical cluster considering the influential parameter. This method which we call BRC (Beta distribution-Reduction-Clusterization) opens new possibilities in creation of a largely reduced database while extracting specific fingerprints of the long-term series. The BRC method was validated using patch clamp current recordings containing 250,000 data points obtained from the living cells and from open tip electrode. The numerical distinction between these two series in terms of the reduced parameters was obtained.

**Keywords: noise analysis, detrended fluctuation analysis, fluctuation spectroscopy based on beta-distribution, sequence of the ranged amplitudes, membrane currents of neurons**

# **INTRODUCTION**

During electrophysiological studies it is common to record rather long tracks of signals. These signals are registered as temporal variations of cell membrane potential or trans-membrane currents induced by the opening of some ligand- or voltage-gated or even chaotic ionic channels. Usually the principal aim of such a study is the registration of some macroscopic signals—evoked or spontaneous—and the change of parameters of these signals characterizes the total effect of some actions that are located in the experimental object. But a large part of the record forms a so-called "empty track" containing a "pure noise" only. It is well known that this noise reflects mainly the result of random openings of transmembrane ionic channels. Different types of stationary or non-stationary noise analysis have been used for identification of these channels' states (Neher and Sakmann, 1976; Sigworth, 1980, 1985, 1986; Läuger, 1985; Traynelisa and Jaramilloa, 1998; Alvarez et al., 2002; Venkataramanan and Sigworth, 2002).

Unfortunately, these methods have not come into widespread use among physiologists since they often cannot answer the main question of the study: If this drug or this change of environment state induces the reliable change of channels condition or not?

Thus, there is an urgent task to develop a special language that can be compact and reliable in order to describe accurately very long current streams (long-time series) with hidden signals and noise in terms of a finite and statistically understandable set of reduced parameters. In this paper we want to show *how* to develop this special language based on an example of the analysis of signals recorded in rat's spinal cord slices. Besides this problem we want to show how to detect the presence of the biological object inside the experimental set. For this purpose we also recorded data representing the dependence of the current vs. time when the biological object is absent. Examples of currents recorded in a living cell and with empty electrodes are shown in **Figure 1**. It is well noticeable that these two signals are apparently very similar. Even though generally distinguishable by an experienced observer the reliability of these differences cannot be numerically evaluated without some special analytic methods.

To the authors' best knowledge one method is basically suitable for quantitative analysis of the different long-time series. This method was introduced by Peng et al. (1994) and nowadays it

is known as detrended fluctuation analysis (DFA). It was well described in literature by their creators (Ossadnik et al., 1994; Peng et al., 1995) and found its application in analysis of biomedical (Penzel et al., 2003; Jospin et al., 2007; Burr et al., 2008) and other (Hausdorff et al., 1995, 1996) data. But it is necessary to note that the DFA algorithm works well only for certain types of non-stationary time series (especially having slowly varying trends), it is not designed to handle all possible non-stationarities in real-world data. This algorithm was *not* free also from uncontrollable errors that are associated with approximate fitting of detrended fluctuations by the segments of straight lines or by the parabolic or high order polynomials (Kantelhardt et al., 2001). The final straight line with power-law exponent αDFA is obtained as a slope in a double-log scale as a result of the fitting procedure and contains the fitting error that depends also on the type of segmentation of the initial series considered. These uncontrollable errors (usually they are not properly analyzed in the literature) can lead to different results in calculation of the desired value of the αDFA and other associated fitting parameters in analysis of the *same* long-time series.

A technique, called scale-dependent Lyapunov exponent (SDLE, see Gao et al., 2006, 2012b, 2013; Hu et al., 2010), provides a more comprehensive characterization of complex time series. Some of DFA's limitations have been overcome recently as well by using a new method called adaptive fractal analysis (AFA, see Gao et al., 2010, 2011, 2012a; Riley et al., 2012; Kuznetsov et al., 2013). AFA has been shown to be able to determine global trends, remove noise, perform fractal analysis and multiscale decomposition and present data as a curve. However, new tools could be developed specifically designed to show and estimate even mild differences between two long time series.

Thus, it would be desirable to have a new method with "high resolution" (10–20 significant parameters) to distinguish more accurately the experimental data and effect of treatments. In this paper we demonstrate such method based on some invariant properties of the beta-distribution function; furthermore this method admits a procedure that controls the error in each stage of its application. From our point of view the effectiveness of new approach is based on the monotone behavior of the primary fitting parameters that admit the secondary fit. This peculiarity allows compressing initial fitting parameters with the help of the secondary fit and present initial data set in more compact form.

The four fitting parameters (*A*, *B*, α, β) of beta-distribution can be interpreted and used for quantitative *reading* of fluctuations arising on different scales of the long-time series considered. In previous papers (Nigmatullin, 2010; Nigmatullin et al., 2012) based on the principle of the strong correlation of random sequences it was shown that the cumulative (integral) curve obtained from the sequence of the ranged amplitudes (SRA) can be described with high accuracy by means of the beta-distribution function. In other words, any *detrended* random sequence being transformed to the SRA (when all amplitudes of the initial sequence are sorted out and located in the descending order *y*<sup>1</sup> > *y*<sup>2</sup> > ... > *yN*) after elimination of its mean value and subsequent integration, forms a bell-like curve *J*(*x*) that can be fit (with controllable relative error) by the function:

$$J(\mathbf{x}) \stackrel{\sim}{=} Jb(\mathbf{x}) = A \left(\mathbf{x} - \mathbf{x}\_{0}\right)^{\alpha} \left(\mathbf{x}\_{N} - \mathbf{x}\right)^{\beta} + B. \tag{1}$$

Here the limiting values *x*<sup>0</sup> < *xN* define the ends of the location interval of the random sequence considered. In many cases the parameters *x*0, *xN* are known. Other quantitative parameters (*A*, *B*, α, β) should be found from the fitting procedure of the function *J*(*x*) to the curve *Jb*(*x*). The power-law exponents (α, β) reflect the *fractal* properties of the random sequence considered and the presence of the memory that is expressed in the behavior of the corresponding SRAs. The criterion for the verification of the presence of memory in two random sequences which are compared is as follows. If one SRA being plotted with respect to another one forms a curve close to a straight line then these two random curves are defined as a having a relative memory and can be considered as being *strongly correlated*. This important property allows transforming any segment of a random sequence to a beta-distribution function and "read" this segment in terms of four unknown fitting parameters (*A*, *B*, α, β). Such transformation from 30 to 50 or even more initial points belonging to a random sequence can be read in terms of these four parameters only. This allows us to suggest a new type of spectroscopy based on some scaling properties of the beta-distribution. This transformation is called Fluctuation Spectroscopy based on Beta-Distribution (FSBD). In general we suggest a method which we call BRC (Beta distribution-Reduction-Clusterization). The basic problem that is solved in this paper by using the BRC method can be formulated as follows: *Is it possible to suggest a reliable method with controllable error that has a wide range of applicability and which has a flexible small set (10–20) of statistically understandable parameters for quantitative characterization of the differences between long-time series?*

### **MATERIALS AND METHODS PREPARATION OF SPINAL CORD SLICES**

Ten- to Twenty-days-old Wistar rats were deeply anesthetized with diethyl ether and killed by decapitation. After laminectomy, the spinal cord was excised, and immediately immersed in cold (0 ÷ 4◦C) artificial cerebrospinal fluid containing (in mM): 126 NaCl, 26 NaHCO3, 2.5 KCl, 1.25 NaH2PO4, 2 CaCl2, 2 MgCl2, and 10 glucose (bubbled with 95% O2 and 5% CO2; pH 7.3; 310 mOsm measured). Several transverse slices (250-μm thick) were prepared from the lumbosacral enlargement (L4-6) with a vibratome (VT1000S, Leica, Nussloch, Germany).

#### **WHOLE-CELL RECORDINGS**

Slices were transferred to a recording chamber (300 ÷ 400μl volume) and continuously superfused with oxygenated artificial cerebrospinal fluid at 3 ml/min and 22 ÷ 24◦C. Interneurons were visualized with an upright interference contrast microscope and a × 40 water immersion objective (Axioscope FS, Carl Zeiss, Oberkochen, Germany). Patch-pipettes (tip resistance, 5 ÷ 7 M-) were prepared by a puller (Flaming-Brown P97; Sutter, Novato, CA, USA) from borosilicate capillaries and were filled with intracellular solution consisting of (in mM: potassium gluconate 140, NaCl 10, MgCl2 3, HEPES 10, EGTA 11; pH 7.3 adjusted with KOH; 300 mOsm measured).

Interneurons were voltage-clamped at −65 mV in the wholecell configuration after obtaining GV seals (usually not less than 2 GV) by means of a patch-clamp amplifier (Axopatch 200B; Molecular Devices, Sunnyvale, CA, USA). Compensation of capacitance (Cm) and series resistance (Rs) was achieved with the inbuilt circuitry of the amplifier. Series resistance was compensated by 40 ÷ 70% and did not change appreciably from the beginning to the end of the experiments, indicating stable recording conditions. The tracks used for comparison were recorded by the immersion of filled patch-pipettes in artificial cerebrospinal fluid; the patch-pipettes were voltage-clamped at −65 mV too.

Then all data were sampled at 10 kHz and stored on-line with a PC using the pClamp 10.0/Clampex 10.0 software package (Molecular Devices).

#### **SCALING PROPERTIES OF THE BETA-DISTRIBUTION AND DESCRIPTION OF THE TREATMENT PROCEDURE**

In this section we want to demonstrate the scaling properties of Expression (1). We subject *x*, *x*<sup>0</sup> and *xN* in Expression (1) to the following scaling transformations, keeping the power-law exponents α and β invariable: *x* = ξ · *x* + *b*. *x*<sup>0</sup> = ξ · *x* <sup>0</sup> + *b*, *xN* = ξ · *x <sup>N</sup>* + *b*, which gives the following beta transformation:

$$Jb(\mathbf{x}) \to Jb(\mathbf{x'}) = A' \left(\mathbf{x'} - \mathbf{x'\_0}\right)^a \cdot \left(\mathbf{x'\_N} - \mathbf{x'}\right)^\beta,\tag{2}$$

where *A* = *A* · ξ(<sup>α</sup> <sup>+</sup> <sup>β</sup>) . This is the accurate mathematical result that follows from the scaling transformation of the initial coordinates.

In order to have a simple criterion for comparison of the two beta-distributions let us calculate the values of two extreme points *x*¯, *x*¯ belonging to the functions *Jb*(*x*) *and Jb*(*x* ) respectively.

$$\begin{aligned} \bar{\boldsymbol{x}} &= \boldsymbol{w}\_1 \boldsymbol{x}\_0 + \boldsymbol{w}\_2 \boldsymbol{x}\_N, \ \bar{\boldsymbol{x}}' = \boldsymbol{w}\_1 \boldsymbol{x}\_0' + \boldsymbol{w}\_2 \boldsymbol{x}\_N', \\ \boldsymbol{w}\_1 &= \frac{\boldsymbol{\beta}}{\alpha + \boldsymbol{\beta}} = \frac{\boldsymbol{x}\_N - \bar{\boldsymbol{x}}}{\Delta}, \boldsymbol{w}\_2 = 1 - \boldsymbol{w}\_1, \Delta = \boldsymbol{x}\_N - \boldsymbol{x}\_0, \Delta' = \frac{1}{\xi} \Delta, \\ \bar{H} &= Jb \left( \bar{\boldsymbol{x}} \right) = A \boldsymbol{w}\_1^{\beta} \boldsymbol{w}\_2^{\alpha} \Delta^{\alpha + \beta} + \boldsymbol{B}, \\ \bar{H}' &= Jb \left( \bar{\boldsymbol{x}}' \right) = A' \boldsymbol{w}\_1^{\beta} \boldsymbol{w}\_2^{\alpha} \left( \Delta' \right)^{\alpha + \beta} + \boldsymbol{B}, \end{aligned}$$

$$\bar{H}' = Jb(\bar{\mathfrak{x}}') = A\nu\_1^{\beta}\nu\_2^{\underline{\alpha}}\xi^{\underline{\alpha}+\beta} \left(\frac{1}{\xi}\right)^{\underline{\alpha}+\beta} + B \equiv \overline{H}.\tag{3}$$

From Expressions (3) it follows that for the scaling transformation (2) the heights *H*¯ , *H*¯ of the extreme points of the two belllike distributions at the fixed values of the power-law exponents α and β and parameter *B* should *coincide* with each other.

Besides this criterion it is necessary to take into account the scaling relationship between the heights *H*¯ , *H*¯ . If two power-law exponents α and β are subjected to the scaling transformation at the fixed value of the length = *xN* − *x*0:

$$
\alpha' = \theta \alpha, \ \mathfrak{k}' = \theta \mathfrak{k}, \tag{4}
$$

then simple manipulations lead to the second scaling relationship:

$$\frac{\bar{H}'}{A'} = \left(\frac{\bar{H}}{A}\right)^{\theta}.\tag{5}$$

Here the amplitudes *A* and *A* are defined by relationships (1) and (2), respectively. The consideration of the scaling properties of the beta-distribution allows one to suggest the following two steps.

*Step 1*. This step includes the formation of the sequence of the range amplitudes (SRA) when all amplitudes located on the fixed length = *xN* – *x*<sup>0</sup> are ordered in descending order *y*1(*x*0) > *y*<sup>2</sup> >... > *y*(*xN*).

*Step* 2. Numerical integration of the SRA with respect to its mean value and subsequent fit to the function (1).

**Figure 2** illustrates this transformation which is realized after application of these two steps.

Each sub-segment having equal length is transformed to its SRA (**Figure 2A**) in Step 1, and the integration of the SRAs with respect to its subtracted mean value gives finally the desired bell-like curve that can be fit to Expression (1) in Step 2. Mathematically these two steps correspondingly are expressed as:

$$\text{SRA}(\boldsymbol{\wp}(\mathbf{x}\_{j})) = \text{sort}(\boldsymbol{\wp}(\mathbf{x}\_{j})) \to \Delta \text{SRA}(\boldsymbol{\wp}(\mathbf{x}\_{j})) $$

$$= \text{SRA}(\boldsymbol{\wp}(\mathbf{x}\_{j})) - \frac{1}{\Delta} \sum\_{j=1}^{\Delta} \text{SRA}(\boldsymbol{\wp}(\mathbf{x}\_{j})) $$

$$\equiv \text{SRA}(\boldsymbol{\wp}(\mathbf{x}\_{j})) - \langle \dots \rangle . \tag{6a}$$

Here the integer index *j* (*j* = 1, 2, . . . , *N*) numerates the number of data points in the fixed segment = *xN* – *x*<sup>0</sup> containing initially 30–50 data points.

$$J(\mathbf{x}\_{\circ}) = J(\mathbf{x}\_{\circ}) + \frac{1}{2} \left(\mathbf{x}\_{\circ} - \mathbf{x}\_{\circ -1}\right) \cdot \left(\Delta \text{SRA}(\mathbf{y}(\mathbf{x}\_{\circ})) \right)$$

$$+ \Delta \text{SRA}(\mathbf{y}(\mathbf{x}\_{\circ})) \text{ , } J\_0 = 0. \tag{6b}$$

**Figure 2** demonstrate the realization of these two steps [with the usage of Expression (6)] on a short segment belonging to the membrane current initial time segment (containing 250,000 data points). We should notice that the mean value <...> of the chosen segment should be subtracted and the integration procedure [the last row in (6)] should be realized with the help of the

order and marked by gray stars. On vertical axes the values of the current are given in picoampers. **(B)** The bell-like curve (marked by crossed stars) obtained

by integration from SRA\_1 (shown on the previous **A** by gray stars) and its fit marked by the bold solid line. The fitting parameters of this curve are given inside of this figure. As it follows from this figure 30 data points are sufficient for providing the acceptable fit with the value of the relative error close to 3.5%.

trapezoid method. As a result of calculation of Expression (6) we obtain the desired bell-like curve *J*(*xj*).

**Figure 2B** shows the quality of the fitting of the bell-like curve obtained to the beta-distribution. In order to have the value of the relative error:

$$\text{Rel}Err = \left(\frac{\text{std}\nu(f(\mathbf{x}) - fb(\mathbf{x}))}{mean(f(\mathbf{x}))}\right) \cdot 100\%,$$

$$\text{where } \text{std}\nu(f(\mathbf{x})) = \left[\frac{1}{N\_{\Delta}} \sum\_{j=1}^{N\_{\Delta}} \left(f\left(\mathbf{x}\_{j}\right) - mean\left(f(\mathbf{x})\right)\right)^{2}\right]^{1/2},$$

$$mean\left(f(\mathbf{x})\right) = \frac{1}{N\_{\Delta}} \sum\_{j=1}^{N\_{\Delta}} f(\mathbf{x}\_{j}),\tag{7}$$

to be limited to a few percentages (2–5)% we should choose the length of the minimal segment min of the initial series containing initially 30–50 data points. In Expression (7) the value *N* defines the number of data points that enters in the segment of the length . Thus, the first reduction criterion should be written as:

$$
\Delta\_{\text{min}} \cdot \xi^k = N\_{\text{total}} \tag{8}
$$

Here the scaling parameter ξ has the same meaning as in Expression (2).

This requirement allows one to consider the long-time series containing the total number of data points (*j* = 1, 2, . . . , *N*total) in terms of the reduced parameters of the beta-distribution (*A*, *B*, α, β) depending on parameter *k*. Further it is convenient to rewrite condition (8) in the following form changing the numeration of the current parameter *k*:

$$
\Delta\_k = \frac{N\_{\text{total}}}{\xi^{K+1-k}}, \quad k = 1, 2, \dots, K+1,\tag{9}
$$

where [in comparison with (8)] the value <sup>1</sup> should coincide with the minimal value 30 < min < 50 giving the condition for finding the limiting value of K (the total number of segments is equaled to K + 1). In the opposite case, the value <sup>K</sup> <sup>+</sup> <sup>1</sup> should give the maximal length coinciding with the value *N*total. As a result of this reduction procedure one can transform *N*total data points to 4.(*K* + 1) parameters. But this step is *not* sufficient. If the functions *Ak*, *B*k, (α + β)k have monotonic behavior one can realize further reduction to the *primary* set of the fitting parameters describing these functions.

Now it is necessary to explain why the sum of the parameters (α + β) is selected instead of considering each-power law exponent separately. This selection is based on the comparison of these exponents with the single power-law exponent αDFA figuring as the basic parameter in the DFA. It is easy to see that relationship α + β = 1 with α ≈ β ≈ 0.5 (for this case betadistribution looks like a semicircle) corresponds to a distribution with the absence of power-law correlations in the time series. From another side it gives for α*DFA* = 0.5. Comparison with these two power-law exponents leads us to the following approximate expression:

$$
\alpha\_{\rm DFA} \cong \frac{1}{2} \left( \alpha + \beta \right). \tag{10}
$$

One can notice also that Expression (10) does not contradict other well-known power-law exponents (Hausdorff et al., 1995; Burr et al., 2008) β*<sup>f</sup>* = 2α*DFA* − 1 that is used for description of the power-law spectrum *S*(*f*) ∼ *f* −*p <sup>f</sup>* and decay of autocorrelation function *C*(*t*) = *xixi* <sup>+</sup> <sup>1</sup> ∼ *t* <sup>−</sup><sup>1</sup> with γ = 2 − 2α*DFA*. From the requirements (β*<sup>f</sup>* , γ > 0) it follows that:

$$1 \le (\alpha + \beta) \:= 2\alpha\_{\text{DEA}} \le 2.\tag{11}$$

We want to stress here that this requirement is *approximate* and can serve as an indication for division of long-time series with fractal structure (because it does not contradict with wellknown inequalities) known before from series with self-similar structure.

The left-hand inequality follows from the requirement β*<sup>f</sup>* > 0 and does not contradict with numerical results obtained in other papers (Penzel et al., 2003; Jospin et al., 2007; Burr et al., 2008). We should also note that the equality (α + β) = 2 corresponds to a *uniform* amplitude distribution. The uniform distribution leads to the degeneration of the corresponding SRA to a straight line (Nigmatullin, 2010). The beta-distribution in this case is described by a parabolic curve. If one of the power-law exponent (say α → 0) then the position of extreme point *x*¯ → *x*0. Because of normalization *w*<sup>1</sup> + *w*<sup>2</sup> = 1 β → 1. This statement is valid also in the opposite case when α → 1, β → 0. So, the last relationship (11) can be considered as a *specific fractal* test in our further calculations. Here we should also note that in practical applications the existence of the interval 0 < α + β < 1 and inequality α + β > 2 also are *possible*. For the first case, for small values of α and β the beta-distribution degenerates to a rectanglelike curve. In the second case the values of the derivatives on the ends (*x*0, *xN*) of the beta-distribution have zero values. These two cases correspond to degeneration of the fractal properties of the time-series analyzed. The verification of relationship (11) on the

Weierstrass-Mandelbrot function that represents itself the selfaffine function (see its definition in Feder, 1988) confirms the relationship (11). So, for practical purposes it is useful to work with the combination of (α + β). The statistical and geometrical meaning of other parameters

entering to (1) can be explained as follows. The value of the amplitude *A* together with the height *H* of the beta-distribution is associated with intensity of the fluctuations analyzed. As one can see from **Figure 3A** the angle of the SRA slope counted off from zero point (after elimination of its mean value) is proportional to the height of the corresponding fluctuation that is expressed in the form of a beta-distribution in **Figure 3B**. If this angle approaches the vertical axis, the height of the distribution becomes large. In the opposite case when this angle tends to zero the height of the distribution is small. See **Figure 3B** where the first 14 beta-distributions are shown. The measure of asymmetry can be connected with parameters *B* and the values of weight factors *w*1,<sup>2</sup> that are defined by Expression (3). The value *w*<sup>1</sup> = 0.5 corresponds to the complete symmetry of the distribution in the horizontal direction. Any shift of this parameter to the left- (*w*<sup>1</sup> < 0.5) or to the right-hand side (*w*<sup>1</sup> > 0.5) reflects the *horizontal asymmetry* of the distribution. A small asymmetry of this distribution in vertical direction is controlled by the parameter *B*.

*Step* 3. After selection of the scaling parameter ξ and the limiting value *K* from Expression (9) one can obtain a family of bell-like curves that can be fitted to Expression (1). The calculated fitting parameters *Ak*, α*k*, β*k*, *Bk*, *k* = 1, 2,..., *K* + 1 from Expression (1) are obtained. The set of these bell-like curves and the corresponding fitting parameters forms the total fluctuation spectrum based on the beta-distribution (FSBD). Each part of this FSBD contains the corresponding beta-distribution:

$$Jb\_k(\mathbf{x}\_j) = A\_k \left(\mathbf{x}\_j - \mathbf{x}\_{0,k}\right)^{a\_k} \left(\mathbf{x}\_{\mathcal{N},k} - \mathbf{x}\_j\right)^{\beta\_k} + B\_k. \tag{12}$$

*Step* 4. In order to subject them to the scale-invariant properties described above it is necessary to average this family of distributions and consider only one weighted distribution:

$$
\langle lb\_k(\mathbf{x}\_j) \rangle = \frac{1}{N B d\_k} \sum\_{j=1}^{N B d\_k} lb\_k(\mathbf{x}\_j), \quad j = 1, 2, \dots, N B d\_k,
$$

$$
N B d\_k = \frac{N\_{total}}{\Delta\_k},
\tag{13}
$$

located in the given interval *k*. Here the parameter *NBdk* coincides with number of beta-distributions calculated for the given *k*. **Figure 4** shows the averaged beta-distribution obtained for the cell number 3. If *N*total = 250,000 then from condition (9) at the given <sup>1</sup> = 32 and ξ = 2 we obtain that *K* = 13. So, the total number of beta-distributions *NBd*<sup>1</sup> = *N*total/<sup>1</sup> = 8333. The first 14 distributions belonging to this family is shown in **Figure 3B**.

*Step* 5. Further calculations are reduced to the analysis of the functional dependencies *Ak*, α*k*, β*k*, *Bk*, *k* = 1, 2,..., *K* + 1 with

respect to the variable *k*. We define them as the *primary* fitting parameters characterizing the averaged distribution (13). Further analysis shows that the amplitude *Ak* has monotonic behavior and can be described by a simple exponential behavior:

$$
\langle A\_k \rangle = A\_1 \cdot \exp \left( \lambda\_a \cdot k \right) + A\_0. \tag{14}
$$

Preliminary calculations show that this monotonic behavior is conserved for the long-time series without any trend. The presence of trend distorts this behavior.

This dependence follows after substitution of Expression (9) in relationship (2) for the amplitudes. The perfect fit of this monotone curve is shown in **Figure 5A**. Other dependencies are not so simple but nevertheless they can be identified from simple power-law and exponential hypothesis with the help of the eigencoordinates (ECs) method (Baleanu et al., 2011; Ciurea et al., 2011). The dependences <(α + β)k >≡ *S*k(αβ) and <*Bk*> have also monotonic character and can be fitted by means of two simple functions:

$$S\_k \left(\alpha \mathfrak{k}\right) \cdot k^{\vee} = A\_{\rho l} \cdot k + B\_{\rho l},$$

$$\langle B\_k \rangle = B\_1 \cdot \exp\left(\lambda\_B \cdot k\right) + B\_0 \tag{15}$$

These functions are shown, respectively, in **Figures 5B,C**. So, finally we obtain 10 fitting parameters that can be combined with 9 parameters figuring in Expressions (14) and (15) [λa, *A*1, *A*0], [ν, *Apl*, *Bpl*], [λ*B*, *B*1, *B*0] and the limiting value of parameter *w*1,K+1. The behavior of this weight factor is shown in **Figure 5D**.

These ten parameters can be used as the *primary* set of the fitting parameters for creation of a specific "fingerprint" of the long-time series considered. The idea of clusterization of these parameters is discussed in Results Section. Further analysis shows that the distribution of the heights and mean values of the SRAs obtained for the family of distributions at <sup>1</sup> also forms two other different beta-distributions. These distributions are *important* also for clusterization purposes because initially the information about the secondary distribution of the heights of the initially formed beta-distributions family and mean values of the corresponding SRA were *not* taken into account. The distributions of the heights and mean values together with their beta-distributions are shown in **Figures 6**, **7**, correspondingly. After fitting of these two distributions one can obtain in addition 5 significant parameters characterizing each beta-distribution separately.

$$\left[A\_H, (\alpha + \beta)\_H, \,\,\,\omega\_{1,H}, \,\,\max\left(Bd\_H\right), \,\, mean(SRA\_H)\right],$$

$$\left[A\_{mn}, (\alpha + \beta)\_{mn}, \,\,\,\omega\_{1,mn}, \,\,\,\max\left(Bd\_{mn}\right), \,\,\, mean(SRA\_{mn})\right]. \,\,(16)$$

These ten additional parameters we define as the *secondary* fitting parameters. The statistical meaning of these parameters are the following. The parameters *AH*, *mn* characterize the amplitudes of beta-distributions referring, correspondingly, to the heights (*H*) and mean values (*mn*). The sum (α + β)*H*, *mn* contains the information about their power-law exponents, *w*1, *<sup>H</sup>*, *mn* gives the information about their asymmetry, max(*BdH*, *Bdmn*) signifies their heights, and the fifth parameter SRA*H*, *mn* contains information about the mean values of these two additional distributions.

From our point of view, these 20 (10 primary and 10 secondary) significant parameters [figuring in Expressions (14)– (16)] combined together can completely characterize the behavior of fluctuations associated with the long-time series analyzed and containing *Ntotal* = 2.5.10<sup>5</sup> ÷ 10<sup>6</sup> and even more data points.

#### **CLUSTERIZATION OF FINAL PARAMETERS BASED ON THE GENERALIZED PEARSON CORRELATION FUNCTION**

For clusterization purposes one can suggest more accurate selection of similar sequences based on *internal* correlations. For this aim we introduce the generalized Pearson correlation function (GPCF) (Nigmatullin, 2010; Nigmatullin et al., 2012).

$$\text{GPCF}\_{\mathcal{P}} = \frac{\text{GMV}\_{\mathcal{P}}(s\_1, s\_2)}{\sqrt{\text{GMV}\_{\mathcal{P}}(s\_1, s\_1) \cdot \text{GMV}\_{\mathcal{P}}(s\_2, s\_2)}},\tag{17}$$

where expression:

$$\begin{aligned} \textit{GMV}\_{\mathcal{P}}(s\_1, s\_2, \dots, s\_K) &= \\ \left( \frac{1}{N} \sum\_{j=1}^{N} \left| nm\_j(s\_1) \cdot nm\_j(s\_2) \cdot \dots \cdot nm\_j(s\_K) \right|^{mom\_{\mathcal{P}}} \right)^{1/mom\_{\mathcal{P}}} \,, \tag{18} \end{aligned}$$

determines the generalized mean value (*GMV*)-function of the *K*-th order. Here the generalized mean value (GMV) function determines the mean value for all range of the moments (see Expression (19) below). The set of parameters (*s*1,*s*2,...,*sK*) determines the type of the random sequence compared. The *GPCFp* determined by Expression (17) coincides with the conventional definition of the Pearson correlation coefficient at *mom*<sup>p</sup> = 1. The set of moments are determined by the following expression:

$$\begin{aligned} mom\_{\mathcal{P}} = \exp\left(Ln\_{\mathcal{P}}\right), \ Ln\_{\mathcal{P}} = mn + \left(\frac{\mathcal{P}}{\mathcal{P}}\right) \cdot (mn - mn), \\ \mathcal{p} = 0, 1, \ldots, P. \end{aligned} \tag{19}$$

The value *momp* in (19) corresponds to the current moment from the interval [0, *P*]. The value *P* determines the final value of the linear function *Lnp* located in the interval [*mn*, *mx*]. The values *mn* and *mx* define correspondingly the limits of the moments in the uniform logarithmic scale. In many practical cases these values are chosen as *mn* = −15, *mx* = 15 and *P* is chosen as an integer value located in the interval [50 ÷ 100]. This empirical choice is related to the fact that the transition region of the random sequences considered and expressed in the form of the *GMV*-functions is concentrated usually in the interval *Lnp* ∈ [−5, 5]. The extended interval [−15, 15] is taken usually for calculation of the limiting values of this function in the space of the fractional moments. The initial sequences are chosen in that way: the minimum of the GMV-function coincides with zero value while the upper value of this function coincides with the maximal value of the random sequence considered. In formula (18) the random sequence is normalized to the unit value in accordance with Expressions (*A*) and (*B*):

intercept of this line are given above of this figure. **(C)** The fit of the

$$(A)\ \operatorname{rrm}\_{j}(\mathbf{y}) = \frac{\mathbf{y}\_{j}^{(+)}}{\max\left(\mathbf{y}\_{j}^{(+)}\right)} - \frac{\mathbf{y}\_{j}^{(-)}}{\min\left(\mathbf{y}\_{j}^{(-)}\right)},$$

$$\mathbf{y}\_{j}^{(\pm)} = \frac{1}{2}\left(\mathbf{y}\_{j} \pm |\mathbf{y}\_{j}|\right),\tag{20a}$$

monotonic decreasing function <*Bk*> defined by Expression (15). The three fitting parameters of this function can be added to the previous ones for characterization of the given long-time series. **(D)** The behavior of the weight factors with respect to the parameter *k*. As the significant factor characterizing the behavior of the long-time sequence we use the maximal value max(*w*1) = 0.5027. So, from analysis of the **Figure 4** and in this figure we can extract 10 *primary* fitting parameters.

$$(B)\ \operatorname{rrm}\_{\dot{\jmath}}(y) = \frac{\Delta y\_{\dot{\jmath}}}{\max\left(\Delta y\_{\dot{\jmath}}\right)}, \ \Delta y\_{\dot{\jmath}} = y\_{\dot{\jmath}} - \min\left(y\_{\dot{\jmath}}\right). \quad \text{(20b)}$$

$$\dot{\jmath} = 1, 2, \dots, N, \ 0 < \operatorname{rrm}(\wp) < 1.$$

Here, as it was done above, the set *yj* defines an initial random sequence that can contain a trend or can be compared with another trendless sequence. The symbol | ... | and index *j* (*j* = 1.2,...,*N*) determine the absolute value and number of the measured points, correspondingly. The second case (*B*) in [20(b)] corresponds to the case when the initial sequence is positive. If the limits *mn* and *mx* in (20) have opposite signs and accept sufficiently large values, then the GPCF function has two plateaus (equaled unit at small numbers of *mn* (i.e., *GPCFmn* = 1) and another limiting value *GPCFmx* depends on the degree of internal correlation between two random sequences compared. This right-hand limit (defined as *Lm*) is located between two values:

$$M \equiv \min \left( GPCF\_{\mathcal{P}} \right) \le Lm \equiv GPCF\_{m\mathcal{x}} \le 1. \tag{21}$$

The appearance of two plateaus implies that all information about possible correlations is complete and further increasing of the limiting numbers (*mx*, *mn*) figuring in (19) is *useless*. Numerous

**FIGURE 7 | The distribution of the mean values of 8333 beta-distributions (when each distribution occupies only 30 data points.) that were calculated in the initial analysis. (A)** Subtracting the mean value of this distribution (mean(mn) = 3.036.10<sup>−</sup>3) one can obtain again the bell-like curve. This curve can be fitted it to the secondary beta-distribution corresponding to the distribution of mean values. **(B)** The fit to beta-distribution function corresponding to the fluctuations of the

mean values. This information was lost at the preliminary analysis. The five fitting parameters of this distribution (shown inside of this figure) can be used as the statistically significant parameters for characterizing of the long-time series considered. So, in the results of this complete analysis one can obtain 20 statistically significant parameters that can be used for the detailed classification of the long-time series containing 2.5.10<sup>5</sup> ÷ 106 data points.

tests showed that the high degree of correlations between two random sequences is achieved when *Lm* = 1, while the lowest correlations are observed when *Lm* = *M*. This empirical observation, having a general character for all random sequences, allows us to introduce new correlation parameter *CC* (complete correlation)—factor, which is determined as:

$$CC = M \cdot \left(\frac{Lm - M}{1 - M}\right). \tag{22}$$

We would like to stress here that this factor is determined on the *total* set of the fractional moments located between exp(*mn*) and exp(*mx*). As it was mentioned above, in practical calculations for many cases it is sufficient to put *mn* = −15 and *mx* = +15. The CC-factor accepts the unit values when the degree of correlation is high while the case *Lm* = *M* corresponds to the lowest (remnant) degree of correlations that can be observed between the compared random sequences. In addition, we want to stress also the following fact. This CC-factor does *not* depend on the amplitudes of the random sequences. The pair random sequences compared should be normalized to the interval: 0 ≤ *yj* <sup>≤</sup> 1. It reflects the *internal* structure of correlations of the compared random sequences based presumably on the similarity of their probability distribution functions that are *not* known in many cases. Recent example related to application of the statistics of the fractional moments was considered in paper (Nigmatullin et al., 2012). So, the CC-factor (22) can be used for clusterization of the significant parameters based on the following idea. For a set of significant parameters referring to one qualitative factor one can calculate the limits of CC-factor:

$$cf\_{\min} \le CC \le 1.\tag{23}$$

Here the low correlation limit *cfmin* is determined by the sampling volume and conditions of experiment that should be almost the same for two qualitative factors compared (control/influence of another qualitative factor).

# **RESULTS**

#### **PROCESSING OF THE LONG-TIME MEMBRANE CURRENT SERIES**

In previous Section we described in details (**S1**–**S5**) basic steps of treatment of an arbitrary long-time series. Here we want to make some general remarks related to this procedure. If the long-time series considered contains the clearly expressed but random trend then its random behavior can disturb the monotonic behavior of the primary 9 parameters figuring in the fitting functions (15) and (16). In this cases one can recommend to apply the POLS (procedure of the optimal linear smoothing) described in papers (Baleanu et al., 2011; Ciurea et al., 2011; Nigmatullin et al., 2012) or simple numeric differentiation. These two procedures help to suppress the hidden random trend and obtain the monotonic behavior for the 9 parameters figuring in (15) and (16). In the shown figures we used the scaling factor ξ = 2. For the rational values of ξ from the interval (1, 2) Expression (9) can be modified as:

$$\Delta\_{k} = \frac{N\_{\text{tot}}}{\exp\left[ (K+1-k)\ln\left(2\right) \cdot \mu\right]}, \text{ } \mu = \frac{\ln\left(\xi\right)}{\ln\left(2\right)}.\tag{24}$$

So, numerical calculations realized at ξ = 1.5 show that results are *not* changed essentially, only the integer variable *k* in Expressions (15) and (16) is replaced as *k* → μ*k*. We think that this method has a wide range of its applicability and these two modifications can be taken into account in order to express the long-range time series in terms of 20 significant parameters. In similar manner as it was treated the membrane currents for the randomly taken interneuron-3 one can treat other long-time series related to other (1, 2, 4, 5, 6, 7) interneurons. Besides, in order to differentiate these random sequences recorded *without* presence of a biological object we treated in the same manner 6 random sequences corresponding to empty electrode.

The next problem is associated with the finding of criterion of clusterization that helps to combine these "control" membrane currents to one strongly-correlated cluster based on the values of the significant parameters. For each cell these parameters are collected in **Table 1**. For 6 files corresponding to pure solute (without presence of the cell) the results are collected in **Table 2**. How to differentiate these 20 quantitative parameters (in this case a qualitative factor is associated with the presence/absence of a biological cell) from each other? The simplest classification can be related to calculation of the mean value and standard deviation of the calculated significant parameter in each row. But more effective scheme of clusterization based on the statistics of the fractional moments and the usage of the complete correlation factor is considered in the next section.

For the clusterization of final parameters we have used new correlation parameter *CC* described in "Materials and Methods" section Expression (22). The calculation of the CC-factor (in our case it is based on a set of membrane currents associated with 3 "control" measurements for each chosen cell from the total set of currents representing other 7 biologic cells) which is considered as the complex correlation matrix (see **Table 3**) having minimal dimension (7 × 7) leads to the minimal value *cfmin* = 0.9238. The result is not changed essentially if one calculates numerically the corresponding integrals with respect to their normalized significant parameters and then considers their CCfactors. The tendency of the strong correlations between columns of **Table 1** is conserved, only the boundary of the correlation interval is slightly increased achieving the value *Jcfmin* = 0.9736. So, using the method of clusterization based on the statistics of the fractional moments and Expression (22) one can say that all "control" currents measured for the sampling 7 × 7 = 49 form the strongly-correlated cluster with limits [0.9238, 1] for the initial set of significant parameters (20 parameters for each sampling) and [0.9736, 1] (for the corresponding integrals that are obtained by direct trapezoid method from the normalized significant parameters). In accordance with this method of clusterization one can make the following conclusion: if any another series having 20 significant parameters will give the CC-factor located in the interval [0.9238, 1] then it can be considered as the "friend" file belonging to this cluster, in the opposite case it can be considered as a "strange" file. For more reliable identification the saying above can be referred to the integrated columns formed from 20 normalized significant parameters. In the same manner we treated the files corresponding to the electrode currents recorded in normal saline solution without presence of biological object. The 20 desired parameters for 6 files are collected in **Table 2**. Their correlation matrix presented by **Table 4** form another cluster. But attempt to combine the currents corresponding to the living o cells with currents corresponding to empty electrodes located in saline solution is *unsuccessful*. If we compare the correlation matrix of **Table 5** with the previous ones (**Tables 3**, **4**) then one can notice that the last matrix is *uncorrelated* (all elements are close to zero). It means that the presence of the biologic cell completely changes the statistical structure of the current and from qualitative point of view the long-time random sequences of currents recorded for both cases (presence/absence of biological cell) are *different*.

So, new clusterization method helps to express quantitatively the internal factor as the presence/absence of the living cell (compare this statement with series shown on **Figure 1** where the corresponding currents look similar to each other). Definitely, more accurate measurements are needed in order to differentiate from many mixed factors that form a time-series for biological and non-biological objects a specific *predominant* factor that plays an essential role in this differentiation. But this problem merits a separate research.

### **DISCUSSION**

It is well known that cellular membrane is the element which largely provides cell functioning. Cell membrane has so many functions that it is difficult even to list—anyone can find them all in each textbook on cell biology. In general the membrane provides all interaction of the cell with the external environment including the perception of the effect of active substances. Withal


**Table 1 | The collection of 20 significant parameters calculated for 7 cells based on calculation of registered membrane currents.**

*Each column describing the chosen cell is obtained in the result of the averaging of three membrane currents with the length 250,000 data points. The first 10 primary parameters are marked by a double line. The minimal and maximal values of each significant parameter in each row are bolded.*



*The first 10 primary parameters are marked by a double line. The minimal and maximal values of each significant parameter in each row are bolded.*


**Table 3 | The correlation matrix of the calculated CC-factors [Expression (22)] for 20 parameters characterizing 7 neurons collected in the Table 1.**

*The maximal and minimal values of correlations in each row are bolded.*

**Table 4 | The correlation matrix of the calculated CC-factors for 20 parameters characterizing 6 empty electrode records collected in the Table 2.**


*The maximal and minimal values of correlations in each row are bolded.*



the membrane comprises a lot of elements which produce socalled "membrane noise"—rather small variations of membrane potential or trans-membrane current; mainly they are different types of ion channels, transporters and pumps. There are many active substances affecting the operation of these elements so the action of these substances actually can be detected by analyzing the membrane noise. But even if some substance does not affect channels, transporters or pumps directly its action often can be detected by noise analysis too. For example if the substance affects G protein-coupled receptors or state of membrane lipids—in many cases it leads to the changes in the functioning of ion channels (Tillman and Cascio, 2003; Inanobe and Kurachi, 2014) and, accordingly, to the noise changes. So the analysis of the long-time series of noise can help to detect the action of many substances when we cannot detect this action differently.

For analysis of the long-time series we applied new BRC method based on the beta-distribution function. Four parameters of the beta-distribution function can be used for description of the local fluctuations and the averaged beta-distributions can be applied for *quantitative* reading of series containing large number of data points. The fluctuation spectroscopy based on beta distribution allows realizing the essential reduction (2.5–10).10<sup>5</sup> data points to 20 quantitative parameters *only* [see Expressions (14)– (16)] that contain the basic information calculated from three basic beta-distributions: (1) distribution over different segments (scales), (2) the secondary beta-distributions over their heights and (3) distributions over mean values. This reduction becomes possible thanks to the invariant properties that are expressed by formulae (3) and (5). We suppose that this approach can be applied successfully for the unified additional analysis of fluctuations of different long-time series that present the results of monitoring of biological, medical and other data reflecting the results of response of the complex system considered with respect to some external factor. In particular, this BRC method is applicable to testing the action of antagonist of receptor and ion channels when the modification based on different type of interaction (with binding site or with the open channel with different kinetics). In such experiments in order to understand the mechanism of action of some new substances we only need to compare the FSBD parameter changes caused by this substance with typical changes stored in the database.

# **FUNDING**

This work was partially supported (Andrei I. Skorinkin) by RF grant "Leading Scientific School" and RFBR grant.

# **ACKNOWLEDGMENTS**

We are grateful to professor Peter Illes (Leipzig University, Germany) for the possibility to receive the used experimental data in his laboratory, we also thank professor Sverre Holm (University of Oslo, Norway) for useful discussions. The work is performed according to the Russian Government Program of Competitive Growth of Kazan Federal University.

# **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 04 June 2014; paper pending published: 11 July 2014; accepted: 08 September 2014; published online: 26 September 2014.*

*Citation: Nigmatullin RR, Giniatullin RA and Skorinkin AI (2014) Membrane current series monitoring: essential reduction of data points to finite number of stable parameters. Front. Comput. Neurosci. 8:120. doi: 10.3389/fncom.2014.00120*

*This article was submitted to the journal Frontiers in Computational Neuroscience. Copyright © 2014 Nigmatullin, Giniatullin and Skorinkin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Fast monitoring of epileptic seizures using recurrence time statistics of electroencephalography

# *Jianbo Gao1,2\* and Jing Hu1,2*

*<sup>1</sup> Institute of Complexity Science and Big Data Technology, Guangxi University, Nanning, China <sup>2</sup> PMB Intelligence LLC, Sunnyvale, CA, USA*

#### *Edited by:*

*Tobias A. Mattei, Ohio State University, USA*

#### *Reviewed by:*

*Tobias A. Mattei, Ohio State University, USA Paul Rapp, Uniformed Services University of the Health Sciences, USA*

#### *\*Correspondence:*

*Jianbo Gao, Institute of Complexity Science and Big Data Technology, Guangxi University, 100 Daxue Road, Nanning 530005, China e-mail: jbgao.pmb@gmail.com*

Epilepsy is a relatively common brain disorder which may be very debilitating. Currently, determination of epileptic seizures often involves tedious, time-consuming visual inspection of electroencephalography (EEG) data by medical experts. To better monitor seizures and make medications more effective, we propose a recurrence time based approach to characterize brain electrical activity. Recurrence times have a number of distinguished properties that make it very effective for forewarning epileptic seizures as well as studying propagation of seizures: (1) recurrence times amount to periods of periodic signals, (2) recurrence times are closely related to information dimension, Lyapunov exponent, and Kolmogorov entropy of chaotic signals, (3) recurrence times embody Shannon and Renyi entropies of random fields, and (4) recurrence times can readily detect bifurcation-like transitions in dynamical systems. In particular, property (4) dictates that unlike many other non-linear methods, recurrence time method does not require the EEG data be chaotic and/or stationary. Moreover, the method only contains a few parameters that are largely signal-independent, and hence, is very easy to use. The method is also very fast—it is fast enough to on-line process multi-channel EEG data with a typical PC. Therefore, it has the potential to be an excellent candidate for real-time monitoring of epileptic seizures in a clinical setting.

**Keywords: EEG, recurrence time, seizure detection, seizure propagation, brain complexity**

# **1. INTRODUCTION**

Epilepsy is a relatively common brain disorder which may be very debilitating. It affects approximately 1% of the world population (Jallon, 1997) and three million people in the United States alone. It is characterized by intermittent seizures. During a seizure, the normal activity of the central nervous system is disrupted. The concrete symptoms include abnormal running/bouncing fits, clonus of face and forelimbs, or tonic rearing movement as well as simultaneous occurrence of transient EEG signals such as spikes, spike and slow wave complexes or rhythmic slow wave bursts. Clinical effects may include motor, sensory, affective, cognitive, automatic and physical symptomatology. Although epilepsy can be treated effectively in many instances, severe side effects may result from constant medication. Even worse, some patients may become drug-resistant not long after treatment. To make medications more effective, timely detection of seizure is very important.

In the past several decades, considerable efforts have been made to detect/predict seizures through non-linear analysis of EEGs (Kanz and Schreiber, 1997; Gao et al., 2007). Representative non-linear methods proposed for seizure prediction/detection include approaches based on correlation dimension (Lehnertz and Elger, 1995, 1997; Martinerie et al., 1998; Aschenbrenner-Scheibe et al., 2003), Kolmogorov entropy (van Drongelen et al., 2003), permutation entropy (Cao et al., 2004), short time largest Lyapunov exponent (STLmax) (Iasemidis et al., 1990; Lai et al., 2003), dissimilarity measures (Protopopescu et al., 2001; Quyen et al., 2001), long-range-correlation (Hwa and Ferree, 2002; Gao et al., 2006b, 2007, 2011b; Valencia et al., 2008), power-law sensitivity to initial conditions (Gao et al., 2005b), scale-dependent Lyapunov exponent (SDLE) (Gao et al., 2006a, 2012a,b), and synthesis of linear/non-linear methods by using neural networks (Adeli et al., 2007). Readers interested in "what is epilepsy, where, when, and why (how) do seizures occur?" are referred to the April, 2007 issue of *Journal of Clinical Neurophysiology*.

Note that most of the proposed methods assume that EEG signals are chaotic and stationary. As a result, they tend to have performances that are signal- and patient-dependent due to the noisy and non-stationary nature of the EEG within and across patients. In addition, they are computationally expensive. Consequentially, studies of epilepsy still heavily involve visual inspection of multichannel EEG signals by medical experts. Visual inspection of long (e.g., tens of hours or days) EEG data is, however, tedious, timeconsuming, and in-efficient. Therefore, it is important to develop new non-linear seizure monitoring approaches.

In this paper, we explore recurrence time based analysis of EEG (Gao, 1999, 2001; Gao and Cai, 2000; Gao et al., 2003), with the goal of potentially on-line monitoring the occurrence and propagation of seizures. The method does not assume that the underlying dynamics of EEGs be chaotic or stationary. More importantly, it has been tested to be able to readily detect very subtle changes in signals (Gao, 2001; Gao et al., 2003).

When developing a new method, it is important to compare its performance with that of existing methods. For seizure detection, such a task has been greatly simplified by our recent studies (Gao et al., 2011a, 2012a). By comparing seizure detection using a variety of complexity measures from deterministic chaos theory, random fractal theory, and information theory, we have found that the variations of those complexity measures with time have two patterns—either similar or reciprocal (Gao et al., 2011a). More importantly, we have gained fundamental understanding about the connections among different complexity measures through a new multiscale complexity measure, the SDLE. These results are recapitulated in **Figure 1**. While we leave the details to our prior works (Gao et al., 2006a, 2007, 2012a,b), these results suggest that it would be sufficient for us to compare the performance of the recurrence time based method for seizure detection with the performance of any of the existing complexity measures. Since some of the EEG data examined here had also been analyzed by the STLmax method and documented results exist, we shall compare our recurrence time method with the STLmax method. We shall show that the recurrence time method is both more accurate and faster than the STLmax method in detecting seizures from EEG.

The remainder of the paper is organized as follows. In section 2, we describe the data used here and the recurrence time method and the STLmax method for seizure detection. In

section 3, we compare the performance of the recurrence time and STLmax method for seizure detection, as well as study seizure propagation. In section 4, we make a few concluding remarks.

# **2. MATERIALS AND METHODS**

In this section, we first describe EEG data used here, then describe the recurrence time method and the short-time Lyapunov exponent (STLmax) method.

# **2.1. DATA**

The EEG signals analyzed here are human EEG. They were recorded intracranially with approved clinical equipment by the Shands hospital at the University of Florida, with a sampling frequency of 200 Hz. **Figure 2** shows our typical 28 electrode montage used for subdural and depth recordings.

Intracranial EEG is also called depth EEG, and is considered less contaminated by noise or motion artifacts. However, the clinical equipment used to measure the data has a pre-set, unadjustable maximal amplitude, which is around 5300μV. This causes clipping of the signals when the signal amplitude is higher than this threshold. This is often the case during seizure episodes, especially for certain electrodes. To a certain extent, clipping complicates seizure detection, since certain seizure signatures may not be captured by the measuring equipment. However, we did not apply any filtering or conditioning methods to preprocess the raw EEG signals when we use our recurrence time method. The good results presented below thus suggest that the method is very reliable.

Altogether we have data of seven patients. The total duration of the measurement for each patient was up to about 3 days, as shown in the 2nd column of **Table 1**. There were only one or a few

approximate location of depth electrodes, oriented along the anterior–posterior plane in the hippocampi (RTD, right temporal depth; LTD, left temporal depth), and subdural electrodes located beneath the orbitofrontal and subtemporal cortical surfaces (ROF, right orbitoftrontal; LOF, left orbitofrontal; RST, right subtemporal; LST, left subtemporal).


**Table 1 | Performance of the** *T***<sup>2</sup> and the STLmax method for seven patients' data.**

*The total number of seizures was determined by examining clinical symptons and all 28 channel video-EEG data by medical experts. Note the five missed seizures for patient P93 are all subclinical seizures, whose information does not appear to be reflected by the EEG dynamics.*

seizures for some patients while there were several tens of seizures for some other patients, as shown in the 3rd column of **Table 1**. Some of the seizures were considered subclinical, i.e., not manifested in the EEG signals. Sometimes the EEG signals may contain signatures distinctly different from background non-seizure signals, due to, for example, the fact that the patient may be eating food, drinking, etc. These non-seizure signatures typically may also be picked up by a seizure monitoring method. In this study, we shall focus on the behavior of the recurrence time and STLmax method in detecting seizures using only three channels EEG data without any preprocessing. As we shall see later, reliable decisions can be made based on single channel EEG data. There appears to be no need to combine multiple channels data.

### **2.2. RECURRENCE TIME BASED METHOD FOR SEIZURE DETECTION**

The method involves first partitioning a long EEG signal into (overlapping or non-overlapping) blocks of data sets of short length *k*, and compute the so-called mean recurrence time of the 2nd type, *T*2(*r*), for each data subset. For non-stationary and transient time series, it has been found (Gao, 1999, 2001; Gao and Cai, 2000; Gao et al., 2003) that *T*2(*r*) will be different for different blocks of data subsets.

Let us first define the recurrence time of the 2nd type. Suppose we are given a scalar time series {*x*(*i*), *i* = 1, 2,...}. We first construct vectors of the form: *Xi* = [*x*(*i*), *x*(*i* + *L*), . . . , *x*(*i* + (*m* − 1)*L*)], with *m* being the embedding dimension and *L* the delay time (Packard et al., 1980; Takens, 1981; Sauer et al., 1991). {*Xi*, *i* = 1, 2,..., *N*} then represents certain trajectory in a *m*dimensional space. Next, we arbitrarily choose a reference point *X*<sup>0</sup> on the reconstructed trajectory, and consider recurrences to its neighborhood of radius *r*: *Br*(*X*0) = {*X* : -*X* − *X*0- ≤ *r*}. The recurrence points of the 2nd type are defined as the set of points comprised of the first trajectory point getting inside the neighborhood from outside. These are denoted as the dark solid circles in **Figure 3**. The trajectory may stay inside the neighborhood for a while, thus generating a sequence of points, as designated by open circles in **Figure 3**. These are called sojourn points (Gao, 1999). It is clear that there will be more such points when the size of the neighborhood gets larger as well as when the trajectory is

sampled more densely. The summation of the recurrence points of the second kind and the sojourn points is called the recurrence points of the first kind. These are often called nearest neighbors of the reference point *X*0, and have been used by all other chaos theory-based non-linear methods.

Let us be more precise mathematically. We denote the recurrence points of the 1st type by *S*<sup>1</sup> = {*Xt*<sup>1</sup> , *Xt*<sup>2</sup> ,..., *Xti* ...}, and the corresponding Poincare recurrence time of the 1st type by {*T*1(*i*) = *ti* <sup>+</sup> <sup>1</sup> − *ti*, *i* = 1, 2,...}. Note the time is computed based on successive returns, not based on the returning points and the reference point. Also note *T*1(*i*) may be 1 (for continuous time systems, this means one unit of the sampling time), for some *i*. This occurs when there are at least one sojourn point. Existence of such points makes further quantitative analysis difficult. Thus, we remove the sojourn points from the set *S*<sup>1</sup> (which can be easily achieved by monitoring whether the recurrence times of the first type are one or not). Let us denote the remaining set by *S*<sup>2</sup> = {*Xt* 1 , *Xt* 2 ,..., *Xt i* ...}. *S*<sup>2</sup> then defines a time sequence {*T*2(*i*) = *t <sup>i</sup>* <sup>+</sup> <sup>1</sup> − *t i* , *i* = 1, 2,...}. These are called the recurrence times of the 2nd type.

*T*2(*i*) has a number of interesting properties: (1) For periodic motions, so long as the size of the neighborhood is not too large, *T*2(*i*) accurately estimates the period of the motion. (2) For discrete sequences, the entire Renyi entropy spectrum can be computed from the moments of *T*<sup>2</sup> (Gao et al., 2005a). (3) For chaotic motions, *T*2(*i*) is closely related to the Lyapunov exponent, and hence, Kolmogorov entropy (Gao and Cai, 2000). (4) For chaotic motions, *T*2(*i*) is related to the information dimension *d*<sup>1</sup> by a simple scaling law (Gao, 1999; Gao et al., 2003),

$$T\_2(r) \sim r^{d\_1 - a},\tag{1}$$

where α takes on value 0 or 1, depending on whether the sojourn points form very few isolated points inside the neighborhood *Br*(*X*0), thus contribute dimension 0, or form a smooth curve inside *Br*(*X*0), thus contribute dimension 1. These properties make the recurrence time based method very versatile and powerful in detecting signal transitions.

We now explain how the mean recurrence time of the 2nd type can be computed. We simply evaluate this quantity for every reference point in a window, then take the mean of those times. Such calculation is carried out for all the data subsets, resulting in a curve which describes how *T*2(*r*) varies with time. It has been observed (Gao, 1999, 2001; Gao and Cai, 2000; Gao et al., 2003) that the variations of *T*2(*r*) coincide very well with sudden changes in the signal dynamics, such as bifurcations or transitions from regular motions to chaotic motions in non-stationary data, and vise versa. An example is shown in **Figure 4** using the transient logistic map described by

$$a\mathbf{x}(n+1) = a(n)\mathbf{x}(n)[1-\mathbf{x}(n)], \quad a(n) = a(n-1) + 10^{-5} \tag{2}$$

We observe from **Figure 4** that the method not only detects all the bifurcations in the signal, but also gives the exact periods of periodic signals. Note that some changes in a signal may be difficult to detect visually (Gao, 2001).

Since there are altogether four parameters involved, namely, the embedding dimension *m* and delay time *L*, the window length *k* for the data subsets, and the neighborhood size *r*, how shall we select them properly? To better illustrate the ideas, we postpone the discussion to section 3.1.1.

#### **2.3. STLmax METHOD FOR SEIZURE DETECTION**

The basic idea is to compute the largest positive Lyapunov exponent for each window's EEG signal using the Wolf et al.'s algorithm (Wolf et al., 1985) or its simple variants. Therefore, it is sufficient to describe the Wolf et al.'s algorithm (Wolf et al., 1985) and point out how it can be modified.

To apply the Wolf et al.'s algorithm (Wolf et al., 1985), one selects a reference trajectory and follows the divergence of its neighboring trajectory from it. Denote the reference and the neighboring trajectories by *Xi* = [*x*(*i*), *x*(*i* + *L*), . . . , *x*(*i* + (*m* − 1)*L*)], *Xj* = [*x*(*j*), *x*(*j* + *L*), . . . , *x*(*j* + (*m* − 1)*L*)], *i*, = 1, 2,..., *j* = *K*, *K* + 1,..., respectively. At the start of the time (which corresponds to *i* = 1), *XK* is usually taken as the nearest neighbor of *X*1. That is, *j* = *K* minimizes the distance between *Xj* and *X*1. When time

evolves, the distance between *Xi* and *Xj* also changes. Let the spacing between the two trajectories at time *ti* and *ti* <sup>+</sup> <sup>1</sup> be *d <sup>i</sup>* and *di* <sup>+</sup> 1, respectively. Assuming *di* <sup>+</sup> <sup>1</sup> ∼ *d i e*λ1(*ti* <sup>+</sup> <sup>1</sup>−*ti*) , the rate of divergence of the trajectory, λ1, over a time interval of *ti* <sup>+</sup> <sup>1</sup> − *ti* is then

$$\frac{\ln(d\_{\bar{i}+1}/d\_{\bar{i}}')}{t\_{\bar{i}+1}-t\_{\bar{i}}}.$$

To ensure that the separation between the two trajectories is always small, when *di* <sup>+</sup> <sup>1</sup> exceeds certain threshold value, it has to be renormalized: a new point in the direction of the vector of *di* <sup>+</sup> <sup>1</sup> is picked up so that *d <sup>i</sup>* <sup>+</sup> <sup>1</sup> is very small compared to the size of the attractor. After *n* repetitions of stretching and renormalizing the spacing, one obtains the following formula:

$$\lambda\_1 = \sum\_{i=1}^{n-1} \left[ \frac{t\_{i+1} - t\_i}{\sum\_{i=1}^{n-1} (t\_{i+1} - t\_i)} \right] \left[ \frac{\ln(d\_{i+1}/d\_i')}{t\_{i+1} - t\_i} \right]$$

$$= \frac{\sum\_{i=1}^{n-1} \ln(d\_{i+1}/d\_i')}{t\_n - t\_1} \,. \tag{3}$$

Note that this algorithm assumes but does not verify exponential divergence. In fact, the algorithm can yield a positive value of λ<sup>1</sup> for any type of noisy process so long as all the distances involved are small. The reason for this is that when *d <sup>i</sup>* is small, evolution would move *d <sup>i</sup>* to the most probable spacing, which is typically much larger than *d i* . Then, *di* <sup>+</sup> 1, being in the middle step of this evolution, will also be larger than *d i* ; therefore, a quantity calculated based on Equation (3) will be positive. This argument makes it clear that the algorithm cannot distinguish chaos from noise. In other words, even if the algorithm returns a positive λ<sup>1</sup> from EEG data, one cannot conclude that the data are chaotic.

It is worth noting that in practice, to simplify implementation of the algorithm, one may replace the renormalization procedure described above by requiring that *d <sup>i</sup>* <sup>+</sup> <sup>1</sup> is constructed whenever *ti* <sup>+</sup> <sup>1</sup> = *ti* + *T*, where *T* is a small time interval. Such a procedure may be called periodic renormalization. In contrast, the original version of the algorithm is an aperiodic renormalization.

# **3. RESULTS**

#### **3.1. SEIZURE DETECTION USING RECURRENCE TIME METHOD**

As we pointed out earlier, the method contains four parameters: the embedding dimension *m* and delay time *L*, the window length *k* for the data subsets, and the neighborhood size *r*. In this subsection, we first discuss how to choose these four parameters properly. Then we evaluate the effectiveness of the method for detecting epileptic seizures. For ease of presentation, we assume that the data have been normalized to the unit interval [0, 1] before further analysis.

#### *3.1.1. Parameter selection*

First, we consider the window length *k* for data subsets. Since our purpose is to find transitions in the signal dynamics, the data subset has to be small. In order to estimate the interesting statistics reliably, a rule of thumb is that so long as a data subset contains several periods of "oscillations", it would be fine (assuming the motion defines certain periodicity-like time scales). For our EEG sampled with a frequency of 200 Hz, we have found that *K* in the range of 500–2000 are all fine. **Figures 5A,B** show two examples, for *k* = 1000 and 2000, respectively. Clearly, in both cases, the two seizures have been detected correctly.

Next, we consider the size *r* of the neighborhood. It can be readily appreciated that when *r* is large, there will be a lot of recurrences, while when *r* is small, recurrences will be rather rare. This means *T*2(*r*) will be large for small *r* but small for large *r*. Such expectations have been extensively observed in practice. For EEG signals, we have found that although the values of *T*2(*r*) may vary with *r*, the pattern of the variation basically remains the same for a wide range of *r*. Two examples are shown in **Figures 5B,C**, where *r* differs by a factor of 2. Our experience is that choice of this parameter is not very critical, in so far as seizure monitoring is concerned.

Finally, we consider the embedding parameters. As is well known, the embedding parameters critically control the geometrical structure formed by the constructed vectors. Because of this feature, optimal embedding is a critical issue, especially when geometrical or dynamical quantities of the dynamics are concerned, such as the fractal dimension, Lyapunov exponents, and Kolmogorov entropy. For an in-depth discussion of this issue, we refer to Gao et al. (2007). Here, we wish to point out that the time scales associated with the motion are typically much less sensitive to the embedding parameters than the quantities such as the fractal dimension, Lyapunov exponents, and Kolmogorov entropy. To appreciate this feature, we have schematically shown in **Figure 6** two different sets of embeddings. It is clear that the reconstructed trajectory shown in **Figure 6A** is fairly uniform, while that in **Figure 6B** is less so. One can readily conceive that when **Figure 6B** is further squeezed, the embedding quality is even worse. Judged by most optimal embedding criteria, the embedding shown in **Figure 6A** is considered a much better one than that shown in **Figure 6B**. However, it can be readily seen that *T*2(*r*) for both **Figures 6A,B** are more or less the same. This means that the selection of *m* and *L* for computing *T*2(*r*) is much less critical than that for computing other dynamical quantities. One good rule of thumb is that as long as the geometrical structure formed by the vectors are reasonably

**FIGURE 5 | Dependence of** *T***<sup>2</sup> on the parameters of the algorithm. (A–F)** Correspond to (*k*, *m*, *L*, *r*) = (1000, 4, 4, 2−4), (2000, 4, 4, 2−4), (2000, 4, 4, 2−3), (2000, 3, 4, 2−4), (2000, 4, 2, 2−4), and (2000, 4, 6, 2−4), respectively.

space-filling, the embedding is considered fine. Our experience with computing *T*2(*r*) from EEG is that 3 ≤ *m* ≤ 6 are all fine, and with a sampling frequency of 200 Hz, *L* may be chosen 2–6. This discussion may be better appreciated by comparing **Figures 5B,D–F**, where four sets of (*m*, *L*) are illustrated. Clearly, all the parameter combinations have detected the two seizures accurately.

To summarize, the recurrence method is much less sensitive to the parameters when compared with other nonlinear methods, where embedding and other parameters have to be chosen carefully, and have to be specifically adapted to each patient for good results. For our recurrence time method, however, we have used the same parameter combination (*k*, *m*, *L*,*r*) = (2000, 4, 4, 2−4) for all seven patients' data.

#### *3.1.2. Performance evaluation of the method*

To illustrate the idea, we shall arbitrarily pick up three channels of EEG data, <sup>1</sup> from one patient, and compare the patterns

1In fact, the three chosen channels EEG data may not correspond to where a seizure was localized. This further indicates the robustness of our method.

of variation of *T*2(*r*) with that of STLmax. One typical result is shown in **Figure 7**. Vertical dotted lines indicate the seizure occurrence time determined by medical experts by viewing videotapes as well as the EEG signals. There are three seizures in **Figure 7** during the period of time plotted. We observe that *T*2(*r*) curves very cleanly and accurately detect all the seizures occurred. In fact, if one ignores the propagation-related slight timing difference (on the order of a few seconds up to 1 min; this will be further discussed later) among different electrodes, then most of the channels can be considered equivalent. In other words, decision can be based on single channel EEG data. This feature makes automatic detection of seizure by thresholding almost trivial. In contrast, the STLmax curves are much noisier than the *T*2(*r*) curves. Although STLmax curves can be further post-processed to better reveal seizure information (Iasemidis et al., 2003), those features are still much weaker than those revealed by the recurrence time method.

To more systematically compare the performance of the two methods in detecting seizures, we have computed positive detection (or equivalently, sensitivity) and false alarm per hour for the two methods. Positive detection is defined as the ratio between the number of seizures correctly detected and the total number of seizures. The false alarm per hour is simply the number of falsely detected seizures divided by the total time period. **Table 1** summarizes the results. Clearly, the recurrence time method is more accurate than the STLmax method. This accuracy becomes even more attractive if one notices that the recurrence time method only involves simple thresholding, while the STLmax method involves a lot of further analysis (Iasemidis et al., 2003).

**FIGURE 8 |** *T***2(***r***) curves for EEG signals measured by three electrodes.** The dashed vertical line indicate the seizure starting position around 200 s. The seizure lasted for about 2 min. Note that from **(A)** LTD1 to **(B)** LTD3, the seizure activity is delayed about 10 s, while from **(A)** LTD1 to **(C)** LTD5, the seizure activity is delayed about 30–40 s.

### *3.1.3. Computational cost*

The recurrence time method is very fast. With an ordinary PC (CPU speed less than 2 GHz), computation of *T*2(*r*) from one channel EEG data of duration 1 h with sampling frequency of 200 Hz takes about 1 min CPU time. Computation of STLmax, on the other hand, takes more than 10 min. Hence, the recurrence time based method is much faster than the STLmax method. In fact, even with an ordinary PC, one is able to process all 28 channels of 1-h EEG data in about half an hour, therefore, faster than the data being continuously collected. With a more powerful PC, of course, the speed becomes even faster. Such a speed implies that the method can be used to real-time on-line process continuously collected all channels of EEG data. From an engineering perspective, the fast computation of recurrence time statistics can be considered overwhelming.

## **3.2. PROPAGATION OF EPILEPTIC SEIZURES IN THE BRAIN**

Formation and propagation of epileptic seizures in the brain is an outstanding example of complex spatial-temporal pattern formations. One of the most desirable ways of studying these problems is to understand how and when information flows from one region of the system to other regions. To resolve this issue, it is critical to accurately providing timing information for interesting events occurring in the system. With the exact timing information, one can then use concepts such as cross correlation and cross spectrum, mutual information, or measures from chaos theory, such as related to cross recurrence plots, to more fully characterize the spatial–temporal patterns. Recurrence time method can effectively provide such a timing information. To illustrate this point, we have shown in **Figure 8** an example of analysis of

# **REFERENCES**


multi-channel EEG signals using the recurrence time method. For the specific seizure studied, it was known that the seizure occurred around 200 s, and lasted about 2 min. While the recurrence time method has accurately detected the seizure, we note that the seizure activity recorded by electrode LTD3 and LTD5 was about 10 and 40 s later than that indicated by electrode LTD1, respectively. Hence, the recurrence time method not only accurately detects the seizure, but also provides invaluable timing information for the development of the seizure.

# **4. CONCLUSIONS**

Motivated by developing a non-linear method without the limitations of assuming that EEG signals are chaotic and stationary, we have proposed a recurrence time based approach to characterize brain electrical activity. The method is very easy to use, as it only contains a few parameters that are largely signal-independent. It very accurately detects epileptic seizures from EEG signals. Most critically, the method is very fast—it is fast enough to realtime on-line process multi-channel EEG data with a typical PC. Therefore, it has the potential to be an excellent candidate for real-time monitoring of epileptic seizures in a clinical setting.

The recurrence time method is also able to accurately give the timing information critical for understanding seizure propagation. Therefore, it may help characterize epilepsy type, lateralization and seizure classification (Holmes, 2008; Napolitano and Orriols, 2008; Plummer et al., 2008). To more thoroughly understand the capabilities of recurrence time method in characterizing seizure propagation, it would be desirable to combine recurrence time analysis of EEG with studies based on MEG and MRI exams.

dynamics. *Cogn. Neurodyn.* 5, 171–182. doi: 10.1007/s11571-011- 9151-3


*Phys. Rev. E* 73:016117. doi: 10.1103/PhysRevE.73.016117


*Med.* 4, 1173–1176. doi: 10.1038/ 2667


recordings. *Lancet* 357, 183. doi: 10.1016/S0140-6736(00)03591-1


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 28 July 2013; accepted: 04 September 2013; published online: 01 October 2013.*

*Citation: Gao J and Hu J (2013) Fast monitoring of epileptic seizures using recurrence time statistics of electroencephalography. Front. Comput. Neurosci. 7:122. doi: 10.3389/fncom.2013.00122*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2013 Gao and Hu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Astronomical apology for fractal analysis: spectroscopy's place in the cognitive neurosciences

# *Damian G. Kelty-Stephen\**

*Department of Psychology, Grinnell College, Grinnell, IA, USA \*Correspondence: keltysda@grinnell.edu*

#### *Edited and reviewed by:*

*Tobias A. Mattei, Ohio State University, USA*

**Keywords: spectroscopy, fractal, power law, time series, molecular cloud, stars, perception, perception-action**

Fractal structure offers new leverage for understanding cognition (Dixon et al., 2012; Kelty-Stephen and Dixon, 2012, 2013). A minority in neuroscience feels very strongly about this point, finding it either crucial (Friston et al., 2012; Van Orden et al., 2012) or patently absurd (e.g., Wagenmakers et al., 2012). The majority remain understandably mystified or bored by opaque math and ponderous debate. I propose to re-present the point through analogy to a field far removed from neuroscience, namely, astronomy, in the hopes of making the common threads clearer and less threatening. One field gazes deep into the brain; the other gazes up and away from anything on Earth. However, both kinds of scientists seek physicochemical accounts of comparably high-dimensional systems (Mesulam, 2008). They must take imperfect measurements and use elegant strategies to probe these measurements for what is not plainly obvious to the naked eye.

Fractal structure (or its absence) and its implication in cognition grows rather inoffensively out of spectral methods (i.e., "spectroscopy") that elevated astronomy from guesswork to extremely sophisticated inquiry. The comparison of 20 years of neuroscience exploring fractal structure in cognition (e.g., Gilden et al., 1995) to 200 years of spectroscopy in astronomy is humblingly instructive (see Hearnshaw, 2010). Far from undermining physicochemical accounts of the heavens, since its recognition in astronomy (e.g., de Vaucouleurs, 1970; Mandelbrot, 1977), fractal structure has supported physicochemical accounts of star formation in ways non-fractal models could not (e.g., Larson, 2005). Comparing our 20 years with astronomy's 200, I am prepared not to live to see the fruition of similar attempts in neuroscience. I hope only to illustrate that neuroscience might learn a lot from astronomy's cosmopolitan views of spectroscopy.

We forget easily that modern astronomy was not always the scientific success we know today. Despite unresolved questions, we are awash in precise physical and chemical information about 10<sup>11</sup> stars living for billions of years in each of 20<sup>11</sup> galaxies (Geach, 2011; Tolstoy, 2011). Roughly 180 years ago, Comte (1835) predicted that we would never know the physicochemical details of the heavens. Astronomy was only as good as telescopes with the strongest magnification, and astronomy would never be more than guesswork projected into kinematics of these magnified dots and smears. Comte's words reflected an ignorance of the initial evidence from a new method called "spectroscopy." And it was the subsequent development of spectroscopy that allowed astronomers to bury Comte's disparaging assessment.

What Comte didn't know about spectroscopy was that astronomical measures of celestial dots and smears carry richly patterned optical information (e.g., Fraunhofer, 1817). The full spectrum of electromagnetic radiation reached Earth only incompletely. Between star and telescope lay rich molecular clouds of dust and gas. Decomposing this radiation into a spectrum of oscillations at different scales revealed the composition of the molecular clouds because specific configurations of electrons absorbed and emitted light from specific ranges of the electromagnetic spectrum. For instance, Lockyer (1869) and Janssen (1869) identified the element later known as helium based on its absorbing and emitting light waves of length 587.6 nanometers—or, equivalently, light waves oscillating at a frequency of 5*.*1 × 1014 Hz. Specific elements composing the universe absorbed energy at specific scales of space and time. Here was the key to the universe's composition and to quashing Comte's prophecy of ignorance.

Spectroscopy denotes the broad class of analyses depicting how an observable's distribution over a wide range of measurement scales. Different kinds of spectra entail different sorts of axis labels. "Power" spectra plot oscillatory power (i.e., amplitude squared) against oscillatory wavelength or, inversely, frequency. "Energy" and "mass" spectra plot quantity across spatial scales. Scientists care about spectroscopy because, as with light through celestial molecular clouds, the distribution of observables varies with scale, and this relationship usually provides insights into the processes underlying phenomena we care about. Sometimes these processes exhibit selective response to characteristic scales, as in helium's emission spectra. Other measurements exhibit response over a continuous range of scales, and this response can increase or decrease with scale. Fractal structure is nothing but an extremely specific example of this latter case, namely, a spectrum exhibiting power-law (and thus scale-invariant) growth or decay across scales. Here we encounter a rather large fact that often goes unmentioned in the debates: There are truly no "fractal analyses"—only fractal or non-fractal patterns revealed by spectroscopic methods.

Neuroscience has a fondness for characteristic scales. For instance, evoked response potential (ERP) data suggests that cortical activity exhibits different voltage profiles across time depending on the engagement of separate neural/cognitive mechanisms. A peak of negative voltage at 400 ms (i.e., the "N400") after visual presentation of a letter string indicates recognition that the letter string is pronounceable (e.g., Rossi et al., 2011). Whereas absorption/emission of light at 5*.*1 × 1014 Hz was the astronomers' first glimpse of helium, perhaps N400s at (400 ms−1=) 2.5 Hz is a glimpse of a similarly elemental mechanism in cognitive processes. However, neuroscience focuses its spectroscopic strategies on molecular details of blood flow and metabolites (Minati et al., 2007; Murkin and Arango, 2009). However, these molecular details alone don't address flexible, task-sensitive operation of cognitive processes of language comprehension (White et al., 2012). So long as these mechanisms are known by their characteristic time scales, why hasn't neuroscience situated the N400 on a spectrum too?

One obstacle is that spectroscopy needs long, densely sampled time series. Any single stream of ERP data is so noisy that observing N400s in single-participant data requires averaging over at least 45 trials (e.g., Niedeggen et al., 1999). Otherwise, we might collect prolonged series of ERP data of a participant viewing continuous text of pronounceable letter strings. Reading pace is ∼250 ms/word (Rayner and Clifton, 2009). Let us imagine the resulting ERP signal: N400 peaks for each string, spaced 250 ms apart over time. The emission line in power-spectral analysis of this ERP signal would appear at (250 ms−1=) 4 Hz. Dyslexic readers take 500 ms/word longer (Russeler et al., 2007), and their N400 peaks might be spaced by (250 + 500=) 750 ms, producing a peak in a spectrum of ERP data at (750 ms−1=) 1.33 Hz. Just as a peak voltage at 400 ms might signify a phonotactic mechanism's characteristic scale, the gap between 1.33 and 4 Hz should indicate the difference in reading mechanisms between dyslexic and typical readers. After all, wasn't it a similar spectral difference that helped astronomers distinguish helium from sodium?

Results from reading reaction times tell a different story. Over the course of reading a 14000-word story, reading time per word decrease according to Newell and Rosenbloom's (1981) ubiquitous powerlaw of learning (Wallot et al., 2013). Also, rather than looking at the power spectrum of ERP signals, we might examine the power spectrum of trial-by-trial reading times. Whereas our above ERP series are imaginary, the latter power spectra have been empirically recorded and presented many times over (e.g., Van Orden et al., 2003; Holden et al., 2009; Wallot and Van Orden, 2011). These spectra show that fluctuations in reading-time series resemble 1/f noise, an inverse powerlaw relationship between oscillatory power and frequency. Rather than having cleanly individuated peaks like emission spectra, the power spectra from these reading-time series show a continuous slow decrease in oscillatory power with greater frequencies. Rather than individuated peaks (i.e., characteristic time scales), these spectra show similar decreases in power across all scales. Often hotly contested as statistical artifacts of "simpler" behavior of cognitive processes at characteristic scales, these patterns have survived statistical rigors (Delignières and Marmelat, 2012).

Statistical rigor notwithstanding, origins and relevance of fractal patterns in neuroscience remain as hotly contested. My own view aligns with one expressed in astronomical literature: fractal patterns reflect cascade dynamics both supported by and giving rise to structures at many scales (Larson, 2005). Astronomy and neuroscience alike have grappled with the realization that structures must somehow embody stability but also flexibility. Stars are not static, homogeneous objects distinct from their contexts—no matter the convenience of this notion for brief measurement and modeling. Stars condense out of clouds, undergo developmental phases, and collapse or explode, and so on. Structures exhibiting characteristic scales demand reconciliation with the fractal patterns inherited from the Big Bang (Mohaved et al., 2011). Similarly, independent mechanisms underpinning cognition are no more static or distinct. Brain structures and cognitive structures reflect relatively stable configurations of neural dynamics within contexts structured at multiple scales (Buzaki, 2006). They exhibit relatively stable short-range functions, but this stability is relative to longer-term variation across the time scales of learning, the life span, and species evolution. The hierarchical nesting of these multiple scales engenders cascades giving rise to structure, and these cascades are no less valid a factor in a physicochemical account than electron configurations. In this light, fractal results that can be (rigorously!) demonstrated to reflect cascade dynamics support a physicochemical account of structure, in astronomy and neuroscience alike.

Spectroscopic work relating fractal patterns to changes in the organization of observed structures supports the foregoing proposals. Fractal modeling of cloud dispersion predicts galactic emission spectra (Bottorff and Ferland, 2001) as well as temperature changes associated with star formation (Pan and Padoan, 2009). In cognitive tasks, bodily movements (e.g., of eye-gaze, hand, foot, or posture) incident to exploring task environments exhibit fractal power spectra. These power-law exponents describing these spectra serve to predict the flexibility of cognitive performance in the same tasks. That is, fractal fluctuations in the human body support the ability of cognitive systems to fine-tune their perceptual judgments (Stephen and Hajnal, 2011; Palatinus et al., 2013) or to discover new representations of problem-solving tasks (Stephen and Dixon, 2009; Stephen et al., 2009). Moreover, these effects of fractal patterning in exploratory behaviors may predict individual-trial performance above and beyond average differences in reaction times due to traditional cognitive processes (Stephen and Anastas, 2011).

The central appeal of fractal results in cognition and neuroscience, to my view, is that they may offer us a framework for aligning physicochemical accounts of neural, cognitive phenomena with physicochemical accounts pursued in different domains. Reaching for a relatively more generic physicochemical framework in which insights from different domains might be mutually relevant and compatible interests me. Not only that, it strikes me as an ideal way of grounding our tests of physicochemical guesses for neuroscience upon stronger physicochemical foundations. Evidence of fractality in domains beyond cognition and neuroscience is a reason that neuroscientists cite for being unimpressed: for instance, the fact that many more systems are found to exhibit fractal fluctuations than are agreed upon to be "cognitive" is taken to entail that fractality is not important to cognition (Botvinick, 2012). This logic seems to presume that welcome causal players in cognitive theory include only those that maintain the (pre-theoretical) distinction between cognitive systems and noncognitive ones. Cognitive neuroscience sometimes takes great comfort in asserting the fundamental difference of cognitive systems from all others (Wagenmakers et al., 2012).

Perhaps similarity between cognitive neuroscience and other physicochemically-oriented fields is unwelcome. I find declaring one's own scientific field to require special and different explanation from other scientific fields no more compelling than Comte (1835) found pre-spectroscopic astronomy's guesswork at dots and smears in telescope images. We already have one Big Bang from which to weave cosmological history, and the simple assertion that cognitive systems are fundamentally different from everything else post-Big Bang will require another. Any such cognitive Big Bang (e.g., "when something might have had the first thought") seems less like compelling explanation and more like reluctance to face what may be humbling physicochemical realities. I remain cautiously confident that spectroscopy should be as valuable to cognitive neuroscience as it has been to astronomy in discerning common explanatory ground with other physicochemical disciplines.

Fractal and non-fractal results from spectroscopy appear important to me because they make falsifiable the interesting physicochemical hypothesis that development of structure in nervous systems depends on cascades. When this hypothesis fails to be interesting, I will oblige my critics and stop worrying about fractals.

# **ACKNOWLEDGMENTS**

I would like to thank Zsolt Palatinus and Emma Kelty-Stephen for their kind, patient feedback, and I would also like to thank Tobias Mattei for inviting this submission.

# **REFERENCES**


*Soc. Lond*. 159, 425–444. doi: 10.1098/rstl.18 69.0015


critical, synergistic, scale-free, exquisitely contextsensitive, interaction-dominant, multifractal, interdependent brain-body-niche systems. *Top. Cogn. Sci.* 4, 87–93. doi: 10.1111/j.1756-8765.2011. 01164.x


*Received: 30 January 2014; accepted: 03 February 2014; published online: 25 February 2014.*

*Citation: Kelty-Stephen DG (2014) Astronomical apology for fractal analysis: spectroscopy's place in the cognitive neurosciences. Front. Comput. Neurosci. 8:16. doi: 10.3389/fncom.2014.00016*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2014 Kelty-Stephen. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Chunking dynamics: heteroclinics in mind

#### *Mikhail I. Rabinovich1, Pablo Varona2 \*, Irma Tristan1 and Valentin S. Afraimovich3*

*<sup>1</sup> BioCircuits Institute, University of California, San Diego, La Jolla, CA, USA*

*<sup>2</sup> Grupo de Neurocomputación Biológica, Departamento de Ingeniería Informática, Escuela Politécnica Superior, Universidad Autónoma de Madrid, Madrid, Spain <sup>3</sup> Instituto de Investigación en Comunicación Óptica, Universidad Autónoma de San Luis Potosí, San Luis Potosí, México*

#### *Edited by:*

*Tobias Alecio Mattei, Ohio State University, USA*

#### *Reviewed by:*

*Maurizio Mattia, Istituto Superiore di Sanità, Italy Hiroshi Okamoto, RIKEN Brain Science Institute, Japan*

#### *\*Correspondence:*

*Pablo Varona, Grupo de Neurocomputación Biológica, Departamento de Ingeniería Informática, Escuela Politécnica Superior, Universidad Autónoma de Madrid, C/Francisco Tomás y Valiente, 11, 28049 Madrid, Spain e-mail: pablo.varona@uam.es*

Recent results of imaging technologies and non-linear dynamics make possible to relate the structure and dynamics of functional brain networks to different mental tasks and to build theoretical models for the description and prediction of cognitive activity. Such models are non-linear dynamical descriptions of the interaction of the core components—brain modes—participating in a specific mental function. The dynamical images of different mental processes depend on their temporal features. The dynamics of many cognitive functions are transient. They are often observed as a chain of sequentially changing metastable states. A stable heteroclinic channel (SHC) consisting of a chain of saddles—metastable states—connected by unstable separatrices is a mathematical image for robust transients. In this paper we focus on hierarchical chunking dynamics that can represent several forms of transient cognitive activity. Chunking is a dynamical phenomenon that nature uses to perform information processing of long sequences by dividing them in shorter information items. Chunking, for example, makes more efficient the use of short-term memory by breaking up long strings of information (like in language where one can see the separation of a novel on chapters, paragraphs, sentences, and finally words). Chunking is important in many processes of perception, learning, and cognition in humans and animals. Based on anatomical information about the hierarchical organization of functional brain networks, we propose a cognitive network architecture that hierarchically chunks and super-chunks switching sequences of metastable states produced by winnerless competitive heteroclinic dynamics.

**Keywords: cognitive dynamics, stable heteroclinic channel, transient dynamics, low dimensionality of brain activity, hierarchical sequences, chunking and superchunking, cognition modeling principles**

# **INTRODUCTION**

Chunking is a dynamical phenomenon that the brain uses for processing long informational sequences. The concept of chunk was introduced by Miller (1956). His key notion is that short-term storage is not rigid but amenable to strategies such as chunking that can expand its capacity. Miller's work drew plenty of attention to the concept of short-term memory and its functional characteristics. Chunking involves two processes: concatenation of units in a block and segmentation of the blocks. In general, chunking is related to the hierarchical organization of perceptual, cognitive, or behavioral sequential activity. In particular, in motor control (see Rosenbaum et al., 1983) sequences can consist of subsequences and these can in turn consist of sub-sub-sequences, etc. The natural hierarchical organization of long sequences is a result of the activity of specific brain functional networks. Such networks include many different brain areas and some of them are also organized in a hierarchical manner. A well-known example is Broca's area that has been suggested to act as a "supramodal syntactic processor," able to process any type of hierarchically organized sequences (Grossman, 1980; Tettamanti and Weniger, 2006), a hypothesis based on the findings that this region is not only involved in processing language syntax (Musso et al., 2003), but also in syntax like aspects of non-linguistic tasks, for example, the performance of specific movements and music (Fadiga et al., 2009) as several fMRI studies (Bahlmann et al., 2008, 2009) seem to confirm. Clerget et al. hypothesize that motor behavior shares some similarities with language (Clerget et al., 2013), namely that a complex action can be viewed as a chain of subordinate movements, which need to be combined according to certain rules in order to reach a given goal (Dehaene and Changeux, 1997; Dominey et al., 2003; Botvinick, 2008).

What are the mechanisms that transform the extremely complex, noisy, and many-dimensional brain activity into a rather regular, low-dimensional, and even predictable cognitive behavior, e.g., what are the mechanisms underlying the dynamics of the mind, including chunking? This is one of the most challenging questions in today's neuro- and cognitive science. Recent continuous advances in non-invasive brain imaging allow assessing the structural connectivity of the brain and the corresponding evolution of the spatio-temporal activity in detail.

In our view, metastability is a key element of transient cognitive dynamics participating in chunking processes. The idea of the spatiotemporal organization of brain dynamic activity through transient, metastable states emerged more than 15 years ago (Kelso, 1995; Friston, 1997). According to this scenario, such dynamics can be represented as a sequential switching between different metastable states (for a description of the mathematical basis of this scenario see Rabinovich et al., 2008a,b). Metastable transient dynamics represent a balance between the segregation of focused cognitive processing and the flexible integration of distributed brain areas. Such integration is necessary for the performance of a specific cognitive function (Bressler and Kelso, 2001; Meehan and Bressler, 2012). The existence of connections that are prevalent over long periods of time supports the well-regarded concept of a hierarchical organization of neural processing (Engel et al., 2001), which is the basis for the understanding of the origin of the chunking dynamics. Because the dimensionality of cognition depends on the number of activated (in contrast to the potentially observable) metastable states, it is important to remember that the brain chooses the necessary metastable states and suppresses those which are irrelevant to the goal of the cognitive process, resulting in a reduced dimensionality. The low-dimensionality of brain cognitive dynamics is based on two important issues: first, the manner of the cognitive task encoding—an external or internal stimulus determining a specific cognitive task excites a set of elements of the community networks which are responsible for the performance of such cognitive activities; and second, the existence of a specific hierarchical organization of the global brain networks that operate for the performance of a specific cognitive task by a moderate number of brain modes.

Based on experimental data suggesting that the processing of sequential cognitive activity on computational grounds is implemented in the brain by spatiotemporally pattern dynamics (see also Sahin et al., 2009), we build here a general dynamical model that produces hierarchical chunking of sequences, which suggests a plausible neural mechanism of chunking dynamics in the brain. This model is reasonably low-dimensional, which allows a detailed dynamical analysis.

### **MATERIALS AND METHODS**

A top-down approach to model transient cognitive dynamics taking into account the experimental observations described in the introduction is to use kinetic equations for the description of spatiotemporal mental modes that contain the discussed metastable states as equilibrium points. The set of brain patterns that sequentially change in the process of the cognitive task performance determine the spatial structure of the modes and the associated connection matrix among them. Using such type of models we can integrate our knowledge about the description of brain activity based on these new ideas related to heteroclinic sequences and their interactions, i.e., heteroclinic networks.

As a top-down departing point, we need a mathematical object that can describe robust transient dynamics and their associated information processing. Once we have this object, we can implement it through a set of canonic equations that can be used to study transient activity at different brain description levels, and in particular to address chunking dynamics. A mathematical image of robust transient sequential dynamics must have two principal features. First, it must be resistant to noise and reliable even in the context of small variations in initial conditions, so that the succession of states visited by the system (its trajectory, or transient) is stable. Second, the transients must be input-specific to contain information about what caused them. These are two fundamental contradictions regarding the use of transient dynamics for the description of brain activity. Transient dynamics are inherently unstable: any transient depends on initial conditions and cannot be reproduced from arbitrary initial conditions. On the other hand, dynamical robustness in principle prevents sensitivity to informative perturbations. These contradictions can be solved through the concept of metastability, which was introduced to cognitive science at the end of the last century (Kelso, 1995; Friston, 1997, 2000; Fingelkurts and Fingelkurts, 2006; Oullier and Kelso, 2006; Gros, 2007; Ito et al., 2007).

A stable heteroclinic channel (SHC) is a mathematical object that meets the above discussed requirements, which can implement such stable transients. A SHC is defined by a sequence of successive metastable "saddle" states that are connected by separatrices. Under proper conditions, all the trajectories in the neighborhood of these saddle metastable states that form the chain remain in the channel, ensuring robustness and reproducibility over a wide range of control parameters (Rabinovich et al., 2008b). The stability of a channel means that trajectories in the channel do not leave it until the end of the channel is reached.

A simple model to implement SHCs is a generalized Lotka– Volterra equation with *N* interactive elements:

$$\frac{dA\_i(t)}{dt} = A\_i(t)F\left(\sigma\_i(S\_k) - \sum\_{j=1}^{N} \rho\_{l\bar{j}} A\_i(t)\right) + A\_i(t)\eta\_i(t)$$

$$i = 1, \dots, N\tag{1}$$

where *Ai*(*t*) ≥ 0 is the activity rate of element *i*, σ*<sup>i</sup>* is the gain function that controls the impact of the stimulus, *Sk* is an environmental stimulus, ρ*ij* determines the interaction between the variables, η*<sup>i</sup>* represents the noise level, and *F* is a function, in the simplest case a linear function. The state portrait of the system often contains a heteroclinic sequence linking saddle points. These saddles can be interpreted as successive and temporary winners in a never-ending competitive game, i.e., winnerless competition (WLC) dynamics (Rabinovich et al., 2001, 2006). In neural systems, because a representative model must produce sequences of connected neuronal population states (the saddle points), the neural connectivity ρ*ij* must be asymmetric, as determined by the theoretical examination of this model (Huerta and Rabinovich, 2004). Although many connection statistics probably work for stable heteroclinic-type dynamics, it is likely that connectivity within biological networks is, to some extent at least, the result of optimization by evolution and synaptic plasticity. It is important to emphasize that Equation (1) is just an elementary building block for different levels of the chunking hierarchy that we will describe below.

Models like the generalized Lotka–Volterra equations allow establishing the conditions necessary for transient stability, and display stable, sequential, and cyclic activation of its components, the simplest variant of WLC. A network with several degrees of freedom and asymmetric connections can generate structurally stable sequences—transients, each shaped by one input. Asymmetric inhibitory connectivity helps to solve the apparent paradox that sensitivity and reliability can coexist in a network (Huerta and Rabinovich, 2004; Nowotny and Rabinovich, 2007; Rabinovich et al., 2008b; Rabinovich and Varona, 2011). The neurons or modes participating in a SHC are assigned by the stimulus, by virtue of their direct and/or indirect input from the neurons activated by that stimulus. The joint action of the external input and a stimulus-dependent connectivity matrix defines the stimulus-specific heteroclinic channel. In addition, asymmetric inhibition coordinates the sequential activity and keeps a heteroclinic channel stable.

The WLC concept is directly related to the sequential dynamics of metastable states that are activated by inputs that do not destroy the origin of a competitive process. This paradigm can explain and predict many dynamical phenomena in neural networks with excitatory and inhibitory synaptic connections. Based on the requirement of the stability, this formalism has been used (i) to assess the dynamical origin of finite working memory (WM) capacity based upon WLC amongst available informational items (Bick and Rabinovich, 2009; Rabinovich et al., 2012); (ii) to build a dynamical model of information binding for transients that can describe the interaction of different sensory information flows that are generated concurrently (Rabinovich et al., 2010a); (iii) to model the sequential interaction between emotion and cognition (Rabinovich et al., 2010b); (iv) to represent attention dynamics (Rabinovich et al., 2013); and (v) to assess the dynamics of pathological states in mental disorders (Bystritsky et al., 2012; Rabinovich et al., 2013). Here we focus on a model of hierarchical chunking dynamics that can represent several forms of cognitive activity such as WM and speech construction.

As we discussed in the Introduction, chunking is grouping or categorizing related issues or information into smaller, most meaningful and compact units. Think about how hard it would be to read a long review paper without chapters, subchapters, paragraphs, and separated sentences. Chunking is a naturally occurring process that can be actively used to break down problems in order to think, understand, and make improvisation more efficiently. This is because it is easier to process chunked tasks or perceptional data. In particular, it is much easier to learn and recall such data. Mathematically, the "chunking principle" can be viewed as the transformation of a chain of metastable states along a transient process to the chain of groups of such states. It is a key dynamical idea that nature may use to make cognitive information processing more effective in the context of a complex environment.

Chunking processes in human perception, learning, and performance of a cognitive task can be both automatic and directly linked to the environmental stimuli, and controllable by a goaloriented intrinsic signal (Gobet et al., 2001). It is important to note that chunking is a strategy that supports increasing speed and accuracy through the formation of hierarchical memory structures and complex task-dependent behavioral sequences. Two competitive processes form temporal chunking sequences one separates long sequences into shorter groups of information items to be easily performed, and the second connects them to express a long sequence as a unified thought or behavioral action (Friederici et al., 2011; Chekaf and Matha, 2012).

Hierarchical chunking dynamics can be implemented in a model of cognitive networks whose information processing relies on SHCs. **Figure 1** illustrates a chunking heteroclinic cognitive network for two hierarchical informational groups—elementary

hierarchy is described by its own Lotka–Volterra type Equations (see 2–6) with connection matrices **ρ**, **ξ** and **ς**. Black circles represent inhibitory connections; triangles represent excitatory connections responsible for the choosing of the informational items. Spheres represent the informational items or units (metastable stables). Different colors indicate different chunks. All connections inside the elementary items are inhibitory.

items and chunking (integrated) informational items including many elementary units interacting through dynamical connections. It is reasonable to hypothesize that functionally there are two different cognitive networks from at least two different hierarchical levels that are responsible for the: (i) organization of the sequence of items inside chunks, and (ii) the formation of the chunk sequence. In particular, this hypothesis is supported by an experiment with chunking during visuomotor sequence learning (Sakai et al., 2003). It has been shown that each motor cluster is processed as a single memory unit—a chunk. A learned visuomotor sequence is a sequence of chunks that contains several elementary movements. The authors of this work have shown that a key role in the process of chunking formation is played by a brain network including the dominant parietal area, the basal ganglia, and the presupplementary motor area (see also Ribas-Fernandes et al., 2011 and Bor and Seth, 2012, where authors discuss the chunking structure of conscious processes).

Below we suggest a three level hierarchical model for the description of the chunking dynamics. Inhibition plays a key role in this model as is responsible for the execution of three functions: (i) competition between elementary informational items in order to produce stable sequences of metastable states, (ii) generation of the chunking sequence, and (iii) control of the performance of the sequential task. In recent years, the investigation of the hierarchical control between different levels of representation and information processing has become one of the hot subjects in cognitive science. This issue is important for understanding how the mind controls behavior and itself. In particular, the relationship between chunking (a sequence-level process) and task-set inhibition (a task-level process) in the performance of task sequences was investigated in (Koch et al., 2006; Schneider, 2007; Li et al., 2010), for a description of "chunks of chunks"—"superchunks" see Rosenberg and Feigenson (2013).

To understand the emergence of hierarchical chunking dynamics in a model we need to depart from Equation (1) in the following direction, c.f. **Figure 1**):

$$\dot{X}\_i^{lk} = X\_i^{lk} \left( \sigma\_i^{lk}(\mathcal{S}, \mathcal{C}) \cdot Y^{lk} - \sum\_{j}^{N^{lk}} \rho\_{ij}^{lk}(\mathcal{S}, \mathcal{C}) X\_j^{lk} \right) \tag{2}$$

$$\tau \dot{Y}^{lk} = Y^{lk} \left( \left( V^l - \mathfrak{G}(C) \sum\_{i}^{N^{lk}} X\_i^{lk} \right) - Z^{lk} \right) \tag{3}$$

$$\Theta(\mathbf{C})\dot{\mathbf{Z}}^{lk} = \sum\_{m}^{M} \xi\_{l}^{km}(\mathbf{S}, \mathbf{C}) Y^{lm} - Z^{lk} \tag{4}$$

$$T\,\dot{V}^l = V^l \left( \left( 1 - 8(C) \sum\_{j}^{M^l} Y^{l^j} \right) - W^l \right) \tag{5}$$

$$\Theta(\mathcal{C})\dot{\mathcal{W}}^l = \sum\_{q}^{p} \xi^{lq}(\mathcal{S}, \mathcal{C})V^q - W^l \tag{6}$$

Here *Xlk <sup>i</sup>* characterizes the -th informational item associated with the *k*-th chunk and *l*-th superchunk, σ*lk <sup>i</sup>* (*S*, *C*) is the growth rate for each informational item determined by the stimulus *S* and the cognitive task *C*, and ρ*lk ij* (*S*, *C*) is the matrix of inhibitory connections among basic informational items. In this model *Ylk* characterizes the *k*-th chunk associated to the *l*-th superchunk *Vl* , with corresponding characteristic times τ and *T*, respectively, and β(*C*) represents the strength of the inhibition between the informational items and the chunk, and δ(*C*) between the chunks and the superchunk. Also, *Zlk* describes the synaptic dynamics for the *k*-th chunk associated to the *l*-th superchunk with ξ*km <sup>l</sup>* (*S*, *C*), the matrix of inhibitory connections between chunks (black circles in **Figure 1**); and *W<sup>l</sup>* describes the synaptic dynamics for the *l*-th superchunk with ς*lq*(*S*, *C*), the matrix of inhibitory connections between superchunks, the corresponding characteristic times are θ(*C*) and -(*C*). In this model, β(*C*) and δ(*C*) are adaptation parameters that determine the timing relationship between a basic informational chain and the chunking and superchunking modulation. The chunking variables also satisfy the generalized Lotka–Volterra—canonic equations which allows them to form a stable sequence. Because of this, in fact, chunking variables play the role of cognitive controllers. The parameters for Equations (3)–(5) in the simulations below were chosen with this scope. Since chunking dynamics has to take into account of the characteristic time of the chunk formation, the competition between different chunks has to be delayed—we used for this an inhibition described by a first order kinetic model. At the same time, the competition among elementary informational items is implemented by fixed weight ρ*ij* instantaneous synapses. The same logic has been applied for the description of the highest level of the hierarchy—the superchunks.

# **RESULTS: HIERARCHICAL SEQUENCES—CHUNKING AND SUPER-CHUNKING**

Let us first represent the phase portrait of a simple two-level chunking dynamics. We carried out numerical simulations of the model for the dynamics within chunks of informational items for the following parameters *N<sup>k</sup>* = 3, *M* = 3 (number of "chunks" or "episodes"), σ<sup>1</sup> = [7.24, 5.85, 8.30], σ<sup>2</sup> = [9.93, 6.00, 5.18], σ<sup>3</sup> = [8.29, 7.86, 9.16], and given these val-

$$\text{uues, } \rho\_{\vec{\mu}}^{k} = 1.0, \rho\_{i\_{n-i}i\_{n}}^{k} = \frac{\sigma\_{i\_{n-1}}^{k}}{\sigma\_{i\_{n}}^{k}} + 0.51, \text{and } \rho\_{i\_{n+1}i\_{n}}^{k} = \frac{\sigma\_{i\_{n+1}}^{k}}{\sigma\_{i\_{n}}^{k}} - 0.5,$$

*i* = 1,..., *Nlk*, *k* = 1,..., *M* as well as the parameters considered for the synaptic dynamics described by Equations (3) and (4): τ = 0.7, θ = 2.0, ξ*kk* = 1.0, ξ*knkn* <sup>+</sup> <sup>1</sup> = 1.4 and ξ*knkn* <sup>−</sup> <sup>1</sup> = 0.5, *k* = 1,..., *M* and β = 0.01. The results of these simulations are shown in **Figures 2**, **3**.

**Figure 2** shows the phase portrait of the chunking dynamics when the superchunk formation is absent: the system is described by Equations (2)–(4), *V* = 1. This example illustrates a closed chunking sequence (green) that consists of several heteroclinic cycles that represent the elementary chunks (blue). In general, the number of elementary items in each chunk are different and the chunking sequence can be open.

**Figure 3** illustrates the timing between chunks along the sequence. The emergence of the chunking sequence shown in **Figure 2** is the result of a modulational instability in the two-level hierarchical network whose dynamics is described by Equations (2)–(4). This instability is oscillatory. The characteristic period of the oscillation is T. The analytical investigation of the dependence of T on the control parameters τ, θ, β and connection matrices **ρ**, **ξ** is a non-realistic problem because of the non-linear feedback between the dynamical variables *X* and *Y*. However, it is reasonable to think that the key parameter in this problem is

**FIGURE 2 | The projection of a nine-dimensional phase portrait of a two-level chunking hierarchical dynamics in the space of the three-dimensional auxiliary variables [see the Equations (2)–(4)]** *<sup>J</sup>***<sup>1</sup> <sup>=</sup>** *<sup>Y</sup>***<sup>1</sup> <sup>+</sup> <sup>0</sup>***.***<sup>04</sup> · -** *X***<sup>1</sup> <sup>1</sup> <sup>+</sup>** *<sup>X</sup>***<sup>2</sup> <sup>1</sup> <sup>+</sup>** *<sup>X</sup>***<sup>3</sup> 1 ,** *<sup>J</sup>***<sup>2</sup> <sup>=</sup>** *<sup>Y</sup>***<sup>2</sup> <sup>+</sup> <sup>0</sup>***.***<sup>04</sup> · -** *X***<sup>1</sup> <sup>2</sup> <sup>+</sup>** *<sup>X</sup>***<sup>2</sup> <sup>2</sup> <sup>+</sup>** *<sup>X</sup>***<sup>3</sup> 2 ,** *<sup>J</sup>***<sup>3</sup> <sup>=</sup>** *<sup>Y</sup>***<sup>3</sup> <sup>+</sup> <sup>0</sup>***.***<sup>04</sup> · -** *X***<sup>1</sup> <sup>3</sup> <sup>+</sup>** *<sup>X</sup>***<sup>2</sup> <sup>3</sup> <sup>+</sup>** *<sup>X</sup>***<sup>3</sup> 3 .** Blue represents the elementary informational item activity—individual chunk. Green represents the chunking sequence.

β which determines the level of excitability of variable *Y* and, according to the feedback, also controls the excitability of *X* (term σ*lk <sup>i</sup>* (*S*, *<sup>C</sup>*) · *<sup>X</sup>lk <sup>i</sup>* · *<sup>Y</sup>lk*) in the right hand side of Equation (2). In **Figure 3** we represent the numerical analysis of the dependence of T on the parameter β—increasing β, i.e., decreasing the excitability leads to the decreasing of the timing interval T.

We also carried out numerical simulations of a highdimensional model that describes the dynamics of chunk and super-chunk formation with the following parameters: *Nlk* = 6, *M<sup>l</sup>* = 6 (number of chunks), *P* = 3 (number of superchunks), σ*l*<sup>1</sup> = [6.94, 5.11, 8.94, 5.86, 8.33, 9.62], σ*l*<sup>2</sup> = [5.48, 5.66, 5.39, 9.89, 9.99, 5.82], σ*l*<sup>3</sup> = [7.65, 8.98, 9.21, 6.02, 5.71, 5.12], σ*l*<sup>4</sup> = [7.61, 7.73, 5.62, 7.93, 5.80, 5.39], σ*l*<sup>5</sup> = [5.11, 9.99, 5.52, 5.66, 5.50, 8.21], σ*l*<sup>6</sup> = [5.84, 9.39, 7.08, 5.16, 8 .37, 6.87], and given these values, ρ*lk ii* <sup>=</sup> <sup>1</sup>.0, <sup>ρ</sup>*lk in* <sup>−</sup> *iin* <sup>=</sup> <sup>σ</sup>*lk i n* − 1 σ*lk in* + 0.5 1, ρ*lk in* <sup>+</sup> *iin* <sup>=</sup> <sup>σ</sup>*lk i n* + 1 σ*lk in* − 0.5, *i* = 1,..., *Nlk*, *k* = 1,..., *M<sup>l</sup>* , *l*=1,... ,*P*, and ρ*lk iin* <sup>=</sup> <sup>ρ</sup>*lk in* <sup>−</sup> <sup>1</sup>*in* <sup>+</sup> <sup>σ</sup>*lk <sup>i</sup>* <sup>−</sup>σ*lk i n* − 1 σ*lk in* + 2, *i* = {*in* <sup>−</sup> <sup>1</sup>, *in*, *in* <sup>+</sup> <sup>1</sup>}, as well as the parameters considered for the synaptic dynamics between chunks described by the equations τ = 0.8, θ = 2.0, ξ*kk <sup>l</sup>* = 1.0, ξ *knkn* − 1 *<sup>l</sup>* = 0.5, ξ *knkn* + 1 <sup>1</sup> = 1.4, ξ *knkn* + 1 <sup>2</sup> = 1.3, ξ *knkn* + 1 <sup>3</sup> <sup>=</sup> <sup>1</sup>.5, *<sup>k</sup>* <sup>=</sup> <sup>1</sup>,..., *<sup>M</sup><sup>l</sup>* , *l* = 1,..., *P*, ξ*kkn <sup>l</sup>* = ξ *kn* − 1*kn <sup>l</sup>* + 2, *k* = {*kn* <sup>−</sup> <sup>1</sup>, *kn*, *kn* <sup>+</sup> <sup>1</sup>}, and β = 0.01. Finally, the parameters for the synaptic dynamics between superchunks were *T* = 5, - = 10, ς*ll* = 1.0, ς*lnln*−<sup>1</sup> = 0.5, ς*lnln* <sup>+</sup> <sup>1</sup> = 1.4, *l* = 1,..., *P*, and δ = 0.01. The result of these simulations are displayed in **Figure 4**, which shows three levels of information hierarchy: original informational chain (lower panel), chunked chain (middle panel), and superchunking chain (upper panel).

As illustrated in **Figure 2**, the sequence of chunks can be considered as a heteroclinic cycle of metastable states where each metastable state itself is a heteroclinic cycle of elementary informational items. Based on this self-similarity, we can expect that

the chunking chain as a result of a second heteroclinic instability generates the next level of modulation—the superchunk sequence. Our expectation is confirmed in **Figure 4** that shows the time series of the three level network (2)–(6) (c.f. **Figure 1**) dynamics. In this figure, one can see the generation of sequences of superchunks. All together, the sequences informational items, chunks and superchunks can be interpreted as "words," "sentences," and "paragraphs."

For the sake of simplicity we have illustrated here the phenomenon of stability just for a closed-loop clustered chunkingsuperchunking sequence. In the general case of open sequence, it is possible to formulate the sufficient conditions for the existence and stability of the non-closed channel based on the estimation of the saddle values of the metastable states (elementary items)—the channel is stable in the case that all of them are larger than one in absolute value (Afraimovich et al., 2004; Bick and Rabinovich, 2010). The formulation of the necessary conditions is a more complex problem and is still under consideration. The imposed stability conditions determine the behavior of the trajectories inside the neighborhood of the heteroclinic network independently of the initial conditions as computer experiments have confirmed (Afraimovich et al., 2004; Bick and Rabinovich, 2010).

The above described numerical results can be justified by an analytical study of the system

$$\begin{cases} \dot{X}^k\_i = X^k\_i \left( \sigma^k\_i \cdot Y^k - \sum\_{j=1}^{N^k} \rho^k\_{ij} X^k\_j \right), \\\\ \tau \dot{Y}^k = Y^k \left( 1 - \mathfrak{G} \sum\_{i=1}^{N^k} X^k\_i - Z^k \right), \\\\ \theta \dot{Z}^k = \sum\_{m=1}^M \xi^{km} Y^m - Z^k \end{cases} (7)$$

*i* = 1,..., *Nk*, *k* = 1,..., *M*. For the sake of simplicity, let us assume that τ = θ << 1, so one can apply geometric singular perturbation theory (see, for instance, Jones, 1995; Hek, 2010 and references therein). In order to avoid confusion, it is important to say that the assumption τ = θ << 1 implies that, in contrast to the dynamics of *X,* the chunking dynamics is a composition of fast and slow motions. The fast motions lead variables *Y*-th and *Z*-th to a neighborhood of the slow manifold in the phase space. The evolution of the chunk variables on this manifold in the vicinity of the metastable states is much slower than the *X* variables. This corresponds to the intuitively clear fact that the "enveloping" variables mimic the averaging dynamics of *X*. Computer experiments confirm this explanation (see **Figure 4**).

The limit slow manifold has the equations *Yk* 1 − β *<sup>N</sup><sup>k</sup> <sup>i</sup>* <sup>=</sup> <sup>1</sup> *X<sup>k</sup> <sup>i</sup>* <sup>−</sup> *<sup>Z</sup><sup>k</sup>* = 0, *<sup>M</sup> <sup>m</sup>* <sup>=</sup> <sup>1</sup> ξ*kmYm* − *Z<sup>k</sup>* = 0, thus, *<sup>M</sup> <sup>m</sup>* <sup>=</sup> <sup>1</sup> ξ*kmYm* = 1 − β *<sup>N</sup><sup>k</sup> <sup>i</sup>* <sup>=</sup> <sup>1</sup> *X<sup>k</sup> <sup>i</sup>* . Denote by ξ the *m* × *m*-matrix ξ*km*. If det ξ = 0, we find

$$Y^k = \frac{1}{\det\xi} \left( \sum\_{m=1}^M \eta^{mk} - \mathfrak{E} \sum\_{m=1}^M \eta^{mk} \sum\_{i=1}^{N^m} X\_i^m \right) \tag{8}$$

#### **Table 1 | Sequential dynamics in neural and cognitive systems.**


where η*km* is the cofactor of the entry ξ*mk*of the matrix ξ. Substituting this expression into the first equation of the system (7) we obtain the system

$$\dot{X}\_i^k = X\_i^k \left( \sigma\_i^k \frac{1}{\det \xi} \sum\_{m=1}^M \eta^{mk} - \sum\_{j=1}^{N^k} \rho\_{ij}^k X\_j^k - \frac{\beta}{\det \xi} \sum\_{m=1}^M \eta^{mk} \sum\_{i=1}^{N^k} X\_i^m \right) \tag{9}$$

*i* = 1,..., *Nk*, *k* = 1,..., *M*, which is similar to the binding model described in Rabinovich et al. (2010a). In particular, the "in-chunk" dynamics in (9) corresponds to the dynamics in the modality subspace in Rabinovich et al. (2010a). The main peculiarity of the system (9) is that the rates of coupling coefficients between different chunks have the common factor β, so if β = 0 then the interaction between different chunks is absent. Similarly to the study in Rabinovich et al. (2010a), one can impose conditions under which there exists a heteroclinic cycle for each chunk and successive heteroclinic connections between saddle points in different cycles. The last claim has the form β > β*cr* where β*cr* depends on the parameters of the system (9). If τ is small then because of the geometric singular perturbation theory, the imposed conditions shall guarantee the existence of a heteroclinic network in the original system (7) corresponding to the "in-chunk" and "inter-chunk" dynamics.

Observations on the temporal chunk signal have focused on the use of pauses in behavior to probe chunk structures in WM. On the basis of some of these studies, a hierarchical process model has been proposed, which consists of four hierarchical levels describing different kind of pauses. The lowest level consists of pauses between strokes within letters. On higher levels, there are pauses between letters, words, and phrases. Each level is associated with a larger amount of processing when retrieving these chunks from memory (Cheng and Rojas-Anaya, 2006). Writing may be an effective approach to the study of cognitive phenomena that involves the processing of chunks. In Cheng and Rojas-Anaya (2003), it was demonstrated that in the writing of simple number sequences the duration of pauses between written elements (digits) that are within a chunk are shorter than the pauses between elements across the boundary of chunks. This temporal signal is apparent in un-aggregated data for individual participants in single trials. Mathematically the time intervals between chunks and super-chunks are controlled by parameter β (see Equation 3).

# **DISCUSSION**

In this paper we have shown how the architecture of hierarchical mental model networks affected their associated functions. The discussed examples illustrate that networks with metastable states having several unstable separatrices exhibit very diverse cognitive functions (behavior). Complex heteroclinic networks allow completely new dynamical phenomena, and one of the primary challenges is the assessment of the existence and stability of hierarchical—chunking processes that can represent cognitive activity.

It is important to remind that the modeling of cycling and sequential dynamics in behavior and cognition has a long history (see several representative efforts in **Table 1**). Most of these models are based on Hopfield type networks. The main problem there is to keep the stability of the recall sequences against noise.

The results of chunking dynamics reported in this paper can be viewed as relevant in the description of different cognitive tasks. For example, in WM, humans encode items and synthesize them. With that, we give meaning to ideas and find a relevant place for them in our cognitive world. In these actions the interaction between WM and chunking are reciprocal—first of all WM is the "engine" of chunking, and on the other hand, the chunking makes WM capacity higher.

The model of chunking dynamics discussed in this paper relies on heteroclinic dynamics. It is important to emphasize that the main features of the SHC do not depend on the specific model used. The conditions of existence and the dynamical features of SHCs can be implemented in a wide variety of models: from simple Lotka–Volterra descriptions to complex Hodgkin–Huxley models, and from small networks to large ensembles of many elements (Varona et al., 2002; Venaille et al., 2005; Nowotny and Rabinovich, 2007; Rabinovich et al., 2012). The intrinsic hierarchical nature of the SHC at different temporal and spatial scales allows implementing many types of cognitive dynamics. Within this framework, brain networks can be viewed as nonequilibrium systems and their associated computations as unique patterns of transient activity, controlled by incoming input. The results of these computations can be reproducible, robust against noise, and easily decoded. Using asymmetric inhibition appropriately, the space of possible states of large neural systems can be restricted to connected saddle points, forming SHCs. These channels can be thought of as underlying reliable transient brain dynamics. **Table 2** summarizes four types of heteroclinic networks that can describe different aspects of sequential dynamics in cognitive processes: (i) A canonic heteroclinic network that produces reproducible sequential switching from one metastable state to another inside one modality (like in a simple WM task); (ii) A network displaying inhibitory-based heteroclinic binding dynamics that is responsible for the stable perception of a subject based on three different modalities; (iii) Two different modalities dynamically coordinated by excitatory connections; (iv) A chunking heteroclinic network that controls the grouping of elements of sequential behavior.

Mathy and Feldman have recently suggested to use the Kolmogorov complexity and compressibility (Mathy and Feldman, 2012) for the definition of a "chunk": a chunk is a unit in a maximally compressed code. The authors presented a series of experiments in which they manipulated the compressibility of stimulus sequences by introducing sequential patterns of variable length. To explore the influence of chunking on the capacity limits of WM, and departing from Bick and Rabinovich (2009), authors in Li et al. (2013) have suggested a model for chunking in sequential WM. This model also uses hierarchical bidirectional inhibition-connected neural networks with WLC. Assuming no interaction between a basic sequence and a chunked sequence, and the existence of an upper bound to the inhibitory weights the network, authors show that chunking increases the number of memorized items in WM from the "magical number" 7–16 items. The optimal number of chunks and the number of the memorized items in each chunk correspond to the "magical number 4."

#### **Table 2 | Heteroclinics in mind.**


*\*See the definition of the variables and parameters in the text.*

Recent experiments have confirmed the existence of three levels of cognitive hierarchy—see Rosenberg and Feigenson (2013). In this paper authors reported that infants can unify the representation of chunks into *"super-chunks."*

The chunking models discussed above can be generalized on more complex cases. In particular, by adding attention control in the network hierarchy, it is possible to analyze the binding of sequences of chunks. The brain could use such binding to perform many cognitive functions like the coordination of visual perception with speech comprehension, or the coordination of music chunks and word chunks in singing processes. It is well-known that viewing a speaker's articulatory movements substantially improves a listener's ability to understand spoken words, especially under noisy environmental conditions like in a crowded cocktail party. Ross and coauthors claimed that this effect is most pronounced when the auditory input is weakest. As a result of attentional binding—multisensory integration—, substantial gain in multisensory speech enhancement is achieved at even the lowest signal-to noise ratios (Ross et al., 2007).

The dynamics of hierarchical heteroclinic networks is also able to explain and predict the coordination of behavioral elements with different time scales (for a study about the coordination of sensorimotor dynamics see Jantzen and Kelso, 2007). Functionally, such kind of synchronization can be the result of learning—the changing of the strength of inhibitory connections between agents at the different levels of the hierarchy in order to coordinate the dynamics with different time scales (see **Figure 3**). Additionally, it is important to note that the winnerless competitive learning process itself can be chaotic (Komarov et al., 2010), which provides wider possibilities for adaptability.

# **ACKNOWLEDGMENTS**

Mikhail I. Rabinovich acknowledges support from ONR grant N00014310205. Pablo Varona was supported by MINECO TIN2012-30883. Irma Tristan acknowledges support from the UC-MEXUS-CONACYT Fellowship. Valentin S. Afraimovich was partially supported by Ohio University Glidden Professorship program.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 18 October 2013; accepted: 10 February 2014; published online: 14 March 2014.*

*Citation: Rabinovich MI, Varona P, Tristan I and Afraimovich VS (2014) Chunking dynamics: heteroclinics in mind. Front. Comput. Neurosci. 8:22. doi: 10.3389/fncom. 2014.00022*

*This article was submitted to the journal Frontiers in Computational Neuroscience. Copyright © 2014 Rabinovich, Varona, Tristan and Afraimovich. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# A non-linear dynamical approach to belief revision in cognitive behavioral therapy

# *David Kronemyer\* and Alexander Bystritsky*

*Anxiety and Related Disorders Program, David Geffen School of Medicine, Semel Institute for Neuroscience and Human Behavior, University of California, Los Angeles, CA, USA*

#### *Edited by:*

*Tobias A. Mattei, Ohio State University, USA*

#### *Reviewed by:*

*Tobias A. Mattei, Ohio State University, USA Fatemeh Bakouie, Amirkabir University of Technology, Iran*

#### *\*Correspondence:*

*David Kronemyer, Anxiety and Related Disorders Program, David Geffen School of Medicine, Semel Institute for Neuroscience and Human Behavior, University of California, 300 UCLA Medical Plaza, Room 2330, Los Angeles, CA 90095-6968, USA e-mail: dkronemyer@ mednet.ucla.edu*

Belief revision is the key change mechanism underlying the psychological intervention known as cognitive behavioral therapy (CBT). It both motivates and reinforces new behavior. In this review we analyze and apply a novel approach to this process based on AGM theory of belief revision, named after its proponents, Carlos Alchourrón, Peter Gärdenfors and David Makinson. AGM is a set-theoretical model. We reconceptualize it as describing a non-linear, dynamical system that occurs within a semantic space, which can be represented as a phase plane comprising all of the brain's attentional, cognitive, affective and physiological resources. Triggering events, such as anxiety-producing or depressing situations in the real world, or their imaginal equivalents, mobilize these assets so they converge on an equilibrium point. A preference function then evaluates and integrates evidentiary data associated with individual beliefs, selecting some of them and comprising them into a belief set, which is a metastable state. Belief sets evolve in time from one metastable state to another. In the phase space, this evolution creates a heteroclinic channel. AGM regulates this process and characterizes the outcome at each equilibrium point. Its objective is to define the necessary and sufficient conditions for belief revision by simultaneously minimizing the set of new beliefs that have to be adopted, and the set of old beliefs that have to be discarded or reformulated. Using AGM, belief revision can be modeled using three (and only three) fundamental syntactical operations performed on belief sets, which are expansion; revision; and contraction. Expansion is like adding a new belief without changing any old ones. Revision is like adding a new belief and changing old, inconsistent ones. Contraction is like changing an old belief without adding any new ones. We provide operationalized examples of this process in action.

#### **Keywords: AGM theory, belief revision, cognitive behavioral therapy, cognitive restructuring, exposure/response prevention, non-linear dynamical psychiatry, systematic desensitization**

Non-linear dynamical psychiatry recently has taken two different directions. The first is the granular description of neurological systems from a bottom-up, micro level, in order to characterize a cognitive phenotype such as emotion or attention (illustrative is Rabinovich et al., 2010a). The second is the functional description of psychopathology and corollary intervention strategies from a top-down, macro level, in order to characterize the course and progression of psychiatric disorders (illustrative is Bystritsky et al., 2012). Drawing on both, in this review we set forth a theory of belief revision for the intervention strategy known as cognitive behavioral therapy (CBT). CBT postulates that psychiatric disorders such as anxiety and depression are not caused by acts, transactions, events or circumstances in the real world, or by one's imaginal reconstruction of them. Rather, they result from one's attitude, orientation or outlook toward them. Persons who are anxious or depressed hold dysfunctional beliefs about themselves, others, their environment and the future. Dysfunctional beliefs are caused by an invalidating environment, deficient informationgathering practices and breakdowns in one's belief formation system (Warman et al., 2007). They often are accompanied by dysregulated emotions (Linehan, 1993). As a result, persons holding them engage in problematic or undesired behavior that is personally distressful or socially maladaptive, for example, anger, impulsivity, self-harm, self-isolation or substance abuse ("target behavior").

Belief revision is the primary therapeutic technology underlying CBT. As we will explain, it comes in two types. The first, called "cognitive restructuring," reformulates old beliefs and changes them into new ones. As a result, one is able to reregulate one's emotions and modify or abandon target behavior. The second results from behavioral change through a process called "systematic desensitization" or "exposure/response prevention." It extinguishes old, conditioned target behavior and introduces new more flexible, adaptive behavior. This in turn reformulates or discards old beliefs and reregulates emotions, reinforcing the newlylearned behavior. In both cases, the new behavior then stabilizes, consolidates and strengthens the new beliefs. Both are forms of belief revision: the former, more cognitively-based than behavioral; and the latter, more behaviorally-based than cognitive. Belief revision also reduces the intensity of interoceptive alarms activated by the sympathetic nervous system when stressed, such as those characteristic of panic (Khalsa et al., 2009; Domschke et al., 2010). CBT widely is regarded as the paradigm of an empirically-supported therapy (EST) (Butler et al., 2006), which should make it particularly amenable to a cognitive science-based approach.

Our central premise is that belief revision in CBT is an integral component of a non-linear dynamical process of psychological change as conceptualized, for example, by Bystritsky et al. (2013). Anxiety and mood disorders have three essential components, which are alarms, beliefs and coping strategies (A-B-C). Alarms can be evaluated using conventional metrics such as their frequency, intensity, duration and onset. Coping strategies–a form of behavior–can be evaluated by whether they are distressful, maladaptive, or effective in down-regulating the incidence of target behavior and the intensity of correlative alarms. Beliefs are more difficult to integrate into a theory of non-linear dynamical systems. They have several unique characteristics as cognitive phenotypes, which prevent them from fitting well into the canonical model. One might not even notice one has beliefs to begin with, unless and until they are activated by environmental triggers, interoceptive sensations or undesired behavioral consequences.

Alternatively, we propose and demonstrate a set-theoretical, semantically-based approach to belief revision known as AGM theory, and show how it is the most plausible candidate to perform belief revision within a non-linear, dynamical framework. AGM is an acronym of the last names of its inventors, Alchourrón et al. (1985). It sets forth the requirements for non-delusional belief change in light of new evidence, and that one's resulting updated knowledge base must meet, in order to remain intuitively appealing (Carnota and Rodríguez, 2011, p. 2). As we discuss at §3, AGM operationalizes the cognitive component of CBT. Its objective is to define the necessary and sufficient conditions for belief revision by simultaneously minimizing the set of new beliefs that have to be adopted, and the set of old beliefs that have to be discarded or reformulated. Using AGM, belief revision can be modeled using three (and only three) fundamental syntactical operations performed on belief sets, which are expansion; revision; and contraction. Expansion is like adding a new belief without changing any old ones. Revision is like adding a new belief and changing old, inconsistent ones. Contraction is like changing an old belief without adding any new ones.

# **SOME RELEVANT CONSIDERATIONS ABOUT BELIEF**

The nature of belief and what it is to believe in something (a doxastic state) both long have been central pre-occupations of psychology and epistemology (Schwitzgebel, 2010). It is beyond the scope of this review to discuss exhaustively the voluminous literature on belief, which has accumulated relentlessly since antiquity. We will, however, briefly develop several characteristics of belief pertinent to its integration into a theory of non-linear dynamical systems, which any theory of belief revision must take into account1 .

A consensus definition is that beliefs are "states of mind that have the property of being about things–things in the world, as well as abstract things, events in the past and things only imagined" (Churchland and Churchland, 2013, p. 1). Russell (1921/2005) and colleagues famously developed a theory of propositions and propositional attitudes. What beliefs are about is their substantive propositional content, i.e., (that "*x*"). Belief is an attitude, orientation or outlook toward that propositional content, i.e., BEL("*x*"). The set of all of one's beliefs at time *t*<sup>1</sup> is one's knowledge base *k*1. Beliefs are different than simple reference to people, places or things; informal or colloquial uses (Grice, 1975); as well as other modes of discourse such as performatives (Austin, 1962) 2. While all of its individual elements are controversial in various respects, for our purposes, **Figure 1** depicts the standard model of belief, with components including perceptual, cognitive, emotional, linguistic and behavioral processing.

#### **BELIEFS ARE BASED ON EVIDENCE**

Evidence is a set of epistemological claims adduced to support a belief set. Relevant evidence enables one to devise and then test various hypotheses the belief set generates (Glymour, 1975; Hartmann and Sprenger, 2010). One is justified in believing that "*x*" to the extent one has good evidence for "*x*" (Feldman and Conee, 1985; Joyce, 2011). In the case of psychiatric disorders such as anxiety or depression, evidentiary data are things one might cite or rely on to support a contention that what one is *afraid* will occur, actually *will* occur. The feared outcome or consequence does not *actually* have to occur, rather, the evidence gives credence to the belief or prediction that it will.

From a clinical standpoint, the client is not responding to an object of fear; instead, to an internal symbolic representation of it, which (among other properties) has a compelling sense of reality. The client's behavioral expressions and coping strategies in turn are not a reaction to the feared object, but rather to the set of beliefs surrounding it, comprising the client's vision of what the feared object is, or might be. Under these circumstances, evidence is nothing more than the way things seem. One is "right to believe everything he believes as strongly as he believes it until it is rendered improbable by something else he believes" (Swinburne, 2011, p. 202). This support function often is conditional (Joyce, 2003). A conditional belief is one with the form

<sup>1</sup>Some of the other issues affecting beliefs that are beyond the scope of this review include (for starters): the subjective, phenomenological experience of belief; taxonomies of different types of beliefs; the relationship between beliefs and emotions; the role of memory; subjective probability theory; Bayesian epistemology; Dempster-Shafer theory; theories of reasoning; and rationality. In addition we do not here address objections such as logical omniscience, monotonicity and whether language (and beliefs) can be analyzed using a logical structure, to begin with.

<sup>2</sup>In linguistics the study of how language actually is used is known as deixis (Brisard, 2011). Deixis is an example of how one's environment pragmatically imposes itself on one's beliefs. Although a word's semantic meaning may be fixed, what it actually means can vary with a number of factors, such as person, place and time. All of these are susceptible to ambiguous reference if viewed in isolation. It may not be clear, for example, who is designated by a pronoun. Spatial locutions such as "here" or "there" may designate more than one location, and temporal ones such as "now" and "then" might apply to different times (Corazza, 2011; Hanks, 2011). By constraining the limits of potential communication systems, ambiguity in natural languages actually may be adaptive (Piantadosi et al., 2012; Solé and Seoane, 2014). Deictic reference is a sub-category of indexical reference, which expands these principles to any context-sensitive use. Example: a vague expression with a hidden or latent variable, or one that has a particular meaning unique to a local community (such as "urban slang"), which often is uninterpretable absent specialized knowledge (Braun, 2007).

BEL(*x*)|{EVID1, EVID2,... EVID*n*}, which reads "BEL(that "*x*") assuming {EVID1, EVID2,... EVID*n*}" (Arlo-Costa, 2007).

In psychiatry, evidence often is clinical observations of patient behavior or patient reports of symptoms set forth in the Diagnostic and Statistical Manual (DSM-5) (American Psychiatric Association, 2013). An example of the former: BEL("This person is depressed") | EVID("She has insomnia or hypersomnia nearly every day and significant weight loss when not dieting or weight gain, or decrease or increase in appetite nearly every day"). An example of the latter: BEL("I'm depressed") | EVID("I have markedly diminished interest or pleasure in all, or almost all, activities most of the day, nearly every day; and I have feelings of worthlessness or excessive or inappropriate guilt nearly every day"). Evidence also can be third-person observations or patient reports of them. Example: EVID("She always is fighting with her friends") or EVID("My parents always told me so"). Persons also may have corollary beliefs about their beliefs (Paulus and Stein, 2010). For example, one might BEL("Therapy/pharmacology doesn't help") or BEL("I'm going to have this for the rest of my life"). They also might be reflexive, as in BEL("I'm afraid of experiencing the symptoms of panic disorder").

#### **REFERENTIAL OPACITY**

A sentence's reference is what it designates. Sentences about beliefs are referentially "opaque" in that co-designating terms are not intersubstitutable (Quine, 1953/1980). To use a famous example, Oedipus married Jocasta; Oedipus believed Jocasta was his girlfriend; Oedipus didn't know Jocasta was his mother. This reads as follows: there was a time (*t*1) when Oedipus believed "Jocasta was his girlfriend" (BEL1) given the supply of evidentiary data {EVID1, EVID2,... EVID*n*} then available to him. Even though true, Oedipus didn't believe at *t*<sup>1</sup> "Jocasta was his mother" (BEL2), i.e., BEL2 ∈/ *k*1. He discovered this only at *t*2, when (to his consternation) his knowledge base was *k*2.

It follows that sentences about beliefs are informative in a way that "the sum of the angles of a triangle is 180◦" is not. Another famous example from Gottlob Frege: one believes the morning star rises in the east; one also believes the evening star sets in the west; one doesn't know both are the planet Venus. Even though both sentences refer to the same thing, their meanings or "senses" are different (Zalta, 2012). Failures of reference do not require one to postulate intentional conduct. They may be due to something as simple as accident or mistake (Austin, 1956/1970) 3 . The main

<sup>3</sup>A related concept is intensionality, later developed by Rudolf Carnap (1947/1988). Intension roughly is the same thing as meaning or sense. It contrasts with extension, which roughly is the same thing as reference. For Carnap, two phrases or sentences have the same extension if they designate the same thing, i.e. they both are true or false with regards to it, so that one can be substituted for the other. Intensional ones fail this test, at least for our actual world. There is, however, a possible world or state-description with different conditions, in which there is substitutability of identity. That possible world could be our actual world at a different point in time, or even the knowledge base of different persons. Beliefs, according to Carnap, are neither extensional or intensional, because one can believe *x* but not *y* or *z* without realizing they

problem with belief reports is that they rely on a client's interpretation of her subjective phenomenological experience (Dattilio et al., 2010).

## **BELIEFS ARE SUBJECTIVE**

Referential opacity is a set-theoretical way of saying that beliefs are inherently subjective. As *homo credens*, people are infinitely capable of believing any number of different things (Shermer, 2012). One might believe in unicorns, global warming, conspiracy theories, that the sun revolves around the earth, or that they are the present King of France. It is not our intention to restrict the content of different beliefs, or the types of evidence that may be adduced to support them.

Psychiatrists and psychologists have devised numerous ways to find out *what* people believe, including observing them, testing them and asking them. In this sense, beliefs are "epistemically objective." Implausible as it may seem, in the near future, it might even be possible to read a person's mind using neurotechnologies such as fMRI (Harris et al., 2008; Poldrack et al., 2011); neuropsychiatric phenomics (Bilder et al., 2009a,b); connectionist-type principles (Sporns et al., 2005); or interactionist-type principles (Stumpf et al., 2008) 4 .

One of the perennial issues in cognitive science is whether these methods ever will be sufficient to account for belief's phenomenological texture. There is something unsatisfying about the neuromaterialistic/neurodeterministic program of extracting the substantive propositional content of a belief from neurological events. The reason why is because beliefs are underdetermined neurophysiologically; a single neurological state potentially could give rise to any number of different beliefs (they are "multiply realizable," (Levine, 1983, 1999); there is an "explanatory gap" between the two, Davidson, 1970, 1974). Further, they only can be held by the person who believes them. In this sense they are "ontologically subjective," as features or ascriptive predicates attributable only to that person (Dehaene, 2014, p. 9; Searle, 1995, pp. 7–9)5 . From a clinical standpoint, there is no such thing as a standardized set of beliefs. Any approach to psychometric assessment that attempts to construct a taxonomy of typical beliefs, whether normative or pathological, most likely will not be successful, because beliefs fundamentally are distinctive, unique and personal. The clinician and the client must become co-investigators to identify them and the evidence ostensibly supporting them.

# **BELIEFS ARE MEDIATED AND MODERATED**

Beliefs are mediated and moderated by any number of different factors such as background, upbringing, life experiences, information processing strategies, temperament, attributional style, other beliefs, context, culture, motivation, and the presence of environmental cues and situational primes (Hope et al., 2010). They may be teleological or subject to confirmation bias. People deploy a variety of heuristic reasoning strategies to arrive at the beliefs they hold, including hypothesis formation, generalization and anomaly resolution. Reasoning has a rational basis rooted in probabilistic approaches to problem-solving (Kahneman and Tversky, 1979; Tversky and Kahneman, 1983; Oaksford and Chater, 2007). These strategies have evolved over time to facilitate our ability to make decisions in situations with incomplete information as to potential outcomes (Kahneman et al., 1982; Shafer and Tversky, 1985; Kahneman, 2003; Michalewicz and Fogel, 2004). They include everything from educated guesses to intuitive judgments and common sense. Induction is an important aspect of human reasoning (Heit and Rotello, 2010; Johnson-Laird, 2010), as are techniques to evaluate the evidence in support of individual beliefs such as Bayesian reasoning and Dempster-Shafer theory (Curley, 2007; Zhao and Osherson, 2010; Zhao et al., 2012). There also is a complex relationship between cognition and emotion (§2.1.4, below; Pessoa, 2008, 2014). Beliefs are thought; emotions are felt. Just as one can have beliefs about one's emotions, so does one's emotional state affects one's beliefgenerating system. As with the subjective nature of beliefs (§1.3, above), while all of these are controversial in various respects, it is not our intention to restrict the nature, scope and extent of potential belief influencers.

# **CONDITIONS OF SATISFACTION**

A proposition has the property that it is true or false in the real world (McGrath, 2012). Beliefs, on the other hand, have conditions of satisfaction–what happens when things are the way one believes them to be. BEL("It's raining") is satisfied if in fact it is raining. Under those circumstances, we say the belief is "true." Beliefs have a "mind-to-world" direction of fit, in that the belief corresponds, to some extent, with reality (Searle, 1983).

### **PSYCHOPATHOLOGY DISRUPTS THE ENTIRE BELIEF TEMPLATE**

One of the best ways to consider belief as a psychological construct is to examine counterfactual cases (Langdon and Connaughton, 2013). Persons who are anxious or depressed have beliefs that are dysfunctional and experienced as negative and invalidating (Bernstein et al., 2010, 2013). Example: BEL("If I try to do this, I'm going to fail").

The main problem with dysfunctional beliefs is they cannot be assigned a truth value, as in BEL ("The cat is on the mat" | There is a creature of the genus and species *felis catus* lying prone upon a rectangle of flooring material). Rather, one *thinks* conditions of satisfaction have been met, or thinks *others* think they have, when in fact they have not. Example: BEL("I'm a terrible person") does not imply one in fact is a terrible person (under some plausible

all refer to the same thing. Phrases or sentences are "intensionally isomorphic" if in fact this intersubstitutability relationship nonetheless exists.

<sup>4</sup>The Human Connectome Project was established in September, 2010 by the U.S. National Institutes of Health (Vance, 2010). In April, 2013, the U.S. announced its BRAIN Initiative, a \$1 billion connectionist-type project. It joined a similar C1 billion venture, the Human Brain Project, announced in January, 2013 by the E.U. (Abbott, 2013; Reardon, 2014). Internet companies such as the Allen Institute for Brain Science (Carey, 2012); Google (Markoff, 2012); and Vicarious (Albergotti, 2014) have similar objectives. Because connectionism results in something akin to a static, point-in-time wiring diagram, it is the opposite of non-linear dynamical psychiatry, see §4. Connectionism has obvious applications to artificial intelligence (AI), beyond the scope of this review to investigate further.

<sup>5</sup>Eliminative materialists such as Churchland et al. necessarily are committed to a theory that psychological disorders are a result of brain malfunction, for example, defective or impaired neurochemistry (Matthews, 2013).

consensus definition of what that means), or that others think so. Initially, negatively-valenced beliefs arise from misinterpretation of exteroceptive and interoceptive evidence and from information processing deficits (Paulus and Stein, 2010; Boden et al., 2012). Misevaluation of conditions of satisfaction then causes one to misjudge the evidence supporting the feared outcomes ("cost biases") (Nelson et al., 2010a,b).

Normatively, we are inclined to impose certain minimum requirements on a set of beliefs in order to maximize the likelihood there will be a match between beliefs and conditions of satisfaction. These include conformity, conditioning and coherence (Howson, 2009).

### **CONFORMITY**

Conformity disregards the substantive propositional content ("*x*") of BEL("*x*") and requires only that one not endorse ("-*x*") simultaneously. Actual human reasoning might not be quite that simple. Research shows that people deal with inconsistencies not by attempting to refute one of the premises, but rather by trying to explain their origins, which has the side effect of revising their beliefs (Khemlani and Johnson-Laird, 2011).

#### **CONDITIONING**

Conditioning means that one should hold BEL("*x*") only for so long as {EVID1, EVID2,... EVID*n*} support (*x*) and that one must update (*x*) in light of new, incoming EVID. Such an update may involve modifications to the belief's conditions of satisfaction. Acquiring, maintaining and using new evidence in order to revise and update beliefs is a crucial human survival strategy (Patterson and Barbey, 2013). When incorrect or obsolete, conceptual knowledge must be repaired by integrating and explaining new material (Friedman and Forbus, 2011).

### **COHERENCE**

Coherence means that only tautological falsehoods qualify for a probability assignment of *p*(*x* = 0) and only tautological truths qualify for *p*(*x* = 1). Thus one should not assign *p*(BEL) = 0 to (BEL = "the sum of the angles of a triangle is 180◦"), §1.2, above. Rather, one should assign it *p*(BEL) = 1.

Although they seem sensible, these axioms often do not apply to psychopathological states, because cognitive processing systems are impaired and emotion processing systems are dysregulated. Persons holding dysfunctional beliefs also may not be able to reason normatively. For example, they may disbelieve a set of propositions (e.g., evolution, global warming), which (most) everybody else believes (Perring, 2010). They may be indifferent to antecedent beliefs and stored knowledge; misunderstand inferential relationships; prioritize anomalous perceptual experiences; and lack a coherent theory of mind (Davies and Coltheart, 2000). It also makes sense to think of sentences expressing the ideations of persons with psychiatric disorders (§1.2, above) as ultra-opaque, thus even less amenable to substitutability of identity.

Their ability to evaluate evidence also may be impaired. Normatively, one relies on evidence to support a belief that what one *thinks* will occur, actually *does* occur. The evidence does not contradict, and in fact supports, the belief. In problematic cases, though, one does not have to believe a feared outcome or consequence actually *will* occur. Rather, all one has to believe is that the evidence supports the *belief* that it will, regardless of whether it happens or not (Joyce, 2011; §1.1, above). In such cases, the evidence supporting the belief is misaligned with reality (Warman et al., 2007; Möller, 2012). Clearly this is a slippery slope. If people can believe whatever they want, then what's to stop them, particularly if they have a mental disorder?

## **SUBJECTIVE PROBABILITY THEORY**

There are two modern epistemic interpretations of probability, which are logicism and subjectivism (Galavotti, 2011). Logicism contends that probability is a person-independent, normative relationship between real-world facts or events. Subjectivism is the theory that probability is one's degrees of belief (Hájek, 2011). Under the logicist interpretation, a tautological statement (such as *A* → *B*; *A*; ∴ *B*) is certain regardless of what people may think about it. Its probability *p* within a sample space is 1 and in principle a large number of other beliefs can be incorporated within so long as they are complementary (§1.6.3, above). Under the subjectivist interpretation, different persons can believe whatever they want and assign their beliefs different *p*-values, even given the same evidence, permitting wide intersubjective belief variation.

Subjectivism almost certainly is true when considering a person's individual beliefs (§1.3, above). It breaks down, however, when considering a set comprising different beliefs, all held by the same person. This surely is normative. It would be odd for a person only to have one belief. Most people probably hold tens of thousands, perhaps hundreds of thousands, of beliefs, and their knowledge base most likely expands over time (Ohlsson, 2011, p. 293). The problem is not about subjectivism. Rather, it is about probability. Probability assessments do not occur on an interval scale, making it impossible to combine them or determine something analogous to their "mean" probability function using a linear pooling methodology (Wallsten et al., 1997) 6 . Beliefs comprising belief sets are interdependent, not independent. As a result, they cannot be evaluated using a differential equation or structural equation modeling approach. A differential equation approach will not work, because one cannot parameterize the values of the variables in order to create a belief change trajectory or phase portrait within a vector field. A structural equation modeling approach will not work, because one needs dimensionality reduction. For example, if one holds 13 separate beliefs, the binominal coefficient is 715. Their interaction effects are 13! (13 factorial), or 6,227,020,800. Beliefs simply cannot be converted into numbers. They are not variables with values. Consequently, there must be some other way to fit beliefs into a non-linear dynamical model.

#### **BELIEFS HAVE SEMANTIC, PROPOSITIONAL CONTENT**

The solution is that beliefs have semantic, propositional content. Semantic content need not be expressed in complete sentences or

<sup>6</sup>Primarily for this reason, it is not clear that a comprehensive Bayesian approach to belief formulation and revision (for a summary, see Davies and Egan, 2013) is viable.

even phrases. It can be concepts that either are the semantic content or that combine to form it (Laurence and Margolis, 2012). Beliefs are just such a conceptual state. Unlike variables populated by values, they must be elicited using a natural language and then comprised into sets at various stages of the belief generating process (*t*1, *t*2,... *tn*). One selects beliefs and includes them as members of belief sets by promoting or prioritizing them ahead of others, based on one's credences in the evidence supporting them, or levels of confidence in their conditions of satisfaction (§1.5, above; Makinson, 2009; Dietrich and List, 2013). Credences are situated along a continuum ranging from complete certainty of falsehood (does not meet perceived conditions of satisfaction) to complete certainty of truth (meets perceived conditions of satisfaction), depending on the evidence (Joyce, 2009).

#### *Preference functions*

Individual beliefs are organized into sets by preference or ranking functions (γ), which assess the occurrence or persistence of the belief (Spohn, 2009). In order to assign a preference function, one must adopt a theory of utility to determine what counts as a desirable (utility-maximizing) action; establish degrees of belief; rank preferences; and determine what evidence counts as confirming what beliefs (Johnson-Laird, 2010, 2013; Meacham and Weisberg, 2011). The higher a belief's preference function, the more likely it is to provide a basis for behavior (Segerberg et al., 2009) 7. Following this compilation process, different belief sets then can be evaluated in order to determine the nature, scope and extent of belief revision, most likely by a human skilled in use of the language in which the beliefs are expressed8 . It is likely that different beliefs impose contrasting and disparate semantic burdens, based on factors such as prevalence, complexity, and the number of inferences involved.

### *Semantic encoding*

An example of a technique that has been devised to elicit beliefs is the articulated thoughts in simulated situations (ATSS) thinkaloud paradigm, initially developed by Davison et al. (Zanov and Davison, 2010). Computational semantics attempts to model key features of natural language processes such as word meaning, sentence meaning, pragmatic usage and background knowledge (Stone, 2014). Recent initiatives include WordNet (Princeton University, 2010); latent semantic analysis (LSA) (University of Colorado Boulder, 1998); and SNePS (SNePS. Research Group., 2013). WordNet is a lexical database that groups words into sets of distinct cognitive concepts. LSA evaluates word similarity by similarity of context of use. SNePS is a natural language knowledge representation and reasoning system. A SNePS sub-routine models belief revision to maintain conformity, conditioning and coherence (§1.6.1, §1.6.2, §1.6.3, above). It too requires both individual beliefs and their relationships to be semantically encoded. One of the research priorities of several of today's most prominent internet companies is to develop algorithms for natural language recognition. Apple acquired Siri in April 2010 (Wortham, 2010); Facebook announced Graph Search in January 2013 (Sengupta, 2013); Google announced Hummingbird in September 2013 (Miller, 2013); Yahoo announced SkyPhrase in December 2013 (Goel, 2013); and in February 2014, Wolfram released software intended to answer natural language queries with real-world information as a kind of "computational knowledge-engine" potentially demonstrating a form of "machine intelligence" (Lecher, 2014). One of the main challenges of these initiatives will be to capture the numerous shades and nuances of meanings used by fluent language speakers–the senses of words, in Fregean terms (§1.2, above).

# *Semantic entailment*

Closely related are problems of semantic entailment, that is, when a phrase or sentence commits one to other associated concepts. A classic example: "Socrates lived in Greece" should be inferred from "Socrates lived in Athens." Words are organized into "semantic/associative neighborhoods within a larger network of words and links that bind the network together" (Nelson et al., 2013, p. 797); Schroeter (2012) characterizes it as a twodimensional semantic space comprising rules for assigning values to words and sentences. Specifying exactly what these neighborhoods and networks are is challenging, because (as with semantic encoding, §1.8.2, above) it depends on acquiring paraphrases, lexical semantic relationships, and inferences in contexts such as question answering, information extraction and summarization– similar to the usages employed by a natural language speaker (Dagan et al., 2009).

#### **BELIEFS DO NOT EXIST IN ISOLATION**

As semantic entailment illustrates, beliefs are components of complex domains, knowledge sets and networks (Davidson, 1994/2005). The limits of certitude on the one hand and psychopathology on the other allow for a wide variety of different {BEL | EVID} (Huber, 2009). One has an extensive set of unspecific background beliefs, which are culturally sensitive and context-dependent. They are "encoded in our linguistic formulation of the problem" (Weisberg, 2011, p. 507). Activities such as data selection, acquisition and learning require constant revision to one's knowledge base. Belief formation is subject to the overwhelming intervention of human experience, chance events and real-world constraints (Oaksford and Chater, 2007).

Quine and Ullian (1978) refer to this as a "web of belief"– "The totality of our so-called knowledge or beliefs, from the most casual matters of geography and history to the profoundest laws of atomic physics or even of pure mathematics and logic, is a manmade fabric which impinges on experience only along the edges" (Quine, 1953/1980, p. 42). Another way to look at beliefs is how they fit into what Searle (1995) calls the "background"–"all of those abilities, capacities, dispositions, ways of doing things and general know-how that enable us to carry out our intentions and apply our intentional states generally" (Searle, 2010, p. 31); or, the "foundational, non-representational non-rule-governed, dispositional structure of everyday understanding that underpins both our perception and our reasoning" (Rhodes and Gipps, 2008, p. 295).

<sup>7</sup>Other than noting its important function, it is beyond the scope of this review to assess γ's mechanism of action.

<sup>8</sup>Obviously this may be any type of language capable of performing this function.

#### **DYNAMICS OF NATURAL LANGUAGE FORMATION**

Another important factor involved in belief semantics is the dynamics of natural language formation. Any language must have certain minimal constructs and features. These include generativity (one can create an indefinite number of new sentences from its component elements); discreteness (semantic elements, such as words, retain their identity, even in different syntactical contexts); compositionality (smaller language units, such as words, can be combined to form more complex ones, such as sentences); predictability; and recursion (phrases can be embedded within phrases to create new sentences) (Hauser et al., 2002; Studdert-Kennedy, 2005; Searle, 2007). Noam Chomsky famously theorized there was a universal human linguistic structure, which he called "generative grammar" (Chomsky, 1955, 1965). For Chomsky, syntax was the essential component of language, as opposed to semantics (meaning and reference) and pragmatics (how language actually is used) (Chomsky, 1977) 9 .

#### **LANGUAGE AND MIND**

It is beyond the scope of this review to investigate the complex relationships between language and mind (for a current overview, see Gleitman and Papafragou, 2012, 2013). Issues include criticism of Chomsky's views; whether logical variables represent the propositional contents of mental states and that cognition consists in manipulating them, a view most closely associated with Jerry Fodor (1975); criticism of Fodor's views; the linguistic relativity hypothesis (Swoyer, 2003); whether one can observe thoughts or emotions without labeling them (Linehan, 1993); or whether simply changing the way one labels them is effective to initiate cognitive/affective/behavioral change (Lieberman et al., 2007; Hayes et al., 2012). Our concern is not just a matter of choosing new words to describe beliefs, but rather reformulating beliefs, which then are expressed using words. At a minimum, we are in accord with Davidson (1975), who holds that belief is central to thought and that to have a belief requires the ability to express it using words10.

The substantive propositional content of an individual belief is interesting and important, particularly for determining just which dysfunctional beliefs typically align with different types of psychopathology. We are more interested, though, in the relationship of an individual belief to the other constituents of the belief set of which the individual belief is a member, and how that set's membership changes or is reformulated between *t*<sup>1</sup> and *tn*. Belief revision does not involve alteration or replacement of that which the belief is about, i.e., the "*x*" in BEL(that "*x*"). It is not a form of reality modification. Rather, the focus of change is belief considered as a propositional attitude (§1, above). The nature, scope and extent of belief revision only can be evaluated by inspecting modifications to the semantics of sets of {BEL | EVID} at *k*<sup>1</sup> and *kn*.

#### **INTEGRATING BELIEF INTO A NON-LINEAR DYNAMICAL SYSTEM**

Given these complex conditions, how can belief revision using CBT be integrated into a theory of non-linear, dynamical systems? As set forth at our Introduction, above, belief revision essentially involves two separate pathways: one through cognition, the other through behavior. CBT straightforwardly uses interventions directed toward both. The first, cognitive restructuring, requires belief revision in order to initiate behavioral change. The second, exposure/response prevention, requires behavioral change in order to initiate belief revision. Both cognitive restructuring and exposure/response prevention are mechanisms of belief revision from *k*<sup>1</sup> to *k*<sup>2</sup> (*k*<sup>1</sup> *k*2). **Figure 2** illustrates their respective critical paths for a client presenting with borderline personality disorder, DSM-5 §301.83.

#### **COGNITIVE RESTRUCTURING**

Cognitive restructuring is the therapeutic technology underlying the "cognitive" component of CBT (Spiegler and Guevremont, 2009). It contends that belief revision is the active ingredient motivating behavioral change: if belief set *k*<sup>1</sup> at time *t*<sup>1</sup> is modified to belief set *kn* at time *tn*, then more adaptive behavior will follow (Leahy, 2001, p. 23). Cognitive restructuring erodes dysfunctional beliefs through several steps: (1) identify them; (2) marshal disconfirming evidence against them; (3) deconstruct them by challenging and refuting them; (4) replace them with alternative, more functional beliefs; and then (5) conduct behavioral experiments to see how the environment responds (Huppert, 2009; McMillan and Lee, 2010; Morina et al., 2011). Examples of cognitive-oriented interventions include decatastrophizing, disputing the evidence, detecting logical errors, chain analysis, situational analysis, etc. (Leahy and Rego, 2012).

Clinical interventions look something like these: If one is afraid of snakes, that belief can be challenged through a series of counter-examples. A herpetologist might be concerned with the snake's various anatomical features. A veterinarian might be concerned with its health. A herpetoculturist might be concerned with its taxonomy. Some people have them as pets, or pose with them for photographs, or perform with them in theatrical productions. Each of these persons has a different, proactive mental stance toward things that are (or that appear to be) snakes, none of which are threatening. Or, if a person with lived experience concedes suicidal ideations or reports parasuicidal target behavior, then one way to interrupt her might be to evaluate the evidence and establish the active ingredients of a life worth living: "We have no reliable information that persons who are dead have a better quality of life than persons who are alive. If you're dead, then therapy won't work and you won't be able to get better."

It follows that in order to recalibrate one's belief-generating system, one must modify one's credences in the evidence supporting the pathological belief. The first step in cognitive restructuring is to elicit BEL(*x*). Then, for example, BEL("I'm afraid of *x*") at *t*<sup>1</sup> might get cognitively restructured into something

<sup>9</sup>The logical underpinnings of natural languages is an involved subject, beyond the scope of this review; for recent discussions, see Carruthers (2012) and Scholz (2011). Culbertson and Adger (2014) recently concluded that some grammatical rules (such as placing adjectives closest to the noun they modify) are innate.

<sup>10</sup>Davidson also contends that one must be aware one has a belief in order to hold it to begin with, because if one didn't, then one wouldn't be able to change it, because one wouldn't be able to recognize that the underlying belief was false. This type of metacognitive awareness might be helpful for eliciting beliefs, §1.8.2, above. However, we concur with Laurence and Margolis (2012) that such a requirement overstates the case.

like BEL⊕("There've been times when I've encountered *x* and it wasn't so bad") at *t*2. Positive belief attributions (BEL<sup>⊕</sup>) supplant negative ones (BEL). Following cognitive restructuring, one then searches for discrepant evidence to confirm BEL<sup>⊕</sup> and disconfirm BEL, giving one a good reason to reformulate one's behavioral repertoire (Garland et al., 2010; Morina et al., 2011; Lightsey et al., 2012). Like belief, fear simply is another propositional attitude, i.e., {fear(*x*) | EVID}. Once one has accumulated enough relevant evidence, the choice clearly is framed: spend a significant portion of one's time entrained to the feared outcome, vs. the likelihood it actually will occur (i.e., conditions of satisfaction will be met, §1.5, above). From an assessment standpoint, this likely would require one to have good metacognitive awareness, that is, the ability to reflect upon, understand and control their learning (Schraw and Dennison, 1994) in order to be able to identify and articulate their beliefs. A related concept from attachment theory is that of reflective functioning, that is, the ability to observe and describe one's own mental state (Fonagy et al., 1991).

Cognitive restructuring presents several issues:

1. It is difficult to challenge entrenched beliefs, even when they result in target behavior. Although maladaptive, to some extent they relieve immediate personal distress. Over time they are reinforced and become a conditioned response to the circumstances triggering them, which consolidate around their utility and effectiveness (Hartley and Phelps, 2012).

Example: aerophobia (fear of flying). In effect one has become fear-conditioned: the unconditioned stimulus (flying) initially provokes anxiety (unconditioned response), then becomes paired or associated with other typically-innocuous contexts or situations extrapolated from or analogized to the original one (such as acrophobia, fear of heights, the conditioned stimulus) (Samanez-Larkin et al., 2008). The resulting thought-pathways become ingrained with experience as they are reinforced by sufficient confirming evidence that maintains the associated beliefs until they become conditioned, learned responses (Tryon and McKay, 2009). One keeps doing the same thing over and over again because one is afraid of the perceived consequences of doing anything else.


This makes it difficult for them to generalize from a specific exposure addressing a particular feared outcome to more global cognitive change. While one might become inoculated or desensitized to a particular trigger, establishing it also applies in other contexts requires deducing there is a more pervasive relationship between them–which is a cognitive process. In effect one must blunt the impulse toward fractalization.

If one adopts the wrong cognitive hypothesis, then it will be ineffective to revise the associated belief set. In order to be successful, cognitive restructuring must correctly identify the ultimate fear: "I'll lose control," "I'll be judged," "I'll be embarrassed and humiliated," "I'm going to die," etc. If one is afraid of physiological symptoms such as those characteristic of panic, then the question should be, what happens next? For example, if a client presents with symptoms consistent with a diagnosis of social anxiety disorder (SAD), such as vasodilation (blushing), then the consequence might be that "people think I'm an idiot." If people think one's an idiot, then the next consequence might be "I'll be rejected and abandoned." If one's rejected and abandoned, then the next consequence might be "I'll lose my job and my relationships," etc. If the terminal fear is not adequately specified, then target behavior actually might increase over baseline, because rather than contending with dysfunctional beliefs, one just has animated or enlivened them. The reason why is because one *thinks* one has handled the problem, but one really hasn't (§1.6, above). One just has deferred dealing with it. As a result, further triggers will continue to recruit and redeploy cognitive, affective and physiological assets to support it (Smits et al., 2008; Olthuis et al., 2012).

4. Cognitive restructuring essentially is a process of "out with the old, in with the new" using interventions such as those described at §2.1, above (Leahy and Rego, 2012). Because CBT regards dysfunctional beliefs as distortions or errors in thinking, such a challenge might be experienced as emotionally invalidating (Leahy, 2001, p. 58; Linehan, 1993, p. 92). Familiar (and to some extent serviceable) beliefs may be revealed as unrealistic, mistaken, distorted, or even irrational. As a result, subsequent behavior might just exchange one cognitive/affective state (e.g., anxiety) for another (e.g., "I'm deficient" or "I'm defective"). In this respect, dialectical behavior therapy (DBT) augments CBT case conceptualization. It emphasizes emotional validation in addition to cognitive restructuring. It is not enough to focus only on beliefs and behavior, because emotions (and their associated interoceptive sensations) also are an integral component of the same equation. In fact, if anything, in a contest between emotions and cognitions, emotions most likely will win out, because they are more fundamental and, in a sense, primordial (LeDoux, 1996; Damasio, 1999; Afraimovich et al., 2011; Frazzetto, 2013). A recent study by Moser et al. (2014) concluded that positively reinterpreting negative emotional experiences (such as those associated with fearful outcomes) is one of belief revision's key mechanisms, with well-defined neurological correlates. The equation *should* read: {dysfunctional beliefs} + {emotional dysregulation} = {target behavior}11.


# **EXPOSURE/RESPONSE PREVENTION**

CBT's second critical path is behavioral intervention based around the concept of progressive desensitizationexposure/response prevention to a feared outcome, rather than escape/avoidance of it. It proposes that the main driver for therapeutic change is behavior, not cognition. It assumes that it is difficult for cognition alone to motivate new behavior; that one of the main reasons why persons engage in target behavior is to attempt to induce their environment to respond; that when reinforcement contingencies are altered, behavioral modification follows; and that psychological change occurs as a result. Instead of being the driving force motivating behavioral change, cognition brings up the rear. This dichotomy is similar to that between thought and action, or thinking vs. doing.

Using this approach, the first question always must be "how did the behavior get to be the way that it is." Often this can be explained using classical and operant conditioning paradigms. Sometimes people enact coping strategies to prevent something bad from happening; occasionally, it may even be pleasurable. If, however, actions have *not* had effects, then it is necessary to supply them in order to consequate that behavior. The next step is to unpair or decouple a conditioned stimulus from an unconditioned one, or to extinguish target behavior that previously has been reinforced (and the entire cycle giving rise to it), by establishing prospective environmental contingencies; acquiring skills; enacting new behavior; and then evaluating evidence as to how the environment responds (Spiegler and Guevremont, 2009). At each stage, behavioral markers demonstrate that the feared outcome did not occur.

Target behavior typically is a form of escape/avoidance. It may be accommodating and protective in the short term, because it reduces the threat posed by dysfunctional beliefs (§2.1.1, above;

<sup>11</sup>While we spend considerable time analyzing pathways between cognition and behavior (§4), it is beyond the scope of this review to expand our analysis to include emotions and affect. For speculation on this point, see (Afraimovich et al., 2011; Huntsinger and Schnall, 2013); and (Rabinovich et al., 2010a).

Hofer, 2010). However, it is ineffective over the long term, as novel and even more threatening stimuli arise in the world and present for interpretation and action (Roemer et al., 2002; Carter et al., 2008; Lee et al., 2010). It does not affect one's pre-existing vulnerabilities and the environmental affordances that trigger or activate them. It does not down-regulate dysfunctional beliefs or dysregulated emotions. Instead, by impeding assimilation of accurate information, it maintains judgmental biases, emotional vulnerability and alarm sensitivity–a kind of "contrast avoidance" (Taylor and Alden, 2010; Newman and Llera, 2011, p. 226).

Adaptive new behavior, on the other hand, is generated by stepwise exposure followed by systematic desensitization or response prevention. Initially this is a "fragile behavioral state" and can be recovered "spontaneously or subsequent to environment influences, such as context changes or stress" (Herry et al., 2010, p. 599). As one confronts the feared stimulus, the fear becomes extinguished through a reverse inhibitory learning process, allowing for more flexible control of conditioned response by forming a consolidated extinction memory. With continued or reinitiated exposure, post-behavior cognitions consolidate and become further refined, dampening responsiveness in the brain's fear-sensitive network (Hauner et al., 2012; Trouche et al., 2013). Similar to cognitive restructuring (§2.1.3, above), in order to be an effective intervention, exposure/response prevention must be autogenic, i.e., personalized more or less exactly to falsifying or validating a specific feared outcome–the one that matters the most.

Example: if one is afraid of heights and things that move quickly, then an escape/avoidance strategy would be not to engage with them. An exposure/response prevention strategy, on the other hand, would be to take opposite action by (say) going on a series of roller-coaster rides at an amusement park, starting with those that are small and innocuous but then building up over the course of a day to those that are taller and faster. At each step one take's stock of one's mental condition, notices that one still is alive and breathing, thereby habituating or acclimating oneself to more challenging stimuli, resulting in cognitive change. Example: if one is afraid of driving on the freeway, then an escape/avoidance strategy would be to take surface streets. What happens, though, if the surface streets all are blocked and the only way to get to one's destination is by taking the freeway? The escape/avoidance strategy no longer works. A more adaptive exposure/response prevention strategy would be to progressively expose oneself to driving on the freeway by (say) traveling from one on-ramp to one off-ramp at a time, then gradually building this up to two, then three, etc. Example: rather than engaging in a difficult and potentially futile process of weighing pros and cons in order to motivate herself not to drink alcohol, a person with substance over-use issues alters her behavioral regimen not to drive by liquor stores and restructures her social network to exclude those persons maintaining it.

Behavior modification is powerful. Some theorists contend that in a contest between beliefs and behavior (i.e., cognitive restructuring versus exposure/response prevention followed by belief consolidation), behavior always will win; see e.g., Gipps (2013) and Longmore and Worrell (2007). Historically, committed behaviorists denied one has beliefs to begin with; rather, one only is disposed to respond to stimuli (Pavlov, 1927/2003; Skinner, 1947; Ryle, 1949/2009). Today, along similar lines, eliminative materialists such as Churchland and Churchland (1998) and Dennett (1992) deny beliefs are anything more than folkpsychological explanations (this phrase is intended to be mildly derisive) of complex neurological events (Bickle et al., 2010). The weakness of this formulation is what originally lead to the cognitive revolution, as exemplified, for example, by Chomsky's (1959) critique of Skinner's (1957/1991) *Verbal Behavior*. Behavior does not, however, occur in a vacuum. There must be some threshold level of belief revision in order to stimulate it, most likely based on the salience of an initial belief or belief set, its relevance to current goals, or its resonance with a particular feature of the environment. In principle this should be similar to the way that intention redirects attention from the default mode network to some other neural construct or constructs (Buckner et al., 2008; Rabinovich et al., 2012a). Attention focuses intentional orientedness, causing heightened self-monitoring, resulting in greater interoceptive sensitivity (Simmons et al., 2006; Woody and Nosen, 2009), one of the main precursors to belief change.

Thereafter, the role of cognition primarily is to consolidate revised beliefs and build behavioral insight. Beliefs are conjectures or predictions about conditions of satisfaction and the evidence supporting them. The only way to accumulate evidence is by enacting behavioral experiments and seeing what happens. From a clinical standpoint, the client can assume the role of an anthropologist, investigating the behavior of a strange tribe, of which she also happens to be a member. If there is insufficient evidence to support a belief, or the evidence disconfirms it, then there is no particular reason why it should be retained as a component element of a belief set. Discrepant evidence creates "expectation violations" (disconfirms pathogenic beliefs), modifying behavioral vectors previously directed toward averting feared outcomes, thereby raising the cognitive accessibility of alternative and more flexible belief formulations. In many instances, the cognitive objective is not to eradicate fear, but rather to tolerate ambiguity. Using a variation of the Rescorla and Wagner (1972) model, Craske et al. (2012) recently advocated that while it may become semi-perturbed, the pairing or coupling between the conditioned stimulus and the unconditioned stimulus never really is eradicated. Instead, it is inhibited or attenuated. It follows that variability in fear level, or reintroducing elements of the unconditioned stimulus concurrently with the conditioned stimulus during exposure, is more likely to create a durable learning experience. Doing so *maximally* violates expectations, eliciting more improvisational and extemporaneous behavior, thereby promoting belief revision (Kircanski et al., 2012). The goal is not so much extinction (from a behavioral standpoint) as it is acceptance (from a cognitive standpoint) which is a completely different skill. As the Viennese novelist (and, in retrospect, proto-ACT theorist) Robert Musil (1930-43) declared: "one must live with uncertainty, yet not be caught in hesitation."

Cognition also extrapolates or pluralizes revised beliefs to analogous contexts. When one masters a skill in a certain domain, that mastery experience carries over to others. Only the target behavior will be affected without generalization effects. While this may be acceptable insofar as it goes, especially in refractory cases, exposure/response prevention will have limited success unless it also addresses adjacent beliefs (Arntz, 2002; Bryant et al., 2003). To continue with the example from §2.1.6, above, if a person with SAD starts mindlessly speaking up at meetings, that will not in and of itself change cognition. It simply is a form of unregulated exposure/response prevention. It may even become a form of escape/avoidance if she engages in it unthinkingly in order to avoid cognitive dissonance, a necessary precursor to extinction. The more that target behavior is effective as a form of escape/avoidance, the more difficult it will be to create a counteracting exposure/response prevention, precipitating belief revision. Reciprocally, some persons who hold severely dysfunctional beliefs or who are considerably emotionally dysregulated may lack the cognitive capacity to perform generalization operations (§4, below). In such cases, target behavior must be specified even more precisely, otherwise it will not be extinguished, or some other undesired behavior will be reinforced instead.

## **AUTOMATIC NEGATIVE THOUGHTS, INTERMEDIATE BELIEFS, CORE BELIEFS**

How do cognitive restructuring and exposure/response prevention integrate with the epistemology of CBT? Received Beck-Ellis theory (Ellis, 1994; Beck, 2011) holds that doxastic agents have a hierarchy of automatic thoughts, intermediate beliefs and core beliefs. There now are several dozen recognized schools of CBT, all of which trace their provenance back to Beck and Ellis (Emmelkamp et al., 2010).

## *Automatic thoughts*

For Beck (2011), automatic thoughts are an undercurrent of cognitions and self-talk, subject to articulation on query or in response to an analogous simulation (Zanov and Davison, 2010). They rarely are conscious in the sense of a state one is aware of, however they typically are accessible and available to other cognitive processes (van Gulick, 2004).

### *Intermediate beliefs*

Automatic thoughts are linked to core beliefs by intermediate beliefs. Beck (2011) assumes the role played by intermediate beliefs is unproblematic (p. 205), however they can be difficult to formulate and it is not clear anybody ever has held an intermediate belief. In principle they should be rules or assumptions in the form of conditional if-then statements such as: "If I (engage in rigid behavioral coping pattern), then (I'll be insulated from a core belief I'll experience as aversive)" or "Unless I (engage in rigid behavioral coping pattern), then (I'll be exposed to a core belief I'll experience as aversive)." For example, if one unexpectedly is running late for work because the bus is running late, intermediate beliefs might be: "If I'm always on time for meetings, then I'm not inadequate" (or, "Unless I'm always on time for meetings, then I'm inadequate"). They should not, however, be idiographic. Thus, "If I'm on time for meetings, then I'll do well at work" is not a proper formulation of an intermediate belief. Rather, it is more of an expression of a particular coping style, connecting to an individual instance of behavior, not a pattern of behavior. Nor should intermediate beliefs be depersonalized. Thus, "People who frequently are late for meetings typically end up losing their jobs" also is not a proper formulation of an intermediate belief, because the outcome does not tie to a more generalizable core belief.

# *Core beliefs*

A core belief is not an actual thought in an epistemological sense. E.g., if the automatic thought is "I'm running out of money," then the associated core belief might be, "One needs a lot of money in order to be safe," even though one never actually thinks that particular core belief. Uncovering it is cognitive restructuring's *raison d'être*. It is tempting to think of a core belief as an implicit conclusion derived from the application of a rule (an intermediate belief) to a premise (an automatic thought). All three are components of an information processing system (Beck, 2011, p. 33) or a way for people to "organize their experience in a coherent way in order to function adaptively" (Beck, 2011, p. 35).

Still, it is not clear what comprises a set of core beliefs. Is it just a single belief, or a set of multiple, interdependent beliefs? Although they acknowledge the possibility that there are many of them, all of the Beck-Ellis examples treat beliefs as singletons rather than as elements of belief sets. It seems implausible that individual beliefs, regardless of how entrenched, proximately cause (or explain) a complex phenomenon such as human behavior. It seems more likely that human behavior is the outcome of a dynamic, interactive network of beliefs (and that it reciprocally influences them).

It also is unclear just what causes what. Does a trigger– a real-world or imaginal event–activate core beliefs or automatic thoughts? Once set in motion, which causes which? Beck (2011) has little to say about the relationships between automatic thoughts, intermediate beliefs and core beliefs other than core beliefs "activate" automatic thoughts (p. 32) and "underlie" (p. 36) both them and intermediate beliefs. Intermediate beliefs "influence" one's view of the situation or event (p. 35), which "trigger" automatic thoughts (p. 38) (Beck apparently views these different verb formulations as synonymous).

# **BELIEF REVISION–THREE AND ONLY THREE FUNDAMENTAL SYNTACTICAL OPERATIONS**

While CBT provides useful tools that can be used to induce or facilitate belief revision such as cognitive restructuring or exposure/response prevention, the problems with Beck's (2011) formulation (§2.3, above) make clear that it comes up short to explain just how they do so. At best, from a clinical standpoint, they just "soften" a set of dysfunctional beliefs, or point out why individual beliefs are implausible (Beck) or illogical (Ellis). We contend that the process of belief revision in CBT can be better characterized using AGM12.

<sup>12</sup>Since their original (1985) paper, AGM theory has evolved and undergone significant further developments (Makinson, 2003; Costa and Pedersen, 2011; Gärdenfors, 2011). While there are other theories of belief revision (Fermé and Hansson, 2011), AGM is the one that has acquired the most traction in the literature. The concept of *k*, whether and how BEL represents or stands for a psychological state, all of the AGM postulates and all of the operations potentially performable on *k* have been discussed and challenged extensively. It is beyond the scope of this review to analyze these various permutations.

According to AGM, a person's knowledge base *k* comprises a number of individual beliefs, BEL1, BEL2,... BEL *<sup>n</sup>*, which combine together to form belief sets. AGM provides a set of ecological rules for how beliefs dynamically evolve by examining the interaction effect of *k*1's and *k*2's respective belief sets at equilibrium points *t*<sup>1</sup> and *t*<sup>2</sup> during the process of belief revision. The problem AGM is trying to solve is to minimize the set of BELnew ∈ *k*<sup>2</sup> and the set of BELold ∈/ *k*<sup>1</sup> *simultaneously*, so as to maximally preserve both *k*1's and *k*2's inductive cores. Unlike *k*1, *k*<sup>2</sup> is less subjectively distressing and leads to more adaptive or normative behavior.

This is interesting and important because it defines the necessary and sufficient conditions for belief revision–what has to happen and that is all that has to happen. It therefore specifies the minimum requirements necessary for successful cognitive restructuring or belief modification following exposure/response prevention. From a clinical standpoint, maybe this is all one can expect, particularly with difficult cases. It can accommodate a diverse belief set, limited only by one's strategies to interpret beliefs, semantically encode them by assigning them substantive propositional content (that "*x*") and then identify the resulting doxastic commitments, which gives it explanatory power. It deemphasizes the distinction between automatic thoughts, intermediate beliefs and core beliefs. All beliefs are targets for revision at any equilibrium point. This better explains the subjective phenomenological experience of belief revision. It also recognizes there are different related beliefs at *t*1, *t*2, etc. Some motivate behavioral change, e.g., *k*<sup>1</sup> = ("If I enact behavioral experiment *y* then *z* will happen"). Others reinforce it, e.g., *k*<sup>2</sup> following skills acquisition or exposure/response prevention = ("This is how the environment responded"). It is a dynamical system because it changes and evolves in real time. It is nonlinear because the "*x*" of BEL(*x*) is idiographic, idiosyncratic and unpredictable.

During belief revision, elements of belief sets are modified or replaced using three (and only three) fundamental syntactical operations, which are expansion (EXP); revision (REV); and contraction (CON). Particular beliefs are the semantics this architecture supports (Fermé and Hansson, 2011).

#### **EXPANSION (EXP)**

EXP is like adding a new belief without deleting any old ones. EXP (expressed as *k*<sup>1</sup> + BEL*x*) occurs when one accepts, acknowledges or incorporates a BELnew into *k*1. *k*<sup>2</sup> = (*k*<sup>1</sup> + BELnew): BELnew is added to *k*1; no ∃(BEL *x* ∈ *k*1) is deleted or removed from *k*1; and on conclusion of belief revision, {(BEL1 ... BELn) ∪ BELnew} ⊆ *k*2, with the caveat it also is the smallest possible set of (*k*<sup>2</sup> ∪ BELnew). Although it might be, BELnew does not necessarily have to be consistent with *k*1. Since AGM does not restrict the substantive propositional content "*x*" of BELnew (§1.3, above), it can have either ⊕ or valence. If it has ⊕ valence (BEL*x*⊕), then it contributes to cognitive restructuring at *t*2. If it has valence (BEL*x*), then either it does not contribute to cognitive restructuring, or may even reinforce *k*1.

For this reason, EXP might be confusing for an AGM agent. BELold remain as elements of her belief set, even as they are joined by BELnew, which can either be BEL<sup>⊕</sup>, BEL or ambiguous. To continue with our previous example, the trigger is running late for a meeting at work because one's bus is late. Under such circumstances, one's beliefs might be: BEL1 ("My boss is going to get angry"), BEL2 ("My colleagues will disrespect me") and BEL3 ("My opinion doesn't count"). One then acquires a new belief BEL4 ("I need this paycheck to support myself"). BEL4 is not inconsistent with {BEL1, BEL2, BEL3}. For these reasons, we hypothesize that it is unlikely EXP alone will result in successful cognitive restructuring or belief consolidation following exposure/response prevention. **Figure 3** depicts this outcome.

#### **REVISION (REV)**

REV is like adding a new belief and deleting old, inconsistent ones. As with EXP, REV (expressed as *k*<sup>∗</sup> <sup>1</sup> BEL*x*) occurs when one accepts a BELnew or admits it to one's *k*<sup>1</sup> knowledge base. *k*<sup>2</sup> = (*k*<sup>1</sup> + BELnew): BELnew is added to *k*1; on conclusion, {(BEL1 ... BELn) ∪ BELnew} ⊆ *k*2. The main difference between REV and EXP is that with REV, a BELold must be *deleted* from *k*<sup>1</sup> so that *k*<sup>2</sup> is consistent with *k*1.

#### *Pragmatic Closure*

*k* is "logically closed" if it represents *all* of one's beliefs, even though they may be difficult or impossible to specify. Every BEL

logically derivable from *k* already ∈ *k*, i.e., *k* includes not only BEL but also all BEL consequences. Stand-alone beliefs sometimes are referred to as "basic beliefs" and consequences as "derived beliefs"–those beliefs one is epistemically committed to hold, even though one might not actively do so (Gabbay et al., 2010). Since *k*<sup>1</sup> is logically closed in this sense, only *one* anomalous BEL(*x*) is sufficient to create inconsistency; an inconsistent *k*(*x*) sometimes is notated as *k*(*x*)⊥. In this respect, REV incorporates the concept of conformity (§1.6.1, above)13.

#### *Frame of discernment*

To some extent the problem of logical closure is solved by the concept of "frame of discernment." The domain of all possible beliefs must be truncated in order to engage in practical inference and reason from belief to action. One's frame of discernment is the set of all of the beliefs comprising *k* that are useful to answer, in a practical context, the question of what one believes. It is notated where (BEL ∈ ∈ *k*); we might say one's is "pragmatically closed" in order for one to be able to function effectively in the world. Example: when one adopts the set <sup>1</sup> = {red, white, yellow} as the frame for the question "What color rose is Bill wearing today?" one formalizes the variable *x* with those possible values. The frame <sup>2</sup> = {white, blue} might answer the question "What color shirt is Bill wearing today?" The frame for the conjoined question "What color rose and what color shirt is Bill wearing today?" is <sup>1</sup> × <sup>2</sup> = {(red, white), (red, blue), (white, white), (white, blue), (yellow, white), (yellow, blue)} (Liu et al., 1991). Frame of discernment narrows down a potentially unwieldy set of beliefs into something more pragmatically serviceable14.

To continue with our earlier example, let's say that at *k*<sup>2</sup> one has acquired BELnew<sup>⊕</sup> ("The last time I was late for work, my boss was understanding"). Because it is BEL⊕, it is inconsistent with {BEL1, BEL2, BEL3}. The objective of cognitive restructuring or belief consolidation following exposure/response prevention is for *k*<sup>1</sup> to be *in*consistent with *k*2. It follows that BELold should be BEL and BELnew should be BEL⊕, otherwise, there would not be any therapeutic change. Cognitive restructuring is teleological in that it is undertaken with a specific objective in mind, which is belief change and resulting behavior modification. For these reasons, we hypothesize that REV is the paradigm case of successful cognitive restructuring (see **Figure 4**).

#### **CONTRACTION (CON)**

CON is like deleting an old belief without adding any new ones. CON (expressed as *k*<sup>1</sup> ÷ BEL*x*) is when one rejects a BELold or deletes it from her knowledge base. *k*<sup>2</sup> = (*k*<sup>1</sup> − BELold): *<sup>k</sup>*2supersedes *<sup>k</sup>*1; *<sup>k</sup>*<sup>2</sup> <sup>⊆</sup> (*k*<sup>1</sup> <sup>|</sup> *<sup>k</sup>*<sup>2</sup> - BELold); but from which no (BEL*x* ∈ *k*1) has been unnecessarily deleted. Because a BEL has been deleted from one's *k*<sup>1</sup> belief set, CON is a process of

<sup>14</sup>A related concept is partition dependence, which is the psychological pattern of how one divides up a set of possible outcomes into particular events. Doing so influences the perceived likelihood those events will occur. Combining events into a common partition lowers their perceived probability. Conversely, unpacking events into separate partitions increases their perceived probability (Sonnemann et al., 2013). For example, apocryphally, Eskimos have numerous words for "snow," because that phenomenon allegedly is far more prevalent where they live than elsewhere (Martin, 1986). They need a vocabulary with greater subtlety and nuance to describe its various aspects. This in turn increases the probability an event will be interpreted as snowlike, because a set of phenomena (e.g. cold wet stuff falling from the sky) with its associated beliefs (e.g. if you stay out in it too long, you will freeze) has been parsed out into separate partitions. Rabinovich et al. (2014, p. 1) recently characterized this as "chunking"–a dynamical strategy agents use to "perform information processing of long sequences by dividing them in shorter information items" thereby making "more efficient use of short-term memory by breaking up long strings of information."

<sup>13</sup>There are several other possible operations one can perform using REV: "partial meet revision" and "transitively relational partial meet revision." We do not cover these, here. Logical closure may be unrealistic in a real-world environment, because one might not recognize derived beliefs, even if they are specified. One draws on numerous other beliefs, facts assumptions and knowledge about the world in order to function effectively within it. It is unlikely one ever is in command of all possibly relevant evidence pertaining to a belief or beliefs. It most likely would be impossible to specify fully all of the beliefs comprising one's knowledge base, a project that in effect would require axiomatizing all human knowledge (Dreyfus, 1992; Shanahan, 2009).

"epistemic entrenchment." In rejecting BELold, one also may have to disavow other BEL*x* that imply or are implied by it. Which beliefs should be deleted? From the standpoint of CBT:


We hypothesize that CON is the most problematic maneuver for an AGM agent, because its contribution to cognitive restructuring depends on whether it operates on a BEL<sup>⊕</sup> or a BEL. If the BEL that are being deleted are BEL, then the remainders will be BEL⊕. This corresponds with the intuitive requirement that successful cognitive restructuring should eliminate dysfunctional BEL, while leaving BEL<sup>⊕</sup> alone. On the other hand, it also illustrates a way in which cognitive restructuring might backfire, for example, if one is so committed to a BEL that a BEL<sup>⊕</sup> is deleted as a consequence. If the belief that is being deleted is a BEL⊕, then the remainders all may end up being BEL, because they are well-entrenched. An example might be recovery following extinction using a classical conditioning model, which occurs when *k*<sup>1</sup> ⊆ {(*k*<sup>1</sup> ÷ BELnew) + BELold}. This means that if *k*<sup>1</sup> was EXP by BELold, but one somehow readopted or reincorporated BELold into her *k*<sup>1</sup> belief set, then the effect of cognitive restructuring would be reversed. Or, the BEL set ∈ *k*<sup>2</sup> could be an ambiguous mixture of both BEL and BEL- , in which case cognitive restructuring would only be partially successful. Building on our previous examples, **Figure 5** illustrates an instance of successful belief revision using CON.

# **INTEGRATING AGM INTO A THEORY OF NON-LINEAR DYNAMICAL BELIEF REVISION**

We conceptualize belief revision using AGM as an emergent property of a complex, self-organizing system involving huge numbers of neurons broadly distributed throughout different brain regions, including the prefrontal cortex (PFC), Broca's area and Wernicke's area (Cogan et al., 2014). There now has been considerable research imaging regions of the brain activated by BEL(*x*), starting approximately with Greene et al. (2001), continuing through Harris et al. (2008) and d'Acremont et al. (2013). Other studies examine brain regions activated by semantic processing–the words in which beliefs are expressed. Huth et al. (2012) used WordNet (§1.8.2, above) to identify 1705 object and action categories from several hours of nature movies. When they projected them to research participants undergoing fMRI, they were able to map semantic selectivity into smooth gradients covering much of the cortex. Crangle et al. (2013) presented their research participants with 48 spoken-word and visual depictions of sentences about the geography of Europe, half of which were true and half of which were false. They used WordNet and LSA (§1.8.2, above) to extract and classify their propositional content–the *x* in BEL(*x*). The resulting semantic processing was associated with characteristic features of EEG recordings. Costanzo et al. (2013) presented research participants undergoing fMRI with 140 line drawings or pictures of objects (visual stimuli) together with corresponding nouns spoken aloud (auditory stimuli). They found that both converged and were processed in the same regions of the brain during superordinate semantic categorization.

Semantic memory long has been recognized as a fundamental component of human cognition (McRae and Jones, 2013). It is "general knowledge about the world, including concepts, facts and beliefs" and is acquired through experience, thereby "grounding knowledge in distributed representations across brain regions that are involved in perceiving or acting" (Yee et al., 2014, p. 353). Semantic network structure plays a key role in the formulation of ideas and the ways in which they are combined and conceptually associated (Goñi et al., 2011; Marupaka et al., 2012). It accommodates both abstract concepts and concrete ones, the former associated with the medial PFC and the superior temporal sulcus, the latter associated with the bilateral intraparietal sulcus (Wilson-Mendenhall et al., 2013). It represents cognitive information either as specific autobiographical episodes or more general semantic knowledge, each with different subjective experiences (Heisz et al., 2014). Rabinovich et al. (2012b, p. 81) characterize it as a "space of interconnected information items," where "each item [is a separate] dynamical element" and "the dynamics of thinking (or consciousness) is a flow in a semantic space."

<sup>15</sup>There are several other possible operations one can perform using CON, including "transitively relational partial meet contraction." We do not cover these, here.


This body of work supports a conclusion that {BEL | EVID} is not a specific topological location or ontogenetic landscape within the brain. Rather, it is a type of neural activity or pattern of activation that occurs within a comprehensive neural system. When one believes something, one enters into a series of hybrid doxastic/semantic states, which can be functionally represented as a non-linear, dynamical process–a belief revision network occurring in a global workspace–such as that depicted at **Figure 6** (while **Figure 6** depicts a two-dimensional surface, it should be understood as a multi-dimensional space; **Figure 7** depicts an alternative perspective).

It also requires a reconceptualization of the relationship between beliefs and semantics. Unlike an fMRI or EEG recording depicting brain activity, a belief set cannot be described as a geometrical object or in statistical terms. Rather, it is an encoded set of semantic propositions, embodying emergent semantic properties in its very organization (Juarrero, 1999). A belief set creates an internal symbolic mental representation based on one's assessment of its conditions of satisfaction (§1.5, above); one can imagine the conditions of satisfaction being enacted or realized16 . It interacts with other brain regions responsible for perception, cognition, emotion, language and behavior. They are embedded within a manifold or phase plane together with physiological assets such as blood flow and oxygen. The phase plane is in a constant state of flux, flexibly changing in response to environmental constraints and internal demands (Kelso, 1999). Belief revision is a dynamic pattern of activity occurring within the phase plane.

Some beliefs initially are stored in long-term memory. These most likely are enduring, persistent beliefs about self, others, world and future; background or network beliefs of the sort described at §1.9, above; and core beliefs of the sort described at §2.3.3, above. They are recalled into short-term memory in response to decision points, environmental affordances and outcomes, and other multiple attractors. The network's attractors constitute a "self-organized space with emergent properties that can only be characterized as semantic" because they "embody [word] meaning[s] or sense[s] in the organization

<sup>16</sup>Mental images are controversial (for a summary of recent work, see Doumas and Hummel, 2012; Markman, 2012; Reisberg, 2014; and Shea, 2013). We are not committed to a theory that one creates actual, static mental representations in the brain. They are not pictures, rather, "depictive representations interpreted by cognitive processes at play in other systems" (Borst, 2014, p. 84). They have "several levels of complexity, from sparse, atomic concepts to complex, knowledge intensive ones" (Rips et al., 2012, p. 177). An agent's

behavior must be flexible in order to respond to her circumstances, and mental representations play an important role in enabling her to do so (Egan, 2012, p. 250). Perception, for example, may be more of a process whereby a perceiver skillfully interacts with her environment. The real world presents way too much information for the perceiver to encapsulate it in an isomorphic mental image. Rather it is like a gigantic external memory, supplying a series of cues, which the perceiver can access as necessary (Noë, 2004). We do not, of course, contend that one literally perceives the words comprising the semantic formulation of one's belief set (in a manner similar to the way the Arnold Schwarzenegger character in the movie Terminator III movie was able to scroll through different belief-action options before selecting a particular alternative).

of the relationships that constitute the higher-dimensional space" (Juarrero, 1999, p. 167). Initially, the phase plane represents all possible states of the belief-generating and belief-revision systems. It has a large number of degrees of freedom. It is unstable in that small changes to initial conditions–both perceived and imaginal–have the potential to become radically amplified, resulting in any number of different multi-stable belief sets. While the output belief set at *kn* depends to some extent on the input belief set at *k*1, *kn* is asymmetrical and cannot be reliably predicted by *k*1. Arguably, it exhibits chaotic dynamics because it would be difficult to specify the individual beliefs comprising the belief set as it evolves into novel and surprising states that are unexpectedly both deterministic and stochastic (non-deterministic) (Nicolis and Prigogine, 1989).

The belief revision system is transient. At *t*1, all possible belief trajectories (starting with the system's initial conditions) intersect the phase plane in a structure similar to a Poincaré surface. As it evolves forward in time, it is bombarded with evidence–information derived from its interactions with the environment and subsequent interpretations. It becomes destabilized and undergoes non-equilibrium, dissipative phase transition. Individual beliefs transverse each attractor's basin of attraction and converge into specific belief sets, which consolidate at saddle equilibrium points {*t*1, *t*<sup>2</sup> ... *tn*}. They can be conceptualized as a form of Mandelbrot fractal. Broader attractor basins capture or entrain a wider range of beliefs, depending on their strength. Because of the system's chaotic dynamics and each point's turbulent behavior, they resemble strange attractors. Convergence results in heteroclinic binding (Rabinovich et al., 2010b) of different evidentiary data to individual beliefs, which recruit resources and attempt to gain priority using the preference function γ as described at §1.8.1, above. The system bifurcates as new beliefs are formulated based on {BEL | EVID} (§1.1, above), revised conditions of satisfaction (§1.5, above), new evidence/information received as a result of interactions with the environment (§2.2, above), and associated evaluative processes.

Belief revision occurs as belief sets sequentially progress or are deflected from one metastable state to another, forming a heteroclinic channel. The separatices are ridges defining its boundaries. They constrain the flow of resources available to each belief set by modifying the phase plane or the possible trajectories of movements within it. As one belief set begins to dominate, it acquires and sustain coherence, crowding out the semantic space potentially accessible to other beliefs. At some point it reaches critical mass and overcomes an inertial threshold, compelling its migration from *t*<sup>1</sup> to *tn*. During this process, the *k*1belief set competes with the *k*<sup>2</sup> belief set (then *k*<sup>2</sup> with *k*3, etc.) to alter its composition using CON, EXP, or REV, either in response to cognitive restructuring or exposure/response prevention with associated environmental feedback, followed by belief revision.

Since the individual beliefs comprising each belief set displace each other (using CON, EXP, or REV), this is a zero-sum, inhibitory process. The sequence of equilibrium points in the heteroclinic channel form a heteroclinic belief revision network. This process typically remains non-conscious until at *tn*, when elements of the belief set acquire salience or otherwise are extracted using typical CBT clinical techniques and protocols17. The combination of non-linearity and non-equilibrium, context-sensitive constraints initially permits multiple solutions, which have the potential to emerge from and be expressed within a diversified assortment of behaviors (Nicolis and Prigogine, 1989). Numerous beliefs compete in a kind of winnerless competition (Rabinovich et al., 2010a). As it stabilizes, though, the belief revision network appropriates a single behavioral output channel. The behavior semantically satisfies the intentions motivating it (the conditions of satisfaction of the associated belief sets, §1.5, above). Upon its conclusion at *tn*, the reformulated beliefs comprising the *kn* belief set are inserted (or reinserted) back into long-term memory. The behavioral stream transfers to an adjacent nonlinear dynamical system for action. Since emotion regulation also plays an important role in belief revision (Boden and Gross, 2013), associated emotions also are reregulated (§2.1.4, above)18.

Cognition and behavior comprise a single autocatalytic unit and it is difficult to assess their respective influences at any *tn*. Neurocognitive methods do not yet have sufficient precision to discriminate between the two (Morrison and Knowlton, 2012). There are no studies persuasively isolating the cognitive component from the behavioral one. Both require selective deployment of attentional, cognitive and affective resources. Unless belief revision was assessed immediately following cognitive intervention, before enactment of any behavior, it would not be possible to isolate the floor effect of cognitive change and control for reinforcement effects, because cognitive change already would be in the process being incrementally reinforced (for an early and unpersuasive attempt to do so based on the concept of "self-focused attention," see Wells, 2006). Any kind of change arguably results in a form of behavior. A recent study on the efficacy of mindfulness-based cognitive therapy (Kuyken et al., 2010)–seemingly, the paradigm case of a cognitive intervention– correctly noted that "these interactive mediation effects indicate that treatment changes the nature of the relationship between cognitive reactivity and outcome" (p. 1110).

What we can say is that together, they comprise a heterogeneous, self-organized, complex adaptive system (Juarrero, 1999) (in this sense, realizing Beck's concept of cognition as an information processing system, §2.3.3, above). Both are temporally and contextually embedded, exchanging information and energy with each other depending on the task at hand, the level of one's skills or expertise to accomplish it, and feedback from the environment. Structure and patterns emerge from repeated cycling involving the cooperation of many individual parts (Thelen and Smith, 2000). Although the system initially is out of equilibrium, with high entropy, it self-organizes by assuming a structure allowing it to operate more efficiently (Guastello and Liebovitch, 2009). Repeated behavioral stimulation and learning history facilitate signal transmission between neurons. Neural plasticity promotes Hebbian-type long-term potentiation, which in turn cascades into further hybrid cognitive-behavioral activation and reinforcement, strengthening attractors and facilitating the development of more predictable belief trajectories within the semantic phase plane. "Through repeated activation of a pattern the connections between units that are activated simultaneously become stronger and the whole pattern becomes an attractor." Thus, even if only partially activated, "the network can complete the pattern by a process of iterative spreading activation" so "the previously learned pattern is recovered in a number of updating cycles in which the activation level of each unit is adjusted according to the activation levels of the other units and the strength of the connections between the units" (Pecher, 2013, p. 359). As a result, conditions of satisfaction (§1.5, above) are revised, together with their corresponding internal symbolic mental representations (§1.1, above). These brain-environmental interactions comprise a negative feedback

<sup>17</sup>In this we are in accord with Dehaene (2014, p. 8) and Searle (1992, p. 152) to the effect that "The notion of an unconscious mental state implies accessibility to consciousness. We have no notion of the unconscious except as that which is potentially conscious." Metaphorically, beliefs are like objects within a multi-dimensional hologram; at any given time we are able to observe only a small portion of them within a potentially vast space-time continuum. Our characterization of the belief-generation and belief-modification process does not implicate any particular theory of action or agency, other than the basic principle that behavior is the action-expression of belief.

<sup>18</sup>Though we disagree with Boden and Gross' naive model of how this works (pp. 591-2), which appears to be the result of reading too much literature on acceptance and commitment therapy (ACT).

loop if they increase the incidence of target behavior; a positive one, if it decreases.

From a clinical standpoint, many cognitive interventions (such as mindfulness) are inherently mental and remain thoroughly solipsistic even as they reinforce and are reinforced by new behavior. Many principles of acceptance and commitment therapy (ACT) are cognitively front-loaded, for example, using metaphor as a means of identifying and developing a valued direction and defusing from one's private mental experiences (Hayes et al., 2012). Other examples are motivational interviewing for substance abuse (Miller and Rollnick, 2012); cognitive behavioral analysis system of psychotherapy (CBASP) for depression (McCullough, 2000); and cognitive processing therapy for PTSD (Resick et al., 2002). Behavioral factors, on the other hand, more clearly dominate interventions such as behavioral activation for depression; exposure/response prevention treatment for obsessive-compulsive disorder or attention deficit disorder; and prolonged exposure therapy for PTSD (Foa et al., 2007). With its dual emphases on learning (cognitive) then applying (behavioral) skills, DBT for borderline personality disorder (§2.1.4, above; Linehan, 1993) lies somewhere in the middle.

In some instances behavioral therapy is a more plausible intervention than cognitive therapy, and vice versa. Unquestionably it is possible to train up organisms with little cognitive processing capacity to demonstrate learned behavior. A 700-kg alligator, for example, has a brain that would fit comfortably inside of a teaspoon (Coulson and Herbert, 1981), yet still is capable of learning in the sense of (Squire and Kandel, 1998) 19. In principle, it would be amenable to behavioral therapy. At some point, though, higher-order propositions must be expressed using natural language or a natural language equivalent20. Without it, propositions would neither be true nor false; the concept of truth builds upon veridical experience. Nor would beliefs have conditions of satisfaction (§1.5, above), nor would psychopathological beliefs have none (§1.6, above). Unlike behavior therapy, cognitive therapy depends on semantics. For this reason, as per §2.1.3, above, it is unclear whether persons with thought disorders can benefit from it (compare Grant et al., 2012 with Aggarwal and Basu, 2013; for a current overview, see Bachman and Cannon, 2012; and Jauhar et al., 2014). While of course outcomes lie on a continuum, arguably, it would be ineffective in principle for those toward the far end of the spectrum. If a person remains impervious to environmental feedback–she is unable to develop adaptive cognitions and activate belief revision–we are inclined to say that something is impeding the assimilation of new evidence, or that her information processing systems require recalibration. Functionally, she may be in a concrete operational stage, or otherwise incapable of abstract thought or metacognition. Having a theory of mind–being able to think about thoughts–may be a necessary component of psychological change (Saxe and Young, 2014). One solution from an operant conditioning perspective might be to increase positive reinforcement (R⊕) or to titrate down punishment using negative reinforcement (R) in order to upregulate the desired behavior, with a view toward mobilizing additional cognitive resources.

Most likely cognition and behavior shuttle back and forth quickly depending on the client's perceptions, emotions, language capability, attentional focus, the context in which behavior occurs, the nature of the transaction the client is having with her/his environment, experience/learning history, genetics, neurochemistry, interoceptive sensitivity, memory capacity, heuristics, intuition, vulnerabilities, intentions, skills, values, and a variety of other factors. Their different trajectories oscillate (Schultz and Heimberg, 2008) in what Rabinovich et al. (2010b) would characterize as a heteroclinic channel between metastable states. Because the brain is a complex system with a variety of different inputs and outputs, neither cognition nor behavior can be controlled in isolation (Ruths and Ruths, 2014). From a clinical standpoint, target behavior should progressively and dynamically reduce. As depicted at **Figures 8**, **9**, their relationship is transactional. The exact mix of each depends not only on the type of therapy but also stages in the therapeutic process. For example, the manic phase of bipolar disorder (DSM-5 §296.xx) might be more amenable to cognitive therapy, whereas the depressive phase might be more amenable to behavioral therapy (Leahy, 2005). Daugherty et al. (2009) characterized this as a Liénard oscillator with autonomous forcing. From the standpoint

**behavior–conceptualization 1.**

<sup>19</sup>This is the double entendre behind the title of B.F. Skinner's famous paper "Superstition in the Pigeon" (1947). Superstition is a form of cognition, whereas pigeons only are capable of learned behavior.

<sup>20</sup>There is no bright-line test for this, either. The meaning of simple propositions can be enacted using language-like behavior, such as Quine's famous example of a speaker using ostension to point to a rabbit, while uttering the word "gavagi" to designate a rabbit-like stage or rabbit-like behavior (Quine, 1964).

of belief revision semantics, the theme of the substantive propositional content ("*x*") remains the same, even as the propositional attitude toward it changes, e.g., if the domain is "affection," then manic = "adorable" whereas depressed = "unlovable." Conceptually, behavioral reformulation and cognitive reconstruction serially propel it in a dynamic progression from *t*<sup>1</sup> through *tn* as different inhibitory and stimulating paradigms take effect. At some point in this process–an extremely interesting one from the standpoint of cognitive science–their trajectories intersect and one transitions into the other. Both are active ingredients of therapeutic change.

# **CONCLUSION**

The ultimate goal of cognitive restructuring or belief consolidation following exposure/response prevention should be thorough overhaul of a meaningful subset of one's entire belief system. Simply inducing doubt is not sufficient. An example of such a paradigm shift might be a prisoner on death row who is exonerated by new DNA evidence, resulting in radical reformation of her knowledge base, or Dostoyevsky's experience in front of a mock firing squad (Bloom, 2005). This is every bit as profound and disruptive as the transition from Ptolemaic astronomy to Copernican astronomy, or from Newtonian physics to Einstein physics, or through the socalled three waves of cognitive behavioral therapy (Hayes, 2004). Thomas Kuhn (1962/2012) labeled these "scientific revolutions"– on an individual level, they might be labeled "personal revolutions."

In addition to making a case for AGM, one of our main objectives in this review has been to illustrate a point of intersection between cognitive science and clinical psychology, two fields which long have enjoyed an uneasy *rapprochement* (Macleod, 2010). "The study of psychopathology has... become an important facet of the cognitive sciences, and the cognitive sciences have, in turn, exerted an important influence on many regions of psychiatry" (Cratsley and Samuels, 2013, p. 413). One of the characteristics of many cognitive science theories is that while each step of the argument makes sense, when viewed as a complete chain of inferential reasoning, the transition from premises to conclusion may be implausible, in a C.P. Snow (1959/2012) type sense. Like a salmon swimming upstream, one ends up in a very small pond. Clinical psychology, in turn, depends operationally on protocols that first were devised over a quarter of a century ago. The prospects for *détente* are not as far-fetched as they initially might seem. For example, on April 1, 2014, the Max Planck Society announced a C5 million investment in a new center for computational psychiatry to be based in London and Berlin, with a view toward uncovering relationships between cognition and psychopathology of the sort we hypothesize (Siddique, 2014).

We submit that the best way to think of our initiative is that it is an exercise in translational research. It applies a form of nonlinear analysis to the study of complex systems in cognitive science and behavioral analysis. Even though it may not exactly mirror actual, common sense psychological activity, logical reasoning should "clarify, sharpen, systematize the purely semantic-level characterization of the demands on any such implementation, biological or not" (Dennett, 1984/2006, p. 449); to "provide an account of our cognitive architecture–which specifies the basic operations, component parts, and organization of the mind" (Samuels, 2012). It also demonstrates how recent work in experimental cognitive science can be combined with clinical psychology to inform the process of psychological change.

# **REFERENCES**


man and shrew. *Comp. Biochem. Physiol.* 69A, 1–13. doi: 10.1016/0300- 9629(81)90632-0


Damasio, A. (1999). *The Feeling of What Happens*. New York, NY: Harcourt, Inc.


*of Psychology: Contemporary Readings*, ed J. L. Bermúdez (New York, NY: Routledge), 433–454. Reprinted in (1998). *Brainchildren: Essays on Designing Minds*, D. C. Dennett, (Cambridge, MA: MIT University Press), 181–206.


Gleitman, L., and Papafragou, A. (2013). "Relations between language and thought," in *The Oxford Handbook of Cognitive Psychology*, ed D. Reisberg (Oxford: Oxford University Press), 504–523.

Glymour, C. (1975). Relevant evidence. *J. Philos.* 72, 403–426. doi: 10.2307/2025011


Hartley, C. A., and Phelps, E. A. (2012). Anxiety and decision-making. *Biol. Psychiatry* 72, 113 –118. doi: 10.1016/j.biopsych.2011.12.027


Lecher, C. (2014). Stephen Wolfram wants to make computer language more human. *Popular Science*. Available online at: http://www.popsci.com/article/ technology/stephen-wolfram-wants-make-computer-language-more-human

LeDoux, J. (1996). *The Emotional Brain*. New York, NY: Touchstone.


Musil, R. (1930-43). *Der Mann Ohne Eigenschaften*. Vienna, AU: Rowohlt Verlag.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 January 2014; paper pending published: 21 February 2014; accepted: 24 April 2014; published online: 15 May 2014.*

*Citation: Kronemyer D and Bystritsky A (2014) A non-linear dynamical approach to belief revision in cognitive behavioral therapy. Front. Comput. Neurosci. 8:55. doi: 10.3389/fncom.2014.00055*

*This article was submitted to the journal Frontiers in Computational Neuroscience. Copyright © 2014 Kronemyer and Bystritsky. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Characterizing psychological dimensions in non-pathological subjects through autonomic nervous system dynamics

Mimma Nardelli <sup>1</sup> , Gaetano Valenza<sup>1</sup> \*, Ioana A. Cristea2, 3, Claudio Gentili <sup>2</sup> , Carmen Cotet <sup>3</sup> , Daniel David<sup>3</sup> , Antonio Lanata<sup>1</sup> and Enzo P. Scilingo<sup>1</sup>

<sup>1</sup> Department of Information Engineering & Research Centre E. Piaggio, Faculty of Engineering, University of Pisa, Pisa, Italy, <sup>2</sup> Section of Psychology, Department of Surgical, Medical, Molecular, and Critical Area Pathology, University of Pisa, Pisa, Italy, <sup>3</sup> Department of Clinical Psychology and Pychotherapy, Babes-Bolyai University, Cluj-Napoca, Romania

#### Edited by:

Tobias Alecio Mattei, Brain & Spine Center - InvisionHealth - Kenmore Mercy Hospital, USA

#### Reviewed by:

Jianbo Gao, Wright State University, USA Sara Bottiroli, IRCCS Mondino, Italy

#### \*Correspondence:

Gaetano Valenza, Department of Information Engineering, University of Pisa, via Caruso 16, 56126 Pisa, Italy g.valenza@ieee.org

Received: 11 November 2014 Paper pending published: 22 December 2014 Accepted: 06 March 2015 Published: 25 March 2015

# Citation:

Nardelli M, Valenza G, Cristea IA, Gentili C, Cotet C, David D, Lanata A and Scilingo EP (2015) Characterizing psychological dimensions in non-pathological subjects through autonomic nervous system dynamics. Front. Comput. Neurosci. 9:37. doi: 10.3389/fncom.2015.00037 The objective assessment of psychological traits of healthy subjects and psychiatric patients has been growing interest in clinical and bioengineering research fields during the last decade. Several experimental evidences strongly suggest that a link between Autonomic Nervous System (ANS) dynamics and specific dimensions such as anxiety, social phobia, stress, and emotional regulation might exist. Nevertheless, an extensive investigation on a wide range of psycho-cognitive scales and ANS non-invasive markers gathered from standard and non-linear analysis still needs to be addressed. In this study, we analyzed the discerning and correlation capabilities of a comprehensive set of ANS features and psycho-cognitive scales in 29 non-pathological subjects monitored during resting conditions. In particular, the state of the art of standard and non-linear analysis was performed on Heart Rate Variability, InterBreath Interval series, and InterBeat Respiration series, which were considered as monovariate and multivariate measurements. Experimental results show that each ANS feature is linked to specific psychological traits. Moreover, non-linear analysis outperforms the psychological assessment with respect to standard analysis. Considering that the current clinical practice relies only on subjective scores from interviews and questionnaires, this study provides objective tools for the assessment of psychological dimensions.

Keywords: psychological scales, Heart Rate Variability, InterBreath Intervals series, nonlinear analysis, multiscale entropy, multivariate multiscale entropy

# 1. Introduction

Psychological assessment refers to the practice of standardized evaluation of performance or impairment in different domains of thinking, learning and behavior. Accordingly, such an assessment can be used to characterize and quantify different behaviors in healthy subjects or to reveal the presence of behavioral disorders such as anxiety and social phobia. Depending on the factors under observation, psychological assessment can be achieved via different routes: behavioral tasks, questionnaires, or interviews. The evaluation is done by a professional (i.e., certified psychologist) in order to obtain a standardized and quantifiable information of the subject under study (Cohen et al., 1992). These approaches are useful in performing an individual assessment for which the performance of one person can be interpreted through pre-existing norms, as well as in group assessment which allows for different comparisons (within a single group or between groups) (Kenny et al., 2008). It is worthwhile noting that self-report questionnaires and interviews currently represent the standard clinical practice in diagnosing psychiatric disorders (Cohen et al., 1992; Valenza et al., 2013a, 2014c).

Nevertheless, several issues in these kinds of approaches still need to be addressed. First, the scores are obtained with subjective procedures which might be biased by possible social desirability thoughts of the subject and possible recent emotional events. Moreover, professionals need to choose the appropriate test for each psychological dimension and subject, and verify that it has good psychometric properties in order to adhere to the evidence-based paradigm (i.e., reliability and validity) (Groth-Marnat, 2003; Hunsley and Mash, 2010). To overcome these problems, several efforts have been made in psycho-physiological and bioengineering research fields to objectify the psychological assessment. In particular, physiological correlates of the central and autonomic nervous systems (CNS and ANS, respectively) have been extensively studied and taken into account (Taillard et al., 1990, 1993; Carney et al., 1995; Glassman, 1998; Stampfer, 1998; Iverson et al., 2002, 2005; Watkins et al., 2002; Calvo and D'Mello, 2010; Lin et al., 2010; Petrantonakis and Hadjileontiadis, 2011; Valenza et al., 2012a,b, 2013a,b, 2014c).

To give some significant examples, physiological correlates of mood disorders such as bipolar disorders have been found on sleep (Stampfer, 1998; Iverson et al., 2002, 2005), hormonal system (Carney et al., 1995; Glassman, 1998; Watkins et al., 2002), and ANS dynamics through heartbeat and respiratory dynamics (Taillard et al., 1990, 1993; Valenza et al., 2013a, 2014c). Moreover, as the psychological dimensions can be related to variations of emotional states, several computational methods for automatic emotion recognition have been developed using electroencephalogram (EEG) and ANS signal analysis (Taillard et al., 1990, 1993; Calvo and D'Mello, 2010; Lin et al., 2010; Petrantonakis and Hadjileontiadis, 2011; Valenza et al., 2012a,b, 2013a,b, 2014c).

Here we focus on the link between ANS dynamics and psychological dimensions. This choice is justified by the fact that ANS dynamics cannot be straightforwardly changed by the subject intention and is under direct control of CNS pathways such as the prefrontal cortex, amygdala, and brainstem (Ruiz-Padial et al., 2011). Of note, dysfunctions on these CNS recruitment circuits lead to pathological effects (Heller et al., 2009) such as anhedonia, i.e., the loss of pleasure or interest in previously rewarding stimuli, which is a core feature of major depression and other serious mood disorders. Moreover, ANS monitoring is widely available, cost-effective, and can be easily performed through wearable systems such as sensorized t-shirts (Valenza et al., 2008, 2014c) or gloves (Lanatà et al., 2012), and its dynamics is thought to be less sensitive to artifact events than in the EEG case.

ANS dynamics has been demonstrated to provide effective markers of typical psychological processes. As a matter of fact, previous studies (Freeman and Nixon, 1985; Yeragani et al., 1999; Virtanen et al., 2003; Cohen and Benjamin, 2006; Shinba et al., 2008; Licht et al., 2009; Thayer et al., 2010, 2012) suggest that patients with anxiety are at increased risk for heart disease (e.g., the association between phobic anxiety or panic disorder and somatic morbidity as coronary heart disease, coronary spasm and ventricular arrhythmia). ANS markers of anxiety and panic disorders can be found through the analysis of the Heart Rate Variability (HRV), revealing an increased heart rate and decreased power in low-frequency (LF) and high frequency (HF) bands. A decreased HF spectral power of HRV was also found in patients affected by generalized anxiety disorder (Thayer et al., 1996), whereas a decreased heart rate was also found in autism spectrum disorders (Jansen et al., 2006) in response to stress. This change could be related to abnormal high basal (nor)epinephrine levels. On the contrary, increased mean heart rate associated to a reduced variability has been observed in depressed patients (Carney et al., 2005). Moreover, it has been shown how subjects reporting excessive and persistent fear of social situations are characterized by atypical ANS dynamics which is evident in variables as HRV mean, respiration rate, tidal volume, and blood pressure (Grossman et al., 2001). ANS markers gathered from non-linear analysis were related to phycological dimensions as anxiety (Cohen and Benjamin, 2006) and panic disorder through symbolic analysis (Yeragani et al., 2000). Despite the elevated number of previous studies, none of these researches have reached an acceptable level of accuracy to effectively, reliably, and objectively characterize the psychological dimensions of healthy subjects and psychiatric patients, and to forecast a clinical course. A possible reason can be related to the limited amount of ANS features and specific psychological traits that were taken into account.

Therefore, here we present a detailed study on psychological assessments through an extensive analysis of the ANS dynamics. Psychological dimensions were quantified by means of the 6 psycho-cognitive scales (see details on the series definition, estimation, and parameter extraction in Section 2.3).

In order to perform a comprehensive study, the ANS nonlinear dynamics has to be taken into account. Although the detailed physiology behind such complex dynamics has not been completely clarified, it is worthwhile noting that ANS non-linear dynamics plays a crucial role in most of the underlying biological processes, as they have been proven to be of prognostic value in aging and diseases, showing robust and effective discerning and characterizing properties (Poon and Merrill, 1997; Glass, 2001; Goldberger et al., 2002; Stiedl and Meyer, 2003; Tulppo et al., 2005; Atyabi et al., 2006; Glass, 2009; Wu et al., 2009; Citi et al., 2012; Valenza et al., 2014a). Indeed, physiological systems are intrinsically non-linear systems characterized by multi-feedback interactions associated to long-range correlations (Marmarelis, 2004), likely due to the enormous amount of structural units inside them and to the various non-linear neural interactions and integrations occurring at the neuron and receptor levels. The study of the complexity of physiological signals, in particular, has led to important results in recent decades in understanding the mechanisms underlying mental illness (Yang and Tsai, 2012). Several measures of complexity have also been proposed and applied to the study of mental illness based on various biomedical signals, from EEG (Hu et al., 2006; Takahashi et al., 2010; Gao et al., 2011), to MEG (Fernandez et al., 2010), through HRV (Mujica-Parodi et al., 2005; Hu et al., 2009, 2010; Gao et al., 2013; Valenza et al., 2014b). Accordingly, in this study we investigate the role of ANS non-linear dynamics in performing the psychological assessment, with respect to the standard analysis, i.e., analysis in the time and frequency domain.

# 2. Materials and Methods

# 2.1. Subjects Recruitment, Experimental Protocol, and Acquisition Set-up

A group of 29 non-pathological subjects (5 males), i.e., not suffering from both cardiovascular and evident mental pathologies, was recruited to participate in the experiment. Subjects were students recruited from the Babes-Bolyai University, via an online screening questionnaire assessing their intention to take part in the study. Participation was voluntary and each subjects signed a written informed consent after the study procedure had been explained. No compensation for participation was offered. Subjects underwent a medical screening interview to assess the presence of any medical condition or medication that might have interfered with their cardiovascular data. Their age ranged from 21 to 35 and were naive to the purpose of the experiment. The group was as heterogeneous as possible in order to have a wide range of psycho-cognitive-behavioral dimensions. The experimental protocol was structured in the following two phases: (1) submission of self report psycho-behavioral tests; (2) recording of the physiological signs. More in detail, all participants were screened by 6 self-report questionnaires (see details below), which were comprised of a total of 25 sub-scales. Then, physiological signals such as ElectroCardioGram (ECG), Respiration (RSP) were simultaneously acquired during resting state condition for 25 min through the BIOPAC MP150 device. The sampling rate was 1000 Hz for all signals. We used the ECG100C Electrocardiogram Amplifier from BIOPAC inc., connected with pregelled Ag/AgCl electrodes placed following Einthoven triangle configuration. The dedicated module of BIOPAC MP150 used to record the respiration activity is RSP100C Respiration Amplifier with the TSD201 sensor, which is a piezo-resistive sensor with the output resistance within the range 5–125 KOhm and bandwidth of 0.05–10 Hz. This piezoresistive sensor changed its electrical resistance if stretched or shortened, and it was sensitive to the thoracic circumference variations occurring during respiration.

The ECG signal was used to extract the HRV series, which refer to the variation of the time intervals between consecutive heartbeats identified with R-waves (RR intervals). Two different time series were extracted from the respiration activity: Inter-Breath Interval time series (IBI) and InterBeat Respiration (IBR). The IBI series was obtained detecting the local maxima of each respiratory act, whereas IBR consists of the amplitude of the respiration activity signal when sampled at the R-peak times.

# 2.2. Scales for the Assessment of Psychological Dimensions

In this work, we used a total of 6 self-report questionnaires in which, for most of them, different sub-scales are considered. The total number of sub-scales used in this experiment was 25. A Cronbach's α measure (Bland and Altman, 1997) is assigned to each scale and represents the consistency of the test. Such an α index depends on the number and average inter-correlation among the test questions. Details on each scale and related sub-scales are as follows:


# 2.3. Methodology of Signal Processing

In this section, the methodology of signal processing applied to the Heart Rate Variability (HRV), InterBreath Interval (IBI), and InterBeat Respiration (IBR) series is reported in detail. HRV refers to the variability of the series comprised of the distances between two consecutive R-waves detected from the Electrocardiogram, i.e., the R-R intervals. IBI is the series comprised of the distances between two consecutive local maxima of the respiration activity (the two maxima within two respiratory acts), whereas IBR series is the respiratory activity sampled at times corresponding to the R-peaks. Standard and non-linear monovariate and multivariate measure are extracted from each series in order to investigate a wide set of parameters characterizing the ANS linear and non-linear dynamics acting of the cardio-respiratory control.

# 2.3.1. Standard Measures

Standard analysis was performed on HRV series in order to extract parameters defined in the time and frequency domain (Camm et al., 1996; Acharya et al., 2006; Valenza et al., 2012b). Time domain features include statistical parameters and morphological indexes. More specifically, concerning the time domain analysis, in addition to the first (meanRR) and second order moment (SDNN) of the RR intervals, so-called normalto-normal (NN) intervals, the square root of the mean of the sum of the squares of differences between subsequent NN intervals (RMSSD = q 1 N−1 PN−<sup>1</sup> j=1 (RRj+<sup>1</sup> − RR<sup>2</sup> j ) and the number of successive differences of intervals which differ by more than 50 ms, expressed as a percentage of the total number of heartbeats analyzed (pNN50 = NN50 N−1 100%) were calculated. Moreover, the triangular index (TINN) was estimated as the base of a triangle which better approximated the NN interval distribution (the minimum square difference is used to find such a triangle).

Concerning the frequency domain analysis, several features were calculated from the Power Spectral Density (PSD) analysis. In this work, PSD was estimated by using the Welch's periodogram, which uses the FFT (Fast Fourier Transform) algorithm. Window's width and overlap were chosen as a best compromise between the frequency resolution and variance of the estimated spectrum. Given the PSD, three spectral bands are defined as follows: VLF (very low frequency) with spectral components below 0.04 Hz; LF (low frequency), ranging between 0.04 and 0.15 Hz; HF (high frequency), comprising frequencies between 0.15 and 0.4 Hz. For each of the three frequency bands, the frequency having maximum magnitude (VLF peak, LF peak, and HF peak), the power expressed as percentage of the total power (VLF power %, LF power %, and HF power %), and the power normalized to the sum of the LF and HF power (LF power nu and HF power nu) were also evaluated. Moreover, the LF/HF power ratio was calculated.

### 2.3.2. Non-Linear Analysis

From the HRV, IBI, and IBR series, several non-linear measures were calculated. Such indices refer to the estimation and characterization of the phase space (or state space) of the physiological system generating the series. The phase space estimation involved the Takens method (Takens, 1981; Casdagli et al., 1991) and three parameters: m, the embedding dimension, which is a positive integer, τ , the time delay, and r, which is a positive real number and represents the margin of tolerance of the trajectories within the space. Takens theory allows for the reconstruction of the dynamic systems of different nature from time series through the method of "delayed outputs." Starting from a time series

$$X = [\mathfrak{u}(T), \mathfrak{u}(2T), \dots \mathfrak{u}(NT)],$$

the attractors of the discrete dynamical system are rebuilt in a mdimensional space, operating a delay τ on the signal. This allows achieving N − (m − 1) signals of length m starting from only one:

$$\begin{cases} X\_1 = [\boldsymbol{\mu}(T), \boldsymbol{\mu}(2T), \dots \boldsymbol{\mu}(mT)] \\ X\_2 = [\boldsymbol{\mu}(2T), \boldsymbol{\mu}(2T + 2\tau), \dots \boldsymbol{\mu}(2T + (m-1)\tau)] \\ \dots \\ X\_{N-(m-1)} = [\boldsymbol{\mu}(N - (m-1))T], \dots \boldsymbol{\mu}(N - (m-1))T \\ \quad + (m-1)\tau)] \end{cases}$$

The various vectors X<sup>j</sup> are the "delayed coordinates" and the derived m-dimensional space is called "reconstructed space." From the state space theory, several ANS non-linear parameters can be derived using the following analyses:


# **2.3.2.1. Poincaré Plot**

This technique quantifies the fluctuations of the dynamics of the time series through a map of each point RR(n) of the RR series vs. the previous one. The quantitative analysis from the graph can be made by calculating the standard deviation of the points by the straight line RRj+<sup>1</sup> = RR<sup>j</sup> . The first standard deviation, SD1, is related to the points that are perpendicular to the lineof-identity and describes the short-term variability, whereas the second, SD2, describes the long-term variability.

# **2.3.2.2. Recurrence Plot**

RP is a graphical method to investigate and quantify the time series complexity. The estimation starts from vectors

$$\begin{aligned} u\_j &= \left( \boldsymbol{R} \boldsymbol{R}\_j, \boldsymbol{R} \boldsymbol{R}\_{j+\tau}, \dots, \boldsymbol{R} \boldsymbol{R}\_{j+(m-1)\tau} \right), \\ j &= 1, 2, \dots, N - (m-1)\tau. \end{aligned}$$

RP is a symmetrical square matrix of zeros and ones, whose dimensions are N − (m − 1)τ , and each element is given by

$$RP(j,k) = \begin{cases} \begin{array}{l} \text{l} \\ \text{0} \end{array} \text{if } d(\mu\_j - \mu\_k \le r) \\ \text{0} \quad \text{otherwise} \end{cases}$$

where d is the Euclidean distance.

Several features can be extracted from the RP by means of the Recurrence Quantification Analysis (RQA). In particular, in this study the following RQA indices were taken into account: longest diagonal line (RP Lmax) and average diagonal line length (RP Lmean), divergence (RP DIV), the percentage of recurrence points which form diagonal lines recurrence rate, determinism (RP DET), trend (RP REC) and entropy (RP ShanEn) (Zbilut et al., 1990; Marwan et al., 2002, 2007).

# **2.3.2.3. Correlation dimension, Approximate, and Sample Entropy Measures**

Starting from the vectors X1, X2, ..., XN−m+<sup>1</sup> in R <sup>m</sup>, the distance between two vectors X<sup>i</sup> and X<sup>j</sup> , according to the definition of Takens applied to high dimensional deterministic systems is given by Takens (1981) and Schouten et al. (1994):

$$d[X\_i, X\_j] = \max\_{k=1,2,\dots,m} |\mu(i+k-1) - \mu(j+k-1)| \tag{1}$$

For each i, with 1 ≤ i ≤ N − m + 1, we measured a parameter C m i (r):

$$C\_i^m(r) = \frac{\text{Number of j such that} (d[X\_i, X\_j] \le r)}{N - m + 1} \tag{2}$$

and we defined

$$C^m(r) = \frac{\sum\_{i=1}^{N-m+1} \log C\_i^m(r)}{N - m + 1} \tag{3}$$

The correlation dimension (CD) is given by Theiler (1987)

$$CD = \lim\_{r \to 0} \lim\_{N \to \text{inf}} \frac{\log C^m(r)}{\log r}$$

The calculation of ApEn used in this study refers to the expression (Pincus, 1991; Fusheng et al., 2001):

$$\text{ApEn(m,r,N)} = \left[C^m(r) - C^{m+1}(r)\right] \tag{4}$$

SampEn is a remake of ApEn and measures the number of pairs of vectors of length m considered "neighbors," i.e., whose distance is less than r, even if the dimension of pattern increases from m to m + 1. Unlike ApEn(m,r, N), SampEn does not include the distance of vectors with themselves, i.e., self-matches, as suggested in the later work of Grassberger and co-workers (Grassberger and Procaccia, 1983; Grassberger, 1988) and it has the advantage of being less dependent on time series length, showing relative consistency over a broader range of possible r-, m-, and N-values. By renaming C <sup>m</sup>(r) parameters without self-matches with the notation U <sup>m</sup>(r), SampEn is calculated by the following expression (Richman and Moorman, 2000):

$$\text{SampEn(m,r,N)} = -\ln \frac{U^{m+1}}{U^m} \tag{5}$$

# **2.3.2.4. Detrended Fluctuation Analysis**

The detrended fluctuation analysis features (DFA1 and DFA2) (Peng et al., 1995; Penzel et al., 2003) were evaluated to study short- and long-term autocorrelation of the HRV series. The algorithm foresaw the estimation of the series

$$y(k) = \sum\_{j=1}^{k} (RR\_j - \overline{RR})$$

k = 1, ..., N. This series was divided into segments of equal length n and for each segment the linear approximation (least square fit, yn) was computed. Then root-mean-square fluctuation was calculated

$$F(n) = \sqrt{\frac{1}{N} \sum\_{k=1}^{N} (\wp(k) - \wp\_n(k))^2}$$

Making a double log graph between log(F(n)) and different values of n, the slope of the regression line is the α scaling exponent. DFA1 and DFA2 features represent this slope between the ranges 4 ≤ n ≤ 16 and 16 ≤ n ≤ 64.

# **2.3.2.5. Multiscale Entropy and Multivariate Multiscale Entropy Analysis**

Multiscale Entropy Analysis (MSE) is a powerful methodology based on the SampEn estimation. MSE was applied in several fields such as study of human gait dynamics (Costa et al., 2003), enhancement of postural complexity (Costa et al., 2007), and synthetic RR time series (Costa et al., 2002). MSE can be an effective non-linear method to collect information about physiological systems whose dynamics is associated to multiple different scales. This method is based on the application of sample entropy method to course-grained time series constructed from a one-dimensional discrete time series by averaging the data points within non-overlapping windows of increasing length, σ. Given a time series {x1, ..., xi, ..., xN} and a scale factor σ, each element of a course-grained series n y (σ) o is calculated using the equation

$$\gamma\_j^{(\sigma)} = \frac{1}{\sigma} \sum\_{i=(j-1)\sigma+1}^{j\sigma} \varkappa\_i, 1 \le j \le N/\sigma \tag{6}$$

The length of each coarse-grained time series is equal to the length of the original time series divided by σ. The second step consists in the computation of SampEn (Richman and Moorman, 2000; Lake et al., 2002) algorithm on these series. Previous studies in which MSE algorithm was applied to physiological data use the standard value m = 2 for the pattern dimension (Costa et al., 2003; Leistedt et al., 2011). In this work the choice of the right r was performed by a method already used in the liter SampEn values were calculated for scale factors σ which were in a range from 1 to 20 and the same process was carried out on HRV, IBI, and IBR series. The complexity index (CI) was measured as the area under the curve of MSE graph and it can be calculated for short time scales, from 1 to 8 (short CI), and for higher time scales, up to 20 (long CI) (Leistedt et al., 2011).

Besides MSE analysis, we performed the Multivariate Multiscale Entropy (MMSE) (Ahmed and Mandic, 2011, 2012) analysis. This algorithm allows performing MSE analysis using multivariate time series. In this work, MMSE was used to quantify the complexity of the series derived from the electrocardiogram and breath. In particular, MMSE results were obtained on the bivariate series HRV-IBI, and HRV-IBR through the estimation of the CI indices (as described above on MSE). Before the MMSE calculation, the involved time series are scaled in the range between 0 and 1 to prevent that the different amplitudes may influence the complexity complexity (Ahmed and Mandic, 2011).

# **2.3.2.6. Symbolic Analysis**

Symbolic analysis (Yeragani et al., 2000; Porta et al., 2001; Baumert et al., 2002; Guzzetti et al., 2005; Tobaldini et al., 2009; Caminal et al., 2010) is another powerful non-linear method which was applied on HRV data series. For each HRV series gathered from each subject, 6 levels were constructed evenly dividing the amplitude range of the samples, and a symbol (from 0 to 5) was assigned to each data sample according to the level of belonging. Then, a window of three consecutive points moves along the HRV series, and three possible configurations are identified when running all the signal: the three points belong to the same level, i.e., no variation (0V), two consecutive points belong to the same level and one to another, i.e., one variation (1V), and the remaining cases, i.e., two variations (2V). The number of patterns falling into each group (0V, 1V, 2V) and the percentage of the total (0V%, 1V%, 2V%) were calculated and used as features. Previous studies support the hypothesis that an increase of 0V patterns is related to an activation of the sympathetic activity, an increase of 2V patterns is related to an increase of the parasympathetic activity, and increases of 1V patterns is associated to a simultaneous increase of both parasympathetic and sympathetic activities.

# 3. Experimental Results

Experimental results are expressed in terms of statistical and correlation analysis. In the literature it can be found the threshold score of each questionnaire above which the behavior of the subject results to show altered psycho-cognitive-behavioral traits. Among all the sub-scales we only considered those where the subjects spread out over a wide range of scores in order to identify two groups, one below and the other above the threshold. For each scale we identified two groups of subjects separated by the median. In order to have two groups numerically equivalent, we selected and investigated only these scales where the median was congruent with the threshold reported in the literature. In addition, for each of the 16 scales we verified that maximum and minimum scores of each group were in the tails of the population distribution reported by the literature. In other words, for each psychological subscale, the median value of the subjects score is calculated to identify two groups: one comprised of the subjects having scores below the median, and one comprised of the subjects having scores above the median. Only 16 out of 25 sub-scales divided the subjects in two groups numerically comparable, therefore we performed the statistical analysis on the scores obtained in these 16 sub-scales. The reference values from the literature about these sub-scales are evaluated on the control groups used in several previous works. For example we considered a sample of 103 subjects (age = 27.00 ± 8.80) for IRI Empathic Concern and IRI Personal Distress sub-scales, referring to a study which explored the relationship among psychological mindedness and several aspects of awareness which comprended this indices of empathy (Beitel et al., 2005) and a sample of 582 subjects for IRI fantasy sub-scale taking this data from a guide study on the empathy scales (Davis, 1980). For the two PANAS sub-scales, a group of 537 volunteers aged 18–91 was in a work that tried to evaluate the reliability and validity of the PANAS (Crawford and Henry, 2004), and 53 participants (age = 34.32±10.50) were asked to answer to the LSAS questionnaires to demonstrate that this method may be employed in the assessment of social anxiety disorder (Fresco et al., 2001; Rytwinski et al., 2009). As a reference for the values of BIS and BAS sub-scales we chose a previous study where the answers of of 2725 individuals aged 18–79 were observed to validate the application of this scale to measure the behavioral inhibition and activation and its correlation with depression and anxiety (Jorm et al., 1998). The threshold value of the answers of a group of 639 participants in a study of the shortened form of the questionnaire, was taken in account for ZKPQ Impulsive Sensation Seeking and Activity subscales (age = 22.31 ± 5.08) (Aluja et al., 2003). At last, as regards DERS subscales, a study on 260 subjects in order to explore the factor structure and psychometric properties of DERS measures (age = 23.10 ± 5.67) was used as reference for DERS Awareness (Gratz and Roemer, 2004) and a reference sample of 42 individuals (age = 24.24 ± 4.38) was considered for the other DERS sub-scales, extracted from a research which compared the values of the this psychological tests on depressed patients and healthy subjects (Ehring et al., 2008).

In the statistical analysis, for each psychological sub-scale and for each ANS feature, we applied the Mann-Whitney test in order to evaluate whether the two groups were statistically different. Moreover, the non-parametric Spearman correlation coefficient was calculated between each psychological sub-scale and ANS feature.

### 3.1. Statistical Analysis

As mentioned above, for each ANS feature, Mann-Whitney nonparametric U-tests were used to test the null hypothesis of having no statistical difference between two groups. The use of such a non-parametric test is justified by having non-gaussian distribution of the samples (p < 0.05 of the null hypothesis of having gaussian samples of the Kolmogorov-Smirnov test).

Concerning features from HRV standard analysis, 8 sub-scales (LSAS Anxiety of Performance , DERS Non-Acceptance, DERS Awareness, IRI fantasy, IRI Empathic Concern, ZKPQ Activity, ZKPQ Impulsive Seeking Sensation, BAS) showed significant discerning capability mostly through frequency domain parameters (see details in **Table 1**). Concerning ANS features coming from non-linear analysis, 9 sub-scales (PANAS Positive Affect, DERS non-Acceptance, DERS Impulse, DERS Awareness, DERS Strategies, IRI Empathic Concern, BIS, BAS, ZKPQ Activity) showed

#### TABLE 1 | Statistical results related to standard HRV features (U-test).


VLF, Very Low Frequency; LF, Low Frequency; HF, High Frequency; nu, normalized units; TINN, width of triangular approximation to NN-interval frequency distribution; RMSSD, square root of mean squared forward differences of successive NN intervals; Pnn50, proportion of successive NN interval differences>50 ms ↑ indicates that an increase of the test score is associated to an increase of the feature value. ↓ indicates that an increase of the test score is associated to a decrease of the feature value.

significant differences considering monovariate and multivariate measures (see details in **Table 2**). An exemplary plot showing the discerning capability of MMSE analysis on DERS Non-Accept sub-scale is shown in **Figure 1**.

To summarize the results, all the extracted features were able to discern the two groups in 12 out of 16 sub-scales. More specifically, standard HRV analysis provided exclusive information, i.e., not overlapped with that coming from the non-linear analysis, on the psychological assessment in only 2 sub-scales, whereas features from ANS non-linear dynamics exclusively discriminated the two groups in 4 sub-scales (see details in **Figure 2**).

### 3.2. Correlation Analysis

The Spearman correlation coefficient was used to show the relationship between the values of each features through all the subjects and the relative score for each sub-scale. Accordingly,


MSE HRV, Multiscale Entropy on HRV series; MSE IBR, Multiscale Entropy on IBR series; MSE IBI, Multiscale Entropy on IBI series; MMSE HRV-IBR, Multivariate Multiscale Entropy on bivariate HRV and IBR series; MMSE HRV-IBI, Multivariate Multiscale Entropy on bivariate HRV and IBI series; CI, Complexity Index. ApEn, Approximate Entropy, SampEn; Sample Entropy, 0V, number of patterns with none variation in the amplitude; 0V%, 1V%, 2V%, percentage of the total patterns with zero, one or two variations in the amplitude; SD1, Standard Deviation of PoincarÃl' Plot related to the points that are perpendicular to the line-of-identity; DFA1, Detrended Fluctuation Analysis (first slope); RP Lmax, Recurrence Plot (longest diagonal line); CD, Correlation Dimension ↑ indicates that an increase of the test score is associated to an increase of the feature value. ↓ indicates that an increase of the test score is associated to a decrease of the feature value.

the coefficient ρ and p − value expressing the probability that no correlation between the two variables exist, were assigned for each sub-scale and each feature. Results are shown in **Tables 3**, **4**.

We found that ANS features related to the linear HRV dynamics are significantly correlated with 5 sub-scales, reaching absolute values of ρ up to 0.52 (BAS and ZKPQ Impulsive Sensation Seeking). Moreover, 10 sub-scales are significantly correlated with markers of ANS non-linear dynamics, reaching absolute values of ρ up to 0.55 (DERS Non-Acceptance).

Although the correlation coefficient is not very high, it is, however, a very interesting result to be further validated and confirmed.

FIGURE 1 | Exemplary plot of Multivariate Multiscale Entropy analysis applied to HRV-IBI series in discerning the two groups (under the median-lower scores: group 1; over the median-higher scores: group 2) according to scores gathered from the DERS Non-Accept sub-scale.

The number of features with significant p-values (p < 0.05) given by such a correlation coefficient is shown in **Figure 3** for each phycological dimension.

# 4. Discussion and Conclusion

In conclusion, we found several ANS biomarkers of psychological dimensions in non-pathological subjects. Such biomarkers are derived from the standard and complexity analysis of ANS measures such as HRV, IBI, and IBR series. We found that dimensions related to difficulties in emotion regulation (DERS),

TABLE 3 | Spearman correlation test results related to standard HRV features.


VLF, Very Low Frequency; LF, Low Frequency; HF, High Frequency, nu, normalized units; TINN, width of triangular approximation to NN-interval frequency distribution; RMSSD, square root of mean squared forward differences of successive NN intervals; Pnn50, proportion of successive NN interval differences > 50 ms.

interpersonal reactivity (IRI), behavioral activation or inhibition (BIS/BAS), sensation-seeking and activity (ZKPQ), and anxiety performance (LSAS) are always associated to changes in the HRV dynamics, quantified using time and frequency domain indices (see **Table 1**). As all the scale define different psychological dimensions, it is very difficulty to give a common interpretation of features through them. The LF/HF ratio decrease, associated to increased questionnaires scores, characterizes the ZKPQ activity and IRI empathic concern, whereas an opposite trend is found for the awareness of difficulties in emotion regulation (DERS). HRV time domain indices such as TINN, Pnn50, and RMSSD are effective only to characterize the empathic concern and emotion regulation. These results, gathered from statistical analyses of standard HRV parameters, are further confirmed by the correlation analyses whose details are shown in **Table 3**.

It is worthwhile noting that the HF power decreases with the DERS score. According to the literature (Porges, 1991, 1992), vagal tone is associated to the ability of emotional self-regulation and high flexibility and adaptability to environmental changes. According to our results, when an emotion dysregulation occurs, the sympathetic activity increases.

Other evidences supporting our results can be found in the current literature (Freeman and Nixon, 1985; Yeragani et al., 1999; Virtanen et al., 2003; Cohen and Benjamin, 2006; Shinba et al., 2008; Licht et al., 2009; Thayer et al., 2010, 2012) which suggest that patients with anxiety disorders revealed a decreased power in the HRV-LF bands.


TABLE 4 | Spearman correlation test results related to non-linear HRV, IBI, IBR features.

MSE HRV, Multiscale Entropy on HRV series; MSE IBR, Multiscale Entropy on IBR series; MSE IBI, Multiscale Entropy on IBI series; MMSE HRV-IBR, Multivariate Multiscale Entropy on bivariate HRV and IBR series; MMSE HRV-IBI, Multivariate Multiscale Entropy on bivariate HRV and IBI series; CI, Complexity Index. ApEn, Approximate Entropy; SampEn, Sample Entropy; 0V, number of patterns with none variation in the amplitude; 0V%, 1V%, 2V%, percentage of the total patterns with zero; one or two variations in the amplitude; SD1, Standard Deviation of PoincarÃl' Plot related to the points that are perpendicular to the line-of-identity; DFA1, Detrended Fluctuation Analysis (first slope); RP Lmax, Recurrence Plot (longest diagonal line); RP DET, Recurrence Plot (determinism); RP REC, Recurrence Plot (trend); CD, Correlation Dimension.

Concerning the ANS non-linear dynamics, several biomarkers of psychological dimensions were found in complexity measures such as sample entropy, monovariate and multivatiate multiscale entropy, short- an long-term correlations, correlation dimension, recurrence and symbolic analysis in characterizing dimensions as positive and negative affect (PANAS), social phobia (Liebowitz Social Anxiety Scale, LSAS), difficulties in emotion regulation (DERS), Interpersonal reactivity (IRI), behavioral

inhibition or activation (BIS/BAS), and sensation-seeking and activity (ZKPQ). Our results on non-linear ANS markers for psychological dimensions confirm the previous findings (Yeragani et al., 2000; Cohen and Benjamin, 2006) and provide a wider portrait of the complexity modulation associated with behavioral characters.

**Figures 2**, **3** report the number of statistically significant features given by Mann-Whitney and Spearman non-parametric correlation, respectively. It is worthy to note that the non-linear features are overall more than those extracted from standard analysis, confirming that complexity dynamics measures play a relevant role in assessing the psycho-physiological dimensions.

Finally, some prudential considerations should be made. The patterns of physiological signals are acquired in rest conditions right after performing the test and the assumption behind the experiment is that the psychological assessment acted as an affective elicitation. Results have to be considered as preliminary to future experiments where subjects experience an actual affective dimension while they are monitored. Nevertheless, it is worthwhile pointing out that complexity measures can be considered promising markers to assess the psychological traits. Is important to underline that such interest is not diminished by the difficulty in giving a physiological meaning to complexity measurements. In this sense and more in generally, we underscored how our data suggested the possibility of an ANV fingerprinting of psychological dimensions. Therefore, beyond their precise physiological meaning, our results have interesting consequences for the psychometric and clinical fields. Our approach may be promising in describing the psychological dimensions as a combination of different features, providing a full classification of psychological characteristics through a baseline ECG acquisition. However, more studies with a much higher number of subjects are needed to test the reliability and the feasibility of these potential clinical implications. Furthermore, to test if our methodology could also be extended to the extremes of the psychological dimensions, these studies should also include pathological samples (e.g., diagnosed subjects). Should that prove to be the case, this approach might hold promise as a tool for providing an external validation to psychological diagnosis.

# References


# Acknowledgments

The research leading to these results has received partial funding from the European Union Seventh Framework Programme FP7/2007U2013 under grant agreement n 601165 of the project ˝ "WEARHAP."


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Nardelli, Valenza, Cristea, Gentili, Cotet, David, Lanata and Scilingo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# What is the mathematical description of the treated mood pattern in bipolar disorder?

# *Fatemeh Hadaeghi\*, Mohammad R. Hashemi Golpayegani and Shahriar Gharibzadeh*

*Biomedical Engineering Faculty, Amirkabir University of Technology, Tehran, Iran*

*\*Correspondence: f\_hadaeghi@aut.ac.ir*

*Edited by:*

*Tobias A. Mattei, Ohio State University, USA*

#### **A commentary on**

**Mathematical models of bipolar disorder** *by Daugherty, D., Roque-Urrea, T., Urrea-Roque, J., Troyer, J., Wirkus, S., and Porter, M. A. (2009). Commun. Nonlinear Sci. Numer. Simulat. 14, 2897–2908.*

In their innovative article, Daugherty et al. (2009) have modeled the mood swings of a patient with bipolar disorder as a Liénard oscillator with autonomous forcing. They proposed that emotional state of untreated and treated bipolar type-II patient could be mathematically represented by the Equation (1), in which *x(t)*, represents emotional state in time *t*. In this equation, by adjusting the parameter ρ, both treated and untreated person could be modeled.

> *x*¨ − 0*.*38*x*˙ + 180*x* = <sup>ρ</sup>*x*˙<sup>3</sup> <sup>+</sup> <sup>μ</sup>*x*˙<sup>5</sup> <sup>−</sup> <sup>ν</sup>*x*˙<sup>11</sup> (1)

The phase space of Equation (1) which is shown in **Figure 1B**, includes an unstable limit cycle encircled by a large stable limit cycle. The authors have supposed that after treatment, the smaller stable limit cycle with sufficiently small amplitude would correspond to the ultimate emotional pattern to be achieved.

Nevertheless, we believe with basis of previous studies (Gottschalk et al., 1995; Huber et al., 1999) that both in normal persons and treated patients, mood variations and emotional states do not exhibit such a periodic pattern (After 300 months in **Figure 1A**) and could be better described by a low amplitude chaotic time series. Some of our evidences for this supposition are: (1) the spatial complexity of brain components. In the brain, there are a large number of interacting neurons connected by synapses and interacting networks connected functionally

or structurally. As already demonstrated in studies in complex systems, the existence of multiple and interdependent connections acting in complex positive and negative feedback loops is very likely to lead to apparently random and unpredictable states (Korn and Faure, 2003). This unpredictability is a fundamental feature of chaotic patterns. (2) The temporal complexity of brain behavior. Besides the complex structural pattern in the brain, recordings from nerve cells as well as electroencephalograms have showed the chaotic temporal function of the brain in its interaction with the environment (Korn and Faure, 2003; Rabinovich et al., 2012).

In the case of mood as a state of the mind, therefore, it can be expected that mood variation in normal individuals would be more complex rather than being ordered. In addition, the environment is in constant modification and therefore, expecting that it would generate standard and fixed emotional states or moods in such a periodic manner seems to be quite unrealistic. Indeed, in the case of bipolar disorder, it has already been demonstrated that we are dealing with an intermittent behavior (Gottschalk et al., 1995) which can be simplified to a stable periodic pattern, in contrast with the highly chaotic patterns in normal individuals. Therefore, we believe that in treated patients, it would not be adequate to reach a state with periodic oscillation with low amplitude. In fact, in abnormal states, as changes in the complexity of brain dynamics occur, therapeutic strategies would attempt to compensate these changes (Bahrami et al., 2005; Mendez et al., 2012).

Based on the above-mentioned view, we propose to modify the aforementioned model by inserting a time dependent term which reflects the momentary interactions of brain with time varying environment as well as interpersonal relationship. The proposed equation for untreated person could be considered as follows in which ρ = −0*.*03302, μ = 0*.*078, ν = 0*.*00093, and η = 0*.*1.

$$\begin{aligned} \ddot{\mathbf{x}} - \mathbf{0}.038\dot{\mathbf{x}} + \mathbf{0}.180\mathbf{x} &= \\ \rho \dot{\mathbf{x}}^3 + \mu \dot{\mathbf{x}}^5 - \nu \dot{\mathbf{x}}^{11} - \eta \mathbf{x}^3 \end{aligned} \qquad (2)$$

The effect of treatments could be inserted through a sinusoidal function which results to Equation (3).

$$\begin{aligned} \ddot{\mathbf{x}} - \mathbf{0}.038\dot{\mathbf{x}} + \mathbf{0}.180\mathbf{x} &= \\ \rho \dot{\mathbf{x}}^3 + \mu \dot{\mathbf{x}}^5 - \nu \dot{\mathbf{x}}^{11} - \eta \mathbf{x}^3 &\quad (3) \\ + q \cos(\omega t) \end{aligned} \quad (4)$$

Changing the parameters of this equation, especially, ω, q, and, η, would yield diverse patterns such as periodic, quasiperiodic, chaotic, and intermittent behaviors. Considering η = 1, ω = 2, and q = 1*.*2 the Equation (3) has a chaotic solution. In order to provide a deeper insight in to such dynamics, we represent this time series and the chaotic attractor in phase plane in **Figures 1C,D**. In such example, we present a mathematical representation of an untreated 20-year-old patient Equation (2) as well as the effects of treatment, which is represented by Equation (3). In phase space portrait (**Figure 1D**), a small amplitude stable chaotic attractor which is encircled by the large unstable periodic orbit (not shown in the figure) represents the desired attractor of emotional state for treated person.

It is obvious that our modified model can represent both rhythmic pattern of mood variation in patients and the complex pattern of mood states in treated subjects. Additionally, our equation seems to be more consistent with observed evidences from empirical studies because its adjustable parameters could reflect the effect of therapeutic strategies (Huber et al., 1999); however, theoretically, the

**FIGURE 1 | (A,B)** Time series of mood and phase space of treated patient in model of Equation (1). It has been supposed that smaller stable limit cycle with small amplitude is the desired emotional pattern of the patient after treatment (Daugherty et al., 2009). **(C)** Time series of mood pattern in modified model, before and after treatment. **(D)** Bounded chaotic attractor as a representation of relative variations in emotional state and the rate of its changes in a treated patient using modified model.

occurrence of a tangent bifurcation in the equation by change in one of the parameters would be required in order to transit from a periodic pattern to a chaotic behavior. The exact meaning of such event in clinical terms still remains to be elucidated in future studies.

Finally, it is important to emphasize that, ultimately, the validity of all these theoretical models and predictions will rely on empirical studies employing qualitative analysis of self-rated mood records (life charts) based on psychological tests or using complexity measures extracted from functional test time series such EEG, fMRI, or PET scan.

# **REFERENCES**

Bahrami, B., Seyedsadjadi, R., Babadi, B., and Noroozian, M. (2005). Brain complexity increases in mania.*Neuroreport* 16, 187–191. doi: 10.1097/00001756-200502080-00025


treatment. *J. Psychopharmacol.* 26, 636–643. doi: 10.1177/0269881111408966

Rabinovich, M. I.., Afraimovich, V. S., Bick, C., and Varona, P. (2012). Information flow dynamics in the brain. *Phys. Life Rev*. 123C, 76–84.

*Received: 18 July 2013; accepted: 19 July 2013; published online: 12 August 2013.*

*Citation: Hadaeghi F, Hashemi Golpayegani MR and Gharibzadeh S (2013) What is the mathematical description of the treated mood pattern in bipolar disorder? Front. Comput. Neurosci. 7:106. doi: 10.3389/ fncom.2013.00106*

*Copyright © 2013 Hadaeghi, Hashemi Golpayegani and Gharibzadeh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Does "crisis-induced intermittency" explain bipolar disorder dynamics?

# *Fatemeh Hadaeghi\*, Mohammad R. Hashemi Golpayegani and Keivan Moradi*

*Biomedical Engineering Faculty, Amirkabir University of Technology, Tehran, Iran \*Correspondence: f\_hadaeghi@aut.ac.ir*

*Edited by:*

*Tobias A. Mattei, Ohio State University, USA*

**Keywords: bipolar disorder, chaos theory, crisis, intermittency, mathematical modeling, multisatability, strange attractor**

The brain presents a large number of spatially connected and interacting neurons and synapses that form many positive and negative feedback circuits. These complex networks in interaction with the environment have been experimentally demonstrated to produce temporally chaotic behavior which may be detected in recordings from individual nerve cells or neural ensembles (Korn and Faure, 2003). According to such paradigm, the brain could be considered as a complex system with chaos as its predominant dynamics. As a result, concepts of complex system and chaos theory could be applied to the studies of normal and abnormal brain functions.

One of the fundamental features of some complex systems is "multistability," which can be understood as the coexistence of several interacting attractors (Chian et al., 2006). These interactions results in various complex behaviors in the long term dynamics of the system. Previous studies in several research areas, including neuroscience, have already reported the existence of multistability in natural systems (Chian et al., 2006; Goldbeter, 2011; Rabinovich et al., 2012).

From the perspective of chaos theory, irregular alternation between episodes of various forms of chaotic or periodic behaviors is known as "intermittency" (Tanaka et al., 2005; Chian et al., 2006). In a "global bifurcation," an "attractormerging crisis" could yield to intermittent behavior. This crisis occurs through the collision of two or more attractors with the boundaries of the basin of the attraction of other attractors (Tanaka et al., 2005; Chian et al., 2006). In this case, by crossing the boundary, the trajectory of the system would be attracted by the other attractor. Such trajectory would then, remain there until another crossing which may lead to a returning to the first attractor. Chaotic intermittency has been reported in circuit oscillators, economic variables, nonperiodic associative dynamics in chaotic neural networks as well as in psychiatric disorders like obsessive–compulsive disorder (Tanaka et al., 2005; Chian et al., 2006; Rabinovich and Varona, 2011). However, we believe that such concept also could be applied to mood variation pattern in bipolar disorder.

According to physiological studies, neuroplastic variations may be the underlying mechanism which explain the misregulation of the main circuits involved in the emotional processing (Kandel et al., 2000; Berns and Nemeroff, 2003). This emotional dysregulation is somatically represented as irregular mood swings. Therefore, we believe that the clinical course of bipolar disorder, which is characterized by repeated erratic cycles of mania, depression and episodes of randomly appeared chaotic transitional states (Gottschalk et al., 1995; Berns and Nemeroff, 2003; Rabinovich et al., 2012), may also be understood based on the concept of chaotic intermittency. Manic, depressive and transitional states could be considered as stable or unstable attractors of a dynamical system through which the mood trajectory moves. Therefore, such accidental and abrupt changes of the mood state in bipolar disorder can result from the collision of the initial mood trajectory with the boundary of the basin of the attraction of the another mood attractors. According to chaos theory, this intermittent behavioral pattern could be considered as "crisis-induced intermittency." Following such viewpoint, in healthy subjects, there would be only one "strange attractor" related to the mood states. Time series of such strange attractor represents both positive and negative emotions, unpredictably and in response to internal (for example thought, attention and memory) or external (environment) stimulus. In a bipolar person, however, initial emotional trigger of disease results in a type of "exterior crisis" in the system, in which the destruction of strange attractor is accompanied with formation of two abnormal attractors (mania and depression) and chaotic transients between them.

In order to model such scenario, models of chaotic systems which demonstrate various kind of crisis by changing their parameters (such as "forced Duffing" oscillator and "Ikeda" iterated map), could be utilized to characterize the basic

**FIGURE 1 | (A)** Example of crisis induced intermittency in the forced Duffing oscillator. **(B)** Example of temporal pattern of mood variation in a patient with bipolar disorder (Tretter et al., 2011).

features of human emotional states, when they are presenting multistable and intermittent behaviors, as in the case of bipolar disorder. In order to provide a deeper insight in to such dynamics, we represent the time series of forced Duffing oscillator in its crisis-induced intermittent mode in **Figure 1A** and an example of temporal pattern of self-rated mood records (life charts) in a person with bipolar disorder in **Figure 1B**. The proposed theoretical model would be useful in order to predict the evolution of such emotional states in bipolar disorder and to investigate the effects of psychopharmacological therapies. The experimental data for such investigations would most likely come from psychological tests, life chart recordings, or functional studies, such as EEG, fMRI, or PET-scan.

# **REFERENCES**

Berns, G. S., and Nemeroff, C. B. (2003). The neurobiology of bipolar disorder. *Am. J. Med. Genet. C Semin. Med. Genet.* 123C, 76–84.


understanding chaotic itinerancy. *Phys. Rev. E Stat. Nonlin. Soft Matter Phys.* 71, 016219.

Tretter, F., Gebicke-Haerter, P. J., an der Heiden, U., Rujescu, D., Mewes, H. W., and Turck, C. W. (2011). Affective disorders as complex dynamic diseases – a perspective from systems biology. *Pharmacopsychiatry* 44(Suppl. 1), S2–S8.

*Received: 21 July 2013; accepted: 29 July 2013; published online: 23 August 2013.*

*Citation: Hadaeghi F, Hashemi Golpayegani MR and Moradi K (2013) Does "crisis-induced intermittency" explain bipolar disorder dynamics? Front. Comput. Neurosci. 7:116. doi: 10.3389/fncom.2013.00116*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2013 Hadaeghi, Hashemi Golpayegani and Moradi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Is there any geometrical information in the nervous system?

# *Sajad Jafari\*, Seyed M. R. Hashemi Golpayegani and Shahriar Gharibzadeh*

*Biomedical Engineering Faculty, Amirkabir University of Technology, Tehran, Iran \*Correspondence: sajadjafari@aut.ac.ir*

*Edited by:*

*Tobias A. Mattei, Ohio State University, USA*

**Keywords: trains of impulses, chaotic systems, sensitivity to initial conditions, geometry, phase space**

There has been an increasing interest in analyzing neurophysiology from complex and chaotic systems viewpoint in recent years. For example, although the famous Hodgkin and Huxley model (Hodgkin and Huxley, 1952) has been the basis of almost all of the proposed models for neural firing, the Rose-Hindmarsh model (Hindmarsh and Rose, 1984) is known to be a more refined model because as it has the ability of showing different firing patterns, especially chaotic bursts of action potential, which causes a proper matching between this model behavior and many real experimental data.

It is believed that information is transferred in the brain by trains of impulses, or action potentials, often organized in sequences of bursts; therefore, it is useful to determine the temporal patterns of such trains (Korn and Faure, 2003). Since chaotic systems are sensitive to initial conditions (Hilborn, 2000), lots of signals with minimum similarity in time domain could have a same source; such behavior might be better understood by analyzing those signals in the phase space and from geometrical viewpoint (Jafari et al., 2013d), as although chaotic signals have pseudorandom behavior in time, they are

ordered in phase space (i.e., if one plots the signals as a trajectory in a coordinate of system variables, he will encounter an ordered and specific topology which is called strange attractor) (Hilborn, 2000).

In fact in many applications of chaotic signals and systems, using temporal properties without being careful about this sensitivity to initial conditions, could lead to important misinterpretations (Jafari et al., 2012, 2013a,c,d). Hence, it seems that more than temporal patterns, it is of paramount importance to investigate topological patterns in such impulse trains. In order to accomplish such tasks several we have recently proposed some interesting tools for geometrical analysis (Jafari et al., in press; Shekofteh et al., in press).

In order to show the benefit of using geometry and topology in the phase space (state space), a simple example is provided in the sequence. Consider the famous Logistic map which is a very simple and well investigated chaotic map:

$$\mathbf{x}\_{k+1} = A\mathbf{x}\_k \left(\mathbf{l} - \mathbf{x}\_k\right) \tag{1}$$

Suppose that we have two different maps with different values of parameter A:

$$\mathbf{x}\_{k+1} = \mathbf{3}.8\mathbf{x}\_k \,(1 - \mathbf{x}\_k) \tag{2}$$

$$\varkappa\_{k+1} = \Im .9 \varkappa\_k \left( 1 - \varkappa\_k \right)$$

If we obtain one time series from each of them, as can be seen in **Figure 1A**, they are both random-like and recognizing the difference between them seems difficult in the time domain. However, they have two ordered and easily distinguishable patterns in the state space (**Figure 1B**).

Since looking at neurophysiology from dynamical and geometrical points of view has already been successfully investigated in some previous works (Sauer, 1994; Christini and Collins, 1995; Gottschalk et al., 1995; Milton and Black, 1995; Sarbadhikari and Chakrabarty, 2001; Korn and Faure, 2003; Hadaeghi et al., 2013; Jafari et al., 2013a), we believe that future investigations, especially using real clinical data, will be able evaluate our hypothesis and prove the benefit of such geometrical analysis of non-linear data. Ultimately, a better understanding of neuronal information transportation from the nonlinear dynamics standpoint is expected to provide a better understanding of the basic pathophysiology of neurological disorders, possibly fostering new future therapeutic approaches.

# **REFERENCES**


*Received: 03 August 2013; accepted: 15 August 2013; published online: 30 August 2013.*

*Citation: Jafari S, Hashemi Golpayegani SMR and Gharibzadeh S (2013) Is there any geometrical information in the nervous system? Front. Comput. Neurosci. 7:121. doi: 10.3389/fncom.2013.00121*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2013 Jafari, Hashemi Golpayegani and Gharibzadeh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Can cellular automata be a representative model for visual perception dynamics?

# *Maryam Beigzadeh , Seyyed Mohammad R. Hashemi Golpayegani and Shahriar Gharibzadeh\**

*Department of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran \*Correspondence: gharibzadeh@aut.ac.ir*

*Edited by:*

*Tobias A. Mattei, Ohio State University, USA*

**Keywords: cellular automata, visual perception modeling, complex systems, chaos and nonlinear dynamics, EEG**

"Cellular automata (CA) are mathematical models for systems in which many simple components act together to produce complicated patterns of behavior" (Packard and Wolfram, 1985). Applying the CA theoretical framework in the field of neuroscience has shown successful results in the interpretation of some cognitive aspects (Adams et al., 1992; Pashaie and Farhat, 2009; Kozma and Puljic, 2013; Lopez-Ruiz and Fournier-Prunaret, 2013; Mattei, 2013). In this short analysis, we suggest that CA can be a very reasonable tool to model both dynamical and structural aspects of visual perception. As Wolfram declared in his book, A New Kind of Science, visual perception is a kind of modeling and reducing the input visual sensory data into a more summary but still informative representation in the brain (Wolfram, 2002).

Studying the visual system can be very useful, because as already previously demonstrated, "The visual system has the most complex neural circuitry of all the sensory systems (Kandel et al., 2000)" and at least 20% of human cerebral cortex is related to the visual part (Olshausen, 2002). Additionally, trying to understand visual perception may lead us to a better understanding of how other cognitive processes in the brains work.

It has already been demonstrated that brain dynamics (which are reflected in EEG, MEG, and ECoG signals) are inherently chaotic (Freeman, 1991). As we perceive different sensory information (i.e., images, sounds, odors, etc.) and recognize different patterns, these dynamical processes tend to turn into a more regular pattern. This stage has been referred by other researchers as: "the transients between gas-like randomness and liquid-like order (Kozma et al., 2012)." According to such paradigm, each stimulus would tend to lead the system to its own "liquid-like attractor" which is different from the other one. So, after the sensorial stimuli, the brain dynamics would exhibit a temporary switching between these different states.

But what would be the advantage of using such CA model? There are millions of neurons in the visual system that are highly interactive, each one demonstrating its own complex behavior. Their combined and integrated functions lead to the overall process of perception. The CA framework provides a model, in which a collection of many interactive agents (cells) relate to each other according to specific "interaction rules" in space and time. The number of agents, their dynamical properties, and their interactions with each other, determine which kind of behavior (chaotic, periodic, etc.) the CA will adopt.

Compared to other alternative multiagent modeling tools (such as artificial neural networks), in CA the researcher is able to determine the local behavior of individuals as well as their interaction rules and connectivity patterns, both locally and globally in space. In CA model it is also possible to analyze the behavior of the system from both the micro to macro levels. But how could the analysis of the space properties of CA make visual perception modeling more realistic? It has already been demonstrated that, in the visual system (at least in the primary processing areas such as V1) there are specialized cells which, because of their own specific structure and function, become more sensitive to specific properties of the perceived visual scene (such as image edges, textures, orientation, spatial frequencies) that are inherently space related features.

In such sense, CA would fit as a very appropriate model, as it exhibits close theoretical similarities with other methods which use graph theory and small world networks analysis (Sporns, 2006; Stam and Reijneveld, 2007). Additionally, it has already been suggested that probabilistic CA can be successfully employed to model the olfactory perception (Kozma et al., 2012). Nevertheless, using CA for modeling visual perception from the dynamical and structural standpoints has not yet been reported before, although CA has already been used for modeling simpler visual-related tasks, such as retina function, or as a computational tool for implementing image processing tasks in computer vision applications (like edge detection, texture detection, noise reduction, etc.) (Wolfram, 2002; Dhillon, 2012). In this short commentary we defend that CA can be used as a holistic model for the integration of local visual aspects in a broader multimodal integration of the global aspects of visual perception.

One possible strategy in order to implement such paradigm would be to use specific objective measures (such as the number of active neurons, or the mean activation value of a specific network), and afterwards attempt to match the behavior of such time series (by comparing its phase space and strange attractors) with real EEG recordings related to specific visual tasks (such as the classic "face/non face discrimination").

In summary, the dynamic behavior of CA has been shown to be a power tool for modeling several types of neuronal activity and we believe that it can be successfully used to study global features of visual perception. In fact, future studies on this area may be able to demonstrate how perceptual deficits commonly observed in clinical practice (such as face recognition deficits in autistic patients) may be represented by a change in the basic parameters of CA models of visual representation.

# **REFERENCES**


*Biology and Ecological Effects*, eds A. F. Camiso and C. C. Pedroso (New York, NY: NOVA Science Publications).


*Received: 01 September 2013; accepted: 09 September 2013; published online: 01 October 2013.*

*Citation: Beigzadeh M, Hashemi Golpayegani SMR and Gharibzadeh S (2013) Can cellular automata be a representative model for visual perception dynamics? Front. Comput. Neurosci. 7:130. doi: 10.3389/fncom. 2013.00130*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2013 Beigzadeh, Hashemi Golpayegani and Gharibzadeh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Bifurcation analysis of "synchronization fluctuation": a diagnostic measure of brain epileptic states

# *Fatemeh Bakouie1,2\*, Keivan Moradi 1, Shahriar Gharibzadeh1 and Farzad Towhidkhah2*

*<sup>1</sup> Neural and Cognitive Sciences Lab, Biomedical Engineering Faculty, Amirkabir University of Technology, Tehran, Iran*

*<sup>2</sup> Cybernetics and Modeling of Biological Systems Laboratory, Biomedical Engineering Faculty, Amirkabir University of Technology, Tehran, Iran \*Correspondence: fbakouie@aut.ac.ir*

#### *Edited and reviewed by:*

*Tobias A. Mattei, Ohio State University, USA*

**Keywords: complex network, synchronization fluctuation, dynamical system, bifurcation, control parameter**

The brain is a complex network with functional elements spatially distributed in different regions. One suggested mechanism for communication among these distributed elements is synchronization (Singer, 1993).

Two oscillating neural groups are called to be "synchronized" if over the time, their phase difference does not remarkably increase. In a real system composed of some oscillators, synchronization level is a computable parameter. According to this paradigm, depending on the functional state of the brain, the level of synchronization among brain regions may vary over time. This variation is called "synchronization fluctuation" (SF). Regarding brain's higher functions such as consciousness and memory, for instance, SF patterns are important features of normal brain states (Schnitzler and Gross, 2005; Watrous et al., 2013).

In some pathological brain states such as epilepsy, however, hypersynchronization is a major problem (Lehnertz et al., 2009). In such situations, synchronization occurs without fluctuations. Therefore, in epilepsy, SF may lose its dynamicity, producing a narrowdynamics signal. The question which arises is: "how is it possible to manage diseases related to the poor dynamics of SF in the brain?"

Dynamical systems approach may be able to provide some answers to this question: Based on dynamical systems theory, even slight modification of a parameter (so-called "control parameter") is able to lead to a significant qualitative change in the system's behavior. This change is called a "bifurcation" (Guckenheimer, 2007). Dynamical approach has already been successfully used to the study of the functional status of epileptic states. For example, Babloyantz and Destexhe reported the nonlinearity of absence (Babloyantz and Destexhe, 1986). Moreover, Stam claims that epilepsy is the most important application of nonlinear EEG study (Stam, 2005). In another research Perez Velazquez et al. suggested that the interictal ictal transition may be the result of bifurcation due to alteration in control parameters like the balance between excitation an inhibition in the underlying neuronal networks (Perez Velazquez et al., 2003).

We hypothesize that SF may be a representative parameter of brain dynamics, which have identifiable bifurcations according to specific brain states. According to such approach, SF dynamics is supposed to change from a rich state to a narrower one, when brain changes from normal conscious to abnormal unconscious epileptic conditions.

Biologically, different mechanisms have already been suggested as the underlying basis of brain synchronization. For instance, it has been shown that gap junctions, coupling of neurons via longterm synaptic plasticity, interneurons, and rhythm generators of the brain such as the medial septum-diagonal band of Broca (MSDBB) may play a role in the synchronization between two neurons or more neuronal networks (Buzsáki, 2002). Such biological mechanisms that control synchronization can be considered as control parameters of SF in brain dynamics. For example, among these parameters, variations may exist in the number and permeability of gap junctions, the synaptic strength between two neurons, the distribution, frequency and strength of the GABA inhibition by interneurons, and the distribution, frequency and strength of excitation and inhibition of the cholinergic and GABAergic neurons of the MSDBB. Moreover, Margineanu and Klitgaard have already demonstrated that levetiracetam (LEV) antagonizes neuronal (hyper) synchronization, in the CA3 area of rat brain slices which is prone to epilepsy (Georg Margineanu and Klitgaard, 2000). In another research, Clemens showed that Valproate decreases EEG synchronization in idiopathic generalized epilepsy (Clemens, 2008).

Concerning connectivity among brain regions, Kay et al. explained that in treatment-responsive epileptic patients, compared to healthy controls, default Mode Network (DMN) connectivity does not reduce significantly; however, in treatment-resistant epileptic patients, there exists connectivity reduction compared to control group (Kay et al., 2013). In another study, the researchers showed DMN alterations in mesial temporal lobe epilepsy. Furthermore, Liao et al. have showed that in mesial temporal lobe epilepsy (mTLE) patients with hippocampal sclerosis (HS), there are reductions in functional and structural connectivity between hippocampal structures and their adjacent regions (Liao et al., 2011). Compared to the controls, it was shown that there is significant reduction in functional and structural connectivity between the posterior cingulate cortex (PCC)/precuneus (PCUN) and bilateral mesial temporal lobes (mTLs). Resting functional magnetic resonance imaging studies showed that in drug-resistant temporal lobe epilepsy, functional connectivity between the hippocampus, anterior temporal, precentral cortices and the default mode and sensorimotor networks reduces Based on their findings it would be claimed that the reduction in functional connectivity within the DMN in mTLE may be the result of the connection density reduction, leading to degeneration of structural connectivity (Voets et al., 2012). These finding showed that in epilepsy, connectivity reduction occurred, while pharmacological treatment tend to drive this change in connectivity back to normal state. The mechanism of such therapeutic action, however, is still relatively unknown (Jin and Zhong, 2011).

In the future, it would be interesting to analyze the efficacy of therapeutic strategies addressing diseases caused by SF dynamicity changes (such as antiepileptic drugs) according to their capacity to carefully tune the control parameters of SF in order to set the brain back to its normal states. As an evidence, Krystal et al. hypothesized that Lyapunov exponent (λ1) may decrease during the electroconvulsive therapy (ECT) seizures (Krystal et al., 1996). It seems that despite they did not assess synchronization directly, decreased λ1 corresponds to decreased EEG complexity. In another experimental treatment strategy for epilepsy, researchers have implemented an "automated, just-intime stimulation seizure control method" in epileptic rats. Interestingly, the successful control of seizures with such therapy highly correlated with desynchronization of brain dynamics (Good et al., 2009).

Such experimental researches support the idea that, by tuning control parameters of SF, it may be possible to drive pathological brain states into normal ones. Therefore, we suggest that SF may be an important measure that represents the brain dynamics and that SF dynamics may be a potential subject of future experimental studies aiming to uncover the underlying mechanisms of pathological brain states.

# **REFERENCES**


*Received: 23 November 2013; accepted: 21 January 2014; published online: 06 February 2014.*

*Citation: Bakouie F, Moradi K, Gharibzadeh S and Towhidkhah F (2014) Bifurcation analysis of "synchronization fluctuation": a diagnostic measure of brain epileptic states. Front. Comput. Neurosci. 8:11. doi: 10.3389/fncom.2014.00011*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2014 Bakouie, Moradi, Gharibzadeh and Towhidkhah. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# A more realistic quantum mechanical model of conscious perception during binocular rivalry

#### *Mohammad Reza Paraan1, Fatemeh Bakouie2 \* and Shahriar Gharibzadeh2*

*<sup>1</sup> Energy Engineering and Physics Department, Amirkabir University of Technology, Tehran, Iran*

*<sup>2</sup> Neural and Cognitive Sciences Lab, Biomedical Engineering Department, Amirkabir University of Technology, Tehran, Iran*

*\*Correspondence: fbakouie@aut.ac.ir*

#### *Edited by:*

*Tobias Alecio Mattei, Ohio State University, USA*

**Keywords: consciousness, quantum state, mixed state, probability distribution, dominance duration**

#### **A commentary on**

# **Quantum formalism to describe binocular rivalry**

*by Manousakis, E. (2009). Biosystems 98, 57–66. doi: 10.1016/j.biosystems.2009.05.012*

Since the first systematic description of binocular rivalry by Wheatstone, this fascinating phenomenon has provided several new insights into the mechanisms of visual awareness (Leopold and Logothetis, 1999). Binocular rivalry (BR) is the subjective experience of randomly alternating perceptions pertaining to the two eyes when they are presented with conflicting stimuli. Because of its nature, BR enables consciousness researchers to separately investigate the mechanisms of perception and conscious experience (Gazzaniga et al., 2009). Among various descriptions of this phenomenon, quantum mechanical descriptions stand out as the most radical.

In a recent innovative work by Manousakis, the formalism of quantum mechanics is utilized to describe the conscious experience during BR. Although the author has successfully derived the observed probability distribution of dominance durations (PDDD), his approach undermines some essential features of conscious perception during BR. Generally, two kinds of perception dominate during BR: (1) full dominance of one eye's stimulus, (2) composite or mixed dominance of the two monocular stimuli (Yang et al., 1992). Our argument revolves around the latter kind of perception which is also referred to as transition phase or transition state.

Classically, simplifications imposed experimental conditions in which only full dominance was perceived by subjects and mixed state's (MS) duration was minimized. However, many experiments reveal the diversity in rivalry's temporal dynamics and specifically the important role of MS (Hollins, 1980; Blake et al., 1992; Bossink et al., 1993; Wilson et al., 2001). Regarding the neural correlates of MS, it has been shown that the frontoparietal areas of brain trigger rivalry transitions (Lumer et al., 1998; Knapen et al., 2011). It must be emphasized that various studies on the neural concomitants of BR suggest that no single neural site or neural mechanism is at work during BR, rather multiple stages and brain areas are involved (Blake and Logothetis, 2002).

Many attempts have been made to model the dynamical behavior of BR, most of which try to reproduce the temporal dynamics of BR by reconstructing specific neural mechanisms (Kalarickal and Marshall, 2000; Laing and Chow, 2002; Stollenwerk and Bode, 2003; Freeman, 2005). A major number of these models ignore MS in order to avoid crippling complications, yet Brascamp and colleagues show that none of the previous models is capable of reproducing the full range of observed dynamics which include MS (Brascamp et al., 2006b) and hence try to develop a new model (Brascamp et al., 2006a; Noest and van Ee, 2006). Another group of models of which Manousakis' model is an example capture certain aspects of rivalry's dynamics without resorting to the underlying neural circuits (Mamassian and Goutcher, 2005). However, in order to obtain the PDDD, Manousakis employs some temporal parameters characterizing neuronal firings. This is an interesting achievement because it ties the dynamics of conscious perception to specific firing patterns.

Like the classical models, Manousakis' model only treats the two dominance states which are represented by two quantum states, while MS is ignored. The author compares his theoretical PDDD with the observed PDDD of classical experiments (Levelt, 1968; Lehky, 1995) which did not record the mixed states' duration separately. We believe that the quantum states are only symbols which are manipulated according to the quantum formalism, and bear no resemblance to the perception they represent. Therefore, in Manousakis' approach, only the number of states and their associated probabilities determine the favored PDDD. Therefore, unlike classical models, the scope of the quantum mechanical model can be readily extended by introducing a third quantum state which represents MS. In order to test the new model, its PDDD should be calculated and compared against that of experimental data which are separate recordings of dominance durations of the three states. It must be emphasized that the probability distribution is not a complete description of the dynamics of BR, and it is necessary to extract other relative quantities from the model in future works.

It is worthwhile discussing another work by Conte and colleagues who showed that mental states follow quantum mechanics during the conscious bi-stable perception of ambiguous figures (Conte et al., 2009). Their model shares a lot of features with that of Manousakis, with the exception that they take into account the periods when their subjects report indeterminate perception. Indeterminate perception resembles MS in that they are both mental states and are mediated by specific neural correlates. But Conte et al. represent indeterminacy state by the wavefunction of the two-state system rather than an additional third quantum state. Technically, a wave-function is a superposition of all the real possible states of a quantum system. We believe that this is an inappropriate take on the problem which leads to inconsistencies within the model. The developers of these two quantum mechanical models believe that the actualization of each quantum state is equal to the activation of neural correlates of consciousness (NCC) of the corresponding perception; a state is actualized when a quantum system is measured (observed) and subsequently its wave-function "collapses" to that constituent state. Therefore, we believe that wave-function is not a legitimate representation, because it does not describe a real state of a system and is doomed to collapse, and on the other hand, specific NCC of MS or that of indeterminate perception demands a distinct associated quantum state.

Manousakis' neglect of MS might be justified by the presumption that this state only functions as a bridge between the two dominance states. That is, MS does not compete with the other two and is not involved in rivalry. It is noteworthy that the term "transition" has led to a misunderstanding, namely that the MS occurs only when the perception is being switched from one eye to another. But as is often the case with BR experiments, subjects report the same perception as the one that was dominant before MS. Hence, there is no particular regular periodic alternation between dominance and suppression (Mueller and Blake, 1989; Brascamp et al., 2006b). We believe these indicate that MS is not a mere bridge connecting the two dominant states, but a state which dominates consciousness randomly and therefore, enters statistical calculations of quantum mechanics.

# **REFERENCES**


*J. Comput. Neurosci.* 12, 39–53. doi: 10.1023/A:1014942129705


*Received: 12 January 2014; accepted: 02 February 2014; published online: 20 February 2014.*

*Citation: Paraan MR, Bakouie F and Gharibzadeh S (2014) A more realistic quantum mechanical model of conscious perception during binocular rivalry. Front. Comput. Neurosci. 8:15. doi: 10.3389/fncom. 2014.00015*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2014 Paraan, Bakouie and Gharibzadeh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# A hypothesis on the role of perturbation size on the human sensorimotor adaptation

#### *Fatemeh Yavari 1, Farzad Towhidkhah1 \* and Mohammad Darainy2*

*<sup>1</sup> Biomedical Engineering Department, Amirkabir University of Technology, Tehran, Iran*

*<sup>2</sup> Department of Psychology, McGill University, Montreal, QC, Canada*

*\*Correspondence: towhidkhah@aut.ac.it*

#### *Edited and reviewed by:*

*Tobias Alecio Mattei, Ohio State University, USA*

**Keywords: adaptation, perturbation amplitude, error size, sensory recalibration, internal model**

# **INTRODUCTION**

Some evidence suggests that depending on the size of error produced by a perturbation, distinct learning mechanisms and neural structures are employed in the brain (Kluzik et al., 2008; Criscimagna-Hemminger et al., 2010; Gibo et al., 2013). Here, based on some existing evidence, we propose a hypothesis about the potential adaptation mechanisms which may be employed in the brain based on the perturbation magnitude. In the following sections, we first briefly explain the proposed hypothesis. Then a short description about the resolution of hand proprioceptive sensory is presented. In this hypothesis, the size of error is assessed relative to the resolution of proprioceptive sensory. Next, the empirical evidence supporting the proposed hypothesis are shortly described.

# **THE HYPOTHESIS**

Our hypothesis schematically represented in **Figure 1** is as follows:

1- For small perturbation amplitude compared to proprioceptive sensory resolution, the produced movement error (Err. in **Figure 1**) will be small as well. Small error does not often result in subject's awareness (Cressman and Henriques, 2009; Criscimagna-Hemminger et al., 2010). In this condition, the brain may consider the perturbation resulting from an internal source and compensate it with recalibration of proprioceptive sensory. This may be expressed by shifting the input-output relationship of proprioceptive sensory module (i.e., Proprioceptive block in **Figure 1**). The input-output relationship of this module has been modeled with a quantization (staircase) function to represent the limited resolution.

2- For large perturbation amplitude, the produced movement error will be large as well, which typically make subject aware of the perturbation (Malfait and Ostry, 2004). In this case the assumption is that the perturbation may be caused by an external source and the brain may need to form/update internal forward and/or inverse models of the new dynamics to reduce movement errors.

# **RESOLUTION OF PROPRIOCEPTIVE SENSORY**

It is possible to infer about the resolution of proprioceptive sensory based on some of previous studies. Diedrichsen et al. (2010) moved the subject's hand passively using a robotic arm along a trajectory deviated 8◦ to the left or right of the subjects' body midline. In the absence of visual feedback, subjects were not able to guess the direction of this deviation. In another study (Farrer et al., 2003), the experimenter moved subject's hand by pulling a rod connected to a joystick. Subjects had no direct view of their hand; instead a virtual hand image provided the visual feedback for them. The visual feedback was deviated either to the right or left relative to the actual hand movement by a certain angular value (0, 5, 10, 15, 20, 30, 40, or 50◦) in each trial. At the end of each movement, subjects had to indicate if their movement and the visual feedback were at the same place. They were not able to detect the deviation when it was less than 5◦ (Figure 2. in Farrer et al., 2003). Also, Darainy et al. (2013) observed that during passive hand movements perceptual boundary was at the left of the midline. Based on the observations in the above mentioned studies and some others (Cressman and Henriques, 2009; Fuentes et al., 2011), it can be suggested that resolution of proprioceptive sensory is about 5◦ (in the midline direction). On the other hand, there are some evidence supporting this notion that proprioceptive sensory is more precise in front-back direction than leftright (van Beers et al., 2002; Wilson et al., 2010). Therefore it seems plausible to infer that maximum resolution of proprioceptive sensory is in the midline direction.

# **EVIDENCE SUPPORTING THE PROPOSED HYPOTHESIS**

Some of the observations which can be explained based on this hypothesis are given in the following:


**FIGURE 1 | Schematic representation of the proposed hypothesis.** The general structure of this model has been borrowed from other studies e.g., (Shadmehr and Krakauer, 2008). <sup>θ</sup>*(t),* <sup>θ</sup>*FM(t),* <sup>θ</sup>*p(t),* and θ*<sup>v</sup> (t)* are respectively system output and its estimations by forward model,

proprioceptive sensory, and visual sensory. θ *(t)* is final estimation of system output obtained from integration. Dashed and dot-dashed lines show sensory recalibration and internal models' (IMs') adaptation, respectively.

confirm dependency of adaptation in presence of large, but not small errors on cerebellum and are in line with the proposed hypothesis.




# **SUMMARY**

We presented a hypothesis about the possible adaptation mechanisms employed in the brain based on error size. The proposed hypothesis can help to provide a better understanding of motor adaptation mechanism in brain. Further validation of the hypothesis requires more investigations and experiments. For example, adaptation in response to a gradual perturbation can be compared in deafferented subjects, cerebellar patients, and healthy individuals. This comparison may be performed regarding generalization patterns to untrained hand or to other contexts with the same hand, adaptation rate, wash-out rate, etc. It has been shown that deafferented individuals were able to adapt their reaches to altered visual feedback of the hand (Ingram et al., 2000; Bernier et al., 2006; Miall and Cole, 2007). Adaptation in these subjects may show different features compared to healthy ones.

# **REFERENCES**


adaptation. *J. Neurophysiol.* 110, 2152–2162. doi: 10.1152/jn.00439.2013


*Received: 22 January 2014; accepted: 22 February 2014; published online: 11 March 2014.*

*Citation: Yavari F, Towhidkhah F and Darainy M (2014) A hypothesis on the role of perturbation size on the human sensorimotor adaptation. Front. Comput. Neurosci. 8:28. doi: 10.3389/fncom.2014.00028*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2014 Yavari, Towhidkhah and Darainy. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Artificial neural networks: powerful tools for modeling chaotic behavior in the nervous system

*Malihe Molaie1, Razieh Falahian1, Shahriar Gharibzadeh1, Sajad Jafari <sup>1</sup> \* and Julien C. Sprott <sup>2</sup>*

*<sup>1</sup> Department of Bioelectric, Biomedical Engineering Faculty, Amirkabir University of Technology, Tehran, Iran*

*<sup>2</sup> Department of Physics, University of Wisconsin, Madison, Wisconsin, WI, USA*

*\*Correspondence: sajadjafari@aut.ac.ir*

#### *Edited and reviewed by:*

*Tobias Alecio Mattei, Ohio State University, USA*

**Keywords: artificial neural networks, biological systems, electroretinogram, chaos, bifurcation diagram**

Modeling real-world systems plays a pivotal role in their analysis and contributes to a better understanding of their behavior and performance. Classification, optimization, control, and pattern recognition problems rely heavily on modeling techniques. Such models can be categorized into three classes: white-box, black-box, and gray-box (Nelles, 2001). White-box models are fully derived from first principles, i.e., physical, chemical, biological, economical, etc. laws. All equations and parameters are determined from theory. Black-box models are based solely on experimental data, and their structure and parameters are determined by experimental modeling. Building blackbox models requires little or no prior knowledge of the system. Gray-box models represent a compromise or combination of white-box and black-box models (Nelles, 2001).

In the modeling of highly nonlinear and complex phenomena, we may not have a good understanding of the processes, and thus black-box models may be our best (or even our only) choice. Artificial neural networks (ANNs) are one of the most powerful and popular tools for black-box modeling and are designed and inspired by real biological neural networks.

There has been an increasing interest in analyzing neurophysiology from a nonlinear and chaotic systems viewpoint in recent years (Christini and Collins, 1995; Sarbadhikari and Chakrabarty, 2001; Korn and Faure, 2003; Hadaeghi et al., 2013; Jafari et al., 2013; Mattei, 2013). For example, although the famous Hodgkin and Huxley model (Hodgkin and Huxley, 1952) has been the basis of almost all of the proposed models for neural firing, the Rose-Hindmarsh model (Hindmarsh and Rose, 1984) is known to be a more refined model since it can show different firing patterns, especially chaotic bursts of action potential, which enable a proper matching between this model behavior and experimental data. Another example of the observation of chaotic behavior in the nervous system is the period-doubling route to chaos in flicker vision (Crevier and Meister, 1998), which is the focus of this letter.

Stimulation with periodic flashes of light is useful for distinguishing some disorders of the human visual system (Crevier and Meister, 1998). It has been shown by Crevier and Meister (1998) that during electroretinogram (ERG) recordings of the visual system, period-doubling can occur. It is well-known that perioddoubling occurs in nonlinear dynamical systems, and it is often associated with the onset of chaos. In one study (Crevier and Meister, 1998) the retina of a salamander was stimulated with a periodic square-wave flashes, and the ERG was recorded. The flash frequency was changed between zero and 30 Hz, while the contrast was constant. In another record, the contrast was changed while the frequency was fixed at 16 Hz. All the ERG signals were filtered at 1–1000 Hz. Using a common approach to obtain a discrete time series from a continuous recorded signal, successive local maxima of the signal were extracted as a time series (**Figure 1A**). As shown in **Figures 1B,C**, both the parameters (flash frequency and contrast) have a great effect on the recorded ERG signals and cause bifurcations resulting in a period-doubling route to chaos.

However, it is difficult to understand the exact relations between the parameters and their effects. In other words, it is not easy to build a white-box model that can regenerate the signals and diagrams accurately. That may be because of the highly complex and nonlinear dynamics involved. We have used the ability of an ANN in learning highly nonlinear dynamics as a black-box model of this system. We used a four hidden layer feed-forward neural network with (7/4/8/5) neurons in the layers (**Figure 1D**) and nonlinear transfer functions hyperbolic tangent function that help the network learn the complex relationships between input and output. The activation function of the last layer of the network is linear transfer function. We used two parameters (contrast and frequency) and three time delays (*xn*−1*, xn*−2, and *xn*−3) as the inputs of the ANN to fit each data point of the time series (*xn*) as the output of the network.

As shown in **Figures 1E,F**, this model can generate bifurcation diagrams similar to those obtained from real data. As the result, we believe that ANNs are powerful tools for modeling highly nonlinear behavior in the nervous system. We plan to construct ANN models in future work including extension to more cases and details, extension of the ideas in Hadaeghi et al. (2013) to patients with bipolar disorder, and extension of the ideas in Jafari et al. (2013) to patients with attention deficit hyperactivity disorder (ADHD).

were used. **(E)** Artificial bifurcation diagram resulted from varying the flash frequency input in the ANN. **(F)** Artificial bifurcation diagram resulted from varying the contrast input in the ANN.

# **ACKNOWLEDGMENTS**

The authors would like to thank Professor Markus Meister for allowing us to use his data.

# **REFERENCES**


application to conduction and excitation in nerve. *J. Physiol*. 117, 500–544.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 13 March 2014; accepted: 21 March 2014; published online: 09 April 2014.*

*Citation: Molaie M, Falahian R, Gharibzadeh S, Jafari S and Sprott JC (2014) Artificial neural networks: powerful tools for modeling chaotic behavior in the nervous system. Front. Comput. Neurosci. 8:40. doi: 10.3389/ fncom.2014.00040*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2014 Molaie, Falahian, Gharibzadeh, Jafari and Sprott. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Synchrony analysis: application in early diagnosis, staging and prognosis of multiple sclerosis

# *Zahra Ghanbari and Shahriar Gharibzadeh\**

*Department of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran \*Correspondence: gharibzadeh@aut.ac.ir*

#### *Edited by:*

*Tobias Alecio Mattei, Ohio State University, USA*

#### *Reviewed by:*

*Al-Rahim Abbasali Tailor, The Ohio State University Wexner Medical Center, USA*

**Keywords: multiple sclerosis (MS), synchrony, early diagnosis, staging, prognosis, prediction**

Multiple Sclerosis (MS) is an autoimmune disease caused by degeneration of the myelin sheath of large diameter fibers in the central nervous system. This will cause deficits in the conducting properties of nerves and also affect electrical signaling. As a result, in MS patients, nerve conduction will be slower than normal (Kandel et al., 2000).

Neural synchrony has been of great interest in neuroscience recently. In signal processing, synchrony refers to quantifying similarity, coherence or correlation among signals and could be measured using a variety of methods (Dauwels et al., 2010). Neural synchrony represents how synchronous the neurons are firing (Vialatte et al., 2008). It is proven that synchrony is an important feature of brain signals. Many neurological diseases are accompanied by abnormalities in neural synchrony (Dauwels et al., 2008).

For example, loss of synchrony among brain signals has been observed in disorders such as Parkinson's and Alzheimer's disease (AD) and was used for the purpose of diagnosis. On the other hand, increasing synchrony has been reported in disorders such as epileptic seizures (Vialatte et al., 2009).

Since perturbation in electrical signaling and slowing of nerve conduction are common among MS and the aforementioned diseases, it brings up the idea of using synchrony for MS as well. In addition, previous works on MS have reported loss of connectivity and synchronous function among different parts of patients' brains. It should be mentioned that most of the previous works were concentrated on the cognitive impairments caused by the disease, and they applied their methods on MEG signals (Arrondo et al., 2009; Hardmeier et al., 2012).

The other point which should be noted is that although MRI and ERP are both common tools in MS diagnosis and follow up, definite diagnosis cannot be made based on these criteria individually. In addition, MRI needs to be repeated (Greenberg et al., 2009; Longo et al., 2012) and it is not affordable and available in many situations. So, we should try to find a reliable solution.

According to the aforementioned points, we believe that recording electrical brain signals (particularly EEG and ERP) and calculating local and global synchrony among their channels may provide us with an individual tool for diagnosing MS. Actually, the idea we put forward is using calculated synchrony indices for the purpose of detection, classification and prediction on electrical brain signals. Of course, the previous results which investigated connectivity and synchronous function of brain parts support our idea (Arrondo et al., 2009; Hardmeier et al., 2012).

The proposed idea may also help us to detect MS in early stages. Additionally, we believe as impairments will increase by progression of the disease, synchrony measures may have significant differences in different stages of the disease. So, they could be useful for staging of the disease as well.

We also propose measuring synchrony among brain signals in the onset periods. It seems that there should be a correlation between the changes in synchrony measures and disease prognosis. In better words, based on the calculated synchrony indices, we can predict the trend of the disease. This would provide us with a clearer perspective of the possible efficiency of different management modalities (including medical and surgical). Additionally, based on the potential level of neural dyssynchrony the proposed idea can be useful in order to assess the efficiency of the selected treatments for both the patient and the physician. Surely experimental evaluations are needed to validate our hypothesis.

# **REFERENCES**


*Received: 26 April 2014; accepted: 27 June 2014; published online: 21 July 2014. Citation: Ghanbari Z and Gharibzadeh S (2014) Synchrony analysis: application in early diagnosis, staging and prognosis of multiple sclerosis. Front. Comput.*

*Neurosci. 8:73. doi: 10.3389/fncom.2014.00073*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2014 Ghanbari and Gharibzadeh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# *Sareh Zendehrouh\*, Shahriar Gharibzadeh and Farzad Towhidkhah*

*Biomedical Engineering Faculty, Amirkabir University of Technology, Tehran, Iran \*Correspondence: sareh.zendehrouh@gmail.com*

#### *Edited by:*

*Tobias Alecio Mattei, Ohio State University, USA*

#### *Reviewed by:*

*Tobias Alecio Mattei, Ohio State University, USA Carlos Rodrigo Goulart, Ohio State University, USA*

**Keywords: performance monitoring, cognitive control, conflict-driven control, monitor-controller networks, feedback-related negativity**

Flexible goal-directed behavior requires a performance monitoring system to monitor behavioral consequences in order to detect the need for further adjustments and control. When a failure in performance is detected by the monitoring system, some signals are transmitted to the brain structures responsible for control implementation. Evidences suggest the anterior cingulate cortex (ACC) (Carter et al., 1998; Gehring and Knight, 2000; MacDonald et al., 2000; Ferdinand et al., 2012) and the lateral prefrontal cortex (lPFC) (MacDonald et al., 2000; Ridderinkhof et al., 2004a,b) as the neural correlates of performance monitoring and control implementation systems, respectively. The interaction of these two systems appears to modulate some components of event-related brain potentials (ERPs) linked with performance monitoring such as the error-related negativity (ERN), the N200, and the feedback-related negativity (FRN) (Gruendler et al., 2011). The ERN is an ERP component that begins close to the time of the erroneous response in speeded response time tasks and peaks about 100 ms later (Gehring et al., 1993). The N200 is another negative deflection in ERP that peaks between 200 and 400 ms after stimulus onset, prior to the response execution, on correct trials of cognitive control experiments (Olvet and Hajcak, 2008). The FRN as one of the most studied components is a negative-going deflection observed 230–330 ms following outcome presentation (Miltner et al., 1997) in gambling and trial-and-error learning tasks (Holroyd et al., 2006). Source localization studies show the neural source of the FRN to be located most probably in the ACC (Miltner et al., 1997; Gehring and Willoughby, 2002; Bellebaum and Daum, 2008; Hauser et al., 2014).

The central question in the interaction of performance monitoring and control systems is how the brain determines the need to recruit the intervention of control structures. The reinforcement learning (RL) account of performance monitoring and control is one of the influential theories to the field (Holroyd and Coles, 2002; Holroyd et al., 2005). The theory is based on the physiological evidences that reveal the similarity of the phasic activity of the mesencephalic dopamine system and reward prediction errors (RPEs) in temporal difference models of learning (Suri, 2002). The theory holds that the monitor is located in the basal ganglia, which produces RPE signals that indicate when events are better or worse than expected. These RPEs are used by the ACC to improve performance on the task at hand (Holroyd et al., 2005). According to the RL model, negative RPEs sent to the ACC generate the ERN and the FRN. Another prominent theory, the conflict-monitoring theory (CMT) proposes that the performance monitoring system monitors for the coactivation of mutually incompatible response tendencies or conflict during response selection. The CMT suggests that the ACC detects response-conflict signal and sends this information to the dorsolateral prefrontal cortex for further adjustment and control (Botvinick et al., 2001; Yeung et al., 2004). Based on this theory, the N2 and the ERN can be described using conflict signal. The CMT argues that the N2 and the ERN are electrophysiologically correlated with pre-response and post-response conflict signals, respectively. However, since no motor response exists after external feedback presentation, the CMT cannot account for the phenomena commencing after feedback onset, e.g., the FRN (Ullsperger et al., 2014). In our previous studies, we have explained the significance of integrating the computational models associated with the RL and the CMT (Zendehrouh et al., 2013, 2014). Since the unification of these two theories depends centrally on conflict signal definition, we propose a hypothetical costconflict monitor in the brain that extends the CMT theory to account for post feedback activities in feedback-based learning tasks. Based on this proposal, the FRN can be described using a cost-conflict signal.

The basis for our hypothetical costconflict monitor is that: (1) Theoretically, conflict can occur anywhere within the information processing system (Carter and van Veen, 2007). (2) Conflict-driven control is domain-specific suggested to be mediated by multiple, independent, and parallel-operating conflict monitorcontroller loops in the brain (Egner, 2008). (3) The appraisal of costs and benefits associated with different candidate actions is a key aspect of decision-making.

The Delay-based and the effort-based costs (effort needed to perform an action in order to obtain a reward) are two types of costs that bias decision making (Floresco et al., 2008). In delay-based tasks, as the time passes, the subjective value of a reward is discounted hyperbolically (Green and Myerson, 2004). Also, the aversiveness of a negative event decreases hyperbolically with time (Murphy et al., 2001). Evidences suggest that discounting can happen across many reward types, reward magnitudes, and several timescales even in the order of tens of milliseconds (Haith et al., 2012). In this paper, it is hypothesized that in feedback-based learning tasks, the participants are faced with delay-based evaluations. Therefore, in these tasks, the time interval between response selection and feedback presentation gives rise to a cost. This delay elevates the cost of the rewarded outcome and reduces the cost of the nonrewarded outcome associated with the selected action. In fact, the conflict can be produced by simultaneous activation of the expected costs of possible outcomes that are mutually exclusive. Therefore, when a cost-conflict is detected by the monitoring system, the regulatory mechanism implements the required control, e.g., by modifying the excitatory weights to the response units. The cost-conflict signal that may occur between expected costs can show the amount of subjective transient uncertainty about what will happen that increases with time (delay) until receiving the actual outcome. The cost-conflict signal can also be viewed in the context of the emerging field of neuroeconomics as an ambiguity signal that may be present during decision-making. Ambiguity is defined as a lack of confidence in probability assignment to the possible outcomes (Kishida et al., 2010). This is consistent with investigations suggesting the existence of an ambiguitysensitive mechanism in the ventromedial prefrontal cortex (vmPFC) (Glimcher and Rustichini, 2004), and also with the role of this area in delay cost coding (Prévost et al., 2010; Rushworth et al., 2011; Dreher, 2013).

This proposal can be validated by performing simple gambling games or probabilistic reinforcement learning tasks with feedback-timing manipulations at the timescale of milliseconds while measuring the brain responses with functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) to identify the contributions of the ACC and the vmPFC in those tasks. Especially, the behaviors of addicted and depressed individuals in these tasks that show anomalies in value based decision making (Sharp et al., 2012) can be beneficial.

Therefore, the cost-conflict monitor as an independent and parallel loop to the response-conflict monitor detects the conflict between the costs of likely outcomes of the selected action and uses this information to adjust the behavior for the future, thereby implements trial-by-trial adjustments. Surely, this proposal is speculative and further experimental studies and research is needed to evaluate its merit. However, the proposal can provide promising avenues toward the unification of computational models associated with the RL and the CMT.

# **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 17 May 2014; accepted: 30 June 2014; published online: 21 July 2014.*

*Citation: Zendehrouh S, Gharibzadeh S and Towhidkhah F (2014) The hypothetical cost-conflict monitor: is it a possible trigger for conflict-driven control mechanisms in the human brain? Front. Comput. Neurosci. 8:77. doi: 10.3389/fncom.2014.00077*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2014 Zendehrouh, Gharibzadeh and Towhidkhah. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Modeling studies for designing transcranial direct current stimulation protocol in Alzheimer's disease

*Shirin Mahdavi , Fatemeh Yavari , Shahriar Gharibzadeh\* and Farzad Towhidkhah*

*Department of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran \*Correspondence: gharibzadeh@aut.ac.ir*

#### *Edited by:*

*Tobias Alecio Mattei, Ohio State University, USA*

#### *Reviewed by:*

*Al-Rahim Abbasali Tailor, The Ohio State University Wexner Medical Center, USA*

**Keywords: brain stimulation, transcranial direct current stimulation (tDCS), computational modeling, finite element model, human head model, Alzheimer's disease**

Transcranial direct current stimulation (tDCS) has been proposed as a technique for brain activity modulation. In this technique, a weak current (usually 1–2 mA) is delivered to scalp through two sponge electrodes. There are two types of tDCS stimulation: cathodal and anodal, which inhibit and facilitate neuronal activity, respectively (Hansen, 2012).

tDCS has been shown to be effective in Alzheimer's disease (AD). Several studies have revealed that tDCS application can improve memory performance in Alzheimer's patients (APs) (Ferrucci et al., 2008; Boggio et al., 2009, 2012). For example, results of a single session tDCS study (Ferrucci et al., 2008) revealed that anodal/cathodal tDCS significantly enhanced/worsened word recognition in AD patients. In another study, application of anodal stimulation over DLPFC of APs has led to recognition memory improvement in a visual memory task (Boggio et al., 2009). These effects seem to be persistent, as in a multi-session tDCS study (Boggio et al., 2012), improvement in patients' visual recognition lasted for 4 weeks.

Current pathway through brain plays a key role in the observed effects. Currently, modeling studies provide the only way for determining the pattern of current flow during tDCS. In recent years, finite element modeling has been suggested as a reliable and helpful tool in clinical therapeutic applications (Bikson et al., 2012).

A critical issue which is required to be considered in modeling studies is the inter-individual anatomical variations. A modeling study has shown the profound role of individual cortical morphology in determination of current flow distribution for healthy people (Datta et al., 2012). Also the impact of pathologic anatomy (skull defects and lesions) on modulation of current flow has been examined in some previous studies (Datta et al., 2010, 2011). Specifically, in AD loss of neuronal structures and synaptic damages result in cortex shrinkage and ventricular enlargement (Frisoni et al., 2010). This changes the volume of CSF- referred as "super highway" for current flow- and therefore can significantly alters current pathway in these patients' head compared to healthy subjects (Bikson et al., 2012). These studies suggest that it is not precise to determine the dosage of applied current only based on healthy human modeling or clinical trial outcomes.

We hypothesize that change in cortical thickness due to brain atrophy has significant effects on current flow pattern. These anatomical alterations may shift the stimulated areas and peak current density location in head. They may even alter the expected results from tDCS application.

We suggest that cortical thickness is required to be considered in modeling studies to obtain more precise pattern of current flow in head and the stimulated brain regions. Specifically, AD affects differently on each patient's brain structure. We suggest developing individualized models based on each patient's MRI data. These models can be used by clinicians to find the optimal electrode montage and current amplitude for each patient.

Using Individual-based models for designing clinical protocols could provide us with better interpretation of the results.

# **REFERENCES**


493–498. doi: 10.1212/01.wnl.0000317060. 43722.a3


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 January 2014; accepted: 27 June 2014; published online: 22 July 2014.*

*Citation: Mahdavi S, Yavari F, Gharibzadeh S and Towhidkhah F (2014) Modeling studies for designing transcranial direct current stimulation protocol in Alzheimer's disease. Front. Comput. Neurosci. 8:72. doi: 10.3389/fncom.2014.00072*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2014 Mahdavi, Yavari, Gharibzadeh and Towhidkhah. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# *Fatemeh Yavari\**

*Biomedical Engineering Department, Amirkabir University of Technology, Tehran, Iran \*Correspondence: f-yavari@aut.ac.ir*

#### *Edited by:*

*Tobias Alecio Mattei, Ohio State University, USA*

#### *Reviewed by:*

*Da-Hui Wang, Beijing Normal University, China Sen Song, Tsinghua University, USA*

**Keywords: internal forward model, internal inverse model, Modular organization, Schematic processing, Stereotypes**

# **INTRODUCTION**

Our first impression of other people is greatly affected by our previous experiences. Schematic processing, proposed in social psychology, explains our behavior in interacting with other people. It suggests existence of different schemas in our brain for different groups of people, e.g., extroverts, introverts, shy, women, men, etc. and also schemas related to special people like our parents, close friends, supervisor, and even ourselves. Each schema is recalled when we meet the corresponding person/personality (Atkinson, 1996).

On the other hand there is a relatively well accepted theory–model based theory- in motor control and learning studies (Daw and Dayan, 2014; Dayan and Berridge, 2014). It suggests existence of some internal models (forward and/or inverse) in the brain which help us for planning and execution of the actions.

Although these two viewpoints may seem very distinct, there are some interesting similarities between them, which are explained in the following section. I hypothesize that these correspondences may suggest that the brain employs same algorithms in dealing with both situations.

Understanding the brain function is a great challenge for many scientists. Further evaluation of the proposed hypothesis may be helpful to achieve better understanding of the brain function, as advances in each field may encourage new ideas in the other one.

In the following sections each of the two viewpoints and then their similarities are explained.

# **STEREOTYPES IN SOCIAL PSYCHOLOGY**

Stereotype is defined as "a fixed, often simplistic generalization about a particular group or class of people (Cardwell, 2014)". Stereotypes, schemas, and schematic processing enable us to efficiently organize and process the huge volume of input information to our brain. Instead of processing every little detail about a new person, we can just recall the most similar schemas and generally categorize the person e.g., based on his most obvious physical features (Atkinson, 1996). Stereotypes enable us to respond rapidly in situations which we have had similar experience. Despite all the benefits, stereotypes may also result in prejudice. Since they bias our impressions, they can have very negative and even mortal (e.g., Amadou Diallo case) consequences (Atkinson, 1996).

# **INTERNAL MODELS IN MOTOR CONTROL AND LEARNING STUDIES**

Internal models are defined as representations of external objects and/or our body organs in the brain (Kawato, 1999) (see Yavari et al., 2013 for a review). They are categorized into "forward" and "inverse" which mimic the "input-output" and "output-input" relationship of the related object/organ, respectively. Model-based theory suggests that motor learning/adaptation leads to formation/modification of internal models (Hunter et al., 2009). Kawato et al. have proposed co-existence of multiple pairs of internal forward-inverse models in the brain and therefore, a modular structure for motor control and learning (Wolpert and Kawato, 1998; Haruno et al., 2001, 2003; Doya et al., 2002; Imamizu et al., 2003; Wada et al., 2003). Based on this idea, which has been supported by different behavioral and imaging evidence (Wolpert and Kawato, 1998; Imamizu et al., 2003), there are an inverse (controller) and a forward (predictor) internal model within each pair. Contribution of each controller to the final motor command is determined based on accuracy of the linked forward model. This modular structure can explain our remarkable ability in motor learning, adaptation, and behavioral switching (Haruno et al., 2003).

# **SIMILARITIES OF THE TWO MENTIONED VIEWPOINTS**

Some similarities between the two mentioned viewpoints are described here:


determines our judgment about his/her personality.

There is a same process about internal models in motor control: when manipulating a new tool the most suitable FM/IM pair is activated based on context, e.g., by looking at the object's appearance, and the corresponding IM is used as controller. In the next trial, the pair which produced the least minimum prediction error will be activated and used (Wolpert et al., 2003).


Based on internal model theory learning a new motor skill goes through an almost similar process: When we try to manipulate a new object, in the early stage, CNS combines output signals from internal models of most similar (and familiar) objects. After some practice we learn to manipulate the new object skillfully and the reason is the special internal model which has been formed for it (Imamizu and Kawato, 2012). Depending on the complexity of the new motor task, its learning would need different time. It could take even years (e.g., for professional athletes).

As it can be seen in both situations, in a new condition reliance is more on previous experience, while gathering more information over time leads to formation of special new internal model/stereotype.

# **CONCLUSION**

Human brain is probably the most fascinating creation in the world. Many scientists in different fields are trying to understand its function. Here I hypothesized that maybe our brain applies the same policy for some distinct applications, e.g., social interaction and manipulating different objects.

It worth mentioning that internal models have been proposed not only in motor control and learning, but also in some other fields such as control of mental activities (Ito, 2008), cognitive planning (Dayan and Yu, 2006), and decision making (Daw et al., 2011). These processes may even have more in common with stereotypes.

It would be interesting to also compare the corresponding neural substrates for stereotypes and internal models. Cell recording in some animal studies (Liu et al., 2003; Cerminara et al., 2009; Laurens et al., 2013) and also imaging studies (Imamizu et al., 2000, 2003; Blakemore et al., 2001; Kawato et al., 2003; Higuchi et al., 2007; Milner et al., 2007) suggest lateral and anterior cerebellum as the probable site of formation or storage of internal models. Some studies have suggested that motor cortex and other frontal motor areas have important roles in computation of internal models (Li et al., 2001; Shadmehr, 2004; Richardson et al., 2006; Shadmehr and Krakauer, 2008). Medial prefrontal cortex (mPFC) has been proposed as a candidate region in model-based evaluation (Hampton et al., 2006, 2008; Valentin et al., 2007; Daw et al., 2011). On the other hand, some neuroimaging studies have shown mPFC as a crucial region in social inferences, (Mitchell et al., 2005a,b, 2006), and judgments of warmth and competence (Harris and Fiske, 2006). Activity in middle mPFC is shown to be associated with thinking about either the self or a similar other (Ida Gobbini et al., 2004; Mitchell et al., 2006); while activity in dorsal mPFC is associated with thinking about a dissimilar other. Therefore, mPFC seems to be important for ingroup and outgroup perception (Amodio and Lieberman, 2009). Perceiving a person as a social being, which has been proposed to form the basis of prejudice (Qiu, 2006), has been suggested to be dependent on dorsal mPFC (Amodio and Lieberman, 2009). Therefore, PFC seems to be a crucial brain region for both internal models and stereotypes.

Further evaluation of the proposed hypothesis may be helpful to achieve better understanding of the brain function. For example as it was mentioned, stereotypes have significant effect on our social life and undeniable effect on impression formation. They sometimes have negative (even mortal) impact on our judgments, because they bias our impressions. The more we increase our knowledge about this concept, the more we can modify our thoughts in a good manner.

Discoveries in each field may lead to new findings in the other. For instance it has been shown that stereotypes may be activated through unconscious priming; e.g., in an experiment by Bargh et al. (1996) seeing images of young African American men triggered more aggressive behavior compared to images of young Caucasian men, even though the images were displayed for less than thirty thousandths seconds (subliminally) (Atkinson, 1996). This observation can be verified about motor actions as well. For example to investigate if seeing a special tool, such as a piano, can prime the piano playing skill. This can be both useful for better understanding the motor related mechanisms in the brain and also in practical applications such as preparing the athletes before their match to achieve better results.

The proposed hypothesis needs to be verified by some specially-designed experiments.

# **REFERENCES**

Amodio, D. M., and Lieberman, M. D. (2009). "Pictures in our heads: Contributions of fMRI to the study of prejudice and stereotyping," in *Handbook of Prejudice, Stereotyping, and Discrimination* (New York, NY: Earlbaum), 347–366.


*R. Soc. Lond. B Biol. Sci.* 358, 593–602. doi: 10.1098/rstb.2002.1238


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 18 September 2014; accepted: 09 December 2014; published online: 05 January 2015. Citation: Yavari F (2015) Does our brain use the same policy for interacting with people and manipulating different objects? Front. Comput. Neurosci. 8:170. doi: 10.3389/fncom.2014.00170*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2015 Yavari. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Stochastic non-linear oscillator models of EEG: the Alzheimer's disease case

#### Parham Ghorbanian<sup>1</sup> , Subramanian Ramakrishnan<sup>2</sup> and Hashem Ashrafiuon<sup>1</sup> \*

*<sup>1</sup> Department of Mechanical Engineering, Center for Nonlinear Dynamics and Control, Villanova University, Villanova, PA, USA, <sup>2</sup> Department of Mechanical and Industrial Engineering, University of Minnesota Duluth, Duluth, MN, USA*

In this article, the Electroencephalography (EEG) signal of the human brain is modeled as the output of stochastic non-linear coupled oscillator networks. It is shown that EEG signals recorded under different brain states in healthy as well as Alzheimer's disease (AD) patients may be understood as distinct, statistically significant realizations of the model. EEG signals recorded during resting eyes-open (EO) and eyes-closed (EC) resting conditions in a pilot study with AD patients and age-matched healthy control subjects (CTL) are employed. An optimization scheme is then utilized to match the output of the stochastic Duffing—van der Pol double oscillator network with EEG signals recorded during each condition for AD and CTL subjects by selecting the model physical parameters and noise intensity. The selected signal characteristics are power spectral densities in major brain frequency bands Shannon and sample entropies. These measures allow matching of linear time varying frequency content as well as non-linear signal information content and complexity. The main finding of the work is that statistically significant unique models represent the EC and EO conditions for both CTL and AD subjects. However, it is also shown that the inclusion of sample entropy in the optimization process, to match the complexity of the EEG signal, enhances the stochastic non-linear oscillator model performance.

### Edited by:

*Tobias Alecio Mattei, Brain & Spine Center - InvisionHealth - Kenmore Mercy Hospital, USA*

# Reviewed by:

*Robin A. A. Ince, University of Manchester, UK Fan Liao, Washington University in St louis, USA*

### \*Correspondence:

*Hashem Ashrafiuon, Department of Mechanical Engineering, Center for Nonlinear Dynamics and Control, Villanova University, 800 E. Lancaster Ave., Villanova, PA 19085, USA hashem.ashrafiuon@villanova.edu*

> Received: *28 January 2015* Accepted: *07 April 2015* Published: *24 April 2015*

#### Citation:

*Ghorbanian P, Ramakrishnan S and Ashrafiuon H (2015) Stochastic non-linear oscillator models of EEG: the Alzheimer's disease case. Front. Comput. Neurosci. 9:48. doi: 10.3389/fncom.2015.00048* Keywords: EEG, Alzheimer's disease, stochastic differential equations, duffing—van der Pol, entropy

# 1. Introduction

Quantitative analysis of human brain electroencephalography (EEG) recordings aimed at enhancing our understanding of brain injuries and disorders is currently an important research area. In addition to being useful in diagnosis, such analysis can provide insights into the underlying neurophysiology of the injury or disorder, thereby leading to better treatment and preventive strategies. Alzheimer's disease (AD) is the most common form of dementia and is the subject of intense research. While no known cure exists, certain medications have shown promise in delaying the symptoms (Dauwels et al., 2010) prompting researchers to seek early diagnosis and intervention strategies. In this context, analysis of the EEG is a potential non-invasive tool that may aid early diagnosis of AD. However, the use of EEG signal analysis in order to improve the diagnosis of AD is a complex problem where, despite significant advances, a number of fundamental questions remain open (Elgendi et al., 2011).

Considering now the characteristics of the EEG, since the non-stationary nature of the signal is generally well-recognized (see, for instance Akin, 2002), decomposition using a fast Fourier transform (FFT) with sliding windows and the wavelet transforms have been the most popular techniques employed to capture the spectral properties of EEG (Darvishi and Al-Ani, 2007; Dauwels et al., 2010). However, linear transformation methods fail to address the non-linear characteristics of the EEG signal (Stam, 2005). Therefore, non-linear dynamic approaches have been attempted as well, mostly involving computationally complex time series analysis (Jeong, 2004). Several other aspects of non-linear modeling and analysis in this context have also been studied in the literature (see, for instance, Stam, 2005 for a review). These include frameworks based on a neural mass model (Valdes et al., 1999; Huang et al., 2011), coupled oscillators (Baier et al., 2005; Leistritz et al., 2007), continuum models (Kim et al., 2007), non-linear non-stationary models (Celka and Colditz, 2002; Rankine et al., 2007), random neural networks (Acedo and Morano, 2013), and chaotic phenomena and stability aspects (Rodrigues et al., 2007; Dafilis et al., 2009). Stochastic approaches based on Markov chain Monte Carlo methods (Hettiarachchi et al., 2012) and Markov process amplitude (Wang et al., 2011) that take into account the inherent randomness of the EEG signal have also been reported. In the same vein, limit cycle oscillators (Hernandez et al., 1996; Burke and Paor, 2004) as well as stochastic synchronization (Bressloff and Lai, 2011) and stochastic approximation (Fell et al., 2000; Sun et al., 2008) methods have been considered in EEG modeling. Notably, limit cycle behavior at each of the brain frequency bands appears to provide a more accurate representation of the EEG signal than one based on chaotic phenomena.

Some of the most important features in non-linear dynamic and stochastic approaches are signal information content and complexity as measured using various forms of information entropy. Measures such as Shannon entropy (Shannon, 1948) characterize the information content in a signal and higher entropy corresponds to increased randomness and chaotic behavior (Abasolo et al., 2006). Importantly, one observes that, with respect to the EEG signal, higher information content correlates with better brain function (Shin et al., 2006). Furthermore, it has been reported that variations in information entropic measures may be used to detect functional abnormalities in the brain caused by disorders or injuries (Slobounov et al., 2009). Hence, information content of the EEG signal, characterized by information-entropic measures, may be expected to be important in identifying distinct states of the brain. This is further reinforced by the recent results of McBride and colleagues on the role of information entropic and spectral analysis in the study of the early stages of Alzheimer's disease and mild Traumatic Brain Injury McBride et al. (2013a,b, 2014).

Entropy may also be utilized to measure signal complexity. For instance, embedding entropy provides information about how the EEG signal fluctuates in time by comparing the time series with a delayed version of itself (Abasolo et al., 2006). Moreover, the concept of approximate entropy was introduced as a measure of system complexity (Pincus, 1991) and has been applied to brain wave signals (Quiroga et al., 2001). However, the approximate entropy measure suffers from drawbacks such as bias and inconsistency (Xu et al., 2010). Hence, the notion of sample entropy was introduced (Richman and Moorman, 2000) as an improvement over approximate entropy.

In recent work, the authors proposed a phenomenological model of the EEG signal based on the dynamics of a stochastic, coupled, Duffing- van der Pol oscillator network (Ghorbanian et al., 2015). An optimization scheme was adopted to match model output with actual EEG data obtained from healthy subjects in the two distinct resting eyes-open (EO) and eyes-closed (EC) conditions and it was shown that the actual EEG signals in both cases were distinct realizations of the model with qualitatively different non-linear dynamic characteristics. Moreover, the model output and the actual EEG data were shown to be in good agreement in terms of both the power spectra (frequency content) and Shannon entropy (information content).

In the present effort, we improve the model introduced in Ghorbanian et al. (2015) by matching the sample entropy of the model output and EEG signal to capture its complexity. A global optimization routine is employed in order to match the output of with EEG recordings in terms of power spectrum, Shannon entropy, and sample entropy. The EEG signals were recorded under resting EC and EO conditions in an earlier pilot study of Alzheimer's disease (AD) patients vs. age-matched healthy control (CTL) subjects (Ghorbanian et al., 2013). The model parameters obtained for the oscillators representing EC and EO EEG signals for CTL and AD patients are compared in order to establish statistically significant, distinct models for AD and CTL subjects under each condition. In addition, we present new results from a phase space reconstruction analysis of the model output to match the actual EEG signal. The results indicate that the analytical model effectively captures the frequency spectrum and nonlinear characteristics of the EEG signal in terms of complexity and information content. Furthermore, it is shown that the addition of sample entropy significantly enhances the model performance in terms of complexity and non-linear dynamic characteristics, as demonstrated by phase space reconstruction. The results suggest exciting new pathways to develop better tools for distinguishing pathological and normal brain states in AD and perhaps other neurological diseases and disorders.

The rest of the article is set as follows. Details of the EEG recordings, the analytical model, the optimization scheme and the phase space reconstruction technique are provided in Section 2. The results are presented in Section 3 and discussed in Section 4. The articles concludes with comments on further research in Section 5.

# 2. Materials and Methods

# 2.1. EEG Recording Blocks

Twenty six AD patients and healthy age-matched CTL subjects were selected for this study ("A Brain-Computer Interface for Diagnosing Brain Function," Aspire IRB, Human Subject Protocol Number PDMC-001, approved on October 7, 2010). Of the 26 subjects selected, one withdrew and one did not qualify as AD or CTL. Subjects were asked to relax and wear an EEG recording headset during alternating blocks of EC and EO followed by a variety of cognitive and auditory tasks and a final EC-EO resting period.

TABLE 1 | Optimal parameters of the Duffing—van der Pol oscillator for EC and EO of CTL subjects (N = 40) and the p-values from unpaired t-test, Wilcoxon rank sum test, and Bonferonni correction.


The EEG signals were recorded through a single-dry electrode device at position Fp1 (based on a 10–20 electrode placement system) with a Bluetooth enabled telemetric headset. The headset's effective sample rate is 125 Hz. Frequencies below 1 Hz and above 60 Hz (near Nyquist frequency) were filtered out by the device hardware. On comparison of the EEG recordings by the device with those from other widely accepted devices, frequencies within 2–30 Hz were deemed to be very accurate.

The recording device eliminated frequently observed artifacts including line noise. Other artifacts were mainly due to eye- and muscle-movements, which are common at Fp1 location and can be clearly identified by their high amplitudes compared to true EEG signal recordings during resting states. These artifacts were removed using a simple artifact detection that eliminated any part of the signal greater than 4.5σ (standard deviation). The algorithm also reconstructed the nulled samples using FFT interpolation of the trailing and subsequent recorded data (Ghorbanian et al., 2013).

The EEG recordings in this study were obtained from subjects in an AD pilot study with 14 control (CTL) subjects and 10 Alzheimer's Disease (AD) patients presented in our earlier work (Ghorbanian et al., 2013). Recording blocks of 40-s duration (approximately 5000 sample signals) from resting eyes-closed (EC) and eyes-open (EO) conditions were selected. In all, 60 random blocks were selected from the pilot study: 40 blocks from control CTL subjects (20 EC and 20 EO) and 20 blocks from AD subjects (10 EC and 10 EO). Note that, the smaller number of AD patients along with smaller number of AD patient recording sessions that were were not dominated by artifacts resulted in the selection of smaller AD sample size.

# 2.2. EEG Features

The time-varying power spectrum in each of the major brain EEG frequency bands was calculated using short time fast Fourier transform (FFT) with sliding window, since a good model must produce signals that can match EEG's frequency content. Specifically, the power spectrum was computed in seven ranges: lower δ (1–2 Hz), upper δ (2–4 Hz), θ (4–8 Hz), α (8–13 Hz), lower β (13–20 Hz), upper β (20–30 Hz), and γ (30–60 Hz). However, lower δ and γ band powers, which happen to have little power, were ignored due to unreliability of the device in those frequency ranges.

Shannon entropy was measured based on a sliding temporal window technique. A temporal window was defined to slide along the signal time representation with a sliding step (interval or bin) to sample a part of the signal. A discrete entropy estimator was applied, in which 10 uniform intervals equally divided the range of the normalized observed signal. Then the probability that the sampled signal belongs to the interval is the ratio between the number of the samples found within each interval and the total number of samples of the signal. The Shannon entropy is then calculated based on these probabilities (Shin et al., 2006), separately for each 40-s EEG recording block (5000 samples).

Sample entropy (SE) is the negative natural logarithm of the conditional probability that two sequences of a time series, similar for m points, remain similar at the next point. For given N data points from a time series, [x(1), x(2), · · · , x(N)], we calculated SE of each 40-s EEG recording block (5000 samples) by the statistic (Abasolo et al., 2006):

$$\text{SE}(m, r, N) = \left\{-\ln\left[\frac{U^{m+1}(r)}{U^m(r)}\right]\right\},\tag{1}$$

where m is the run length, r is the tolerance window size, and

$$U^{m}(r) = \frac{1}{(N-m)(N-m-1)} \sum\_{i=1}^{N-m} U\_{i}.\tag{2}$$

In the above equation, U<sup>i</sup> indicates the number of k's (1 ≤ k ≤ N − m) such that the Euclidean distance between Xm(i) and Xm(k), k 6= i, is less than or equal r and Xm(i) = [x(i), x(i + 1), · · · , x(i + m − 1)].

Generally, large m or small r values result in number of matches being too small for confident estimation of the conditional probability and vice versa (Lake and Moorman, 2011). In this study, we used m = 2 and r = 0.25σ based on the consistency of the results and recommended ranges in previous studies (Richman and Moorman, 2000; Xu et al., 2010).

## 2.3. Stochastic Coupled Non-linear Oscillators

We recall that the EEG has been modeled in the literature taking into account characteristics including non-linearity (both chaotic and non-chaotic), non-stationarity, and randomness of the signal (Fell et al., 2000; Rankine et al., 2007; Sun et al., 2008). The EEG has also been studied as the manifestation of underlying limit cycle oscillations at a given frequency and other such periodic solutions (Hernandez et al., 1996; Burke and Paor, 2004). While inspired by the above, the authors were fundamentally motivated to develop models that can better reproduce the significant linear and non-linear characteristics of actual EEG signals. Hence, we proposed a phenomenological model of the EEG based on a coupled system of Duffing—van der Pol oscillators subjected to white noise excitation (Ghorbanian et al., 2015). This particular oscillator was selected because the Duffing nonlinearity allows a system with only two oscillators capture the major brain frequency spectra and van der Pol non-linearity provides self-excited limit cycle behavior which have been previously reported for each major brain frequency bands (Burke and Paor, 2004).

We consider a phenomenological model of the EEG based on a coupled system of Duffing—van der Pol oscillators subject to white noise excitation, as shown in **Figure 1**. The equations for the model may be written as:

$$\begin{cases} \ddot{\mathbf{x}}\_1 + (k\_1 + k\_2)\mathbf{x}\_1 - k\_2 \mathbf{x}\_2 = -b\_1 \mathbf{x}\_1^3 - b\_2 (\mathbf{x}\_1 - \mathbf{x}\_2)^3 \\ \qquad + \epsilon\_1 \dot{\mathbf{x}}\_1 (1 - \mathbf{x}\_1^2), \\\ \ddot{\mathbf{x}}\_2 - k\_2 \mathbf{x}\_1 + k\_2 \mathbf{x}\_2 = b\_2 (\mathbf{x}\_1 - \mathbf{x}\_2)^3 \\ \qquad + \epsilon\_2 \dot{\mathbf{x}}\_2 (1 - \mathbf{x}\_2^2) + \mu \text{ } dW, \end{cases} \tag{3}$$

where xi, x˙i, x¨i, i = 1, 2 are positions, velocities, and accelerations of the two oscillators, respectively. Parameters ki, bi, ǫi, i = 1, 2 are, respectively, linear stiffness, cubic stiffness, and van der Pol damping coefficient of the two oscillators. Parameters bis indicate the strength of the Duffing non-linearity resulting in multiple resonant frequencies while ǫis indicate the strength of van der Pol non-linearity and determine the extent of selfexcitation and the shape of the resulting limit cycle. Parameter µ represents the intensity of white noise and dW is a Wiener process (Gardiner, 1985; Higham, 2001) representing the additive noise in the stochastic differential system. The input excitation to the system is provided through µdW. The output

may be selected as any combination of the positions and velocities to mimic an EEG signal. Note that, the Euler-Maruyama method (Higham, 2001) was selected to integrate the stochastic differential equations in Equation (3) since standard numerical integration methods are not applicable.

# 2.4. Optimization Formulation

We have selected the velocity of the second oscillator as the system output approximating the EEG signal since it is directly impacted by the noise. A global optimization search method based on a multi-start algorithm (Ugray et al., 2007) was adopted to determine the oscillator model parameters that can produce the output matching various EEG signals. The optimization objective function was selected as the root mean squared of the errors in power spectrum of each selected brain frequency bands plus weighted values of the errors in absolute Shannon and sample entropies. Hence, the optimization goal is error minimization:

$$\min\_{\mathcal{P}} J = \sqrt{\sum\_{j=1}^{m} (P\_{Ej} - P\_{Oj})^2} + \omega\_1 |S\_E - S\_O| + \omega\_2 |SP\_E - SP\_O|, \tag{4}$$

where J is the objective function, p = [k1, k2, b1, b2, ǫ1, ǫ2, µ] the decision variables, PEj and POj the powers in the major brain frequency bands for the normalized EEG signal and the model output, respectively, m is number of frequency bands (m = 7), S<sup>E</sup> and S<sup>O</sup> the Shannon entropies of the EEG signal and the model output, respectively, SP<sup>E</sup> and SP<sup>O</sup> the sample entropies of the EEG signal and the model output, respectively, w<sup>1</sup> and w<sup>2</sup> are weighting factor for absolute Shannon and sample entropies, respectively, and | | represents absolute value. The weighting factors w<sup>1</sup> and w<sup>2</sup> are required to give equal importance to power spectrum and entropy characteristics of the signal. Note that the magnitude of the output signals are matched through normalization of both the model output and the EEG signal with respect to their standard deviations.

The objective function minimization is subject to equality constraints represented by the state (Equation 3) and inequality constraints represented by the decision variable lower and upper bounds:

$$0 < k\_i \le 1 \epsilon 4, \quad 0 < b\_i \le \frac{1}{2} k\_i, \quad 0 < \epsilon\_i \le \frac{1}{3} k\_i,\tag{5}$$

$$i = 1, 2, \quad 0 \le \mu \le 2.$$

The constraints for b<sup>i</sup> 's and ǫ<sup>i</sup> 's were imposed to avoid the chaotic regime (Li et al., 2006) and provide a periodic stochastic response.

output of stochastic oscillator model using Shannon and sample entropies.

Noise intensity is also constrained to avoid a response dominated by random noise. The initial guesses for the global optimization search are randomly generated within the bounds defined in Equation (5).

The stochastic component was introduced as white noise, which was generated through a normally distributed random variable and applied to the model via Wiener process. A new random process was generated and applied to the model during integration of the equations, at each iteration of the optimization algorithm.

### 2.5. Statistical Analysis

A key objective of the phenomenological modeling in this work is the ability to establish a correspondence between variations in model parameters and the variations in the data obtained from different physiological conditions. Hence, the parametric unpaired t-test and non-parametric Wilcoxon rank sum statistical testing methods were employed to determine the relative significance of the model parameters. Furthermore, Bonferroni correction was applied due to multiple comparisons problem and adequacy of sample sizes for statistical tests were established using power analysis.

# 2.6. Phase Space Reconstruction

In addition to matching Shannon and sample entropies of the model output and EEG signal through the optimization process, it is of interest to investigate matching other features such as the phase plot which plays a significant role in non-linear time series analysis (Kantz and Schreiber, 2004). It is known that any dynamic system can be completely recovered in the phase space, which maybe reconstructed from the measured time domain response of the system (Nie et al., 2013). While phase space consists of velocity and position variables for a mechanical system, in the case where just the time representation of a signal is available, a phase space reconstruction technique based on the method of delays is used (Kantz and Schreiber, 2004).

The main idea is that one does not need the derivatives to form a coordinate system in which to capture the structure of phase space, but instead one could directly use the lagged variables:

$$\mathbf{x}(n+T) = \mathbf{x}(t\_0 + (n+T)\Delta \mathbf{r}\_s),\tag{6}$$

where x(n) is the nth sample of the time series, 1τ<sup>s</sup> the time step, and T the delay integer to be determined. Then, a vector

output of stochastic oscillator model using Shannon and sample entropies.

of (embedding) dimension d may be constructed using the time lags as:

$$[\mathbf{x}(n), \mathbf{x}(n+T), \mathbf{x}(n+2T), \dots, \mathbf{x}(n+(d-1)T)].\tag{7}$$

Time-delay embedding is probably one of the best systematic methods for converting scalar data to multidimensional phase space (Abarbanel et al., 1993; Burke and Paor, 2004; Nie et al., 2013). An appropriate and successful reconstruction depends on the choice of both time delay T and the embedding dimension d (Nie et al., 2013).

In this study, the appropriate value of the time lag was determined using the average mutual information method applied to each EEG recording block. The idea behind mutual information is to identify the amount of information that can be learned about a measurement at one time from a measurement taken at another time. Consider the time series nth sample x(n) and its value after time delay T with the associated probability distributions of P(x(n)) and P(x(n + T)), respectively. The average information which can be obtained about x(n + T) from x(n) is given by the mutual information of the two measurements (Abarbanel et al., 1993; Mizrach, 1996):

$$I(\mathbf{x}(n), \mathbf{x}(n+T)) = \log\_2 \left[ \frac{P(\mathbf{x}(n), \mathbf{x}(n+T))}{P(\mathbf{x}(n))P(\mathbf{x}(n+T))} \right],\tag{8}$$

where P(x(n), x(n + T)) is the joint probability of the measurements x(n) and x(n + T) calculated using a binning-based method, in which 20 uniform intervals divided the range of the measurements equally. The average mutual information between

(bottom) output of stochastic oscillator model using Shannon and sample entropies.

measurements of any value x(n) and x(n + T) is the average over all possible measurements of I(x(n), x(n + T)) (Abarbanel et al., 1993):

$$I(T) = \sum\_{\mathbf{x}(n), \mathbf{x}(n+T)} P(\mathbf{x}(n), \mathbf{x}(n+T)) I(\mathbf{x}(n), \mathbf{x}(n+T)). \tag{9}$$

If T is too small, the measurements x(n) and x(n+T) will have too much overlap. However, if T is too large, then I(T) will approach zero and nothing relates x(n) to x(n + T). It is suggested that the proper T can be chosen as the first minimum of I(T) which is not necessarily optimal but has been shown to work well (Abarbanel et al., 1993; Nie et al., 2013). If in a case, no minima exists for I(T), the choice of T = 1 or 2 has been suggested (Abarbanel et al., 1993).

After specifying the correct time delay T, an appropriate embedding dimension, d, should also be found for the phase space reconstruction. If d is too small, the trajectories will not be unique. On the other hand, too large a d will result in additional computational cost by requiring extra dimensions (Nie et al., 2013).

# 3. Results

The optimization algorithm was separately applied to determine the model parameters (decision variables) for each of the 60 selected EEG signals using the weighting factors w<sup>1</sup> = w<sup>2</sup> = 0.35. These weighting factors give equal importance to the entropy measures and power spectrum. We then categorized the resulting 60 set of model parameters into four groups based recording

conditions and subject diagnosis: EC-CTL, EO-CTL, EC-AD, and EO-AD.

# 3.1. Healthy Eyes-Closed and Eyes-Open Results

Initially, we studied the models derived for the EC and EO EEG signals of CTL subjects for validation purposes. The means and standard deviations of the optimal values of the model parameters for EC-CTL and EO-CTL EEG signals are listed in **Table 1**. The p-values from the two statistical tests and non-parametric method after Bonferroni corrections indicate that the differences between of all parameters of the two models are strongly statistically significant with the exception of noise intensity. Note that, µ is also found statistically significant using t-test but is slightly off when the non-parametric method is used.

In order to ensure that adequate sample sizes are used, the minimum required difference between means of two groups of data for each parameter are computed. As expected due to very small p-values, the sample size for statistical testing is found to be sufficient with more than 99.9% power for all parameters except noise intensity µ, which was not found to be statistically significant using the non-parametric method.

Power spectrums of the optimal stochastic oscillator model output and EEG signals for the EC and EO cases of CTL subjects are presented in **Figure 2** where θ, α, and β band powers show excellent agreement. The comparison revealed that, as expected, the optimal model is closely following the α-band dominance in the EC cases. While, in the EO cases, the optimal model follows

TABLE 2 | Optimal parameters of the Duffing—van der Pol oscillator model for EC and EO of AD subjects (N = 20) and the p-values from unpaired t-test, Wilcoxon rank sum test, and Bonferonni correction.


a more flat frequency distribution from upper δ to lower β frequency bands. Furthermore, Shannon and SE values of the EEG signals and the model outputs for the EC and EO cases show close agreement. Shannon entropy values were 1.80 ± 0.08 and 1.92 ± 0.08 for EC EEG and model output, respectively, and 1.71 ± 0.11 and 1.57 ± 0.15 for EO EEG and model output, respectively. While, SE values were 1.04 ± 0.20 and 1.17 ± 0.22 for EC EEG and model output, respectively, and 0.97 ± 0.20 and 1.20 ± 0.18 for EO EEG and model output, respectively. These results show a significant improvement over our previous model where only Shannon entropy was used (Ghorbanian et al., 2015). The improvement is clearly observed in the the power spectra of sample EC and EO EEG signals and their corresponding optimal model outputs, respectively shown in **Figures 3**, **4**. Both figures demonstrate more distributed spectra of the model outputs with similar noise complexities to the actual EEG signals when SE is added to the objective function; i.e., power spectra of the signals without matching of SE have very discrete peaks unlike the EEG.

The impact of SE to match signal complexity is further demonstrated through phase plot reconstruction of the time series. Average mutual information for a sample EC EEG signal and outputs of the optimal stochastic oscillator models are shown in **Figure 5** as a function of lag time. The first minimum occurs at T = 5 lag samples for both the EEG signal and the optimal model derived with both Shannon and sample entropies while T = 3 for the output of the model derived solely based on Shannon entropy. The resulting reconstructed phase plots of the EC EEG signal and the outputs of the two optimal models are presented in **Figure 6**. Clearly, the reconstructed phase plots of the EEG and the output of the model derived using both Shannon and sample entropies, display similar behavior. While the output of the model derived using only Shannon entropy is qualitatively different form the EEG signal in terms of complexity and noise. Indeed this result provides further affirmation that the stochastic Duffing—van der Pol model yields an output that matches the actual EEG data in terms of non-linear characteristics observed in the phase space.

### 3.2. Alzheimer's Disease vs. Control Results

Next, we studied the models derived for the EC and EO EEG signals of AD subjects. The mean and standard deviation of the optimal values of the model parameters for EC-AD and EO-AD EEG signals are listed in **Table 2** along with the p-values from the two statistical tests and the non-parametric test after Bonferroni corrections indicating that the differences between only the

TABLE 3 | The p-values from unpaired t-test, Wilcoxon rank sum test, and Bonferonni correction for comparison of model parameters between AD (N = 20) and CTL (N = 40) subjects.


first four parameters of the two models are statistically significant. Next, we separately compared the model parameters of EC and EO EEG signals of CTL subjects with those AD patients.

**Table 3** lists the p-values from the two statistical testing methods and the non-parametric method after Bonferroni corrections comparing CTL vs. AD subjects under separate EC and EO conditions. The results indicate that the difference between all model parameters of CTL and AD subjects under EC condition are strongly statistically significant except for noise intensity. Again, µ is also found statistically significant using t-test but is slightly off when non-parametric method is used. The difference between the model parameters of CTL and AD subjects under EO condition are not, however, as strong, though they are still mostly statistically significant. In the EO case, parameter µ is not statistically significant using either method and t-test does not find b<sup>1</sup> to be statistically significant either.

The power analysis results for 90%, 95%, 99%, and 99.9% for two statistical are listed in **Tables 4**, **5** for EC and EO cases, respectively. The actual difference between means are given within parentheses following each parameter. The results indicated that our sample size for statistical testing in EC case between AD and CTL subjects was sufficient for all parameters

TABLE 4 | Minimum required difference between model parameter mean values of EC AD vs. EC CTL for various desired powers of statistical tests.


TABLE 5 | Minimum required difference between model parameter mean values of EO AD vs. EO CTL for various desired powers of statistical tests.


except µ with more than 99.9% power. However, in the EO case, only sample size for parameters b<sup>2</sup> and ǫ<sup>2</sup> has more than 99.9% confidence and k<sup>2</sup> shows a 90% confidence. The sample size for the remaining parameters did not provide sufficient confidence.

Power spectrums of the optimal stochastic oscillator model output and EEG signals for the EC and EO cases of AD subjects are presented in **Figure 7** where again θ, α, and β band powers show excellent agreement. The comparison revealed that the optimal model was closely and correctly slightly θ-band dominated in the EC cases for AD subjects (Ghorbanian et al., 2013). While, in the EO cases, the optimal model followed the more flat frequency distribution. Again, it should be noted that the higher error rates are related to those frequency bands with lower powers. Furthermore, Shannon and SE values of the EEG signals and the model outputs for the EC and EO cases show close agreement. Shannon entropy values were 1.78 ± 0.04 and 1.70 ± 0.10 for EC EEG and model output, respectively, and 1.63 ± 0.32 and 1.62 ± 0.27 for EO EEG and model output, respectively. While, SE values were 1.06±0.19 and 1.17±0.21 for EC EEG and model output, respectively, and 1.02±0.39 and 1.29±0.24 for EO EEG and model output, respectively.

Power spectra of outputs of the optimal stochastic oscillator models and EEG signals for sample EC and EO cases of AD subjects are presented in **Figures 8**, **9**. Again, it is clear that the addition of SE to the objective function results in output signals with power spectra patterns which are much more similar to the EEG signal in terms of distribution and noise complexity. As expected, the power spectrum plots demonstrated that the EC EEG signals from AD subjects were slightly θ band dominated unlike α band dominance of EC EEG recordings from CTL subjects.

# 4. Discussion

Power spectra of the optimal stochastic oscillator model output and EEG signals show excellent agreement in the brain's major frequency bands. The comparison revealed that the optimal model is closely following the α-band dominance in EC recordings for the control subjects. Furthermore, the model for EC recordings of AD patients closely followed θ-band power dominance indicating the slowing of the EEG signal for these patients. In the EO cases, the optimal model, as expected, followed a more flat frequency distribution from upper δ to lower β frequency bands for both AD and CTL subjects. Further evidence

output of stochastic oscillator model using Shannon and sample entropies.

of robustness of the the models derived in this study is that the models derived for healthy subject EC and EO EEG signals in our earlier study (Ghorbanian et al., 2015) fall within the same distributions obtained for the CTL subjects in the clinical study.

Moreover, Shannon and SE values of the EEG signals and the model outputs for the EC and EO cases show close agreement for both CTL and AD subjects. However, the difference between the entropy values of the CTL subjects and AD patients were not statistically significant for neither the EEG signal nor the model output. This aspect needs to be further studied since EEG signals from AD patients may be expected to have lower complexity and thus lower entropy values.

The contributions of the article are as follows. Firstly, the objective function of the optimization scheme that yields model parameters based on comparison with actual EEG data in our previous work was extended to include both Shannon and sample entropies, with the latter being a measure of signal complexity. The procedure yielded model outputs that were in agreement with the actual EEG signals with respect to the frequency content (power spectra), information content (Shannon entropy) and complexity (sample entropy). It was shown that the addition of SE significantly enhances the performance of the optimal model in terms of both power spectrum and non-linear characteristics displayed through phase space reconstruction. The results demonstrate the feasibility of stochastic non-linear oscillator models which can be further studied for greater insight into EEG signal dynamic characteristics.

Secondly, the model parameter differences for EC and EO EEG recordings were statistically significant leading to qualitatively and quantitatively distinct realizations of the underlying models for the cases considered. This is a key result of the work since it verifies that distinct models represent the EEG signals recorded under different brain states. Potentially, this could lead to unique models for different brain disorders and injuries.

Thirdly, the study provided unique models for EC and EO EEG recordings from AD patients. The results showed that almost all of the model parameters were statistically significant for the EC and EO cases when comparing the AD and CTL subjects. Moreover, the power spectrum plots showed a good match between the generated signal from the stochastic model and the actual EEG signal from AD patients. However, the results for the EC case of AD were more accurate and reasonable than the results of EO cases mainly due to the ability of the optimization scheme to provide a better match in EC cases. The important conclusion here is that unique stochastic non-linear oscillator models can be developed to represent EEG signals from patients with a brain disorder.

Of particular interest is the potential connection between our model and the neural mass models studied in the literature. For instance, characterization of functional connectivity between remote cortical areas has been studied using neural mass models (David and Friston, 2003; David et al., 2004). These and other efforts (Sotero et al., 2007) represent intriguing attempts to capture actual neural dynamics using coupled oscillator models and suggest that, after all, models such as the one discussed in this article may be of broader scope than being purely phenomenological. Extrapolating further, it would then be of immense interest to understand the manifestation of phenomena such as synchronization (Mirollo and Strogatz, 1990) within the framework of our model and the implications for EEG characterization.

# 5. Conclusions

In this article, we presented results that further develop our recent work on modeling the EEG signal as the response of a stochastic, coupled Duffing—van der Pol system of two oscillators. The results presented verify that unique and statistically significant stochastic Duffing—van der Pol oscillator models represent EEG recorded from AD patients vs. health controls. Overall, the results presented in this article further affirm the efficacy of a stochastic Duffing—van der Pol oscillator network model in capturing the key characteristics of actual EEG data under different brain states as well as brain conditions in terms of healthy controls vs. patients with a brain disorder. The validation provided by the results certainly motivates further research toward improving the analytical model and testing it against larger data sets. Furthermore, the results suggest that the modeling approach could potentially help develop novel diagnostic and interventional tools for neurological diseases and disorders.

# References


Alzheimer disease," in International Conference of the IEEE Engineering in Medicine and Biology Society (Boston: MA), 6087–6091.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Ghorbanian, Ramakrishnan and Ashrafiuon. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Multisensory integration using dynamical Bayesian networks

Taher Abbas Shangari <sup>1</sup> , Mohsen Falahi <sup>1</sup> , Fatemeh Bakouie<sup>2</sup> \* and Shahriar Gharibzadeh<sup>2</sup> \*

*<sup>1</sup> Amirkabir Robotic Center, Amirkabir University of Technology, Tehran, Iran, <sup>2</sup> Institute for Cognitive and Brain Sciences, Shahid Beheshti University, Tehran, Iran*

Keywords: multisensory integration, Dynamic Bayesian Networks, modeling, sensory processing disorder, Bayesian Models

Multisensory Integration (MSI) is the study of how information coming from different sensory modalities, such as vision, audition and etc. are being integrated by the nervous system (Stein et al., 2009) as a complex system. MSI is one of the most important aspects of neuroscience which has a great influence on our decision making system. It plays a key role in our understanding of surrounding environment which makes a coherent representation of the world for us (Lewkowicz and Ghazanfar, 2009). Since signals in our sensory systems are corrupted by variability or noise, the nervous system combines different kinds of sensory information like sound, touch etc. to achieve a meaningful and continuous stream of percepts (Kording and Wolpert, 2006; Lewkowicz and Ghazanfar, 2009). Recently, researchers have shown an increased interest in MSI modeling, to discover the causes of related disorders such as under-sensitivity or hyposensitivity (Knill and Pouget, 2004). Moreover individuals with Autism Spectrum Disorder (ASD) have an impaired ability to integrate multisensory information to make a unified percept (Stevenson et al., 2014).

Different researches have modeled MSI in a variety of ways. Computational methods, such as Kalman Filter (KF) and Bayesian Networks (BN) are used widely to model probabilistic functions of the nervous system including MSI (Van Der Kooij et al., 1999; Kording and Wolpert, 2004). In KFbased models there is a basic assumption on accuracy of the sensory input data. This assumption says that the error's Probability Density Function (PDF) of each sensor is Gaussian. According to KF, it is provable that data fusion of two different kinds of data for one variable measurement leads to more accurate results (Kalman, 1960). A serious weakness with this method, however, is its basic assumption. Assuming a Gaussian form of the PDF of the sensory systems' error is in contradiction with the brain's internal models and prior knowledge about human sensory system and environmental models which are not necessarily Gaussian-like. Additionally, as different formats are used by each sensory modality to encode the same properties of the environment or body, MSI cannot be as simple as an averaging between sensory inputs (Deneve and Pouget, 2004). Hence, it is clear that KF-based models are not valid for many MSI studies and therefore researchers tried to modify this method (Van der Zijpp and Hamerslag, 1994; Julier and Jeffrey, 2004).

Since BNs have not any assumption on accuracy of the input data, they have attracted much attention recently. A BN is a graphical model that represents probabilistic relationships among variables of interest. By using graphical models in conjunction with statistical techniques, several advantages for data analysis will be obtained: Firstly, because a BN represents conditional dependencies among all variables, it is able to handle situations where some data entries are missing. Secondly, the model can be used to learn causal relationships, so it can be used to understand a problem domain and to predict the consequences of intervention. Thirdly, because BNs have both causal and probabilistic semantics, they represent combining prior knowledge and data ideally (Heckerman, 1998; Wasserman, 2011).

#### Edited by:

*Tobias Alecio Mattei, Brain & Spine Center - InvisionHealth - Kenmore Mercy Hospital, USA*

#### Reviewed by:

*Malte J. Rasch, Beijing Normal University, China*

#### \*Correspondence:

*Fatemeh Bakouie, f\_bakouie@sbu.ac.ir; Shahriar Gharibzadeh, gharibzadeh@aut.ac.ir*

Received: *21 February 2015* Accepted: *29 April 2015* Published: *22 May 2015*

#### Citation:

*Abbas Shangari T, Falahi M, Bakouie F and Gharibzadeh S (2015) Multisensory integration using dynamical Bayesian networks. Front. Comput. Neurosci. 9:58. doi: 10.3389/fncom.2015.00058*

Generally, there are three main inference tasks for BNs: inferring unobserved variables, parameter learning, and structure learning. They are used widely for modeling knowledge in computational biology, bioinformatics, etc. For example, a BN could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases.

As it mentioned before, the brain needs using different resources of information altogether to be able to make a sound decision about a situation. In such cases BNs can be used to model brain's function in many studies (Seilheimer et al., 2014). It is worth mentioning that in BNs, relationship between different nodes is not as simple as an averaging and we can model more complex probabilistic problems by using BNs (Bishop and Nasser, 2006).

However, it is obvious that the reliability of sensory modalities varies widely according to the context and in a BN the effect of one node on the other one can vary from one task or situation to another one. But it is clear that when we assume a node as a parent node for another one, this relation could not be changed and new experiences would not cause new links between separated nodes. The main weakness of the BN based models is the failure to address the way it uses to reconstruct the network, based on new observed experiences. Most studies in MSI modeling have only focused on one task in which the effective sensory resources are known before, therefore, the structure of the network is known too, and we only need to train the network. By contrast, when we want to model MSI, we should not restrain it only in some certain tasks but the model should instead be generalizable to other tasks. It means that the model should be more dynamic and task independent. In addition, it is clear that time has a great influence in our decision making and reasoning and unfortunately, BN fails to code the time directly (Mihajlovic and Petkovic, 2001).

We suggest that, MSI models will be more generalized if we use Dynamic Bayesian Networks (DBN) which describes a system that dynamically changes over time. In a BN that models the interactions between sensory modalities, the nodes are associated with activated sensory modalities and the edges represent the interactions among sensory modalities. Sensory modalities of a neural system including n sensory modalities are indexed in a set I = {i : i = 1, 2, . . . n}. Consider activation of a sensory modality measured by fMRI time-series or EEG over the sensory modality. Let x<sup>i</sup> be the activation measuring the response of sensory modality i.

BNs describe the PDF over the activation of sensory modalities, where the graphical structure provides an easy way to specify conditional interdependencies for a compact parameterization of the distribution. A BN defined by a structure S is a directed acyclic graph (DAG) and a joint distribution over the set of time-series x = {x<sup>i</sup> : i ∈ I}. The set of activations of the parents of sensory modality i is denoted by a<sup>i</sup> , and a DAG offers a simple and unique way to decompose the likelihood of activation in terms of conditional probabilities: where θ = {θ<sup>i</sup> : i ∈ I} represents the parameters of the conditional probabilities (Rajapakse and Zhou, 2007).

DBNs extend BNs to incorporate temporal characteristics of the time-series x. x(t) = {xi(t) : i ∈ I} represents the activations of n sensory modalities at time t, where the instances t = 1, 2, . . . T correspond to the times when sensory modality measures are taken and T denotes the total number of measures. In order to model the temporal dynamics of brain processes, we need to model a probability distribution over the set of random variables S<sup>T</sup> t=1 x(t) which is complex and practically hard.

To avoid an explosion of the model complexity, one can assume that the temporal changes of activations of brain regions are stationary and first-order Markovian. This assumption provides a tractable causal model that explicitly takes into account the temporal dependencies of brain processes. When facing more complex temporal processes and connectivity patterns, higher-order and non-stationary Markov models can be used to overcome the complexity.

The connectivity structure between two consecutive data sampling is represented by the transition network, which renders the joint distribution of all possible trajectories of temporal processes. The structure of the DBN is obtained by unrolling the transition network over consecutive scans for all t = 1, 2, . . . , T (Rajapakse and Zhou, 2007).

In an overview, we here suggest that DBN may be a more useful method to model MSI in comparison to prior methods because of three reasons. Firstly, as DBN changes dynamically, initial structure of the network does not lead to an unreliable result and we can use the network in various kinds of studies (because this method is task-independent). Secondly, in cases which we are not sure about the relation and interaction between different sensory modalities, DBN output can help us to achieve a more accurate understanding about MSI processes. Moreover, there exist cyclic functional networks in the brain, such as cortico-subcortical loops which BNs are not capable to model. Unlike BN, DBN has the capability of modeling recurrent networks while still satisfying the acyclic constraint of the transition network (Rajapakse and Zhou, 2007). This is an important advantage of modeling neural system with DBN as these key features of DBN help us to obtain a proper viewpoint about MSI in different tasks and it makes the study of related disorders easier and closer to reality.

# References


Heckerman, D. (1998). A Tutorial on Learning with Bayesian Networks. Springer Netherlands.

Julier, S. J., and Jeffrey, K. U. (2004). Unscented filtering and nonlinear estimation. Proc. IEEE 92, 401–422. doi: 10.1109/JPROC.2003. 823141

Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. J. Basic Eng. 82, 35–45.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Abbas Shangari, Falahi, Bakouie and Gharibzadeh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.