**CRITICALITY AS A SIGNATURE OF HEALTHY NEURAL SYSTEMS: MULTI-SCALE EXPERIMENTAL AND COMPUTATIONAL STUDIES**

**Topic Editors Paolo Massobrio, Lucilla de Arcangelis, Valentina Pasquale, Henrik J. Jensen and Dietmar Plenz**

#### *FRONTIERS COPYRIGHT STATEMENT*

© Copyright 2007-2015 Frontiers Media SA. All rights reserved.

All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.

**ISSN** 1664-8714 **ISBN** 978-2-88919-503-9 **DOI** 10.3389/978-2-88919-503-9

## *ABOUT FRONTIERS*

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

## *FRONTIERS JOURNAL SERIES*

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing.

All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

## *DEDICATION TO QUALITY*

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view.

By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

## *WHAT ARE FRONTIERS RESEARCH TOPICS?*

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area!

Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

## **CRITICALITY AS A SIGNATURE OF HEALTHY NEURAL SYSTEMS: MULTI-SCALE EXPERIMENTAL AND COMPUTATIONAL STUDIES**

Topic Editors:

**Paolo Massobrio,** University of Genova, Italy **Lucilla de Arcangelis,** Second University of Napoli, Italy **Valentina Pasquale,** Fondazione Istituto Italiano di Tecnologia, Italy **Henrik J. Jensen,** Imperial College of London, UK **Dietmar Plenz,** National Institute of Mental Health (NIH), USA

Criticality as a signature of healthy neural systems: multi-scale experimental and computational studies. Most commonly used words in the Research Topic. The font size is proportional to the number of times the word was used.

Since 2003, when spontaneous activity in cortical slices was first found to follow scale-free statistical distributions in size and duration, increasing experimental evidences and theoretical models have been reported in the literature supporting the emergence of evidence of scale invariance in the cortex. Although strongly debated, such results refer to many different in vitro and in vivo preparations (awake monkeys, anesthetized rats and cats, in vitro slices and dissociated cultures), suggesting that power law distributions and scale free correlations are a very general and robust feature of cortical activity that has been conserved across species as specific substrate for information storage, transmission

and processing. Equally important is that the features reminiscent of scale invariance and criticality are observed at scale spanning from the level of interacting arrays of neurons all the way up to correlations across the entire brain. Thus, if we accept that the brain operates near a critical point, little is known about the causes and/or consequences of a loss of criticality and its relation with brain diseases (e.g. epilepsy). The study of how pathogenetical mechanisms are related to the critical/non-critical behavior of neuronal networks would likely provide new insights into the cellular and synaptic determinants of the emergence of critical-like dynamics and structures in neural systems. At the same time, the relation between the impaired behavior and the disruption of criticality would help clarify its role in normal brain function. The main objective of this Research Topic is to investigate the emergence/disruption of the emergent critical-like states in healthy/impaired neural systems.

# Table of Contents

## *04 Criticality as a Signature of Healthy Neural Systems* Paolo Massobrio, Lucilla de Arcangelis, Valentina Pasquale, Henrik J. Jensen and Dietmar Plenz *07 On the Temporal Organization of Neuronal Avalanches* Fabrizio Lombardi, Hans J. Herrmann, Dietmar Plenz and Lucilla De Arcangelis *22 Markers of Criticality in Phase Synchronization* Maria Botcharova, Simon F. Farmer and Luc Berthouze *46 Self-Organized Criticality as a Fundamental Property of Neural Systems* Janina Hesse and Thilo Gross *60 Marginally Subcritical Dynamics Explain Enhanced Stimulus Discriminability Under Attention* Nergis Tomen, David Rotermund and Udo Ernst *75 Critical Role for Resource Constraints in Neural Models James* James A. Roberts, Kartik K. Iyer, Sampsa Vanhatalo and Michael Breakspear *80 Spike Avalanches in Vivo Suggest a Driven, Slightly Subcritical Brain State* Viola Priesemann, Michael Wibral, Mario Valderrama, Robert Pröpper, Michel Le Van Quyen, Theo Geisel, Jochen Triesch, Danko Nikoli and Matthias H. J. Munk *97 Functional Significance of Complex Fluctuations in Brain Activity: From Resting State to Cognitive Neuroscience* David Papo *104 Alternation of Up and Down States at a Dynamical Phase-Transition of a Neural Network with Spatiotemporal Attractors* Silvia Scarpetta and Antonio de Candia *113 Power Law Scaling In Synchronization of Brain Signals Depends on Cognitive Load* Jesse Tinker and Jose Luis Perez Velazquez *123 Universal Organization of Resting Brain Activity at the Thermodynamic Critical Point* Shan Yu, Hongdian Yang, Oren Shriki and Dietmar Plenz

## Criticality as a signature of healthy neural systems

#### *Paolo Massobrio1 \*, Lucilla de Arcangelis 2, Valentina Pasquale3, Henrik J. Jensen4 and Dietmar Plenz <sup>5</sup>*

*<sup>1</sup> Department Informatics, Bioengineering, Robotics, System Engineering (DIBRIS), University of Genova, Genova, Italy*

*<sup>2</sup> Department Industrial and Information Engineering, Second University of Napoli, Napoli, Italy*

*<sup>4</sup> Department of Mathematics, Centre for Complexity Science, Imperial College of London, London, UK*

*<sup>5</sup> Section on Critical Brain Dynamics, National Institute of Mental Health (NIH), Bethesda, MD, USA*

*\*Correspondence: paolo.massobrio@unige.it*

#### *Edited and reviewed by:*

*Maria V. Sanchez-Vives, Institució Catalana de Recerca i Estudis Avançats and Institut de Investigacions Biomèdiques August Pi i Sunyer, Spain*

**Keywords: criticality, neuronal avalanches, critical exponents, power law, network dynamics, healthy neural systems**

This Research Topic in "Frontiers in Systems Neuroscience" contains a collection of original contributions and review articles on the hypothesis that the normal, healthy brain resides in a critical state. The hypothesis that brain activity, or specifically, neuronal activity in the cortex, might be critical arose from the premise that a critical brain can show the fastest and most flexible adaptation to a rather unpredictable environment (for review see Chialvo, 2010). Over the last decade, numerous signatures of criticality have been identified in brain activity. Some of the most striking examples are the probability distributions of size and duration for intermittent spontaneous activity bursts during ongoing activity in the cortex (Beggs and Plenz, 2003). These distributions have been found to follow power laws, which are conserved across species [rat: (Gireesh and Plenz, 2008); non-human primate: (Petermann et al., 2009; Yu et al., 2011); MEG: (Poil et al., 2012; Palva et al., 2013; Shriki et al., 2013); EEG: (Meisel et al., 2013); fMRI: (Tagliazucchi et al., 2012; Haimovici et al., 2013)] and experimental preparations, spanning from reduced *in vitro* models [i.e., acute and organotypic slices (Beggs and Plenz, 2003) and dissociated cultures (Pasquale et al., 2008)] to *in vivo* animal models (Gireesh and Plenz, 2008; Petermann et al., 2009; Ribeiro et al., 2010). These scale-free activation patterns, called *neuronal avalanches*, provide evidence for criticality in the brain.

The bold claim that the brain is critical has elicited a healthy dose of skepticism and critique, often on technical grounds. For example, power laws are ubiquitous in nature, can potentially emerge from noise, and might not be particular to brain function (Touboul and Destexhe, 2010). This critique has refocused the debate on the specific, shallow exponents found for avalanche power laws, which demonstrate that unique, long-range spatial correlations are introduced by these dynamics, which require precisely balanced, weak interactions that differ from noise (Klaus et al., 2011). Similarly, discussion about the proper power law model and functional fit (Clauset et al., 2009; Dehghani et al., 2012) has highlighted the importance of careful identification of power law cut-offs in avalanche distributions, and their correct incorporation into appropriate statistical models (e.g., Langlois et al., 2014; Yu et al., 2014). Importantly, alternative approaches to avalanche dynamics using temporal scaling (Hardstone et al., 2012) and spatial scaling of fluctuations in ongoing human brain activity (Haimovici et al., 2013) have brought further support to the hypothesis of criticality in the brain.

As evidenced in the contributions to this Special Topic issue, the exploration and examination of brain activity in the framework of criticality represents a highly active, ongoing field of research. It has been shown that the distribution of silent times between consecutive avalanches displays a non-monotonic behavior (due to the slow alternation between up- and downstates, Scarpetta and De Candia, 2014), with a power law decay at short time scales (Lombardi et al., 2014). Further analyses (Lombardi et al., 2012) demonstrate that avalanche size and interavalanche silent times are correlated, and highlight that avalanche occurrence exhibits the characteristic periodicity of θ and β/γ oscillations. This observation is in line with pharmacological results that connect nested oscillations and neuronal avalanches in cortex (Gireesh and Plenz, 2008). Experimental observations of long-term temporal correlations (Botcharova et al., 2014) in fluctuations of phase synchronization in EEG and MEG signals suggest that the driving mechanisms behind avalanche activity are non-local, with all scales contributing to system behavior. Indeed, an important "keyword" that characterizes such scalefree systems is the presence of a critical point, indicating the existence of a critical branching process as underlying structure that sustains this kind of dynamics. As it emerges in Yu et al. (2013), the ongoing resting activity in cortical networks organizes close to an effective thermodynamic critical point, suggesting the possibility that a critical state may in effect be described by methodology from thermodynamic equilibrium. As reviewed in Hesse and Gross (2014), a critical system displays optimal computational properties, indicating that criticality has been evolutionarily selected as a useful feature for the nervous system. The progress in the field of criticality and brain dynamics is further demonstrated by the fact that current discussions, rather than rejecting criticality altogether, are often focused on the proximity of brain dynamics to the critical point under different conditions. Based on *in vivo* recordings of extracellular spiking activity and modeling work, it has been concluded that the brain does not reflect a critical state, but its emergent dynamics might selforganize to a slightly sub-critical regime (Priesemann et al., 2014). To reside in such a regime can be considered an advantage, since it might prevent brain activity from becoming epileptic, which has been associated with supercritical dynamics (Meisel et al., 2012). Based on modeling (Tomen et al., 2014), it has been suggested that cortical networks, by operating at the sub-critical to

*<sup>3</sup> Fondazione Istituto Italiano di Tecnologia, Genova, Italy*

critical transition region, could dramatically enhance stimulus representation.

Thus, if the brain works close to or at a critical point, it is interesting to investigate the role of criticality on cognition and long-term temporal correlations observed in behavioral studies (Papo, 2014). Moreover, little is known about the causes and/or consequences of a loss of criticality, and its relation with brain diseases (e.g., epilepsy). The study of how pathogenic mechanisms are related to the critical/non-critical behavior of neuronal networks would likely provide new insights into the cellular and synaptic determinants supporting the emergence of critical-like dynamics and structures in neural systems. At the same time, the relationship between disrupted criticality and impaired behavior would help clarify the role of critical dynamics in normal brain functioning. In this Research Topic, (Tinker and Perez Velazquez, 2014) studied whether power law scaling can be achieved in the distribution of phase synchronization derived from MEG recordings, acquired from children with or without autism performing executive function tasks. Interestingly, (Roberts et al., 2014) point out an issue not well explored in previous works: i.e., that existing models lack precise physiological descriptions for how the brain maintains its tuning near a critical point. The authors claim that a missing fundamental ingredient is a formulation of the reciprocal coupling between neural activity and metabolic resources. Recent findings are aligned with the author's idea, which emerged from the analysis of disorders involving severe metabolic disturbances and altered scale-free properties of brain dynamics.

The hypothesis that cortical dynamics resides at a critical point, at which information processing is optimized, has refocused attempts to explain the tremendous variability in neuronal activity patterns observed in the brain at all scales. Over the last several years, this hypothesis has given rise to numerous conferences and workshops on the brain and criticality (Plenz and Niebur, 2014). The current Research Topic continues the endeavor to explore one of the most exciting current concepts on brain function.

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 03 December 2014; accepted: 10 February 2015; published online: 25 February 2015.*

*Citation: Massobrio P, de Arcangelis L, Pasquale V, Jensen HJ and Plenz D (2015) Criticality as a signature of healthy neural systems. Front. Syst. Neurosci. 9:22. doi: 10.3389/fnsys.2015.00022*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2015 Massobrio, de Arcangelis, Pasquale, Jensen and Plenz. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## On the temporal organization of neuronal avalanches

#### *Fabrizio Lombardi <sup>1</sup> \*, Hans J. Herrmann1,2, Dietmar Plenz <sup>3</sup> and Lucilla De Arcangelis <sup>4</sup>*

*<sup>2</sup> Departamento de Física, Universitade Federal do Ceará, Fortaleza, Brazil*

*<sup>4</sup> Department of Industrial and Information Engineering, Second University of Naples, National Institute for Nuclear Physics Gr. Coll. Salerno, Aversa, Italy*

#### *Edited by:*

*Paolo Massobrio, University of Genova, Italy*

#### *Reviewed by:*

*James A. Roberts, Queensland Institute of Medical Research Berghofer Medical Research Institute, Australia Leonardo Paulo Maia, Universidade de São Paulo, Brazil*

#### *\*Correspondence:*

*Fabrizio Lombardi, Institute of Computational Physics for Engineering Materials, ETH, Schafmattstr. 6, Zürich, 8093, Switzerland e-mail: fabrizio.lombardi@ ifb.baug.ethz.ch*

Spontaneous activity of cortex *in vitro* and *in vivo* has been shown to organize as neuronal avalanches. Avalanches are cascades of neuronal activity that exhibit a power law in their size and duration distribution, typical features of balanced systems in a critical state. Recently it has been shown that the distribution of quiet times between consecutive avalanches in rat cortex slice cultures displays a non-monotonic behavior with a power law decay at short time scales. This behavior has been attributed to the slow alternation between up and down-states. Here we further characterize the avalanche process and investigate how the functional behavior of the quiet time distribution depends on the fine structure of avalanche sequences. By systematically removing smaller avalanches from the experimental time series we show that size and quiet times are correlated and highlight that avalanche occurrence exhibits the characteristic periodicity of θ and β/γ oscillations, which jointly emerge in most of the analyzed samples. Furthermore, our analysis indicates that smaller avalanches tend to be associated with faster β/γ oscillations, whereas larger ones are associated with slower θ and 1–2 Hz oscillations. In particular, large avalanches corresponding to θ cycles trigger cascades of smaller ones, which occur at β/γ frequency. This temporal structure follows closely the one of nested θ − β/γ oscillations. Finally we demonstrate that, because of the multiple time scales characterizing avalanche dynamics, the distributions of quiet times between avalanches larger than a certain size do not collapse onto a unique function when rescaled by the average occurrence rate. However, when considered separately in the up-state and in the down-state, these distributions are solely controlled by the respective average rate and two different unique function can be identified.

**Keywords: neuronal avalanches, oscillations, rat cortex, waiting times, neuronal networks, criticality**

## **1. INTRODUCTION**

During sleep or under anesthesia, as well as *in vitro*, ongoing or spontaneous activity in cortex alternates between active periods with high probability of action potential firing and quiescent periods characterized by sparse firing (Plenz and Aertsen, 1996; Cossart et al., 2003; Cunningham et al., 2006; Hahn et al., 2006). These extracellular spiking dynamics correspond to so-called up and down-state fluctuations in the intracellular membrane potential of cortical neurons (Steriade et al., 1993; Plenz and Kitai, 1996; Wilson, 2008). During up- states, the intracellular membrane potential is close to firing threshold allowing neurons to fire action potentials in response to synaptic input. In contrast, the membrane potential is more hyperpolarized during the downstate leading to low probability of firing. The up-state is generally considered a cortical network property that arises from the propagation of activity among recurrently connected neurons (Plenz and Kitai, 1996; McCormick et al., 2003; Wilson, 2008; Millman et al., 2010). The resulting synaptic input depolarizes neurons beyond threshold supporting and prolonging the up-state. In that context, the up-state should be considered a metastable state, i.e., the membrane potential would rapidly decay to resting value, if network mechanisms prevented the required excitability or excitatory synaptic drive for individual neurons.

Conversely, down-states reflect relatively quiescent network periods during which the membrane potential of most neurons is close to or even lower than their resting value. Down-states generally result from disfacilitation, i.e., a substantial reduction or lack of excitatory drive in the network (Cowan and Wilson, 1994; Timofeev et al., 2001). Transitions to the down-state can be caused by various mechanisms such as synaptic depression at glutamatergic synapses (Stevens and Tsujimoto, 1995; Staley et al., 1998), an increase of a factor inhibiting glutamate release, such as nucleoside adenosine (Thompson et al., 1992), blockage of receptor channels by the presence, for instance, of external magnesium (Maeda et al., 1995), or spike adaptation, which arises from the intracellular accumulation of calcium entering during the action potential and opening potassium channels (Sanchez-Vives et al., 2000). Transitions to the up-state are generally thought to arise from non-linear amplification following recovery from disfacilitation. For example, spontaneous single action potentials, spontaneous miniature synaptic release, and recovery from synaptic vesicle depletion, i.e., synaptic depression, can cooperate

*<sup>1</sup> Institute of Computational Physics for Engineering Materials, ETH, Zurich, Switzerland*

*<sup>3</sup> Section on Critical Brain Dynamics, National Institute of Mental Health, National Institute of Health, Bethesda, MD, USA*

to a non-linear amplification of small amplitude signals leading to the generation of larger depolarizing events rapidly transitioning the network to the up- state, as observed in cortical slabs (Timofeev et al., 2000) and slice cultures (Plenz and Aertsen, 1996).

During up-states, which usually last up to several hundreds of milliseconds, cortical neurons have been shown to fire irregularly often during nested oscillations (e.g., Plenz and Kitai, 1996). This highly variable firing pattern at short time scales of just a few milliseconds, over the last decade, has been found to reflect in fact a precise, scale-invariant organization of activity, so-called neuronal avalanches (Beggs and Plenz, 2003; Mazzoni et al., 2007; Gireesh and Plenz, 2008; Pasquale et al., 2008; Petermann et al., 2009; Shriki et al., 2013). Neuronal avalanches are intermittent bursts of activity cascades whose sizes and durations follow power law statistics, a typical feature of systems at criticality (Stanley, 1971). The statistics of time intervals separating successive avalanches has been recently studied in the spontaneous activity of rat cortex slice cultures (Lombardi et al., 2012). In Lombardi et al. (2012), these intervals are called waiting times and defined as the difference between the ending and starting time of consecutive avalanches. Here and in the following we will adopt a slightly different notation (Sanchez et al., 2002): We call quiet times the time intervals between the ending and starting time of consecutive avalanches, whereas we refer to waiting times as time intervals between starting times of consecutive avalanches.

The quiet time distribution, is widely used in the stochastic analysis of natural phenomena, such as earthquakes, solar flares (de Arcangelis et al., 2006a), and rock fractures, where it is usually called waiting time distribution. Indeed, for these phenomena the waiting times, do not differ from quiet times because event durations can be neglected and processes can be consistently treated as point processes. For neuronal avalanches this approximation is not always valid since the shortest quiet times are comparable with some avalanche durations, as we will show in the following. While numerous similarities between earthquakes and neuronal avalanches have been found (Plenz, 2012), the quiet time distribution has only been incompletely analyzed so far for avalanches. Of particular interest are the universal temporal scaling features observed for earthquakes. Distribution of earthquake waiting times, in which waiting times are restricted to earthquakes above a given magnitude threshold, depend on the threshold, but nevertheless collapse onto a universal, i.e., threshold independent, function when waiting times are rescaled by the average rate (Corral, 2004). This property reveals that seismicity has a complex organization in time with universal properties: the removal of small events by increasing the minimal detection threshold does not affect the fundamental organization of earthquake occurrence.

The quiet time distribution of neuronal avalanches is characterized by a peculiar non-monotonic behavior, with power law decay followed by a local minimum and a more or less pronounced peak at a characteristic slow time scale (Lombardi et al., 2012). Numerical simulations suggest that such a distribution reflects the alternation between up and down-states in the network, which acts as a homeostatic mechanism controlling network excitability (Lombardi et al., 2012). In the current work, we analyze the functional behavior of the quiet time distribution in relation to the structure of avalanche sequences. In particular, we examine the relationship between quiet times and avalanche sizes by studying the distributions *P*(*t*;*sc*) of quiet times between consecutive avalanches of sizes larger than a given threshold *sc* and investigate whether the non-monotonic quiet time distribution identified in cortex cultures exhibits the universal scaling features reported for waiting time distributions of earthquakes. We first compare quiet and waiting time statistics for neuronal avalanches. Then we show that, (1) the avalanche process in the up-state is solely controlled by the average occurrence rate and the corresponding quiet time distribution has a universal, i.e., sample independent, power law decay. By systematically removing smaller avalanches from the experimental time series, (2) we then unveil correlations between sizes and quiet times and highlight that avalanche occurrence exhibits some of the characteristic periodicity of θ (4–15 Hz), β (15–30 Hz), and γ (30–100 Hz) oscillations. Indeed, in place of the original power law, we observe several peaks at short time scales when considering only avalanches with size *s* above a given threshold *sc*. Therefore, close in time smaller avalanches are crucial for the power law in the quiet time distribution of up-states to emerge. We observe that these avalanches tend to be related to short quiet times and fast β/γ oscillations, while larger avalanches are associated with slower θ and 1–2 Hz oscillations. In particular, we notice a sort of hierarchical structure in avalanche sequences: In the up-states, large avalanches occurring with θ frequency trigger cascades of smaller avalanches corresponding to faster oscillations. Finally we demonstrate (3) that the distributions *P*(*t*;*sc*) of quiet times between avalanches with size *s* above a given threshold *sc* do not collapse if quiet times are rescaled by the average rate *r* = 1/*t*. However, when the different temporal scales that govern up and down-states are taken into account, a proper collapse can be obtained. Specifically, the distributions *P*(*t*;*sc*) in the upstate and in the down-state show the same functional behavior if quite times are rescaled by the respective average avalanche rate.

## **2. MATERIALS AND METHODS**

## **2.1. EXPERIMENTAL SETUP**

Coronal slices from rat dorsolateral cortex (postnatal day 0–2; 350 µm thick) are attached to a poly-D-lysine coated 60 microelectrode array (MEA; Multichannelsystems, Germany) and grown at 35.5 C in normal atmosphere in standard culture medium without antibiotics for 4–6 weeks before recording. Avalanche activity is measured from cortex-striatum-substantia nigra triple cultures or single cortex cultures as reported previously (Beggs and Plenz, 2003). In short, spontaneous avalanche activity is recorded outside the incubator in standard artificial cerebrospinal fluid (ACSF; laminar flow of 1 ml/min) under stationary conditions for up to 10 h. The spontaneous local field potential (LFP) is sampled continuously at 1 kHz at each electrode and low-pass filtered at 50 Hz. Negative deflections in the LFP (nLFP) were detected by crossing a noise threshold of −3 *SD* followed by negative peak detection within 20 ms. nLFP times and nLFP amplitudes were extracted. Neuronal avalanches are defined as spatio-temporal clusters of nLFPs on the MEA (Beggs and Plenz, 2003). A neuronal avalanche consists of a consecutive series of time bins of width that contain at least one nLFP on any of the electrodes. Each avalanche is preceded and ended by at least one time bin with no activity. Without loss of generality, the present analysis is done with width individually estimated for each culture from the average inter nLFP interval on the array at which the power law in avalanche sizes s, *P*(*s*) ∼ *s* <sup>−</sup>α, yields α = 3/2. ranged between 3 and 6 ms for all cultures. Avalanche size is defined as the sum of absolute nLFP amplitudes (µV) on active electrodes or simply the number of active electrodes. Size distributions are obtained using logarithmic binning for sizes expressed in µV. A quiet time *t* is defined as the time interval between the ending time of an avalanche *t f <sup>j</sup>* and the starting time *t i <sup>j</sup>* <sup>+</sup> <sup>1</sup> of the following one, namely *tj* = *t i <sup>j</sup>* <sup>+</sup> <sup>1</sup> − *t f <sup>j</sup>* . A waiting time δ*t* is defined as the time interval between the starting time of an avalanche *t i <sup>j</sup>* and the starting time *t i <sup>j</sup>* <sup>+</sup> <sup>1</sup> of the following one, namely δ*tj* = *t i <sup>j</sup>* <sup>+</sup> <sup>1</sup> − *t i j* . Quiet (waiting) time distributions are obtained using logarithmic binning for quiet (waiting) times expressed in ms.

#### *2.1.1. Up and down-state*

The following procedure is used to discriminate between up and down-states. An up-state consists of a consecutive series of avalanches separated by a quiet time *t* shorter than *t* <sup>∗</sup>, where *t* <sup>∗</sup> is defined as the local minimum between the initial power law regime and the local peak observed between 500 and 1000 ms. Conversely, every quiet time longer than *t* <sup>∗</sup> belongs to a downstate and a consecutive series of avalanches separated by quiet times longer than *t* <sup>∗</sup> is considered a down-state. The mean rate in the up-state is defined as *rup* = 1/*tup*, whereas the mean rate in the down-state is defined as *rdw* = 1/*tdw*.

#### **2.2. NUMERICAL MODEL**

#### *2.2.1. Network and dynamics*

We consider *N* = 64000 neurons at random positions, characterized by their potential *vi*. Neurons are connected by a scale-free network of synapses. More precisely to each neuron *i* we assign an out-going connectivity degree, *kouti* ∈ [2, 100], according to the degree distribution *P*(*k*) ∝ *k*−<sup>2</sup> of the functional network measured in Eguiluz et al. (2005). Choosing different network topologies, the model exhibits the same scaling behavior of avalanche size and duration distributions (de Arcangelis et al., 2006b; Pellegrini et al., 2007; de Arcangelis and Herrmann, 2012). The universality class of the neuronal avalanche process is the one of the mean field branching process (Zapperi et al., 1995; Lauritsen et al., 1996). To each synaptic connection we assign an initial random strength *gij* ∈ [0.15, 0.3] and to each neuron an excitatory or inhibitory character. Outgoing synapses are excitatory if they belong to excitatory neurons, inhibitory otherwise. The network has a fraction *pin* of inhibitory synapses, which is fixed. Each synapse is directed, meaning that it can be used by neuron *i* to send a signal to neuron *j* but not viceversa. As a consequence *gij* = *gji* and in general out-degree and in-degree of a neuron do not coincide. Therefore, once the network of output connections is established, we identify the resulting degree of in-connections, *kinj* , for each neuron *j*, namely we identify the number of synapses directed to each neuron *j*. The number *kinj*

of in-going synapses can be considered as the dentritic tree of neuron *j*. We then assume that each neuron *j* has a soma whose surface is proportional to *kinj* .

Whenever at time *t* the value of the potential in neuron *i* is above a certain threshold, *vi* ≥ *vmax*, the neuron fires and its potential *vi* arrives at each of the *kouti* connected neurons. In our simulations we use *vmax* = 6. However, as in every SOC-like model, this parameter is not relevant and results are independent of this particular choice.

For real neurons the production of neurotransmitters at the presynaptic terminals, and then the charge entering the postsynaptic neuron, is controlled by the frequency of action potentials, which depends on the integrated stimulation received by the neuron. Here the integrated stimulation is given by *vi*, the membrane potential of the firing neuron. Therefore, we assume that the total charge *qi* that can enter into connected neurons is proportional to *vi* · *kouti* . The change in the intracellular membrane potential of the postsynaptic neuron *j* is proportional to the relative synaptic strength *gij*/ - *<sup>l</sup> gil*,

$$\nu\_{\dot{\jmath}}(t+1) = \;\nu\_{\dot{\jmath}}(t) \pm \frac{\nu\_{\dot{\imath}} \cdot k\_{out\_{\dot{\imath}}}}{k\_{in\_{\dot{\jmath}}}} \frac{\mathcal{g}\_{\dot{\jmath}}}{\sum\_{l=1}^{k\_{out\_{\dot{\imath}}}} \mathcal{g}\_{\dot{\imath}l}}.\tag{1}$$

In Equation 1 it is assumed that the received charge is distributed over the surface *kinj* of the soma of the post-synaptic neuron. The plus or minus sign is for excitatory or inhibitory synapses respectively. After firing, the neuron is set in a refractory state lasting *tref* = 1 time step, during which it is unable to receive or transmit any charge, and its membrane potential is set to *vrest* = 0.

## *2.2.2. Avalanche activity*

When a neuron fires, it may bring to threshold some of the connected neurons thus generating an avalanche, a cascade of activity which propagates through the network involving a variable number of neurons. During an avalanche there is no further external stimulation. As soon as no more neurons are able to fire, the avalanche ends and size is recorded as the number of firing neurons *s*, or, alternatively, as the sum *s<sup>V</sup>* of all positive potential variations (depolarizations) δ*v*<sup>+</sup> *<sup>i</sup>* occurred in the network, namely *s<sup>V</sup>* = - *<sup>i</sup>* <sup>δ</sup>*v*<sup>+</sup> *<sup>i</sup>* . By definition a single neuron firing does not constitute an avalanche. Avalanches are also characterized by their duration *T*, which is defined as the number of iterations taken by the activity propagation. The numerical time step for each iteration corresponds to the real time between the triggering of an action potential in the presynaptic neuron and the change of the membrane potential in the postsynaptic neuron, therefore it is of the order of 4–6 ms. After an avalanche ends, an external stimulus triggers further activity in the system. Distributions of sizes and durations are shown in Supplementary Figure 1.

#### *2.2.3. Synaptic plasticity*

We implement a Hebbian-like plasticity rule at the end of each avalanche. The strength *gij* of the used connections is increased proportionally to the membrane potential variation |δ*vj*| of the postsynaptic neuron *j* induced by the presynaptic neuron *i* during the avalanche,

$$\mathfrak{g}\_{\vec{\eta}} = \mathfrak{g}\_{\vec{\eta}} + |\delta \nu\_{\vec{\jmath}}| / \nu\_{\max},\tag{2}$$

whereas the strength of all inactive synapses is reduced by the average strength increase per bond

$$
\Delta \mathbf{g} = \sum\_{\vec{\eta}} \delta \mathbf{g}\_{\vec{\eta}\vec{\prime}} / N\_B,\tag{3}
$$

where *NB* is the number of bonds. We set a minimum and a maximum value for the synaptic strength *gij*, *gmin* = 0.0001 and *gmax* = 1.0. Whenever *gij* < *gmin*, synapse *gij* is pruned. Since cortical plasticity such as long-term potentiation acts on time scales of seconds to minutes, which is much longer than the duration of avalanches, we apply the plasticity protocol for a certain number of stimulations and then study avalanche activity without further changing synaptic strengths. Specifically, since we don't want to alter the scale-free connectivity of the initial network, we apply plasticity rules until the first few synapses are pruned. After this plastic adaptation the *gij* are distributed as shown in Supplementary Figure 2.

## *2.2.4. Up-down state dynamics*

Alternation between the up and down-state was simulated on the basis of two concepts. First, the transition from one state to the other has a high degree of synchronization. Second, a down-state occurs when activity in the up-state reaches a level at which the up-state can't sustain itself anymore. Such a decrease in activity can result from either the exhaustion of available synaptic vesicles (Staley et al., 1998) or the increase of factors inhibiting their release (Thompson et al., 1992). For simplicity, we assume that the transition happens after a sufficiently large avalanche, which causes a lack of available neurotrasmitters and a sufficiently strong network inhibition.

Accordingly, at the end of each avalanche we measure its size in terms of the sum of depolarizations δ*v*<sup>+</sup> *<sup>i</sup>* of all neurons, *sV*. As soon as avalanche is larger than a threshold *s min <sup>V</sup>* , *s<sup>V</sup>* > *s min <sup>V</sup>* , the system transitions into a down-state and neurons become hyperpolarized proportionally to their previous activity; namely, we reset

$$\nu\_i = \nu\_i - h \cdot \delta \nu\_i. \tag{4}$$

This rule models the local inhibition experienced by a neuron, due to spike adaptation (Sanchez-Vives et al., 2000), adenosine accumulation (Thompson et al., 1992), synaptic vesicles depletion (Staley et al., 1998) or blockade of receptor channels by the presence of external magnesium (Maeda et al., 1995). The downstate ends whenever a new avalanche occurs, namely the system transitions in an up-state. When in the up-state, all neurons firing in the previous avalanche of size *s<sup>V</sup>* are set to the depolarized value

$$\nu\_i = \text{ } \nu\_{\text{max}}(1 - s\_{\Delta V} / s\_{\Delta V}^{\text{min}}) \,. \tag{5}$$

This equation states that the neuron's intracellular membrane potential depends on the response of the whole network via *s<sup>V</sup>* and implements an homeostatic mechanism at the single neuron level: When avalanche sizes *s<sup>V</sup>* are close to the threshold *s min <sup>V</sup>* , the ratio *sV*/*s min <sup>V</sup>* is close to 1 and membrane potentials are reset closer to a zero resting value, thus avoiding an explosive growth of the following avalanche. Conversely, the network does sustain the depolarized state of the single neuron and the membrane potential stays closer to the firing threshold. We wish to stress that this mechanism is driven by the whole network activity, following the idea that the up-state in the cortex is a cooperative network state (Wilson, 2008). Furthermore, it is in agreement with measurements of the neuronal membrane potential, which remains significantly depolarized in the up-state (Wilson, 2008), and, at the same, keeps activity balanced. Through Equation (5), the threshold *s min <sup>V</sup>* controls the level of excitability of the system.

At the network level, the high activity in up-states is sustained by a stimulation which has a random value in the interval *du* = [0, *s min <sup>V</sup>* /*sV*): After an avalanche, at each time step we randomly choose a neuron and increase its membrane potential by *rad* · *du*, where *rad* is a random number in the interval [0, 1). We notice that the amplitude of *du* depends on past network activity through the size of the previous avalanche *sV*. As for Equation 5, the stimulation in the up-state is based on an homeostatic principle: The larger the previous avalanche the smaller *du* and viceversa.

Conversely, during the down-state, the system experiences a general disfacilitation mimicked by weak random stimulation: At each time step we randomly choose a neuron and increase its membrane potential by a small constant quantity (30–40 smaller than *vmax*). This drive reproduces the effect of the small depolarizations due to miniature potentials (minis) from spontaneous synaptic release observed in the down-state (Timofeev et al., 2001). The drive slowly brings the system back in an up-state not correlated to past activity (Lombardi et al., 2012).

During the avalanche propagation the drive is stopped, as in usual SOC models. This procedure implements the separation of time scales between fast avalanche propagation and slow neuron stimulation.

Equations 4 and 5 each depend on a single parameter, *h* and *s min <sup>V</sup>* , which introduce a memory effect at the level of single neuron activity and the entire system, respectively. In order to reproduce the experimentally observed behavior we only need to control the ratio *R* = *h*/*s min <sup>V</sup>* , as shown in Lombardi et al. (2012).

## **3. RESULTS**

#### **3.1. WAITING TIME AND QUIET TIME DISTRIBUTIONS**

The definition of quiet time and waiting time is sketched in **Figure 1A** and can be summarized in the following equality:

$$
\delta t\_{\dot{\jmath}} = \Delta t\_{\dot{\jmath}} + T\_{\dot{\jmath}}, \tag{6}
$$

that is the *j*th waiting time is obtained summing up the *j*th quiet time and the duration *Tj* of the *j*th avalanche. It follows that δ*t t* if the relation *T t* holds. In case of neuronal avalanches durations *T* range from a few to few tens of milliseconds (**Figure 1B**) and are then comparable with the shortest *t*s (Lombardi et al., 2012). As a consequence, we expect quiet time and waiting time distribution to differ at short time scales. In **Figure 1** we show the distribution *P*(*t*) of quiet times between successive avalanches in six different cortex slice cultures (Lombardi et al., 2012) and compare them to the corresponding distributions *P*(δ*t*) of waiting times. The quiet time distribution has been extensively discussed in Lombardi et al. (2012), where

**FIGURE 1 | Distributions of duration** *T***, quiet times** *t* **and waiting times** *δt* **for six cortex slice cultures. (A)** Definition of avalanche, quiet time and waiting time. nLFPs in the same time bin or consecutive bins define an avalanche. Avalanche duration *T* is given by the number *n* of consecutive non-empty bins times the bin amplitude , namely *T* = *n* · . A quiet time *t* is the time interval between the end of an avalanche and the start of the following one. A waiting time δ*t* is the time interval between the start of an avalanche and the start of the following one. The following equality holds: δ*t* = *t* + *T* . **(B)** Duration distributions. For better comparison duration *T* is expressed in multiples of . The initial power law regime extends for about one order of magnitude and is followed by an exponential cutoff. **(C)** Distribution of quiet times: All curves show an initial power law regime with an exponent μ ranging between −2.0 and −2.5. For larger *t*, distributions are characterized by a local minimum followed by a more or less pronounced maximum at *t* 1 − 2 s. Upper inset: Distributions of waiting times. Lower inset: illustrative comparison between quiet (cyan) and waiting (blue) time distribution for the blue curve in the main panel. The two distributions only differ at short time scales where durations are comparable to quiet times.

it was called waiting time distribution. Here we briefly recall its main features, namely the power law behavior at short time scales, from few to 200–300 ms, and a local maximum situated at longer time scales, which leads to a peculiar non-monotonic behavior. The initial power law decay indicates that avalanches are temporally correlated if sufficiently close in time, which requires a sustained synaptic and firing activity in the network, namely an up-state. Conversely longer quiet times correspond to down-states and sparse synaptic activity in the network.

This non-monotonic behavior, with the same general features, can be still observed in the waiting time distributions (**Figure 1C**, upper inset). However, the power law exponent is generally slightly lower than the one measured for *P*(*t*), as shown in the lower inset of **Figure 1C**. On the other hand, for time intervals larger than 200–300 ms, which are related to down-states, the two distributions basically coincide (**Figure 1C**, lower inset), meaning that, for this range of values, *T t* and δ*t t* is a good approximation. From 6 it follows that the waiting time distribution *P*(δ*t*) results from the combination of two quantities, quiet times and durations. While for long time scales *P*(δ*t*) is dominated by the former, at short time scales both of them contribute to its functional behavior. In this range of values both *t* and *T* are power law distributed and add up to give again a power law: Short durations significantly couple with short quiet times and, due to lack of characteristic values, the net results is a power law with a larger slope. Is this power law carrying the same information as the statistics of time intervals without activity, i.e., quiet times? Evidently it does not, for the following reason: Durations, which are power law distributed, are not negligible and concluding that avalanches are temporally correlated from the power laws in waiting time distribution would be misleading (Sanchez et al., 2002). Indeed in Lombardi et al. (2012) only quiet time statistics has been considered. Nevertheless, some specific information can be extracted from waiting time distributions, as we will discuss in the following.

### **3.2. TEMPORAL FEATURES OF UP AND DOWN-STATE**

In Lombardi et al. (2012) we have used numerical simulations to investigate the origin of the non-monotonic behavior in the quiet time distribution and concluded that it arises from the slow alternation of up and down-state. Accordingly, in this section we systematically isolate each contribution to the overall quiet time distributions (see Materials and Methods) and further investigate the temporal features of these two network states.

In **Figure 2A** we show the experimental distributions of quiet times between consecutive avalanches in the up-states (panel a) (see Materials and Methods): After rescaling *t* by the mean rate *rup* in the up-state, distributions collapse onto a unique power law with exponent μ −2.2. This implies that the avalanche process in the up-state is solely controlled by the average occurrence rate and the corresponding quiet time distribution has a universal, i.e., sample independent, power law decay (**Figure 2A**). On the other hand, down-states produce long quiet times mostly contributing to the tail of the overall *P*(*t*), exhibiting a distribution with a characteristic value τ*d*, as found numerically (Lombardi et al., 2012). This behavior has a simple interpretation: The recurrence of up-states has a more or less pronounced characteristic time. If the distribution of quiet times in the down-state is peaked around a particular value τ*<sup>d</sup>* and is sufficiently narrow, then a non-monotonic behavior can be observed in the quiet time distribution of the entire avalanche activity. Although distributions of quiet times in the down-states do exhibit common features across samples, they do not generally collapse onto a unique function after rescaling δ*t* by *rdw*, the mean rate in the down-state (**Figure 2B**).

To complete the investigation of up and down-state temporal features, we consider the distributions *P*(*Tup*) and *P*(*Tdw*) of up and down-state durations, respectively (**Figure 3**). Numerical curves are over plotted with experimental results. We notice

here that, both numerically and experimentally, the two states are characterized by time scales that differ by about one order of magnitude. Moreover, their respective duration distributions exhibit a distinct functional behavior. On average, the durations of down-states are distributed around *Tdw* 2000 ms and the tail of the distribution is well fitted by an exponential (Millman et al., 2010). This property characterizes most of the analyzed samples (Supplementary Figure 3). Conversely, the distribution *P*(*Tup*) exhibits a tail compatible with a power law. However, in this case, the power law behavior arises by averaging over many cultures and does not necessarily characterize the up-state duration in each culture (Supplementary Figure 4).

#### **3.3. TEMPORAL STRUCTURE OF AVALANCHE PROCESS**

We have shown that different quiet time distributions of distinct experimental samples show a qualitatively similar behavior. In particular, at short time scales, the distributions of quiet times are all characterized by the same power law (**Figure 2**), a general and robust feature of up-states. Here we go further in the characterization of the avalanche process and question how the functional

behavior of the distribution *P*(*t*) depends on the fine structure of avalanche sequences. In order to do that, we study the distributions *P*(*t*;*sc*) of quiet times between consecutive avalanches of size larger than a given threshold *sc*. In this way we remove smaller avalanches from the time series and analyze how the distribution changes as a function of *sc*. If the different distributions *P*(*t*;*sc*) collapse onto a unique function, then the temporal properties of the avalanche process are invariant under the aforesaid removal procedure. This specific point will be addressed in the next section.

In **Figure 4** we show the distribution *P*(*t*;*sc*) for different values of *sc*. By removing avalanches we are making the time series sparser and, as a result, we would expect the distributions *P*(*t*;*sc*) to become broader and broader as we increase *sc*. Indeed this effect is observed but it is minor for a wide range of *sc* values, which suggests that large quiet times tend to separate large avalanches. On the contrary, as a main effect, we observe that the distributions *P*(*t*;*sc*) show peaks that were not present in the original *P*(*t*). These peaks become pronounced

pronounced in **(B,C,D,F)**, less in **(A,E)**. The distributions in **(B,C)** exhibit one more peak around 500 ms, related to 2 Hz oscillations. It is worth to notice that the probability *P*(*t*) for *t* corresponding to the θ **(B,C,D,F)** and 1–2 Hz oscillations **(A,B)** is nearly a fixed point for this transformation. Insets: Experimental distributions of waiting times δ*t*, for different values of *sc* . In this case one more peak appears around 20–30 ms, which corresponds to γ oscillations.

for values of *sc* larger than 40 µV and are either located at the time scales within the power law regime or at very long quiet times. The first peak appears at *t* 40 − 60 ms (*t*<sup>β</sup> in the following) and can be related to the period of β oscillations (**Figure 4C**). The second one arises at *t* ∈ [80, 250] ms (*t*<sup>θ</sup> in the following) and corresponds to the period of θ oscillations. This peak is visible in all samples. In particular it is very pronounced in **Figures 4B–D,F**. Quiet times around *t*<sup>θ</sup> seem to play a special role with respect to our removal process: While the probability increases with *sc* for *t* longer than *t*<sup>θ</sup> and decreases for the shorter ones, it stays nearly constant in a neighborhood of *t*<sup>θ</sup> (**Figures 4B–D,F**). This means that the ratio *N*(*t*<sup>θ</sup> ;*sc*)/*N*(*sc*) *const*, namely the number of quiet times corresponding to θ period scales with the total number *N*(*t*;*sc*). Since the number of avalanches larger than *sc* is simply given by the number of quiet times plus one, then the number *N*<sup>θ</sup> (*sc*) of avalanches related to θ oscillations scales with the total number *N*(*sc*) of avalanches, namely it decreases proportionally to *N*(*sc*) for increasing values of *sc*. On the other hand, the number of avalanches separated by longer and shorter quiet times decreases slower and faster than *N*(*sc*), respectively. This point can be understood as follows. If, for a given *t*, the probability *P*(*t*;*sc*) increases (decreases) with *sc* (**Figures 4B–D,F**), then the numerator of the ratio *N*(*t*;*sc*)/*N*(*sc*) decreases slower (faster) than the denominator and so does the corresponding number of avalanches. Alternatively, one can look at the quantity *N*(*t*;*sc*), which we show in the Supplementary Figure 5, and notice that it decreases faster for small than for large *t*s. Therefore, long quiet times tend to occur between large avalanches whereas shorter quiet times tend to separate the smaller ones. From **Figure 4** we notice that, whenever the peak around *t*<sup>θ</sup> is not pronounced (**Figures 4A,E**), the *t* characteristic of slow 1 Hz oscillations between up and down-states plays the role of fixed point. Finally a further peak appears at *t* 400-500 ms, which corresponds to a ≈ 2 Hz oscillation (**Figure 4B**). This peak behaves as the one at *t*<sup>θ</sup> , namely it behaves as a fixed point for our removal procedure.

Since avalanche durations and periods of fast oscillations are of the same order of magnitude, in order to capture their relation with avalanche sizes we have considered the distributions *P*(δ*t*;*sc*) of waiting times between consecutive avalanches of size larger *sc*, which are shown in the insets of **Figure 4**. The picture emerging from the analysis of the quantity δ*t* is basically the same we have drawn looking at the quiet times, except for a peak corresponding to the faster γ oscillations, which can be now clearly observed in the insets of **Figures 4A,B,D–F** of **Figure 4**. The probability associated with this peak, which is situated at very short δ*t*, decreases with *sc* whenever it coexists with very pronounced θ peaks (**Figures 4B–D,F**), indicating that, at least in this particular case, faster oscillations tend to be associated with smaller avalanches.

To summarize, our removal procedure uncovers a rich temporal structure hidden behind the scale free behavior in the quiet time distribution: Beside the characteristic time associated with down-state duration, avalanche occurrence keeps the temporal features of θ and β/γ oscillations. They jointly emerge in most of the analyzed experimental samples (**Figures 4B–D,F**). While short quiet times and fast β/γ oscillations tend to be associated with smaller avalanches, slower oscillations are in general related to larger avalanches, but without any characteristic size. Indeed, varying the threshold *sc* in a range of values within the power law regime of the size distribution *P*(*s*), typically between 30 µV and 400 µV (**Figure 5B**), the probability *P*(*t*;*sc*) of *t* associated with θ (**Figures 4B–D,F**) or slower oscillations (**Figures 4A,E**) remains nearly unchanged. In particular, the θ peak coexists with a faster decrease of the probability of γ period, thus suggesting that a sort of hierarchical structure for avalanche sequences, which follows closely the temporal organization of nested θ − β/γ oscillations: Within up-states, large avalanches occur with θ frequency and trigger smaller ones in a faster γ cycle (**Figure 5A**). Remarkably, within γ cycles the quiet times have no characteristic value. Indeed the quiet time distributions do not show peaks at very short time scales. Then, quiet times and durations, which are both power law distributed, show a peculiar coupling in the

Here bar widths indicate durations: Avalanche start is at the right side of the bar. Bar heights indicate sizes. Spacing between blue bars corresponds to a θ period. Spacing between the starting points of green bars corresponds to γ period. γ cycles do not show characteristic quiet times. Sizes *s* of avalanches related to θ cycles tend to fall within the blue region of the size distribution *P*(*s*) plotted in **(B)**, whereas the ones corresponding to nested γ oscillations fall within the green region. Therefore, the relationship between avalanches and oscillations does not imply characteristic sizes. In particular, for *sc* ≥ 80 µV, the number of avalanches *N*<sup>θ</sup> related to θ cycles scales with *sc* as the total number *N* of avalanches, namely *N*<sup>θ</sup> /*N const*. **(B)** Distributions of avalanche sizes for the experimental samples in **Figure 1**. Same color is used here for each sample.

γ cycles. δ*t*s corresponding to these oscillations are short, which implies that both *T* and *t* are short. Considering the scaling relation between duration *T* and *s* (Friedman et al., 2012), this is the same that saying small avalanches are associated with short quiet times.

**Figure 4** indicates that quiet times and avalanche sizes are correlated. The analysis of the scatter plots between *t* and the relative previous and following avalanche also provides some evidence that correlations exist (Supplementary Figures 6, 7). In order to further validate this result, we reshuffle avalanche sizes while keeping the sequence of starting and ending times fixed. More precisely, we reassign to each avalanche a size taken at random from the measured size distribution. Then, we apply the same procedure described above. If no correlations existed between sizes and waiting times, then we should still observe the same peaks in the distributions *P*(*t*;*sc*). As shown in **Figure 6**, in this case no peaks emerge in the power law regime, which implies that, in the up-state, waiting times are strongly correlated with sizes. In particular, periods of θ, β, and γ oscillations are correlated with sizes of corresponding avalanches. On the other hand, for longer waiting times we observe the same qualitative behavior discussed for the original time series. Therefore, we can state that correlations with avalanche sizes are weak, but a more quantitative analysis is needed to exclude that they are significant.

## *3.3.1. Up and down-state*

From the analysis performed above it is evident that the functional behavior of the quiet time distribution arises from the superposition of many dynamic mechanisms. In Section 3.2 we have argued that non-monotonicity results from the alternation between up and down-state, which implies already two different mechanisms governing avalanche activity at the short and large time scales. Then we have shown that also the characteristic times of θ, β, and γ oscillations enter in the process. As a consequence, we do not expect the distribution *P*(*t*;*sc*) being controlled by a single parameter, as observed in other stochastic time series (Corral, 2004). Indeed, rescaling the quiet times by the mean avalanche rate *r* = 1/*t*, does not lead to a collapse of the curves onto a single one (not shown).

However, one can apply the same removal procedure separately to up and down-states and then rescaling quiet times by the respective average occurrence rate, *rup* and *rdw*, in order to find universal features for each of the two network states. We start considering the distributions *P*(*t*;*sc*) in the down-state and we rescale them by *rdw* = *tdw*. As shown in **Figure 7**, distributions collapse onto a unique function, which shows a characteristic value and an exponential tail. This functional behavior is common to all samples except the one in **Figure 7E**, whose departure from an exponential could be interpreted as an effect of the very sharp peak at *t* 1 s and not as a result of a different dynamics in the down-state. The existence of a universal function implies that the quiet time distribution in the down-states is uniquely controlled by *rdw*. On the other hand, following the same procedure for the up-state does not provide a good data collapse (not shown). Peaks that emerge at short *t* after the removal of smaller avalanches, tell us that avalanche occurrence in the up-state is not solely controlled by one time constant, that is 1/*rup*. Nevertheless, here we show that the distributions of quiet times shorter than *t*<sup>θ</sup> are solely controlled by *r*<sup>θ</sup> *up* = 1/*tt*<*t*<sup>θ</sup> , where -·*t*<*t*<sup>θ</sup> indicates the average over *t* < *t*<sup>θ</sup> . Indeed, rescaling them by *r*<sup>θ</sup> *up* leads to a data collapse onto a unique function which follows a power law with an exponent μ −2 (**Figure 8**). This collapse is particularly good in samples that show a clear power law behavior for quiet times shorter than *t* corresponding to θ and 1 Hz oscillation period (**Figures 8B–E,F**). Conversely, curves do not collapse whenever a further, shorter characteristic time is present (**Figure 8C**).

We obtain a similar result for numerical distributions. However, in this case removing avalanches according to their size does not lead to many peaks at short quiet times, which implies that there are only two characteristic time scales for numerical avalanches. In this case, we just need to consider separately up and down-state and rescale the quiet times by respective average occurrence rate, *rup* and *rdw*. As shown in **Figures 7E,F**, **8E,F** we obtain a good data collapse in both cases.

## **4. DISCUSSION**

The distribution of quiet times between consecutive avalanches in cortex slice cultures displays a power law decay at short time scales, namely from few to 200–300 ms, and is generally characterized by a local maximum at longer quiet times, which leads to a non-monotonic behavior. Numerical simulations show that this non-monotonic distribution results from the slow alternation between up and down-states (Lombardi et al., 2012). The model suggests that in the up-state, where neurons mutually sustain their spiking activity, network mechanisms act as a form of shortterm memory, which produces clusters of correlated avalanches and thus gives rise to the initial power law regime in the quiet time distribution. On the other hand, the synaptic activity during down-state can be modeled as a random process that slowly brings the system back into the up-state, with no memory of past activity. Indeed numerical distributions exhibit an exponential tail similar to the ones observed experimentally (Lombardi et al., 2012).

Accordingly, here we have defined as up-state (down-state) a consecutive series of avalanches separated by *t* shorter (longer) than the longest *t* falling within the power law regime of *P*(*t*) and systematically evaluated the quiet time distribution for up and down-state. We have shown that, while a power law with exponent μ −2 is a property of up-states in all analyzed samples, the recurrence of up-states has a characteristic time τ*<sup>d</sup>* which is sample dependent ( 1 s on average). Indeed, the lasting times of down-states, which are simply quiet times between successive up-states, are distributed around a certain value 1 s < *T* < 2 s, the tail of the distribution being well fitted by an exponential. Since the exponential behavior is characteristic of Poisson processes, we conclude that consecutive up-states are basically not correlated. Moreover, from the properties of Poisson processes it follows that, given the sequence of quiet times *t* between successive up-states, the jumps, i.e., the differences between two consecutive *t*s, are also exponentially distributed. The distribution of jumps is commonly used to characterize stochastic processes. It has been

analyzed for burst sequences in spontaneous activity of dissociated cultures of cortical neurons (Segev et al., 2002) and has been approximated with a symmetric Levy distribution. While Levy is indicative of self similarity in the process, spectral analysis was consistent with long range temporal correlation. Beside differences with cultures considered here, discrepancies can be also due to the definition of burst adopted in Segev et al. (2002), which substantially differs from our definition of up-states.

**FIGURE 7 | Distributions of quiet times** *P***(***t***;** *sc* **) in the down-state for the experimental data samples of Figure 1C and the numerical samples reproducing blue squares and red diamonds curve of Figure 1C.** Distributions are rescaled by the mean rate *rdw* in the down-state. In five of the analyzed samples the tail of the distribution is well fitted by an

exponential (black dashed line in **A–D,F**). Numerical data are shown in **(E,F)** together with the corresponding experimental curves of **Figure 1C** and shifted by 1 order of magnitude to the left, for clarity. Numerical distributions are averaged over 100 configurations of a network of *N* = 64000 neurons with *pin* = 0.1.

**FIGURE 8 | Distributions** *P***(***t < t<sup>θ</sup>* **;** *sc* **) of quiet times shorter than** *tθ* **in the up-state for the experimental data samples of Figure 1C and the numerical samples reproducing blue squares and red diamonds curve of Figure 1C.** *t*<sup>θ</sup> is sample dependent and its value varies in the interval [80, 250] ms. Distributions are rescaled by the mean rate *r* <sup>θ</sup> *up*. Numerical data

are shown in **(E,F)** together with the corresponding experimental curves of **Figure 1C** and shifted by 1 order of magnitude to the left, for clarity. Numerical distributions are averaged over 100 configurations of a network of *N* = 64000 neurons with *pin* = 0.1 and are rescaled by *rup*. The dashed line is a power law with exponent −2.2.

We have shown that beside the characteristic recurrence time τ*<sup>d</sup>* between consecutive up-states, the analysis of quiet time distributions is able to capture the presence of θ, β, and γ oscillations in avalanche occurrence. The connection between nested oscillations and neuronal avalanches has been pointed out in Gireesh and Plenz (2008). Investigation of spontaneous neuronal activity in the rat cortex layer 2/3 has revealed that, during the second week postnatal, bursts develop a temporal organization of higher frequency oscillations, β and γ , nested into lower frequencies θ oscillations, while the spatio-temporal organization of LFPs is characterized by the scaling behavior of neuronal avalanches. Here we have further enlightened the relation between avalanche sizes and the temporal structure of the avalanche process. When avalanches of all sizes are considered, the distribution of quiet times in the up-state is scale free. On the contrary, disregarding avalanches smaller than 80 µV, peaks corresponding to oscillations in θ, β, and γ frequency bands are clearly visible. Smaller avalanches (60–160 µV) tend to be associated with shorter quiet times and faster β/γ oscillations, larger ones to longer quiet times and slower θ or 1–2 Hz oscillations. Of considerable interest is the behavior of the θ and 1 Hz peaks under the removal procedure, which are nearly independent of the threshold *sc* on avalanche sizes: It doesn't matter how many avalanches are removed, the probability for quiet times around the period of θ or 1 Hz oscillation does not change for a large range of *sc* values. Equivalently, avalanches corresponding to these frequency bands are a constant fraction of the total number, which implies that they have no characteristic size. This suggests a special role in the temporal organization of spontaneous activity. In particular, we have noticed that large avalanches occurring with θ frequency trigger cascades of smaller avalanches corresponding to the higher frequency oscillations, in a sort of hierarchy which is reminiscent of the temporal organization of nested θ − β/γ oscillations (Gireesh and Plenz, 2008; He et al., 2010).

These results indicate that correlations between quiet times and avalanche sizes could be relevant and deserve further investigation. This point is intimately related to the existence of a universal scaling function for the distributions *P*(*t*;*sc*). A stochastic process for which such a universal function exists is a fixed point of the transformation which has been illustrated and performed in Section 3 (Corral, 2007). It can be shown that the only process without correlations which is invariant under this transformation is the Poisson process (Daley and Vere-Jones, 1988). More precisely, if sizes are independent of any other variable, the removal of events is equivalent to a so called random thinning and, under certain conditions, the resulting process converges to a Poisson process. Here we have demonstrated that the distributions *P*(*t*;*sc*) do not collapse onto a unique function when *t* is rescaled by the average occurrence rate *r*. This is because of the multiple time scales in avalanche dynamics, which result from different mechanisms governing avalanche triggering during up and down-states. Indeed distributions *P*(*t*;*sc*) for the down-state are simply controlled by the respective average rate: When *t* is rescaled by *rdw*, the distributions *P*(*t*;*sc*) for the down-state collapse onto the same curve with an approximately exponential tail, which therefore implies that sizes of avalanches separated by large quiet times are either independent or weakly correlated, as well as sizes and quiet times. On the other hand, in the up-state we observe that the peak associated with period of θ oscillations and those corresponding to the β/γ scale differently with *sc* and therefore cannot be controlled by the same time scale, *rup*. In other words oscillations introduce additional characteristic times in the up-state. However, we have shown that the power law for short quiet times is universal and controlled by *tt*<*t*<sup>θ</sup> . A similar analysis has been recently performed for spike avalanches in freely behaving (FB) and anesthetized rats (AR) (Ribeiro et al., 2010), where the quiet time distributions show consistently a monotonically decreasing behavior. Universal scaling features are observed for FB rats when quiet times are rescaled by the average occurrence rate, whereas curves for AR do not collapse onto a unique function. Our analysis suggests that the different behavior between anesthetized and freely behaving rats could be due to different dynamic mechanisms characterizing spontaneous activity in AR.

Waiting time distribution and its universal features have been widely investigated for earthquakes (Corral, 2004; de Arcangelis et al., 2006a). In this case the distribution is not exponential, but monotonic and solely controlled by *r*, except for corrections at short waiting times (Bottiglieri et al., 2010). On the other hand, many similarities between neuronal avalanches and earthquakes can be recognized, which have suggested a common interpretation in term of self organized criticality (SOC). SOC was originally proposed as an explanation for long range correlations emerging in processes far from equilibrium (Tang et al., 1988) and has rapidly become a useful interpretative scheme for many stochastic natural phenomena that exhibit scale free statistics. As for neuronal avalanches and earthquakes, in many cases, e.g., solar flares (Boffetta et al., 1999), waiting time distributions are not exponential. Conversely, in the original sand pile model introduced by Bak, Tang and Wiesenfeld (BTW) to exemplify SOC idea, waiting times are exponentially distributed (Boffetta et al., 1999) and this fact was used to question SOC as an interpretation for solar flares (Boffetta et al., 1999) and earthquakes (Yang et al., 2004). However, Paczuski et al. (2005) have argued that an experimental sequence of bursts can arise from a single avalanche observed at a finite detection threshold, which would give rise to a power law in the waiting time distribution of the BTW model. In addition, several different models have been proposed in order to show that SOC-like dynamics can provide temporal correlations among avalanches (Rios and Zhang, 1999; Baiesi and Maes, 2006) and a non-exponential distribution of waiting times (Sanchez et al., 2002; Lippiello et al., 2005; Baiesi and Maes, 2006). In particular, it has been shown that in the so called running sand pile (Sanchez et al., 2002), waiting times between avalanches with size above a large enough threshold are power law rather than exponentially distributed. Non-exponential waiting time distributions also arise if avalanches are triggered on the basis of the entire history of local stimulations (Lippiello et al., 2005). Here we have shown that our model, inspired in SOC, is able to capture the peculiar, non-exponential and non-monotonic behavior of the waiting time distribution for neuronal avalanches recorded in cortex slice cultures (Lombardi et al., 2012). Moreover, numerically generated up and down-states, exhibit the same universal features found experimentally. This point is particularly important because it indicates that the lack of universality in the waiting time distribution for spike avalanches in anesthetized rats (Ribeiro et al., 2010) could be due to the coexistence of different dynamic mechanisms, each one controlling ongoing activity at different temporal scales. Indeed, in freely behaving rats, where no down-states are observed, the waiting time distribution is controlled by the average occurrence rate (Ribeiro et al., 2010), which, for our model, is equivalent to *rup*. From our simulations it emerges that the crucial features of this temporal evolution are (1) the different single neuron behavior in the two phases, namely the ability to oscillate between a very depolarized and hyperpolarized state, (2) the homeostatic mechanism driving activity in the up-state and (3) the network disfacilitation following up-states. The good agreement with experimental data indicates that the transition from an up-state to a down-state has a high degree of synchronization, whereas the onset of up-states is usually more gradual. According to our numerical results, the alternation between up and down-states is the expression of an homeostatic regulation which, during a burst, is activated to control the excitability of the system and avoid pathological behavior.

## **ACKNOWLEDGMENT**

We thank the SNF for funding within project 205321-138074.

## **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fnsys.2014. 00204/abstract

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 26 June 2014; accepted: 01 October 2014; published online: 28 October 2014. Citation: Lombardi F, Herrmann HJ, Plenz D and De Arcangelis L (2014) On the temporal organization of neuronal avalanches. Front. Syst. Neurosci. 8:204. doi: 10.3389/ fnsys.2014.00204*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2014 Lombardi, Herrmann, Plenz and De Arcangelis. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Markers of criticality in phase synchronization

## *Maria Botcharova1,2, Simon F. Farmer 2,3 and Luc Berthouze4,5\**

*<sup>1</sup> CoMPLEX, Centre for Mathematics and Physics in the Life Sciences and Experimental Biology, University College London, London, UK*

*<sup>2</sup> Institute of Neurology, University College London, London, UK*

*<sup>3</sup> The National Hospital for Neurology and Neurosurgery, London, UK*

*<sup>4</sup> Centre for Computational Neuroscience and Robotics, University of Sussex, Falmer, UK*

*<sup>5</sup> Institute of Child Health, University College London, London, UK*

#### *Edited by:*

*Valentina Pasquale, Fondazione Istituto Italiano di Tecnologia, Italy*

#### *Reviewed by:*

*Joana R. B. Cabral, Universitat Pompeu Fabra, Spain Tiago Pereira, Imperial College London, UK*

#### *\*Correspondence:*

*Luc Berthouze, Centre for Computational Neuroscience and Robotics, University of Sussex, Falmer BN1 9QH, UK e-mail: L.Berthouze@sussex.ac.uk* The concept of the brain as a critical dynamical system is very attractive because systems close to criticality are thought to maximize their dynamic range of information processing and communication. To date, there have been two key experimental observations in support of this hypothesis: (i) neuronal avalanches with power law distribution of size and (ii) long-range temporal correlations (LRTCs) in the amplitude of neural oscillations. The case for how these maximize dynamic range of information processing and communication is still being made and because a significant substrate for information coding and transmission is neural synchrony it is of interest to link synchronization measures with those of criticality. We propose a framework for characterizing criticality in synchronization based on an analysis of the moment-to-moment fluctuations of phase synchrony in terms of the presence of LRTCs. This framework relies on an estimation of the rate of change of phase difference and a set of methods we have developed to detect LRTCs. We test this framework against two classical models of criticality (Ising and Kuramoto) and recently described variants of these models aimed to more closely represent human brain dynamics. From these simulations we determine the parameters at which these systems show evidence of LRTCs in phase synchronization. We demonstrate proof of principle by analysing pairs of human simultaneous EEG and EMG time series, suggesting that LRTCs of corticomuscular phase synchronization can be detected in the resting state and experimentally manipulated. The existence of LRTCs in fluctuations of phase synchronization suggests that these fluctuations are governed by non-local behavior, with all scales contributing to system behavior. This has important implications regarding the conditions under which one should expect to see LRTCs in phase synchronization. Specifically, brain resting states may exhibit LRTCs reflecting a state of readiness facilitating rapid task-dependent shifts toward and away from synchronous states that abolish LRTCs.

**Keywords: criticality, long-range temporal correlations, phase synchronization, detrended fluctuation analysis, oscillations, Kuramoto, Ising**

## **1. INTRODUCTION**

The concept of the brain as a dynamical system close to a critical regime is attractive because systems close to criticality are thought to maximize their dynamic range of information processing and communication, show efficiency in transmitting information and a readiness to respond to change (Linkenkaer-Hansen et al., 2001, 2004; Beggs and Plenz, 2003; Stam and de Bruin, 2004; Kinouchi and Copelli, 2006; Sornette, 2006; Shew et al., 2009; Werner, 2009; Chialvo, 2010; Beggs and Timme, 2012; Meisel et al., 2012; Shew and Plenz, 2013).

A number of modeling studies have shed important light on the behavior of neurally inspired systems close to their critical dynamical range (Kitzbichler et al., 2009; Shew et al., 2009; Breakspear et al., 2010; Daffertshofer and van Wijk, 2011; Poil et al., 2012). To date there have been two significant experimental observations suggesting that the brain may operate at, or near, criticality. These are: (i) the discovery that the spatio-temporal distribution of spontaneous neural firing statistics can be characterized as neuronal avalanches with a power law distribution of avalanche size (Beggs and Plenz, 2003) and (ii) the presence of long-range temporal correlations (LRTCs) in the amplitude fluctuations of neural oscillations, typically bandpassed MEG or EEG (Linkenkaer-Hansen et al., 2001; Hardstone et al., 2012). The mechanisms by which avalanches and LRTCs of oscillation amplitude may maximize the dynamic range of information processing and communication are still to be fully understood and experimental and computational neuroscience data linking the two phenomena are only just beginning to emerge (Plenz and Chialvo, 2009; Poil et al., 2012).

Population coding approaches to neuronal information storage and transmission show that both changes in the firing rate and changes in neuronal synchronization and desynchronization of action potentials are required to indicate changes in signal salience (Pfurtscheller, 1977, 1992; Singer, 1999; Baker et al., 2001; Schoffelen et al., 2005). At a coarser spatio-temporal scale, extracellular brain signals (local field potentials, corticography, EEG, and MEG), which depend on recordings within the brain, at the brain surface and at the scalp are observed to be quasioscillatory (brain oscillations) and in the resting state contain spectral peaks within distinct frequency bands sitting on a 1/*f* decrease in power with increasing frequency (Buzsaki, 2006). Brain oscillations both in the resting state and during task conditions show short-range and long-range synchronization when examined both from the phase and amplitude envelope perspectives (Wang, 2010). Primarily neuroscience has focused on the detection of synchronization between areas either at zero phase lag, or with a fixed phase delay. This is in part a consequence of the fact that the averaging necessary to extract evidence of signal correlation requires a consistent phase relationship between the two signals for at least some period of the recording.

Importantly, neural synchronization is weak and it fluctuates spontaneously over time. A number of experiments have shown neural synchronization to be consistently modulated by cognitive, perceptual and motor tasks supporting the idea that synchronization and de-synchronization within and across frequency bands may play an important role in communication within the nervous system (Conway et al., 1995; Farmer, 1998; Baker et al., 1999; Singer, 1999; Pikovsky et al., 2003; Schoffelen et al., 2005; Buzsaki, 2006; Doesburg et al., 2009; Fries, 2009; Akam and Kullmann, 2010). Changing synchronization patterns may indicate an evolution in the relationship and exchange of information (Pikovsky et al., 2003). Neural synchronization can exist between nearby and distant regions, across a range of time scales, and can be characterized using a number of techniques based on time- and frequency-domain techniques as well as mutual information (Halliday et al., 1998; Schoffelen et al., 2005; Buzsaki, 2006; James et al., 2008; Brittain et al., 2009; Siegel et al., 2012).

Neuronal synchronization occurs when the mutual influence of neurons on each other causes them to fire close together in time. It is favored by oscillatory activity. Oscillators can be tipped in and out of weak synchonization through shared noise, a phenomenon first appreciated by Huygens (Pikovsky et al., 2003). Therefore, weak yet variable synchrony between neuronal oscillators may easily emerge within complex and highly interactive neural networks. In this paper the term synchronization will be used to encapsulate both zero and fixed phase lag synchrony but also situations in which any non-trivial phase relationship exists between signals. Importantly, we will introduce a new methodology to demonstrate that non-fixed yet non-random phase relationships between signals are present in models of critical synchronization and we will show that, in principle, the methodogy can be applied to neural data in order to further explore the relationship between neural synchronization and systems operating close to a critical regime.

Recent evidence supporting the idea of criticality in the dynamics of the resting state brain activity and the appreciation that synchronization is an important extractable property of neural spatio-temporal dynamics has led researchers to ask whether neuronal synchrony can have properties consistent with a dynamical system at criticality. These approaches identify power law distributions in neural synchronization where synchronization has been defined as phase consistency between two thresholded time series, e.g., see the phase lock interval (PLI) measure and the lability of global synchronization (GLS) measure in Kitzbichler et al. (2009). These findings are of considerable interest, however, the results supporting power law behavior of PLI have been shown by the present authors to be vulnerable to data pooling and therefore may not provide robust estimates of critical synchronization in neural time series data (Botcharova et al., 2012, see also Shriki et al., 2013).

As discussed above, LRTCs (these will be formally defined in Section 2.3) exist in dynamical systems thought to operate close to a critical regime (Linkenkaer-Hansen et al., 2001). They are typically identified by the autocorrelation function of the time series decaying in the form of a power law (Granger and Joyeux, 1980). The detrended fluctuation analysis (DFA) technique allows a characterization of LRTCs through an exponent similar to the Hurst exponent. DFA has been widely used in order to demonstrate the presence of LRTCs in a number of natural and human phenomena (see Peng et al., 1994, 1995a,b; Stanley et al., 1994; Hausdorff et al., 1995; Bak, 1996; Robinson, 2003; Karmeshu and Krishnamachari, 2004; Wang et al., 2005; Samorodnitsky, 2006; Hardstone et al., 2012, for examples). In neurophysiology, the finding of LRTCs in amplitude fluctuations of the bandpass filtered MEG and EEG (Linkenkaer-Hansen et al., 2001, 2004) has inspired us to develop a methodological framework that can be used to to verify the presence or absence of power law scaling of detrended fluctuations and where power law scaling is present to estimate and ascertain non-trivial DFA exponents in the moment to moment fluctuations of phase synchronization (quantified in terms of the rate of change of phase difference time series) between pairs of neuronal oscillation time series. It should be noted here that our focus on the rate of change of phase difference time series means that our framework is not reliant on the definition of (discrete) phase locking events. It is therefore expected to contribute insights regarding phase synchronization that corroborate or complement those provided by the study of intermittency in phase synchronization (e.g., Gong et al., 2007).

The methodology is tested as follows: (i) on synthetic time series where their phase difference has known temporal properties with a known DFA exponent. Using these simulations we demonstrate the method's ability to recover known DFA exponents in the phase difference, and we test the method's robustness to additive noise in such signals; (ii) the method is tested on two classical models of criticality, Ising and Kuramoto (Ising, 1925; Onsager, 1944; Kuramoto, 1975, 1984), from which time series and their pairwise phase differences can be extracted. The output of these models is examined using our method for those parameter values that determine the sub-critical, critical, and super-critical regimes. The classical Kuramoto model is tuned close to the physiological β frequency range of MEG and EEG and examined with additive noise. We show from this analysis that a rise in DFA exponent associated with robust power law detrended fluctuation scaling occurs close to the critical regimes of both the Ising model and the Kuramoto model with noise.

We next use our methodology to examine a system of Kuramoto oscillators, operating in a range of frequencies close to the physiological γ frequency range of MEG and EEG that are connected through a network constructed based on empirical estimations of brain connectivity parameters with time delays, noise and non-uniform connectivity (Cabral et al., 2011). From these simulations, we determine the parameters at which this system shows evidence of LRTCs in the rate of change of phase differences and we relate the presence of LRTCs to the network's connectivity.

Finally, we demonstrate that in principle this methodology may be applied to neurophysiological data through analysing pairs of human EEG and EMG time series. These preliminary results suggest that LRTCs can be detected in the phase synchronization between oscillations in human neurophysiological recordings.

We present and discuss our methodology in detail and we offer an interpretation of its results in relation to the emerging literature on neural synchrony and criticality within neural systems. We suggest that the existence of a valid DFA exponent in fluctuations of a phase difference measure suggests that the fluctuations are governed by non-local behavior, with all scales contributing to system's behavior.

## **2. MATERIALS AND METHODS**

We seek to characterize the presence of LRTCs in the (timevarying) phase difference between two time series. These time series may be physiological signals such as EEG, MEG, or EMG, time series extracted from a simulation or physical model, or data recorded from other natural phenomena. Below, we present the detail of the various components of our proposed methodology, including a technique used to calculate phase differences, DFA and the recently introduced ML-DFA method for validating the output of DFA. **Figure 1** illustrates the application of our methodology to neurophysiological data using two sample MEG time series. We note that for these signals, we bandpassed filter the data to a frequency band of interest, however, this step will be omitted in model data considered further in the manuscript.

#### **2.1. SIGNAL PHASE**

The phase of a single time series *s*(*t*) is calculated by first finding its analytic signal:

$$s\_a(t) = s(t) + H\left[s(t)\right] \tag{1}$$

where *H* - *s*(*t*) is the Hilbert transform:

$$H\left[s(t)\right] = \text{p.v.} \int\_{-\infty}^{\infty} s(\tau) \frac{1}{\pi(t-\tau)} d\tau \tag{2}$$

and p.v. indicates that the transform is defined using the Cauchy principal value.

#### **2.2. PHASE DIFFERENCE**

The signal phase is defined such that it belongs to a range φ(*t*) ∈ [0, 2π] or φ(*t*) ∈ [−π, π]. When a single oscillatory cycle is completed the phase returns to its starting value. A time-varying phase therefore has the properties of a sawtooth function (see panel 3 in **Figure 1**). In order to turn the phase into a continuous signal, the phase is unwrapped, so that at each discontinuity, a value of 2π is added to the phase (Freeman and Rogers, 2002; Freeman, 2004).

The phase difference φ1(*t*) − φ2(*t*) between two different time series *s*1(*t*) and *s*2(*t*) is calculated using the respective Hilbert transform of the signals *H*[*s*1(*t*)] and *H*[*s*2(*t*)] (Pikovsky et al., 2003):

$$\phi\_1(t) - \phi\_2(t) = \tan^{-1}\left\{\frac{H\left[s\_1(t)\right]s\_2(t) - s\_1(t)H\left[s\_2(t)\right]}{s\_1(t)s\_2(t) + H\left[s\_1(t)\right]H\left[s\_2(t)\right]}\right\} \quad (3)$$

Full synchronization between the two signals is indicated by a constant difference in phase over some time period (Pikovsky et al., 2003). The time series φ1(*t*) − φ2(*t*) is an unbounded process because φ1(*t*) and φ2(*t*) themselves are unbounded as long as the signals *s*1(*t*) and *s*2(*t*) continue to evolve as time increases. As we shall use DFA, see Section 2.4, to assess the presence of LRTCs and DFA in its standard form assumes a bounded signal, in this paper, we characterize phase synchronization in terms of the time derivative of the phase difference time series φ1(*t*) − φ2(*t*), i.e., the rate of change of the phase difference.

#### **2.3. LONG-RANGE TEMPORAL CORRELATIONS**

The autocorrelation function *Rss*(τ ) of a signal *s*(*t*) quantifies the correlation of a signal with itself at different time lags τ (Priemer, 1990), formally:

$$R\_{st}(\tau) = \int\_{\infty}^{-\infty} s(t+\tau)\ddot{s}(t)dt\tag{4}$$

where ¯*s*(*t*) is the complex conjugate of *<sup>s</sup>*(*t*) and therefore ¯*s*(*t*) <sup>=</sup> *<sup>s</sup>*(*t*) if *<sup>s</sup>*(*t*) is real-valued.

In signals with short-range or no dependence (Beran, 1994), the autocorrelation function shows a rapid decay. Gaussian white noise, for example, is a signal with no temporal dependence because each successive value of the time series is independent and thus its autocorrelation function decays exponentially. In contrast, a slow decay of the autocorrelation function indicates that correlations persist even across large temporal separations, and this is referred to as long-range dependence (Beran, 1994).

If there is power law decay of the autocorrelation function, namely:

$$R\_{\rm ss}(\mathbf{r}) \sim C\mathbf{r}^{-\alpha} \tag{5}$$

where *C* > 0 and α ∈ (0, 1) are constants, and the symbol ∼ indicates asymptotic equivalence (Clegg, 2006), then the time series is said to contain LRTCs. LRTCs are a subject of considerable scientific interest. They have been detected in biological data (Peng et al., 1994; Carreras et al., 1998; Willinger et al., 1999; Linkenkaer-Hansen et al., 2001; Samorodnitsky, 2006; Berthouze et al., 2010) and have been discussed within the context of complex systems operating in a critical regime.

Applying a Fourier transformation to Equation (5), a similar formulation exists for the spectral density of the signal (Clegg, 2006), with *f* representing frequency:

$$G\_{ss}(f) \sim Bf^{-\beta} \tag{6}$$

two sample MEG signals from the left and right motor cortex, displayed throughout panels 1–4 in red and blue, respectively. Panel 2 shows an optional bandpass filtering step. In panel 3 the instantaneous phases of the two time series are calculated using the Hilbert transform. Panel 4 shows the unwrapped phases leading to a time-varying phase difference displayed in panel 5. In panel 6, the rate of change of this phase

correspond to the minimum and maximum window sizes used in the DFA analysis, see Section 2.4. Panel 7 shows the resulting DFA fluctuation plot. The validity of this plot is determined using ML-DFA, see Section 2.5. In this case, the validity of the DFA plot was confirmed, with a DFA exponent of 0.57.

where β = 1 − α and is also related to the level of temporal dependence.

The exponents α and β in Equations (5, 6) are connected to the Hurst Exponent, *H*, by α = 2 − 2*H* and β = 2*H* − 1 (Beran, 1994; Taqqu et al., 1995).

In practice, finding the exponent α and β is not straightforward for an arbitrary signal. In the time-domain, α is best approximated by the slope of the autocorrelation function in the limit of infinite time lags τ where measurement errors are also largest (Clegg, 2006). Similarly, in the frequency domain, β is best approximated by the shape of the spectral density at large frequency shifts *f* . Determination of the Hurst exponent for non-stationary signals is not straightforward, and therefore, for practical applications, the related property of self-similarity (see below) is considered.

## **2.4. DETRENDED FLUCTUATION ANALYSIS**

DFA may be used to determine the self-similarity of a time series (Peng et al., 1994, 1995b). The application of DFA returns the value of an exponent, which is closely related to the Hurst exponent (Beran, 1994; Clegg, 2006). DFA is often considered to be applicable to both stationary and non-stationary data although recent reports, e.g., Bryce and Sprague (2012), have suggested that the ability of DFA to deal with non-stationary signals is overstated. In Section 2.5, we will describe our approach to mitigating this concern.

To calculate the DFA exponent, the time series is first detrended and then cumulatively summed. The root mean square error is then calculated when this signal is fitted by a line over different window sizes (or box sizes). Extensions of the technique can be used to fit any polynomial to each window, however, here we only consider linear detrending. If the time series is selfsimilar, there will be power law scaling between the residuals (or detrended fluctuations) and the box sizes. In the log space, this power law scaling yields a linear relationship between residuals and box sizes, the so-called DFA fluctuation plot, and the DFA exponent *H* is obtained using least squares linear regression. A DFA exponent in the range 0.5 < *H* < 1 indicates the presence of LRTCs. An exponent of 0 < *H* < 0.5 is obtained when the time series is anti-correlated, *H* = 1 represents pink noise, and *H* = 1.5 is Brownian noise. Gaussian white noise has an exponent of *H* = 0.5.

When performing DFA on oscillatory signals, the smallest window length should be large enough to avoid errors in local root mean square fluctuations, and it is typically taken to be several times the length of a cycle at the characteristic frequency in the time series (Linkenkaer-Hansen et al., 2001). If the minimum window size is significantly smaller than this value, then the fluctuation plot will typically contain a crossover at the window length of a single period (Hu et al., 2001). However, for nonoscillatory time series for which there is no characteristic temporal scale and there are rapid changes at each innovation, such as Gaussian white noise or FARIMA time series (see Section 2.6.1), a smaller window size may be used.

The maximum window size should encompass a significant proportion of the time series yet contain sufficient estimates to allow for a robust estimate of the average fluctuation magnitude across the time series. It is typically taken to be *N*/10 where *N* is the length of the data (Linkenkaer-Hansen et al., 2001).

In our application of DFA to neurophysiological and model data, we use 20 window sizes with a logarithmic scaling and a minimum window of 8 time steps for simulated data, and 1 s for neurophysiological oscillations (sampled at 512 Hz, band-pass filtered 15.5–27.5 Hz) providing for a minimum of 16 cycles per second. Following Linkenkaer-Hansen et al. (2001) we take a maximum window size of *N*/10 time steps where *N* is the length of the time series.

## **2.5. ASSESSING THE VALIDITY OF DFA**

As mentioned above, a self-similar process will produce a power law relationship between the magnitude of the detrended fluctuations and the box sizes. In DFA, this power law scaling is characterized in terms of the linear scaling between the log detrended fluctuations and the log box sizes (DFA fluctuation plot). It is beyond the scope of this paper to argue the validity of operating in the log domain (but see Clauset et al., 2006 for a reasoned view as to why this may not be appropriate), however, since the object of DFA is to find evidence for or against scaling and because a valid DFA exponent can only be obtained when the DFA fluctuation plot is indeed linear we have introduced a model selection method for establishing the linearity of DFA fluctuation plots (Botcharova et al., 2013).

Our arguments for adopting a more rigourous approach are as follows: (i) there is no *a priori* means of confirming that a signal is self-similar, (ii) a DFA fluctuation plot will necessarily increase with window size, (iii) an exponent may be too easily obtained through simple regression analysis producing a statistically significant result with a high *r*<sup>2</sup> value even though the linear model may not best represent a given DFA fluctuation plot, (iv) the discovery of an exponent >0.5 with a high *r*<sup>2</sup> value may lead to the incorrect conclusion that the signal is self-similar with LRTCs.

Instead of a simple regression we use the model selection technique (ML-DFA) introduced in Botcharova et al. (2013) to determine whether a given DFA fluctuation plot is best-approximated by a linear model. This is a heuristic technique, which has been tested extensively and found to perform well in assessing linearity in the fluctuation plots of the following time series: (i) those with known combinations of short and LRTCs, (ii) self-similar time series with varying Hurst exponent, (iii) self-similar time series with added noise and (iv) time series with known oscillatory structure, e.g., sine waves (Botcharova et al., 2013).

The technique fits the DFA fluctuation plot with a number of different models (see below) and compares the fit of each model using the Akaike Information Criterion (AIC), which discounts for the number of parameters needed to fit the model. The DFA exponent is accepted as being valid only if the best fitting model is linear. We want to stress that this does not equate to stating that the fluctuation plot is linear. Rather, we do not reject the linear model hypothesis. In what follows, only those time series for which the linear model hypothesis is not rejected (i.e., their DFA fluctuation plot is best-fitted by the linear model) contribute to the DFA exponents presented in the present paper and where appropriate we indicate where linear scaling of the fluctuation plot is lost.

The models included in ML-DFA are listed below (see Botcharova et al., 2013 for a justification), with the *ai* parameters to be found. The number of parameters ranges between 2 for the linear model, and 8 for the four-segment spline model.

Polynomial - *f*(*x*) = *<sup>K</sup> <sup>i</sup>* <sup>=</sup> <sup>0</sup> *aix<sup>i</sup>* for *K* = {1,..., 5} Root - *f*(*x*) = *a*1(*x* + *a*2)<sup>1</sup>/*<sup>K</sup>* + *a*<sup>3</sup> for *K* = {2, 3, 4} Logarithmic - *f*(*x*) = *a*1log(*x* + *a*2) + *a*<sup>3</sup> Exponential - *f*(*x*) = *a*1*ea*2*<sup>x</sup>* + *a*<sup>3</sup> Spline with 2, 3 and 4 linear sections.

The first step of ML-DFA is to normalize the fluctuation magnitudes with:

$$IF\_{scaled} = 100 \times \frac{IF - IF\_{min}}{IF\_{max} - IF\_{min}}$$

where *lFmin* and *lFmax* are the minimum and the maximum values of vector *lF*, respectively. A function *L* is then defined:

$$\mathcal{L} = \prod\_{i=1}^{n} p(\ln(i))^{lF\_{\text{scaled}}(i)}$$

which is a product across all windows *i*, and which works in a similar way to a likelihood function, where *p*(*lns*) represents the function:

$$p(lns) = \frac{\left| f(lns) \right|}{\sum\_{i=1}^{n} \left| f(lns) \right|}.$$

where *f*(*lns*) is the fitted model. Absolute values are used in order to ensure that *p*(*lns*) remains in the range [0, 1], so that a function is rejected if it falls below 0.

The next step is to apply a logarithm to*L*to produce a function that is similar in form to a log-likelihood:

$$\log \mathcal{L} = \sum\_{i=1}^{n} l F\_{scaled}(i) \log p(lns(i))$$

This is maximized to find the parameters *ai* necessary for *f*(*lns*). It is worth mentioning that the application of the logarithm means that the values belonging to *lns* are not equally weighted for all *i*. The larger window sizes have a lower weighting, which is beneficial because these estimates are also the least robust since they have fewer samples associated with them.

Akaike's Information Criterion (AIC) is then computed, which is designed to prevent over-fitting—a situation that should in general be avoided—by taking into account the number of parameters used (Akaike, 1974; Mackay, 2003). For a model using *k* parameters, with likelihood function log*L*, the AIC is calculated using the following expression:

$$\text{AIC} = 2k - 2\log \mathcal{L} + \frac{2k(k+1)}{n-k-1}$$

where *k* is the number of parameters that the model uses (Akaike, 1974). An adapted formula was proposed by Hurvich and Tsai (1989), which accounts for small sample sizes. The model which provides the best fit to the data is that with the lowest value of AIC. It is important to recall that the AIC can only be used to compare models. It does not give any information as to how good the models are at fitting the data, i.e., it is only its relative value, for different models, that is important; and it would not be possible, for instance, to compare AIC values obtained from different data sets to each other.

## **2.6. METHOD VALIDATION**

## *2.6.1. FARIMA processes*

An Autoregressive Fractionally Integrated Moving Average model (FARIMA) (Hosking, 1981) can be used to create time series with self-similarity. The model provides a process that can easily be manipulated to include a variable level of LRTCs within a signal, from which DFA should return the exponent used to construct the FARIMA process.

To construct a FARIMA process a time sequence of zeromean white noise is first generated, which is typically taken to be Gaussian, and necessarily so to produce fractional Gaussian noise. The FARIMA process, *X*(*t*), is then defined by parameters *p*, *d*, and *q* and given by:

$$\left(1 - \sum\_{i=1}^{p} \varphi\_i B^i \right) (1 - B)^d X(t) = \left(1 + \sum\_{i=1}^{q} \varphi\_i B^i \right) \varepsilon(t) \quad (7)$$

*B* is the backshift operator operator, so that *BX*(*t*) = *X*(*t* − 1) and *B*2*X*(*t*) = *X*(*t* − 2). Terms such as (1 − *B*) <sup>2</sup> are calculated using ordinary expansion, so that (1 − *B*) <sup>2</sup>*X*(*t*) = *X*(*t*) − 2*X*(*t* − 1) + *X*(*t* − 2). While the parameter *d* must be an integer in the ARIMA model, the FARIMA can take fractional values for *d*. A binomial series expansion is used to calculate the result:

$$(1 - B)^d = \sum\_{k=0}^{\infty} \binom{d}{k} (-B)^k$$

The left hand sum deals with the autoregressive part of the model where *p* indicates the number of back-shifted terms of *X*(*t*) to be included, ϕ*<sup>i</sup>* are the coefficients with which these terms are weighted. The right hand sum represents the moving average part of the model. The number of terms of white noise to be included are *q*, with coefficients ϕ*i*. In the range |*d*| < <sup>1</sup> <sup>2</sup> , FARIMA processes are capable of modeling long-term persistence (Hosking, 1981). As we will only consider *p* = 1 and *q* = 1 throughout the manuscript, we will refer to ϕ<sup>1</sup> as ϕ and ϕ<sup>1</sup> as θ. We set |ϕ| < 1, |θ| < 1 to ensure that the coefficients in Equation (7) decrease with increasing application of the backshift operator, thereby guaranteeing that the series converges, and *X*(*t*) is finite (Hosking, 1981).

A FARIMA(0,*d*,0) is equivalent to fractional Gaussian noise with *d* = *H* − <sup>1</sup> <sup>2</sup> (Hosking, 1981). This produces a time series with a DFA fluctuation plot that has been shown to be asymptotically linear with a slope of *d* + 0.5 (Taqqu et al., 1995; Bardet and Kammoun, 2008). By manipulating the ϕ and θ parameters, the DFA fluctuation plots can also be distorted.

## *2.6.2. Surrogate data*

Two time series *x*1(*t*) and *x*2(*t*) can be constructed such that the time derivative of their phase difference is a FARIMA time series *X*(*t*) with a known DFA exponent (Hosking, 1981). Concretely, we work backwards from the time series *X*(*t*) to which DFA is applied. The phase difference of the two time series (φ(*t*)) will be the cumulative sum of *X*(*t*), which is discrete in this case:

$$\Delta(\phi(t)) = \sum\_{s=1}^{t} X(s)$$

The two phases φ*i*(*t*) and φ2(*t*) of *x*1(*t*) and *x*2(*t*), respectively, must be constructed to have a difference of (φ(*t*)), or some multiple of (φ(*t*)) since DFA is unaffected by multiplying a time series by a constant. We therefore set φ1(*t*) = *<sup>t</sup> <sup>s</sup>* <sup>=</sup> <sup>1</sup> *X*(*s*) <sup>2</sup>*fs* and φ2(*t*) = − *<sup>t</sup> <sup>s</sup>* <sup>=</sup> <sup>1</sup> *X*(*s*) <sup>2</sup>*fs* where *fs* takes the role of a nominal sampling rate for the surrogate data.

Since the phase of a cosine signal is equal to its argument, the two signals *x*1(*t*) and *x*2(*t*) are defined as:

$$\mathbf{x}\_1 = \cos\left(\omega + \frac{\sum\_{s=1}^t X(s)}{2f\_s}\right).$$

and

$$x\_2 = \cos\left(\omega - \frac{\sum\_{s=1}^t X(s)}{2f\_s}\right)$$

where ω is a constant.

In what follows, we used ω = 1 and *fs* = 600. These values were chosen in order to produce a smooth enough phase difference. This was necessary to prevent artifacts produced by the Hilbert transform when applied to non-smooth data. When using physiological data, a high enough sampling rate guarantees that the signals will be smooth.

A hundred time series *X*(*t*) were generated using the algorithm described in Hosking (1981) for each of the 11 DFA exponents 0.5, 0.55, 0.6,..., 1. Each simulation contains 222 = 4194304 innovations. The value of the exponent of *X*(*t*) is first computed, the two signals *x*1(*t*) and *x*2(*t*) are then constructed, and the phase analysis method is applied. Window sizes used for application of DFA were logarithmically spaced with a minimum of 600 time steps to correspond to *fs* and maximum *N*/10 where *N* = 222 is the length of the time series.

A further control analysis was performed in which a Gaussian white noise time series η*i*(*t*) was added to one of the signals, namely,

$$\mathbf{x}\_1'(t) = \cos(\omega t + \frac{\sum\_{s=1}^t X(s)}{2}) + \eta\_i(t),$$

before the phase analysis method was applied in order to recover the DFA exponent of the phase difference *X*(*t*). This allowed us to alter the signal-to-noise ratio of *x*1(*t*) in an additive way, which we may suppose to be the case for noise in a neurophysiological time series. By applying the phase analysis method to signals with additive noise, we were able to test the robustness of the method to noisy data. In this analysis, first we will estimate the extent to which the DFA exponent alters when noise is added. Second, we will assess whether ML-DFA rejects those DFA exponents that we know to contain noise, and if so, we will quantify the level of noise at which exponents are no longer valid.

#### **2.7. MODEL SIMULATIONS**

## *2.7.1. The Ising model*

The Ising model is a model of ferromagnetism (Ising, 1925). In two dimensions, the model is implemented on a lattice (grid) of elements, or particles which represent a metallic sheet. A temperature parameter controls the collective magnetization (Onsager, 1944). The Ising model has been recently used as a model for a two-dimensional network of connected and interacting neurons (Kitzbichler et al., 2009).

Each element of the grid is assigned a spin *pi*, initially at random, which takes a value +1 (spin up) or −1 (spin down). Spins may switch up and down in time in a fashion influenced by both the energy of the full system and by the spin configuration of other neighboring elements. The energy of the system in a given configuration of spins *p* is given by the Hamiltonian function *H*(*p*) = −*J<sup>N</sup> i*,*j* = *nn*(*i*) *pipj*, where *j* is an index for the four elements that are nearest neighbors *nn* of each element, *i* of the square grid. The negative sign is included by convention. The average energy of the system *E* =< *H* > where the symbol <> indicates taking the expectation value.

The probability P of a given configuration occurring is then proportional to P = *e*−*H*(*p*)/*kT*, where *T* is the temperature parameter and *k* is Boltzmann's constant. The system may switch into a new configuration if its associated probability is higher or equal to that of the current configuration. The Ising model is implemented using the Metropolis Monte Carlo Algorithm (Metropolis et al., 1953).

At temperature *T* = 0, the system is highly ordered and corresponds to a magnetic state (see **Figure 2** for an example of an Ising model lattice). With increasing temperature values, the probability of a spin changing increases. As the system temperature increases the spins change more rapidly and the system becomes increasingly disordered and corresponds to a non-magnetic state (**Figure 2A**). The temperature value at which a transition occurs between the magnetized and non-magnetized states is known as the critical temperature *Tc*. At this temperature (see **Figure 2B**), the system will have a large dynamic range and infinite correlation length. However, in practice, this means that the system contains spin clusters of all sizes, and correlations between elements of an infinite system remain finite (Onsager, 1944; Daido, 1989). In other words, the Ising model is predicted to have long-range correlations between its elements at *Tc*.

The value of the critical temperature *Tc* was calculated for the two-dimensional Ising model in Onsager (1944), and is given by the solution to the equation

$$\sinh\left(\frac{2J}{kT\_c}\right) = 1$$

In the implementation of the Ising model used here, the lattice consists of 96 × 96 elements. The constants *J* and *k* are set to *J* = 1 and *k* = 1 without loss of generality, which gives the critical temperature *Tc* = <sup>2</sup> *ln*(1 <sup>+</sup> <sup>√</sup>2) <sup>≈</sup> <sup>2</sup>.269.

In order to obtain a time series from this spatial model, we follow the procedure introduced by Kitzbichler et al. (2009). Namely, the lattice is divided into a number of smaller square lattices, which we refer to as sub-lattices, and a number of time series are created by taking an average spin value for each sublattice. Here, we use a sub-lattice size of 8 × 8 as in Kitzbichler et al. (2009), but we also investigated other sub-lattice sizes (results not shown) in order to verify that this choice of sublattice size did not affect the results. Indeed, previous work by Priesemann et al. (2009) suggests that the sub-sampling

of a system may cause it to be mis-classified as subcritical or supercritical when it is in fact in a critical state.

Pairs of time series, for every possible pairing of sub-lattices belonging to the larger grid, were used as input signals for the phase analysis method. For the sub-lattice of size 8 × 8 considered here, 144 time series could be created allowing for 10, 296 pairings. Each time series consisted of 64, 000 innovations.

#### *2.7.2. The Kuramoto model*

The Kuramoto model is a classical model of synchronization (Acebrón et al., 2005; Chopra and Spong, 2005) and has been used to study the oscillatory behavior of neuronal firing (Pikovsky et al., 2003; Kitzbichler et al., 2009; Breakspear et al., 2010) among many other biological systems.

The Kuramoto model describes the phase behavior of a system of mutually coupled oscillators with a set of differential equations. Each of *N* oscillators in the system rotates at its own natural frequency ω*i*, *i* = 1,..., *N* , drawn from some distribution *g*(ω). However, it is attracted out of this cycle through coupling *K*, which is globally applied to the system. Time *t* is taken to run for *T* seconds of length *dt* = 10<sup>−</sup>3. The differential equation to describe the phase of an oscillator is (Kuramoto, 1975, 1984):

$$\dot{\phi}\_i(t) = \phi\_i(t) + \frac{K}{N} \Sigma\_{\dot{j}=1}^N \sin(\phi\_{\dot{j}}(t) - \phi\_i(t)) \tag{8}$$

Because the Kuramoto model provides an equation governing the phase evolution of each oscillator in the system, there is no need for the Hilbert transform to recover the phase time series and therefore only the latter stages of the phase analysis method are used (see steps 3–6 in **Figure 1**).

Kuramoto (1975) showed that the evolution of any phase φ*i*(*t*) may be re-expressed using two mean field parameters, which result from the combined effect of all oscillators in the system. Namely, we may write:

$$\dot{\phi}\_i(t) = \alpha\_i + Kr(t)\sin(\psi(t) - \phi\_i(t))\tag{9}$$

where ψ(*t*) is the mean phase of the oscillators, and *r*(*t*) is their phase coherence, so that:

$$r(t)e^{i\psi(t)} = \frac{1}{N} \sum\_{j=1}^{N} e^{i\phi\_j(t)}\tag{10}$$

This crucially indicates that each oscillator is coupled to the others through its relationship with mean field parameters *r*(*t*) and ψ(*t*), so that no single oscillator, or oscillator pair drives the process on their own. The oscillators synchronize at a phase equal to the mean field ψ(*t*), and *r*(*t*) describes the strength of synchronization, sometimes referred to as the extent of order in the system (Strogatz and Mirollo, 1991; Bonilla et al., 1992). When *r*(*t*) = 0, no oscillators are synchronized with each other. When *r*(*t*) = 1, all oscillators are entrained with each other.

One solution to Equation (9) is *r* ≡ 0 for all time and coupling, leaving each oscillator to evolve independently at its own natural frequency. Using a limit of *N* → ∞, some further deductions can be made, including the fact that when the natural frequency distribution *g*(ω) is unimodal and symmetric, another solution can be found for ω*i*, with *r*(*t*) not equivalent to 0 (Kuramoto, 1975). A critical bifurcation occurs for sufficiently high coupling, resembling a second-order phase transition (Miritello et al., 2009) in which the order parameter [here, *r*(*t*)] leaves zero and grows continuously with coupling (Strogatz and Mirollo, 1991; Dörfler and Bullo, 2011). The coupling at the bifurcation is referred to as the critical coupling *Kc* (Dörfler and Bullo, 2011).

In an infinite Kuramoto model, criticality is defined through this point of bifurcation. For a finite system, however, the critical point can only be approximated by this theoretical value. One defining characteristic of the critical coupling for the Kuramoto system is that the greatest number of oscillators come into synchronization at this value. In our study, we deal with finite-sized implementations of the Kuramoto model, and we use this characteristic as a marker of the onset of critical regime in addition to the theoretical value *Kc*. Specifically, we use a measure characterizing the onset of synchronization with increasing coupling introduced by Kitzbichler et al. (2009). This is the change in the "effective mean-field coupling strength," (*Kr*). If the value of *Kr* exceeds the difference between the natural frequency and the mean phase ω*<sup>i</sup>* − ψ (in modulus), i.e., |ω*<sup>i</sup>* − ψ| < *Kr*, then oscillator *i* will synchronize to the mean field (Mertens, 2011). Thus, the value of *K* at which *Kr* increases maximally is the coupling value at which the greatest number of oscillators are drawn into the mean field.

In this paper, we consider the Kuramoto model with a noise term added to the phase equation, namely, Equation (8) becomes:

$$\dot{\phi}\_i(t) = \omega\_i(t) + \frac{K}{N} \sum\_{j=1}^{N} \sin(\phi\_j(t) - \phi\_i(t)) + \eta\_i(t) \tag{11}$$

where η*<sup>i</sup>* is a noise input taken to be uncorrelated Gaussian noise with zero mean ( η*i* = 0) and covariance σ<sup>2</sup> *<sup>i</sup>* /*T* ( η*i*(*t*) η*j*(*s*) = δ*ij*δ(*t* − *s*)σ<sup>2</sup> *<sup>i</sup>* /*T*) where δ*ij* is the Kronecker delta, δ(*t* − *s*) is the Dirac delta function, σ*<sup>i</sup>* is in radians and *T* = 1 s here.

This creates a richer structure in the oscillator dynamics, which we suggest may better reflect coupling of neurophysiological oscillators. Furthermore, it has been shown that addition of noise increases the critical regime over a wider range of coupling values (Breakspear et al., 2010). This may allow for the fluctuations of phase difference of a given oscillator pair to persist for longer with increasing coupling before full synchronization is achieved.

Strogatz and Mirollo (1991) analytically derived a formula for the critical coupling in an infinite Kuramoto model with added noise *Kc*,*noise*. As the number of oscillators is inevitably finite, this value is only an approximation to the true critical coupling in the system, but we find it useful and it is displayed alongside plots of (*Kr*), which although originally introduced for a noiseless model, remains a helpful marker of the effective critical coupling in the Kuramoto model when noise levels are not too large (Mertens, 2011).

In this study, we generated time series for 200 oscillators of the Kuramoto model described by Equation (11). Each time series was 6100-timestep long. The standard deviation σ*<sup>i</sup>* was set to 0.32. The distribution of natural frequencies was *g*(ω) ∼ *N* (44π, σω), with standard deviation σω = 15. This corresponds to a normal distribution centered around 22 Hz (which is a unimodal distribution). In order to get an idea of the spread of the distribution, the minimum natural frequency selected from this distribution was 16.3 Hz and the maximum was 27.8 Hz. We selected this frequency range because it spans the β-band of EEG, MEG, and EMG oscillations (Farmer, 1998).

For these parameter values, the critical coupling *Kc* is equal to:

$$K\_c = \frac{2\sqrt{2}}{\sqrt{\pi}} \sigma\_w \sim 23.93$$

The integral for *Kc*,*noise* is not analytically calculable for a normal distribution *g*(ω) ∼ *N* , but empirical calculation yields:

$$K\_{c,noise} \sim 23.85$$

## *2.7.3. The Cabral model*

The third model that we consider in this paper was developed by Joanna Cabral and her colleagues, referred to as the Cabral model. It is a modification of the Kuramoto model, combining the dynamics of the Kuramoto oscillators with the network properties observed in the human brain (Cabral et al., 2011).

The Cabral model includes a noise input to the Kuramoto oscillators and situates the 66 oscillators on a connectivity matrix with varying connection strengths and time delays based on empirical measurements of 998 brain regions, which have been down-sampled to 66 (Honey et al., 2009). The list of brain regions considered in this model are given in the supplementary material of Cabral et al. (2011) and are reproduced in the Appendix to the present paper. Specifically, Equation (8) is modified to include a connectivity term *Cij* between oscillators *j* and *i*, namely,

$$\dot{\phi}\_i(t) = \omega\_i(t) + \frac{K}{N} \Sigma\_{j=1}^N C\_{i\bar{j}} \sin(\phi\_{\bar{j}}(t - D\_{\bar{i}\bar{j}}) - \phi\_i(t)) + \eta\_i(t) \\ (12)$$

where η*<sup>i</sup>* is the noise input previously introduced, and *Dij* is the time delay associated with the link between oscillators *j* and *i*. The matrix of delays *D* is extracted from a matrix of empirical distances *L* between regions using:

$$D\_{\vec{\eta}} = \frac{\langle D \rangle \, L\_{\vec{\eta}}}{\langle L \rangle}$$

and is used to encode the length of time taken by neural activity to traverse the connection space. The connectivity and distance matrices (*C* and *D*, respectively) are shown in **Figure 12**. They can also be visualized through the schematic diagram in **Figure 3** in which the thickness and color of the lines represent the weights of the connections between the oscillators denoting individual brain regions. These weights are proportional to the number of fibers that were empirically observed to connect the various regions (Cabral et al., 2011, 2012). Brain regions may be identified by their labels, the abbreviations of which are given in **Table A1** in the Appendix.

In Cabral et al. (2011), the model was used to generate time series which were used as input to a hemodynamic model and bandpass filtered. Each time series was 106 timestep-long, corresponding to 1000 s. The resulting time series were compared to recordings of BOLD fMRI signals using Pearson's correlation coefficient and mean squared error to determine the parameter values *K* and *D*that generated the time series which most closely approximated the BOLD data.

In this model, there is no theoretically derived value of critical coupling and (*Kr*) is only a marker of effective change in coupling that may or may not be critical. We interpret a rise in (*Kr*) as an increase in order of the system similar to that observed by Kitzbichler et al. (2009).

The phase analysis method presented here was applied to the Cabral model for coupling parameters *K* ranging from 1 to 20. We note that this encompasses *K* = 18, the value identified by Cabral et al. (2011) as best approximating human brain resting state BOLD fluctuations. Natural frequencies were drawn from a normal distribution with *g*(ω) ∼ *N* (120π, σω) with standard deviation σω = 5, which corresponds to a normal distribution centered around 60 Hz in the γ frequency band. This was selected because γ oscillations have been shown to play a significant

part in the BOLD signal fluctuations (see Cabral et al., 2011 for details).

The standard deviation σ*<sup>i</sup>* of the noise input was set to 1.25. It was found that values of σ*<sup>i</sup>* < 3 did not significantly alter the resulting parameter values of *K* and *D*. The value *D* = 11 is taken as in Cabral et al. (2011).

## *2.7.4. Clusters in the Cabral model*

Cabral et al. (2011) identified a number of clusters of oscillators, along with a set of 12 oscillators which are not part of a cluster. These clusters are listed below in **Table 1**. In our analysis, we considered how each of these different clusters contributed to the overall behavior.

## *2.7.5. Disruptions to the Cabral model*

In order to investigate the role of connectivity in sustaining LRTCs of rate of change of phase difference, we modified the connectivity matrix *C* in the Cabral model in two ways, as shown in **Figure 4**. First, beginning with the empirical connectivity matrix we deleted any connection that extended from one hemisphere into the other. We preserved all the other elements of the model's connectivity and oscillator characteristics.


*The 66 oscillators of the Cabral model can be separated into 6 clusters, based on their mutual connectivity and distance matrix patterns, and a final set of 12 oscillators, which are not considered to belong to a cluster, but are grouped together here for convenience. The table also states the average sum of weights per node belonging to each cluster and the average number of connections per node (both to 2 d.p.).*

The second exploration involved a reconnection of the connectivity matrix in a random arrangement, while preserving the degree distribution and weight distribution of each oscillator by an algorithm described in Gionis et al. (2007), Hanhijärvi et al. (2009). Specifically, a list of the outgoing weights of each oscillator was made alongside the node from which it extends. Two weights were selected from this list. If they did not belong to the same node, then the nodes were connected to each other with the associated outgoing weights that were selected. These weights were then deleted from the list. To continue the algorithm, two further weights were selected. After the first step, it was necessary to check at each iteration that the nodes were not already connected before connecting them. If the nodes were connected, or if they were the same node, new weights were selected from the list.

Analysis of the random connectivity model and comparison of the results obtained from it to those derived from the disconnected hemisphere model and standard appropriately connected model allowed us to determine the extent to which a realistic connectivity matrix of the human brain predisposes the system to LRTCs in the rate of change of the phase difference between the oscillator pairs representing different brain regions.

*2.7.5.1. A note on notation.* From this point in the text, all instances of oscillator phase φ*i*(*t*) and *r*(*t*) will be written as φ*<sup>i</sup>* and *r* for ease of notation, unless stated otherwise. Any quantities that are defined using the phases of one or more oscillators are also implicitly functions of time, although the *t* is omitted for the same reason.

## **2.8. NEUROPHYSIOLOGICAL DATA**

Previously collected neurophysiological data were used to illustrate the application of the method (see James et al., 2008 for full details). Briefly, EEG and EMG signals were simultaneously recorded whilst a healthy adult subject performed a 2-min 10% MVC (maximum voluntary contraction) isometric abduction of the index finger of the right hand. The EMG was recorded using bipolar electrodes situated over the first dorsal interosseous muscle (1DI). The EEG was recorded using a modified Maudsley

montage from 24 Ag/AgCl electrodes with impedance <5 k. The data were amplified and bandpass filtered 4–256 Hz and sampled at 512 Hz. We analyzed EEG recorded from over the left sensorimotor cortex. The signal processing pathway was set out as in **Figure 1**, including bandpass filtering in the β frequency range (15.5–27.5 Hz).

## **3. RESULTS**

## **3.1. SURROGATE DATA**

The signals described in Section 2.6.2 were analyzed. The scatter plot presented in **Figure 5** shows the DFA exponents of the rate of change of phase difference expected from the construction of a FARIMA time series with known parameters against those recovered by applying the phase analysis method. The scatter plot shows a strong linear relationship between the expected and recovered exponents with a slope of 0.998. The fact that the slope is slightly <1 indicates that the recovered exponent was slightly under-estimated by our method. This minor tendency will decrease the likelihood of false positive results.

As noise is added to a signal with a known DFA exponent in its phase, the exponent of its phase is found to be reduced. **Figure 6** shows that as the noise level is progressively increased, the percentage difference between the known DFA exponent and that recovered by the method increases. When the noise level is above one which causes the percentage difference between known and recovered DFA exponent to exceed approximately 5% (note, as shown in **Figure 6**, that this noise level depends on the exponent, e.g., 0.1 for true DFA exponent of 1, 0.025 for exponent of 0.75), no values are returned for the recovered DFA exponent. This occurs because the recovered DFA exponents are not considered to be valid by ML-DFA because their associated DFA fluctuation plots are not best approximated by a linear model (see Section 2.5).

As the noise level is increased further, and as it passes a level of ≈0.3–0.4, noise dominates the signal and valid exponents are once again obtained. These exponents are at or close to 0.5 regardless of the value of the known DFA exponents, indicating that the phase relationship of the two signals *s*1(*t*) and *s*2(*t*) is dominated by noise only.

## **3.2. THE ISING MODEL**

**Figure 7** shows the results for sub-lattices of size 8 × 8. At a high temperature of *T* = 105, the average DFA exponent across all pairwise comparisons is 0.57 (see magenta shaded bar). This value is in excess of 0.5 expected for Gaussian white noise and indicates that even at high temperatures there is order within the rate of change of phase difference between pairs of lattice time series. As the temperature is lowered the DFA exponent of the rate of change of phase difference increases steadily reaching a maximum of 0.65 at *T* = 2.55 (see magenta shaded bar) indicating maximal LRTC just before the critical temperature is reached.

The change in mean DFA has to be seen within the context of the validity of the DFA fluctuation plots. As the system cools

toward the critical point the validity of DFA exponents across all pairwise phase differences drops abruptly. The first temperature value for which <100% of the DFA plots are valid is *T* = 2.75 shown as magenta shaded bar. There is a large fall in DFA fluctuation plot validity as the critical temperature is reached (56–34%). This fall in validity reflects the onset of full synchronization between a number of the time series. At the critical point, *T* = *Tc* which occurs between *T* = 2.25 and *T* = 2.3 (see magenta shaded bars) the validity is 34% of time series pairs with mean DFA exponent of 0.64. As the Ising model cools below the critical point the DFA validity in general falls and there are no valid DFA fluctuation plots below *T* = 2.15. As discussed above this occurs because of the loss of fluctuations in the rate of change of phase difference due to full synchronization.

Results obtained for sub-lattice sizes of 32 × 32, 16 × 16, 12 × 12, and 6 × 6 were found to be qualitatively consistent with the results shown in **Figure 7** (results not shown).

### **3.3. THE KURAMOTO MODEL**

The group average results for the Kuramoto model are shown in **Figure 8**. As can be seen, the peak average DFA exponent occurs on average at *K* ≈ 22. The value of the average DFA exponent at this coupling value is 0.65 with standard deviation 0.06, consistent with the rate of change of phase difference showing LRTCs. The peak DFA exponent occurs one coupling value later than the peak of the (*Kr*) measure, at *K* ≈ 21. (*Kr*) represents the coupling value at which the order parameter *r* increases most, and the point of greatest oscillator coupling flux in the system (Kitzbichler et al., 2009). The peak coupling value (*Kr*) and the maximum

**FIGURE 6 | True and recovered DFA exponents for noisy signals with LRTCs. (A)** Recovered DFA exponent values as noise is progressively added. For each of the DFA exponents given in the legend (box insert), a signal *x* <sup>1</sup>(*t*) was constructed with a noise level σ ∈ [0, 1], shown on the *x* axis. The phase synchrony analysis method was applied to *x* <sup>1</sup>(*t*) and *x*2(*t*). This was performed 100 times. For DFA exponents corresponding to DFA fluctuation plots that were accepted as linear by ML-DFA, the average value for the 100 signal pairs is shown. There are no data points corresponding to the intermediate noise level of ≈0.1 to ≈0.3 because all 100 DFA fluctuation plots for signals with this noise level were determined to be invalid by ML-DFA. **(B)** The % difference between recovered and

known DFA exponents as a function of the noise added to a signal with a known DFA exponent in its phase. The data shown in this plot is the same as that in **(A)**, but it is expressed in terms of the % difference between true and recovered DFA exponents rather than the raw recovered value. Only noise levels of σ ∈ [0, 0.1] are shown. The colors represent different true DFA exponent values, as indicated by the legend within the inserted box. The dashed line indicates a 5% difference between known and recovered exponents. When the difference between the known and recovered exponent exceeded approximately 5% for any value of the true exponent, the DFA fluctuation plot is not accepted as being linear by ML-DFA and therefore the exponent is not shown on the plots.

DFA values are just less than the theoretical critical coupling of the infinite Kuramoto system with noise *Kc* ≈ 23.85. Again, these results must be understood in context of DFA fluctuation plot validity which is 42% of the 199, 000 oscillator pairs at *K* ≈ 22. Once full synchonization occurs between an individual pair of oscillators, their phase difference takes a constant value. ML-DFA detects the resulting loss of scaling by indicating that the DFA fluctuation plot is no longer linear.

After the peak DFA at *K* ≈ 22, further increase in *K* eventually causes full synchronization between all individual oscillator pairs. Across the whole system, fewer than 10% of oscillator pairs yield a valid DFA after the critical coupling is exceeded. When all oscillator pairs are synchronized with each other, the order parameter of the system approaches its maximum level of 1 but the DFA fluctuation measure of rate of change of phase difference is no longer valid.

Analysis of the Kuramoto model with noise suggests that LRTCs in the rate of change of phase difference between oscillator pairs occur when the system is in a state of maximal flux just prior to the onset of full synchronization.

## *3.3.1. Individual oscillators pairs*

Further insights into the rate of change of phase difference fluctuation behavior can be obtained from DFA of individual oscillator pairs. Analysis of a set of 5 oscillator pairs is shown in **Figure 9**. The top panel shows the change in DFA exponent with coupling *K* for a pair whose initial frequencies are very close (0.001 Hz apart). The bottom panel shows the changes in DFA exponent for an oscillator pair with initial frequencies that differ by ≈7.0 Hz. The middle panels show oscillator pairs with varying amounts of initial frequency difference (increasing top to bottom). Non valid DFA exponents are not plotted in the left hand panel but the right hand panels indicate for each given pair linear DFA validity "yes" or "no" for a given value of *K*. At low coupling *K*, the oscillators do not interact with each other and each evolves at its own natural frequency. The order in the system is low and the DFA exponent ≈0.5 reflects the additive noise which dominates the fluctuations in the rate of change of phase difference. A DFA value of ≈0.5 is also evident in the average DFA (**Figure 8**). There is almost 100% validity across all pairs because white noise time series are scale-free and therefore the DFA fluctuation plot obtained from analysing them is expected to be linear (**Figure 8**).

As the coupling parameter *K* is increased, the DFA exponents of each of the oscillator pairs rise until a peak is reached. The value of *K* at which a maximal valid exponent is retrieved for these peaks is related to the difference in natural frequencies of the two oscillators as well as their interactions with the noise and the mean field. Oscillator pairs which start further apart in frequency terms develop full synchonization later than those whose initial frequencies are close together. As *K* increases the DFA exponent of the rate of change of phase difference increases. The pairs with the strongest LRTCs on the basis of the highest DFA exponent value prior to onset of full synchronization are those with the greatest inital frequency difference. Increasing temporal order of the rate of change of phase difference prior to full synchonization of these pairs may indicate a state of pre-synchronization in these pairs.

## **3.4. THE CABRAL MODEL**

For the Cabral model we present results regarding both the global behavior of the system through average DFA exponents across all possible pairs of oscillators (**Figure 10**) and the behavior of the system at cluster level through average DFA exponents of intra-cluster pairs of oscillators (**Figure 11**).

## *3.4.1. Global behavior*

The model introduced by Cabral et al. (2011) is affected by rich interplay between the connectivity and distance matrices as well as the noise and natural frequency elements of the system. The average valid DFA exponents for all oscillator pairings (*n* = 2145) are shown in **Figure 10** as the coupling in the system is increased.

highlighted in magenta.

These average exponent values indicate the presence of LRTCs in the rate of change of phase difference. The peak values of mean DFA exponent correspond to peaks in the change in order paramenter ((*Kr*)) derived for the classical Kuramoto model and the Kuramoto model with noise, see Kitzbichler et al. (2009) and **Figure 8**. Such peaks occur when the system undergoes the greatest change in synchronization. The peak in (*Kr*) corresponds closely to the coupling value that shows maximum mean DFA exponent (*K* = 5 and 6, respectively—see **Figure 10**).

15. The theoretical critical coupling *Knoise* when noise is added is marked with

The number of pairings that yield valid DFA exponents in the rate of change of their phase difference is equal to 100% when there is no coupling in the system (magenta shaded bar at *K* = 0), but it falls as coupling is introduced (magenta shaded bar at *K* = 1). At the coupling value of the DFA peak, *K* = 6, validity is at 20%, which is higher than the neighboring coupling values (magenta shaded bar at *K* = 6).

## *3.4.2. Cluster behavior*

At coupling value *K* = 6, the value at which the global behavior shows peak DFA value, the intra-cluster results indicate that only cluster 4, consisting of oscillators 27–40, shows valid nontrivial DFA exponents. These exponents are consistent with the presence of LRTCs. This suggests that cluster 4 acts as an organizing force in the system when the system is in its greatest state of flux, as demonstrated by a large increase in the order parameter. This cluster corresponds to the most connected brain regions listed in **Table1** and **Table A1** in the Appendix.

The connectivity and distance matrices for the Cabral model are shown in **Figure 12**. The linear coupling between oscillators for two values of *K* is shown in **Figure 13**. The central cluster of oscillators with high levels of synchronization is evident from the two correlation matrices. At *K* = 6 (**Figure 13A**), i.e., the value at which LRTCs are detected in the rate of change of phase difference, the central oscillator cluster shows evidence of synchronization but with Pearson correlation values of <1.0. As *K* increases to 18, the value identified by Cabral et al. (2011) as best approximating human brain resting state BOLD fluctuations, it can be seen from **Figures 10**, **11** that the proportion of oscillator pairs with valid DFA fluctuation plots is low (approximately 5%). Those oscillator pairs that remain and show persistently valid DFA fluctuation plots are predominantly individual oscillators with low average weight per node (0.03) and low average degree distribution (8.59). Their associated DFA exponent is on average 0.5 (see **Figure 11**). At *K* = 18, the Cabral model shows strong cluster synchronization. In particular, the central cluster 4 (oscillators 27–40) which contains homologous elements connected across the corpus callosum shows Pearson correlation values close to 1.0 indicative of full synchrony (**Figure 13B**). Therefore, the results we obtained from the Kuramoto model with noise and those derived from the Cabral model are similar. Both show valid DFA fluctuation plots with LRTCs of the rate of change of phase difference at a coupling value where (*Kr*) is increasing and loss of validity as full synchronization takes over. As discussed earlier, "criticality" is not defined for the Cabral model but with increasing *K* there is clearly a change in the system's order which is detected through our method.

**Figure 14** shows the DFA exponents of the rate of change of phase difference between individual pairs of oscillators in the form of a symmetric lattice of size 66 × 66, where each element in the lattice represents a brain region as detailed in **Table A1** of the Appendix. **Figure 14A** of this figure shows the importance of the central cluster in generating LRTCs of phase synchronization. Importantly it shows this cluster's influence over many of the other oscillators in the Cabral model. Cluster group 4 has the greatest sum of weights per oscillator and the greatest number of connections per oscillator (see **Table 1**). The correlation between

the number of connections of a given oscillator and the average DFA exponent of its rate of change of phase difference with all other oscillators is 0.359, suggesting a relationship between oscillators with large connectivity and those with large DFA exponents in their pairwise phase difference.

## *3.4.3. Comparison of the three connectivity structures*

In the Cabral model, the (*Kr*) measure has its peak at coupling value *K* = 6. Here, we compare the effects of the three connectivity matrices introduced in Section 2.7.5 on the DFA exponents of the pairwise phase difference between oscillators at this coupling value in **Figure 14**.

The empirical connectivity matrix showed large DFA exponents indicating the presence of LRTCs at this coupling value for a small number of hub oscillators belonging to cluster 4 (see above). These oscillators have a high number of connections and large weights associated with these connections (see **Table 1**). When the two hemispheres are disconnected, we see no LRTCs in the DFA exponents of the phase difference at this coupling value. When the distance matrix is preserved, but the connectivity and associated weights are assigned at random, LRTCs are still present in the DFA exponent of the phase differences between oscillators, but a lower value of DFA exponent is obtained. There is no apparent cluster formation when connectivity is random.

**FIGURE 10 | The average DFA exponents of phase synchrony as a function of the coupling parameter,** *K***, in the extended Kuramoto model (Cabral et al., 2011).** The model includes 66 oscillators at normally distributed natural frequencies with mean 60 Hz and standard deviation σ*<sup>i</sup>* = 1.25. The connectivity and time delay matrices are set from empirical values. The average of the valid DFA exponents is shown in magenta and the proportion of valid exponents, as calculated by ML-DFA, are indicated by bars. The Kuramoto model order parameter *r* is in blue, and the quantity (*Kr*) is in cyan. The peak (*Kr*) has been used as an indicator of the effective critical coupling. A horizontal line at DFA exponent 0.5 is plotted to guide the eye. The proportion of valid DFA bars for *K* = 0, *K* = 1, and *K* = 6 have been shaded in magenta.

## *3.4.4. Neurophysiological data*

**Figure 15** illustrates the application of our phase synchrony analysis technique to the human neurophysiological data described in Section 2.8. In this example, a valid DFA exponent of ≈0.6 was obtained for the rate of change of phase difference between the simultaneously recorded EEG and EMG data during a steady muscle contraction, indicative of the presence of LRTCs. Analysis of amplifier noise and artificially generated noise time series using

**FIGURE 12 | Connectivity and distance matrices for the Cabral model.** Each oscillator number represents a brain region, which is defined in **Table A1** in the Appendix. An empty (white) element means that the two regions are not connected. Regions are not connected to themselves so that the diagonals are white. **(A)** Shows the pairwise connection matrix *C* between the 66 oscillators. **(B)** Shows the matrix

of pairwise distances *L* between the brain regions that are represented by the 66 oscillators. Matrix *L* is symmetric, however, matrix *C* is not because the connection weights are normalized by row. The values associated with the colors of the plots are defined by the color bars. Red colors in **(A)** represent higher weights. Red colors in **(B)** represent longer distance connections.

processing steps identical to those for the EEG and EMG data (signal processing pathway shown in **Figure 1**) resulted in a valid DFA fluctuation plot but with exponent of 0.48 consistent with uncorrelated noise.

## **4. DISCUSSION**

The aim of this paper is to introduce a new methodology for eliciting a marker of criticality in neuronal synchronization. This methodology relies on the rate of change of the phase difference between two signals as a (time-varying) measure of phase synchronization. The presence of LRTCs in this quantity is proposed as marker of criticality and is assessed using DFA in combination with the recently proposed ML-DFA, a heuristic technique for validating the output of DFA. With these methods, we can first determine the presence or absence of power law scaling using ML-DFA and secondly the presence or absence of LRTCs in the phase synchronization of two time series based on the value of the DFA exponent. If the method returns an exponent of ≈0.5, this indicates a phase relationship similar to white Gaussian noise, however, if the DFA exponent is greater than 0.5, this indicates the presence of LRTCs. Importantly, we can attribute significance to the loss of power law scaling within the fluctuation plot and draw conclusions based on an exponent value only when the exponent has been recovered from plots that are judged to be valid by ML-DFA.

#### **4.1. SURROGATE DATA**

It was found that the phase synchrony analysis method recovers a known DFA exponent value in the rate of change of phase difference between two signals of surrogate data with a high degree of accuracy (*r* = 0.998). When the structure of phase synchronization was perturbed with an additive noise source, it was found that a percentage difference between the true and recovered DFA exponent of above approximately 5% noise caused DFA

exponents to be judged as invalid by ML-DFA. When the surrogate data was characterized by a DFA exponent close to 1, the recovery of this exponent using DFA was more resistant to noise when compared to surrogate data with a lower DFA exponent of 0.6 (**Figure 6**). In these simulations we used additive noise which was included at the amplitude stage of the surrogate time series prior to extraction of the phase using the Hilbert transform.

## **4.2. THE ISING MODEL**

We had initially expected to see LRTCs in the Ising model only in the vicinity of the critical parameter, and a DFA exponent of 0.5 when the energy in the system was large (disordered phase). However, in applying our method to the Ising model, both of these hypotheses were not fully realized. It was found that when the temperature was increased to a very high level of *T* = 105, the DFA exponent of the rate of change of phase difference did not fall to 0.5, but remained at ≈0.57. This did not change when the temperature was set to an even higher value of *T* = 1012. This was not a finite size effect of the system, as the result held when larger lattice sizes (up to 1000 × 1000) were used (results not shown). We noted that when pure phase was analyzed, i.e., an uncoupled system of Kuramoto oscillators, DFA exponents of 0.5 were obtained as expected, and therefore, we cannot exclude the possibility that the Hilbert transform induced artifacts may inject some order into the resulting phase time series. However, within the Ising system, the expectation of a DFA exponent of 0.5 at high *T* is based only on our intuition concerning the operation of the system. As all elements in the Ising lattice interact with their neighbors it is possible that some temporal correlation in the rate of change of phase difference may persist regardless of temperature value, and this may be the cause of a DFA exponent above 0.5.

Importantly, we found that the DFA exponent was indicative of LRTCs at critical temperature but was maximal at *T* = 2.55, just in excess of the critical temperature. As can be seen in **Figure 7**, the consistent change in the DFA value and the change in power law scaling behavior indicates that the phase synchrony analysis method is capturing an important behavior of the system close to its critical regime. However, it is important to realize that unless an experimental neuroscientific paradigm can be discovered that produces similar consistent changes in this measure, neurophysiological data will have to be intepreted with caution, i.e., we may be able to state that for a given pair of neural oscillation time series there exists power law scaling with a DFA exponent indicative of LRTCs in the rate of change of their phase difference but we may not know whether for this neural state there may exist other higher (or lower) exponent values. In other words, the technique may provide evidence that the system is ordered in ways that are similar to systems nearing their critical regime but whether the technique will pinpoint the most critical regime in a neural system is open to question. We will consider this further in our discussion of the results of analysing a Kuramoto system with noise.

Interestingly, the evolution of the DFA exponent with the temperature parameter shares a key characteristic with that of a recently published measure of information flow in the same model (Barnett et al., 2013), specifically, an asymmetry around the critical point, with a sharp rise in the metric as temperature is increased toward the critical *T* = *Tc* and a gradual descent as the temperature rises significantly. It would be of interest to further assess the extent to which the proposed method captures information flow in the system, e.g., through a comparison of both methods when applied to the Kuramoto model.

## **4.3. THE KURAMOTO MODEL**

In the Kuramoto model, the critical transition is characterized in terms of a global order parameter which reflects the overall organization of the system. However, through our phase synchrony analysis method we are able to make observations at a pair-wise level of Kuramoto oscillators always bearing in mind that even at the pair-wise level the result is influenced by the oscillators' interactions with all other oscillators in the model. As individual Kuramoto oscillator pairs become fully synchronized, their rate of change of phase difference no longer contains momentto-moment fluctuations and thus power law scaling in the DFA measure is lost. This is an important consideration because it emphasizes the difference between our method and more standard measures of neural synchrony. Methods for detecting neural

synchrony rely on phase consistency to allow averaging out of fluctuations so that a measure of coupling (e.g., coherence and phase coherence) is obtained. In contrast, the method introduced in this paper is dependent on the fluctuations of the two phase signals and their interaction. Therefore, our method detects "order" across time in the rate of change of phase difference rather than phase consistency between two processes.

The phenomenon of loss of fluctuations at the onset of full synchronization is well illustrated both for the global Kuramoto model and for individual oscillator pairs extracted from the Kuramoto model. In the global analysis the peak in the DFA exponent occurs close to the observed peak of (*Kr*) and at values of *K* just below theoretical critical coupling value. At these values of *K*, a power law scaling exists for the rate of change of phase difference, and the DFA exponent of oscillator pairs with different initial frequencies indicates the presence of LRTCs. At the onset of full synchronization the number of oscillator pairs for which DFA is valid drops yet those whose phase differences still possess fluctuations continue to show LRTCs. Once the critical regime has been fully crossed and the order parameter *r* approaches 1, the DFA of the rate of change of phase difference is no longer valid for any oscillator pair.

The LRTC behavior is also clearly explained as the coupling value *K* decreases toward zero. As can be seen in **Figure 8**, the DFA exponent of the pairwise rates of change of phase difference decreases toward 0.5 and yet scaling remains valid. These changes in DFA exponent are evident both on the global level in the average DFA and for individual oscillator pairs. At *K* = 0 the phases are independent from one another yet contain noise; thus the rate of change of phase difference time series contains innovations that are random across time with a DFA which is valid and returns the expected exponent of 0.5.

## **4.4. ORDER WITHIN THE ISING AND KURAMOTO MODELS**

In these models, temperature *T* (Ising) and coupling *K* (Kuramoto) play a similar role in controlling the *order* within the two systems, and the DFA validity and exponent results obtained from analysis of rate of change of phase difference in both of these models mirror each other. In the Kuramoto model, there is a transition from an uncoupled to a synchronized state with increasing *K*. Similarly in the Ising model, there is a transition from a very ordered to a disordered system with increasing *T*. In the human brain, we are not able to characterize the system by incrementally tuning a parameter and observing the result, and we are only privy to snapshots of the working system. However, we can begin to understand the behavior of the brain within this range of behaviors by comparing the DFA of the rate of change of phase difference of pairs of neurophysiological signals to the outcomes of these models of criticality.

## **4.5. THE CABRAL MODEL**

We found that LRTCs exist in the rate of change of phase difference between oscillator pairs at parameter values close to those at which the change in order, (*Kr*), increases sharply. Extrapolating from the Kuramoto model with noise, we suggest that there are important changes in the order of the phase synchronization of interacting oscillators in the Cabral model that involve the presence of LRTCs when the order in that system is at or close to a point of maximal change.

It is important to note that the value of *r* in the Cabral model does not reach a level of 1 in the range of coupling values 0–20. It approaches a level of ≈0.4 as *K* approaches 20 with maximal rate of change at *K* ≈ 6. Further analysis of the Cabral model indicates that *r* will gradually reach a value closer to 1 as *K* increases above a value of 60, as seen in Figure 4 of Cabral et al. (2011). Cabral focussed her attention on *K* = 18 at which point the model, when fed through the Balloon-Windkessel hemodynamic model, produced an output that closely matched the spatio-temporal correlations seen in the BOLD signals of the resting state fMRI. We find that at this value, there are no LRTCs detectable in the rate of change of phase difference measure.

## **4.6. THE ROLE OF CONNECTIVITY IN THE CABRAL MODEL**

Although most of results were obtained at *K* = 6, selected because it is the peak of (*Kr*), it is important to note that LRTCs exist for a broader range of coupling values *K*. This finding agrees with a recent study by Moretti and Muñoz (2013) in which the authors demonstrated that a network with complex connectivity, such as that of the Cabral model and, indeed, that of the brain, causes the critical point to becomes a broader critical "region."

Our examination of oscillator pairs belonging to a single cluster, as defined in Cabral et al. (2011), indicates that the emergence of LRTCs is determined primarily by oscillators belonging to cluster 4 which has a large number of connections and a large sum of connection weights. This cluster is located centrally, and it contains four brain regions of particular importance to the resting state network (Fransson and Marrelec, 2008; van den Heuvel and Sporns, 2011). These are oscillators 33 and 34, which correspond to the left and right posterior cingulate cortices, and oscillators 32 and 35 which represent the left and right precuneus. These central brain regions are known to be important with a higher metabolic activity than other regions during the resting state.

Importantly, we find that LRTC behavior of this cluster, and its relationship to the other clusters in the network, is dependent on trans-callosal left-right connectivity. Indeed, disruption of the left-right trans-callosal connections resulted in a loss of LRTCs in the rate of change of phase difference between time series extracted from the central cluster 4 and the other oscillators in the Cabral network. Intuitively, those oscillators that are connected to many other oscillators in the network will also influence the phases of a large number of other oscillators. When these oscillators try to synchronize, we suggest that those that are well connected will be subjected to conflicting phase inputs from their neighbors and thus increased variation in their phase fluctuations, yielding a larger DFA exponent. These variations in fluctuation will in turn feed into the neighboring oscillators and cause them to also have large variations in fluctuation as they attempt to synchronize with their well-connected neighbor. On the other hand, an oscillator that is poorly connected or connected to just one other oscillator may have a more straightforward task of synchronizing with just this (albeit changing) oscillator speed.

The LRTCs in the rate of change of phase difference were also disrupted by randomization of connectivity, albeit less severely than when the trans-callosal connections were severed. When a random connectivity is assigned, no clusters exist and DFA exponents are significantly reduced.

The results obtained from the phase synchrony analysis method here may pave the way for potential future use of the Cabral model in investigating specific pathological modifications of connectivity and their effects on the time-varying synchronization patterns between different brain regions. The method has the potential to be used to trace some types of pathological synchronization such as may arise in epileptic or Parkinsonian conditions to any roots that they may have either in the connectivity, clustering or noise input elements of the Cabral model and therefore potentially also of the nervous system.

## **4.7. NEUROPHYSIOLOGICAL DATA**

In order to show proof of principle, we have presented an example of our method's application to neurophysiological data, in this case EEG and EMG simultaneously recorded during voluntary muscle contraction. It was through this experimental paradigm that corticomuscular coherence (CMC) in the 16– 32 Hz (β) frequency range was first discovered by Conway et al. (1995), Halliday et al. (1998) and shown to be the β frequency common drive to human motoneurons first described by Farmer et al. (1993). These preliminary results indicate power law scaling in the DFA plot with a DFA exponent of ≈0.6.

It has been recognized through application of time-varying coherence measures that CMC coherence fluctuates even when a subject attempts to maintain the same motor output (Muthukumaraswamy, 2011). As discussed earlier, the techniques introduced here allow us to focus on the fluctuations within the phase coupling rather than on the averaged measure of coupling. These preliminary results indicate that the fluctuations in the rate of change of phase difference between simultaneously recorded EEG and EMG show power law scaling and LRTCs within the β frequency range. We suggest that the analysis of instantaneous phase diffence of neurophysiological data using the methods described in this paper will allow researchers to investigate the coupling between signals in a way that will allow a new appreciation of the relationship between neural synchrony and other oscillator systems approaching their critical regime.

## **4.8. LRTCs IN RATE OF CHANGE OF PHASE DIFFERENCE AND THE BRAIN**

LRTCs have been associated with model dynamical systems that show efficiency in learning, memory formation, rapid information transfer, and network organization. The broad dynamical range of which LRTCs are a marker acts to support these functions (Linkenkaer-Hansen et al., 2001, 2004; Stam and de Bruin, 2004; Sornette, 2006; Shew et al., 2009; Chialvo, 2010; Werner, 2010; Beggs and Timme, 2012; Meisel et al., 2012). It has been argued by a number of researchers that these properties if present would be of major benefit to the functions that human brain dynamics needs to support and there is now a literature that connects the theory of critical systems with properties of human brain dynamics (Linkenkaer-Hansen et al., 2001; Beggs and Plenz, 2003; Kitzbichler et al., 2009; Shew et al., 2009; Chialvo, 2010).

In this paper, we focus on LRTCs, and because of the importance in neuroscience of brain oscillations and the concept of communication through coherence, we make the link between LRTCs and phase synchrony. We note that in the model systems that we have explored the highest valid DFA exponents were recovered when the systems were close to their critical point but in a slightly more disordered state than at exact criticality. We explained this on the basis of full synchronization within our model systems being a point at which the rate of change of phase difference is lost (observed in Ising at *T* < *Tc* and in Kuramoto for increasing *K*).

In neurophysiological systems, it is important to appreciate that full synchronization of neural oscillators is a pathological state (e.g., observed in the EEG and MEG of epileptic seizures and in EMGs showing pathological tremor). The healthy resting brain state therefore is characterized by weak and variable neural synchrony which would be expected to show fluctuations (temporal innovations) in a measure of the change in phase synchrony, i.e., the rate of change of phase difference. From the perspective of brain dynamics (and muscle activation dynamics) the most important constraints are to avoid pathological synchronization whilst at the same time maintaining the potential for useful synchronization. We suggest therefore that in the healthy state the instantaneous phase difference between neural oscillators will show power law fluctuation plots with a DFA exponent that is either 0.5 or that will show LRTCs. If LRTCs are found in the resting state then they may represent an optimum state of readiness to which the system can readily return if increased synchronization occurs as a result of sensory stimulation, motor task, or cognitive action. Such temporary changes in synchronization may occur in order to support communication through coherence. The resting state, however, is characterized by fluctuations of phase synchrony that have LRTCs and represent the behavior of weakly coupled oscillators whose synchrony can be modulated. The hypothesis that the LRTCs of rate of change of phase difference of brain oscillations may be altered through task is an experimentally tractable question.

To conclude the evidence for the brain as a critical system continues to accrue. There is an important need to link the criticality paradigm with the paradigm that attaches functional significance to neural synchrony. The methodology presented in this paper takes us some way toward this synthesis.

## **FUNDING**

Engineering and Physical Sciences Research Council (EPSRC).

## **ACKNOWLEDGMENT**

Maria Botcharova thanks the Centre for Mathematics and Physics in the Life Sciences and Experimental Biology (CoMPLEX), University College London for their funding and continuing support. Simon F. Farmer was supported by University College London Hospitals Biomedical Research Centre (BRC).

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 22 April 2014; accepted: 01 September 2014; published online: 24 September 2014.*

*Citation: Botcharova M, Farmer SF and Berthouze L (2014) Markers of criticality in phase synchronization. Front. Syst. Neurosci. 8:176. doi: 10.3389/fnsys.2014.00176 This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2014 Botcharova, Farmer and Berthouze. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## **APPENDIX**



*The abbreviations, full names, and oscillator numbers corresponding to the left and the right hemispheres are given for each brain region. The labels, brain regions, and oscillator numbers used in the Cabral model.*

## Self-organized criticality as a fundamental property of neural systems

## *Janina Hesse1,2,3\* and Thilo Gross <sup>4</sup>*

*<sup>1</sup> Computational Neurophysiology Group, Institute for Theoretical Biology, Humboldt Universität zu Berlin, Berlin, Germany*

*<sup>2</sup> Bernstein Center for Computational Neuroscience Berlin, Berlin, Germany*

*<sup>3</sup> École Normale Supérieure, Paris, France*

*<sup>4</sup> Department of Engineering Mathematics, Merchant Venturers School of Engineering, University of Bristol, Bristol, UK*

#### *Edited by:*

*Dietmar Plenz, National Institute of Mental Health, NIH, USA*

#### *Reviewed by:*

*Shan Yu, National Institute of Mental Health, USA Woodrow Shew, University of Arkansas, USA Hongdian Yang, Johns Hopkins University School of Medicine, USA*

#### *\*Correspondence:*

*Janina Hesse, Computational Neurophysiology Group, Institute for Theoretical Biology, Humboldt Universität zu Berlin, Philippstr. 13, Building 4, 10115 Berlin, Germany e-mail: janina.hesse@bccn-berlin.de*

The neural criticality hypothesis states that the brain may be poised in a critical state at a boundary between different types of dynamics. Theoretical and experimental studies show that critical systems often exhibit optimal computational properties, suggesting the possibility that criticality has been evolutionarily selected as a useful trait for our nervous system. Evidence for criticality has been found in cell cultures, brain slices, and anesthetized animals. Yet, inconsistent results were reported for recordings in awake animals and humans, and current results point to open questions about the exact nature and mechanism of criticality, as well as its functional role. Therefore, the criticality hypothesis has remained a controversial proposition. Here, we provide an account of the mathematical and physical foundations of criticality. In the light of this conceptual framework, we then review and discuss recent experimental studies with the aim of identifying important next steps to be taken and connections to other fields that should be explored.

**Keywords: self-organized criticality, brain, phase transition, dynamics, neural network**

## **1. INTRODUCTION**

The brain can be studied by two complementary approaches: Bottom-up approaches start on the level of single neurons or small groups of neurons, and then generalize upwards to the level of the brain. Hypotheses on the macroscopic level are formed based of the microscopic dynamics. For example, the observation of resonance in electrophysiological recordings can predict oscillations on the network level. By contrast, top-down approaches start by considering the properties of the brain on the level of brain areas or the whole brain, and infer downwards to the properties of its constituents. Hypotheses on the microscopic level are formed based of the macroscopic dynamics. For example, correlated activity in EEG recordings predicts a connection between the underlying brain areas.

A central concept connecting the microscopic and macroscopic levels is *criticality*. In the investigation of neural criticality, the word *critical* is used in the sense of statistical physics, which is distinct from other meanings, including the colloquial use. In statistical physics, criticality is defined as a specific type of behavior observed when a system undergoes a *phase transition*.

Physics characterizes the behavior of systems into qualitatively different *phases*. This classification scheme has its origin in the phases of classical matter, i.e., solid, liquid, and gaseous phase. The different macroscopic properties of, say, ice, liquid water, and steam can be explained by the microscopic forces between single water molecules. The discovery of this connection inspired the application of the concept of phases in a broader context and led to the identification of many more phases and different types of phase transitions.

To distinguish different phases, one considers macroscopic, measurable properties of the system, so-called *order parameters*. One then observes how these order parameters change as an ambient property, the so-called *control parameter*, is varied. In general, a smooth change in the control parameter leads to a smooth change in the order parameters. However, there are certain points where the values of the order parameters jump or make sharp turns, see **Figure 1**. These points mark boundaries between different phases, and moving the control parameter across such a boundary causes a *phase transition*. If the transition is marked by a jump in the order parameters of the system (mathematically-speaking, a discontinuity in the phase diagram), the phase transition is called *discontinuous*. Such transitions are sometimes called *transitions of first order*. If the phase diagram is continuous and the transition is marked by a sharp corner (a point of non-differentiability), then the phase transition is *continuous* (*second order*).

If a system has a continuous phase transition, then the system can reside exactly at the transition point between two phases. This state on the edge between two qualitatively different types of behavior is called the *critical state*, and in this state the system is *at criticality*. Because phase transitions usually break certain symmetries of the system, they often separate an ordered state from a less ordered state. Critical states are therefore said to be on the *edge of chaos*.

As we discuss in detail below, systems at criticality are believed to have optimal memory and information processing capabilities. This general theoretical prediction was verified in many specific models such as boolean networks

(Kauffman, 1984; Derrida and Pomeau, 1986), liquid state machines (Langton, 1990), and neuronal networks (Maass et al., 2002; Bertschinger and Natschläger, 2004), for a review also see Legenstein and Maass (2007). These findings inspired the *criticality hypothesis*, which proposes that the brain operates in a critical state because the associated optimal computational capabilities should be evolutionarily selected for.

Deviations from criticality could be symptomatic or causative for certain pathologies. This may pave the way for new diagnostics and treatments. For instance, Meisel et al. (2012) showed that hallmarks of criticality disappeared during epileptic seizures. Furthermore, insights into criticality in the brain could yield valuable design and operating principles for computation more in general, for example for unstructured artificial systems such as computers build from randomly-deposited nanowire memristors.

However, the criticality hypothesis is far from undisputed and many open questions remain. In particular, for a system to be at criticality, one parameter needs to be tuned exactly to the right point. One can therefore ask how a complex dynamic and variable system such as the brain can remain correctly tuned to this state. For a plausible answer, first note that the theory of phase transitions typically considers infinite systems. In large but finite systems, phase transitions occur not at a single point, but are smoothed out over a small parameter range. Instead of the unique critical state, we find a small region that is not technically critical, but still retains many properties of criticality, see **Figure 1** (Moretti and Muñoz, 2013). However, even remaining in this "critical" region should require mechanisms that actively retune the brain. The general idea of systems tuning themselves to critical states through active decentralized processes is known as *self-organized criticality* (SOC) (Bak et al., 1988), and is illustrated in **Figure 2**. After a burst of activity in this area in the 1990s, the theory of self-organized criticality encountered some obstacles and interest slowly subsided (Vespignani and Zapperi, 1998). It was revived by Bornholdt and Rohlf (2000), who discovered an elegant mechanism of self-organized criticality in networks and already suggested it as a plausible mechanism for neural criticality.

The criticality hypothesis can thus build both on evolutionary arguments and on a plausible general mechanism that can explain the self-organization to the critical state. Although investigated analytically and numerically for numerous toy models, it is still unclear whether and how such a mechanism is implemented in the brain. Evidence for criticality has been found in experiments on cell cultures (e.g., Beggs and Plenz, 2003; Tetzlaff et al., 2010), animals (e.g., Petermann et al., 2009; Hahn et al., 2010) and humans (e.g., Kitzbichler et al., 2009; Meisel et al., 2012). However, it has been pointed out that some evidence may be misleading and could potentially be explained by alternative mechanisms (Botcharova et al., 2012). Some experimental studies also report negative results where characteristics of criticality were not observed in the neuronal activity (e.g., Bédard et al., 2006; Dehghani et al., 2012, but see criticism in Yu et al., 2014). In general, the relationship between the theoretical framework and its biological realization remains unclear. While models have demonstrated the plausibility of self-organized criticality in the brain, it is not clear to which of the many conceivable phase transitions the brain organizes, if and how different forms of plasticity drive the brain to this state, and whether different brain regions organize independently. Resolving these questions could lead to a much deeper understanding of neural criticality, explain apparent contradictions in experimental findings, and open up new connections with other fields.

Neural criticality has been reviewed in recent articles (Beggs, 2008; Kello et al., 2010; Beggs and Timme, 2012; Shew and Plenz, 2013; Markovic and Gros, 2014 ´ ) and is the topic of a contributed volume (Plenz and Niebur, 2014). In this review, our aim is to present a clear picture of the underlying concepts and ideas from statistical physics and nonlinear dynamics. We do not attempt to provide a comprehensive survey, but instead highlight specific papers to illustrate general insights that are evident in much of the recent literature. We first present a simple toy model that provides the essential concepts in front of which much of the recent work can be discussed. We then review self-organized criticality in nervous systems with a special focus on the interaction of theoretical and experimental work in this field. We point out several current questions and connections to other phenomena. Because of the emerging connections, we believe that the criticality hypothesis inspires discussions and the development of tools for the analysis of brain dynamics which will proof useful independent of the validity of the hypothesis itself.

## **2. EXAMPLE OF A PHASE TRANSITION IN A NETWORK**

Phase transitions and criticality can already be observed in simple network models. In physics, such highly simplified models have proven useful to distill the essence of a phenomenon, before investigating how this essence is reshaped through additional details present in the real system.

Consider a large directed network of excitable nodes that can be seen as a crude model of neurons. In average, each node has

*z* outgoing links that can propagate activity to other nodes. In analogy to neural systems, we refer to the source of a given link as the *presynaptic* node and to the destination of a given link as the *postsynaptic* node. For a given link, activity is not transmitted instantaneously; instead, there is a small probability *p* that an activation of the presynaptic node activates the postsynaptic node in a small time interval of length τ . Active nodes decay back to the inactive state within the same time interval.

Clearly, this model is excessively simplistic and omits many additional effects and factors that are present in real nervous systems. However, as we show below, the model already contains all ingredients to exhibit a phase transition, and thus provides us with a simple model of a phase transition to play with.

Let us now try to understand the macroscopic dynamics of the system based on its microscopic rules. We define the network activity *A*(*t*) as the mean proportion of activated nodes at time *t*. Higher values of *A* imply that there are more active nodes, which can serve as sources of activity, but less resting nodes, which can still be activated. Mathematically, we can capture the ensuing dynamics by the differential equation

$$\frac{dA}{dt} = \underbrace{-\frac{1}{\tau}A}\_{\text{inactive}} + \underbrace{pz\left(1 - A\right)\frac{1}{\tau}A}\_{\text{activation}}\tag{1}$$

The system approaches a dynamical equilibrium *dA dt* = 0. Setting the left hand side of Equation (1) to zero reveals two qualitatively different steady states. In one of them, the activity dies out, *A*<sup>0</sup> = 0, whereas in the other, a stable level of activity *A*<sup>0</sup> = 1 − 1/*pz* is maintained. Generally, a system will only approach steady states which are stable to small perturbations. Stability analysis (Guckenheimer and Holmes, 1983) reveals that the stable state is the quiescent state for *pz* < 1 and the active state for *pz* > 1, such that the activity is non-negative. We can thus say that we observe an active state of the network when the connectivity *z* is greater or equal than *z*<sup>∗</sup> = 1/*p*.

Plotting the level of activity observed in the system's long term behavior, *A*0, as a function of the connectivity *z* reveals a typical *phase diagram* (**Figure 1**). In this context, the connectivity *z* is the control parameter, and the activity *A*<sup>0</sup> is the order parameter. The diagram shows a subcritical quiescent phase and a supercritical active phase. The critical connectivity *z*<sup>∗</sup> = 1/*p* corresponds to a phase transition between these two phases. We note that even in this simple model the relation between phase transitions and symmetry breaking is evident. In the quiescent phase, all nodes are in the same (inactive) state, whereas this symmetry is broken in the active phase. In the quiescent phase, the system is completely static, whereas in the active phase, the individual nodes are activated stochastically, and seemingly chaotically. The phase transition point therefore marks the edge of chaos.

The analytical solution only holds for the limit of large networks. In small networks, network activity is difficult to sustain near the critical point, where the sustained network activity is so low that it easily dies out by chance. The abrupt change at the critical point is smoothed out and the observed phases are no longer perfectly distinct (**Figure 1**). Analog effects are seen in any finite system. In a large but finite system such as the brain, one would therefore not expect to find a single isolated point that expresses perfect criticality, but rather a small region that shows properties of critical systems in an approximate sense.

We emphasize that the simple model discussed here only exhibits one type of phase transition, the onset of activity. Additionally, there can be many other types of phase transitions. Another example that is commonly encountered in models, and may be more relevant for neural information processing, is a transition that marks the onset of synchronous (i.e., correlated) activity in the network (e.g., Meisel and Gross, 2009; Yang et al., 2012). One implication of the presence of such additional transitions is that labels such as subcritical and supercritical can only be applied with respect to a certain transition. For instance, a system that shows activity, but not correlated activity, can be considered supercritical with respect to the activity transition, but subcritical with respect to the synchronization transition.

## **3. PROPERTIES OF PHASE TRANSITIONS**

In the following, we discuss how phase transitions and critical dynamics can be detected in experiments. The most direct evidence for a phase transition is certainly provided by a phase diagram (**Figure 1**) (Dickman et al., 2000). In this type of diagram, the existence of a phase transition can be seen directly in the response of the order parameter to variations of the control parameter. However, for creating such a diagram the control parameter must be accessible (controllable) in the experimental setting. For instance, it is difficult to imagine an experiment where the connectivity of the brain (our parameter *z* from above) can be varied *in vivo*. Yet, it might be possible to control the effective connectivity (e.g., *pz*) by pharmacological interventions in *in vitro* experiments. Although some studies discussed below report results for such modifications of control parameters, most of the evidence for criticality comes from experiments that show criticality indirectly by the observation of certain hallmarks. In this section, we discuss these hallmarks of criticality in the context of the simple model introduced above.

One commonly used hallmark comes from the theory of branching processes (Harris, 1963). Suppose we could observe only a tiny portion of the system, which only rarely lights up with (possibly spontaneous) activity. Under the assumption that the connections are sufficiently short-ranged to be within our observation window, we can still estimate the number of secondary activations that a given focal activation triggers, the so-called *branching parameter* σ. In the subcritical phase, this number is in average less than one. In the supercritical phase, the dynamics persists in the system and thus there must be in average as many activations as deactivations, which implies a branching parameter of one. In a spatially extended system that is not too far in the supercritical phase, a branching parameter greater than one may be observed over short times in response to an artificial excitation. Therefore, the observation of a branching parameter σ = 1 in response to an artificial excitation in a sufficiently quiescent system may be seen as evidence for criticality. However, compared to other hallmarks, this evidence is relatively weak because a branching ratio of one does not necessarily imply critical dynamics, but is also observed in supercritical states.

Another hallmark of criticality is related to the response of the system to external stimuli. In our model, the sensitivity to inputs (the *dynamic range*) is maximal at criticality. This can be shown by considering the temporal development of a small perturbation δ. The dynamical evolution of the perturbation of the steady state *A*<sup>0</sup> is given by inserting *A*<sup>0</sup> + δ into Equation (1), which yields

$$\frac{d\delta}{dt} = \left(-1 + pz\left(1 - (A\_0 + \delta)\right)\right) \frac{1}{\tau} (A\_0 + \delta),$$

$$= -\frac{1}{\tau} \left(\pm (1 - pz)\delta + pz\delta^2\right)$$

$$\approx -\frac{1}{\tau} \left(\pm (1 - pz)\delta\right),$$

where the plus applies if *z* < 1/*p* and the minus otherwise. The approximation in the last line holds for sufficiently small perturbations. The resulting equation is a linear differential equation, which implies that after the perturbation the system relaxes rapidly (exponentially) back to *A*0. In this case, the half-life of a sufficiently small perturbation is |τ ln (2)/(1 − *pz*)|. Any memory of the perturbation disappears therefore quickly. When the system approaches criticality, *pz* → 1, such that the half-life increases. At criticality, *z*<sup>∗</sup> = 1/*p*, such that the first order term 1 − *pz* vanishes, and the approximation leading to the third line no longer holds. In this case, the system relaxes only geometrically back to the state *A*0, which means that the memory of the perturbation is retained for a long time. This property is often called *critical slowing down*. Let us emphasize that critical slowing down is not only a property of the specific model considered here, but a general feature of critical phase transitions in the dynamics of a system (Scheffer et al., 2009). It lends critical systems a long memory and may play an important role for their computational properties.

In the example system, we can also understand the emergence of memory in the critical state on a microscopic level. Consider a situation in which we artificially activate a small number of neurons. We now ask how long the memory of this activation lasts in the time evolution of the system. Let us first consider a system in a subcritical state. Here, we already know that the branching parameter is less than one and hence the initially activated neurons will activate only a smaller number of neurons such that the signal from the initial activation quickly (i.e., exponentially) decreases over time. Consider now a supercritical state. We recognize that the branching parameter is equal to one, so we expect that the initial artificial activation of the small group of neurons triggers a cascade that stays in average roughly constant in size. We could therefore naively expect that the memory of the activation persists in the system. However, the truth is a bit more subtle: While the cascade indeed persists, some of the neurons involved in the cascade would have been activated anyway due to the ongoing self-sustained activity of the system. Thus, the difference between the artificially excited system and a system where the artificial activation did not take place shrinks in time; again the memory of the activation is lost exponentially. By contrast, the critical system already has a branching parameter of one, allowing the cascade that we have set off to persist for a long time, and it has also negligible background activity, allowing the information transmitted by the cascade to persist without interference.

The slowing down leads to another observable characteristic of critical systems, called 1/*f*-noise, which is commonly observed in nature (Hausdorff and Peng, 1996). If a critical system is constantly perturbed by weak random inputs, the dynamics is a superposition of a multitude of geometric responses. The power spectrum of this noisy response then follows a power-law, which means that the energy dissipated at frequency *f* is approximately 1/*f* <sup>α</sup>, where α is some constant. While every critical system should exhibit power-law noise, the observation of this type of noise alone does not constitute a proof of criticality, as it is also observed in certain other processes (Cencini et al., 2000; Bédard et al., 2006).

Power-laws appear in critical systems also in a different way. Loosely speaking, phase transitions occur at the points where the line between macroscopic and microscopic dynamics is blurred, e.g., where avalanches initiated on the microscopic level become so large that they affect the dynamics on the macroscopic level. For several reasons this is only possible when the size distribution of avalanches obeys a power-law (Levy and Solomon, 1996). Let us once again consider the simple model proposed above. Since the branching ratio is one independent of the size of the current avalanche, the probability distributions describing the cascades of events downstream from an activated node are independent of whether the node is the initial node that sparked the avalanche or a node that is only activated as the result of a long sequence of events. This is one aspect of the self-similarity found in critical processes (Markovic and Gros, 2014 ´ ). The cascade of subsequent activations caused by a given node is statistically identical to the cascades of subsequent activations triggered by the activated nodes. This in turn causes power-laws to appear in many observables of the system. Thus, criticality is generally associated with the appearance of power-law distributions of the form *f*(*x*) = *Cx*−<sup>α</sup> for many different observables.

The observation of power-laws in multiple observables with consistent exponents constitutes a relatively strong proof for criticality. Although it is often pointed out that also these relationships could arise in non-critical systems, this criticism is much weaker for microscopic observables than for macroscopically recorded power-law noise. We note that some of the examples that are often quoted for spurious are wrong. For instance, it is often held that the Barabasi-Albert model (Barabási and Albert, 1999) leads to networks with a power-law degree distribution but does not correspond to a critical state. However, the Barabasi-Albert model is indeed in a critical state that marks the transition between exponential and star-like networks (Krapivsky and Krioukov, 2008).

A consequence of the blurring of the line between global and local scales, and part of the reason for the appearance of power-laws, is the so-called *scale independence* (Goldenfeld, 1992). This phenomenon captures the observation that critical systems show similar patterns at all scales. For example, the shapes of avalanches of any size resemble each other (Markovic and Gros, ´ 2014). As one approaches criticality, correlations occur between distant parts of the system, which means that external perturbations or spontaneous fluctuations can influence large parts of the system. For instance, stimulations induce small avalanches already in the subcritical region of our simple model. As we slowly increase the connectivity, these avalanches get bigger and bigger and reach the scale of the system at criticality. In this case, the avalanches occur on all scales up to the system size, which implies that the typical length of correlations diverges. If we increase the connectivity further, activity in the system continues to increase, making simultaneous occurring avalanches likely. As a node cannot be activated twice at the same time, one of the avalanches effectively stops whenever two avalanches reach the same node. These collisions between the avalanches decrease long-ranged correlations and destroy the divergence of the correlation length.

In summary, criticality occurs at phase transitions for which the order parameter changes non-smoothly but continuously with the control parameter. Proper phase transitions are an idealization only expected in the infinite size limit—in real systems, the transition is less well defined and smoothed out over a finite interval. At criticality, as well as in its proximity, the system dynamics exhibits critical slowing down, and the distributions of observables and fluctuations follow power-laws. These hallmarks of criticality lend critical systems their optimal information processing and storage capabilities, reviewed by Shew and Plenz (2013). Critical slowing down allows memories of dynamical patterns to be retained for a long time (Beggs and Plenz, 2004; Haldeman and Beggs, 2005; Chialvo, 2006; Chen et al., 2010; Kello et al., 2010). Furthermore, criticality maximizes the dynamic range of the response to inputs (Kinouchi and Copelli, 2006; Shew et al., 2009) and the variability of the neuronal response (Shew et al., 2011; Yang et al., 2012; Meisel et al., 2013). As scale-independent systems naturally show both small and large activity patterns, inputs can be processed in parallel, and integrated over the whole system (Gutiérrez et al., 2011).

## **4. SELF-ORGANIZATION TO A CRITICAL BRAIN STATE**

To observe criticality, a control parameter has to be tuned to its critical value. In a variable system such as the brain, and without an external observer, critical dynamics can only be conserved by *self-organized criticality* (Bak, 1996), a constant tuning of the control parameter by a decentralized internal mechanism. For many systems with a critical phase transition, *self-organized criticality* is easily implemented by a mechanism that increases the control parameter in the subcritical phase and decreases it in the supercritical phase.

We use the term *control parameter* also in self-organized critical systems although the control parameter is no longer controlled externally, but by the system itself. To adjust the control parameter appropriately, self-organizing mechanisms have to evaluate the current phase of the system from an internal perspective. In the nervous system, the self-organization probably relies on the dynamics of single neurons or synapses, and not on a global regulation, e.g., by the endocrine system, because evidence for critical brain dynamics is especially prominent in *in vitro* studies, where the neurons are separated from the rest of the brain that could act as global integrator. A central challenge is therefore to explain how individual neurons or synapses can infer the phase from local observations.

To decide whether the system is in the sub- or supercritical phase, the self-organizing mechanism has to evaluate the global mean of the order parameter. However, as the information accessible to a single neuron or synapse is necessarily local, it is reasonable to expect that the global mean is approximated by a temporal mean over the dynamics (Bornholdt and Rohlf, 2000). To allow for an estimation of the global mean based on a temporal integration of local observations, the change in the control parameter has to be considerably slower than the dynamics of the system. For example, in the sandpile model presented in the introduction (Bak et al., 1988), criticality is only reached when the next sand grain is dropped after any dynamics on the pile has ceased. Self-organized critical systems show in general a *time-scale separation* between changes in the system structure and changes in the dynamics of the system (Vespignani and Zapperi, 1998).

Theoretical arguments seem to suggest that self-organized criticality can be fully realized only in systems in which the control parameter is conserved (Dickman et al., 2000). In the brain, which is constantly subject to external input, the self-organization never precisely reaches the critical point (Bonachela et al., 2010). However, the characteristics of criticality, such as computational capabilities and sensitivity, are already increased in the proximity of the critical point. Therefore, we use the term self-organized criticality to refer both to neural networks which are right at or sufficiently close the critical point, a state that has previously been called *self-organized quasi-criticality* (Bonachela and Muñoz, 2009).

For the brain with its highly hierarchical and modular structure, it is likely that critical points generalize to critical regions (Griffiths phases) (Moretti and Muñoz, 2013). This relaxes the requirements on the tuning of the control parameter, which could also be shown in a realistic model of neuronal network dynamics (Rubinov et al., 2011). In modular systems, the global phase transition is spread out because, for a certain range of the control parameter, some modules are still in the subcritical phase, while other modules are already in the supercritical phase. Properties arising at criticality, such as power-law distributions, large dynamic range and slowing down of the dynamics, are approximately observed for any value of the control parameter in this critical range. Dynamical states similar to criticality are therefore likely whenever the self-organizing mechanism tunes the control parameter to the proximity of this critical region.

Self-organization of plausible neural models to criticality was demonstrated in a number of papers (e.g., Bornholdt and Rohlf, 2000; Levina et al., 2009; Meisel and Gross, 2009; Droste et al., 2013). However, many questions remain. First, we do not know which parameters in the brain are tuned to reach criticality. On a microscopic scale, synaptic conductances seem to be a likely candidate. As a substantial change in the synaptic conductances is only observed after several spikes, the plasticity acts on a slower time-scale than the neuronal activity. This provides the time-scale separation required for a robust tuning of the system to critical states. A change in the synaptic conductances could directly influence the excitability of the synapse (basically the parameter *p* in our simplified model), which is sufficient to tune the system to criticality.

A change of individual synaptic weights, which translates into an overall change in excitability, is only the simplest possible scenario. The excitability can also be changed directly by mechanisms of homeostatic plasticity (Stewart and Plenz, 2008; Droste et al., 2013). Other possible targets for sophisticated selforganizing plasticity mechanisms are changes in the level of micro-scale modularity, or of the heterogeneity in the system. These factors, which we have ignored so far, affect the location of critical points and can thus be used to tune the system to criticality. The simple picture, in which exactly one global control parameter is tuned, is thus misleading. In reality, the microscopic changes in the system are likely to affect tens or hundreds of network level quantities at the same time, which all act as possible control parameters for phase transitions.

Another open question is to which critical state the network organizes. While we have so far focused on the phase transition at the onset of activity, some evidence suggests the onset of synchrony as a more likely candidate. Some insights into this question can be gained based on the relation between the nature of the transition at which the system resides and efficient coding of information. For the activity transition considered so far, the optimal computational properties are likely to be realized if the information is presented in a rate code, where the activity of a node represents directly an input. To achieve optimal information representation for a synchronization code, where an input is represented by synchronous activity, the system needs to be tuned to the phase transition at the onset of synchronization.

In a system with many parameters, the term *critical point* is misleading. From a mathematical perspective, the critical point is a bifurcation point of the macroscopic dynamics, and as such is characterized by its codimension, which is one in this case (Kutsnetsov, 1998). What this means is that the critical point is actually a manifold which has one dimension less than the embedding parameter space. So, in a one-dimensional parameter space, i.e., when only one parameter is varied, the critical point appears as a (zero-dimensional) point. However, in a two-dimensional parameter space, where a second parameter is varied, we find a (one-dimensional) line of critical points. In a three-dimensional parameter space, criticality occurs on a surface, and so on.

In complex networks, there is an abundance of parameters that affect the dynamics, including for instance the mean degree and mean outgoing link weights, which are often considered, but also clustering coefficients, modularity, and abundances of larger motifs. The precise number of parameters that play a role in neural criticality is hard to determine. However, let us point out that the one dimensional picture (**Figure 1**), which is usually drawn, is particularly misleading. Consider that in one dimension the probability that two different phase transitions occur at the same parameter value is of measure zero. However, in two parameter dimensions, each phase transition occurs on a critical line in the parameter space, and crossings between the lines are likely. Thus, if there are two processes of plasticity that tune the system to two different critical states, there is generally a possibility to observe both forms of criticality at the same time. Some evidence for such double criticality was already observed by Yang et al. (2012) and Meisel et al. (2013). This can potentially explain why characteristics of both activity (e.g., Beggs and Plenz, 2003, 2004) and synchronization (e.g., Linkenkaer-Hansen et al., 2001; Kitzbichler et al., 2009) phase transitions have been observed in experiments.

So far we have talked about *the brain* as a critical system. However, there is at least the possibility that different regions of the brain are tuned to criticality separately, and perhaps to different phase transitions. Working at the activity transition seems particularly advantageous for the detection of weak stimuli, as it allows a single spike to trigger a cascade of activity. On the other hand, working at the synchronization phase transition appears advantageous for cognitive processes.

## **5. EXPERIMENTAL EVIDENCE FOR THE CRITICALITY HYPOTHESIS**

The demonstration of self-organized criticality in the brain is controversial. Several experimental studies support the criticality hypothesis, others interpret their results in contradiction. In this section, we discuss common measurements used to support criticality in the brain and stress their potential shortcomings.

The best proof of criticality would be provided by a phase diagram as in **Figure 1**, where the critical point appears as a kink in the curve. However, in self-organized critical systems, the control parameter is set by the dynamics itself. If the control parameter is deviated experimentally, it starts to return to its critical value, such that it cannot be set freely. However, if the return is sufficiently slow, phase diagrams can be obtained approximately by monitoring a suitable order parameter while the system relaxes to the critical state.

In recent studies, most evidence for the criticality hypothesis in experiments and simulations is based on power-laws. As power-laws are expected in virtually every critical system, the existence of power-laws is a fundamental prerequisite for criticality, but as such not sufficient to prove criticality. Power-laws have been explained alternatively by different non-critical mechanisms (Touboul and Destexhe, 2010; Markovic and Gros, 2014 ´ ), such as filtered neural activity (Bédard et al., 2006; Bédard and Destexhe, 2009), noise (Bonachela and Muñoz, 2009; Miller et al., 2009), or noisy feed-forward structures amplifying small perturbations (Benayoun et al., 2010). For an educational review on the topic see Beggs and Timme (2012).

A major concern is the inference of power-law behavior from data. When plotted on a log-log plot (**Figure 3**), power-laws follow a straight line with a slope equal to their critical exponent α. However, visual inspection of a diagram can lead to false positives and it has been pointed out that conventional goodness-of-fit tests are ill suited for power-laws (Newman, 2005). The identification of power-laws is thus delicate and demands for advanced fitting procedures because power-laws are difficult to differentiate from other heavy-tail distributions (Clauset et al., 2009; Klaus et al., 2011; Markovic and Gros, 2014 ´ ). Furthermore, power-laws are truncated in systems of finite size (Bonachela et al., 2010) and are influenced by subsampling (Priesemann et al., 2009; Ribeiro et al., 2010; Priesemann et al., 2014; Ribeiro et al., 2014).

Most experimental and numerical studies on self-organized criticality concentrate on the identification of *neuronal avalanches* (Beggs and Plenz, 2003), i.e., bursts of activity that spread through the network and are predicted to follow power-law distributions in certain critical states (e.g., Harris, 1963; Eurich et al., 2002; Larremore et al., 2012). As the precise network topology is often not known in experimental observations, events are considered as part of the same avalanche if they occur in temporal and spatial proximity. This is justified in systems without longrange connections. In this case avalanches form local wave-like structures. If long-range connections are present it is difficult to assign observed activity to a particular avalanche. Avalanches with power-law size distribution can then still be present in the system although no local outbreaks that follow a power-law distribution are detected, which may explain why wave-like activity propagation is for example not observed in acute slices (Stewart and Plenz, 2006).

If the spontaneous activity and the stimulation rate are low, which is the case in most models, one avalanche is temporally

measured, power-laws with the same critical exponent are observed.

separated from the next. In this case, the size of the avalanche is defined as the number of neurons activated by the initial stimulation. In experiments, the definition is not straight-forward as the time-scale separation between dynamics initiation and dynamics progression is less clear (Shew et al., 2009; Ribeiro et al., 2010; Priesemann et al., 2014). Instead, avalanches are declared separated if the dynamics is interrupted for at least one pre-defined time bin. The dynamics is evaluated based on specific events seen in multi-electrode recordings, for example spikes or strong negative deflections of the local field potential (LFP). Resulting event time series are binned and a sequence of consecutive active bins is defined as avalanche, see **Figure 4**.

In models, in which a time-scale separation between dynamics initiation and dynamics progression is given, the avalanche size distribution is independent of the chosen threshold and bin size (Priesemann et al., 2014), which is a consequence of the scale independence of critical processes. In experimental data, however, avalanche size distributions depend on the chosen event threshold and on the bin size used for the binning process (Pasquale et al., 2008; Touboul and Destexhe, 2010; Priesemann et al., 2013). The avalanche size distribution changes with the bin size when subsampling introduces artificial pauses in single avalanches and when the external input is large enough to initiate multiple avalanches simultaneously (Priesemann et al., 2014). Most studies use a bin size that fits the time that the neural signal takes to spread between electrodes (Beggs and Plenz, 2003; Stewart and Plenz, 2006; Pasquale et al., 2008), some studies also report power-law fitting for different bin sizes (Hahn et al., 2010; Tetzlaff et al., 2010). As expected at criticality, neuronal avalanches show further scale-free properties. Importantly, the avalanche distributions overlap when rescaled by the number of recording electrodes (finite-size scaling, Klaus et al., 2011; Yu et al., 2013). Results are furthermore independent of the recording electrode number and distance (Beggs and Plenz, 2003; Hsu et al., 2008; Pasquale et al., 2008; Tetzlaff et al., 2010) and long range spatial and temporal correlations can be shown (Petermann et al., 2009; Hahn et al., 2010; Yu et al., 2013).

For LFP-recordings, critical neuronal avalanche distributions are reported for various animals and brain regions, both *in vitro* (Beggs and Plenz, 2003, 2004; Mazzoni et al., 2007; Pasquale et al., 2008) and *in vivo* (Petermann et al., 2009; Hahn et al., 2010). Neuronal avalanches can be formed by nested oscillations (slices and anesthetized rat Gireesh and Plenz, 2008) and the variability in the synchronization is maximal (Yang et al., 2012). The critical exponents of the avalanche size distributions (e.g., Beggs and Plenz, 2003; Hahn et al., 2010; Klaus et al., 2011; Friedman et al., 2012) fit theoretical predictions (e.g., Harris, 1963).

When spikes are evaluated, the picture is less consistent. Power-law distributed avalanches were not observed in awake animals (Bédard et al., 2006; Dehghani et al., 2012; Priesemann et al., 2014), which is consistent with theoretical models which predict criticality in a resting state. Indeed there is some evidence that the brain's critical state deteriorates during wakefulness and recovers during sleep (Meisel et al., 2013, compare also Priesemann et al., 2013). In anesthetized animals or cultures, power-law distributions for the spiking activity can be observed (Hahn et al., 2010; Ribeiro et al., 2010), but most recordings do not support

power-law fitting (Bédard et al., 2006; Hahn et al., 2010; Ribeiro et al., 2010; Dehghani et al., 2012). Avalanche distributions as observed for spiking activity can however be reproduced by subsampling models implementing self-organized criticality with increased external input and tuned to a slightly subcritical regime (Priesemann et al., 2014). An alternative explanation for noncritical avalanche distributions may be recordings that are biased toward a specific subset of neurons, for example if cell types with particularly clear spike shapes in the extracellular signal are preferentially identified. Furthermore, it is questionable whether we can expect hallmarks of criticality if just a few neurons are recorded simultaneously, because criticality is intrinsically a network effect. In many real-world systems, the scale-independence breaks down if we get too close to the level of single dynamical units.

Apart from properties of the critical state, implications of the self-organization to criticality can be examined. For instance, models of self-organized criticality reproduce developmental phases of cell cultures. Starting from an unconnected state, the temporal development of avalanche distributions in neuronal cultures can be fitted by models of self-organized criticality (Tetzlaff et al., 2010). Also slices from newborn rats of different ages show a temporal development from subcritical to critical dynamics (Gireesh and Plenz, 2008; Stewart and Plenz, 2008). Organotypic cell cultures can develop to subcritical, critical or supercritical states (Pasquale et al., 2008; Tetzlaff et al., 2010). Intriguingly, only the critical cultures show scaling of the mean temporal profile of avalanches, i.e., the data collapse when normalized appropriately (Friedman et al., 2012). The scaling also predicts the relationship between exponents, which is a strong indicator of criticality (Friedman et al., 2012).

Recent results suggest that also in humans, brain dynamics is close to criticality, yet slightly subcritical (Priesemann et al., 2013, 2014), a possibility first raised by Pearlmutter and Houghton (2009). Resting state dynamics from human brains reveal events analogous to neuronal avalanches whose dynamics fluctuate closely around criticality (EEG Allegrini et al., 2010, fMRI Tagliazucchi et al., 2012, MEG Shriki et al., 2013, EEG and MEG during rest and tasks Palva et al., 2013). The resulting critical exponents correlate with the critical exponents of the long-range temporal correlations (Palva et al., 2013). Imaging data suggests furthermore power-law noise because activity fluctuations (e.g., EEG Novikov et al., 1997, ECoG Miller et al., 2009) and correlation fluctuations (e.g., EEG and MEG Linkenkaer-Hansen et al., 2001, fMRI and MEG Kitzbichler et al., 2009) follow power-laws.

Correlations can also be used to construct functional connectivity maps, whose power-law distributed properties might relate to self-organized criticality (e.g., Eguiluz et al., 2005; Bassett et al., 2006; Expert et al., 2010; Lee et al., 2010; Van De Ville et al., 2010). For example, the duration distribution of functional connections in EEG recordings follow power-laws, which are stable over several states of consciousness (awake, loss of consciousness due to anesthesia, and recovery) and frequency bands (Lee et al., 2010).

The criticality hypothesis predicts that sufficiently strong perturbations of the network dynamics should eliminate the powerlaws found in the previously cited studies. In the following, we discuss studies showing that observed hallmarks of criticality vanish in response to interventions that change the network dynamics. Such deviations from criticality, and especially the subsequent return of the network to a critical state, strongly support criticality, since alternative explanations of power-laws based on low level features, such as noise and filtering of neuronal tissue, should be independent of the network dynamics.

Hallmarks of criticality are apparently destroyed during epileptic seizures. Epileptic dynamics shows hallmarks of supercritical states, and destroys power-laws observed in healthy brains (Hobbs et al., 2010; Meisel et al., 2012). If the network adapts to the supercritical state during the seizure, this may explain reduced activity and a smaller critical exponent after the seizure (Hsu et al., 2008). A self-organized criticality model suggests a relation between epileptic activity and decreased neuronal connectivity (Meisel et al., 2012). While it is thus tempting to equate epileptic seizures with supercritical dynamics, care has to be taken as seizures could very well be the result of another, overriding mechanism that is not captured by current models of neural self-organized criticality.

In contrast to epileptic seizures, pharmacologically induced variations in activity do not always destroy power-law distributed neuronal avalanches. In acute slices, the level of dopamine that implies maximal activity coincides with critical avalanche size distributions with a critical exponent of −1.5, while more or less dopamine preserves the power-law distribution, but shows steeper critical exponents (Stewart and Plenz, 2006). Steeper exponents reduce the occurrence of large avalanches and spatial correlations (Stewart and Plenz, 2006). Steeper critical exponents are as well observed under reduced spontaneous activity due to pharmacological interventions with a dopamine D1 receptor antagonist (Gireesh and Plenz, 2008), but the same antagonist can also suppress neuronal avalanches (Stewart and Plenz, 2006). The application of acetylcholine, which increases the spontaneous activity, results in exponential avalanche distributions (Pasquale et al., 2008). Strong pharmacological interventions can furthermore change the dynamical state of neural networks via alterations of excitation or inhibition. As expected from the idea that balanced excitation and inhibition are required for critical brain dynamics, this eliminates the observed hallmarks of criticality (**Table 1**).

The observation of variable exponents is interesting, as the critical exponents of phase transitions are usually independent of system features. This universality typically holds broadly even across different systems. The exponent of −1.5 is a plausible result as it is characteristic of the directed percolation universality class into which many processes of activity propagation fall. Exponents with a larger absolute value are more difficult to explain. While a complex real world system can potentially exhibit such exponents, it is also plausible that what is observed here is actually the breakdown of the power-law as the system is pushed from the critical state. If this occurs, the underlying distributions return to exponential behavior and thus exhibit less large events. In a certain transition region around the critical state, they can therefore easily be mistaken for steeper power-laws.

The premise of self-organized criticality is that the system is able to tune itself back to the critical state after moderate perturbations. This reorganization to criticality after long-lasting increases in inhibition has so far not been observed experimentally (Tetzlaff et al., 2010). Over the duration of the experiment, the network state does not adapt to decreased inhibition (Shew et al., 2009). Even after the inhibition-decreasing drug is washed out, neuronal slices take several hours to recover criticality (Shew et al., 2009). This time-scale is consistent with reorganization on a slow time-scale, for instance due to slow plasticity mechanisms such as homeostatic plasticity.

In summary, evidence for self-organized criticality is provided by critical neuronal avalanches in various animals, power-law noise in brain imaging data, scale independence and finite-size scaling. While power-laws can also be explained by alternative hypothesis, deviations from criticality and subsequent reorganization provide strong evidence for the criticality hypothesis. Perhaps the most compelling evidence is not provided by any individual study, but rather by the breadth of experimental results which provide evidence for criticality in many different systems using various approaches.

## **6. MODELS OF SELF-ORGANIZED CRITICALITY IN NEURAL NETWORKS**

Apart from direct experimental evidence, support of selforganized neural criticality comes from a range of models which show that self-organized criticality in the brain is plausible.

While simple model networks allow for analytical considerations that show general features, the more complex models convince with biological detail. Self-organized criticality can be implemented robustly in networks ranging from simple, binary units (e.g., Bienenstock and Lehmann, 1998; Bornholdt and Rohlf, 2000; Bornholdt and Röhl, 2003) up to more biologically realistic integrate-and-fire neurons (e.g., De Arcangelis et al., 2006; Levina et al., 2007, 2009; Meisel and Gross, 2009; Rubinov et al., 2011), for which dynamical switching between subcritical *down*-states and critical *up*-states can be observed (Millman et al., 2010).

Several models reproduced experimental results on criticality (e.g., De Arcangelis et al., 2006; Millman et al., 2010; Tetzlaff et al., 2010; Meisel et al., 2012). Yet, if critical models are suggested by parameter fitting based on experimental data, care has to be taken because the estimation of model parameters shows an intrinsic trend to apparently critical values because, around the phase transition, the uncertainty of the estimate is minimized and the amount of distinguishable models is greatest (Mastromatteo and Marsili, 2011).

Most numerical studies simulate a network of identical model neurons, where activity is regulated by the implemented adaptation mechanism. The network dynamics is launched by an initial stimulation of an arbitrary subset of neuron and analyzed after a period that allows the network to self-organize. The adaptation changes microscopic parameters depending on a given microscopic rule and depending on local measurements of the dynamical state. Using plausible rules, it is then observed that one order parameter of the system approaches the critical point.

If models use activity dependent rules, then the system can self-organize to the critical point at the onset of activity, where avalanche distributions follow power-laws. Inspired by the study of branching processes, these mechanisms change the probability with which activity is transmitted from one neuron to the next. This can be realized either through a regulation of the synaptic connection such as activity-dependent rewiring (Bornholdt and


**Table 1 | Deviations from criticality due to unbalanced excitation and inhibition.**

Röhl, 2003; Tetzlaff et al., 2010), Hebbian (De Arcangelis et al., 2006), short-term synaptic plasticity (Levina et al., 2007, 2009; Millman et al., 2010), or, in a certain parameter range, spiketiming dependent plasticity (STDP) (Rubinov et al., 2011); or through a regulation of the neuronal excitability such as internal homeostatic plasticity (Droste et al., 2013).

If the adaption rule is dependent on relative timing or phase differences, the system can self-organize to the critical point at the onset of synchronization. Using phase coherence as order parameter, such models self-organize to criticality if the connections are created and retracted as observed for synaptic rewiring during development, or if the strength of the connections are changed as observed for STDP (Meisel and Gross, 2009). STDP is thus a plausible mechanism that could organize a system to both activity and synchronization phase transitions.

Especially with mechanisms based on STDP, the models reach biologically plausible network structures. They self-organize from a highly connected state to a sparsely connected state, in which only few strong synapses survive (Jost and Kolwankar, 2009; Meisel and Gross, 2009). The resulting networks show powerlaw distributed synaptic fluctuations (Shin and Kim, 2006) and a scale-free network structure (Shin and Kim, 2006; Meisel and Gross, 2009).

Most neuron models are rather simple, but the self-organized criticality mechanisms also allow for the implementation of certain more realistic properties. Models using intergrate-andfire neurons can implement delayed synaptic transmission (e.g., Rubinov et al., 2011) and a refractory period, which is thought to hinder back-propagation of neuronal avalanches (e.g., De Arcangelis et al., 2006). In addition, integrate-and-fire neurons can also have leaky membranes (Meisel and Gross, 2009; Millman et al., 2010; Rubinov et al., 2011). Up to now, selforganized criticality has not been reported for conductance-based neuron models, probably because the network simulations are constraint by the available computational power; limiting the self-organization to criticality by restricting either network size or simulation duration. Just one study reports that a network of Hodgkin-Huxley model neurons self-organizes to a scale-free network with STDP (Shin and Kim, 2006). The observation of self-organized criticality across a wide range of neuron models is intuitive as the critical state itself should be independent of microscopic details.

Criticality and self-organized criticality can already be observed in models with very simple dynamics as the toy model proposed above. Nevertheless, many current models capture the complex interplay between inhibitory and excitatory neurons (De Arcangelis et al., 2006; Shin and Kim, 2006; Tetzlaff et al., 2010). The resulting dynamics then depends only on the ratio of inhibitory and excitatory connection strengths such that a regulation of the excitatory connections is sufficient (Bienenstock and Lehmann, 1998; Shin and Kim, 2006). The exact role played by the balance of excitation and inhibition in the brain is poorly understood. It can be shown mathematically that this interplay in itself is not a prerequisite for criticality (Jost and Kolwankar, 2009). Nevertheless, the interplay between inhibition and excitation could play an important role for the system's computational capabilities in the critical state.

A crucial ingredient for robust self-organized criticality is the ability to sense the global state of the system based on local information. For instance, concerning the activity transition, every local neuron or synapse has a plasticity rule that increases or decreases the unit's activity. Self-organized criticality can only be achieved if the increase is more frequently or more strongly realized in the subcritical than in the supercritical state, and the decrease in the supercritical state. Thus, on some level, the global state has to be detectable on the local scale. Regarding activity, this is much easier for the supercritical state than for the subcritical state. Even a single neuron or synapse that experiences a high level of activity can conclude that the system is in the supercritical state with high probability. Conversely, the absence of such activity, observed locally, does not necessitate a subcritical state on the global level.

Because the subcritical state is difficult to recognize by a local mechanism, it is likely that criticality in the brain is achieved by a slow continuous increase of the control parameter, which is then overcompensated by a decisive decrease once supercritical dynamics is detected. Such an asymmetric regulation is implemented in models inspired by short term synaptic depression, where synaptic efficiency is abruptly decrease when a spike occurs, and afterwards exponentially increased until the next spike occurs (Levina et al., 2007, 2009; Millman et al., 2010), and in a model inspired by calcium dependent development of axons and dendrites with faster dynamics in the direction of subcritical states, where the rate of the dendritic retraction was twice the rate of the axonal outgrowth (Tetzlaff et al., 2010). The use of asymmetric regulation was emphasized in a simpler model by Droste et al. (2013). Since the dynamics on the subcritical side is slower, the system spends more time on the subcritical side and thus, in average, appears slightly subcritical, which is consistent with experimental findings.

In general, we can expect that self-organized criticality in finite systems drives the system slightly into the subcritical phase. For the onset of synchronization, the local detection of synchrony implies that some degree of synchrony exists in the system such that the system must be in the supercritical state. By contrast, the absence of synchrony observed locally does not imply that the system is necessarily in the subcritical state as synchronous dynamics may already exist elsewhere in the system. Again, we expect that self-organization will drive the system to a slightly subcritical state.

The finding that already highly simplified models reproduce experimental results suggests fundamental properties of selforganizing mechanisms for which implementation details do not matter. The robustness of self-organization to criticality can increase with system size, suggesting that self-organized criticality is especially easily implemented in large neural networks (Levina et al., 2007; Rubinov et al., 2011). While each of the models discussed here can be criticized in various ways, the observation of robust self-organized criticality across a broad range of modeling assumptions and frameworks lends much credibility to the criticality hypothesis.

## **7. DISCUSSION**

If self-organized criticality is indeed fundamental for the functioning of the brain, then we expect a link between self-organized criticality and other properties of the brain. In the following, we speculate on the relation of self-organized criticality with sensory input, learning and sleep.

Most studies of self-organized criticality have so far focused on systems without input. However, to assess the impact of criticality on the brain's computational capabilities, inputs need to be considered. Based on current results, it is likely that high levels of input will cause hallmarks of criticality to disappear as internal dynamics is replaced by externally triggered activity. Inputs are considerably decreased in slice and cell cultures compared to *in vivo* preparations, and the same probably holds for anesthetized animals compared to awake animals (Beggs and Plenz, 2003; Ribeiro et al., 2010; Touboul and Destexhe, 2010). It is therefore not surprising that most evidence for criticality comes from these systems. Future experimental studies aiming to find hallmarks of criticality should therefore likewise focus on low-input situations.

In systems with strong input, the discussion of self-organized criticality is conceptually more difficult as the definition of the system now has to include a statistical model of inputs. While it is still possible to define phases and phase transitions, the phase transitions become harder to identify and critical states can easily be mistaken for supercritical states. For instance, if we add inputs to the toy model proposed above we always observe activity, even in subcritical states.

In a situation where the brain is exposed to a significant level of input, we would expect that self-tuning mechanisms fail as the retuning mechanisms start to compensate for the input by regulating activity down. The system thus departs from the state where the internally generated dynamics is critical. Indeed, evidence for critical brain dynamics decreases during prolongated periods of wakefulness, and increases after a night of sleep (Meisel et al., 2013). It is thus plausible that sleep is essential for retuning the brain to the critical state where it can operate effectively.

Both experimental studies (e.g., Bassett et al., 2006; Bédard et al., 2006; Hahn et al., 2010; Dehghani et al., 2012; Priesemann et al., 2013, 2014) and models point to self-organization to a subcritical state close to criticality. Many authors have suggested that this is a safety mechanism to prevent pathological supercritical dynamics. From a theoretical point, another explanation appears more plausible. Any finite real world system, subject to noise and inputs, can only self-organize to critical states with given accuracy. Due to limitations in the sensing of the global state, systems spend in average more time in the subcritical phase.

One property that is so far widely ignored in the literature is the dimensionality of the underlying parameter space. In simple systems that have only one control parameter the critical state is a point. However, in general it is a manifold whose dimensionality is less than the dimensionality of the parameter space. Technically, the parameter space spanned by a complex network includes all the individual link weights and is thus almost infinite. Even if we only focus on the main macroscopic descriptors of networks we can easily identify tens of parameters that can potentially affect the dynamics and can be affected by the plasticity. If only ten such parameters played a role in the real system the critical state would still be a nine dimensional manifold and thus a huge parameter space.

One implication of the high dimensionality of the critical manifold is that the system can change and therefore learn while remaining in the critical state. However, the connection between learning and criticality goes apparently deeper than that. For instance it has been claimed that self-organized criticality is essential for learning, for review see Hsu et al. (2008), but further explorations of the detailed connection between learning and criticality seem necessary.

Another implication of the high-dimensional parameter space of complex networks is that the system can reside in multiple phase transitions at the same time. Intriguingly, recent results suggest that neural networks are organized to both the activity and synchronization phase transition (e.g., Yang et al., 2012 for organotypic slices, Meisel et al. (2013), or Linkenkaer-Hansen et al. (2001); Kitzbichler et al. (2009) compared to Tagliazucchi et al. (2012); Shriki et al. (2013) for brain imaging). Future modeling work should address whether neural networks can support multiple or simultaneous critical states.

A central question is whether the brain self-organizes to criticality as a single system, or as a collection of many, potentially overlapping, subsystems. While simulations consider predominantly homogeneous networks, anatomical features divide the brain in clearly defined brain areas. Several authors stress the possibility that different brain areas self-organize independently (Bédard et al., 2006; Kitzbichler et al., 2009; Meisel and Gross, 2009; Priesemann et al., 2009; Meisel et al., 2012). If this is confirmed the next logical questions are if all brain areas selforganize to criticality, and if yes, do they all organize to the same phase transition? Resolving these questions could greatly strengthen the link between self-organized criticality and its medical implications.

## **8. CONCLUSION**

The neural criticality hypothesis is motivated by the relationship between criticality and optimal computational properties. The hypothesis is supported by experiments that observed hallmarks of criticality for a wide range of animals from leech to humans, over several states of consciousness, and on many different experimental scales from recordings of few neurons up to the whole brain. However, the experimental evidence is still controversial and more studies are needed to resolve major open questions and rule out alternative explanations for the observed phenomena. Based on the presently available work, we judge self-organized as preferable over alternative explanations because it provides an evolutionarily-motivated explanation for several otherwise disconnected observation.

In addition to experiments, the criticality hypothesis is supported by models which demonstrate that the self-organization to critical states in the brain is feasible and plausible. While these models necessarily simplify the brain to various degrees, they paint a consistent picture where essentially the same phenomenon is observed independently of specific modeling assumptions.

The criticality hypothesis is intriguing because it opens new perspectives in several areas. First, deviations from criticality could be symptomatic of diseases of the central nervous system (Meisel et al., 2012; Shew and Plenz, 2013). Understanding self-organized criticality in the brain could thus lead to new diagnostic tools, and possibly treatments. Second, connections are presently emerging which suggest that understanding criticality in the brain could provide important insights into other phenomena including sleep, learning, the root-causes of certain diseases, and a deeper understanding of information processing. Finally, several results which have been obtained in the context of self-organized criticality in the brain suggest that criticality is a prerequisite for efficient information processing in unstructured systems. This could provide a general principle that is broadly relevant beyond the field of neuroscience and could be valuable for overcoming various challenges, from understanding swarm intelligence (Ioannou et al., 2012) to constructing microprocessors that process information using randomly-deposited nano-scale components. We believe that these perspectives provide a strong incentive for more experimental and theoretical work in the area of self-organized criticality.

## **AUTHOR CONTRIBUTIONS**

Janina Hesse scanned the literature and wrote the paper, Thilo Gross wrote the paper and supervised the project.

## **FUNDING**

This work was partially supported by the EPSRC under grant no EP/K031686/1. Janina Hesse was funded by a scholarship from the École Normale Supérieure, Paris, and by grants from the Federal Ministry of Education and Research, Germany (01GQ1001A, 01GQ0901) and the Deutsche Forschungsgemeinschaft (SFB618, GK1589/1).

## **ACKNOWLEDGMENT**

Janina Hesse thanks Viola Priesemann for helpful discussions and valuable comments on the manuscript.

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 23 May 2014; accepted: 25 August 2014; published online: 23 September 2014.*

*Citation: Hesse J and Gross T (2014) Self-organized criticality as a fundamental property of neural systems. Front. Syst. Neurosci. 8:166. doi: 10.3389/fnsys.2014.00166 This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2014 Hesse and Gross. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Marginally subcritical dynamics explain enhanced stimulus discriminability under attention

## *Nergis Tomen\*, David Rotermund and Udo Ernst*

*Institute for Theoretical Physics, University of Bremen, Bremen, Germany*

#### *Edited by:*

*Dietmar Plenz, National Institute of Mental Health, National Institutes of Health, USA*

#### *Reviewed by:*

*Paul Miller, Brandeis University, USA John A. Wolf, University of Pennsylvania, USA Dietmar Plenz, National Institute of Mental Health, National Institutes of Health, USA*

#### *\*Correspondence:*

*Nergis Tomen, Institute for Theoretical Physics, University of Bremen, Hochschulring 18, Bremen D-28359, Germany e-mail: nergis@neuro.uni-bremen.de* Recent experimental and theoretical work has established the hypothesis that cortical neurons operate close to a critical state which describes a phase transition from chaotic to ordered dynamics. Critical dynamics are suggested to optimize several aspects of neuronal information processing. However, although critical dynamics have been demonstrated in recordings of spontaneously active cortical neurons, little is known about how these dynamics are affected by task-dependent changes in neuronal activity when the cortex is engaged in stimulus processing. Here we explore this question in the context of cortical information processing modulated by selective visual attention. In particular, we focus on recent findings that local field potentials (LFPs) in macaque area V4 demonstrate an increase in γ -band synchrony and a simultaneous enhancement of object representation with attention. We reproduce these results using a model of integrate-and-fire neurons where attention increases synchrony by enhancing the efficacy of recurrent interactions. In the phase space spanned by excitatory and inhibitory coupling strengths, we identify critical points and regions of enhanced discriminability. Furthermore, we quantify encoding capacity using information entropy. We find a rapid enhancement of stimulus discriminability with the emergence of synchrony in the network. Strikingly, only a narrow region in the phase space, at the transition from subcritical to supercritical dynamics, supports the experimentally observed discriminability increase. At the supercritical border of this transition region, information entropy decreases drastically as synchrony sets in. At the subcritical border, entropy is maximized under the assumption of a coarse observation scale. Our results suggest that cortical networks operate at such near-critical states, allowing minimal attentional modulations of network excitability to substantially augment stimulus representation in the LFPs.

**Keywords: criticality, neuronal avalanches, phase transition, attention, synchronization, gamma-oscillations, information entropy**

## **1. INTRODUCTION**

Self-organized criticality (SOC) is a property observed in many natural dynamical systems in which the states of the system are constantly drawn toward a critical point at which a phase transition occurs. A variety of systems such as sandpiles (Held et al., 1990), water droplets (Plourde et al., 1993), superconductors (Field et al., 1995), and earthquakes (Baiesi and Paczuski, 2004) exhibit SOC. In such systems, system elements are collectively engaged in cascades of activity called avalanches, whose size distributions obey a power-law at the critical state (Bak et al., 1987). Scientists have long hypothesized that SOC might also be a feature of biological systems (Bak and Sneppen, 1993) and that criticality of dynamics is relevant for performing complex computations (Crutchfield and Young, 1989; Langton, 1990). Support was given by modeling studies showing that networks of integrate-and-fire (IAF) neurons are able to display SOC (Corral et al., 1995), and predicting that avalanches of cortical neurons may belong to a universality class with a power-law exponent τ = 3/2 (Eurich et al., 2002).

Experimental data indicates that cortical dynamics may indeed assume a critical state: in 2003, Beggs and Plenz have shown that neuronal avalanche size distributions follow a power-law with τ = 3/2 in organotypic cultures as well as in acute slices of rat cortex. The observed avalanche size distributions hereby nicely matched the closed-form expressions derived for neural systems of finite size (Eurich et al., 2002). Subsequently, the ability of dissociated and cultured cortical rat neurons to self-organize into networks that exhibit avalanches *in vitro* was presented in Pasquale et al. (2008). Petermann et al. (2009) reported similar avalanche size distributions in the spontaneous cortical activity in awake monkeys. On a larger spatial scale, Shriki et al. (2013) presented scale-free avalanches in resting state MEG in humans. In addition, recent studies address questions relating to, for example, the rigorousness of statistical analysis (Klaus et al., 2011), subsampling (Priesemann et al., 2009), and resolution restraints as well as exponent relations (Friedman et al., 2012) in experimental criticality studies.

Combined, such theoretical and experimental results constitute the hypothesis that cortical neuronal networks operate near criticality (Bienenstock and Lehmann, 1998; Chialvo and Bak, 1999; Chialvo, 2004; Beggs, 2008; Fraiman et al., 2009). What makes the criticality hypothesis especially compelling is the idea that a functional relationship may exist between critical dynamics and optimality of information processing as well as information transmission (Bertschinger and Natschläger, 2004; Haldeman and Beggs, 2005; Kinouchi and Copelli, 2006; Nykter et al., 2008; Shew et al., 2009). However, the majority of neuronal avalanche observations are of spontaneous or ongoing activity in the absence of an actual sensory stimulus being processed by the cortex. In addition, no experimental studies exist to date which explore the criticality of neuronal dynamics *in vivo* in conjunction with a specific behavioral task, or under changing task demands.

Nevertheless, criticality describes the border between asynchronous and substantially synchronous dynamics, and in the field of vision research, synchronization has been studied extensively as a putative mechanism for information processing (von der Malsburg, 1994). Experimental studies demonstrated that in early visual areas, oscillations in the γ -range (about 40–100 Hz) occur during processing of a visual stimulus (Eckhorn et al., 1988; Gray and Singer, 1989). Hereby mutual synchronization between two neurons tends to become stronger if the stimulus components within their receptive fields are more likely to belong to one object (Kreiter and Singer, 1996), thus potentially supporting feature integration. Furthermore, it has been shown that selective visual attention is accompanied by a strong increase in synchrony in the γ -band in visual cortical networks (Fries et al., 2001; Taylor et al., 2005). In this context, γ -oscillations have been proposed to be the essential mechanism for information routing regulated by attention (Fries, 2005; Grothe et al., 2012). Moreover, recent studies have demonstrated links between synchronized activity in the form of oscillations in MEG (Poil et al., 2012) and LFP recordings (Gireesh and Plenz, 2008) and in the form of neuronal avalanches.

These findings motivated us to explore the potential links between synchronization, cortical information processing, and criticality of the underlying network states in the visual system. In particular, we investigated the criticality hypothesis in the context of γ -oscillations induced by selective visual attention. If visual cortical networks indeed assume a critical state in order to optimize information processing, such a state should be prominent during the processing of an attended stimulus, since attention is known to improve perception (Carrasco, 2011) and to enhance stimulus representations (Rotermund et al., 2009).

Specifically, we will focus here on a structurally simple network model for population activity in visual area V4. We will first demonstrate that our model reproduces key dynamical features of cortical activation patterns including the increase in γ -oscillations under attention observed in experiments (Fries et al., 2001; Taylor et al., 2005). In particular, we will explain how attention enhances the representation of visual stimuli, thus allowing to classify the brain state corresponding to a particular stimulus with higher accuracy (Rotermund et al., 2009), and we will identify mutual synchronization as the key mechanism underlying this effect.

Construction of this model allowed us to analyze dependencies between network states and stimulus processing in a parametric way. In particular, we were interested in whether such a network displayed critical dynamics, and how they relate to cognitive states. We inquired: Is criticality a "ground state" of the cortex which is assumed in the absence of stimuli, and helps process information in the most efficient way as soon as a stimulus is presented? Or is the cortex rather driven toward a critical state only when there is a demand for particularly enhanced processing, such as when a stimulus is attended?

For answering these questions, we (a) characterized the network state based on neuronal avalanche statistics (subcritical, critical, or supercritical), (b) quantified stimulus discriminability, and (c) analyzed the richness of the dynamics (information entropy of spike patterns) in the two-dimensional phase space spanned by excitatory and inhibitory coupling strengths. Within this coupling space, we identified a transition region where the network undergoes a phase transition from subcritical to supercritical dynamics for different stimuli. We found that the onset of γ -band synchrony within the transition region is accompanied by a dramatic increase in discriminability. At supercritical states epileptic activity emerged, thus indicating an unphysiological regime, and both information entropy and discriminability values exhibited a sharp decline.

Our main finding is that cortical networks operating at marginally subcritical states provide the best explanation for the experimental data (Fries et al., 2001; Taylor et al., 2005; Rotermund et al., 2009). At such states, fine modulations of network excitability are sufficient for significant increases in discriminability.

## **2. RESULTS**

## **2.1. ATTENTION ENHANCES SYNCHRONIZATION AND IMPROVES STIMULUS DISCRIMINABILITY**

Our study is motivated by an electrophysiological experiment (Rotermund et al., 2009) which has demonstrated that attention improves stimulus discriminability: While a rhesus monkey (*Macaca mulatta*) attended to one of two visual stimuli simultaneously presented in its left and right visual hemifields, epidural LFP signals were recorded in area V4 of the visual cortex. Power spectra of the Wavelet-transformed LFPs display a characteristic peak at γ -range frequencies between 35 and 80 Hz as well as a 1/*f* offset (**Figure 2A**). For assessing stimulus discriminability, Rotermund et al. used support vector machines (SVMs) on these spectralpower distributions in order to classify the stimuli on a single trial basis. A total of six different visual stimuli (complex shapes) were used in the experiments, therefore, the chance level was around 17%. This analysis yielded two results which are central for this paper:


In this study, we present a minimal model which allows us to investigate putative neural mechanisms underlying the observed data.

### **2.2. REPRODUCTION OF EXPERIMENTAL KEY FINDINGS**

The spectra recorded in the experiment are consistent with neural dynamics comprising irregular spiking activity (the 1/*f*background) and oscillatory, synchronized activity in the γ -band. In order to realize such dynamics in a structurally simple framework, we considered a recurrently coupled network of IAF neurons which is driven by Poisson spike trains. The network consists of both excitatory and inhibitory neurons interacting via a sparse, random coupling matrix with a uniform probability of a connection between two neurons (for details see Section 4.1). The strengths *Jinh* and *Jexc* of inhibitory and excitatory recurrent couplings are homogeneous. While oscillatory activity is generated as a consequence of the recurrent excitatory interactions, the stochastic external input and inhibitory couplings induce irregular spiking, thus providing a source for the observed background activity.

We consider this network as a simplified model of a neuronal population represented in LFP recordings of area V4 and the external Poisson input as originating from lower visual areas such as V1. One specific visual stimulus activates only a subset of V4 neurons by providing them with a strong external drive while the remaining V4 neurons receive no such input (**Figure 1A**). We drove a different, but equally sized subset of V4 neurons for each stimulus. Hence in a recording of summed population activity (e.g., LFPs), where the identity of activated neurons is lost, stimulus identity is represented in the particular connectivity structure of the activated V4 subnetwork. We simulated a total of *N* = 2500 neurons but kept the number of activated V4 neurons fixed at *N*active = 1000 since every stimulus in the experiment was approximately the same size. With this setup, we ensured that the emerging stimulus-dependent differences in the network output are a consequence of stimulus identity and not of stimulus amplitude.

The variability of the couplings in our network mimics the structure of cortical couplings, which are believed to enhance certain elementary feature combinations [such as edge elements aligned to the populations' RF features (Kisvárday et al., 1997)] while suppressing others. Consequently, there will be stimuli activating subsets of V4 neurons which are strongly interconnected, while other stimuli will activate subsets which are more weakly connected.

We simulated the network's dynamics in response to *Na* = 6 different stimuli in *Ntr* = 20 independent trials. Comparable to the experiments, LFP signals were generated by low-pass filtering the summed pre- and postsynaptic V4 activity (Section 4.1.3). We computed the spectral power distributions using the wavelettransforms of LFP time series.

For sufficiently large *Jexc* the neurons in the V4 population were mutually synchronized, leading to a peak in the power spectra at γ -band frequencies. The average frequency of the emergent oscillations depends mainly on the membrane time constant τ for the particular choice of external input strength. Averaged over trials, these power spectra reproduced all the principal features displayed by the experimental data (**Figure 2**). In particular, spectra for individual stimuli differed visibly, with largest variability observed in the γ -range. Since the identity of activated neurons is lost in the population average, any differences in strength

of the observed γ -oscillations can only be attributed to subnetwork connectivity. This result has a natural explanation because connection strength and topology strongly determine synchronization properties in networks of coupled oscillatory units (see for example Guardiola et al., 2000; Lago-Fernández et al., 2000; Nishikawa et al., 2003).

activity of *K* adjacent cells is summed to construct *xi*.

**FIGURE 2 | Comparison of model dynamics to experimental recordings.** LFP spectral power distributions in **(A)** the experiment and **(B)** the model for non-attended (left) and attended (right) conditions. In each case, spectra averaged over trials is shown for 6 stimuli (different colors). In both **(A,B)** the spectra for each stimulus is normalized to its respective maximum in the non-attended case. Model spectra reproduce the stereotypical 1/*f* background

as well as the γ -peaks observed in the experimental spectra. Under attention, γ -band oscillations become more prominent and spectra for different stimuli become visibly more discriminable. **(C)** Single trial LFP time-series from the model, illustrating the analyzed signals in the non-attended (top) and attended (bottom) conditions. [Data shown in **(A)** is courtesy of Dr. Andreas Kreiter and Dr. Sunita Mandon and Katja Taylor (Taylor et al., 2005)].

Differences in power spectra become even more pronounced if a stimulus is attended. We modeled attention by globally enhancing excitability in the V4 population. This can be realized either by increasing the efficacy of excitatory interactions, or by decreasing efficacy of inhibition. In this way, the gain of the V4 neurons is increased (Reynolds et al., 2000; Fries et al., 2001; Treue, 2001; Buffalo et al., 2010), and synchronization in the γ range gets stronger and more diverse for different stimuli while the 1/*f*-background remains largely unaffected (**Figure 2B**). For visualizing the effect of attention, single trial LFP signals corresponding to attended and non-attended conditions for a specific stimulus are given in **Figure 2C**. Note that the change induced by attention does not need to be large; in the example in **Figure 2** inhibitory efficacy was reduced by 10% from *jinh* = 0.80 to 0.72.

The observed changes in the power spectra with attention can be interpreted in terms of the underlying recurrent network dynamics: each activated subnetwork has a particular composition of oscillatory modes, and enhancing excitability in such a non-linear system will activate a larger subset of these modes more strongly. This effect is enhanced by synchronization emerging at different coupling strengths for different stimuli. With a further increase in the coupling, however, groups of neurons oscillating at different frequencies will become synchronized at a single frequency (Arnold tongues, Coombes and Bressloff, 1999), which ultimately decreases the diversity of power spectra.

## **2.3. ENHANCEMENT OF STIMULUS DISCRIMINABILITY IS A ROBUST PHENOMENON**

The spectra in **Figure 2B** were generated using coupling parameters *Jexc* and *Jinh* specifically tuned for reproducing the experimental data. However, the basic phenomenon is robust against large changes in the parameters: Discriminability increase is coupled to the emergence of strong γ -oscillations. To show this, we varied the excitatory and inhibitory coupling strengths independently, and quantified stimulus discriminability using SVM classification for every parameter combination. When varying the inhibitory efficacies, we used a step size that is proportional to the excitatory efficacy: *Jinh* = · *Jexc* · *jinh* for every point in the coupling space where *jinh* is the inhibitory scaling factor. We set the upper bound of excitation and the lower bound of inhibition so as to avoid unphysiologically high firing rates due to the activation of all neurons, including those that did not receive external input. **Figure 3A** shows the classification results in coupling space, averaged over *Nw* = 5 independently realized random connectivity architectures of the V4 network. The coupling values used for generating the spectra in **Figure 2B** are indicated by white markers. Classification performance is 24.2% in the non-attended (white cross) condition (significantly above chance level, ∼17%, via a one-tailed binomial test with *p* < 0.005) and 32.8% in the attended (white circle) condition. Notably, discriminability is significantly above chance level only in a bounded region of the parameter space. Within this region, relatively small increases in excitatory, or decreases in inhibitory coupling strengths lead to an acute discriminability enhancement.

This effect comes about in the following way: In networks with low excitation and high inhibition, the dynamics are asynchronous and the LFP spectra are dominated by the 1/*f*-noise. In this case, every stimulus input is mapped to a network output with similar spectral components and with a large trial-to-trial variance. This severely impedes the ability to classify stimuli correctly. On the other hand, in networks with very high excitation and low inhibition, synchronous activity dominates the dynamics and epileptic behavior is observed. Mutual synchronization of the activated V4 neurons leads to co-activation of the otherwise silent V4 neurons which do not receive external input. This means that every stimulus input is mapped to spike patterns where almost all neurons are simultaneously active at all times. The corresponding

classification performance as a function of the excitatory coupling strength *Jexc* and the inhibitory coupling scaling factor *jinh* (obeying *Jinh* = · *Jexc* · *jinh*). The coupling values representing the non-attended and

attended conditions in **Figure 2B** are marked by a cross and a circle, respectively. **(B)** Discriminability index in the coupling space for the same spectra. For both **(A,B)**, the strength of the background noise was *cmix* = 0.2.

spectra have reduced trial-to-trial variability but are almost identical for different stimuli. Consequently, stimulus discriminability reaches a maximum only in a narrow region of the parameter space which is associated with the onset of synchrony.

It is necessary to point out that the absolute magnitude of the SVM performance depends strongly on the background noise (i.e., on the value of *cmix*) which constitutes the 1/*f*-background in the spectra. For example, without the addition of the background noise (i.e., *cmix* = 0), SVM classification performance is 36.67% for the non-attended and 43.83% for the attended spectra in **Figure 2B**. Nevertheless, the observation of a bounded region of enhanced discriminability persists even in the absence of 1/*f*noise. This finding has an important consequence: It allows us to identify coupling parameters which cannot explain the experimental data regardless of the "real" noise level. Thus, it outlines a specific working regime in which the model can reproduce both of the experimental findings described in Section 2.1.

#### **2.4. CHARACTERIZATION OF DYNAMICAL NETWORK STATES**

Our findings indicate that a significant discriminability increase correlates implicitly with the onset of synchronous dynamics. In the following, we will focus on this network effect in more detail, and investigate its ramifications for information processing in the visual system.

In order to obtain a better understanding of the behavior of the system, we implemented certain reductions to our simulations. First, we excluded regions in parameter space where all neurons not receiving external input became activated. For most of the phase space, recurrent excitation is not strong enough to activate these stimulus-nonspecific neurons. At the supercritical regions, where excitation is strong and neurons are firing synchronously, however, these silent neurons become activated. This effect further increases the average excitatory input strength in the recurrent V4 population, leading to epileptic activity at very high (biologically implausible) frequencies. Such a regime would be highly unrealistic, since neurons in V4 populations have well-structured receptive fields and are only activated by specific stimuli (Desimone and Schein, 1987; David et al., 2006). Therefore, we proceeded to isolate the activity of externally driven subnetworks and focused our analysis on their output. This was realized by limiting the number of neurons in the network to *N* = *N*active = 1000 and by assigning different random coupling matrices to simulate different stimulus presentations. Thus, distinct network architectures stand for distinct stimulus identities.

When constructing the output signal, we now excluded the background noise induced by the V1 afferents (i.e., we set *cmix* = 0), but note that the V4 neurons were still driven by this stochastic, Poisson input. This segregation of V4 activity from background noise was necessary for the analysis of network dynamics, in order to ensure that the observed variance of the LFP spectra across trials originated in the V4 population.

In the reduced simulations, spikes propagated and impacted the postsynaptic neurons' membrane potentials instantaneously (see Section 4.3). We also prevented neurons from firing twice during an avalanche. These latter changes were introduced for inspecting criticality in the system dynamics (described in detail in Section 2.4.1), allowing us to quantify the number of neurons involved in an avalanche event accurately.

Since SVM classification is a comparatively indirect method for quantifying discriminability, employing classifiers which are difficult to interpret, we introduce the discriminability index (DI) as a simplified measure. The DI quantifies by how much, averaged over frequencies, the distributions of LFP spectra over trials overlap for each stimulus pair (see Section 4.2.3). As oscillations emerge in network dynamics, trial-to-trial variability of the spectra decrease (i.e., width of the distributions become narrower), and the average spectra for each stimulus is more distinct (i.e., the means of the distributions disperse). Hence, DI provides us with a meaningful approximation of the SVM classification performance. We find that the DI yields a phase space portrait (**Figure 3B**) similar to the SVM classification result (**Figure 3A**) for the full network simulations.

In order to compute discriminability in the reduced simulations, we used *Ntr* = 36 trials from each of the *Na* = 20 different stimuli. Simulations with the reduced network produce the same qualitative behavior in phase space (**Figure 4A**), in the sense that discriminability increase is only observed in a narrow region in the phase space, located in the border between regimes with and without strongly synchronous activity. Discriminability is maximized as oscillations emerge, and decays quickly in the regions where epileptic behavior is observed as all neurons fire simultaneously. Combined with the experimental evidence, our findings suggest that the cortex operates near a particular state where

**FIGURE 4 | Discriminability of the LFP spectra in relation to the avalanche statistics. (A)** Discriminability index in the reduced simulations. As in the full simulations (**Figure 3B**), stimulus discriminability increases dramatically in a narrow region of the coupling space. **(B)** Avalanche size distributions *P*(*s*) in the sub-critical (green), critical (blue), and super-critical (red) regimes for a single stimulus. Insets show how the corresponding avalanche duration distributions *P*(*T*) and the mean avalanche sizes *s* conditioned on the avalanche duration *T* behave in the three distinct regimes. The corresponding coupling parameter values are marked with crosses in **(A)**. **(C)** The values of the estimated power-law exponents τ , α, and 1/σ ν*z* for each value of the excitatory coupling strength *Jexc* . The lines mark the mean exponent at the critical point for each stimulus and the corresponding colored patches represent the standard deviation over the stimuli. The black dashed line shows the value of α computed using Equation 3, by plugging in the other two exponents.

small modifications of excitability lead to substantial changes in its collective dynamics.

However, time-averaged power spectra of local field potentials are not well suited for characterizing different aspects of this state. Since epidural LFPs are signals averaged over large neuronal populations, dynamic features in spiking patterns become obscured, and temporal variations in the network dynamics are lost in the averaging process. In the following, we will go beyond LFPs and focus on (a) the size distribution of synchronized events (avalanche statistics), and (b) on the diversity and richness of patterns generated by the network (measured by information entropy).

## *2.4.1. Criticality of dynamics*

The network dynamics can be classified into three distinct regimes of activity characterized by their avalanche size distributions: subcritical, critical, and supercritical (**Figure 4B**). In the subcritical state spiking activity is uncorrelated, events of large sizes are not present and the probability distributions *P*(*s*) of observing an avalanche event of size *s* exhibit an exponential decay. In the supercritical state, spiking activity is strongly synchronous and avalanches spanning the whole system are observed frequently. This behavior is represented in the avalanche size distributions by a characteristic bump at large event sizes. The critical state signifies a phase transition from asynchronous to oscillatory activity and the corresponding avalanche size distributions *P*(*s*) display scale-free behavior.

$$P(s) \propto s^{-\tau} \tag{1}$$

Even though power-law scaling of the avalanche size distributions, combined with the sudden emergence of oscillatory behavior in the system strongly suggest a phase transition in network dynamics, it is not sufficient to definitively conclude that the system is critical (Beggs and Timme, 2012; Friedman et al., 2012). Therefore, for inspecting criticality in the network dynamics, we have investigated the behavior of two other, relevant avalanche statistics: the distribution *P*(*T*) of avalanche durations *T* and the mean avalanche size *s* given the avalanche duration *T*, *s*(*T*). We find that both of these distributions follow a power-law for intermediate values of *T* at the critical points (**Figure 4B**, insets).

$$P(T) \propto T^{-a} \tag{2}$$

$$
\langle \mathsf{s} \rangle (T) \propto T^{1/\sigma \vee z} \tag{3}
$$

We observe that the behavior of *P*(*T*) within the phase space is similar to that of *P*(*s*). In the subcritical regime, there are only avalanches of short durations, and *P*(*T*) has a short tail. In the supercritical regime, *P*(*T*) displays a bump at large event durations. For *s*(*T*), we observe scale-free behavior of the distributions in both subcritical and critical regimes. Again a bump appears for large *T* at the supercritical regimes. In order to quantify the power-law scaling of the avalanche size and duration distributions we applied a maximum-likelihood (ML) fitting procedure (Clauset et al., 2009) and obtained an ML estimation of the power-law exponent for every stimulus. We obtained the power-law exponent of the mean size distributions conditioned on the avalanche duration using a least squares fitting procedure (Weisstein, 2002). Notably, the exponents obtained from the simulated dynamics fulfill the exponent scaling relationship (**Figure 4C**)

$$\frac{\alpha - 1}{\pi - 1} = \frac{1}{\sigma \,\nu z} \tag{4}$$

as predicted by universal scaling theory (Sethna et al., 2001; Friedman et al., 2012).

As a goodness-of-fit measure for the avalanche size distributions, we employed the Kolmogorov–Smirnov (KS) statistic. The KS statistic *D* averaged over all stimuli (i.e., network architectures) is given in **Figure 5A**. However, for identifying points in the phase space at which the network dynamics are critical, the KS statistic is ineffective: Even in the transition region from subcritical to supercritical behavior, the avalanche size distributions rarely display a perfect power-law which extends from the smallest to the largest possible event size. Therefore, we introduced lower and upper cut-off thresholds on the avalanche sizes during the fitting process (see Section 4.3). While this procedure allowed us to do better fits, it also lead to a large region of subcritical states which had relatively low (and noisy) *D*-values. This presents a predicament for automatically and reliably detecting the critical points by searching for minima in the *D*-landscape. Furthermore, we found that avalanche size distributions become scale-free at different points in phase space for different stimuli (**Figure 5B**). Therefore, the minima of the average KS statistic in **Figure 5A** are not representative of the critical points of the system.

Visual inspections revealed that the subcritical avalanche size distributions converge slowly to a power-law as inhibition is decreased. At a critical value of inhibition, a phase transition occurs and the bump characteristic of supercritical distributions appears abruptly. Consequently, it is trivial to determine the transition regions graphically. We automatized this procedure by using a binary variable γ , which assumes a value of 1 if a bump is detected in the avalanche size distributions (if the distribution is supercritical) and 0 otherwise (if the distribution is subcritical). Its mean γ over all stimuli is given in **Figure 6A**. We observed that there are clearly defined regions of sub- and supercritical dynamics, where γ is 0 or 1 for all stimuli, respectively. The points for which 0 < γ < 1 define the transition region, where synchronization builds up rapidly for different stimuli.

In **Figure 6B** the transition region is plotted together with the discriminability index for comparison. We observe that the points at which discriminability is enhanced are confined to the neighborhood of the transition region. Discriminability is maximized within the transition region, where the network dynamics are supercritical for a subset of architectures and subcritical for the remaining ones. This means that if cortical neurons were to maximize discriminability, a set of stimulus inputs would effectively map to epileptic output activity. Such a scenario is not only physiologically implausible, but actually pathological. Taken together, these findings suggest that only marginally subcritical points, and not ones within the transition and supercritical regions, qualify for explaining the experimental observations.

Therefore we propose that the cortex operates at nearcritical states, at the subcritical border of the transition region. Such near-critical states are unique in their ability to display significant discriminability enhancement under attention while avoiding pathologically oscillatory dynamics. In addition, strongly correlated activity is associated with encoding limitations. However, neither the discriminability of LFP spectra, nor the avalanche statistics considered putative, neurophysiologically plausible decoding schemes used by downstream visual areas. To address this issue, we next inspected the diversity of spike patterns generated in the V4 network, and how this diversity behaves in the neighborhood of the transition region.

#### *2.4.2. Information entropy*

We computed information entropy (Shannon, 1948) in order to assess the diversity of V4 spike patterns generated in response to

calculated using the γ measure is given in white. **(B)** KS statistic *D* as a function of inhibitory coupling scaling factor *jinh* for two exemplary stimuli, *a*<sup>1</sup> (blue) and *a*<sup>2</sup> (red), illustrating how the *D* minima occur at different points in the phase space for different stimuli (*Jexc* = 0.2 mV). The γ -transition region is given in magenta.

**phase transition from subcritical to supercritical avalanche statistics. (A)** γ measure averaged over all stimulus presentations. The network dynamics are subcritical for all stimuli in the regions of the phase space where the mean γ = 0, and supercritical in the regions where

γ = 1. A phase transition from subcritical to supercritical dynamics takes place between these two regions, at different points for different stimuli. This transition region where 0 < γ < 1 is indicated by white dots. **(B)** Comparison of the discriminability index (**Figure 4A**) and the transition region.

stimuli within the coupling space. In doing so, we considered different scales on which read-out of these patterns, e.g., by neurons in visual areas downstream of V4, might take place.

At the finest scale of observation, the read-out mechanism has access to complete information about V4 spiking activity. In this case, it can discriminate between spikes originating from distinct presynaptic V4 neurons. At the coarsest observation scale, the read-out mechanism is not capable of observing every individual neuron, but rather integrates the total V4 input by summing over the presynaptic activity at a given time. To account for this, we introduce a scale parameter *K* which reduces a spike pattern **X** comprising spikes from *N* neurons to a representation of *N*/*K* channels with each channel containing the summed activity of *K* neurons (**Figure 1B**).

**Figures 7A,B** show how information entropy compares with the transition region of the system for *K* = 1 (full representation) and for *K* = *N* (summed activity over whole network). For each inhibitory coupling, the value of the excitatory coupling which maximizes information entropy is marked with a dashed line. For both conditions, we see that information entropy displays a sharp decline near the transition region. This behavior is consistent with a phase transition toward a regime of synchronous activity as the emergence of strong correlations attenuate entropy by severely limiting the maximum number of possible states. In comparison to the finest scale of observation (*K* = 1), we find that the maxima of information entropy are shifted to greater values of excitation at the coarsest scale of observation (*K* = *N* = 1000). **Figure 7C** shows how the maxima of information entropy evolve as a function of observation scale *K*, converging onto near-critical points. This effect arises because, as *K* increases, the points with the greatest number of states in the network activity are shifted toward the transition region. By construction, the number of possible states of *X* is finite, and the uniform distribution has the maximum entropy among all the discrete distributions supported on the finite set {*x*1,..., *xn*}. Hence, information entropy of the spike patterns increases with both an increase in the number of observed states and an increase in the flatness of the probability mass function *P*(**X**) of the states. For the coarsest scale of observation, *P*(**X**) is equivalent to the avalanche size distributions, and it is clear that a power-law scaling of these distributions cover the largest range of states (**Figure 4B**). However, for large *jinh* (*jinh* - 0.6), entropy maxima persist at moderately subcritical regions. For large *K*, these regions are characterized by *P*(**X**) with smaller supports but more uniform shapes than the *P*(**X**) near the transition region. The flatness of these distributions, especially at small event sizes, causes the entropy maxima to appear around *Jexc* = 1.8 mV, instead of being located at higher values of excitation.

Combined, our results can be interpreted in the following way for the two extreme conditions discussed:


## **3. DISCUSSION**

In this paper we addressed the criticality hypothesis in the context of task-dependent modulations of neuronal stimulus processing. We focused, in particular, on changes in cortical activity induced by selective visual attention. We considered recent findings that

γ -band oscillations emerge collectively with an enhancement of object representation in LFPs in macaque area V4 under attention (Rotermund et al., 2009). We reproduced these results using a model of a visual area V4 population comprising IAF neurons recurrently coupled in a random network. Attention induces synchronous activity in V4 by modulating the efficacy of recurrent interactions. In the model, we investigated the link between experimentally observed enhancement of stimulus discriminability, scale-free behavior of neuronal avalanches and encoding properties of the network quantified by information entropy.

converge toward the transition region (black) as *K* is increased.

We found that the emergence of γ -band synchrony is strongly coupled to a rapid discriminability enhancement in the phase space. Notably, we observed that discriminability levels comparable to the experiments appear exclusively in the neighborhood of the transition region, where network dynamics transition from subcritical to supercritical for consecutive values of excitation for different stimuli. This effect arises because synchronizability of the network depends inherently on its connectivity structure, and the strength of synchrony for different stimuli is most diverse near and within the transition region. However, this also means that information entropy displays a sharp decline as network activity becomes strongly correlated for some stimuli, beginning within the transition region and reaching a minimum in the supercritical regions. Therefore, we propose that cortical networks operate at near-critical states, at the subcritical border of the transition region. Such marginally subcritical states allow for fine modulations of network excitability to dramatically enhance stimulus representation in the LFPs. In addition, for a putative encoding scheme in which higher area neurons integrate over the spiking activity in local V4 populations (coarse observation scale), near-critical states maximize information entropy.

#### **3.1. ROBUSTNESS OF RESULTS**

In this work we aimed to reproduce reproducing the characteristic features of the experimental findings with an uncomplicated model, in part due to considerations of computational expense. The conclusions of this paper depend mainly on the facts that in our model: (1) the emergence of synchronous spiking activity can be described by a phase transition as a function of an excitability parameter, and (2) synchronizability of the network depends implicitly on the topography of its connections. Therefore, we believe that as long as these requirements are met, discriminability enhancement will correlate with a narrow choice of parameters which generate near-critical dynamics. This will also be the case in more complex and biologically plausible models which detail different synchronization mechanisms which might be responsible for generating neural γ -activity (see, for example, the reviews Tiesinga and Sejnowski, 2009; Buzsáki and Wang, 2012).

In fact, recent modeling work by Poil et al. (2012), which employed a network consisting of IAF neurons with stochastic spiking and local connectivity, reported a result which nicely parallels our findings. For random realizations of their network architecture, the greatest variance of the power-law scaling of the avalanche size distributions was found near the critical points. In this framework, different random realizations of network connectivity were used to describe differences between human subjects, and the authors concluded that their findings provide an explanation for interindividual differences in α-oscillations in human MEG.

## **3.2. PHYSIOLOGICAL PLAUSIBILITY**

We simulated cortical structure employing a random network of finite size, thus our model had a connectivity structure which varied for different subpopulations of activated neurons. This setting spared us any particular assumptions about the connection topology of V4 neurons, which is still subject of extensive anatomical research. In the brain, variability in connectivity of neurons in a local population is not random, but signifies a highly structured global network. Such functional connectivity is exemplified in the primary visual cortex by long-range connections between neurons with similar receptive field properties such as orientation preference (Kisvárday et al., 1997). These connections are thought to serve feature integration processes such as linking edge segments detected by orientation-selective neurons in V1 or V2 into more complex shapes, thus giving rise to the array of receptive field structures found in V4 (Desimone and Schein, 1987; David et al., 2006). In consequence, connection variability in the brain is significantly higher than random. Specifically, the variance of degree distributions is higher, the synaptic weights are heterogeneous, and the coupling structures are more anisotropic than in our simulations. Hence connection variability across different local networks is not decreased as drastically when the number of neurons is increased. In fact, assuming random variability implied a trade-off in our simulations: On the one hand, increasing the number of neurons decreased diversity in activation patterns and pattern separability, while on the other hand, it improved the assessment of criticality by increasing the range over which avalanche events could be observed.

In addition, in our model, we posited that attention modulates the efficacy of interactions, in order to reproduce the attention induced gain modulation and γ -synchrony using a reductionist approach. In biological networks, these effects may originate from more complicated mechanisms. For example, previous studies have shown that such an increase in gain (Chance et al., 2002) as well as synchronous activity (Buia and Tiesinga, 2006) can be achieved by modulating the driving background current. However, as described in Section 3.1, we expect our results will persist in other models where the network dynamics undergo a phase transition toward synchronous dynamics as a function of the responsiveness of neurons which is enhanced by attention. As an alternative to enhancing synaptic efficacy, we also tested a scenario in which attention provided an additional, weak external input to all neurons (results not shown). This led to qualitatively similar findings, with a quantitatively different discriminability boost.

Lastly, our current understanding of cortical signals strongly suggests that LFPs are generated mainly by a postsynaptic convolution of spikes from presynaptic neurons (Lindén et al., 2011; Makarova et al., 2014) and that even though other sources may contribute to the LFP signal, they are largely dominated by these synaptic transmembrane currents (Buzsáki et al., 2012). We generate the LFP signal through a convolution of the sum of appropriately scaled recurrent and external spiking activity. In our model, this closely approximates the sum of postsynaptic currents to V4 neurons: We are considering a very simple model of a small V4 population in which the postsynaptic potentials are evoked solely by these recurrent and external presynaptic spikes; degree distributions in the connectivity structure of the network has a small variance; the recurrent synaptic strengths are homogeneous; and there is no stochasticity in the recurrent synaptic transmission (i.e., every V4 spike elicits a postsynaptic potential in the V4 neurons it is recurrently coupled to). In addition, there is no heterogeneity in cell morphologies or the location of synapses, which are believed to influence the contribution of each synaptic current to the LFP signal in cortical tissue (Lindén et al., 2010). Combined, this means that each spike elicited by a model V4 neuron has a similar total impact on the postsynaptic membrane potentials, and the low-pass filtered spiking activity represents the postsynaptic currents well. Furthermore, even though our model does not incorporate the full biological complexity of cortical neurons, we believe that the particular choice of constructing the LFP signal in our model is not consequential for our results. The increase in discriminability of the LFP spectra originate primarily in the γ -band (both in the model and the experimental data), and we assume that correlated synaptic currents emerge simultaneously with correlated spiking activity, as there is experimental evidence that spiking (multi-unit) activity is synchronized with the LFP signal during attention-induced γ -oscillations (Fries et al., 2001).

## **3.3. DYNAMICS, STRUCTURE, AND FUNCTION**

In order to scrutinize the role of synchrony in enhancing stimulus representations, we considered an idealistic scenario: Each stimulus activates a different set with an *identical* number of neurons, so that without synchronization stimulus information encoded in activated neuron identities would be lost in the *average* population rate. By means of the different connectivities within different sets, however, this information becomes re-encoded in response amplitude and γ -synchrony. In principle, this concept is very similar to the old idea of realizing binding by synchrony (von der Malsburg, 1994), namely, using the temporal domain to represent information about relevant properties of a stimulus, for example, by tagging its features as belonging to the same object or to different objects in a scene.

However, strong synchronization hurts encoding by destroying information entropy. This is visible in the dynamics in the supercritical regime where ultimately all neurons do the same: fire together at identical times. Therefore, synchronization is only beneficial for information processing if additional constraints exist: for example, a neural bottleneck in which some aspect of the full information available would be lost, or a certain robustness of signal transmission against noise is required and can be realized by the synchronous arrival of action potentials at the dendritic tree.

In our setting, this bottleneck is the coarse observation scale where neuron identity information is lost by averaging over all neural signals. In such a case, information entropy is maximized as oscillations emerge at near-critical points. Although this situation is most dramatic for epidural LFPs that sum over thousands of neurons, it may also arise in more moderate scales if neurons in visual areas downstream of V4 have a large fan-in of their presynaptic connections. Naturally, this does not exclude the possibility that such a bottleneck may be absent and that cortical encoding can make use of spike patterns on finer spatial scales. This would shift the optimal operating regime "deeper" into the subcritical regime, and away from the transition region. Nonetheless, for this finer scale assumption, marginal subcriticality might represent a best-of-both-worlds approach. In particular, a penalty in information entropy may be necessary to ensure a certain level of synchronous activity required for other functionally relevant aspects of cortical dynamics, such as information routing regulated by attention via "communication through coherence" (Fries, 2005; Grothe et al., 2012).

In general, coding schemes being optimal for information transmission and processing always depends strongly on neural constraints and readout schemes. Nevertheless, specific assumptions about stimulus encoding do not influence our conclusion that the experimentally observed effects are unique to near-critical dynamics.

#### **3.4. OUTLOOK**

In summary, our study establishes several, novel links between criticality, γ -synchronization, and task requirements (attention) in the mammalian visual system. Our model predicts that the cortical networks, specifically in visual area V4, operate at marginally subcritical regimes; task-dependent (e.g., attention induced) modulations of neuronal activity may push network dynamics toward a critical state; and the experimentally observed discriminability increase in LFP spectra can be attributed to differences in the network structure across different stimulus-specific populations. It remains for future studies to explore these links in more detail, and provide experimental support for our model's predictions. With recent advances in optogenetic methods and multielectrode recording techniques, assessing avalanche statistics in behaving, nonhuman primates with the required precision will soon be possible.

#### **4. MATERIALS AND METHODS**

#### **4.1. NETWORK MODEL**

#### *4.1.1. Structure and dynamics*

The V4 network consists of *N* recurrently coupled IAF neurons *i* = 1,..., *N* described by their membrane potential *V*(*t*):

$$\begin{split} \pi\_{m\boldsymbol{\kappa}m} \frac{\mathrm{d}V\_{i}(t)}{\mathrm{d}t} &= -\left(V\_{i}(t) - V\_{R}\right) + I\_{\mathrm{ext}} \sum\_{k} \delta(t - t\_{i\boldsymbol{k}}') \\ &+ J\_{\mathrm{ext}} \sum\_{j=1}^{N\_{\mathrm{ext}}} \omega\_{\boldsymbol{ij}} \delta(t - t\_{\boldsymbol{j}k}) - J\_{\mathrm{inh}} \sum\_{j=N\_{\mathrm{ext}}+1}^{N} \omega\_{\boldsymbol{ij}} \delta(t - t\_{\boldsymbol{j}k}) \end{split} \tag{5}$$

The membrane potential evolves according to Equation 5 where every V4 neuron *i* has a resting potential *VR* = −60 mV and generates an action potential when *V* crosses a threshold *V*<sup>θ</sup> = −50 mV. After spiking, *V*(*t*) is reset back to *VR*. We picked the parameters to be representative of those of an average cortical neuron (Kandel et al., 2000; Noback et al., 2005). We used a membrane time constant of τ*mem* = 10 ms. In Equation 5, *tjk* denotes the *k*-th spike from V4 neuron *j*, and *t ik* the *k*-th spike from V1 (external input) to V4 neuron *i*.

V4 neurons are primarily driven by the external (feedforward) input once a stimulus is presented (see Section 4.1.2). Presynaptic V1 spikes have an external input strength *Jext* = 0.1 mV.

*Ninh* V4 neurons are inhibitory (interneurons) and the remaining *Nexc* are excitatory cells (pyramidal neurons). We assumed a fixed ratio of = *Nexc*/*Ninh* = 4 (Abeles, 1991). The neurons are connected via a random coupling matrix with connection probability *p* = 0.02 (Erdös-Renyi graph). Connections are directed (asymmetrical), and we allow for self-connectivity. *wij* assumes a value of 1 if a connection exists from neurons *j* to *i*, and is 0 otherwise. Global coupling strengths can independently be varied by changing *Jinh* and *Jexc*.

Simulations were performed with an Euler integration scheme using a time step of *t* = 0.1 ms. Membrane potentials of V4 neurons were initialized such that they would fire at random times (pulled from a uniform distribution) when isolated and driven by a constant input current (asynchronous state). We simulated the network's dynamics for a period of *T*total = 2.5 s and discarded the first, transient 500 ms before analysis.

#### *4.1.2. Stimulus and external input*

For comparison with the experimental data, we drove our network using *Na* different stimuli. Specifically, we assumed that each stimulus activates a set of neurons in a lower visual area such as V1 or V2 whose receptive fields match (part of) the stimulus (**Figure 1A**). These neurons in turn provide feedforward input to a subset of *N*active neurons in the V4 layer. We realized this input as independent homogeneous Poisson processes with rate *fmax* = 10 kHz. This situation is equivalent to each activated V4 neuron receiving feedforward input from roughly 1000 neurons, each firing at about 10 Hz during stimulus presentation.

Since stimuli used in the experiment had similar sizes, we assumed the subset of activated V4 neurons to have constant size *N*active = 1000 for all stimuli. For each stimulus, we randomly choose the subset of V4 neurons which were activated by feedforward input. With a total of *N* = 2500 neurons, these subsets were not mutually exclusive for different stimuli. The remaining *N* − *N*active neurons received no feedforward input. Each stimulus was presented to the network in *Ntr* independent trials, and the simulations were repeated for *Nw* independent realizations of the V4 architecture *wij*.

#### *4.1.3. Local Field Potentials (LFPs)*

In the experiments motivating this work, spiking activity was not directly observable. Only neural population activities (LFPs) were measured by epidural electrodes. Likewise, using our model we generated LFP signals *U*(*t*) by a linear superposition of spiking activities of all neurons *j* in layer V4 and spiking activities of V1 neurons presynaptic to V4 neurons *i*, scaled by a mixing constant of *cmix* = 0.2. This was followed by a convolution with an exponential kernel *Kexp* (low-pass filter). In our network, this is a close approximation of summing the postsynaptic transmembrane currents of the V4 neurons (Lindén et al., 2011; Buzsáki et al., 2012; Makarova et al., 2014).

$$U(t) = K\_{\exp}(t, \mathfrak{r}\_k) \otimes \left(\sum\_{jk} \delta(t - t\_{jk}) + c\_{\text{mix}} \sum\_{ik} \delta(t - t'\_{ik})\right) \tag{6}$$

$$K\_{\exp}(t,\tau\_k) = \frac{1}{\tau\_k} \ e^{-t/\tau\_k}. \tag{7}$$

We used a time constant of τ*<sup>k</sup>* = 15 ms for the kernel and discarded a period of 50 ms (∼3.3 τ*k*) from both ends of the LFP signal in order to avoid boundary effects.

### **4.2. ANALYSIS OF NETWORK DYNAMICS**

#### *4.2.1. Spectral analysis*

Mirroring the experiments, we performed a wavelet transform using complex Morlet's wavelets ψ(*t*,*f*) (Kronland-Martinet et al., 1987) for time-frequency analysis. We obtained the spectral power of the LFPs via

$$p(t,f) = \left| \int\_{-\infty}^{+\infty} \psi(\mathfrak{r}\_{\mathcal{W}}, f) \, \, U(t - \mathfrak{r}\_{\mathcal{W}}) \, \mathrm{d}\mathfrak{r}\_{\mathcal{W}} \right|^2. \tag{8}$$

In order to exclude boundary effects, we only took wavelet coefficients outside the cone-of-influence (Torrence and Compo, 1998). Finally, we averaged the power *p*(*t*,*f*) over time to obtain the frequency spectra *p*(*f*). This method is identical to the one used for the analysis of the experimental data (Rotermund et al., 2009). The power *p*(*t*, *f*) of the signal was calculated in *Nf* = 20 different, logarithmically spaced frequencies *f* , in the range from *fmin* = 5 Hz to *fmax* = 200 Hz.

#### *4.2.2. Support vector machine classification*

In order to assess the enhancement of stimulus representation in the LFPs, we performed SVM classification using the libsvm package (Chang and Lin, 2011). The SVM employed a linear kernel function and the quadratic programming method to find the separating hyperplanes. We implemented a leave-one-out routine, where we averaged over *Ntr* results obtained by using *Ntr* − 1 randomly selected trials for each stimulus for training and the remaining trial for testing.

#### *4.2.3. Discriminability index*

The discriminability index DI(*Jexc*, *jinh*) was defined as

$$\text{DI} = \frac{1}{N\_a(N\_a - 1)/2} \frac{1}{N\_f} \frac{1}{N\_{tr}} \sum\_{i=1}^{N\_a - 1} \sum\_{j=i+1}^{N\_a} \sum\_f \sum\_{tr}$$

$$\frac{\text{erf}(Z\_{DI}(f, tr, i, j)/\sqrt{2})}{2} + \frac{1}{2} \tag{9}$$

with

$$Z\_{DI}(f,tr,i,j) = \frac{|\overline{p}\_i(f,tr) - \overline{p}\_j(f,tr)|}{\sigma\_{tr}(\overline{p}\_i(f,tr)) + \sigma\_{tr}(\overline{p}\_j(f,tr))} \tag{10}$$

where σ*tr* is the standard deviation of frequency spectra *p* over different trials *tr* and erf( · ) is the error function. The assumption underlying the DI measure is that, at a given frequency *f* , the magnitude of the LFP power distribution for different trials *tr* is normally distributed. Discriminability of two stimuli thus depend on how much the areas under their corresponding distributions overlap. DI represents the mean pairwise discriminability of unique stimulus pairs {*i*, *j*}, averaged over frequencies and trials. For one particular frequency band, the DI measure is related to the area-under-the-curve of a receiver-operator-characteristic of two normal distributions. By this definition, DI is normalized between 0.5 and 1, a higher DI indicating better discriminability. Because of trials having a finite duration, however, DI has a bias which took an approximate value of 0.69 in our simulations (**Figures 3B**, **4A**, **6B**). In addition, since there are typically frequencies which carry no stimulus information (e.g., the 110 Hz-band, see **Figure 2B**), DI is confined to values smaller than 1.

The discriminability index was further averaged over *Nw* independent realizations of the coupling matrix in the full simulations. In the reduced model, we ran the simulations for an extended duration of *T*total = 12 s. For computing DI, we then divided the LFP time series into *Ntr* = 36 trials.

## **4.3. NEURONAL AVALANCHES** *4.3.1. Separation of time scales*

A neuronal avalanche is defined as the consecutive propagation of activity from one unit to the next in a system of coupled neurons. The size of a neuronal avalanche is equal to the total number of neurons that are involved in that avalanche event, which starts when a neuron fires, propagates through generations of postsynaptic neurons, and ends when no neurons are activated anymore. Avalanche duration is then defined as the number of generations of neurons an avalanche event propagated through. In such a system, the critical point is characterized by a scale-free distribution of avalanche sizes and durations.

In simulations assessing avalanche statistics, recurrent spikes were delivered instantaneously to all postsynaptic neurons for proper separation of two different avalanches. This means that as soon as an avalanche event started, action potentials were propagated to all the generations of postsynaptic spikes within the same time step, until the avalanche event ended. This corresponds to a separation of timescales between delivery of external input and avalanche dynamics. In this way we could determine the avalanche sizes precisely, by "following" the propagation of every spike through the network.

In addition, we implemented a basic form of refractoriness which prevented a neuron from firing more than once during an avalanche event (holding its membrane potential at *VR* after it fired). Since each avalanche event took place in a single time step of the simulations, this corresponded to each neuron having an effective refractory period equivalent to the integration time step *t*.

### *4.3.2. Analysis of criticality of dynamics*

For each network realization, we obtained the probability *P*(*s*) of observing an avalanche of size *s* by normalizing histograms of avalanche sizes.

For every distribution *P*(*s*) obtained from our simulations, we calculated a maximum-likelihood estimator τˆ for the power-law exponent τ using the statistical analysis described in Clauset et al. (2009) for discrete distributions. For a comprehensive account of the fitting method please see Clauset et al. (2009). To explain the procedure briefly, we started by defining a log-likelihood function *L*(τ ). This quantifies the likelihood that the *n* empirical avalanche size observations*si* (*i* = 1,..., *n*), which were recorded during our simulations, were drawn from a perfect power-law distribution with exponent τ .

$$\mathcal{L}(\mathbf{r}) = -n \ln \zeta(\mathbf{r}, s\_{\rm min}) - \mathbf{r} \sum\_{i=1}^{n} \ln s\_i \tag{11}$$

where

$$\zeta(\tau, s\_{\min}) = \sum\_{n=0}^{\infty} \ (n + s\_{\min})^{-\tau} \tag{12}$$

is the Hurwitz zeta function. For a set of τ -values in the interval [1.1, 4], we computed *L*(τ ) (using Equation 11) and the value of τ which maximized the log-likelihood was taken as the exponent τˆ of the power-law fit *Pfit*(*s*) ∝ *s* − ˆ<sup>τ</sup> . During the fitting procedure, we used a lower cut-off threshold *smin* = *N*/100 = 10 and an upper cut-off threshold *smax* = 0.6*N* = 600. In other words, we fit a power-law to the set of empirical observations in the interval *smin* ≥ *si* ≥ *smax*. We repeated this fitting procedure to obtain power-law exponents α for the avalanche duration distributions *P*(*T*) ∝ *T*−α, using *Tmin* = 5 and *Tmax* = 30.

For clarity, it is important to point out that the ML analysis described in Clauset et al. (2009) does not take into consideration an upper cut-off in the empirical power-law distributions. One of the reasons we used an upper cut-off threshold during fitting is that the automated detection of critical points using the γ measure required us to fit a power-law exponent also to subcritical and supercritical avalanche size distributions. Using the complete tail of the distribution during the fitting procedure, for example in supercritical regimes, would yield a bias toward lower exponent estimates which would make it difficult to reliably detect the bump at large event sizes. This would hinder the detection of critical points using the γ measure, as it depends on an exponent which reliably represents the behavior of the distribution in the medium range of event sizes. More importantly, most of the size and duration distributions we observed at critical points displayed an exponential upper cut-off, as also observed in other experimental and theoretical work (Beggs and Plenz, 2003; Beggs, 2008; Petermann et al., 2009; Klaus et al., 2011; De Arcangelis and Herrmann, 2012). In statistics of neuronal avalanches, the exact location of the cut-off threshold depends strongly on system size and the duration of observations, and increasing either will increase the number of sampled avalanches and shift the cutoff threshold to higher values, but not make it vanish. In addition, excluding the observations above a cut-off threshold reduced the absolute magnitude of the log-likelihood function for all values of τ (Equation 11), but the value of τ which maximized the loglikelihood provided us with a better estimate of the exponent for the middle range of the distributions where power-law scaling was prominent.

We used a least squares fitting procedure to find the power-law exponents for *s*(*T*) (Weisstein, 2002), as it is not a probability distribution, using *Tmin* = 2 and *Tmax* = 20. In this procedure, the exponent 1/σ ν*z* of the function *s*(*T*) ∝ *T*1/σ ν*<sup>z</sup>* is given by the closed expression

$$\frac{1}{\sigma \upsilon z} = \frac{m \sum\_{i=1}^{m} \left( \ln T\_i \ln \langle s \rangle\_i \right) - \sum\_{i=1}^{m} \left( \ln T\_i \right) \sum\_{i=1}^{m} \left( \ln \langle s \rangle\_i \right)}{m \sum\_{i=1}^{m} \left( \ln T\_i \right)^2 - \left( \sum\_{i=1}^{m} \left( \ln T\_i \right) \right)^2} \tag{13}$$

where *m* is the total number of points on the function *s*(*T*), *Ti* are the duration values of the points and *s<sup>i</sup>* are the corresponding *s* values.

The KS statistic *D* was computed using

$$D = \max\_{s \ge \underline{N}/100} |F(s) - F\_{\tilde{\rm fit}}(s)|\tag{14}$$

where *F*(*s*) and *Ffit*(*s*) are the cumulative distribution functions (CDFs) of *P*(*s*) and *Pfit*(*s*), respectively.

We defined the transition region where the network dynamics switch from sub-critical to super-critical statistics using the binary variable indicator function γ .

$$\gamma = \begin{cases} 1 & \text{if } \quad F(N) - F(0.6N - 1) > F'\_{\text{fit}}(N) - F'\_{\text{fit}}(0.6N - 1) \\ 0 & \text{else} \end{cases}$$

(15) In Equation 15, *F fit*(*s*) <sup>=</sup> *Ffit*(*s*) *<sup>F</sup>*(*N*/100) *Ffit*(*N*/100) . γ assumes a value of 1, signifying super-critical statistics, if the tail of the empirical avalanche size distributions *P*(*s* > 0.6*N*) is heavier than that of the fit. Additionally, we visually verified that the indicator γ works well for describing the behavior of the distributions in coupling space. The region in which its mean γ over *Na* different stimuli lies between 0 and 1 was termed the transition region.

#### **4.4. COMPUTATION OF INFORMATION ENTROPY**

We quantified information entropy *H*(**X**) using a state variable **X** which represents the spiking patterns of V4 neurons at a given time point *t* (**Figure 1B**). We construct the probability *P*(**X** = *xi*) of observing a spike pattern *xi* using the *T*total *t* spike patterns observed in one trial.

$$H(\mathbf{X}) = -\sum\_{i} P(\mathbf{x}\_{i}) \log\_{2} P(\mathbf{x}\_{i}) \tag{16}$$

Considering different read-out strategies of the information encoded by V4 neurons in the higher visual areas, we computed information entropy in different scales of observation *K*. These scales were defined as follows (**Figure 1B**):

For the finest observation scale, *K* = 1, the state variable **X** consists of *N* channels, representing *N* V4 neurons. Each channel assumes a value of 1 if the corresponding neuron generated an action potential at time *t*, and 0 otherwise. We randomly picked the order in which different neurons were represented in **X**.

As we increase the observation scale *K*, **X** comprises *N*/*K* channels, and each channel represents the sum of spikes from *K* different neurons. For *K* > 1, we constructed **X** by adding up the spiking activity of *K* consecutive neurons, while conserving the aforementioned random order of neurons over the channels. At the coarsest scale of observation, we sum over the activity of the whole network (i.e., for *K* = 1000, **X** is a scalar in the interval [0, 1000]).

#### **FUNDING**

This work has been supported by the Bundesministerium für Bildung und Forschung (BMBF, Bernstein Award Udo Ernst, Grant No. 01GQ1106).

#### **ACKNOWLEDGMENTS**

The authors would like to thank Dr. Andreas Kreiter for providing the data shown in **Figure 2A**, and Dr. Klaus Pawelzik for fruitful discussions about the project.

#### **REFERENCES**

Abeles, M. (1991). *Corticonics: Neural Circuits of the Cerebral Cortex*. New York, NY: Cambridge University Press. doi: 10.1017/CBO9780511574566

Baiesi, M., and Paczuski, M. (2004). Scale-free networks of earthquakes and aftershocks. *Phys. Rev. E* 69, 066106. doi: 10.1103/PhysRevE.69.066106


neuronal avalanche data. *Phys. Rev. Lett.* 108, 208102. doi: 10.1103/PhysRevLett. 108.208102


neurons. *Neuroscience* 153, 1354–1369. doi: 10.1016/j.neuroscience.2008. 03.050


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 April 2014; accepted: 04 August 2014; published online: 25 August 2014. Citation: Tomen N, Rotermund D and Ernst U (2014) Marginally subcritical dynamics explain enhanced stimulus discriminability under attention. Front. Syst. Neurosci. 8:151. doi: 10.3389/fnsys.2014.00151*

*This article was submitted to the journal Frontiers in Systems Neuroscience. Copyright © 2014 Tomen, Rotermund and Ernst. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Critical role for resource constraints in neural models

#### *James A. Roberts <sup>1</sup> \*, Kartik K. Iyer 1,2, Sampsa Vanhatalo3 and Michael Breakspear 1,4*

*<sup>1</sup> Systems Neuroscience Group, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia*

*<sup>2</sup> Faculty of Health Sciences, School of Medicine, University of Queensland, Brisbane, QLD, Australia*

*<sup>3</sup> Department Clinical Neurophysiology, Children's Hospital, Helsinki University Central Hospital, University of Helsinki, Helsinki, Finland*

*<sup>4</sup> Royal Brisbane and Women's Hospital, Herston, QLD, Australia*

#### *Edited by:*

*Dietmar Plenz, National Institute of Mental Health, NIH, USA*

#### *Reviewed by:*

*Jason N. MacLean, University of Chicago, USA Hongdian Yang, Johns Hopkins University School of Medicine, USA*

#### *\*Correspondence:*

*James A. Roberts, Systems Neuroscience Group, QIMR Berghofer Medical Research Institute, 300 Herston Rd., Herston, QLD 4006, Australia e-mail: james.roberts@ qimrberghofer.edu.au*

Criticality has emerged as a leading dynamical candidate for healthy and pathological neuronal activity. At the heart of criticality in neural systems is the need for parameters to be tuned to specific values or for the existence of self-organizing mechanisms. Existing models lack precise physiological descriptions for how the brain maintains its tuning near a critical point. In this paper we argue that a key ingredient missing from the field is a formulation of reciprocal coupling between neural activity and metabolic resources. We propose that the constraint of optimizing the balance between energy use and activity plays a major role in tuning brain states to lie near criticality. Important recent findings aligned with our viewpoint have emerged from analyses of disorders that involve severe metabolic disturbances and alter scale-free properties of brain dynamics, including burst suppression. Moreover, we argue that average shapes of neuronal avalanches are a signature of scale-free activity that offers sharper insights into underlying mechanisms than afforded by traditional analyses of avalanche statistics.

**Keywords: criticality, mathematical models, metabolic resources, burst suppression, scale-free dynamics**

## **INTRODUCTION**

A substantial body of evidence now suggests that the brain operates near criticality. That is, analyses of healthy (Meisel et al., 2013) and pathological (Roberts et al., 2014) brain activity yield parameters lying near the cusp between stability and instability. Such a state confers benefits of increased flexibility (Kinouchi and Copelli, 2006; Shew et al., 2009), optimized information transfer (Beggs and Plenz, 2003; Shew et al., 2011), and increased storage capacity (Haldeman and Beggs, 2005; Shew et al., 2011). However, the question of *how* the brain maintains criticality is not clear. Prevailing theories posit various mechanisms but little attention has been paid to unifying these. In this Perspective Article, we argue that since existing mechanisms ultimately rely on various forms of activity-dependent modulation, models that integrate neuronal activity with metabolic resources present an opportunity for unifying existing theories of neuronal criticality. Moreover, we suggest that disambiguation of competing models would benefit from complementing traditional approaches of calculating scaling exponents with analyses of the deeper scaling properties encoded in average event shapes. This has been employed successfully in physics, but has only recently found traction in neuroscience.

## **COMPETING MECHANISMS IN MODELS OF CRITICAL BRAIN DYNAMICS**

Much of the attention on critical brain dynamics has centered on neuronal avalanches in fluctuating local field potentials measured using small grids of electrodes (Beggs and Plenz, 2003; Petermann et al., 2009; Priesemann et al., 2013), though signatures of criticality have been detected in many other large-scale measurements including MEG (Palva et al., 2013; Shriki et al., 2013), EEG (Linkenkaer-Hansen et al., 2001; Palva et al., 2013; Roberts et al., 2014), and fMRI (Haimovici et al., 2013). Modeling efforts have tended to focus on spatial avalanches in networks of spiking neurons, with relatively few analyses of criticality in large-scale models relevant to EEG (Steyn-Ross et al., 1999; Robinson et al., 2010; Aburn et al., 2012). Such models will be crucial for describing the macroscopic scale accessible in human studies.

Models of critical brain dynamics typically fall into two classes: those with a tuning parameter and those that self-organize. Models with a tuning parameter only exhibit critical dynamics when the model parameters are set precisely at the critical state, such as in branching processes (Beggs and Plenz, 2003; Haldeman and Beggs, 2005) and in typical mean-field models (Steyn-Ross et al., 1999). Parameter-setting mechanisms are outside the scope of these models by design—presumably slow parameter modulations exist to set the parameters but these are not explicitly modeled. In self-organizing models, the parameters evolve "naturally" to the critical point, usually involving synaptic plasticity based on either the strength of activity (De Arcangelis et al., 2006; Levina et al., 2007) or spike timing (Meisel and Gross, 2009; Rubinov et al., 2011). Another means of self-organizing to a critical point is to grow a network from scratch, with activitydependent plasticity governing the growth rules (Tetzlaff et al., 2010). A common feature of self-organization in physics is a separation of time scales between the slow build-up of energy and fast relaxation or dissipation—earthquakes are a classic example (Sethna et al., 2001). While similar time-scale separations exist in many neural models, an explicit link to energy (or at least a proxy for energy) is rarely made. We argue that such links will be important for unifying various tuning mechanisms.

## **AVERAGE BURST SHAPES ARE SENSITIVE TO UNDERLYING MECHANISMS**

The proliferation of models exhibiting criticality has centered on reproducing scale-free distributions of event sizes and durations seen empirically, with varying degrees of biological realism versus abstraction. While criticality likely arises in more than one context in the brain, it is also likely that there is room to unify theories where they describe the same phenomenon. Conversely, it is important to find ways of distinguishing between competing mechanisms that do not necessarily perform equally well in all settings. Disambiguating competing models is likely hampered by the limited set of measures typically used when benchmarking candidate models against empirical data. Powerlaw exponents are the most widely used means of testing model validity. However, multiple models can exhibit the same exponents while having different mechanisms and avalanche shapes (Sethna et al., 2001). Thus, average shapes are a sharper test of competing theories—this has been particularly successful in studies of ferromagnetism, where existing theories that reproduced correct exponents were shown to not reproduce the correct shapes (Mehta et al., 2002). By moving beyond traditional analyses, average shapes reveal deeper mechanistic insights (Zapperi et al., 2005; Papanikolaou et al., 2011). This approach has recently been applied in neuroscience revealing a variety of shapes in both data and models (Friedman et al., 2012; Priesemann et al., 2013; Roberts et al., 2014). In particular, invariance of average shapes across time scales is a strong indicator of scale-free dynamics, while a scale-dependent change in shape (such as asymmetry at long time scales, quantified with skewness), hints at deviations from perfect scale-free behavior that may not be visible in typical event statistics such as size distributions.

Recently it was shown that burst suppression following hypoxia exhibits a striking example of scale-free dynamics (Roberts et al., 2014). Burst suppression occurs in various abnormal brain states such as recovery from hypoxia and during anesthesia, and is characterized by near-quiescent "suppressed" periods punctuated erratically by large-amplitude "bursts" of electrical activity. In post-hypoxic burst suppression, scalefree properties vary across the recovery period, with scalefree burst distributions prominent during the burst-suppression phase, exhibiting stronger truncation upon the resumption of healthy activity (**Figure 1A**). These statistical features appear to relate closely to the pathophysiology, as they co-vary significantly with later clinical outcome (Iyer et al., 2014). Since criticality is usually associated with healthy states, existence in neonatal burst suppression thus broadens criticality's applicability to at least one pathology, and suggests that the developing brain may provide a new window into critical brain states.

Power-law scaling is also seen in duration distributions and in the scaling relationship between sizes and durations, with the exponents related in line with theories of crackling noise (Roberts et al., 2014). But scaling exponents do not tell the

**FIGURE 1 | Signatures of criticality in burst suppression EEG. (A)** Distributions of burst area (BA) for burst suppression (red) and later in recovery (blue), with corresponding power-law fits (green and orange, respectively). **(B)** Asymmetric average burst shapes for burst suppression EEG over a range of durations (red to blue, shortest to longest). Inset: burst skewness (-) as a function of duration *T* for burst suppression with linear fit (red). **(C)** Symmetric average burst shapes from EEG recorded later during recovery. Inset: burst skewness later in recovery. **(D)** Asymmetric average burst shapes from the simple model showing resource depletion. For more details see Roberts et al. (2014).

whole story. Scale-invariance of burst shapes is disrupted in the burst-suppression phase, showing increasing leftward asymmetry at long time scales (**Figure 1B**). Again, this feature of metabolically-compromised cortex diminishes upon resumption of healthy activity (**Figure 1C**, and cf. insets of panels **B** and **C**). This scale-free signature is thus acutely sensitive to the pathophysiology. In light of their success in explaining Barkhausen noise in ferromagnetism (Sethna et al., 2001; Mehta et al., 2002; Zapperi et al., 2005), where analysis of average shapes led to the development of new models, we argue that average shapes are under-utilized as a signature of scale-free dynamics in neural systems. We hope that rigorous testing of models against data will enable similar progress to that seen in the study of ferromagnetism. Moreover, analysis of events themselves, rather than coarse summary statistics, is underused in clinical settings.

## **UNIFYING MECHANISMS OF SELF-ORGANIZATION VIA BIOPHYSICAL MODELING OF RESOURCE CONSTRAINTS**

Asymmetry of the average shape arises from history-dependent effects (Zapperi et al., 2005; Roberts et al., 2014). A wellestablished example in physics is the response of a ferromagnet in a slowly changing external field (a classic, controllable example of criticality). There, the external field aligns microscopic domains in the magnet, but instead of gradually aligning in unison the individual domains flip suddenly and erratically, triggering similar flips in their neighbors. This yields a bursty signal termed Barkhausen noise, a striking example of crackling noise (Sethna et al., 2001) with characteristic asymmetric burst shapes that lean to the left. These shapes were explained using a model with history dependence derived from the dynamics of domain wall pinning (Zapperi et al., 2005; Papanikolaou et al., 2011). For burst suppression in post-hypoxic neonates, it was found that left-leaning bursts (**Figure 1D**) arise from a simple model with activity-dependent damping:

$$
\dot{\mathfrak{x}} = -\lambda \mathfrak{x} + \xi \begin{pmatrix} t \end{pmatrix}, \tag{1}
$$

$$
\lambda = a\_2 \int\_{-\infty}^t e^{-a\_1(t-\tau)} \varkappa(\tau)^2 d\tau. \tag{2}
$$

Here, *x* represents neuronal activity, ξ is a Gaussian white noise drive, λ is a damping constant, and α<sup>1</sup> and α<sup>2</sup> are constants. This form was motivated by the fast-out slow-return nature of the leftward asymmetry: damping is low at the beginning of bursts but increases with the increasing activity in its recent history. This is consistent with the post-hypoxic brain being acutely sensitive to its constrained metabolic resources. Although this is a simple phenomenological model, the central idea of activity-dependent modulations is widely applicable. For example, metabolic constraints have recently been incorporated into a cellular model to explain a different (non-scalefree) type of burst suppression induced in adult EEG during propofol anesthesia (Ching et al., 2012). Moreover, advancing technologies for measuring metabolic variables will yield rich data sets prompting model development. Oxygen availability has recently been shown to be tightly coupled to levels of excitability in slice preparations (Hajos et al., 2009; Ivanov and Zilberter, 2011), prompting calls to study the feedback loop between activity and energy availability (Zilberter et al., 2010). Such approaches may yield new insights into activity that requires high metabolic load, such as the high-frequency gamma activity associated with higher cognitive functions (Kann, 2011). Furthermore, live O2 monitoring enables unprecedented insight into metabolic dynamics (Ingram et al., 2014), motivating new models of seizure dynamics (Wei et al., 2014), complementing models of ion concentrations (Cressman et al., 2009). This last application is notable because brain dynamics have been shown to deviate from criticality during seizures (Meisel et al., 2012).

Thus, we argue that since signatures of scale-free dynamics appear to be sensitive to metabolic disturbances, proper understanding of these dynamics should parsimoniously describe the underlying metabolic system to which the dynamics are closely coupled. This allows the metabolic states to tune the neuronal states. More concretely, typical models of the form

$$
\dot{\mathbf{x}} = f(\mathbf{x}, M, t) + \xi(t), \tag{3}
$$

where *M* are parameters, can be extended to incorporate dynamics for the slow evolution of *M* given by

$$
\dot{M} = \epsilon g(\mathbf{x}, M, \mathbf{t}),
\tag{4}
$$

where is a small parameter determining the separation of time scales. This formalism of slow parameter dynamics (not necessarily metabolic) is widely used to model oscillatory systems such as bursting in individual neurons (Izhikevich, 2000), EEG oscillations in anesthesia (Liley and Walsh, 2013; Ching and Brown, 2014), and seizures (Jirsa et al., 2014). The bifurcations involved in such oscillatory transitions are likely different from the critical points responsible for scale-free dynamics, but the core approach is valid for modeling all types of slow parameter changes, and should be applied in studies of neuronal criticality. In our example of post-hypoxic burst suppression, one could envisage three time scales: fast neuronal dynamics on the order of tens of milliseconds, slower dynamics governing activity-dependence within bursts on the order of hundreds of milliseconds to seconds, and very slow dynamics describing the recovery trajectory in and out of burst suppression on the order of tens of minutes. Indeed such a hierarchy of time scales in a phenomenological model successfully explains many features of seizure dynamics (Jirsa et al., 2014). On slower time scales still, we expect that another key target for such modeling will be the sleep-wake cycle, which is itself fundamentally tied to slow homeostatic processes and known to exhibit temporally-varying signatures of criticality (Meisel et al., 2013; Priesemann et al., 2013).

More broadly, all mechanisms for slow parameter modulations are tightly constrained by the need for the brain to optimize the use of its resources. This view has been extraordinarily successful in explaining the structure of brain networks in terms of minimizing wiring costs (Bullmore and Sporns, 2012), yet has been used only sparingly to study large-scale brain dynamics. The brain evolved under the constraint of finite resources, so understanding how this constraint shapes brain dynamics will likely tell us more about the specific resource constraints, the resulting dynamics, and how the brain may be organized to circumvent these restrictions. Most attention thus far has been devoted to overall activity levels (Attwell and Laughlin, 2001), and even then most of the brain's energy expenditure remains unexplained (Raichle, 2006; Buzsáki et al., 2007). We hypothesize that resource constraints not only underpin activity-dependent modulations on micro- and meso-scales, but collectively act to keep the brain near a critical point on the macro-scale. That is, optimizing the balance between the brain's competing needs of being active while not squandering its energy supplies seems consistent with self-organization to a critical point. Failures of this balance lead to neurological disorders (Meisel et al., 2012; Roberts et al., 2014), demonstrating that studying pathological activity—particularly in metabolicallydemanding states—enables better understanding of healthy brain states.

In sum, these considerations suggest new unifying principles across the spectrum of criticality in neural systems as well as new means of disambiguating between competing causal mechanisms. Crucially, this approach also suggests a means of integrating data from emerging technologies that combine electrical, hemodynamic, and metabolic imaging—a major upcoming challenge for neuroscience.

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 May 2014; accepted: 05 August 2014; published online: 22 August 2014. Citation: Roberts JA, Iyer KK, Vanhatalo S and Breakspear M (2014) Critical role for resource constraints in neural models. Front. Syst. Neurosci. 8:154. doi: 10.3389/fnsys. 2014.00154*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2014 Roberts, Iyer, Vanhatalo and Breakspear. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Spike avalanches *in vivo* suggest a driven, slightly subcritical brain state

## *Viola Priesemann1,2,3,4\*, Michael Wibral 5,6, Mario Valderrama7, Robert Pröpper 8,9, Michel Le Van Quyen10, Theo Geisel 1,2, Jochen Triesch3, Danko Nikolic´ 3,4,6,11 and Matthias H. J. Munk12*

*<sup>1</sup> Department of Non-linear Dynamics, Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany*

*<sup>3</sup> Frankfurt Institute for Advanced Studies, Frankfurt, Germany*

*<sup>5</sup> Magnetoencephalography Unit, Brain Imaging Center, Johann Wolfgang Goethe University, Frankfurt, Germany*

*<sup>6</sup> Ernst Strüngmann Institute for Neuroscience in Cooperation with Max Planck Society, Frankfurt, Germany*

*<sup>7</sup> Department of Biomedical Engineering, University of Los Andes, Bogotá, Colombia*

*<sup>8</sup> Neural Information Processing Group, Department of Software Engineering and Theoretical Computer Science, TU Berlin, Berlin, Germany*

*<sup>9</sup> Bernstein Center for Computational Neuroscience, Berlin, Germany*

*<sup>10</sup> Centre de Recherche de l'Institut du Cerveau et de la Moelle épinière, Hôpital de la Pitié-Salpêtrière, INSERM UMRS 975—CNRS UMR 7225-UPMC, Paris, France*

*<sup>11</sup> Department of Psychology, Faculty of Humanities and Social Sciences, University of Zagreb, Zagreb, Croatia*

*<sup>12</sup> Physiology of Cognitive Processes, Max Planck Institute for Biological Cybernetics, Tübingen, Germany*

#### *Edited by:*

*Valentina Pasquale, Fondazione Istituto Italiano di Tecnologia, Italy*

#### *Reviewed by:*

*John M. Beggs, Indiana University, USA Mauro Copelli, Federal University of Pernambuco, Brazil Silvia Scarpetta, University of Salerno, Italy Miguel Angel Muñoz, Universidad de Granada, Spain*

#### *\*Correspondence:*

*Viola Priesemann, Max Planck Institute for Dynamics and Self-Organization, Am Fassberg 17, 37077 Göttingen, Germany e-mail: v.priesemann@gmx.de*

In self-organized critical (SOC) systems avalanche size distributions follow power-laws. Power-laws have also been observed for neural activity, and so it has been proposed that SOC underlies brain organization as well. Surprisingly, for *spiking* activity *in vivo*, evidence for SOC is still lacking. Therefore, we analyzed highly parallel spike recordings from awake rats and monkeys, anesthetized cats, and also local field potentials from humans. We compared these to spiking activity from two established critical models: the Bak-Tang-Wiesenfeld model, and a stochastic branching model. We found fundamental differences between the neural and the model activity. These differences could be overcome for both models through a combination of three modifications: (1) subsampling, (2) increasing the input to the model (this way eliminating the separation of time scales, which is fundamental to SOC and its avalanche definition), and (3) making the model slightly sub-critical. The match between the neural activity and the modified models held not only for the classical avalanche size distributions and estimated branching parameters, but also for two novel measures (mean avalanche size, and frequency of single spikes), and for the dependence of all these measures on the temporal bin size. Our results suggest that neural activity *in vivo* shows a mélange of avalanches, and not temporally separated ones, and that their global activity propagation can be approximated by the principle that one spike on average triggers a little less than one spike in the next step. This implies that neural activity does not reflect a SOC state but a slightly sub-critical regime without a separation of time scales. Potential advantages of this regime may be faster information processing, and a safety margin from super-criticality, which has been linked to epilepsy.

**Keywords: self-organized criticality, human intracranial recordings, spike train analysis, highly parallel recordings, spiking neural networks, multiunit activity, cortex, monkeys**

## **INTRODUCTION**

Avalanches, earthquakes, and forest fires are all cascades of activity in otherwise quiescent systems (Gutenberg and Richter, 1944; Bak et al., 1987; Drossel and Schwabl, 1992; Frette et al., 1996;

**Measures, variables, and abbreviations:** α, connection strength or synaptic strength; β, scaling exponent (DFA); σ, branching parameter; σ∗, estimated branching parameter; τ , critical exponent of the avalanche size distribution; *bs*, bin size; DFA, detrended fluctuation analysis; *f(s)*, avalanche size distribution; *f*(*s* = 1, *bs*), frequency of avalanches of size *s* = 1 and their dependence on the bin size; *h*, rate of input spikes, also called drive (Hz); <s>, mean avalanche size; <*IEI*>, average inter event interval; <IEI> = 1/*R*; *N*, number of sampled (model) neurons; *r*, rate per unit (Hz); *R*, population rate (Hz); STS, separation of time scales.

Dickman et al., 2000). Most of the time, the size of these cascades, or avalanches, is small, but sometimes avalanches are large enough to span the entire system. The size *s* of an avalanche is the number of units activated during a cascade, and interestingly, the distribution *f(s)* of avalanche sizes in the systems mentioned above precisely follows a power law:

$$f(s) \sim s^{-\mathfrak{r}} \tag{1}$$

where τ is the critical exponent. Critical exponents determine the macroscopic behavior of a system, and indicate the system's universality class (Wilson, 1975).

*<sup>2</sup> Bernstein Center for Computational Neuroscience, Göttingen, Germany*

*<sup>4</sup> Department of Neurophysiology, Max Planck Institute for Brain Research, Frankfurt, Germany*

Power law distributions are characteristic for second-order phase transitions, where the system is in a "critical" state. If the system evolves to reach a critical state without fine-tuning of control parameters, the system is termed *self-organized critical* (SOC) (Bak et al., 1987; Jensen, 1998; Nagler et al., 1999; Beggs and Plenz, 2003; Frigg, 2003; Beggs and Timme, 2012; Pruessner, 2012).

SOC models show avalanches or cascades of activity across their units, which may arise from simple local interactions (Bak et al., 1987; Drossel and Schwabl, 1992; Olami et al., 1992). These avalanches can include all units in the system. However, most avalanches are small or intermediate in size. Note that avalanches of size one, i.e., only one unit is active and no further activity is triggered, have the highest chance of occurring (see Equation 1). Overall, avalanches are not characterized by an average size, i.e., the size distribution is scale-free, and only the true size of the system restricts the avalanche size range.

In nervous systems, scale-free properties have been observed in local field potentials (LFP), electro- and magnetoencephalographic (EEG, MEG) activity, and BOLD signals (Linkenkaer-Hansen et al., 2001; Beggs and Plenz, 2003; Petermann et al., 2009; Hahn et al., 2010; Ribeiro et al., 2010; Tetzlaff et al., 2010; Friedman et al., 2012; Poil et al., 2012; Tagliazucchi et al., 2012; Priesemann et al., 2013; Shriki et al., 2013). They have been found in different preparations, ranging from cultures to *in vivo* preparations, and across different species and phyla: leeches, rats, cats, monkeys, and humans (Linkenkaer-Hansen et al., 2001; Beggs and Plenz, 2003; Mazzoni et al., 2007; Pasquale et al., 2008; Petermann et al., 2009; Priesemann et al., 2009, 2013; Hahn et al., 2010; Ribeiro et al., 2010; Tetzlaff et al., 2010; Friedman et al., 2012; Poil et al., 2012; Tagliazucchi et al., 2012; Shriki et al., 2013). The prevailing hypothesis is that scale-free neural activity arises from SOC behavior (Linkenkaer-Hansen et al., 2001; Beggs and Plenz, 2003; Mazzoni et al., 2007; Beggs, 2008; Pasquale et al., 2008; Petermann et al., 2009; Shew et al., 2009; Hahn et al., 2010; Ribeiro et al., 2010; Tetzlaff et al., 2010; Friedman et al., 2012; Poil et al., 2012; Tagliazucchi et al., 2012; Gal and Marom, 2013; Shriki et al., 2013). However, there are also studies that reported deviations from scale-free activity: Neural activity was shown to exhibit sub-critical and super-critical behavior during development *in vitro* (Pasquale et al., 2008; Tetzlaff et al., 2010; Friedman et al., 2012); and there are also studies in which *in vivo* neural activity appeared as sub-critical (Bedard et al., 2006; Priesemann et al., 2013). Thus, healthy brains seem to be capable of organizing themselves into a range of states that are not necessarily SOC.

Nevertheless, because neural activity from coarse scale measures (e.g., population spikes, LFP, MEG, BOLD) often do show power law scaling, the same was expected for more basic constituents of neural activity, namely the spiking activity. Surprisingly, however, spike avalanches often deviated from power law scaling (Bedard et al., 2006; Pasquale et al., 2008; Hahn et al., 2010; Tetzlaff et al., 2010). In fact, to the best of our knowledge, there is not a single study that demonstrated power laws for spikes in awake animals. The deviations from power law scaling in previous studies were attributed either to sub- or supercritical states (Pasquale et al., 2008; Tetzlaff et al., 2010), or to subsampling effects (Ribeiro et al., 2010). Subsampling refers to the technical constraint that only a fraction of all neurons in a given area can be measured. Subsampling can impede the observation of power law distributions in SOC models (Priesemann et al., 2009, 2013; Ribeiro et al., 2010; Girardi-Schappo et al., 2013) and hence a critical system can be misinterpreted as sub- or super-critical (Priesemann et al., 2009). Therefore, subsampling effects need to be taken into account when interpreting spike avalanches.

An important property of SOC systems, which is potentially absent in neural activity, is the separation of time scales (STS) (Bak et al., 1987; Drossel and Schwabl, 1992; Clar et al., 1996; Dickman et al., 2000; Pruessner, 2012; Hartley et al., 2013) whereby pauses between avalanches last much longer than the avalanches proper. For example, forest fires last for a much shorter time than it takes to regrow the forest. Similarly, earthquakes are much more rapid than the time it takes to build shear stress through plate tectonics (Drossel and Schwabl, 1992; Clar et al., 1996, 1999; Baiesi and Paczuski, 2004). Likewise, in the classical sandpile model, scale-free avalanche distributions are observed only if the grains are dropped at a low enough rate (Vespignani and Zapperi, 1997, 1998). This low rate of external input, called drive, is a necessary condition for the long pauses and hence for SOC (Bak et al., 1987; Drossel and Schwabl, 1992; Clar et al., 1996; Dickman et al., 2000; Pruessner, 2012; Hartley et al., 2013).

Neither the neural activity we analyzed here, nor that from previous studies of neural avalanches showed STS: There were no long pauses in the neural activity which could be seen as natural separations between avalanches. Without such pauses, unambiguous detection of the beginning and the end of an individual avalanche is not possible. Hence, the method of temporal binning had been introduced as a workaround (Beggs and Plenz, 2003) (**Figure 1**). Here, the choice of the bin size determines what is considered to be a pause between avalanches. Consequently, avalanche sizes necessarily change with the choice of the bin size (see e.g., Beggs and Plenz, 2003; Priesemann et al., 2009, 2013; Hahn et al., 2010). This implies that also the avalanche size distributions and, more importantly, power law exponents change with the choice of bin size (Beggs and Plenz, 2003; Priesemann et al., 2013). This is in marked contrast to fully sampled SOC systems, in which the power law exponents do not change under temporal binning as a result of STS. These differences have to be considered when comparing neural activity *in vivo* to that of classical SOC models.

As indicated above, in classical SOC systems each avalanche is separated from the next one by a long pause. In contrast, in *driven* SOC systems, i.e., SOC systems without STS, avalanches can meet, merge, intermingle, and split up: They form a mélange. As we demonstrate in this paper, neural activity indeed resembles such a mélange of avalanches instead of well-separated ones.

To investigate the differences between *in vivo* and model activity, we analyzed spike avalanches recorded in awake rats and monkeys, anesthetized cats, and LFP avalanches recorded in humans, and compared these *in vivo* avalanches to avalanches from an established SOC model (Bak-Tang-Wiesenfeld model) (Bak et al., 1987; Dunkelmann and Radons, 1994; Priesemann et al., 2009, 2013), and to those from a stochastic branching model (Harris, 1963; Haldeman and Beggs, 2005).

**FIGURE 1 | Definition of avalanches sizes, branching parameter** *σ* **∗, and their change with bin size. (A)** To define avalanches, temporal binning (boxes) is applied to a sequence of spikes (red dots and diamonds). Empty bins are marked in blue. An avalanche is an ensemble of spikes in a sequence of non-empty bins. Its size *s* is the total number of spikes, as indicated above the bins. The branching parameter σ∗ <sup>i</sup> is the ratio between the number of spikes in one bin, divided by the number of spikes in the previous bin, as indicated below the bins. If the previous bin was empty, σ<sup>i</sup> is "not defined" (nd). The estimated branching parameter σ<sup>∗</sup> for an experiment is the average over all σ<sup>∗</sup> <sup>i</sup> , <sup>σ</sup><sup>∗</sup>

become larger, since pauses "disappear". The branching parameter σ<sup>∗</sup> also changes with the bin size. **(C)** Under subsampling, only a fraction of the units are recorded (red dots), while others are missed (gray). Thereby subsampling can split a single avalanche into several parts. **(A–C)** In the model, spikes are either triggered externally by some drive (red diamonds), or they are evoked by presynaptic activity (red dots). If a second avalanche is triggered while the first one is still active [last avalanche in **(A)**], then the two avalanches cannot be told apart and are evaluated as if they were a single one.

## **RESULTS**

As a widely held belief states that mammalian nervous systems operate in a SOC state, we first briefly recapitulate the theoretically expected avalanche statistics in this state by example of a SOC model and a critical stochastic branching model. We then show that all of the analyzed neural avalanches *in vivo* showed clear deviations from the expected statistics.

The remainder of the results then demonstrates how two simple and neurophysiologically well-motivated conceptual changes in the models can serve to align model and *in vivo* activity with respect to a large set of measured quantities.

### **DIFFERENCES BETWEEN NEURAL DYNAMICS** *IN VIVO* **AND SOC**

The first example model is a simple neural network model, which is known to have SOC properties (Bak et al., 1987). Furthermore, this SOC model has been shown to match LFP avalanches in monkeys and humans (Priesemann et al., 2009, 2013). In our study, the model consisted of 2500 non-leaky integrate-and-fire neurons arranged as a 50 by 50 grid with nearest neighbor connections of synaptic strength α = 1 (see Methods). In this model, spikes are either evoked by activity from presynaptic neurons, or by a random external input to a neuron. This input is termed drive and has a rate *h*. For *h* → 0 and α = 1, this model obeys local energy conservation (Bonachela et al., 2010), and is equivalent to the well-known SOC Bak-Tang-Wiesenfeld model (Bak et al., 1987). *h* → 0 is necessary for a model to be SOC (Vespignani and Zapperi, 1997, 1998; Dickman et al., 2000), because it guarantees the obligatory STS. *h* → 0 is implemented by applying external input only when there is otherwise no activity in the model. The input triggers an avalanche, i.e., a cascade of events. The size *s* of an avalanche is defined as the total number of spikes evoked by a single input spike. This model is known to show a power law for *f(s)* with slope τ ≈ 1 (**Figure 2A**), and a cutoff at *s* ≈ 1000 (Bak et al., 1987). This cutoff reflects the finite size of the model (Bak et al., 1988; Kadanoff et al., 1989; Ktitarev et al., 2000).

To later demonstrate that our conclusions are not specific to the SOC model above, we simulated a second model, namely a stochastic branching model (see Methods) (Harris, 1963; Haldeman and Beggs, 2005). Like the SOC model, it was simulated with 2500 neurons, but in contrast to the SOC model,

for all bin sizes. *h* was chosen such that the population rate *R* of the 100

lines indicate potential power law slopes to guide the eye. All *f(s)* are logarithmically binned and *f(s)* is in absolute counts.

the *k* = 4 postsynaptic neurons were chosen randomly at each step. Activity propagated stochastically, i.e., an active neuron activated each of its *k* postsynaptic neurons with probability *p* = α/k. Like the SOC model, this model is critical for α = 1, and sub- (super-) critical for α < 1 (α > 1). The critical stochastic branching model with STS also showed a power law distribution for *f(s)*, but with a different critical exponent (τ = 1.5, **Figure 2G**).

The results for the stochastic branching model and the SOC model were qualitatively the same for all measures used below. The similarity also held when the models were modified analogously. Therefore, in the following, we mainly report results for the SOC model.

Our critical models were contrasted with highly parallel recordings from awake rats (hippocampus), awake monkeys (prefrontal cortex), and from an anesthetized cat (visual cortex area 18). The avalanche distributions *f(s)* from these *in vivo* spike recordings were all very similar, but clearly differed from those obtained from the fully sampled critical models (compare **Figures 2D–F** with **A,G**). In particular, the *in vivo f(s)* neither followed a power law, in contrast to what is expected for a SOC system, nor an exponential distribution, as would be expected for independent Poissonian activity (Figures S1 and S2 show the *in vivo f(s)* for each experiment in double-logarithmic and loglinear scales, respectively). Quantitatively, the *f(s)* were best fit in 16 out of 17 experiments by a lognormal distribution

$$f(s) \sim e^{-\frac{\left(\ln(s) - \mu\right)^2}{2\sigma^2}}$$

with parameters μ = 0.89 ± 0.25 and variance σ<sup>2</sup> = 1.2 ± 0.1, given a bin size of 1 average inter event interval (<IEI>) (see Clauset et al., 2007; Priesemann et al., 2013 for details). Based on these parameters the maximum of *f(s)* was at *s* = 0.87 ± 0.38 (mean ± SD), which means that *f(s)* was monotonically decreasing. Two alternative distributions, namely stretched exponentials and power laws with cutoff, also provided reasonable fits, with likelihoods ∼1% worse than the one for the lognormal distribution.

Interestingly, all *in vivo* avalanche distributions were similar despite changes in the population rate *R* by a factor of 50 (from 37 to 1560 Hz) across the 17 experiments (Figures S1, S2).

Note that some of the *f(s)* of the rat experiments could also be approximated by a power law, but at most for one selected bin size (Figure S3A). When slightly changing the bin size, the *f(s)* clearly deviated from power law scaling (Figure S3B). This is in stark contrast to the behavior expected for SOC systems.

A second striking difference between critical models and *in vivo* activity was that the *in vivo f(s)* changed with the bin size across a range from 0.5 to 128 ms. The reason for the bin size dependence was that *in vivo* recordings showed pauses of variable length between the spikes, while SOC activity showed only the long pauses between avalanches, which are due to STS. In order to introduce pauses of variable length into the model avalanches, one can apply subsampling and drop STS (see next two sections).

#### **SUBSAMPLING INTRODUCES PAUSES AT SHORT TIME SCALES**

Subsampling refers to the problem that we are far from being able to sample all spikes from all neurons, even for a single brain area (**Figure 1C**). Thus, for a careful comparison between *in vivo* recordings and models, the activity from the models should be subsampled in the same manner as in the experiments. Because in each experiment around 100 neurons were recorded in parallel, in the model we constrained the sampling also to *N* = 100 randomly chosen neurons out of the 2500. We fixed the subsampling by the number of neurons, and not the fraction, because running these critical models with millions of neurons is beyond our computational capacities, and because the qualitative results did not change in larger models, i.e., when decreasing the fraction (see below).

When applying subsampling, the model avalanche size distribution *f(s)* changed with bin size (**Figures 2B,H**). A change in bin size affected *f(s)*, because subsampling introduces apparent pauses in a single avalanche (**Figure 1C**). These apparent pauses were relatively short compared to the duration of an avalanche, and compared to the pauses between avalanches on the full model (by definition of STS). Therefore, when subsampling, *f(s)* changed only with small bin sizes but stopped to change its shape with larger ones (**Figures 2B,H**).

These results also held when using a larger model and sampling the same number of neurons, i.e., a smaller fraction of neurons. In this case, the distance and hence the traveling time of avalanches between sampled neurons became larger and longer, and the inter spike intervals became unrealistically long. Nonetheless, at large bin size, a similar fraction of small avalanches was observed (due to STS). As a consequence, *f(s)* also stopped changing like in smaller models, and never became as flat as the *in vivo f(s)*. Hence, the behavior of a larger model was the same as that of smaller ones, but on a longer time scale.

Subsampling the SOC model did not only introduce a dependence of *f(s)* on the bin size, it also affected the cutoff of *f(s)*. Thereby, the absolute value of the cutoff became more similar for the model and the *in vivo f(s)* (**Figures 2B,H**).

In sum, acknowledging subsampling effects in the model allowed for a better match between the model and the *in vivo* activity, but only for small bin sizes up to a few milliseconds. For larger bin sizes, the *in vivo f(s)* continued to become flatter, while the model *f(s)* stopped to change their shape. This indicated that a modification to the model dynamics itself was necessary to match *in vivo* activity.

#### **AN INCREASE IN DRIVE RATE** *h* **CREATES A MÉLANGE OF AVALANCHES**

We hypothesized that *in vivo* and SOC activity differed because SOC models have STS (Vespignani and Zapperi, 1997, 1998; Dickman et al., 2000), which is necessarily absent *in vivo*. STS can be eliminated from the models by increasing the drive rate *h*. We increased *h* in such a way that the model population rate *R* matched the *in vivo* population rate under subsampling (*h* = 0.02 Hz, and *R* = 320 Hz; single neuron rate *r* in the model: *r* = *R*/*N* = 3.2 Hz). In this *driven* SOC model, the avalanches were no longer separated by long pauses (**Figure 3B**). Instead, at any point in time, avalanches could start, meet, intermingle, split into branches, or die out (**Figures 1**, **3B**). In such a mélange of avalanches, single avalanches can no longer be tracked.

The mélange of avalanches in the driven model hardly showed any pauses when all neurons were sampled (**Figure 3B**). However, under subsampling, pauses were more frequent. Thus, subsampling allowed for an extraction of apparent avalanches by applying temporal binning (**Figure 1**). Note that these apparent avalanches do not correspond to the avalanches observed in classical SOC models in which avalanches are separated by long pauses, and are thereby defined unambiguously. However, the apparent avalanches from the driven models are conceptually the same as those extracted from *in vivo* recordings because avalanches in both cases are extracted with the same method.

As expected for the *driven*, subsampled SOC model, *f(s)* changed with all bin sizes (**Figures 2C,I**), and thereby resembled the *in vivo f(s)* much better than the original SOC model (**Figure 2**).

#### **DRIVEN CRITICAL AND DRIVEN SUB-CRITICAL STATES**

In the following, we address the question whether subsampling and the elimination of STS is sufficient to match the model activity with the *in vivo* activity, or whether it is necessary to introduce in addition deviations from criticality.

To tune models away from criticality, we made use of the fact that SOC and branching models are only critical in the conservative limit (α = 1) (Harris, 1963; Bonachela and Muñoz, 2009; Bonachela et al., 2010). Hence, by introducing dissipation (α < 1) these models can be made sub-critical. In fact, the model dynamics showed a smooth transition from the "driven SOC" state (α = 1) to pure Poisson activity (α = 0) (**Figures 3**, **4**) with

decreasing α. In principle, a decrease in α also decreased the firing rate *r* of each unit. To still maintain a constant firing rate *r*, a concomitant increase in the drive rate *h* was applied. In this way, the model could make the transition from driven SOC to Poissonian activity without a change in *r* (**Figure 4**, black line). Given a fixed *r*, a decrease in α decreased the variability of the models population rate (**Figure 3**).

To understand which network dynamics between driven critical and Poissonian accounted best for the *in vivo* spike avalanches, we identified those measures in the model which depended most sensitively on α *under subsampling:* α had a prominent effect on the avalanche size distribution *f(s)*, in particular how *f(s)* depended on the bin size. We quantified this below using the following avalanche measures: the mean avalanche size (<*s*>), the frequency of avalanches of size *s* = 1 (*f*(*s* = 1)), and the estimated branching parameter σ∗. The way in which these measures changed with the bin size depended sensitively on α. In addition, we estimated the scaling exponent β of the "detrended fluctuation analysis" (DFA) (Peng et al., 1994, 1995; Kantelhardt et al., 2002). (Note that the scaling exponent (β) is often denoted as α in the literature). The results of these analyses are presented in detail below, and compared one by one to the *in vivo* results.

#### **THE MEAN AVALANCHES SIZE**

The mean avalanche size (<*s*>) from the subsampled model followed a power law with increasing bin size for α = 1 (driven SOC), and followed an exponential for α = 0 (Poissonian activity) (**Figure 5A**). For intermediate values of α, the relation changed gradually.

For the experiments, the observed <*s*> at a given bin size depended strongly on the population spike rate *R* that varied considerably between experiments (*R* ranged from 37 Hz to 1.5 kHz). To diminish the impact of *R*, we used a normalized bin size, i.e., a bin size in units of average inter-event-intervals (1 <*IEI*> = 1/*R*). Using the normalized bin size, the <*s*> of all experiments overlapped (**Figure 5A**, gray lines). However, the <*s*> did not follow a power law with changing bin size *in vivo*, in contrast to the driven critical model. In fact, the *in vivo* <s> was best matched by the <*s*> of the driven, sub-critical models (α ≈ 0.99). Thus, comparing the *in vivo* and model <*s*> indicated that spike avalanches resembled a driven sub-critical regime more closely than a driven SOC state.

#### **THE FREQUENCY OF AVALANCHES OF SIZE ONE**

The frequency of avalanches of size *s* = 1, *f*(*s* = 1, *bs*) quantifies how *f(s)* decayed with the bin size (*bs*) at *s* = 1, i.e., how the intercept of *f(s)* with the y-axis in **Figure 2** changed. *f(s)* at *s* = 1 was equally spaced from bin size 1 to 32 ms for the driven critical models under subsampling (**Figures 2C,I**) which is remarkable as it corresponds to a power law behavior of *f*(*s* = 1, *bs*) for the driven SOC model (black line in **Figure 5B**; note that the x-axis here is in <*IEI*>, and 1 <IEI> = 2 ms in the model). For the sub-critical models (α < 1), *f*(*s* = 1, *bs*) decayed more steeply than a power law. For the Poissonian case (α = 0), it followed an exponential. In this respect, *f*(*s* = 1, *bs*) and <*s*> showed similar behaviors with α.

*f*(*s* = 1, *bs*) is a promising new measure to assess criticality under subsampling, because in contrast to many other measures, its behavior did not change with the subsampling strategy: For the driven SOC model, it showed power law scaling independently

**FIGURE 5 | Two new avalanche measures. (A)** The mean avalanche size and **(B)** the frequency of avalanches with size *s* = 1, *f*(*s* = *1*, *bs*), changed with the bin size (*bs*) in the model (colored) and in the experiments (gray). The colored

lines show *f*(*s* = *1*, *bs*) for the model with varying synaptic strength α. In the model, the drive rate *h* was adjusted such that each neuron spiked with *r* ≈ 5 Hz. In **(B)**, *f*(*s* = *1*, *bs*) was normalized such that *f*(*s* = *1*, *bs* = *1* <*IEI*>) = 1.

of the number and spatial arrangement of the sampled units (**Figure 6**). However, the slope of the power law did change due to the model's next-neighbor topology: With smaller distances between sampled sites, the power laws became flatter (red and pink traces in **Figure 6**). For the stochastic branching model, the same results held, but the power law slopes did not change under subsampling, owing to the model's random topology.

The *in vivo f*(*s* = 1, *bs*) did not follow a power law (**Figure 5B**, gray lines), and for most cases did not follow an exponential dependency either (**Figure 5B**). The best approximation for the *in vivo f*(*s* = 1, *bs*) was the driven, slightly sub-critical model (α ≈ 0.99). This is in agreement with the results for <*s*>.

The precise value of α necessary to achieve the best match between model and experiments potentially depended on a number of factors (e.g., finite size effects). However, the main result that <*s*> and *f*(*s* = 1, *bs*) observed *in vivo* followed neither a power law nor an exponential distribution excludes both, critical and Poissonian states of operation.

#### **THE BRANCHING PARAMETER** *σ*

A widely used measure to estimate whether the *in vivo* avalanches reflected a driven SOC brain state is the branching parameter σ∗, which has been used in many past studies about neural avalanches to test whether the brain was SOC (Beggs and Plenz, 2003; Beggs, 2007; Plenz and Thiagarajan, 2007; Priesemann et al., 2009, 2013; Shew et al., 2009; Klaus et al., 2011; Shriki et al., 2013). The analysis of σ<sup>∗</sup> was initially inspired by the theory of branching processes (Harris, 1963), in which σ = 1 guarantees that a branching process is critical. Note, however, that *estimating* σ<sup>∗</sup> from data may yield misleading results, because σ<sup>∗</sup> depends on various factors such as the bin size (Beggs and Plenz, 2003; Priesemann et al., 2013), the subsampling geometry (Priesemann et al., 2009), and STS (i.e., *h* → 0 vs. *h* > 0). We next show how σ<sup>∗</sup> depended on these factors in our models, and then use these results to estimate whether the *in vivo* avalanches might reflect a SOC state.

For the modified SOC model, we expect that σ equals α. For the second model we used, i.e. the stochastic branching model, we *know* by definition of the model that σ equals α. However, when estimating σ<sup>∗</sup> in this model by applying temporal binning to the model activity, finding the expected σ<sup>∗</sup> = α was the exception, not the rule (Figure S4; results were very similar to the ones for the SOC model in **Figure 7**). In addition, σ<sup>∗</sup> changed with the bin size, although the model parameter σ proper is obviously bin size independent (**Figures 7**, S4). Although the estimated σ<sup>∗</sup> failed to approximate the true σ, σ<sup>∗</sup> may still be a viable approach to compare model and *in vivo* activity in the following. Since the results for both models were basically the same, we again focus on the results for the modified SOC model.

With STS, σ<sup>∗</sup> always approached zero for large bin sizes independently of model state and subsampling approach (dashed lines in **Figures 7A,B**, S4). For intermediate bin sizes and under subsampling, σ<sup>∗</sup> varied widely. σ<sup>∗</sup> tended to be smaller for smaller α, but the absolute value of σ<sup>∗</sup> apparently cannot serve as an

**FIGURE 7 | The estimated branching parameter** *σ* **∗ changed with bin size. (A,B)** In the model, σ<sup>∗</sup> depended on the synaptic strength α and the bin size. For the driven model, the spike rate was fixed to *r* = 5 Hz (full lines), while for the model with separation of time scales the drive was infinitesimal small (*h* → 0; dashed lines). For *h* → 0 and α = 1, the model is SOC (black

dashed lines). **(A)** Results for the fully sampled model. **(B)** Results for subsampling *N* = 100 neurons from the model. **(C)** σ<sup>∗</sup> for the spiking activity recorded in monkeys, cats, and rats varied with the bin size, but was very similar across species and experiments. It was well approximated by the driven model with α = 0.98 (green line).

indicator for the state of the system (**Figures 7A,B**). Thus, under most analysis conditions, the estimated σ<sup>∗</sup> did not show the intended result (σ<sup>∗</sup> = α). Note that in theory, σ<sup>∗</sup> should not change at all with the bin size.

Without STS (full lines in **Figures 7A,B**, S4), σ<sup>∗</sup> was ≤1 for small bin sizes, ≥1 for intermediate bin sizes, and approximated unity for large bin sizes – independently of the state of the model. This shows that the widely held assumption that an estimated σ<sup>∗</sup> > 1 (σ<sup>∗</sup> < 1) corresponds to a super-critical (sub-critical) state of the system is likely incorrect, especially for the ubiquitous scenario of subsampling.

Although the expected σ<sup>∗</sup> = 1 is neither unique to critical systems, nor indicative of criticality, σ<sup>∗</sup> and its dependence on the bin size still reflect the intrinsic dynamics of the system. Therefore, comparing σ<sup>∗</sup> between *in vivo* and model activity may still help to indicate the state of the system. Note that to estimate the *in vivo* σ<sup>∗</sup> we used the normalized bin size (in <*IEI*>) to account for the different population rates *R* in the experiments. σ<sup>∗</sup> was very similar across all experiments (**Figure 7C**) despite a 50-fold difference in *R*. This indicates once again that neural avalanches *in vivo* hardly differ across mammalian species (from rats to monkeys), across brain structures (from hippocampus to prefrontal cortex), and across cognitive states (from anesthetized to awake behaving animals).

Given the complex dependence of σ<sup>∗</sup> on the bin size, how can σ<sup>∗</sup> be used to estimate the precise state of the neural network? First, for all *in vivo* avalanches, σ<sup>∗</sup> approximated unity for large bin size (**Figure 7C**). However, this simply indicates that spiking activity *in vivo* lacks STS. Second, the maximum of σ<sup>∗</sup> under subsampling may be an indicator of the state. The maximum of σ<sup>∗</sup> increased with increasing α. For α = 1, σ<sup>∗</sup> showed a maximum of ≈3 at *bs* ≈100 ms. [The same values held for the stochastic branching model (Figure S4)]. For the experiments, the maximum value of σ<sup>∗</sup> was only around 1.4. Overall, the best match for the *in vivo* σ<sup>∗</sup> was achieved by the driven, slightly sub-critical models (α ≈ 0.98). This result is in line with the previous results for *f*(*s* = 1, *bs*) and <*s*>.

**FIGURE 8 | The exponent** *β* **of the DFA.** Depended on the synaptic strength *α* in the model (diamonds), and was affected by subsampling (black: fully sampled model; green: subsampled model). For the experiments, β (gray circles) and the respective mean values (gray bars) ranged between 0.55 and 0.9.

#### **THE SCALING EXPONENT** *β*

In DFA, the scaling exponent β quantifies the memory decay in a time series. β = 0.5 indicates that a time series has no memory (uncorrelated); β ≈ 1 indicates 1/f (pink) noise; and β ≈ 1.5 Brownian noise. We estimated β for the population rate time series of the model (*r* = 5 Hz), and for each experiment. As expected, under full sampling the model with α = 1 showed β ≈ 1 (**Figure 8**, black diamonds); with decreasing α, β decreased as well; and for α = 0 (Poisson), we found β ≈ 0.5. Qualitatively, the same results held under subsampling, but β tended to be underestimated (**Figure 8**, green diamonds).

The *in vivo* activity showed neither β = 1 nor β = 0.5, but β ranged between 0.55 and 0.9. These β values correspond to those of the sub-critical, driven model with 0.98 ≤ α < 0.999.

All the above measures indicated that driven, slightly subcritical models provided the best match to *in vivo* spike

avalanches. Most of these measures were derived from the avalanche size distribution, and hence we expect a good match between the *in vivo f(s)*, and the *f(s)* of the driven models with α ≈ 0.99. Indeed, given a normalized bin size, both sub-critical models fitted the *in vivo f(s)* well (**Figure 9**). The small differences for large *s* (*s* > 100) may potentially be overcome by applying a more realistic drive instead of uncorrelated Poissonian drive, for example one that reflects the statistics of neural activity (as lined out here), or the statistics of our environment (Field, 1987; Van der Schaaf and van Hateren, 1996; Simoncelli and Olshausen, 2001; Sinz et al., 2009).

#### **LFP AVALANCHES IN HUMANS**

Approximate power law distributions have been reported for coarse measures of neural activity, such as population spikes, LFP, EEG, MEG, and BOLD activity (Linkenkaer-Hansen et al., 2001; Beggs and Plenz, 2003; Petermann et al., 2009; Hahn et al., 2010; Ribeiro et al., 2010; Tetzlaff et al., 2010; Friedman et al., 2012; Poil et al., 2012; Tagliazucchi et al., 2012; Priesemann et al., 2013; Shriki et al., 2013). In the following, we show that also LFP recordings in humans indicate a driven, slightly subcritical regime, despite their approximate power law scaling of *f(s)*.

LFPs were recorded using intracranial depth electrodes from five human subjects. Each subject had between 44 and 63 recording contacts implanted. From these recordings, we extracted avalanches of enhanced activity (see Methods and Priesemann et al., 2013). The LFP *f(s)* closely followed a power law (**Figure 10A**), and the slope of the power law decreased with increasing bin sizes. This is in contrast to SOC systems in which the slope does not change with temporal binning (**Figures 2A,G**), and indicates that LFP avalanches, like the spike avalanches, lack clear STS.

In general, the LFP *f(s)* showed a better approximation to power law scaling than any of the spike avalanche distributions (**Figures 2**, **10**). Despite an approximate power law scaling for *f(s)*, all the other measures we used here [i.e., <*s*>, *f*(*s* = 1, *bs*), σ∗, and β] indicated a sub-critical regime: The <*s*> and the *f*(*s* = 1, *bs*) both deviated from power law scaling (**Figure 10B**); the branching parameter did not show a pronounced peak (**Figure 11**); and the scaling exponent β of the DFA was smaller than unity (mean(β) = 0.6; **Figure 7**). This is in line with our previous study on the same data (Priesemann et al., 2013), and with our results for spiking activity. In sum, despite approximate power-law scaling in *f(s)*, all the other measures indicated a driven, slightly sub-critical regime on the level of LFP activity.

## **DISCUSSION**

This study challenges the hypothesis that mammalian brains operate in a SOC state, as has been repeatedly suggested (Linkenkaer-Hansen et al., 2001; Beggs and Plenz, 2003; Haldeman and Beggs, 2005; Levina et al., 2007a; Hsu et al., 2008; Pasquale et al., 2008; Stewart and Plenz, 2008; Petermann et al., 2009; Priesemann et al., 2009; Shew et al., 2009; Hahn et al., 2010; Ribeiro et al., 2010; Tetzlaff et al., 2010; Poil et al., 2012; Tagliazucchi et al., 2012; Shriki et al., 2013). Despite these claims, evidence for SOC was found lacking for spiking data, which are generally considered an important and reliable marker of neural activity. To test the SOC hypothesis, we therefore analyzed *in vivo* spiking activity from three mammalian species and local field potential recordings from the human brain using established measures of criticality, and also novel ones that are robust to common shortcomings of experimental data, such as subsampling. We particularly focused on systematic changes of these measures with the choice of the bin size.

Spike avalanches from rats, cats, and monkeys, and LFP avalanches from humans showed deviations from the behavior expected for SOC, thereby contradicting the SOC hypothesis. To reproduce the *in vivo* results and provide potential explanations for their deviations from SOC, we modified the models capable of critical behavior. We found a close match between *in vivo* and model behavior (1) if those models were subsampled,

depth electrodes in humans followed power laws. The slope of the power laws changed with the bin size (see legend). The bin size was changed over a 1000-fold range, from sampling resolution (400 Hz, i.e., 2.5 ms) to "gluing" everything together at *bs* ≈ 2500 ms. The bin size closest to one inter event interval is marked in purple (*bs* = 80 ms, see Methods). **(B)** Neither the mean avalanche size (<*s*>), nor the frequency of avalanches of size *s* = 1, *f(s = 1, bs),* showed a power law. Each line represents the results for one recording session (<*s*> in black, *f*(*s* = *1*, *bs*) in gray).

and (2) if the STS – a fundamental property of SOC systems – was eliminated, and (3) if the models were tuned to a sub-critical regime. As these results generalized over two very different models, we interpret results from the *in vivo* recordings here as evidence that mammalian nervous systems operate in a driven, sub-critical regime. This regime, albeit not critical, was, however, remarkably similar across species and experimental conditions.

## **UNIVERSAL BEHAVIOR OF SPIKE AVALANCHE DISTRIBUTIONS ACROSS RECORDING AREAS, VIGILANCE STATES AND SPECIES**

The observed avalanche size distributions *f(s)* were similar across species and recording areas (hippocampus in rats, visual cortex in cats, prefrontal cortex in monkeys). A similar universality of *f(s)* across recording areas has been reported by Ribeiro and colleagues (hippocampus, somatosensory cortex, and visual cortex in rats) (Ribeiro et al., 2010). Thus, avalanche activity seems to

be independent of the function and the precise anatomy of an area. This might either indicate that avalanches are not a sensitive measure of neural dynamics, or that activity propagation must follow principles that are independent of the specific role that a brain area plays in information processing. The first argument is not likely applicable, since avalanches change under data shuffling and they sensitively reflect the correlation structure in the data (e.g., Figure 1 in Priesemann et al., 2013). The second argument might indeed hold. Hence, the challenge is to identify the principle that gives rise to these apparently universal spike avalanche distributions. This principle may in fact be very simple. As discussed below, our modified SOC model, as well as a simple branching model, suggests that on average one spike gives rise to a little less than one subsequent spike, and that quiescence in the population activity is prevented by "input spikes" which trigger avalanches at a low rate. This principle differs from SOC, where one spike *on average* gives rise to exactly one subsequent spike, and the rate of input spikes approaches zero (STS). As a consequence, SOC activity shows only one avalanche at a time, while the driven, slightly sub-critical regime shows instead a mélange of avalanches.

## **EMPIRICAL AVALANCHE DISTRIBUTIONS RULE OUT THE CRITICAL AND THE POISSON STATES**

Let us first summarize the conclusions that can be drawn from the analyses of the *in vivo* spike avalanches alone, without referring to modeling. For *f(s)*, neither was the power law scaling found, that is characteristic for SOC, nor did the novel measures (*f*(*s* = 1, *bs*), <s>) support the hypothesis of critical behavior. Thus, the hypothesis that spike avalanches show signs of SOC can be ruled out. In addition, we can rule out the hypothesis of largely independent Poissonian behavior of the spiking units (that is often used in models), because in this case the avalanche distributions should have shown exponential behavior, which was not observed. We therefore conclude that spiking activity is neither (self-organized) critical nor Poissonian.

## **LIMITATIONS OF THE MODELS AND MEASURES**

The SOC model used here was admittedly simple – it comprised neither inhibitory connections nor leakage in the neurons; synaptic connections had a homogeneous nearest-neighbor topology and were all of identical strength α. We chose this model because the basic variant (σ = 1, *h* → 0; i.e., the Bak-Tang-Wiesenfeld model; Bak et al., 1987) is extensively studied in the context of SOC (De Menech et al., 1998; Jensen, 1998; Vespignani et al., 1998; Dickman et al., 2000; Dhar, 2006; Pruessner, 2012). The second model we used was a stochastic branching model (Harris, 1963; Haldeman and Beggs, 2005). It was set up to be comparable to the SOC model, but had a random topology, and the activity propagated stochastically with *p* = α/*k*. In this model, the number of connections *k* hardly affected the results (see also Haldeman and Beggs, 2005).

For both models, the avalanche dynamics was qualitatively similar. Hence, the model results were not specific to the topology (local vs. random), the number of connections *k*, and the precise spike propagation mechanisms (deterministic vs. stochastic). In contrast, implementing leaky model neurons may hinder SOC altogether (Bonachela and Muñoz, 2009; Bonachela et al., 2010). This in itself is an argument against the hypothesis that neural activity is SOC, but it could still be "quasi-critical" (Bonachela and Muñoz, 2009; Bonachela et al., 2010). However, our results indicate sub-criticality.

We note that the power law scaling observed for the novel measures (*f*(*s* = 1, *bs*), <s>) in the critical models has not been derived analytically yet. However, in both critical models the novel measures showed power law scaling despite the different topology and the different spike propagation rules, and hence we expect this behavior to be characteristic for critical dynamics. Still, for now these measures can only be used as tools to compare model and *in vivo* dynamics, and not for determining scaling laws.

## **ON THE PLAUSIBILITY OF EXTERNAL DRIVE**

Spike and LFP avalanches recorded in rats, cats, and primates were best matched by a *driven* sub-critical model. The drive in the model consisted of input spikes, i.e., of spikes not caused by presynaptic spikes from within the model. Given their importance for a successful match between *in vivo* and model activity, we may ask what the *in vivo* counterpart of the input spikes in the models could be. *In vivo*, such input spikes can be provided by at least three sources—by sensory input elicited by stimuli in the outside world, from brain structures other than the one under consideration, or by internal activation which presumably occurs spontaneously. Such spontaneous activity can for example be generated by pacemaker cell activity (Selverston, 2008; Longtin, 2013), or vesicle fusion at a presynaptic terminal without a preceding spike (Fredj and Burrone, 2009). With all these known input sources *in vivo*, it came as no surprise that the model required input spikes (i.e., drive) to be able to match *in vivo* activity.

## **INPUT SPIKES MOST LIKELY DO NOT CONSTITUTE A LARGE FRACTION OF THE OBSERVED ACTIVITY**

The fraction of "input spikes" (drive) among all the spikes of the model is negligible at criticality (α = 1). This fraction, given a constant spike rate *r*, increases with tuning toward sub-criticality (α < 1), until *all* spikes are input spikes in the Poisson state (α = 0), and none arises from synaptic transmission. For example, in the driven, slightly sub-critical model (α = 0.99), only one in ∼3600 spikes was an input spike. To illustrate this number, imagine a neuron that spikes with a rate of 1 Hz. This neuron fires spontaneously (i.e., an "input spike") only once an hour. This example is simplistic, because it assumes that the input is homogeneous, however, it illustrates well that the fraction of input spikes (from the external world, other brain structures, or of stochastic origin; see above) in the driven, sub-critical regime that reproduced the *in vivo* findings is extremely small compared to the overall level of activity.

## **CONCEPTUAL CONSIDERATIONS ON THE ANALYSIS OF NEURAL AVALANCHES AND THE CRITICAL STATE**

While we have so far discussed how *in vivo* spike avalanches suggest a driven sub-critical regime of operation for mammalian nervous systems, several neglected but important conceptual issues with the analysis of neural avalanches surfaced in this study. These are discussed in the following.

## **THE TERM "AVALANCHE" REFERS TO DIFFERENT ENTITIES IN SOC MODELS AND IN THE ANALYSIS OF NEURAL DATA**

Although it is rarely fully acknowledged, the term "avalanche" refers to different entities for activity in SOC models and for neural activity. In SOC models, an avalanche is a cascade of events that originates from a *single* input event (Bak et al., 1987). Subsequent avalanches are always separated by pauses (STS). In contrast, for neural activity, avalanches are defined using temporal binning (Beggs and Plenz, 2003), because neural activity lacks clear pauses that could naturally serve to define the beginning and end of an avalanche. Such avalanches can be defined on any spike time series, irrespective of its origin. Consequently, "binning-dependent avalanches" do not correspond to classical SOC avalanches. Although these two types of "avalanches" are different entities, it is customary to use the same term when referring to any one of them. In the present study, we analyzed the "binning-dependent" avalanches in both cases, in the driven models and in the *in vivo* activity. This justifies a comparison between model and *in vivo* activity, and was also necessary as binning-dependent avalanches are the de-facto standard in the analysis of neural systems, although previous studies frequently alluded to SOC avalanches.

## **AVALANCHE DEFINITIONS IN HIGHLY PARALLEL RECORDINGS**

For neural activity, avalanches are commonly defined using temporal binning, and this definition relies on pauses. We can expect that physiologically relevant pauses (i.e., pauses of a few ms) vanish in spike recordings, when activity of a large number of neurons is recorded in parallel. For example, if each neuron spikes with 1 Hz, sampling only 100 neurons in parallel would frequently produce pauses that are several milliseconds long. However, when sampling thousands or even millions of such neurons, pauses would probably be absent. Without pauses, neither the classical nor the binning-dependent avalanche definition is applicable, and consequently, alternative approaches to assess criticality have to be established. Currently, these approaches threshold the activity and thereby introduce pauses (e.g., Spasojevic et al., 1996; ´ Papanikolaou et al., 2011; Poil et al., 2012). As an alternative approach, we propose to apply systematic subsampling. Both approaches allow using the binning-dependent avalanche definitions again.

## **CAN WE DETERMINE A SPECIFIC CRITICAL EXPONENT FOR NEURAL DATA?**

Avalanche size distributions of critical branching processes have an exponent of τ = 1.5 (Harris, 1963). Since branching processes have some resemblance with propagation of neural activity, it was hypothesized that neural avalanches should also show τ = 1.5. Indeed, τ = 1.5 has been observed (Beggs and Plenz, 2003, 2004; Stewart and Plenz, 2008; Hahn et al., 2010; Priesemann et al., 2013), but only for specific bin sizes. For example, Beggs and Plenz showed in their seminal work that τ = 1.5 holds for one specific bin size (4 ms), but when changing the bin size from 1 to 16 ms, the exponent decreased from 2 to 1.2 (Beggs and Plenz, 2003). Similarly, for the LFP avalanches shown here, τ = 1.5 was observed only for a bin size of ∼80 ms, and with varying the bin size from 2.5 ms to ∼2.5 s, the exponent changed from 3 to 1 (**Figure 10A**) (Priesemann et al., 2013). Changes in τ were also observed in the driven, subsampled SOC model (**Figure 2C**). Thus, drive and subsampling may underlie the variation of τ in experiments as well. However, irrespective of its origin, it is an open question how to reconcile the variation of τ in neural data with the fixed τ in critical systems. One proposal is to use a specific bin size for neural data, namely one average interevent-interval (<IEI>) (Beggs and Plenz, 2003). However, there is no theoretical underpinning yet why this bin size should be preferred over others, and even for using this bin size, τ was found to be 1.8 in spike avalanches in anesthetized cats (Hahn et al., 2010), instead of 1.5. Thus, in neural data, there is not a unique τ , and therefore there is no specific critical exponent for neural activity, which would allow to link neural activity to a universality class.

Since neural avalanche distributions change with the bin size (Beggs and Plenz, 2003; Priesemann et al., 2009, 2013; Benayoun et al., 2010; Hahn et al., 2010), we side with Benayoun et al., who "do not read any significance into the particular slope observed. [. . . ] In our view, any good model of neural avalanches must reproduce the variability in the observed slope of the power law with temporal bin width." (Benayoun et al., 2010) Though we here did not observe power laws for the *in vivo f(s)*, our model could reproduce the *in vivo* spike *f(s)* and their change with temporal binning. It could also reproduce the bin-size dependent changes of novel and established measures of avalanche dynamics (*f*(*s* = 1, *bs*), <*s*>*,* σ∗*,* DFA exponent). To the best of our knowledge, this is the first model that matched not only the avalanche properties for a single bin size, but also their changes with changing bin size.

## **SUBSAMPLING EFFECTS IN THE ASSESSMENT OF CRITICALITY**

Subsampling is unavoidable in spike avalanche recordings *in vivo*, and is helpful when comparing neural activity to model activity (Priesemann et al., 2009). However, subsampling was also shown to complicate criticality analysis because it can distort avalanche measures (Priesemann et al., 2009, 2013; Ribeiro et al., 2010). To overcome this problem, we here developed avalanche measures that are not distorted by subsampling. One example is the bin size dependence of the frequency of avalanches of size one (*f*(*s* = 1, *bs*)). This measure robustly showed power-law scaling in the driven SOC states, and exponential scaling in the Poisson state, independent of subsampling strategies (**Figure 6**). Therefore, we propose to use *f*(*s* = 1, *bs*) as a robust measure for criticality analysis.

Subsampling effects can appear very strong if one uses a fixed bin size, e.g., 1 ms as in Ribeiro et al. (2014). We used instead a normalized bin size, which accounts for the problem that the population rate *R* changes with the number of sampled neurons. Using a normalized bin size diminished subsampling effects, and also allowed for a comparison to the *in vivo* recordings.

## **FINITE SIZE EFFECTS IN CRITICALITY ASSESSMENT**

The finite size of the critical models limited the correlation lengths in space and time and thereby caused the cutoff in *f(s)* (**Figures 2A,G**). In analogy, the finite size is expected to also have caused – in the driven critical models – the cutoff at large bin size in the novel measures (*f*(*s* = 1, *bs*), <s>). Since finite size effects decrease with increasing system size, and since the *in vivo* spikes were recorded in a far larger system than our model spikes, finite size effects are unlikely to account for the deviations from power law scaling found for the *in vivo* activity.

In critical models, the finite size can change the value of α, for which the model is critical. For example, Eurich et al. (2002) showed for their model that the critical α depended on the model size *L* as αcrit = 1 − *L*−0.5. Thus, their finite size models with α → 1 were super-critical and showed peaks in their *f(s)*. This was not the case for our critical models. Our models, in contrast, appeared to be slightly sub-critical at α = 1. This is probably due to the open boundary conditions we used in contrast to Eurich et al. Hence, since the finite size made our models at most sub-critical but not super-critical, there is no concern that the observed match of model and *in vivo* results at values of α < 1 is due to finite size effects.

## **DIFFERENT TYPES OF CRITICAL PHASE TRANSITIONS EXIST**

To better understand criticality and potential deviations from it, it is also important to define which type of criticality one refers to. Critical phase transitions can occur for example for the transitions from order to chaos (Bertschinger and Natschläger, 2004; Haldeman and Beggs, 2005; Boedecker et al., 2012; Lizier, 2013), from non-oscillatory to oscillatory regimes (Linkenkaer-Hansen et al., 2001; Poil et al., 2012), from replay to non-replay of spatiotemporal patterns (Scarpetta and de Candia, 2013), and from a regime with finite to one with potentially infinite avalanche sizes (Bak et al., 1987; Drossel and Schwabl, 1992; Olami et al., 1992; Eurich et al., 2002; Beggs and Plenz, 2003; Haldeman and Beggs, 2005; Levina et al., 2007a,b, 2009), as known from branching processes (Harris, 1963). One study has found that the transitions to chaos and to potentially infinite avalanches coincide in their model (Haldeman and Beggs, 2005), but it is unclear whether this finding generalizes to other systems. We here want to emphasize that our model showed a transition to potentially infinitely large avalanches.

## **CONSEQUENCES FOR INFORMATION PROCESSING AND STABILITY OF BRAIN DYNAMICS**

After having discussed evidence from *in vivo* spike avalanche distributions for a driven, sub-critical mode of operation, and after having clarified conceptual issues, we now turn to the question of what consequences these findings may have on information processing and dynamic stability in the mammalian brain.

### **SUB-CRITICALITY, SUPER-CRITICALITY, AND STABILITY**

Criticality is characterized by a power-law distribution of its avalanche sizes. This indicates that avalanches of any size can occur; even close to infinite-size avalanches may occur, provided that the system is large enough to sustain them. Infinite-size avalanches do occur in the super-critical regime, and have been linked to epileptic seizures (Hsu et al., 2008; Meisel et al., 2012). Such infinite avalanches produce runaway activity, and could thereby impair normal brain activity. Therefore, it is unlikely that it would be good for a normally functioning brain to be supercritical. Sub-criticality, in contrast, never shows infinitely large avalanches, and thus offers a safer regime for brain operation. Thus, a *slightly* sub-critical regime allows the brain to avoid runaway activity, while still allowing moderate activity propagation, and maintaining most of the possible computational advantages that come with criticality (Haldeman and Beggs, 2005; Kinouchi and Copelli, 2006; Beggs, 2008; Shew et al., 2009; Shew and Plenz, 2013).

## **DRIVE AND INFORMATION PROCESSING**

There may be good reason why neural activity *in vivo* does not show a STS for its avalanches: When eliminating the STS, avalanches run in parallel, meet, and intermingle. Thereby, the *rate* of computations may be increased compared to the SOC state. In addition, the presence of multiple, potentially interacting avalanches, may enable collision-based computation, which is one fundamental way of information modification (Lizier, 2013). Thus, a driven state may increase the rate and capacity of neural information processing *in vivo*.

## **CONCLUSIONS**

Our analysis of *in vivo* data indicated that the mammalian brain is not SOC because *in vivo* spiking activity differed fundamentally from activity expected for SOC. Instead, the mammalian brain apparently self-organizes to a slightly sub-critical regime without an STS. Mechanistically, such a driven, sub-critical regime shows a mélange of avalanches, while SOC systems, in contrast, are characterized by temporally separated avalanches. Operating in a slightly sub-critical regime may prevent the brain from tipping over to super-criticality, which has been linked to epilepsy. Regarding computational capabilities, which have been reported to be optimal for SOC, a slightly sub-critical regime only deviates little from SOC and therefore its computational capabilities may still be close to optimal, while the non-zero drive in general may allow for a higher rate of information processing. Taken together, a driven, slightly sub-critical regime may strike a balance between optimal information processing and the need to avoid runaway activity.

## **METHODS**

### **SELF-ORGANIZED CRITICAL MODEL**

The SOC neural network model we used here is the Bak-Tang-Wiesenfeld model (Bak et al., 1987), and modified versions of it. Translated to a neuroscience context, the model consisted of 2500 non-leaky integrate and fire neurons. A neuron *i* spiked if its membrane voltage *Vi*(*t*) reached a threshold :

$$\text{If } V\_i(t) > \Theta, \ V\_i(t+1) = V\_i(t) - 4$$

 was set to = 0 for convenience. Note that the choice of does not change the activity of the model at all. The model neurons were arranged on a 2D lattice, and each neuron was connected locally to its four next neighbors, i.e., the coupling strength α*ij* = α for all four next neighbors of neuron *i*, and α*ij* = 0 else.

$$V\_i(t+1) = V\_i(t) + \sum\_j \alpha\_{i\bar{j}} \cdot \delta(t - T\_{\bar{j}}) + H(t)$$

The time *t* was updated in ms (i.e., 1 ms effective synaptic delay). *Tj* denoted the spike times of neuron *j*, and *H(t)* was a function which set a neuron above threshold with a certain Poisson rate *h*. *h* represented the "drive" in the context of SOC. Note that the neurons at the edges and corners of the grid had only 3 and 2 neighbors, respectively. This model is equivalent to the well-known Bak-Tang-Wiesenfeld model (Bak et al., 1987) if *h* → 0 and α = 1. In contrast, for α = 0, the model represented independent Poisson units which spiked with rate *r* = *h*.

Subsampling (Priesemann et al., 2009) was applied to the model by sampling the activity of 100 randomly selected neurons only, and neglecting the activity of all other neurons. To simulate specific subsampling effects, the sampled neurons were not chosen randomly, but arranged in specific configurations (see **Figure 6**, right part). Here the sampled neurons were arranged to have very small or very large distances. For the small distances, 4 × 4 or 8 × 8 neurons from a compact, central subset were sampled (**Figure 6**, red and pink), and for the large distances, 4 × 4 or 8 × 8 neurons with distance 5 grid units between them were sampled (**Figure 6**, turquoise and beige).

## **STOCHASTIC BRANCHING MODEL**

In addition to the SOC model, we also simulated a classical stochastic branching model. In this model, a branching process (Harris, 1963; Haldeman and Beggs, 2005) was mapped on a grid of neurons. An active neuron activated each of its *k* postsynaptic neurons with probability *p* = α · 1/*k*. As in the SOC model, this model was critical for α = 1 in the infinite size limit, and subcritical (supercritical) for α < 1 (α > 1). In contrast to the SOC model, here the postsynaptic neurons were assigned randomly at each step. The other parameters were analogous to the SOC model: The model had 2500 neurons with *k* = 4 connections each, and α and *h* were balances such that neurons spiked with *r* = 5 Hz (except if *h* → 0). The open boundary conditions were implemented by defining *pdiss* = 0.001 as the probability that a neuron projected "outside of the grid," i.e., the probability that an activation of a postsynaptic neuron was not effective. Note that *pdiss* > 0 makes the model slightly subcritical. Subsampling was implemented in the same manner as in the SOC model. Note however that spatial distances have no meaning in this model because of its random topology. Results for this model were qualitatively similar to those of the SOC model. Therefore, we usually reported the results of the SOC model only.

#### **EXPERIMENTS**

We evaluated spikes from recordings in three different species, namely in rats, cats and monkeys. The rat experimental protocols were approved by the Institutional Animal Care and Use Committee of Rutgers University (Mizuseki et al., 2009). The cat experiments were performed in accordance with guidelines established by the Canadian Council for Animal Care (Blanche, 2009). The monkey experiments were performed according to the German Law for the Protection of Experimental Animals, and were all approved by the Regierungspräsidium Darmstadt. The procedures also conformed to the regulations issued by the NIH and the Society for Neuroscience.

The spike recordings from the rats and the cats came from the NSF-founded CRCNS data sharing website (Blanche, 2009; Mizuseki et al., 2009). In brief, in rats the spikes were recorded in CA1 of the right dorsal hippocampus during an open field task. We used the first data set of each animal (ec013.527, ec014.277, ec015.041, ec016.397), and from rat "ec014" we also used a second data set (ec014.333). The five datasets provided sorted spikes, i.e., {37, 77, 32, 58, 58} single units and {4, 8, 8, 8, 8} multi units, respectively. However, since the identity of a unit does not matter for the definition of neural avalanches (see below), the single- and multi-unit activity was combined to one set of spike times. More details on the experimental procedure and the datasets proper can be found on Mizuseki et al. (2009).

For the spikes from the cat, neural data were recorded by Tim Blanche in the laboratory of Nicholas Swindale, University of British Columbia, and downloaded from the NSF-funded CRCNS Data Sharing website (Blanche, 2009). We used the data set pvc3, i.e., recordings in area 18 which contain 50 sorted single units (Blanche and Swindale, 2006). We used that part of the experiment in which no stimuli were presented, i.e., the spikes reflected spontaneous activity in the visual cortex of the anesthetized cat. Details on the experimental procedures and the data proper can be found in Blanche and Swindale (2006); Blanche (2009).

In the monkey experiments, spikes were recorded simultaneously from up to 16 single-ended micro-electrodes (ø = 80μm) or tetrodes (ø = 96μm) in lateral prefrontal cortex of three trained macaque monkeys (M1: 6 kg ♀; M2: 12 kg ♂; M3: 8 kg ♀). The electrodes had impedances between 0.2 and 1.2 M at 1 kHz, and were arranged in a square grid with inter electrode distances of either 0.5 or 1.0 mm. The monkeys performed a visual short term memory task with on average 80% correct behavioral responses which required them to memorize a sample object and to compare a test stimulus presented after a delay of 3 s to memory content. The monkeys indicated via differential button press whether test and sample stimuli matched or not. Each trial consisted of a 1 s long baseline, 500–900 ms sample stimulus presentation, a delay of 3 s and a response interval lasting throughout a 2 s test stimulus presentation. More details of the experimental procedure can be found in Pipa et al. (2009). In total, we analyzed spike data from 11 experimental sessions comprising almost 12.000 trials. In M1 and M2 we recorded four sessions each, and in M3 we recorded 3 sessions. 6 out of 11 sessions were recorded with tetrodes (2/4, 4/4, and 0/3 from M1, M2, and M3, respectively). Spike sorting on the tetrode data was performed using a Bayesian optimal template matching approach as described in Franke (2011) (see Franke et al., 2010 for an earlier version) using the "Spyke Viewer" software (Pröpper and Obermayer, 2013). On the single electrode data, spikes were sorted with a multi-dimensional PCA method (Smart Spike Sorter by Nan-Hui Chen).

### **MEASURES**

Avalanches in SOC systems are cascades of spikes triggered by a single external spike (Bak et al., 1987). An avalanche can span the entire system, but can also affect just a few sites before it dies out. By definition, in SOC models subsequent avalanches are separated by pauses that are much longer than the avalanches proper (STS) (Bak et al., 1987; Pruessner, 2012). This means that a new avalanche is only triggered after the previous one has long died out. In SOC systems, several avalanche characteristics, such as the distribution of sizes and durations, follow scaling laws, known from the framework of "renormalization theory" (Stanley, 1971, 1999; Sethna et al., 2001; Dhar, 2006). In the following, we define the avalanche measures and describe the expected scaling laws for the SOC model and the critical stochastic branching model.

The avalanche size *s* is the total number of spikes in an avalanche. The avalanche size distribution *f(s)* is its frequency of avalanche sizes, and *p(s)* refers to the respective probability distributions. *f(s)* follows a power law in SOC systems:

$$f(s) \sim s^{-t}$$

τ is the critical exponent and depends on the SOC model. For the SOC model we use here (α = 1 and *h* → 0), τ ≈ 1 (Bak et al., 1987; Priesemann et al., 2009), and for the critical branching model τ = 1.5 (Harris, 1963; Haldeman and Beggs, 2005).

The definition of avalanche sizes in the driven models (*h* > 0) and *in vivo* relied on temporal binning (Beggs and Plenz, 2003), since these systems lacked STS. When applying temporal bins to a spike train, the avalanche size was defined as the total number of events in subsequent, non-empty time bins (**Figure 1**). Stating it differently, an avalanche is by definition the activity in a sequence of full bins, and is preceded and followed by an empty bin. With this definition, *f(s)* changed with the bin size (**Figure 1**).

As stated above, *f(s)* changed with the bin size. To quantify the bin-size dependent changes of *f(s)*, we used the mean avalanche size (<*s*>), and the measure *f*(*s* = 1, *bs*), i.e., the bin size dependence of the frequency of avalanches of size *s* = 1.

A common measure to characterize neural avalanches is the branching parameter. In a branching process, the branching parameter σ defines whether activity expands (σ > 1) or dies out (σ < 1) (Harris, 1963). Between these two regimes, at σ = 1, the branching process is critical (Harris, 1963). In analogy, the σ<sup>∗</sup> was estimated from spike trains using temporal binning as follows (Beggs and Plenz, 2003; Priesemann et al., 2009): σ<sup>∗</sup> *<sup>i</sup>* is the number of events in time bin *ti* divided by the number of events in time bin *ti* <sup>−</sup> 1. The average over all σ<sup>∗</sup> *<sup>i</sup>* (for which the number of events in *ti* <sup>−</sup> <sup>1</sup> is not zero) is defined as the estimated branching parameter σ<sup>∗</sup> (**Figure 1**) (Beggs and Plenz, 2003; Priesemann et al., 2009). Note that σ<sup>∗</sup> depends on the bin size, and may fail to provide the intended results (see Results and Discussion).

Detrended fluctuations analysis (DFA) (Peng et al., 1994, 1995; Kantelhardt et al., 2002) quantifies long-range correlations in a time-series, which also dominate SOC systems. We applied DFA to the time course of the summed population activity. The summed population activity is the total number of spikes across all neurons at each sampling step. For the DFA, we used analysis window widths from 24 to 2<sup>11</sup> ms. Smaller window widths could not be used because of the limited sampling resolution, and for windows larger than 2 s the power law scaling broke down, and this impeded the estimation of the DFA exponent β.

It sometimes is helpful to measure the bin size not in absolute time (e.g., milliseconds), but in "average inter event intervals" (<*IEI*>). The <*IEI*> is the inverse of the population rate *R*, i.e., the rate of all units together, independent of their origin. In contrast to the population rate *R*, the rate of a single unit is denoted with *r*.

## **LFP RECORDINGS IN HUMANS**

We evaluated LFP which were recorded with intracranial depth recordings in humans. We used the very same data and analysis methods as in Priesemann et al. (2013), and we used the results from all vigilance states combined, because we already showed that the differences with vigilance states were small (Priesemann et al., 2013). We analyzed data from five subjects [3 females (aged 21, 23, and 27), two males; (aged 25 and 48)] with refractory partial epilepsy undergoing pre-surgical evaluation. The subjects were hospitalized between February 2005 and March 2007 in the epilepsy unit at the Pitié-Salpetrière hospital in Paris. All patients gave their informed consent and procedures were approved by the local ethical committee (CCP). Each patient was continuously recorded during several days (duration range: 9–20 days; mean duration: 16 days) with intracranial and scalp electrodes (Nicolet acquisition system, CA, US). Depth electrodes were composed of 4–10 cylindrical contacts (2.3-mm long, 1 mm in diameter, 10-mm apart center-to-center), mounted on a 1 mm wide flexible plastic probe. Pre and post implantation MRI scans were evaluated to anatomically locate each contact along the electrode trajectory. The placement of electrodes within each patient was determined solely by clinical criteria. Signals were digitized at 400 Hz. The five subjects were implanted with (44, 48, 50, 50, and 63) intracranial LFP recording sites. In total seven recording sites were excluded from the analysis due to artifacts and thus we used (44, 48, 45, 50, and 61) recordings sites for data evaluation. All LFP were low-pass filtered at 40 Hz (4th order butterworth, MATLAB) to reduce the impact of line noise.

To analyze the neuronal avalanches for these LFP data in the same manner as the spike data, we extracted binary events from the LFP. These binary events represent phases of enhanced synaptic activity. To extract these events, we calculated the area under the positive deflection lobes between two zero crossings of the LFP (Figure 2 in Priesemann et al., 2013). As LFP-voltages reflect current flows via Ohm'*s* law, this time integral, or area under the voltage curve, is proportional to the total amount of displaced charges and hence describes the departure from equilibrium (charge neutrality) quantitatively—in contrast to simple voltage peaks. To obtain binary events from the LFP, we applied a threshold to the area values under the LFP deflection lobe. The threshold was selected such that each recording site in each interval of constant vigilance state had the same event rate *r* = 1/4 Hz. In contrast to our first paper with these data (Priesemann et al., 2013), we here used only one value for *r*, and combined the results for all vigilance states from wakefulness to deep sleep, since neither *r* nor the different vigilance states affected the results qualitatively (Priesemann et al., 2013).

For the avalanche analysis in the humans, we used a bin size either in units of average inter event intervals (<*IEI*>) or in ms. The <*IEI*> is a function of the event rate *r* and the number of electrode contacts *N*, <*IEI*> = *1/(r*· *N)* = *1/R*. Since *r* was fixed and *N* did not vary much across patients, the following approximation holds: 1 <*IEI*> ≈ 80 ms.

## **FUNDING**

Viola Priesemann received financial support from the German Ministry for Education and Research (BMBF) via the Bernstein Center for Computational Neuroscience (BCCN) Göttingen under Grant No. 01GQ1005B. Viola Priesemann and Matthias H. J. Munk received funding from the Federal Ministry of Education and Research (BMBF) Germany under grant number 01GQ0742. Viola Priesemann, Michael Wibral, and Jochen Triesch received funding from the LOEWE Grant "Neuronale Koordination Forschungsschwerpunkt Frankfurt (NeFF)." Robert Pröpper received funding from the Deutsche Forschungsgemeinschaft (GRK 1589/1). Danko Nikolic received funding from the ´ Deutsche Forschungsgemeinschaft (NI 708/5-1) and the Hertie Stiftung. Jochen Triesch is supported by the Quandt foundation.

## **ACKNOWLEDGMENTS**

We thank Dr. Anna Levina for helpful discussions, and Maximilian Puelma Touzel for his comments on the manuscript. We thank Hanka Klon-Lipok for monkey training and excellent support for recording experiments, Sergio Neuenschwander for providing the software for monkey data acquisition ("SPASS"), and Nan-Hui Chen for providing the software for spike sorting ("Smart Spike Sorter"). We thank Philipp Meier and Christian Donner for contributing to the tetrode spike sorting (monkeys). Neural data from the cat were recorded by Tim Blanche in the laboratory of Nicholas Swindale, University of British Columbia, and downloaded from the NSF—funded CRCNS Data Sharing website. Neural data from the rat were provided by K. Mizuseki, A. Sirota, E. Pastalkova, and G Buzsáki, Rutgers University, and were downloaded from the NSF—funded CRCNS Data Sharing website.

## **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fnsys.2014. 00108/abstract

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 08 February 2014; accepted: 21 May 2014; published online: 24 June 2014. Citation: Priesemann V, Wibral M, Valderrama M, Pröpper R, Le Van Quyen M, Geisel T, Triesch J, Nikoli´c D and Munk MHJ (2014) Spike avalanches in vivo suggest a driven, slightly subcritical brain state. Front. Syst. Neurosci. 8:108. doi: 10.3389/ fnsys.2014.00108*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2014 Priesemann, Wibral, Valderrama, Pröpper, Le Van Quyen, Geisel, Triesch, Nikoli´c and Munk. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Functional significance of complex fluctuations in brain activity: from resting state to cognitive neuroscience

## *David Papo\**

*Computational Systems Biology Group, Center for Biomedical Technology, Universidad Politécnica de Madrid, Madrid, Spain*

#### *Edited by:*

*Lucilla De Arcangelis, Second University of Naples, Italy*

#### *Reviewed by:*

*Paolo Allegrini, Consiglio Nazionale delle Ricerche, Italy Antonio De Candia, University of Naples, Italy*

#### *\*Correspondence:*

*David Papo, Computational Systems Biology Group, Center for Biomedical Technology, Universidad Politécnica de Madrid, Calle Ramiro de Maeztu, 7, 28040 Madrid, Spain e-mail: papodav@gmail.com*

Behavioral studies have shown that human cognition is characterized by properties such as temporal scale invariance, heavy-tailed non-Gaussian distributions, and long-range correlations at long time scales, suggesting models of how (non observable) components of cognition interact. On the other hand, results from functional neuroimaging studies show that complex scaling and intermittency may be generic spatio-temporal properties of the brain at rest. Somehow surprisingly, though, hardly ever have the neural correlates of cognition been studied at time scales comparable to those at which cognition shows scaling properties. Here, we analyze the meanings of scaling properties and the significance of their task-related modulations for cognitive neuroscience. It is proposed that cognitive processes can be framed in terms of complex generic properties of brain activity at rest and, ultimately, of functional equations, limiting distributions, symmetries, and possibly universality classes characterizing them.

**Keywords: scaling, multifractals, ageing, weak ergodicity breaking, symmetry, fluctuation-dissipation theorem, cognitive neuroscience, resting state**

## **INTRODUCTION**

Ideally, cognitive psychology aims at providing a description of the space of cognitive processes, the nature of each of them, and the way they interact. Cognitive processes are unobservable regimes of an underlying dynamical system. However, they can be reconstructed by considering that sequences of observable quantities, sampled during the execution of controlled cognitive tasks, are the output of this system.

In behavioral studies, the underlying system is construed as a black box function, with given tasks, supposed to summon given cognitive processes, as inputs, and observable behavioral performance as outputs.

Typically, a quantitative description of cognitive processes consists in calculating means and standard deviations of trialaveraged performance measures, implicitly assuming an underlying Gaussian distribution (which is completely described by its first two moments), and statistical independence of the various trials.

However, the results of numerous behavioral studies [see (Kello et al., 2010) for a review] cannot be reconciled with Gaussian distribution functions. Power-law distributions and temporal scaling have consistently been found for relatively short time series (∼102–103 time points) (Gilden, 2001) of inter-trial fluctuations in performance levels, although finer temporal scales have also been considered, particularly for motor tasks (Cabrera and Milton, 2002; Diniz et al., 2011).

Behavioral scaling laws contain important information about cognitive function, viz. on how (non observable) components of cognition interact (Holden et al., 2009). For instance, powerlaw scaling of trial-to-trial performance variations has been taken to arise from multiplicative interactions among interdependent processes, suggesting that the mechanisms through which processes interact to give rise to cognitive performance may be no less fundamental than single components' functioning principles (Holden et al., 2009; Ihlen and Vereijken, 2010).

The scaling properties appear to be modulated in a taskspecific way. For example, increasing task difficulty accelerates the transition from 1/f to white noise in decision-making time series (Correll, 2008; Grigolini et al., 2009).

Cognitive function is naturally understood as originating from brain activity, and quantitatively characterized in terms of the brain properties associated with the execution of given cognitive tasks. Cortical activity adds spatial and temporal scales unavailable in behavioral studies, so that scaling can be assessed within single process realizations.

The brain generates fluctuations with complex scaling properties (Novikov et al., 1997; Linkenkaer-Hansen et al., 2001; Gong et al., 2002, 2007; Freeman et al., 2003; Bianco et al., 2007; Suckling et al., 2009; Freyer et al., 2009), even in the absence of exogenous perturbations or changes in parameters controlling its activity. Only few experimental studies (Linkenkaer-Hansen et al., 2004; Popivanov et al., 2006; Buiatti et al., 2007; Bhattacharya, 2009; He et al., 2010; Ciuciu et al., 2011, 2012; Zilber et al., 2012) investigated the scaling properties of *task-related* brain activity, or their relationship with behavioral ones (Monto et al., 2008; Palva et al., 2013; Kello, 2013).

The aetiology and functional meaning of brain fluctuation scaling have been discussed at length. For example, the presence of spatial and temporal inverse-power law correlations is often taken to suggest that the brain lives near a second order phase transition, a condition optimizing information processing and storage, and dynamic response (Chialvo, 2010).

Here, instead, we discuss ways in which fluctuation properties can be used as metrics making cognitive function observable.

## **A RANDOM WALK AROUND BRAIN ACTIVITY'S SPACE**

To garner a physical intuition of the meaning of brain fluctuations one can think of brain activity as the motion of a random walker making steps of size *x* at given times*t*, or, in the continuous limit, of a diffusing macroscopic particle in a complex highdimensional space, subject to viscous friction, with a time scale τ*m*, and driven by an additive random force with a characteristic time τη (Hsu and Hsu, 2009).

The relationship between τ*<sup>m</sup>* and τη determines how the system evolves in this complex space, including traveled distances, velocity, degree to which the space is visited, time to reach a given target point, system's memory of its own trajectory within the landscape, relationship between spontaneous and task-related activity, and ultimately how microscopic fluctuations renormalize to give rise to observable macroscopic statistical properties (Papo, 2013b).

If spontaneous fluctuations were *Markovian*, with Gaussian δ-correlated noise, and τη τ*m*, the particle would undergo normal diffusion: the step length would be taken from the Maxwell-Boltzmann equilibrium distribution, and the meansquare distance (MSD) traveled by the particle would scale linearly with time - |*x*(*t*)| 2 ∼ *t*. Under general conditions, the first passage time from a prescribed phase space domain would be characterized by a universal distribution, independent of the jump length distribution (Sparre Andersen, 1953). For *t* τ*m*, the temporal autocorrelation of velocity fluctuations would behave as *C*(τ ) ∼ *exp* (−*t*/τ*m*), with a unique *characteristic time* τ*m*. The dynamics would hop without memory from one configuration to another, eventually visiting the whole phase space.

However, the properties of observed brain fluctuations are inconsistent with the Markovian approximation (Fraiman and Chialvo, 2012). Spontaneous fluctuations show temporal and spatial scale-free statistics (Novikov et al., 1997; Linkenkaer-Hansen et al., 2001; Gong et al., 2002; Stam and de Bruin, 2004; Expert et al., 2010; van de Ville et al., 2010). The MSD scales as - |*x*(*t*)| 2 ∼ *t* <sup>2</sup><sup>ν</sup> with ν = 1/2, so that its diffusion is *anomalous,* and indeed even *strongly* anomalous (Suckling et al., 2009; Ciuciu et al., 2011, 2012; Zilber et al., 2012), with the *q-th* moments scaling as - |*x*(*t*)| *q* ∼ *t q*ν(*q*) , with ν(*q*) = *const* (Castiglione et al., 1999). Appropriately rescaled average temporal fluctuations collapse onto *universal scaling functions* (Sherrington, 2010; Friedman et al., 2012; Shriki et al., 2012).

Exponential relaxation is replaced by complex scaling, e.g., of a Mittag-Leffler type (Bianco et al., 2007), with stretched exponential relaxation at microscopic scales (*t* < τ ), and inverse power-law scaling *C*(τ ) ∼ τ <sup>−</sup>α, for *t* τ , so that, for α ≤ 1, the *correlation time* <sup>τ</sup>*<sup>C</sup>* <sup>=</sup> <sup>ξ</sup> <sup>0</sup> *C*(*t*)*dt* diverges, leading to a scale-free process with memory. The system undergoes *ageing* (Bianco et al., 2007): correlations are time-inhomogeneous, with a dependence on the time of application of a given field, history-dependent (Sherrington, 2010), and weakly non-ergodic (Bianco et al., 2007), as some phase space region may take extremely long times to be visited (Bouchaud, 1992).

Activity shows statistical and dynamical *intermittency*: on the one hand, although large-scale fluctuations are approximately Gaussian, non-Gaussian fluctuations appear at higher frequencies (Freyer et al., 2009). On the other hand, activity is characterized by alternating laminar and turbulent phases (Gong et al., 2007; Allegrini et al., 2010, 2011).

## **UNDERSTANDING BRAIN FLUCTUATIONS**

The statistical and dynamical properties of brain fluctuations contain information on the structure of the functional space within which brain dynamics evolves, and on the style, as it were, with which brain dynamics explores its dynamic repertoire (Ghosh et al., 2007; Deco et al., 2011; Betzel et al., 2012).

## **FROM SINGLE STEPS TO COMPLETE WALKS**

Scaling laws indicate that the walker takes steps of all sizes, from local to extremely long jumps.

More importantly, probability distributions contain information on how observable large-scale outcomes arise from the interactions of many small-scale processes (Frank, 2009). Observed probability distributions can be thought of as reproducible macroscopic features emerging from the sum of highly fluctuating individual elements. It is natural to see this sum as representing the temporal aggregation of fluctuations within a given time-window.

The central limit theorem (CLT) ensures that the limit distribution of the sum of a large number of random variables is a stable law. The law is Gaussian if the variables are independent and have finite variance. For correlated or infinite variance fluctuations, the CLT ought to be generalized, and the stable law is not Gaussian but Lévy. Importantly, in the latter the largest term is of approximately the same order of magnitude of the sum, indicating that extreme events dominate the underlying process (Laguës and Lesne, 2008).

From a dynamical view-point, the CLT accounts for normal diffusion and the time dependence of the MSD (or the walker's position), while generalized CLTs result in anomalous diffusion, which differs both in relaxation speed, and in the probability distribution's shape, even at very long times.

Probability distributions can be seen as resulting from the iteration of some action on them. For instance, stable laws are fixed points of the convolution operation. Somehow equivalently, fluctuation distributions can be understood as asymptotic behaviors emerging as the system is coarse-grained and rescaled (Hochberg and Pérez-Mercader, 2003). Scale-free distributions are fixed points of a renormalization flow, and universality classes are their basins of attraction. The surface comprising the models flowing into the same fixed point separates the space into phases, corresponding to different macroscopic phenomenologies (Laguës and Lesne, 2008). Universality can be understood in terms of relevant and irrelevant operators, depending on the consequence they have on the statistical behavior (Laguës and Lesne, 2008).

Probability distributions can also be seen as solutions to specific problems expressed e.g., by differential equations (Barenblatt and Zel'dovich, 1972). For instance, probability distributions are solutions of the Fokker-Planck equation of evolution of the particle's transition probabilities, under given information constraints (Jaynes, 1957). For the linear diffusion equation, the solution is a time-evolving spatial Gaussian probability function maximizing the Shannon entropy. Correlated anomalous diffusion is governed by a nonlinear Fokker-Planck equation whose exact stationary solutions are probability distributions maximizing Tsallis generalized entropy (Borland, 1998).

## **EMERGENCE OF STRUCTURE: MEMORY, TEMPORAL ORDER, AND NON-LOCALITY**

Correlations are propagators, whose characteristic length ξ constitutes an active time window within which all points are somehow related to each other.

A Markovian system has perfectly elastic almost instantaneous relaxation and no memory: the time axis tends to be infinitely fragmented, so that activities of overall duration *L* are *temporally disordered* (*L* ξ ).

Brain fluctuations' loss of scale separation allows microscopic randomness to renormalize and become macroscopically detectable (Grigolini et al., 1999): correlated driving noise and cross-scale relationships produce *temporally ordered* structures (*L* ∼ ξ ), so that activity at a given time point is temporally *nonlocal*, and not easily divorced from that occurring within the scaling range.

With temporal scaling, fluctuations no longer have a characteristic time; more than to a multiplicity of scales {τ*i*}, the emphasis shifts to some *relationship* between them. The brain's functional heterogeneity introduces a spatial distribution of time scales {σ*i*} inducing a structure *S*. Eventually, the studied dynamics is a field φ −→*<sup>s</sup>* , *<sup>t</sup>* ∈ *-*, where *-* = {φ} is a space of systems, endowed with a spatio-temporal structure {*S* ∗ }, with arbitrarily complex topological properties (Zaslavsky, 2002), and which can become observable through a wealth of collective state variables *X* ∈ **X**.

The structure {*S* ∗ } is a dynamical system in the space of fields φ, relating representations of the process at different scales (Friedrich et al., 2000; Bacry et al., 2001; Longo et al., 2012). For instance, at any given scale λ within the scaling range, the probability *P*(*x*, *t*) that the particle traveled a distance *x* at time *t* can be thought of as the convolution of the distribution *P*(*x*, *t*) at the coarsest scale and a probability distribution *G*(.), not necessarily a power-law (Chainais et al., 2005), expressing the relationships across time scales (Castaing et al., 1990). For scale-invariant processes, *P*(*x*, *t*) = *t* <sup>−</sup>ν*F*(*x*/*<sup>t</sup>* <sup>ν</sup>), *G* collapses into a single point, and is simply the *scaling exponent* ν. Scale invariance breakdown indicates that *P*(*x*, *t*) is specified by a complex spectrum of scaling exponents.

The set of renormalization operators is endowed with some structure, e.g., a multiplicative semi-group structure, and a covariance property comparable to that of tensors under the action of rotations, with scale invariance replacing Galilean invariance and fractal geometry the Euclidean one (Lesne, 2008a). In turn, scaling laws can be seen as the statistical properties prescribed by the symmetries of a (semi)group on the time-scale space (Borgnat et al., 2003).

Altogether, the presence of complex fluctuations allows treating brain activity as a physical object, defining subparts, and relationships among them, and ultimately using theoretical physics tools such as functional analysis and algebra to characterize them.

## **VELOCITY AND OPERATIONAL TIME**

The presence of scaling can be interpreted in dynamical terms in various ways.

Furthermore, the Lamperti transform establishes a bijective correspondence between self-similar processes on R<sup>+</sup> and stationary processes on R (Flandrin et al., 2003). Self-similar solutions reflect a uniform propagation regime (Barenblatt and Zel'dovich, 1972), and the system can be seen as moving at constant velocity, given by the scaling exponent (Sornette, 2004), whereas the breakdown of exact self-similarity indicates that the propagator is not time-stationary.

The scaling properties also define an intrinsic time of the process. This can be seen by considering that the random walk of brain activity has a waiting-time distribution (WTD) between jumps scaling as a power-law. The WTD defines an internal *operational time*, which can grow sub- or super-linearly with *physical time* (Sokolov and Klafter, 2005). Without multiplicative interactions, *operational* and *physical* time coincide. Multiplicative crossscale interactions bias the WTD so that, local probability densities become time-dependent and *intermittent*, and time translational invariance is broken (Crisanti and Ritort, 2003). The observed Mittag-Leffler fluctuation distribution (Bianco et al., 2007) may in fact stem from the process intermittent subordination with internal time.

## **DYNAMICAL REGIMES AND FLUCTUATION DISSIPATION RELATIONS**

Brain fluctuation properties help relating two only seemingly antagonistic aspects of brain activity: spontaneous and taskinduced brain activity. For Gaussian δ-correlated fluctuations, the fluctuation-dissipation theorem (FDT) ensures that the system's integrated response χ(*t*, *t* ) at time *t* to an external field applied at time *t* and the autocorrelation function *CX*(*t*, *t* ) of the unperturbed system are linked by the temperature *T* of the bath with which the system is in equilibrium (Kubo, 1966). Translated in terms of brain activity, the FDT would establish an equivalence between stimulus-evoked and spontaneous brain fluctuation correlations (Papo, 2013c).

Complex multiscale fluctuations suggest that thermalization happens simultaneously at widely different timescales, so that the FDT in its classical form is not expected to hold (West et al., 2008). For systems with the type of intermittency observed for brain activity, the linear response is anomalous even with simple stimuli (Silvestri et al., 2009; Allegrini et al., 2010). The way the FDT is violated and the ingredients necessary to recover it can be used in various ways as descriptors of brain activity.

First, the properties of ongoing fluctuations define the form of the generalized FDT holding for brain activity and, *in fine*, the way stimulus information is transferred to the brain. The presence of correlated noise affects the particle's transport properties and corresponding dynamics (Machura and łuczka, 2010), and information transfer is maximized when stimuli and brain fluctuations display similar scaling properties (Allegrini et al., 2007; West and Grigolini, 2010; Aquino et al., 2011). Moreover, scaling exponents mark dynamical transitions between qualitatively different response regimes (Burov and Barkai, 2008).

Second, the nature of FDT violation helps understanding at what scales correlations and memory start playing a role, and correctly characterizing the underlying dynamics by specifying the additional degrees of freedom necessary to recover Markovianity (Zwanzig, 2001).

Finally, *effective temperatures*, i.e., what a thermometer responding on the time scale at which the system slowly reverts to equilibrium would measure, which may be used to derive a generalized FDT (Cugliandolo et al., 1997), constitute intrinsic time scales of the system. Fluctuations ultimately identify a spatial distribution of scale-dependent relationships between spontaneous and stimulus-induced brain activity, quantifying the extent to which each scale deviates from equilibrium (Papo, 2013a). This reflects the fact that a path realizes qualitatively different diffusion processes at different temporal and the spatial scales.

## **TASK-RELATED MODULATIONS**

Because most complex scaling properties are presumably generic, psychologists are primarily interested in the extent to which cognitive activity may affect them. Furthermore, precisely because they are generic, task-induced modulations of these properties represent powerful descriptors of the underlying processes.

## **CROSS-OVERS AND SYMMETRY CHANGES**

Numerous cognitive tasks have been shown to modulate the scaling exponents of brain fluctuation probability functions (Linkenkaer-Hansen et al., 2004; Popivanov et al., 2006; Buiatti et al., 2007; He et al., 2010; Ciuciu et al., 2011, 2012; Zilber et al., 2012). Task demands also appear to enhance data collapse and universality of brain fluctuations (Bhattacharya, 2009).

Cognitive demands may push brain activity toward the basin of attraction of adaptively advantageous probability distributions. Cognitive function would be tantamount to designing a driving noise function making the system's stationary distribution equal a desired target one. Moreover, insofar as power laws are solutions of functional equations, rather than frequency or amplitude modulators, cognitive processes may be conceptualised as operators acting upon the functional form of brain activity.

A still poorly explored possibility is that these modulations represent cross-overs between universality classes. This would allow classifying observed cognitive function as operators acting on symmetries (Lübeck, 2004). Renormalization flows would represent generalized dynamic pathways within the functional space, and universality classes a partition of this space, quantifying robustness with respect to control parameter variations (Lesne, 2008a,b).

Whether and how cognitive demands act on brain activity's symmetries is a deserving matter (Freeman and Vitiello, 2006; Buice and Cowan, 2009). For instance, the transition from monoto multifractal distributions has been reported at the late stages of various fracture phenomena (de Arcangelis and Herrmann, 1989; Kapiris et al., 2004). However, whether spontaneous activity is temporally scale-free (van de Ville et al., 2010) or breaks down scale invariance (Ciuciu et al., 2011, 2012; Wink et al., 2012; Zilber et al., 2012) is still an open debate. Existing discrepancies may stem from the order parameter used to evaluate scaling, e.g., whether it is local or has prominent spatial extension, as heterogeneity and disorder may directly affect the scaling exponents.

The shrinking of the multifractal spectrum associated with performance of cognitive tasks may amount to selecting a set of complex patterns from the available repertoire, or to modifying the rate at which these patterns are re-edited across the system (Kenet et al., 2003; Betzel et al., 2012). On the other hand, stimuli drive neural activity away from criticality (Kohen-Kashi Malina et al., 2013), an action reminding the interruption of ageing caused by an external field forcing a glassy material (Kranz et al., 2010). In this sense, one may interpret multifractality as a sign of ageing (Allegrini et al., 2004).

## **STEERING WITHIN THE PHASE SPACE**

As they modulate the temporal scaling of fluctuations, cognitive demands affect the temporal organization of brain activity and the corresponding operational time.

The observable outcome could come in the form of a modulation of cross-over scales, e.g., the time scales at which fluctuations start converging to a Gaussian distribution, varying the likelihood of large scale events (Mantegna and Stanley, 1994), the length interval over which activity can be considered a Markov process, the time scale of the transition from microscopic to macroscopic dynamics (Aquino et al., 2007), or the degree of nonergodicity, corresponding to different ways of visiting the state space (Lomholt et al., 2013).

Furthermore, stimulus-induced modulations of temporal correlations may induce phase transitions in first-passage times (Carretero-Campos et al., 2012) and in response regimes (Burov and Barkai, 2008), and may influence fluctuations' transition to scaling, while endogenous activity likely affects the WTD scaling properties (Aquino et al., 2011).

Finally, cognitive demands may bias either the probabilities or the occurrence times of the walker's jumps (Allegrini et al., 2004), and therefore the operational time associated with a given process.

## **CONCLUSIONS**

We addressed the question of whether and how brain fluctuations help describing non observable cognitive processes.

That observed behavior is a product of brain activity is a matter of general consensus. Here, we further proposed that the former can be described in terms of the generic properties of the latter, such as scaling regimes and their basins of attraction, symmetries (not only scale invariance), FDT violations. Ultimately, it is tempting to conceive of observed behavior as a macroscopic property emerging from the renormalization of microscopic brain fluctuations.

Such characterization of the action of cognitive demands on brain activity affords a wealth of order parameters through which activity becomes observable, each representing a cut, in different dimensions and scales, of the same underlying space. More generally, it allows a conceptualization whereby cognitive processes operate upon the structure of brain activity, producing effects observable from various perspectives (e.g., structural or dynamical). Eventually, this shapes a functional space for which internal structure, and transition and combinatory rules can be extracted.

Finally, it is important to warn that these descriptions do not unambiguously characterize the aetiology of fluctuation properties, as similar scaling properties may stem from qualitatively different generators (Magdziarz et al., 2009; Meroz et al., 2013; Thiel et al., 2013) which may be difficult to distinguish with a finite amount of data (Grigolini, 2008).

## **ACKNOWLEDGMENT**

The author acknowledges the support of MINECO (FIS2012- 38949-C03-01).

## **REFERENCES**


scaling behaviour of sensorimotor oscillations. *Eur. J. Neurosci.* 19, 203–211. doi: 10.1111/j.1460-9568.2004.03116.x


Zwanzig, R. (2001). *Nonequilibrium Statistical Mechanics*. Oxford: Oxford University Press.

**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 16 April 2014; accepted: 26 May 2014; published online: 11 June 2014.*

*Citation: Papo D (2014) Functional significance of complex fluctuations in brain activity: from resting state to cognitive neuroscience. Front. Syst. Neurosci. 8:112. doi: 10.3389/fnsys.2014.00112*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2014 Papo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Alternation of up and down states at a dynamical phase-transition of a neural network with spatiotemporal attractors

## *Silvia Scarpetta1,2\* and Antonio de Candia3,4,5*

*<sup>1</sup> Dipartimento di Fisica "E. R. Caianiello", Università di Salerno, Fisciano (SA), Italy*

*<sup>2</sup> INFN Gr. Coll. di Salerno, Fisciano (SA), Italy*

*<sup>3</sup> Dipartimento di Fisica, Università di Napoli Federico II, Napoli, Italy*

*<sup>4</sup> CNR-SPIN, Sezione di Napoli, Italy*

*<sup>5</sup> INFN, Sezione di Napoli, Complesso Universitario di Monte S. Angelo, Naples, Italy*

#### *Edited by:*

*Lucilla De Arcangelis, Second University of Naples, Italy*

#### *Reviewed by:*

*Sebastiano Stramaglia, Università degli Studi di Bari, Italy Fabrizio Lombardi, ETH Zürich, Switzerland*

#### *\*Correspondence:*

*Silvia Scarpetta, Dipartimento di Fisica "E. R. Caianiello", Università di Salerno, Via Giovanni Paolo II, 132, Stecca 9, I84084 Fisciano (SA), Italy e-mail: sscarpetta@unisa.it*

Complex collective activity emerges spontaneously in cortical circuits *in vivo* and *in vitro*, such as alternation of up and down states, precise spatiotemporal patterns replay, and power law scaling of neural avalanches. We focus on such critical features observed in cortical slices. We study spontaneous dynamics emerging in noisy recurrent networks of spiking neurons with sparse structured connectivity. The emerging spontaneous dynamics is studied, in presence of noise, with fixed connections. Note that no short-term synaptic depression is used. Two different regimes of spontaneous activity emerge changing the connection strength or noise intensity: a low activity regime, characterized by a nearly exponential distribution of firing rates with a maximum at rate zero, and a high activity regime, characterized by a nearly Gaussian distribution peaked at a high rate for high activity, with long-lasting replay of stored patterns. Between this two regimes, a transition region is observed, where firing rates show a bimodal distribution, with alternation of up and down states. In this region, one observes neuronal avalanches exhibiting power laws in size and duration, and a waiting time distribution between successive avalanches which shows a non-monotonic behavior. During periods of high activity (up states) consecutive avalanches are correlated, since they are part of a short transient replay initiated by noise focusing, and waiting times show a power law distribution. One can think at this critical dynamics as a reservoire of dynamical patterns for memory functions.

**Keywords: criticality, phase transition, STDP, associative memory, spatiotemporal pattern replay, neural avalanches, up and down states**

## **1. INTRODUCTION**

Spontaneous cortical activity, i.e., ongoing activity in the absence of sensory stimulation, can show very complex collective features, with, in some cases, the membrane potential making spontaneous transitions between two different levels called up and down states (Steriade et al., 1993; Cowan and Wilson, 1994; Cossart et al., 2003; Shu et al., 2003). This alternation of "down states" of network quiescence and "up states" of generalized spiking and neuronal depolarization, have been observed to occur spontaneously in a variety of systems and conditions, both *in vitro* (Plenz and Kitai, 1998; Cossart et al., 2003; Shu et al., 2003) and *in vivo* during slow-wave sleep, anaesthesia and quiet waking (Petersen et al., 2003; Luczak et al., 2007) The precise mechanism by which these up states transitions occur is still unclear, but it seems to rely on network mechanisms (Cossart et al., 2003). Up states transitions are almost abolished by pharmacological blockers such as glutamate receptor antagonists (Cossart et al., 2003; Shu et al., 2003) and totally abolished by glutamate and GABA receptor antagonists (Cossart et al., 2003).

Results on *in vitro* and *in vivo* up states has suggested that this spontaneous activity occurred in a highly structured way, with repeating spatiotemporal patterns of cellular activity (Cossart et al., 2003; Luczak and MacLean, 2012). Because of their stereotyped spatio-temporal dynamics, it has been conjectured that network up states are circuit attractors (Cossart et al., 2003). Transitions between down and up states can also be evoked by sensory stimulation (Petersen et al., 2003), and interestingly evoked activity patterns are similar to the up states produced spontaneously (Luczak et al., 2007). Also *in vitro*, in thalamocortical slices, the patterns of activity evoked by thalamic stimulation were similar to the patterns of activity that occurred during the up states spontaneously (Luczak and MacLean, 2012).

Many experimental results, both in cell cultures and slices as well as *in vivo* (Gireesh and Plenz, 2008; Petermann et al., 2009; Ribeiro et al., 2010; Plenz, 2012; Haimovici et al., 2013), have also supported the idea that the brain operates near the critical point of a phase transition (Plenz and Thiagarajan, 2007 ; Chialvo, 2010; Plenz, 2012; Tagliazucchi et al., 2012; Yang et al., 2012; Plenz, 2013; Shew and Plenz, 2013). Neuronal avalanches, i.e., cascade of activity with power law distribution of size and durations (Beggs and Plenz, 2003; Mazzoni et al., 2007; Plenz and Thiagarajan, 2007 ; Pasquale et al., 2008; Plenz, 2012), are only one of the observed proprieties suggestive of criticality. Criticality is very advantageous for the brain, in terms of optimization of dynamical range, information transmission and capacity (large repertoire of diverse activity patterns) (Kinouchi and Copelli, 2006; Deco et al., 2013; Shew and Plenz, 2013).

All these intriguing results on spontaneous dynamics support the long-lasting hypothesis that brain can move in a landscape with multiple dynamical attractors, and that up states may be the result of the system falling in one of these attractors. From this point of view, the spontaneous fluctuations between up and down state may be the signature of the system posed at a non-equilibrium phase transition, where system fluctuates in the landscape, and flexibly switches from one state to another. Several models have been proposed as explanations for the avalanche power law distributions that emerge in spontaneous cortical activity (Kinouchi and Copelli, 2006; Levina et al., 2007; Plenz and Thiagarajan, 2007 ; de Arcangelis and Herrmann, 2010; Millman et al., 2010; Lombardi et al., 2012, in preparation; Yang et al., 2012; Scarpetta and de Candia, 2013), and many have discussed the emergence of up and down states in terms of attractor states of a dynamical systems (Holcman and Tsodyks, 2006; Parga and Abbott, 2007; Millman et al., 2010), or self-organized criticality (Lombardi et al., 2012, in preparation). To get bistability, in Parga and Abbott (2007) IF neurons were augmented with a nonlinear membrane current, while in Holcman and Tsodyks (2006) and Millman et al. (2010) the crucial role of activity-dependent short-term synaptic depression was pointed out. For example in the attractor model discussed in Holcman and Tsodyks (2006) the mean time the network spends in the down state is comparable to the mean time it takes for the synapses to recover from a certain depressed activity.

In this paper, we study a model that captures not only the emergence of neural avalanches and up and down states, but also additional features of spontaneous activity, such as the stable recurrence of particular spatiotemporal patterns. In particular, recurrence of spatiotemporal patterns has been observed within up states (Luczak and MacLean, 2012, and refs therein), and also neuronal avalanches seem to be highly repeatable, and can be clustered into statistically significant families of activity patterns that satisfy several requirements of a memory substrate (Beggs and Plenz, 2004; Stewart and Plenz, 2006; Gireesh and Plenz, 2008). The model is a network of leaky integrate-and-fire (LIF) neurons, whose connections have synaptic strengths designed in order to store in the network a set of spatiotemporal patterns. The network shows two distinct regimes, a regime of collective replay activity for high connection strength or high noise, and a regime of no activity for low connection strength or low noise. Between these two distinct regimes, it appears a region where noise is able to switch between periods of quiescence (down states) and periods of high rate coherent activity (up states). At a finer temporal scale, within up states, one observes neural avalanches with power law size and duration distributions. In this model, fluctuations between up and down states emerge even in absence of short-term depression, or of any kind of single neuron bistability. It's a network effect, the results of a structured connectivity, that produce multiple dynamical attractors. Near the non-equilibrium phase-transition separating the two regimes in which the network remains permanently in either the up or the down state, one observes high fluctuations, induced by noise, with emergences of transient up states. The mean time the network spends in the down or in the up state is related to noise intensity and connection strength.

## **2. MATERIAL AND METHODS**

## **2.1. THE MODEL**

We model the neurons as leaky integrate-and-fire (LIF) units. The postsynaptic membrane potential of neuron *i*, when the neuron does not emit a spike, is given by the equation

$$\frac{dV\_i(t)}{dt} = -\frac{V\_i(t)}{\tau\_m} + \frac{I\_i(t)}{C},\tag{1}$$

where τ*<sup>m</sup>* is the characteristic time of the membrane, *C* the membrane capacity, and *Ii*(*t*) the total current input to neuron *i*. The input is given by

$$I\_i(t) = \sum\_j \sum\_{t\_i < t\_j < t} \frac{Q\_{ij}}{\mathbf{r}\_s} e^{-(t - t\_j)/\mathbf{r}\_s} + \sum\_{t\_i < \hat{t}\_i < t} \frac{\hat{Q}\_i}{\mathbf{r}\_s} e^{-(t - \hat{t}\_i)/\mathbf{r}\_i} \tag{2}$$

where *tj* are the spike times of neuron *j*, *Qij* is the total charge released at the synapse between neuron *i* and *j*, τ*<sup>s</sup>* is the characteristic time of the synapse, *t* ˆ*<sup>i</sup>* are the times of noise events releasing a random charge *Q*ˆ *<sup>i</sup>* at some point of the membrane of neuron *i*, and the sum is extended to the spikes *tj* and noise events *t* ˆ*<sup>i</sup>* between the last spike *ti* of neuron *i*, and the present time *t*. Defining *Jij* = *Qij*/[*C*(1 − τ*s*/τ*m*)] and ˆ*Ji* = *Q*ˆ *<sup>i</sup>*/[*C*(1 − τ*s*/τ*m*)], we therefore have (Gerstner et al., 1993; Gerstner and Kistler, 2002)

$$V\_i(t) = \sum\_j \sum\_{t\_i < t\_j < t} J\_{ij} \epsilon(t - t\_j) + \sum\_{t\_i < \hat{t}\_i < t} \hat{f}\_i \epsilon(t - \hat{t}\_i) \tag{3}$$

where (*t*) = *e*−*t*/τ*<sup>m</sup>* − *e*−*t*/τ*<sup>s</sup>* . When the potential *Vi*(*t*) reaches the threshold value *i*, the neuron *i* emits a spike, and its potential is reset to the base value *Vi* = 0. In the present paper we set the same threshold *<sup>i</sup>* ≡ for all the neurons, τ*<sup>m</sup>* = 10 ms, τ*<sup>s</sup>* = 5 ms, we extract the times *t* ˆ*<sup>i</sup>* of noise events from a Poissonian distribution with a rate ρ = 1 ms−<sup>1</sup> for each neuron, and extract ˆ*Ji* from a Gaussian distribution with zero mean and standard deviation <sup>α</sup> ρ *j J*2 *ij*. The constant α, which has the dimension of a rate, sets the "noise level" of the network.

The synapse strengths *Jij* are held fixed during the simulation (no short term plasticity). They are set at the beginning with a "learning procedure" (Scarpetta et al., 2001, 2002; Scarpetta and Marinaro, 2005; Scarpetta and Giacco, 2012; Scarpetta et al., 2013), inspired to spike time dependent plasticity (STDP) (Markram et al., 1997, 2011). During this initial "learning procedure," we store *P* patterns in the network connections. A pattern μ = 1,..., *P* is a phase-coded spike train of period *T*μ, with one spike per neuron and per cycle, where the activity of neuron *i* is given by

$$\alpha\_i^{\mu}(t) = \sum\_{n = -\infty}^{\infty} \delta \left[ t - (t\_i^{\mu} + nT^{\mu}) \right]. \tag{4}$$

The times *t* μ *<sup>i</sup>* are given by φ<sup>μ</sup> *i* 2π *T*μ, where φ<sup>μ</sup> *<sup>i</sup>* are phases chosen randomly in [0, 2π) and then kept fixed, that give the "order of spiking" of neurons within pattern μ. Therefore, during the initial learning procedure, the network is forced to replay pattern μ, and the connections evolve due to STDP, so that in the interval [−*T*, 0] the change in the connection *Jij* is given by

$$\delta I\_{\vec{\imath}\vec{\jmath}} = H\_{\vec{\imath}} \left( \frac{T^{\mu}}{T} \right) \int\_{-T}^{0} dt \int\_{-T}^{0} dt' \,\mathbf{x}\_{\vec{\imath}}(t) A \left( t - t' \right) \mathbf{x}\_{\vec{\jmath}}(t')$$

$$= H\_{\vec{\imath}} \sum\_{n=-\infty}^{\infty} A \left( t\_{\vec{\jmath}}^{\mu} - t\_{\vec{\imath}}^{\mu} + nT^{\mu} \right) . \tag{5}$$

where *Hi* is a constant depending on the postsynaptic neuron *i* that sets the strength of the connections, *<sup>T</sup>*<sup>μ</sup> *<sup>T</sup>* is a normalization factor, *xj*(*t*) is the activity of the presynaptic neuron at time *t*, and *xi*(*t*) the activity of the postsynaptic one. In STDP, the learning window *A*(τ ) is the measure of the strength of the synaptic change when a time delay τ occurs between pre and post-synaptic spikes. The window *A*(τ ) is the one introduced and motivated by Abarbanel et al. (2002), with the same parameters used in Abarbanel et al. (2002) to fit the experimental data of Bi and Poo (1998), see **Figure 1**. This function satisfies the balance condition <sup>∞</sup> −∞ *A*(τ )*d*τ = 0. Notably, when *A*(τ ) is used in Equation (5) to learn phase-coded patterns with uniformly distributed phases, then the balance condition assures that the sum of the connections on the single neuron *<sup>j</sup> Jij* is of order 1/ <sup>√</sup>*N*, and therefore, it assures a balance between excitation and inhibition (Scarpetta et al., 2010). Note that, as we are studying a network of excitatory neurons, the negative connections have to be thought as connections mediated by fast inhibitory interneurons. When multiple phase-coded patterns are stored, the learned connections are simply the sum of the contributions from individual patterns,

namely,

$$J\_{\vec{ij}} = \sum\_{\mu=1}^{p} \delta f\_{\vec{ij}}^{\mu}. \tag{6}$$

Throughout the paper we use a number of neurons *N* = 3000, a period *T*<sup>μ</sup> = 333 ms, and a number of patterns *P* = 2. Moreover we use two values for the strengths *Hi* of the connections in Equation (5). A value *H*<sup>0</sup> for "normal" neurons, and a value *H*<sup>1</sup> = 3*H*<sup>0</sup> for "leader neurons," that are chosen for each pattern μ as a fraction of 3% of the neurons that have consecutive phases, for example the lowest phases in the interval [0, 2π). Note that the values of *H*<sup>0</sup> are expressed in units of the threshold of the neurons. The role of these few "leader" neurons, with higher incoming connection strenghts, is that of collect and amplify activity initiated by noise, and give rise to a cue able to initiate the short collective replay.

After the learning procedure, we perform a pruning procedure, by which only a fraction of the *N*(*N* − 1) connections *Jij* survives. Namely, for each neuron *i*, we take all the incoming connections *Jij*, and separately consider the positive (excitatory), and the negative (inhibitory) ones. As for the positive ones, we delete a fixed fraction *f* <sup>+</sup> prune of them that have the lowest value. Then, we delete a fraction *f* −,*i* prune, that can depend on the neuron *i*, of the negative connections that have the lowest (absolute) value, choosing *f* −,*i* prune so that the sum of the incoming connections to neuron *i* at the end is as close as possible to zero. Throughout the paper we use *f* <sup>+</sup> prune = 70%. As a consequence, at the end of the pruning process, about 12% of the *N*(*N* − 1) connections survived as positive connections, and 27% as negative connections, with statistical fluctuations of order of 1/ <sup>√</sup>*N*. After the learning and pruning procedure is applied, the dynamics of the network is studied with the connections *Jij* fixed, that is we do not apply STDP nor short term depression.

## **3. RESULTS**

We studied the dynamics of the network with *N* = 3000 neurons as a function of two parameters, the parameter *H*<sup>0</sup> setting the strengths of the connections, and the parameter α setting the noise level. The former is expressed in units of the threshold of the neurons, while the latter has dimensions of ms<sup>−</sup>1. We started with a network with all the potentials *Vi*(0) = 0, and let the system evolve subjected to Equation (1). We discarded the first 60 s of the dynamics, to avoid considering the transient, and analyzed the dynamics for a total of 10<sup>7</sup> spikes, or 1200 s, whichever condition was met before1 . The simulated time was therefore between 180 and 1200 s, depending on the average spiking rate of the neurons. For each value of the pair of parameters *H*<sup>0</sup> and α, we average the results over four realizations of the patterns, that is of the quenched random phases φ<sup>μ</sup> *i* .

#### **3.1. SPIKING RATE DISTRIBUTION AND DYNAMICAL REGIMES**

In **Figure 2A** we show the average spiking rate in Hz per neuron, as a function of the noise level α and the connection strengths

<sup>1</sup>This is the time appearing in Equation (1), not the CPU time needed to perform the simulations, that ranged between 1 and 5 h.

*H*0. While the average rate increases continuously as either of the parameters is increased, the distribution of the rates in a finite interval of time changes qualitatively. We bin the time in 1 ms intervals, evaluate the rate in Hz per neuron for every interval, and compute the distribution of the rates. For low connection strength, or low noise, the distribution is nearly exponential (see **Figure 3A**), with an average rate lower than 2 Hz. For high connection strength and high noise, the distribution is nearly Gaussian (see **Figure 3C**), with an average rate higher than 13 Hz. In an intermediate region the distribution is bimodal (see **Figure 3B**), and shows both peaks, one exponential at low rates, and one Gaussian at high rates with a minimum in the distribution. This three different regimes are shown in **Figure 2B** with different colors. The intermediate bimodal regime resembles the phase coexistence observed in a first order equilibrium phase-transition, even though in our case the transition is a non-equilibrium one.

The qualitative difference in the distribution of the spiking rates, corresponds to a different dynamical behavior. At low rates, when the distribution is nearly exponential with a maximum at zero rate, the dynamical behavior is dominated by noise. The potential of neurons is governed by a Ornstein–Uhlenbeck process, and with some probability crosses the threshold giving rise to a spike, that is not able, however, to generate a spreading activity in the network (**Figure 4A**). On the other hand, in the high rate regime, the noise triggers the replay of one of the patterns encoded in the network (**Figure 4C**). In this case, once the replay of the pattern has started, the noise is not able to stop it, so that the replay is permanent2 . The intermediate regime, corresponding to a bimodal distribution of the rates, is shown in **Figure 4B**. In this case the noise is able to start the replay of a pattern, but also to stop it, so that the activity is intermittent, and resembles the experimentally observed alternation of up and down states. The firing rate is high in the "up" state (during short replays) and is low during "down" states, therefore the distribution of the rates is bimodal. The corresponding region, shown with green dots in **Figure 2B**, separates the regimes of permanent collective replay of spatiotemporal pattern (red) and the region of quiescence with low activity (yellow). Such non-equilibrium phase transition has been recently studied in a similar model (Scarpetta and de Candia, 2013, 2014) showing that in the region where the order parameter, which measures the similarity between spontaneous dynamics and the stored dynamic patterns, passes

<sup>2</sup>Note that, once the replay of the pattern has started, it is able to sustain itself, provided the synaptic strengths are larger than some threshold (Scarpetta et al., 2010; Scarpetta and Giacco, 2012), so that with a suitable triggering it is possible to observe the continuous replay of the pattern down to α = 0.

**FIGURE 2 | (A)** The average spiking rate in Hz per neuron, as a function of the noise level α and the connection strengths *H*0. **(B)** Shape of the firing rate distribution: nearly exponential (yellow), nearly Gaussian (red), or bimodal (green). The letters on the plot mark the points whose rate distribution is shown in **Figure 3**, and whose activity is shown as a raster plot in **Figure 4**.

As a function of the noise level α and the connection strengths *H*<sup>0</sup> we identify a region with bimodal rate distribution with alternation of up and down states (green), which separates the two distinct regimes of low nearly exponential firing rate (yellow) and the regime with nearly Gaussian high firing collective activity (red).

from zero to one, the fluctuations of the order parameter are maximized.

Note that the replay of a pattern appears in the raster plot of **Figure 4** as a sawtooth, when the neurons are sorted on the vertical axes in the order of the pattern that is being replayed. On the other hand, it appears as completely random when the neurons are sorted in another way, for example in the order of a pattern that is not being replayed. The alternation of states seen in the raster plot in the bimodal region (**Figure 4B**) resembles the alternation of up and down states observed to occur spontaneously. Notably, as reviewed in Luczak and MacLean (2012), there are experimental evidences that during the up states often neurons activate in a surprisingly similar sequential order, reproducing default spatiotemporal patterns.

#### **3.2. AVALANCHES SIZE AND TIME DISTRIBUTION**

In the intermediate regime, where the network alternates between up and down states, we observe that inside the periods of high firing rate (up states), at a finer level, the activity is made of a series of cascades or "avalanches," separated by short drops in the rate. Cortical activity cascades that follow precise power laws, i.e., neural avalanches, have been observed experimentally during spontaneous cortical activity *in vitro* and *in vivo* (Plenz, 2012, and references therein).

Experimentally neural avalanches are defined in terms of local field potential recorded at electrodes, that average the activity of many neurons. In our model, we have to distinguish between the few spikes generated by noise, that we want to characterize as no activity, and the spikes generated when a collective pattern is replayed, that represent instead an activity in the network. Due to the separation we have seen on the global spiking rates, with rates lower than 2 Hz corresponding to no activity, and rates larger than 13 Hz representing the collective replay of a pattern, we identify "avalanches" as consecutive time bins with a rate higher than a threshold *R*min = 7 Hz. Successive time bins are concatenated until an empty bin (rate lower then *R*min) is reached, at which the concatenation process stops.

We define the size of an avalanche as the total number of spikes, that is the integral of the rates over the avalanche duration. In **Figure 5** we show the distribution of the sizes (A) and

distribution shown in **Figure 3A**. Alternation of states of quiescence with states of higher activity is shown in **(B)**, corresponding to **Figure 3B**. During the states of higher activity a collective coherent replay of one of the two stored patterns emerges. In this regime the noise is able to initiate a short collective replay of a pattern, and also to stop it. In the picture **(B)** we can see that both patterns are initiated intermittently, a short replay of pattern 2 is followed by a quiescence period and then by a short pattern 1 replay. Raster plot in picture **(C)** shows a regime with stable attractors, with permanent replay of pattern 2.

durations (C) of the avalanches for *H*0/ = 0.221, α = 0.06, that corresponds to point B in **Figure 2B**. Note that the distributions are well described by power laws, with exponent 3/2 for the sizes and 2 for the durations, as experimentally observed (Plenz, 2012). Such a behavior is quite robust, and is observed generically in the region of the non-equilibrium transition between the replay and non-replay of spatiotemporal patterns (Scarpetta and de Candia, 2013). It does not depend on the precise value of the *R*min chosen. Indeed, as shown in **Figure 5B**, the size distribution follows approximately the same power law in a range of *R*min from *R*min = 5 Hz to *R*min = 9 Hz.

## **3.3. WAITING TIMES BETWEEN AVALANCHES AND UP AND DOWN STATES**

We have computed the distribution *P*( *t*) of the waiting times between successive avalanches. In **Figure 6A** we show the distribution for synaptic strength *H*0/ = 0.214, 0.221, and 0.228, and noise α = 0.06 ms<sup>−</sup>1, in the region where the rate distribution is bimodal. The middle (red) curve at *H*0/ = 0.221 corresponds to point B in **Figure 2B**. The distribution presents a regime between 10 and 50 ms characterized by a power law with exponent −3, preceded by a regime with a lower slope. For times larger than 50 ms, the distribution shows a broad plateau, that is longer the lower the noise, or the strength of the connections.

A power law regime in the waiting times between avalanches has been observed also experimentally, for example in freely behaving rats (Ribeiro et al., 2010), or in cortical slices (Lombardi et al., 2012, in preparation). The second regime, corresponding to large waiting times, is also observed in Lombardi et al. (2012, in preparation).

The power law regime corresponds to waiting times between successive avalanches within the same up state. The power law in the distribution indicates temporal correlation, i.e., that consecutive avalanches belonging to the same up state are correlated. Indeed in our model the up state is the result of the system falling in one of the many metastable spatiotemporal pattern attractors, corresponding to a collective replay activity. The large bump at long times in the waiting time distribution is related on the other hand to down states, that is intervals in which the network does not replay any of the encoded patterns.

We have plotted data of **Figure 6A** also in an alternative way. While *P*( *t*)δ*t* is the probability of observing a waiting time between *t* and *t* + δ*t*, in **Figure 6B** we plot *P*¯( *t*), where *P*¯( *t*)δλ is the probability of observing a waiting time between *t* and *t*(1 + δλ). Note that *P*¯( *t*) = *P*( *t*) *t*. With this alternative definition, the distribution becomes non-monotonic, with a pronounced maximum at high values of *t*, and the exponent of the initial power law becomes −2.

In **Figure 7** we show the distribution *P*( *t*) in the region of high activity, marked in red in **Figure 2B**. In this case the replay of the pattern becomes continuous, and therefore the plateau at long times, in the distribution of waiting times, disappears. The distribution therefore shows only the power law regime, as shown in **Figure 7**, corresponding to the point in phase space marked with letter C in **Figure 2B**. On the other hand, in the region of low activity, marked in yellow in **Figure 2B**, avalanches become very

**FIGURE 7 | The distribution of waiting times** *P***(***t***) in the region of high activity, for** *<sup>H</sup>***0***/* **<sup>=</sup> <sup>0</sup>***.***250,** *<sup>α</sup>* **<sup>=</sup> <sup>0</sup>***.***08 ms−<sup>1</sup> (point C in Figure 2B).** In this case the plateau at high values of the waiting time, corresponding to down states, disappears, because the replay of the patterns becomes continuous, and only the power law, corresponding to the concatenation of correlated avalanches inside an up state, is observed.

where the rate distribution is bimodal. The middle curve (red) for

sparse, so that the distribution of waiting times is different from zero only for very long times, and in this case the initial power law regime disappears.

As in Lombardi et al. (2012), we define the up states as periods of high activity characterized by the concatenation of consecutive avalanches with waiting times lower than *T*max = 50 ms, the maximum time falling inside the power law regime of waiting times. Successive avalanches are concatenated until a waiting time larger than *T*max is reached, at which the concatenation process stops. Similarly, down states are defined as a concatenation of waiting times larger than *T*max. An isolated avalanche preceded and followed by a waiting time larger than *T*max does not stop the down state.

In **Figure 8** we show the distribution of the durations of down and up states, for the same parameters of **Figure 6**. The distribution of durations of down states (**Figure 8A**) is well fitted by an exponential (continuous lines in the figure), showing that the transition from down to up states is controlled by a Poissonian probability, due to the noise focusing that triggers a replay of one of the patterns encoded in the network. On the other hand, the distribution of durations of up states cannot be fitted by an exponential as well as the one of down states, except for a narrow interval at large times. Indeed, one observes an excess of durations around 100 ms, that corresponds in the model to the duration of one period of the replayed pattern. Moreover, when one approaches the region of parameter space where the replay of the patterns becomes continuous, the distributions show significant deviations from the exponential also for large times, and could be better fitted by a stretched exponential. This is apparent going from blue to red and green curves in **Figure 8B**, that are all in the bimodal region, but get closer and closer to the region of self-sustained replay. Note that, when one goes deep inside the region of self-sustained replay, no down states are observed in practice, and up states last for a time of the order of the experimental time.

#### **3.4. BEHAVIOR IN PRESENCE OF SHORT TERM DEPRESSION**

It has been conjectured that the alternation between up and down states depends crucially on the short term synaptic depression (STSD). As we have shown, in our model this instability between up and down states is present even in absence of short term depression, and is due instead to the particular structure of connections, that are far from being random. Such structure determines in the network a large transition region of phase space, where there is a co-presence of both dynamical attractor states, corresponding to the replay of the patterns

distributions. Durations of down states are well fitted by an exponential, while durations of up states show some deviations both at short and at large times.

in presence of STSD for *H*0/ = 0.221 and 0.231. **(B)** Raster plot

case of **Figure 4B**.

encoded, and the attractor corresponding to quiescence of the network.

However, as short term depression is present in real synapses in the brain, we show here that it does not invalidate the behavior displayed by the model considered here, but changes only the parameters, such as the strength of connections, where the transition region appears. We have added STSD in the model, implementing a dynamics on the connections *Jij* according to the equation

$$\frac{dJ\_{\vec{ij}}}{dt} = \frac{1}{\mathfrak{r}\_r} (J\_{\vec{ij}}^0 - J\_{\vec{ij}}),$$

where *J*<sup>0</sup> *ij* are the connections given by Equation (6), and τ*<sup>r</sup>* = 10 ms is the recovery time of synapses. Moreover, we depress *Jij* by a factor *f*stsd = 0.5, every time the presynaptic neuron *j* fires a spike.

In **Figure 9A**, we show the distribution of the rates, evaluated as in **Figure 3**, in absence of STSD for *H*0/ = 0.221 (point B in **Figure 2**), and in presence of STSD for *H*0/ = 0.221 and 0.231. Note that, for the same synaptic strength at rest, the effect of STSD is to lower the fraction of time in which the network is in the up state. However, for a slightly higher *H*0/, the distribution is very similar to the one without STSD. In **Figure 9B**, we show the raster plot in the case of *H*0/ = 0.231 and τ*<sup>r</sup>* = 10 ms, showing a behavior very similar to the one displayed in **Figure 4B**, with the alternation of up and down states, and of different patterns replayed in the up states. Therefore, the effect of STSD is to slightly shift the region of the phase space in which the transition is observed.

#### **4. DISCUSSION**

Our model is the first, to our knowledge, that describe both neural avalanches, recurrences of spatiotemporal patterns, and alternation of up and down states, in a single minimal model.

Differently from our previous work (Scarpetta and de Candia, 2013) here we study a sparse connectivity, which is a results of a competitive pruning process applied after the learning procedure. Moreover while we previously introduced heterogeneity in the network topology using neurons with different spiking thresholds, here we show that avalanches initiation may be initiated by the interplay between miniatures noise and the heterogeneity in the strengths of connections, in agreement with recent experimental results (Orlandi et al., 2013).

The model shows a region of low activity, with Poissonian spiking rate, and a region of high activity, characterized by the continuous replay of one of the multiple attractors stored in the network connections, depending on the value of synaptic strength and noise intensity. In the region of phase space separating these two regimes, one observes an alternation of periods of quiescence (down states) and periods of high correlated activity (up states), corresponding to an intermittent replay of the patterns. At a finer temporal scale, up states are made of a sequence of avalanches, showing power law distribution of sizes and durations. In this model the alternation of up and down states does not depend on a kind on neuron bistability, nor on synaptic depression, but is rather a network effect, the result of a structured connectivity, that produces multiple dynamical attractors, and of the fact that at the non-equilibrium phase transition the network dynamics fluctuates between different metastable basins of attraction.

Therefore, such complex dynamics appears at a dynamical transition between disordered Poissonian activity, and an ordered permanent dynamical state. In such region, the network is able to respond to external inputs in a flexible way, switching effectively between different modes of operation, corresponding to the different basins of attraction, that may be connected to functionally relevant behavior.

## **REFERENCES**

Abarbanel, H., Huerta, R., and Rabinovich, M. I. (2002). Dynamical model of longterm synaptic plasticity. *PNAS* 99, 10132–10137. doi: 10.1073/pnas.132651299


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 17 March 2014; accepted: 28 April 2014; published online: 19 May 2014. Citation: Scarpetta S and de Candia A (2014) Alternation of up and down states at a dynamical phase-transition of a neural network with spatiotemporal attractors. Front. Syst. Neurosci. 8:88. doi: 10.3389/fnsys.2014.00088*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2014 Scarpetta and de Candia. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Power law scaling in synchronization of brain signals depends on cognitive load

## **Jesse Tinker <sup>1</sup> and Jose Luis Perez Velazquez 1,2\***

<sup>1</sup> Neuroscience and Mental Health Programme, Brain and Behaviour Centre, Division of Neurology, The Hospital for Sick Children, Toronto, ON, Canada 2 Institute of Medical Science and Department of Paediatrics, University of Toronto, Toronto, ON, Canada

#### **Edited by:**

Paolo Massobrio, University of Genova, Italy

#### **Reviewed by:**

Yoshio Sakurai, Kyoto University, Japan Thierry Ralph Nieus, Istituto Italiano di Tecnologia, Italy

## **\*Correspondence:**

Jose Luis Perez Velazquez, Institute of Medical Science and Department of Paediatrics, University of Toronto and Neuroscience and Mental Health Programme, Brain and Behaviour Centre, Division of Neurology, The Hospital for Sick Children, 555 University Avenue, Toronto M5G1X8, ON, Canada e-mail: jlpv@sickkids.ca

As it has several features that optimize information processing, it has been proposed that criticality governs the dynamics of nervous system activity. Indications of such dynamics have been reported for a variety of in vitro and in vivo recordings, ranging from in vitro slice electrophysiology to human functional magnetic resonance imaging. However, there still remains considerable debate as to whether the brain actually operates close to criticality or in another governing state such as stochastic or oscillatory dynamics. A tool used to investigate the criticality of nervous system data is the inspection of power-law distributions. Although the findings are controversial, such powerlaw scaling has been found in different types of recordings. Here, we studied whether there is a power law scaling in the distribution of the phase synchronization derived from magnetoencephalographic recordings during executive function tasks performed by children with and without autism. Characterizing the brain dynamics that is different between autistic and non-autistic individuals is important in order to find differences that could either aid diagnosis or provide insights as to possible therapeutic interventions in autism. We report in this study that power law scaling in the distributions of a phase synchrony index is not very common and its frequency of occurrence is similar in the control and the autism group. In addition, power law scaling tends to diminish with increased cognitive load (difficulty or engagement in the task). There were indications of changes in the probability distribution functions for the phase synchrony that were associated with a transition from power law scaling to lack of power law (or vice versa), which suggests the presence of phenomenological bifurcations in brain dynamics associated with cognitive load. Hence, brain dynamics may fluctuate between criticality and other regimes depending upon context and behaviors.

**Keywords: autism, synchrony, power law, criticality, bifurcations, magnetoencephalography**

## **INTRODUCTION**

Much is being discussed today about the possible critical dynamics of brain activity and its close relatives complexity and emergence. The appealing characteristics of criticality (for comprehensive introductions to the field, see Christensen and Moloney, 2005; Sornette, 2004), derived from early theoretical and computational work indicating the optimization of information processing and adaptability in general at the "edge of chaos" (Packard, 1988; Langton, 1990), fostered a tremendous interest in the application of these concepts to nervous system function (concisely reviewed in Beggs, 2007; Chialvo, 2010; Shew and Plenz, 2013), for, after all, whereas brain cells (glia and neurons) perform individually relatively simple computations, in their collective activity in the brain, cell networks achieve complex operations leading to adaptive behaviors. Critical dynamics generally show scale-invariant organization (similar fluctuations occurring at all spatio-temporal scales) which can be described by scale-invariant metrics. Of these metrics, power laws in the distribution of characteristics of the system (for instance the size of events or inter-event intervals) have been considered as typical signatures of criticality. Encouraged by early experimental observations reporting the celebrated 1/f power spectrum scaling (Pritchard, 1992; Georgelin et al., 1999), neuroscientists launched an intense investigation to study the presence of power laws in experimental recordings of all types, from *in vitro* systems to *in vivo* recordings. However, during these investigations, no consensus on the interpretation of power law scaling has emerged and many misunderstandings are currently apparent (Beggs and Timme, 2012). Most notably, while the presence of power laws is commonly thought to be associated with complexity, this association has only been formally demonstrated to occur in equilibrium statistical mechanics in systems near bifurcations. In addition, there are many means by which a system may display power laws (Mitzenmacher, 2004; Clauset et al., 2009; Markovi´c and Gros, 2014) and some have little to do with complex dynamics.

Keeping these considerations in mind, we have assessed the possible presence of power-law scaling in a phase synchronization index of magnetoencephalographic brain recordings in children with and without autism during performance of two executive function tasks. Characterizing the difference in brain dynamics between autistic and non-autistic individuals is motivated by the potential to find differences that could either aid diagnosis or provide insights as to possible therapeutic interventions in autism. Autism and related disorders (autism spectrum disorders, ASD) are accompanied by different styles of brain information processing, reflected in some particular behavioral features of individuals with ASD. The Austrian psychiatrist Kanner (1943) described autism as ". . .the inability to experience wholes without full attention to the constituent parts" (even though it seems that the term autism was coined in 1911 by the Swiss psychiatrist Eugen Bleuler, who used it to describe the "withdrawal into one's inner world"). Ideas proposed to explain the behavioral traits in ASD have mostly been on the psychological level of description, such as the weak central coherence hypothesis (Frith, 1989). With the advent of new analytical methods to scrutinize brain dynamics, especially the analysis of synchrony and "connectivity", these ideas have been "translated" into neurophysiological notions such as disconnection amongst brain circuits. Yet, the debate still continues regarding the possible hyper or hypo-connectivity in autistic brains.<sup>1</sup> What seems conceivable is that the brain coordination dynamics differs in ASD brains from others, for it is the coordinated activity of transiently formed cell assemblies that underlie cognition (von der Malsburg, 1981; Flohr, 1995; Bressler and Kelso, 2001; Kelso, 2008; Pérez Velázquez and Frantseva, 2011). Thus, studies aimed at assessing brain coordinated activity could be of relevance in the field.

Our study uses magnetoencephalographic (MEG) recordings done in two groups, children with and without ASD, performing two different executive function tasks. In our analysis, we calculated a synchronization index and studied whether the index's empirical density function (edf) displayed power law scaling. Specifically, we looked for different expressions of power law scaling between the two groups of children and the two executive tasks. We found that power law scaling was not common and its frequency of occurrence was decreased when the cognitive load of the test was high. This difference between tasks was seen in both groups of children but little inter-group variation was observed. We discuss implications of these findings in the Discussion section.

## **MATERIALS AND METHODS PARTICIPANTS**

Data were drawn from a larger sample of children enrolled in previous studies (Pérez Velázquez et al., 2009; Teitelbaum et al., 2012). Sixteen control children (7 females) and 15 children (1 female) diagnosed with high functioning autism (Asperger syndrome) participated in the study. The children's parents provided informed consent for the protocol approved by the Hospital for Sick Children Review Ethics Board. Age range was between 7 and 16 years. Patients met the criteria for ASD based on DSM-IV and were evaluated by the psychologists in the Autism Research Unit of the Hospital for Sick Children or were recruited from the Geneva Center for Autism and Autism Ontario. Age-matched control children had no known neurological disorders. Cognitive abilities were measured using the Wechsler Abbreviated Scale of Intelligence (WASI), as reported previously (Pérez Velázquez et al., 2009). The data analyzed in this study corresponded to 14 children (7 in each group) for the Stroop task and 25 children (12 in the ASD group) in the auditory attention task.

## **MAGNETOENCEPHALOGRAPHIC RECORDINGS**

MEG recordings were acquired at 625 Hz sampling rate, using a CTF Omega 151 channel whole head system (CTF Systems Inc., Port Coquitlam, Canada), as previously described (Pérez Velázquez et al., 2009). Head movement was tracked by measuring the position of three head coils every 30 ms, located at the nasion, left and right ear, and movements less than 5 mm were considered acceptable. Sensors used in the analyses are depicted in **Figure 1**, and were located over the following cortical areas: left and right frontal (LF, RF), left and right parietal (LP, RP), and left and right temporal (LT, RT). We chose these cortical areas as they are associated with executive functions and relatively mutually distant in space.

## **EXECUTIVE FUNCTION TASKS Stroop color-word test**

The color Stroop interference paradigm is a commonly used test of inhibition (Stroop, 1935), in which the participant names the colors of the ink in which words are written. It consists of a list of color words written in congruent color (e.g., the word "green" written in green color), and follows with a list in incongruent color (e.g., the word "green" written in red color). It is well established that processing the content of the word is more automatic than processing the color of the word. Therefore, in the incongruent condition, the individual needs to inhibit the response of word naming that competes with the response of color naming. In our experimental MEG set-up, words were presented to participants via a video projector, and the children's responses were monitored on-line to check for errors. Besides the congruent (termed "Congruent ink" in this study) and incongruent conditions ("Incongruent ink"), we also conducted a baseline condition ("Black-ink") in which participants named the color words written in black ink, where interference effects were expected to be much lower or absent. Ninety four words were presented for each condition (Black-ink, Congruent-ink, and Incongruent-ink), the time interval between words was 2.5 s.

## **Auditory attention task**

The auditory task included two conditions with varying attentional requirements. In the simple reaction to stimulus, that is, a "low attention" condition (which we term "No attend" in this work) the participants heard repeated identical auditory tones and were instructed to press the response button to every tone. In the auditory oddball condition ("Attend" condition), that required attention to a deviant tone amongst otherwise common tones, participants pressed the response button only after hearing a deviant tone. In this way, the "low-attention" condition mainly reflected sensory registration of auditory stimuli, and the

<sup>1</sup> see for instance http://sfari.org/news-and-opinion/news/2013/autismbrains-are-overly-connected-studies-find for recent perspectives on the topic

oddball condition reflected decision-making based on an auditory distinction. The baseline recording for this task was a period of 30–60 s when individuals were asked to remain quiet and did not receive any auditory input. Tones were presented binaurally with a 750 ms inter-stimulus interval. MEG recording time was 5 min for the low-attention and oddball conditions. There were 400 of the same stimuli presented in the low-attention condition. There were equally 400 stimuli presented in the oddball condition, of which 30% were deviant tones.

## **PHASE SYNCHRONIZATION ANALYSIS**

Visual inspection of the MEG recordings for artifacts was done during the acquisition and off-line before the analysis to remove sensors with artifacts or repeat the acquisition. Recordings were initially band-passed using a FIRLCS filter with a band-pass of ±2 Hz around a "*central frequency*". The band-pass filtering done before the extraction of the oscillation phase removes eye blink artifacts (which tend to appear in frontal sensors) because these last around 300–400 ms, which is ∼2.5–3.3 Hz in terms of frequencies. Since in our study the lowest frequency studied is 10 ± 2 Hz, we can consider that eye blinks are not affecting our results. In this study, we used four central frequencies, 10, 18, 26 and 32 Hz, thus covering the range 8–34 Hz. The reason to choose these frequency bands is that they cover most of the ranges from α to lower γ that have been attributed to cognitive task performances. In addition, due to some limitations with the extraction of the phase using the Hilbert transform, especially the advice to have about 20 points per characteristic period of the oscillation (see page 367 in Pikovsky et al., 2001), phase synchrony was not assessed past 34 Hz.

On these band-passed signals, the Hilbert transform was applied and successive values of instantaneous phases were derived from the corresponding analytic signal. These phase series were then analyzed using sliding windows extracting the Mean Phase Coherence Statistic between two MEG recording channels as described in Mormann et al. (2000). Briefly, we use the analytic signal approach, employing the Hilbert transform to estimate instantaneous phases and calculate phase locking between two MEG recording channels (sensors), as previously described (Garcia Dominguez et al., 2005, 2007). With noisy data, phase synchronization has to be defined in a statistical sense: two signals are phase synchronized if the difference between their phases is bounded over a selected time window, that is, if it clusters around a single value (Pikovsky et al., 2001). A measure of this is the circular variance (CV) of the phase differences ∆θ(*t*), or alternatively, the coefficient *R* = 1 CV, which can also be expressed as:

$$|R\_{jk} = |<\exp(i\Delta\theta\_{jk}(t))>|$$

Here |·| denotes the norm and < · > the mean value. 1θ*jk*(*t*) = θ*j*(*t*) − θ*k*(*t*) are the series of phase differences between the analytic signals of series indexed by *j* and *k* (each index *j* and *k* refer to one signal, that is, one MEG sensor time series) over a given time window. The value of *R* varies from 0 to 1, the higher the value the tighter the clustering of the phase differences 1θ about a single mean value; that is, the closer the *R*-value to 1 the more synchronized the signals.

To estimate the mean synchrony index in the Stroop task, as described in detail in Pérez Velázquez et al. (2009), averages of the values of the synchronization index *R* were computed from stimulus presentation to the moment near the individual's response, about 0.45–0.6 s after stimulus presentation in the Stroop task. The precise time to calculate the average varied slightly from individual to individual because the time to answer was variable and the average of the synchrony index was taken from the time of stimulus presentation to just before the subject's response. For this purpose, the minimum time for each response of the individual rather than the mean of each subject's distribution of reaction times was taken. All 282 trials (94 words × 3 conditions) were used for the analysis. The "baseline" was the initial list of words written in black ink. The results derived from the estimations of the magnitude of synchrony in this task were reported in Pérez Velázquez et al. (2009). In the present study, those synchronization indices estimated in the previous study were used to construct the edf to be analyzed as described below.

For the auditory attention task data, synchrony between two cortical sensor groups (those aforementioned above and shown in **Figure 1**) was computed, as in the Stroop task, using the average of all sensor combinations between the regions. For example, we selected 6 left parietal sensors and 6 right temporal sensors and formed 36 inter-group pairs. For each task condition, the synchrony values between these 36 sensor pairs were averaged to define the average synchrony index between the two sensor groups at each time point. Unlike in the Stroop task, in this case the synchrony index was not calculated phase-locked to the stimulus presentation, rather was calculated in a sliding window of 1 s for the whole 5 m recording (this is reasonable as attention, in this task, is supposed to be continuous and not intermittent). This derived average was then compared between subject groups and between conditions, and was used to obtain the empirical distribution functions.

## **TESTING POWER LAW DISTRIBUTION**

We used the method described in Janczura and Weron (2012), which is based on the asymptotic properties of the edf. Details and validation of the procedure can be found in that article. We used the edf (the sample estimate of the cumulative distribution function) of the phase synchronization index rather than probability densities (pdf) because the former is not as biased as the pdf in terms of binning the data points that is required to construct a pdf but not an edf, and in general tests on edf are more powerful than those done on pdf (Newman, 2005). Especially, it has been documented that the cdf is more accurate to fit power laws (Dehgani et al., 2012). To construct the edf, the values of the *R* index were not further averaged in the Stroop task because the values represented the average from stimulus presentation to the moment near the individual's response in the time periods mentioned in the previous paragraph on phase synchrony analysis. In the auditory task, the *R*-values, computed in a sliding window as mentioned above, were averaged in 1-s windows to reduce the number of data points (otherwise we would have 625 points per second, as we used a 625 Hz acquisition rate, that would result in a very large number of data points for the 5 m MEG time series and for all sensor combinations) and to make it more comparable to the Stroop task data. In total and for each individual and each task, the number of data points (that is, the *R*-values corresponding to the sensor combinations) used to derive the edf was 864 in the Stroop task, 4437 in the attention task, and 387 in the "baseline" for the attention task (because here the recordings were of shorter duration).

In brief, Janczura and Weron's MATLAB algorithm (CI\_powertail.m) estimates confidence intervals of a specified significance level (set at 0.05 in this study) for a power law fitted to a certain range of the edf. The logarithmic plots (**Figures 2** and **3**) represent 1-edf on the *y*-axis and the data on the *x*-axis. The ranges used in our study to fit the tail power law were (unless otherwise stated in the text) to the largest 5–1% values for the "attend" and "no attend" conditions of the auditory task, and 25–2.5% in the baseline condition of the auditory task and for the Stroop task. The ranges had to differ because of the different number of data points as detailed above. When the power law was fitted to central regions of the edf, the range was 70–25%.

To assess possible phenomenological bifurcations (Kuehn, 2011), we estimated whether two pdfs of the *R* indices were statistically different using the two-sample Kolmogorov-Smirnov test, with the null hypothesis that the two data sets are from the same continuous distribution.

## **RESULTS**

In order to inspect characteristics of the phase synchrony probability distribution function or the edf, a computation of the phase synchrony index, described in Section Materials and Methods, was done first. It should be noted, as discussed below in the Discussion section, that our synchrony analysis among MEG sensors reflects population-scale levels of activity in large cellular ensembles, mostly a combination of synaptic potentials and neuronal action potentials (Toga and Mazziotta, 2002), thus the synchrony index in reality represents correlated phases among the MEG sensors. The average magnitude of the synchrony revealed

versus the data (R-values) are presented. Dashed red lines indicate 95% confidence for a power law, showing the presence of outliers in **B**. Note as well the change in the pdf, becoming almost bimodal in **B**, and as well the possible presence of a power law regime in the middle of the edf. Fitting the power law in this middle range (70–30% of the values, see Section Materials and Methods for ranges used), the exponent found was 2.4. **(C)** Data collected from another individual (non-ASD) during performance of the Stroop task, showing the presence of outliers in the tail during the incongruent color condition.

few differences between the ASD and the control (non-ASD) group during task performance. The most notable difference is that the slight increase in synchrony during performance of the auditory attention task was more evident in the non-ASD group, as presented in **Figure 1** for the central frequency of 10 Hz. Note that the synchrony index between two sensor groups tends to augment from the "control", or baseline, to the "attend" condition in both ASD and non-ASD participants, and that it is already higher than baseline in the "no-attend" condition. Apparently, only the fact that participants had to perform a task either paying or not paying attention to deviant tones as instructed, already changes the brain synchrony patterns. In contrast, and shown as well in **Figure 1**, no apparent reproducible change in synchrony is detected when evaluated at 32 Hz (or at 26 Hz, not shown). The increase in the synchrony

**FIGURE 3 | Upper graphs correspond to one subject (ASD) performing the auditory attention task, illustrating that the tail power law characteristics disappear during the attend condition**. The inset on the right-hand side graph is the log-linear plot, suggesting that the edf has more pronounced exponential characteristics rather than power law features. Lower plots are from another subject (non-ASD) showing the presence of outliers in both conditions of the auditory task, but more numerous in the "attend" condition (circled). As in **Figure 2B**, the middle part (70–40%) of the edf in the "Attend" condition could be fit to a power law with exponent 3.4.

index *R* associated with task performance was observed when phase synchrony was computed at central frequencies of 10 and 18 Hz, and the relative changes during task performance, between the "baseline" and "attend" condition, were an increase of 13.1 ± 5.6% and 5.54 ± 4.6% for the non-ASD group at 10 and 18 Hz respectively, and 4.4 ± 3.5 and (a decrease of) −0.6 ± 1.9% for the ASD group. Thus it seems that it is around the α-frequency range (10 ± 2 Hz) where the tendency to enhance the magnitude of phase synchrony amongst the MEG sensors assessed is more pronounced. Note that, in the ASD group, the magnitude of synchrony at 32 Hz is highest in the parietal sensors (LP-RP) regardless of task condition, as indicated as well in previous studies (Pérez Velázquez et al., 2009; Teitelbaum et al., 2012). Comparison of the number of errors (deviant tones not detected) committed during the performance of the oddball task ("attend" condition) did not significantly differ between the two groups, although there was a trend for worse performance by those with ASD (ASD mean of 12.25 ± 13 errors; control group mean of 10.1 ± 7.3).

The changes in synchrony during the Stroop task were reported in Pérez Velázquez et al. (2009), so it will not be reproduced here. Briefly, significant increases in the magnitude of the synchrony index were observed in the non-ASD participants during the "incongruent" condition, but no apparent change in synchrony between conditions was detected in the ASD group. The behavioral results of this task are also reported in that paper: the difference in the errors committed (reading the word rather than naming the color in the incongruent condition) between the two groups of children was not significant even though, as found in the attention task, there was a tendency to make more mistakes in the case of ASD subjects (average of 5 ± 4.8 errors for the ASD participants versus 2.68 ± 2.5 errors in the control group).

Whereas the averaged magnitudes of synchronization provide certain information regarding brain dynamics, in addition to presenting the averages it is also informative to inspect the whole pdf of the magnitudes of synchrony, and this was our main purpose in the present study. As is well known, when pdfs are not Gaussian the central tendencies (median, most probable value, average) will differ, so which one to use is a matter of convenience or taste, thus inspecting characteristics of the whole pdf, especially the tails, provides a more complete picture than averages and variances alone. **Figure 2** shows two pdfs of the synchronization index evaluated at 26 Hz for one subject performing the auditory task, **Figure 2A** is that derived from the baseline condition and **Figure 2B** for the "no-attend" condition. Note the differences in shape, one ("no-attend") being bimodal, differences that can be quantified by a Kolmogorov-Smirnov (KS) test (in this case the difference is very significant: *p* < 0.0001). The logarithmic plot of the edf (or rather 1-edf, as noted in Section Materials and Methods) is represented in the same figure, and these were used to assess the presence of power law in the tails, as explained in Section Materials and Methods. Note that some power laws could be present not in the tails of the distribution but in the middle, as in **Figure 2B** (the straight segment in the middle), however this was uncommon (less than 45% of inspected edfs). Power laws in the tails were not too frequently found either. **Tables 1** and **2** provide the abundance of power laws found in the tails: in average in the non-ASD group, these were present in 27.9% (auditory task) and 32.1% (Stroop task) of those evaluated, and in the ASD group the averages were 35% (auditory task) and 39.3% (Stroop). Other values are presented in the tables, where it can be seen a disappearance of tail power law characteristics with increasing task cognitive effort, an effect seen in both tasks. **Figure 2C** shows the loss of tail power law (appearance of outliers) in the "incongruent ink" condition of the Stroop task, the more demanding of the three in that task. Perhaps because of this effect, notice in **Tables 1** and **2** that, for the "baseline" conditions in both tasks, the frequency of tail power laws is greater in the baseline condition for the auditory task (60.5% of instances), when children were asked to remain relaxed, whereas in the Stroop task (36.4% of instances) the cognitive load was higher as they had to read a list of words in black ink. Thus, less power law features are associated with more cognitive effort. The Discussion section contains comments on why the power law regime in the synchrony distribution is less frequent as cognitive load increases. Representative examples are presented in **Figures 2** and **3**. Even when power laws could not be fitted, there were more outliers as cognitive effort increased, depicted in **Figure 3** (lower graphs) and quantified in **Table 1** ("points out of PL"). It is worth noticing that rather than power laws, some of the edfs had exponential characteristics, as shown in **Figure 3**, upper graph inset.

The tendency to change in pdf characteristics (as represented in **Figure 2**) is suggestive of a critical transition, what is known as phenomenological bifurcations (Kuehn, 2011), that describe changes in the probability density functions in random dynamical systems. To quantitatively assess the difference between pdfs in each condition of the tasks, KS tests were used. Of 388 pdfs evaluated, including all children and all tasks (thus 388/2 = 194 transitions, one "transition" here means going from one task condition, say "congruent ink", to the next, "incongruent ink"), changes between pdfs were significant (*p* < 0.05) 52.1% of times, but were more abundant when there was a transition from power-law to non-power law (or vice versa) characteristics (53.9% of the times) than when there was no such transition (39% of the times). This difference was more pronounced in the data corresponding to the Stroop task (47.2% of instances for power law to non-power law, and 25% for the other case) than in the auditory attention task. These observations suggest that

**Table 1 | Presence of tail power law (PL) regimes in the distribution of the synchronization index during the auditory attention task**.


Data points outliers ("Points out of PL") are more numerous as the attentional demand increases, from baseline to the "Attend" condition (lower plots in Figure 3 depict one example).



a bifurcation, manifested as a change in the characteristics of the pdf, may take place when increasing cognitive load of a task.

## **DISCUSSION**

Critical dynamics, the behavior of extended systems near a phase transition where scale invariance prevails, has been proposed for nervous system activity as it has several features that optimize information processing (Beggs, 2007; Shew and Plenz, 2013), and this notion has been taken with such enthusiasm that the field is currently in the grip of an explosion of fecundity. Indications of such dynamics have been reported for a variety of *in vitro* and *in vivo* recordings, ranging from *in vitro* slice electrophysiology to human functional magnetic resonance imaging. However, there still remains considerable debate as to whether brains really operate close to criticality rather than, for instance, stochastic or oscillatory dynamics. One sign of criticality that has become a favorite is the inspection of power-law distributions in nervous system data, and such power-law scaling has been reported associated with different types of recordings, even though some studies failed to find clear evidence. Here, we studied whether there is a power law scaling in the distribution of the phase synchronization derived from magnetoencephalographic recordings during executive function tasks performed by children with and without ASD. Our observations suggest that power law scaling of phase synchrony indices derived from MEG recordings is not very common in both ASD and non-ASD groups and its frequency of occurrence tends to diminish with increased cognitive load/effort as children performed the tasks. There were indications of changes in the phase synchrony probability distribution functions associated with a transition from power law scaling to lack of power law, perhaps suggesting the presence of phenomenological bifurcations in brain dynamics associated with cognitive load. Hence, the observations of power law and other (exponential) scaling regimes plus the signs of phenomenological bifurcations, further support the metastability of brain dynamics and suggest that some brain areas experience critical transitions.

Our studies are based on the calculation of a phase synchronization index from MEG recordings that reflect large-scale activity, at the collective level, in extensive cellular ensembles. The synchrony index thus represents correlated activity in brain areas over which the sensors locate. There are certain limitations worth noting. Perhaps principally, the signals detected may summate at nearby MEG sensors, depending on the intensity of the source, causing multiple sensors to contain similar activities. To minimize summation of signals, the areas of sensors chosen were not direct neighbors. These sensors were chosen as well because the cortical areas over which they are located are associated with sensorimotor transformations (Binkofski et al., 1999). Estimating the time series at the source level (in brain tissue) is a solution to overcome the summation at neighboring sensors, and while some methods to derive the signals at the sources have been reported in the literature, source reconstruction adds another level of complexity to the analysis and may even yield spurious results, as it is an "illposed mathematical problem" (Gross et al., 2013), mainly because assumptions must be made about the origin and location of the expected sources in order to properly constrain the solution to the problem, and thus there is bias as it is not trivial to choose what brain areas could be expected to account for the brain dynamics. With these considerations in mind, our analyses are performed at the sensor level and the conclusions we draw from them focus on relative changes without focusing on specific cortical areas.

Traditional scientific reporting methods normally use averages and variances, which tend to hide the variability and fluctuations in data sets. Thus, a complementary approach is the observation of the full pdf. As evidenced in the figures, power law scaling could always be found in some segments of the edf, a well-known feature as few real-world distributions follow a power law over the entire range (Newman, 2005). This imposes a certain arbitrary constraint, in that one must choose a range of values at which the power law may hold, choice that is not trivial when using empirical data, and thus the scaling exponent will vary depending on the chosen data points. It is known as well that two or more power law regimes with different exponent may be present in the same distribution. To make things more complicated, the exponent values depend on sampling and several other aspects (Priesemann et al., 2009; Touboul and Destexhe, 2010; Markovi´c et al., 2013). For all these reasons, we do not emphasize the values of the exponents in our work, nevertheless we note that the values of the exponent, either in the tail or in the middle of the distribution are larger than 2 (see legends of **Figures 2B** and **3**, where a power law approximated to the middle part of the edf provided exponents >2). Because the classical exponent associated with self-organized criticality is 1, the celebrated 1/f scaling (Bak et al., 1988; Pritchard, 1992), exponents larger than 1 may not be associated with this phenomenon. High values of exponents have been recently reported in recordings from cat, monkeys and humans (Touboul and Destexhe, 2010; Dehgani et al., 2012), hence the matter of self-organized criticality in nervous system activity remains unclear at the present time. Nevertheless our study is not intended to present evidence for self-organized critical dynamics in brain synchronization, rather to inspect certain properties of the distributions of our synchrony index associated with performance of executive function tasks in two groups of individuals. In instances where power law regimes co-exist with others (e.g., exponential) in the distributions of synchrony magnitudes, it could be hypothesized that this is a sign of the metastability of brain dynamics, a notion proposed by several authors (Bressler and Kelso, 2001; Fingelkurts and Fingelkurts, 2004; Kelso, 2008; Pérez Velázquez and Frantseva, 2011; Deco and Jirsa, 2012; Kelso et al., 2013). Incidentally, one of the first early proposals of the brain as "organ whose natural state is one of unstable equilibrium" is due to William James in his 1879 essay "Are we automata?" published in *Mind*, 4, 1–22.

One aspect that, in principle, could be concluded from our study is that the power law features become less frequent as tasks require more effort/cognitive component. Since we investigated the edf of a synchronization index amongst MEG sensors, and if we assume these indices represent correlated activity in brain areas over which the sensors are positioned as expounded above, power law scaling then denotes that synchrony has no characteristic scale, and the absence of power law indicates that there are characteristic scales in synchrony; in the case of the (right-hand side) tails of the distribution, the absence of power law features means that there appear some especially high values of the magnitude of synchrony, perhaps because of the change in coordinated activity in certain cortical areas associated with task performance. Thus, the possible reason why we observed decrease incidence of power law regimes as cognitive effort augments can be explained by the associated enhanced synchronization needed to perform the task. In fact, **Figure 1** indicates a tendency to increase synchrony during task performance. Equally, in the Stroop task, it was previously reported (Figures 1 and 2 in Pérez Velázquez et al., 2009) an increase in the magnitude of phase synchrony going from the baseline ("black ink") to the "incongruent" condition (that is, the most difficult of the three conditions in that task) only for the non-ASD group, and consequently, notice in **Table 2** the reduced occurrence of tail power laws for this group as task difficulty increased, but not for the ASD set. If some specific cortical areas become more synchronous, this will result in high values of the magnitude of synchrony (our *R* index) and therefore the loss of scale-free features. Heavy tails have been associated with small world features (Feldt et al., 2011), that applied to our studies would suggest there are highly "connected" cortical regions whereas most have low connections. Or more accurately, because phase synchrony as evaluated here is not really a measure of connectivity but a correlation between phases of oscillations, those results could be interpreted as few regions with highly correlated phases of the oscillations (cautionary notes on the notion of "connectivity" derived from these types of analyses have been presented elsewhere, Perez Velazquez, 2012). It is of interest that, in experiments *in vitro*, enhancing excitation using blockers of GABAergic transmission results in deviations from the neuronal avalanche power law observed in unperturbed brain slices (Beggs and Plenz, 2003). It is conceivable that this *in vitro* manipulation shares similar neurophysiological features with increasing cognitive effort, perhaps increased activity/excitation in some cortical regions, and therefore both results, *in vitro* and ours *in vivo*, are complementary.

In our study we have not emphasized the possible association of the observed power law regimes and self-organized criticality, because, as noted above, it is still inconclusive that power law scaling is directly related to self-organized criticality in nervous systems. Indeed, features of critical dynamics emerge in various situations even when the dynamics are not critical, as shown in networks that possess a hierarchical modular structure (Friedman and Landsberg, 2013) or a noisy feedforward structure (Benayoun et al., 2010). While indications of criticality derived from "neuronal avalanches" of activity (Beggs and Plenz, 2003) or the scaling of fluctuations in functional brain imaging (Fraiman and Chialvo, 2012) have been reported, other studies have cast some doubt as to the methods used to assess power laws in brain recordings (Clauset et al., 2009; Dehgani et al., 2012). For instance, Touboul and Destexhe (2010) observed that sometimes the scaling behavior is a consequence of the thresholding method, which applies to amplitude-based recordings. There is doubt too as to the generic character of this presumed criticality in nervous tissue (Bédard et al., 2006; Beggs and Timme, 2012). To complicate matters, power laws can be generated in a variety of manners (Reed and Hughes, 2002; Markovi´c and Gros, 2014). Nevertheless, our finding of some signs of phenomenological bifurcations most commonly associated with transitions from power law to nonpower law regimes, may suggest that, in some instances, our MEG recordings display signatures of possible phase transitions and thus provides a, perhaps indirect, support for criticality in some instances. The observation that power law regimes are not frequently seen may present another indication of criticality, because in principle it is only at the bifurcation point when power laws should be apparent, but once the transition has taken place, other regimes can be present; this is an important point, many times overlooked, mentioned in Beggs and Timme (2012). To stress it again, what has been demonstrated beyond doubt is that in systems at thermodynamic equilibrium power laws are found only near bifurcations, but in far from equilibrium conditions, this remains unclear. In view of what we, and others, have been reporting with regards to the apparent mixture of regimes, especially exponential (which is related to Poisson-type stochastic processes) and power-law scaling, brain recordings may represent the activity of coupled oscillator phenomena (Perez Velazquez et al., unpublished observations) in stochastic settings (Teramae and Tanaka, 2004; section 1.5 in Pérez Velázquez and Frantseva, 2011). For instance, Reed and Hughes (2002) reported that randomly observed stochastic processes exhibit tail power laws, and Deco and Jirsa (2012) proposed that resting state networks in the brain emerge as structured noise fluctuations in a multistable attractor landscape.

In terms of synchronization in the brain, the presence or absence of characteristic scales makes sense according to what has been found regarding, for instance, the stability of certain functional nets derived from EEG recordings (Chu et al., 2012), phenomenon which would require characteristic scales if we assume those stable nets are almost always functionally "connected", whereas scale invariance makes sense too as many brain nets have to be loosely or very transiently coordinated, and especially when analyzing such recordings like MEG or EEG representing global, collective activities in myriad of cells. These neurophysiological features would support the varied dynamic behaviors of brain networks and in general metastable dynamics.

We have used phase synchronization in this study to evaluate power law scaling, instead of others most commonly used such as the size of bursts or number of spikes in neuronal avalanches. It is difficult to ascertain what type of metric is the best suited to characterize collective brain dynamics, but synchronization has two advantages. First, it seems to be a reasonable metric to scrutinize collective network dynamics, and it is and has been very widely used to study cognition and brain pathologies. The second advantage over other metrics that have been used in this type of studies is that a threshold is not needed to define the characteristic to be analyzed (it was mentioned above the problem with threshold-based methods to assess power law regimes, Touboul and Destexhe, 2010). Using different metrics to scrutinize for criticality will be crucial in the future, considering the controversies with the study of neuronal avalanches.

To conclude, a few comments on what these results may indicate about ASD brain dynamics. It was noted in the Introduction section the current debate about the classical notion of "underconnectivity" in view of recent observations suggesting, if something, the opposite. Since the time when specific changes in brain dynamics were proposed to account for ASD cognitive features, including the temporal binding deficit (Brock et al., 2002) and disruptions of coordinated timing in cellular activity and associated synchronization dynamics (Herbert, 2005; Uhlhaas and Singer, 2007), many reports have appeared indicating, sometimes, contrasting evidence. This should not be surprising if we consider the wide spectrum of autistic syndromes, and of course the great variety in the experimental and analytical methods used to assess brain dynamics. In our study, no main differences were found comparing the ASD and the non-ASD participants, other than a tendency to exhibit more synchrony in non-ASD individuals when performing the tasks, thus having in general less frequent power law features than in the ASD data (see percentages in the tables). Thus, the current assortment of observations seems to indicate that, as we already noted in previous publications (Pérez Velázquez and Frantseva, 2011; Garcia Domínguez et al., 2013; Pérez Velázquez and Fernández Galán, 2013), it may not be a matter of more or less connectivity in the ASD brain, rather a different type of brain coordinated activity that manifests in the particular information processing characteristics and associated special cognitive style of individuals with autism.

## **ACKNOWLEDGMENTS**

This research was supported by a New Investigator Grant from the Hospital for Sick Children Foundation and by the Natural Sciences and Engineering Research Council of Canada (NSERC).

## **REFERENCES**


Sornette, D. (2004). *Critical Phenomena in Natural Sciences.* Berlin: Springer Verlag.

Stroop, J. R. (1935). Studies of interference in serial verbal reactions. *J. Exp. Psychol.* 18, 643–662. doi: 10.1037/h0054651


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 21 February 2014; accepted: 14 April 2014; published online: 01 May 2014*.

*Citation: Tinker J and Perez Velazquez JL (2014) Power law scaling in synchronization of brain signals depends on cognitive load. Front. Syst. Neurosci. 8:73. doi: 10.3389/fnsys.2014.00073*

*This article was submitted to the journal Frontiers in Systems Neuroscience*.

*Copyright © 2014 Tinker and Perez Velazquez. This is an open-access article distributed under the terms of Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms*.

## Universal organization of resting brain activity at the thermodynamic critical point

#### *Shan Yu\* , Hongdian Yang†‡, Oren Shriki and Dietmar Plenz ‡*

*Section on Critical Brain Dynamics, National Institute of Mental Health, NIH, Bethesda, MD, USA*

#### *Edited by:*

*Lucilla De Arcangelis, Second University of Naples, Italy*

#### *Reviewed by:*

*Stefano Panzeri, Italian Institute of Technology, Italy Matias Palva, University of Helsinki, Finland*

#### *\*Correspondence:*

*Shan Yu, Section on Critical Brain Dynamics, Laboratory of Systems Neuroscience, National Institute of Mental Health, Bethesda, MD 20892, USA e-mail: yushan.mail@gmail.com*

#### *†Present address:*

*Hongdian Yang, The Solomon H. Snyder Department of Neuroscience and Brain Science Institute, Johns Hopkins University School of Medicine, Baltimore, USA ‡These authors have contributed equally to this work.*

Thermodynamic criticality describes emergent phenomena in a wide variety of complex systems. In the mammalian cortex, one type of complex dynamics that spontaneously emerges from neuronal interactions has been characterized as neuronal avalanches. Several aspects of neuronal avalanches such as their size and life time distributions are described by power laws with unique exponents, indicating an underlying critical branching process that governs avalanche formation. Here, we show that neuronal avalanches also reflect an organization of brain dynamics close to a thermodynamic critical point. We recorded spontaneous cortical activity in monkeys and humans at rest using high-density intracranial microelectrode arrays and magnetoencephalography, respectively. By numerically changing a control parameter equivalent to thermodynamic temperature, we observed typical critical behavior in cortical activities near the actual physiological condition, including the phase transition of an order parameter, as well as the divergence of susceptibility and specific heat. Finite-size scaling of these quantities allowed us to derive robust critical exponents highly consistent across monkey and humans that uncover a distinct, yet universal organization of brain dynamics. Our results demonstrate that normal brain dynamics at rest resides near or at criticality, which maximizes several aspects of information processing such as input sensitivity and dynamic range.

#### **Keywords: neuronal avalanches, LFP, MEG, critical exponents, phase transition**

The cerebral cortex of the mammalian brain consists of tens of billions of neurons with interactions among them that exist at many scales ranging from local microcircuits, to cortical areas, and even across the entire cortex. These myriads of neuronal interactions underlie various brain functions including motion, perception, and cognition (Abeles et al., 1993; Vaadia et al., 1995; Rodriguez et al., 1999; Singer, 1999). Understanding the general principles governing these interactions and how they give rise to emergent properties of information processing is one of the most challenging questions in systems neuroscience.

For several decades, concepts and tools developed in statistical physics have addressed the collective behavior of complex systems by studying the interactions among the constituent microscopic system components. Of the many states a complex system might adopt, the critical state at thermodynamic equilibrium has been extensively studied and this state might be particularly relevant for the brain. Microscopically, the critical state represents exquisitely balanced interactions among all system components (Stanley, 1999). Macroscopically, such balanced interactions poise the system at a transition between two contrasting phases (quantified by the order parameter, *M*) and give rise to a number of non-trivial emergent properties, including the divergence of the sensitivity to external perturbations (quantified by the susceptibility, χ), and internal complexity/diversity (quantified by the specific heat, *C*; Stanley, 1987; Binney et al., 1992; Sornette, 2006). For the cortex, these quantities have intuitive meanings in terms of neuronal information processing. χ reflects the input sensitivity of the system (Newman and Barkema, 1999), *C* reflects the dynamic range of neuronal populations in representing inputs (Tkacik et al., 2009; Macke et al., 2011), and *M* measures the overall neuronal activity level. The maximization of χ and *C* achieved at criticality can thus be interpreted as optimizing input sensitivity (Houweling and Brecht, 2007; Huber et al., 2008; Shew et al., 2009) and dynamic range (Shew et al., 2009; Tkacik et al., 2009; Macke et al., 2011), respectively. At the same time, the changes of *M*, i.e., the overall activity level, may reflect state changes of the brain, such as transitions between sleep and wakefulness or between focused attention and inattention (Cohen and Maunsell, 2009; Mitchell et al., 2009; Vyazovskiy et al., 2009; Harris and Thiele, 2011; Grosmark et al., 2012).

Importantly, near the critical state, those emergent behaviors do not depend on the specific microscopic realization of a system. It has been shown that a multitude of systems can be categorized into a small number of "universality classes" based on only a few parameters, i.e., so called "critical exponents" (Stanley, 1987, 1999; Binney et al., 1992; Sornette, 2006). Within individual classes, apparently different systems follow the same quantitative rules. A major question thus arises, whether such universality of critical behavior, encountered when studying physical systems, might also include biological complex systems such as the cortex that evolved to process information.

Recent studies of neuronal avalanches strongly suggest that neuronal interactions, both at the mesoscopic scale (within tens of mm2 of cortical tissue; Beggs and Plenz, 2003; Petermann et al., 2009) as well as macroscopic level (across the entire cortex; Allegrini et al., 2010; Tagliazucchi et al., 2012; Palva et al., 2013; Shriki et al., 2013), may position the cortex at or near a non-equilibrium critical state in order to optimize information processing (Kinouchi and Copelli, 2006; Rämö et al., 2007; Shew et al., 2009, 2011; Yang et al., 2012). Neuronal avalanches are intermittent cortical activity cascades that spontaneously form in the normal brain. During an avalanche, spontaneous activation of one neuronal group can trigger consecutive activations of other neuronal groups within just a few milliseconds and the propagation of such activity spans both spatial and temporal domains. This propagation is well-described by a non-equilibrium critical branching process, which successfully explains some of the functional advantages of neuronal avalanches (Beggs and Plenz, 2003; Shew et al., 2009, 2011; Yang et al., 2012). However, it is currently unclear if neuronal avalanches indicate cortical dynamics close to a critical state in the equilibrium thermodynamic sense and, if so, what universality class this form of cortical activity might belong to. The current study is aimed to address these questions and their potential functional implications for the brain.

## **MATERIALS AND METHODS**

## **LOCAL FIELD POTENTIAL (LFP) RECORDINGS IN MONKEYS**

All experiments were carried out in accordance with NIH guidelines for animal use and care. The protocol was approved by the Animal Care and Use Committee of the National Institute of Mental Health. Ongoing LFP activity was recorded from two adult monkeys (*Macaca mulatta*). Multi-electrode arrays (10 × 10; 400μm inter-electrode distance; 1 or 0.6 mm electrode length; BlackRock Microsystems) were chronically implanted in the left pre-motor (Monkey 1) or prefrontal (Monkey 2) cortex (**Figure 1A**). Twenty to thirty min of ongoing LFP (1–100 Hz) signals were simultaneously obtained from each electrode while the animals were sitting alert in a primate chair but not engaged in any behavioral task. For more experimental details, see Yu et al. (2011).

## **MAGNETOENCEPHALOGRAPHY (MEG) RECORDINGS IN HUMAN SUBJECTS**

All experiments were carried out in accordance with NIH guidelines for human subjects. Ongoing brain activity

**FIGURE 1 | Identifying avalanche dynamics in LFP signals. (A)** Lateral view of the macaque brain showing the position of the multi-electrode array (square, not to scale) in pre-motor (Monkey 1; blue) and prefrontal (Monkey 2; orange) cortex. PS, Principal Sulcus; CS, Central Sulcus. **(B)** Example period of continuous LFP at a single electrode. Asterisks indicate peaks of negative deflections in the LFP (nLFPs) that pass the threshold (Thr., broken line; −2.5 SD). **(C)** Identification of spatiotemporal nLFP clusters and corresponding spatial patterns. Left: nLFPs that occur in the same time bin or consecutive bins of length *t* define a spatiotemporal cluster, whose size is

given by its number of nLFPs (two clusters of size 4 and 5 shown; gray area). Right: Patterns represent the spatial information of clusters only. **(D**,**E)** Neuronal avalanche dynamics are identified when the sizes of activity cascades distribute according to a power-law with slope close of −1.5. Four distributions from the same original data set (solid line) using different areas (inset), i.e., number of electrodes (*n*), are superimposed. The power-law distributions vanish for shuffled data (broken lines). A theoretical power-law with slope of −1.5 is provided as guidance to the eye (gray, broken line). **(D)** is reprinted from Yu et al. (2011).

(∼30 min) was recorded from 3 healthy female participants. The sampling rate was 600 Hz, and the data were band-pass filtered between 1 and 150 Hz. The sensor array consisted of 275 axial first-order gradiometers. Two dysfunctional sensors were removed, leaving 273 sensors in the analysis. Analysis was performed directly on the axial gradiometer waveforms. For more details, see Shriki et al. (2013).

#### **AVALANCHE ANALYSIS**

Negative deflections in the LFP (nLFPs) were detected by applying a threshold at −2.5 standard deviations (SDs) of the LFP fluctuations estimated for each electrode separately (**Figure 1B**). Such a threshold is based on the non-linear relation between nLFP amplitudes and ability of local neuronal groups to synchronize with other, spatially separated ones (Thiagarajan et al., 2010; Yu et al., 2011). The nLFP peak times were then binned using a time window, *t*. Results shown are based on *t* = 2 ms (Monkey 1) and 4 ms (Monkey 2) but they are similar across a wide range of *t* (2–16 ms tested). Spatiotemporal clusters of nLFPs, i.e., avalanches, were defined by consecutive bins such that each bin contained at least one nLFP at any site in the selected group (Beggs and Plenz, 2003). The size of a cluster, *s*, was defined as the number of nLFPs in the cluster (**Figure 1C**). Similar analysis was applied to identify avalanches from the MEG recordings, for which a threshold at −3.0 SD of the MEG waveforms was used to detect significant neuronal events. The time window *t* was 1.67 (1 × sampling period; subject 1) or 3.34 ms (2 × sampling period; subjects 2, 3). For more details, see Shriki et al. (2013). Avalanche patterns were obtained by collapsing all time bins within an avalanche to form a corresponding spatial pattern **σ** = (σ1, σ2,..., σ*n*), where *n* is the number of recording sites, i.e., system size, included in the analysis and σ*<sup>i</sup>* = 1 if at least one nLFP occurred at site *i* and σ*<sup>i</sup>* = −1 otherwise (**Figure 1C**).

## **USING THE DICHOTOMIZED GAUSSIAN (DG) MODEL FOR ESTIMATING PATTERN PROBABILITIES** *Pi*

The DG model is a useful tool for capturing the statistics of binary neural activity patterns (Amari et al., 2003; Macke et al., 2009, 2011; Yu et al., 2011). It applies a threshold to multivariate Gaussian variables: *yi* = 1 when *ui* > 0 and *yi* = −1 otherwise, where **u** = (*u*1, *u*2,..., *un*) ∼ *N* (δ, λ), δ is the mean and λ is the covariance of the Gaussian variables. In order to match the rate, *r*, and covariance, , of the observed binary variables, i.e., avalanche patterns, δ and λ need to be adjusted according to δ*<sup>i</sup>* = −1(*ri*) and λ*ij* as the solution for *ij* = 2(δ*i*, δ*j*, λ*ij*) – (δ*i*) (δ*j*), where and −<sup>1</sup> are the cumulative probability function of a Gaussian distribution ( for 1-dimensional and <sup>2</sup> for 2-dimensional) and its inverse function, respectively. An implementation of the model in MATLAB can be found in Macke et al. (2009). The pattern probabilities for the DG model were obtained by calculating the cumulative distribution of multivariate Gaussians (MATLAB function *mvncdf*).

#### **FITTING A POWER-LAW TO THE SIZE DISTRIBUTION**

The exponent of the best fitting power-law, was estimated by minimizing the Kolmogorov–Smirnov (KS) distance between the empirical distribution and a power-law distribution (Klaus et al., 2011). The KS distance (*D*KS) was defined as

$$D\_{\rm KS} = \max\_{\rm s} |CDF\_{\rm emp}(\rm s) - CDF\_{\rm power} - \rm low\ (s)|,\tag{1}$$

where *s* is the pattern size and *CDF*emp and *CDF*power−law represent the cumulative distribution function for the empirical size distribution and the power-law function used for fitting, respectively.

## **INFERRING** *p<sup>i</sup>* **FOR DIFFERENT VALUES OF T**

To predict the pattern probabilities *pi* for different values of the fictitious temperature, *T*, it is useful to express the state probability as a function of interactions that occur at different orders (Nakahara and Amari, 2002; Amari et al., 2003). Let the pattern probability be *p*(**σ**), where **σ** = (σ1, σ2,..., σ*n*) and σ*<sup>n</sup>* = {1, −1}, representing the states of individual components. Generally, we can write *p*(**σ**), using the full log-linear expansion, as

$$p\_1(\boldsymbol{\sigma}) = \frac{1}{Z} \exp\left(\sum\_i \theta\_i \sigma\_i + \sum\_{\{i < j\}} \theta\_i \sigma\_i \sigma\_j + \sum\_{\{i < j < k\}} \theta\_{ijk} \sigma\_i \sigma\_j \sigma\_k + \dotsb\right),\tag{2}$$

where *Z* is the normalization factor and θ characterizes different orders of interactions. The full log-linear expansion and its lower-order approximations have been widely used in characterizing neuronal interactions (Schneidman et al., 2006; Yu et al., 2008; Ohiorhenuan et al., 2010).

Next, we define θ = θ0/*T*, where θ<sup>0</sup> represent the intrinsic interaction strength that does not depend on *T*. If we denote *E* (**σ**) = − *i* θ0 *<sup>i</sup>* σ*<sup>i</sup>* + (*i*<*j*) <sup>θ</sup><sup>0</sup> *ij*σ*i*σ*<sup>j</sup>* + (*i*<*j*<*k*) θ0 *ijk*σ*i*σ*j*σ*<sup>k</sup>* +··· , Equation 2 can be rewritten as

$$p\left(\sigma\right) = \frac{1}{Z} \exp\left(\frac{-E\left(\sigma\right)}{T}\right). \tag{3}$$

We can then use the single histogram method (Ferrenberg and Swendsen, 1988; Newman and Barkema, 1999) to infer *pi* for different *T*, an approach that was used for modeling natural image statistics (Stephens et al., 2013) and was also recently introduced to neuroscience (Tkacik et al., 2009). Specifically, if *pi* denotes the probability of any given pattern *i* and *Ei* the corresponding *E*, Equation 3 changes to

$$p\_i = \frac{1}{Z}e^{-E\_i/T} \tag{4}$$

Setting *T* = 1 for the original recording, Equation 4 can be expressed as

$$p\_i(1) = \frac{1}{Z\left(1\right)}e^{-E\_i},\tag{5}$$

which enables us to compute *pi* for different *T* as

$$p\_i(T) = \frac{1}{Z} e^{\frac{-E}{T}} = \frac{1}{Z} \left[ Z \left( 1 \right) p\_i(1) \right]^{\frac{1}{T}} = \frac{Z \left( 1 \right)^{1/T}}{Z} p\_i(1)^{1/T} \tag{6}$$

The normalization factor is determined by considering *pi*(*T*) = 1.

## **COMPUTING THE SPECIFIC HEAT, SUSCEPTIBILITY, AND ORDER PARAMETER**

The specific heat, *C*, is:

$$C = \frac{1}{n} \frac{\partial U}{\partial T} = \frac{\left< E\_i^2 \right> - \left< E\_i \right>^2}{nT^2},\tag{7}$$

where *n* is system size, *U* ≡ *Ei* = *piEi* and *Ei* can be calculated according to Equation 4. Given *n* and *T*, *C* reflects the variance of log (*pi*), a useful metric for quantifying the capacity of the system to represent information (Tkacik et al., 2009; Macke et al., 2011).

The order parameter, *M*, is defined as:

$$M = \frac{1}{n} \sum\_{i=1}^{2^n} p\_i m\_i,\tag{8}$$

where *mi* = *<sup>n</sup> <sup>j</sup>* <sup>=</sup> <sup>1</sup> <sup>σ</sup>*<sup>i</sup> j* . σ*<sup>i</sup>* indicates that the value of σ is taken from the *i* th pattern. *M* has a very intuitive meaning for a cortical system—it reflects the overall activity level of the system.

Finally, the susceptibility χ is a measure of the sensitivity of the system to small external perturbations. χ is defined as the change rate of *M* when a small external field *H* is applied:

$$\chi = \left. \frac{\partial M}{\partial H} \right|\_{H=0} = \frac{\left\langle m\_i^2 \right\rangle - \left\langle m\_i \right\rangle^2}{nT} \tag{9}$$

The field *H* exerts its effect by changing the preference of the units to be active or not, i.e., their likeliness to be involved in an avalanche. Specifically, applying *H* is equivalent to adding a term of *H*σ*<sup>i</sup>* to the Hamitonian (*E*). For cortical dynamics, *H* can be thought as an approximation of a local perturbation, e.g., making a single or small group of neurons to fire [analog to flipping a single spin in a model; see Newman and Barkema (1999) and/or a weak common input from, e.g., distant cortical areas or sub-cortical brain structures].

#### **FINITE SIZE SCALING (FSS) ANALYSIS**

At the thermodynamic limit (*n* → ∞), a critical system can be identified by power-law behaviors of its macroscopic quantities, including the correlation length ξ (a characteristic distance beyond which correlations diminish), specific heat *C*, magnetization *M* and susceptibility χ. These quantities follow a power-law relation as a control parameter, such as the thermodynamic temperature *T*, approaches a critical value *Tc*, with specific critical exponents ν, α, β, and γ, respectively:

$$
\xi \sim |t|^{-\nu} \tag{10}
$$

$$C \sim \left| t \right|^{-\alpha} \tag{11}$$

$$M \sim |t|^{-\beta} \tag{12}$$

$$\chi \sim |\mathfrak{t}|^{-\gamma} \tag{13}$$

where *t* = (*T* − *Tc*)/*Tc*. In principle, one could directly measure these relations to determine whether and when the system will be critical, i.e., to determine *Tc*, and, at the same time, estimate all critical exponents.

The complication comes with the fact that real systems are finite in size. This so called "finite size effect" causes the system's behavior to deviate from the thermodynamic limit. A standard procedure in statistical physics to solve this problem is Finite Size Scaling (FSS; Binney et al., 1992; Newman and Barkema, 1999). By analyzing the behavior of systems with different sizes, FSS extrapolates the behavior for the thermodynamic limit and to estimate *Tc* and critical exponents. Briefly, we can choose a unique set of critical exponents to scale Equations 10–13 with different linear sizes of the system *<sup>L</sup>* <sup>=</sup> <sup>√</sup>*<sup>d</sup> <sup>n</sup>*, where *<sup>d</sup>* is the dimensionality, and then collapse the curves obtained for all sizes. Specifically, *t* needs to be scaled by *L*1/ν, whereas *C*, *M*, and χ are scaled by *L*−α/ν, *L*β/ν, and *L*−γ/ν, respectively. The critical exponents (ν, α, β, and γ) and *Tc* that achieve the collapse are equivalent to those expected for a measurement made at the thermodynamic limit (see Appendix for detailed derivation). We identified the best collapse by minimizing the distance among all functions with different sizes using numerical optimization (MATLAB function *fminsearch*). Initial conditions for optimization were systematically changed according to a grid search method within a large parameter space and the resulting values for exponents were stable. These values were also stable for different values of *T* to perform FSS. Results reported were based on *T* = 0.5 − 2.5.

#### **MEASURING GOODNESS OF COLLAPSE**

For different system sizes *i*, the dependency of a system parameter, e.g., susceptibility χ*i*, on *T* was obtained. To quantify how well such a series of functions can be collapsed by FSS, we compared the "closeness" of them before (without scaling) and after the collapse (the best results achieved by numerical optimization). Specifically, the goodness of collapse (*GC*) is indicated by the ratio of mean squared deviation (MSD) after and before the collapse, i.e., *GC* = MSDafter/MSDbefore. Formally, MSD = (χ*<sup>i</sup>* − χ) 2 *T <sup>i</sup>* , where χ is the point-wise average over all system sizes, *<sup>T</sup>* indicates the average across the range of *T* and *<sup>i</sup>* indicates the average across system sizes. Smaller *GC* indicates better goodness of collapse.

## **RESULTS**

#### **AVALANCHE DYNAMICS AT THE MESOSCOPIC SCALE**

We first investigated neuronal avalanches at the mesoscopic scale (Beggs and Plenz, 2003; Petermann et al., 2009; Hahn et al., 2010; Ribeiro et al., 2010; Yu et al., 2011). Ongoing neuronal activity in two monkeys was recorded with 10 × 10 high-density micro-electrode arrays chronically implanted in superficial layers of cortex (**Figure 1A**). Significant negative local field potential deflections (nLFPs), which indicate synchronized activity of local neuronal populations (Petermann et al., 2009; Yu et al., 2011), were detected using an amplitude threshold of –2.5 SDs of the LFP calculated for each electrode (**Figure 1B**). A spatiotemporal nLFP cluster was identified if nLFPs on the multielectrode array occurred within the same or consecutive time bins of width *t* (**Figure 1C**). Importantly, the cluster size s, defined as the number of nLFPs in a cluster, distributed according to a power-law with an exponent close to −1.5. Moreover, the distribution exhibited scale-free behavior, i.e., the power-law and its slope were stable for different system size *n*, whereas the cut-off changed systematically with *n* (**Figures 1D,E**). This power-law demonstrates that ongoing cortical activity at rest in awake monkeys organizes as neuronal avalanches (Beggs and Plenz, 2003; Petermann et al., 2009). It indicates the presence of significant correlations in neuronal activity among cortical sites and, accordingly, is destroyed when the times of nLFPs are shuffled randomly (**Figures 1D,E**, broken lines).

#### **CHARACTERIZATION OF THE CRITICAL BEHAVIOR**

Next we investigated whether neuronal avalanches reflect a cortical state close to criticality in the sense of a thermodynamical equilibrium. Our approach is based on a method similar to Monte Carlo simulations (Newman and Barkema, 1999). First, we estimated the probability *pi* of individual configurations in the system based on actual recordings. For an equilibrium system, those probabilities would give a complete characterization of the system's behavior. Then, we infer the changes of *pi* with the change of a control parameter, *T*, which is considered to be equivalent to thermodynamic temperature. Finally, we compute various macroscopic properties including susceptibility, specific heat, and an order parameter, as a function of *T* to judge if the actual *T* (the one associated with the original recording) is close to the critical point.

More specifically, we define the configurations or states of the system by the spatial avalanche patterns, obtained by collapsing the spatiotemporal avalanche patterns along the temporal domain. This mapping ignores the internal temporal structure of individual avalanches. Each avalanche is originally represented by an *n* by *m* activity matrix, where *n* is the number of electrodes and *m* is the temporal duration of the avalanche. The activity matrix is then turned into an *n*-component binary vector where an electrode is set to 1 if it participates at least once in the avalanche and to −1 otherwise [**Figure 1C**, see also Methods and Yu et al. (2011)]. The finite duration of the recording limits the direct estimation of pattern probabilities *pi* to *n* ∼ 10. Therefore, in order to estimate *pi* for larger *n*, we take advantage of a parametric model, the Dichotomized Gaussian (DG) model (Amari et al., 2003; Macke et al., 2009, 2011; Yu et al., 2011), which considers only the observed first-order (event rate) and second-order (pairwise correlations) statistics. This model estimates *pi* of avalanche patterns more accurately than directly measuring it from the limited data [**Figure 2**; see also Yu et al. (2011)]. Due to the exponential increase in possible configurations with increasing *n*, we restrict the calculation of *pi* to *n* = 20. In total, we analyzed four 20-electrode sub-groups recorded from each of the two monkeys.

After obtaining *pi* for the condition in which the actual recording was taken, we introduce a control parameter *T*, which changes both the likelihood of a given site to participate in an avalanche and the correlation among activities between different sites (Binney et al., 1992; Newman and Barkema, 1999). *T* is similar to the thermodynamic temperature and allows us to systematically estimate the system's behavior for conditions different from the recorded, physiological condition. To infer *pi* for

**FIGURE 2 | The DG model predicts state probability more accurately than direct sampling. (A)** Observed probability *pi* (thirty 10–electrode sub-groups) is plotted against the prediction made by direct sampling and the DG model. Solid line indicates equality. The comparison is based on 2-fold cross-validation (Yu et al., 2011). **(B)** JS divergence (Yu et al., 2011) between the observed and predicted probabilities of spatial avalanche patterns for the same thirty 10–electrode groups shown in **(A)**. Linked dots are the results obtained by direct sampling and the DG model for the same group. The DG model has significantly smaller JS divergence (21% reduction, *p* < 10<sup>−</sup>5, paired-sample signed rank test).

different *T*, we use the single histogram method (Ferrenberg and Swendsen, 1988; Newman and Barkema, 1999), which accurately predicts behavior of equilibrium system for different values of the control parameter. We note that the equilibrium assumption for the data is supported by the stable size distribution of avalanches over time (**Figure 3**) and the demonstration of detailed balance (**Figure A1**; see Appendix for more details). If we set *T* at which the actual recording was taken to be 1, it can be shown that, *pi*(*T*) = <sup>1</sup> *<sup>Z</sup> pi* (1) <sup>1</sup>/*<sup>T</sup>* where *pi*(*T*) is the state probability with the thermodynamic temperature *T* and *Z* is a normalization factor (Methods). After obtaining *pi* for a wide range of *T*, we use finite size scaling (FSS) analysis (Newman and Barkema, 1999) to investigate whether the avalanche state (*T* = 1) is close to a thermodynamic critical point, i.e., if the critical "temperature" *Tc* ≈ 1. We first analyzed the thermodynamic quantities χ, C, and M as functions of *T* for different system sizes (*n* = 12 − 20; **Figure 4**). Those functions measured for different *n* will be scaled according to a unique set of *Tc* and critical exponents to test if they can be collapsed. Specifically, *T* needs to be scaled by *<sup>L</sup>*1/ν(*<sup>T</sup>* <sup>−</sup> *Tc*)/*Tc*, where *<sup>L</sup>* <sup>=</sup> <sup>√</sup>*<sup>d</sup> <sup>n</sup>* and *<sup>d</sup>* is the dimensionality of the system. χ, *C*, and *M* need to be scaled by *L*−α/ν, *L*β/ν, and *L*−γ/ν, respectively. Achieving such a collapse implies that, at the thermodynamic limit, the system has a critical point at *Tc*, which is characterized by the divergence of χ and *C* and the phase transition of *M*. To illustrate this, we consider the collapse of χ, which implies that, at *Tc*, the scaled quantity of χ, i.e., *L*−γ/νχ, is a constant. When *n* → ∞, *L*−γ/<sup>ν</sup> = *n*−γ/ν*<sup>d</sup>* → 0 because γ/ν*d* > 0 (see below). Therefore, a finite product of *L*−γ/<sup>ν</sup> and χ implies χ → ∞. We find an excellent collapse up to *n* = 20 (**Figure 4**). Importantly, the values of *Tc* estimated by the FSS method are close to 1 (**Table 1**), suggesting that ensembles of neuronal avalanches are organized at the vicinity of a thermodynamic critical point. In addition to *Tc*, FSS also estimates the critical exponents, including ν, α, β, and γ. They

**FIGURE 3 | Stability of the power law size distribution during the recording. (A)** Avalanche pattern size distribution of the whole recording (30 min) plotted in a double-logarithmic scale. ε, exponent of the best fitting power law to the distribution. Avalanche pattern was identified based on the activities recorded in the whole array (91 channels, Monkey 1). **(B)** The full dataset as analyzed in **(A)** was split into 10 consecutive, non-overlapping segments, each of which lasted for 3 min.

**FIGURE 4 | Critical behavior in susceptibility, specific heat, and order parameter observed for neuronal avalanches at the mesoscopic level, i.e., recorded by LFPs.** Susceptibility **(A)**, specific heat **(B)**, and order parameter **(C)** are plotted as a function of *T* for system size *n* = 12–20 (color code). Left: Original non-scaled functions. Right: Corresponding collapse using FSS analysis. Scaled quantities plotted as a function of *t* = *T* − *Tc* /*Tc* , *<sup>L</sup>* <sup>=</sup> <sup>√</sup>*<sup>d</sup> <sup>n</sup>*, where *<sup>d</sup>* is the dimensionality of the system. Critical exponents: α, β, γ, and ν. We note that the peaks for the scaled variables χ and *C* are not expected to be at the location of *L*1/ν*t* = 0.

characterize how χ, *C*, and *M* change as a function of *T* at the thermodynamic limit. We find that ν ≈ (0.8 − −0.9)/*d*, α ≈ 0.7, β close to 0 and γ close to 1. These results are consistent across the datasets obtained from two monkeys (**Table 1**).

#### Avalanche pattern size distributions were calculated for individual segments and plotted (color coded). **(C)** The original dataset as analyzed in **(A)** was shuffled in time (i.e., the sequence of activities was randomized) to eliminate temporal dependencies and split into ten consecutive, equal-sized segments. Avalanche pattern size distributions were calculated for individual segments and plotted (color coded). In **(B)** and **(C)**, ε is represented as mean ± s.d. (across all segments).

## **AVALANCHE DYNAMICS AT THE MACROSCOPIC SCALE**

Seeking to extrapolate from these results, we applied the FSS analysis to neural dynamics manifested at the macroscopic scale—the whole human brain—measured by MEG. In **Figure 5**, we show that ongoing neuronal activity in human MEG reflects neuronal avalanches, which reconfirmed our recent finding (Shriki et al., 2013). Despite the dramatically different spatial scales between the LFP and MEG signals from monkeys and humans (>10,000-fold difference in recording areas), we found strikingly similar behavior for the activity measured across the entire human cortex when the control parameter, *T*, and system size, *n*, change (**Figure 6**). Again, FSS analysis suggests that *Tc* ≈ 1 for the macroscopic system (**Table 1**). The results were consistent across different human subjects and, importantly, both *Tc* and the critical exponents of MEG recordings are very similar to those obtained from the LFP recordings (**Figure 7**). Such similarity, in terms of both the scaling behavior, i.e., collapse of curves, and critical exponents, strongly suggests a universal organization that underlies neuronal interactions at various spatial scales.

## **VALIDATING THE FSS METHOD THROUGH A SIMPLE MODEL**

Next, we investigated a simple and understandable model, and exemplified the sensitivity of FSS analysis to distinguish critical from non-critical system dynamics. To this end, we used the DG model in which all elements were embedded in a ring configuration. Each element had a well-defined "distance" to every other element (**Figure 8A**). We set the covariance of hidden variables (Methods) *i* and *j*, λ*ij*, as a Gaussian function of the distance *rij* between them: <sup>λ</sup>*ij* <sup>=</sup> <sup>λ</sup>max exp −<sup>1</sup> 2 *rij* ω 2 , where λmax is the maximal covariance and ω is the SD of the Gaussian function. For the limit of ω → ∞, all λ*ij* become identical and criticality is ensured (Macke et al., 2011). Conversely, decreasing ω to 0 drives the system to an independent state (**Figure 8B**).

We applied the FSS method to this system. To facilitate the analysis, system sizes were set to be *n* = 6–10. In **Figures 8C–F**, we plot the goodness of collapse, estimation of *Tc*, and critical exponents as a function of ω. We found that for this model,


**Table 1 | Critical temperature** *Tc* **and critical exponents ν***d***, α, β, and γ estimated using finite size scaling analysis (FSS) for eight 20-eletrode sub-groups in two monkeys (M1, M2) and six 20-sensor sub-groups in three human subjects (H1–H3).**

*Arguments in brackets indicate that Tc and* ν*d were estimated by applying FSS to susceptibility* χ*, specific heat C and order parameter M, respectively.*

the deviation from the critical state (ω = ∞) is detectable for ω <7∼8. Given that all *r* ≤ 5, we consider the sensitivity of the FSS for detecting deviations from criticality as satisfactory. We note that with increasing system sizes in the analysis, even higher sensitivity will be achieved. We also compared these results with real data (*n* = 6–10) and found that the actual results we obtained for cortical activities are very close to a true critical state (**Figures 8C–F**), further supporting the previous results that neuronal avalanches represent a cortical state close to thermodynamic criticality.

## **CORRELATION STRUCTURE IN NEURONAL AVALANCHE DYNAMICS**

The results based on this simple model also provide testable predictions for the empirical data. First, if we remove all correlations in activities between cortical sites, the critical behavior observed in the original data should be abolished. To test this prediction, we used independent Poisson processes to generate nLFPs at the empirically measured rate for each cortical site. χ, *C*, and *M* were then calculated as a function of *T* and *n* in the same way as for the original data. As expected, all three quantities did not depend on system size anymore and thus did not show any scaling behavior (**Figure 9**). Another important prediction is that the original data should contain long-range spatial correlations. In **Figure 10**, we plot the correlation *G*, defined as *Gij* = σ*i*σ*<sup>j</sup>* − σ*i* σ*j* , as a function of the Euclidian distance *r* between sites *i* and *j* in both linear and log-log coordinates. We found that the correlation slowly decreases with increase in distance and that the rate of decay further decelerates at larger distance. As a result, for an increase in distance by one order of magnitude, the correlation decreases by less than 50% (**Figures 10A,B**), demonstrating that fluctuations in activity between very distant cortical sites are still correlated. For critical systems, theory predicts that the decay in spatial correlation should be a power law function with an exponent close to zero, which ensures the existence of long-range correlations (Binney et al., 1992). In line with theory, the spatial correlations in monkey 1 and those with distance >1 mm in monkey 2 exhibit a linear tendency in log-log coordinates, with exponents of −0.24 ± 0.05 (**Figures 10C,D**). The 10 × 10 recording array with interelectrode distance of 0.4 mm limits our investigation of the spatial correlation function to roughly one order of magnitude from 0.4 to 4.5 mm of distance. On the other hand, 4.5 mm already captures a relatively large distance within one cortical area of a macaque's brain. A more definitive conclusion about whether a power law is a good approximation awaits future studies with the capability to record from a much wider spatial extent. It is interesting that the data and the model with ω = ∞ share the same set of critical exponents (**Figures 8E,F**), despite their differences in correlation structure. Whereas *G* was constant in the model (for ω = ∞), it changed systematically as a function of *r* in the data. Consequently, all patterns with the same size were

equally probable in the model (Macke et al., 2011), whereas these probabilities differed in the data by up to 2 orders of magnitude. Therefore, the fact that the model and the data share the same set of exponents is non-trivial, suggesting that they belong to the same universality class.

## **RELATION BETWEEN THE POWER-LAW SIZE DISTRIBUTION AND THERMODYNAMIC CRITICALITY**

The equilibrium critical behavioral revealed here is not simply implied by the power-law distributed avalanche sizes. This can be demonstrated by studying the probability *p*<sup>0</sup> of the quiescent state, i.e., all sites are inactive. This probability is not constrained by the power-law distribution in avalanche patterns (because it leads to divergence for a power-law), but nevertheless is important in order to obtain proper scaling and collapse using FSS. In the original data, *p*<sup>0</sup> decreased in a unique way with increase in system size *n* (**Figure 11**). When *p*<sup>0</sup> was changed randomly with n, the functions could not be collapsed anymore despite the preservation of the power-law in size distribution (**Figure 12**). Furthermore, we know that a system is not required to have

**FIGURE 7 |** *Tc* **and critical exponents α, β, γ, and ν estimated using finite size scaling analysis in two monkeys and three human subjects.** Four (two) different 20-electrode/sensor sub-groups were analyzed for each monkey (human) dataset resulting in the sample size of 8 (6). Values are mean (center circle) ± s.d. (error bars omitted for s.d. smaller than center circle).

power-law distributed avalanche sizes in order to exhibit features of equilibrium criticality. For example, Macke et al. (2011) has shown that for a system with (1) higher order interactions and (2) infinite correlation length, thermodynamic criticality is ensured, regardless of the pattern size distribution. Although the power-law size distribution is not necessarily associated with thermodynamic criticality, by testing a wide range of *T*, we found that the particular value of *T* that minimizes the distance from a power-law and the actual distribution is very close to 1 (0.99 ± 0.03; mean ± SD across eight sub-groups from 2 monkeys for the best fitting power-law and 1.03 ± 0.10 for the powerlaw with slope −1.5; **Figure 13**), demonstrating that there is a unique "temperature" associated with the avalanche dynamics. Given that there is no trivial relation between the power-law size distribution and the thermodynamic criticality, our finding that cortical dynamics exhibit these two features simultaneously is intriguing.

## **DISCUSSION**

Our results suggest that neuronal avalanches at both mesoscopic and macroscopic scales manifest a cortical state near thermodynamic criticality. The critical exponents found are highly consistent among different subjects and are reasonably consistent across the two different scales and species. Our results are reminiscent of the well-known fact that, near the critical state, emergent behaviors do not depend on the specific microscopic realization of a system and, therefore, a multitude of systems can be categorized into a small number of universality classes based on their critical exponents (Stanley, 1987, 1999; Binney et al., 1992; Sornette, 2006). Our results thus suggest a general principle governing the collective behavior of cortical activities across spatial scales.

#### **METHODOLOGICAL CONSIDERATIONS**

We demonstrated previously that the nLFP correlates with local neuronal synchrony and increased spiking activity from local

elements is 1. **(B)** the covariance of the hidden variables in the DG model, λ, is plotted as a function of the distance, *r*, that separates corresponding elements for different choices of the standard deviation of a Gaussian function, ω. **(C–F)** Goodness of collapse, *Tc* and critical exponents measured for various systems are plotted against ω (open circles). In all systems, λmax and mean event rate were set such that when ω = ∞, the average covariance and the event rate match what we empirically observed for Monkey 1. Corresponding results obtained from actual data for Monkey 1 (averaged across four sub-groups) are shown for comparison (broken lines).

neuronal populations (Petermann et al., 2009; Yu et al., 2011). However, the exact spatial extent of the LFP is still debated. While some studies suggest that the LFP reflects neuronal activities within the vicinity of the microelectrode (<0.2 – 0.4 mm radius; Katzner et al., 2009; Xing et al., 2009), some evidence has been provided that even distant (>1 mm) neuronal activities might contribute to the LFP due to volume conduction (e.g., Kajikawa and Schroeder, 2011). Similar concerns are also related to MEG signals, as one sensor of the MEG can detect signals generated by multiple sources. A question thus arises as to what extent linear mixing of signals from different sources might affect the results presented in the current study? In general, volume conduction and/or signal mixing cannot produce genuine critical behavior.

**FIGURE 9 | Shuffled data does not exhibit scaling behavior.** Original data was the same as shown in **Figure 4**. At *T* = 1, we calculated the individual pattern probabilities based on independent Poisson processes to generate nLFPs with the same empirically measured rate for each cortical site. Using the same method applied to original data, we calculate χ, *C*, and *M* as functions of *T*. In contrast to the original data, the curves for systems of different sizes are almost identical for χ **(A)**, *C* **(B),** and *M* **(C)**. For visual clarity, curves with different sizes have different widths.

Criticality relies on long-range correlations that emerge from cascades of local interactions. That is, the activity of unit *A* affects unit *B*, which in turn affects unit *C*, and so on. As a result, the activity of unit *A* will be correlated (with some temporal delay) with a distant unit *X* (Stanley, 1999). If measured interactions solely arose from volume conduction and/or signal mixing, the activity of a local unit will not causally affect nearby units and, therefore, causal chains of interactions cannot form. Accordingly, volume conduction and/or linear signal mixing should not lead to the appearance of critical dynamics. We verified this statement by modeling volume conduction in a 10 × 10 array configuration, in which even fairly strong volume conduction fails to reproduce long-range correlations as observed in our neuronal data (see Appendix **Figure A2**). Furthermore, the FSS method we used here to identify criticality is robust to a potential contribution from volume conduction. This can be easily seen in the ring model we used to identify scaling collapse. Introducing volume conduction

**FIGURE 11 | Change in the probability of the quiescent state as a function of system size in the data.** For 4 sub-groups analyzed in monkey 1, probability of the quiescent state measured for the original data (blue) is plotted as a function of systems size (from 1 to 20). Probability of the quiescent state measured for corresponding shuffled data (orange) is plotted for comparison. Shuffled data were obtained by randomizing the activity sequence for individual electrodes, which eliminates the correlation among different electrodes but preserves the probability of being active for all electrodes.

**power-law size distribution.** Pattern probabilities of the original data (as shown in **Figure 4**) were modified so that the probability for the quiescent state, *p*0, was set randomly from a uniform distribution (0, 1) while the probabilities for all other states were renormalized, i.e., *pi* = *pi* /(1 − *p*0). Therefore, the power-law size distribution was preserved. **(A)**, Specific heat, *C*, is plotted as a function of *T* for system size *n* = 12 – 20 (color coded). **(B)** No collapse can be achieved.

into the ring model is equivalent to an increase in ω, which controls the spatial extent of covariance between nearby elements. Our simulations demonstrated that even strong volume conduction (ω = 5) failed to produce the critical behavior as observed in our neuronal data (cf. **Figure 8**). These analyses suggest that our conclusions are unlikely to be affected by volume conduction or signal mixing.

A recent study (Mastromatteo and Marsili, 2011) reported that experimental data might falsely imply criticality due to (1) the limitation of finite sampling and (2) the bias introduced when choosing parameters to achieve best accuracy in the inferring procedure. However, neither aspect applies to the current

1.5 with a step of 0.05. Distribution at *T* = 1 is marked by red. Inset: Kolmogorov–Smirnov distance (*D*KS, a goodness-of-fit measure) between the actual pattern size distributions and best fitting power law (purple) or power law with slope −1.5 (blue) is minimized for *T* ≈ 1.

study. The pair-wise correlation we observed for nLFPs that constitute neuronal avalanches are within the range of 0.2 – 0.6 (Pearson's *r*) and, given our sample sizes, the margin of error is <0.05 (95% confidence interval). Therefore, our sample sizes were large enough to infer even lower or higher correlation strengths [indicating larger distances from the critical state, see Mastromatteo and Marsili (2011)], if they actually existed in the system. This suggests that the proximity to a critical state is a true feature of the cortex. Furthermore, in the current analysis, no parameter for analyzing the data was chosen according to the criterion of inferring accuracy. Taken together, the current results are robust, in light of the known methodological biases.

## **SUGGESTIONS OF A NEW UNIVERSALITY CLASS FOR THE RESTING BRAIN**

One of the key steps in our analysis was the use of the single histogram method to infer system behavior for different values of the control parameter *T*. This is a well-established method and has been widely applied to study various empirical systems and models at, or close to equilibrium (Tkacik et al., 2009; Macke et al., 2011; Stephens et al., 2013). Using the same method, Stephens et al. (2013) recently found that the spatial pattern of natural images contains indications of criticality. Macke et al. (2011) found that if a system exhibits higher-order interactions, its specific heat will diverge as long as the correlation does not decay as a function of the distance. In a study of spiking activities in salamander retina (Tkacik et al., 2009), it was found that the maximal heat capacity increased with system size and the corresponding *T*(*T*peak) approaches 1. This was suggested as evidence for criticality (Tkacik et al., 2009). Heat capacity, though, is an extensive quantity and thus, an increase in heat capacity with increasing system size is difficult to interpret. It does not necessarily indicate an increase in specific i.e., normalized, heat capacity. Furthermore, without a sound extrapolation of *T*peak for *n* → ∞, it is difficult to give an accurate estimation of *Tc*. In the current study, we took several steps to avoid such ambiguities. First, specific heat *C* was analyzed directly. More importantly, we used FSS to estimate both *Tc* and the critical exponents, providing a quantitative characterization of the system's behavior.

Interestingly, the critical exponents derived for the cortical activities are different from those that are commonly found in physics such as the Ising model, Heisenberg model or Spherical model (Binney et al., 1992). Cortical activity has distinctive features, including a currently unknown dimensionality and a special structure of higher-order interactions (Yu et al., 2011), which may underlie its unique critical exponents. We also notice that the value of β is close to zero, which in some cases indicates that the phase transition is a discontinuous one (Achlioptas et al., 2009). However, recently it was found that some continuous phase transitions have β so close to zero that it is practically indistinguishable from a discontinuous one (Riordan and Warnke, 2011). To further elucidate this issue, future work with approaches that can analyze much larger systems, i.e., larger *n*, would be needed to increase the precision in estimating *Tc* and critical exponents.

## **NON-EQUILIBRIUM AND EQUILIBRIUM PERSPECTIVES OF NEURONAL AVALANCHE DYNAMICS**

Our current approach did not address the organization of activities within individual avalanches. It has been previously demonstrated that such activities can be effectively understood in the framework of a critical branching process (Beggs and Plenz, 2003; Shew et al., 2009, 2011; Friedman et al., 2012; Yang et al., 2012). That approach considers the spatiotemporal organization of events (nLFPs) that occur in an avalanche to be the result of balanced cascades and correctly predicts the power-law distribution in avalanche size with the exponent of –1.5. The critical branching process is a well-studied, non-equilibrium critical condition, which belongs to the universality class of directed percolation (Buice and Cowan, 2007). By collapsing the temporal dimension, we compressed the spatiotemporal pattern of neuronal cascades into spatial-only patterns and thus ignored the non-equilibrium cascading process in our present study. At the same time, we analyzed the ensemble of all cascades as a whole. Thus, our approach focused on the organization of avalanche activities at a different level. With this regard, the current results provide a complementary view to better understand cortical dynamics, suggesting a highly organized, hierarchical organization of cortical activity. We propose that cortical dynamics are organized close to criticality from both the non-equilibrium, branching process perspective and the equilibrium thermodynamic perspective. The former is indicated by a power-law size distribution, whereas the latter is indicated by *Tc* close to 1. Interestingly, recent studies that investigated large scale (across the entire brain) neuronal dynamics have also reported evidence for criticality in an equilibrium as well as non-equilibrium context (Deco and Jirsa, 2012; Haimovici et al., 2013; Shriki et al., 2013). Future studies to investigate how the brain can achieve both types of criticality, at different spatial as well as temporal scales hold great promise to uncover a more complete picture of cortical dynamics.

For the non-equilibrium critical state characterized by powerlaw probability distributions, theoretical as well as empirical studies have revealed functional advantages for neuronal information processing (Kinouchi and Copelli, 2006; Rämö et al., 2007; Shew et al., 2009, 2011; Tsubo et al., 2012; Yang et al., 2012). The equilibrium, thermodynamic criticality also has direct functional implications. From an information-theoretical point of view, the maximal specific heat, i.e., maximal variance of log(*pi*), implies largest dynamic range for population coding (Tkacik et al., 2009; Macke et al., 2011). This is also consistent with the finding that the dynamics of the brain reach highest signal complexity near the equilibrium criticality (Deco and Jirsa, 2012). The maximal susceptibility has an even more straightforward interpretation: it means that cortical networks have obtained largest sensitivity to small perturbations. This may play an essential role in allowing the organism to be able to detect and respond to subtle environmental changes. Such a high sensitivity of cortical networks has been demonstrated empirically for both spiking activity (Houweling and Brecht, 2007; Huber et al., 2008) and neuronal population activity reflected in the LFPs (Shew et al., 2009). The current results provide new insights into these intriguing phenomena of cortical dynamics.

## **POTENTIAL FUNCTIONAL ROLE OF THE CONTROL PARAMETER T IN THE BRAIN**

In systems studied in statistical mechanics, increasing the temperature *T* drives the system toward a state of higher activity and weaker effective interactions among the system components. Similar changes in activity and interactions have also been observed in the brain, specifically the cortex. For example, an increase in firing rate that is accompanied by a decrease in pairwise correlation has been documented in transitions from a less vigilant state to a more vigilant state, e.g., from sleep to wakefulness (Vyazovskiy et al., 2009; Grosmark et al., 2012) and from an inattentive to an attentive state (Cohen and Maunsell, 2009; Harris and Thiele, 2011; Mitchell et al., 2009). These observations suggest that there might be intrinsic neural mechanisms for adjusting cortical states roughly along the same dimension as changing *T*.

It is well-known that neuromodulators, such as acetylcholine (ACh) and dopamine (DA) produce numerous diverse effects at the receptor, synaptic transmission, and single neuron level (Picciotto et al., 2012; Tritsch and Sabatini, 2012). On the other hand, when studying the effect of e.g., ACh in the context of cortical state changes (Himmelheber et al., 2000; Jones, 2005; Brown et al., 2011), effects brought about by an increase in the tone of ACh are quite reminiscent of the effects of increasing *T* in our framework. In particular, ACh drives cortical networks toward a state of high activity and weak coupling both *in vitro* (Chiappalone et al., 2007; Pasquale et al., 2008) and *in vivo* (Goard and Dan, 2009; Thiele et al., 2012). Similarly, the neuromodulator dopamine was shown to control neuronal avalanche dynamics via an inverted-U profile typical for the regulation of working memory (Stewart and Plenz, 2006). At moderate dopamine D1-receptor stimulation, neuronal avalanche dynamics was established, whereas lower or higher receptor stimulation abolished avalanche dynamics and reduced the number of local synchronized events reminiscent of weaker coupling between neurons.

The control parameter *T* might not capture the effects of changing the balance of fast excitation to fast inhibition (E/I) in a network. Experimentally, it has been shown that a proper E/I balance is required to maintain avalanche dynamics in cortical networks (Beggs and Plenz, 2003; Shew et al., 2009, 2011; Yang et al., 2012). Neuronal simulations have demonstrated that such proper E/I balance, in addition, establishes longrange temporal correlations in the network (Poil et al., 2012) as identified in the human EEG (e.g., Linkenkaer-Hansen et al., 2005; Montez et al., 2009). An increase in excitation, e.g., by reducing inhibition, increases activity. However, it also leads to an increase, not a decrease, in coupling (Shew et al., 2009, 2011).

## **CONCLUDING REMARKS**

By studying neuronal avalanches in non-human primates and human subjects, we demonstrated that ongoing resting activity in the cortex organizes close to a thermodynamic critical point. We derived the cortical equivalents of the three parameters, including susceptibility, specific heat capacity and an order parameter that are commonly used in statistical mechanics to capture the behavior of systems near a thermodynamic critical point. By investigating the scaling behavior of these parameters we uncovered a potentially new universality class for the brain and propose that this endows cortical networks with maximized

## **REFERENCES**


*introduction to the renormalization group*. Oxford: Clarendon Press.


input sensitivity and dynamic range for representing information. Our results reveal, in a quantitative manner, how the interactions among individual neurons in cortex collectively give rise to emergent behavior that is highly non-trivial. With ever increasing capacity of monitoring activities of large neuronal networks, we anticipate that the framework provided here will be instrumental for understanding how cortical states are regulated through myriads of neuronal interactions to optimize information processing.

## **AUTHOR CONTRIBUTIONS**

Conceived and designed the experiments: Shan Yu, Hongdian Yang, Oren Shriki, and Dietmar Plenz. Performed the experiments: Shan Yu and Oren Shriki. Analyzed the data: Shan Yu, Hongdian Yang, and Oren Shriki. Contributed reagents/ materials/analysis tools: Hongdian Yang. Wrote the paper: Shan Yu, Hongdian Yang, Oren Shriki, and Dietmar Plenz.

## **ACKNOWLEDGMENTS**

The authors would like to thank Richard Saunders and Andy Mitz for help with monkey data collection and the MEG Core Facility of the NIMH for help with human data collection. The authors would also like to thank Jayanth R. Banavar, Amos Maritan, Rajarshi Roy, Mauro Copelli, Mikhail Anisimov, Didier Sornette and Woodrow Shew for helpful comments on an early version of the mansucript. This study was supported by the Intramural Research Program of the National Institute of Mental Health, NIH.

32, 3366–3375. doi: 10.1523/ JNEUROSCI.2523-11.2012


3312–3322. doi: 10.1152/jn.00953. 2009


in parietal alpha and prefrontal theta oscillations in early-stage Alzheimer disease. *Proc. Natl. Acad. Sci. U.S.A.* 106, 1614–1619. doi: 10.1073/pnas.0811699106


*D: Nonlinear Phenomena* 227, 100–104. doi: 10.1016/j.physd.2006. 12.005


*J. Neurosci.* 29, 11540–11549. doi: 10.1523/JNEUROSCI.2573-09.2009 Yang, H., Shew, W. L., Roy, R., and Plenz, D. (2012). Maximal variability of phase synchrony in cortical networks with neuronal avalanches. *J. Neurosci.* 32, 1061–1072. doi:

10.1523/JNEUROSCI.2771-11.2012 Yu, S., Huang, D., Singer, W., and Nikoliæ, D. (2008). A small world of neuronal synchrony. *Cereb. Cortex* 18, 2891–2901. doi: 10.1093/cercor/bhn047

Yu, S., Yang, H., Nakahara, H., Santos, G. S., Nikoliæ, D., and Plenz, D. (2011). Higher-order interactions characterized in cortical activity. *J. Neurosci.* 31, 17514–17526. doi: 10.1523/JNEUROSCI.3127-11.2011

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 08 May 2013; paper pending published: 14 June 2013; accepted: 31 July 2013; published online: 22 August 2013.*

*Citation: Yu S, Yang H, Shriki O and Plenz D (2013) Universal organization of resting brain activity at the thermodynamic critical point. Front. Syst. Neurosci. 7:42. doi: 10.3389/fnsys. 2013.00042*

*This article was submitted to the journal Frontiers in Systems Neuroscience*

*Copyright © 2013 Yu, Yang, Shriki and Plenz. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## **APPENDIX**

## **A. EXAMINING THE ASSUMPTIONS ABOUT STATIONARITY AND EQUILIBRIUM**

Thermodynamic equilibrium implies that the macroscopic properties of the system keep stable and do not change with time. As the distribution of avalanche sizes captures the essential feature of cortical dynamics (Beggs and Plenz, 2003; Petermann et al., 2009; Shew et al., 2009, 2011; Yang et al., 2012), we examined the stability of this size distribution. In **Figure 3** (main text), we show that the avalanche size distribution, measured for 10 consecutive, equal-sized segments of recording, is stable across the whole recording period (30 min). To contrast this with the true equilibrium condition, we shuffled the original avalanche raster (i.e., randomized its sequence) and repeated the same analysis. The variability of the estimated power law exponent, ε, across all segments is small for both the original and shuffled datasets. *F*-test statistics also revealed no significant difference in the variance of ε between the two conditions (*p* = 0.13), suggesting a stable organization of the system over time.

Secondly, we demonstrate that the data satisfy two crucial criteria that will lead to equilibrium: (1) detailed balance (microreversibility) and (2) accessibility/ergodicity (Binney et al., 1992). Detailed balance is achieved in a system if the following relation holds: *pipi*<sup>→</sup>*<sup>j</sup>* = *pjpj*<sup>→</sup>*i*, where *i* and *j* are possible states (configurations) of the system; *pi* is the probability of states *i* and *pi*<sup>→</sup>*<sup>j</sup>* is the transition probability from state i to state j. For avalanche patterns defined by clustering a period of activity flanked by quiescent periods before and after it (Beggs and Plenz, 2003), it is clear that the detailed balance strictly holds for systems with arbitrary sizes. As in this condition, every transition from a quiescent state (i.e., all sites are inactive) to an active state (i.e., at least one of the sites is active) would be accompanied by a reverse of that transition. In other words, the system will satisfy *pqpq*<sup>→</sup>*<sup>i</sup>* = *pipi*<sup>→</sup>*q*, where *q* is the quiescent state and *i* is any active state. Such a feature, combined with the fact that all *pi*<sup>→</sup>*<sup>j</sup>* = 0 when *i* and *j* are both active states, ensures the detailed balance.

To study whether the detailed balance still holds when we release these constraints set by the rules that identify avalanches, we examined the relation between *pipi*<sup>→</sup>*<sup>j</sup>* = *pjpj*<sup>→</sup>*<sup>i</sup>* in the data with quiescent periods removed. In such case, both constraints, i.e., the symmetrical transition from a quiescent state to an active state and zero transition probability between active states, are removed. In **Figure A1**, we plotted the measured *pipi*<sup>→</sup>*<sup>j</sup>* against *pjpj*<sup>→</sup>*<sup>i</sup>* for systems with different sizes (*n* = 2–5). Overall, the data points are fairly close to the identical line, suggesting the fulfillment of the equality. For comparison, we constructed a shuffled data set, in which the sequence of avalanche patterns was randomized. For this shuffled data set, any possible temporal dependency was removed so it is in a truly equilibrium state and, therefore, fulfills the detailed balance. The same analysis was then performed for the shuffled data and we found that the results are similar to those from the original data, indicating that the deviation from the identical line for the original data is largely due to finite sampling, and not due to a violation of the detailed balance. To quantify this effect, we computed the ratio *r* = *D*data/*D*shuffled,

**FIGURE A1 | Detailed balance approximately holds for the data with quiescent periods removed.** For differently sized systems (*n* = 2–5), empirically measured *pi pi*<sup>→</sup>*<sup>j</sup>* is plotted against *pj pj*<sup>→</sup>*<sup>i</sup>* for both the original data (*blue*) and shuffled data (*red*). For every size, 100 different systems (i.e., different combinations of electrodes) were analyzed. The solid lines represent equality. *r* is a measure of the distance from the equality, relative to that of the shuffled data (see Appendix text A for details). It is represented as mean ±s.d. (across 100 systems). **(A–D)**, system size equals 2, 3, 4, and 5, respectively.

where

$$D = \frac{2\left|p\_i p\_{i \to j} - p\_j p\_{j \to i}\right|}{p\_i p\_{i \to j} + p\_j p\_{j \to i}}\tag{A1}$$

We found that for *n* > 2, this ratio is very close to one (Mann– Whitney U test, *p* > 0.05), indicating that the violation to the detailed balance is sufficiently small so it is not detectable within the current recording length. Due to the lack of sufficient data, the direct check of detailed balance cannot be performed for larger systems (*n* >> 5). However, with the results we obtained for *n* = 2–5, and given the fact that with the increase of system size, exponentially more samples would be needed to detect the same level of violation, it is clear that the detailed balance among the active states, i.e., avalanche patterns, should be a good approximation for even larger systems.

Regarding the accessibility/ergodicity assumption, it requires that from any given state, the system should be able to evolve (after a sufficiently long time) to any other state. Although the direct test for ergodicity is not possible due to limited length of the recording, the power-law distribution in avalanche sizes provides strong empirical evidence to support it. Such a heavy-tailed distribution indicates that even large systems can visit configurations that cover all possible avalanche sizes.

between site *i* and *j*. **(A)** Spatial configuration of the simulated array. **(B–D)**, For an example site (the red square in **A**), the mixing weight *wij* (color corded) with different spatial extend (ω) are shown, with the reference to the whole array. **(E)** Correlation, *G*, between mixed activities are plotted as a function of the separation distance for different ω (color-coded). To facilitate comparison, *G* is normalized by the correlation between the nearest neighbors, i.e., *r* = 1.

Taken together, various empirical tests strongly suggest that the stationarity and even the equilibrium assumption can be considered a reasonable first approximation for our data.

#### **B. ANALYTICAL DERIVATION OF FINITE SIZE SCALING METHOD**

For readers who are not familiar with the finite size scaling, we illustrate the method using susceptibility χ as an example. In the vicinity of the critical temperature *Tc*, χ can be expressed as a function of correlation length ξ.

$$\chi = \xi^{\chi/\nu} \tag{A2}$$

In finite size system, correlation length ξ is comparable to system size *L*, and therefore has a cut off. Consequently, χ also has a cut off. If we use ξ to represent the correlation length at the thermodynamic limit, then the cut off takes place when ξ > *L*. Then, we can rewrite χ as

$$
\chi = \xi^{\chi/\iota} \chi^0(L/\xi),
\tag{A3}
$$

which satisfies the conditions above. Then define

$$
\chi\_0(\mathfrak{x}) \sim \chi^{\chi/\nu}, \text{for } \mathfrak{x} < 1
$$

$$
\chi\_0(\mathbf{x}) \sim \mathcal{c}, \text{ otherwise, where } \mathcal{c} \text{ is a constant.}\tag{A4}
$$

Therefore, when the system size is finite,

$$\chi\_L = \xi^{\chi/\nu} (L/\xi)^{\chi/\nu} = L^{\chi/\nu},\tag{A5}$$

And the correlation length is comparable to the system size. Otherwise, when the system size is infinite, the correlation length is actually ξ,

$$
\chi = c\xi^{\chi/\nu}.\tag{A6}
$$

Now we can rewrite the equation in order to remove ξ, because we do not know its exact value, and also to introduce a dimensionless function χ¯(*x*), which will be the scaling function for χ*<sup>L</sup>*

$$\begin{split} \chi &= \xi^{\gamma/\nu} \chi\_0 \left( L \left| t \right|^{\nu} \right) \\ &= \xi^{\gamma/\nu} \chi\_0 \left[ \left( L^{1/\nu} \left| t \right| \right)^{\nu} \right] \\ &= |t|^{-\gamma} \chi\_0 \left[ \left( L^{1/\nu} \left| t \right| \right)^{\nu} \right] \\ &= L^{\gamma/\nu} L^{-\gamma/\nu} |t|^{-\gamma} \chi\_0 \left[ \left( L^{1/\nu} \left| t \right| \right)^{\nu} \right] \\ &= L^{\gamma/\nu} \left( L^{1/\nu} \left| t \right| \right)^{-\gamma} \chi\_0 \left[ \left( L^{1/\nu} \left| t \right| \right)^{\nu} \right] . \end{split} \tag{A7}$$

Set *x* = *L*1/*v*|*t*|, which will be the scaling variable

$$\chi = L^{\chi/\nu} \mathfrak{x}^{-\mathcal{Y}} \chi\_0(\mathfrak{x}^{\nu}) \tag{A8}$$

Define scaling function χ(*x*) = *x*<sup>−</sup>γχ0(*xv*), then

$$\chi = L^{\chi/\nu} \, \overline{\chi}(L^{1/\nu} \, |t|). \tag{A9}$$

Note when ξ ∼ *L*,

$$\begin{split} \overline{\chi}(\mathbf{x}) &= \mathbf{x}^{-\mathsf{y}} \chi\_{0}(\mathbf{x}^{\mathsf{y}}) \\ &= \left(L^{1/\mathsf{v}} \, |t|\right)^{-\mathsf{y}} \chi\_{0} \left[\left(L^{1/\mathsf{v}} \, |t|\right)^{\mathsf{v}}\right] \\ &= L^{-\mathsf{y}/\mathsf{v}} \, |t|^{-\mathsf{y}} \, \chi\_{0} \left(L \, |t|^{\mathsf{v}}\right) \\ &= L^{-\mathsf{y}/\mathsf{v}} \, |t|^{-\mathsf{y}} \, c \left(L \, |t|^{\mathsf{v}}\right)^{\mathsf{y}/\mathsf{v}} \\ &= L^{-\mathsf{y}/\mathsf{v}} \, |t|^{-\mathsf{y}} \, c L^{\mathsf{y}/\mathsf{v}} \, |t|^{\mathsf{v}} \\ &= c \end{split} \tag{A10}$$

Thus, the scaling function is a constant and independent of the system size.

The scaling function also can be written as

$$\begin{split} \left( \overline{\chi} \left( L^{1/\nu} |t| \right) = L^{-\gamma/\nu} |t|^{-\gamma} c \left( L |t|^{\nu} \right)^{\gamma/\nu} \\ = L^{-\gamma/\nu} \left( |t|^{-\gamma} \right)^{\gamma/\nu} c \left( \frac{L}{|t|^{-\nu}} \right)^{\gamma/\nu} \end{split} \tag{A11}$$

Recall ξ ∼|*t*| <sup>−</sup>*v*, so we have

$$\overline{\chi}\left(L^{1/\nu}|t|\right) = L^{-\gamma/\nu} \,\xi^{\gamma/\nu} c\left(\frac{L}{\overline{\xi}}\right)^{\gamma/\nu} \tag{A12}$$

Also recall, when system size is finite,

$$\chi\_L = \xi^{\chi/\nu} (L/\xi)^{\chi/\nu},\tag{A13}$$

So

$$\overline{\chi}\left(L^{1/\nu}|t|\right) = L^{-\gamma/\nu}\chi\_L = c \tag{A14}$$

From Eq. A14, we can measureχ*L*(*t*) for various system sizes *L* in a temperature range close to *Tc*, and rescale χ*L*(*t*) by *L*−γ/*<sup>v</sup>* for each L to obtain the scaling functionχ(*L*1/*v*|*t*|), with *L*1/*v*|*t*| as the scaling variable. If we choose the correct *Tc*, ν and γ, the scaling functions for different system sizes will fall on the same curve.

## **REFERENCES**

Beggs, J. M., and Plenz, D. (2003). Neuronal avalanches in neocortical circuits. *J. Neurosci.* 23, 11167–11177.

