# Adaptation of short-term plasticity parameters via error-driven learning may explain the correlation between activity-dependent synaptic properties, connectivity motifs and target specificity

^{1}Department Computer Science, University of Sheffield, Sheffield, UK^{2}Theoretical Neurobiology and Neuroengineering Laboratory, Department Biomedical Sciences, University of Antwerp, Antwerp, Belgium^{3}Laboratory of Neural Microcircuitry, Brain Mind Institute, Swiss Federal Institute of Technology of Lausanne, École Polytechnique Fédérale de Lausanne, Switzerland^{4}INSIGNEO Institute for in Silico Medicine, University of Sheffield, Sheffield, UK

The anatomical connectivity among neurons has been experimentally found to be largely non-random across brain areas. This means that certain connectivity motifs occur at a higher frequency than would be expected by chance. Of particular interest, short-term synaptic plasticity properties were found to colocalize with specific motifs: an over-expression of bidirectional motifs has been found in neuronal pairs where short-term facilitation dominates synaptic transmission among the neurons, whereas an over-expression of unidirectional motifs has been observed in neuronal pairs where short-term depression dominates. In previous work we found that, given a network with fixed short-term properties, the interaction between short- and long-term plasticity of synaptic transmission is sufficient for the emergence of specific motifs. Here, we introduce an error-driven learning mechanism for short-term plasticity that may explain how such observed correspondences develop from randomly initialized dynamic synapses. By allowing synapses to change their properties, neurons are able to adapt their own activity depending on an error signal. This results in more rich dynamics and also, provided that the learning mechanism is target-specific, leads to specialized groups of synapses projecting onto functionally different targets, qualitatively replicating the experimental results of Wang and collaborators.

## 1. Introduction

It is the current belief that experiences and memories are registered in long-term stable synaptic changes. Long-term plasticity, and in particular Hebbian learning or Spike-Timing-Dependent-Plasticity (STDP), is a form of unsupervised learning that captures correlations in the neuronal input. Hence, their involvement in, for instance, the development of receptive fields (e.g., Song et al., 2000; Clopath et al., 2010) or memory and associations is long-standing knowledge. However, the variety of different long-term plasticity rules (Markram et al., 2011), indicates that the precise synaptic prescriptions of long-term plasticity mechanisms remain unclear.

On the contrary, short-term plasticity (STP) is well-described (Varela et al., 1997; Markram et al., 1998b; Le Be' and Markram, 2006; Rinaldi et al., 2008; Testa-Silva et al., 2012; Costa et al., 2013; Romani et al., 2013) in the context of specific models (Tsodyks and Markram, 1997; Hennig, 2013; Rotman and Klyachko, 2013). Its role in neuronal computation is assumed to be related to temporal processing, see for instance (Natschläger et al., 2001) or the work by Carvalho and Buonomano (2011), where STP is demonstrated to enhance the discrimination ability of a single neuron (i.e., a tempotron, see Gütig and Sompolinsky, 2006), when presented with forward and reverse patterns. Synapses with STP are also optimal estimators of presynaptic membrane potentials (Pfister et al., 2010).

The investigation of the brain wiring diagram known as *connectomics* has recently made spectacular progress and generated excitement for its perspectives (Seung, 2009). Novel discoveries in molecular biology (Wickersham et al., 2007; Zhang et al., 2007; Lichtman et al., 2008), neuroanatomical methods (Denk and Horstmann, 2004; Chklovskii et al., 2010), electrophysiology (Song et al., 2005; Hai et al., 2010; Perin et al., 2011), and imaging (Friston, 2011; Minderer et al., 2012; Wedeen et al., 2012) have pushed forward the technological limits for ultimate access to neuronal connectivity. The comprehension of this level of organization of the brain (Kandell et al., 2008) is pivotal to understanding the richness of its high-level cognitive, computational and adaptive properties, as well as its dysfunctions.

At the microcircuit level (Binzegger and Douglas, 2004; Grillner et al., 2005; Silberberg et al., 2005; Douglas and Martin, 2007a,b), the non-random features of cortical connectivity have recently raised a lot of interest (Song et al., 2005; Perin et al., 2011). The occurrence of stereotypical connectivity motifs was experimentally demonstrated and, in some cases, accompanied by physiological information on neuronal and synaptic properties (Song et al., 2005; Wang et al., 2006; Silberberg and Markram, 2007; Perin et al., 2011), on activity-dependent short-term and long-term plasticity (Buonomano and Merzenich, 1998) and rewiring (Chklovskii et al., 2004; Le Be' and Markram, 2006). Recent experimental findings obtained in young ferret cortices (Wang et al., 2006) indicate that short-term facilitation and depression correlate to specific connectivity motifs: neurons connected by synapses exhibiting short-term facilitation form predominantly reciprocal (bidirectional) motifs; neurons connected by synapses exhibiting short-term depression form unidirectional motifs. Interestingly, the same overexpression of connectivity motifs has been observed in another brain area, i.e., the excitatory microcircuitry of the olfactory bulb (Pignatelli, 2009).

Earlier work by Vasilaki and Giugliano (2012, 2014) attempted to shed light on this correlation between STP and the observed wiring diagram configuration. They demonstrate that all facilitating or all depressing networks, upon receiving the same wave-like stimulation, give rise to the experimentally observed motifs: bidirectional for facilitating synapses and unidirectional for depressing synapses. This was explained both in the context of mean field analysis and microscopic simulations as a frequency-dependent effect. This is a simple consequence of the type of input (wave like) and the choice of the STDP triplet rule (Pfister and Gerstner, 2006). Differently from the classical pair rule, the triplet rule displays a frequency-dependent behavior, which can explain some experimental results (Sjöström et al., 2001): at low frequencies the rule reveals the classic STDP and, given a wave-like input, it results in unidirectional connectivity (Clopath et al., 2010; Vasilaki and Giugliano, 2014). At high frequencies, however, it reveals “classic Hebb” behavior: neurons that fire together, wire together. Hence, the low firing network develops unidirectional connectivity, while the high firing network develops bidirectional connectivity; for details see (Vasilaki and Giugliano, 2014). However, the observed synaptic development was not associated to any particular type of learning, but was explored as the emerging structure upon receiving a wave like input: what the network learned *per se* in that context was not clear.

With the present work we aim to complement and extend on Vasilaki and Giugliano (2012, 2014). We define a learning model for STP through which a population of neurons can modify its synapses in order to adapt its own activity and then fulfill a given time-varying task. The key idea comes from an optimization perspective: neurons that are able to modify their synapses, for instance making depressing synapses more and more depressing or even turning them into facilitating ones, would allow for much more flexibility and efficacy in signal transmission. A similar argument can be found in Markram et al. (1998a), whereas for earlier but different mechanisms of STP optimization or learning we redirect to Natschläger et al. (2001) and Carvalho and Buonomano (2011).

Then, we construct a typical inverted associative learning problem (Asaad et al., 1998; Fusi et al., 2007; Vasilaki et al., 2009b) where neurons have to learn to respond with high or low frequencies, when presented with the same wave-like input signal. We use this paradigm to show the potential of our model. In particular, not only do we provide an explanation for the correspondence motifs-synaptic properties within the context of learning both STP and STDP (triplet rule) but we also qualitatively capture, for instance, the heterogeneity in synaptic properties observed by Wang et al. (2006).

Moreover, having defined the learning model as a target-specific mechanism, we are able to obtain variability in the short-term profile of synapses innervating functionally different targets. Finally, we show that the learning model can be reduced to a minimal model where only the time constant of recovery from depression τ_{rec} needs to be learnt in order to obtain neurons firing at high or low frequency. Comparing this finding with the results from Carvalho and Buonomano (2011), we suggest that different parameters of the model describing STP might be related to different types of coding.

## 2. Materials and Methods

### 2.1. Single Neuron Model

Each neuron is modeled as in Carvalho and Buonomano (2011): the sub-threshold dynamics of the electrical potential *V*_{i} of the generic neuron *i* are described by the equation:

where *E*_{rev} = 30 mV is the reversal potential and *g*_{L} = 0.1 μS is the leak conductance - both quantities are equal and fixed for all neurons. {*g*_{ij}}_{i,j = 1, … N} is the matrix of conductances and the generic element *g*_{ij} represents the conductance of the synapse going from neuron *j* to neuron *i*. Upon arrival of a presynaptic action potential elicited by neuron *j*, each of the conductances *g*_{ij} with *i* = 1, … *N*, *i* ≠ *j* increases by a quantity *w*_{ij}, called effective synaptic efficacy, and decays exponentially back to zero with a fixed time constant τ_{g} = 10 ms, equal for all synapses:

where *t*^{f}_{j} is the *f*-th spike emitted by neuron *j*. The effective synaptic efficacy depends on both presynaptic and postsynaptic factors:

where *r*_{ij} and *u*_{ij} are the presynaptic variables representing depression and facilitation in the STP model (see Subsection 2.2) and *A*_{ij} is the postsynaptic variable for the maximum synaptic strength (or absolute efficacy), which represents the maximum synaptic response (see Subsection 2.7). If *V*_{i}(*t*) ≥ 1 mV a spike is elicited by neuron *i* and *V*_{i}(*t* + *dt*) is set to 0 for the next *t*_{ref} = 10 ms (refractory period).

### 2.2. STP Model

Short-term synaptic plasticity is described at each synapse through the evolution of two variables, *r*_{ij} and *u*_{ij}, representing the degree of depression and facilitation of the synapse connecting neuron *j* to neuron *i*. The time course of *r*_{ij} and *u*_{ij} is given by the following kinetic equations (Markram et al., 1998b; Maass and Markram, 2002):

*U*_{ij}, τ_{recij} and τ_{facilij} are the parameters of the model and they represent, respectively: fraction of resources used by the first action potential, time constant of recovery from depression and time constant of synaptic facilitation. A learning rule for STP has to allow changes to (at least one of) these parameters. At each synapse, the product of *r*_{ij} and *u*_{ij} determines the presynaptic efficacy.

### 2.3. STDP Model

We use the triplet learning rule defined by Pfister and Gerstner (2006) with hard bounds: maximum weights can only vary in the interval [*A*_{min}, *A*_{max}]. In this model, each neuron has two presynaptic variables *m*^{1}, *m*^{2} and two postsynaptic variables *o*^{1}, *o*^{2}. In the absence of any activity, these variables exponentially decay toward zero with different time constants:

whereas when the neuron elicits a spike they increase by 1:

Then, assuming that neuron *i* fires a spike, the STDP implementation of the triplet rule can be written as follows:

where γ is the learning rate, ϵ is an infinitesimal time constant to ensure that the values of *m*^{2}_{i} and *o*^{2}_{i} used are the ones right before the update due to the spike of neuron *i*, and *A*_{ij} is the maximum strength of the connection from *j* to *i*. Values of STDP amplitudes are taken from Pfister and Gerstner (2006) and are listed in Table 1.

In order to set *A*_{min} we note that if the maximum weights connecting the input neurons to a specific output neuron all collapse to zero in the low firing rate regime, then, in the subsequent high firing rate regime, inputs were not able to “wake up” this neuron: it remained almost silent all the time. To avoid this, we set *A*_{min} = 10^{−3}. With such a small value we can still apply the symmetry measure (Esposito et al., 2014), which assumes *A*_{min} = 0, see Subsection 2.9, to evaluate the symmetry of the network.

### 2.4. Learning Task

Neurons are divided into different populations, each of them is required to fire at one of the two target firing rates: 30 Hz (*high*) or 5 Hz (*low*). To allow the populations to reach their target rate, both short- and long-term plasticity parameters are adapted via error-driven learning (see Subsection 2.6) and, in addition, the maximum synaptic strength is shaped by the STDP triplet rule (see Subsection 2.3).

### 2.5. Input Signal and Input Neurons

In all simulations, the input signal is delivered only to a subset of neurons in the network, which we call input neurons *N*_{in}. Each of these neurons receives a pulse-like stimulus with a fixed frequency ν_{in} = 10 Hz, whose amplitude (2 mV) is chosen to always elicit an action potential in the corresponding input neuron. The stimulus delivery, however, is not synchronous across the input neurons, but it follows a *sequential protocol*: neurons are stimulated one after another with a fixed time delay *t*_{delay} and in a fixed order. We choose *t*_{delay} = (ν_{in} *N*_{in})^{−1} so that neurons that belong to input cyclically receive a stimulus. To further explain this, one may imagine labeling the neurons depending on the order they receive the stimulus, and therefore on the firing order, then have the firing pattern *N*_{1}, *N*_{2}, *N*_{3}, …, *N*_{Nin}, *N*_{1}, *N*_{2}, *N*_{3}, …, *N*_{Nin}, *N*_{1}, …, with each pair of adjacent spikes being separated by a time interval of *t*_{delay}. We can think of the *N*_{in} neurons as if they are organized in a ring and the stimulus as a cyclically traveling wave across this ring. To include the effect of noise, a random Gaussian variable with zero mean and standard deviation equal to 0.1 *t*_{delay} is added to the firing times. The magnitude of the standard deviation is such that there is no inversion in the firing order. With this construction, the stimulus delivered to input neurons can be thought as generated by an external (not explicitly simulated) population of neurons where each external neuron projects only onto one corresponding input neuron.

Note that, by construction, in the absence of any other signal, the firing pattern of the input neurons reflects that of the stimulus. This means that the external signal implicitly fixes a level of minimum activation for the *N*_{in} neurons: their firing rate cannot be smaller than ν_{in}. Due to this constraint, the input neurons, despite being free to change their parameters according to STP learning rules (see Subsection 2.6), are not totally free to regulate their firing activity, which may prevent them from effectively fulfilling the task. The rest of the neurons, instead, are totally free to adapt their activity and are called output neurons. For these reasons, we read out the interesting quantities only from output neurons (we refer to Results and to Figures 1A, 4A for more details on the architecture).

**Figure 1. Single population scenario: network architecture and activity, connectivity and STP parameters adaptation in the output population with ( U, τ_{rec}) learning scheme (Part 1). (A)** Architecture. The learning network (

*green*) is divided into an input region (

*blue*) and an output region (

*red*). Connections (

*black arrows*) are all-to-all and obey both Spike-Timing Dependent Plasticity and rate-dependent Short-Term Plasticity. Input neurons receive an external wave-like stimulus (

*blue dashed arrows*).

**(B)**Mean firing rate of the output population.

*Shaded area*represents standard deviation,

*horizontal dotted gray lines*show the two target firing rates (high = 30 Hz, low = 5 Hz) and

*vertical black arrows*mark the onset of the four dynamic phases alternating the target according to the sequence low-high-low-high.

**(C)**Symmetry measure applied on the connectivity of the output population. In accordance to the target, connectivity switches between unidirectionality (low values) and bidirectionality (high values).

**(D,E)**Mean values of the recovery time constant τ

_{rec}and synaptic utilization

*U*for the synapses projecting onto the output neurons. We observe depression (high values) at low firing rates and facilitation (low values) at high firing rates.

### 2.6. Error-Driven Learning Rule for STP

The task can be formulated as an optimization problem where neurons regulate their own activity in order to minimize the objective function defined as:

where ν_{lim} is the maximum allowed frequency due to the refractory period (ν_{lim} = 1/*t*_{ref}), ν_{targ} is the target firing rate and 〈ν〉 is the mean firing rate across a single population. To calculate firing rates of single neurons ν_{i} we use an exponential moving average with time constant τ_{ν} = 1 *s*:

where $\widehat{{\nu}}$_{i} is the current firing rate, which basically reflects if neuron *i* has fired ($\widehat{{\nu}}$_{i} = 1 Hz) or not ($\widehat{{\nu}}$_{i} = 0 Hz). The population mean firing rate is therefore:

with *N*_{pop} being the size of the population.

By following a standard procedure, learning rules can be derived from Equation (9) by applying the gradient descent method (Hertz et al., 1991). Since the task is not based on single neurons but it involves an entire population, we use a mean-field approach for the derivation of the learning rules. Therefore, from now on in this section, we switch from the above single neuron notation to mean-field variables, by dropping the *ij* indices. It is worth noting that in our formulation the target is achieved not by directly acting on the firing rates, but by tuning the STP parameters, which in turn affects the firing itself. Therefore, 〈ν〉 = 〈ν〉 (*U*, τ_{rec}, τ_{facil}) and by using the chain rule we can formally write the following update rule for each parameter *p*:

where η_{p} is the learning rate, which in principle could be different for each parameter. The form of the function 〈ν〉 (*U*, τ_{rec}, τ_{facil}) can be derived with a semi-heuristic procedure, following (Vasilaki and Giugliano, 2014). Whenever possible, for the mean-field variables we use the same symbols as in Vasilaki and Giugliano (2014) for consistency. Thus, we introduce the mean-field variables *u*, *x*, *U*, and *A*, respectively describing facilitation, depression, synaptic utilization and maximum strength. We assume a threshold-linear gain function between input mean current *h* and output mean firing rate 〈ν〉 = *a* [(*h* − ϑ)]_{+}, for some constants *a*, ϑ. We can then write the dynamic mean-field equations for a population of neurons recurrently connected by short-term synapses as follows (Chow et al., 2005):

where *I*_{ext} is the mean external current and τ is a decaying constant. By imposing equilibrium conditions, ḣ = ẋ = $\dot{{u}}$ = 0, and combining the resulting equations, we can finally write:

Now we observe that by taking the limit *h* → ∞ in *F*(〈ν〉_{h}) we obtain an upper bound for the maximum allowed firing rate 〈ν〉 ≤ $\frac{{A}}{{{\tau}}_{{r}{e}{c}}}$ + *I*_{ext} (for more details see Vasilaki and Giugliano, 2014). We can heuristically turn the above inequality into an equality:

so as by plugging Equation (15) into Equation (12) we can finally obtain an explicit form for the learning rule. In particular, since only one of the three parameters appears in Equation (15), we have a single rule for τ_{rec} only:

Then, according to the above derivation, the only parameter that needs to be learnt is τ_{rec}. Here we adopt the view (Tsodyks and Markram, 1997; Markram et al., 1998b; Thomson, 2000; Chow et al., 2005) that facilitation/depression corresponds to small/large values of τ_{rec} and *U* as well. Therefore, assuming that they apparently play a similar role, we can heuristically take a similar dependence of 〈ν〉 upon *U*: 〈ν〉 = $\frac{{A}}{{U}}$ + *I*_{ext}, which leads us to a similar learning rule:

With the same heuristic argument we can also write down a relation involving τ_{facil}. Indeed, it is well-know that facilitation/depression corresponds to large/small values of τ_{facil}, so we can hypothesize a linear relation, also including the dependence on the maximum strength for similarity with the other parameters. Thus, 〈ν〉 ∝ *A*τ_{facil} + *I*_{ext}, which gives the following learning rule:

Finally, based on the fact that *A* turns out to appear in Equation (15), and supported by experimental results showing an interaction between STP and STDP (Markram et al., 1997; Sjöström et al., 2003), we can also introduce a STP-dependent learning rule for the maximum synaptic strength:

This synaptic modification clearly does not substitute the traditional STDP, since the two rules come from different mechanisms. Rather, we assume they both contribute to maximum weights changes (see Subsection 2.7).

### 2.7. Single Neuron Learning Framework: Combining STDP and STP Learning Models

Equations (16–19) are mean field learning rules for the four parameters τ_{rec}, *U*, τ_{facil}, *A*. It is straightforward to turn them into single neuron online learning rules. From now on, we return to a single neuron notation. Similarly to STDP, we hypothesize that modifications of STP are triggered by postsynaptic events: every time neuron *i* elicits a spike, its current firing rate is updated as well as the mean population firing rate. Neuron *i* can therefore backwards regulate its incoming synapses, through the following set of equations:

The firing event of the neuron *i* also triggers STDP, according with Equation (8). This contribution sums up with the above STP-dependent change, so as the total modification of the maximum synaptic strength is:

Note that when we converted mean field population equations into single neuron equations we kept the population mean firing rate 〈ν〉, instead of turning it into the single rate ν_{i}. This is because the task is defined at a population level. Learning rates of the three STP parameters are chosen to be equal and error-dependent:

with η = 0.1. The learning rate for maximum synaptic strength, instead, is fixed in time and it is the same as the one used for STDP, η_{Aij} ≡ γ.

Now we have four single neuron rules for the STP learning model, plus an equation for STDP and an equation for combining the different rules for the maximum synaptic strength. All these six rules together, Equations (8, 20–24) form a complete learning scheme for each neuron, which is implemented in our simulations. These rules are now local, since their computation takes place separately in each neuron, but receive a global signal encoding for the task performance error.

### 2.8. Investigation of Different Rule Combinations

In the Results section we consider different learning mechanisms: in addition to STDP, that is crucial for the formation of motifs (Vasilaki and Giugliano, 2012, 2014), different combinations of the four STP rules are taken into account while the remaining parameters are kept fixed. At first we allow only two parameters to change: *(i)* τ_{rec}, because Equation (15) implies that for high frequencies this is the only critical parameter for adapting the firing rate of the population, and *(ii) U*, since it was a key parameter adopted in the work in Carvalho and Buonomano (2011). Then, we introduce the STP-dependent rule on the maximum synaptic strength, Equation (23), with the view to observe a more stable learning process. Following this, we also include τ_{facil} in the learning scheme for a full parameter adaptation (*full model*) and finally we investigate the minimal number of parameters that needs to be adapted (*minimal model*), based on Equation (15). Looking for other parameter combinations might not be meaningful, as Equation (15) indicates the key parameters that are involved in changing the mean firing of the population.

### 2.9. Connectivity Analysis

To reveal the type of connectivity in the output population, we use a symmetry index defined as a measure of the symmetry of the connectivity matrix *W* (Esposito et al., 2014):

Here *M* is the number of instances where both *A*_{ij} and *A*_{ji} are zero, i.e., there is no connection between two neurons. Since in our case connections are bounded in the interval [10^{−3}, 1], *M* = 0 all the time. Equation (26) is able to capture the presence of global non-random structures in a network, returning a value included in [0, 1]. Values of *s* close to 1 reflect the presence of a global bidirectional motif, whereas when *s* approaches 0, a unidirectional motif is emerging. Note that, in order to apply the measure Equation (26), we assume that the lower bound for connections is 0. However, the choice of a small value such as 10^{−3} does not affect the measure.

### 2.10. Data Sharing

We provide the scripts that were used to construct the main figures of the paper in the ModelDB database, accession number:169242.

## 3. Results

### 3.1. Single Population with a Time-Varying Task: A Continuum between Facilitation and Depression

First, we apply our learning model to a specific task demonstrating how synapses can change their behavior driven by an external feedback signal. The problem we study is simple: a population of neurons is presented with a stimulus and is required to produce a certain output as a response to that stimulus. Once the learning has been successful, for the same input signal the desirable output changes. In other words, neurons are trained to respond to a change in the associative paradigm (*inverted associative learning problem*), that can be due to, for instance, changes in the environmental conditions.

Let us give a concrete example of an inverted associative learning problem, taken from Asaad et al. (1998). In their work, the authors trained monkeys to associate visual stimuli (pictures) with delayed saccadic movements, left or right, with associations being reversed from time to time. Monkeys had to go beyond learning a single cue-response association: they are required to learn to associate, on alternate blocks, two cue objects with two different saccades. In other words, after having learned the relation {*object A, go right*}, and {*object B, go left*}, the associations were reversed such that now they needed to learn {*object A, go left*} and {*object B, go right*}.

Similar to the (Asaad et al., 1998) experiment, we assume a binary problem, i.e., environmental conditions can change only between two states, and we measure the neurons' activity in terms of firing rate. This means that neurons are initially asked to fire at some rate and, after learning this task, they are asked to fire at a different rate, while keeping the same input signal all the time. Thus, the problem we defined is a simpler version of the monkey experiment, with only a single input. In order to train the neurons on the current associative paradigm, an external global signal is required, that can be considered as an error signal (see Section Methods 2.6 and 2.7).

#### 3.1.1. Problem description and network architecture

We created a learning network of *N* = 40 conductance-based integrate-and-fire neurons (see Section Methods 2.1) all to all connected. Synaptic connections are modified by the STDP triplet rule (Pfister and Gerstner, 2006) and STP is implemented by using the Tsodyks and Markram model (TM model) described in Markram et al. (1998b); Maass and Markram (2002).

Figure 1A shows the network architecture, composed by two non-overlapping regions: a *blue* one with *N*_{in} = 30 neurons receiving the input signal and a *red* one with *N*_{out} = 10 neurons from which we read out the quantities of interest. Note that for clarity, only a few neurons (*black circles*) and connections (*black arrows*) are drawn. The network is therefore formed by two functionally distinct populations, with the input population delivering the stimulus to the output one. Recursive connections are present within each population and across populations, and they are all plastic, in the sense of both long-term and STP. We refer to this architecture as a first or single population scenario.

The input neurons are stimulated one after the other, following a *sequential protocol*, and approximately with the same rate, ν_{in} = 10 Hz. The amplitude of the stimulus is such that input neurons release a spike every time they receive an input (see Section Methods 2.5). This external source can be thought as an additional population of neurons, which we are not simulating here, where each “external” neuron is connected only with a corresponding neuron in the input population by means of a fixed feedforward connection (*blue dashed arrows*).

We hypothesize that the whole learning network (*green* region in Figure 1A) is presented with a sequence of two tasks while the stimulus pattern is kept fixed. The tasks are firing low (5 Hz) and firing high (30 Hz) and the sequence is *low-high-low-high*. Therefore, neurons have to repeatedly learn a new association and forget the previous one in a dynamic context divided in four phases of *t*_{ph} = 100 s. We refer to them as: low 1, high 1, low 2, high 2. As discussed at the beginning of this section, this picture is inspired by a typical inverted associative learning problem: considering the monkey experiment from Asaad et al. (1998) as a metaphor, our scenario provides a simplified version, where instead of having two different inputs, *object A* and *object B*, we have a single input. Indeed, we can think we are presenting the network with only *object A* and while doing this we switch the target association between the two states *go right* and *go left*, which correspond to our low and high firing rate targets. We call the desirable context-dependent target rate, ν_{targ}. As described in Methods, the difference between ν_{targ} and the current firing rate of each population 〈ν〉 is the error signal that, according which our rate-dependent STP causes synapses to adapt their activity.

In all simulations, single neuron parameters {*U*, τ_{rec}, τ_{facil}, *w*}_{ij} are initially drawn from uniform distributions (for *i* ≠ *j*), respectively in [0.05, 0.95], [100, 900] ms, [1, 900] ms, [10^{−3}, 1]. Synaptic variables are initialized at their equilibrium values, i.e., *r*_{ij} = 1 and *u*_{ij} = *U*_{ij}. All the simulations in this subsection use γ = 1 for the high rate regime and γ = 2 for the low rate regime. Values of the parameters are listed in Table 1.

#### 3.1.2. Learning *U* and τ_{rec}

We initially studied the problem with a learning scheme involving *U* and τ_{rec} only, Equations (20, 21), so there is no additional change in maximum synaptic strengths due to STP. Indeed, due to Equation (15) and (Carvalho and Buonomano, 2011), we wanted to test the hypothesis that *U* and τ_{rec} are the only crucial parameters that need to be learnt for adapting the firing rate of a population. The results are displayed in Figures 1B–E, with *vertical black arrows* marking the beginning of each of the four phases, and in Figure 2.

**Figure 2. Single population scenario: STP parameters distribution during adaptation with ( U, τ_{rec}) learning scheme (Part 2).** Different phases of the dynamics are represented.

**(A)**Initial (uniformly random) condition.

**(B)**End of

*low 1*phase (target rate is 5 Hz).

**(C)**End of

*high 1*phase (30 Hz).

**(D)**End of

*low 2*phase (5 Hz).

**(E)**End of

*high 2*phase (30 Hz).

*Columns 1, 2*Histograms of recovery time constant and synaptic utilization of the synapses projecting onto the output neurons. Low values indicates facilitation whereas high values suggest depression.

*Column 3*Single synapse traces obtained with the TM model by applying a 5 Hz stimulus. Synaptic parameters used are mean values obtained from the distributions drawn in

**(A,B)**. Synapses display a clear alternation between depressing and facilitating behavior.

Figure 1B shows the average firing rate of the *N*_{out} neurons, with *shaded area* being the standard deviation. Target firing rates are show with *gray dotted lines*. The adaptation to the new target is fast, except during the *low 2* phase, when we switch from high to low rate, where an initial fast decrease of the firing rate is followed by a much slower adaptation. Despite the fact that neurons do not reach the target rate during this phase, we observe a monotonically decreasing activity which would eventually stabilize at 5 Hz if we were allowing the simulation to run for longer. The reason for this double slope adaptation will be further discussed now.

Figure 1C shows the evolution of the symmetry index (see Section Methods 2.9). At the beginning, the value reflects the randomness in the connections (the mean value of *s* for a network with uniform random connections is indeed ≃ 0.614, see Esposito et al., 2014), whereas, as learning takes place, we observe the development of unidirectional (low values of *s*) and bidirectional (high values of *s*) motifs, depending on the set target. This can also be formalized by applying the *p*-value hypothesis test obtained by using mean and variance of *s* on a completely random network with uniform distribution of connections (Esposito et al., 2014). *P*-values are shown in Table 2. We, again, observe rather slow dynamics during the *low 2* phase that, within the fixed simulation time, prevent the system from reaching a clear connectivity configuration. However, the trend of *s* clearly shows that the connectivity within the output population is approaching unidirectionality.

**Table 2. Symmetry measure and p-value for the single population scenario with {τ_{rec}, U} and {τ_{rec}, U, A} learning schemes**.

Figures 1D,E depicts the time course of the recovery time constant τ_{rec} and synaptic utilization *U* averaged across the output neurons, with *shaded area* representing standard deviation. Both parameters oscillate between high values, which correspond to depressing behavior, and low values, that indicate facilitation. Note that the dynamics of τ_{rec} and *U* is fast in all phases, the third included. This is not surprising since STP is a fast process and leads to fast adaptation of its parameters. As a result, neurons' response to a change in the target rate takes place in a short time. However, during the *low 2* phase, synaptic parameters saturate before the neurons could fulfill the task, with STDP being the only remaining mechanism through which the output population can regulate its own activity. This results in a much slower decrease toward the target rate for two reasons: *(i)* STDP by itself acts on much longer time scales, *(ii)* switching from high to low rate is the most challenging part of the entire sequence of tasks due to the saturation of the maximum weights in the previous *high 1* phase, which slows down the process even further.

Figure 2 provides additional evidence of the alternation between the two different synaptic behaviors. Plots are organized in five rows, with each row displaying information from a different phase of the simulation. Panel A shows the initial uniform condition, panel B the end of *low 1* phase, etc. For each stage, we draw the histograms of recovery time constant (*Column 1*) and synaptic utilization (*Column 2*). According to the narrow standard deviation observed in Figures 1D,E, distributions peak around extreme values, reflecting two different, synaptic behaviors. *Column 3* in Figure 2, displaying the single synapse traces obtained with a TM model, demonstrates the corresponding behaviors: at the end of the phases where neurons are required to fire low we observe a typical depressing response, whereas at the end of the high rate regimes synapses show a typical facilitating trace. To generate the traces, we used a 5 Hz signal to stimulate a synapse with a parameters given by the mean values obtained from the corresponding histograms. Note that the synaptic trace for the initial condition, i.e., before learning shapes the parameters, already shows depression, which explains why the distributions of τ_{rec} and *U* at the end of the *low 1* phase are much broader than in the following phases.

Altogether, the four panels **(B–E)** in Figure 1 and the five panels **(A–E)** in Figure 2 show that the properties and activity of the output population oscillate between two states and that the desirable structure is formed depending on the target rate. In particular, we observe that neurons that fire at low frequency turn their synaptic properties into depressing and the connections formed are mostly unidirectional. On the other hand, when the target rate is set at a high frequency, neurons develop facilitating synapses and bidirectional connections.

#### 3.1.3. STP-dependent modification of *A* enhances performance

Given the speed convergence issue in the *low 1* phase, we introduced an additional learning mechanism, i.e., the STP-dependent rule for *A*, Equations (23, 24). Indeed, this mechanism provides an additional way, besides the STDP, for regulating the long-term plastic synapses. In all the other aspects, the model remains as above.

Figure 3 shows simulation results, with panels A-D depicting the same quantities as panels B-E in Figure 1 (symbols as before). A direct panel-by-panel comparison shows that the results are very similar, meaning that with this new learning configuration the output population also learns to adapt its synaptic properties in order to fulfill the current task, with subsequent motifs formation. As expected, due to the additional leaning rule for *A*, the dynamics are faster: in particular, during the *low 1* phase, neurons reach the target rate within the simulation time, and the value of the symmetry measure is much lower than before, confirming the formation of a unidirectional motif; compare with Figures 1C, 3B and see Table 2. Note that the adaptation of the STP parameters is also faster, as they depend on the current value of the maximum synaptic strength. Thus, the STP-dependent modification of *A* improves the overall performance and introduces an interesting link between STP and STDP.

**Figure 3. Single population scenario: activity, connectivity and STP parameter adaptation in the output population with ( U, τ_{rec}, A) learning scheme. (A)** Mean firing rate of the output population.

*Shaded area*represents standard deviation,

*horizontal dotted gray lines*show the two target firing rates (high = 30 Hz, low = 5 Hz) and

*vertical black arrows*mark the onset of the four dynamic phases alternating the targets in the sequence low-high-low-high.

**(B)**Symmetry measure applied on the connectivity of the output population. In accordance with the target rate, connectivity switches between unidirectionality (low values) and bidirectionality (high values).

**(C,D)**Mean values of recovery time constant τ

_{rec}and synaptic utilization

*U*for the synapses projecting onto the output neurons. We observe depression (high values) at low firing rates and facilitation (low values) at high firing rates. Compared to Figure 1, we observe an improvement in the overall performance due to the inclusion of the STP-dependent modification of

*A*.

### 3.2. Two Populations with a Different Task: Synaptic Differentiation

Now we consider a different scenario, which we refer to as the second or double population scenario. The two tasks associated with low and high targets are now simultaneously active and must be learnt by different populations, interacting via lateral connections and receiving the same stimulus source. Reasons are multiple: we want to investigate if our model allows to contemporary encode both associative paradigms, without the need of forgetting one of the two. In addition, we want to study the possibility that target-specific STP emerges as a consequence of the target-dependent learning rules we chose for our model. In particular, we want to test whether our model is able to reproduce existing experimental data, specifically that appearing in Table 1 from the paper by Wang et al. (2006).

#### 3.2.1. Network architecture

The new configuration is depicted in Figure 4A and it is obtained by mirroring the structure of the first scenario and by adding recursive connections between functionally homologous populations. This led to a network of *N* = 80 conductance-based integrate-and-fire neurons, organized in two distinct branches of 40 neurons each, with the first branch required to fire at a high rate (ν = 30 Hz) and the second branch at a low rate (ν = 5 Hz). Both targets remain fixed throughout the entire simulation. Each branch is a replication of the architecture we used previously, i.e., it is formed by an input and an output population recursively connected. Thus, the network is formed by four functionally different populations: ℘^{in}_{1}, ℘^{in}_{2}, ℘^{out}_{1}, ℘^{out}_{2}, with obvious meaning of symbols. Input populations in both branches receive the stimulus from the same source: a single wave-like signal is delivered to the *N*_{input} = 60 neurons with ν = 10 Hz, stimulating one neuron per time (see Section Methods 2.5), first the neurons in ℘^{in}_{1} and then the neurons in ℘^{in}_{2}. All connections are plastic following the STDP triplet rule and TM model for STP.

**Figure 4. Double population scenario: network architecture, activity and connectivity of the output populations with full ( U, τ_{rec}, τ_{facil}, A) learning scheme (Part 1). (A)** Architecture. The previous network is doubled so that there are now four populations: two input regions (

*blue*) and two output regions ℘

^{out}

_{1}, ℘

^{out}

_{2}(

*red*). The four populations are organized in two branches, one required to fire at high rates (30 Hz) and the second at low rates (5 Hz). Within each branch connections are all to all (

*black arrows*) whereas initially weak connections (

*gray arrows*) are present between the two output populations and between the two input populations. Input neurons receive a wave-like stimulus from outside (

*blue dashed arrows*). All synapses obey both Spike-Timing Dependent Plasticity and rate-dependent Short-Term Plasticity.

**(B)**Mean firing rate of the output populations,

*black line*for ℘

^{out}

_{1}, and

*gray line*for ℘

^{out}

_{2}.

*Shaded area*represents standard deviation and

*horizontal dotted gray lines*show the two target firing rates (30 Hz for ℘

^{out}

_{1}, 5 Hz for ℘

^{out}

_{2}).

**(C)**Symmetry measure applied on the connectivity of the output population. Color legend as in

**(B)**. Connectivity evolves differently in the two populations, leading to a bidirectional motif in ℘

^{out}

_{1}and to a unidirectional motif in ℘

^{out}

_{2}.

Lateral connections are present between the inputs ℘^{in}_{1}, ℘^{in}_{2} and between the outputs ℘^{out}_{1}, ℘^{out}_{2}. To stress that they are functionally different, we drew their initial values from a uniform distribution in [10^{−3}, 10^{−1}], but, during the evolution, synapses are allowed to grow up to *A*_{max} = 1 as any other synapse. Furthermore, cross connections between different output and input populations, i.e., between ℘^{in}_{1}, ℘^{out}_{2} and between ℘^{out}_{1}, ℘^{in}_{2} are absent. The rest of the connections - within each population and across populations belonging to the same branch - are drawn from a uniform distribution in [10^{−3}, 1] and they are not allowed to exceed this interval during the simulation. STP variables are initialized as in the single population scenario and in all the simulations presented in this subsection we used γ = 2 as the learning rate.

#### 3.2.2. Full model: adaptation of *U*, τ_{rec}, τ_{facil} and *A*

We begin by studying the behavior of the *full model*: all four parameters are modified by our rate-dependent STP, Equations (20–24). Taking into account the modifications of all three STP parameters allows us to make a direct comparison with (Wang et al., 2006). Results are displayed in Figures 4B–C and in Figure 5.

**Figure 5. Double population scenario: STP parameters adaptation and final distribution for the output populations with full ( U, τ_{rec}, τ_{facil}, A) learning scheme (Part 1). (A–C)** Mean values of recovery time constant τ

_{rec}, facilitation time constant τ

_{facil}and synaptic utilization

*U*.

*Black lines*represent mean values across the synapses projecting onto output population 1 from both output populations, ℘

^{out}

_{1}∪ ℘

^{out}

_{2}→ ℘

^{out}

_{1}, whereas

*gray lines*describe the synapses projecting onto output population 2 from both output populations, ℘

^{out}

_{1}∪ ℘

^{out}

_{2}→ ℘

^{out}

_{2}.

*Shaded areas*show standard deviation. We observe that the two populations develop different synaptic types, facilitating for ℘

^{out}

_{1}and depressing for ℘

^{out}

_{2}.

**(D–F)**Corresponding histograms of the three synaptic parameters at the end of the simulation. For each of them we show four different groups of values, mapping qualitatively to the four subtypes identified by Wang et al. (2006), see Table 3.

*Light gray:*℘

^{out}

_{2}→ ℘

^{out}

_{2}(E2a).

*Medium gray:*℘

^{out}

_{1}→ ℘

^{out}

_{2}(E2b).

*Dark gray:*℘

^{out}

_{1}→ ℘

^{out}

_{1}(E1a).

*Black:*℘

^{out}

_{2}→ ℘

^{out}

_{1}(E1b).

**(G–J)**Single synapse traces obtained with the TM model by using a 12 Hz stimulus. Each panel represents a different subtype of synapses.

**(G)**℘

^{out}

_{2}→ ℘

^{out}

_{2}.

**(H)**℘

^{out}

_{1}→ ℘

^{out}

_{2}.

**(I)**℘

^{out}

_{1}→ ℘

^{out}

_{1}.

**(J)**℘

^{out}

_{2}→ ℘

^{out}

_{1}. Synaptic parameters used are the mean values obtained from the distributions drawn in

**(D–F)**. A comparison with (Wang et al., 2006) on the basis of the traces only shows that we are able to identify three of the four subtypes.

Figures 4B,C shows the time course of the mean firing rate and symmetry index in both output populations, *black lines* for ℘^{out}_{1} and *light gray lines* for ℘^{out}_{2}. *Shaded areas* and *dark gray dotted lines* represent standard deviation and target firing rates. Both populations ℘^{out}_{1}, ℘^{out}_{2} approach the target rate while developing specific connectivity: as expected, a bidirectional motif emerges in the population firing at the high rate whereas the population firing at the low rate develops mostly unidirectional connections.

Figures 5A–C shows the time evolution of the three parameters of the TM model: *black lines* and *gray lines* represent the mean value of the synapses projecting from the two output populations ℘^{out}_{1} ∪ ℘^{out}_{2}, respectively onto ℘^{out}_{1} and ℘^{out}_{2}. Shaded area is the standard deviation. As expected from the previous simulation, we observe that the two populations develop different synaptic types: high values of τ_{facil} and low values of τ_{rec} and *U*, as observed in the population firing at the high rate, suggest a facilitating behavior, whereas values as the one observed in ℘^{out}_{2}, characterize depressing synapses. Mean values at the end of the simulation are reported in Table 3 *rows 1,4*. These results show that our model develops target-specific STP and results in good agreement with the data in Wang et al. (2006). Indeed, although single values are not identical, the qualitative synaptic behavior is represented: recalling the notation used in Wang et al. (2006), two main types of synapses are present. The group projecting from ℘^{out}_{1} ∪ ℘^{out}_{2} onto ℘^{out}_{1} can be mapped onto the type *E*1 and the group projecting from ℘^{out}_{1} ∪ ℘^{out}_{2} onto ℘^{out}_{2} that can be mapped onto the type *E*2.

**Table 3. Types and subtypes of excitatory synapses between the two output populations in the full model {τ _{rec}, U, τ_{facil}, A}**.

Following Wang et al. (2006), we can also refine our classification, introducing a further distinction within each class. With this purpose, we show in Figures 5D–F the distributions of τ_{rec}, τ_{facil} and *U* at the end of the simulation within the entire output population ℘^{out}_{1} ∪ ℘^{out}_{2}. For each histogram, data have been divided into four groups, representing the four different subtypes: ℘^{out}_{2} to ℘^{out}_{2} with *light gray*, ℘^{out}_{1} to ℘^{out}_{2} with *medium gray*, ℘^{out}_{1} to ℘^{out}_{1} with *dark gray*, ℘^{out}_{2} to ℘^{out}_{1} with *black*. While the distinction between the two synaptic types mapping onto *E*1 and *E*2 is evident, the difference between two subtypes in the same type cannot be easily seen. However, by looking at the mean values of synaptic parameters in Table 3 *rows 2, 3, 5, 6* and in particular the ratio τ_{rec}/τ_{facil} in Table 3 *column 5*, the distinction into four subtypes becomes more clear. As reported in *column 7* of Table 3, we can map the synaptic subtypes as follows: *E*1*a* corresponds to the group ℘^{out}_{1} → ℘^{out}_{1}, *E*1*b* to ℘^{out}_{2} → ℘^{out}_{1}, *E*2*a* to ℘^{out}_{2} → ℘^{out}_{2} and *E*2*b* to ℘^{out}_{1} → ℘^{out}_{2}.

Finally, similarly to Figure 2, in Figures 5G–J we show single synapse traces for each subtype. We observe that, except for the last trace, different groups effectively show a distinctive response to the same stimulus (12 Hz) and the traces reproduce the ones of the corresponding subtypes in Wang et al. (2006).

Although in Figures 5D–F we present four different histograms for each parameter, we can reason on the overall distribution within the entire output population ℘^{out}_{1} ∪ ℘^{out}_{2} as the sample size is the same in all histograms. We can therefore observe that the distribution of τ_{rec} closely matches that in Wang et al. (2006), whereas the distribution of *U* reproduces the peak at around 0.25 but is less broad. On the other hand, the distribution of τ_{facil} is rather different, being totally shifted toward facilitating values in our case. This may be due to the fact that *U* is much more peaked around low values. We decided then to discard τ_{facil} from the learning scheme and run a simulation where only *U*, τ_{rec} and *A* are learnt, as we did for the single population scenario in subsection 3.1.3. We observed that the behavior of the output populations and all the results remain unchanged. We provide an explanation for this in the Discussion.

#### 3.2.3. A minimal model for rate-dependent STP: adaptation of τ_{rec} and *A*

Finally, we study the *minimal model*: a model that suffices to obtain the desired behaviors by adapting as few parameters as possible. The choice of the parameters to be learnt is naturally suggested by the form of the objective function Equation (9): τ_{rec} and *A*. Interestingly, this minimal model preserves two key features: *(i)* both a presynaptic parameter, τ_{rec}, and a postsynaptic parameter, *A*, participate in learning, *(ii)* STP and STDP are linked to each other through the STP-dependent modification of *A*.

In Figure 6 we show the results of the minimal model: from A to D, respectively: mean output firing rates, symmetry index, τ_{rec} evolution and τ_{rec} distribution in the four groups of synapses. By comparing these panels with the ones from the full model simulation, we observe that output populations still efficiently fulfill the task while developing the expected connectivity motifs. Also, in Table 4 we report the mean values of τ_{rec} for the four groups of synapses that we identified with the full model: there is still a clear distinction between them. We can therefore conclude that this minimal model is sufficient for qualitatively reproducing the main two types and also the subtypes of Wang et al. (2006).

**Figure 6. Double population scenario: learning in the output populations with minimal (τ _{rec}, A) model. (A)** Mean firing rate of the output populations,

*black line*for ℘

^{out}

_{1}and

*gray line*for ℘

^{out}

_{2}.

*Shaded area*represents standard deviation and

*horizontal dotted gray lines*show the two target firing rates (30 Hz for ℘

^{out}

_{1}, 5 Hz for ℘

^{out}

_{2}).

**(B)**Symmetry measure applied on the connectivity of the output population. Color legend as in

**(B)**. Connectivity evolves differently in the two populations, leading to a bidirectional motif in ℘

^{out}

_{1}and to a unidirectional motif in ℘

^{out}

_{2}.

**(C)**Mean value of recovery time constant τ

_{rec}.

*Black line:*℘

^{out}

_{1}∪ ℘

^{out}

_{2}→ ℘

^{out}

_{1}.

*Gray line:*℘

^{out}

_{1}∪ ℘

^{out}

_{2}→ ℘

^{out}

_{2}. We observe that the two populations develop different type of synapses, facilitating for ℘

^{out}

_{1}and depressing for ℘

^{out}

_{2}.

**(D)**Corresponding histograms of the recovery time constant at the end of the simulation.

*Light gray*: ℘

^{out}

_{2}→ ℘

^{out}

_{2},

*medium gray*: ℘

^{out}

_{1}→ ℘

^{out}

_{2},

*dark gray*: ℘

^{out}

_{1}→ ℘

^{out}

_{1},

*black*: ℘

^{out}

_{2}→ ℘

^{out}

_{1}. The panels show that the achievement of the tasks and the differentiation of the synapses is still possible with this minimal model.

**Table 4. Types and subtypes of excitatory synapses between the two output populations in the minimal model (τ _{rec}, A)**.

## 4. Discussion

It is well-known that synapses are activity-dependent connections through which neurons propagate information. STP is a mechanism that describes these phenomena in short time scales and introduces two typical synaptic behaviors: depression and facilitation. Contrary to long-lasting modifications of maximum synaptic strengths, for example STDP, existing models of STP do not rely on any learning mechanisms, apart from very few exceptions; see for instance (Carvalho and Buonomano, 2011). Motivated by their work, it is our belief that more efficient dynamics would be possible if synapses were allowed to change their short-term behavior by tuning their own parameters, depending on one or more external controlling factors, for example, their current task. Typically, one asks which is the firing regime for which a certain type of synapse performs better (Barak and Tsodyks, 2007), whereas we are looking at the picture from a reverse perspective: we want to obtain some frequency regime, which is the most efficient way to do it from a synaptic point of view? A similar concept can be found in Natschläger et al. (2001), where the authors trained a network with a temporal structured target signal, using optimization techniques.

In our work, we developed a learning scheme for STP, and we obtained, with a semi-rigorous argument, a learning rule for only one of the three parameters of the TM model, τ_{rec}. Based on specific experimental results (Tsodyks and Markram, 1997; Markram et al., 1998b; Thomson, 2000) and data fitting (Chow et al., 2005), we used the conjecture that STP behavior of synapses has the same functional dependence on *U* and τ_{rec}, which allowed us to write a similar rule for the synaptic utilization *U*. Interestingly, such learning rules depend on the maximum synaptic strength, and they therefore: *(i)* provided a natural link between STP and STDP and *(ii)* allowed us to derive an STP-dependent rule for the maximum synaptic strength, to be added to the STDP contribution.

The interaction between short- and long-term plasticity is largely supported by experimental evidence (Markram et al., 1997), although the exact mechanisms are still unknown. Some results (Markram and Tsodyks, 1996; Sjöström et al., 2003, 2007) suggest that synapses become more/less depressing after long-term potentiation/depression. Our rules incorporate this behavior: long-term potentiation/depression always produces larger/smaller changes in STP parameters. However, whether these modifications bring more facilitation or depression critically depends on whether the population firing rate 〈ν〉 is approaching the target rate ν_{targ} from above or below. Consider, for example, Equation (16): if ν_{targ} − 〈ν〉 < 0, then long-term potentiation will produce a stronger depression, thus reproducing the experimentally observed behavior. In our simulations, this happens to the neurons that are firing at low frequencies. On the other hand, if ν_{targ} − 〈ν〉 > 0, then an increase in *A* will make τ_{rec} even smaller, resulting in a less depressing synapse. In our simulations, this happens to the neurons that are firing at high frequencies. A similar argument can be formulated for the induction of long-term depression. We note that several mechanisms have been identified to compete during synaptic transmission, resulting in a more complex and less clear relationship between STP and STDP (Sjöström et al., 2007).

In Sjöström et al. (2003, 2007) the authors link the interaction between short- and long-term plasticity with the frequency of firing: at high rates, synapses tend to become stronger and more depressing, while at lower frequencies they tend to become weaker and less depressing. Our derivation, instead, suggests the opposite: if we rely on the hypothesis that large values of τ_{rec} lead to depression and small values to facilitation (Chow et al., 2005), according to Equation (15), facilitating synapses allow neurons to reach higher frequencies. These findings, together with the STDP triplet rule, from the basis of our work: they provide the theoretical basis for the experimentally observed correspondence between facilitation and bidirectionality, and between depression and unidirectionality. The behavior expressed by Equation (15) is experimentally and computationally based on previous work that relates facilitation with high frequency and rate code, and depression with low frequency and temporal code (Fuhrmann et al., 2002; Blackman et al., 2013). This is because, for example, a facilitating synapse may require several spikes to elicit an action potential, meaning that only high frequency stimulation can generate postsynaptic spikes (Matveev and Wang, 2000; Klyachko and Stevens, 2006).

We derived our rules by minimizing an error function that is equal to zero when the target and actual firing rates are equal. Alternatively, we could have defined a reward function opposite to the error function in the sense that for zero error the reward function has its maximum value, and it is equal to zero for large error. We could have then taken the gradient of the reward function instead, bringing the derived rules into the framework of policy gradient learning methods and reinterpreting the feedback signal as a reward signal (Urbanczik and Senn, 2009; Vasilaki et al., 2009a; Richmond et al., 2011). In biological systems, dopamine is thought to act as reward signal (Schultz et al., 1997; Fiorillo et al., 2003), and its role in the context of learning associated with STDP, and more generally with Hebbian learning, has been extensively studied (Tobler et al., 2005; Izhikevich, 2007; Legenstein et al., 2008).

Each of the learning rules we proposed depends, however, on the difference between the target and the actual firing rates, computed at the population level. This implies the presence of: *(i)* a single feedback signal encoding the population activity, which is processed outside the population and broadcasted to all neurons; *(ii)* an external signal bringing information about the current paradigm, i.e., the target firing rate. Similar to Urbanczik and Senn (2009), we can assume that synapses receive both signals via ambient neurotransmitter concentrations, leading to an on-line plasticity rule.

We initially tested our learning scheme by implementing the rules for τ_{rec} and *U* on a classical paradigm of inverting associations: keeping the stimulus fixed and varying the associations, the network had to learn to first make choice A and then unlearn it in favor of choice B. This led to a network able to periodically switch its behavior from depressing to facilitating and vice versa, closely following the change in the association paradigm. Throughout the simulation, the network formed motifs similar to those experimentally observed in Wang et al. (2006) and Pignatelli (2009), with facilitating synapses developing bidirectional motifs and depressing synapses developing unidirectional motifs. The desirable motifs were formed due to two factors: (i) the triplet rule that governed long-term potentiation and (ii) the wave-like input stimulus of the network. The form of the plasticity rule guarantees that when neurons fire at high frequency, the synaptic efficacy increases. Hence, synapses will grow up to their bounds, leading to bidirectional connections. On the contrary, when neurons fire at low frequencies, the synaptic efficacy decreases, yet the wave-like input imposes unidirectional connectivity.

We further extended this learning model by adding an STP-motivated rule for the maximum synaptic strength, and we tested it on the same invert association scenario. Results showed the same behavior as before but with faster dynamics due to the joint action of STP and STDP on the absolute efficacy.

In the second part of the paper, we extended our study. First, we considered two populations that have to fire at different frequencies (low, high). Then, we introduced a learning rule for the facilitating time constant, in order to have a full learning model involving all four parameters. The aim was twofold:

*(i) Comparison of our results with experimental data in Wang et al. (2006)*. Although the accuracy is not excellent, we were able to qualitatively reproduce the basic differentiation in the ranges of values of the STP parameters, reflecting the existence of four different synaptic subtypes. We believe that by further adapting the model, in particular learning rates and target frequencies or by considering other rule combinations, it is possible to obtain different parameter values (in principle an infinite combination of them), and thus possibly reproduce the results of Wang and collaborators even better. However, we think this may not be critical because, as a recent study (Costa et al., 2013) has pointed out, fitting techniques generally used for deriving STP parameters from experimental data may give unreliable results. Given this limitation, we think it is important that our model accounts for a large variety of parameter values in principle, and that in this specific case of Wang et al. (2006) it is able to replicate the basic distinction in the synaptic response.

*(ii) Differentiation of synaptic types innervating two functionally different populations*. The reason for this lies in the way we constructed the learning model: what triggers the synaptic modification is the spike of the postsynaptic neuron. The firing rate of the population to which this postsynaptic neuron belongs is the information used to tune the values of STP parameters. In other words, we implement a target-specific learning mechanism. This choice is based on an optimization argument: the more direct and efficient way for a neuron to influence its own activity through synaptic changes is to modify incoming synapses rather than outgoing synapses. A second scheme, a source-specific learning mechanism modifying the outgoing synapses, would have probably led to the same results within closed microcircuits, but on a much longer time scale.

Our target-specific learning mechanism is also supported by experimental evidence (see Blackman et al., 2013 for a review). Despite the fact that STP seems to be mainly a presynaptic mechanism, it has been shown that the target cells can also determine the STP dynamics. All the studies we are aware of have established such a target specificity only in the context of excitatory cells innervating other excitatory cells on one hand and inhibitory cells on the other, specially interneurons (Markram et al., 1998b; Reyes et al., 1998; Buchanan et al., 2012). It would therefore be interesting to appropriately modify the double population scenario by incorporating a population of inhibitory neurons and comparing the results with existing data. In addition, some authors (Blackman et al., 2013; Costa et al., 2013) suggested that a similar differentiation might exist within excitatory only populations. Having target-specific STP for excitatory-excitatory connections is still an open possibility that needs to be further explored. Here we show from a theoretical point of view that such a differentiation is possible between fundamentally similar (all excitatory) but functionally different (encoding for different paradigms) targets.

The well-established existence of STP-target specificity provides us with a possible biological explanation for the learning rules we derived. Indeed, this scenario requires that the postsynaptic neuron can regulate specifically its own presynaptic compartment only, by a retrograde signal that does not affect neighboring cells. Thus, diffusive retrograde messengers, for example endocannabinoids and nitric oxide, do not appear to be the most suited agents, whereas synaptic adhesion molecules, for example cadherins (Bozdagi et al., 2004) and neuroligins (Dean and Dresbach, 2006), seem to be better candidates for playing this role. These molecules are responsible for governing the presynaptic transmitter release through many different presynaptic mechanisms (Zucker and Regehr, 2002; Blatow et al., 2003; Deng et al., 2011; Blackman et al., 2013).

We underline that the way we obtained the learning rules is based in part on heuristic evaluation. According to Equation (15), derived from a semi-rigorous argument, the key parameters seems to be τ_{rec} and *A*. By also including *U* following Carvalho and Buonomano (2011), we obtain a learning scheme involving τ_{rec}, *U* and *A* only, which we used to study the double population problem and evaluate the importance of τ_{facil}. Results remain essentially unchanged from the full model, suggesting that τ_{facil} does not play a critical role in the task we defined. This is not surprising and the reason is that our rules link facilitation with a high firing rate, and depression with a low firing rate. Indeed, even with a small facilitation time constant (small τ_{facil}), synapses are still able to fire at a high rate, as long as the stimulating frequency is high enough and recovery from depression is fast enough (low τ_{rec}). Therefore, the time constant of recovery from depression seems to be the only parameter regulating the firing frequency of the neuron for high firing rates, exactly as it comes out from the objective function (we recall that Equation 15 comes from an inequality obtained in the limit of high frequency). With our novel view of allowing synapses to modify their properties from facilitating to depressing and vice versa, we therefore suggest that τ_{rec} is the parameter that is mostly related to rate coding, whereas *U* to temporal coding.

This conclusion is also supported by Carvalho and Buonomano (2011). In this paper the authors described a simple problem based on temporal synchrony between two inputs that cannot be solved unless STP is learnt, together with STDP. Besides the long-lasting change in *A*, they introduce a temporal synaptic plasticity for *U* only and they showed that this indeed solves the problem. Also, they reported that changing *U* only was the most efficient way to solve the problem. Our work supports the hypothesis that, when dealing with rate coding tasks, the only necessary parameter that has to be learnt is τ_{rec}, whereas, based on Carvalho and Buonomano (2011), when dealing with temporal coding tasks, the only necessary parameter is *U*.

Another result pointing to a similar direction can be found in Natschläger et al. (2001), where the authors use optimization techniques, rather than explicit learning rules, to train a network of neurons in order to transform a time-varying input into a desired time-varying output. They show that to achieve good performance, one needs to change at least two parameters, either *A* and τ_{rec}, or *A* and *U*. This confirms that learning must involve at least one presynaptic and one postsynaptic parameter, and that τ_{facil} seems not to be relevant in these types of tasks.

We finally presented results from what we call the minimal model, where only τ_{rec} and *A* were allowed to change, since both their corresponding update rules come directly from the gradient of the objective function we defined. Results confirmed our belief, as we were still able to learn the tasks while obtaining results similar to those from Wang and collaborators. It is in agreement with our conjecture that when we tried to apply learning on *U* and *A* only (results now shown here), the network failed to perform its task because the population that was supposed to fire high stabilized at a much lower frequency, i.e., ~ 15 Hz. Therefore, an alternative minimal model adapting *U* and *A* would be able to successfully learn only targets of a lower firing regime. We believe that specialization of parameters in the STP model depending on tasks and signal encoding may be a key ingredient toward a better understanding of synaptic and neuron functionality.

## Author Contributions

All authors provided materials and contributed to writing the article.

## Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

## Acknowledgments

This work was supported by the Royal Society (travel grant JP091330-2009/R4, http://royalsociety.org) and the European Commission (FP7 Marie Curie Initial Training Network “NAMASEN,” grant n. 264872, http://cordis.europa.eu/fp7). Michele Giugliano acknowledges additional support from the European Commission (FP7 Information and Communication Technologies, Future Emerging Technology programme, “BRAINLEAP” grant n. 306502, http://cordis.europa.eu/fp7/ict/fet-open) and the Flemish agency for Innovation by Science and Technology (grant n. 90455/1955, http://www.iwt.be); Eleni Vasilaki acknowledges additional support from the Engineering and Physical Sciences Research Council (grant n. EP/J019534/1 and e-futures award EFXD12003/EFXD12004, http://efutures.ac.uk/xd). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

## References

Asaad, W. F., Rainer, G., and Miller, E. K. (1998). Neural activity in the primate prefrontal cortex during associative learning. *Neuron* 21, 1399–1407. doi: 10.1016/S0896-6273(00)80658-3

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Barak, O., and Tsodyks, M. (2007). Persistent activity in neural networks with dynamic synapses. *PLoS Comput. Biol*. 3:e35. doi: 10.1371/journal.pcbi.0030035

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Binzegger, T., and Douglas, RJ Martin, K. (2004). A quantitative map of the circuit of cat primary visual cortex. *J. Neurosci*. 24, 8441–8453. doi: 10.1523/JNEUROSCI.1400-04.2004

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Blackman, A. V., Abrahamsson, T., Costa, R. P., Lalanne, T., and Sjöström, P. J. (2013). Target-cell-specific short-term plasticity in local circuits. *Front. Synaptic Neurosci* 5:11. doi: 10.3389/fnsyn.2013.00011

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Blatow, M., Caputi, A., Burnashev, N., Monyer, H., and Rozov, A. (2003). Ca2+ buffer saturation underlies paired pulse facilitation in calbindin-d28k-containing terminals. *Neuron* 38, 79–88. doi: 10.1016/S0896-6273(03)00196-X

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Bozdagi, O., Valcin, M., Poskanzer, K., Tanaka, H., and Benson, D. L. (2004). Temporally distinct demands for classic cadherins in synapse formation and maturation. *Mol. Cell. Neurosci*. 27, 509–521. doi: 10.1016/j.mcn.2004.08.008

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Buchanan, K. A., Blackman, A. V., Moreau, A. W., Elgar, D., Costa, R. P., Lalanne, T., et al. (2012). Target-specific expression of presynaptic nmda receptors in neocortical microcircuits. *Neuron* 75, 451–466. doi: 10.1016/j.neuron.2012.06.017

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Buonomano, D. V., and Merzenich, M. (1998). Cortical plasticity: from synapses to maps. *Annu. Rev. Neurosci*. 21, 149–186. doi: 10.1146/annurev.neuro.21.1.149

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Carvalho, T., and Buonomano, D. V. (2011). A novel learning rule for long-term plasticity of short-term synaptic plasticity enhances temporal processing. *Front. Integr. Neurosci*. 5:20. doi: 10.3389/fnint.2011.00020

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Chklovskii, D., Mel, B., and Svoboda, K. (2004). Cortical rewiring and information storage. *Nature* 431, 782–788. doi: 10.1038/nature03012

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Chklovskii, D., Vitaladevuni, S., and Scheffer, L. (2010). Semi-automated reconstruction of neural circuits using electron microscopy. *Curr. Opin. Neurobiol*. 20, 667–675. doi: 10.1016/j.conb.2010.08.002

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Chow, C., Gutkin, B., Hansel, D., Meunier, C., and Dalibard, J. (eds.). (2005). *Methods and Models in Neurophysics*. Chapter Synaptic Dynamics. Les Houchers: Elsevier. 245–266.

Clopath, C., Buesing, L., Vasilaki, E., and Gerstner, W. (2010). Connectivity reflects coding: a model of voltage-based stdp with homeostasis. *Nat. Neurosci*. 13, 344–352. doi: 10.1038/nn.2479

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Costa, R. P., Sjöström, P. J., and van Rossum, M. C. W. (2013). Probabilistic inference of short-term synaptic plasticity in neocortical microcircuits. *Front. Comput. Neurosci*. 7:75. doi: 10.3389/fncom.2013.00075

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Dean, C., and Dresbach, T. (2006). Neuroligins and neurexins: linking cell adhesion, synapse formation and cognitive function. *Trends Neurosci*. 29, 21–29. doi: 10.1016/j.tins.2005.11.003

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Deng, L., Kaeser, P. S., Xu, W., and Sdhof, T. C. (2011). Rim proteins activate vesicle priming by reversing autoinhibitory homodimerization of munc13. *Neuron* 69, 317–331. doi: 10.1016/j.neuron.2011.01.005

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Denk, W., and Horstmann, H. (2004). Serial block-face scanning electron microscopy to reconstruct three-dimensional tissue nanostructure. *PLoS Biol*. 2:e329. doi: 10.1371/journal.pbio.0020329

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Douglas, R., and Martin, K. (2007a). Mapping the matrix: the ways of neocortex. *Neuron* 56, 226–238. doi: 10.1016/j.neuron.2007.10.017

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Douglas, R., and Martin, K. (2007b). Recurrent Neuronal circuits in the neocortex. *Curr. Biol*. 17, R496–R500. doi: 10.1016/j.cub.2007.04.024

Esposito, U., Giugliano, M., van Rossum, M., and Vasilaki, E. (2014). Measuring symmetry, asymmetry and randomness in neural network connectivity. *PLoS ONE* 9:e100805. doi: 10.1371/journal.pone.0100805

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Fiorillo, C. D., Tobler, P. N., and Schultz, W. (2003). Discrete coding of reward probability and uncertainty by dopamine Neurons. *Science* 299, 1898–1902. doi: 10.1126/science.1077349

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Friston, K. (2011). Functional and effective connectivity: a review. *Brain Connect*. 1, 13–36. doi: 10.1089/brain.2011.0008

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Fuhrmann, G., Segev, I., Markram, H., and Tsodyks, M. (2002). Coding of temporal information by activity-dependent synapses. *J. Neurophysiol*. 87, 140–148. doi: 10.1152/jn.00258.2001

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Fusi, S., Asaad, W. F., Miller, E. K., and Wang, X.-J. (2007). A neural circuit model of flexible sensorimotor mapping: learning and forgetting on multiple timescales. *Neuron* 54, 319–333. doi: 10.1016/j.neuron.2007.03.017

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Gütig, R., and Sompolinsky, H. (2006). The tempotron: a neuron that learns spike timing-based decisions. *Nat. Neurosci*. 9, 420–428. doi: 10.1038/nn1643

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Grillner, S., Markram, H., De Schutter, E., Silberberg, G., and LeBeau, F. (2005). Microcircuits in action - from cpgs to neocortex. *Trends Neurosci*. 28, 525–533. doi: 10.1016/j.tins.2005.08.003

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Hai, A., Shappir, J., and Spira, M. (2010). In-cell recordings by extracellular microelectrodes. *Nat. Methods* 7, 200–202. doi: 10.1038/nmeth.1420

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Hennig, M. H. (2013). Theoretical models of synaptic short term plasticity. *Front. Comput. Neurosci*. 7:154. doi: 10.3389/fncom.2013.00154

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Hertz, J., Krough, A., and Palmer, R. (1991). *Introduction to the Theory of Neural Computation*. Redwood City, CA: Addison Wesley.

Izhikevich, E. M. (2007). Solving the distal reward problem through linkage of stdp and dopamine signaling. *Cereb. Cortex* 17, 2443–2452. doi: 10.1093/cercor/bhl152

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Kandell, E., Schwartz, J., and Jessel, A. (2008). *Principles of Neural Science*. New York, NY: McGraw-Hill.

Klyachko, V. A., and Stevens, C. F. (2006). Excitatory and feed-forward inhibitory hippocampal synapses work synergistically as an adaptive filter of natural spike trains. *PLoS Biol*. 4:e207. doi: 10.1371/journal.pbio.0040207

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Le Be', V., and Markram, H. (2006). Spontaneous and evoked synaptic rewiring in the neonatal neocortex. *Proc. Natl. Acad. Sci. U.S.A*. 103, 13214–13219. doi: 10.1073/pnas.0604691103

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Legenstein, R., Pecevski, D., and Maass, W. (2008). A learning theory for reward-modulated spike-timing-dependent plasticity with application to biofeedback. *PLoS Comput. Biol*. 4:e1000180. doi: 10.1371/journal.pcbi.1000180

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Lichtman, J., Livet, J., and Sanes, J. (2008). A technicolour approach to the connectome. *Nat. Rev. Neurosci*. 9, 417–422. doi: 10.1038/nrn2391

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Maass, W., and Markram, H. (2002). Synapses as dynamic memory buffers. *Neural Netw*. 15, 155–161. doi: 10.1016/S0893-6080(01)00144-7

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Markram, H., and Tsodyks, M. (1996). Redistribution of synaptic efficacy between neocortical pyramidal neurones. *Nature* 382, 807–809. doi: 10.1038/382807a0

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Markram, H., Lübke, J., Frotscher, M., and Sakmann, B. (1997). Regulation of synaptic efficacy by coincidence of postsynaptic aps and epsps. *Science* 275, 213–215. doi: 10.1126/science.275.5297.213

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Markram, H., Pikus, D., Gupta, A., and Tsodyks, M. (1998a). Potential for multiple mechanisms, phenomena and algorithms for synaptic plasticity at single synapses. *Neuropharmacology* 37:489–500. doi: 10.1016/S0028-3908(98)00049-5

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Markram, H., Wang, Y., and Tsodyks, M. (1998b). Differential signaling via the same axon of neocortical pyramidal neurons. *Proc. Nat. Acad. Sci. U.S.A*. 95, 5323–5328. doi: 10.1073/pnas.95.9.5323

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Markram, H., Gerstner, W., and Sjöström, P. J. (2011). A history of spike-timing-dependent plasticity. *Front. Synaptic Neurosci*. 3:4. doi: 10.3389/fnsyn.2011.00004

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Matveev, V., and Wang, X. J. (2000). Differential short-term synaptic plasticity and transmission of complex spike trains: to depress or to facilitate? *Cereb. Cortex* 10, 1143–1153. doi: 10.1093/cercor/10.11.1143

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Minderer, M., Liu, W., Sumanovski, L., Kügler, S., Helmchen, F., and Margolis, D. (2012). Chronic imaging of cortical sensory map dynamics using a genetically encoded calcium indicator. *J. Physiol*. 590, 99–107. doi: 10.1113/jphysiol.2011.219014

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Natschläger, T., Maass, W., and Zador, A. (2001). Efficient temporal processing with biologically realistic dynamic synapses. *Network* 12, 75–87. doi: 10.1080/net.12.1.75.87

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Perin, R., Berger, T., and Markram, H. (2011). A synaptic organizing principle for cortical neuronal groups. *Proc. Natl. Acad. Sci. U.S.A*. 108, 5419–5424. doi: 10.1073/pnas.1016051108

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Pfister, J.-P., and Gerstner, W. (2006). Triplets of spikes in a model of spike timing–dependent plasticity. *J. Neurosci*. 26, 9673–9682. doi: 10.1523/JNEUROSCI.1425-06.2006

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Pfister, J.-P., Dayan, P., and Lengyel, M. (2010). Synapses with short-term plasticity are optimal estimators of presynaptic membrane potentials. *Nat. Neurosci*. 13, 1271–1275. doi: 10.1038/nn.2640

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Pignatelli, M. (2009). *Structure and Function of the Olfactory Bulb Microcircuit*. Ph.D. thesis, École Polytechnique Fédérale de Lausanne. Available online at: http://library.epfl.ch/en/theses/?nr=4275.

Reyes, A., Lujan, R., Rozov, A., Burnashev, N., Somogyi, P., and Sakmann, B. (1998). Target-cell-specific facilitation and depression in neocortical circuits. *Nat. Neurosci*. 1, 279–285. doi: 10.1038/1092

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Richmond, P., Buesing, L., Giugliano, M., and Vasilaki, E. (2011). Democratic population decisions result in robust policy-gradient learning: a parametric study with GPU simulations. *PLoS ONE* 6:e18539. doi: 10.1371/journal.pone.0018539

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Rinaldi, T., Perrodin, C., and Markram, H. (2008). Hyper-connectivity and hyper-plasticity in the medial prefrontal cortex in the valproic acid animal model of autism. *Front. Neural Circ*. 2:4. doi: 10.3389/neuro.04.004.2008

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Romani, A., Marchetti, C., Bianchi, D., Leinekugel, X., Poirazi, P., Migliore, M., et al. (2013). Computational modeling of the effects of amyloid-beta on release probability at hippocampal synapses. *Front. Comput. Neurosci*. 7:1. doi: 10.3389/fncom.2013.00001

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Rotman, Z., and Klyachko, V. A. (2013). Role of synaptic dynamics and heterogeneity in Neuronal learning of temporal code. *J. Neurophysiol*. 110, 2275–2286. doi: 10.1152/jn.00454.2013

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Schultz, W., Dayan, P., and Montague, P. R. (1997). A neural substrate of prediction and reward. *Science* 275, 1593–1599. doi: 10.1126/science.275.5306.1593

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Seung, H. (2009). Reading the book of memory: sparse sampling reading the book of memory: sparse sampling versus dense mapping of connectomes. *Neuron* 62, 17–29. doi: 10.1016/j.neuron.2009.03.020

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Silberberg, G., and Markram, H. (2007). Disynaptic inhibition between neocortical pyramidal cells mediated by martinotti cells. *Neuron* 53, 735–746. doi: 10.1016/j.neuron.2007.02.012

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Silberberg, G., Grillner, S., LeBeau, F., Maex, R., and Markram, H. (2005). Synaptic pathways in neural microcircuits. *Trends Neurosci*. 28, 541–551. doi: 10.1016/j.tins.2005.08.004

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Sjöström, P.J., Turrigiano, G.G., and Nelson, S.B. (2001). Rate, timing, and cooperativity jointly determine cortical synaptic plasticity. *Neuron* 32, 1149–1164. doi: 10.1016/S0896-6273(01)00542-6

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Sjöström, P. J., Turrigiano, G. G., and Nelson, S. B. (2003). Neocortical ltd via coincident activation of presynaptic nmda and cannabinoid receptors. *Neuron* 39, 641–654. doi: 10.1016/S0896-6273(03)00476-8

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Sjöström, P. J., Turrigiano, G. G., and Nelson, S. B. (2007). Multiple forms of long-term plasticity at unitary neocortical layer 5 synapses. *Neuropharmacology* 52, 176–184. doi: 10.1016/j.neuropharm.2006.07.021

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Song, S., Miller, K., and Abbott, L. (2000). Competitive Hebbian learning through spike-timing-dependent synaptic plasticity. *Nat. Neurosci*. 3, 919–926. doi: 10.1038/78829

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Song, S., Sjöström, P., Reigl, M., Nelson, S., and Chklovskii, D. (2005). Highly nonrandom features of synaptic connectivity in local cortical circuits. *PLoS Biol*. 3:e68. doi: 10.1371/journal.pbio.0030068

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Testa-Silva, G., Loebel, A., Giugliano, M., de Kock, C. P. J., Mansvelder, H. D., and Meredith, R. M. (2012). Hyperconnectivity and slow synapses during early development of medial prefrontal cortex in a mouse model for mental retardation and autism. *Cereb. Cortex* 22, 1333–1342. doi: 10.1093/cercor/bhr224

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Thomson, A. M. (2000). Facilitation, augmentation and potentiation at central synapses. *Trends Neurosci*. 23, 305–312. doi: 10.1016/S0166-2236(00)01580-0

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Tobler, P. N., Fiorillo, C. D., and Schultz, W. (2005). Adaptive coding of reward value by dopamine Neurons. *Science* 307, 1642–1645. doi: 10.1126/science.1105370

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Tsodyks, M., and Markram, H. (1997). The neural code between neocortical pyramidal neurons depends on neurotransmitter release probability. *Proc. Natl. Acad. Sci. U.S.A*. 94, 719–723. doi: 10.1073/pnas.94.2.719

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Urbanczik, R., and Senn, W. (2009). Reinforcement learning in populations of spiking Neurons. *Nat. Neurosci*. 12, 250–252. doi: 10.1038/nn.2264

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Varela, J., Sen, K., Gibson, J., Fost, J., Abbott, L., and Nelson, S. (1997). A quantitative description of short-term plasticity at excitatory synapses in layer 2/3 of rat primary visual cortex. *J. Neurosci*. 17, 7926–7940.

Vasilaki, E., and Giugliano, M. (2012). “Emergence of connectivity patterns from long-term and short-term plasticities,” in *Artificial Neural Networks and Machine Learning ICANN 2012* Vol. 7552 of *Lecture Notes in Computer Science*, eds A. Villa, W. Duch, P. Rdi, F. Masulli, and G. Palm (Berlin; Heidelberg: Springer), 193–200.

Vasilaki, E., and Giugliano, M. (2014). Emergence of connectivity motifs in networks of model neurons withshort- and long-term plastic synapses. *PLoS ONE* 9:e84626. doi: 10.1371/journal.pone.0084626

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Vasilaki, E., Frémaux, N., Urbanczik, R., Senn, W., and Gerstner, W. (2009a). Spike-based reinforcement learning in continuous state and action space: when policy gradient methods fail. *PLoS Comput. Biol*. 5:e1000586. doi: 10.1371/journal.pcbi.1000586

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Vasilaki, E., Fusi, S., Wang, X.-J., and Senn, W. (2009b). Learning flexible sensori-motor mappings in a complex network. *Biol. Cybern*. 100, 147–158. doi: 10.1007/s00422-008-0288-z

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Wang, Y., Markram, H., Goodman, P., Berger, T., Ma, J., and Goldman-Rakic, P. (2006). Heterogeneity in the pyramidal network of the medial prefrontal cortex. *Nat. Neurosci*. 9, 534–542. doi: 10.1038/nn1670

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Wedeen, V., Rosene, D., Wang, R., Dai, G., Mortazavi, F., Hagmann, P., et al. (2012). The geometric structure of the brain fiber pathways. *Science* 335, 1628–1634. doi: 10.1126/science.1215280

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Wickersham, I., Lyon, D. C., Barnard, E., Mori, T., Finke, S., Conzelmann, K., et al. (2007). Monosynaptic restriction of transsynaptic tracing from single, genetically targeted neurons. *Neuron* 53, 639–647. doi: 10.1016/j.neuron.2007.01.033

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Zhang, F., Aravanis, A., Adamantidis, A., de Lecea, L., and Deisseroth, K. (2007). Circuit-breakers: optical technologies for probing neural signals and systems. *Nat. Rev. Neurosci*. 8, 577–581. doi: 10.1038/nrn2192

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Zucker, R., and Regehr, W. (2002). Short-term synaptic plasticity. *Annu. Rev. Physiol*. 64, 355–405. doi: 10.1146/annurev.physiol.64.092501.114547

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Keywords: short-term plasticity, long-term plasticity, learning, rate code, motifs, target-specificity

Citation: Esposito U, Giugliano M and Vasilaki E (2015) Adaptation of short-term plasticity parameters via error-driven learning may explain the correlation between activity-dependent synaptic properties, connectivity motifs and target specificity. *Front. Comput. Neurosci*. **8**:175. doi: 10.3389/fncom.2014.00175

Received: 15 May 2014; Accepted: 31 December 2014;

Published online: 29 January 2015.

Edited by:

Cristina Savin, Institute of Science and Technology Austria, AustriaReviewed by:

Rui Ponte Costa, University of Oxford, UKAlberto Bernacchia, Jacobs University Bremen, Germany

Copyright © 2015 Esposito, Giugliano and Vasilaki. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Eleni Vasilaki, Department Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, S1 4DP Sheffield, UK e-mail: e.vasilaki@sheffield.ac.uk

## COMMENTARY