**NEUROBIOLOGICAL CIRCUIT FUNCTION AND COMPUTATION OF THE SEROTONERGIC AND RELATED SYSTEMS**

**Topic Editors Kae Nakamura and KongFatt Wong-Lin**

INTEGRATIVE NEUROSCIENCE

### *FRONTIERS COPYRIGHT STATEMENT*

© Copyright 2007-2015 Frontiers Media SA. All rights reserved.

All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.

Cover image provided by Ibbl sarl, Lausanne CH

**ISSN** 1664-8714 **ISBN** 978-2-88919-384-4 **DOI** 10.3389/978-2-88919-384-4

### *ABOUT FRONTIERS*

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### *FRONTIERS JOURNAL SERIES*

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing.

All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### *DEDICATION TO QUALITY*

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view.

By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

### *WHAT ARE FRONTIERS RESEARCH TOPICS?*

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area!

Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **NEUROBIOLOGICAL CIRCUIT FUNCTION AND COMPUTATION OF THE SEROTONERGIC AND RELATED SYSTEMS**

Topic Editors: **Kae Nakamura,** Kansai Medical University, Japan **KongFatt Wong-Lin,** University of Ulster, UK

Serotonin is one of the oldest neurotransmitters in evolutionary terms, and the serotonergic system is complex and multifaceted. Serotonin-producing neurons in the raphe nuclei provide serotonin innervations throughout various parts of the brain, modulating cellular excitability and network properties of targeted brain areas, and regulating mood, cognition and behavior. Dysfunctions of the serotonergic system are implicated in neuropsychiatric disorders including depression, schizophrenia, and drug abuse. Although the system has been studied for many years, an integrative account of its functions and computational principles remains elusive. This is partly attributed to the high variability and heterogeneity in terms of neuronal properties and receptor types, and its extensive connections with other brain regions. This Frontiers Research Topic e-book is a collection of recent experimental and computational work and approaches at multiple scales that provide the latest information regarding the integrated functions of the serotonergic system. The contributed papers include a variety of experimental and computational work, and human clinical studies.

# Table of Contents


### *148 Exploring the Effects of Depression and Treatment of Depression in Reinforcement Learning*

Pedro Castro-Rodrigues and Albino J. Oliveira-Maia and KongFatt Wong-Lin

*150 A Dynamic, Embodied Paradigm to Investigate the Role of Serotonin in Decision-Making*

Derrik E. Asher, Alexis B. Craig, Andrew Zaldivar, Alyssa A. Brewer and Jeffrey L. Krichmar

### Functions and computational principles of serotonergic and related systems at multiple scales

#### *Kae Nakamura1 \* and KongFatt Wong-Lin2 \**

*<sup>1</sup> Department of Physiology, Kansai Medical University, Osaka, Japan*

*<sup>2</sup> Intelligent Systems Research Centre, School of Computing and Intelligent Systems, University of Ulster, Northern Ireland, L'Derry, UK*

*\*Correspondence: nakamkae@hirakata.kmu.ac.jp; k.wong-lin@ulster.ac.uk*

### *Edited by:*

*Sidney A. Simon, Duke University, USA*

### **Keywords: serotonin 5-HT, neural circuit, computational modeling, dopamine, serotonin, dorsal raphe nucleus, locus coeruleus, pendunculopontine tegmental nucleus**

As one of the phylogenetically and ontogenetically oldest neurotransmitters, the monoamine serotonin (5-HT) is derived from tryptophan in neurons within the raphe nuclei, and innervates various parts of the nervous system (Jacobs and Azmitia, 1992). The serotonergic system is complex and can generate multifarious actions (Barnes and Sharp, 1999; Smythies, 2005). There are seven general families of serotonin receptors with multiple receptor subtypes, all of which are G protein-coupled receptors (GPCRs) except one (5-HT3 receptor), which is a ligand-gated ion channel, and these receptors can modulate the release of many major neurotransmitters such as glutamate, GABA, dopamine, acetylcholine, and norepinephrine (Barnes and Sharp, 1999; Smythies, 2005). It can also modulate neuronal excitability and network properties of many targeted brain areas, and regulate mood, cognition and behavior (Smythies, 2005). Dysfunctions of the serotonergic system are implicated in neuropsychiatric disorders including depression and schizophrenia (Müller and Jacobs, 2009). The serotonergic system has been the target of pharmaceuticals for decades, primarily to treat biological and neuropsychiatric disorders. These include antidepressants, antipsychotics, hallucinogens, antimigraine agents, and gastroprokinetic agents (Nichols and Nichols, 2008). Hence, the study of serotonin has high societal impacts.

Although the serotonergic system has been studied for many years, an integrative account of its underlying functions remains elusive. This could be partly attributed to the high variability and heterogeneity in terms of neuronal properties and receptor subtypes, and its extensive connections with other brain regions. Indeed, it has been claimed that serotonin is in involved "in virtually everything, but responsible for nothing" (Jacobs and Fornal, 1995). While there have already been many excellent reviews and books on serotonin and related neural systems (e.g., Jacobs and Azmitia, 1992; Barnes and Sharp, 1999; Smythies, 2005; Müller and Jacobs, 2009), we hope that this collection of recent works provides a complementary and updated coverage of their diverse functions. In particular, unlike previous collections, neurobiologically based computational studies are included in this collection as we consider them to be important toward elucidating some of the underlying principles, especially at the systems level. Hence, we have made a concerted effort to invite both experimental and computational articles in this Research Topic. These works include original results, reviews, and hypothesis over multiple levels: from receptors and channels, to neuronal circuits and finally to behavior and neuropsychiatric disorders.

At the receptor and cellular levels, Maejima et al. (2013) discussed various GPCRs and ion channels in the serotonin regulation and introduced optogenetic techniques that modulate intracellular signaling to more finely control the serotonergic systems for studies of their functions. The activation of the serotonin receptors was determined by its release and uptake dynamics. Unlike other more commonly studied neurotransmitters such as acetylcholine for example, the release and uptake dynamics of serotonin is not well characterized. Dankowski and Wightman (2013) reviewed the challenges and developments of fast-scan cyclic voltammetry to monitor serotonin at the subsecond (maybe millisecond) timescale in both *in vitro* and *in vivo* conditions.

At the neuronal circuit level, Celada et al., 2013 provided a comprehensive review on cortical modulation of serotonin. In particular, the prefrontal cortex, linked to executive brain functions, seemed to form closed-loop interactions with the serotonin neurons in the dorsal raphe nucleus. This review was well complemented by biologically realistic computational modeling works of serotonin modulation on the prefrontal cortex. In Wang and Wong-Lin (2013), a biologically motivated model was developed to investigate how the co-modulation of serotonin and dopamine in the prefrontal cortex could result in complex, non-intuitive neuronal circuit dynamics, thus challenging current simpler theories on neuromodulation. Cano-Colino et al. (2013) incorporated serotonin modulation into an established computational model of the prefrontal cortex performing spatial working memory tasks. The model showed that excessive serotonin could impede task performance, and interestingly predicted that serotonin levels could affect neuronal memory fields.

Besides the cortex, serotonin is also known to modulate important subcortical brain regions. Using a mathematical model of multiple brain regions, Reed et al. (2013) demonstrated the potential roles of serotonin in maintaining homeostasis in the basal ganglia (via the frontal cortex) under dopamine depletion (e.g., in Parkinson's disease). In Nakamura (2013), the neural circuit architecture of the dorsal raphe nucleus and other key subcortical brain regions involved in reward-based decision making and learning were discussed with emphasis on the neural circuit. The dorsal raphe nucleus has strong anatomical and functional connectivity with neighboring structures including the pendunculopontine tegmental nucleus (PPTg) and the locus coeruleus (LC), where many acetylcholine and noradrenergic neurons are found, respectively, (Koyama and Koyama, 1993; Martinez-Gonzalez et al., 2011). Indeed, Okada and Kobayashi (2013) showed that PPTg neurons exhibit similar tracking of future reward expectation as neurons in the dorsal raphe nucleus. Tsuruoka et al. (2012) reviewed the role of LC on pain control, which might be involved in aversive information processing.

It has been proposed that reinforcement learning models can be used as a platform for studying neurological and neuropsychiatric disorders (Maia and Frank, 2011). In this collection, Herzallah et al. (2013) dissociated among depressed patients with and without antidepressant medication, and healthy control subjects by observing the performance in learning from positive (reward) and negative (punishment) feedback. Castro-Rodrigues and Oliveira-Maia (2013) provided a useful commentary on this important original work. Finally, the comprehensive review by Asher et al. (2013) proposed a closed-loop paradigm toward understanding serotonergic roles in decision making by involving behavioral experiments, game theory, computational modeling, and human–robotic interaction, a truly integrative neuroscience approach.

We hope that this issue will provide a comprehensive review of the diverse and complex functions and computations of serotonergic and related systems at multiple scales of investigation. We wish that this will motivate and inspire a more integrative research approach from cellular to systems level toward understanding neuromodulatory systems.

### **ACKNOWLEDGMENTS**

We would like to thank all the authors for participating as well as help from the Frontiers Neuroscience Editorial Office staff and encouragement from the chief editor, Sid Simon. We would also wish to thank Sid Simon for comments on this editorial, and the reviewers, whose contributions significantly helped to improve the published papers that constituted this Research Topic.

### **REFERENCES**


*Received: 18 February 2014; accepted: 19 February 2014; published online: 07 March 2014.*

*Citation: Nakamura K and Wong-Lin K (2014) Functions and computational principles of serotonergic and related systems at multiple scales. Front. Integr. Neurosci. 8:23. doi: 10.3389/fnint.2014.00023*

*This article was submitted to the journal Frontiers in Integrative Neuroscience.*

*Copyright © 2014 Nakamura and Wong-Lin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Biological implications of coeruleospinal inhibition of nociceptive processing in the spinal cord

### *Masayoshi Tsuruoka\*, Junichiro Tamaki , Masako Maeda , Bunsho Hayashi and Tomio Inoue*

*Department of Physiology, Showa University School of Dentistry, Tokyo, Japan*

### *Edited by:*

*Kae Nakamura, Kansai Medical University, Japan*

### *Reviewed by:*

*Kae Nakamura, Kansai Medical University, Japan Hidemasa Furue, National Institute for Physiological Sciences, Japan*

*\*Correspondence: Masayoshi Tsuruoka, Department of Physiology, Showa University School*

*of Dentistry, 1-5-8 Hatanodai, Tokyo 142-8555, Japan. e-mail: masa@dent.showa-u.ac.jp*

The coeruleospinal inhibitory pathway (CSIP), the descending pathway from the nucleus locus coeruleus (LC) and the nucleus subcoeruleus (SC), is one of the centrifugal pain control systems. This review answers two questions regarding the role coeruleospinal inhibition plays in the mammalian brain. First is related to an abnormal pain state, such as inflammation. Peripheral inflammation activated the CSIP, and activation of this pathway resulted in a decrease in the extent of the development of inflammatory hyperalgesia. During inflammation, the responses of the dorsal horn neurons to graded heat stimuli in the LC/SC-lesioned rats did not produce a further increase with the increase of stimulus intensity in the higher range temperatures. These results suggest that the function of CSIP is to maintain the accuracy of intensity coding in the dorsal horn because the plateauing of the heat-evoked response in the LC/SC-lesioned rats during inflammation is due to a response saturation that results from the lack of coeruleospinal inhibition. The second concerns attention and vigilance. During freezing behavior induced by air-puff stimulation, nociceptive signals were inhibited by the CSIP. The result implies that the CSIP suppresses pain system to extract other sensory information that is essential for circumstantial judgment.

**Keywords: locus coeruleus/subcoeruleus, coeruleospinal pathway, pain control, peripheral inflammation, startle response, air-puff stimulation, spinal dorsal horn**

### **INTRODUCTION**

It is a general principle that the brain regulates its sensory inputs. This principle applies to all of the somatosensory pathways that have been investigated. The inhibitory regulation of nociceptive inputs is of particular clinical interest because this regulation may lead to a reduction of pain. Inhibitory action on nociceptive processing is accomplished via descending or ascending inhibitory pathways (Horie et al., 1991; Koyama et al., 1995; Willis and Coggeshall, 2004). There is considerable interest in the role of descending inhibitory pathways and the possibility of targeting these pathways for clinical treatments. A number of studies have demonstrated that stimulation at many sites of the brain can produce analgesia by inhibiting nociceptive transmission in the spinal cord (see review by Willis and Coggeshall, 2004).

The descending pathway from the nucleus locus coeruleus (LC) and the nucleus subcoeruleus (SC) is one of centrifugal pain control systems. The LC/SC provides noradrenergic innervation of the spinal cord (Guyenet, 1980; Westlund et al., 1983, 1984; Fritschy and Grzanna, 1990; Clark and Proudfit, 1991, 1992; Grzanna and Fritschy, 1991; Proudfit and Clark, 1991). Activation of the LC/SC either electrically or chemically can produce profound antinociception (Segal and Sandberg, 1977; Margalit and Segal, 1979; Jones and Gebhart, 1986a; Jones, 1991; West et al., 1993) and can inhibit nociceptive activity in dorsal horn neurons (Hodge et al., 1981; Mokha et al., 1985; Jones and Gebhart, 1986a,b, 1987, 1988). Thus, the coeruleospinal inhibitory pathway (CSIP) appears to play a significant role in spinal nociceptive processing.

During the first decade of the twenty-first century, we were particularly interested in the role of the CSIP in the everyday life of mammals, including its roles in normal and abnormal pain condition. Based on our experimental results that characterized coeruleospinal inhibition of nociceptive processing in the spinal cord, this review provides an answer to the question regarding the role of coeruleospinal inhibition in the mammalian brain. We hope that our inferences will aid in a better understanding of the role of centrifugal control of sensation.

### **CONTRIBUTION OF THE CSIP TO PAIN CONTROL UNDER AN ABNORMAL PAIN STATE AND ITS BIOLOGICAL IMPLICATIONS ACTIVATION OF THE CSIP BY PERIPHERAL INFLAMMATION**

### **(Tsuruoka and Willis, 1996 a,b)**

In a series of our study, inflammatory pain, but not neuropathic pain, was used as an abnormal pain state. Pain can divide three groups (i.e., inflammatory pain, neuropathic pain, and psychogenic pain) on the basis of the source, nociceptive, neuropathic, and psychogenic pain. Inflammatory pain is nociceptive pain via nociceptor, neuropathic pain is a morbid pain induced by dysfunction of the peripheral or central nervous system, and psychogenic pain results from the psychological reasons. We adopted inflammatory pain because peripheral inflammation is a matter of frequent occurrence in compared to other pain in everyday life.

We compared the development of peripheral hyperalgesia between rats that received bilateral lesions to the LC/SC and sham-operated, control animals for 4 weeks after administration

of carrageenan (an inflammatory agent) (**Figure 1**). Four hours after the induction of inflammation, the paw withdrawal latencies (PWLs) to heat stimuli in the inflamed paws of the LC-lesioned rats were significantly shorter than those of the sham-operated rats. This result shows that peripheral inflammation activates the CSIP and that the activation of this pathway results in a decrease in the extent of the development of hyperalgesia. The difference in the PWLs between the two groups was not observed at 7 days, whereas, edema and hyperalgesia were still present in the inflamed paw. This result suggests that the CSIP is active only in the acute phase of the inflammatory process.

### **A POSSIBLE INTERACTION WITH OPIOID SYSTEMS (Tsuruoka and Willis, 1996b)**

We examined, whether coeruleospinal inhibition of nociceptive processing depends on an interaction with other inhibitory systems that involve opioid peptides (**Figures 2**, **3**). In the acute phase of inflammation, systemic administration of naloxone significantly further decreased the PWLs of the LC-lesioned rats, which indicate that opioid inhibitory mechanisms are active in the acute phase of inflammation. This result suggests that the coeruleospinal inhibition system interacts with the opioid inhibitory system. However, systemic naloxone never reversed nociceptive threshold in sham-operated rats under inflammation, whereas reverse effects observed in LC-lesioned rats. These results indicate that the coeruleospinal inhibitory system is far predominant in compared to the opioid inhibitory system under inflammatory pain state.

**FIGURE 2 | The effect of naloxone or saline on PWLs in sham-operated rats tested 4 h, 7 days and 28 days after the injection of carrageenan.** The data were obtained 10 min after intraperitoneal (i.p.) naloxone (*n* = 8) or saline (*n* = 8) and are presented for both the inflamed **(A)** and the contralateral non-inflamed **(B)** paws. -*P <* 0.01, significantly different from PWLs before injection. ∗*P <* 0*.*01, significantly different between two groups of rats (Tsuruoka and Willis, 1996b).

**FIGURE 3 | The effect of naloxone or saline on PWLs in LC-lesioned rats tested 4 h, 7 days and 28 days after the injection of carrageenan.** The data were obtained 10 min after i.p. naloxone (*n* = 6) or saline (*n* = 6) and are presented for both the inflamed **(A)** and the contralateral non-inflamed **(B)** paws. -*P <* 0.01, significantly different from PWLs before injection. ∗*P <* 0*.*01, significantly different between two groups of rats (Tsuruoka and Willis, 1996b).

Opioid inhibitory mechanisms were inactive in both the LClesioned rats and the sham-operated rats at 7 days, whereas edema and hyperalgesia were still present in the inflamed paws. Comparable data have been obtained in rats with unilateral inflammation in which naloxone induced no significant effect at 1 week, whereas this drug reduced the paw-pressure threshold 24 h after the induction of inflammation (Millan et al., 1988). In both the sham-operated and the LC-lesioned rats, we found that the baselines PWLs of the inflamed paws were prolonged for the rats that had recovered from the inflammation. These analgesic states at 28 days resulted from the activation of endogenous opioid controls, which was apparent following systemic administration of naloxone. This finding is consistent with reports on rats with carrageenan-induced inflammation (Kayser and Guilbaud, 1991) and rats with neuropathic hyperalgesia (Attal et al., 1990), whereby naloxone produced a hyperalgesic effect in rats that had recovered from either inflammation or neuropathic hyperalgesia.

### **THE ROLE OF THE CSIP IN THE INTENSITY CODING OF NOCICEPTIVE SIGNALS UNDER AN ABNORMAL PAIN STATE (Tsuruoka et al., 2003)**

Extracellular recordings were made from the sites at lumbar enlargement of the spinal cord that had receptive fields on the hindpaws or toes. The neurons included 63 wide-dynamic-range

neurons and two high threshold neurons. These neurons were tested for changes in heat-evoked response during hindpaw inflammation (**Figures 4**, **5**). During inflammation, the responses of the dorsal horn neurons to graded heat stimuli in the LC/SC-lesioned rats did not produce a further increase with the increase of stimulus intensity in the higher range temperatures (49–53◦C), whereas the responses recorded from the LC/SCintact rats continued to increase at temperature of 49◦C or higher. Therefore, it is clear that the plateauing of the heatevoked response in the LC/SC-lesioned rats during inflammation is due to a response saturation that results from the lack of coeruleospinal inhibition.

Previous studies have reported that the descending system from the brain stem, including the LC/SC, becomes more active in modulating spinal nociceptive processes during peripheral inflammation (Ren and Dubner, 1996; Wei et al., 1999). In these studies, it has been suggested that the CSIP plays a role in suppressing the hyperexcitability of nociceptive dorsal horn neurons

histograms for a neuron in a LC/SC-intact rat. **(B)** Rate histograms for a

neuron in a LC/SC-lesioned rat (Tsuruoka et al., 2003).

rats (*n* = 11). Open circles (•) represent neurons in the LC/SC-lesioned rats (*n* = 20). Neuronal discharges to each temperature in graded heat stimuli (ordinate) are expressed as a percentage of the control. In **(A)** and **(B)**, 100% (control) were discharges to heating at 53◦C in the LC/SC-intact rats. #*P <* 0*.*05, ##*P <* 0*.*01, significantly different from the value of the LC/SC-intact rats (ANOVA, with Scheffe's t-test as a post hoc analysis of differences). ∗*P <* 0*.*05, significantly different between responses to heating at 49◦C and responses to heating at 53◦C (ANOVA, with Scheffe's t-test as a post hoc analysis of differences). (Tsuruoka et al., 2003).

during inflammation. Our study provides additional findings concerning the role of coeruleospinal inhibition in nociception under the condition of inflammation. In intensity coding, the plateauing of the stimulus–response curve of dorsal horn neurons indicates that the accuracy of the transmission of stimulus intensity decreases in the dorsal horn. This implies that the difference of stimulus intensity in higher temperature ranges cannot be distinguished in the LC/SC-lesioned rats in which the plateauing of the heat-evoked response was observed. Because the plateauing of the heat-evoked response was not seen in the LC/SC-intact rats, the CSIP activated by peripheral inflammation may be involved in the prevention of the plateauing of the heat-evoked response in the dorsal horn. Activation of the CSIP induces a decrease of activity in response of dorsal horn neurons to noxious heating so that the heat-evoked responses do not produce response saturation in the range of higher temperatures which can prevent the plateauing of the heat-evoked response in the dorsal horn. It seems that the function of CSIP activation by peripheral inflammation is to maintain the accuracy of intensity coding in the dorsal horn. Thus, a possible role of CSIP activation by peripheral inflammation is to provide a means to discriminate among differences in the intensity of a painful stimulus in an inflamed region, as well as in the condition without inflammation. It is likely that the CSIP contributes to the discrimination of the intensity of pain sensation under abnormal pain states, such as inflammation.

### **COERULEOSPINAL INHIBITION ON VISCERAL PAIN PROCESSING AND VISCEROMOTOR REFLEXES (Tsuruoka et al., 2010B)**

Visceral nociceptive signals are the subject of coeruleospinal inhibition (Liu et al., 2007). We identified, in rats, dorsal horn neurons whose visceral nociceptive responses were not inhibited by the CSIP (LC/SC-unaffected neurons) (Liu et al., 2008). To determine the possible role of LC/SC-unaffected neurons in pain processing and visceromotor reflexes (muscular defense), we electrically stimulated the descending colon, and simultaneously recorded both the evoked discharge in the ventral posterolateral (VPL) nucleus of the thalamus and the electromyogram (EMG) of the abdominal muscle under halothane anesthesia (**Figures 6**, **7**). It is known that spinothalamic tract neurons that are excited by visceral nociceptive stimuli are located in the dorsal horn and that postsynaptic dorsal column neurons, which conduct visceral nociceptive signals in the dorsal column, are located near the central canal of the spinal cord (Al-Chaer et al., 1996, 1999; Ness, 2000; Palecek et al., 2002, 2003; Willis and Coggeshall, 2004). We clarified that all the LC/SC-unaffected neurons tested were located in the dorsal horn, and none were in the area near the central canal of the spinal cord (Tsuruoka et al., 2008). This result suggests that the LC/SC-unaffected neurons include spinothalamic tract cells. It has been confirmed that the spinothalamic tract neurons are involved in the development of visceromotor reflexes, such as muscular defense (Palecek and Willis, 2003). Thus, the LC/SC-unaffected neurons may be involved in visceromotor reflexes. As seen in **Figures 6**, **7**, the inhibitory effect of LC/SC stimulation was different between in the evoked discharge of the VPL and the EMG of the abdominal musculature. The EMG was not completely inhibited even when the stimulus intensity was increased up to 150μA, whereas the evoked discharge disappeared. If the LC/SC-unaffected neurons were involved in visceromotor reflexes, the presence of the LC/SC-unaffected neurons can explain the fact that visceromotor reflexes are not completely inhibited during activation of the coeruleospinal modulation system. The minimum visceromotor reflex responses (muscular defense) are maintained by the presence of LC/SC-unaffected neurons, which play an important role of protecting visceral organs.

Visceral nociceptive information ascending in the spinothalamic tract subserves multiple functions, acting as components of visceromotor reflexes as well as signals for producing visceral pain (Palecek and Willis, 2003). The location of the LC/SC-unaffected neurons in the spinal cord and the different inhibitory effects of LC/SC stimulation between the evoked discharge and the EMG responses lead to the conclusion that the LC/SC-unaffected neurons contribute to the maintenance of a minimum tonic contraction of the abdominal musculature even when visceral pain is completely inhibited. Considering a role of muscular defense, it is reasonable to assume that some visceral nociceptive neurons are not under the control of the CSIP to prevent the disappearance

**FIGURE 6 | Difference in inhibitory effect of LC/SC stimulation between the evoked discharge and the EMG activity. (A)** Stimulation site of the LC/SC (closed circle). **(B)** The EMG activity in the masseter muscle evoked by an increase in the intensity in LC/SC stimulation. **(C)** An example of the effect of graded LC/SC stimulation on the evoked discharge and the EMG activity. Note that LC/SC stimulation at a stimulus intensity below 50 μA never produced EMG activity of the masseter muscle associated with stimulation of the mesencephalic trigeminal nucleus, located just lateral to the LC/SC, and that EMG activity was still observed even when the evoked discharge was completely inhibited by LC/SC stimulation at an intensity over 50μA. Electrical stimulation of the descending colon is indicated by the arrow. (Tsuruoka et al., 2010b).

of muscular defense. Thus, the presence of the LC/SC-unaffected neurons may be advantageous for an individual in an abnormal pain state, such as inflammation.

### **CONTRIBUTION OF THE CSIP TO PAIN CONTROL ACCOMPANIED BY THE MAMMALIAN-STARTLE RESPONSE INVOLVEMENT OF THE LC/SC IN THE INDUCTION OF FREEZING BEHAVIOR FOLLOWING THE STARTLE REACTION (Tsuruoka et al., 2010a)**

The startle response is an example of a simple behavior in mammals. An air puff is one of the startle-eliciting stimuli (Davis, 1984; Taylor et al., 1991; Knapp and Pohorecky, 1995; Cooke and Graziano, 2003, 2004; Sttensland et al., 2005; Lockey et al., 2009). The air-puff-induced startle response is widely used to study the function of the central nervous system (Geyer et al., 1982; Anand et al., 1999; Cooke and Graziano, 2003, 2004), including habituation (Rinaldi and Thompson, 1985), motor and cardiovascular responses (Retting et al., 1986; Woodworth and Johnson, 1988; Casto et al., 1989; Taylor et al., 1991; Zaretsky et al., 2003;

**FIGURE 7 | (A)** Localization of LC/SC stimulation sites (*n* = 12). Each closed circle represents one animal. The rostrocaudal extension of the stimulation sites was 9*.*6 ± 0*.*1 mm caudal to the bregma. Only data from rats in which the tip of the stimulating electrode was located within the LC/SC were adopted. **(B)** Graphs summarizing the effects of varying stimulus intensities of LC/SC stimulation on either the evoked discharge or the EMG activity (*n* = 8). The evoked discharge and the EMG activity during LC/SC stimulation are expressed as a percentage of the value before LC/SC stimulation (control). Note that the inhibitory effect was different between the evoked discharge and the EMG. ∗*P <* 0*.*05, significantly different from the control. (Tsuruoka et al., 2010b).

de Menezes et al., 2008), and anxiety (Barros and Miczek, 1996). The air-puff startle reaction consists of a marked extension of both forelimbs and hindlimbs or a rapid flexion of the whole body that results in an overall shortening of the body (Cassela and Davis, 1986; Taylor et al., 1991; Knapp and Pohorecky, 1995). Following the startle reaction, rats react by freezing with a defense response of approximately 2–5 s in length. We designated this freezing behavior as a defensive-like, immobile posture (DIP) on the basis of Davis' study (1984).

Regression analysis showed a low correlation (R2 <sup>=</sup> <sup>0</sup>*.*11) in the relation between the DIP period and the startle magnitude, which suggests that the DIP period is not influenced by the startle magnitude. This finding generates the notion that the DIP is an independent component of the startle response, although the startle reaction and the DIP are continuous behavioral responses in the air-puff startle. In this context, the DIP period might be used as another endpoint for assessing the air-puff startle.

The startle magnitude in the LC/SC-lesioned rats was significantly less than that before lesions, although the startle magnitude only slightly decreased. This result is consistent with preceding investigations in which an LC lesion decreased the effect of the startle reaction in rats (Adams and Geyer, 1981). Bilateral lesions of the LC/SC produced a significant reduction of the DIP period, as well as in the startle magnitude. These results suggest that the LC/SC is involved in the induction of both the startle reaction and the DIP, whereas the startle magnitude and the DIP period are independent endpoints for assessment. The reduction of both the startle magnitude and the DIP period in the LC/SC-lesioned rats suggests that the LC/SC exerts an excitatory influence on the air-puff startle.

LC/SC neurons have been implicated in regulation of attention and vigilance (Foote et al., 1983; Aston-Jones et al., 1991). The DIP period, therefore, seems to be an attentional state and a vigilance condition. This notion may be supported by the following finding reported by Knapp and Pohorecky (1995): an air-puff stimulus elicits ultrasonic vocalizations (e.g., 22 kHz), which are thought to reflect an aversive behavioral state, following the startle reaction in rats. Considering these findings, it can be inferred that following the startle reaction, rats likely focus attention on judging their circumstances so that unnecessary sensory information may be inhibited to extract other sensory information which is essential to the survival of an individual. It is likely that the DIP period is the time for circumstantial judgment.

### **ACTIVATION OF THE CSIP DUE TO AIR-PUFF STIMULATION (Tsuruoka et al., 2011)**

Because stimulation of the CSIP produces inhibition of nociceptive transmission in the spinal dorsal horn (Tsuruoka et al., 2004; Liu et al., 2007), the tail flick test was used to examine air-puff stimulation-induced activation of the CSIP (**Figure 8**). The tail flick test in rats has often been used for measuring a response to a noxious stimulus and for assaying analgesic drugs, since it was first described by D'Amour and Smith (1941). The antinociceptive effect has been estimated by the prolongation of the tail flick latency (e.g., Li et al., 2010; Schröder et al., 2010; Silva et al., 2010). As shown in **Figure 9**, tail flick latencies with air-puff stimulation were significantly prolonged when

**FIGURE 8 | (A)** Photograph of the tail heating. The heat source was attached to the tail, and a voltage (35 V) was applied to the coil. The heat source was a tubiform coil of wire (50 turns) covered with a thin film made of acrylic resin. The whole aspect of the heat source looked like a plastic tube with an 8-mm inside diameter and a 20-mm length. Half of the inside was the effective electric heating surface. Following heating of the tail, rats whisked their tail (tail flick reflex) and bit the heat source when the tail was unable to be removed from heating by tail flick. The arrow points to the heat source.

**(B)** An example of changes in the skin temperature of the tail after the heating was begun. In this case, the skin temperature of the tail before heating was 30.1◦C. After heating was begun, the skin temperature of the tail almost linearly increased to 74.0◦C at 10 s. **(C)** The relation between the skin temperature of the tail and the time from the beginning of tail heating. Eleven points were obtained from 11 untreated rats. Note that a high correlation (R<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*99) was observed between the skin temperature of the tail and the time from the beginning of tail heating (Tsuruoka et al., 2011).

compared to values without air-puff stimulation, indicating airpuff stimulation-induced antinociception in the spinal cord level. Because this phenomenon was not observed after the LC/SC was bilaterally lesioned, it appears that air-puff stimulation activates the descending pathway from the LC/SC so that nociceptive signals are inhibited in the spinal dorsal horn.

Air-puff stimulation induces the DIP, which is a defensive movement, following the startle reaction. It is known that the DIP is mediated by cortical areas, as well as by the LC/SC, in the brain (Cooke and Graziano, 2003). From this finding, it is obvious that the ascending pathway from the LC/SC is activated by air-puff stimulation for inducing the DIP. Because the DIP and the antinociception in the spinal dorsal horn are simultaneous events, it seems that the descending and ascending LC/SC pathways are simultaneously active following air-puff stimulation. Moreover, as shown in **Figure 10A**, there was no significant difference between the DIP period and the tail flick latency. This result suggests that the activity of the descending and ascending LC/SC systems stop simultaneously.

In addition, as mentioned in the section "Activation of the CSIP by Peripheral Inflammation," the CSIP is active during an abnormal pain state in the peripheral tissues, such as peripheral inflammation (Tsuruoka and Willis, 1996a,b). This finding indicates that the LC/SC descending mechanism is included in the way of a spino-pontine-spino pathway. In contrast, airpuff stimulation-induced descending inhibition of nociceptive signals in the spinal dorsal horn suggests a top–down modulation of nociceptive signals from the LC/SC to the spinal dorsal horn.

### **AIR-PUFF STIMULATION-INDUCED SUPPRESSION OF BITE BEHAVIOR (Tsuruoka et al., 2011)**

Noxious stimuli also induce nociceptive behavior mediated by a higher center of the brain (e.g., Woolfe and MacDonald, 1944). If air-puff stimulation-induced activation of the CSIP inhibits nociceptive signals in the spinal dorsal horn, it will certainly result in suppression of nociceptive behavior. Following heating of the tail, rats whisked their tail and then bit the heat source when tail heating was continued because the tail could not escape from heating by tail flick. In our study, the bite behavior induced by tail heating was a candidate for nociceptive behavior mediated by a higher center of the brain. The bite latency was defined as the time between the onset of heat stimulation and the first motion of bite behavior. Because it was shown that bite behavior was undoubtedly nociceptive and that the bite latency reflected the nociceptive threshold for evoking bite behavior (Tsuruoka et al., 2011), the bite latency could be considered as an indicator for estimating nociception.

Bite latencies with air-puff stimulation were significantly prolonged when compared to values without air-puff stimulation (Tsuruoka et al., 2011). This result indicates air-puff stimulationinduced nociceptive inhibition. Fanselow and Helmstetter (1988) have shown that rats react with a defense response of freezing and a reduction in sensitivity to painful stimulation when they are placed in a situation that has come to be associated with footshock through the process of Pavlovian conditioning. Our study is not the same as that of Fanselow et al. in content; the experimental conditions in our study were different in the following two ways: (1) air-puff stimulation is not painful

stimulation and (2) our study did not utilize of Pavlovian conditioning. Indeed, influences of learning the experimental conditions could be excluded in our experiment (Tsuruoka et al., 2011). The air-puff stimulation-induced nociceptive inhibition may be different from nociceptive inhibition under a fear-like condition with respect to the underlying mechanisms and the biological meaning.

significantly prolonged by air-puff stimulation in pre-lesions, suggesting that the descending inhibitory system from the LC/SC is involved in air-puff

stimulation-induced antinociception (Tsuruoka et al., 2011).

Regarding air-puff stimulation-induced nociceptive inhibition, there is a possibility that the air-puff stimulationinduced prolongation of the bite latency is due to decreased interactions in motor response/behavioral output systems, but not in decreased sensory processing. It seems that air-puff stimulation suppresses motor response/behavioral output systems so that the DIP is induced. However, as seen in **Figure 10B**, the bite latency was held nearly constant regardless of changes in the DIP period, suggesting that the bite latency is not influenced by the DIP period. With the result of air-puff stimulation-induced coeruleospinal antinociception, this finding generates the notion that nociceptive inhibition is an independent component of the

tail flick latency (*n* = 26) and the bite latency (*n* = 40). ∗∗*P <* 0*.*01, significantly different from DIP periods and tail flick latencies. Note that there was no significant difference between the DIP period and the tail flick latency, suggesting that the descending and ascending LC/SC systems simultaneously cease activity. **(B)** Regression analysis of the DIP period and the bite latency. Nine points were obtained from nine untreated rats. The regression line corresponds to *y* = 4*.*24 + 0*.*26*x*. Note that the bite latency were nearly constant regardless of the change in the DIP period, suggesting that the bite latency are not influenced by the DIP period (Tsuruoka et al., 2011).

air-puff startle response, although the DIP and the nociceptive inhibition are simultaneous behavioral responses in the air-puff startle. We speculate that two air-puff stimulation-induced events, which are suppression in motor response/behavioral output systems and nociceptive inhibition, are a parallel phenomenon, but are not related via cause and effect.

### **INVOLVEMENT OF THE CSIP IN AIR-PUFF STIMULATION-INDUCED SUPPRESSION OF BITE BEHAVIOR (Tsuruoka et al., 2011)**

In the LC/SC-lesioned rats, air-puff stimulation-induced prolongation of bite latencies was not observed in post-lesions, whereas air-puff stimulation significantly prolonged bite latencies in prelesions (**Figure 11Ba**). In contrast, in the sham-lesioned rats, air-puff stimulation significantly prolonged bite latencies in both pre- and post-lesions (**Figure 11Bb**).

It has been reported that a connection from the dorsomedial hypothalamus through the rostral ventromedial medulla takes

**FIGURE 11 | (A)** Extent of neurotoxin-induced bilateral lesions of the LC/SC (*n* = 10). The rostrocaudal extension was between 0.8 and 1.5 mm, and the LC/SC was always completely destroyed ventrodorsally throughout its rostrocaudal extension. The drawing is simplified from Paxinos and Watson (1998). **(B)** The effect of either bilateral lesions of the LC/SC (**a**, *n* = 10) or sham lesions of the LC/SC (**b**, *n* = 10) on the bite latency. Pre, pre-lesions; Post, post-lesions. ∗∗*P <* 0*.*01, significantly different from bite latencies of the non-air-puff condition. Note that a significant air-puff stimulation-induced prolongation of the bite latency was not observed in post-lesions of the LC/SC-lesioned rats, whereas bite latencies were significantly prolonged by air-puff stimulation in pre-lesions, suggesting that the LC/SC is involved in air-puff stimulation-induced nociceptive modulation (Tsuruoka et al., 2011).

part in the circuitry of air-puff stress (Zaretsky et al., 2003) and that the dorsomedial hypothalamus recruits nociceptivemodulating neurons in the rostral ventromedial medulla (de Menezes et al., 2008). We have shown that the LC/SC is involved in the circuitry of air-puff stress (Tsuruoka et al., 2010a). As suggested from the result shown in **Figure 11**, the LC/SC is also a brain structure involved in the air-puff stimulation-induced nociceptive inhibition mechanism.

At this time, as shown in **Figure 12A**, the following four CSIPs are demonstrated: (1) in ipsilaterally projecting neurons, axons descend the ipsilateral dorsolateral funiculus or ventrolateral

funiculus to terminate in the dorsal horn on the side of the descending projection (Sluka and Westlund, 1992); (2) in ipsilaterally projecting neurons, axons cross the midline within the brain, travel through the contralateral ventrolateral funiculus and recross the midline at spinal segmental levels (Jones and Gebhart, 1987); (3) in contralaterally projecting neurons, axons cross the midline within the brain and travel through the dorsolateral funiculus to terminate in the dorsal horn on the side of the descending projection (Clark and Proudfit, 1992); and (4) in contralaterally projecting neurons, axons descend through the ipsilateral ventrolateral funiculus and cross the midline at spinal segmental levels (Tsuruoka et al., 2004). Neurotransmitters related to coeruleospinal inhibition of nociceptive signals are shown in **Figure 12B**. Norepinephrine released from descending LC/SC neurons is received by α2-adrenoceptor, and nociceptive signals are inhibited pre- or post-synaptically (Willis and Coggeshall, 2004). Inhibitory effects of GABAergic inteneurons are facilitated by cholinergic interneurons (Baba et al., 1998).

norepinephrine; ACh, acetylcholine; GABA, gamma-aminobutyric acid.

### **BIOLOGICAL IMPLICATIONS OF AIR-PUFF STIMULATION-INDUCED NOCICEPTIVE INHIBITION**

Our results may support the finding reported by Bushnell et al. (2004) that pain sensation is often reduced under an attentional state or a vigilance condition. LC/SC neurons have been implicated in the regulation of attentional states and vigilance (Foote et al., 1983; Aston-Jones et al., 1991). There is evidence that deregulation of the LC-noradrenergic system causes clinical problem in human, such as attention deficit hyperactivity disorder (see Berridge and Waterhouse, 2003). Because induction of the DIP following an air-puff startle reaction is mediated by the LC/SC (Tsuruoka et al., 2010a), the DIP period seems to be an attentional state and a vigilance condition. This notion may be supported by the finding reported by Knapp and Pohorecky (1995) that an air-puff stimulus elicits ultrasonic vocalizations (e.g., 22 kHz), which are thought to reflect an aversive behavioral state following the startle reaction in rats. It is obvious that the air-puff startle response occurs in conscious animals but not in anesthetized animals. Considering these findings, it may be possible to infer that nociceptive inhibition produced by air-puff stimulation

### **REFERENCES**


forms part of the emotional reaction in animals. During the DIP period, rats probably focus on the judgment of circumstances so that nociceptive signals may be inhibited to extract other sensory information which is essential to the circumstantial judgment. It is likely that sensory information related to the survival of an individual has priority over pain signals. Concerning the function of the LC/SC in the regulation of attentional states and vigilance, we speculate that following the startle reaction the LC/SC suppresses both motor and pain systems for judging circumstances.

### **ACKNOWLEDGMENTS**

The present study was supported by a grant from Daiichi Sankyo Co. We also wish to thank Drs. Yukiko Hiruma and Mutsumi Nonaka for their critical reading of the manuscript.


electrophysiological studies in the rat. *Brain Res.* 189, 121–133.


naloxone in a rat model of localized hyperalgesic inflammation, *Brain Res*. 567, 197–203.


in the spinal cord lateral and dorsal funiculi in signaling nociceptive somatic and visceral stimuli in rats. *Pain* 96, 297–307.


androgenic steroid nandrolone decanoate. *Steroid* 70, 199–204.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 08 June 2012; accepted: 12 September 2012; published online: 28 September 2012.*

*Citation: Tsuruoka M, Tamaki J, Maeda M, Hayashi B and Inoue T (2012) Biological implications of coeruleospinal inhibition of nociceptive processing in the spinal cord. Front. Integr. Neurosci. 6:87. doi: 10.3389/fnint.2012.00087*

*Copyright © 2012 Tsuruoka, Tamaki, Maeda, Hayashi and Inoue. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any thirdparty graphics etc.*

### Serotonin modulation of cortical neurons and networks

### *Pau Celada1,2, M. Victoria Puig3 and Francesc Artigas 1,2\**

*<sup>1</sup> Department of Neurochemistry and Neuropharmacology, Institut d' Investigacions Biomèdiques de Barcelona (CSIC), IDIBAPS, Barcelona, Spain*

*<sup>2</sup> Centro de Investigación Biomédica en Red de Salud Mental (CIBERSAM), Madrid, Spain*

*<sup>3</sup> The Picower Institute for Learning and Memory, Massachusetts Institute of Technology, Cambridge, MA, USA*

### *Edited by:*

*KongFatt Wong-Lin, University of Ulster, Northern Ireland*

### *Reviewed by:*

*Rodrigo N. Romcy-Pereira, Universidade Federal do Rio Grande do Norte, Brazil Aaron Gruber, University of Lethbridge, Canada*

#### *\*Correspondence:*

*Francesc Artigas, Department of Neurochemistry and Neuropharmacology, Institut d' Investigacions Biomèdiques de Barcelona (CSIC), IDIBAPS, Rosselló 161, 6th floor, 08036 Barcelona, Spain. e-mail: fapnqi@iibb.csic.es*

The serotonergic pathways originating in the dorsal and median raphe nuclei (DR and MnR, respectively) are critically involved in cortical function. Serotonin (5-HT), acting on postsynaptic and presynaptic receptors, is involved in cognition, mood, impulse control and motor functions by (1) modulating the activity of different neuronal types, and (2) varying the release of other neurotransmitters, such as glutamate, GABA, acetylcholine and dopamine. Also, 5-HT seems to play an important role in cortical development. Of all cortical regions, the frontal lobe is the area most enriched in serotonergic axons and 5-HT receptors. 5-HT and selective receptor agonists modulate the excitability of cortical neurons and their discharge rate through the activation of several receptor subtypes, of which the 5-HT1A, 5-HT1B, 5-HT2A, and 5-HT3 subtypes play a major role. Little is known, however, on the role of other excitatory receptors moderately expressed in cortical areas, such as 5-HT2C, 5-HT4, 5-HT6, and 5-HT7. *In vitro* and *in vivo* studies suggest that 5-HT1A and 5-HT2A receptors are key players and exert opposite effects on the activity of pyramidal neurons in the medial prefrontal cortex (mPFC). The activation of 5-HT1A receptors in mPFC hyperpolarizes pyramidal neurons whereas that of 5-HT2A receptors results in neuronal depolarization, reduction of the afterhyperpolarization and increase of excitatory postsynaptic currents (EPSCs) and of discharge rate. 5-HT can also stimulate excitatory (5-HT2A and 5-HT3) and inhibitory (5-HT1A) receptors in GABA interneurons to modulate synaptic GABA inputs onto pyramidal neurons. Likewise, the pharmacological manipulation of various 5-HT receptors alters oscillatory activity in PFC, suggesting that 5-HT is also involved in the control of cortical network activity. A better understanding of the actions of 5-HT in PFC may help to develop treatments for mood and cognitive disorders associated with an abnormal function of the frontal lobe.

**Keywords: 5-hydroxytryptamine or serotonin, dorsal raphe nucleus, electrophysiological recordings, GABAergic interneurons, oscillatory activity, prefrontal cortex, pyramidal neurons, serotonin receptors**

### **INTRODUCTION**

Serotonin (5-hydroxytryptamine, 5-HT) is one of the phylogenetically older molecules used in cellular communications. It is present in the central nervous system (CNS) of vertebrates and invertebrates and plays the role of neurotransmitter/neuromodulator. It also functions as a developmental signal in the CNS and regulates a variety of physiological functions in the periphery (where most 5-HT is present), such as intestinal motility, platelet aggregation, and vasoconstriction.

Within the CNS, the serotonergic system is involved in a large number of functions resulting from its widespread innervation of the whole neuraxis. The axons of serotonergic neurons of the midbrain raphe nuclei reach almost every brain structure. Action potentials traveling along these axons release 5-HT which can act on pre- and postsynaptic receptors, coupled to different signal transduction mechanisms. So far, 14 different 5-HT receptor subtypes have been identified, corresponding to 7 different families: 5-HT1 (5-HT1A, 5-HT1B, 5-HT1D, 5-HT1E, 5-HT1F), 5-HT2 (5-HT2A, 5-HT2B, 5-HT2C), 5-HT3, 5-HT4, 5- HT5 (5-HT5A, 5-HT5B), 5-HT6, and 5-HT7. With the exception of the 5-HT3 receptor, a pentameric ligand-gated ion channel composed of several subunits (up to 5 different ones have been identified), the rest of 5-HT receptors belong to the superfamily of G-protein coupled receptors and their activation results mainly in modulatory actions in the neurons expressing these receptors.

Given the widespread innervation of the brain and the richness of signals evoked by 5-HT, it is not surprising that the 5-HT system is the target of many drugs used to treat brain diseases and also of recreational drugs. For instance, most antidepressant treatments block the 5-HT transporter and increase the extracellular (or synaptic) 5-HT concentration and hence, they indirectly elevate the serotonergic tone at pre- and postsynaptic 5-HT receptors. This action is supposed to mediate the therapeutic effect of these drugs. Moreover, some anxiolytic drugs are 5-HT1A receptor agonists and 5-HT3 receptor antagonists are commonly used to treat emesis induced by anti-cancer treatments.

On the other hand, drugs of abuse such as cocaine, amphetamine or MDMA (ecstasy) target monoaminergic transporters, including the 5-HT transporter. Furthermore hallucinogens like LSD, DOI, DOB, or DOM are 5-HT2 receptor agonists whereas atypical antipsychotics act as preferential antagonists of these receptors [(Roth et al., 2004); see Geyer and Vollenweider (2008) for a review].

Among the various 5-HT receptors, the 5-HT1 family has probably received the largest attention because of the high density expression in limbic (5-HT1A) and motor (5-HT1B) brain areas and the various roles subserved by some of its members. Thus, in addition to being located postsynaptically to 5-HT axons, 5-HT1A and 5-HT1B receptors are autoreceptors in 5-HT neurons and therefore control the overall (5-HT1A) as well as the local (5-HT1B) activity of the system. 5-HT1B receptors are also terminal heteroreceptors and modulate the release of various transmitters, including dopamine, glutamate, GABA, and acetylcholine. Moreover, 5-HT1A receptors are highly expressed by different neuronal types (mainly pyramidal but also GABAergic) in prefrontal cortex (PFC), which suggests an important role in the control of mood and emotions as well as in cognitive processes. An extensive review of the characteristics of the serotonergic system is beyond the scope of the present review. The reader is referred to several review papers dealing with the anatomy, physiology, neurochemistry, and neuropharmacology of the 5-HT system (Jacobs and Azmitia, 1992; Barnes and Sharp, 1999; Adell et al., 2002; Smythies, 2005; Artigas, 2013). In the following sections, we focus on the role of 5-HT in the modulation of cortical activity.

### **THE CORTICAL 5-HT SYSTEM: RECEPTOR LOCALIZATION**

There is growing evidence that the serotonergic pathways originating in the dorsal and median paphe nuclei (DR and MnR, respectively) are critically involved in cortical functions. 5-HT appears to play an important role in the development of the somatosensory cortex and formation of the barrel cortex. In adult brain, the axons of 5-HT neurons innervate a large number of cortical areas, including the entorhinal and cingulate cortices, which contain a moderate to high density of 5-HT receptors. However, of all cortical regions, the frontal lobe is the richest area in serotonergic terminals and 5-HT receptors.

Yet, unlike dopamine, whose function in the PFC has been extensively studied (Williams and Goldman-Rakic, 1995; Robbins and Arnsten, 2009), the role of 5-HT in PFC remains less known than that of dopamine. Indeed, the widespread localization of 5- HT receptors (particularly of the 5-HT1A, 5-HT2A, and 5-HT2C subtypes) and the high density of 5 HT axons (greater than in any other cortical area) in this cortical region suggest an important role of 5-HT in cognitive and emotional functions depending on PFC activity. Hence, the selective depletion of 5-HT in the monkey frontal cortex impairs cognitive flexibility (increases perseveration), and reversal learning (Clarke et al., 2004, 2005), likely via 5-HT2A receptors (Carli et al., 2006; Boulougouris et al., 2007). In addition, optimized levels of 5-HT in the PFC are important for behavioral inhibition, as elevated or reduced 5-HT increases impulsivity (Harrison et al., 1997; Dalley et al., 2002; Winstanley et al., 2004). In fact, both stimulation of 5-HT1A receptors and blockade of 5-HT2A receptors decrease impulsivity (Winstanley et al., 2003; Carli et al., 2006; Talpos et al., 2006), suggesting that a downregulation of cortical serotonergic activity may effectively promote behavioral control. 5-HT in the frontal cortex is also involved in the modulation of attention in humans, an effect that implicates 5-HT1A, but not 5-HT2A, receptors (Carter et al., 2005; Scholes et al., 2006). Moreover, as observed for dopamine D1 receptors (Williams and Goldman-Rakic, 1995; Puig and Miller, 2012), the blockade of 5-HT2A receptors in the monkey lateral PFC avoids the increase in neuronal activity during a working memory task (Williams et al., 2002), and a study associates allelic variants of this receptor with memory capacity in humans (De Quervain et al., 2003). Furthermore, hallucinogens like LSD or DOI are 5-HT2A receptor agonists, which also suggests a role of 5-HT in the processing of external (sensory) and internal information through the activation of 5-HT2A receptors. On the other hand, 5-HT1A agonists display anxiolytic/antidepressant activity in animal models (Martin et al., 1990; De Vry, 1995; Carr and Lucki, 2011) whereas 5-HT1A receptor antagonists reverse drug-induced cognitive deficits (Harder and Ridley, 2000; Mello e Souza et al., 2001; Misane and Ögren, 2003). Likewise, preclinical studies suggest that 5-HT4 receptor agonists may exert rapid antidepressant actions by acting on PFC receptors (Lucas et al., 2007).

One of the key basic information relevant for the interpretation of physiological and behavioral data concerning the cortical 5-HT system is the regional and cellular localization of the 5-HT receptors. Several studies have examined the localization of 5-HT in the cortex. Early studies using receptor autoradiography and *in situ* hybridization enabled to identify the presence of various 5-HT receptors in cortical areas, notably the 5-HT1A, 5-HT2A, and 5-HT2C subtypes (Pazos and Palacios, 1985; Pazos et al., 1985; Pompeiano et al., 1992, 1994). Further studies identified the presence of other receptor subtypes, yet in lower density than these ones.

5-HT1A receptors are particularly enriched in the rodent medial PFC (mPFC), entorhinal cortex and, to a lesser extent, cingulate and retrosplenial cortices. Outside the cortex, they are densely expressed in the hippocampus, septum and the raphe nuclei. In the latter location, the receptor is almost exclusively expressed by 5-HT neurons, where it functions as an autoreceptor in the plasma membrane of perikarya and dendrites (Riad et al., 2000). PET scan studies using a radiolabeled selective antagonist ([11C]-WAY-100635) have shown a very similar distribution in human brain, with an enrichment of the signal in the temporal and frontal lobes, cingulate cortex and the raphe nuclei (Martinez et al., 2001). Interestingly, as also observed in rats (Weber and Andrade, 2010), there is a marked rostro-caudal negative gradient in the abundance cortical of 5-HT1A receptors, with the largest abundance in PFC.

Likewise, the neocortex of rodent, primate and human brains show a large abundance of 5-HT2A receptors, with an enrichment in frontal regions (Pompeiano et al., 1994; Burnet et al., 1995; López-Giménez et al., 1998; Hall et al., 2000; Amargós-Bosch et al., 2004). Lower abundances are found in ventro-caudal part of CA3, medial mammillary nucleus, striatum (dorsal and ventral) and several brainstem nuclei (Pompeiano et al., 1994; Burnet et al., 1995; López-Giménez et al., 1998). Interestingly, pyramidal neurons in the rat PFC that simultaneously project to the ventral tegmental area and the dorsal raphe nucleus express 5-HT2A receptors (Vázquez-Borsetti et al., 2009, 2011). This reveals a close anatomical interaction or "loop" between frontal areas and dopamine and serotonin neurons of the brainstem, as found in several electrophysiological studies (Thierry et al., 1979, 1983; Tong et al., 1996; Hajós et al., 1998; Celada et al., 2001; Martín-Ruiz et al., 2001). As for 5-HT1A receptors, there is a good agreement between the autoradiographic and *in situ* hybridization signals, which indicates that the receptor is expressed mainly in the somatodendritic region. Similar regional distributions have been reported in human brain using the selective antagonist ligand M100907 *in vivo* (PET scan) or *in vitro* (autoradiography) (Hall et al., 2000).

5-HT1A and 5-HT2A receptors are present in a high proportion of cells in some cortical regions. Double *in situ* hybridization studies, to label the cellular phenotype and the respective receptor mRNA, have shown that around 50% of pyramidal neurons (labeled with the vGluT1 mRNA) and 20–30% of GABAergic interneurons (labeled with GAD65/67 mRNA) express 5-HT1A and/or 5-HT2A receptor mRNAs in various areas of the PFC (Santana et al., 2004) (**Table 1**). Interestingly, about 30% of parvalbumin-expressing fast-spiking interneurons in the PFC express 5-HT1A or 5-HT2A receptors which, unlike pyramidal neurons, are largely distributed in separate neuron populations (Puig et al., 2010).

**Figure 1** shows the localization of the transcripts for 5-HT1A and 5-HT2A receptors in the PFC of the rat. Interestingly, 5- HT1A and 5-HT2A receptor transcripts are heavily co-expressed in rat and mouse PFC. Approximately 80% of the cells expressing 5-HT1A receptor mRNA also express the 5-HT2A receptor mRNA in all PFC areas examined, except in layer VI and the lower part of

**Table 1 | Proportion of pyramidal and local GABAergic neurons that express the mRNAs encoding 5-HT1A and 5-HT2A receptors.**


*Data are means of three rats and represent the percentage of the counted cells expressing the mRNAs of each 5-HT receptor in pyramidal (vGluT1 mRNApositive) and GABAergic (GAD mRNA-positive) cells. MOs, secondary motor area; ACAd, dorsal anterior cingulate area; PrL, prelimbic area; ILA, infralímbic area; PIR, piriform cortex; TT, tenia tecta. Layer VIa denotes deep areas of the sensorimotor cortex at prefrontal level.*

*aThe data of the ILA correspond to its more ventral part, which shows a remarkable low level of 5-HT*2A *receptor, whereas cell counts from its dorsal part are more similar to those of PrL.*

*\*P < 0.05 vs. PrL, TT, and PIR; \*\*P < 0.05 vs. the rest of areas, except layer VIa (P* <sup>=</sup> *0.9); \*\*\*P <sup>&</sup>lt; 0.05 vs. the rest of areas;* <sup>+</sup>*<sup>P</sup> <sup>&</sup>lt; 0.05 vs. the rest of areas except ILA;* ++*P < 0.05 vs. ACAd and PrL (Tukey test post-ANOVA). Data from Santana et al. (2004).*

the infralimbic area, where the density of cells expressing 5-HT2A receptors is much lower (Amargós-Bosch et al., 2004; Santana et al., 2004).

The abundant co-expression of 5-HT1A and 5-HT2A receptors raises questions about the physiological role of the simultaneous occurrence of inhibitory (5-HT1A) and excitatory (5-HT2A) receptors responding to 5-HT in the same cortical neurons. Various hypotheses have been examined (Amargós-Bosch et al., 2004; Puig et al., 2005), but perhaps one of the most convincing explanations is the putative localization of both receptors in different cellular compartments. Thus, immunohistochemical studies by several groups—using different antibodies—consistently show a predominant location of 5-HT2A receptors in the apical

**FIGURE 1 | Localization of 5-HT1A and 5-HT2A receptor mRNAs in the rat PFC using double** *in situ* **hybridization histochemistry. (A–C)** Coronal sections of rat PFC showing a large number of cells expressing **(A)** 5-HT1A receptors (Dig-labeled oligonucleotides) or **(B)** 5-HT2A receptors (dark field; 33P-labeled oligonucleotides); **(C)** an adjacent Nissl-stained section. Note the abundant presence of cells expressing both receptors in layers II–V, as well as in piriform cortex (PIR) and tenia tecta (TT). **(D–F)** Enlargements of the marked area in panels **(A–C)**. **(D,E)** Show the presence of a large number of cells containing 5-HT1A and 5-HT2A receptor transcripts in cingulate (ACAd) and prelimbic (PL) cortex. Cells in deep layers (VI) express preferentially 5-HT1A receptor mRNA. **(G–J)** Individual cells expressing both receptor transcripts. Occasional cell profiles containing only 5-HT1A (blue arrowheads) or 5-HT2A receptor mRNAs (red arrowheads) were seen in the infralimbic cortex **(G,I)**, piriform cortex **(H)** and taenia tecta **(J)**. Bar size is 1 mm in **(A–C)**, 250 μm in **(D–F)**, 50μm in **(G,H)** and 30μm in **(I,J)**. Reproduced with permission from Amargós-Bosch et al. (2004).

dendrites (and to a lower extent, cell bodies) of cortical pyramidal neurons (Jakab and Goldman-Rakic, 1998, 2000; Jansson et al., 2001; Martín-Ruiz et al., 2001), where they may amplify the impact of excitatory synaptic currents. On the other hand, there is a considerable disagreement in regards to the location of cortical 5-HT1A receptors, due to the use of different antibodies. A homogenous labeling of cell bodies and dendrites was initially reported (Kia et al., 1996), but more recent studies performed in rodent, primate, and human brain tissues using a different antibody (Azmitia et al., 1996) show the *exclusive* labeling of the axon hillock of pyramidal neurons (De Felipe et al., 2001; Czyrak et al., 2003; Cruz et al., 2004). This location suggests that 5-HT axons would be able to establish axo-axonic contacts with pyramidal neurons, similar to those established by chandelier GABA interneurons, which would markedly impact on the generation of action potentials. In this way, 5-HT axons reaching apical dendrites would be able to modulate glutamatergic inputs onto pyramidal cells (Aghajanian and Marek, 1997; Puig et al., 2003) whereas those reaching axon hillocks would control the probability of generation of nerve impulses through the activation of 5-HT1A receptors. However, existing discrepancies on the cellular localization of 5-HT1A receptors prevent to draw firm conclusions on this point.

5-HT1B and 5-HT1D receptors show a widespread brain distribution, with a relative low abundance in cortex. Radioligand binding and autoradiographic studies have detected the presence of a high density of 5-HT1B receptors in the basal ganglia and hippocampal formation, particularly the subiculum (Pazos and Palacios, 1985; Offord et al., 1988). They are negatively coupled to adenylate cyclase and activation of 5-HT1B by selective agonists decreases the forskolin-stimulated adenylate cyclase levels [for review see Sari (2004)].

The comparison of autoradiographic, *in situ* hybridization and immunohistochemcial studies has revealed that 5-HT1B receptors are located both presynaptically (i.e., on 5-HT axons) and postsynaptically to 5-HT neurons, mostly on axons of intrinsic neurons of the basal ganglia (Riad et al., 2000). Presynaptic 5-HT1B autoreceptors, however, represent a small proportion of the entire population of 5-HT1B receptors in the brain because the lesion of 5-HT neurons does not generally result in a reduction of their density (Compan et al., 1998). Notwithstanding the low density of cortical 5-HT1B receptors seen by autoradiography, electrophysiological studies have identified 5-HT1B receptor-mediated actions in the cingulate cortex of the rat (Tanaka and North, 1993; see below). The other two members of the 5-HT1 family (5-HT1E and 5-HT1F receptors) are also present in cortex, particularly entorhinal cortex, yet their low abundance and the lack of selective pharmacological tools have hampered the study of their actions on cortical neurons.

5-HT2B receptors are expressed in a very low density in the brain. In contrast, 5-HT2C receptors (formerly named 5-HT1C receptors) are highly expressed in the choroid plexus (where they were initially identified), various cortical areas in the rodent brain, particularly the PFC, the limbic system (nucleus accumbens, hippocampus, amygdala) and the basal ganglia (caudate nucleus, substantia nigra). 5-HT2C receptors are also expressed in the human cortex, yet their abundance relative to other brain areas appears to be lower than in rat brain (Clemett et al., 2000; Pandey et al., 2006). Interestingly, immunohistochemical studies suggest that cortical 5-HT2C receptors are mainly expressed in pyramidal neurons (Clemett et al., 2000; Puig et al., 2010), and not in fast-spiking interneurons (Puig et al., 2010), yet data using a different antibody indicate that more than 50% of the 5-HT2C receptor immunoreactivity is present in GABAergic neurons (Liu et al., 2007).

5-HT3 receptors are moderately abundant in the neocortex and other telecephalic regions, such as the olfactory cortex, the hippocampus, and the amygdala. Interestingly, most cortical 5- HT3 receptor mRNA is located in GABAergic interneurons, as assessed by *in situ* hybridization (Morales and Bloom, 1997; Puig et al., 2004). These are calbindin- and calretinin-(but not parvalbumin-) containing neurons and are located in superficial cortical layers (I–III) (Morales and Bloom, 1997; Puig et al., 2004) (**Figure 2**).

5-HT4 receptors are abundant in the olfactory tubercle, some structures of the basal ganglia (caudate putamen, ventral

**FIGURE 2 | Composite photomicrographs showing the localization of cells expressing vGluT1 (A), GAD (B), 5-HT3 (C), and 5-HT2A (D) mRNAs through layers I–VI at the level of the prelimbic area in the rat PFC.** The continuous vertical line denotes the location of the midline whereas the dotted line shows the approximate border between layer I and II. Pyramidal neurons (as visualized by vGluT1 mRNA) are present in layers II–VI whereas GAD mRNA-positive cells are present in all layers, including layer I. Note the segregation of cells expressing 5-HT3 **(C)** and 5-HT2A receptors **(D)**. 5-HT3 receptor transcript is expressed by a limited number of cells present in layers I–III, particularly in the border between layers I and II. However, they represent 40% of GABAergic neurons in layer I. On the other hand, cells in these locations, particularly in layer I, do not express 5-HT2A receptors. The asterisk denotes an artifact of the emulsion, seen in the dark field. Scale bar = 150 μm. Reproduced with permission from Puig et al. (2004).

striatum), medial habenula, hippocampal formation, and amygdala. The neocortex contains low levels of the 5-HT4 receptor and its encoding mRNA, as assessed by autoradiography and *in situ* hybridization, respectively (Waeber et al., 1994; Vilaró et al., 1996, 2005). However, studies using single-cell RT-PCR technique reported that ∼60% of pyramidal neurons recorded in the PFC contain the 5-HT4 receptor transcript (Feng et al., 2001), and overexpression of 5-HT4 in the mPFC increases DRN 5-HT activity (Lucas et al., 2005) likely through descending excitatory axons reaching the DR (Celada et al., 2001; Vázquez-Borsetti et al., 2011).

5-HT5 receptors are the less well understood 5-HT receptors. Both 5-HT5A and 5-HT5B receptor subtypes were found in rodents whereas only the 5-HT5A was identified in human brain. The 5-HT5A receptor mRNA is found in relative high levels in the hippocampus, the medial habenula and the raphe but is absent in cortex. The occurrence of the receptor in the midbrain raphe, where the cell bodies of 5-HT neurons are located raises the possibility that it may directly or indirectly influence the activity of 5-HT neurons, and thus the levels of 5-HT in the target structures. On the other hand, the 5-HT5B receptor mRNA is present throughout the rat brain, with higher levels in the hippocampus, hypothalamus, pons, and cortex (Erlander et al., 1993).

The richest brain areas in 5-HT6 receptor mRNA are the ventral striatum and adjacent areas (nucleus accumbens, olfactory tubercle, islands of Calleja) as well as the dorsal striatum (caudate-putamen). High levels of 5-HT6 receptor mRNA are also found in the hypothalamus and the hippocampus, whereas the cerebral cortex, the substantia nigra, and the spinal cord contain low/moderate levels of the transcript (Gerard et al., 1996). Immunohistochemical studies have confirmed a similar distribution of the receptor protein, although the PFC shows a labeling density greater than that of the mRNA and similar to that of the hippocampus (Gerard et al., 1997).

Finally, the 5-HT7 receptor mRNA is localized to discrete regions of the rodent brain. Higher levels are present in the thalamus and the hippocampus whereas moderate levels are seen in the septum, the hypothalamus, the centromedial amygdala, and the periaqueductal gray. Autoradiographic studies indicate the presence of a similar distribution of the binding sites, in the cortex, septum, thalamus, hypothalamus, centromedial amygdala, periaqueductal gray, and superior colliculus (Gustafson et al., 1996). A similar distribution of 5-HT7 receptor was reported in human brain (Martin-Cora and Pazos, 2004). Interestingly, the 5-HT7 receptor is also localized in the raphe nuclei in both rodent and human brain, which has raised interest of targeting 5-HT receptors as a potential new mechanism to control brain's 5-HT levels by regulating the neuronal activity of the ascending 5-HT systems.

In summary, 5-HT released from axons innervating the cerebral cortex can modulate the activity of cortical neurons through several distinct receptors. However, with few exceptions (see above) little is known about the cellular phenotype of the neurons expressing 5-HT receptors, their precise distribution in cortical layers and the proportion of neurons of each type (e.g., pyramidal, stellate, or GABAergic neurons) expressing the receptor subtypes. The knowledge of these data is deemed important to identify the cellular elements and local circuitry involved in the cortical actions of 5-HT.

### **ROLE OF 5-HT RECEPTORS ON CORTICAL NEURON ACTIVITY 5-HT1A RECEPTORS**

The 5-HT1A receptor has been characterized biochemically and electrophisiologically as being coupled to the Gi/o family of heterotrimeric G proteins. Gi/o proteins coupled to 5-HT1A receptors are composed of pertussis toxin sensitive αi/αo subunits. This coupling mechanism was demonstrated *in vivo* and *in vitro* in the DR (Innis and Aghajanian, 1987). In hippocampal cells (as well as in 5-HT cells) a similar G protein couples 5-HT1A and GABAB receptors to potassium channels (Andrade et al., 1986).

5-HT1A receptors are also coupled to potassium and—to a lesser extent—calcium channels. Intracellular current-clamp recordings in slices containing the DR had indicated that the 5-HT-mediated inhibition is mediated by an enhancement of the inward rectifying potassium conductance (Aghajanian and Lakoski, 1984; Williams et al., 1988). Exogenous application of 5-HT and 5-HT1A agonists also elicit membrane potential hyperpolarization and decreased the membrane input resistance in DR 5-HT neurons *in vitro* (Aghajanian and Lakoski, 1984; Sprouse and Aghajanian, 1987), leading to an overall reduction in the probability of action potential firing. Similar effects to 5-HT1A receptor activation have been reported in other neuronal types, such as hippocampal pyramidal cells (Andrade and Nicoll, 1987). 5-HT1A receptors are also involved in the modulation of excitatory glutamatergic neurotransmission, since their activation suppresses AMPA-mediated signaling through the inhibition of CAMKII (Cai et al., 2002a) and reduces NMDA-mediated currents in PFC neurons (Zhong et al., 2008).

Early *in vivo* electrophysiological recordings showed that iontophoretic application of 5-HT excited and inhibited different cortical neurons (Krnjevic and Phillis, 1963; Roberts and Straughan, 1967), even though the major effect of 5-HT was inhibition of firing (Krnjevic and Phillis, 1963; Reader et al., 1979; Ashby et al., 1994; Zhang et al., 1994). Studies in the neocortex showed that 5-HT application also induced a hyperpolarization followed by a depolarization in a subpopulation of neurons (Davies et al., 1987; Tanaka and North, 1993). These effects were mediated, respectively by an action of 5-HT on 5- HT1A and 5-HT2 receptors (Araneda and Andrade, 1991), and can be likely accounted for by the high co-expression of 5-HT1A and 5-HT2A receptors in cortical neurons (Amargós-Bosch et al., 2004). The three major actions of 5-HT on the activity of cortical neurons (inhibitions, excitations, and biphasic responses) have been recently found in layer V pyramidal neurons of mice PFC slices to be mediated by 5-HT1A and 5-HT2A receptors (Avesar and Gulledge, 2012) (**Figure 3**). Similarly, both hyperpolarizing (5-HT1A receptor-mediated) and depolarizing (5-HT2A receptormediated) responses to 5-HT have been demonstrated in human neocortex *in vitro* (Newberry et al., 1999).

In entorhinal cortex neurons recorded in the current clamp mode, 5-HT evoked a biphasic response consisting of a large amplitude hyperpolarization followed by a slowly developing, long lasting, and small amplitude depolarization. In voltage clamp, 5-HT consistently evoked an outward current by a smaller

inward shift of holding current. The outward current was mediated by the activation of 5-HT1A receptors (Ma et al., 2007).

from Avesar and Gulledge (2012) with permission from A. Gulledge.

Systemic administration of selective 5-HT1A receptor agonists suppresses the firing activity of 5-HT neurons in the DR and MnR (Blier and de Montigny, 1987; Hajós et al., 1995; Casanovas et al., 2000) and hippocampal neurons (Tada et al., 1999). However, these agents appear to have a more complex effect on cortical (PFC) pyramidal neurons, with either a biphasic effect (increase in firing activity at lower doses followed by decrease at higher doses) or affecting different neuronal populations in a distinct manner (most neurons excited by 5-HT1A agonists) (Borsini et al., 1995; Hajos et al., 1999; Diaz-Mataix et al., 2006; Lladó-Pelfort et al., 2010, 2012a,b). Thus, the systemic administration of the selective 5-HT1A receptor agonist 8-OH-DPAT increased the firing activity of pyramidal neurons and reduced that of fastspiking GABAergic interneurons, suggesting a preferential action of 5-HT1A agonists on 5-HT1A receptors in fast-spiking GABA interneurons, particularly at lower doses (Lladó-Pelfort et al., 2012b).

### **5-HT1B***/***1D RECEPTORS**

Anatomical and pharmacological evidence indicate that 5-HT1B receptors have an axonal localization in different cerebral pathways and it exerts an inhibitory action on neurotransmitter release (Sari, 2004). Furthermore, electrophysiological evidence of the role of 5-HT1B receptors in neuronal function is based on the assessment of inhibitory actions on evoked synaptic potentials or currents in target neurons whereas neurochemical studies have examined direct effects on neurotransmitter release.

The control of glutamate release by 5-HT1B receptors has been described in different brain areas. In slices of cingulate cortex, 5-HT, acting on 5-HT1B receptors, reduced the amplitude of NMDA and non-NMDA components of synaptic potentials recorded intracellularly in layer V pyramidal neurons (Tanaka and North, 1993) (**Figure 4**). It has been also reported that 5-HT1B receptors mediate the 5-HT suppression of evoked fast excitatory postsynaptic current (evEPSC) in layer V pyramidal neurons in response to nearby electrical stimulation of cortical afferents (Lambe and Aghajanian, 2004).

### **5-HT2A***/***2C RECEPTORS**

5-HT2A receptors are coupled to phospholipases through Gq proteins. Their activation entrains the production of IP3, diacylglycerol and the mobilization of intracellular Ca2<sup>+</sup> stores. Furthermore, it is widely recognized that it is the main G-proteincoupled receptor through which 5-HT has excitatory actions. However, some aspects are highly controversial, including the localization of receptors responsible for the actions of 5-HT and 5-HT2A agonists.

There is a substantial overlap between the localization of 5-HT axon terminals and 5-HT2 receptors in rat cortex (Blue et al., 1988). 5-HT2A receptors are localized both to pyramidal neurons and GABAergic interneurons in the PFC (Willins et al., 1997; Jakab and Goldman-Rakic, 1998; Santana et al., 2004). 5- HT2A receptors are highly expressed in large and medium-size parvalbumin- and calbindin-containing interneurons involved in the feed-forward inhibition of pyramidal neurons (Jakab and Goldman-Rakic, 2000; Puig et al., 2010). In the rat frontoparietal cortex, 5-HT axons are parallel to the apical dendrites of pyramidal neurons expressing 5-HT2A receptors (Jansson et al., 2001). Additionally, a lower proportion of 5-HT2A receptors was found presynaptically (Jakab and Goldman-Rakic, 1998; Miner et al., 2003).

Activation of 5-HT2A receptors exerts complex effects on the activity of PFC neurons. Thus, the microiontophoretic application of DOI suppressed the firing activity of putative pyramidal neurons in anesthetized rats but enhanced the excitatory effect of glutamate at low ejection currents (Ashby et al., 1990). *In vitro* recordings of identified pyramidal neurons in PFC slices have revealed that 5-HT2A receptor activation increases spontaneous EPSCs and depolarizes the recorded cells (Araneda and Andrade,

1991; Tanaka and North, 1993; Aghajanian and Marek, 1997, 1999a,b; Zhou and Hablitz, 1999; Avesar and Gulledge, 2012) (**Figures 3**, **5**). In addition, 5-HT can elicit 5-HT2A-mediated IPSCs through the activation of GABA synaptic inputs (Zhou and Hablitz, 1999), an effect that can be accounted for by the activation of 5-HT2A receptors in GABAergic interneurons (Santana et al., 2004).

Importantly, recent studies have provided new insights into the role of 5-HT2A-mediated excitations and 5-HT1A-mediated inhibitions in PFC circuits. 5-HT generated 5-HT2A-mediated excitatory or biphasic responses in all callosal/commissural (COM) neurons responsive to 5-HT *in vitro*, whereas corticopontineprojecting neurons (CPn) were universally inhibited by 5- HT through 5-HT1A receptors (Avesar and Gulledge, 2012). Additionally, cortico-mesencephalic pyramidal neurons respond to 5-HT1A and/or 5-HT2A receptor activation, as indicated by *in vivo* physiological (Puig et al., 2005) and pharmacological studies (Celada et al., 2001; Martín-Ruiz et al., 2001; Bortolozzi et al., 2005). Thus, 5-HT may exert different and projection-selective actions on PFC pyramidal neurons, activating cortico-cortical output channels, having mixed actions on some cortico-subcortical output channels and inhibiting other cortical pyramidal neurons, particularly CPn neurons. 5-HT2A receptor-dependent serotonergic excitation of COM neurons may also be related to the parallel rostral-to-caudal gradients found for cortical 5-HT2A receptors expression (Pazos et al., 1985; Weber and Andrade, 2010) and COM neuron density (Chao et al., 2009).

One of the most important mechanisms by which 5 HT increases pyramidal cell excitability seems to be mediated by the inhibition of the afterhyperpolarizating current (IAHP) typically observed after a burst of spikes in response to 5-HT2 receptor activation (Araneda and Andrade, 1991; Andrade, 1998). Accordingly, studies conducted in layer V neurons of PFC have identified 5-HT2A receptors as the primarily receptor involved in the inhibitory effect of 5-HT on the slow afterhyperpolarizating current (IsAHP), suggesting also the contribution of additional 5-HT receptor subtypes (Villalobos et al., 2005). As AHP is involved in determining neuronal excitability, such an inhibition could contribute to regulate firing pattern activity of cortical neurons. Interestingly, the 5-HT2A*/*2C receptor agonist DOI markedly affects the firing activity of PFC pyramidal neurons *in vivo* (Puig et al., 2003). Systemic DOI administration increased, decreased or left unaffected the activity of pyramidal neurons in PFC by a 5-HT2A receptor-dependent mechanism. As observed for 5-HT (Zhou and Hablitz, 1999), inhibitory actions of DOI seems to be dependent on GABAA receptor tone, suggesting the involvement of 5-HT2A receptors in GABAergic interneurons.

There seems to be a tight link between 5-HT2A receptors and glutamatergic transmission. Hence, the excitatory effects of DOI appear to involve interaction with glutamatergic transmission

because DOI increases the excitatory effects of glutamate on prefrontal neurons (Ashby et al., 1989a, 1990). Likewise, the 5-HT2A receptor-mediated EPSCs evoked by 5-HT in mPFC slices are occluded by blockade of AMPA receptors and mGluR II receptor activation (Aghajanian and Marek, 1997, 1999a,b). Moreover, the modulation of prefrontal NMDA transmission by 5-HT and DOB appears to involve pre- and postsynaptic 5-HT2A receptors (Arvanov et al., 1999). Likewise, 5-HT modulates NMDA transmission in PFC via 5-HT1A and 5-HT2A receptors, with opposite actions of both receptors (Yuen et al., 2008; Zhong et al., 2008). Finally, the observation that the selective mGluRII agonist LY-379268 reversed the excitatory effect of DOI on pyramidal neurons *in vivo* is consistent with these *in vitro* observations (Puig et al., 2003).

It has been suggested that in mPFC 5-HT activates 5-HT2A receptors located putatively on thalamocortical terminals to release glutamate and evoke EPSCs in pyramidal cells (Aghajanian and Marek, 1997). This interpretation was based on a number of observations, including the fact that this effect was antagonized by AMPA receptor antagonists and mGluR agonists (Aghajanian and Marek, 1997, 1999a,b). Similarly, the increase in pyramidal cell firing evoked by systemic DOI administration was dependent on glutamate inputs (Puig et al., 2003). However, the lesion of the thalamus (including the dorsomedial and centromedial nuclei which project to mPFC) (Berendse and Groenewegen, 1991; Fuster, 1997) did not abolish the excitatory effects of DOI on mPFC pyramidal neurons (Puig et al., 2003). Likewise, *in vitro* electrophysiological studies in 5-HT2A receptor knockout mice are also discordant with the existence of presynaptic 5- HT2A receptors in thalamocortical afferents (Béïque et al., 2007) and electron microscopy studies have failed to identify 5-HT2A receptors in excitatory axonal terminals in the mPFC [most are located postsynaptically (Miner et al., 2003)]. Overall, these data have raised doubts about the presynaptic mechanisms responsible for the excitatory actions of 5-HT2A receptors and suggest that postsynaptic receptors, which make up the majority of cortical 5-HT2A receptors, are involved.

5-HT2A receptors seem to exert a more marked depolarizing action in early stages of postnatal development, since the depolarizing action of 5-HT diminishes with age. Both 5-HT2A and 5-HT7 receptors appears to underlie the depolarizing effect of 5-HT (see below) (Zhang, 2003; Béïque et al., 2004).

There is almost no evidence for a role of 5-HT2C receptors in the modulation of cortical activity. 5-HT-induced slow excitations in prefrontal interneurons seem to be mediated exclusively by 5-HT2A receptors, since the 5-HT2C antagonist SB242084 failed to reduce 5-HT excitations previously blocked by a 5- HT2A*/*2C antagonist (Puig et al., 2010). This agrees with the described primary expression of 5-HT2C receptors in pyramidal neurons. Consistently, in the piriform cortex, it was reported that 5-HT could activate pyramidal neurons via 5-HT2C receptors and GABAergic neurons via 5-HT2A receptors (Sheldon and Aghajanian, 1991). However, the depolarizing action of 5-HT in layer V pyramidal neurons of the mPFC does not seem to depend on 5-HT2C receptor activation since it was not blocked by the selective antagonist SB 242084 (Béïque et al., 2004). This inconsistency will require further investigation.

### **5-HT<sup>3</sup> RECEPTORS**

5-HT can mediate rapid excitatory responses through the activation of the 5-HT3A receptor, a ligand-gated ion channel. These receptors may be involved in cortical actions of 5-HT, since some 5-HT3 receptor antagonists display procognitive effects (Staubli and Xu, 1995). These agents have been also reported to display anxiolytic and antipsychotic activity in animal models (Higgins and Kilpatrick, 1999) and to improve the therapeutic action of antipsychotic drugs in schizophrenic patients (Sirota et al., 2000). Likewise, the atypical antipsychotic clozapine is an antagonist of 5-HT3 receptors (Watling et al., 1989; Edwards et al., 1991).

Early microiontophoretic studies showed that 5-HT and 5- HT3 receptor agonists suppressed pyramidal neuron activity in rat PFC through the activation of 5-HT3 receptors by a direct action (Ashby et al., 1989b, 1991, 1992). However, more recent *in vitro* studies indicate that 5-HT may also increase pyramidal neurons IPSCs by activation of 5-HT3 receptors, likely as a result of a fast synaptic excitation of local GABAergic neurons (Zhou and Hablitz, 1999; Férézou et al., 2002; Xiang and Prince, 2003). The latter observations are consistent with the presence of 5-HT3 receptors in GABAergic interneurons in the rat telencephalon, including the PFC (Morales and Bloom, 1997; Puig et al., 2004). In macaque cortex, 5-HT3 receptors are expressed by a subpopulation of calbindin- and calretinin-positive interneurons (Jakab and Goldman-Rakic, 2000). *In vivo* studies also show the excitation of GABA interneurons in the mPFC through 5-HT3 receptors (Puig et al., 2004).

### **OTHER 5-HT RECEPTORS**

There is almost no information on the effects of 5-HT or selective agonists on cortical neurons through the activation of 5-HT4– 5-HT7 receptors. For instance, it has been suggested that 5-HT7 receptor may play a role during early postnatal developing of cortical circuits. *In vitro* whole cell recordings revealed a shift in the effect of 5-HT on membrane potential across development according with coordinated changes in the expression and function of 5-HT1A, 5-HT2A, and 5-HT7 receptors. Hence, 5-HT in early postnatal days elicits a marked depolarization of pyramidal cells (dependent on 5-HT2A and 5-HT7 receptors) which progressively shifts to hyperpolarization (mediated by 5-HT1A receptors). This change appears to be mainly due to the loss of 5-HT7 receptors together with an increased function of 5-HT1A receptors (Béïque et al., 2004).

The activation of 5-HT4 receptors has dual effects (enhancement or reduction) on GABA evoked currents in PFC pyramidal neurons, being the direction of the 5-HT4 receptor-mediated effect determined by neuronal activity. These observations suggest a flexible mechanism for 5-HT4 receptors to dynamically regulate synaptic transmission and neuronal excitability in the PFC network (Cai et al., 2002b; Yan, 2002). Also, extracellular recordings in rat frontal cortical slices have showed that bursting activity could be modulated by the application of the 5-HT4 agonist zacopride. Thus, perfusion of zacopride induces an increase on spontaneous bursting activity (Zahorodna et al., 2004).

Blockade of 5-HT6 receptors improves cortical performance in different learning and memory paradigms, an effect still poorly understood (Upton et al., 2008). Recent work using whole-cell path-clamp electrophysiological recordings showed that 5-HT6 agonists reversibly reduced spontaneous glutamatergic transmission in both striatal and layer V PFC pyramidal neurons, an effect prevented by preincubation in the selective 5-HT6 antagonists SB258585. Since no evidence for the expression of 5-HT6 receptors on glutamatergic neurons has been provided and it has been reported co-localization of this receptor on GAD immunoreactive neurons (Woolley et al., 2004) these data suggest that modulation of the glutamatergic transmission might be mediated by 5-HT6 receptors expressed by GABAergic neurons (Tassone et al., 2011).

### *In vivo* **ACTIONS OF ENDOGENOUS SEROTONIN ON CORTICAL NEURONS**

Despite the wealth of *in vitro* studies on the actions of 5-HT on cortical neurons, the endogenous effects of 5-HT and its receptors in the regulation of cortical inhibitory and excitatory responses *in vivo* is not fully known. The effect of endogenous 5-HT on postsynaptic 5-HT1A receptors has been typically examined in two forebrain areas, the CA region of the hippocampal formation and the mPFC using extracellular recordings. Electrical stimulation of the medial forebrain bundle inhibits hippocampal pyramidal neurons, an effect reversed by 5-HT1A receptor blockade (Chaput and de Montigny, 1988). Similarly, as previously observed with the microiontophoretic application of 5-HT (see above), the electrical stimulation of DR/MnR at a physiological rate (∼1 spike/s) mainly evoked inhibitory responses in PFC cells *in vivo*, which were partly or totally blocked by the selective 5-HT1A antagonist WAY-100635 (Hajós et al., 2003; Amargós-Bosch et al., 2004; Puig et al., 2005, 2010). DR/MnR stimulation mainly evoked 5-HT1A-mediated inhibitory responses in two thirds of pyramidal neurons of the mPFC, identified by antidromic activation from the midbrain. The rest of responses were orthodromic excitations, either pure (13%) or preceded by short-latency inhibitions (20%, i.e., biphasic responses) (Puig et al., 2005) (**Figure 6**). Excitatory responses were blocked by the selective 5-HT2A receptor antagonist M100907 (Amargós-Bosch et al., 2004; Puig et al., 2005).

Intriguingly, the proportion of excitatory responses was markedly lower than that of inhibitory responses despite the ∼80% co-expression of 5-HT1A and 5-HT2A receptor mRNAs in PFC neurons (Amargós-Bosch et al., 2004). This observation agrees with the predominant inhibitory effects of 5-HT on cortical neurons observed *in vitro* (see above). The putative differential location of 5-HT1A and 5-HT2A receptors in different compartments of pyramidal neurons may perhaps account for the greater proportion of inhibitory responses. Should 5-HT1A receptor be localized on axon hillocks, endogenous 5-HT would have a profound suppressing effect on action potential generation. Alternatively, a direct effect of 5-HT1A receptor on ionic conductances (↑K<sup>+</sup> currents) may favor inhibitory vs. excitatory actions of 5-HT mediated by 5-HT2A receptors, which result from indirect and long lasting changes of similar ionic conductances into the opposite direction (↓ K<sup>+</sup> currents, ↑ Ca2<sup>+</sup> currents). Interestingly, 5-HT1A and 5-HT2A receptor mRNAs do not colocalize in parvalbumin-expressing inhibitory neurons of the PFC (Puig et al., 2010). This supports the notion that 5-HT1A receptors in pyramidal neurons, by down-regulating action potential output, and 5-HT2A receptors, by enhancing synaptic inputs onto the dendrites, exert a balanced modulation of cortical pyramidal networks across layers not available to small local interneurons (Puig and Gulledge, 2011).

In addition, inhibitory responses elicited in mPFC pyramidal neurons by raphe stimulation involve a GABAergic component since they were blocked by WAY-100635 but also by the GABAA antagonist picrotoxinin (Puig et al., 2005). This GABAergic response may indeed result from the activation of local GABA interneurons by axon collaterals of pyramidal neurons projecting to midbrain. Likewise, 5-HT may also activate PFC GABAergic neurons through 5-HT2A receptors thus inhibiting pyramidal cell activity, as observed *in vitro* (Zhou and Hablitz, 1999). The activation of 5-HT3 receptors is also likely, given their involvement in the excitatory responses of a subpopulation of GABAergic

interneurons (Puig et al., 2004; see above). However, a third element seems also likely to explain the inhibition given the very short latency of inhibitory responses (9 ms on average) induced by DR/MnR stimulation. This latency is shorter than the time required for action potentials to travel along 5-HT axons from midbrain to PFC (orthodromic potentials; ∼25 ms) or pyramidal axons from PFC to midbrain (antidromic potentials; ∼15 ms). Overall, these observations would be consistent with the existence of a monosynaptic GABAergic projection from the midbrain raphe to mPFC, as suggested by anatomical studies (Jankowski and Sesack, 2002). This pathway would be analogous to the ascending GABAergic pathway between the ventral tegmental area and the mPFC or the nucleus accumbens—mesocortical and mesolimbic pathways, respectively—(Carr and Sesack, 2000), suggesting a common pattern of control of PFC by monoaminergic nuclei in which monoamine and projection GABA neurons would be involved.

Similarly to pyramidal neurons, endogenous 5-HT elicits 5-HT1A receptor-mediated inhibitions and 5-HT2A receptormediated excitations in PFC parvalbumin-expressing fast-spiking interneurons *in vivo* (Puig et al., 2010). There is also evidence that 5-HT3 receptors can activate GABA interneurons in the rat PFC *in vivo*. Physiological stimulation of the raphe nuclei excites local GABAergic neurons located in superficial layers (I–III) of the prelimbic and cingulate areas. These responses can clearly be distinguished from 5-HT2A receptor-mediated excitations because they (a) have shorter onset latency and duration than 5-HT2A receptor-mediated excitations, (b) show a higher concordance rate (number of spikes evoked/number of stimulus delivered in DR/MnR), and (c) are blocked by of the 5-HT3 receptor antagonists ondansetron and tropisetron (Puig et al., 2004; Puig and Gulledge, 2011).

In summary, three possibilities (differential localization and/or control of ion flows by 5 HT1A and 5-HT2A receptors in pyramidal neurons, existence of a raphe-mPFC GABAergic pathway, and inhibitory action in pyramidal cells mediated by activation of excitatory 5-HT receptors on GABAergic interneurons), not mutually exclusive, may account for the predominantly inhibitory responses elicited by raphe stimulation on pyramidal neurons of the mPFC. To the best of our knowledge, *in vivo* responses to other 5-HT receptors present in low or moderate abundances in cortex, such as 5-HT2C, 5-HT4, 5-HT6, or 5-HT7 have not been reported so far.

### **SEROTONIN MODULATES CORTICAL NETWORK ACTIVITY**

Neuronal populations, via their anatomical and functional interconnections, can display sophisticated discharge patterns that arise from the synchronization of their activity. That is, they can form neural networks whereby synchronous activity can become oscillatory. Oscillatory activities, ranging from 0.1 up to few hundred cycles per second, generate small electrical waves detectable outside the skull through electroencephalographic (EEG) recordings or intracerebrally through local field potential (LFP) recordings. Oscillations at several frequency bands have been recorded in the neocortex during natural sleep, anesthesia, and alertness, where their presence tightly correlates with a variety of behavioral tasks. During slow-wave sleep (SWS) and deep anesthesia, slow waves (∼2 Hz) and delta waves (1–4 Hz) are prominent (Steriade et al., 1993; Mukovski et al., 2007; Celada et al., 2008; Puig et al., 2010), which are critical for memory consolidation and learning (Stickgold, 2005; Marshall et al., 2006; Landsness et al., 2009). During wakefulness, cortical alpha (10–14 Hz) and gamma (30–80 Hz) waves correlate with the modulation of attention, memory, and learning (Fries et al., 2001; Jensen et al., 2002; Ward, 2003; Buschman and Miller, 2007; Fries, 2009; Siegel et al., 2009; Benchenane et al., 2011; Bollimunta et al., 2011; Puig and Miller, 2012), whereas beta waves (15–30 Hz) also play a role in learning (Puig and Miller, 2012). Work reported over the last decade has suggested that the synchronization of neural activity in the neocortex and subcortical structures such as the hippocampus may indeed be critical for the normal processing of cognitive functions. In fact, schizophrenia patients, who show clear cognitive impairment (Elvevåq and Goldberg, 2000; Harvey et al., 2001), display abnormal cortical oscillatory activity in the slow and gamma frequency bands (Hoffmann et al., 2000; Spencer et al., 2003; Cho et al., 2006). Likewise, abnormal cortical oscillations can be observed in a variety of psychiatric disorders (see below). Considering the growing evidence pointing to an important role of cortical oscillations in cognition and the abundant psychiatric medication targeting the serotonergic system, the involvement of 5-HT in the generation and modulation of cortical oscillatory activities is of high interest, as it will help to identify new targets for psychiatric treatments.

### **SEROTONERGIC MODULATION OF CORTICAL OSCILLATIONS**

The seminal studies pioneered by Mircea Steriade and colleagues (Steriade et al., 1993, 1996, 2001; Steriade, 2006) described in detail the cellular mechanisms underlying the spontaneous slow rhythms (∼2 Hz)—including slow and spindle waves—present in the neocortex during natural sleep. Slow waves reflect the spontaneous changes in membrane potential and synchronous firing of neuronal ensembles coordinated by an underlying slow oscillation. At a cellular level, they consist of an alternation between periods of activity (called UP states) and silence (DOWN states), that are not observed during wakefulness. UP and DOWN states reflect periods of membrane depolarization and hyperpolarization, respectively, within large neuronal networks (Steriade et al., 1993; Contreras and Steriade, 1995; Mukovski et al., 2007). Spontaneous low frequency oscillations (∼1 Hz) have also been reported to occur in slices of the ferret visual cortex (Sanchez-Vives and McCormick, 2000), suggesting the existence of intrinsic mechanisms. To our knowledge, the role of 5-HT in the modulation of cortical slow oscillations during natural sleep has not been addressed, despite the marked activity changes of raphe 5-HT neurons during sleep and wakefulness.

Mild anesthetics, such as chloral hydrate, can reliably generate slow oscillations in the neocortex of laboratory animals that resemble the slow rhythms of natural SWS. This provides a more convenient preparation to examine how synchronous activity is modulated by 5-HT and its receptors. Endogenous 5-HT—released after electrical stimulation of the DR in the midbrain—modulates both the frequency and amplitude of cortical slow-like oscillations recorded in the PFC of anesthetized rats (Puig et al., 2010). 5-HT release produces a moderate increase in frequency by promoting rapid initiation of UP states, while reducing the amplitude and duration of DOWN states (**Figure 7**). This indicates that the activity of 5-HT neurons in the DR (and possibly MnR) may directly regulate the frequency of cortical slow oscillations by promoting UP states. Thus, despite most pyramidal neurons are inhibited by the physiological release of 5-HT (see above), 5-HT appears to have an excitatory effect on cortical networks *in vivo*, because UP states are generated by the synchronous depolarization of large ensembles of cortical neurons. In fact, a massive release of 5-HT in the cortex following high-frequency stimulation of the DR completely suppresses cortical slow waves by promoting a long-lasting depolarization and elimination of DOWN states (Puig et al., 2010). These excitatory influences of 5- HT on cortical slow oscillations may be accomplished via 5-HT2A receptors (Celada et al., 2008; Puig et al., 2010). Hence, pharmacological stimulation of 5-HT2A receptors with the hallucinogen and 5-HT2A receptor agonist DOI and the blockade with the 5- HT2*A/*2*<sup>C</sup>* antagonist ritanserin desynchronize slow waves in rat PFC (**Figure 8**), suggesting that a balanced stimulation of 5-HT2A receptors is critical for a stable synchronization of cortical slow waves.

Much less is known about the role of 5-HT in the modulation of high-frequency oscillations, especially for the alpha

Puig et al. (2010).

(10–14 Hz) and beta (15–30 Hz) bands. A correlation between increases in the power of alpha oscillations in the ventral PFC and increased levels of 5-HT in whole blood has been found (Fumoto et al., 2010; Yu et al., 2011). Some more information has been reported on gamma oscillations. During natural sleep and anesthesia, gamma oscillations (30–100 Hz) are present in cortex along with slow waves (Steriade, 2006; Puig et al., 2010; Massi et al., 2012), although their specific involvement in cortical processing is poorly understood. Gamma oscillations are generated by the synchronous firing of fast-spiking interneuron networks (Cardin et al., 2009; Sohal et al., 2009), which exert potent inhibition onto pyramidal neurons and other fast-spiking neurons.

1 Hz. Boxes 1 and 2 are expanded in **(B)**. Bottom, Change in power of slow waves over time (red indicates high power, blue low power). White dashed line marks the frequency of stimulation. Note that the predominant band

Interestingly, 5-HT exerts a strong modulation of gamma oscillations in the PFC of anesthetized rats via 5-HT1A and 5-HT2A receptors (Puig et al., 2010). Specifically, blockade of 5-HT1A receptors increases the amplitude of cortical gamma waves, the discharge rate of 5-HT1A-expressing fast-spiking interneurons, and sharpens the synchronization of these neurons to gamma cycles. By contrast, blocking 5-H2A receptors decreases cortical gamma oscillations and desynchronizes 5- HT2A-expressing fast-spiking interneurons from gamma waves. In other words, endogenous 5-HT can dampen or enhance gamma oscillations by reducing or increasing the activity and synchronization of 5-HT1A- and 5-HT2A-expressing fast-spiking interneurons, respectively. Overall, 5-HT's major effect is a reduction of cortical gamma oscillations during sleep-like epochs. Further investigations should establish whether these actions also occur during wakefulness.

duration. A threshold was set (red line) to discriminate UP states. Note the increase in UP-state potentials during the stimulations (arrow). Modified from

### **RELEVANCE FOR PSYCHIATRIC DISORDERS**

Abnormal oscillatory activities in the cortex have been observed in a number of neurological and psychiatric disorders (Basar and Güntekin, 2008). For example, the synchronization of slow (*<*1 Hz), delta (1–4 Hz) and gamma (30–80 Hz) oscillations is reduced in schizophrenia, major depression, and bipolar disorder (Keshavan et al., 1998; Hoffmann et al., 2000; Spencer et al., 2003; Cho et al., 2006; Uhlhaas and Singer, 2006). Impaired gamma oscillations and synchrony have also been reported in schizophrenia patients as well, suggesting the existence of network alterations (Spencer et al., 2003; Cho et al., 2006; Uhlhaas and Singer, 2006; Basar and Güntekin, 2008; Gonzalez-Burgos and Lewis, 2008; Gonzalez-Burgos et al., 2010).

Interestingly, patients with depression that do not respond to selective serotonin reuptake inhibitors (SSRIs) have alterations in alpha power in some cortical areas compared with responders and healthy control subjects (Bruder et al., 2008), suggesting some relationship between 5-HT levels and the amplitude of alpha oscillations.

Animal models of psychiatric disorders are starting to shed new light into the contribution of 5-HT to normal and abnormal cortical oscillations. Importantly, they are providing valuable

information to better understand the cellular mechanisms by which psychiatric medication compensates for imbalances in network activity. For instance, the hallucinogen and preferential 5-HT2A receptor agonist DOI disrupts low-frequency oscillations

effects of WAY and RIT on the power of slow waves. Time-frequency

in the PFC of anesthetized rats (Celada et al., 2008), an effect reversed by antipsychotic drugs with different pharmacological targets. While the effect of clozapine can be interpreted by its competition with DOI at 5-HT2A receptors, the effect of

oscillations. Modified from Puig et al. (2010).

haloperidol must necessarily be interpreted at the network level, given its inability to occupy 5-HT2A receptors at the dose used. A reduction in slow wave activity has been detected in patients with schizophrenia during sleep (Hoffmann et al., 2000); hence, a potential source of this decrease could be an unbalanced stimulation of cortical 5-HT2A receptors.

Interestingly, the disruption of PFC activity evoked by DOI is similar—yet of smaller magnitude—to that produced by the NMDA receptor antagonist phencyclidine (PCP) (Kargieman et al., 2007). Interestingly, PCP effect is also reversed by haloperidol and clozapine, which suggests a link between the disrupting action of PCP and DOI on PFC activity and their psychotomimetic activity. Likewise, the reversal of this effect by two antipsychotic drugs with different primary targets strongly suggests a relationship with their therapeutic action. Despite its low *in vitro* affinity for 5-HT1A receptors, the reversal by clozapine of PCP effects depends on the *in vivo* stimulation of such receptors (Kargieman et al., 2012), as previously observed for the antipsychotic-evoked release of dopamine in mPFC (Diaz-Mataix et al., 2005).

**FIGURE 9 | Schematic representation of the relationships between the mPFC and the DR involving 5-HT1A and 5-HT2A receptors.** Pyramidal neurons in the mPFC project densely to the DR/MnR and modulate the activity of serotonergic neurons via direct and indirect influences (Celada et al., 2001). In turn, endogenous 5-HT modulates pyramidal cell activity through the activation of various receptors expressed in the neocortex, of which 5-HT1A and 5-HT2A receptors play a major role. The latter receptors are particularly enriched in apical dendrites of pyramidal neurons where they can facilitate AMPA inputs. A smaller population of 5-HT2A receptors are expressed by GABA interneurons, including fast-spiking (FS) interneurons. Pyramidal 5-HT1A receptors may be localized in the axon hillock, together with GABAA receptors activated by chandelier axons (Azmitia et al., 1996; De Felipe et al., 2001; Cruz et al., 2004) or in the somatodendritic compartment (Riad et al., 2000). It is possible that 5-HT

axons reaching the cortex at different levels may exert distinct effects on pyramidal neurons, depending on a precise topology between certain 5-HT neurons or neuronal clusters within the DR/MnR and 5-HT1A- or 5-HT2A-receptor-rich compartments, in agreement with anatomical studies showing an association between 5-HT axons and such receptor-rich areas (De Felipe et al., 2001; Jansson et al., 2001). Also, 5-HT axons reaching upper layers, including layer I, may activate 5-HT3 receptors located on GABAergic non-FS interneurons to modulate inputs onto the tufts and most distant segments of apical dendrites of pyramidal neurons. 5-HT1B receptors are present on serotonergic axons (not shown) and in axons of other neuronal types (e.g., glutamatergic) where they regulate neurotransmitter release and modulate synaptic activity. The scheme shows also the putative GABAergic projections from DR/MnR to the mPFC suggested by electrophysiological and anatomical studies (see text).

Collectively, these studies highlight the importance of 5-HT in regulating cortical network activity, as well as the complexities of the alterations present in psychiatric disorders and the compensatory mechanisms mediating the action of psychiatric medication. Detailed knowledge of the cellular and circuit mechanisms underlying serotonergic modulation of cortical oscillations in health and disease could provide valuable information for our understanding of why many schizophrenia and other psychiatric treatments are largely ineffective at restoring PFC function.

### **CONCLUSIONS**

The assessment of the *in vivo* and *in vitro* actions of 5-HT on cortical neurons and networks has revealed a complex pattern of action. 5-HT can hyperpolarize pyramidal neurons through the activation of 5-HT1A receptors, an action that results from the opening of G protein-coupled inward rectifying K+ channels. This effect is followed by a reduction of the firing activity of pyramidal neurons. At the same time, 5-HT can depolarize the same neurons through 5-HT2A receptors and increase their excitability. These two receptors appear to be the main players for the postsynaptic actions of 5-HT in the cerebral cortex. *In situ* hybridization studies have revealed the concurrent presence of 5- HT1A and 5-HT2A receptor mRNAs in a large proportion (∼80%) of neurons in the PFC. Although several hypotheses have been put forward, it is yet unclear what determines whether a given pyramidal neuron responds to 5-HT with an excitation or an inhibition, although the latter responses predominate, both *in vitro* and *in vivo*. However, the fact that 5-HT induces depolarizing actions on slow oscillations *in vivo* suggests that it can exert excitatory effects on cortical neural networks independent from action potential generation, a view awaiting further confirmation. The role of these two receptors in the modulation of the activity of GABAergic interneurons is still poorly understood. A significant proportion of these neurons, including fast-spiking interneurons, located in layers II–VI express 5-HT1A and/or 5- HT2A receptors; yet so far, there are no studies examining the role of both receptors in the modulation of ion currents. *In vivo*, 5-H1A receptors decrease, whereas 5-HT2A receptors increase, spiking rate of fast-spiking interneurons in the PFC (Puig et al., 2010).

On the contrary, there is a reasonable knowledge on the role of 5-HT3 receptors in the control of the activity of GABAergic interneurons. It seems like there is a segregation

### **REFERENCES**


excitatory postsynaptic potentials in apical dendrites of neocortical pyramidal cells. *Neuropharmacology* 36, 589–599.


of 5-HT1A/5-HT2A receptors on one side, expressed likely in parvalbumin- and calbindin-containing interneurons and 5-HT3 receptors, expressed in calretinin- and (to a lesser extent) calbindin-containing neurons. Moreover, interneurons expressing 5-HT3 receptors are localized mostly in layers I-III, which suggests a role in the modulation of inputs reaching the tufts and upper segments of the apical dendrites of pyramidal neurons.

Another 5-HT receptor for which a role (yet still poorly characterized) has been attributed is the 5-HT1B receptor, whose activation by 5-HT can presynaptically modulate GABAergic and glutamatergic inputs onto pyramidal neurons. **Figure 9** shows a schematic representation of the PFC—raphe circuit with the most important receptors involved in the serotonergic actions in PFC and their presumed localization.

Unfortunately, there is a poor knowledge of the actions of 5-HT on other receptors, some of which are expressed in significant amounts in the neocortex. 5-HT2C receptors can modulate the activity of pyramidal neurons in piriform cortex, but this does not seem to be the rule in neocortex. On the other hand, the neuronal depolarization induced by 5-HT7 receptor activation disappears few weeks after birth, which suggests a role in development but not in adulthood. Further studies are indeed required to clarify the complex role of 5-HT in the modulation of cortical activity. Current and new knowledge in this area will help to understand the involvement of 5-HT in cortical functions, notably those in PFC, a brain region highly enriched of 5-HT elements and involved in critical brain functions such as cognition and emotional control, among others.

### **ACKNOWLEDGMENTS**

Supported by grants SAF 2012-35183 (Ministry of Economy and Competitiveness and EU FEDER funds), PI09/1245 and PI12/00156 (PN de I+D+I 2008-2011, ISCIII-Subdirección General de Evaluación y Fomento de la Investigación and the European Regional Development Fund. Una manera de hacer Europa). Support by the Centro de Investigación Biomédica en Red de Salud Mental, CIBERSAM (P82, 11INT3) and Generalitat de Catalunya (SGR20093) is also acknowledged. M. V. Puig was supported by the Japanese Society for the Promotion of Science (JSPS). Pau Celada is supported by the Researcher Stabilization Program of the Health Department of the Generalitat de Catalunya. We thank Dr. Allan Gulledge for permission to reproduce data in **Figure 3**.

glutamate release. *Brain Res.* 825, 161–171.


mediate opposing responses on membrane excitability in rat association cortex. *Neuroscience* 40, 399–412.


rats. *Neuropsychopharmacology* 33, 2007–2019.


memory, and the serotonin 1A and 2A receptors. *J. Cogn. Neurosci.* 17, 1497–1508.


lesions of serotonergic neurons. *Brain Res.* 793, 103–111.


of phosphoinositide hydrolysis produced by the 5-HT3 receptor agonist 2-methyl-serotonin. *Brain Res.* 545, 276–278.


contributions to understanding psychoses. *Trends Pharmacol. Sci.* 29, 445–453.


dopaminergic mechanisms. *Psychopharmacology* 133, 329–342.


on GABA interneurons. *Cereb. Cortex* 22, 1487–1497.


Klausberger, T. (2012). Temporal dynamics of parvalbuminexpressing axo-axonic and basket cells in the rat medial prefrontal cortex *in vivo*. *J. Neurosci.* 32, 16496–16502.


mapping of serotonin receptors in the rat brain. I. Serotonin-1 receptors. *Brain Res.* 346, 205–230.


Diencephalic and mesencephalic efferents of the medial prefrontal cortex in the rat: electrophysiological evidence for the existence of branched axons. *Exp. Brain Res.* 50, 275–282.


the anterior prefrontal cortex and serotonergic system is associated with improvements in mood and EEG changes induced by Zen meditation practice in novices. *Int. J. Psychophysiol.* 80, 103–111.


of clozapine and haloperidol on the effects of the activation of 5-HT(1A), 5-HT(2) and 5- HT(4) receptors in rat frontal cortex. *J. Physiol. Pharmacol.* 55, 371–379.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 31 January 2013; paper pending published: 07 March 2013; accepted: 01 April 2013; published online: 19 April 2013.*

*Citation: Celada P, Puig MV and Artigas F (2013) Serotonin modulation of cortical neurons and networks. Front. Integr. Neurosci. 7:25. doi: 10.3389/fnint. 2013.00025*

*Copyright © 2013 Celada, Puig and Artigas. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Reward prediction-related increases and decreases in tonic neuronal activity of the pedunculopontine tegmental nucleus

### *Ken-ichi Okada1,2 andYasushi Kobayashi1,2,3,4\**

*<sup>1</sup> Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan*

*<sup>2</sup> Center for Information and Neural Networks, National Institute of Information and Communications Technology, and Osaka University, Osaka, Japan*

*<sup>3</sup> Research Center for Behavioral Economics, Osaka University, Osaka, Japan*

*<sup>4</sup> Precursory Research for Embryonic Science and Technology, Japan Science and Technology Agency, Saitama, Japan*

### *Edited by:*

*Kae Nakamura, Kansai Medical University, Japan*

### *Reviewed by:*

*Joshua D Berke, University of Michigan, USA Sheri Mizumori, University of Washington, USA*

#### *\*Correspondence:*

*Yasushi Kobayashi, Graduate School of Frontier Biosciences, Osaka University, 1-4 Yamadaoka, Suita 565-0871, Japan. e-mail: yasushi@fbs.osaka-u.ac.jp*

The neuromodulators serotonin, acetylcholine, and dopamine have been proposed to play important roles in the execution of movement, control of several forms of attentional behavior, and reinforcement learning.While the response pattern of midbrain dopaminergic neurons and its specific role in reinforcement learning have been revealed, the roles of the other neuromodulators remain elusive. Reportedly, neurons in the dorsal raphe nucleus, one major source of serotonin, continually track the state of expectation of future rewards by showing a correlated response to the start of a behavioral task, reward cue presentation, and reward delivery. Here, we show that neurons in the pedunculopontine tegmental nucleus (PPTN), one major source of acetylcholine, showed similar encoding of the expectation of future rewards by a systematic increase or decrease in tonic activity. We recorded and analyzed PPTN neuronal activity in monkeys during a reward conditioned visually guided saccade task. The firing patterns of many PPTN neurons were tonically increased or decreased throughout the task period. The tonic activity pattern of neurons was correlated with their encoding of the predicted reward value; neurons exhibiting an increase or decrease in tonic activity showed higher or lower activity in the large reward-predicted trials, respectively. Tonic activity and reward-related modulation ended around the time of reward delivery. Additionally, some tonic changes in activity started prior to the appearance of the initial stimulus, and were related to the anticipatory fixational behavior. A partially overlapping population of neurons showed both the initial anticipatory response and subsequent predicted reward value-dependent activity modulation by their systematic increase or decrease of tonic activity. These bi-directional reward- and anticipatory behavior-related modulation patterns are suitable for the presumed role of the PPTN in reward processing and motivational control.

**Keywords: acetylcholine, reinforcement learning, pedunculopontine tegmental nucleus, reward prediction, cholinergic, tonic activity, motivation**

### **INTRODUCTION**

The pedunculopontine tegmental nucleus (PPTN) is the major source of cholinergic projections in the midbrain, but also contains glutamatergic, gamma aminobutyric acid (GABA)ergic, dopaminergic, and noradrenergic neurons (Mesulam et al., 1983; Rye et al., 1987; Clements and Grant, 1990; Jones, 1991; Spann and Grofova, 1992; Ford et al., 1995; Takakusaki et al., 1996; Wang and Morales, 2009). The PPTN controls sleeping/waking (Datta and Siwek, 2002; Kayama and Koyama, 2003) and locomotion (Garcia-Rill and Skinner, 1988; Takakusaki et al., 2004; Harris-Warrick, 2011), and also has a role in regulating motivated behavior (Rompre and Miliaressis, 1985; Kozak et al., 2005; Doya, 2008; Winn, 2008; Wilson et al., 2009; Okada et al., 2011). However, the role of the PPTN in motivated behavioral control remains rather elusive. On the other hand, there are numerous studies showing that another neuromodulation system, i.e., dopaminergic neurons located in the substantia nigra pars compacta and ventral tegmental area, play an essential role in the regulation of motivated behavior by encoding a reward prediction error signal for reinforcement learning (Schultz, 1998; Bromberg-Martin et al., 2010b). Dopaminergic neurons exhibit phasic burst firing in response to external stimuli and rewards, and their response magnitude alters throughout the course of learning to match the reward prediction error signal (Hollerman and Schultz, 1998). The PPTN projects to dopaminergic neurons (Beninato and Spencer, 1987), and these excitatory cholinergic/glutamatergic projections are thought to regulate the firing of dopaminergic neurons (Lokwan et al., 1999; Forster and Blaha, 2003; Mena-Segovia et al., 2008b). Thus, it is possible that neurons in the PPTN encode the reward-related signals that are necessary for the computation of the reward prediction error signal by dopaminergic neurons.

"fnint-07-00036" — 2013/5/11 — 17:31 — page 1 — #1

Many previous studies, including ours, reported the phasic activity of PPTN neurons in response to reward, sensory stimulus, and movement (Garcia-Rill and Skinner, 1988; Matsumura et al., 1997; Dormont et al., 1998; Kobayashi et al., 2002; Pan and Hyland, 2005; Okada and Kobayashi, 2009; Okada et al., 2009; Norton et al., 2011). Some studies also showed tonic changes in activity in relation to locomotion (Garcia-Rill and Skinner, 1988), sleeping/waking (Datta and Siwek, 2002), and arousal state (Mena-Segovia et al., 2008a). Previously, we recorded tonic changes in neuronal activity in the monkey PPTN while they performed a reward-biased saccade task, which was comparable to those used in recordings from dopaminergic neurons. We found that one group of PPTN neurons showed a tonic increase in activity during the task execution period, with greater activity during successful versus failed trials (Kobayashi et al., 2002) and greater activity during highly motivated trials (Okada et al., 2009). These neurons could act as a gate from motivation to action by changing attentional or arousal processes that matches the presumed role of the PPTN as the ascending reticular activating system (Steriade, 1996). Furthermore, some tonic excitatory neurons showed stronger responses to large reward-predicted cues than that to small reward-predicted cues (Okada et al., 2009). This group of PPTN neurons may provide the neural substrates for the temporal memory of the predicted reward magnitude, which is required for the computation of the reward prediction error.

Recent neurophysiological studies of the serotonergic dorsal raphe nucleus (DRN) reported that DRN neurons showed either an increase or decrease in tonic activity to reward-related cues and reward outcomes (Nakamura et al., 2008), and these tonic changes in their activity continually encode the state of expectation of future rewards, such that the response of a neuron to the start of the task was correlated with its response to the reward cues and outcomes (Bromberg-Martin et al., 2010a). The PPTN and DRN are interconnected with each other (Steininger et al., 1992; Honda and Semba, 1994), and serotonergic and cholinergic neuromodulatory systems control many brain functions, such as sleeping/waking (Kayama and Koyama, 2003) and locomotion (Takakusaki et al., 2004; Harris-Warrick, 2011), in a mutually interacting manner. Thus, similar bi-directional reward value coding might also be present in PPTN neurons.

To investigate the relationship between task-related tonic activity and reward-related modulation, we analyzed PPTN neuronal activity during several phases of a behavioral task, i.e., just before the start of the task, just after the start of the task, at reward cue presentation, and at reward delivery. In addition to the neurons we reported previously that exhibited an increase in tonic activity, other PPTN neurons showed a decrease in tonic activity during the behavioral task, similar to the activity of some DRN neurons. We found a correlation between the tonic increase or decrease in activity to the start of the task and the responses to large/small reward cues, but not to the delivery of large/small rewards. Furthermore, a partially overlapping population of neurons showed preparatory activity modulation that depended on the anticipatory fixational behavior of the monkeys. This result suggests that PPTN neurons encode both the externally cued reward value and internal anticipatory state signals by systematic changes in their tonic firing rate, both in excitatory and suppressive directions.

### **MATERIALS AND METHODS**

### **GENERAL**

We recorded neuronal activity from the PPTN in three Japanese macaque monkeys (*Macaca fuscata*; animal Ds, male; animal Tm, female; animal Dn, male) while they performed a rewardbiased visually guided saccade task. All experimental procedures were performed in accordance with the National Institutes of Health *Guidelines for the Care and Use of Laboratory Animals* and approved by the Committee for Animal Experiments at Okazaki National Research Institutes and Osaka University.

Information on the experimental procedures was published previously (Kobayashi et al., 2002). Briefly, a head-holding device, a chamber for unit recording, and a scleral search coil were implanted under general anesthesia. During the experimental sessions, the animals were seated in a primate chair and placed in a sound-attenuated room. All aspects of the behavioral experiment, including presentation of the stimuli, monitoring of eye movements, monitoring of neuronal activity, and reward delivery, were under the control of a personal computer-based real-time data acquisition system (TEMPO) with a real-time link to MATLAB. Eye position was monitored by means of a scleral search coil system with a spatial resolution of 0.1◦ and time resolution of 1 kHz. The stimuli were presented on the screen of a 21-inch cathode ray tube monitor that was placed 28 cm in front of the animals.

### **BEHAVIORAL TASK**

The animals performed a reward-biased visually guided saccade task. This task was comparable to those used in recordings from basal ganglia nuclei and dopaminergic neurons in which the shape of the fixation target (FT; square, circle, or triangle) indicated the reward magnitude (large or small, **Figure 1**). The monkeys initially fixated on the central target, then made a saccade to the peripheral target, and finally received a juice reward. During the initial fixation period, the shape of the FT cued the animals to expect either a large or small reward upon the successful completion of the trial. In the recordings from animal Ds, an uninformative small fixation point (FP) was presented initially at the center of the screen. The monkey was required to fixate on the FP within 3000 ms to a precision of ± 2◦. After 400–800 ms fixation, the FP was replaced by a square or triangle FT, which was associated with the reward magnitude. In the recordings from animals Tm and Dn, the initial stimulus was a square or circle FT, and its shape was associated with large or small rewards, respectively. The FT shape-reward magnitude contingency was switched at quasi-random intervals (20–30 trials).

The subsequent task procedures were the same for all monkeys. After fixation on the FT for a variable duration (400–1500 ms), a saccade target (ST, a circle of 0.8◦) appeared at an eccentricity of 10◦ from the FT in 1 of 2 (left or right) or 8 (0, 45, 90, 135, 180, 225, 270, and 315◦) possible directions. The monkey was required to saccade to the ST within 80–500 ms to a precision of ± 2◦. Successful trials were rewarded with juice presented together with a tone at 100 or 300 ms after the ST disappeared. The large and small rewards consisted of 3 or 1 drops of juice (each drop ∼0.1 ml), respectively. If the animal broke fixation at any time during the fixation period or failed to make a saccade to the ST, an error tone sounded and the trial was aborted. The intertrial

"fnint-07-00036" — 2013/5/11 — 17:31 — page 2 — #2

interval, which started at the time of reward offset and lasted until the onset of the initial stimulus in the next trial, was fixed at 1500 ms in the recordings from animal Ds, and quasi-randomly varied (within 1.5–2 s) in the recordings from animals Tm and Dn.

### **RECORDING PROCEDURE**

Guide tubes held within the recording chamber were aimed at the PPTN of the monkeys using magnetic resonance imaging (2.2 T) under general anesthesia. The locations of the recorded neurons were reconstructed for two monkeys from the readings of the micromanipulator and those of the guide grids of the recording chamber, referenced to a single marker site selected for each monkey (Okada et al., 2009). Correct placement of the recording electrode was confirmed by monitoring the neuronal activity in the surrounding structures, including the auditory responses in the inferior colliculus encountered at 3–7 mm before those in the PPTN and high-frequency tonic fiber activity in the cerebellar peduncle, close to the PPTN.

While the PPTN is the major source of cholinergic projections in the brainstem (Mesulam et al., 1983), it also contains glutamatergic and GABAergic (Clements and Grant, 1990; Spann and Grofova, 1992; Ford et al., 1995; Takakusaki et al., 1996; Wang and Morales, 2009) as well as dopaminergic (Rye et al., 1987) and noradrenergic (Jones, 1991) neurons. It was suggested that there are two types of neurons that generate broad and brief action potentials, respectively, in slice preparations of the rat PPTN (Takakusaki et al., 1997). Recent extracellular recording studies also reported neurons that generated broad and brief action potentials; however, they exhibited a unimodal distribution and could not be classified into groups (Matsumura et al., 1997; Kobayashi et al., 2002). Therefore, rather than choosing neurons with specific electrophysiological properties, we studied all well isolated neurons in the PPTN whose activity changed during the saccade task.

### **DATA ANALYSIS**

Our database consisted of 507 neurons in animal Ds (saccade task with FP presentation), 156 neurons in animal Tm, and 29 neurons in animal Dn. The data from monkeys Tm and Dn are from the same neurons recorded in a previous study (Okada et al., 2009). The analysis included neuronal data from all correctly performed trials, excluding the first three trials of each block when the animals were adapting to the change in the FT shape-reward magnitude contingency.

We used receiver operating characteristic (ROC) analysis to compare the firing rates in two time windows during a trial or for two different task conditions. In principle, ROC analysis evaluates the reliability by which an ideal observer could correctly distinguish between 2 conditions from the neuronal signal. The ROC value is calculated as the probability that a randomly chosen firing rate from the first condition has a higher value than a randomly chosen firing rate from the second condition (excluding ties; Green and Swets, 1966). Thus, an ROC value 1 implies that neuronal activity in the first condition is always higher than in the second condition. An ROC value 0.5 implies that neuronal activity does not discriminate between the two conditions, and an ROC value zero implies that neuronal activity is always higher in the second condition.

For the analysis of task-related tonic changes in activity (see **Figure 4**), the normalized activity of each neuron was calculated as the ROC value comparing the firing rate of the neuron collected in a 200 ms window centered on that time versus the firing rate collected during a pre-fixation period represented by a 600 ms window before the onset of the initial stimulus to display neuronal activity during both the task period and intertrial interval. Neurons were classified as"tonic excitatory"or"tonic suppressive" based on their significant increase or decrease in activity during the post-fixation period (0–600 ms after the onset of the initial stimulus) versus their activity in the pre-fixation period (*p* <0.05,Wilcoxon rank-sum test). We defined the fixation period response as the ROC value comparing the firing rate in the same 600 ms time window. An ROC value >0.5 implies that neuronal activity is increased after the onset of the initial stimulus.

For the analysis of reward-related modulation (see **Figure 5**), the normalized activity of each neuron was calculated as an ROC value separately for the large and small reward trials. We calculated reward-related modulation by comparing the firing rate in the large versus small reward trials, separately for the reward

"fnint-07-00036" — 2013/5/11 — 17:31 — page 3 — #3

cue period (0–600 ms after FT onset) and outcome period (0– 600 ms after reward delivery), using the Wilcoxon rank-sum test (*p* < 0.05) and ROC analysis. An ROC value >0.5 implies that neuronal activity is higher in the large reward condition (positive reward modulation). We also determined the contributions of the predicted reward magnitude and FT shape to the neuronal responses. Multiple linear regression analysis was performed in which the responses were modeled by a linear sum of the predicted reward magnitude and FT shape. From the regression coefficients of the model, we confirmed that the neuronal activity was significantly modulated by the predicted reward magnitude (Okada et al., 2009).

For the analysis of anticipatory behavior-related modulation (see **Figure 7**), we used the reaction time to fixate on the initial target (RTit) as a measure of the monkeys' anticipation of the occurrence of an upcoming event. Even before the appearance of the initial stimulus, our monkeys frequently shifted their gaze to the center of the screen (entered a 3◦ window), i.e., they performed self-initiated movements based on their anticipation of an upcoming visual event. An RTit <0 implies that the monkey made an anticipatory gaze shift before the appearance of the initial target, while an RTit >0 implies that the monkey made a gaze shift after the appearance of the target. We classified the trials into short and long RTit categories (shorter and longer RTit than the median values of the individual neurons, respectively), and the normalized activity of each neuron was calculated as an ROC value separately for the short and long RTit trials. We calculated behavior-related modulation by comparing the firing rate in the short versus long RTit trials for the pre-fixation period (0–600 ms before the appearance of the initial stimulus) using the Wilcoxon rank-sum test (*p* < 0.05) and ROC analysis. An ROC value >0.5 implies that neuronal activity is higher in the short RTit condition (positive behavioral modulation).

Correlations between the fixation period response, rewardrelated modulation, and behavior-related modulation were assessed using Spearman's rank correlation. To estimate the significance of the correlation, we performed a permutation test by shuffling each dataset 20,000 times.

The electrophysiological properties of neurons were quantified, such as spike duration, spiking irregularity, and firing rate. Spike duration was measured between the first negative deflection and the peak of the second positive deflection of the spike waveform. Spiking irregularity was measured for each spike using the coefficient of variation (CV) of five successive interspike intervals (ISI), where the standard deviation (S.D.) of the ISI was divided by the mean of the ISI (CV = S.D. (ISI)/mean (ISI)). The irregularity index of each neuron was defined as the median of the CV of all of its spikes recorded during the performance of correct trials. The firing rates of tonic excitatory and suppressive neurons during the pre- and post-fixation periods were compared using the Wilcoxon rank-sum test (*p* < 0.05/6, Bonferroni correction).

### **RESULTS**

### **INCREASE AND DECREASE IN THE TONIC ACTIVITY OF PPTN NEURONS**

We analyzed the activity of neurons recorded from the PPTN while the monkeys performed a reward-biased visually guided saccade task (**Figure 1**). During fixation on the reward-conditioned FT, the monkeys could expect either a large or small reward upon the successful completion of the trial depending on the shape of the FT. Animal behavior is influenced by the predicted reward magnitude, such that there is a higher success rate and shorter saccadic reaction time to the ST in large reward-predicted trials than in those with a small reward (Okada et al., 2009).

As we reported previously, many PPTN neurons increase their tonic activity around the time of the initial target appearance, which was sustained until the end of the trial (tonic excitatory neurons), and some of these neurons show predicted reward-related activity modulation (Okada et al., 2009). **Figure 2A** illustrates a raster and spike density function for a representative tonic excitatory neuron. This neuron showed an anticipatory increase in activity around the time of the initial target appearance. After the reward-conditioned FT was presented during fixation, the neuron exhibited higher activity in response to the large reward-indicating FT than the small reward-indicating FT (positive reward modulation). This tonic activity and differential response ended around the time of reward delivery, and the neuronal activity was almost unrelated to the actual magnitude of the given reward (Okada et al., 2009). Thus, this neuron possibly encoded the expectation of future rewards by increasing its activity.

Here we show another group of PPTN neurons that exhibited reverse response patterns compared to the tonic excitatory neurons. **Figure 2B** illustrates an example neuron that decreased its tonic activity during the task period. Its decrease in activity started before the appearance of the FP, and then showed lower activity to a large reward-indicating FT (negative reward modulation). This differential response faded away during the saccade period and its activity returned to the pre-fixation level after reward delivery. There was no significant difference in activity according to the magnitude of the given reward. Thus, opposite to the tonic excitatory neurons, this neuron might encode the expectation of future rewards by decreasing its activity. Some tonic suppressive neurons additionally responded to multiple task events, such as the appearance of the visual stimulus and saccade, similar to the tonic excitatory neurons (Okada and Kobayashi,2009; Okada et al., 2009). The example neuron shown in **Figure 2C**decreased its tonic activity during the task period and showed no activity modulation with the magnitude of the predicted reward, but showed a phasic burst of activity with saccades toward the ipsilateral side.

Previously, we reported that the activity of tonic excitatory neurons was correlated with the monkeys' behavioral performance, such that the neuronal responses were stronger for successful trials than for erroneous trials (Kobayashi et al., 2002; Okada et al., 2009). Tonic suppressive neurons also showed behavioral performance-related activity modulation. **Figure 3A** compares the activity of a representative tonic suppressive neuron in successful and erroneous trials, in which the monkey failed to fixate on the central target. In successful trials, this neuron showed a decrease in tonic activity around the time of FP appearance; however, there was no decrease in its activity in erroneous trials. This result suggests that tonic suppressive neurons might signal the attentional and/or motivational state, similar to the tonic excitatory neurons.

As shown in **Figures 2B,C**, tonic suppressive neurons showed an increase in tonic activity after task reward delivery that was sustained until the start of the next trial. However, there was

"fnint-07-00036" — 2013/5/11 — 17:31 — page 4 — #4

**FIGURE 2 | Reward-related activity modulation of the tonic excitatory and suppressive neurons.** The rastergrams and spike density functions are shown in four sections. From left to right, the data are aligned to the time of FP onset, FT onset, ST onset, and reward delivery, respectively. The colored rasters and traces indicate the large reward trials (red), small reward trials (cyan), and all trials (black). **(A)** This representative neuron exhibited an increase in tonic activity during the task period, and showed greater activity for the large reward-predicted cue than for the small reward-predicted cue. **(B)** The activity of this representative neuron decreased during the task period, and showed smaller activity for the large reward-predicted cue. **(C)** This representative neuron also showed a decrease in tonic activity during the task period, but showed no reward prediction-related activity modulation.

no reward magnitude-related difference in activity after reward delivery. To examine further whether the signal of tonic suppressive neurons actually encoded reward information, we compared their responses to rewards that were delivered expectedly during

the saccade task and delivered unexpectedly during the intertrial interval. **Figure 3B** shows a representative example that decreased its tonic activity during the task period and exhibited a rebound of activity shortly after reward delivery in the task condition. However, this neuron remained totally unresponsive to unexpectedly delivered rewards. This result suggests that tonic suppressive neurons do not encode the signal for the actual reward, but encode the predicted reward signal by decreasing their activity, similar to tonic excitatory neurons.

an unexpectedly delivered reward.

We found that many PPTN neurons showed a task-related increase or decrease in tonic activity. As reported previously, more than half of the PPTN neurons increased their activity around the time of the initial target appearance and this activity was sustained until the end of the trial (tonic excitatory neurons, *N* = 372, 54%), regardless of whether the initial target was an uninformative FP (indicating the start of a trial, top panel of **Figure 4A** and red trace of **Figure 4B**) or a reward-conditioned FT (indicating both the start of a trial and reward magnitude, top panel

"fnint-07-00036" — 2013/5/11 — 17:31 — page 5 — #5

**the PPTN for the saccade task. (A–C)** The activity of each PPTN neuron is presented as a row of pixels for the task with (**A**, *N* = 507) and without (**C**, *N* = 185) FP presentation. From left to right, the data are aligned to the time of FP/FT onset, FT onset, ST onset, and reward delivery, respectively. The data were plotted separately for neurons that showed an increase in tonic activity (top), no significant modulation (middle), and a decrease in tonic activity (bottom). The neurons have been sorted in the order of their initiation of changes in tonic firing. The color of each pixel indicates the ROC value based on the comparison of the firing rate between a

duration). The warm colors (ROC value >0.5) indicate increases in the firing rate relative to the pre-fixation period, whereas the cool colors (ROC value <0.5) indicate decreases in the firing rate. **(B–D)** Population average activity is shown for the task with **(B)** and without **(D)** FP presentation, separately for tonic excitatory neurons (red) and tonic suppressive neurons (blue). **(E,F)** Histograms for the firing rate during the pre- **(E)** and post-fixation **(F)** periods for the tonic excitatory (red) and suppressive (blue) neurons.

of **Figure 4C** and red trace of **Figure 4D**). Furthermore, another group of PPTN neurons showed a reverse response pattern; their activity was tonically decreased around the time of the initial target appearance and rebounded at the end of the task (tonic suppressive neurons, *N* = 114, 16%, bottom panels of **Figures 4A–C** and blue trace of **Figures 4B–D**). Some of the remaining neurons exhibited various phasic discharges to the visual stimulus, saccade, and reward delivery (Okada and Kobayashi, 2009; Okada et al., 2009). Here, we focused on the neuronal data showing tonic changes in activity.

Tonic excitatory and suppressive neurons started to change their tonic activity even before the appearance of the initial target, both in the fixed and quasi-randomized intertrial interval conditions. We first compared the time course of this anticipatory modulation for tonic excitatory and suppressive neurons. The start of the changes in neuronal activity was defined as the time at which

"fnint-07-00036" — 2013/5/11 — 17:31 — page 6 — #6

the normalized activity exceeded that of the pre-fixation period by more (or less) than two standard deviations. Many tonic excitatory neurons showed an increase in activity before the appearance of the initial target (*N* = 284, 76% of tonic excitatory neurons). Similarly, many tonic suppressive neurons also started to decrease their activity before the appearance of the initial target (*N* = 74, 65% of tonic suppressive neurons). These preparatory changes in activity were slightly more frequent in tonic excitatory neurons than in tonic suppressive neurons (*p* < 0.05, chi-square test). Thus, the tonic activity was triggered not only by the appearance of the external visual stimulus but also by the anticipation of the upcoming event and/or motivation of the monkeys. We will discuss the relationship between the preparatory changes in activity and the monkeys' anticipatory behavior in detail in a later section.

Previous *in vitro* studies reported that the neurotransmitter of PPTN neurons might be related to their electrophysiological properties, such as spike duration, spiking irregularity, and baseline firing rate (Takakusaki et al., 1996). However, it is difficult to classify neurons by these properties in extracellular recording experiments (Matsumura et al., 1997; Kobayashi et al., 2002). We tested whether these properties were correlated with the taskrelated tonic activity pattern, but we found no clear evidence for such a correlation. When comparing the firing rate of tonic excitatory and suppressive neurons during the pre-fixation period (600 ms period before the appearance of the initial stimulus), the tonic excitatory neurons exhibited a significantly lower frequency firing rate than the tonic suppressive neurons (**Figure 4E**, median, 9.6 spikes/s for tonic excitatory neurons, 16.2 spikes/s for tonic suppressive neurons, *p* < 0.001). However, when comparing the firing rate during the active period (post-fixation period of the tonic excitatory neurons and pre-fixation period of the tonic suppressive neurons) and silent period (pre-fixation period of the tonic excitatory neurons and post-fixation period of the tonic suppressive neurons), there was no significant difference (*p* = 0.09 for the active period, *p* = 0.58 for the silent period). Therefore, we concluded that these two groups of neurons showed a similar range of firing rates. In addition, tonic excitatory and suppressive neurons did not show a significant difference in spike duration (median, 0.53 ms for both tonic excitatory and suppressive neurons, *p* = 0.87), spiking irregularity (median CV, 0.57 for tonic excitatory neurons and 0.51 for tonic suppressive neurons, *p* = 0.17), and recording site (data not shown, see also Okada et al., 2009).

Thus, we concluded that some PPTN neurons increased, while others decreased, their tonic activity during the task period, and had a similar time course of modulation and range of firing characteristics. Therefore, these two groups of PPTN neurons showed mirror image activity patterns.

### **CORRELATION BETWEEN TASK-RELATED TONIC ACTIVITY AND REWARD-RELATED MODULATION**

We then tested the hypothesis that the tonic changes in the activity of PPTN neurons encoded the state of expectation of future rewards, as DRN neurons do, by analyzing the relationship between the tonic activity during the task execution period and the differential responses to the reward cues and actual reward delivery. Even if the initial target was an uninformative FP, the start of the task could be a clue for the future reward value after the successful completion of a trial (possibly the mean value of the large and small rewards). Therefore, if the tonic activity reflected the monkeys' expectation of future rewards, then the tonic excitatory neurons should exhibit stronger activity to large reward cues (positive reward modulation; activity increased during a positive state). Thus, the tonic suppressive neurons should exhibit weaker activity to large reward cues (negative reward modulation; activity decreased during a positive state). Conversely, if the tonic changes in activity were independent of the expectation of the reward value and encoded some variables about the behavioral task, there would be no systematic relationship between the sign of tonic activity changes and reward-related activity modulation.

As shown in **Figure 2A**, some tonic excitatory neurons exhibited higher activity in response to the large reward-indicating FT than the small reward-indicating FT. This positive reward coding was the major pattern of the predicted reward-related modulation of the tonic excitatory neurons. The population average normalized activity is shown in **Figure 5A** for neurons that showed an increase in tonic activity and positive reward modulation for the rewardconditioned FT (20% of tonic excitatory neurons, *N* = 74/372). The data from 2 reward tasks were pooled and presented together because they showed similar results. At the population level, activity modulation started even before the appearance of the initial target. If the large reward cue appeared, the higher activity was maintained, whereas if the small reward cue appeared, the activity decreased, but was still higher than the activity during the intertrial interval. Consistent with our previous finding, this differential response to the reward cue was dependent on the predicted reward magnitude rather than the shape of the FT (*p* < 0.05 and *p* > 0.1, respectively, multiple regression analysis), indicating that the tonic activity encoded the magnitude of the predicted reward rather than a simple visual response to the target stimulus (Okada et al., 2009). In a subset of the tonic excitatory neurons with positive reward modulation, the predicted reward-related differential response was maintained until shortly after reward delivery (*N* =17/74). A small population of tonic excitatory neurons (*N* = 16/372) showed a weak negative reward modulation, in that their response was smaller for the large-reward indicating cue.

Tonic suppressive neurons also showed a correlation between their task-related tonic activity and reward-related modulation. Some tonic suppressive neurons showed negative reward modulation to the FT (13% of tonic suppressive neurons, *N* = 14/110, **Figure 5B**). Opposite to the population activity pattern of the tonic excitatory neurons, the tonic suppressive neurons keep silent during the presentation of a large reward cue, whereas if a small reward cue appeared, their activity increased, but it was still lower than their activity during the intertrial interval. Similar to the tonic excitatory neurons, the differential response to the reward cue was dependent on the predicted reward magnitude rather than the shape of the FT (*p* < 0.05 and *p* > 0.1, respectively, multiple regression analysis). Some neurons maintained negative reward modulation until shortly after reward delivery (*N* = 6/14), and only two tonic suppressive neurons showed positive reward modulation to the FT.

"fnint-07-00036" — 2013/5/11 — 17:31 — page 7 — #7

**Correlation between task-related tonic activity and reward-related modulation. (A,B)** Population average activity is shown separately for tonic excitatory neurons with positive reward modulation **(A)** and tonic suppressive neurons with negative reward modulation **(B)**. Neurons recorded with and without FP presentation are included. **(C)** Plot of the fixation period response (*x*-axis) versus reward-related modulation (*y*-axis). The fixation period response was measured as the ROC value for each neuron to discriminate between its firing rates during the post-fixation period (0–600 ms after the appearance of the initial stimulus) versus the pre-fixation period (0–600 ms before the appearance of the initial stimulus). Reward-related modulation was measured between its firing rates at 0–600 ms after the appearance of the reward-conditioned FT for large versus small reward trials. The marker shapes indicate neurons with a significant increase (rightward triangles) and decrease (leftward triangles) in activity during the post-fixation period (*p* < 0.05). The marker colors indicate neurons that showed significantly higher (red) and lower (cyan) activity during the large reward trials (*p* < 0.05).

We then analyzed the correlation between the strength of tonic activity modulation during the task execution period and the differential response to the reward cue and reward delivery in order to examine the pattern of reward value coding in PPTN neurons. We used ROC analysis to measure the strength of tonic activity modulation and reward-related response of each neuron (Green and Swets, 1966). We found that the modulation of tonic activity during the fixation period was positively correlated with its reward-related modulation to the reward-conditioned FT (0– 600 ms after FT appearance, *r* = 0.17, *p* < 0.001, **Figure 5C**), but not after reward delivery (0–600 ms after reward delivery, *r* = 0.04, *p* = 0.14, data not shown). This result further supported the view that PPTN neurons encoded the prediction of future rewards by their bi-directional changes in tonic activity, but did not primarily encode the actual reward value information. These tonic firing PPTN neurons might play a role in the reward prediction-based behavioral control system rather than the actual reward-based feedback valuation system.

Then, we examined the relationship between the absolute strength of tonic activity modulation during the task execution period and the absolute strength of the reward-related modulation after FT presentation. We used the absolute value of the difference between the ROC value and 0.5. The absolute ROC value for tonic activity modulation was not correlated with the absolute ROC value for reward-related modulation (*r* = 0.02, *p* = 0.34), possibly because there was a substantial number of neurons that showed strong tonic changes in activity during the task period, but had no reward-related modulation. Overall, we found no relationship between the absolute strength of the tonic activity during the task and the absolute strength of the reward-related modulation; however, the sign of the reward-related modulation could be predicted by the increase or decrease in tonic activity.

Thus, we concluded that some PPTN neurons encoded the tonic reward value prediction signal either by increasing or decreasing their firing rate.

### **CORRELATION BETWEEN TASK-RELATED TONIC ACTIVITY AND ANTICIPATORY BEHAVIOR-RELATED MODULATION**

We then examined whether the tonic changes in the activity of PPTN neurons are correlated with the anticipatory behavior of

"fnint-07-00036" — 2013/5/11 — 17:31 — page 8 — #8

the monkeys. Previously, we reported that the predictive increase in activity before the task period was correlated with the monkeys' anticipatory behavior, which possibly reflected the monkeys' prediction of an upcoming visual event and motivational state (Okada et al., 2009). Our monkeys often made an anticipatory gaze shift, i.e., a self-initiated movement based on their anticipatory preparation for an upcoming visual event. The RTit was determined as a measure of the monkeys' behavior and, possibly, their state of anticipation of an upcoming event. We classified the trials according to the RTit into short and long RTit categories (shorter and longer RTit than the median values of the individual neuron, respectively) and compared neuronal activity just before the start of the task (0–600 ms before the appearance of the initial stimulus).

The anticipatory response was correlated to the monkeys' anticipatory behavior. **Figures 6A,B** illustrates the activity of

a representative tonic excitatory neuron during the same set of trials aligned to the gaze shift to the center of the screen (**Figure 6A**) and appearance of the FP (**Figure 6B**). This neuron showed a predictive increase in activity before FP appearance, and this tonic activity persisted during the task period. The start of the changes in neuronal activity was time locked to the gaze shift to the center of the screen (**Figure 6A**), rather than the appearance of the FP (**Figure 6B**). In other words, the neuron showed higher anticipatory activity in the short RTit trials than in the long RTit trials (positive behavioral modulation, **Figure 6B**). Activity reached a plateau shortly after the anticipatory gaze shift, even before the appearance of the FP. After the reward-conditioned FT was presented during fixation, this neuron also showed higher activity in response to the large rewardindicating FT than the small reward-indicating FT (positive reward modulation).

colored rasters and traces indicate the short RTit trials (purple), long RTit trials (green), large reward trials (red), and small reward trials (cyan). **(A,B)** This representative neuron exhibited an increase in tonic activity during the task period, and the increase in activity started before the appearance of the FP in a behavior-dependent manner. **(C,D)** The activity of this representative neuron was decreased in a time-locked manner to the monkey's centering gaze shift.

"fnint-07-00036" — 2013/5/11 — 17:31 — page 9 — #9

Similarly, tonic suppressive neurons showed a predictive decrease in activity before the appearance of the initial stimulus in an RTit-dependent manner. **Figures 6C,D** shows a neuron that decreased its tonic activity during the task period. Similar to the tonic excitatory neuron shown in **Figures 6A,B**, the pause in its activity was started at the time of the centering gaze shift (**Figure 6C**). Therefore, this neuron showed higher anticipatory activity in the long RTit trials than in the short RTit trials (negative behavioral modulation), and this differential response was only apparent in the pre-fixation period (**Figure 6D**).

We found correlations between the tonic activity modulation and anticipatory behavior-related modulation. The population average normalized activity is shown in **Figure 7A** for neurons that showed an increase in tonic activity and positive behavioral modulation. If the monkeys made a short RTit for a centering gaze shift, a subset of tonic excitatory neurons showed higher activity during the pre-fixation period and this activity reached a plateau before the appearance of the FP, whereas if the monkeys made a long RTit for a centering gaze shift, there was a slower increase in activity (20% of tonic excitatory neurons, *N* = 74/372, **Figure 7A**). There was a small population of tonic excitatory neurons (*N* = 10) that showed a weak negative behavioral modulation in that their response was smaller in the shorter RTit trials. Opposite to the tonic excitatory neurons, some tonic suppressive neurons showed higher activity during the pre-fixation period in the short RTit trials than in the long RTit trials (14% of tonic suppressive neurons, *N* = 16/110, **Figure 7B**), and only seven neurons showed positive behavioral modulation. Thus, the increase or decrease in tonic activity reflected the motivational and/or attentional state of the monkey based on the anticipation of an upcoming event.

We then analyzed the correlation between the strength of tonic activity modulation and anticipatory behavior-related modulation. We found that the strength of tonic activity modulation during the task execution period was positively correlated with anticipatory behavior-related modulation (0–600 ms before initial stimulus appearance, *r* = 0.40, *p* < 0.001, **Figure 7C**). Furthermore, the absolute strength of the anticipatory response could be predicted by the tonic activity modulation during the task. The

"fnint-07-00036" — 2013/5/11 — 17:31 — page 10 — #10

absolute ROC value for the tonic activity modulation was positively correlated with the absolute ROC value for the anticipatory response (*r* = 0.28, *p* < 0.01). Thus, the sign and strength of the anticipatory behavior-related modulation could be predicted by the increase or decrease in tonic activity. Neurons that had behavioral dependency basically showed a predictive change in their firing rate. We also analyzed the effect of reward history on the monkeys' behavior and neuronal activity, but there was no significant correlation.

Thus far, we have described separately the correlation between the tonic activity modulation during the task and predicted reward value-related modulation (**Figure 5C**) and anticipatory behaviorrelated modulation (**Figure 7C**). We then questioned whether PPTN neurons encoded the externally cued predicted reward value and internal anticipatory state by a correlated increase or decrease in their tonic neuronal activity. The example neuron in **Figure 6B** showed correlated encoding such that the neuron initially showed behavior-related anticipatory increases in activity and then showed a predicted reward-related differential response. **Figure 8** shows the correlation between the strength of reward-related modulation and behavior-related modulation. Some neurons showed both predicted reward value-related activity modulation and anticipatory behavior-related activity modulation (triangles, *N* = 14 for tonic excitatory neurons and *N* = 3 for tonic suppressive neurons). In addition, by correlation analysis, we found that the predicted reward-related modulation after FT presentation was positively correlated with anticipatory behavior-related modulation before the appearance of the initial stimulus (*r* = 0.10, *p* = 0.006, **Figure 8**). On the other hand, largely separate groups of neurons showed either predicted reward value-related activity modulation (red, cyan) or anticipatory behavior-related activity modulation (purple, green). Thus, the prediction of the reward-value signal and anticipatory behavior-related signal converged in a subset of PPTN neurons, while other separate populations of neurons carried these two signals independently.

### **DISCUSSION**

We found that most PPTN neurons showed a tonic increase or decrease in activity during the task execution period, and the sign of tonic activity modulation was correlated with their response magnitude to large/small reward cues. This result suggests that the tonic activity of PPTN neurons encodes the prediction of a future reward. Additionally, the modulation of tonic activity during the task was also correlated with the monkeys' anticipatory behavior. Thus, the tonic activity of PPTN neurons also reflects the monkeys' motivational and/or attentional state, which was based on the anticipation of an upcoming event. Altogether, some PPTN neurons increased, while others decreased, their activity driven either by the externally cued reward value or internal anticipatory state. These bi-directional modulation patterns with reward and motivation are in agreement with the presumed role of the PPTN in reward processing and motivational control.

Some previous studies reported the phasic activity of PPTN neurons in response to a given reward (Dormont et al., 1998; Kobayashi et al., 2002; Okada et al., 2009; Norton et al., 2011).

Previously, we reported that individual different PPTN neurons showed task-related tonic activity and actual reward-related phasic response (Okada et al., 2009). Consistent with this, we found that the modulation of tonic activity during the fixation period was not correlated with the actual reward magnitude-related activity modulation after reward delivery. Another group of neurons, i.e., the tonic suppressive neurons we reported here, showed an increase in tonic activity after task reward delivery that was sustained until the start of the next trial. However, the rebound in activity was not primarily the reward magnitude-related response because: (1) the strength of rebound activity did not change with the actual reward magnitude and (2) tonic suppressive neurons remained totally unresponsive to unexpectedly delivered rewards. Therefore, tonic excitatory and suppressive neurons encode the predicted reward value by increasing and decreasing their tonic

increase (upward triangles) and decrease (downward triangles) in activity

(*p* < 0.05).

"fnint-07-00036" — 2013/5/11 — 17:31 — page 11 — #11

activity, and further, a separate population of PPTN neurons encode the actual reward value by a phasic increase in their activity (Okada et al., 2009).

Because obtaining a reward and avoiding a punishment are basic desires of all animals, similar modulation of reward prediction-related neuronal activity has been reported in many brain areas, including the cerebral cortices and basal ganglia nuclei (Leon and Shadlen, 1999; Roesch and Olson, 2003; Samejima et al., 2005; Belova et al., 2008; Joshua et al., 2009; Vickery et al., 2011; Tachibana and Hikosaka, 2012). Midbrain dopaminergic neurons encode the error between reward prediction and the actual reward and act as a teacher to revise the reward prediction to match an uncertain environment and acquire the maximum reward (Schultz, 1998; Bromberg-Martin et al., 2010b). The PPTN receives signals from these reward-related structures and provides strong excitatory inputs to dopaminergic neurons (Mena-Segovia et al., 2008b; Winn, 2008). Computational models of dopaminergic neuronal firing presumed the necessity of tonic excitatory and inhibitory reward prediction signals into dopaminergic neurons to produce the reward prediction error signal (Houk et al., 1995; Montague et al., 1996). The mirror image activity patterns of reward prediction-related tonic excitatory and suppressive PPTN neurons would match the requirements of this model. Thus, PPTN neurons could send both positive and negative reward prediction components to dopaminergic neurons, which are necessary for the computation of the reward prediction error signal.

In addition to the reward prediction-related activity modulation, a somewhat overlapping group of PPTN neurons showed anticipatory behavior-related activity modulation. These neurons showed an anticipatory increase/decrease of tonic activity before the appearance of the initial stimulus that was maintained until the end of the task. Furthermore, a tonic change in neuronal activity was almost absent in the error trials. Indeed, many previous studies reported that the PPTN is involved in the motivational control system. The cholinergic projections from the PPTN to the thalamus are considered as a part of the ascending reticular activating system and have a role in motivational control (Steriade, 1996). Several motivated behaviors of rats are controlled by the PPTN (Kozak et al., 2005; Wilson et al., 2009). In conditioned cats, reversible blockage of the PPTN by the injection of muscimol caused an elongation of intertrial intervals in a lever-release task (Conde et al., 1998). The correlation between the tonic activity and the monkeys' anticipatory behavior suggests that the tonic activity of PPTN neurons might reflect the motivational and/or attentional state of the monkey and could act as the motivational drive to start and successfully complete a behavior.

We found that somewhat overlapping, but largely separate, groups of neurons showed predicted reward value-related activity modulation or behavior-related activity modulation with bidirectional changes in tonic activity. Previous studies hypothesized functional differences between the anterior and posterior PPTN, such that the posterior PPTN is connected with the sensorimotor structure and the ventral tegmental area, whereas the anterior PPTN is connected with the forebrain and the substantia nigra pars compacta (Winn, 2008). One possibility is that the neurons that showed reward-related activity modulation belong to the anterior PPTN and play a role in reward processing, whereas the neurons that showed behavior-related activity modulation belong to the posterior PPTN and play a role in motivational control. However, we found no difference in the recording sites between the reward- and behavior-related neurons. The PPTN is also hypothesized to be an integrative interface for multimodal signals (Inglis and Winn, 1995). Therefore, another possibility is that reward- and motivation-related signals converge at the PPTN neurons, and thus the PPTN neurons encoded the externally cued predicted reward value and internally driven anticipatory behavior by a correlated increase or decrease in their tonic activity.

The PPTN is connected with other neuromodulator systems that are involved in motivated behavior. The cholinergic PPTN has reciprocal connections with the serotonergic DRN (Steininger et al., 1992; Honda and Semba, 1994), and their mutual functions reportedly control wake/sleep and locomotion (Kayama and Koyama, 2003; Takakusaki et al., 2004; Harris-Warrick, 2011). In a motivated behavioral task, DRN neurons carry state value signals that track progress through a task both before and after reward delivery (Bromberg-Martin et al., 2010a). The tonic response patterns of PPTN and DRN neurons were very similar during the task execution period during which the monkeys predict the future reward. There were also differences in their activity patterns, such that most DRN neurons continually encode the reward signal after reward delivery; however, the firing of many PPTN neurons returns to the baseline state around the time of reward delivery. These different neuromodulator systems might play a role in motivated behavioral control in parallel and interact with one another.

While the PPTN is the major source of cholinergic projections in the brainstem, it also contains glutamatergic, GABAergic, dopaminergic, and noradrenergic neurons. One simple hypothesis is that these neurochemical types of neurons correspond to the different response types such as the increase and decrease in tonic activity. For example, during a behavioral task, cholinergic/glutamatergic tonic excitatory neurons are activated, while GABAergic tonic suppressive neurons disinhibit target neurons that form a push-pull circuit and could effectively activate target neurons. However, there are no reliable electrophysiological criteria (e.g., firing rate, spike shape, and spiking regularity) to identify the neurotransmitter of the recorded neuron. Additionally, we found that the tonic activity pattern of our recorded PPTN neurons had no clear relationship with several electrophysiological properties. One future direction is to determine their neuronal activity and neurotransmitter content by using new techniques (Mena-Segovia et al., 2008a; Boucetta and Jones, 2009; Cohen et al., 2012).

In addition to the tonic excitatory neurons and their positive reward- and behavioral-modulation we reported previously (Okada et al., 2009), here, we demonstrated that other PPTN neurons showed a decrease in tonic activity during the task period. Furthermore, some of these neurons showed negative activity modulation related to the predicted reward value and anticipatory behavior. The negative reward prediction signal of the PPTN would match the requirements of the reinforcement learning

"fnint-07-00036" — 2013/5/11 — 17:31 — page 12 — #12

model. Conversely, the role of the negative anticipatory behaviorrelated signal remains rather elusive. The cholinergic, serotonergic, and dopaminergic neuromodulatory systems control many brain functions in a mutually interacting manner, and an understanding of the role of each neuromodulator system in reinforcement learning and motivational behavioral control will be an important direction for future research.

### **REFERENCES**


### **ACKNOWLEDGMENTS**

This work was supported by Precursory Research for Embryonic Science and Technology (PRESTO) from the Japan Science and Technology Agency and Grants-in-aid for scientific research from the Japan Society for the Promotion of Science (24120511, 23650145) and Osaka University Global COE program "Human Behavior and Socioeconomic Dynamics."


"fnint-07-00036" — 2013/5/11 — 17:31 — page 13 — #13


labeling. *J. Comp. Neurol.* 371, 345–361.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

### *Received: 12 December 2012; accepted: 25 April 2013; published online: 14 May 2013.*

*Citation: Okada K and Kobayashi Y (2013) Reward prediction-related increases and decreases in tonic neuronal activity of the pedunculopontine tegmental nucleus. Front. Integr. Neurosci. 7:36. doi: 10.3389/fnint.2013.00036*

*Copyright ©2013 Okada and Kobayashi. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

"fnint-07-00036" — 2013/5/11 — 17:31 — page 14 — #14

## Modulation of firing and synaptic transmission of serotonergic neurons by intrinsic G protein-coupled receptors and ion channels

### *Takashi Maejima, Olivia A. Masseck, Melanie D. Mark and Stefan Herlitze\**

*Department of Zoology and Neurobiology, Ruhr-University Bochum, Bochum, Germany*

### *Edited by:*

*Kae Nakamura, Kansai Medical University, Japan*

### *Reviewed by:*

*David M. Lovinger, National Institutes of Health, USA Evan Deneris, Case Western Reserve University, USA*

### *\*Correspondence:*

*Stefan Herlitze, Department of Zoology and Neurobiology, Ruhr-University Bochum, Universitätsstr. 150, ND 7/31, D-44780 Bochum, Germany. e-mail: stefan.herlitze@rub.de*

Serotonergic neurons project to virtually all regions of the central nervous system and are consequently involved in many critical physiological functions such as mood, sexual behavior, feeding, sleep/wake cycle, memory, cognition, blood pressure regulation, breathing, and reproductive success. Therefore, serotonin release and serotonergic neuronal activity have to be precisely controlled and modulated by interacting brain circuits to adapt to specific emotional and environmental states. We will review the current knowledge about G protein-coupled receptors and ion channels involved in the regulation of serotonergic system, how their regulation is modulating the intrinsic activity of serotonergic neurons and its transmitter release and will discuss the latest methods for controlling the modulation of serotonin release and intracellular signaling in serotonergic neurons *in vitro* and *in vivo*.

**Keywords: 5-HT system, GPCRs, auto-regulation, hetero-regulation, optogenetics**

### **INTRODUCTION**

The serotonergic system consists of a small number of neurons that are born in the ventral regions of the hindbrain (Deneris and Wyler, 2012). In the adult nervous system, serotonergic neurons [5-HT (5-hydroxytryptamine) neurons] are located in the nine raphe nuclei that are restricted to the basal plate of the midbrain, pons, and medulla (Dahlstrom and Fuxe, 1964). 5-HT neurons located in the rostral raphe nuclei, such as the dorsal raphe nucleus (DRN) and the median raphe nucleus (MRN), give rise to the majority of the serotonergic ascending fibers into the forebrain including cerebral cortex, limbic system, and basal ganglia (Jacobs and Azmitia, 1992). The activity of the serotonergic system is regulated via transmitter release from local interneurons and/or afferents to the raphe nuclei (hetero-regulation), via mechanisms arising from 5-HT neurons themselves (auto-regulation), and potentially via alterations in the extracellular milieu (e.g., increase in CO2; Pineyro and Blier, 1999; Richerson, 2004). In this review, we will discuss G protein-coupled receptors (GPCRs) and ion channels located at somatodendritic and presynaptic regions of 5-HT neurons in the DRN andMRN that contribute to the modulation of 5-HT neuronal activity and 5-HT release (**Figure 1**).

The DRN and MRN are the primary nuclei of 5-HT projections to forebrain and provide the neural substrate to communicate between global forebrain and other neuromodulatory systems by sending a wide range of 5-HT projections and receiving a wide variety of afferents (Jacobs and Azmitia, 1992). The DRN is located right beneath the posterior part of cerebella aqueduct and contains about half of all 5-HT neurons in the central nervous system (CNS), which can be further divided into six regions: rostral, caudal, dorsomedial, ventromedial, interfascicular, and lateral parts. The MRN is located at the ventral expansion of the DRN or the midline of the pontine tegmentum where many 5-HT neurons are densely packed in the midline and some 5-HT neurons are scattered in the periphery. Within the DRN and MRN 5-HT neurons project to defined target area in brain (Adell et al., 2002; Lechin et al., 2006). For example, DRN 5-HT neurons innervate the prefrontal cortex, lateral septum, and ventral hippocampus, while MRN 5-HT neurons innervate the temporal cortex, medial septum, and dorsal hippocampus. Afferent projections to the raphe nuclei are diverse and include acetylcholine (ACh) from the laterodorsal tegmental nucleus, dopamine from the substantia nigra and ventral tegmentum area, histamine from tuberomammillary hypothalamic nucleus, noradrenaline (NA) from the locus coeruleus, serotonin itself from the raphe nuclei and several neuropeptides as well as excitatory glutamatergic and inhibitory GABAergic inputs. Glutamatergic inputs come from several nuclei including medial prefrontal cortex and lateral habenula nucleus (Aghajanian and Wang, 1977; Celada et al., 2001; Adell et al., 2002; Bernard and Veh, 2012). Some inputs make direct contacts with 5-HT neurons, while others project onto local GABAergic interneurons that provide feedforward inhibitory input to 5-HT neurons (Sharp et al., 2007). In addition to the local GABAergic interneurons located in the raphe nuclei and in the neighboring periaqueductal gray area, extrinsic GABAergic projections have been suggested (Gervasoni et al., 2000). The responsiveness of 5- HT neurons to each of the inputs differs between DRN and MRN, and also within the subnuclei of the DRN (Adell et al., 2002; Lechin et al., 2006). The differences in responsiveness most likely depend on differences in the strength of the afferent inputs for each of the nuclei and subnuclei and also on the expression and types of the ionotropic and metabotropic receptors in 5-HT neurons. Furthermore, the intrinsic membrane excitability of 5-HT neurons has been reported to differ in the distinct raphe nuclei (Beck et al., 2004; Crawford et al., 2010). Importantly, beyond the anatomical

"fnint-07-00040" — 2013/5/21 — 14:57 — page 1 — #1

and physiological differences, it has been reported that subpopulation of 5-HT neurons have distinct implications in specific physiological function and behavior (Abrams et al., 2004; Lechin et al., 2006; Hale and Lowry, 2010).

### **AUTO-REGULATION**

The midbrain 5-HT neurons elicit spontaneous action potentials (APs), with a regular, slow firing pattern (1–5 APs/s; Aghajanian and Vandermaelen, 1982; Vandermaelen and Aghajanian, 1983). 5-HT released from 5-HT neurons act either on the 5-HT neuron itself or on the target circuits. There are several ways how 5-HT neurons may receive 5-HT. First, dendrodendritic synapses releasing 5-HT have been described in raphe nuclei between 5-HT neurons. Second, recurrent axonal collaterals have been suggested to back-propagate to the raphe nucleus itself to release 5-HT. Finally, 5-HT neurons between different raphe nuclei such as DRN and MRN communicate with each other via 5-HT (for review, see Adell et al., 2002; Harsing, 2006; Lechin et al., 2006). Indeed electrical stimulation in DRN slice preparations induces 5-HT1A receptor-mediated slow inhibitory postsynaptic potentials (IPSPs) in 5-HT neurons (Pan et al., 1989; Morikawa et al., 2000), demonstrating 5-HT release in the proximity of 5-HT neurons.

Once 5-HT is released, 5-HT receptors will be activated. Seven subgroups of 5-HT receptors encoding ionotropic as well as metabotropic receptors have been described (5-HT1–5-HT7), with 15 total variants identified to date (Barnes and Sharp, 1999; Hoyer et al., 2002; Kroeze et al., 2002). The 5-HT GPCRs can be divided into three major subgroups depending on which G protein signaling pathway they activate. 5-HT1 receptors couple mainly to the Gi/o pathway; 5-HT4, 5-HT5, 5-HT6, and 5-HT7 receptors couple to the Gs pathway; and 5-HT2 receptors activate the Gq/<sup>11</sup> pathway. The 5-HT3 receptors are ligand-gated ion channels.

5-HT1A, 5-HT1B, and 5-HT1D receptors are found on somatodendritic and axonal region of 5-HT neurons (McDevitt and Neumaier, 2011). All three receptors act as negative feedback effectors for 5-HT neuronal firing and 5-HT release (Andrade, 1998). Somatodendritically located 5-HT1A receptors down-regulate the firing rate of 5-HT neurons via activation of G protein-coupled inwardly rectifying potassium channels (GIRK) leading to membrane hyperpolarization, and reduction or complete block of AP firing (Colino and Halliwell, 1987; Sprouse and Aghajanian, 1987; Blier et al., 1989; Hjorth and Sharp, 1991; Penington et al., 1993; Stamford et al., 2000). In addition, 5-HT1A receptors inhibit voltage-gated Ca2<sup>+</sup> channels of the N- and P/Q-type in 5-HT neurons, as application of a selective 5-HT1A agonist diminishes somatic Ca2<sup>+</sup> channel currents (Penington and Kelly, 1990; Penington et al., 1992; Bayliss et al., 1997b). The functional consequence of Ca2<sup>+</sup> channel inhibition is an increase in the firing rate due to reduction in the afterhyperpolarization, which may involve Ca2<sup>+</sup> activated-K<sup>+</sup> channel (Bayliss et al., 1997b). The physiological role of the differential effects of 5-HT1A receptors on the AP firing has not been addressed so far, but may involve input specificity due to 5-HT1A/GIRK and 5-HT1A/Ca<sup>2</sup><sup>+</sup> channel colocalization in specific subcellular domains and/or differences in the regulatory properties of heterogeneous 5-HT neurons within and among different raphe nuclei (Calizo et al., 2011).

The predominant 5-HT receptors at the presynaptic terminal are the 5-HT1B/1D receptors. 5-HT1B/1D receptors have been shown to inhibit 5-HT release from the axonal varicosities as demonstrated with electrophysiological experiments (Sprouse and Aghajanian, 1987; Boeijinga and Boddeke, 1993; Morikawa et al., 2000), probably due to inhibition of Ca2<sup>+</sup> influx through voltagegated Ca2<sup>+</sup> channels such as P/Q-type and N-type Ca2<sup>+</sup> channel (Kimura et al., 1995; Harvey et al., 1996). The 5-HT1B receptors have been shown to underlie presynaptic autoinhibition of 5- HT release in which 5-HT1A-mediated slow IPSPs are reduced by previous released 5-HT activating presynaptic 5-HT1B receptors (Morikawa et al., 2000). In addition, 5-HT1B receptors have been suggested to up-regulate 5-HT reuptake by serotonin transporters (Xie et al., 2008; Hagan et al., 2012) and 5-HT synthesis by itself might be under the control of 5-HT1B (Hjorth et al., 1995). Thus, 5-HT1B autoreceptors may have the ability to control 5-HT release independently from the actual firing rate.

5-HT1F receptor mRNA has also been detected in the raphe nuclei (Bruinvels et al., 1994). Since 5-HT1F receptors have a high affinity for sumatriptan, a 5-HT1B/1D agonist, and the sumatriptan-induced reduction in 5-HT release (monitored by voltammetry in brain slices) could not be blocked by 5-HT1A/1B/1D receptor antagonists, 5-HT1F had been suggested as a possible candidate of a serotonergic autoreceptor in the MRN (Hopwood and Stamford, 2001).

While the expression and function of 5-HT1 receptors has been directly demonstrated in 5-HT neurons, involvement of other 5- HT receptors such as 5-HT2, 5-HT3, and 5-HT4−<sup>7</sup> is less clear and may differ among species, the developmental stage of the animal and 5-HT neuron subtypes.

5-HT2 receptors are functionally expressed in particular on GABAergic interneurons in the DRN, since activation of 5- HT2A/2C receptors increase fast inhibitory postsynaptic current (IPSC) frequency in 5-HT neurons and reduce 5-HT neuronal firing as electrophysiologically measured in brain slices (Liu et al., 2000; Leysen, 2004). 5-HT2 receptor mRNA and proteins have been identified in the DRN and embryonic 5-HT neurons (Wright et al., 1995; Clemett et al., 2000; Wylie et al., 2010). Additionally, 5- HT2 receptors have been postulated to increase 5-HT1A-mediated responses in 5-HT neurons (Kidd et al., 1991). However, a direct modulatory effect of 5-HT2 in 5-HT neurons has not been demonstrated.

The 5-HT3 receptors have also been suggested to act as presynaptic autoreceptors in serotonergic nerve terminals. Although 5-HT3 receptors have been shown to enhance 5-HT release in various brain areas including the raphe nuclei as monitored by [3H]5-HT assays (Bagdy et al., 1998), there is no direct immunohistochemical and electrophysiological evidence of the presence of 5-HT3 receptors in 5-HT neurons (van Hooft and Vijverberg, 2000).

For the Gs protein-coupled 5-HT4−<sup>7</sup> receptors, mainly indirect evidence exists for an autoregulatory role of these GPCRs in 5-HT neurons.

5-HT4 receptors seem to be located somatodendritically and presynaptically. A presynaptic potentiating effect of 5-HT4 receptor on glutamate release, which can be counteracted by 5- HT1A receptor-mediated inhibitory action, has been described in

"fnint-07-00040" — 2013/5/21 — 14:57 — page 2 — #2

GPCRs coupling to the Gq/<sup>11</sup> pathway or activation of excitatory ligand-gated

"fnint-07-00040" — 2013/5/21 — 14:57 — page 3 — #3

pathway and via opening of excitatory, ligand-gated ion channels.

hippocampal neurons (Kobayashi et al., 2008). Since neurotransmitter release of various transmitters including 5-HT is modulated by 5-HT4 agonists, a presynaptic localization of 5-HT4 receptors on 5-HT neurons seems possible (Mengod et al., 2010).

While the function of 5-HT5 receptors in the CNS has not been thoroughly studied, two studies suggest a role of 5-HT5 receptors for modulating 5-HT neurons. First, 5-HT5B receptor mRNA, a receptor which is expressed in rodents but not in humans, is colocalized with the mRNA of 5-HT transporter in the DRN (Serrats et al., 2004). Second, block of 5-HT5A receptors in the DRN attenuates the 5-carboxamidotryptamine (5-CT; non-selective agonist in particular for 5-HT1A/1B/1D receptors) induced reduction of 5- HT neuronal firing but fail to affect 5-HT release measured using fast cyclic voltammetry *in vitro* (Thomas et al., 2006). The data suggest an autoreceptor modulation of 5-HT neurons via 5-HT5A receptors.

5-HT6 and 5-HT7 receptor protein and mRNA have been detected in cells in the raphe nuclei including the DRN (Ruat et al., 1993; To et al., 1995; Gustafson et al., 1996; Woolley et al., 2004; see also Gerard et al., 1997; Hamon et al., 1999), but a functional role as autoreceptors for modulating 5-HT neurons could not be demonstrated so far (Bourson et al., 1998; Roberts et al., 2001).

Thus, the auto-regulation of 5-HT neuronal firing is in particular regulated by 5-HT1A receptors via activation of the Gi/o pathway and opening K<sup>+</sup> and closing Ca2<sup>+</sup> conductance. At the presynaptic terminal, 5-HT1B/1D receptor activation reduces 5-HT release most likely via Gi/o protein-mediated inhibition of presynaptic Ca2<sup>+</sup> channels. In addition, potentiation of 5-HT release by activating 5-HT4 receptors via the Gs pathway seems possible. Since other 5-HT receptor mRNAs have been detected in 5-HT neurons, other autoregulatory mechanisms may exist in subgroups of 5-HT neurons or during different developmental stages of the serotonergic transmitter system. In particular, animal models for the selective activation of these GPCRs during development will further elucidate the modulatory role of other 5-HT receptors in the auto-regulation of 5-HT neuronal firing and 5-HT release.

### **HETERO-REGULATION**

5-HT modulates various complex behaviors and therefore the serotonergic transmitter system receives feedback and feedforward information from other brain areas and networks involved in regulating the different behaviors (Adell et al., 2002; Lechin et al., 2006; Sharp et al., 2007). Thus, the hetero-regulation of the 5-HT neurons involves various transmitter systems.

Forty-nine different GPCRs belonging to all four GPCR subfamilies were identified in postmitotic embryonic 5-HT neurons using microarray expression profiling (Wylie et al., 2010). These GPCRs include adrenergic, calcitonin, cannabinoid, GABA, histamine, opioid, and serotonin receptors. For these transmitter systems, a postnatal modulatory role for 5-HT neurons has been described (see below). In addition, other GPCRs such as thrombin, chemokine, prostaglandin E, melanin-concentrating hormone, cadherin, and parathyroid hormone receptors as well as frizzled (FZD) and smoothened (SMO) homolog and the orphan receptors (GPR 19, 56, 85, 98, 125 135, 162, and 173) were also identified but a physiological role for their modulation of 5-HT neurons still needs to be defined and investigated (for a complete list, see Wylie et al., 2010). We will summarize recent findings on the intrinsic hetero-regulation of the serotonergic transmitter system.

### **GABA, GLYCINE, AND GLUTAMATE**

5-HT neurons within the raphe nuclei receive in particular GABAergic but also glutamatergic input. As expected, the GABAergic input onto 5-HT neurons reduces the neuronal firing, while glutamatergic input increases the firing activity (Pan and Williams, 1989; Levine and Jacobs, 1992; Becquet et al., 1993a,b; for review, see Adell et al., 2002; Harsing, 2006). These effects have been mainly attributed to the expression of ionotropic GABA and glutamate receptors (GluRs) in 5-HT neurons (Tao and Auerbach, 2000; Gartside et al., 2007), which is in agreement with the expression of 2-amino-3-(3-hydroxy-5-methyl-isoxazol-4-yl)propanoic acid (AMPA), *N*-methyl-D-aspartate (NMDA), kainate receptors [GluR 1,2; NMDA-R-2B, kainate receptor 5 (Grik5)] as well as GABAA receptor subunits (GABAA β1–3 and γ2) in embryonic 5-HT neurons (Wylie et al., 2010). Interestingly, the glycine receptor α1 and β subunits have also been identified in human brain and mice embryonic 5-HT neurons (Baer et al., 2003; Wylie et al., 2010). Indeed the DRN receives input from glycinergic fibers (Rampon et al., 1996, 1999) to inhibit 5-HT neuronal firing and 5- HT release (Gallager and Aghajanian, 1976; Wang and Aghajanian, 1977; Becquet et al., 1993a).

The modulation of 5-HT neurons by GABAB receptors has been addressed in various studies. In general, activation of GABAB receptor by selective agonists decreases 5-HT release (Becquet et al., 1993b). The decrease of 5-HT release by GABAB receptors located within 5-HT neurons is most likely mediated via activation of GIRK channels leading to a reduction in AP firing (Innis and Aghajanian, 1987; Williams et al., 1988; Bayliss et al., 1997a; Cornelisse et al., 2007). Within 5-HT neurons, GABAB receptors are located extrasynaptically, suggesting that spillover of GABA during high activity of GABAergic neurons would modulate 5- HT neuronal activity within raphe nuclei (Varga et al., 2002). On the other hand, there is little information about modulatory effects of metabotropic GluRs (mGluRs) in 5-HT neurons so far. Although administration of group II mGluR (mGluR2/3) antagonist has been reported to increase 5-HT neuronal activity, an indirect effect on presynaptic excitatory neurons seems to be involved (Kawashima et al., 2005).

### **CORELEASE OF GLUTAMATE OR GABA FROM 5-HT NEURONS**

Previous reports have suggested the possibility of glutamate release from 5-HT neurons based on the presence of vesicular glutamate transporter type 3 in a subset of 5-HT neurons (Gras et al., 2002; Amilhon et al., 2010). Using optogenetic techniques, the corelease of glutamate and 5-HT from serotonergic terminals could be demonstrated in a serotonergic projection from the MRN to hippocampal GABAergic interneurons (Varga et al., 2009). The serotonergic fibers make direct synaptic contacts to the GABAergic neurons and exert fast synaptic transmission mediated by ionotropic GluRs and 5-HT3 receptors. In addition to the glutamate transporters, GABA and its synthesizing enzyme, glutamic acid decarboxylase (GAD) have also been reported in subsets of 5-HT neurons, suggesting the corelease of GABA and 5-HT

"fnint-07-00040" — 2013/5/21 — 14:57 — page 4 — #4

(Nanopoulos et al., 1982; Belin et al., 1983; Fu et al., 2010; Hioki et al., 2010; Shikanai et al., 2012). However, vesicular inhibitory amino acid transporter which is necessary for filling synaptic vesicles with GABA was absent in 5-HT/GAD67 positive neurons and their projections, suggesting that GABA may be released by nonvesicular mechanisms such as a reverse transport through GABA transporters (Shikanai et al., 2012). Thus, 5-HT axonal projections have a potential to modulate the neuronal activity in target areas using at least three different transmitters, i.e., 5-HT, glutamate, and GABA. It is intriguing to speculate that the auto-regulation of 5-HT neurons itself might be modulated by corelease of glutamate and GABA.

### **ACETYLCHOLINE**

The DRN also receives cholinergic input from the laterodorsal tegmental nucleus (Wang et al., 2000). Modulation of 5-HT neurons by ACh mainly involves nicotinic ACh receptors (nAChRs) and can increase 5-HT neuronal firing (Mihailescu et al., 2002) for example, via presynaptic modulation of glutamate release (Garduno et al., 2012) or via opening of nAChR expressed in 5-HT neurons (Galindo-Charles et al., 2008; Chang et al., 2011). Very limited information is available for the expression and function of muscarinic ACh receptors (mAChRs) in 5-HT neurons. mAChR-M1 receptors (Gq/11) might be expressed on serotonergic projections into the hippocampus (Rouse and Levey, 1996) and application of mAChR antagonist, atropine into the DRN enhances antidepressant-like 5-HT1A agonist effects (Haddjeri et al., 2004), which may involve M2 (Gi/o) but not M1 (G*q*/11) receptors (Haddjeri et al., 2000). Nevertheless a detailed and direct demonstration of the involvement of mAChRs in 5-HT neurons is essential.

### **DOPAMINE**

The 5-HT neurons of the DRN reciprocally interact with the dopaminergic mesencephalic transmitter system involving dopamine receptors. D1-like receptors (D1 and D5) couple to the Gs pathway, while dopamine D2-like receptors (D2−4) have been described to couple to the Gi/o pathway. In the DRN, D2 and D3 receptor expression has been detected so far, with very little or no expression of D1-like receptors (Bouthenet et al., 1987; Dawson et al., 1988; Cortes et al., 1989; Wamsley et al., 1989; Yokoyama et al., 1994; Suzuki et al., 1998). Dopamine increases 5-HT neuronal firing in DRN *in vivo* and in slice electrophysiological recording, and also increases 5-HT release detected by *in vivo* microdialysis in DRN and other brain areas (Ferre and Artigas, 1993; Ferre et al., 1994; Matsumoto et al., 1996; Mendlin et al., 1998; Haj-Dahmane, 2001; Martin-Ruiz et al., 2001). The modulatory effects on 5-HT neurons have been shown to be mediated by D1- and D2-like receptors located outside the DRN (Martin-Ruiz et al., 2001) or by direct activation of D2 receptors expressed in 5-HT neurons (Haj-Dahmane, 2001). Activation of D2-like receptors in 5-HT neurons leads to membrane depolarization, involving activation of G proteins, phospholipase C, and a non-selective cation current, most likely mediated by a transient receptor potential (TRP) channel (Aman et al., 2007). The D2-like receptor effects in 5-HT neurons suggest that D2-like receptors may activate the G*q*/<sup>11</sup> rather than the Gi/o signaling pathway. The G*q*/<sup>11</sup> protein

coupling might be explained by the heterodimerization between D1- and D2-like receptors (Rashid et al., 2007; Hasbi et al., 2010). It has to be noted that the experiments suggesting the modulation of 5-HT neurons by intrinsic D2-like receptors have been performed with relatively high concentrations of quinpirole and sulpiride in *in vitro* preparations. Therefore, the experiments have to be interpreted carefully.

### **NORADRENALINE**

5-HT neurons in the raphe nuclei receive noradrenergic input in particular from the locus coeruleus (Adell et al., 2002; Lechin et al., 2006). α<sup>1</sup> and α<sup>2</sup> adrenergic receptor mRNA and protein have been detected in the DRN and MRN (Unnerstall et al., 1985; Rosin et al., 1993; Scheinin et al., 1994; Talley et al., 1996; Day et al., 1997; Strazielle et al., 1999). α<sup>1</sup> and α<sup>2</sup> adrenergic receptors couple to the Gq/<sup>11</sup> and Gi/o pathway, respectively. Therefore depending on pathway activation, an increase or decrease in 5-HT neuronal activity and 5-HT release can be postulated if adrenergic receptors are expressed in 5-HT neurons. Early studies revealed that NA causes an increase in 5-HT neuronal firing in DRN. This effect has been suggested to be mediated via activation of α<sup>1</sup> adrenoceptors (presumably α1B adrenoceptor subtype) located on 5-HT neurons (Vandermaelen and Aghajanian, 1983; Day et al., 1997) and may involve the suppression of a 4-aminopyridine sensitive K<sup>+</sup> conductance (*I*A; Aghajanian, 1985). *In vivo* experiments suggest that α<sup>1</sup> adrenoceptors are tonically activated by endogenous NA (Adell and Artigas, 1999; Pudovkina et al., 2003). In contrast, 5-HT release detected by voltammetry or [3H]5-HT assay in the DRN slice preparation is inhibited by NA, an effect which has been attributed to α<sup>2</sup> adrenoceptors (involving α2A adrenoceptor subtype) and also perhaps indirectly to α<sup>1</sup> adrenoceptors (Frankhuijzen et al., 1988; Hopwood and Stamford, 2001). Since α<sup>2</sup> receptors couple to the Gi/o pathway, 5-HT release and 5-HT neuronal firing could be reduced via α<sup>2</sup> adrenoceptors located at the soma or presynaptic terminal of 5-HT neurons itself (Hopwood and Stamford, 2001), or via inhibition of NA release at noradrenergic synaptic terminals lowering the effective NA concentration for α<sup>1</sup> adrenoceptors/Gq signaling pathway activation. The direct effect of α<sup>2</sup> adrenoceptors in 5-HT neurons is supported by the fact that embryonic 5-HT neurons express α2A adrenoceptors (Wylie et al., 2010).

### **HISTAMINE**

There are four histamine receptors (H1−4), which couple to different G protein pathways, i.e., H1 (Gq/11), H2 (Gs), H3, and H4 (Gi/o). Early studies suggested that histamine reduces the firing of 5-HT neurons in the DRN (Lakoski and Aghajanian, 1983) via H2 receptors (Lakoski et al., 1984). Since H2 couples to the Gq/<sup>11</sup> pathway, the results suggest that H2 receptors are localized on GABAergic terminals. Later findings showed that histamine increased 5-HT neuronal firing in the DRN via activation of H1 receptors and the opening of a non-selective cation conductance through Gq/<sup>11</sup> signaling pathways (Barbara et al., 2002; Brown et al., 2002). Expression profiling in mouse embryos suggested the expression of H3 receptors in 5-HT neurons, which is consistent with high mRNA levels in the DRN (Lovenberg et al., 1999; Drutel et al., 2001; Pillot et al., 2002). However, the low binding

"fnint-07-00040" — 2013/5/21 — 14:57 — page 5 — #5

of a H3 receptor selective radioligand in the DRN suggests that H3 receptors are mainly functional at the presynaptic terminal of 5-HT projections (Pillot et al., 2002). Indeed increasing levels of histamine decrease 5-HT release detected by an *in vivo* electrochemical technique (Hashemi et al., 2011).

### **ENDOCANNABINOIDS**

The cannabinoid receptorfamily consists of two subtypes, CB1 and CB2. Modulation of neuronal activity is mainly exerted via CB1, which couples to the Gi/o protein (Howlett et al., 1986) and probably also to the Gs pathway (Glass and Felder, 1997). CB1 receptors are localized in particular at presynaptic terminals, where they inhibit presynaptic Ca2<sup>+</sup> influx and reduce transmitter release via endocannabinoid (eCB)-mediated retrograde signaling (Maejima et al., 2001). CB1 receptors are expressed in serotonergic fibers (Haring et al., 2007; Ferreira et al., 2012) and their mRNA is found early in development (Wylie et al., 2010). 5-HT release in projection areas from the DRN is reduced by activation of CB1 as monitored by microdialysis (Egashira et al., 2002) and [3H]5-HT assay (Nakazi et al., 2000; Egashira et al., 2002), and increased by inhibition of CB1 *in vivo* and *in vitro* (Darmani et al., 2003; Tzavara et al., 2003; Aso et al., 2009). Studies in CB1 knock-out animals and chronic activation of CB1 receptors *in vivo* suggest that CB1 may regulate the function and expression of 5-HT1A receptors (Aso et al., 2009; Moranta et al., 2009; Zavitsanou et al., 2010). Interestingly, 5-HT neurons itself synthesize eCBs in an activitydependent manner (Haj-Dahmane and Shen, 2009, 2011). eCB release from 5-HT neurons can be induced by orexin-B leading to the activation of orexin (OX) receptors via the G*q*/<sup>11</sup> pathway (Liu et al., 2002b; Haj-Dahmane and Shen, 2005). It has been therefore speculated that the activity-dependent activation of the G*q*/<sup>11</sup> pathway in general may lead to the production of eCBs in 5-HT neurons (Haj-Dahmane and Shen, 2011). Within the DRN eCBs mainly act on glutamatergic terminals and probably also on GABAergic terminals (Liu et al., 2002b; Haj-Dahmane and Shen, 2009; Mendiguren and Pineda, 2009; Tao and Ma, 2012), leading to a reduction in glutamate and GABA release onto 5-HT neurons and therefore changes the activity of the 5-HT neurons itself.

### **FRIZZLED RECEPTORS**

Four frizzled receptors (FZD1–3 and SMO) have been detected in postmitotic embryonic 5-HT neurons (Wylie et al., 2010). These receptors mainly couple to the Wnt signaling cascade and are most likely involved in the development and maturation of 5- HT neurons (Simon et al., 2005; Song et al., 2012). However, the role of frizzled receptors in 5-HT neurons remains to be determined.

### **NEUROPEPTIDES**

Dr. Hökfelt's laboratory demonstrated that various peptide transmitters are expressed in the DRN with species-specific differences between mice and rats (Fu et al., 2010). The identified neuropeptides are cholecystokinin (CCK), calcitonin gene-related peptide (CGRP), vasoactive intestinal peptide (VIP), somatostatin, substance P, dynorphin, neurotensin, thyrotropin-releasing hormone (TRH), enkephalin, galanin, neuropeptide Y (NPY), and corticotropin-releasing factor (CRF). Among these, various peptide receptors have been identified and functionally described in 5-HT neurons.

### **BOMBESIN**

To date three types of bombesin receptors have been described, i.e., the neuromedin B (NMB) receptor (BB1 receptor), the gastrinreleasing peptide (GRP) receptor (BB2 receptor) and the bombesin receptor subtype 3 (BRS-3 or BB3 receptor;Jensen et al.,2008). The NMB receptors are functionally expressed in 5-HT neurons in the DRN (Pinnock et al., 1994; Woodruff et al., 1996). BB1 receptor activation leads to an increase in 5-HT neuronal firing via suppression of K+ current, involving most likely the activation of the G*q*/<sup>11</sup> pathway (Benya et al., 1992; Woodruff et al., 1996) and as a consequence, an increase in 5-HT release to projection site such as the hippocampus, which is monitored by *in vivo* microdialysis (Merali et al., 2006).

### **CALCITONIN**

Calcitonin receptor (CalcR) mRNA has been localized in 5-HT neurons in the DRN (Nakamoto et al.,2000), which is in agreement with the expression profiling studies of postmitotic embryonic 5- HT neurons (Wylie et al., 2010). The CalcR couples to the G*<sup>s</sup>* pathway (Hay et al., 2005) and G*q*/<sup>11</sup> pathway (Offermanns et al., 1996). Since high levels of amylin binding sites are detected in the DRN (Sexton et al., 1994) and mRNA for CGRP has been localized in 5-HT positive axon terminals in monkeys (Arvidsson et al., 1990), it is most likely that CalcR assemble with receptor activity-modifying proteins (RAMPs) in 5-HT neurons to respond to the various peptide transmitters (i.e., CGRP, adrenomedullin, and amylin; Tilakaratne et al., 2000). A direct function of CalcR in 5-HT neurons has not yet been demonstrated. However, injection of CGRP into rats induces anxiety-like behaviors and increases c-Fos expression in the DRN (Sink et al., 2011).

### **CHEMOKINE RECEPTORS**

Two subtypes of the 18 identified chemokine receptors have been detected in microarray analyses from embryonic 5-HT neurons, i.e., Duffy antigen/chemokine (C-X-C motif) receptor 4 (CXCR4; Wylie et al., 2010). CXCR4 is expressed in the majority of 5- HT neuron outer membranes in the DRN (Heinisch and Kirby, 2010). So far only an indirect action of CXCR4 for modulation of 5-HT neuron has been demonstrated, since application of the CXCR4 ligands and antagonists modulate GABA and glutamate release onto 5-HT neurons (Heinisch and Kirby, 2010), which is in agreement with the described Gi/o protein-mediated inhibition of Ca2<sup>+</sup> channels via CXCR4 (Oh et al., 2002). In addition, CX3CR1 is also expressed in 5-HT neurons in the DRN and MRN (Heinisch and Kirby, 2009). Here the CX3CR1 specific ligand, fractalkine/CX3CL1 increased evoked IPSC amplitude on 5-HT neurons (Heinisch and Kirby, 2009), an effect which is most likely mediated postsynaptically and not presynaptically. The effect is surprising, since CX3CR1 has been described to couple to the Gi/o pathway (Oh et al., 2002), which would inhibit synaptic transmitter release and induce paired-pulse facilitation (PPF) if activated on GABAergic terminals. Therefore, the CX3CL1 could increase/modulate GABAA receptor trafficking and GABAA

"fnint-07-00040" — 2013/5/21 — 14:57 — page 6 — #6

Maejima et al. Modulation of 5-HT neurons

receptor currents in 5-HT neurons via activation of CX3CR1. A signaling function for Duffy antigen remains to be determined.

### **CHOLECYSTOKININ**

The expression of CCK receptors in 5-HT neurons in the DRN has also been suggested. Application of CCK increases 5-HT neuronal firing, which is blocked by the CCK1 antagonist L-364,718 (Boden et al., 1991). In addition, 5-HT release measured as outflow of [3H]5-HT in cortical slices is increased by CCK-4, which involves CCK2 receptors (Siniscalchi et al., 2001). Both CCK1 and CCK2 receptors mainly couple to the G*q*/<sup>11</sup> and G*<sup>s</sup>* pathway (de Weerth et al., 1993; Lee et al., 1993; Ulrich et al., 1993).

### **CORTICOTROPIN-RELEASING FACTOR**

Two CRF (CRF1 and CRF2) receptor subtypes have been described in brain and both seem to be localized in GABAergic neurons in the DRN as well as in 5-HT and non-5-HT neurons with differential subcellular localizations (for review, seeValentino et al., 2010). CRF1 and CRF2 couple to the G*<sup>s</sup>* pathway (Chang et al., 1993; Perrin et al., 1993; Vita et al., 1993; Lovenberg et al., 1995; Liaw et al., 1996) and also the G*q*/<sup>11</sup> pathway in heterologous expression systems (Dautzenberg et al., 2004), leading to the assumption that stimulation of CRF1 or CRF2 will increase neuronal firing. Indeed, activation of CRF1 on GABAergic neurons increases GABA release onto 5-HT neurons, while activation of CRF2 elicits an inward current in 5-HT neurons (Kirby et al., 2008). Based on the differential localization of the CRF receptors within the DRN and its receptor type-specific action on 5-HT neurons, it has been suggested that at low concentrations of CRF, 5-HT neuronal activity is decreased, while at high concentrations, 5-HT neuronal activity is increased (Kirby et al., 2008; Valentino et al., 2010).

### **GALANIN**

The neuropeptide galanin activates three types of galanin receptors (GalR1−3). GalR1 and GalR2 are highly expressed in the DRN (Melander et al., 1988; Larm et al., 2003; Lu et al., 2005; Sharkey et al., 2008). Moreover, GalR1 expression has also been described in 5-HT neurons from rats but not in mice (Xu et al., 1998; Larm et al., 2003). GalR activation in the DRN causes a K+ conductancemediated hyperpolarization in rat brain slices (Xu et al., 1998), most likely via GalR3-mediated Gi/o activation of GIRK channels (Swanson et al., 2005). These effects are in agreement with *in vivo* microdialysis studies showing that injection of galanin into the DRN reduces 5-HT release via GalR activation in the hippocampus (Kehr et al.,2002). In contrast to the inhibitory action of galanin on 5-HT neuronal activity, a reduction in inhibitory input onto 5-HT neurons has also been described (Sharkey et al., 2008). Here, the pan GalR1−<sup>3</sup> agonist reduced GABA-mediated fast synaptic transmission accompanied by increase of PPF, suggesting that GalRs are expressed on GABAergic terminals and inhibit presynaptic Ca2<sup>+</sup> channels via the Gi/o pathway. On the other hand, GalR2 agonist, galanin (2–11) reduced IPSP amplitude but did not cause PPF, suggesting a postsynaptic action (Branchek et al., 2000; Sharkey et al., 2008). Additionally, galanin (2–11) was demonstrated to increase 5-HT release in hippocampal tissue by immunofluorescence and high-performance liquid chromatography (HPLC) measurement

(Mazarati et al., 2005). Therefore GalR2 receptors may activate 5- HT neurons via reduction in GABAergic input onto 5-HT neurons and/or via G*q*/11-mediated increase in 5-HT neuronal firing. Galanin also modulates 5-HT1A autoreceptor responses *in vivo*. A possible mechanism of this modulation could be the heterodimerization of GalR with 5-HT1A which has been observed in heterologous expression systems (for review, see Kuteeva et al., 2008; Borroto-Escuela et al., 2010).

### **HYPOCRETIN–OREXIN**

The two hypocretin/orexin (OX1 and OX2) receptors are expressed in tryptophan hydroxylase-positive neurons in the DRN (Brown et al., 2002). Their intracellular signaling targets are rather complex involving activation of Gi/o, G*q*/11, G*s*, and other G proteins (Scammell and Winrow, 2010). Orexin positive fibers project onto GABAergic as well as 5-HT neurons in the DRN (Peyron et al., 1998). Application of the neuropeptides orexin-A and orexin-B causes a Na+/K+ non-selective cation current in 5-HT neurons (Brown et al., 2002; Liu et al., 2002b; Kohlmeier et al., 2008), suggesting that activation of OX1 and OX2 leads to the increase of 5-HT neuronal firing. The neuropeptides also induce GABA release onto 5-HT neurons at higher peptide concentrations (Liu et al., 2002b). In addition, orexin increases the somatic L-type Ca2<sup>+</sup> current in 5-HT neurons in a protein kinase C-dependent manner (Kohlmeier et al., 2008). It has therefore been suggested that modulation of Ca2<sup>+</sup> transients by orexin may be involved in the transcriptional regulation of long-term processes (Kohlmeier et al., 2008).

### **OPIOIDS**

Raphe nuclei receive dynorphinergic, enkephalinergic, and βendophinergic innervation (Adell et al., 2002). These transmitters activate μ, κ and δ opioid receptors, which primarily couple to the Gi/o pathway. Injection of morphine into the DRN causes an increase in 5-HT release detected in forebrain microdialysis (Tao and Auerbach, 1994). The increase in 5-HT release is most likely mediated via Gi/o protein-mediated inhibition of GABAergic interneurons in the DRN, involving μ opioid receptors located on GABAergic neurons (Jolas and Aghajanian, 1997). The modulation of 5-HT release in the DRN by κ and δ opioid receptors has also been described (Tao and Auerbach, 2002). Activation of δ receptors increased, while activation of κ receptors decreased 5- HT release measured with *in vivo* microdialysis. The κ receptors effects do not involve the modulation of GABAergic or glutamatergic inputs in the DRN (Tao and Auerbach, 2002), suggesting that κ receptors are expressed in 5-HT neurons. Likewise, opioid receptor-like1 (Oprl1) are most likely located and expressed in 5-HT neurons as follows. Oprl1 or nociceptin (NOP) receptors belong to the opioid receptor family but are activated by NOP (orphanin FQ), a neuropeptide derived from prepronociceptin protein. High levels of NOP receptor binding sites have been detected in the DRN (Florin et al., 2000) and Oprl1 receptor expression could be detected in embryonic 5-HT neurons (Wylie et al., 2010). NOP/orphanin FQ inhibits 5-HT release in the DRN via Oprl1 (Tao et al., 2007), suggesting a functional role of Oprl1 in 5-HT neurons early in development and in the adult brain.

"fnint-07-00040" — 2013/5/21 — 14:57 — page 7 — #7

### **SUBSTANCE P**

Substance P belongs to the tachykinin family and has a high affinity for the three different neurokinin receptors (NK1−3), in particular to NK1 (Hokfelt et al., 2001). These GPCRs couple mainly to the G*q*/<sup>11</sup> pathway (Stratowa et al., 1995), but G*<sup>s</sup>* pathway activation has also been reported for NK1 in cell culture systems (Martini et al., 2002). Various histological studies have revealed extensive expression of NK1 receptors in the DRN (Maeno et al., 1993; Saffroy et al., 1994; Vigna et al., 1994; Charara and Parent, 1998; Sergeyev et al., 1999; Froger et al., 2001). Most studies suggest that NK1 receptors are not localized on 5-HT neurons (Froger et al., 2001; Santarelli et al., 2001), while others revealed NK1 receptor expression in a subpopulation of 5-HT neurons (Santarelli et al., 2001; Lacoste et al., 2006, 2009). Interestingly, NK1 receptors are found in the cytoplasm of the 5-HT neurons and in dendritic membranes of GABAergic neurons. After administration of NK1 antagonist or deafferentation of substance P releasing projections, the density of membrane bound NK1 receptors is increased in the somatodendritic region of 5-HT neurons, suggesting that membrane trafficking of NK1 receptors may be regulated by Substance P input. This mechanism may contribute to the modulation of 5-HT neuronal firing under certain physiological conditions (Lacoste et al., 2009). In addition, controversial results exist for the effect of NK1 on 5-HT release and firing of 5-HT neurons. Inhibition of NK1 in the DRN using antagonists or knock-out strategies leads to an increase in firing activity of 5-HT neurons *in vivo* (Haddjeri and Blier, 2001; Santarelli et al., 2001). In contrast, activation of NK1 and NK3 increases spontaneous excitatory postsynaptic currents (EPSCs) in DRN 5-HT neurons resulting in an increased firing of the 5-HT neurons as observed in brain slice recording (Liu et al., 2002a). These effects could be blocked by NK1 and NK3 antagonists (Liu et al., 2002a). Also, activation of NK1 via intra-raphe injection of substance P in the DRN increases 5-HT release within the DRN, but decreases 5-HT release in frontal cortex as measured with *in vivo* microdialysis (Guiard et al., 2007). These effects and also the described increase in 5-HT firing in NK1 knock-out mice involve changes in 5-HT1A autoreceptor levels, suggesting at least a functional coupling between NK1 and 5-HT1A receptors. Further investigations to verify these interactions in 5-HT neurons are required.

In summary, the various heteroreceptors integrate incoming information via two main pathways, i.e., Gi/o and G*q*/<sup>11</sup> leading to inhibition or activation of 5-HT neuronal firing and 5-HT release, respectively. Besides the "classical" Gi/o protein-mediated, membrane-delimited modulation of GIRK and presynaptic Ca2<sup>+</sup> channels, other ion channel targets have been identified in 5-HT neurons. For example, two-pore-domain K+ channels [TWIKrelated acid-sensitive K-1 (TASK-1) and TASK-3] have been described in dorsal and caudal raphe 5-HT neurons (Washburn et al., 2002). TASK channels are inhibited by GPCRs coupling to the G*q*/<sup>11</sup> pathway most likely in a membrane-delimited manner involving the direct binding of Gαq subunits (Chen et al., 2006). The existence of voltage-sensitive but not ATP-dependent K+ channels in DRN neurons including 5-HT neurons have been proposed based on drug application studies (Harsing, 2006). In addition, TRP channels have been described to be modulated by D2-like receptors (Aman et al., 2007). According to the microarray expression profiling studies, various ion channel targets of GPCRs are expressed in embryonic 5-HT neurons including TRP (Trpm4 and Trpm7), two-pore channels (TPCN1), cyclic nucleotide gate channels (Hcn3), and KCNQ (Kcnq2; Wylie et al., 2010). Therefore, more detailed studies have to be performed to determine the role of other ion channel targets and in particular long-term effects of GPCR modulation for the serotonergic system.

### **INTEGRATION AND SIGNAL PROCESSING OF MODULATORY INFORMATION BY SEROTONERGIC NEURONS: WHY SO MANY GPCRs?**

The serotonergic transmitter system modulates many physiological functions such as mood, sexual behavior, feeding, sleep/wake cycle, memory, cognition, blood pressure regulation, breathing, and reproductive success (Mooney et al., 1998; Abrams et al., 2004; Lechin et al., 2006; Lerch-Haner et al., 2008; McDevitt and Neumaier, 2011). Because of the complexity and variety of the different behaviors modulated by serotonin, it is expected that modulatory signals from other brain areas including sensory information is integrated by GPCR signals in nuclei containing 5-HT neurons using a high diversity of GPCRs. While GABA and glutamatergic input into the raphe nuclei will adjust 5- HT neurons to the current inhibitory/excitatory state of the brain, other transmitter systems will inform 5-HT neurons more directly about the serotonin-associated behavior. For example, dopamine is involved in reward-driven learning; ACh modulates arousal and reward; NA and CRF are involved in stress responses; histamine is involved in sleep regulation and sexual function; bombesin and CCK regulate eating behavior; eCBs modulate memory, appetite, stress, social behavior, anxiety, and sleep; galanin has been implicated in the regulation of sleep–wake cycle, cognition, emotion, and blood pressure; hypocretin–orexin modulate arousal, wakefulness, and appetite; and opioids and substance P are involved in pain perception and mood (White and Rumbold, 1988; Woodruff et al., 1996; Greenough et al., 1998; Bear et al., 2001; Merali et al., 2006; Monti, 2010; Haj-Dahmane and Shen, 2011). Since all different behavioral responses can be integrated in nuclei containing 5-HT neurons, regulation of serotonin release will affect similar behaviors as stated above. Thus there is a tight interaction and signaling exchange between the different transmitter systems to precisely modulate behavioral output. Consequently, long-term changes in serotonin release can involve changes in the auto-regulation involving 5-HT receptor or hetero-regulation involving the above mentioned GPCRs and can cause neuropsychiatric disorders, most notably depression, anxiety, schizophrenia, and dementia (Lucki, 1998; Davidson et al., 2000; Mann et al., 2001; Nelson and Chiavegatto, 2001). The modulation of the different behaviors is even more complex since other GPCRs, such as orphan GPCRs, with so far unknown function are also expressed in 5-HT neurons. Therefore, new strategies and techniques have to be applied and developed to understand complex behaviors related to the serotonergic system.

"fnint-07-00040" — 2013/5/21 — 14:57 — page 8 — #8

### **NEW APPROACHES TO CONTROL AND UNDERSTAND SEROTONERGIC G PROTEIN-COUPLED RECEPTOR SIGNALING PATHWAYS**

Recently, several new approaches to manipulate the activity of 5-HT neurons in a cell type-specific manner have been developed. Various promoter/enhancer sequences have been isolated and characterized, which allow for the expression of proteins of choice within at least subsets of 5-HT neurons. These DNA sequences include the promoter or enhancer sequences of Pet-1/Fev transcription factor, the serotonin transporter (SLC6A4), and the tryptophan hydroxylase 2 (TPH2; Scott et al., 2005). Using these different promoter/enhancer sequences, different mouse lines and virus approaches have been developed to activate and/or silence/delete reporter genes such as green fluorescent protein (GFP) or tdTomato, Cre or Flip recombinases, tetracycline inducible systems, tetanus toxin light chain, and genes of interest (Scott et al., 2005; Kim et al., 2009; Madisen et al., 2010, 2012; Richardson-Jones et al., 2010; Zhao et al., 2011). Using the tetracycline inducible system, for example, the genetic ablation of 5-HT1A from the majority of 5-HT neurons could be achieved (Richardson-Jones et al., 2010, 2011).

For the investigation of the modulation and function of neuronal circuits in general and for the serotonergic system in particular, chemical and optogenetic techniques have been developed in recent years (Herlitze and Landmesser, 2007; Masseck et al., 2010). For control of neuronal activity in various neuronal circuits including the 5-HT system, the light-gated non-selective cation channel, ChR2 has been used and allows for the dissection of 5-HT-mediated behavioral effects in different raphe nuclei (Li et al.,2005;Varga et al.,2009; Zhao et al.,2011; Madisen et al.,2012; see also Kim et al., 2009). For the investigation of GPCR signals, various chemically and light-activated GPCRs have been developed (Masseck et al., 2010). For example vertebrate rhodopsin (vRh) has been used to regulate Gi/o signaling pathways in neurons by light (Li et al., 2005). Exogenously expressed vRh inhibits neuronal firing and neurotransmitter release *in vitro* and *in vivo*

### **REFERENCES**


most likely via activation of GIRK and inhibition of presynaptic Ca2<sup>+</sup> channels (Li et al., 2005; Oh et al., 2010; Gutierrez et al., 2011). Since vRh belongs to class A or rhodopsin like group of GPCRs, like serotonergic autoreceptors 5-HT1A/1B/1D, the idea arose to generate light-activated chimeric receptors which couple to intracellular signaling pathways in 5-HT1 receptor domains. The feasibility of receptor domain swapping has been demonstrated by Dr. Khorana's group (Kim et al., 2005). They replaced the intracellular loops of vRh with that of the β2-adrenergic receptor and turned vRh into a light-activated G*<sup>s</sup>* protein-coupled receptor, Opto-β2AR. The approach however was not suitable for the exchange of vRh intracellular peptide loops by the 5- HT1A intracellular receptor domains, since this chimeric receptor revealed altered activation and deactivation kinetics in respect to GIRK channel modulation (Oh et al., 2010). However, it could be shown that the C-terminus (CT) of the 5-HT1A receptor was sufficient to target vRh into expression domains of 5-HT1A receptor and functionally substitute for Gi/o pathway activation. The CTs of GPCRs contain peptide signal domains for subcellular targeting and G protein interaction. The CT of 5-HT1A had been shown to be necessary for trafficking of the receptor to dendritic domains via interaction with trafficking protein Yif1B (Carrel et al., 2008). Fusion of the 5-HT1A receptor CT onto vRh was therefore sufficient to target the chimeric construct Rh-CT5-HT1A into somatodendritic regions of hippocampal neurons and 5-HT neurons in the DRN and exclude the expression of the chimeric receptor from the axons. The Rh-CT5-HT1A receptors were able to functionally substitute for intracellular 5-HT1A signals in DRN neurons of 5-HT1A knock-out mice, i.e., light illumination induced K+ conductance (most likely GIRK) reduced the firing rate of spontaneously active 5-HT neurons. Thus, the adjustment of for example light-activated or chemically activated GPCRs and their coupling and anchoring to and in intracellular signaling domains will allow for the dissection of multiple GPCR pathways within the serotonergic system and their interaction *in vitro* and *in vivo*.

transmission and anxiety. *J. Neurosci.* 30, 2198–2210.


human brainstem and spinal cord: an immunohistochemical analysis. *Neuroscience* 122, 773–784.


"fnint-07-00040" — 2013/5/21 — 14:57 — page 9 — #9

conductance. *J. Neurophysiol.* 77, 1349–1361.


of a cholecystokinin receptor on 5 hydroxytryptamine neurones in the dorsal raphe of the rat brain. *Br. J. Pharmacol.* 102, 635–638.


(2004). Cell-type specific calcium signaling by corticotropin-releasing factor type 1 (CRF1) and 2a (CRF2(a)) receptors: phospholipase C-mediated responses in human embryonic kidney 293 but not SK-N-MC neuroblastoma cells. *Biochem. Pharmacol.* 68, 1833–1844.


"fnint-07-00040" — 2013/5/21 — 14:57 — page 10 — #10

frontocortical serotonin and glutamate release – species differences. *Neurochem. Int.* 61, 219–226.


N., et al. (2000). Role and origin of the GABAergic innervation of dorsal raphe serotonergic neurons. *J. Neurosci.* 20, 4217–4225.


activity in synaptosomes. *Synapse* 66, 1024–1034.


"fnint-07-00040" — 2013/5/21 — 14:57 — page 11 — #11


calcium channels in high potassiumelicited release of neurotransmitters from rat brain slices. *Neuroscience* 66, 609–615.


by 5-HT neurons in dorsal raphe nucleus of rat and mouse: evidence for species-dependent modulation of serotonin transmission. *Eur. J. Neurosci.* 17, 481–493.


"fnint-07-00040" — 2013/5/21 — 14:57 — page 12 — #12


and the rostral hypothalamus in rat brain slices in vitro. *Brain Res.* 653, 119–124.


"fnint-07-00040" — 2013/5/21 — 14:57 — page 13 — #13

afferents. *J. Comp. Neurol.* 375, 406–416.


"fnint-07-00040" — 2013/5/21 — 14:57 — page 14 — #14

pituitary and human brain corticotrophin releasing factor receptors. *FEBS Lett.* 335, 1–5.


*Psychopharmacology (Berl.)* 95, 1–14.


monoamine autoreceptors in human embryonic kidney 293 cells and brain synaptosomes. *J. Pharmacol. Exp. Ther.* 325, 629–640.


for optogenetic dissection of neural circuitry function. *Nat. Methods* 8, 745–752.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 11 February 2013; accepted: 03 May 2013; published online: 23 May 2013.*

*Citation: Maejima T, Masseck OA, Mark MD and Herlitze S (2013) Modulation of firing and synaptic transmission of serotonergic neurons by intrinsic G protein-coupled receptors and ion channels. Front. Integr. Neurosci. 7:40. doi: 10.3389/fnint.2013.00040*

*Copyright © 2013 Maejima, Masseck, Mark and Herlitze. This is an openaccess article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

"fnint-07-00040" — 2013/5/21 — 14:57 — page 15 — #15

### Computational studies of the role of serotonin in the basal ganglia

#### *Michael C. Reed1 \*, H. Frederik Nijhout <sup>2</sup> and Janet Best <sup>3</sup>*

*<sup>1</sup> Department of Mathematics, Duke University, Durham, NC, USA*

*<sup>2</sup> Department of Biology, Duke University, Durham, NC, USA*

*<sup>3</sup> Department of Mathematics, The Ohio State University, Columbus, OH, USA*

### *Edited by:*

*KongFatt Wong-Lin, University of Ulster, Northern Ireland*

### *Reviewed by:*

*Matthew S. Matell, Villanova University, USA Rosario Moratalla, Cajal Institute, Spain*

### *\*Correspondence:*

*Michael C. Reed, Department of Mathematics, Duke University, Physics Building, 120 Science Drive, Durham, NC 27708, USA. e-mail: reed@math.duke.edu*

It has been well established that serotonin (5-HT) plays an important role in the striatum. For example, during levodopa therapy for Parkinson's disease (PD), the serotonergic projections from the dorsal raphe nucleus (DRN) release dopamine as a false transmitter, and there are strong indications that this pulsatile release is connected to dyskinesias that reduce the effectiveness of the therapy. Here we present hypotheses about the functional role of 5-HT in the normal striatum and present computational studies showing the feasibility of these hypotheses. Dopaminergic projections to the striatum inhibit the medium spiny neurons (MSN) in the striatopalladal (indirect) pathway and excite MSNs in the striatonigral (direct) pathway. It has long been hypothesized that the effect of dopamine (DA) depletion caused by the loss of SNc cells in PD is to change the "balance" between the pathways to favor the indirect pathway. Originally, "balance" was understood to mean equal firing rates, but now it is understood that the level of DA affects the patterns of firing in the two pathways too. There are dense 5-HT projections to the striatum from the dorsal raphe nucleus and it is known that increased 5-HT in the striatum facilitates DA release from DA terminals. The direct pathway excites various cortical nuclei and some of these nuclei send inhibitory projections to the DRN. Our hypothesis is that this feedback circuit from the striatum to the cortex to the DRN to the striatum serves to stabilize the balance between the direct and indirect pathways, and this is confirmed by our model calculations. Our calculations also show that this circuit contributes to the stability of the dopamine concentration in the striatum as SNc cells die during Parkinson's disease progression (until late phase). There may be situations in which there are physiological reasons to "unbalance" the direct and indirect pathways, and we show that projections to the DRN from the cortex or other brain regions could accomplish this task.

**Keywords: serotonin, basal ganglia, direct pathway, mathematical model**

### **INTRODUCTION**

The serotonergic system projects widely in the brain with targets including sensory, motor, and limbic systems (Feldman et al., 1997; Hornung, 2003), and it participates in many neural functions including cognition, mood, regulation of feeding, and sleepwake behavior (Feldman et al., 1997; Monti, 2011). The dorsal raphe nucleus (DRN) contains a significant proportion of the brain's serotonergic neurons, and these axons are distinguished from other serotonergic projections by their small varicosities and their lack of true synapses, so that they contribute to volume transmission (Hornung, 2003). The result may be complex modulation of neural activity rather than directly stimulating a specific response, which may account for the fact that the effects of these projections have been long debated; see for example Monti (2010).

The DRN makes dense serotonergic projections to the basal ganglia (BG), including the substantia nigra pars compacta (SNc) and the striatum (Vertes, 1991). While the role of this serotonergic

innervation is unclear in healthy individuals, in the past 10 years, it has become appreciated that these serotonergic projections to the striatum play a special role in patients with Parkinson's disease (PD) being treated with levodopa (L-DOPA). In serotonergic neurons, 5-Hydroxy-L-tryptophan is converted to serotonin (5-HT) by amino acid decarboxylase, the same enzyme that converts L-DOPA to dopamine (DA) in dopaminergic cells. Thus, when L-DOPA is systemically administered, as is the common first treatment for PD, serotonergic cells are also able to convert the L-DOPA into DA, store it with 5-HT in the neuronal vesicles, and then release a mixture of 5-HT and DA in response to action potentials (Carta et al., 2007). In the striatum, the implication is that cells are releasing DA without the restraint and homeostatic mechanisms present on DA cells, allowing large fluctuations in the level of extracellular DA in the striatum. When a minority of DA cells have been lost (to PD), the remaining cells may be able to help regulate these fluctuations. But as more DA cells die, these fluctuations are less restrained and dyskinesias occur (de la Fuente-Fernandez et al., 2001; Lindgren et al., 2010). These effects were investigated using a mathematical model in Reed et al. (2012).

In this paper, we present a hypothesis about the role of 5-HT in the healthy basal ganglia, focusing on the effects of 5-HT on the so-called direct and indirect pathways. In the next two sections we describe the direct and indirect pathways and our hypothesis. In the following section, we will present a computational model that we use to test the plausibility of our hypothesis. We will then describe the results of computational experiments on the model.

### **DIRECT AND INDIRECT PATHWAYS AND BALANCE**

The BG are a group of subcortical nuclei including the striatum, subthalamic nucleus, internal and external globus pallidus, and substantia nigra. Cortical-BG-thalamic circuits are critically involved in many functions including sensorimotor, emotion, and cognition (Haber and Calzavara, 2009; Lincoln et al., 2012). Multiple paths and subcircuits within BG have been identified. In some cases the different circuits perform different functions; for instance the striatum, the input nucleus of the BG, has anatomic and functional subdivisions including sensorimotor and associative. In other cases, pathways may compete, as has been postulated for action selection.

Two of the most studied pathways through the BG are the direct and indirect pathways, segregated pathways through medium spiny neurons (MSNs) in the striatum. The names reflect the fact that the direct pathway proceeds from the striatum directly to either the internal portion of the globus pallidus (GPi) or the Substantia Nigra pars reticulata (SNr), the two output nuclei of the BG. The indirect pathway, on the other hand, also involves a subcircuit that includes the external portion of the globus pallidus (GPe) and the subthalamic nucleus (STN) before reaching the output nuclei. The two pathways have opposing effects on the thalamus: the indirect pathway has an inhibitory effect, while the direct pathway has an excitatory effect (Smith et al., 1998; Gerfen and Surmeier, 2011).

Albin and DeLong (Albin et al., 1989; DeLong, 1990) proposed that the balance of these opposing pathways is important for healthy function. Dopaminergic cells in the SNc project to the striatum and inhibit MSNs in the indirect pathway and excite MSNs in the direct pathway. Albin and DeLong noted that, during PD, the loss of dopaminergic cells in the SNc has the effect of shifting the balance in favor of the indirect pathway, and they reasoned that the increased inhibitory output from BG to thalamus might account for some of the motor symptoms of PD, such as bradykinesia and difficulty in initiating movement. This view later lost favor in the face of new experimental observations that appeared to contradict the Albin–DeLong theory. The fact that pallidotomy—lesioning the GPi—alleviates some PD motor symptoms fit well with the theory, but it emerged that high frequency stimulation of GPi was equally effective therapeutically. The solution to this conundrum seemed to be that the *pattern* of neuronal firing in the BG was as important for symptoms as the *rate* of firing; it was then often assumed that the Albin–DeLong theory was dead. In particular, it has been established experimentally that firing patterns in the GPi become bursty as PD progresses. Noting that this firing pattern is effectively a stronger signal than the irregular firing observed in the healthy GPi, however, allows the possibility that the Albin-DeLong theory retains merit but the notion of "balance" needs to be interpreted more generally. With this more general notion of balance, it is again widely hypothesized that many of the motor symptoms of PD are due to an imbalance between the direct and indirect pathways (Kravitz et al., 2010; Gerfen and Surmeier, 2011; Zold et al., 2012).

The BG play a critical role in action selection, and it has been proposed that changes in DA levels are important in this process. One key difference between MSNs in the direct and indirect pathways lies in their responses to extracellular DA: MSNs in the direct pathway express D1 receptors and are stimulated by DA while MSNs in the indirect pathway express D2 receptors and are inhibited by DA. It is known that D1 receptors mediate the effect of DA on the dyskinesias mentioned above (Darmopil et al., 2009; Mela et al., 2012). MSNs in both pathways receive feedforward inhibition from cortical pyramidal neurons that project to striatal inhibitory interneurons; this inhibition, together with collateral inhibition from other MSNs, may suppress MSN activity in circuits corresponding to undesired actions. In the circuit of the desired action, selection could depend upon the level of DA. While both the direct and indirect pathways receive the feed forward inhibition, it has been found that these inhibitory projections preferentially connect with the direct pathway and that there is an inhibitory feedback loop from the GPe in the indirect pathway (Bevan et al., 1998; Gerfen and Surmeier, 2011). Since the indirect pathway MSNs express D2 receptors, this feedback loop is expected to be inhibited by basal levels of DA. However, a transient decrease in DA could facilitate the feedback by disinhibiting the inhibitory projection to the GPe. Meanwhile, cortical excitation in the direct pathway helps counter the feed forward inhibition there. In this description of action selection, the presence of DA helps shift the balance in favor of the direct pathway. We mention these details of feedforward and feedback circuits in action selection to show how important the balance between direct and indirect pathways is in considering action selection, but these detailed feedforward and feedback circuits are not in our model.

Computational models of the BG abound, including biophysical models (Terman et al., 2002; Rubchinsky et al., 2003). Many studies focus on functions believed to be performed by the BG (Doya, 1999) such as reinforcement learning (Bar-Gad et al., 2011) or action selection (Gurney et al., 2001; Humphries et al., 2006; Houk et al., 2007; Girard et al., 2008). These models often involve competition between different loops through the BG. Some models explicitly consider the balance between pathways, with a loss of balance hypothesized to occur when DA is depleted (Leblois et al., 2006). Contreras-Vidal and Stelmach (1995) also consider the role of other neuropeptides (dynorphin, Substance P, enkephalin) in the imbalance of pathways that accompanies nigral degeneration. However, these studies do not consider the BG to be embedded in a larger regulatory circuit. Our model uses a simple global circuit as a platform for investigating the role of serotonin in the striatum.

### **A HYPOTHESIS ABOUT THE ROLE OF 5-HT IN THE STRIATUM**

What is the role of 5-HT in the striatum? Both the discussion of PD and the discussion of action selection above suggest that the balance between the direct and indirect pathways is important for the function of the BG in healthy individuals. We will explain how 5-HT in the striatum may help to maintain that balance.

The circuit we will consider is shown schematically in **Figure 1**. Dopaminergic neurons in the SNc innervate the MSNs in the striatum, inhibiting the MSNs in the indirect pathway and stimulating the MSNs in the direct pathway. The DRN sends dense serotonergic projections to the striatum (Vertes, 1991) and it is known that increased concentrations of 5-HT in the striatum facilitate the release of DA from the dopaminergic projections from the SNc (Blandina et al., 1989; Bonhomme et al., 1995; Deurwaerdere et al., 1996). Projections from the thalamus excite cortical neurons. There are numerous projections from higher brain regions to the DRN (Monti, 2010); in particular there are inhibitory projections from medial prefrontal cortex (mPFC) (Celada et al., 2001). In addition, SNc projections excite DRN neurons (DiMatteo et al., 2008) and DRN neurons inhibit SNc neurons (Guiard et al., 2008). Nucleus X stands for some other nucleus that sends excitatory or inhibitory projections to the DRN. This is the basic structure of our model that is indicated schematically in **Figure 1**; many of the details of the direct and indirect pathways, described in Smith et al. (1998), have been omitted here.

**pathways.** Dopaminergic neurons of the SNc inhibit the indirect pathway and stimulate the direct pathway from the cortex to the thalamus. Serotonin from the DRN projections to the striatum increase DA release and projections from the cortex inhibit DRN firing. The details of basal ganglia BG circuitry in the direct and indirect pathways is omitted. References for the influences are given in the text. There are many descending projections to the DRN; nucleus X represents one of these that may excite or inhibit the DRN.

We can now describe verbally how the circuit in **Figure 1** allows the serotonergic projection from the DRN to the striatum to maintain the balance between the direct and indirect pathways. Suppose, for some reason, that the firing of the DA neurons in the SNc goes down. This will lower DA in the striatum, which in turn will decrease the firing in the direct pathway and increase the firing in the indirect pathway. Therefore, the thalamus will be more inhibited and the ascending projections to the mPFC will fire less. Thus, the descending projections from the mPFC to the DRN will fire less, thus removing inhibition from the DRN. The increased firing of the DRN neurons will release more 5-HT in the striatum. Finally, the increased 5-HT tone in the striatum increases DA release in response to each action potential, thus partially compensating for the initial decreased DA release.

### **METHODS**

In this section we describe the details of the mathematical model that implements the circuit indicated schematically in **Figure 1**. The variables of the mathematical model, given in **Table 1**, are the firing rates (in Hz) corresponding to the nuclei pictured in **Figure 1** and the concentrations (in nM) of DA and 5-HT in the striatum. The firing rates can be thought of as average firing rates in the nuclei or as the firing rates of particular neurons in the nuclei that correspond to specific actions.

The differential equations satisfied by the variables are given below. They are the simplest equations one can imagine that express the positive and negative influences between the variables shown in **Figure 1**. There are external input terms. The influences are all simple linear functions of the relevant variables except for the term *G* · *(*5*HT)* · *(SN)* that expresses the release of DA in the striatum as function of SNc firing rate and of 5-HT concentration in the striatum. Most of the equations have decay terms that guarantee that firing or concentrations go to zero in the absence of input. Of course, the real physiological situation is much more complicated. We have left out important nuclei in the BG and there is no representation of the important dynamics within various nuclei. The purpose of this simple model is merely to allow us to investigate and illustrate the hypothesis about the role of


5-HT in maintaining homeostasis between the direct and indirect pathways.

$$\frac{d\,MI}{dt} = a\_{1c} - a\_{1da} \cdot DA - d\_1 \cdot MI \tag{1}$$

$$\frac{d\,\mathrm{MD}}{dt} = a\_{2c} + a\_{2da} \cdot \mathrm{DA} - d\_2 \cdot \mathrm{MD} \tag{2}$$

*d TH dt* <sup>=</sup> *<sup>a</sup>*<sup>3</sup> <sup>+</sup> *<sup>a</sup>*3*md* · *MD* <sup>−</sup> *<sup>a</sup>*3*mi* · *MI* <sup>−</sup> *<sup>d</sup>*<sup>3</sup> · *TH* (3)

$$\frac{d\,\,CX}{dt} = a\_{4th} \cdot TH - d\_4 \cdot CX \tag{4}$$

$$\frac{d\,DRN}{dt} = a\_5 - a\_{5\infty} \cdot CX + a\_{5\infty} \cdot SN - d\_5 \cdot DRN \tag{5}$$

$$\frac{d\,DA}{dt} = G \cdot 5HT \cdot \text{SN} - d\_6 \cdot DA \tag{6}$$

$$\frac{d\,5HT}{dt} = a\_7 \cdot DRN - d\_7 \cdot 5HT \tag{7}$$

$$\frac{d\,\mathrm{SN}}{dt} = a\mathbf{s} - a\_{\mathrm{8d}tr} \cdot \mathrm{DRN} - d\mathbf{s} \cdot \mathrm{SN} \tag{8}$$

The values of the parameters in the differential equations and their meanings are listed in **Table 2**.

The values of the parameters were adjusted so that the steady state values of the variables are in ranges that correspond to experimental observations. For example, the steady state concentration of DA in the striatum is 2.72 nM (Segovia and DelArco, 1997; Jones et al., 1998), the steady state concentration of 5-HT in the striatum is 0.846 nM (Knobelman et al., 2001), the DRN firing rate is 1.41 Hz (Feldman et al., 1997), the SNc firing rate is 4.47 Hz (Feldman et al., 1997), the firing rates in the indirect and direct pathways are 1.9 Hz (Mahon et al., 2006), and the thalamic firing rate is 17.5 Hz (Ohara et al., 2007). We have not been specific about which cortical region is projecting to the DRN in our model so we didn't make an effort to put that rate in any particular range.

### **RESULTS**

In this section we describe some experiments with the model that illustrate and confirm some of the speculations discussed in the Introduction. We will see that the circuit in **Figure 1** automatically compensates for changes in the system that would unbalance the direct and indirect pathways. The firing rates discussed can be interpreted as average firing rates of populations of neurons or as firing rates of individual neurons.

### **DECREASE IN SNc FIRING**

Suppose that the average firing rate of SNc cells decreases, for example by cell death as in Parkinson's disease. We can simulate this condition by raising the parameter *d*<sup>8</sup> from 10 to 17 because this will lower SNc firing at steady state. If we do this, the new steady state values of the variables are given in column A1 of **Table 3**. The units are Hz except for DA and 5-HT where the units are nM. These new steady state values are to be compared with the baseline steady states given in the column labeled


"normal" in **Table 3**. As one can see, the SNc firing rate falls 49% but DA concentration in the striatum falls from 2.72 to 1.98 a decrease of only 27%. The indirect pathway increases by only 6.6% and the direct pathway decreases by only 10%. This is the homeostatic effect of the DRN projection to the striatum discussed above, and it is easy to understand using **Figure 1**. As SNc firing decreases, the indirect pathway is more stimulated and the direct pathway is less stimulated. Thus, the thalamus is more inhibited and fires less, so the cortex is less stimulated and fires less. This removes some cortical inhibition from the DRN which fires more, driving the concentration of 5-HT up in the striatum. The enhanced 5-HT concentration in the striatum causes more DA release, which partially compensates for the lower SNc firing rate.

To test whether these homeostatic effects really are the result of the circuit in **Figure 1**, we ran the same simulation as in A1, but now we kept the 5-HT in the gain term [the first term on the right side of equation (6)] constant at its normal value of 0.846 nM. Thus, we have kept the circuit intact but the increase of 5HT in the striatum will not be felt by the DA terminals. The results are

**Table 3 | Results of simulations no. 1.**


shown in column A2. Now, MI increases by 14%, MD decreases by 22% and DA in the striatum decreases by 54%.

### **DECREASE IN THE GAIN** *G*

Suppose that the gain, *G*, that governs how much each nM of 5-HT in the striatum increases DA release is cut in half. One could imagine that half the 5-HT receptors on DA terminals in the striatum become ineffective. The new steady state values are are given in column B of **Table 3**. *MI* increases by only 7.5% and MD decreases by only 13%. The reason is that as the direct and indirect pathways start to become unbalanced the thalamus fires less and thus the cortex fires less. Thus, the corical inhibition is partially withdrawn and the DRN fires a lot more (up 58%) and this causes a 58% rise in striatal 5-HT that partially compensates for the drop in *G*. This is true despite the fact that the SNc firing rate declines because of inhibition from the DRN.

### **DECREASE IN 5-HT RELEASE**

Suppose that we decrease by 50% the coefficient *a*7, which represents the amount of 5-HT released per DRN spike in the striatum. The new steady states are given in column C of **Table 3**. Although one might expect that 5-HT would decrease by 50% in the striatum, in fact it only decreases by 21%. The reason, as above, is that as the direct and indirect pathways become unbalanced, the thalamus and cortex fire less, so the inhibition of the DRN is partially withdrawn and DRN firing rises 58%, which partially compensates for the drop in the 5-HT release coefficient *a*7*.* As a result, the indirect and direct pathways change by only 7.5 and 15%.

### **THE EFFECTS OF SSRIs**

The mechanisms by which chronic doses of selective serotonin reuptake inhibitors (SSRIs) relieve depressive symptoms in some patients remain largely unknown. Since many proposed mechanisms involve increases in extracellular 5-HT in projection regions of the raphe nuclei, it was of interest to ask what changes in the circuit in **Figure 1** would result from an increase in 5-HT in the striatum. If we decrease the coefficient *d*<sup>7</sup> in equation (7) then we expect that the steady state value of 5-HT in the striatum would increase, so we decreased *d*<sup>7</sup> from 2 to 1. The results can be seen in column D of **Table 4**. The firing rates in the direct and indirect pathways change modestly. But the most dramatic effect is on the firing of the DRN which decreases by 44%. As a result, the concentration of 5-HT only

**Table 4 | Results of simulations no. 2.**


increases from 0.846 to 0.944 nM. It is known that acute doses of SSRIs decrease DRN firing (Gartside et al., 1995; Hajos et al., 1995; ElMansari et al., 2005), so the circuit in **Figure 1** gives a another mechanism by which that could occur. Of course, the standard mechanism is that by increasing the extrcellular concentration of 5-HT near the cell bodies in the DRN, the 5-HT1A auto receptors are stimulated and this decreases DRN firing.

### **DEEP BRAIN STIMULATION OF THE SNc**

Since deep brain stimulation is used to treat a variety of neurological disorders, we asked how stimulation of the SNc would affect the circuit in the model. To do this we lowered the constant *d*<sup>8</sup> from 10 to 5 which should increase SNc firing. The results can be seen in column E of **Table 4**; indeed the firing rate of the SNc more than doubles. More interesting is the fact than the firing rate of the DRN decreases by 44% because of the extra excitation of the direct pathway and the extra inhibition of the in direct pathway. It is tempting to wonder whether this is the cause of the the transient acute depression seen in some patients when the SNc is stimulated (Bejjani et al., 1999).

The simulations in A–E show how the network in **Figure 1** automatically compensates for "flaws" in the DA and 5-HT systems to try to keep the direct and indirect pathways in balance. Of course, there may be times when it is important to change the balance between the pathways.

### **CHANGING THE BALANCE BETWEEN THE DIRECT AND INDIRECT PATHWAYS**

The role of 5-HT in increasing DA release and the circuit in **Figure 1** can also be used to unbalance the direct and indirect pathways. Suppose that there is some other nucleus (called nucleus X in **Figure 1**) that projects to the DRN. The projection may be excitatory or inhibitory, which would correspond to increasing or decreasing the external drive, *a*5, to the DRN. If *a*<sup>5</sup> is *increased* by 50%, the new steady states are as given in column F1 of **Table 4**. So, the indirect pathway has been decreased by 9% and the direct pathway has been increased by 14%.

Conversely, if we *decrease a*<sup>5</sup> by 50% the new steady states are as given in column F2 of **Table 4**. We see that the indirect pathway is increased by 14% and the direct pathway is decreased by 21%. Thus, the brain can control the balance between the direct and indirect pathways via projections to the DRN.

### **PHASIC CORTICAL INPUT TO THE DIRECT PATHWAY**

Suppose that there is a burst of cortical input to the direct pathway such as might occur when a particular action is selected and a particular group of striatal neurons in the direct pathway is stimulated. Now we are thinking of **Figure 1** as the connection diagram for small groups of neurons in each nucleus that correspond to this action. To test the result in the model we double the cortical input, *a*2, to the direct pathway for one second between *t* = 1 s and *t* = 2 s. The result can be seen in **Figure 2**.

There is a substantial increase in firing in the direct pathway that decays in about 4 s. Notice that this is followed by several seconds in which the direct pathway is inhibited and the indirect pathway is increased. This is a result of the feedback pathway via the DRN depicted in **Figure 1**. It is tempting to speculate that this inhibition may play a functional role in action-selection by turning off the direct pathway after it has been stimulated.

### **DISCUSSION**

The purpose of the simple model in this paper is to illustrate and investigate various homeostatic mechanisms arising from the DRN projections to the BG. It is clear that the model is highly simplified and that many important physiological details have not been considered. This paper is based on the fact that direct pathway MSNs express D1 receptors and are stimulated by DA, while MSNs in the indirect pathway express D2 receptors and are inhibited by DA. There are other sub-circuits of the BG, such as the hyperdirect loop (Leblois et al., 2006), and there are other important serotonergic projections to the BG such as the moderately dense projection from the DRN to the globus pallidus. There are also other interactions between 5-HT and DA in the BG that we have not discussed here. Boureau and Dayan (2011) discuss the opponency between DA and 5-HT in behavior. Dopamine is important for changes in synaptic strength in the cortical striatal pathway (Gerfen and Surmeier, 2011). Further, there are differences between the direct and indirect pathways other than the

different dopamine receptor types. For example, indirect pathway MSNs are more excitable, an effect thought to be due to differences in morphology (Gerfen and Surmeier, 2011). It is known in animal models that there is rapid hyper-innervation of the striatum by 5-HT terminals after SNc lesions Maeda et al. (2003). This would increase the gain, *G*, in our model and also provide homeostasis of extracellular DA.

It is not surprising that the BG, a regulatory biochemical and electrophysiological network of enormous importance for many aspects of human behavior, would have many built-in homeostatic mechanisms. Such mechanisms buffer the functions of the BG in the face of normal biological variations in the parts and also help to keep the BG functional in the face of large environmentally caused perturbations such as drug use and cell death. One of the most important such homeostatic mechanisms is the "passive stabilization" of DA in the striatum after cell death in the SNc. It is well known that the tissue content of DA in the striatum declines more or less proportionally to cell death in the SNc (Bezard et al., 2001; Dentresangle et al., 2001; Bergstrom and Garris, 2003), but that the extracellular level of DA in the striatum remains almost constant until 80 or 90% of the cells of the SNc have died. Many active compensatory mechanisms were proposed, see for example Hornykiewicz (1966), to explain this remarkable homeostasis. However, a very simple explanation was proposed in Bergstrom and Garris (2003). As SNc cells die, less DA is released into the striatum but there are also proportionally fewer dopamine reuptake transporters, so the concentration in the extracellular space should remain constant. This proposal by Bergstrom and Garris was verified by the current authors using mathematical modeling in Reed et al. (2009), where it is also explained why the homeostasis breaks down when more than 80% of the SNc cells have died.

In this paper we show how the circuitry in **Figure 1** automatically leads to several other homeostatic mechanisms. In Results A we showed that in the face of decreased SNc cell firing the level of 5-HT in the striatum will go up and increase the release of DA per action potential, partially compensating for the loss of SNc cell firing. Thus the circuit tends to keep the balance between the direct and indirect pathways. We note that this is consistent with the finding that the DRN fires more in animal models of PD (Zhang et al., 2007; Kaya et al., 2008; Wang et al., 2009). In Results B we showed that if the gain that measures the influence of 5-HT on DA release decreases (for example by receptor loss), then the circuit automatically partially compensates by increasing the firing rate of DRN neurons and thus releasing more 5-HT. In Results C we showed that if there is a drop in 5-HT release in the striatum, the circuit will automatically compensate by increasing DRN firing. All of these effects depend, of course, on the descending inhibitory projections from the cortex to the DRN, which in our model we assume are coming from the mPFC (Celada et al., 2001). So, these results give a reason for the descending inhibitory projections from the cortex to the DRN.

We also show (Results D) that increasing 5-HT in the striatum, as caused by an SSRI for example, would depress firing in the DRN. And, similarly, deep brain stimulation of the SNc would also depress DRN firing (Results E) and may explain the transient depression seen in some DBS patients.

There are times when it might be important to overcome one of these homeostatic mechanisms, for example by changing the balance between the direct and indirect pathways for a subset of neurons involved in choosing a particular action. We showed in Results F how this can be done by changing the input to the DRN, in our example by excitatory or inhibitory input from a hypothesized nucleus X. It is tempting to think that this gives a possible explanation of the plethora of descending projections to the DRN (Monti, 2010).

Our intention was to use our simple model to discuss possible roles of 5-HT in the striatum. Though our results are suggestive, the model depends on the circuitry that we hypothesize in

### **REFERENCES**


**Figure 1**. And, therefore, firm conclusions depend on experimental confirmation.

### **ACKNOWLEDGMENTS**

This research was partially supported by NSF grants EF-1038593 (H. Frederik Nijhout, Michael Reed), NSF agreement 0112050 through the Mathematical Biosciences Institute (Janet Best, Michael Reed), an NSF CAREER Award (Janet Best), the Alfred P. Sloan Foundation (Janet Best), and NIH grant R01 ES019876 (D. Thomas). The authors are grateful to Parastoo Hashemi and Steven Schiff for discussions about SSRIs and deep brain stimulation.


action selection. *Neural Netw.* 21, 628–641.


Reed et al. Serotonin in the basal ganglia

the dopamine trasporter. *Proc. Natl. Acad. Sci. U.S.A.* 95, 4029–4034.


temporal and quantitative relationship to the expression of dyskinesia. *J. Neurochem.* 112, 1465–1476.


to 5-HT-1A receptor stimulation in the rat. *Neuroscience* 159, 850–861.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 08 February 2013; accepted: 05 May 2013; published online: 24 May 2013.*

*Citation: Reed MC, Nijhout HF and Best J (2013) Computational studies of the role of serotonin in the basal ganglia. Front. Integr. Neurosci. 7:41. doi: 10.3389/fnint.2013.00041*

*Copyright © 2013 Reed, Nijhout and Best. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

### Monitoring serotonin signaling on a subsecond time scale

### *Elyse C. Dankoski <sup>1</sup> and R. Mark Wightman1,2\**

*<sup>1</sup> Curriculum in Neurobiology, University of North Carolina, Chapel Hill, NC, USA*

*<sup>2</sup> Department of Chemistry, University of North Carolina, Chapel Hill, NC, USA*

### *Edited by:*

*Kae Nakamura, Kansai Medical University, Japan*

#### *Reviewed by:*

*Katie A. Jennings, Oxford University, UK Paul A. Garris, Illinois State University, USA*

### *\*Correspondence:*

*R. Mark Wightman, Department of Chemistry, University of North Carolina, Campus Box 3290, Chapel Hill, NC 27599, USA e-mail: rmw@unc.edu*

Serotonin modulates a variety of processes throughout the brain, but it is perhaps best known for its involvement in the etiology and treatment of depressive disorders. Microdialysis studies have provided a clear picture of how ambient serotonin levels fluctuate with regard to behavioral states and pharmacological manipulation, and anatomical and electrophysiological studies describe the location and activity of serotonin and its targets. However, few techniques combine the temporal resolution, spatial precision, and chemical selectivity to directly evaluate serotonin release and uptake. Fast-scan cyclic voltammetry (FSCV) is an electrochemical method that can detect minute changes in neurotransmitter concentration on the same temporal and spatial dimensions as extrasynaptic neurotransmission. Subsecond measurements both *in vivo* and in brain slice preparations enable us to tease apart the processes of release and uptake. These studies have particularly highlighted the significance of regulatory mechanisms to proper functioning of the serotonin system. This article will review the findings of FSCV investigations of serotonergic neurotransmission and discuss this technique's potential in future studies of the serotonin system.

**Keywords: 5-HT, cyclic voltammetry, carbon-fiber microelectrode, selective serotonin reuptake inhibitor, serotonin autoreceptor, serotonin transporter**

### **INTRODUCTION**

The neurotransmitter serotonin [also called 5 hydroxytryptamine (5-HT)] can be found in nearly every region of the central nervous system. Its functions are as diverse as the areas they innervate, and it is a complex component of many psychiatric disorders. This pervasive involvement in brain-wide neurocircuitry is supported by an exceptionally large family of receptors whose collective functional scope enables the multifarious actions of serotonin throughout the brain (Barnes and Sharp, 1999). Much of our knowledge about serotonin comes from studies investigating its actions via these receptors, which remain the target of many pharmacotherapies involving serotonergic signaling. However, the release and uptake dynamics of serotonin precipitate its downstream effects, and exploring how these dynamics are modulated has provided key insights to the serotonin system and its therapeutic potential.

A number of techniques have been used to characterize serotonin signaling in the brain. *In vivo* microdialysis has provided key insights into natural and pharmacologically-induced fluctuations in ambient extracellular levels of serotonin. However, even recent advancements in microdialysis sampling rates provide markedly lower temporal resolution than required to examine individual release and uptake events (Schultz and Kennedy, 2008). Electrophysiological measurements can infer some properties of neurotransmitter release by measuring postsynaptic response, and this method works well for neurotransmitters like glutamate and GABA, whose ligands effect instantaneous changes in ionic current or membrane potential. However, most serotonin receptors in the brain are G-protein coupled and activate intracellular cascades over time periods of 400 ms or more, resulting in postsynaptic effects that are too slow or heterogeneous to reveal information about small, fast changes in concentration. Rigorous characterization of serotonin signaling requires a technique that operates on the same temporal and spatial scales as its release and uptake processes.

Electroanalytical techniques, which combine chemical selectivity with high temporal resolution, are often used in brain tissue to monitor small, fast changes in neurotransmitter concentrations concurrent with release and uptake. Serotonin signaling has been studied using several electroanalytical techniques, including differential pulse voltammetry and chronoamperometry [for a review, see Stamford (1985)]. Among these techniques, fast-scan cyclic voltammetry (FSCV) is the best combination of temporal and chemical sensitivity for measuring endogenous changes in serotonin concentration in brain tissue. This article will review the findings of voltammetric studies and discuss their contribution to current understanding of the mechanisms modulating serotonin release and uptake.

### **FAST-SCAN CYCLIC VOLTAMMETRY OF SEROTONIN**

FSCV is an electrochemical technique that detects changes in endogenous neurotransmitter levels rapidly enough to distinguish release and uptake events in brain tissue. The monoamine neurotransmitters dopamine, norepinephrine, and serotonin are well-suited to voltammetric detection because they oxidize predictably and at low potentials. To evaluate changes in neurotransmitter concentration, FSCV measures the current generated by the oxidation of a neurotransmitter. Oxidation is driven by a potential waveform applied to a carbon-fiber sensor. The current generated is proportional to the concentration of analyte at the carbon surface, so the current-to-concentration relationship can be quantified by calibrating microelectrodes in authentic standards before or after experimental use. Chemical selectivity, or the ability to identify the neurotransmitter being measured, is facilitated by analyzing the plot of generated current vs. applied potential. This current-voltage curve is termed the cyclic voltammogram. Monoamines oxidize and reduce at predictable potentials, and their cyclic voltammograms have a characteristic shape that is easy to recognize. An example of a voltage waveform, cyclic voltammograms, and *in vitro* calibration is shown in **Figure 1**. The "fast-scan" in the technique's name refers to the potential waveform, which is applied rapidly and repeatedly, producing

**FIGURE 1 |** *In vitro* **calibration of microelectrodes. (A)** Voltage potential waveform, described by Jackson et al. (1995), for detection of serotonin. **(B)** Cyclic voltammograms (current-voltage curves) obtained for known concentrations of serotonin injected into a flow cell apparatus. The concentration (right) and its corresponding oxidation current amplitude (left axis) are noted by dashed lines. **(C)** Maximal oxidation current vs. concentration of serotonin. The data are fit to a linear regression (black line), the slope of which gives a calibration factor for serotonin measured at these electrodes.

up to 10 cyclic voltammograms per second. The carbon-fiber microelectrode sensors used in FSCV have small dimensions (5 × 100µm), and this small size enables sampling from as few as 100 synapses at a time, with the electrode targeted to a discrete brain region. Thus, FSCV is a technique for which temporal and spatial scales of data collection are compatible with monitoring neurotransmission.

### **BRAIN REGIONS WITH MEASURABLE SEROTONIN RELEASE**

In brain slices, changes in serotonin concentration can be evoked using local electrical stimulation in brain regions containing serotonergic neurons or their axonal projections. The dorsal raphe nucleus (DRN), a tiny hub in the core of the medulla, contains the majority of serotonin-producing neurons that send ascending projections into the brain. Voltammetric measurements detect serotonin efflux from both axonal and somatodendritic sites in this region because a subset of serotonergic neurons synapse locally. Although axonal serotonin release is prevalent throughout the central nervous system, experiments employing FSCV are typically constrained to brain regions dense with serotonergic terminals and limited interference from other neurotransmitters and metabolites. These studies predominantly take place in the substantia nigra, a midbrain region composed of the *pars compacta*, packed with dopamine-synthesizing neurons, and the *pars reticulata* (SNr), a networked relay region that includes the densest serotonergic projections from the DRN to any forebrain region (9 <sup>×</sup> <sup>10</sup><sup>6</sup> sites per mm3) (Moukhles et al., 1997). In the SNr, serotonin is the predominant electroactive neurotransmitter evoked by electrical stimulations, frequently observed in the absence of somatodendritic dopamine release (Cragg et al., 1997). However, Moukhles et al. (1997) reported that serotonergic processes form synaptic junctions at a high rate in the SNr than any other brain region. It should be considered, therefore, that serotonin dynamics described in this region may be dissimilar to the dynamics in other brain regions, including the cerebral cortex, neostriatum, and hippocampus, where a majority of serotonin terminals form non-junctional synapses (Descarries et al., 1990). Serotonin efflux has also been described using FSCV in brain slices containing the suprachiasmatic nucleus (SCN) and ventral lateral geniculate nucleus (vLGN), hypothalamic and thalamic areas with similarly robust serotonergic innervation.

Serotonin measurements in *in vivo* FSCV experiments have taken place exclusively in the SNr. Thick vasculature and meninges above the DRN make targeting this region in the intact brain with a fragile carbon-fiber microelectrode difficult. Other serotonergic regions of interest, such as the hippocampus and prefrontal cortex, have not been explored due to significant chemical interference from other monoamines. However, recent advancements in neuronal stimulation technology may help circumvent this problem, and these potential future directions will be discussed in more detail in the conclusion of this article.

### **ELECTROCHEMICAL IDENTIFICATION**

Electrochemical methods, including FSCV, lack absolute chemical specificity. Some chemical species, particularly those with similar structure, can interfere with detection of the desired substance by oxidizing at similar or identical potentials. Therefore, voltammetric measurements rely on five criteria for identification of endogenously released substances: First, cyclic voltammograms obtained under experimental conditions must have high correlation with cyclic voltammograms of the authentic compound. Second, presence of the neurotransmitter must be validated by independent chemical identification, such as microdialysis, tissue content analysis, or radioligand binding in the targeted brain region. The third criterion requires precise anatomical positioning of the sensor into the brain region of interest. Fourth, observed release should follow known physiological properties for the neurotransmitter and target brain region. Finally, identification of the released substance is dependent on pharmacological validation.

O'Connor and Kruk (1991a) systematically addressed the criteria for electrochemical validation in the first published report of endogenous serotonin measured in rat brain slices using FSCV (O'Connor and Kruk, 1991a). The cyclic voltammogram obtained from electrically-evoked serotonin is highly correlated to the one obtained after adding known concentrations of serotonin to the bath solution. Stimulation trains (500 ms in duration) elicited transient flux in serotonin levels in the DRN, where serotonin-synthesizing neurons are located, and SCN, a region dense in serotonin projections from the DRN (Fuxe, 1965). The evoked concentrations measured in both brain regions was completely and reversibly abolished by removal of calcium from the buffer solution or addition of sodium-channel blocker tetrodotoxin, complying with known physiological properties of exocytotic release. RO 4-1284, an irreversible vesicular monoamine transporter 2 (VMAT2) inhibitor, attenuated release, confirming that observed release was vesicular in nature. Inhibition of monoamine oxidase (MAO) had no effect on stimulated efflux, ruling out interference from serotonin's metabolite, 5-HIAA. Finally, the clearance rate of serotonin in the DRN and SCN could be decreased after application of citalopram, a selective serotonin uptake inhibitor, but not by benztropine, a norepinephrine uptake inhibitor, to bath solution. Similar procedures validate the identity of serotonin detected in subsequent experiments by this and other groups.

Bunin and Wightman (1998) later investigated an aspect of serotonin's physiological release properties that had not been addressed by initial voltammetric characterizations. The dimensions of carbon-fiber microelectrodes are considerably larger than the synaptic cleft into which neurotransmitters are released (**Figure 2**). Consequently, FSCV detects extracellular, not intrasynaptic, changes in concentration, and its measurements are limited to the neurotransmitter concentration that diffuses into the extrasynaptic space following release. A number of neuromodulators diffuse beyond the synaptic space to reach their receptors and transporters in a process called volume transmission (Fuxe et al., 2010), and prior evidence from non-voltammetric techniques implicated serotonin as a volume neurotransmitter. Ultrastructural studies of serotonergic terminals throughout the brain suggest that they form predominantly non-junctional synapses (Chazal and Ralston, 1987). This terminal architecture, together with reports that expression of serotonin transporters and receptors occurs primarily on extrasynaptic regions of

**FIGURE 2 | Illustration of a carbon-fiber microelectrode in SNr.** Scaling of the microelectrode to *in situ* serotonergic fibers (gray) and uptake sites (yellow) is a representation based on Moukhles et al. (1997).

neuronal processes (Kia et al., 1996; Zhou et al., 1998), is indicative of volume transmission. In light of this information, Bunin and Wightman (1998) hypothesized that electrically-evoked serotonin should reach the extracellular space via diffusion, without buffering from uptake and receptor binding sites. This was found to be the case for both somatodendritic and terminal release, where the concentration of serotonin evoked per stimulation pulse during 20-pulse trains was equivalent to the concentration evoked by a single pulse (Bunin and Wightman, 1998). Therefore, the authors concluded that serotonin concentrations measured by voltammetry reflect physiological volume transmission from the synapse to its extrasynaptic targets.

### **TECHNICAL CONSIDERATIONS**

Since O'Connor and Kruk's first report of voltammetric detection of serotonin, several modifications have been implemented to adapt and improve the use of FSCV for novel applications. The voltage potential waveform (−1 to +1.4 to −1) used by the Stamford and Kruk labs, as well as others, in serotonin studies cited throughout this review was adjusted by Jackson et al. (1995) to improve temporal resolution. This modification to an N-shaped waveform, which scans from +0.2 to +1.0 to −0.1 back to +0.2 (**Figure 1A**), was designed to reduce serotonin adsorption to the electrode surface, as this slows electrode response times. It also avoids fouling reactions of serotonin's oxidative and reductive byproducts, improving electrode sensitivity and stability over time (Jackson et al., 1995). The modified waveform's improvement enhanced electrode response times and enabled more accurate measurements of release and uptake rates, facilitating closer examination of the kinetic parameters of serotonin release.

Improvement of carbon-fiber microelectrode sensors has been another ongoing adaptation to voltammetric measurements of serotonin. Brazell et al. (1987) first reported that dip-coating a carbon-fiber microelectrode in Nafion, a cation-selective polymer, improves serotonin and dopamine detection (Brazell et al., 1987; Jackson et al., 1995). Nafion enhances serotonin detection in two ways: first, by directly increasing the electrode's sensitivity to (positively-charged) serotonin, and second, by reducing its sensitivity to interfering anionic species such as uric acid and serotonin's metabolites. Years later, the success of the first *in vivo* voltammetric measurements of endogenous serotonin concentrations in a rat owe their success to the enhanced sensitivity and temporal resolution facilitated by Nafion-coated sensors and the modified voltage potential waveform (Hashemi et al., 2009).

Many labs continued their investigations of serotonin release without adopting either modification. Because each study reports on electrically-evoked changes in serotonin concentration, which are derived from *in vitro* calibrations, comparing findings between labs is not considered an issue within this review. It is important to note that these calibrations do not take into account the deleterious effects of electrode fouling that may be appreciably different over the course an experiment depending on the waveform used. Regardless of waveform choice, however, demonstration of a linear relationship between concentration applied and the current evoked establish the suitability of a waveform for stable detection of serotonin.

### **ELECTRICAL STIMULATION**

Many of the optimal electrical stimulation parameters for evoking somatodendritic and terminal serotonin in brain slices are consistent with previously established physiological principles. Serotonergic fibers are not myelinated and, like other unmyelinated fibers, are maximally excited by wider stimulation pulse widths, up to 2 ms in length (Anden et al., 1967; Merrill et al., 1978; Millar et al., 1985; Bunin et al., 1998). The amplitude of evoked serotonin concentration is also strongly dependent on increases in stimulation intensity (up to 380µA) and number of pulses in a stimulation train. Maximal release amplitudes are also positively correlated with increasing frequency, up to 100 Hz (O'Connor and Kruk, 1991b; Iravani and Kruk, 1997; Bunin and Wightman, 1998), although a detailed investigation by John et al. (2006) found that electrically-evoked concentrations were less sensitive to stimulation frequencies above 30 Hz (John et al., 2006). These constrained ranges of frequency dependence could reflect limitations in vesicular availability, but Aghajanian et al. (1990) has posited that processes in terminal regions store enough serotonin to sustain long, high frequency release (Aghajanian et al., 1990). Although serotonergic neurons are typically thought to fire at a rate of 0.5–5 Hz, burst-firing in the DRN has been measured at a rate of 100 Hz (Aghajanian et al., 1978; Vandermaelen and Aghajanian, 1983; Hajos et al., 1995). Differences in the range of frequency sensitivity between studies may therefore reflect dynamic, physiological fluctuations and could point to yet another regulatory component within the serotonin system. Future investigation of the mechanisms influencing frequency dependence would be an interesting addition to our understanding of serotonin signaling.

Although voltammetric studies of serotonin have used a wide array of stimulation parameters, one type has been used repeatedly in the studies reviewed in this article. Pseudo-one-pulse (POP) stimulations consist of 5–10 pulses applied at 100–200 Hz and are shorter than 100 ms in duration. They are designed to approximate a single electrical impulse but evoke more consistent efflux. In brain slice experiments, POP stimulations are often used to avoid creating an endogenous "tone" at receptors, which facilitates more direct investigation of selective agonists and antagonist's effects on autoreceptor-mediated modulation of release (Limberger et al., 1991; Thienprasert and Singer, 1993).

Endogenous serotonin concentrations have been evoked *in vivo* using electrical stimulation of the DRN as well as the medial forebrain bundle (MFB). A subset of serotonergic neurons that project to the SNr also send axon collaterals to forebrain structures via the MFB (Van Der Kooy and Hattori, 1980). Electrical stimulation of these collaterals excites SNr-projecting neurons in a retrograde direction, eliciting serotonin in the desired region (Hashemi et al., 2011). While targeting a stimulation electrode to the MFB is less challenging than targeting the DRN, this stimulation site can also be used to evoke neurotransmitter release in many brain regions. This may have indirect effects on serotonin signaling, complicating interpretation of data. Many optimal stimulation parameters are consistent between *in vitro* and *in vivo* measurements, including pulse width, stimulation intensity, and stimulation length. However, the concentration of serotonin evoked in the SNr is remarkably lower than predicted by brain slice measurements, prompting curiosity about the potential for serotonergic regulatory mechanisms that require intact brain tissue.

### **RELEASE**

Local electrical stimulations of serotonin terminals in brain slices typically evoke concentration changes in the 100 nM range. *In vivo*, however, serotonin concentrations evoked in the SNr rarely reach 100 nM, even after pharmacological manipulations (Hashemi et al., 2012). *In vivo* serotonin release, measured in an intact brain, is presumably limited by negative feedback from somatodendritic and terminal autoreceptors as well as inhibitory neurotransmitters that are released concurrently, which may account for some of the disparity in release amplitudes. In brain slices, concentration flux coincides with onset of the stimulation pulse train and this rising phase reaches its maximum within milliseconds of the stimulation's end. Serotonin evoked *in vivo* tends to overshoot the duration of stimulation. The overshoot is partially an effect of the broader area of release sites activated by a remote stimulation location, but is also due to limited diffusion rates through a Nafion polymer coating that is applied to enhance sensitivity *in vivo* (Hashemi et al., 2009).

As mentioned in a previous section, electrically-evoked serotonin concentrations measured in brain slices are sensitive to stimulation frequency. A proposal by Wightman et al. (1988) explains this observation: more uptake occurs in the time between stimulation pulses during low frequency stimulations, which limits the summation of extracellular neurotransmitter concentration (Wightman et al., 1988). Jennings et al. (2010) hypothesized that shifts in uptake rate associated with differential serotonin transporter expression would predictably alter this frequency dependence. Mice with either gain or loss of SERT expression both displayed significantly lower sensitivity to stimulation frequency than their wild-type littermates. Furthermore, in wild-type mice, a selective serotonin transporter inhibitor reduced sensitivity to stimulation frequency (Jennings et al., 2010). These findings underscore the importance of SERT in establishing a functional, dynamic equilibrium between release and uptake that enables coherent serotonin signaling.

Time-resolved measurements with FSCV also enable examination and comparison of the kinetic parameters of serotonin transmission. Neurotransmitter uptake is assumed to follow Michaelis–Menten dynamics, and uptake as well as concentration evoked per stimulus pulse can be calculated using a modified model of enzyme kinetics. **Figure 3** shows the equations used to model (**i**) uptake and (**ii**) release and representative signals predicted for stimulations of varying frequency. In brain slice preparations, the concentration evoked per stimulation pulse ([5-HT]pulse) was found to be 100 ± 20 nM in DRN, and significantly lower in the SNr, at 55 ± 7 nM. Differences in [5-HT]pulse are proportional to differences in tissue content between the two brain regions, indicating that local stores of serotonin may influence the concentration that can be evoked by electrical stimulation (Bunin et al., 1998). *In vivo* [5-HT]pulse in the SNr is much lower, comparatively: 1.5 nM per pulse using DRN stimulation, and 1.1 nM per pulse from the MFB. **Figure 4** shows an averaged recording of *in vivo* serotonin signals in the SNr; note that the concentration evoked is strikingly lower than predicted by the model in **Figure 3**. Given that both Bunin et al. (1998) and Hashemi et al. (2011) conducted experiments in the SNr, the nearly 50-fold difference cannot be attributed to differences in tissue content. Instead, this discrepancy between brain slice and *in vivo* preparations suggests powerful regulatory mechanisms acting on serotonin release *in vivo* which may depend on intact circuitry.

Hashemi et al. (2012) investigated mechanisms that may limit *in vivo* neurotransmission by using a common MFB stimulation to compare serotonin and dopamine efflux in the SNr and nucleus accumbens, respectively. The dopamine system serves as a good basis for comparison with the serotonin system because the two monoamines share parallel features in the mechanisms controlling their synthesis, release, modulation, uptake, and metabolic degradation. Inhibition of the monoamine synthesis enzyme aromatic amino acid decarboxylase and monoamine vesicular packing protein (VMAT2) considerably decreased the concentration of evoked dopamine to 18% and 6% of control amplitudes, respectively, but affected serotonin to a much lesser extent (48% and 72%, respectively). Serotonin efflux was also resistant to short term depression after repeated stimulation pulse trains, while dopamine efflux was attenuated by 38% after 20 stimulations. This suggests that a relatively small proportion of the available vesicular serotonin is mobilized for release by each electrical stimulation train, a finding which may partially explain the low concentrations observed *in vivo*.

### **MODULATION BY AUTORECEPTORS**

Three subtypes of serotonin receptors, all 5-HT1-type, are expressed on serotonergic axons, soma, and dendrites and function as autoreceptors that provide inhibitory feedback. 5-HT1-type receptors are found throughout the brain as autoreceptors, expressed on pre-synaptic serotonin terminals, and also as heteroreceptors, expressed on post-synaptic targets. The most well-studied autoreceptors, 5-HT1A, 1B, and 1D are seven transmembrane, G-protein coupled receptors (GPCRs). 5-HT1B and

50 pulse stimulation trains at 100, 20, and 10 Hz, as predicted by the model.

Traces are representations of data based on Bunin et al. (1998).

1D autoreceptors negatively couple to adenylyl cyclase (Yocca and Maayani, 1990). 5-HT1A heteroreceptors throughout the brain also inhibit adenylyl cyclase activity, but autoreceptors in the DRN apparently function through a different Gi-coupled mechanism (Clarke et al., 1996). *In vivo* studies of these autoreceptors are challenging because even highly-selective drugs inadvertently target pharmacologically-identical heterosynaptic receptors, which are often expressed at high levels in the same brain region as the autoreceptor. Intact circuitry thus makes it difficult to extricate direct effects of autoreceptor activity from indirect regulation by heteroreceptors.

Voltammetric measurements in brain slices avoid some of the problems associated with 5-HT1-type receptor pharmacology. In slices, the absence of spontaneous activity in serotonergic cells, due to either separation from cell bodies in a terminal slice or elimination of noradrenergic inputs in a DRN slice, results in loss of endogenous serotonin tone (Judge and Gartside, 2006). Therefore, these experiments avoid tonic activation of autoreceptors and can also avoid transient autoreceptor activity, when appropriate, using POP stimulations. This provides an opportunity to study the timing and function of these receptors in relative isolation. O'Connor and Kruk (1991b) showed that the non-selective autoreceptor antagonist methiothepin did not affect the concentration of serotonin evoked by POP stimulations, but increased serotonin elicited by longer stimulations. Further exploration with stimulations of varying frequency and duration determined that activation of autoreceptors requires a stimulation period of at least 400 ms (O'Connor and Kruk, 1991b). This time frame is comparable to the activation window for dopamine autoreceptors in striatal and limbic regions. Phillips et al. (2002) found that the activation delay observed for dopamine autoreceptors reflects timing of intracellular cascades added to the rate of neurotransmitter diffusion in a given brain area (Phillips et al., 2002).

5-HT1A, 1B, and 1D receptors are expressed at high levels in the DRN, where they negatively influence neuronal firing rate and extracellular levels of serotonin (Sprouse and Aghajanian, 1987; Pineyro et al., 1995; Moret and Briley, 1997; Adell et al., 2001). Voltammetric studies corroborate the inhibitory functions of all three receptors in this region by demonstrating that their selective agonists can reduce the amplitude of electrically-evoked serotonin release (Davidson and Stamford, 1995b; Hopwood and Stamford, 2001). Although its heteroreceptor analogues are prominently expressed in limbic regions, 5-HT1A autoreceptors are only expressed in the DRN and median raphe nucleus (Verge et al., 1985). Serotonin levels in forebrain terminal regions are affected by 5-HT1A-mediated changes in DRN unit activity (Kreiss and Lucki, 1994; Casanovas et al., 1997), but only 5-HT1B and 1D autoreceptors are expressed locally to functionally inhibit release in these regions. Voltammetric measurements in the SCN and vLGN confirm absence of 5-HT1A autoreceptor function in these terminal regions. 5-HT1B and 1D receptors, and not 5-HT1A receptors, negatively influence serotonin efflux in brain slices containing terminal regions (O'Connor and Kruk, 1992; Davidson and Stamford, 1996).

The 5-HT1A receptor may be the trump card in this family of autoreceptors: 5-HT1A receptor mRNA is expressed in nearly 100% of serotonergic cells and up to 15% of GABAergic interneurons in the DRN (Day et al., 2004). This receptor robustly regulates both neuronal firing rates and extracellular serotonin levels in the DRN (Sprouse and Aghajanian, 1987; Hjorth and Sharp, 1991). Voltammetric measurements find that antagonists for 5-HT1A and 1B have a supra-additive effect when administered together: increase in serotonin efflux is greater when both receptors are blocked than would be expected given the effect of each antagonist alone (Roberts and Price, 2001). In addition, the effects of 5-HT1B receptor antagonists on serotonin efflux are overpowered by 5-HT1A receptors unless they are also blocked, suggesting that these receptors compensate for reductions of 5-HT1B activity. Given these results, it is suggested that 5-HT1A and 1B receptors exhibit a functional interaction that is facilitated by proximal expression sites on serotonin neurons. Interest in 5-HT1A receptors has increased in the last decade, as they may play a role in depression and anxiety-related disorders (Ohno, 2010). Use of FSCV in future studies could meaningfully contribute to our understanding of how 5-HT1A receptor-mediated modulation of serotonin release plays a role in the etiology and treatment of these disorders.

Much speculation has occurred regarding the explanation for seemingly parallel functions of 5-HT1B and 1D autoreceptors. Both receptors are expressed in most serotonergic brain regions and have superficially redundant effects. One theory posits that these autoreceptors differ in their affinity for serotonin: one high affinity and the other low affinity. However, it has since been demonstrated that their affinities are nearly identical (Boess and Martin, 1994). More likely, the two receptors are expressed in different anatomical locations, and thus provide site-specific regulation of serotonin release, e.g., dendritic vs. axonal localizations in the DRN. Stamford et al. (2000) have reviewed the evidence supporting this hypothesis (Stamford et al., 2000).

The SNr expresses the highest concentration of 5-HT1B autoreceptors and heteroreceptors in the murine brain (Pazos and Palacios, 1985). 5,7-HT-induced lesions of serotonin neurons reduced 5-HT1B expression level by 37%, presumably due to degradation of serotonin terminals (Verge et al., 1986); this suggests that over 1/3 of 5-HT1B receptors expressed in the SNr could function as autoreceptors. Heterosynaptic function of 5-HT1B receptors on presynaptic sites in the SNr has been well-documented and may yield important therapeutic findings (Sari, 2004), but its functionality as an autoreceptor in the SNr remains controversial. Iravani and Kruk (1997) found no effects of 5-HT1B receptor antagonists on electrically-evoked serotonin concentrations in SNr slice preparations (Iravani and Kruk, 1997). However, Threlfell et al. (2010) report that these autoreceptors influence short-term depression of serotonin efflux. In paired stimulation trains, the concentration of serotonin evoked by the second stimulation (S2) reached 30% of that evoked by the first stimulation (S1) when there was a 1 second delay between S1 and S2. Antagonists of 5-HT1B receptors relieved this depression by up to 20% (Threlfell et al., 2010). 5-HT1B autoreceptors are thus apparently functional in the SNr, although their modulatory effects may be less robust than in other brain regions.

It is possible that the role of autoreceptors could be better elucidated by *in vivo* voltammetric studies, where endogenous serotonin tone is undisturbed and autoreceptor function is closer to normal physiological levels. However, a limited number of studies currently address the effects of serotonin's autoreceptors *in vivo*. In practice, it is difficult to selectively target 5-HT1-type receptors on serotonin terminals when pharmacologically indistinct 5-HT1-type heteroreceptors are expressed throughout the brain. As with *in vivo* microdialysis, the direct roles of the autoreceptor would be difficult to extricate from indirect modulation by *in situ* circuitry. Recent technological advancements in iontophoresis enable spatially-resolved, quantitative drug delivery at the site of voltammetric measurements (Herr and Wightman, 2013). Future studies using FSCV combined with this drug-delivery method have great potential to answer important questions about serotonin's autoreceptors.

### **UPTAKE**

Serotonin clearance is achieved primarily via active transport. Its transporter, SERT, is a member of the Na+/Cl− transporter family, which includes dopamine, norepinephrine, GABA, and glutamate transporters (Bennett et al., 1973; Iversen, 1974). SERT displays high affinity for serotonin in the nanomolar concentration range (Blakely et al., 1991). Inhibitors of SERT, selective serotonin reuptake inhibitors (SSRIs), have been a significant target of research efforts for decades, owing to their widespread use as antidepressant medications. Given acutely, SSRIs exert striking effects on the serotonin system: they elevate extracellular serotonin levels in the DRN (Bel and Artigas, 1992), which in turn decreases rate of cell firing due to activation of 5-HT1A autoreceptors (Chaput et al., 1986; Gartside et al., 1995). However, in therapeutic practice, SSRIs relieve depressive symptoms only after a chronic period of 3–6 weeks. It is during this period that the effects of transport inhibition on serotonin transmission become less clear. FSCV provides an ideal method for deciphering the effects of SSRIs because it can distinguish between changes in released serotonin and changes in rate of uptake.

Electrically-evoked changes in serotonin concentration are cleared from the extrasynaptic space within seconds of stimulation termination. The term *t*1*/*<sup>2</sup> is often used to compare rates of clearance; *t*1*/*<sup>2</sup> is the time elapsed between peak concentration of neurotransmitter and its decay to half this amplitude. Across brain regions, brain slice and *in vivo* voltammetric measurements report similar values of *t*1*/*<sup>2</sup> ranging from approximately 1 to 3 seconds (O'Connor and Kruk, 1994; Iravani et al., 1999; Davidson and Stamford, 2000; Hashemi et al., 2012). Rates of neurotransmitter clearance may positively correlate with the density of transporter sites in a given brain region: Bunin et al. (1998) report clearance rates of 1300 ± 20 nM/s in the DRN and 570 ± 70 nM/s in the SNr, and quantitative autoradiographic studies report 2–4 fold greater SERT binding levels in the DRN (Kovachich et al., 1988; Kovacevic et al., 2010). However, some comparisons of SERT density across brain regions do not support this conclusion, particularly in species other than rat, so more thorough investigation of the relationship between transporter expression and SERT density is needed. In addition to influencing uptake rate, brain slice studies in mice that either lack or overexpress SERT have demonstrated a negative correlation between transporter expression level and concentration of serotonin evoked by electrical stimulation (John et al., 2006; Jennings et al., 2010). The disparity observed between clearance rates in the DRN and SNR is conspicuously proportional to differences Bunin et al. (1998) reported in release rates. This suggests a consistent relationship between transporter expression levels, uptake rates, and release rate. Modeling serotonin signaling kinetics in more brain regions could confirm whether this relationship holds true throughout the brain.

SSRIs decrease rate of neurotransmitter clearance while increasing the maximum amplitude of electrically-evoked serotonin concentrations. In brain slices, SERT inhibition slows clearance (measured as an increase in *t*1*/*2) by 150–700%. This wide spread of responses may be attributable to experimental variability between studies, particularly differences in stimulation parameters. Indeed, *in vivo* studies of SSRI effects in the SNr using identical stimulation parameters report comparable changes in *t*1*/*<sup>2</sup> using MFB and DRN stimulation sites (increasing by 324 and 306%, respectively) (Hashemi et al., 2009, 2012). SSRIs also increase evoked serotonin concentrations by 200–450% in SNr brain slices, and up to 410% *in vivo* (Iravani et al., 1999; John et al., 2006; Hashemi et al., 2012). In this case, the intensity of the SSRI's effect is associated with different stimulation frequencies or pulse number. Structure and selectivity differences between SERT inhibitors may also contribute to variable responses between voltammetric studies; however, differences between SSRIs have not been specifically investigated using FSCV. Serotonin efflux in SNr brain slices has been modeled to describe the effects of an SSRI, fluoxetine, on apparent*KM*, the Michaelis–Menten constant (John et al., 2006). Quantifying changes to *KM*, *V*max and [5-HT]p may be a more effective way to contrast the effects of various SERT inhibitors on serotonin signaling in future studies in brain slices and *in vivo*. Thorough comparison of these effects could inform clinical usage of these pharmacotherapies.

### **AUTORECEPTORS MEDIATE SOME EFFECTS OF ACUTE UPTAKE INHIBITION**

In addition to their inhibitory influence on release, serotonin's autoreceptors appear to modulate response to SERT inhibition. A number of studies report that autoreceptor antagonists can potentiate the rise in extracellular serotonin levels elicited by SSRIs (Hjorth, 1993; Artigas et al., 1994), and 5-HT1A autoreceptors also mediate reduction of firing rate by SSRIs in the DRN (Gartside et al., 1995). The concentration change evoked by POP stimulations, deliberately rapid enough to avoid creating an endogenous tone, typically does not activate autoreceptors and is thus not affected by their antagonists. However, in brain slices of the DRN, paroxetine-induced increases in serotonin efflux were potentiated by 5-HT1A and 1B/D receptor antagonists (Davidson and Stamford, 1995a). Therefore, it is hypothesized that SERT inhibition causes an increase in extracellular serotonin levels sufficient to activate autoreceptors, even in brain slices. This produces an inhibitory tone, such that autoreceptor antagonists can further unmask SSRI-induced increases in release. 5-HT1B and 1D autoreceptors appear to similarly potentiate the effects of SSRIs in distal brain regions, as the Stamford lab also reports increases in paroxetine's effects in the vLGN when co-administered with 5-HT1B and 1D receptor antagonists (Davidson and Stamford, 1997). The interaction between regulation of release and uptake functions may also be an important detail in understanding how chronic uptake inhibition functions in treating depressive disorders.

### **CHRONIC UPTAKE INHIBITION**

The gap between onset of acute physiological effects and the therapeutic efficacy achieved in a chronic treatment period implies that SSRI-induced increases in serotonin levels are not directly producing antidepressant effects. Instead, elevated serotonin levels may influence long-term changes in serotonin signaling and its downstream targets to relieve symptoms of depression (Blier et al., 1987). The effects of long-term SERT inhibition are conflicting: some find increases in extracellular serotonin levels, and some find no changes. Associated with these outcomes are variable reports of autoreceptor desensitization or hypersensitization of 5-HT1A and 1B autoreceptors (Chaput et al., 1986; Invernizzi et al., 1992, 1995; Bosker et al., 1995a,b; Moret and Briley, 1996). Studies examining the effects of chronic SSRI treatment using FSCV have produced more consistent findings.

FSCV measurements of serotonin signaling after 21 days of SSRI exposure reveal that rate of clearance, measured by *t*1*/*2, is unchanged by this treatment. This lack of change is intriguing because radioligand binding studies report brain-wide reductions in SERT density after chronic inhibition (Kovacevic et al., 2010). It may reflect compensation by other clearance mechanisms, such as low affinity serotonin transporters. High and low affinity transport systems have been described for other monoamine neurotransmitters (Iversen, 1974; Stamford et al., 1984, 1990; Hagan et al., 2011). Although studies suggest that these transporters may play an important role in serotonin signaling (Daws, 2009), there are presently no FSCV studies describing their role in modulating release and uptake. The role of non-selective uptake transporters in modulating serotonin signaling, particularly following chronic SSRI treatment, would be interesting to investigate using FSCV.

Long-term SSRI treatment increases stimulation-evoked serotonin concentrations in the DRN and other brain regions by 20–100%, depending on the experiment and brain region studied (O'Connor and Kruk, 1994; Davidson and Stamford, 1998, 2000).These findings concur with the results of Schoups et al. (1986), who found that electrically-evoked release of tritiated serotonin (3[H]5-HT) in the hypothalamus increased after 21 days of SSRI treatment (Schoups et al., 1986). Although increases in serotonin efflux are observed after acute SERT inhibition, these can be explained by changes in rate of uptake. However, *t*1*/*<sup>2</sup> was not altered in any voltammetric investigation of long-term SSRI treatment. Therefore, increases in evoked concentrations induced by chronic treatment must rely on another mechanism. Changes in other aspects of release may contribute to this effect, for example: the quantity or composition of serotonin stored in vesicles, regulation of intracellular calcium, or excitability of the synaptic membrane. In-depth exploration of these mechanisms has not yet been explored using voltammetric methods.

Alterations in 5-HT1A autoreceptors contribute to the effects of chronic SSRIs on serotonin signaling. Under normal conditions, activated 5-HT1A receptors inhibit serotonin release and neuronal firing rates, and chronic SSRI treatment may modify this activity. Selective suppression of 5-HT1A autoreceptors can produce antidepressant behavioral effects in the absence of SSRIs (Bortolozzi et al., 2012). Many investigations have described functional desensitization of 5-HT1A receptors after chronic SERT inhibition, but to varying degrees across brain regions (Kreiss and Lucki, 1994, 1997; Cremers et al., 2000; Bosker et al., 2001; Rossi et al., 2008). Davidson and Stamford (1998) compared serotonin release and uptake and neuronal firing rates in the DRN of rats treated with water or paroxetine for 21 days. Paroxetinetreated rats had significantly higher serotonin release rates but exhibited no differences in firing rate. Interestingly, application of a 5-HT1A receptor agonist revealed that firing rate was *less* sensitive, and release amplitude *more* sensitive, to this manipulation. Contradictory findings of 5-HT1A receptor sensitivity was not a total surprise: prior studies found similar desensitization of 5-HT1A receptors in the control of firing rate after chronic paroxetine treatment (Chaput et al., 1986; Blier et al., 1988, 1990), and O'Connor and Kruk (1994) had previously reported sensitization of 5-HT1A receptors controlling release amplitude. The dichotomous effect of chronic SSRIs on 5-HT1A receptor sensitization indicates a functional distinction between the receptors mediating neuronal firing and those controlling release. Given the complex effects of chronic SSRIs on 5-HT1A autoreceptors in the DRN, it would be interesting to see how these changes translate to serotonin release in an intact brain. Currently, however, no studies employing FSCV have examined the effects chronic SERT inhibition *in vivo*.

5-HT1B and 1D receptors also desensitize after chronic SERT inhibition, although the extent to which this occurs appears to vary between brain regions. O'Connor and Kruk (1994) reported desensitization of 5-HT1B receptors in SCN after chronic treatment with fluoxetine. In contrast, the Stamford lab found no changes in the sensitivity of 5-HT1B receptors in the vLGN, instead finding desensitization of 5-HT1D receptors after chronic paroxetine. This inconsistency may reflect differences in autoreceptor expression in the SCN and vLGN, or result from difficulty in selectively targeting the 5-HT1B receptor pharmacologically (O'Connor and Kruk do not address the effects of 5-HT1D receptors in their study). Additionally, while O'Connor and Kruk (1994) found no desensitization of 5-HT1B receptors in the DRN, Davidson and Stamford (2000) later demonstrated that 5-HT1B receptor desensitization was apparent only when the 5-HT1A autoreceptor is antagonized (Davidson and Stamford, 2000). This adds further weight to the conjecture that 5-HT1A and 1B receptors functionally interact in the DRN.

### **MONOAMINE OXIDASE**

Metabolic degradation of serotonin by the enzyme MAO also contributes to serotonin clearance, especially in the developing brain (Cases et al., 1995, 1998). However, MAO inhibition in brain slices has no reported effect on release amplitudes or uptake (O'Connor and Kruk, 1991a), a finding used to confirm absence of serotonin's metabolites from the voltammetric signal. Owesson et al. (2002) showed a greater role of MAO in regulating serotonin efflux using transgenic mice lacking MAO-A expression. MAO-A is the isoenzyme that preferentially degrades norepinephrine, epinephrine, dopamine, and serotonin, and mice lacking this enzyme have decreased neuronal firing rates in the DRN and increased extracellular serotonin levels (Evrard et al., 2002). In brain slices of the DRN, MAO-A-deficient mice displayed significantly greater serotonin efflux and reduced clearance rates compared to wild-type controls. Additionally, the effects of citalopram were smaller and radioligand binding showed significantly lower expression of SERT in these mice (Owesson et al., 2002). This suggests that serotonin signaling is subject to regulation by MAO under the right experimental conditions. *In vivo* work supports this idea, as a recent study has shown that MAO inhibitors dramatically increase serotonin efflux in the SNr (Hashemi et al., 2012). MAO inhibition also has a much greater effect on serotonin than dopamine efflux when compared *in vivo*, suggesting a unique role for metabolic degradation in the regulation of serotonin transmission compared to other monoaminergic systems.

### **FUTURE DIRECTIONS**

Most of the studies reviewed in this article focused on describing the role of autoreceptors and transporters in modulating serotonin signaling throughout the brain. However, signaling is also considerably influenced by many other neurotransmitter systems, including norepinephrine, glutamate, GABA, and a number of neuroendocrine modulators. These external influences are highly implicated in serotonin's involvement in a number of psychiatric disorders and, while they have been investigated by other techniques, their functions have not been fully described using subsecond voltammetric measurements. Evaluating the effects of external modulatory mechanisms on the subsecond dynamics of serotonin signaling could provide important clues about their role in neurological disorders.

Ongoing methodological developments continue to progress voltammetric measurements beyond the current experimental limits. While electrochemical techniques have been optimized for serotonin detection (Lama et al., 2012), multi-electrode arrays are being developed which would enable measurements of multiple neurotransmitters in multiple locations simultaneously. Additionally, iontophoretic methods adapted for FSCV now enable localized, quantitative drug delivery, enabling investigation of recording-site specific effects *in vivo*. FSCV can also be paired with concurrent electrophysiological measurements to couple information about neurotransmitter release to single-unit responses of post-synaptic neurons. Iontophoretic and electrophysiological methods have already been applied to voltammetric studies of dopamine release in anaesthetized and freely-moving animals (Takmakov et al., 2011), and the Wightman group is currently working to adapt these methods for serotonin detection.

It has previously been challenging to selectively study serotonin's autoreceptors *in vivo* because homologous receptors are expressed throughout the brain. However, many novel drugdelivery and transgenic methods have been developed to avoid this type of complication. DREADDs, designer receptors with exogenous ligands, have been used to target specific G-proteinactivated cascades in serotonergic neurons (Dong et al., 2010). A light-activated 5-HT1A receptor has been generated that can be expressed selectively on serotonergic neurons (Oh et al., 2010). Furthermore, transgenic mice and rats offer many opportunities to study signaling in models of neurological disorders and targeted deletions. The effects of SERT deletion or overexpression on serotonin signaling have been investigated in brain slices of the SNr but not in an *in vivo* preparation. Many conditional knockout mouse models, which avoid confounding developmental effects, are now available for serotonin's transporter and receptors. These techniques could lead to more selective targeting and better characterization of serotonin's receptors and their downstream effectors in combination with voltammetric measurements.

Voltammetric measurements have, until recently, been limited to brain regions with high levels of the neurotransmitter of interest and limited presence of other electroactive compounds. This is because electrical stimulations indiscriminately excite all proximal nerve terminals. Use of optogenetic stimulation circumvents this barrier by enabling selective excitation of a specific population of neurons. Channelrhodopsin-2-mediated serotonin efflux has been measured in fly larvae using FSCV in a technique developed by the Venton group. The light-evoked efflux is vesicular and subject to regulation by synthesis and uptake transport in a manner that is similar to mammalian serotonin release (Borue et al., 2009, 2010). Selective stimulation of serotonergic neurons in a mammalian model would permit measurements in brain regions with significant interference from other electroactive neurotransmitters, such as the hippocampus.

**FIGURE 5 | A synopsis of the findings presented in this article.** Serotonin (5-HT) is synthesized from tryptophan in a two-step process requiring tryptophan hydroxylase and aromatic amino acid decarboxylase. Serotonin is packaged into vesicles by vesicular monoamine transporter 2 (VMAT2) and is released via calcium-dependent exocytosis. Released serotonin diffuses to extrasynaptic receptors and transporters via volume transmission. Its autoreceptors (5-HT1A, 1B, and 1D) are inhibitory and coupled to Gi proteins. The serotonin transporter (Thienprasert and Singer) has high affinity and selectivity for uptake of extracellular serotonin. Inside the terminal, serotonin is primarily metabolized by monoamine oxidase (MAO).

### **REFERENCES**


Finally, while voltammetric measurements of serotonin have presently only occurred in brain slices and anaesthetized animals, an exciting future direction for research will be monitoring serotonin signaling in an awake, freely moving animal. FSCV has been used to measure endogenous dopamine and norepinephrine release in freely moving animals, and this research has led to groundbreaking information coupling real-time neurotransmission to specific facets of behaviors. Many questions remain about serotonin's role in both basic and complex nervous system processes, and coupling FSCV to relevant behavioral paradigms may yield important clues about its function.

### **CONCLUSION**

Serotonin signaling is an important component in the etiology and treatment of many neurological disorders. By combining subsecond temporal resolution with nanomolar sensitivity to concentration changes, FSCV has revealed a great deal about dynamic serotonin transmission. These findings are summarized by the illustration in **Figure 5**. Studies using voltammetric methods have emphasized the importance of autoreceptor-mediated inhibitory feedback mechanisms in normal signaling as well as response to SSRIs. Further, recent *in vivo* measurements suggest that intact brain circuitry supports the involvement of multiple modulatory mechanisms in the control of serotonin signaling. New developments in a variety of techniques present potential for more intricate assessment of regulation within and external to the serotonin system. Future studies using FSCV in combination with new technologies will likely elucidate many of the mysteries of the serotonin system.

### **ACKNOWLEDGMENTS**

The authors wish to thank the Electronics Facility at University of North Carolina for their contribution to selected studies cited in this review. Our work was funded by the National Institute of Health (R01 NS38879 to R. Mark Wightman).

229, 101–103. doi: 10.1016/ 0014-299990292-C


for the therapeutic response in major depression. *J. Clin. Psychopharmacol.* 7, 24S–35S.


5-HT1A autoreceptors evokes strong anti-depressant-like effects. *Mol. Psychiatry* 17, 612–623. doi: 10.1038/mp.2011.92.


neuronal release and uptake: an investigation of extrasynaptic transmission. *J. Neurosci.* 18, 4854–4860.


600, 81–92. doi: 10.1111/j.1749- 6632.1990.tb16874.x


R. M. (2011). *In vivo* electrochemical evidence for simultaneous 5-HT and histamine release in the rat substantia nigra pars reticulata following medial forebrain bundle stimulation. *J. Neurochem.* 118, 749–759. doi: 10.1111/j.1471- 4159.2011.07352.x


autoreceptors of the dorsal and median raphe nuclei. *Synapse* 25, 107–116. doi: 10.1002/(SICI)1098- 2396(199702)25:2*<*107::AID-SYN1 *>*3.0.CO;2-G


studied using fast cyclic voltammetry. *Brain Res.* 568, 123–130. doi: 10.1016/0006-899391387-G


*Neuroscience* 165, 212–220. doi: 10.1016/j.neuroscience.2009.10.005


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 March 2013; accepted: 16 May 2013; published online: 05 June 2013.*

*Citation: Dankoski EC and Wightman RM (2013) Monitoring serotonin signaling on a subsecond time scale. Front. Integr. Neurosci. 7:44. doi: 10.3389/fnint. 2013.00044*

*Copyright © 2013 Dankoski and Wightman. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

# Comodulation of dopamine and serotonin on prefrontal cortical rhythms: a theoretical study

#### *Da-Hui Wang1 \* and KongFatt Wong-Lin2 \**

*<sup>1</sup> Department of Systems Science and National Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China <sup>2</sup> Intelligent Systems Research Centre, School of Computing and Intelligent Systems, University of Ulster, Derry, UK*

### *Edited by:*

*Kae Nakamura, Kansai Medical University, Japan*

### *Reviewed by:*

*Lei Niu, Albert Einstein College of Medicine, USA M. Victoria Puig, Massachusetts Institute of Technology, USA*

### *\*Correspondence:*

*Da-Hui Wang, Department of Systems Science and National Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Xinjiekouwai Street 19, Haidian district, Beijing 100875, China e-mail: wangdh@bnu.edu.cn; KongFatt Wong-Lin, Intelligent Systems Research Centre, University of Ulster, Magee Campus, Northland Road, BT48 7JL, Northern Ireland, UK e-mail: k.wong-lin@ulster.ac.uk*

The prefrontal cortex (PFC) is implicated to play an important role in cognitive control. Abnormal PFC activities and rhythms have been observed in some neurological and neuropsychiatric disorders, and evidences suggest influences from the neuromodulators dopamine (DA) and serotonin (5-HT). Despite the high level of interest in these brain systems, the combined effects of DA and 5-HT modulation on PFC dynamics remain unknown. In this work, we build a mathematical model that incorporates available experimental findings to systematically study the comodulation of DA and 5-HT on the network behavior, focusing on beta and gamma band oscillations. Single neuronal model shows pyramidal cells with 5-HT1A and 2A receptors can be non-monotonically modulated by 5-HT. Two-population excitatory-inhibitory type network consisting of pyramidal cells with D1 receptors can provide rich repertoires of oscillatory behavior. In particular, 5-HT and DA can modulate the amplitude and frequency of the oscillations, which can emerge or cease, depending on receptor types. Certain receptor combinations are conducive for the robustness of the oscillatory regime, or the existence of multiple discrete oscillatory regimes. In a multi-population heterogeneous model that takes into account possible combination of receptors, we demonstrate that robust network oscillations require high DA concentration. We also show that selective D1 receptor antagonists (agonists) tend to suppress (enhance) network oscillations, increase the frequency from beta toward gamma band, while selective 5-HT1A antagonists (agonists) act in opposite ways. Selective D2 or 5-HT2A receptor antagonists (agonists) can lead to decrease (increase) in oscillation amplitude, but only 5-HT2A antagonists (agonists) can increase (decrease) the frequency. These results are comparable to some pharmacological effects. Our work illustrates the complex mechanisms of DA and 5-HT when operating simultaneously through multiple receptors.

**Keywords: dopamine DA, serotonin 5-HT, prefrontal cortical circuit, computational model, selective dopamine and serotonin receptor agonist and antagonist, nonlinear dynamics**

### **1. INTRODUCTION**

The prefrontal cortex (PFC) plays an essential role in many higher brain functions such as goal-directed behavior, action planning, learning, attention, mnemonic processes, inhibitory control, and task switching (Miller, 2000; Fuster, 2001; Miller and Cohen, 2001; Andrade, 2011b). Neural oscillations in the PFC are suggested to be important for communication within the PFC and with other brain regions, and are suggested to regulate such higher cognitive functions (Wang, 2010; Benchenane et al., 2011).

Neural activities in the PFC are known to be regulated by endogenous neuromodulators. In particular, the neuromodulators dopamine (DA) and serotonin (5-HT) can modulate PFC neuronal excitability, synaptic transmission, plasticity and other electrical and biochemical properties, and hence affect various brain functions and behaviors (de Almeida et al., 2008; Kehagia et al., 2010; Puig and Gulledge, 2011; Rogers, 2011; Puig and Miller, 2012; Tritsch and Sabatini, 2012). DA alone can modulate the PFC in various ways through D1-like (comprising D1 and D5) receptors and D2-like (comprising D2, D3, and D4) receptors expressed on the pyramidal cells and interneurons (Vincent et al., 1993; Gaspar et al., 1995; Vincent et al., 1995; Muly et al., 1998; Neve et al., 2004; Seamans and Yang, 2004; Lapish et al., 2007; de Almeida et al., 2008; Santana et al., 2009). D1-like receptors activation can increase the intrinsic excitability and the input-output gain of PFC pyramidal cell (Henze et al., 2000; Thurley et al., 2008). D1-like receptor is also found to directly depress excitatory interaction between pyramidal cells, increase the excitability of fast-spiking interneurons, and also enhance inhibitory (GABAergic) synaptic transmission (Zhou and Hablitz, 1999; Gao et al., 2001; Gulledge and Jaffe, 2001; Gonzalez-Burgos et al., 2002; Gorelova et al., 2002; Kroner et al., 2007). These can be attributed to D1-like receptors' ability to trigger a variety of ionic channel activities, e.g., enhancement of sodium current, and attenuation of slowly-inactivating potassium currents and glutamate mediated synaptic currents (Yang and Seamans, 1996; Gao et al., 2001; Seamans et al., 2001a; Gonzalez-Islas and Hablitz, 2003; Tseng and O'Donnell, 2004). Activation of D2-like receptors seems to lead to opposite effects of D1-like receptors (Sesack and Bunney, 1989; Yang and Mogenson, 1990; Gulledge and Jaffe, 1998).

The PFC also receives dense 5-HT innervation from the raphe nuclei (Vertes, 1991; Vertes et al., 1999; de Almeida et al., 2008). Although 5-HT can modulate neural activity through seven distinct subtypes of receptors (Hoyer et al., 2002), 5-HT1A and 5-HT2A receptors are abundant in the PFC and seem to be the main contributors. Specifically, about 50–60% of the pyramidal neurons express 5-HT1A and/or 5-HT2A receptors (Pazos and Palacios, 1985; Pompeiano et al., 1992, 1994; Kia et al., 1996; Lopez-Gimenez et al., 1997; Willins et al., 1997; Martin-Ruiz et al., 2001; Santana et al., 2004; de Almeida and Mengod, 2007; Wedzony et al., 2008; Weber and Andrade, 2010), while a subpopulation of pyramidal cells express 5-HT1A or 5-HT2A receptors alone (Amargos-Bosch et al., 2004; Santana et al., 2004; Weber and Andrade, 2010; Andrade, 2011a). Inhibitory interneurons in the PFC also express 5-HT1A or 5-HT2A receptors (Pazos and Palacios, 1985; Willins et al., 1997; Santana et al., 2004; de Almeida and Mengod, 2007; Di Pietro and Seamans, 2007; Puig et al., 2010; Weber and Andrade, 2010). 5-HT1A and 2A receptors seem to act in opposing ways. For example, the activation of 5-HT1A receptors can lead to an increase in potassium conductance, resulting in an inhibitory response of the neuronal membrane potential (Andrade et al., 1986; Beique et al., 2004; Goodfellow et al., 2009), while the activation of 5-HT2A receptors generates an excitatory response by a decrease in the potassium conductance (Zhang and Arsenault, 2005; Andrade, 2011a) or mediating a calcium-sensitive nonspecific cation conductance (Villalobos et al., 2005; Zhang and Arsenault, 2005). *In vivo* and *in vitro* studies demonstrate that 5-HT evokes different response on pyramidal cells: inhibitions, excitations, and biphasic response, but the overall effect is overwhelmingly inhibitory (Puig et al., 2005). In addition to modulating neuronal excitability, 5-HT1A and 5-HT2A receptors can also modulate synaptic transmission. For example, 5-HT1A receptor activation can decrease the function of AMPA (Cai et al., 2002) and NMDA (Cai et al., 2002; Zhong et al., 2008). In contrast, 5-HT2A receptor activation can enhance the function of AMPA (Cai et al., 2002) and NMDA (Yuen et al., 2005). Activation of 5-HT2A receptors inhibits GABA*<sup>A</sup>* function through phosphorylation of GABA*<sup>A</sup>* receptors (Feng et al., 2001; Zhong and Yan, 2004).

At the neuronal network level, it has been found that DA injected in the PFC of anesthetized rats enhances hippocampalprefrontal coherence in the theta band oscillation (Benchenane et al., 2010), which could be due to DA modulating the GABAergic inhibition (Tierney et al., 2008). Blocking D1 receptors has been known to increase alpha and beta band oscillations more in local field potentials for novel than familiar associations (Puig and Miller, 2012). Increasing extracellular DA with genetic polymorphism of dopamine transporter (DAT1) in humans can enhance evoked gamma response to stimulus (Demiralp et al., 2007) 5-HT can also increase the frequency and amplitude of slow waves by promoting the UP states in PFC via activation of 5-HT2A receptors, suggesting an excitatory effect in *in vivo* condition (Puig et al., 2010). 5-HT2A/2C receptor agonist/antagonist has also been found to synchronize/desynchronize frontal cortical oscillations in anesthetized rats (Budzinska, 2009).

Dysregulation of DA and 5-HT in the PFC, and abnormal neural activity levels and oscillations in the PFC are implicated in various mental illnesses such as schizophrenia, attention deficit hyperactivity disorder, depression and addiction (Basar and Guntekin, 2008; Robbins and Arnsten, 2009; Ross and Peselow, 2009; Artigas, 2010; Curatolo et al., 2010; Arnsten, 2011; Meyer, 2012; Noori et al., 2012). Abnormal cortical oscillations can be observed in various neurological and psychiatric disorders, and in particular, disrupted beta (12–30 Hz) and gamma (30–80 Hz) band oscillations are found in schizophrenia, major depression and bipolar disorder (Spencer et al., 2003; Cho et al., 2006; Uhlhaas and Singer, 2006; Basar and Guntekin, 2008; Gonzalez-Burgos and Lewis, 2008; Gonzalez-Burgos et al., 2010; Uhlhaas and Singer, 2010, 2012). For example, schizophrenic patients have enhanced power in the beta2 (16.5–20 Hz) frequency band in the frontal cortex as compared to controls (Merlo et al., 1998; Venables et al., 2009). Beta band oscillation in the frontal cortex in a rat model of Parkinson's disease is also abnormally high compared to controls (Sharott et al., 2005). These mental disorders are usually treated with neuropharmacological drugs that target the DA and/or 5-HT systems (Di Pietro and Seamans, 2007; Bolasco et al., 2010; Poewe et al., 2010; Meltzer and Massey, 2011), which also seem to influence brain rhythms (Kleinlogel et al., 1997; Nichols, 2004; Sharott et al., 2005; Budzinska, 2009) .

Although there have been extensive investigations on the modulation of DA and 5-HT on the PFC, little is known about their comodulation effects on the PFC network dynamics and their potential applications in drug treatments (Diaz-Mataix et al., 2005; Di Pietro and Seamans, 2007; Artigas, 2010). In fact, many of the DA and 5-HT induced intracellular signaling pathways overlap (Amargos-Bosch et al., 2004; Santana et al., 2004; Di Pietro and Seamans, 2007; Esposito et al., 2008; Santana et al., 2009), suggesting that DA and 5-HT may cooperatively modulate PFC activity. One notable study has found that coadministration of 5-HT2A antagonist with a D2 antagonist in PFC significantly increase DA release which is greater than that induced by either antagonist alone (Westerink et al., 2001). A recent research has found that co-application of DA and 5-HT can increase the evoked excitability of certain PFC pyramidal cells (the gain of the neuronal input-output response) more than when either was applied alone, while the activities of other pyramidal cells get more suppressed (Di Pietro and Seamans, 2011). Furthermore, the same study also shows that prior DA or 5-HT application can potentiate the subsequent effect of the other.

In this work, we integrate the essential available experimental findings into a biologically motivated computational model to provide insights into the possible PFC dynamics caused by the comodulation of DA and 5-HT. The focus will be on tonic DA and 5-HT modulations, and their effects on higher frequency band oscillations.

### **2. MATERIALS AND METHODS**

The computational models in this work will implement DA and 5-HT comodulation at the neuronal and synaptic levels. In the following, we shall discuss about the various neuronal constituents of the PFC modulated by DA and 5-HT.

### **2.1. SUBGROUPS OF PFC NEURONS**

The modulation of DA and 5-HT on the neuronal activity depends on the specific receptor subtypes and their combinations since they can evoke different intracellular signaling pathways. Therefore, we divide the pyramidal cells and inhibitory interneurons into subgroups according to their expression of receptors (see **Table 1**). For simplicity, we omit neurons that do not express DA or 5-HT receptors. We also ignore pyramidal cells expressing DA or 5-HT receptors only, and those co-expressing 5-HT2A and D1-like receptors due to the relatively lower expression of 5-HT2A receptors. Thus, we consider 4 subgroups of pyramidal cells expressing the following combinations of receptors: D1+5-HT1A, D2+5-HT1A, D1+5-HT1A+5-HT2A, and D2+ 5-HT1A+5-HT2A. For inhibitory interneurons, we also consider 4 subgroups: D1+5-HT1A, D2+5-HT1A, D1+5-HT2A, D2+ 5-HT2A.

The instantaneous population firing rate for each neuronal subgroup follows the established dynamics (Wilson and Cowan, 1972; Dayan and Abbott, 2001; Murphy and Miller, 2009):

$$
\pi\_i \frac{dr\_i}{dt} = -r\_i + f\_i(I\_{i,\text{ syn}}) \tag{1}
$$

where *i* is the index for the subgroup of neurons. For pyramidal cells, *i* = 1 to 4 denote the subpopulation expressing D1+5HT1A, D2+5HT1A, D1+5HT1A+5-HT2A, and D2+5HT1A+5-HT2A, respectively. For interneurons, *i* = 5 to 8 denote the subgroup neurons expressing D1+5HT1A, D1+5HT2A, D2+5HT1A, and D2+5HT2A , respectively. τ*<sup>i</sup>* is the neuronal membrane time constant, set at 10 ms for pyramidal cells and 15 ms for inhibitory neurons (McCormick et al., 1985). *Ii,* syn is the total synaptic currents to the *i*-th subgroup. The activation function *fi(Ii,*syn*)* follows the established form (Eckhoff et al., 2011):

$$f(I\_{\rm syn}) = \frac{CI\_{\rm syn} - I\_L}{1 - \exp[-g(CI\_{\rm syn} - I\_L)] + (CI\_{\rm syn} - I\_L)/r\_{\rm max}} \tag{2}$$

where *r*max denotes the saturated firing rate, *r*max is 80 Hz for pyramidal cells, and 120 Hz for inhibitory neurons. *C* is the

**Table 1 | Percentage of prefrontal cortical neurons expressing DA and 5-HT receptor subtypes∗.**


*\*Data adapted from references, Gaspar et al., 1995; Amargos-Bosch et al., 2004; Santana et al., 2004, 2009; Andrade, 2011a; Puig, 2011.*

gain and set as 300 Hz/nA for pyramidal cells and 500 Hz/nA for inhibitory interneurons. *IL* is associated with the membrane leakage current and its value is set at 150 Hz for pyramidal cells and 180 Hz for inhibitory interneurons. The curvature of the activation function *g* is 0.2 Hz−1.

### **2.2. MODULATION OF NEURONAL EXCITABILITY**

In DA modulation, although D1 and D2 receptors can act on different signaling transduction pathways, they can effectively attain opposite effects (e.g., D1 activate protein kinase A while D2 receptors inactivate it) (Trantham-Davidson et al., 2004; Beaulieu and Gainetdinov, 2011; Tritsch and Sabatini, 2012). Therefore, we model the modulation of D1 (D2) on neuronal activity by increasing (decreasing) the gain factor *C* and decreasing (increasing) the leakage factor *IL* of the input-output function (Thurley et al., 2008).

With regard to 5-HT modulation, experiments have shown that 5-HT1A can hyperpolarize the neurons through the activation of G protein-gated inwardly rectifying K+ channels, while 5-HT2A activation can induce slow membrane depolarization and inhibition of the slow after-hyperpolarization which increases membrane excitability (Andrade et al., 1986; Beique et al., 2004; Goodfellow et al., 2009; Andrade, 2011a). 5-HT1A receptors are often localized on the axon initial segment and soma of pyramidal neurons where they act to suppress action potential generation, while 5-HT2A receptors are abundant in apical dendrites where they can amplify the synaptic current (Amargos-Bosch et al., 2004; Santana et al., 2004). 5-HT2A receptors are also found to increase the gain of the input-output relationship of pyramidal neurons (Zhang and Arsenault, 2005). Therefore, the modulation of 5-HT on neuronal activity can be modeled by an increase in the leakage factor *IL* (for 5-HT1A) and increase of the gain factor *C* (for 5-HT2A).

The activation of D1, D2, 5-HT1A and 5-HT2A receptors are concentration dependent (Trantham-Davidson et al., 2004; Hurley, 2006; Solt et al., 2007). For simplicity, we apply the sigmoid function to describe the concentration dependent modulation of DA and 5-HT on the gain and leak factors (**Tables 2** and **3**), similar to previous work (Fellous and Linster, 1988; Scheler, 2004). In these formulae, [DA]1 denotes the half maximal effective concentration (EC50) of DA for D1 receptor, and [DA]2 is the EC50 of DA for D2 receptor. Similarly, [5-HT]1 and [5-HT]2 are the EC50 for 5-HT1A and 5-HT2A receptors, respectively. In brief, we depict the modulation of DA and 5-HT on neuronal activity by changing the activation function through multiplying *C* with a gain factor shown in **Table 2**, and *IL* with the leakage factor shown in **Table 3**.

The concentrations of DA and 5-HT in **Tables 2** and **3** can be inferred from experiments such as those using microdialysis and voltammetry techniques. It is shown that the basal extracellular DA and 5-HT concentrations in the PFC is about ∼0.2–2.5 nM/L in resting condition, and can increase by as much as 10–200% when performing behavioral tasks (Adell et al., 1991; Watanabe et al., 1997; Lena et al., 2005; Winstanley et al., 2006; Rogoz and Golembiowska, 2010; Seeman, 2010; Staiti et al., 2011; van Dijk et al., 2012).



**Table 3 | Leakage factor for the neuronal subgroups.**


*Pyr, Pyramidal cells; Int, Interneurons.*

*See text for parameter values.*

D1- and D2-like receptors can have a high affinity state with a binding constant around the nM/L level or a low affinity state with a binding constant around the μM/L level (Richfield et al., 1989). Since we are studying only tonic concentration levels, we shall only focus on the high affinity receptors, by assuming that low affinity ones are activated more in the phasic or evoked mode, e.g., during behavioral tasks. In particular, we assume that the high affinity D1- and D2-like receptors are sensitive within a range of 0–50 nM/L with different EC50 values. We chose [DA]<sup>1</sup> = 4 nM/L and [DA]<sup>2</sup> = 8 nM/L (Koshkina, 2006), suggesting lower [DA] activates D1 receptor only while higher [DA] activates both D1 and D2 receptors, similar to the observed activation order of these receptors depending on DA concentration (Trantham-Davidson et al., 2004).

Similarly, 5-HT1A and 5-HT2A receptors can also operate in high-affinity and low-affinity states (Glennon et al., 1998; Watson et al., 2000). In this work, we assume that 5-HT1A and 5-HT2A receptors operate at high-affinity state since tonic 5-HT concentration at the nM/L level is far below the affinity of 5-HT for the low agonist affinity state (Watson et al., 2000), and we vary the 5-HT concentration within the range 0–5 nM/L. The affinity of 5-HT for 5-HT1 is higher than most of other subtype of

*Pyr, Pyramidal cells; Int, Interneurons. See text for parameter values.*

5-HT receptors, and lower concentrations favor 5-HT1A receptor activation (Ramage, 2010). One recent research shows that 5-HT1A receptors has lower EC50 than that of 5-HT2A receptors in concentration-electrophysiological response relationship (Goodfellow and Lambe, 2009). Thus, we adopt a lower affinity for 5-HT2A receptors than that of 5-HT1A. In particular, we assign [5-HT]1 = 1 nM and [5-HT]2 = 2 nM. For brevity, we shall drop the 1/L units for the dopamine and serotonin concentrations.

The specific gain modulation parameter values are as follows: ε<sup>1</sup> = 0*.*15 for D1, ε<sup>2</sup> = 0*.*1 for D2, and δ<sup>2</sup> = 0*.*2 for 5-HT2A. The parameter values reflecting the curvature of gain modulation are chosen as: β<sup>1</sup> = β<sup>2</sup> = 1/nM for D1 and D2, and α<sup>2</sup> = 4/nM for 5-HT2A. The parameters describing the amplitude of leak modulation are chosen as: ε¯<sup>1</sup> = 0*.*15 for D1, ε¯<sup>2</sup> = 0*.*1 for D2, ¯ δ<sup>1</sup> = 0*.*15 for 5-HT1A. The parameters reflecting the curvature of leak modulation are chosen as: β¯ <sup>1</sup> = 1/nM for D1, β¯ <sup>2</sup> = 1/nM for D2, ¯ δ<sup>1</sup> = 4/nM for 5-HT1A. The modulation factors due to D1, D2, 5-HT1A and 5-HT2A receptors are shown in **Figure 1**.

function of DA concentration **(A)**; modulation factor due to 5-HT1A (5-HT2A) receptors as a function of 5-HT concentration **(B)**.

### **2.3. SYNAPTIC CURRENTS**

The synaptic currents are mediated by AMPA, NMDA, and GABA receptors. We adopt an all-to-all connectivity among neuronal subgroups, thus, the synaptic currents to one neuron in the *i*-th subgroup can be approximated by summing all presynaptic neurons (*Nj*) and normalized by the neurons in the *i*-th subgroup (*Ni*):

$$I\_{i, \text{syn}} = \left(\sum\_{j=1}^{4} (\mathbf{G}\_{A, \, \vec{\eta}} \mathbf{S}\_{A, j} + \mathbf{G}\_{N, \, \vec{\eta}} \mathbf{S}\_{N, j}) r\_{\vec{\eta}} - \sum\_{j=5}^{8} \mathbf{G}\_{G, \, \vec{\eta}} \mathbf{S}\_{G, j} r\_{\vec{\eta}}\right) \frac{N\_{\vec{\eta}}}{N\_{\vec{\eta}}} \tag{3}$$
 
$$+ \, \mathfrak{r}\_{A} \mathbf{G}\_{A, \, \text{ext}, i} r\_{\text{ext}} \tag{3}$$

where the term τ*AGA,* ext*, ir*ext describes the constant part of the AMPA mediated background Poisson input with rate *r*ext (2.4 KHz) (Wong and Wang, 2006). Unlike (Wong and Wang, 2006; Eckhoff et al., 2011), we do not include synaptic noise into the model as we find that noise does not significantly affect our results (not shown).

*GX, ij* in Equation (3) denotes the averaged coefficient or strength of the synaptic currents mediated by receptors *X* (*A* for AMPA, *N* for NMDA, and *G* for GABA) from neuron *j* to *i*. Their values are constrained by the experimentally observed neural circuit oscillation frequencies and are assigned as: *GA, PP* = 4*.*42 nA, *GA, IP* = 4*.*21 nA, *GN, PP* = 0*.*10 nA, *GN, IP* = 0*.*83 nA, *GG, PI* = 2*.*275 nA, *GG, II* = 1*.*75 nA, *GA,* ext*, <sup>P</sup>* = 0*.*0929 nA, and *GA,* ext*, <sup>I</sup>* = 0*.*0716 nA. *SX, <sup>j</sup>* is the averaged synaptic gating variable mediated by receptors *X* expressing on neuron in the *j*-th



*Pyr, Pyramidal cells; Int, Interneurons.*

subgroup and follows the established dynamical forms (Brunel and Wang, 2001; Wong and Wang, 2006; Eckhoff et al., 2011):

$$\frac{d\mathbf{S}\_{A,j}}{dt} = -\frac{\mathbf{S}\_{A,j}}{\mathbf{r}\_A} + \frac{r\_j}{1000} \tag{4}$$

$$\frac{d\mathbf{S}\_{N,j}}{dt} = -\frac{\mathbf{S}\_{N,j}}{\mathbf{r}\_N} + 0.641(1 - \mathbf{S}\_{N,j})\frac{r\_j}{1000} \tag{5}$$

$$\frac{d\mathbf{S}\_{G,j}}{dt} = -\frac{\mathbf{S}\_{G,j}}{\mathbf{r}\_G} + \frac{r\_j}{1000} \tag{6}$$

where τ*<sup>A</sup>* = 2 ms, τ*<sup>N</sup>* = 100 ms, and τ*<sup>G</sup>* = 10 ms are the decay time constants for AMPA, NMDA, and GABA receptors, respectively. The fraction *Nj/Ni* can be approximated according to the experimental observations on the DA and 5-HT receptors distribution (Pazos and Palacios, 1985; Amargos-Bosch et al., 2004; Beique et al., 2004; Santana et al., 2004; de Almeida and Mengod, 2007; Santana et al., 2009). Based on experimental observations (see **Table 1**), we assume that the ratio for pyramidal cells expressing D1 to D2 receptors is approximately 25:15%, and the ratio for interneurons expressing D1 to D2 receptors is approximately 30:10%. We also assume that the ratio for pyramidal cells or interneurons solely expressing 5-HT1A receptors to that those expressing 5-HT1A and 5-HT2A receptors is approximately 50:50%. Moreover, pyramidal cells expressing D2 receptors are often found in apposition with GABAergic cells not expressing D2 receptors, leading to the synaptic currents from inhibitory neurons expressing D2 receptors smaller than those from inhibitory neurons expressing D1 receptors (de Almeida and Mengod, 2007). So we specify the fraction from D2 expressing interneurons to D2 expressing pyramidal cells as a fifth of that from D1 expressing interneurons. **Table 4** lists the above-mentioned fractions, where the fraction from the *j*-th to *i*-th subgroup is the value at the *i*-th row and *j*-th column. When simulating the two-population model, we ignore other subgroups by setting the irrelevant connections to zero.

### **2.4. MODULATION OF SYNAPTIC TRANSMISSION**

The synaptic current coefficients or strengths, *GX, ji*, can be modulated by DA and 5-HT through the activation of D1, D2, 5-HT1A, and 5-HT2A receptors. Studies have demonstrated that D1-like receptors can enhance AMPA, NMDA, and GABA mediated synaptic currents or receptor expression, while D2-like receptors decrease them (Seamans et al., 2001b; Gorelova et al., 2002; Thurley et al., 2008). Similar to the modulation of DA on synaptic currents, 5-HT also bidirectionally modulates the synaptic currents through the activation of different receptors: 5-HT1A receptors reduce AMPA and NMDA mediated currents (Cai et al., 2002; Zhong et al., 2008), while 5-HT2A receptors increase them. Activation of 5-HT2A receptors on pyramidal cells can reduce GABA mediated currents (Feng et al., 2001) while 5-HT1A receptors may suppress the presynaptic GABAergic release in interneurons (Yan, 2002).

We adopt a scalar, sigmoid function factor for the modulation of synaptic current coefficients (Fellous and Linster, 1988; Scheler, 2004). In principle, as all 4 considered receptors can modulate the three (AMPA, NMDA and GABA mediated) synaptic currents, we have 12 possible modulation factors (**Table 5**). The parameters λ*ij* and κ*ij* depict the modulatory effects of the *i*-th receptor on the *j*-th synaptic type. We set the amplitude λ*ij* = 0*.*2 and the curvature κ*ij* = 1/nM for *i* = 1*,* 2 and κ*ij* = 4/nM for *i* = 3*,* 4.

Usually, the synaptic modulation factors are determined by the specific types of receptors expressing on the postsynaptic neuron. For example, GABA mediated currents from D1+5-HT2A expressing (presynaptic) interneuron to D1+5-HT2A expressing (postsynaptic) pyramidal cells should be modulated by the factor <sup>1</sup> <sup>+</sup> <sup>λ</sup><sup>23</sup> <sup>1</sup> <sup>+</sup>*e*−κ23*(*[DA]−[DA]1*)* × <sup>1</sup> <sup>−</sup> <sup>λ</sup><sup>43</sup> <sup>1</sup> <sup>+</sup>*e*−κ43*(*[5-HT]−[5-HT]2*)* . **Figure 2A** shows that it is more effective to have [DA] and [5-HT] to increase/decrease in the same direction. One exception is the modulation of 5-HT1A on GABA-mediated synaptic currents. As mentioned above, 5-HT1A receptors can modulate the GABA-mediated currents via a presynaptic mechanism by reducing the GABAergic release (Yan, 2002), which effectively reduces the GABA-mediated currents. Thus, we assume that only presynaptic expressing 5-HT1A receptors can modulate GABAmediated currents by the factor 1 <sup>−</sup> <sup>λ</sup><sup>33</sup> <sup>1</sup> <sup>+</sup>exp[−κ33*(*[5-HT]−[5-HT]1*)*]. For example, if the postsynaptic pyramidal neuron expresses D2, 5-HT1A and 5-HT2A receptors, while the presynaptic

**FIGURE 2 | Examples of synaptic current modulations. (A)** Modulation factor of GABA mediated current from an inhibitory interneuron to a pyramidal cell, both expressing D1 and 5-HT2A receptors. **(B)** Modulation factor of an NMDA- or AMPA-mediated synaptic current by a presynaptic inhibitory neuron expressing D1 and 5-HT1A receptors and a postsynaptic pyramidal cell expressing D1 and 5-HT2A receptors.


*Top to bottom rows: modulation factors due to D1, D2, 5-HT1A, and 5-HT2A receptors, respectively.*

*\*As activation of 5-HT1A receptors on an interneuron can reduce GABA release, an assumption is made in which GABA-mediated currents are modulated only by presynaptic 5-HT1A receptors on an interneuron and not postsynaptic 5-HT1A receptors.*

### **3. RESULTS**

In the following, we shall first investigate the effects of DA and 5-HT modulation on individual PFC neurons, and then followed by various coupled excitatory-inhibitory PFC circuits. After that, we investigate how a more realistic heterogeneous multi-population network model is modulated by DA and 5-HT concentration levels and selective receptor agonists/antagonists.

### **3.1. NON-MONOTONIC MODULATION OF 5-HT ON PYRAMIDAL CELL COEXPRESSING 5-HT1A AND 5-HT2A RECEPTORS**

As previously mentioned, our model assumes that D1 and D2 receptors are expressed on distinct neuronal populations. Thus, it is expected that the modulation of DA on single neuronal activity should monotonically depend on the extracellular DA concentration. Similarly, for neurons solely expressing 5-HT1A or 5-HT2A receptors, the neuronal activity will also monotonically depend on extracellular 5- HT concentration. However, for pyramidal cells coexpressing 5-HT1A and 5-HT2A receptors, the modulation of 5-HT is not monotonic. The steady firing rate for such a pyramidal cell is *<sup>r</sup>* <sup>=</sup> *<sup>I</sup>* <sup>1</sup> <sup>−</sup>*e*−*gI*+*I/r*max with *<sup>I</sup>* <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>δ</sup><sup>2</sup> <sup>1</sup> <sup>+</sup>*e*−α<sup>2</sup> *(*[5-HT]−[5-HT]2*)* × <sup>1</sup><sup>+</sup> <sup>λ</sup><sup>41</sup> <sup>1</sup> <sup>+</sup>*e*−κ41*(*[5-HT]−[5-HT]2*) CPI*syn− <sup>1</sup> <sup>+</sup> ¯ δ1 <sup>1</sup>+*e*− ¯α1*(*[5-HT]−[5-HT]1*) IL*. For any constant *I*syn, the combined modulation effect is found to be non-monotonic (**Figure 3**). Lower 5-HT concentration initially decreases the neuronal firing rate because of the activation of the high affinity of 5-HT1A receptors. But higher concentration of 5-HT subsequently increases the neuronal firing rate due to activating the lower affinity 5-HT2A receptors.

### **3.2. MODULATION ON TWO-POPULATION EXCITATORY-INHIBITORY NEURONAL NETWORKS**

The simplest "canonical" cortical column and its oscillatory behavior can be modeled by an excitatory neuronal population mutually coupled to an inhibitory neuronal population (Wilson and Cowan, 1972). In our model, for every pair of coupled

excitatory and inhibitory neuronal populations considered, we set the parameters (i.e., the fractions in **Table 4**) describing the other (six) populations to be zero. Based on the expression of the various receptors (**Tables 2** or **3**), there are 4 × 4 = 16 combinations of excitatory and inhibitory neurons in the two-population network model. In general, we find that if the network consists of pyramidal cells which express D2 receptors (Pyr2 or Pyr4), the network cannot attain oscillatory behavior over the ranges of [DA] and [5-HT] explored. However, if the network includes D1-expressing pyramidal cells (Pyr1 or Pyr3), a rich repertoire of dynamical behavior can be produced with varying [DA] and [5-HT], as described below.

**Figure 4** shows the results for the Pyr1-type (D1+5-HT1A) pyramidal cells coupled to Int1-type (D1+5HT1A) interneurons. **Figure 4A** shows an example of the firing rate time course of Pyr1 neurons for [DA] = 7 nM. Higher [5-HT] level results in lower oscillation amplitude but faster frequency. The firing rate time courses for the inhibitory Int1 populations look similar (not shown). At intermediate [DA], the oscillation frequency decreases with increasing [DA] while remaining within the high beta range (**Figure 4B**).

**Figure 5A1** (green lines) summarizes the oscillation amplitudes over a range of [DA] levels. The top green lines denote the maximum (top) and minimum (bottom) firing rates during oscillation. The red lines represent collections of unstable steady states (or specifically, unstable fixed points), while the black lines represent that for the asynchronous stable steady states (or stable fixed

**Pyr1-type (D1+5-HT1A) and Int1 (D1+5-HT1A) neurons. (A)** Firing rate time course of pyramidal cells with [DA] = 7 nM and [5-HT] = 0.3 nM (dotted) or 2 nM (solid). **(B)** Oscillation frequency decreases with increasing [DA]. [5-HT] = 0.3 nM (dotted) and 2 nM (solid).

points). We can also observe that with sufficiently low [DA], oscillations can disappear through a phase transition or bifurcation (specifically, a Hopf bifurcation) (Strogatz, 2001), such that the neurons in the network are tonically and asynchronously firing at stable rates. Clearly, we can see that higher [5-HT] level can laterally shifts the onset of oscillation (bifurcation point) rightward (compare dotted and bold), which means a higher [DA] level is required to maintain the oscillations. Moreover, the range of oscillation amplitudes are also significantly more constrained. Globally, we can also map out the network behavior with respect to the [DA] and [5-HT], i.e., a phase diagram. The phase diagram in **Figure 5A2** clearly shows that oscillation behavior can occur when [DA] is sufficiently high (∼7 nM) regardless of the [5-HT] level. The inset in **Figure 5A2** is a replicate of **Figure 4B**.

**Figures 5B–D** show the modulation of [DA] and [5-HT] on the network composing Pry1-type pyramidal cells and rest of the interneuronal types: Int2-type (D1+5-HT2A) inhibitory neurons (B), Int3-type (D2+5-HT1A) inhibitory neurons (C), and Int4-type (D2+5-HT2A) inhibitory neurons (D). For the network consisting of Pyr1 and Int2 neurons, **Figure 5B1** shows two examples of the bifurcation diagram for pyramidal cells' firing rate with respect to [DA] given fixed [5-HT] = 0*.*3 nM (dotted) and [5-HT] = 2*.*0 nM (solid). The phase diagram (**Figure 5B2**) shows that the neurons are tonically firing asynchronously given higher [5-HT] and lower [DA], and synchronously firing given smaller [5-HT] and moderately higher [DA]. The oscillation emerges along with increasing [DA] through a Hopf bifurcation. The oscillation amplitude increases and approaches saturation (**Figure 5B1**), while the oscillation frequency decreases from 30 Hz to approximately 20 Hz with increasing [DA] (inset of **Figure 5B2**). For the network with D2+5-HT1A inhibitory neurons, only a finite range of [DA] supports oscillation behavior (**Figure 5C2**). The oscillation emerges and then disappears through Hopf bifurcations with increasing [DA], regardless of the [5-HT] levels. There is also an increase follows by a decrease in oscillation amplitude as [DA] is increased for a given fixed [5-HT] = 0.3 nM (dotted) and [5-HT] = 2.0 nM (solid) (**Figure 5C1**). Thus a non-monotonic dependence of oscillation frequency on [DA] (**Figure 5C2** inset). For the network with Pyr1- and Int4-type (D2 and 5-HT2A) inhibitory neurons, oscillation behavior can be obtained only in a finite range of [DA] if [5-HT] is low ([5-HT] = 0.3 nM) (**Figures 5D1,D2**), similar to **Figure 5C**. A similar minimal (maximal) oscillation frequency (amplitude) can be observed (**Figure 5D2** inset, dotted). However, for high [5-HT] ([5-HT] *>* 2 nM) and [DA] (*>*7 nM), oscillation behavior becomes more easily attainable (bold).

A similar analysis is done on Pyr3-type(D1+5-HT1A+5- HT2A) pyramidal cells, instead of Pyr1-type. **Figure 6** summarizes the analysis of 2-population network of Pyr3 paired individually with the same four (Int1–4) types of inhibitory neurons. The phase diagrams (**Figure 6**, right) indicate that oscillations cannot be attained if [DA] and [5-HT] levels are sufficiently low. **Figures 6A1,A2** (Pyr3 and Int1) look qualitatively similar to **Figures 5A1,A2** (Pyr 1 and Int1) except that now there is a finite regime of [5-HT] that does not allow oscillation to occur. **Figures 6B1,B2** (Pyr3 and Int2) look qualitatively similar to that of **Figures 5B1,B2** (Pyr1 and Int2) throughout the range of [DA] and [5-HT] explored. **Figures 6C1,C2** (Pyr3 and Int3) seem to be a hybrid of **Figure 6A** (Pyr3 and Int1) and **Figure 5C** (Pyr1 and Int3), with one of the asynchronous regions (**Figure 6C2**). This implies that when [5-HT] is sufficiently high, oscillation can occur even with very low level of [DA] (*<*3.5 nM). The network with Pyr3 and Int4 (**Figure 6D**) looks similar to that of Pyr1 and Int4 (**Figure 5D**), but the oscillation regime can now occur over a much larger range of [DA] and [5-HT].

We have seen how [DA] and [5-HT] can modulate the network consisting of different combinations of pyramidal cells and inhibitory neurons in **Figures 4**–**6**. Taken together, we can make several observations. Firstly, we can observe that pyramidal cells with (excitatory) 5-HT2A receptors can oscillate even with low [DA] (**Figures 6A,C,D)** as compared to pyramidal cells with no 5-HT2A receptors (**Figures 5A,C,D**). Secondly, inhibitory neurons with 5-HT2A receptors can enhance inhibition in the circuit, which can cause oscillation to cease (compare **Figure 6D2** with **Figure 5D2)**. Thirdly, higher [DA] will inhibit interneurons expressing D2 receptors due to the latter's inhibitory nature upon activation, and as a result, network oscillation will cease (**Figures 5C2, D2**, **6C2**, **D2**).

### **3.3. HETEROGENEOUS NETWORK MODEL**

After investigating the DA and 5-HT modulation on various possible two-population excitatory-inhibitory networks, we shall now study the neuromodulation and drug effects on the dynamics of a 8-population network fully connected with neurons expressing all the considered receptor types and their combinations.

### *3.3.1. 5-HT and DA modulation*

We first vary [DA] to investigate the modulation of DA, fixing [DA]1 = 4 nM, [DA]2 =8 nM, [5-HT]1 = 1 nM, [5-HT]2 = 2 nM, and [5-HT] = 0.3 nM. Oscillation emerges at [DA] = 3.21 nM through a Hopf bifurcation. Due to the higher affinity of the excitatory D1-like receptors than that of inhibitory D2-like receptors, the amplitudes of the neuronal firing rates first increase before reducing or saturating as [DA] increases (**Figures 7A–C**). This is especially pronounced for pyramidal cells which express D2 receptors (Pyr2, Pyr4, Int3, and Int4), exhibiting an inverted U-shaped modulation (Pyr2 in **Figure 7B**; Pyr4, Int3 and Int4 not shown due to the modulation on these neuron by DA is similar to that of Pyr2). The oscillation frequency decreases from low gamma to beta band with increasing [DA], before it slightly increases again with further increase in [DA] (**Figure 7D**). The value of [DA] producing the minimal oscillation frequency (∼6 nM) coincides with that of the maximal neuronal firing rates.

Next, we vary [5-HT] while keeping [DA] fixed at 5 nM, and having [DA]1 = 4 nM, [DA]2 = 8 nM, [5-HT]1 = 1 nM, [5-HT]2 = 2 nM, and [DA] = 5 nM. **Figure 8** shows that 5-HT modulates the network activity in an interesting manner. The network oscillates either at a low or high [5-HT] level, while intermediate [5-HT] level (within the 1*.*08–2*.*22 nM range) leads to asynchronous tonic stable activity (**Figures 8A–C**). The underlying reason for such a phenomenon is due to the different affinities of 5-HT1A and 5-HT2A receptors. This intermediate tonic stable state may not arise if [5-HT]1 *>* [5-HT]2. In fact, we have observed such multiple oscillation regimes in the simpler two-population excitatory-inhibitory network model (**Figures 5B2,C2**, **6C2,D2**). For pyramidal cells without 5-HT2A receptors (Pyr1 and Pyr2), its activity is almost fully suppressed by 5-HT1A receptor inhibition (**Figure 8A**; Pyr2 not shown). The slight increase in activity with oscillation is indirectly activated by other neuronal subgroups (e.g., excitation from oscillating Pyr3-type neurons; **Figure 8B**). The frequency of the oscillation increases with increasing [5-HT] before the latter reaches a Hopf bifurcation point (1*.*08 nM), after which the oscillation

ceases. When [5-HT] exceeds a larger critical value (2*.*22 nM), the frequency decreases toward a stable value of about 24 Hz (**Figure 8D**). It should be noted that the activations of neurons at high levels of [5-HT] are observed only for Pyr3-type (D1+5- HT1A+5-HT2A) pyramidal cells and Int2-type (D1+5-HT2A) inhibitory neurons, while the other types of neurons are inhibited (not shown).

Finally, we vary [DA] and [5-HT] simultaneously, and find that if [DA] is above a certain value (10*.*946 nM), the network always oscillates for any [5-HT] level (**Figure 9A**). Specifically, there exist a Λ-shaped green curve in **Figure 9** below which the network cannot support oscillatory activity (black region), while above it oscillation occurs. Although the oscillation frequencies generally decreases with increasing [DA] levels, they stay around the same range (**Figure 9**). Moreover, the frequency of the oscillation non-monotonically depends on [5-HT]; increasing before [5-HT] exceeds the left branch of the Λ shape in the phase diagram, and then decreasing after [5-HT] exceeds the right

**[DA].** Oscillation of the network emerges from a Hopf bifurcation at [DA] = 3.21 nM. **(A–C)** The amplitude of the oscillation increases with increasing [DA] before the activation of D2 receptors reduce **(A,C)** or

upon activation of D2 receptors. [DA]1 = 4 nM, [DA]2 = 8 nM, [5-HT]1 = 1 nM, [5-HT]2 = 2 nM, and [5-HT] = 0.3 nM.

expressing D1, 5-HT1A, and 5-HT2A receptors. **(C)** Activity of interneurons expressing D1 and 5-HT2A receptors. **(D)** Dependence of the oscillation frequency on [5-HT]. Insets: firing rates of Pyr 3 and Int 2 given [5-HT] = 0.5 nM and 2*.*5 nM. [DA]1 = 4 nM, [DA]2 = 8 nM, [5-HT]1 = 1 nM, [5-HT]2 = 2 nM, and [DA]= 5 nM.

branch of the Λ shape (**Figure 9**). When [DA] *>* 10.946 nM, there exists an optimal [5-HT] value where the network can attain the maximum frequency oscillation. The peak of the frequency lies in the gamma range, which is often observed during attentional processing (Benchenane et al., 2011). Interestingly, when [DA] is smaller than 4*.*1 nM, there is only a narrow finite range of [5-HT] values that supports oscillatory behavior.

### *3.3.2. 5-HT and DA receptor selective agonist/antagonist*

Having observed how the PFC network model can be modulated by [DA] and [5-HT], we shall now investigate the influence of DA and 5-HT receptors selective agonist or antagonist. A selective agonist (antagonist) is a drug that can activate (block) a specific receptor without affecting other receptors. Our model can mimic the effect of receptor selective agonist or antagonist by decreasing or increasing the half maximal effective concentration of the receptors, namely, [DA]1 for D1, [DA]2 for D2, [5-HT]1 for 5-HT1A, or [5-HT]2 for 5-HT2A receptors (Lambert, 2004; Golan et al., 2007).

To investigate the effect of D1 selective agonist or antagonist, we fix [DA]2 = 8 nM, [5-HT] = 0.3 nM, [DA] = 3 nM, and vary [DA]1 in the range 0–10 nM. We find that smaller [DA]1 values (simulating D1 receptor agonist) favors larger firing rate amplitudes and slower oscillations (**Figure 10**). At intermediate [DA]1 values, there is a transition to a smaller oscillation amplitude. Higher [DA]1 values eventually shuts off the oscillation behavior via a Hopf bifurcation (at [DA]1 = 4*.*79 nM) with the percentage of active D1 receptors approximately at 1*/(*1 + exp*(*1*.*78667*))* = 14*.*35%. With regard to the neuronal activity, decreasing [DA]1 general leads to higher neuronal firing rates. As [DA]1 increases, the oscillation amplitude of D1-expressing neurons decreases slower (**Figures 10A,C**) than that of D2 expressing neurons (**Figure 10B**), before they all reaches a stable asynchronous tonic state .

Next, we vary [DA]2 while fixing [DA]1 = 4 nM, [5-HT] = 0.3 nM, [DA] = 4 nM, to mimic the influence of D2 selective agonist/antagonist on the oscillation. The results are shown in **Figure 11**. Unlike D1 modulation, no bifurcation happens when we vary [DA]2, and the behavior of the network does not change dramatically. The variation of [DA]2 only slightly increase the oscillation frequency (**Figure 11**), but affects the oscillation amplitudes of only D2-expressing neurons (**Figures 11B,D**).

To study the effects of selective 5-HT 1A agonist/antagonist, we vary [5-HT]1 while fixing [DA]1 = 4 nM, [DA]2 = 8 nM, [DA] = 4 nM, [5-HT]2 = 2 nM, [5-HT] = 0.3 nM. We find that the system becomes oscillatory once [5-HT]1 exceeds a critical value (Hopf bifurcation point at [5-HT]1 = 0*.*516 nM). The frequency (amplitude) of the oscillation decreases (increases) with increasing [5-HT]1, eventually reaching a robust oscillatory behavior after the 5-HT1A receptors are blocked (**Figure 12**).

Finally, we investigate the modulation of 5-HT2A selective agonist/antagonist by fixing [DA]<sup>1</sup> = 4 nM, [DA]2 = 8 nM, [DA] = 4 nM, [5-HT]1 = 1 nM, and varying [5-HT]2. Increasing [5- HT]2 from 0 to 5 nM, 5-HT2A receptors transit from the activated state to the blocked state. As a result, the amplitude of the oscillation decreases, but the frequency of the oscillation increases (**Figure 13**). It is to be noted that the different neuronal types are affected differently. For example, the firing rate of pyramidal cells expressing D1 and 5-HT1A receptors decreases to approximately 10 Hz (**Figure 13A**), but that of pyramidal cells expressing D2 and 5-HT1A (or 5-HT1A and 5-HT2A) receptors decreases to less than 1 Hz (**Figure 13B**). The firing rate of interneurons expressing D1 and 5-HT1A (or 5-HT2A) decreases to approximately 20 Hz (**Figure 13C**), and that of interneurons expressing D2 and 5-HT1A (or 5-HT2A) receptors decreases to less than 5 Hz (**Figure 13D**). As atypical antipsychotic drugs typically block 5-HT2A and D2 receptors (Maher et al., 2002, 2011), we also investigated such drug effects by increasing the values of [5-HT]2 and [DA]2, but we do not find much difference from that of individually varying [5-HT]2 and [DA]2.

### **4. DISCUSSION**

### **4.1. SUMMARY OF RESULTS**

In this work, we have shown, from single neuron to neuronal circuits, how DA and 5-HT, with their multiple receptors and combinations, can tonically modulate the PFC neural activity, resulting in a variety of complex behaviors.

Due to the different affinities and opposing effects of the 5-HT1A and 2A receptors, the neuronal firing activity of a PFC excitatory neuron coexpressing these two receptors can be inhibited before being enhanced as 5-HT concentration increases. When we extend our analysis to the two-population excitatoryinhibitory neuronal networks, we find that generally, pyramidal cells expressing D1 receptors can provide various interesting network behaviors. In particular, 5-HT and DA can modulate the amplitude and frequency of the network oscillations. Depending on the receptor types expressed by the neurons in the network, 5-HT and DA modulation can cause the oscillations to emerge or cease. Hence, this can result in a finite oscillatory regime, which can create optimal oscillation frequency and amplitude with

**FIGURE 10 | D1 selective agonist/antagonist on the heterogeneous network.** Increasing [DA]1 will decrease the amplitude of oscillation but increase the frequency of oscillation. **(A)** Dependence of firing rate of pyramidal cell (D1+5-HT1A) on [DA]1. The amplitude of the oscillation, the difference between the green lines, decreases with increasing [DA]1. **(B)**

Dependence of firing rate of pyramidal cell (D2+5-HT1A) on [DA]1. **(C)** The firing rate of interneurons (D1+5-HT1A) depends on [DA]1. **(D)** Frequency of oscillation increases with the increase of [*DA*]<sup>1</sup> before the agonist/antagonist shuts off the oscillation. Insets: firing rates of two types of pyramidal cells over time with [DA]1 = 1 nM (left), and [DA]1=3 nM (right).

Firing rate of pyramidal cells expressing D2 and 5-HT1A receptors. **(C)** Firing rate of interneuron expressing D1 and 5-HT2A receptors. **(D)** Firing rate of interneurons expressing D2 and 5-HT1A receptors.

respect to certain [DA] or [5-HT] level. Moreover, we find that certain combinations of receptors are conducive for the robustness of the oscillatory regime, and for the existence of multiple oscillatory regimes.

The analysis of the two-population model provides us a leverage to understanding the more complex and realistic heterogenous network model. In the heterogeneous network model, the pyramidal cells and interneurons with all the considered combinations of D1, D2, 5-HT1A and 5-HT2A receptors are synaptically coupled. The model reveals that for the network to oscillate, it requires a sufficiently high level of [DA]. At intermediate levels of [DA], an interesting bimodal feature with respect to 5-HT concentration level appears - network oscillation can occur in two separate ranges of 5-HT concentration level. This bimodal feature is largely contributed by the different affinities and opposing effects of 5-HT1A and 5-HT2A receptors, as observed in the two-population model. Very low DA concentration level can suppress the oscillation regardless of the 5-HT level. Finally, we show that selective D1 receptor antagonists (agonists) tend to suppress (enhance) network oscillations, and shift from beta toward gamma band, while selective 5-HT1A antagonists (agonists) act in opposite ways. Selective D2 or 5-HT2A receptor antagonists can lead to decrease in oscillation amplitudes, but only 5-HT2A antagonists can increase the oscillation frequency.

Based on the analysis of the two-population and full network models, a general trend can be observed: the oscillation frequency will decrease if the change causes an overall increase in excitation within the network ([DA]1 ↓, [DA]2 ↑, [5-HT]1 ↑, and [5-HT]2 ↓), and vice versa.

### **4.2. RELATIONS TO NEUROPHARMACOLOGICAL DRUG EFFECTS**

As mentioned earlier, abnormal beta and gamma band oscillations have been observed in various neurological and neuropsychiatric disorders (Spencer et al., 2003; Cho et al., 2006; Uhlhaas and Singer, 2006, 2010; Basar and Guntekin, 2008; Gonzalez-Burgos and Lewis, 2008; Gonzalez-Burgos et al., 2010). Schizophrenic patients (late responder to antipsychotic drugs) have been shown to have enhanced power in the beta2 (16*.*5–20 Hz) frequency band in the frontal cortex as compared to controls (Merlo et al., 1998; Venables et al., 2009). Using fluphenazine, an antagonist of both pre- and postsynaptic D2 receptors, beta2 in schizophrenic patients can be reduced (Kleinlogel et al., 1997). In our model, if we only simulate the effect of D2 postsynaptic receptor antagonist, the results is actually an enhancement of beta2 oscillation amplitude (**Figure 11**). However, antagonist of D2 pre-synaptic receptors can effectively decrease the overall DA concentration level, which can result in a shift in the oscillation frequency out of the beta2 range (**Figure 7**). Hence the model suggests that the antagonist effects on the D2 presynaptic receptors may be more dominant than the D2 postsynaptic receptors.

In a rat model of Parkinson's disease, beta band oscillation in the frontal cortex is abnormally high compared to controls (Sharott et al., 2005). In that work, the authors showed that administration of apomorphine, a non-selective dopamine agonist which activates both D1- and D2-like receptors (but with higher preference for D2-like receptors), reduces the high beta band power and shifts the oscillation slightly toward higher frequency (from 28 Hz to about 35 Hz). In our model, D2 agonist generally reduces the oscillation amplitude of the beta band (**Figure 11**), while D1 agonist slightly decreases the oscillation frequency. The former is consistent with the experiment but not the latter. This discrepancy deserves to be further investigated.

Hallucinogens (psychedelics) are agonists of 5-HT2 receptors, enhancing PFC activity and metabolism in humans, and for treatment of psychiatric disorders (Nichols, 2004). It is shown that application of the 5-HT2 agonist (2,5-Dimethoxy-4-iodoamphetamine or DOI) as compared to 5-HT2 antagonist (ketanserin) can lower the power of the beta band in the EEG signal from the frontal cortex of anesthesized rats (Budzinska, 2009). Our model with high [5-HT]2 value (mimicking 5-HT2A antagonists) also reduces the oscillation amplitude in the beta band (**Figure 13**), and thus is consistent with the experiments.

### **4.3. MODEL LIMITATIONS AND FUTURE WORK**

Our modeling approach involves incorporating various experimental data to constrain the model parameters. This includes electrophysiological and pharmacological properties of PFC neurons and synapses, and how these are distinctly modulated by the different DA and 5-HT receptors. Thus, our approach lies more toward biologically constrained firing-rate models (Wong and Wang, 2006; Eckhoff et al., 2011) than abstract connectionist models (Fellous and Linster, 1988). This is a first step toward a systemic understanding of DA and 5-HT comodulation in the PFC. Admittedly, the model has its limitations.

As expected from large-scale biologically based modeling, many model parameters are involved here. We have tried to base as many parameters as possible from experimental data. Some of these parameters are directly obtained or inferred from experimental measurements, while others are based on indirect evidences or assumptions. The extensive investigations on the localization of DA and 5-HT receptors in PFC provided biological plausible proportions of subpopulations of neurons in the PFC network. However, we did not simulate all possible details in the model. For example, we did not include pyramidal cells which coexpress both D1 and D2 receptors. We also did not include PFC neurons which do not express D1, D2, 5-HT1A and 5-HT2A receptors. It remains unknown how these neurons will indirectly affect PFC network behavior upon 5-HT and DA co-modulation. Moreover, DA and 5-HT receptors generally have low and high affinity states, but the present model assume only receptors with high affinity states. In terms of selecting the parameters for the model, we have only chosen a single value for each parameter or variable (e.g., tonic basal [5-HT] in the PFC) within a range of available values identified over various separate experiments. This problem is often encountered when integrating data from multiple sources during model development. Furthermore, oscillations in the cerebral cortex can differ among cortical layers, but the current model does not deal with this issue. The present model also considers only tonic release state, which is more of a resting state and independent of any specific cognitive task.

Despite these limitations and assumptions, it is sometimes advantageous to understand neuromodulation phenomena from simpler to more complex models, teasing apart the contributions of individual components of a system - a key advantage of computational modeling. As we can easily observe, even with such simplified models, the behaviors produced due to the DA-5-HT comodulation are already rather complex. Moreover, the specific [DA]1 and [DA]2, [5-HT]1 and [5-HT]2 values only reflect their comparative affinities with DA or 5-HT. Variation of these values while keeping their relative affinities will not dramatically change the network's qualitative behavior under DA and 5-HT co-modulation. That is, their absolute values are not as important as their relative values. A possible extension of our

### **REFERENCES**


present work would be to explicitly specify the cortical layers, where the latter are known to be distinctively modulated by DA and 5-HT (Wang, 2010). Furthermore, for the model to generate slower oscillations such as theta and other lower frequency bands, and hence directly compare with other experimental data (Benchenane et al., 2010; Puig et al., 2010), the model may require additional slower dynamical features such as GABABmediated synaptic currents. These concerns will be addressed in future work.

### **ACKNOWLEDGMENTS**

This work was supported by the National Natural Science Foundation of China grant no. 91132702, 31271169 (Da-Hui Wang), and the Center of Excellence in Intelligent Systems award, funded by InvestNI and the Integrated Development Fund, through its local facilitator, ILEX (KongFatt Wong-Lin).

by recurrent inhibition. *J. Comput. Neurosci.* 11, 63–85. doi: 10.1023/A:1011204814320


before and during treatment with fluphenazine. *Schizophr. Res.* 3, 162–163.


differential D1 versus D2 dopamine receptor regulation of inhibition in prefrontal cortex. *J. Neurosci.* 24, 10652–10659. doi: 10.1523/JNEUROSCI.3179-04.2004


prefrontal cortex. *J. Neurosci.* 13, 2551–2564.


prefrontal and entorhinal cortices– immunohistochemical studies. *J. Physiol. Pharmacol.* 59, 229–238.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 23 April 2013; accepted: 08 July 2013; published online: 05 August 2013. Citation: Wang D-H and Wong-Lin K (2013) Comodulation of dopamine and serotonin on prefrontal cortical rhythms: a theoretical study. Front. Integr. Neurosci. 7:54. doi: 10.3389/fnint. 2013.00054*

*Copyright © 2013 Wang and Wong-Lin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The role of the dorsal raphé nucleus in reward-seeking behavior

### *Kae Nakamura1,2\**

*<sup>1</sup> Department of Physiology, Kansai Medical University, Hirakata, Japan*

*<sup>2</sup> Precursory Research for Embryonic Science and Technology, Japan Science and Technology Agency, Kawaguchi, Japan*

### *Edited by:*

*KongFatt Wong-Lin, University of Ulster, UK*

### *Reviewed by:*

*Quentin Huys, University College London, UK Andrew J. Greenshaw, University of Alberta, Canada*

### *\*Correspondence:*

*Kae Nakamura, Department of Physiology, Kansai Medical University, 2-5-1 Shin-machi, Hirakata-city, Osaka, 570-1010, Japan e-mail: nakamkae@ hirakata.kmu.ac.jp*

Pharmacological experiments have shown that the modulation of brain serotonin levels has a strong impact on value-based decision making. Anatomical and physiological evidence also revealed that the dorsal raphé nucleus (DRN), a major source of serotonin, and the dopamine system receive common inputs from brain regions associated with appetitive and aversive information processing. The serotonin and dopamine systems also have reciprocal functional influences on each other. However, the specific mechanism by which serotonin affects value-based decision making is not clear. To understand the information carried by the DRN for reward-seeking behavior, we measured single neuron activity in the primate DRN during the performance of saccade tasks to obtain different amounts of a reward. We found that DRN neuronal activity was characterized by tonic modulation that was altered by the expected and received reward value. Consistent reward-dependent modulation across different task periods suggested that DRN activity kept track of the reward value throughout a trial. The DRN was also characterized by modulation of its activity in the opposite direction by different neuronal subgroups, one firing strongly for the prediction and receipt of large rewards, with the other firing strongly for small rewards. Conversely, putative dopamine neurons showed positive phasic responses to reward-indicating cues and the receipt of an unexpected reward amount, which supports the reward prediction error signal hypothesis of dopamine. I suggest that the tonic reward monitoring signal of the DRN, possibly together with its interaction with the dopamine system, reports a continuous level of motivation throughout the performance of a task. Such a signal may provide "reward context" information to the targets of DRN projections, where it may be integrated further with incoming motivationally salient information.

**Keywords: 5-HT, dopamine, raphé, saccade, primate, reinforcement, reward**

### **INTRODUCTION**

Serotonin (5-hydroxytryptamine, 5-HT) is present in almost all organisms from plants to vertebrates. In mammals, 5-HT has been found in all organs, such as the brain, gut, lung, liver, kidney, and skin, as well as platelets. Such a wide distribution indicates that 5-HT is an essential chemical for all living animals. In the brain, the distribution of 5-HT projections is widespread, regulating the activity of almost all brain regions. Thus, it is no surprise that 5-HT has been implicated in a variety of brain functions, such as the sleep-wake cycle, appetite, locomotion, emotion, hormonal regulation, and as a trophic factor.

In addition to the "basic" brain functions described above, the role of 5-HT in cognitive functions, including attention, control of impulsivity, coping with stress, social behavior, value-based decision making, and learning and memory, has also captured a great deal of attention. The breakdown of the 5-HT system is often associated with neuropsychiatric diseases including depression, schizophrenia, drug abuse, autism, and Parkinson's disease. However, the specific mechanisms by which 5-HT is involved in these cognitive processes are not yet clear.

Among the possible functions of 5-HT, this review will focus on its role in reward-seeking behavior. There are already good reviews about the role of 5-HT in value-based decision making, often being compared with dopamine function. For example, it has been proposed that the tonic and phasic dopamine and 5-HT systems represent value and action, which are not independent, in an opposite manner. Thus, dopamine may be involved in behavioral activation to obtain rewards and 5-HT may be involved in inhibition in the face of punishment (Boureau and Dayan, 2010; Cools et al., 2011). This unified model can account for the variety of aspects of decision making, including response vigor, time discounting, switching, and risk sensitivity, observed in behavioral-pharmacological experiments in animals and humans. The aim of this review is to further focus on the anatomical and physiological evidence of the 5-HT system and link it with the above findings. I will first review the anatomical evidence that supports the involvement of the raphé nuclei, the origin of 5-HT, in reward-dependent behavior. Among the raphé nuclei, I will focus on the dorsal raphé nucleus (DRN) because it has strong anatomical and physiological connections with the brain areas that are related to reward processing. Second, I will introduce pharmacological studies that examined the impact of changes in the brain levels of 5-HT on reward-seeking behavior. Although the results are mixed, depending on the affected brain regions and the type of 5-HT receptors examined, these studies generally support the inhibitory effect of 5-HT on reward-seeking behavior.

The behavioral pharmacological studies examined how 5-HT is utilized at the projection targets. On the other hand, it is also critical to reveal when and in which situations 5-HT is secreted or when DRN neurons are activated in real time. Recently, several research groups measured the activity of single DRN neurons while animals performed behavioral tasks. I will review the results of single unit recordings from the DRN, including our recent experiments in monkeys. The results show that DRN neuronal activity continuously keeps track of the expected and received reward value throughout the trials.

Finally, I will discuss the possible mechanisms by which 5-HT modulates value-based decision making, together with dopamine and other brain structures, such as the lateral habenula, amygdala, frontal cortex, and basal ganglia.

### **ANATOMICAL IMPLICATION OF THE ROLE OF 5-HT IN MOTIVATIONAL BEHAVIOR**

There is a great amount of evidence demonstrating tight anatomical connections between the raphé nuclei and the brain areas that are related to reward (Azmitia and Gannon, 1986; Molliver, 1987; Jacobs and Azmitia, 1992; Michelsen et al., 2007).

Among the 9 raphé nuclei B1–B9 (Dahlstroem and Fuxe, 1964), those that are often discussed in relation to reward-related behavior are the DRN, which is the largest group (B7), lumped together with B6, and the median raphé nucleus (MRN), which consists of B8 and B5.

### **INPUT TO THE DRN THAT MAY BE INVOLVED IN REWARD PROCESSING (FIGURE 1, LEFT)**

The DRN receives projections from many brain areas that have been associated with reward and punishment. These areas tend to project to distinct divisions of the DRN (Aghajanian and Wang, 1977; Sakai et al., 1977; Behzadi et al., 1990; Peyron et al., 1998).

Cortical areas projecting to the DRN include the medial prefrontal (Arnsten and Goldman-Rakic, 1984), lateral and medial orbital, cingulate, infralimbic, and insular cortices (Arnsten and Goldman-Rakic, 1984; Sesack et al., 1989; Amat et al., 2005). At least a part of the projection from the medial frontal cortex is via GABA interneurons in the raphé nuclei (Arnsten and Goldman-Rakic, 1984; Hajos et al., 1998; Varga et al., 2001, 2003; Jankowski and Sesack, 2004), which in turn project to 5-HT neurons.

Subcortical areas projecting to the DRN include the amygdala (Peyron et al., 1998; Lee et al., 2007), substantia nigra pars reticulata (SNr), ventral pallidum, preoptic area, claustrum, bed nucleus of the stria terminalis, zona incerta, medial and lateral preoptic areas, hypothalamus, and, most prominently, the lateral habenula nucleus (Pasquier et al., 1976; Aghajanian and Wang, 1977; Wang and Aghajanian, 1977c; Herkenham and Nauta, 1979; Stern et al., 1979; Kalen et al., 1989; Peyron

tegmental nucleus; and GPb, globus pallidus external and internal border (anatomically called the internal medullary lamina). The open and filled rectangles correspond to excitatory and inhibitory connections, respectively. The arrows indicate that the effect is unknown or excitatory and inhibitory effects have been reported.

et al., 1998; Varga et al., 2003), whose projection is through the fasciculus retroflexus. The lateral habenula is a brain region that represents negative motivational values, such as reward omission and aversive stimuli (Matsumoto and Hikosaka, 2007, 2009; Hong and Hikosaka, 2008) and transmits these signals to midbrain dopamine neurons and the DRN. Many studies have reported an inhibitory effect from the habenula to the DRN via the rostromedial tegmental nucleus (RMTg). Stimulation of the habenula suppresses the activity of DRN 5-HT neurons (Aghajanian and Wang, 1977; Wang and Aghajanian, 1977c; Stern et al., 1979; Nishikawa and Scatton, 1984, 1985, 1986; Scatton et al., 1984; Nishikawa et al., 1986; Ferraro et al., 1997; Varga et al., 2003) and decreases 5-HT release in the caudate nucleus and substantia nigra (Reisine et al., 1982; but see Kalen et al., 1989).

The hypothalamus is also an important source of reward information for the DRN (Celada et al., 2002). Hypothalamic orexin neurons are activated by arousal, feeding, and rewarding stimuli (Mieda and Yanagisawa, 2002; Lee et al., 2005; Harris and Aston-Jones, 2006) and facilitate 5-HT release (Tao et al., 2006). The amygdala, in which neurons encode positive or negative motivational values (Ledoux, 2000; Belova et al., 2008), also sends projections to the DRN.

The dopamine neurons in the ventral tegmental area (VTA) and substantia nigra pars compacta (SNc) also project to the DRN and MRN (Kalen et al., 1988; Mansour et al., 1990; Peyron et al., 1995; Kitahama et al., 2000), which may exert facilitatory effects on putative 5-HT neurons in the DRN by D2-like dopamine receptor activation (Ferre and Artigas, 1993; Mendlin et al., 1999; Haj-Dahmane, 2001).

Finally, the activity of neurons in the raphé nuclei is regulated by 5-HT via the 5-HT1A receptor found on the somata and dendrites of neurons in the raphé nuclei, where it functions as a somato-dendritic auto-receptor (Wang and Aghajanian, 1977a; Gozlan et al., 1983; Verge et al., 1985; Carey et al., 2004).

### **OUTPUT FROM THE RAPHÉ NUCLEI (FIGURE 1, RIGHT)**

Efferent projections from the raphé nuclei are widespread, but constitute a topographic organization along the rostrocaudal and medial-lateral axes (Imai et al., 1986; Abrams et al., 2004; Lee et al., 2008). Separate ascending pathways have been described in rats and primates. In the rat, the largest pathway is the medial forebrain bundle, which carries fibers from the MRN and DRN to a wide range of target areas in the forebrain. In primates, a significant number of these fibers (∼25%) are heavily myelinated (Azmitia and Gannon, 1986), and the largest pathway appears to be the dorsal raphé-cortical tract, which enters the cortex through the internal capsule (Azmitia and Segal, 1978; Azmitia and Gannon, 1986). Many projection sites include areas that are associated with reward processing, such as the neocortex, nuclei in the basal ganglia, nucleus accumbens, amygdala, septum, hippocampus, and hypothalamus (Azmitia and Segal, 1978; Azmitia and Gannon, 1986; Molliver, 1987; Vertes, 1991; Peyron et al., 1998).

The innervations from the DRN have several characteristics. First, individual DRN neurons give rise to several sets of collateral (branched) projections to distinct, but functionally related, targets. Single DRN 5-HT neurons project to the septum and the entorhinal area, both of which are essential for normal hippocampal function (Kohler et al., 1982), to various combinations of the olfactory cortex, septum, and medial thalamus (De Olmos and Heimer, 1980), to the prefrontal cortex and nucleus accumbens (Van Bockstaele et al., 1993), and to the central nucleus of the amygdala and paraventricular nucleus in the hypothalamus, both of which are involved in central autonomic control, anxiety, and conditional fear (Petrov et al., 1994; Lowry, 2002). This branching is also observed in the DRN projections to the sensory-motor areas, such as the lateral geniculate body and superior colliculus (Villar et al., 1988), which are important for visual information processing, and the substantia nigra subthalamic nucleus and caudate-putamen (Van Der Kooy and Hattori, 1980; Imai et al., 1986), which are involved in the execution of movement. These serotonergic collateral projections to functionally and anatomically related targets could facilitate the integrated and temporally coordinated modulation of multiple brain regions.

Second, 5-HT acts on all major dopaminergic pathways, i.e., nigrostriatal, mesocortical, mesolimbic, and tuberoinfundibular. The interaction of the 5-HT system with the dopamine system has been documented in the frontal cortex and basal ganglia nuclei, which form part of the nigrostriatal, mesocortical, and mesolimbic dopamine pathways. The fourth dopamine pathway, the tuberoinfundibular pathway, projects from the arcuate nucleus to the median eminence in the hypothalamus. Here, dopamine inhibits the secretion of prolactin from the anterior pituitary gland during the resting state. It is also known that stressful events that evoke prolactin release seem to rely, at least partially, on central serotonin function (Bregonzio et al., 1998). In terms of receptor types, although with some exceptions, receptor types such as 5-HT1A, 5-HT1B, 5-HT2A, 5-HT3, and 5-HT4 facilitate dopamine release, while 5-HT2C exerts tonic inhibition on dopamine release (for review, Alex and Pehek, 2007).

Here, in the first section, we discuss primarily the anatomical connections of the DRN to the reward-related brain areas. In the second section, we will focus more on the differential functional effects of 5-HT.

### *Projections to the striatum and SNr*

Among the widespread efferent projections of the DRN, those to the basal ganglia structures, especially the striatum and substantia nigra, may be particularly important for the control of the reward-dependent modulation of action (monkey, Lavoie and Parent, 1990; rat Van Der Kooy and Hattori, 1980; Imai et al., 1986). In monkeys (Lavoie and Parent, 1990; Haber, 2003), 5-HT terminals are particularly abundant in the ventral striatum, including the nucleus accumbens, ventrolateral region of the putamen, and ventromedial region of the caudate nucleus. The influence of 5-HT depends on the type and location of its receptors. High levels of 5-HT1B, 2A, and 2C receptors are reported in the striatum (Wright et al., 1995; Eberle-Wang et al., 1996).

Many reports examining the function of 5-HT in the striatum have focused on its effect on dopamine release. Electrical stimulation of the DRN enhanced dopamine release in the nucleus accumbens, but reduced it in the dorsal striatum; however, the specific effect depends on the type of receptors present. The facilitatory effect of endogenous 5-HT on dopamine release in the nucleus accumbens depends on the presence of 5-HT2A and 5-HT3 receptors, and not on 2B/2C receptors. Conversely, 5- HT2C receptors tonically inhibit dopamine release in the dorsal and ventral striatum (Jiang et al., 1990; Chen et al., 1991; De Deurwaerdere et al., 1998). The activation of 5-HT1B in the nucleus accumbens reportedly attenuated dopamine-dependent responses to a conditioned reward (Fletcher and Korth, 1999; but see Galloway et al., 1993).

The SNr is one of the major targets of the DRN in rats (Dray et al., 1976; Fibiger and Miller, 1977; Azmitia and Segal, 1978; Van Der Kooy and Hattori, 1980; Wirtshafter et al., 1987; Corvaja et al., 1993; Van Bockstaele et al., 1994; Moukhles et al., 1997), cats (Mori et al., 1987), and monkeys (Lavoie and Parent, 1990). In monkeys, 5-HT innervations are particularly dense in the SNr, but much less so in the SNc (Lavoie and Parent, 1990). Coexpression of 5-HT2C receptor mRNA with glutamic acid decarboxylase, but not with tyrosine hydroxylase mRNA, indicates that 5-HT2C receptors are restricted to GABAergic neurons (Eberle-Wang et al., 1997). The functional significance of 5-HT in the SNr, however, is not well understood.

### *Projections to the SNc and VTA*

Electron microscopy studies have shown that 5-HT neurons make direct synaptic contacts with dopaminergic and nondopaminergic neurons in the VTA (Herve et al., 1987; Van Bockstaele et al., 1994), indicating the direct and indirect influence of the raphé nuclei on the midbrain dopamine system. Electrical stimulation of the MRN (Dray et al., 1976) and DRN (Trent and Tepper, 1991; Gervais and Rouillard, 2000) inhibits the majority of (but not all) the activity of dopamine neurons. Further studies showed that the effect of 5-HT on midbrain dopamine neurons depends on the subtypes of 5-HT receptors present and the location of the dopamine neurons (Alex and Pehek, 2007). The systemic application of a 5-HT2C agonist decreased the baseline activity of dopamine neurons in a dose-dependent manner (Di Giovanni et al., 2000; Gobert et al., 2000), while the application of a 5-HT2C/2B antagonist caused a dose-dependent increase in the baseline and burst activity of dopamine neurons (Ugedo et al., 1989; Di Giovanni et al., 1999). As 5-HT2C receptors are mainly localized in GABAergic neurons in the SNr and VTA, which in turn inhibit dopamine neurons, the inhibitory effect of a 5-HT2C agonist on dopamine function is, at least in part, due to the GABA-mediated tonic inhibitory effect of 5-HT on mesolimbic and nigrostriatal dopamine function. On the other hand, the activation of VTA 5-HT1B receptors increases mesolimbic dopamine release, probably by inhibiting GABA release (Yan and Yan, 2001; Yan et al., 2004). Some authors have reported direct facilitatory effects of 5-HT on dopamine neurons *in vitro* (Nedergaard et al., 1988). In addition, 5-HT receptors located presynaptically on dopamine terminals or postsynaptically in dopamine projection areas could activate feedback loops, such as the striato-nigral, nucleus accumbens-VTA, or frontal-VTA pathways, thus indirectly altering the excitability of dopamine neurons in the SNc or VTA, resulting in changes in their baseline firing rates (Di Giovanni et al., 2010).

### *Projections to the amygdala*

Several nuclei of the amygdala receive rich serotonergic innervations (Steinbusch, 1981). In rats, the rostral and medial subregions are dense projection sites of 5-HT neurons. In monkeys, 5-HT projections are found widely in the amygdala, with the highest concentration in the lateral division of the central nucleus and lateral-dorsal part of the bed nucleus of the stria terminalis (Sadikot and Parent, 1990; Freedman and Shi, 2001). The effect of DRN on neurons in the amygdala is reportedly inhibitory and mediated by direct DRN-amygdala serotonergic projections (Wang and Aghajanian, 1977b).

### *Projections to the hypothalamus*

The hypothalamus plays a significant role in the processing of natural rewards, such as food and sex (Harris et al., 2005; Muschamp et al., 2007), and it receives strong inputs from the DRN (Nambu et al., 1999). Extracellular 5-HT levels increased in the medial and lateral hypothalamus during the anticipation and intake of food, but not after its consumption (Schwartz et al., 1990). Interestingly, this finding was interpreted in line with the reward-inhibiting and satiety-facilitating functions of 5-HT in the hypothalamus (Hoebel et al., 1989).

### *Projections to the cortex*

The DRN also projects to virtually all cortical areas, and its effect can be excitatory and inhibitory, depending on which layers it projects to and the presence of different receptor types. Electrical stimulation of the DRN and MRN inhibits the majority of medial prefrontal cortex neurons via 5-HT1A (Hajos et al., 2003; Puig et al., 2005) or 5-HT2 (Mantz et al., 1990) receptors. Among several receptor types, 5-HT2A receptors are particularly dense in the prefrontal and anterior cingulate cortices (Pazos et al., 1985), and they are primarily located on the apical dendrites of pyramidal neurons (Jakab and Goldman-Rakic, 1998; Cornea-Hebert et al., 1999). Prefrontal 5-HT2A receptors may activate cortico-tegmental projection neurons, which in turn facilitate VTA dopamine neurons (Pehek et al., 2006). On the other hand, 5-HT2A/2C receptors are also present in the GABAergic interneurons of the cortex and may regulate glutamatergic output (Abi-Saab et al., 1999). 5-HT2C activation in the medial frontal cortex suppresses cocaine-seeking behavior (Pentkowski et al., 2010).

### **5-HT AND THE REWARD CIRCUIT**

5-HT has long been implicated in a wide variety of motivational process; however, contrasting effects have been reported, many indicate a positive reward effect, but some others indicate a negative effect.

The positive reward effects of 5-HT have been described mainly in relation to brain self-stimulation experiments where animals perform operant responses such as pressing a bar to receive electrical stimulation of the brain. The majority of selfstimulation studies have focused on the medial forebrain bundle, which contains ascending dopaminergic fibers; however, several studies have also shown that stimulation of the raphé nuclei and their vicinity is equally effective (Miliaressis et al., 1975; Miliaressis, 1977; Rompre and Miliaressis, 1985). In addition, some pharmacological experiments using the systemic reduction of 5-HT reported attenuated cocaine-seeking behavior (Tran-Nguyen et al., 1999, 2001).

However, many lines of evidence indicate the inhibitory effects of 5-HT on the reward circuitry. The systemic injection of the 5-HT releaser *d*-fenfluramine (Fletcher, 1995) and the injection of 5-HT into the accumbens (Fletcher, 1996; Fletcher and Korth, 1999) attenuated conditioned responses to obtain amphetamine. The systemic reduction of 5-HT also reportedly enhanced rewardrelated behavior (Leccese and Lyness, 1984; Tran-Nguyen et al., 2001), while the findings of others depended on the type of reinforcement and the method used to reduce the function of 5-HT.

Experiments with the local injection of 5-HT inhibitors in the raphé nuclei support an inhibitory role of the raphé nuclei in motivational behavior. The local injection of a low dose of the 5-HT1A agonist 8-hydroxy-2-(di-n-propylamino)tetralin (8-OH-DPAT), which selectively inhibits serotonergic neurons in the MRN or DRN (Fletcher et al., 1993, 1995), and muscimol (Liu and Ikemoto, 2007) into the MRN induces conditioned place preference. It is of particular interest that these effects were reversed when the dopamine antagonists were administered systematically (Fletcher et al., 1999; Liu and Ikemoto, 2007) or directly into the nucleus accumbens (Muscat et al., 1989) or striatum (Fletcher and Davies, 1990; Fletcher, 1991), indicating that the reward effect of 5-HT antagonists may depend, at least partly, on the removal of the inhibitory influence of 5-HT on the mesolimbic dopamine system. Indeed, the systemic administration of 8-OH-DPAT increased the firing rate of the majority (75%) of dopamine cells studied and stimulated their bursting activity (Prisco et al., 1994).

The role of 5-HT in reward is complicated by the fact that it binds to a large number of receptor types that have different effects on reward-oriented behavior (Higgins and Fletcher, 2003). One of the principal receptor types involved in rewardoriented behavior may be the 5-HT2C receptor. This receptor's mRNA is expressed in the anterior olfactory nucleus, olfactory tubercle, claustrum, piriform and entorhinal cortices, lateral septal nucleus, amygdala, subiculum and ventral part of CA3, lateral habenula, subthalamic nucleus, SNr, VTA (Molineaux et al., 1989; Pompeiano et al., 1994; Wright et al., 1995; Eberle-Wang et al., 1997; Clemett et al., 2000), and dorsal and ventral (including nucleus accumbens) striatum, all of which are important parts of the reward-related circuitry. Another functional characteristic of the 5-HT2C receptor is that it possesses a high level of constitutive activity, even in the absence of agonist stimulation (Berg et al., 2008). It has been reported that neurons with 5-HT2C receptors in the nucleus accumbens and striatum are probably GABAergic projection neurons (Eberle-Wang et al., 1997). It was also suggested that all 5-HT2C mRNA-containing cells in the SNr and VTA are GABAergic, not dopaminergic, neurons. Thus, the tonic suppressive influence of 5-HT on dopamine neurons would be by 5-HT2C receptors acting on GABAergic neurons, which in turn suppress dopaminergic neurons in the VTA. This mechanism would allow 5-HT2C to exert a tonic influence on the activity of the mesocortical and mesolimbic dopaminergic pathways. Note, however, that a recent study provided anatomical and behavioral support for the localization of 5-HT2C receptors on dopamine neurons in the VTA (Ji et al., 2006). Altogether, 5-HT2C receptors tonically regulate, mainly by inhibition, dopamine release from the terminal regions of the nigrostriatal and mesolimbic pathways (Di Giovanni et al., 1999; Gobert et al., 2000).

As described above, many behavioral-pharmacological studies have reported the effects of 5-HT on the reward circuitry. However, the direction (positive or negative) of its effects should be analyzed carefully because it may vary depending on the method used to modulate 5-HT levels (e.g., systemic or local), the location of self-stimulation (Ahn et al., 2005), or the kinds of behavioral test used (Mosher et al., 2005; Hayes et al., 2009).

Another hypothesis for the role of the 5-HT system in rewardseeking behavior is that 5-HT regulates the timescale of reward prediction, such as the balance between immediate and delayed rewards. In reinforcement learning theory, the state value is discounted when the delivery of the reward is delayed, and Doya et al. suggested that 5-HT regulates this reward discounting rate (Doya, 2002; Tanaka et al., 2004). Indeed, the 5-HT level and firing rate in the DRN increased when rats waited to obtain rewards, and the level of neuronal firing was correlated with successful waiting (Miyazaki et al., 2010, 2011). Such "wait to obtain a reward" behavior might be originally initiated by the reward signal that activates the dopamine system, which then promotes behavioral vigor or activation, and at the same time, the subsequent activation of the DRN is necessary for the successful withholding of responses to obtain rewards.

### **5-HT AND AVERSIVE INFORMATION PROCESSING**

The participation of 5-HT in aversive information processing has also been reported repeatedly. Strong evidence that 5-HT is involved in aversive information processing comes from the observation that there is a change in neuronal activity in the raphé nuclei or an increase in 5-HT levels in response to aversive stimuli. Stress-related stimuli activate immediate-early gene expression within the DRN (Pezzone et al., 1993). In the DRN of anesthetized rats, the majority of neurochemically identified 5-HT neurons with a clock-like firing pattern were phasically excited, whereas the majority of bursting 5-HT neurons were inhibited by noxious footshocks (Schweimer and Ungless, 2010). Activity level of the raphe nuclei is also modulated; it is increased under inescapable shocks (Grahn et al., 1999; Takase et al., 2004). Forced swimming induced an increase or decrease in 5-HT levels, as measured by microdialysis, depending on the brain region examined; its levels increased in the striatum, but decreased in the amygdala and lateral septum (Kirby et al., 1995).

The role of 5-HT in aversive information processing has multiple facets. First, several lines of evidence suggest that 5-HT modulates sensitivity to threat-related stimuli and punishment (for review, Deakin, 1991; Cools et al., 2008). A negative correlation between 5-HT levels and aversion has been demonstrated repeatedly, indicating the analgesic effect of 5-HT. Low levels of 5-HT in human subjects, achieved by acute tryptophan (the precursor of 5-HT) depletion, enhanced the responsiveness of several brain regions, especially the amygdala, to aversive stimuli, such as fearful faces and negative words (Hariri et al., 2002; Cools et al., 2005; Hariri and Holmes, 2006; Roiser et al., 2008). Low levels of 5-HT also alter the performance of a probabilistic reversal learning task by abnormally enhancing the impact of punishment, such as the inappropriate avoidance of less frequent punishment (Evers et al., 2005; Chamberlain et al., 2006). Note, however, that the role of 5-HT in a probabilistic reversal task may come from the changes in the processing of negative feedback signals *per se*, rather than changes in sensitivity to the error, because the changes in medial frontal activity did not differ between errors that were or were not followed by behavioral correction (Evers et al., 2005).

Just as decreased 5-HT function causes punishment processing to be enhanced, animal studies have shown that an increase in 5-HT levels inhibits responses to punishment. A well-known example is that increasing 5-HT levels via selective 5-HT reuptake inhibitors produces a potent reduction in the levels of anxiety, an effect underlying many anxiolytic drugs. Conditioned fear stress increases extracellular 5-HT levels in the rat medial prefrontal cortex, followed by a reduction of freezing behavior (Hashimoto et al., 1999). 5-HT also suppresses panic or defensive reactions (Maier and Watkins, 2005) and aggression (Marsh et al., 2002; Miczek et al., 2007). The DRN itself and the projection sites of 5-HT, such as the prefrontal cortex and amygdala, may be involved in this process (Graeff et al., 1996). The amygdala has an essential role in the learning and expression of conditioned fear to unconditional and conditional stimuli (Bechara et al., 1995; Ledoux, 2007), and the injection of the 5-HT reuptake blocker citalopram to the amygdala, which presumably enhances 5-HT levels, impairs fear conditioning (Inoue et al., 2004). Amygdala neurons that are excited by the electrical stimulation of glutamate-releasing inputs from the frontal cortex are inhibited by the concurrent iontophoresis of 5-HT, probably by the activation of GABA-releasing neurons through excitatory 5- HT receptors in the amygdala (Stutzmann et al., 1998; Stutzmann and Ledoux, 1999). Thus, deficient 5-HT function might result in the enhanced processing of harmful stimuli because of the diminished inhibitory modulation of excitatory sensory afferents, thereby enabling innocuous sensory signals to be processed by the amygdala as being emotionally salient.

Secondly, recent theoretical and experimental studies suggest that 5-HT does not operate solely as an affective (i.e., aversive) factor. Instead, the influence of 5-HT on aversive processing is evident on the junction of affective and activational factors; specifically, behavioral inhibition in the face of aversive predictions (Dayan and Huys, 2008, 2009; Boureau and Dayan, 2010). For example, in a task in which healthy human subjects decided to respond or not to obtain a reward or to avoid punishment, temporarily lowering 5-HT levels abolished the punishment-induced slowing of their response, but it did not affect the general inhibition of their motor response or sensitivity to aversive outcomes (Crockett et al., 2009). However, aversive predictions can be an instrumental process that links stimuli, responses, and outcomes, or they can be a Pavlovian process that links stimuli and outcomes. Here, further study revealed that 5-HT is involved in reflexive, Pavlovian aversive predictions because the latencies for the punished and non-punished responses were prolonged in the presence of punishment stimuli under acute tryptophan depletion (Crockett et al., 2012).

The third aspect of 5-HT-dependent neuronal processes associated with aversive experiences is behavioral control over a stressor. Generally, the emotional consequences related to aversive events are less severe if the subjects have control over the aversive events, and a lack of control of stress leads to mood and anxiety disorders. Experimentally, animals exposed to inescapable stressors subsequently exhibit "learned helplessness," a set of behavioral changes that include an impaired ability to escape from aversive events, increased fear conditioning and anxiety, a potentiated response to addictive drugs, and altered pain sensitivity. It has been suggested that 5-HT is involved in this "reduction of action" after a stressful, uncontrollable situation. Indeed, the activity of DRN 5-HT neurons, as measured by Fos expression (Grahn et al., 1999), and 5-HT levels, as measured by *in vivo* microdialyzis (Maswood et al., 1998), in the DRN or its projection sites (Amat et al., 1998; Bland et al., 2003a,b) were enhanced under an inescapable stress, such as tailshock, but not under an escapable stress. Further, the intense activation of DRN 5-HT neurons by an uncontrollable stress sensitizes these neurons for a period of time (Amat et al., 1998). The inactivation of 5-HT blocks the occurrence of these behavioral changes (Maier et al., 1994, 1995).

One possible mechanism for the activation of the DRN under inescapable stress is input to the DRN from the habenula. Lesions of the habenula severely attenuate the rise in 5-HT levels in the DRN under both escapable and inescapable stress, thus eliminating the difference between them and producing behavioral indifference (Amat et al., 2001). The frontal cortex may also be involved in this process. When a stressor is controllable, the DRN is no longer activated by the stressor due to inhibitory signals from the ventral medial prefrontal cortex (Amat et al., 2005). The role of the ventral medial prefrontal cortex may be to detect the fact that the stressor is controllable rather than to escape learning *per se*. If controllable, the prefrontal cortex inhibits DRN activation and thus prevents learned helplessness. A recent study used an optogenetic approach to reveal more detailed neuronal circuits that support such behavioral changes; activation of the prefrontal-DRN pathway is causally involved in an increase in effortful movement during the forced swim test, which is a challenging and inescapable situation, whereas activation of the prefrontal-habenula pathway caused the opposite effect (Warden et al., 2012). Note that such a situation or state-dependence is also documented for single neuronal activity. For example, DRN neurons in the rat responded to a tone differently, depending on the reward and no-reward context (Li et al., 2013).

Fourth, an increase of 5-HT may regulate the processing of stress via the activation of pituitary and adrenal functions (Vernikos-Danellis et al., 1977), which have bi-directional interactions with the 5-HT system. There is a dense projection of corticotropin-releasing factor (CRF) neurons to the raphé nuclei in rats (Cummings et al., 1983; Lowry et al., 2008) and humans (Austin et al., 1997). A subpopulation of CRF-containing neurons is present in the dorsomedial part of the DRN, and duallabeling immunohistochemistry revealed that almost all CRFcontaining neurons are serotonergic (Commons et al., 2003). Intracerebroventricular injections of the selective CRF2 receptor agonist urocortin 2 increased the activity of serotonergic neurons (Abrams et al., 2004; Staub et al., 2005, 2006). However, the effects of CRF on the DRN appeared to be either excitatory or inhibitory, probably depending on the location of the recorded neurons within the DRN, e.g., neurons in the ventromedial region were inhibited, whereas neurons in the dorsomedial and lateral wings had variable responses (Kirby et al., 2000). In addition, CRF-containing axons from the dorsomedial DRN project to CRF-containing neurons of the central nucleus of the amygdala, a stress related area and a part of the central autonomic system (Petrov et al., 1994). There is also a dense projection of 5-HT neurons to the suprachiasmatic nucleus, which in turn regulates the secretion of CRF from the hypothalamus and, consequently, adrenocorticotropic hormone (ACTH) release. Thus, it is important to emphasize the two roles of 5-HT in the mammalian brain, i.e., as a neurotransmitter and a hormonal factor. These two aspects may be related to each other as a recent study showed that the negative prediction error signal in the ventral striatum is strengthened under stress (Robinson et al., 2013).

### **SINGLE UNIT RECORDINGS FROM THE DRN**

The anatomical and pharmacological evidence reviewed above suggests that 5-HT has potent effects on reward and punishment, and that its effects are tightly regulated by the neural circuitry interacting with the DRN. A missing piece of this puzzle is precisely how DRN neurons behave while reward-oriented behavior unfolds in real time. The studies reviewed above typically manipulated DRN function over long timescales, such as hours, and over a wide spatial extent, altering 5-HT function in multiple brain regions simultaneously. Yet the reward-related processes that the DRN regulates, including seeking, consuming, and learning about rewards, are performed during natural behavior within the span of minutes or seconds. In addition, while the behavioralpharmacological experiments examined how 5-HT is utilized at the projection sites, much less is understood about in which situations DRN neurons secrete 5-HT.

To understand which aspects of cognitive behavior are encoded by the activity of DRN neurons in real time, several research groups have measured the activity of single DRN neurons while animals performed behavioral tasks. In the following section, I will introduce our studies in primates performing "biased reward saccade tasks" (**Figures 2A,B**) (Bromberg-Martin et al., 2010). Using saccades as a behavioral measure is advantageous for several reasons. First, the measurement and assessment of changes in behavior, i.e., eye movement, are relatively simple. Second, the neuronal circuit for the generation of eye movements is well established.

While the activity of DRN neurons was found to be correlated with a variety of events, including movements, stimulus identity, and response direction (Ranade and Mainen, 2009), we found that reward information is one of the most influential factors for the modulation of DRN neuronal activity. A comparison of DRN activity with that of midbrain dopamine neurons also highlighted the distinct aspects of reward coding by different monoamine neurotransmitters.

### **SINGLE NEURONAL ACTIVITY OF THE PRIMATE DRN IN A BIASED REWARD SACCADE TASK**

Nakamura et al. recorded DRN neuronal activity while monkeys performed memory-guided saccade tasks with a biased reward schedule (Nakamura et al., 2008). After fixation on a central fixation point, a target flashed briefly to either the left or right. After a delay of 800 ms, the animal made a saccade in the direction where the target was previously presented (**Figure 2A**). The main feature of the task was the block design of the reward schedule (**Figure 2B**). For every 20–28 consecutive trials, called a block, one direction was always associated with a large reward, while the other direction was always associated with a small reward (e.g., right-large, left-small). Thus, we can measure the effect of the expectation and receipt of a certain reward size on neuronal activity. In addition, this target location-reward size contingency was switched between blocks (e.g., right-large, left-small to rightsmall, left-large) without an explicit signal, which caused the receipt of an unexpectedly large or small reward on the very first trial of each block. This feature enabled us to measure the effect of the positive and negative reward prediction error.

### **DRN NEURONS ENCODE THE EXPECTED AND RECEIVED REWARD VALUE**

We found that many DRN neurons exhibited task-related activity that was modulated by the expected and received reward value. **Figure 2C** shows a representative example. This neuron exhibited an increase in activity after the onset of the fixation point (FPon) followed by regular and tonic firing until reward onset (RWon). The activity further increased after the onset of a large reward, but ceased after the onset of a small reward, and this trend lasted tonically after reward onset. Another example neuron in **Figure 2D** showed an opposite modulation pattern. This neuron exhibited a decrease in activity after the onset of the fixation point followed by a tonic increase for small reward trials and suppression for large reward trials.

Reward-dependent modulation in activity was commonly observed in the population of DRN neurons. **Figure 2E** illustrates the time course of activity modulation using receiver operating characteristic (ROC) analysis by comparing the firing rate of each neuron for large (**Figure 2E**, left) and small (**Figure 2E**, middle) reward conditions to their baseline activity during 400 ms before fixation onset. During both periods before and after reward delivery, called the pre- and post-reward periods, respectively, many DRN neurons exhibited tonic increases (shown in warm colors) or decreases (cool colors) in activity. **Figure 2E**, right, compares the activity of each neuron between the large- and small-reward trials. The tonic reward effect was present in many neurons during both the pre- and post-reward periods.

There was a notable difference in reward-dependent modulation between the pre- and post-reward periods, indicating a different source of information. For each neuron, the change in activity during the pre-reward period, compared with baseline activity, tended to be in the same direction in both the largeand small-reward trials. On the contrary, the change in activity during the post-reward period, compared with baseline activity, tended to be in the opposite direction. For example, for the neuron shown in **Figure 2A**, the pre-reward activity increased compared with the baseline in both the large- and small-reward trials. On the other hand, its post-reward activity increased in the large-reward trials, but was inhibited in the small-reward trials relative to its baseline activity before fixation point onset. Thus, the main cause of the reward effect during the pre-reward period was that the change in activity tended to be stronger in the large-reward trials than in the small-reward trials. Conversely, the reward-dependent modulation of post-reward activity was caused by the modulation of activity in the opposite direction, depending on the reward value.

### **DRN NEURONS KEEP TRACK OF THE EXPECTED AND RECEIVED REWARD VALUE**

DRN neurons exhibited a tonic increase or decrease in activity that was modulated by the expected and received reward value. What do these tonic changes encode? One possibility is that this tonic modulation of activity encodes sustained aspects of motivated behavior, such as the state of expectation of future rewards for each moment. If so, the activity during the fixation period may represent the expected value of the performance of the task itself. This is because the animal did not know the exact value of the upcoming reward during the fixation period, but knew the averaged expected reward value, which should be a value between the large and small rewards. After target presentation, the exact expected/received reward value was known. If the neurons encoded the behavioral tasks primarily in terms of their reward value throughout a trial, then the neurons excited during the fixation period should be preferentially excited by the reward cues (i.e., carrying positive reward signals), whereas the neurons inhibited during the fixation period should be preferentially inhibited by the reward cues (i.e., carrying negative reward signals). Conversely, if the neurons encoded the fixation period and reward value in an independent manner, then there should be no systematic relationship between fixation- and reward-related activity.

Analysis revealed that there was indeed a strong correlation between the tonic activity level of a neuron during the fixation period and its encoding of reward-related cues and outcomes. For example, neurons like the one presented in **Figure 2C** showed a sustained elevation in activity during the fixation period. After

**FIGURE 2 | (A)** One direction rewarded memory guided saccade (1DR-MGS) task. After the monkey fixated on the central fixation point for 1200 ms, one of the two target positions was flashed for 100 ms. After the fixation point disappeared, the monkey made a saccade to the cued position to receive a liquid reward. The white arrows indicate the direction of gaze. In a block of 20–28 trials (e.g., left-large block), one target position (e.g., left) was associated with a large reward and the other position (e.g., right) was associated with a small reward. The position-reward contingency was then reversed (e.g., right-large block). **(B)** Left-large and right-large conditions were alternated between blocks with no external cue. The location of the target was determined pseudo-randomly. **(C,D)** Examples of the activity of two DRN neurons in the 1DR-MGS task. The activity in the large- and small-reward trials is shown in red and blue, respectively. The histograms and raster plots are shown in three sections: the left section is aligned to the time of fixation point onset (FPon), the middle section is aligned to target onset (TGon) and fixation point offset (FPoff), and the right section is aligned to reward onset (RWon).

The black dots indicate saccade onset (SACon); the blue dots indicate reward onset and offset. Note that reward offset (RWoff) applies only to the large-reward trials. **(E)** Population activity of DRN neurons in the 1DR-MGS task (*n* = 84). The activity of each neuron is presented as a row of pixels. Left and center: changes in the neuronal firing rate from baseline are compared in the large- and small-reward trials. The color of each pixel indicates the ROC value based on the comparison of the firing rate between a control period just before fixation onset (400 ms duration) and a test window centered on the pixel (100 ms duration). This analysis was repeated by moving the test window in 20-ms steps. The warm colors (ROC *>* 0.5) indicate increases in the firing rate relative to the control period, while the cool colors (ROC *<* 0.5) indicate decreases in the firing rate. Right: changes in reward-dependent modulation. The ROC value of each pixel was based on the comparison of the firing rate between the large- and small-reward trials. The warm colors (ROC *>* 0.5) indicate higher firing rates in the large-reward trials than in the small-reward trials. Modified from (Nakamura et al., 2008).

reward delivery, these neurons responded with a positive reward signal, with higher activity in response to the large- than to the small-reward trials. Other neurons, like the one presented in **Figure 2D**, showed a sustained suppression in activity during the fixation period, with higher activity in unrewarded trials than in rewarded trials. The population average of normalized activity was computed separately for neurons with positive, negative, or no significant reward signals in response to the outcome (**Figures 3A–C**). Neurons with positive reward signals for the outcome had elevated activity during the early period of the task (**Figure 3A**); if the rewarded target appeared, their activity was elevated further, whereas if the unrewarded target appeared, they returned to near baseline. Neurons with negative reward signals had suppressed activity during the early period of the task (**Figure 3B**); if the rewarded target appeared, their activity was suppressed further, whereas if the unrewarded target appeared, they returned to near baseline. Neurons with no significant reward signals had a tendency for small phasic responses to the fixation point and targets and slightly elevated activity during the task (**Figure 3C**).

The activity of a neuron during the fixation period was strongly positively correlated with its degree of reward discrimination during the post-target and post-reward periods (**Figure 3D**). If the elevation of activity during the fixation period was stronger, the neuron had higher discrimination of a positivereward signal; with stronger activity for large- than small-reward

neurons were sorted into these categories based on significant reward discrimination during a 150–450-ms window after outcome onset (gray bar on the x-axis; *p <* 0*.*05, Wilcoxon rank-sum test). Thick lines, mean normalized activity; light shaded areas, 1 SEM. **(D)** Neural activity during the fixation period was positively correlated with reward coding during the target and outcome periods. The x-axis indicates the fixation period response, which was measured as the ROC area for each neuron for discriminating between its firing rate at 500–900 ms after fixation point onset vs. the pre-fixation period at 0–400 ms before fixation point onset. The y-axis indicates reward discrimination, which was the difference in reward responses between the large- and small-reward trials. The text indicates rank correlation (rho) and its

indicate the line of best fit calculated using type 2 least-squares regression. **(E,F)** The first **(E)** and second **(F)** principal components of dorsal raphé neural activity profiles during the memory-guided saccade. Curves represent the normalized firing rate of the principal component during the fixation period (black) and after the onset of the rewarded (red) and unrewarded (blue) target, separately for the contralateral-rewarded block (dark colors) and ipsilateral-rewarded block (light colors). The first principal component indicated tonically increased activity during the fixation period and positive-reward coding during the target, memory, and outcome periods. The second component indicated tonically increased activity in response to reward delivery. Modified from (Bromberg-Martin et al., 2010).

trials during the post-target and post-reward periods. If the inhibition of fixation activity was stronger, the neuron had higher discrimination of a negative-reward signal; with stronger activity for small- than large-reward trials during the post-target and post-reward periods. Thus, most DRN neurons responded to the initiation of a behavioral task in the same direction as they responded to the reward cues and outcomes, and those neurons with stronger task coding also had stronger reward coding. These two signals combined so that the level of DRN activity tracked progress throughout the task toward obtaining future rewards. This form of correlated task and reward coding had a dominant influence on DRN neurons, and it was not simply one of many systematic forms of task and reward encoding.

So far, the analysis showed that DRN neurons encoded the information of the task (i.e., fixation period activity) and reward outcome in a correlated manner. However, it did not analyze whether this correlation had a dominant tendency or it was merely one of many systematic forms of task and reward encoding. The analysis was also performed only for restricted periods of the task, which were determined tentatively. To characterize the activity patterns of neurons during all task phases in an unbiased manner, we applied principal component analysis (Richmond and Optican, 1987; Paz et al., 2005). In this analysis, the activity of each neuron is described as a linear combination of the major components of activity that varied systematically with the task variables; the first principal component represents the most common pattern of neural activity with the greatest amount of variance, the second principal component explains the second most common pattern of neural activity, and so on. Then, the activity profile of every neuron may be reconstructed as the sum of its mean neural activity profile plus a weighted combination of the principal components. If a neuron is assigned a component positive weight, then its activity is positively related to the time series of that component. Conversely, if a neuron is assigned a component negative weight, then its activity is negatively related to the time series of that component.

In the DRN population of neurons, the first principal component (**Figure 3E**) indicated a positive correlation between task onset-related activity and reward coding. It consisted of a gradual increase in tonic activity during the inter-trial interval and after fixation point onset, followed by an additional increase in tonic activity in response to the rewarded target. The second principal component (**Figure 3F**) had a prolonged tonic change in activity after a reward was delivered. Thus, whereas the first component resembled "task-reward value coding," the second component resembled "reward delivery coding."

Note that principal component analysis treats neural activity as a linear combination of orthogonal components; if the "true" components underlying neural activity are combined nonlinearly or are not orthogonal, the principal components may not represent them perfectly. Nevertheless, further analysis indicated that only the first two principal components explained significantly more variance in activity than would be expected under the null hypothesis that there were no systematic patterns in the data using shuffled datasets (Bromberg-Martin et al., 2010). Thus, these principal components explained most of the systematic variation in neuronal activity that was related to task events.

### **DIFFERENCE FROM DOPAMINE NEURONS**

Reward-dependent modulations of the activity of DRN neurons were distinctively different from those observed in putative dopamine neurons for the same task (the visually guided version of the biased-reward saccade task, **Figure 4A**). First, whereas DRN neurons responded to both the reward-predicting stimulus and the reward itself (TGon and RWon, respectively, **Figure 4B**), dopamine neurons predominantly responded to the reward-predicting sensory stimulus (TGon). Second, whereas the DRN contains neurons that preferred larger rewards and neurons that preferred smaller rewards, dopamine neurons invariably preferred larger rewards (i.e., are excited by larger rewards). Third, whereas DRN neurons reliably coded the value of the received reward, whether or not it was expected, dopamine neurons responded to a reward only when it was larger or smaller than expected. **Figure 4C** shows the changes in neuronal activity during the pre- and post-reward periods when the target location-reward value contingency was switched. The activity of positive and negative reward-coding DRN neurons exhibited the expected (pre-reward) and received (post-reward) reward values. The changes in the activity of dopamine neurons during the pre-reward period were similar to those of DRN neurons. However, unlike DRN neurons, dopamine neurons responded to reward delivery only when the cue position-reward contingency was switched so that the reward was unexpectedly small or large, consistent with the prediction error hypothesis (Schultz, 1998; Kawagoe et al., 2004). Finally, whereas DRN neurons typically exhibited tonic responses, dopamine neurons exhibited phasic responses. Thus, DRN neurons provide tonic signals related to the expected and received reward values, unlike dopamine neurons that provide phasic signals related to the reward prediction error.

### **DISCUSSION**

The characteristic features of the activity of DRN neurons observed in the biased-reward saccade tasks were a tonic response pattern and stronger modulation for the most valuable option in either a positive or negative manner. The tonic activity underlying the expected reward value indicates its role in subjective motivation to obtain a reward or "wanting;" the response to the received reward value indicates its role in a subjective hedonic experience or "liking" (Berridge and Kringelbach, 2008). Correlated fixation-period activity, which represents the task value, and post-outcome activity, which represents the value of the received reward, indicate that DRN activity encodes behavioral tasks primarily in terms of their reward value throughout a trial. The principal components, which explained the majority of activity patterns, indicate that this reward coding, aside from other possible sensory-motor coding, is the major component of DRN activity. Conversely, DRN neurons do not appear to encode the prediction error signal of appetitive or aversive events.

Possible sources of the pre-reward activity (i.e., the response to fixation and target) may be dopamine neurons in the SNc, VTA, and lateral habenula (**Figure 5**). Since dopamine neurons are excited by a large reward-predicting cue, DRN neurons would also be excited by the same cue (Kawagoe et al., 2004). Indeed,

**aspects of the reward. (A)** Visually guided version of the one direction rewarded saccade (1DR-VGS) task. **(B)** Activity of 167 DRN neurons and 64 dopamine neurons for the 1DR-VGS task. The same format is used as in **Figure 2E**. **(C)** Changes in neuronal activity with the reversal of position-reward contingency. Top and middle: DRN neurons with large- and small-reward preferences, respectively. Bottom: dopamine neurons. For each group, the activity during the pre-reward period (400 ms after target onset) is shown on the left, and the activity during the post-reward period (400–800 ms dopamine neurons) is shown on the right. For each graph, the left panel shows large-to-small reward reversal; the right panel shows small-to-large reward reversal. The large-reward trials are indicated by dark gray; the small-reward trials are indicated by clear areas (as in the top). Shown are the mean and SE of the normalized neuronal activity for the n-th trial after contingency reversal. The asterisks (∗) indicate activity that was significantly different from the activity in the last five trials of the block with the reversed contingency (*p <* 0*.*01, Mann-Whitney *U*-test). Modified from (Nakamura et al., 2008).

during the pre-reward period, a large-reward preference was more common (∼20% of all task-related DRN neurons) than a smallreward preference (∼5%). The main projection from the lateral habenula to the DRN is, on the other hand, inhibitory. Using the same biased-reward saccade tasks, Matsumoto et al. showed that lateral habenula neurons were excited by stimuli that predict small rewards and were inhibited by a large-reward predicting cue (Matsumoto and Hikosaka, 2007). Such modulation of habenula activity would then be inversely translated into the largereward preference of DRN neurons via inhibitory neurons in the RMTg.

The post-reward responses of DRN neurons are unlikely to be derived from dopamine or habenula neurons because neither of them exhibit post-reward responses, except on the first trial after the block was switched. Possible origins of the postreward activity include the amygdala, hypothalamus, and medial prefrontal cortex. In the post-reward period, unlike the prereward period, the direction of modulation relative to the baseline was often opposite between the large- and small-reward trials. This observation indicates different sources of activity for the large- and small-reward trials. It was also found that one population of DRN neurons showed a large-reward preference and another population showed a small-reward preference. One possible interpretation would be that the source of the two kinds of reward-related signals (small *>* large and large *>* small) are represented in other brain areas, such as the anterior cingulate cortex (Niki and Watanabe, 1979; Amiez et al., 2006), and these signals are transmitted to the DRN (Arnsten and Goldman-Rakic, 1984). Another possible source is the amygdala. Amygdala neurons, like DRN neurons, tracked progress throughout a behavioral task, such that the response of a neuron to the start of the task was strongly correlated with its response to the reward cue and outcome. They also include both positive and negative coding neurons (Belova et al., 2008). Another possibility is that the reward information originated from the same group of neurons, but was transmitted to the DRN by different mechanisms; one directly, the other indirectly, via inhibitory connections. For example, the ventral medial prefrontal cortex inhibits 5-HT neurons in the DRN by targeting local GABAergic interneurons (Varga et al., 2001). Such multi-channeled inputs would enable the DRN

et al., 2004), lateral habenula (Matsumoto and Hikosaka, 2007, 2009), and DRN neurons (Nakamura et al., 2008), and for the Pavlovian conditioning task in the amygdala (Belova et al., 2007, 2008) is shown.

to integrate positive and negative reward values independently over time.

### **POSSIBLE FUNCTIONS OF THE DRN IN REWARD PROCESSING AND THE DIRECTION OF FUTURE RESEARCH**

The tonic activity of DRN neurons may be ideal to signal a continuous level of motivation and hedonic experience throughout the performance of a task. Such a signal may provide a "reward context" signal to the targets of DRN projections, where the signal may be used differently depending on the type of 5-HT receptor present.

First, the sustained reward signals in the DRN could be used to track the value of the current behavioral state. Such estimated values have an important role in theories of reinforcement learning, which suggest that the prediction error signal of dopamine neurons is calculated as the difference between the actual and expected reward values. Thus, DRN activity could contribute to the computation of prediction errors by providing the current state of the expected reward value.

Second, DRN activity may report the long-term averaged reward, rather than immediate, phasic reward information (Daw et al., 2002). In real life, one needs to integrate flows of information, including both appetitive and aversive events and situations, to achieve better decision making to adapt to external changes. The tonic activation patterns of DRN neurons may be useful in integrating appetitive and aversive information coming from different sources (as in **Figure 1**, left) over a substantial period of time.

The activity of DRN neurons observed in behaving monkeys is characterized by a mirror-image pattern of reward coding by different subsets of neurons, namely, positive and negative reward coding (**Figure 5**). The current theoretical account of 5-HT function is that it may be involved in behavioral inhibition in the face of punishment (Cools et al., 2011). Thus, the neuronal activity data in the DRN of behaving animals appears partially unexpected because some neurons showed stronger activity in the expectation of large rewards. This seemingly inconsistent finding may be because different groups of neurons might map onto neurochemically or anatomically different subgroups. In the neurochemical account, it is possible that the negative coding DRN neurons could be serotonergic projection neurons, while the positive coding ones may be GABAergic interneurons. Clarifying the underlying cell properties is essential for further understanding of the function of 5-HT (Schweimer and Ungless, 2010). In the anatomical account, neurons may respond differently depending on the circuit in which they are involved. For example, a recent single unit recording study in primates (Inaba et al., 2013) reported that neurons that prefer rewards tend to be distributed more rostrally, while neurons that prefer no rewards were distributed more caudally. It is possible that these different types of neurons may be involved in different anatomical circuits in the brain.

The mirror-image activity of different sets of DRN neurons also suggests that their function may be highly contextdependent. The DRN is anatomically and functionally linked to different circuits involving different brain structures, such as the frontal cortex, amygdala, basal ganglia, and dopamine neurons, and context here may depend on which circuit is mainly involved. Indeed, Warden et al. showed that stimulation of specific projections from the medial frontal cortex to the DRN caused changes in animals' movement in a challenging situation (the forced swim test), while stimulation of the overall DRN caused, in addition to the usual effects observed in a challenging situation, a general increase in movement (the open field test) (Warden et al., 2012). The activation of a specific pathway of DRN neurons with specific task-related activity may support the context-dependent selection of value-based decision making.

Another possible function of the seemingly opposite signals might be the interaction between the 5-HT and other systems, including dopamine systems, to compute appetitive and aversive information in a balanced manner (**Figure 5**). As in Solomon and Corbit's affective dynamics model (Solomon and Corbit, 1974), the value of rewards is treated as a continuous signal rather than the pulsatile pattern of the value signal, and the tonic DRN activity we observed may correspond to this signal. With the normal level of DRN activity and 5-HT, the baseline activity of dopamine neurons may be tonically suppressed. In addition, phasic appetitive and aversive event-indicating cues would drive both dopamine and DRN neurons which inhibit, at least partly, dopamine neurons, simultaneously. Thus, the DRN would attenuate the strength of responses of the dopamine system to appetitive and aversive events. This process might have the advantage of maintaining equilibrium in terms of reward to prevent excessive positive or negative value coding. Given the variety of 5-HT receptors and their functions, this scheme is, of course, simplistic. It should also be clarified whether the regulation of the reward circuit by 5-HT is always dopamine-dependent, like the proposed scheme, or it can act independently and directly. Combined research of circuit-specific manipulation such as the optogenetic approach and detailed analyses of neural activity in relation to changes in behavior would lead to a clear understanding of the role of 5-HT.

### **REFERENCES**


*Brain Res.* 917, 118–126. doi: 10. 1016/S0006-8993(01)02934-1


### **ACKNOWLEDGMENTS**

Kae Nakamura was supported by Precursory Research for Embryonic Science and Technology, a Grant-in-Aid for Scientific Research B, a Grant-in-Aid for Scientific Research on Priority Areas, and Human Frontier Science Program. I thank Dr. Ethan S. Bromberg-Martin for helpful comments.

of state value in the amygdala. *J. Neurosci.* 28, 10023–10030. doi: 10.1523/JNEUROSCI.1400- 08.2008


of task reward value in the dorsal raphe nucleus. *J. Neurosci.* 30, 6262–6272. doi: 10.1523/JNEURO SCI.0015-10.2010


*Psychopharmacology (Berl.)* 180, 670–679. doi: 10.1007/s00213-0 05-2215-5


serotonin2C receptor messenger RNA in the basal ganglia of adult rats. *J. Comp. Neurol.* 384, 233–247.


*(Berl.)* 142, 165–174. doi: 10.1007/s002130050876


(1999). Activation of serotoninimmunoreactive cells in the dorsal raphe nucleus in rats exposed to an uncontrollable stressor. *Brain Res.* 826, 35–43. doi: 10.1016/S0006-8993(99)01208-1


*Behav. Brain Res.* 197, 323–330. doi: 10.1016/j.bbr.2008.08.034


*Acad. Sci. U.S.A.* 95, 735–740. doi: 10.1073/pnas.95.2.735


22, 148–162. doi: 10.1016/S0893- 133X(99)00093-7


A. C., and Spiga, R. (2002). Laboratory-measured aggressive behavior of women: acute tryptophan depletion and augmentation. *Neuropsychopharmacology* 26, 660–671. doi: 10.1016/S0893- 133X(01)00369-4


response to delayed but not omitted rewards. *Eur. J. Neurosci*. 33, 153–160. doi: 10.1111/j.1460-9568. 2010.07480.x


Goto, K. (1999). Distribution of orexin neurons in the adult rat brain. *Brain Res.* 827, 243–260. doi: 10.1016/S0006-8993(99)01336-0


preferential involvement of 5-HT2A serotonin receptors in stress- and drug-induced dopamine release in the rat medial prefrontal cortex. *Neuropsychopharmacology* 31, 265–277. doi: 10.1038/sj.npp. 1300819


serotonin and GABA. *Cereb. Cortex* 15, 1–14. doi: 10.1093/cercor/ bhh104


307–322. doi: 10.1016/0006-8993 (94)91330-7


involvement in pituitary-adrenal function. *Ann. N.Y. Acad. Sci.* 297, 518–526. doi: 10.1111/j.1749- 6632.1977.tb41879.x


raphe. *Science* 197, 89–91. doi: 10.1126/science.194312


mesolimbic dopaminergic neuronal activity via GABA mechanisms: a study with dual-probe microdialysis. *Brain Res.* 1021, 82–91. doi: 10.1016/j.brainres.2004.06.053

**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 10 February 2013; accepted: 30 July 2013; published online: 27 August 2013.*

*Citation: Nakamura K (2013) The role of the dorsal raphé nucleus in rewardseeking behavior. Front. Integr. Neurosci. 7:60. doi: 10.3389/fnint.2013.00060*

*This article was submitted to the journal Frontiers in Integrative Neuroscience.*

*Copyright © 2013 Nakamura. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

### Learning from negative feedback in patients with major depressive disorder is attenuated by SSRI antidepressants

*Mohammad M. Herzallah1,2\*, Ahmed A. Moustafa3,4, JomanY. Natsheh1,2, Salam M. Abdellatif 1, Mohamad B. Taha1,Yasin I. Tayem1, Mahmud A. Sehwail 1, Ivona Amleh1, Georgios Petrides 5, Catherine E. Myers 6,7,8 and Mark A. Gluck2*

*<sup>1</sup> Al-Quds Cognitive Neuroscience Lab, Faculty of Medicine, Al-Quds University, Abu Dis, Palestinian Territories*

*<sup>3</sup> Marcs Institute for Brain and Behaviour, University of Western Sydney, Sydney, NSW, Australia*


*<sup>8</sup> Department of Psychology, Rutgers University, Newark, NJ, USA*

### *Edited by:*

*Kae Nakamura, Kansai Medical University, Japan*

### *Reviewed by:*

*Albino J. Oliveira-Maia, Champalimaud Foundation, Portugal Gina R. Poe, University of Michigan, USA*

#### *\*Correspondence:*

*Mohammad M. Herzallah, Al-Quds Cognitive Neuroscience Lab, Faculty of Medicine, Al-Quds University, P.O. Box 20002, Abu Dis, Palestinian Territories; Center for Molecular and Behavioral Neuroscience, Rutgers University, 197 University Avenue, Room 209, Newark, NJ 07102, USA e-mail: mohammad.m.herzallah@ gmail.com*

One barrier to interpreting past studies of cognition and major depressive disorder (MDD) has been the failure in many studies to adequately dissociate the effects of MDD from the potential cognitive side effects of selective serotonin reuptake inhibitors (SSRIs) use. To better understand how remediation of depressive symptoms affects cognitive function in MDD, we evaluated three groups of subjects: medication-naïve patients with MDD, medicated patients with MDD receiving the SSRI paroxetine, and healthy control (HC) subjects. All were administered a category-learning task that allows for dissociation between learning from positive feedback (reward) vs. learning from negative feedback (punishment). Healthy subjects learned significantly better from positive feedback than medication-naïve and medicated MDD groups, whose learning accuracy did not differ significantly. In contrast, medicated patients with MDD learned significantly less from negative feedback than medication-naïve patients with MDD and healthy subjects, whose learning accuracy was comparable. A comparison of subject's relative sensitivity to positive vs. negative feedback showed that both the medicated MDD and HC groups conform to Kahneman and Tversky's (1979) Prospect Theory, which expects losses (negative feedback) to loom psychologically slightly larger than gains (positive feedback). However, medicated MDD and HC profiles are not similar, which indicates that the state of medicated MDD is not "normal" when compared to HC, but rather balanced with less learning from both positive and negative feedback. On the other hand, medication-naïve patients with MDD violate Prospect Theory by having significantly exaggerated learning from negative feedback. This suggests that SSRI antidepressants impair learning from negative feedback, while having negligible effect on learning from positive feedback. Overall, these findings shed light on the importance of dissociating the cognitive consequences of MDD from those of SSRI treatment, and from cognitive evaluation of MDD subjects in a medication-naïve state before the administration of antidepressants. Future research is needed to correlate the mood-elevating effects and the cognitive balance between reward- and punishment-based learning related to SSRIs.

**Keywords: major depressive disorder, selective serotonin reuptake inhibitor, basal ganglia, reward, punishment**

### **INTRODUCTION**

Major depressive disorder (MDD) is debilitating psychiatric disease, characterized by persistent low mood and significant loss of pleasure (Belmaker and Agam, 2008). MDD has been associated with various cognitive deficits, including alterations to learning from positive feedback (reward) and negative feedback (punishment; Eshel and Roiser, 2010). Behavioral studies suggest that patients with MDD show hypersensitive responses to punishment (Beats et al., 1996; Elliott et al., 1996, 1997), while being hyposensitive to reward (Henriques et al., 1994; McFarland and Klein, 2009; Robinson et al., 2012a). These findings fit with psychological theories of MDD, which argue that patients with MDD manifest abnormally negative attitudes and thoughts (Bower, 1981), while being unable to modulate their behavioral responses when presented with positive reinforcement, which results in misconception of environmental information to confirm these biases (Gotlib and Joormann, 2010; Roiser and Sahakian, 2013). Such cognitive biases relate to the

"fnint-07-00067" — 2013/9/20 — 11:24 — page 1 — #1

*<sup>2</sup> Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, NJ, USA*

*<sup>4</sup> School of Social Sciences and Psychology, University of Western Sydney, Sydney, NSW, Australia*

*<sup>5</sup> Hofstra North Shore-LIJ School of Medicine, The Zucker Hillside Hospital North Shore-LIJ Health System, Glen Oaks, NY, USA*

underlying neural circuits that are affected by MDD, namely the basal ganglia and the limbic system (Sheline et al., 2001; Nutt, 2006; Dunlop and Nemeroff, 2007). Accordingly, we can draw two major conclusions from the literature on MDD patients' ability to process information in the context of positive and negative feedback. The first is that patients with MDD show exaggerated responses to negative feedback (Beats et al., 1996; Elliott et al., 1996, 1997), while the second is that MDD patients show hyposensitive responses to positive feedback (Henriques et al., 1994; McFarland and Klein, 2009; Robinson et al., 2012a).

In addition to being implicated in the pathophysiology of MDD, the monoamines serotonin and dopamine have also been shown to be play major roles in reinforcement learning (Deakin, 1991; Dunlop and Nemeroff, 2007; Cools et al., 2011). Serotonin has been prominently associated with aversive processing as well as behavioral inhibition, where serotonin levels positively correlate with punishment-induced inhibition and aversive processing but not overall inhibition of motor responses to aversive outcomes (Deakin and Graeff, 1991; Crockett et al., 2009). Studies have shown that acute tryptophan depletion (a dietary technique used to reduce central serotonin concentrations) enhances reversal learning of aversive cues in healthy subjects (Cools et al., 2008), which mimics the feedback sensitivity bias in patients with MDD (Clark et al., 2009; Eshel and Roiser, 2010). Aside from being key for learning from positive feedback (Schultz et al., 1997), it has been suggested that dopaminergic dysregulation plays a central role in the cognitive correlates of MDD (Nutt, 2006; Dunlop and Nemeroff, 2007; Nutt et al., 2007). Imaging studies have shown that patients with MDD exhibit hyposensitive responses to reward alongside attenuated striatal response to presentation of reward (Henriques et al., 1994; McFarland and Klein, 2009; Robinson et al., 2012a). These reports highlight the low serotonergic and low dopaminergic state in MDD, which could represent the neurochemical basis for the observed cognitive biases in MDD (Cools et al., 2011).

A substantial proportion of patients with MDD respond to pharmacological treatment with antidepressants, including selective serotonin reuptake inhibitors (SSRIs; Carvalho et al., 2007), which are thought to achieve their therapeutic effect, primarily, by modifying synaptic availability of monoamines, namely serotonin, dopamine, and norepinephrine (Malberg and Schechter, 2005). Recent studies argue that SSRI administration in MDD results in normalization of activity in the prefrontal cortex (PFC) and amygdala (Di Simplicio et al., 2012; Godlewska et al., 2012), normalization of the functional connectivity between PFC and both hippocampus and amygdala (McCabe et al., 2011), and enhancement of reward learning and striatal activity (Stoy et al., 2011). On the other hand, reports suggest that the administration of SSRIs diminishes the processing of both reward and punishment stimuli in healthy subjects (McCabe et al., 2010), but diminishes learning from punishment stimuli and enhances learning from reward stimuli in rats (Bari et al., 2010). Accordingly, there is evidence that SSRI administration normalizes brain activity in key regions for learning from positive and negative feedback, and enhances learning from positive feedback. Unfortunately, relatively little is known about how the remediation of psychiatric symptoms by SSRIs impacts the balance between learning from reward and punishment in MDD.

In this study, our main aim was to investigate the effect of remediation of depressive symptoms by SSRI administration on the balance between learning from positive and negative feedback in MDD. We tested medication-naïve patients with MDD, SSRIresponder patients with MDD and matched healthy control (HC) subjects, on a computer-based learning task that uses a mix of positive-feedback and negative feedback (Bodi et al., 2009). To our knowledge, no previous studies attempted to dissociate the effects of MDD and SSRI on reward and punishment learning in the same study.

### **MATERIALS AND METHODS**

### **PARTICIPANTS**

We recruited and tested 13 medication-naïve patients with MDD, 18 SSRI-responding patients with MDD (MDD-T), and 22 HC subjects, from various psychiatric clinics, mental health care centers and primary health care centers throughout the West Bank, Palestinian Territories. All subjects were White, ranging from 20 to 60 years of age. Participants were group matched for age, gender, and years of education, as shown in**Table 1**. All subjects underwent screening evaluations that included a medical history and a physical examination. Psychiatric assessment was conducted using an unstructured interview with a psychiatrist using the DSM-IV-TR


**Table 1 | Summary of demographic and neuropsychological results.**

*HC, healthy controls; MDD, medication-naïve patients with MDD; MDD-T, SSRI-treated patients with MDD; MMSE, Mini-Mental Status Exam; BDI-II, Beck Depression Inventory II; BAI, Beck Anxiety Inventory; TPQ dimensions, Tridimensional Personality Questionnaire; HA, harm avoidance; RD, reward dependence, NS, novelty seeking.*

"fnint-07-00067" — 2013/9/20 — 11:24 — page 2 — #2

criteria for the diagnosis of MDD (melancholic subtype), and the Mini International Neuropsychiatric Interview (MINI; Amorim et al., 1998). We recruited medication-naïve patients with MDD after meeting the DSM-IV-TR criteria for MDD and completing the MINI structured clinical interview to confirm the diagnosis and absence of comorbidities.We tested medication-naïve patients with MDD immediately prior to their initiating treatment with SSRIs. All SSRI-treated patients with MDD received 10–30 mg of paroxetine per day (mean = 18.333, SD = 5.941) as part of their normal ongoing treatment. Inclusion criteria for HC subjects were absence of any psychiatric, neurological, or other disorders that might affect cognition. MDD-T patients' average exposure to SSRIs was 12.833 (SD = 18.912) months. MDD-T patients' response to SSRIs was assessed using subjective reports and scores on the Beck Depression Inventory II (BDI-II). Exclusion criteria for all subjects included psychotropic drug exposure, except for the SSRI paroxetine in the SSRI-treated MDD group; major medical or neurological illness; illicit drug use or alcohol abuse within the past year; lifetime history of alcohol or drug dependence; psychiatric disorders other than major depression (excepting comorbid anxiety symptoms); current pregnancy or breastfeeding. After receiving a complete description of the study, participants provided written informed consent as approved by both the Al-Quds University Ethics Committee and the Rutgers Institutional Review Board.

### **PSYCHOMETRIC AND PSYCHOPATHOLOGY TEST BATTERY**

All subjects completed the validated Arabic version (Herzallah et al., 2010, 2013) of a battery of psychometric and psychopathology test questionnaires: Mini-Mental Status Examination (MMSE; Folstein et al., 1975), BDI-II (Beck et al., 1996), and Beck Anxiety Inventory (BAI; Beck et al., 1988). Further, all subject completed the Tridimensional Personality Questionnaire (TPQ; Cloninger et al., 1991). All results are summarized in **Table 1**.

### **COMPUTER-BASED COGNITIVE TASK** *Reward and punishment learning*

Participants were administered a computer-based classification task (Bodi et al., 2009). On each trial, participants viewed one of eight images (**Figure 1**), and were asked to guess whether that stimulus predicts rainy weather (Rain, **Figure 1**) or sunny weather (Sun, **Figure 1**). For each participant, the eight images were randomly assigned to be stimuli S1–S8. On any given trial, stimuli S1, S3, S5, and S7 predicted Rain, while stimuli S2, S4, S6, and S8 predicted Sun. Stimuli S1–S4 were used in the reward-learning task. Four stimuli per valence were employed in order to balance category outcome frequencies, so that one stimulus in each task would be associated with each outcome. Thus, if the participant correctly guessed category membership on a trial with either of these stimuli, a reward of +25 points was received; if the participant guessed incorrectly, no feedback appeared. Stimuli S5–S8 were used in the punishment-learning task. Thus, if the participant guessed incorrectly on a trial with either of these stimuli, a punishment of –25 was received; correct guesses received no feedback.

The experiment was conducted on a Macintosh MacBook, programed in the SuperCard language. The participant was seated in a quiet testing room at a comfortable viewing distance from the screen. The keyboard was masked except for two keys, labeled "Sun" and "Rain" which the participant could use to enter responses. At the start of the experiment, the participant read

whether this stimulus predicts rain or sun. **(B)** No feedback is given for incorrect answers in rewarding stimuli or correct answers in punishing

incorrect responses get punished with visual feedback and the loss of 25 points.

"fnint-07-00067" — 2013/9/20 — 11:24 — page 3 — #3

the following instructions: "Welcome to the Fortuneteller School! You will be trained as a fortune teller to predict the weather. You learn to do this by using cards that either predict rain or sun. Your goal is to learn which cards predict rain and which cards predict sun." The practice phase then walked the participant through an example of a correct and an incorrect response to a sample trial in the reward-learning task and an example of a correct and response to a sample trial in the punishment-learning task. These examples used images other than those assigned to S1–S8. The participant saw a practice image, with a prompt to choose "Sun" or "Rain," and a running tally of points at the lower right corner of the screen. The tally was initialized to 500 points at the start of practice. The participant was first instructed to press the "Sun" key, which resulted in a reward of +25 and updated point tally and then the "Rain" key, which resulted in no feedback. The participant then saw a second practice figure and was instructed first to press the "Rain" key, which resulted in a reward of –25 and updated point tally and then the "Sun" key, which resulted in no feedback. After these two practice trials, a summary of instructions appeared: "So... for some pictures, if you guess CORRECTLY, you WIN points (but, if you guess incorrectly, you win nothing). For other pictures, if you guess INCORRECTLY, you LOSE points (but, if you guess correctly, you lose nothing). Your job is to win all the points you can and lose as few as you can. Press the mouse button to begin the experiment". From here, the experiment began. In each trial, the participant saw one of the eight stimuli (S1–S8) and was prompted to guess whether it was a "Sun" or a "Rain." On trials in the reward-learning task (with stimuli S1–S4), correct answers were rewarded with positive feedback and a gain of 25 points; incorrect answers received no feedback. On trials in the punishment-learning task (with stimuli S5–S8), incorrect answers were punished with negative feedback and a loss of 25 points; correct answers received no feedback. The task contained 160 trials, distributed over four blocks of 40 trials. Within a block, trial order was randomized. Trials were separated by a 1 s interval, during which time the screen was blank. Within each block, each stimulus appeared five times. Thus, training on the reward-learning task (S1–S4) and punishment-learning task (S5–S8) were intermixed. The no-feedback outcome, when it arrived, was ambiguous, as it could signal lack of reward (if received during a trial with S1–S4) or lack of punishment (if received during a trial with S5–S8).

### **STATISTICAL ANALYSIS**

The normality of data distribution was checked using Kolmogorov–Smirnov tests. All data were normally distributed (*p* > 0.1). We used mixed-design three-way ANCOVA followed by mixed-design two-way ANOVA and one-way ANOVA *post hoc* tests, Tukey's honestly significant difference (HSD) *post hoc* tests and Bonferroni *post hoc* tests. The level of significance was set at α = 0.05.

### **RESULTS**

### **BEHAVIORAL RESULTS**

We used one-sample *t*-test on the percentage of correct responses in the fourth block of learning in both reward and punishment to ensure that subjects learned significantly better than chance in different groups. In reward learning, MDD-T and HC learned significantly better than chance, with Bonferroni correction adjusted α = 0.017 to protect the level of significance [MDD-T: *t*(17) = 3.264, *p* = 0.005; HC: *t*(21) = 9.997, *p* < 0.001], while MDD did not [*t*(12) = 0.925, *p* = 0.373]. In punishment learning, all groups learned significantly better than chance, with Bonferroni correction adjusted α=0.017 to protect the level of significance [MDD: *t*(12) = 7.704, *p* < 0.001; MDD-T: *t*(17) = 3.394, *p* = 0.003; HC: *t*(11) = 13.231, *p* < 0.001].

Using mixed-design three-way ANCOVA, we analyzed the data obtainedfrom the cognitive task with group as the between-subject variable, learning block, and feedback type as within-subject variables, BDI-II scores as a covariate, and the percentage of correct responses on reward and punishment as the dependent variables. There was a significant effect of group [*F*(2,51) = 9.433, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.270] and block [*F*(3,153) <sup>=</sup> 11.880, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.189] as illustrated in **Figure 2**. However, there was no significant effect of feedback type [*F*(1,51) = 1.337, *p* = 0.253]. We conducted two *post hoc* mixed-design two-way ANOVAs, with group as the between-subject variable, learning block as withinsubject variable, the percentage of correct responses on reward as the dependent variable in one of the ANOVAs and the percentage of correct responses on punishment in the other, and Bonferroni correction adjusted α = 0.025 to protect the level of significance. The reward *post hoc* revealed a significant effect of group [*F*(2,50) <sup>=</sup> 5.094, *<sup>p</sup>* <sup>=</sup> 0.010, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.169] and block [*F*(3,150) <sup>=</sup> 6.000, *<sup>p</sup>* <sup>=</sup> 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.107] along with an interaction between group and block [*F*(6,150) = 3.098, *p* = 0.007, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.110]. We used four *post hoc* one-way ANOVAs to explore the significant interaction between group and block, with group as the between-subject variable, and the percentage of correct responses on a each one of the four reward learning block was the within-subject variable, with a Bonferroni correction adjusted α = 0.0125 to protect the level of significance. One-way ANOVA and Tukey's HSD results are summarized in **Table 2**. The punishment *post hoc* two-way ANOVA showed a significant effect of group [*F*(2,50) <sup>=</sup> 4.512, *<sup>p</sup>* <sup>=</sup> 0.016, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.153] and block [*F*(3,150) <sup>=</sup> 45.644, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.477], but no interaction between group and block [*F*(6,150) = 2.426, *p* = 0.029]. Tukey's HSD *post hoc* test revealed a significant difference between MDD-T and both MDD and HC (*p* < 0.05), but not between MDD and HC.

To investigate the balance between reward and punishment learning, we subtracted punishment learning accuracy in a particular block from that of reward in the same block. Two-way ANOVA, with group as the between-subject variable, block of learning as the within subject variable, and the mean difference between percentage correct responses in reward and punishment trials as the dependent variable, revealed a significant effect of block [*F*(3,150) <sup>=</sup> 11.147, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.182] and an interaction between block and group [*F*(6,150) = 3.145, *p* = 0.006, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.112], but no significant effect of group [*F*(2,50) <sup>=</sup> 2.486, *p* = 0.094], as illustrated in **Figure 3**. We used four *post hoc* oneway ANOVA and Tukey's HSD *post hoc* analyses on each block of mean difference between percentage correct responses in reward and punishment trials to investigate the interaction between block and group, with group as the between subject variable and the mean difference between percentage correct responses in reward

"fnint-07-00067" — 2013/9/20 — 11:24 — page 4 — #4

**FIGURE 2 | Performance on the reward and punishment learning task; (A)** The mean number of correct responses in the four phases for the reward stimuli (**±**SEM). **(B)** The mean number of correct responses

in the four phases for the punishment stimuli (±SEM). MDD is medication naïve, MDD-T is on medication MDD patients, and HC is healthy controls.



*HC, healthy controls; MDD, medication-naïve patients with MDD; MDD-T, SSRI-treated patients with MDD. The symbol "\*" marks significant results.*

and punishment trials as the dependent variable. ANOVA and Tukey's HSD results are reported in **Table 3**.

### **PSYCHOMETRIC RESULTS**

There was no significant effect of group on age, education, MMSE score, or the novelty seeking subsection of the TPQ, with Bonferroni correction adjusted α = 0.006 to protect the level of significance (*p* > 0.006). However, there was a significant difference between groups in BDI-II scores [*F*(2,50)=77.576, *p*<0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.756, Tukey's HSD *post hoc*: significant difference between MDD and both MDD-T and HCs], BAI scores [*F*(2,50) = 52.444, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.677, Tukey's HSD *post hoc*: significant difference between MDD and both MDD-T and HCs], harm avoidance subsection of the TPQ [*F*(2,50) <sup>=</sup> 15.903, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.389, Tukey's HSD *post hoc*: significant difference between HC and both MDD and MDD-T, and between MDD-T and HC], and reward dependence subsection of the TPQ [*F*(2,50) = 5.808, *p* = 0.005, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.189, Tukey's HSD *post hoc*: significant difference between HC and both MDD and MDD-T].

### **DISCUSSION**

We have three main findings. First, SSRI-treated patients with MDD were less sensitive to negative feedback (punishment) than either medication-naïve patients with MDD or HC subjects, based on their accuracy in the cognitive task. Second, both medicationnaïve and SSRI-treated patients with MDD were less sensitive to positive-feedback than HC subjects. Third, a comparison of subjects' learning from positive vs. negative feedback, showed that both the HC and MDD groups conform to Kahneman and Tversky's (1979) Prospect Theory, which expects losses (negative

"fnint-07-00067" — 2013/9/20 — 11:24 — page 5 — #5

feedback) to loom psychologically larger than gains (positive feedback; Kahneman and Tversky, 1979). In contrast the MDD patients violate Prospect Theory by being significantly more biased toward negative.

### **BEHAVIORAL AND NEURAL CORRELATES OF MDD**

Abnormal exaggerated reactions to negative events and overlooking positive events are considered central features of MDD (Beats et al., 1996; Elliott et al., 1996). These abnormal responses to positive and negative feedback represent an important link between emotional and cognitive disturbances in MDD (Wright and Beck, 1983; Elliott et al., 1997), showing an increased elaboration of negative information (Gotlib and Joormann, 2010), while ignoring positive information. As explained by the cognitive theory of depression (Clark and Beck, 2010); depressed people tend to demonstrate selective attention to negative information; magnifying the importance and meaning placed on negative events (Beck,1979; Bower,1981). Our results show that medication-naïve patients with MDD learn from punishment as efficiently as HC subjects, but fail to learn from reward feedback. However, the task design we use in the current study is not the most ideal approach to delineate higher-than-normal learning from punishment learning in MDD due to a possible ceiling effect (**Figure 2B**). Further research is needed in this domain to further investigate the differential sensitivity to negative feedback in MDD as compared to healthy subjects, and properly correlate cognitive measures with symptom distribution and severity in patients with MDD.

Patients with MDD's strong biases toward negative stimuli and away from positive ones highlights the role of serotonin in the processing of affective stimuli and inhibitory control of behavior and adaptation of the animals to aversive events (Graeff et al., 1996), and underpin the attentional bias in MDD toward negative feedback (Mogg et al., 1995; Harmer et al., 2009). Lowering brain serotonin level by acute tryptophan depletion (serotonin precursor) in healthy volunteers results in increased sensitivity to punishment and negative feedback without affecting reward (Cools et al., 2008; Robinson et al., 2012b). These alterations in the reward and punishment processing implicate a neural circuit that is composed of brain regions strongly innervated by serotonin, namely, the medial PFC and the ventral striatum (Clark et al., 2009).

Recent imaging studies argue that patients with MDD manifest cognitive and neurochemical dysfunction directly related to the nigrostriatal dopaminergic system (Dunlop and Nemeroff, 2007; Walter et al., 2007; Robinson et al., 2012a). On the other hand, previous research has shown that the basal ganglia dopaminergic system is vital for learning to predict rewarding outcomes (Schultz et al., 1997; Haber and Knutson, 2010). In a previous study using a reward-punishment learning task (similar to the task we used in this paper), we demonstrated that medication-naïve patients with Parkinson's disease learned very well from punishment but were impaired on reward learning (Bodi et al., 2009). Our findings indicate that medication-naïve patients with MDD show similar cognitive profile to de novo patients with Parkinson's (Bodi et al., 2009). Both disorders were shown to suppress learning from reward (Henriques et al., 1994; Bodi et al., 2009; McFarland and Klein, 2009; Robinson et al., 2012a), without altering learning from punishment (Beats et al., 1996; Elliott et al., 1996,

**Table 3 | Summary of the** *post hoc* **one-way ANOVA and Tukey's HSD** *post hoc* **analyses on each block of mean difference between percentage correct responses in reward and punishment trials to investigate the interaction between block and group, with group as the between subject variable and the mean difference between percentage correct responses in reward and punishment trials as the dependent variable.**


*HC, healthy controls; MDD, medication-naïve patients with MDD; MDD-T, SSRI-treated patients with MDD. The symbol "\*" marks significant results.*

"fnint-07-00067" — 2013/9/20 — 11:24 — page 6 — #6

1997; Bodi et al., 2009). This observation might be attributed to the effect of both disorders on the striatal dopamine (Kish et al., 1988; Walter et al., 2007). Further, there is a very high level of comorbidity between MDD and Parkinson's disease (Cummings, 1992; Schuurman et al., 2002; Leentjens et al., 2003; Veiga et al., 2009). However, it is not clear whether this overlap between the two disorders is a consequence of dopaminergic dysfunction alone, or it is a mixture monaminergic effects (Kitaichi et al., 2010; Delaville et al., 2012). In addition, our findings suggest that SSRItreated patients withMDD learn significantly less than HC subjects from positive-feedback, similar to medication-naïve patients with MDD. Future studies ought to compare the cognitive correlates of SSRI administration in MDD and depression in Parkinson's disease.

Increasing the central level of serotonin by administration of SSRIs counteracts MDD-related negative biases in aversive learning paradigms in animals (Bari et al., 2010) as well as emotional learning paradigms in humans (Harmer et al., 2009; McCabe et al., 2010). Various studies show that the administration of SSRIs normalizes the BOLD response in the dorsomedial PFC and across the functional connection between PFC and both hippocampus and amygdala (McCabe et al., 2011). Hence, it has been proposed that SSRIs may ameliorate MDD symptoms by inhibiting processing of negative feedback (Boureau and Dayan, 2011; Cools et al., 2011). In agreement with these results, we found here that SSRI-treated patients with MDD are less sensitive to negative feedback as compared to both medication-naïve patients with MDD as well as HC subjects.

InWatts et al. (2012), daily administration of SSRIs caused normal rats to slowly begin to lose selectivity in their box-checking behavior for food reward; they soon began to check more unbaited boxes. If SSRI administration reduces salience of punishment, it may be that the Watts et al.'s (2012) behavioral outcome is not due to lack of consolidation or reconsolidation of which boxes were baited or unbaited, as the authors chose to interpret their findings, but rather resulted from a lack of motivation to discriminate the rewarded vs. unrewarded boxes since the slight negative drawback (waste of time and effort) of checking an unbaited box was no longer worth the cognitive effort of discrimination. This could support either a learning deficit with MDD treatment or a loss of the power of negative motivation, or both. However, it also remains possible that change in the MDD-T performance in our study is due to an *a priori* learning impairment caused by the MDD treatment, or the effects of recovery from MDD. All groups did seem to learn the positive reward stimuli, but none of them learned it well, whereas the MDD and HC groups learned from punishment quite well indeed while the MDD-T group poor learning from punishment compares to their poor learning from reward.

Driven by the SSRI-related suppression of punishment learning, we found that SSRI-treated patients with MDD expressed balanced reward-punishment learning bias similar to HC subjects. This balance can be the underlying mechanism for SSRI-induced restoration of mood in patients with MDD. It is worth noting, however, that SSRI-treated MDD and HC profiles are not similar, which indicates that the state of SSRI-treated MDD is not "normal" (when compared to HC), but rather balanced with less learning from both positive and negative feedback. The negative values in this difference computation for the HC and MDD-T groups indicated a biased sensitivity to learn slightly more quickly from negative feedback (punishment) than positive feedback (reward) as expected by Kahneman and Tversky's (1979) Prospect Theory, which expects that losses from negative feedback should loom larger than gains from positive feedback. Only the MDD group failed to conform to the Prospect Theory with significantly exaggerated bias toward negative feedback.

### **LIMITATIONS AND FUTURE DIRECTIONS**

An important limitation of the current study is that the different severity of depressive symptoms in SSRI-treated vs. medicationnaïve patients might have contributed to the difference between the groups. We did not have access to SSRI-treated patients' BDI-II scores before they were placed on the SSRI regimen. Therefore, it is impossible to conclude that the observed behavioral effects originate from the medication alone. However, we added BDI-II scores as a covariate in our main analysis, and matched the different groups on a number of psychometric measures.

Another major limitation to our study is the between-subject design, where the medication-naïve and the SSRI-treated patients with MDD are different individuals. Given the heterogeneity of MDD, and how various subtypes of MDD differ with regards to cognitive function, the current result might be confounded by between-subject variability originating from factors other than MDD and SSRI administration. Further, given that we recruited SSRI-responders, it is not expected that the selected medicationnaïve patients with MDD will turn out to be responders once they started SSRI monotherapy, which limits the comparability of the groups and represents a major limitation of the current study. We did, of course, try to control for that in the current study by recruiting melancholic patients with MDD only, and by matching the two groups on various psychometric and demographic measures as described earlier. However, future work ought to address this issue by examining the same patients with MDD on and off medication. Another limitation of the current study is the low number of recruited subjects. However, given that the focus of the current study is cognitive function assessment, all *a priori* power analyses indicated the need for 14 subjects per group to achieve power levels higher than 90%, which confirms the sufficiency of the number of subjects in the analysis of our primary cognitive results. Future studies, however, should address these limitations and better control for possible confounding variables.

### **ACKNOWLEDGMENTS**

We would like to thank Al-Quds Cognitive Neuroscience Lab students: Omar Danoun, Dana Deeb, Aya Imam, Issa Isaac, Hussain Khdour, and Jeries Kort for their excellent technical assistance. Research reported in this publication was supported by National Institutes of Health Award R21MH095656 from the Fogarty International Center and the National Institute of Mental Health to Mark A. Gluck, the Palestinian American Research Center (PARC), as well as generous donations from Saad N. Mouasher, Dr. Samih Darwazah (Hikma Pharmaceuticals LLC.), Dr. Fouad Rasheed, and Mahmud Atallah.

"fnint-07-00067" — 2013/9/20 — 11:24 — page 7 — #7

### **REFERENCES**


"fnint-07-00067" — 2013/9/20 — 11:24 — page 8 — #8

depression: diminished responsiveness to anticipated reward but not to anticipated punishment or to nonreward or avoidance. *Depress. Anxiety* 26, 117–122. doi: 10.1002/da. 20513


not reward prediction: implications for resilience. *Psychopharmacology (Berl.)* 219, 599–605. doi: 10.1007/s00213-011-2410-5


depressed patients normalizes after treatment with escitalopram. *J. Psychopharmacol.* 26, 677–688. doi: 10.1177/0269881111416686


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

### *Received: 10 April 2013; accepted: 26 August 2013; published online: 23 September 2013.*

*Citation: Herzallah MM, Moustafa AA, Natsheh JY, Abdellatif SM, Taha MB, Tayem YI, Sehwail MA, Amleh I, Petrides G, Myers CE and Gluck MA (2013) Learning from negative feedback in patients with major depressive disorder is attenuated by SSRI antidepressants. Front. Integr. Neurosci. 7:67. doi: 10.3389/fnint.2013.00067*

*This article was submitted to the journal Frontiers in Integrative Neuroscience.*

*Copyright © 2013 Herzallah, Moustafa, Natsheh, Abdellatif, Taha, Tayem, Sehwail, Amleh, Petrides, Myers and Gluck. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

"fnint-07-00067" — 2013/9/20 — 11:24 — page 9 — #9

### Serotonergic modulation of spatial working memory: predictions from a computational network model

#### *Maria Cano-Colino1, Rita Almeida1,2 and Albert Compte1 \**

*<sup>1</sup> Systems Neuroscience Group, Institut d'Investigacions Biomèdiques August Pi i Sunyer, Barcelona, Spain*

*<sup>2</sup> Department of Neuroscience, Karolinska Institute, Stockholm, Sweden*

### *Edited by:*

*KongFatt Wong-Lin, University of Ulster, Northern Ireland*

### *Reviewed by:*

*KongFatt Wong-Lin, University of Ulster, Northern Ireland Da-Hui Wang, Beijing Normal University, China*

### *\*Correspondence:*

*Albert Compte, Institut d'Investigacions Biomèdiques August Pi i Sunyer, C/Rosselló 149, 08036 Barcelona, Spain e-mail: acompte@clinic.ub.es*

Serotonin (5-HT) receptors of types 1A and 2A are strongly expressed in prefrontal cortex (PFC) neurons, an area associated with cognitive function. Hence, 5-HT could be effective in modulating prefrontal-dependent cognitive functions, such as spatial working memory (SWM). However, a direct association between 5-HT and SWM has proved elusive in psycho-pharmacological studies. Recently, a computational network model of the PFC microcircuit was used to explore the relationship between 5-HT and SWM (Cano-Colino et al., 2013). This study found that both excessive and insufficient 5-HT levels lead to impaired SWM performance in the network, and it concluded that analyzing behavioral responses based on confidence reports could facilitate the experimental identification of SWM behavioral effects of 5-HT neuromodulation. Such analyses may have confounds based on our limited understanding of metacognitive processes. Here, we extend these results by deriving three additional predictions from the model that do not rely on confidence reports. Firstly, only excessive levels of 5-HT should result in SWM deficits that increase with delay duration. Secondly, excessive 5-HT baseline concentration makes the network vulnerable to distractors at distances that were robust to distraction in control conditions, while the network still ignores distractors efficiently for low 5-HT levels that impair SWM. Finally, 5-HT modulates neuronal memory fields in neurophysiological experiments: Neurons should be better tuned to the cued stimulus than to the behavioral report for excessive 5-HT levels, while the reverse should happen for low 5-HT concentrations. In all our simulations agonists of 5-HT1A receptors and antagonists of 5-HT2A receptors produced behavioral and physiological effects in line with global 5-HT level increases. Our model makes specific predictions to be tested experimentally and advance our understanding of the neural basis of SWM and its neuromodulation by 5-HT receptors.

**Keywords: persistent activity, prefrontal cortex, computational model, metacognition, working memory**

### **INTRODUCTION**

Spatial working memory (SWM) is a function of the prefrontal cortex (PFC) (Smith and Jonides, 1999) that is known to be under the neuromodulatory control of the monoamine nuclei of the brainstem (Brozoski et al., 1979; Porrino and Goldman-Rakic, 1982; Ellis and Nathan, 2001; Robbins and Arnsten, 2009; Arnsten, 2011; Arnsten et al., 2012). Especially catecholamines have been implicated in SWM, while only comparatively fewer studies have investigated neuromodulation of SWM by the indolamine serotonin (5-hydroxytryptamine, 5-HT) (Park et al., 1994; Luciana et al., 1998, 2001; Vollenweider et al., 1998; Carter et al., 2005; Wingen et al., 2007; Wittmann et al., 2007; Mendelsohn et al., 2009; Silber and Schmitt, 2010). These studies have not found consistent effects of 5-HT on SWM, even if the serotonergic system is associated with general cognitive function (Schmitt et al., 2006; Robert and Benoit, 2008; Ögren et al., 2008; Froestl et al., 2012). This is unexpected, given the marked effect of local application of serotonergic drugs on the persistent activity of prefrontal neurons in monkeys engaged in a SWM task (Williams et al., 2002), which is believed to subserve working memory function in the PFC (Funahashi et al., 1989; Goldman-Rakic, 1995). A role for 5-HT in controlling PFCdependent functions is also suggested by the strong expression of serotonin receptors in PFC (Jacobs and Azmitia, 1992; Jakab and Goldman-Rakic, 1998; Barnes and Sharp, 1999; Amargós-Bosch et al., 2004; Santana et al., 2004; De Almeida et al., 2008) and their capacity to modulate prefrontal cortical activity (Jacobs and Azmitia, 1992; Puig et al., 2004, 2005, 2010; Celada et al., 2013).

Recently, a computational model of SWM was used to study the effects of 5-HT receptors on network function (Cano-Colino et al., 2013). This study suggested that a non-monotonic dependency of SWM performance with 5-HT concentration could underlie the difficulty in identifying serotonergic effects in psychopharmacological studies of SWM. Furthermore, the network model predicted that worsened SWM performance upon excessive and defective activation of 5-HT receptors could be discriminated based on a careful examination of the nature of the errors committed (Cano-Colino et al., 2013). In particular, the confidence declared by subjects after erroneous responses could distinguish the behavioral effects of increased and reduced 5- HT activations, and thus lead to detecting a robust serotonergic effect on SWM. These predictions are experimentally testable, since confidence reports are being increasingly used in both human (Pessoa and Ungerleider, 2004; Mayer and Park, 2012; Rademaker et al., 2012) and animal (Middlebrooks and Sommer, 2011, 2012; Tanaka and Funahashi, 2012) studies of working memory. However, there are limitations in relying only on the confidence report to test the computational model. On the one hand, confidence reports rely on meta-cognition, the knowledge about one's own cognitive processes (Flavell, 1979), which is also a PFC-dependent function (Schnyer et al., 2004; Pannu et al., 2005; Rounis et al., 2010; Middlebrooks and Sommer, 2012). As a result, metacognitive reports could be themselves affected by serotonergic manipulations and there could be a confound between a 5-HT effect on the confidence circuits and a 5-HT effect on the SWM circuit. On the other hand, neurobiological research on metacognition is in its infancy, so that confidence-based predictions are difficult to test in animal studies.

Here, we present behavioral and physiological predictions from the computational model of SWM subject to 5-HT neuromodulation (Cano-Colino et al., 2013) that do not require metacognitive reports. In this way, we provide control predictions to disambiguate a 5-HT effect on metacognition as opposed to on SWM, and we also extend the predictive power of the computational model so it can be validated in electrophysiological studies that use well-established behavioral paradigms of SWM during application of pharmacological agents in behaving animals (Williams et al., 2002; Meneses, 2007; Vijayraghavan et al., 2007).

### **MATERIALS AND METHODS**

### **THE MODEL NETWORK**

We used a neuronal network model of the PFC to explore the relationship between 5-HT and SWM (Cano-Colino et al., 2013). The network model represents a local circuit of the monkey dorsolateral PFC (Funahashi et al., 1989). The local recurrent cortical network consists of two populations of leaky integrate-and-fire neurons (Tuckwell, 1988): excitatory pyramidal cells (*NE* = 1024) and inhibitory interneurons (*NI* = 256). The membrane voltage *Vm* of each neuron obeys the following dynamical equation:

$$C\_m \frac{dV\_m}{dt} = -I\_L - I\_{\text{syn, }\epsilon} - I\_{\text{syn, }i} - I\_{\text{ext}} + I\_s - I\_{\text{5-HT}}$$

where *Cm* represents the membrane capacitance of the neuron. When *Vm* reaches a threshold value *Vth*, *Vm* is reset to *V*res and stays there for an absolute refractory period τref. *I*ext represents random synaptic inputs from outside the network, simulated as uncorrelated Poisson spike trains activating AMPA channels of conductance *g*ext at a rate *n*ext. *I*<sup>s</sup> is the input current associated with stimulus presentation (see below). *I*syn*,<sup>e</sup>* and *I*syn*, <sup>i</sup>* are the recurrent synaptic inputs from presynaptic pyramidal cells and interneurons, respectively. Details of synaptic transmission are given below. For pyramidal neurons *I*5-HT is the current modulated by 5-HT (see below) and the leak current is *IL* = *gL* (*Vm* − *EL*), with *gL* and *EL* being the conductance and reversal potential of leak channels. For interneurons *IL* depends on 5-HT concentration (see below) and there is no *I*5-HT. The intrinsic parameters that characterize pyramidal cells are: *Cm* = 0*.*5 nF, *gL* = 27*.*4 nS, *EL* = −70 mV, *Vth* = −50 mV, *V*res = −60 mV, νext = 1650 Hz, *g*ext = 5 nS, and τref = 2 ms. For interneurons *Cm* = 0*.*2 nF, *EL* = −70 mV, *Vth* = −50 mV, *V*res = −60 mV, νext = 1800 Hz, *g*ext = 1*.*8 nS, and τref = 1 ms.

Neurons are connected through conductance-based synapses of the AMPA, NMDA, and GABAA types, which were calibrated by the experimentally measured dynamics of synaptic currents (Wang, 1999). Thus, postsynaptic currents were modeled according to *I*syn = *g*syn *s(Vm* − *V*syn*)*, where *g*syn is a synaptic conductance, *s* a synaptic gating variable, and *V*syn the synaptic reversal potential (*V*syn = 0 for excitatory synapses, *V*syn = −70 mV for inhibitory synapses). AMPAR and GABAAR synaptic gating variables were modeled as an instantaneous jump of magnitude of 1 when a spike occurred in the presynaptic neuron followed by an exponential decay with time constant 2 ms for AMPA and 10 ms for GABA*A*. The NMDA conductance was voltage-dependent, with *<sup>g</sup>*syn multiplied by 1/(1 <sup>+</sup> [Mg2+] exp(−0*.*<sup>062</sup> *Vm)/*3*.*57), [Mg2+] <sup>=</sup> 1.0 mM, and the channel kinetics were modeled by:

$$\frac{ds}{dt} = \frac{-1}{\mathfrak{r}\_s}s + \mathfrak{a}\_s \mathfrak{x} (1 - s),$$

$$\frac{d\mathfrak{x}}{dt} = \frac{-1}{\mathfrak{r}\_\mathfrak{x}}\mathfrak{x} + \sum\_i \mathfrak{k}\left(t - t\_i\right).$$

where *s* is the gating variable, *x* a synaptic variable proportional to the neurotransmitter concentration in the synapse, *ti* the presynaptic spike times, τ*<sup>s</sup>* = 100 ms the decay time of NMDA currents, τ*<sup>x</sup>* = 2 ms controls the rise time of NMDAR channels, and α*<sup>s</sup>* = 0*.*5 kHz controls the saturation properties of NMDAR channels at high presynaptic firing frequencies. Parameters for synaptic transmission were taken from Compte et al. (2000).

The network model simulates neurons selective to the memorized location in working memory tasks (Funahashi et al., 1989; Goldman-Rakic, 1995). Pyramidal cells and interneurons were spatially distributed on a ring, labeled by their preferred direction of motion (θ*i*, from −180 to 180◦) (**Figure 1A**). Connections between cells were spatially tuned, such that nearby cells were more strongly connected than distant cells (Compte et al., 2000). The connection strength *g*syn*, ij* between pyramidal cells *i* and *j* depended on the difference in preferred angle between the cells and was described by the equation *g*syn*, ij* = *W(*θ*<sup>i</sup>* − θ*j)G*syn, where *W(*θ*<sup>i</sup>* − θ*j)* was the sum of a constant term plus a Gaussian: *<sup>W</sup>(*θ*<sup>i</sup>* <sup>−</sup> <sup>θ</sup>*j)* <sup>=</sup> *<sup>J</sup>*<sup>−</sup> <sup>+</sup> *(J*<sup>+</sup> <sup>−</sup> *<sup>J</sup>*−*)* exp[−*(*θ*<sup>i</sup>* <sup>−</sup> <sup>θ</sup>*j)*2*/*2σ2]. *W(*θ*<sup>i</sup>* − θ*j)* depends on two parameters, *J*<sup>+</sup> and σ, while *J*<sup>−</sup> is determined from a normalization condition (Compte et al., 2000). All the connections were structured with the same σ (σ*EE* = σ*EI* = σ*IE* = σ*II* = 14*.*4◦*)* but with different *J*<sup>+</sup> : *J* + *EE* = 2, *J* + *EI* = 0*.*5, *J* + *IE* = 1*.*4, *J* + *II* = 1*.*9. Following the notations in Compte et al. (2000) and Cano-Colino et al. (2013), the parameters defining the strengths of local connections in the network were as follows: *GEE,* AMPA = 0*.*14 nS, *GEE,* NMDA = 2*.*1 nS (pyramid to pyramid); *GEI,* AMPA = 0*.*72 nS, *GEI,* NMDA = 1*.*9 nS (pyramid

indicated by thickness of connections). Connections onto pyramidal neurons are indicated with a solid line and onto interneurons with a dashed line. **(B)** In pyramidal cells (blue triangle), 5-HT has an inhibitory effect via 5-HT1A receptors by increasing a K<sup>+</sup> current (*IK*1*A*), and an excitatory effect via

to interneuron); *GIE* = 7*.*8 nS (interneuron to pyramid); *GII* = 4*.*4 nS (interneuron to interneuron).

### **5-HT MODULATION**

The model included 5-HT receptor physiological effects on PFC neurons (**Figure 1B**). 5-HT1A and 5-HT2A receptors are the two most abundant receptors in the PFC (Santana et al., 2004), which are highly co-localized in pyramidal neurons (∼80%) (Amargós-Bosch et al., 2004). The main and faster inhibitory response to 5-HT pulses is through 5-HT1A receptors and the later excitatory response through 5-HT2A receptors (Puig et al., 2005; Goodfellow et al., 2009), which desensitize at high [5-HT] levels (Araneda and Andrade, 1991). We modeled the kinetics of the mechanisms of 16 possible locations. After a delay of 3 s subjects have to report the location of the stimulus. In the distractor trials, a distractor stimulus is presented at location θ*<sup>D</sup>* during the delay (at 1.75 s), and a second delay of 1.75 s follows before the behavioral response.

triggered by these receptors with the following simple kinetic equations (Destexhe et al., 1994; Cano-Colino et al., 2013):

$$\frac{ds\_{1A}}{dt} = \frac{-s\_{1A}}{\mathfrak{r}\_{1A}} + \alpha\_{1A}[\text{5-HT}],$$

$$\frac{ds\_{2A}}{dt} = \frac{-s\_{2A}}{\mathfrak{r}\_{2A}} + \alpha\_{2A}[\text{5-HT}](1 - s\_{2A})$$

where *s*1A and *s*2A are the gating variables for the corresponding 5-HT receptors (0 *< s*1A, *s*2A *<* 1); [5-HT] is the serotonin concentration, which we take to be 10 nM in physiological conditions (Celada et al., 2001), but the exact absolute value is not critical in our model. The receptor time constants τ1A and τ2A (τ1A = 30 ms and τ2A = 120 ms) were chosen to match the time course of PFC responses to the electrical stimulation of raphe neurons in the rat (Amargós-Bosch et al., 2004; Puig et al., 2005). The parameters α1*<sup>A</sup>* = 1*.*8 kHz/μM, α2*A, <sup>E</sup>* = 2*.*25 kHz/μM and α2*A, <sup>I</sup>* = 11 kHz/μM control the affinity of the receptors to 5-HT and were obtained through an optimization procedure seeking to ensure the stability of working memory function in the network in the presence of baseline 5-HT (Cano-Colino et al., 2013). Assuming a diffuse, temporally constant action of 5-HT on the network's mechanisms to mimic systemic application of serotonergic agents, our simulations used a constant value of [5-HT] for all neurons in the network and through all periods of the task. Phasic actions of the serotonergic system in the course of the task are not evaluated in this study. In different simulations we changed the tonic level of [5-HT], increasing or decreasing the initial value (physiological level, see above) by 10 or 20%. We also mimicked the effect of agonists or antagonists of 5-HT receptors. To this end, we ran simulations in which we kept constant the activation of one receptor, by fixing the value of [5-HT] in its kinetic equation, while we modified the activation of the other one by changing [5-HT] in its equation.

The physiological actions of these receptors on pyramidal neurons have been characterized in *in vitro* electrophysiological studies (Andrade et al., 1986; Araneda and Andrade, 1991; Béïque et al., 2004; Villalobos et al., 2005; Zhang and Arsenault, 2005; Ma et al., 2007). In interneurons, 5-HT2A receptors have been described in *in vitro* studies to increase neuronal excitability (Deng et al., 2007; Puig et al., 2010). We included these electrophysiologically defined effects of 5-HT receptors in our network model (**Figure 1B**, see below) (Meeter et al., 2006; Cano-Colino et al., 2013).

We simulated the hyperpolarizing action of the 5-HT1A receptor in pyramidal prefrontal neurons by including a 5-HTmodulated K<sup>+</sup> current (*IK*1*A*) in excitatory cells of our network (**Figure 1B**, blue triangle), modeled according to:

$$I\_{K1A} = \mathcal{g}\_{K1A} s\_{1A} \left( V - V\_K \right)^\*$$

where *gK*1*<sup>A</sup>* = 29*.*7 nS is the maximal conductance of the channel and *VK* = −70 mV is the potassium reversal potential. The depolarizing/excitatory response to 5-HT2A receptor activation was simulated through 3 different mechanisms: increase in intracellular Ca2+, inhibition of Ca2+-activated afterhyperpolarization currents (*IKCa*), and activation of an afterdepolarization current mediated by a Ca2+-dependent non-selective cation channel (*ICan*). The calcium dynamics were modeled by the following equation:

$$\frac{d\left[\text{Ca}^{2+}\right]}{dt} = \alpha\_{\text{Ca}} \sum\_{sp} \delta\left(t - t\_{sp}\right) - \frac{\left[\text{Ca}^{2+}\right]}{\tau\_{\text{Ca}}} + \gamma\_{\text{[5-HT]}^{\text{S2-H}}}$$

where γ[5-HT] = 0*.*41 nM/ms controls calcium flow through 5- HT2A receptors, the [Ca2+] influx per spike is <sup>α</sup>Ca <sup>=</sup> <sup>0</sup>*.*1μM and τCa = 240 ms (Wang, 1998; Tegnér et al., 2002). These calcium dynamics affect *IKCa* and *ICan* according to:

$$\begin{split} I\_{\text{KCa}} &= \text{g}\_{\text{KCa}} \left( 1 - s\_{2A} \right) \frac{\left[ \text{Ca}^{2+} \right]}{\left[ \text{Ca}^{2+} \right] + K\_{D}} \left( V - V\_{K} \right), \\ I\_{\text{Can}} &= \text{g}\_{\text{Carn}} m\_{\text{Ca}}^{2} h\_{\text{Ca}} \left( V - V\_{\text{Can}} \right), \\ \frac{dm\_{\text{Ca}}}{dt} &= \frac{m\_{\infty} - m\_{\text{Ca}}}{\text{\text{\textdegree}\_{\text{Ca}}}}, \\ m\_{\infty} &= \frac{\alpha\_{\text{Ca}} \left[ \text{Ca}^{2+} \right]}{\alpha\_{\text{Ca}} \left[ \text{Ca}^{2+} \right] + \beta\_{\text{Ca}}}, \\ \text{\textdegree}\_{\text{Can}} &= \frac{1}{\alpha\_{\text{Ca}} \left[ \text{Ca}^{2+} \right] + \beta\_{\text{Ca}}}, \\ h\_{\text{Ca}} &= \frac{1}{1 + \exp \left( \left[ \text{Ca}^{2+} \right] - \beta\_{\text{h}} \right) / \alpha\_{\text{h}}} \end{split}$$

where *gKCa* = 703 nS, *VK* = −70 mV, *KD* = 30μM, *gCan* = 36 nS, *VCan* = −20 mV, <sup>α</sup>*Can* <sup>=</sup> <sup>0</sup>*.*0056 [ms·μM]−1, <sup>β</sup>*Can* <sup>=</sup> <sup>0</sup>*.*002 ms−1, <sup>α</sup>*<sup>h</sup>* <sup>=</sup> <sup>3</sup>μM, <sup>β</sup>*<sup>h</sup>* <sup>=</sup> <sup>5</sup>μM (Wang, 1998; Tegnér et al., 2002; Cano-Colino et al., 2013).

We included the depolarizing action of 5-HT2A receptors on model interneurons (**Figure 1B**, red circle) by having 5-HT2A receptors activation decrease the conductance of the leak current:

$$\lg = \lg\_L^\*(1 - s\_{2A})$$

where *g*∗ *<sup>L</sup>* = 26 nS.

### **SIMULATIONS**

We simulated a SWM task that resembled behavioral protocols used in monkey experiments (Funahashi et al., 1989; Williams et al., 2002). In brief, monkeys fixate on a central spot during a brief presentation of a peripheral cue and throughout a subsequent delay period. After this delay, they make a saccadic eye movement to where the cue had been presented in order to obtain a reward. To mimic this behavioral protocol in our simulations, simulation trials consisted of four periods: fixation (3 s), cue (0.25 s), delay (3 s), and response (**Figure 1C**, left). In the fixation period there were no additional external inputs to the network so it typically stayed in a spontaneous, unstructured firing state. In the cue period, a cue stimulus was applied at location θ*s*. This was simulated as current injection to each excitatory neuron in the network (labeled by θ*i*) of intensity *Is(*θ*i)* = *I*<sup>1</sup> exp[μstim*(*cos*(*θ*<sup>i</sup>* − θ*s)* − 1*)*]. We used *I*<sup>1</sup> = 0*.*235 nA and μstim = 10. During the delay, no stimulus was presented so that the network maintained the cue position in a stable pattern of network activation (activity bump).

The response period was not simulated explicitly, but a decoding algorithm was used to simulate behavioral responses during simulated tasks. The task consisted of reporting the location of the cue stimulus after the delay period within a predefined tolerance window. To obtain a behavioral response from our simulation trials we computed a population vector estimation (Georgopoulos et al., 1986; Lee et al., 1988) from the network activity at the end of the delay period. Thus, if {*ni*, *i* = 1 *... NE*} are the spike counts of all the excitatory neurons labeled by {θ*i, i* = 1 *... NE*} in a 50-ms window at the end of the delay period, the population vector was computed as the normalized sum of each neuron's selectivity vector *ei*θ*<sup>i</sup>* (we use complex notation to operate with vectors in a compact manner) weighted by its spike count: *<sup>P</sup>* <sup>=</sup> *nie<sup>i</sup>*θ*<sup>i</sup> ni* −<sup>1</sup> . We then extracted the modulus *<sup>C</sup>* and angle <sup>θ</sup>*<sup>R</sup>* of the resultant population vector: *<sup>P</sup>* <sup>=</sup> *Cei*θ*<sup>R</sup>* . For each individual simulation trial we took θ*<sup>R</sup>* as the decoded location memorized in the network activity before response initiation, the *behavioral response*. Correct trials were those trials for which |θ*<sup>R</sup>* − θ*S*|*<* 22.5◦, where 22.5◦ is an arbitrarily defined window around θ*<sup>S</sup>* to define correct trials.

In several simulations, we tested the influence of distractors on network performance. We define a distractor as an external stimulus of the same intensity and duration as the cue stimulus, but which appears after the cue and the nature of the task requires it to be ignored. The distraction trials consisted of: fixation (0.75 s), cue stimulus in the direction θ*<sup>S</sup>* (0.25 s), delay 1 (1.75 s), distractor in the direction θ*<sup>D</sup>* (0.25 s), delay 2 (1.75 s), and response (**Figure 1C**, right). Distractors were modeled as a cue stimulus (same strength, *I*<sup>1</sup> = 0*.*235 nA and μstim = 10, same duration, 250 ms) but at a different location relative to the cue stimulus. We tested several distances between the stimulus position and the distractor position (θ*<sup>D</sup>* − θ*S*, from 0 to 180◦ in steps of 11.25◦) for different levels of baseline [5-HT] (see above). We ran 100 simulations for each condition ([5-HT] and distance θ*<sup>D</sup>* − θ*S*), and then we computed the behavioral response (report location, θ*R*) at the end of the delay period, from which we had a measure of the distractability as θ*<sup>R</sup>* − θ*<sup>S</sup>* vs. θ*<sup>D</sup>* − θ*S*.

### **NUMERICAL INTEGRATION**

The integration method used was a second-order Runge–Kutta algorithm with a time step of *t* = 0*.*02 ms. The custom code for the simulations was written in C++.

### **RESULTS**

We used a computational network model to investigate how 5- HT modulates WM function (Cano-Colino et al., 2013). This neuronal network model mimics the activity of neurons in PFC of monkeys performing a visuo-spatial WM task (Compte et al., 2000), and incorporates the cellular mechanisms of receptors 5- HT1A and 5-HT2A described in PFC neurons *in vitro* (**Figure 1B**) (Materials and Methods) (Cano-Colino et al., 2013). The network model consisted of 1024 pyramidal neurons (excitatory cells) and 256 interneurons (inhibitory cells), each coding for a stimulus location at a specific angle (**Figure 1A**). Any neuron connected to all other neurons in the network, and it received independent random inputs (assumed to be inputs from other brain regions). Connections between pyramidal cells coding for nearby stimuli were stronger than average (see Materials and Methods). With this general network organization, simulations produce neural activity consistent with PFC single-neuron data acquired during performance of visuo-spatial WM tasks (Funahashi et al., 1989; Compte et al., 2000).

We used this model to simulate behavioral tasks used in monkeys and humans to test SWM (Funahashi et al., 1989; Park and Holzman, 1992; Park et al., 1999). The task (**Figure 1C**, left) starts with a fixation period of 3 s followed by a brief presentation (0.25 s) of a cue stimulus (localized external current input to the network). The cue stimulus appeared in a random location restricted within a circle of given eccentricity from the fixation point. Thus, the cue stimulus location was entirely described by an angle value θ*<sup>S</sup>* (−180◦ *<* θ*<sup>S</sup> <* 180◦). After cue stimulus extinction there was a delay period of 3 s, after which the location of the cue stimulus had to be reported based on the network's neural activity at the end of the delay period. We did not simulate explicitly the response period activity of the network (see Materials and Methods).

The network parameters could be tuned so that during the delay period the memory of the cue stimulus was maintained in a localized bump of neural activity (cluster of neighboring cells with raised activity, see **Figure 2A**) by virtue of strong reverberatory recurrent excitation among neighboring excitatory cells and strong disynaptic inhibition between excitatory cells of dissimilar selectivity (Compte et al., 2000; Tegnér et al., 2002). In this optimal network operation regime, this tuned persistent activity state (*memory state*) is maintained in a stable manner by the network, but the network can also sustain a low firing rate, unstructured network state (*spontaneous state*) if no cue stimulus has been presented (as during the fixation period in **Figure 2A**). Neuromodulation through 5-HT receptors imbalances this network regime and causes the destabilization of either the memory state or the spontaneous state, thus resulting in two types of behavioral errors that can be distinguished based on the reported confidence in the response (Cano-Colino et al., 2013). We first review briefly this result and we then use the model to derive additional predictions that do not require metacognitive evaluation.

### **5-HT MODULATION OF SWM PERFORMANCE**

We ran repeated trials with the task structure defined above but with different noise realization so that network activity over the course of the trial varied substantially from trial to trial. Then, we could extract behavioral output from these network simulations that could be treated similarly as in a real psychophysics experiment (Cano-Colino and Compte, 2012; Murray et al., 2012; Cano-Colino et al., 2013). For each simulation trial we obtained a *behavioral response* by extracting a population vector read-out of the angle θ*<sup>R</sup>* encoded in network activity in a window of 50 ms at the end of the delay period. Thus, for each simulation trial we obtained the full network dynamics over the course of the trial, and one behavioral measure: the decoded stimulus location θ*R*. As it is usual in behavioral experiments, we classified trials as correct or error trials. If |θ*<sup>R</sup>* − θ*S*| *<* 22.5◦ we classified the trials as correct, and if |θ*<sup>R</sup>* − θ*S*| *>* 22.5◦ the trial was an error. Typically, in correct trials the localized network activity (*bump*) triggered by the stimulus at θ*<sup>S</sup>* was maintained by excitatory reverberation robustly through the delay (*memory bump trials*, **Figure 2A**). When inspecting the network dynamics in error trials two main causes for errors could be distinguished. In some cases, network activity formed in response to the stimulus at θ*<sup>S</sup>* failed to reverberate through the length of the delay period so that by the end of the delay network activity was unstructured and did not contain any robust signal (*decaying bump trials*, **Figure 2B**). Responses in these error trials would be declared with low confidence because

**FIGURE 2 | 5-HT modulates SWM performance in the network model. (A–C)** Example of rastergrams of simulations with 3 s fixation, a cue presentation of 250 ms (gray bar) and a delay period of 3 s. Panel **(A)** shows a correct trial: at the end of the delay period the memory bump is in the same position where the cue was presented. Possible errors can be due to: memory disappearance before the end of the delay period (**B**, decaying bump trial) or emergence of a spurious, misaligned bump (**C**, emergent bump trial).

See specific criteria for defining correct and error trials in the text, based on the relative match of cue location (red triangle) and decoded bump location at the end of the delay (black triangle, θ*R*). **(D–F)** For different levels of [5-HT] (black bars), and varying activation of 5-HT1A (purple bars) or 5-HT2A (blue bars) receptors, fraction of correct trials **(D)**, fraction of decaying bump error trials, **(E)** and fraction of emergent bump error trials **(F)**. Thousand trial simulations per condition.

there was no signal in the network at the end of the delay and the response θ*<sup>R</sup>* would essentially be randomly chosen. In other cases, the error occurred because a spontaneous bump of activity was formed in the network before the cue stimulus was presented and it remained stable for the duration of the delay period, despite the subsequent presentation of the stimulus at θ*<sup>S</sup>* (*emergent bump trials*, **Figure 2C**). These responses would be declared with high confidence, as the late-delay signal in the network was strong, albeit wrong.

We then ran 1000 trials of this task simulation for 5 different values of tonic 5-HT concentration, and we identified trials as correct or error trials based on the classification described above. When the network model was subject to the reference 5-HT concentration of 10 nM (*physiological level*) almost all trials were correct responses. However, when the tonic 5-HT level was either increased or decreased, errors became more frequent (**Figure 2D**, black bars). The results showed an inverted U-shape, with optimal performance around our reference 5-HT concentration. Similar non-monotonic dependencies were observed when the activation of 5-HT1A or 5-HT2A receptors were independently manipulated (**Figure 2D**, purple and blue bars). It has been argued that this non-monotonic dependence of behavioral accuracy with 5-HT concentration could be a factor in the inconsistent results of psycho-pharmacological studies exploring the effect of 5-HT on SWM (Cano-Colino et al., 2013).

Inspection of the network activity in individual trials revealed that errors committed were very different in the high 5-HT and low 5-HT conditions: Errors in the low 5-HT network were mostly emergent bump trials (**Figure 2F**, black bars), while the performance decrease for high 5-HT concentrations was mostly due to decaying bump trials (**Figure 2E**, black bars). The same trend of error types was observed varying only 5-HT1A receptor activation: 5-HT1A agonists caused decaying bump trials (**Figure 2E**, purple bars) and 5-HT1A antagonists promoted emergent bump trials (**Figure 2F**, purple bars). In contrast, 5- HT2A receptor manipulations presented the opposite behavior: antagonists of 5-HT2A increased decaying bump trials (**Figure 2E**, blue bars) and 5-HT2A agonists increased the incidence of emergent bump trials (**Figure 2F**, blue bars). The analogy between the types of errors resulting from 5-HT1A receptor activation and from 5-HT concentration increases, indicates that the relevant effect of 5-HT in this WM network is a change of cellular excitability in excitatory neurons (because this is the only effect of 5-HT1A in the model) (Cano-Colino et al., 2013). One can therefore understand emergent bump trials as a consequence of enhanced network excitability, and decaying bump trials as a result of reduced network excitability. These two network dynamics cause behavioral errors of very different nature.

Non-monotonic dependencies of behavioral performance with 5-HT modulations in **Figure 2D** can thus turn into monotonic relationships (**Figures 2E,F**) if we can distinguish the nature of the errors (whether a decaying bump trial or an emergent bump trial) on a trial-by-trial basis. Monotonic dependencies would then be a lot easier to document in experimental population studies. In our network simulations we resorted to the full network dynamics, but this is experimentally inaccessible with current neurophysiological techniques. One possibility presented by (Cano-Colino et al., 2013) is to use the confidence in the response as a behavioral parameter to distinguish the two types of errors: low-confidence errors would mostly follow the predictions for decaying bump trials (**Figure 2E**), while high-confidence errors would mimic emergent bump trials (**Figure 2F**). We now turn into identifying other ways to tell apart the two error regimes experimentally without resorting to a metacognitive evaluation.

### **DELAY-LENGTH DEPENDENCY OF 5-HT EFFECTS ON SWM PERFORMANCE**

One defining feature of working memory processes is the fact that the retention of stimulus attributes degrades with time (Pasternak and Greenlee, 2005). Also in our computational network model, the representation of the cue stimulus degrades during the delay (Compte et al., 2000). We reasoned that the performance of networks with parameter modulations promoting decaying bump trials would be especially sensitive to the duration of the delay because for longer delays more and more memory states would destabilize and decay to the spontaneous state. We tested this explicitly by running multiple trials with three different delay lengths: 1, 2, and 3 s delay, and for three 5-HT concentrations: physiological [5-HT], low [5-HT], and high [5-HT]. When we plotted the fraction of correct trials for each of the factors, we found that for increases in baseline 5-HT levels behavioral performance decayed with delay duration more strongly than in either the physiological or low [5-HT] conditions (**Figure 3**). Strong delay-length dependency of SWM performance therefore characterizes decaying bump error trials and should be a specific trait of agonists, not antagonists, of 5-HT1A receptors or antagonists, not agonists, of 5-HT2A receptors, or treatments leading to increased, not decreased, baseline [5-HT] according to our computational network model.

### **5-HT RECEPTORS EFFECT ON THE DISTRACTIBILITY OF THE NETWORK**

The ability to resist distractors is an important component of SWM, and our computational network model has inhibitory

trials, for a delay length of 1, 2, and 3 s for two levels of [5-HT], compared with the physiological level of [5-HT] (dashed line). **Top**, for a low level of [5-HT] (20% reduction compared to the physiological level), the fraction of correct trials was reduced but remained roughly constant for different delay lengths. **Bottom**, for a high level of [5-HT] (20% increase compared to the physiological level), the fraction of correct trials diminished parametrically with increasing delay length.

mechanisms that allow it to resist the presentation of intervening distractors (Compte et al., 2000). We hypothesized that a regime with a fraction of decaying bump trials would behave radically different than a regime with a large proportion of emergent bump trials in relation to the filtering of unwanted distractors, and this would provide us with another prediction to disambiguate error types in different neuromodulatory manipulations. We therefore sought to characterize how 5-HT neuromodulation affected the resistance to distractors presented as intervening stimuli during the delay period in the network model.

To this aim, we ran trials with a different simulation protocol (distractor trials, **Figure 1C**, right) (see Materials and Methods). First, a cue stimulus was shown at an angle θ*S*. It triggered normally the memory state, a bump with a peak at an angle close to θ*<sup>S</sup>* (see **Figures 4A,B**) prior to the presentation of a distractor at angle θ*D*, of the same intensity and duration as the cue stimulus (see Materials and Methods). We then quantified the effect of the distraction by measuring the peak location of the bump state

black triangle) is decoded from activity at the end of the delay period. **(C)** For the physiological level of [5-HT], fraction of simulation trials in which a given distraction angle θ*<sup>R</sup>* − θ*<sup>S</sup>* was observed when stimulus and distractor were presented at different relative locations θ*<sup>D</sup>* − θ*<sup>S</sup>* (θ*<sup>S</sup>* = 0◦, θ*<sup>D</sup>* from 0 to 180◦ in steps of 11.25◦, 100 simulations per θ*D*). Warmer colors on the diagonal indicate distraction, while resistance to distractors is characterized by higher fraction of trials (warmer colors) along the *x*-axis.

at the end of the delay period (report angle θ*R*, **Figures 4A,B**) (see Materials and Methods). The effect of the distractor was quantified as the difference between the location of behavioral response compared with the location of the cue stimulus, θ*<sup>R</sup>* − θ*S*. Distractor stimuli were presented at various positions relative to the cue stimulus (θ*<sup>D</sup>* − θ*S*), in separate trials. The effect of the distractor depends on the stimulation intensity (Compte et al., 2000), so we considered only the condition in which the applied distractor stimulus had the same intensity and duration than the cue stimulus. We chose stimulation intensity so that a similar proportion of trials had response θ*<sup>R</sup>* in a vicinity of the cue θ*<sup>S</sup>* and the distractor θ*<sup>D</sup>* stimuli, for large θ*<sup>D</sup>* − θ*S*.

When the distractor was applied, the network activity that was previously storing the cue stimulus could maintain its original position resisting the distraction (**Figure 4A**) or move to the location of the distractor (**Figure 4B**). We ran several simulations of the task for different distances between the stimulus position and the distractor position (θ*<sup>D</sup>* − θ*S*). **Figure 4C** shows the fraction of trials that, for a given distance between the distractor and cue stimuli (θ*<sup>D</sup>* − θ*S*), had an effect in the behavioral response θ*<sup>R</sup>* − θ*<sup>S</sup>* (distance between the location of the report and the cue stimuli). As previously shown (Compte et al., 2000), the effect of the distractor depended on the distance to the cue stimulus: distraction was very probable when distractor and cue stimulus where presented in nearby locations (θ*<sup>D</sup>* − θ*<sup>S</sup> <* 45◦). When the distance θ*<sup>D</sup>* − θ*<sup>S</sup>* increased, both distraction and nodistraction trials were observed in a similar proportion (as a result of our choice of stimulation strength, see above). This pattern of distraction corresponded to our network model with a physiological level of [5-HT].

The manipulation of baseline 5-HT concentrations in the network model was found to greatly affect the distractibility of the network. Low [5-HT] (**Figure 5A**, top) led to perseverant behavior, resisting the distraction in almost all trials. Only when the distractor was within 45◦ of the cue stimulus the memory state was perturbed, and the memory bump was attracted toward the distractor location. For larger distances the network was unaffected by distractors. High [5-HT] (**Figure 5A**, bottom) had the opposite effect, the network resulted easily distracted by both close-by and distant distractors. Thus, the two conditions were more readily distinguished by using distractors sufficiently distant from the cue: if 33% of network simulations (293 out of 900) resulted in a correct response in the baseline condition for θ*<sup>D</sup>* − θ*<sup>S</sup> >* 90◦, this fraction grew significantly to 76% of simulations when reducing baseline [5-HT] by 20%, and it instead decreased to just 6% of correct trials when baseline [5-HT] was raised by 20%.

We also studied how specific receptor agonists and antagonists would modify the capacity of the network to resist distractors. From the results of **Figures 2E,F** we hypothesized that agonists of 5-HT1A receptors and antagonists of 5-HT2A receptors would have the same effect on network distractibility as increasing baseline [5-HT], further supporting the idea that changes of [5-HT] engage dominantly 5-HT1A. Therefore, we varied in our network simulations the tonic activation of one receptor type independently from the other, mimicking the effect of agonists or antagonist of the receptors, similarly to what we did in **Figure 2**. When the activation of 5-HT1A receptors was decreased (**Figure 5B**, top) the network was completely unaffected by distractors, as it happened for low levels of [5-HT] (**Figure 5A**, top). And when 5-HT1A receptors were agonized the network lost its ability to resist the distractors for any position of the distractor (**Figure 5B**, bottom), as it also happened with high [5-HT] (**Figure 5A**, bottom). In contrast, modulation of 5-HT2A receptors activation led to the reversed pattern: 5-HT2A agonists made the network become more resistant to an intervening stimulus,

and 5-HT1A agonists (Bottom, +10% change relative to baseline 5-HT1A activation) have similar effects on network distractibility than corresponding manipulations of [5-HT], suggesting that general 5-HT modulation operates primarily on 5-HT1A receptors. **(C)** In contrast, 5-HT2A agonists (Top, +20% change relative to physiological value) and 5-HT2A antagonists (Bottom, +20% change relative to physiological value) follow the opposite trend than corresponding [5-HT] modulations.

whereas 5-HT2A antagonists rendered the network very sensitive to intervening distractors (**Figure 5C**). The larger differences again occur for θ*<sup>D</sup>* − θ*<sup>S</sup> >* 90◦, so that the percent of correct responses in these trials is 73% (5%) for a 20% 5-HT1A (5-HT2A) receptor activity reduction, and 5% (74%) for a 20% 5-HT1A (5-HT2A) receptor activity enhancement.

Thus, SWM tasks with intervening distractors is a behavioral protocol that can clearly tell apart the two different causes for SWM deficits that our network model predicts to occur as serotonergic neuromodulation is parametrically varied.

### **5-HT MODULATES THE SHAPE OF NEURONAL MEMORY FIELDS**

So far we have been able to propose two experimental validation of the computational model in behavioral experiments that do not require metacognition evaluation. However, our model is neurobiologically explicit and it can also produce predictions regarding PFC neural activity in SWM tasks upon systemic 5-HT neuromodulation. We derived one such prediction based on how the different error types affect neural tuning curves in the delay period of a SWM task. We first illustrate how we build such tuning curves schematically in **Figure 6A**, which shows schematically the firing rate during the delay period for 5 different trials (labeled *i–v*) in which the stimulus was presented at θ*<sup>S</sup>* = 0 (red triangles), together with the location of the behavioral report (θ*R*, black triangles), computed at the end of the delay period. Following the criteria in **Figure 2**, trial *i* would be classified as a correct trial, trials *ii*, *iii,* and *iv* would be emergent bump error trials and trial *v* would be a decaying bump error trial. The latter shows a small bump around the cue location due to the activity at the beginning of the delay, before the bump decays, but the report location is in

**[5-HT] levels. (A)** Scheme of 5 possible examples of neuronal firing rate during the delay period (*i–v*). The red triangle shows the location of the stimulus (θ*S*), the black triangle the report location calculated at the end of the delay period (θ*R*) and the classification of each trial is showed on the right, computed as in **Figure 2**. Right, Tuning curves are computed by averaging firing rate, after aligning the trials to the stimulus location (red line, *cue tuning*) or to the report location (black line, *report tuning*). **(B)** Predicted delay cue tuning changes with 5-HT: low [5-HT] (dark red line) induces increases in

tuning curve tails, high [5-HT] (orange line) mostly reduces tuning curve peak. **(C)** Tuning curves in the delay period computed from the stimulus cue position (red lines) and from the position of the behavioral report (black lines). Average delay firing rates for bins of 45◦ (circles) and Gaussian fit computed from all trials (200 trials) (top panels) or just from error trials (bottom panels). Left: low [5-HT] (20% decrease compared to the physiological level) has better tuning for report than cue tuning curves. Right: high [5-HT] (20% increase compared to the physiological level) shows better tuning for cue than for report tuning curves.

a random location, because the activity at the end of the delay is unstructured. As it is customary to build tuning curves, we can average the firing rate over many different trials by using the location of the cue stimulus as a reference (θ − θ*S*). We call this *cue tuning* (**Figure 6A** right, red curve). Error trials can have a marked influence on the cue tuning: although its height depends on correct trials, it decreases if there is a substantial number of error trials.

In addition, we now propose to compute a different neural tuning curve, based on the behavioral response. We calculate the *report tuning* curve (**Figure 6A** right, black curve) by averaging the firing rates during the delay period in different trials as a function of the distance from the neuronal cue preference θ (obtained from the cue tuning curve) to the report location θ*R(*θ − θ*R*). The report tuning is also affected by the number of error trials, but it is in addition very sensitive to the type of error trials. Thus, emergent bump error trials (*ii–iv*) make the curve more sharply tuned, while decaying bump error trials (*v*) reduce its tuning. We therefore hypothesized that a systematic study of how serotonergic manipulations alter the sharpness of cue tuning and report tuning in a SWM task could yield a prediction from our modeling approach that could be tested electrophysiologically to identify the effects of different error substrates in PFC neuronal responses.

We used the full network dynamics from our simulations in **Figure 2** to test this hypothesis with our modeling data. We computed neuronal firing rates in the delay period for each neuron θ in each simulation trial and we averaged across all trials, correct and error trials. We first analyzed predictions in relation to the classically defined delay tuning curve, the cue tuning. Relative to the cue tuning curve in the control [5-HT] levels, reduction of baseline [5-HT] essentially increased firing rate to non-preferred cues, while increased [5-HT] caused primarily a reduction of delay firing rates in response to cue stimuli (**Figure 6B**). We then computed delay firing rates using both bins centered around the cue location θ*<sup>S</sup>* (cue tuning) and bins centered around the report location θ*<sup>R</sup>* (report tuning). We found that for low baseline [5-HT] (**Figure 6C**, top left) the report tuning curve was more sharply tuned than the cue tuning curve because of the high incidence of emergent bump error trials (**Figure 2**). On the other hand, for high baseline [5-HT] (**Figure 6C**, top right) the report tuning curve presented worse selectivity than the cue tuning curve, as a result of decaying bump error trials. An increase in the firing rate in the tails of the curves was also observed for the high baseline [5-HT] condition, especially for the report tuning curve.

These differences were still bigger when we analyzed only error trials. For low baseline [5-HT] (**Figure 6C**, bottom left) the cue tuning curve showed essentially no selectivity for the cue because practically all errors corresponded to emergent bump trials. In contrast, the report tuning curve was still nicely tuned because these error trials maintained a bump of activity centered around the report location (**Figure 2C**). For high baseline [5-HT] (**Figure 6C**, bottom right) the cue tuning curve retained good cue selectivity because decaying bump error trials had bumps centered around the cue location in the early period of the delay. In contrast, the report tuning was not selective at all, since for these error trials responses do not reflect a bump of activity in the network but a random choice due to the lack of network selectivity at the end of the delay (**Figure 2B**). Note however a small selectivity in the report tuning of the bottom right panel in **Figure 6C**. This is due to the fact that mean network activity over the whole delay (as plotted in **Figure 6C**) still has residual enhanced rates around the cue stimulus for decaying bump trials (case *v* in **Figure 6A**), which are the dominant type of error trials there. Since in error trials behavioral reports are far from the cue, neurons with preference right on the report are never participating in such residual mean activity. This explains the dip in the report tuning curve in **Figure 6C**, bottom right panel.

Our computational model thus predicts that a comparison of cue tuning and report tuning curves from delay activity in SWM tasks should be able to reveal the different neural substrate of errors committed in the various 5-HT modulations considered. A majority of emergent bump errors leads to sharper report tuning than cue tuning, while a majority of decaying bump errors is reflected in a sharper cue tuning compared to the report tuning.

### **DISCUSSION**

We used a cortical network model for SWM which includes the effect of 5-HT modulation of PFC neurons proposed previously (Cano-Colino et al., 2013) to investigate behavioral and physiological effects of 5-HT neuromodulation of SWM. The dynamics of our computational network model was strongly affected by changes in baseline 5-HT concentration [5-HT] in regard to the stability of the homogeneous, unstructured spontaneous state, and the stability of the tuned, persistent activity memory state (Cano-Colino et al., 2013). However, when the network was tested in a SWM task that mimics experimental behavioral protocols (**Figure 1C**, left), performance changed non-monotonically with [5-HT] (**Figure 2D**), which may compromise the detection of behavioral effects (Cano-Colino et al., 2013). However, the error types committed for reduced and increased [5-HT] were of radically different nature, either decaying bump errors (**Figure 2B**) or emergent bump errors (**Figure 2C**), and distinction between these error types revealed monotonous, predictable effects of [5-HT] manipulations that would be easier to prove experimentally (**Figures 2E,F**). In a previous study we argued that one way to distinguish error types would be by separating errors based on the declared confidence of the subjects in their responses (Cano-Colino et al., 2013). While confidence reports are increasingly being used in behavioral studies of SWM (Pessoa and Ungerleider, 2004; Middlebrooks and Sommer, 2011; Mayer and Park, 2012; Rademaker et al., 2012; Tanaka and Funahashi, 2012), interpretations suffer from our limited understanding of metacognition mechanisms. Here, we sought to explore other possible strategies to distinguish errors without using metacognition, in order to control for possible confounds and to provide simpler behavioral protocols that can be implemented more easily in animal studies. Two of these predictions concern behavioral experiments that can be carried out both on human and nonhuman subjects (**Figures 3**, **5**), and one prediction is specific for electrophysiological experiments in monkey studies of SWM (**Figure 6**).

In a first prediction we propose to study the dependency of behavioral accuracy with delay-length as a way to disambiguate decaying bump errors from emergent bump errors. Decaying bump errors are very sensitive to delay length because once the cue stimulus triggers bump activity in the network, random fluctuations can destabilize the bump and they are more likely to occur the longer the delay period. Thus, delay-length dependency in error rates is a property of decaying bump error trials, and not of emergent bump error trials (**Figure 3**), and this could distinguish the behavioral errors committed for low and high [5- HT] experimentally without requesting a report of confidence from the subjects. For other neuromodulatory systems, pharmacological modulations can indeed have delay-dependent or delayindependent behavioral effects on working memory depending on dosage (Penetar and McDonough, 1983; Chudasama and Robbins, 2004).

The second prediction concerned the ability to resist distractors during the SWM task. Previous computational studies have studied the conditions for SWM distraction in similar neuronal network models (Compte et al., 2000; Brunel and Wang, 2001; Durstewitz and Seamans, 2002; Gruber et al., 2006; Murray et al., 2012). When the external distractor is presented close to the location of the cue stimulus, the memory report is invariably attracted toward the distractor location (**Figure 4C**), but when the distractor is beyond a given distance from the cue (in **Figure 4** ∼45◦), one observes either a negligible effect of the distractor or a complete distraction (**Figure 4**) (Compte et al., 2000). The proportion of distraction trials for these distantdistractor conditions (Compte et al., 2000) as well as the window of short distances in which the distractor attracts the memory trace (Gruber et al., 2006; Murray et al., 2012) depend on the specific parameters of the network simulations and are therefore subject to possible neuromodulation (Compte et al., 2000; Brunel and Wang, 2001; Durstewitz and Seamans, 2002; Gruber et al., 2006). Interestingly, recent psychophysics studies have found evidence for this distance-dependent attraction of distractors in SWM (Stefan Van der Stigchel et al., 2007; Herwig et al., 2010), thus lending support to our network mechanism for the control of SWM distraction.

Our model makes testable experimental predictions about how 5-HT pharmacological manipulations affect distractibility in a SWM task with intervening distractors. Drugs that enhance brain 5-HT levels, such as selective serotonin reuptake inhibitors or tryptophan loading treatments, should decrease the ability to resist distractors (**Figure 5A**, bottom), and result in increased errors for distractors far away from cue stimuli. On the other hand, reducing 5-HT brain levels with tryptophan depletion should lead to better resistance to distant distractors (**Figure 5A**, top). Directly modulating 5-HT1A and 5-HT2A receptors should also affect distractibility: Antagonists of 5-HT1A and agonists of 5-HT2A receptors would facilitate resistance to distractors, while increased distractibility should be expected after treatment with agonists of 5-HT1A and antagonists of 5-HT2A receptors (**Figure 5**). Experimentally, we are not aware of any study that has investigated the effect of 5-HT drugs on distraction in such SWM tasks. Instead, the effect of 5-HT on cognitive flexibility is well-described (Robbins and Roberts, 2007). Several studies have shown a perseverative, inflexible behavior associated with prefrontal 5-HT depletion including a failure in error detection, altered responsiveness to punishment or loss of reward, and a deficit in inhibitory control (Deakin, 1991; Murphy et al., 2002; Evers et al., 2005). Large depletions of 5-HT throughout the PFC (Clarke et al., 2004, 2005) as well as more restricted lesions targeting the OFC (Clarke et al., 2007) result in impaired discrimination reversal performance characterized by marked response perseveration. Although these tasks are clearly different from the distraction SWM task that we analyzed here, it is suggestive to realize that if we change the task and ask the network to switch to the memory of the distractor to complete the task, reduced [5-HT] would result in perseverant behaviors (memories would remain on the initial cue) and the network would be unable to perform the required switch.

We observed a parallelism in distractibility between the global effects of baseline [5-HT] and the activation of 5-HT1A receptors. In contrast, distractibility effects from the activation of 5-HT2A receptors followed the opposite trends, with agonists reducing distractibility and antagonists increasing it. These results reinforce the idea advanced before (**Figure 2** and Cano-Colino et al., 2013) that 5-HT1A receptors drive the effects of general modulation of 5-HT baseline levels in PFC in SWM.

As a third prediction, we took advantage from the fact that our model is mechanistically explicit to derive predictions at the neurophysiological level that can be addressed in electrophysiological experiments in behaving animals. We reasoned before that microinfusion of serotonin receptor agonists and antagonists in the PFC should alter behavioral WM performance nonmonotonically (**Figure 2**), while modulating monotonically the average firing rate during delay periods of correct trials (Cano-Colino et al., 2013). We now add that in such experiments, receptor modulations should alter the shape of neuronal tuning curves in error trials (**Figure 6**). In error trials, delay tuning to the cued stimulus (cue tuning) should be diminished relative to correct trials but some residual cue tuning may be detectable, since decaying bump error trials have early-delay rate elevations linked to the cued location (**Figure 2B**). We propose to compare this cue tuning to tuning curves computed based on the behavioral report (report tuning), not the memory cue. Report tuning in the delay will be stronger (weaker) than cue tuning for emergent bump (decaying bump) error trials (**Figure 6**). Considering the specific effects of the 5-HT receptors separately, our model thus predicts stronger (weaker) report tuning than cue tuning in error trials after microinfusion of a 5-HT1A antagonist (agonist) or a 5-HT2A agonist (antagonist) (**Figure 6**). Although this difference is especially notorious for error trials, a change in tuning sharpness of cue and report tuning curves should also be detectable when analyzing all trials together, provided the task was designed to be near psychophysical thresholds so errors would be numerous.

Our study has some limitations. We included only 5-HT receptor mechanisms that are known to be in PFC neurons and have also been described physiologically in *in vitro* studies. We therefore did not include 5-HT1A receptors on inhibitory interneurons or the effects of 5-HT1A and 5-HT2A receptors on intracortical glutamatergic and GABAergic synaptic transmission in our model (Celada et al., 2013). We also left out other receptors that are known to be expressed in PFC but lack *in vitro* characterization, such as 5-HT3 (Puig et al., 2004), or 5-HT2C receptors (Pompeiano et al., 1994). However, analysis of the network simulations leads to the conclusion that the effects reported here are primarily dependent on the modulation of the pyramidal neuron excitability and not specific of the mechanisms of one single receptor (Cano-Colino et al., 2013). For this reason, we expect that adding more 5-HT receptors or functional effects as their *in vitro* characterization becomes available will increase the predictions that emanate from the model but will not essentially change the current findings. A second limitation is related with the fact that we are modeling neurons as single compartments where different 5-HT receptors interact compactly. However, there are indications of a possible distinct spatial distribution of 5-HT1A and 5HT2A receptors on PFC pyramidal neurons (Nichols and Nichols, 2008; Celada et al., 2013). Our modeling study cannot address possible mechanistic consequences of such segregation of 5-HT receptors on different neuronal compartments. A third limitation stems from the fact that our simulations rely on many parameters that lack clear experimental references. We reported here simulations that correspond to one particular network realization that yields reasonable SWM behavior, and the question remains as to how general these findings are in relation to other possible parameter instantiations. This is a difficult problem that affects this kind of modeling projects, but we have addressed it partially by finding 20 different network realizations using an unbiased automated optimization procedure and confirming that our predictions are shared by all these different networks (Cano-Colino et al., 2013). We are therefore confident that our analysis extends to a large family of SWM models with 5-HT neuromodulation.

Although we focused our study on 5-HT effects on SWM, the modulations of network function that we observed depend essentially on a concerted change in neuronal polarization in our network, here due to the action through 5-HT1A receptors (Cano-Colino et al., 2013). Therefore, any neuromodulatory agent that alters the excitability of neurons in the PFC would lead to analogous predictions in our SWM network model. This is further strengthened by the fact that synaptic strengths are also known to affect emergent and decaying bump dynamics in a similar way to cellular excitability (Edin et al., 2009; Wei et al., 2012), so the qualitative results would apply to neuromodulators affecting either cellular or synaptic excitability. Thus, all our predictions aimed at distinguishing error types to characterize the behavioral effects of neuromodulation on SWM would apply also to the dopamine or norepinephrine (NE) systems (Aston-Jones and Cohen, 2005; Cools and D'Esposito, 2011). This is underscored by comparing qualitatively our results with those of Eckhoff et al. (2009), who also used a computational approach

### **REFERENCES**


1261–1265. doi: 10.1126/science. 2430334


to explain the non-monotonic effects of NE on decision-making. They found that low tonic NE produces unmotivated behavior, due to fading or decaying memory. In contrast, high tonic NE causes impulsive responses and poor accuracy, due to the emergence of spontaneous activity prior to stimulus onset. Our results with 5-HT suggest that a careful analysis of error types in a wide range of cognitive tasks is critical to understand the effects of neuromodulatory systems in cognitive function. Experimentally, rodent experiments with D1 receptor manipulations report that an excessive activation leads to perseverative responses while suppression of the receptor induces more random behavior in a SWM task (Zahrt et al., 1997; Floresco and Phillips, 2001; Seamans and Yang, 2004). This is reminiscent of our findings that behavioral inverted U-curves may reflect different types of errors, and it is in line with the pattern of distractibility of the network upon modulation of 5-HT2A receptors. Notably, D1 and 5-HT2A receptor activations have both a generally depolarizing effect on PFC pyramidal neurons (Araneda and Andrade, 1991; Yang and Seamans, 1996). In a suggestive study (Vijayraghavan et al., 2007), local activation of D1 receptors by iontophoresis was associated with a dose-dependent change in neuronal tuning curves in a SWM task so that insufficient D1 stimulation led to diminished cue tuning due to stronger response to non-preferred locations, while excessive D1 stimulation diminished cue tuning by reducing responses to preferred locations. Interestingly, these modulations parallel our network model's predictions for 5-HT2A receptor modulation (**Figure 6B**). Taken together, we propose that identifying error mechanisms when neuromodulation causes inverted-U doseresponse curves in working memory can be a fruitful avenue to advance our understanding of neuromodulatory control of higher cognitive functions.

### **ACKNOWLEDGMENTS**

The work was carried out at the Esther Koplowitz Center, Barcelona. This work was supported by the Ministry of Economy and Competitiveness of Spain (grant no. BFU2009-09537); the European Regional Development Fund, European Union; the Karolinska Institutet Strategic Neuroscience Program to Rita Almeida, and Generalitat de Catalunya (grant no. 2007BP-B100135) to Rita Almeida. We thank Francesc Artigas, Pau Celada, Jaime de la Rocha, Alex Roxin, Klaus Wimmer, Juan Pablo Ramírez-Mahaluf, and Daniel Jercog for fruitful discussions.


24, 4807–4817. doi: 10.1523/ JNEUROSCI.5113-03.2004


serotonin depletion affects reversal learning but not attentional set shifting. *J. Neurosci.* 25, 532–538. doi: 10.1523/JNEUROSCI.3690-04. 2005


under norepinephrine modulation. *J. Neurosci.* 29, 4301–4311. doi: 10.1523/JNEUROSCI.5024- 08.2009


modulation in the basal ganglia locks the gate to working memory. *J. Comput. Neurosci.* 20, 153–166. doi: 10.1007/s10827-005-5705-x


attention and executive functions: a systematic review. *Neurosci. Biobehav. Rev.* 33, 926–952. doi: 10.1016/j.neubiorev.2009.03.006


and likelihood of successful maintenance of visual working memory. *J. Vis.* 12, 21. doi: 10.1167/12.13.21


50–64. doi: 10.1177/026988110606 5859


rat prefrontal cortex. *J. Physiol.* 566, 379–394. doi: 10.1113/jphysiol.2005.086066

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 06 May 2013; accepted: 05 September 2013; published online: 26 September 2013.*

*Citation: Cano-Colino M, Almeida R and Compte A (2013) Serotonergic modulation of spatial working memory:* *predictions from a computational network model. Front. Integr. Neurosci. 7:71. doi: 10.3389/fnint.2013.00071 This article was submitted to the journal Frontiers in Integrative Neuroscience. Copyright © 2013 Cano-Colino, Almeida and Compte. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

### Exploring the effects of depression and treatment of depression in reinforcement learning

### *Pedro Castro-Rodrigues 2,3 and Albino J. Oliveira-Maia1,2,4\**

*<sup>1</sup> Champalimaud Neuroscience Programme, Champalimaud Foundation, Lisboa, Portugal*

*<sup>2</sup> Neuropsychiatry Unit, Champalimaud Clinical Centre, Champalimaud Foundation, Lisboa, Portugal*


### *Edited by:*

*Kae Nakamura, Kansai Medical University, Japan*

**Keywords: depression, SSRI antidepressants, reinforcement (Psychology), punishment, reward**

### **A Commentary on the Frontiers Research Topic**

### *Neurobiological circuit function and computation of the serotonergic and related systems*

In this research topic, Herzallah et al. (2013) have contributed original work, exploring the effects of Major Depressive Disorder (MDD) and of treatment with a selective serotonin reuptake inhibitor (SSRI) on responses in a reinforcementlearning task. The task developed by the authors allows for dissociation between positive (reward) and negative (punishment) reinforcement learning and was used to compare, in a cross-sectional design, healthy controls (HC), patients with MDD prior to initiating treatment (MDD), and patients recovered from MDD under treatment with paroxetine (a SSRI; MDD-T). The main finding of their work was that, when compared to HC, patients with untreated MDD were impaired in reward learning, while MDD patients under treatment with paroxetine were impaired in both reward and punishment learning. As a result of these response profiles, the relative learning from reward and punishment was similar between HC and MDD-T (due to blunted responses to both valences in MDD-T) while, in MDD patients, learning from negative feedback was exaggerated when compared to learning from positive reinforcement.

The findings reported in this paper (Herzallah et al., 2013) contribute novel insights to the body of literature exploring the relationship between depression, reinforcement and serotonin (5-HT). The interest on understanding reinforcement processes in the context of depression is sustained on the fact that anhedonia, i.e., the loss of interest or pleasure in all or almost all activities, is a key-symptom of MDD (APA Task Force on DSM-IV, 1994). Preclinical analogs to anhedonia, commonly used to assess depression-like behavior in rodents, are the sucrose intake and preference tests, whereby decreased intake of, or preference for, a sweet sucrose solution (relative to water) are argued to reflect an anhedonic state (Monleon et al., 1995). The relationship between serotonergic processing and depression has been suggested mostly from the anti-depressant effects of drugs modulating serotonergic neurotransmission (Shopsin et al., 1976; Delgado et al., 1990). Recently, several hypotheses have been put forward to suggest that the link between MDD and 5HT might be indirect and mediated by associative learning (Robinson and Sahakian, 2008) and/or disinhibition of negative thoughts (Dayan and Huys, 2008). Elliott et al. (1996), for example, found that during application of a neuropsychological assessment battery, after having solved one problem incorrectly, MDD patients were more likely than controls to fail the subsequent problem. This finding correlated with the severity of the depression and has been interpreted as oversensitivity to negative feedback.

Several paradigms have been used to explore the processing capabilities of depressed patients in contexts of reward and/or punishment. Findings have not been entirely consistent, possibly due to heterogeneity of patient samples and differences in testing protocols, with different authors suggesting that patients with MDD have more significant deficits in reward learning (Henriques and Davidson, 2000; Pizzagalli et al., 2008; McFarland and Klein, 2009; Robinson et al., 2012) punishment learning (Murphy et al., 2003; Santesso et al., 2008) or both (Must et al., 2006). Henriques and Davidson (2000) compared MDD patients to a group of non-depressed control subjects on a verbal memory task under three monetary payoff conditions: neutral, reward, and punishment. While control subjects maximized their earnings by modulating their pattern of response in both reward and punishment conditions, relative to the neutral condition, depressed subjects did not do so during reward. Others (McFarland and Klein, 2009) have reported similar findings, with reactivity in response to anticipated reward being significantly diminished in currently depressed compared with never depressed participants, and marginally diminished in comparison with previously depressed participants. There is also evidence to suggest exaggerated responses to punishment in MDD, rather than diminished responses to reward. In a study based on a probability reversal task (Murphy et al., 2003), when given misleading negative feedback, MDD patients were impaired in the ability to maintain the response set, as shown by their increased tendency to switch responding to the 'incorrect' stimulus following negative reinforcement, relative to controls.

Others have compared neural responses to stimuli with positive and negative valence between depressed and nondepressed volunteers. McCabe et al. (2009) compared unmedicated patients with a history of major depression to age and gender matched HC. Despite no differences in stimulus ratings, recovered patients had decreased neural responses to the pleasant stimulus in the ventral striatum and increased responses in the caudate nucleus to the aversive stimulus. The same authors (McCabe et al., 2010) later found that citalopram (a SSRI antidepressant), but not reboxetine (another antidepresssant that acts as a noradrenaline reuptake inhibitor), reduced activation from rewarding stimuli in the ventral striatum and orbitofrontal cortex. Citalopram also decreased neural responses to the aversive stimuli conditions in areas such as the lateral orbitofrontal cortex, while reboxetine produced a similar, although weaker, effect. Others have found that another antidepressant (duloxetine, a dual serotonin and noradrenaline reuptake inhibitor), rather than diminishing the neural processing of both rewarding and aversive stimuli, can increase ventral striatal activity in humans (Ossewaarde et al., 2011).

The findings now reported by Herzallah et al. (2013) highlight the importance of considering treatment effects when testing the relationship between depression and reinforcement learning. Furthermore, the authors provide behavioral support for the neural findings previously described by McCabe et al. (McCabe et al., 2010): in both cases responses to reward and punishment were reduced by treatment with a SSRI, possibly accounting for the experience of *emotional blunting described* by some patients during SSRI treatment (Price et al., 2009). However, as is acknowledged by the authors (Herzallah et al., 2013), these findings should be interpreted in the context of their cross-sectional study design. For example, it is unclear if, before starting treatment, disease severity in MDD-T patients was similar to that found in MDD patients, and also if the latter group will respond to treatment similarly to the MDD-T group. Importantly, because MDD-T patients are responders to treatment, the experimental design also does not distinguish entirely between the effects of paroxetine *per se* and the effects of recovery from depression. Future work, with a longitudinal study design and/or including groups at different phases of treatment or treated with non-SSRI drugs or non-pharmacological alternatives, should address these questions.

### **ACKNOWLEDGMENTS**

We thank Marta Camacho for review of this manuscript. Albino J. Oliveira-Maia is funded by a Junior Research and Career Development Award from the Harvard Medical School—Portugal Program and a grant from the BIAL Foundation.

### **REFERENCES**


*Received: 08 September 2013; accepted: 14 September 2013; published online: 09 October 2013.*

*Citation: Castro-Rodrigues P and Oliveira-Maia AJ (2013) Exploring the effects of depression and treatment of depression in reinforcement learning. Front. Integr. Neurosci. 7:72. doi: 10.3389/fnint.2013.00072*

*This article was submitted to the journal Frontiers in Integrative Neuroscience.*

*Copyright © 2013 Castro-Rodrigues and Oliveira-Maia. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# A dynamic, embodied paradigm to investigate the role of serotonin in decision-making

### *Derrik E. Asher1\*†, Alexis B. Craig1\*†, Andrew Zaldivar1\*†, Alyssa A. Brewer <sup>2</sup> and Jeffrey L. Krichmar1,3*

*<sup>1</sup> Cognitive Anteater Robotics Lab, Department of Cognitive Sciences, University of California, Irvine, CA, USA*

*<sup>2</sup> Laboratory of Visual Neuroscience, Department of Cognitive Sciences, University of California, Irvine, CA, USA*

*<sup>3</sup> Cognitive Anteater Robotics Lab, Department of Computer Science, University of California, Irvine, CA, USA*

### *Edited by:*

*Kae Nakamura, Kansai Medical University, Japan*

### *Reviewed by:*

*Lei Niu, Albert Einstein College of Medicine, USA KongFatt Wong-Lin, University of Ulster, Northern Ireland*

#### *\*Correspondence:*

*Derrik E. Asher, Alexis B. Craig, and Andrew Zaldivar, Cognitive Anteater Robotics Lab, Department of Cognitive Sciences, University of California, 2220 Social and Behavioral Sciences Gateway Building, Irvine, CA 92697, USA e-mail: dasher@uci.edu; acraig1@uci.edu; azaldiva@uci.edu*

†*Derrik E. Asher, Alexis B. Craig, and Andrew Zaldivar have contributed equally to this work.*

Serotonin (5-HT) is a neuromodulator that has been attributed to cost assessment and harm aversion. In this review, we look at the role 5-HT plays in making decisions when subjects are faced with potential harmful or costly outcomes. We review approaches for examining the serotonergic system in decision-making.We introduce our group's paradigm used to investigate how 5-HT affects decision-making. In particular, our paradigm combines techniques from computational neuroscience, socioeconomic game theory, human–robot interaction, and Bayesian statistics. We will highlight key findings from our previous studies utilizing this paradigm, which helped expand our understanding of 5-HT's effect on decision-making in relation to cost assessment. Lastly, we propose a cyclic multidisciplinary approach that may aid in addressing the complexity of exploring 5-HT and decision-making by iteratively updating our assumptions and models of the serotonergic system through exhaustive experimentation.

**Keywords: serotonin, embodiment, cost assessment, human–robot interaction, adaptive agents, game theory, cognitive modeling, acute tryptophan depletion**

### **INTRODUCTION**

Recent theoretical work has implicated serotonergic (5-HT) function in critical dimensions of reward versus punishment and invigoration versus inhibition (Boureau and Dayan, 2010). Both of these dimensions have influence on a broad range of decision-making elements, including reward processing, impulsivity, reward discounting, predicting punishment, harm aversion, opponency with other neuromodulators, and anxious states (Doya, 2008; Dayan and Huys, 2009; Cools et al., 2010). In this review, we first provide evidence from the literature indicating several of the proposed functions attributed to the serotonergic system. Next, we discuss approaches that utilized game theory and other behavioral measures along with some metric of serotonergic function. Finally, we introduce a multidisciplinary experimental paradigm to model the role 5-HT plays in decision-making, and present some of our work that has utilized this paradigm.

Our paradigm begins with base assumptions regarding the role of serotonin or other neuromodulators to construct a simulated agent that can adapt to environmental challenges. That adaptive agent, which is either embodied in a robotic platform for human– robot interaction studies or embedded in a computer interface, is incorporated into a game theoretic environment. The data collected from these experiments are then analyzed to support or reject hypotheses about the roles of neuromodulators in specific cognitive functions such as decision-making, which may lead to the use of more sophisticated adaptive agents in subsequent studies.

### **FUNCTIONAL ROLES OF SEROTONIN**

It has been suggested that serotonin influences a broad range of decision-based functions such as reward assessment, cost assessment, impulsivity, harm aversion, and anxious states. This section discusses recent evidence demonstrating the role serotonin has on these decision-based functions.

Though reward processing is a function that has primarily been attributed to the dopaminergic (DA) system, 5-HT has also been associated with reward-related behavior (Tanaka et al., 2007, 2009; Nakamura et al., 2008; Schweighofer et al., 2008; Bromberg-Martin et al., 2010; Okada et al., 2011; Seymour et al., 2012). Recent single-unit recordings of serotonergic neurons in the monkey dorsal raphe nucleus (DRN), which is a major source of serotonergic innervation in the central nervous system, demonstrated that many of these neurons represent reward information (Nakamura et al., 2008; Bromberg-Martin et al., 2010; Okada et al., 2011). Nakamura et al. (2008) showed that during a saccade task, after target onset but before reward delivery, the activity of many DRN neurons was modulated by the expected reward size. Bromberg-Martin et al. (2010) showed that a group of DRN neurons tracked progress toward future delayed reward after the initiation of a saccade and after the value of the trial was revealed. These studies suggest that DRN neurons, which include 5-HT neurons, may influence behavior based on the amount of delay before reward delivery and the value of the reward in future motivational outcomes (Nakamura et al., 2008; Bromberg-Martin et al., 2010).

"fnint-07-00078" — 2013/11/19 — 20:26 — page 1 — #1

Other anatomical evidence has shown that projections from DRN to reward-related DA regions support 5-HT's role in both reward and punishment (Tops et al., 2009). A theoretical review by Boureau and Dayan (2010)suggested that the 5-HT and dopamine systems primarily activate in opposition and at times in collaboration for goal directed actions. A review by Doya (2008) also highlighted possible computational factors of decision-making in brain regions innervated by serotonin and dopamine (for a schematic of the potential interplay between 5-HT and other brain structures, see Doya, 2008, Figure 3). 5-HT projections to dopamine areas have been shown to regulate threat avoidance (Rudebeck et al., 2006; Tops et al., 2009), and an impairment in these projections can lead to impulsivity and addiction (Deakin, 2003). Altogether, the interaction between these systems allows 5-HT to play various functional roles in decision-making where reward and punishment, as well as invigoration and inhibition, are in opposition.

In addition to reward processing, several studies have investigated serotonin's involvement in reward and impulsivity by manipulating levels of central 5-HT in humans using the acute tryptophan depletion (ATD) procedure. ATD is a dietary reduction of tryptophan, an amino acid precursor of 5-HT,which causes a rapid decrease in the synthesis and release of the human brain's central 5-HT, thus affecting behavioral control (Nishizawa et al., 1997). Altering 5-HT levels via ATD influences a subject's ability to resist a small immediate reward over a larger delayed reward (delay reward discounting; Tanaka et al., 2007, 2009; Schweighofer et al., 2008). As such, subjects that underwent ATD had both an attenuated assessment of delayed reward and a bias toward small reward, which were indicative of impulsive behavior.

Besides reward, 5-HT has also been linked to predicting punishment or harm aversion (Cools et al., 2008; Crockett et al., 2009, 2012; Tanaka et al., 2009; Seymour et al., 2012). Cools et al. (2008) paired the ATD procedure with a reversal-learning task, demonstrating that subjects under ATD made more prediction errors for punishment-associated stimuli than for reward-associated stimuli. In a related study, Crockett et al. (2009) utilized the ATD procedure with a Go/No-Go task to show that lowering 5-HT levels resulted in a decrease in punishment-induced inhibition. In a follow up study, they investigated the mechanisms through which 5-HT regulated punishment-induced inhibition by using the ATD procedure paired with their Reinforced Categorization task, a variation on the Go/No-Go task (Crockett et al., 2012). Subjects with lowered 5-HT were faster in responding to stimuli predictive of punishments (Crockett et al., 2012), indicating a manipulation of some punishment-predicting mechanism associated with standard serotonergic function. Together, these results suggest that 5-HT influences the ability to inhibit actions that predict punishment and to avoid harmful circumstances.

Beyond punishment, 5-HT has been implicated in stress and anxiety (Millan, 2003; Jasinska et al., 2012). A recent review by Jasinska et al. (2012) proposed a mechanistic model between environmental impact factors and genetic variation of the serotonin transporter (5-HTTLPR), linking to the risk of depression in humans. They argued that genetic variation may be linked to a balance in the brain's circuitry underlying stressor reactivity and emotion regulation triggered by a stressful event, ultimately leading to depression (Jasinska et al., 2012). A review by Millan (2003) described studies showing that 5-HT function has been tied to an organism's anxious states triggered by conditioned or unconditioned fear. Together, this work suggests a functional role for 5-HT in the control of anxious states.

In summary, these studies reveal serotonergic modulation of a wide range of decision-based functions including but not limited to reward processing, motivational encoding, punishment prediction, discounting, impulsivity, harm aversion, and anxious states. Building on this body of work, many researchers in the field have utilized their own approaches in studies to better understand the function of serotonin in behavior. In the present paper, we introduce a novel, multi-disciplinary approach to study serotonin's influence in decision-making that may highlight many of the functions described above. Our paradigm combines techniques from computational neuroscience, socioeconomic game theory, human–robot interaction, and Bayesian statistics.

### **INVESTIGATION OF DECISION-MAKING USING GAME THEORY AND SEROTONERGIC MANIPULATIONS**

Game theory is a toolbox that is utilized in a multitude of disciplines for its ability to quantitatively measure and predict behavior in situations of cooperation and competition (Maynard Smith, 1982; Nowak et al., 2000; Skyrms, 2001). It operates on the principle that organisms will balance reward with effort while acting in self-interest to obtain the optimal result in a given situation. Game theory is especially valuable as a venue for studying human behavior because it provides a replicable, predictable, and controlled environment with clearly defined boundaries. These elements are essential when introducing computer agents as opponents.

Game theory has been combined with manipulations of serotonin to help understand its role in socioeconomic decisionmaking. For example, in the Prisoner's Dilemma, where subjects either cooperate or defect in a risky situation, it has been shown that ATD increases the prevalence of defecting, which might be considered an impulsive, risk-taking choice (Wood et al., 2006). Similarly, the Ultimatum game is a test of cooperation in which a proposer offers a share of a resource to a receiver, and the receiver can either accept or reject this offer (Nowak et al., 2000; Sanfey, 2003). In studies conducted by Crockett et al. (2008) incorporating the Ultimatum game with serotonergic manipulations, it was found that subjects under ATD rejected a significantly higher proportion of unfair offers and that decreased serotonin levels correlated with increased dorsal striatal activity induced by costly punishment (Crockett et al., 2013). In contrast, subjects that ingested citalopram, an SSRI, were less likely to punish unfairness in the Ultimatum game (Crockett et al., 2010). Together, these studies implicate the involvement of the serotonergic system with cost in decision-making, an important result in understanding the cost and reward mechanisms in the brain.

Another notable game thatfocuses on the investigation of cooperation and social contracts is the Stag Hunt. In the Stag Hunt, two players must independently choose to hunt a high payoff stag cooperatively or a low payoff hare individually. The risk in decision-making lies in the case when only one player chooses stag, resulting in no payoff for that player (Skyrms, 2004). The body of work involving Stag Hunt largely involves simulations

"fnint-07-00078" — 2013/11/19 — 20:26 — page 2 — #2

with set-strategy agents or human players as opponents (Skyrms, 2004; Szolnoki and Perc, 2008; Scholz andWhiteman, 2010). More recently, the use of adaptive agents, computer players that learn in real-time, have been gaining popularity in the field of social decision-making (Yoshida et al., 2008, 2010). Yoshida et al. (2010) conducted a study in which adaptive agents played a spatiotemporal version of the Stag Hunt game against human subjects in an fMRI scanner, implicating both rostral medial prefrontal cortex and dorsolateral prefrontal cortex in processing uncertainty and sophistication of agent strategy, respectively. Utilizing adaptive agents allows for a dynamic yet controlled behavioral manipulation in subjects, which is useful within game environments, particularly if applied to studying the cost and reward mechanisms of the brain.

Pairing a decision-making task with ATD in the absence of game theory has further illuminated serotonin's involvement in behavior. This combination has revealed serotonin's involvement in the reflexive avoidance of relatively immediate small costs in favor of larger future costs with an Information Sampling Task (Crockett et al., 2011). The pairing of ATD with a "four-armed bandit" task showed that depleted subjects tended to be both more perseverative and less receptive to reward (Seymour et al., 2012). These results show that the combination of a decision-making task with serotonergic manipulation (e.g., ATD) can provide important information about the role serotonin has in the decision-making process. In general, the combination of ATD with a decision-making task provides a useful venue for the exploration of social behavior and the neural correlates of cost and reward in decision-making.

In addition to altering decision-making, reduced 5-HT levels via ATD have been correlated with individual differences in subject behavior (Krämer et al., 2011; Demoto et al., 2012). Subjects with high neuroticism and low self-directedness personality traits have been shown to be particularly susceptible to central 5-HT depletion, resulting in decreased selection of delayed larger reward over smaller immediate reward when performing a delayed reward choice task (Demoto et al., 2012). Similarly, subjects with low baseline aggression have displayed reduced reactive aggression when performing a competitive reaction time task with depleted 5-HT levels (Krämer et al., 2011). The results from these studies provide evidence for individual behavioral differences correlated with central 5-HT manipulation, which may serve as a direction for future study.

In summary, due to the complex nature of the serotonergic system, researchers have utilized several complementary methods to investigate the varied aspects of its behavioral influence. This review introduces a multidisciplinary experimental paradigm to model the role 5-HT plays in decision-making.

### **A MULTIDISCIPLINARY PARADIGM TO INVESTIGATE THE SEROTONERGIC SYSTEM**

Our paradigm combines socioeconomic game theory with embodied models of learning and adaptive behavior (**Figure 1A**). In particular, we constructed our computational models to reflect 5-HT's potential interplay with the expected cost of a decision (Daw et al., 2002; Cools et al., 2007), under the assumption that 5-HT, released by the DRN, can act as an opponent to dopamine. In this case, activation of the 5-HT system may cause an organism to be withdrawn or risk-averse, and the DA system causes the organism to be uninhibited or risk taking (Boureau and Dayan, 2010). Within the context of this paper, cost is defined as either the perceived loss of an expected payoff or harm from a potential threat, depending on the scope of the study it is used in. We will compare our present results using this paradigm with other studies, and discuss future steps that may lead to more accurate modeling of serotonin's proposed role in assessing the tradeoff between cooperation and competition (**Figure 1B**).

### **PARADIGM OVERVIEW**

When considering the complex and highly varied behavior in decision-making during socioeconomic games (Lee, 2008a), adaptive agents (i.e., computer algorithms that can adjust their game playing behavior in response to the human player or to changes in the environment) provide a formidable means to engage human subjects that exceeds the abilities of set-strategy agents (i.e., algorithms that do not adjust their strategy over the course of a game) (Valluri, 2006). As a result of the dynamic nature of adaptive agents, there is a bidirectional influence between agent and player that is otherwise limited, as is the case for set-strategy agents. Additionally, adaptive agents are capable of changing their strategies over time, both between and during games. This allows for more organic player behavior that resembles interactions with a human subject. The advantage of using an adaptive agent over human subjects is that experimenters have greater control over how the agent performs, addressing a weakness commonly found in the highly variable and complex decision-making strategies of humans (Craig et al., 2013). Furthermore, adaptive agents themselves are a source of information, as it is possible to examine their internal processes and strategies that develop in response to the game environment. Adaptive agents work well in a simulation setting, running thousands of trials very quickly; however, it is often necessary to match simulations with comparable studies in human subjects to get a more complete picture of an agent's behavior. In human studies, it is not only possible for human subjects to interact with a computer screen; these agents can also be embodied in robots for further investigation of human–robot interactions.

Embodied models have been shown to elicit strong reactions in humans (Breazeal and Scassellati, 2002; Kidd and Breazeal, 2004) and exhibit more natural and complex behavior than pure simulations (Krichmar and Edelman, 2002, 2005). For these reasons, embodied models provide a good platform for studying a wide range of cognitive functions. One previous study tested subjects' engagement with robots, as compared to animated characters, in an experiment where subjects had to cooperate with, persuade, and assist the robot in the completion of various tasks (Kidd and Breazeal, 2004). Subjects found robots to be more credible, informative, and enjoyable to interact with compared to an animated character on a computer screen. Similar results were found by Wainer et al. (2007), further reinforcing the theory that robotic platforms are seen as more cognizant, helpful, and pleasant to work with as reported by subjects. This, in turn, has led other researchers to adopt robots as brain-based devices, because they provide a framework for understanding the interaction of simulated brain activity within a real environment. Furthermore, the embodied

"fnint-07-00078" — 2013/11/19 — 20:26 — page 3 — #3

**FIGURE 1 | Multidisciplinary paradigms. (A)** Multidisciplinary paradigm for investigating the role of serotonin in decision-making and behavior. The model begins with base assumptions regarding neuromodulation, which are then used to develop an adaptive neural network model of cost and reward assessment. This network is embedded in an agent acting as a player in a game theoretic environment, alongside control conditions with set-strategy agents. These agents are both embodied in robotic players and simulated in computer-based games. The agents are used in both human and simulation experiments to assess the adaptive network's ability to behave naturally, as well as the human subjects' reactions to the adaptive agent compared to set-strategy agents. Human subject experiments under this paradigm can include acute tryptophan depletion (ATD) manipulations. The results from human and simulation experiments are then processed to determine the validity of any hypotheses developed at the outset, in addition to the appearance of interesting emergent behavior.

**(B)** Cyclic, multidisciplinary paradigm. This model is a modified version of (A) with an added iterative component, as well as the inclusion of both fMRI (Tanaka et al., 2007; Yoshida et al., 2010; Crockett et al., 2012; Seymour et al., 2012) and genetics (Bevilacqua and Goldman, 2011; Hyde et al., 2011; Loth et al., 2011) components of human experimentation. The addition of an iterative component allows the results of previous studies conducted under the paradigm to be analyzed for possible areas of improvement in the model, which are then committed as alterations. The new node represents the following three modifications: (1) make new interpretations as to the role of serotonin in human subject behavior; (2) develop a new cognitive model based on human subject behavior; and (3) modify the adaptive neural network to create agents that reflect observed individual differences in human subject behavior. This paradigm allows for a constantly improving neural network model that is increasingly more able to fit the demands of studying decision-making and behavior.

approach might serve as the foundation for the development of intelligent machines that adhere to neurobiological constraints (Krichmar and Edelman, 2002, 2005).

In order to further elucidate the role of serotonin and dopamine in decision-making, we have developed a multidisciplinary paradigm that incorporates embodied adaptive agents into interactive game environments (**Figure 1A**). Our general paradigm includes several key aspects, which we describe in detail below. In brief, we begin with base assumptions founded on previous studies that are used to construct an adaptive agent. That model, alongside set-strategy agents used in control conditions, are either embodied in a robotic platform (Agents: Embodied) or embedded in a computer interface (Agents: Simulated). Those agents are incorporated into a game theoretic environment in both human subject and simulation experiments. Human subject experiments include manipulation with ATD (Human Experiments: ATD). The data collected from these experiments are analyzed to either support or reject specific hypotheses about the role of serotonin in decision-making, or to create new models that explore the theories that emerge from the data.

### **BASE ASSUMPTIONS**

To start, we assume that serotonergic activity in the raphe nucleus is related to the expected cost of a decision. In this case, cost assessment can be related to harm or loss aversion (Doya, 2002; Millan, 2003; Cools et al., 2008; Crockett et al., 2009, 2012; Murphy et al., 2009; Tanaka et al., 2009; Takahashi et al., 2012), as well as risk in discounting reward (Schweighofer et al., 2008; Tanaka et al., 2009). These suppositions imply that decreased serotonergic activity would result in reduced harm aversion and reduced risk aversion in the decision-making process, along with some alteration of learning parameters influenced by cost. That is, the magnitude of cost in making decisions is perceived as less when serotonin levels are low. On the other hand, we assumed that DA activity was related

"fnint-07-00078" — 2013/11/19 — 20:26 — page 4 — #4

Asher et al. Dynamic paradigm to investigate serotonin

to the expected reward of a decision (Schultz, 1997; Berridge and Robinson,1998; McClure et al.,2003; Redgrave and Gurney,2006). Under this assumption, reduced dopamine would result in reduced reward seeking behavior and manipulation of the learning parameters influenced by reward. In other words, the magnitude of reward in making decisions is perceived as less when dopamine levels are low. In this case, high serotonin levels have a strong influence over decisions resulting in less risk taking behavior.

Although controversial compared to other neuromodulators, evidence suggests that serotonergic neuromodulation features both tonic and phasic modes of activity (Briand et al., 2007; Schweimer and Ungless, 2010; Nakamura, 2013). The phasic mode is associated with transient bursts of neural activity from aversive stimuli (Schweimer and Ungless, 2010), whereas the tonic mode is represented by baseline activity (Briand et al., 2007) linked to reward magnitude assessment (Nakamura et al., 2008), outcome-based motivation (Bromberg-Martin et al., 2010), and behavioral regulation (Okada et al., 2011). Although not specific to serotonin, it has been suggested that phasic neuromodulation amplifies inhibitory connections and extrinsic inputs from the thalamus, whereas during tonic neuromodulation, intrinsic cortico-cortico connections are relatively higher (Kobayashi et al., 2000; Gu, 2002; Hasselmo and McGaughy, 2004; Lapish et al., 2006). Theoretical work has shown that this change in synaptic currents associated with phasic neuromodulation can produce a winner-take-all (WTA) network response (Krichmar, 2008). This WTA response is indicative of decisive, exploitive behavior. In addition, evidence has indicated that neuromodulatory activity is linked to increased plasticity (Gu, 2002; Fletcher and Chen, 2010; Shumake et al., 2010; Moran et al., 2013; Nakamura, 2013). Thus, phasic neuromodulation might increase future responses to salient stimuli.

The association between tonic/phasic neuromodulation and explore/exploit behavior was originally put forth by Aston-Jones and Cohen (2005) based on their observations of the noradrenergic system during studies with awake-behaving monkeys. Based on this and other empirical evidence, we have extended the exploration/exploitation idea to other neuromodulatory systems (Asher et al., 2010, 2012a; Zaldivar et al., 2010; Krichmar, 2013). Specifically, tonic levels of neuromodulation have been associated with distractible behavior and poor task performance, whereas phasic neuromodulation has been associated with attentiveness and good task performance (Aston-Jones and Cohen, 2005). Although tonic levels are associated with distractibility, this is a necessary component of the drive for exploration, or seeking out new sources of rewarding stimuli in an environment. The attentiveness associated with phasic neuromodulation is necessary for the ability to exploit a resource that has proven to be rewarding, and also to attend to salient stimuli in an environment.

These base assumptions have led us to develop models balancing cost and reward in decision-making through simulation of the neuromodulatory systems, which reflects neural activity in the brain and its resulting explorative and exploitive behaviors.

### **ADAPTIVE AGENT MODELS**

Given our base assumptions, we developed adaptive neural models capable of shaping action selection involved in decision-making

(**Figure 2**). In general, these models made decisions based on their assessment of the expected cost and reward of actions, where cost was related to harm or loss aversion (Asher et al., 2010, 2012a; Zaldivar et al., 2010; Craig et al., 2013).

### *Neural network model*

Our neural network model, which was used in Hawk-Dove and Chicken human robot interaction studies, simulated neuromodulation and plasticity based on environmental conditions, as well as previous experiences with cost and reward (Asher et al., 2010, 2012a; Zaldivar et al., 2010). The model was divided into three distinct neural areas: (1) Game-Dependent Input Neurons, (2) Action Neurons, and (3) Neuromodulatory Neurons (**Figure 2A**). The Game-Dependent Input Neurons, akin to sensory neurons, represented the possible environmental states the model could observe. The Action Neurons reflected the different choices the model could make in its environment. The Neuromodulatory Neurons featured Cost neurons, which represented

"fnint-07-00078" — 2013/11/19 — 20:26 — page 5 — #5

serotonergic neuromodulation, and Reward neurons, which represented DA neuromodulation. The connections between the Game-Dependent Input Neurons and both the Neuromodulatory and Action Neurons were subject to neuromodulated synaptic plasticity. To model phasic neuromodulation's effect on decisionmaking, neuromodulatory activity amplified the extrinsic excitatory connections from the Game-Dependent Input Neurons and the inhibitory connections from the opposing Action Neuron (**Figure 2A**).

The equation for activity of each of the Game-Dependent Input Neurons (*ni*) were computed as follows:

$$m\_i = \begin{cases} \begin{array}{l} b+noise; \; i = salient \; stimulus \\ \; noise; \; Otherwise \end{array} \end{cases} \tag{1}$$

where *b* was a constant value dependent on the game played (*b* = 0.75 for the Hawk-Dove and *b* = 0.45 for the Chicken games described below), and *noise* represented neural noise, which was a random number between 0 and 0.25 drawn from a uniform distribution.

The neural activities for the action and neuromodulatory neurons were simulated by a mean firing rate neuron model, where the firing rate of each neuron ranged from 0 (quiescent) to 1 (maximal firing) on a continuous scale. The activity of both Action Neurons was based on their previous firing rates, plastic extrinsic excitatory input from the Game-Dependent Input Neurons, non-plastic intrinsic excitatory input from the opposing action neuron, and non-plastic intrinsic inhibitory input from the opposing action neuron (**Figure 2A**). In contrast, the activity of both Neuromodulatory Neurons was based on plastic extrinsic excitatory input from the Game-Dependent Input Neurons and previous cost/reward information reflected in the respective firing rates at the previous time step. The equation for the mean firing rate neuron model was:

$$s\_i(t) = \rho\_i s\_i(t-1) + (1 - \rho\_i) \left(\frac{1}{1 + \exp^{(-5I\_i(t))}}\right) \tag{2}$$

where *t* was the current time step,*si* was the activation level of neuron *i*, ρ*i*was a constant set to 0.1 denoting the persistence of the average activity of the Cost and Reward neurons. The *noise* term represented neural input noise and was a random number between −0.5 and 0, drawn from a uniform distribution.

Phasic neuromodulation can have a strong effect on action selection and learning (Krichmar, 2008). During phasic neuromodulation, extrinsic excitatory synaptic projections from sensory systems and intrinsic inhibitory inputs are amplified relative to recurrent or excitatory intrinsic connections. In the model, the input (Game-Dependent Input Neurons) to Action neurons represented sensory connections and the inhibitory Action-to-Action neurons represented the intrinsic inhibitory connections. To simulate the effect of phasic neuromodulation, intrinsic inhibitory and sensory connections were amplified by setting *nm* in Equation 3 to ten times the combined average activity of the simulated Cost and Reward neurons. Otherwise, *nm* in Equation 3 was set to 1 for all other connections. In previous simulation studies and robotic experiments, this mechanism was shown to be effective in making the network exploitive when neuromodulation levels were high and exploratory when neuromodulation levels were low (Krichmar, 2008; Cox and Krichmar, 2009).

After the neural activities for the Action and Neuromodulatory Neurons were computed, a learning rule was applied to the plastic connections (projections from Game-Dependent Input Neurons) of the neural model. The learning rule depended on the current activity of the pre-synaptic neuron, the post-synaptic neuron, the overall activity of the modulatory neurons, and the cost/reward outcome from the game played:

$$
\Delta w\_{i\bar{j}} = \alpha^\* nm(t-1)s\_{\bar{j}}(t-1) \left( s\_{\bar{i}}(t-1) \right)^\* R \tag{4}
$$

where *sj* was the pre-synaptic neuron activity level, *si* was the post-synaptic neuron activity level, *nm* was the average activity of the Neuromodulatory Neurons, and *R* was the level of reinforcement based on payoff and cost (Equation 5). The pre-synaptic neuron (*sj*) in Equation 4 was the most active Game-Dependent Input Neuron (Equation 1). The post-synaptic neuron (*si*) could be the most active Action neuron, the Cost neuron, or the Reward neuron. The level of reinforcement was given by:


neuron, and *Ii* was the synaptic input. The synaptic input of the neuron was based on pre-synaptic neural activity, the connection strength of the synapse, and the amount of neuromodulatory activity:

$$I\_i(t) = noise + \sum\_j mm(t-1)w\_{ij}(t-1)s\_j(t-1) \tag{3}$$

where *wij* was the synaptic weight from neuron *j* to neuron *i*, and *nm* was the level of neuromodulation, which was the combined where the *Reward Received* and *Cost Received* were values determined by the positive and negative payoffs, respectively. The values were determined by a payoff matrix specific to the game being played (Asher et al., 2012a). Application of Equation 5 was based on the assumption that the Reward neuron activity predicted the reward of an upcoming action and the Cost neuron activity predicted the cost of that action. If the predictions were accurate, there would be little change in synaptic plasticity, whereas if the predictions were inaccurate, synaptic plasticity would occur (Equations 4–5).

"fnint-07-00078" — 2013/11/19 — 20:26 — page 6 — #6

### *Actor-critic model*

In addition to the neural network described above, we have implemented more abstract adaptive agents based on our assumptions (**Figure 2B**). For example, a variation of the Actor-Critic model was used to simulate reward and cost assessment in a Stag-Hunt game (Craig et al., 2013). In general, the Actor-Critic model should abide by the tenets of game theory, learning to behave in such a way that maximizes gains and minimizes potential losses. Our model contained three state tables – one for the Reward Critic, Cost Critic, and Actor - that were comprised of a column for scalar weight values, similar to the plastic weights in the neural model described in the previous section, and several columns representing the state of the environment, akin to the sensory neurons in our neural model (see **Figure 2A**, Game-Dependent Input Neurons). The weights for the Cost and Reward Critics indicated the expected cost and expected reward values learned over time. For instance, state values in the Stag-Hunt game were related to the distances of the agent and the other players to potential rewarding stimuli. The weights in the Cost and Reward Critic state tables were governed by a delta rule for error prediction:

$$\delta(t) = r(t) + V(s, t) - V(s, t - 1) \tag{6}$$

where *r*(*t*) was either the reward or cost at time *t*, *V*(*s, t*) was the Critic's weight at state *s*, at time *t*, and *V*(*s*, *t*–1) was the Critic's weight for the previous timestep. More specifically, the reward *r*(*t*) value corresponds to the agent's expected value of their selected choice of rewarding stimuli at that timestep, and the cost *r*(*t*) is the negative of that value in the case that the expected reward was not fulfilled (i.e., perceived loss). However, other interpretations of cost are possible depending on the game being played. The delta value of Equation 6 was used to update the weights in the Reward and Cost critic tables at every timestep according to the following function:

$$V(s, \ t+1) = V(s, \ t) + \delta(t) \tag{7}$$

TheActor weights were the likelihood to execute a particular action at a state and were updated by using the reward and cost information for that state. In the case that the model decided on choice 1 of two choices, the Actor weights were updated based on the following equation:

$$\begin{aligned} V(\varepsilon 1, \ s, \ t+1) &= V(\varepsilon 1, \ s, \ t) + 1 - p[\varepsilon 1] \ast \delta(t) \\ V(\varepsilon 2, \ s, \ t+1) &= V(\varepsilon 2, \ s, \ t) + 1 - p[\varepsilon 2] \ast \delta(t) \end{aligned} \tag{8}$$

*V*(*c*1, *s*, *t*) was the Actor's state table value for deciding on choice 1 (of two possible actions) in state *s* at time *t*. Likewise, *V*(*c*2, *s*, *t*) was the Actor's state table value for choice 2 in state *s* at time *t*. δ(*t*) was the delta value from both the Reward and Cost Critics. Thus, the Actor was updated based on the assessment of both the Cost and Reward Critics.The probabilities for selecting either choice 1 or choice 2 were decided using a SoftMax function:

$$\begin{aligned} p[c1] &= \frac{e^{V(c1, s, t)}}{e^{V(c1, s, t)} + e^{V(c2, s, t)}}\\ p[c2] &= 1 - p[c1] \end{aligned} \tag{9}$$

This implementation of the Actor Critic provided a cost-reward tradeoff mechanism for decision-making in game environments, analogous to the interplay between the DA and serotonergic neuromodulatory systems.

### **GAME ENVIRONMENTS**

In our experiments, we utilized both an adaptive neural network (**Figure 2A**) and an instantiation of the Actor-Critic model (**Figure 2B**) to investigate cost and reward in games of decision-making.

The adaptive neural network of **Figure 2A**, coupled with set-strategy models as controls, were both experimentally embodied as robotic agents and embedded in computer simulation within a game theoretic environment to investigate reciprocal social interactions depending on reward and cost assessment. For these experiments, we selected the game of Hawk-Dove, which is similar to the widely studied Prisoner's Dilemma (Kiesler et al., 1996) but arguably more informative when studying a model of the serotonergic system's role in cost assessment in competitive situations.

Our version of Hawk-Dove (**Figure 3**) was played with an adaptive neural network model contesting over a resource with another player in an area referred to as the territory of interest (TOI) (Asher et al., 2010, 2012a; Zaldivar et al., 2010). The game started with each player and the TOI randomly placed inside an environment. In the Hawk-Dove game, each player needed to reach the TOI and choose between two actions: escalate (an aggressive, confrontational tactic) or display (a nonviolent, cooperative tactic). If both players chose to escalate, they fought, resulting in an injury or penalty, which could either be serious or mild. If only one player chose to escalate, then the escalating player received the total value of the TOI, and the other player received nothing. If both players

(escalate), avoiding or risking injury in hopes of a larger payoff, respectively. © 2012 IEEE. Reprinted, with permission, from Asher et al. (2012a).

"fnint-07-00078" — 2013/11/19 — 20:26 — page 7 — #7

chose to display, then there was a tie, and both players split the value of the TOI. Our variant of Hawk-Dove also modified the harshness of the environment in certain experimental conditions by increasing the likelihood of receiving a serious injury when escalating. Thus, players should strive to cooperate and minimize penalty from escalating by either alternating their actions or sharing the resource, which, at times, may result in conflict, as each player is attempting to secure the highest payoff.

Alongside Hawk-Dove, Chicken (Rapaport and Chammah, 1966) was used to investigate competitive situations in terms of expected costs and reward. Unlike Hawk-Dove in which players could decide to choose their action first or wait to see the other player's decision, Chicken forced players to decide on an action quickly without knowledge of the opponent's choice, as players do not know the decision their opponent has made until the outcome. In our version of Chicken (**Figure 4**), the human subject and the adaptive neural network model each controlled racecars, both heading toward each other on a single lane track (Asher et al., 2012a). If a player chose to swerve, that player relinquished the single lane track to the other player and received no reward, while the player who did not swerve received the maximum payoff. If both players swerved, they each received the minimum payoff. If neither player swerved, then the result was a severe head-on collision, the worst outcome for both players. Thus, the best outcome for a given player was to stay straight while the other player swerved. This created a situation in which each player, in an attempt to secure the best outcome, risked the worst scenario in terms of payoff.

While games such as the Prisoner's Dilemma, Hawk-Dove and Chicken are used to explore cost and reward assessment in competitive situations, the socioeconomic game known as the Stag Hunt is better suited to investigate cooperative situations and the formation of social contracts. Evidence suggests that neural responses are different when the social interaction is perceived to be cooperative versus competitive (Fleissbach et al., 2007). In Stag Hunt, two players must decide whether to cooperate with each other in order to hunt the high-payoff stag, or hunt a low-payoff hare individually (Skyrms, 2004). The risk involved with stag hunting is that both players must commit to hunting stag. If one player hunts stag while the other player hunts hare, the stag hunter is unable to catch the stag and receives no payoff. While the standard version of Stag Hunt is typically played as a simple stag or hare choice, we used a variant of the game, much like the one used byYoshida et al.

risk of a collision and continue straight ahead in hopes of a larger payoff. © 2012 IEEE. Reprinted, with permission, from Asher et al. (2012a).

(2010), that incorporated a spatiotemporal component (**Figure 5**). The game was played on a computer-simulated 5 × 5 board, with tokens depicting the locations of the two players, the stag target, and the hare targets. Players moved toward the targets at the start of each game, with enforced token adjacency as a condition to catch them. Cooperation is crucial in Stag Hunt, as in order to obtain the highest payoff (stag capture) players must form social contracts to work together.

### **TESTING THE ADAPTIVE MODELS IN GAME ENVIRONMENTS**

Depending on the goal of the experiment in question, simulations can provide significant information about behavior development in an adaptive model. These experiments often consist of exhaustive model testing with various opponents, environmental conditions, and intrinsic model parameters resulting in various behaviors and strategies that the model may exhibit. With our model of cost and reward modulation (see Neural Network Model), we conducted simulation experiments that revealed that the model was capable of predicting upcoming costs and rewards (Asher et al., 2010; Zaldivar et al., 2010). This resulted in the evolution of mixed strategies that allowed the model to compete for resources, independent of the opponents' actions. With our instantiation of an adaptive Actor-Critic model (see Actor-Critic Model) embedded in the Stag Hunt game, we found that this model developed suitable state tables to guide the agent in cost and reward prediction while playing against set-strategy agents (Craig et al., 2013). In both cases, the simulations showed that the adaptive model was sensitive to the other player's strategy and the game environment. For example, when making decisions in the Stag Hunt, the model not only took into consideration its distance to the game tokens, but also the other player's distance to tokens. These simulation experiments provide evidence that the base assumptions were a sufficient foundation for the model governing behavior. However, simulation experiments are only a small subset of the methods that can be utilized when studying an adaptive model's behavior. It is also important to observe real human interaction with the model in an effort to assess the model's ability to replicate natural behavior.

Following simulation, human subject experiments were performed to test the adaptive model's performance against human players, as well as the subjects' reactions to playing against both set-strategy and adaptive agents, and the influence of embodied agents on game play. Our first set of human subjects experiments involved ATD, the dietary manipulation described above that temporarily lowers serotonin levels in the central nervous system, resulting in decreased cooperation and lowered harm-aversion (Wood et al., 2006; Crockett et al., 2008). In these ATD experiments, two sessions (tryptophan-depleted and control) were performed on two separate days. For each session, healthy, adult subjects played the Hawk-Dove (**Figure 3**) and Chicken (**Figure 4**) games against adaptive agents both in simulation and embodied in physical robots. We measured changes in behavior associated with lowered levels of 5-HT throughout the interactions between human subjects and the robotic agent in the game environments.

"fnint-07-00078" — 2013/11/19 — 20:26 — page 8 — #8

**FIGURE 5 | Stag Hunt game environment.** The game board included a 5 × 5 grid of spaces upon which the player (stick figure image), agent (robot image), stag (stag image), and hare (hare image) tokens resided. The screen included a button to start the experiment, the subject's score for the round, the subject's overall score for the experiment, the game

number, a countdown to the start of the game, and a counter monitoring the game's timeout. In the game of Stag Hunt, two players attempt to hunt a low-payoff hare alone, or attempt to cooperate with the other player to hunt a large payoff stag. © 2013 by Adaptive Behavior. Reprinted by Permission of SAGE from Craig et al. (2013).

In our next set of human subject experiments, the participants played the Stag Hunt game with various set-strategy and adaptive simulated agents. Subjects played games against each of five computer strategies, including an adaptive model. In each game, players navigated the game board on a computer (**Figure 5**), ending the game when either one of the players successfully captured a hare, or both players worked together to capture a stag. The adaptive agent was an instantiation of the Actor-Critic model that weighed cost and reward to make decisions in the environment in a manner much like the serotonergic and DA systems are thought to act in humans (Cools et al., 2010). Using these paradigms, we were able to study the ability for humans to cooperate with adaptive agents, as well as the extent of learning that takes place in the model when placed in an environment fostering cooperation.

Altogether, our multidisciplinary paradigm is one of many that are currently utilized in this field to explore the theorized role of the serotonergic system on behavior as related to cost assessment. The results from using this paradigm provide a balanced and informative procedure that incorporates both neuromodulation and behavior with current methods and technology, as described below.

### **RESULTS OF OUR STUDIES CONDUCTED USING THIS MULTIDISCIPLINARY APPROACH**

### **ADAPTIVE NEURAL NETWORK PLAYING THE HAWK-DOVE GAME**

We explored the research question of how the interplay between cost and reward would lead to appropriate decision-making under varying conditions in a game theoretic environment. To test this question, we modeled several predictions as to how the activity of a cost function leads to appropriate action selection in competitive and cooperative environments (Asher et al., 2010; Zaldivar et al., 2010). One such prediction was that the interaction between the simulated serotonergic neuromodulatory system, associated with the expected cost of a decision, and the simulated DA system, associated with the expected reward of a decision, would allow for appropriate decision-making in Hawk-Dove (see **Figure 2A** and Adaptive Agent Models). Our results verified this prediction, as the adaptive neural agent was more likely to escalate over the resource when activity of the reward system exceeded the activity of the cost system. Conversely, when the reward activity did not exceed the activity of cost, the adaptive neural agent displayed. One further prediction verified by our results was that the impairment of the serotonergic system would lead to perseverant, uncooperative behavior. A simulated lesion of the serotonergic system resulted

"fnint-07-00078" — 2013/11/19 — 20:26 — page 9 — #9

in the adaptive neural agent almost always engaging in risk taking (aggressive) behavior, which was similar to the uncooperative behavior seen in human studies where serotonin levels were lowered via ATD while subjects played games such as Prisoner's Dilemma and the Ultimatum game (Wood et al., 2006; Crockett et al., 2008). Altogether, our results are in agreement with the theoretical work proposed by Boureau and Dayan (2010), in which the influence of serotonergic and DA systems in generating an appropriate decision are sometimes in opposition.

### **ATD AND EMBODIMENT IN HAWK-DOVE AND CHICKEN GAMES**

To test the influence of embodiment and serotonin on decisions where there is a tradeoff between cooperation and competition, we conducted a study that included both embodied and simulated versions of adaptive agents along with manipulation of serotonin in human subjects. We used ATD to reveal the ways humans interacted with these agents in competitive situations via the Hawk-Dove and Chicken games (Asher et al., 2012a). Although we did not look at the ratio of plasma tryptophan to other large neutral amino acids, the differences between total blood plasma tryptophan levels with (5–8 μmol/L) and without (51–182 μmol/L) ATD were highly significant (*p* < 0.0005, Wilcoxon rank-sum test). Contrary to our expectations, we found that our subjects' ability to assess cost when tryptophan-depleted was unchanged and that they were not more likely to cooperate with an adaptive embodied agent at the population level of analysis. Subjects responded equally strong to both the embodied and simulated adaptive agents, and tryptophan-depleted subjects did not show a significantly increased proportion of aggressive decisions (escalate) resulting from a decrease in cost assessment. Instead, we found that subjects significantly altered their strategy from Win-Stay-Lose-Shift (WSLS) against control adaptive agents, to Tit-For-Tat (T4T) against an aggressive version of the model, which is in agreement with previous studies (Wood et al., 2006; Crockett et al., 2008). We interpreted this result as subjects tending toward retaliatory behavior when confronted with agents that partook in risky behavior. This result was in agreement with those found by Crockett et al. (2008), which indicated that subjects under ATD tended to reject significantly more unfair offers in the Ultimatum game. The rejection of unfair offers in the Ultimatum game is similar to the retaliatory behavior observed in both the Hawk-Dove and Chicken games. Additionally, a type of motivational opponency was found in the dorsal and ventral regions of the striatum when subjects received costly punishment in the Ultimatum game (Crockett et al., 2013). Given that serotonin has been shown to have an inhibitory effect on the striatum (Di Cara et al., 2001), decreased 5-HT levels led to greater striatal activity. Similarly, subjects demonstrating retaliatory behavior under the effects of ATD would likely have shown increased dorsal striatal activity, a result that has been observed in previous work (de Quervain, 2004; Krämer et al., 2007; Strobel et al., 2011). Although we did not collect any brain imaging data, we would expect to see differences in striatal activity across our subjects correlating with their individual baseline levels of retaliatory behavior.

Although the small subject size (*n* = 8) may have contributed to the lack of significant differences in our measurements of both tryptophan-depletion vs. control conditions and embodied agent vs. simulation conditions, there is the possibility that differences between the conditions were masked by subgroups of subjects responding differently across conditions.

### **COGNITIVE MODELING**

To better understand our results at the individual subject level, we implemented a cognitive model to investigate potential behavioral differences in the subjects' decision-making by examining their propensity to choose the aggressive action (escalate) in the Hawk-Dove game under the various conditions. Because these cognitive models use Bayesian inference to predict subject behavior based on many individual decisions, their predictions were not weakened by a small sample size.

To investigate how ATD and embodiment affected subjects' decision-making in our previous work (see ATD and Embodiment in Hawk-Dove and Chicken Games), we implemented a cognitive model using hierarchical Bayesian inference. Hierarchical Bayesian inference has been shown to be a highly customizable and reliable way of exploring models of cognitive processes (Rouder et al., 2005; Lee, 2008b;Wetzels et al., 2010). In addition, Bayesian graphical models have been used to make inferences about the use of strategies such as WSLS or T4T from data consisting of sequences of choices from human subjects studies in N-armed bandit tasks, as well as other sequential decision-making tasks (Lee et al., 2011; Newell and Lee, 2011).

We used a hierarchical latent mixture model with Bayesian inference to analyze the individual differences in decision-making arising from alterations in serotonin levels and of agent embodiment (Asher et al., 2012b). The hierarchical attribute of these models allows for modifications to the parameters controlling cognitive processes across different individuals. We decided to use latent mixture models, as they allow for modeling completely different strategies across individuals. Formally, we recast the cognitive models as probabilistic graphical models and used Markov Chain Monte Carlo (MCMC) methods for computational Bayesian inference. By utilizing hierarchical latent mixture models, we addressed the question of how ATD and embodiment in the Hawk/Dove game could affect subjects' decision to compete (i.e., choose the aggressive escalate action) or cooperate (i.e., choose the passive display action). We modeled the probability of escalating through a logistic model. The logit (Cramer, 2003) of the probability of escalating for each subject in each condition was assumed to follow a Gaussian distribution defined by its mean and variance (hyperparameters in the hierarchical model), with the mean modeled as the sum of the baseline level of escalating for the subject, and an additive effect associated with ATD or embodiment (Asher et al., 2012b).

We showed that subjects separated into two distinct subgroups for the probability to choose the aggressive action (escalate) across the conditions (**Figure 6**). Our justification for this conclusion was based on the assumption that the effect of ATD/embodiment could vary across individuals as is reinforced by recent evidence suggesting that the effects of ATD give rise to individual differences across subjects (Krämer et al., 2011; Demoto et al., 2012; Seymour et al., 2012). The individual differences observed could either result in an increase or decrease in the likelihood of selecting an aggressive action. Alternatively, between the two subgroups, there existed a

"fnint-07-00078" — 2013/11/19 — 20:26 — page 10 — #10

potential middle ground that was relatively unbiased on the scale of increased or decreased selection of aggressive actions, which is analogous to random behavior or the lack of influence from the experimental conditions (null hypothesis). No subjects fell within this middle ground for the conditions shown in **Figure 6**, further reinforcing a strong possibility for the existence of at least two subgroups within the subject population (*n* = 8). The results from this analysis yielded a differential influence on subjects stemming from lowered serotonin levels and of agent embodiment on individual decision-making in a competitive game (Asher et al., 2012b), which potentially implicates neural correlates in these individual differences.

To give a full account of the data, the hierarchical model was designed to address individual differences at two levels: the baseline level, which depends on the subjects inherent tendencies, and the additive level, which depends on the interaction between subjects natural tendencies and experimental conditions. In contrast to the results from our population analysis (see ATD and Embodiment in Hawk-Dove and Chicken Games), we found that clustering subjects into two opposing subgroups better represented the data. That is, one group of subjects, in concordance with expectation, had a higher probability to escalate under tryptophan depletion, but another had a lower probability to escalate in the tryptophan-depleted condition (**Figure 6**). Similarly, we found that two subgroups better predicted the rate of escalation when comparing responses to a robot versus responses to a computer simulation (**Figure 6**). The formation of these subgroups is not accounted for by the variance of the data in the population analysis (Asher et al., 2012a). We hypothesized that these subgroups may be typical of any given population of human subjects under these

**FIGURE 6 | Estimated group identities based on cognitive modeling results.** Both plots show each subject's likelihood to choose the aggressive action (escalate) for the two different conditions. Red and green dots correspond to subjects that showed a respective increased or decreased probability to escalate from their baselines. Error bars show the 95% Bayesian confidence interval of the posterior mean. The *x*-axes indicate subject numbers, which correspond to the same subjects in the two plots. The *y*-axes show the Bayesian model's mean output indicating group affiliation with respect to the subject's likelihood to escalate, relative to their independently determined baselines. The *y*-axis value of 1 indicates a strong likelihood of decreasing choices to escalate relative to their baseline level for the conditions, whereas the value of 2 indicates a strong likelihood of increasing choices to escalate relative to their baseline level for the conditions. The group identities were estimated based on: **(A)** the influence tryptophan depletion had on subjects' choices for aggressive actions (Escalation, Tryptophan), and **(B)** the influence an embodied agent had on subjects' choices for aggressive actions (Escalation, Robot). Cognitive Science Conference and published in the Proceedings (COGSCI 2012, Sapporo, JP) from Asher et al. (2012b).

conditions and that future studies should take individual variation into consideration. This framework for evaluating cognition offers a comprehensive approach for modeling individual differences in cognitive strategies (Lee, 2008b; Lee et al., 2011).

### **HUMANS PLAYING STAG HUNT WITH SIMULATED ADAPTIVE AGENTS**

In our recent study using the Stag Hunt game, we investigated the variance in behavior of human subjects while playing Stag Hunt against adaptive (cost/reward learning) and set-strategy agents, with the intent of finding a stronger response evoked by adaptive over set-strategy. We found that adaptive agents, controlled by an Actor-Critic model (see **Figure 2B** and Adaptive Agent Models), caused subjects to invest more time and effort into game play than set-strategy agents (Craig et al., 2013). The strategy of the adaptive agent was formed by taking into consideration the reward and costs of its decisions, much like the theorized roles of the DA and serotonergic systems, while the four set-strategy agents conformed to the following tactics: (1) always hunt hare, (2) always hunt stag, (3) act randomly, and (4) WSLS. During games with an adaptive agent, human subjects took significantly longer to make a move than when playing against the other agents. Specifically in games in which the subject did not receive a payoff (i.e., the subject lost the game), subjects took a significantly longer path across the board to their endgame position than in all other tested set-strategy conditions. These findings indicate that playing against adaptive agents correlated with more effort spent on the subject's part while making decisions. Moreover, it appears that subjects might have been trying to guide the adaptive agents toward stags; such a strategy would suggest that, purely through experience, subjects became aware of the fact that the adaptive agent, unlike the set-strategy agents, could be influenced. The increased time and effort exerted by the subjects in the adaptive condition from our Stag Hunt experiment, may be related to increased neural activity seen in other Stag Hunt studies (Yoshida et al., 2010). Using fMRI while subjects played the Stag Hunt, Yoshida et al. (2010) observed increased activity in rostral medial and dorsolateral prefrontal cortices when subjects played a more sophisticated (adapting) agent, areas that indicate planning and mentalization.

Similar to our findings with the Hawk-Dove game, the Stag Hunt study also highlighted subject variation when playing games of decision-making. When assessing the ratio of stag-to-hare captures, playing against an adaptive agent appeared to evoke different equilibriums of hunt decisions in individual subjects. It appears that, much like the Hawk-Dove results (Asher et al., 2012b), over half of analyzed subjects became either strongly cooperative or strongly competitive when playing the adaptive agent. Overall, these results showed that adaptive agents are able to evoke complex behavioral responses in human subjects that may vary depending on individual subject differences. This is useful when studying decision-making and also offers control over agent behavior that would not have been possible in human-human studies. While ATD was not performed in the Stag Hunt experiment, it is possible that the grouping of these subjects also resulted from individual differences, such as genetic polymorphisms related to the serotonergic system (Bevilacqua and Goldman, 2011; Hyde

"fnint-07-00078" — 2013/11/19 — 20:26 — page 11 — #11

et al., 2011; Loth et al., 2011) leading to changes in cost/reward assessment.

### **FUTURE DIRECTIONS USING A CYCLIC, MULTIDISCIPLINARY PARADIGM**

In an attempt to more accurately model serotonin's theorized influence on decision-making, we suggest that future experiments improve upon the approach illustrated in **Figure 1A** with the addition of an iterative component (**Figure 1B**). Such a cyclic paradigm would have the following components: (1) the development of an embodied neural model to support socioeconomic game studies; (2) an experimental protocol in which subject behavior and neural correlates of decision-making can be probed and categorized; (3) the design of an improved neural model that captures the neuromodulatory influences and individual variation of decision-making in socioeconomic games, to be used in subsequent experiments; and (4) the deployment of a population of models with varying phenotypes to be used in subsequent socioeconomic game studies. Components (3) and (4) allow the paradigm to run cyclically, thereby improving the paradigm through analysis and incorporation of past results. In this cyclic, multidisciplinary paradigm, we amend our previous multidisciplinary paradigm with a feedback loop that: (1) makes new interpretations for the role of serotonin in subject behavior; (2) develops a new cognitive model based on subject behavior; and (3) modifies the adaptive neural network to construct agents that capture individual behavioral differences demonstrated by subjects. These modifications are performed with the intention of refining the adaptive neural network's performance, making its behavior more natural and human-like. After each cycle, the experimental paradigm improves to better suit the purposes of the task (e.g., stronger decision-making/modeling of neuromodulation), while holding constant the general framework of testing (e.g., game theoretic environment, simulation/human experimentation, etc.).

While the proposed paradigm is intended to improve the field of modeling decision neuroscience, our current models are rather abstract and would benefit from the incorporation of additional empirical data collected from the mammalian brain in neuroimaging and neurophysiological studies. Functional data from neuroimaging studies can help our models become more biologically realistic by revealing the specific brain areas active during select behaviors. Single unit recording studies in animals dictate the more granular neural behavior within each modeled brain region. Together, empirical data provides the base assumptions that guide our models computational neural behavior and architecture, making them more biologically realistic. Improved biological plausibility can, in turn, increase the efficacy of theoretical predictions made by our models, resulting in better theories to be tested through future neurophysiological and neuroimaging experiments.

Single unit recording studies in animals provide a critical component to computational modeling, as physical data is essential for developing base assumptions and confirming predictions made by models. For instance, phasic and tonic serotonergic activity in monkeys and rats has been associated with components of cost and reward processing (Nakamura et al., 2008; Bromberg-Martin et al., 2010; Schweimer and Ungless, 2010; Okada et al., 2011; see Base Assumptions). In future work, our base assumptions could more accurately incorporate the dynamics of both phasic and tonic serotonergic activity to improve the biological plausibility of our models. This could lead to better predictions about the dynamics between phasic and tonic serotonergic neuromodulation and their impact on cost and reward processing. Thus, where computational models inevitably rely on empirical data to make predictions about neuromodulatory influence over biological behavior, their predictions can provide theoretical evidence for future experiments.

Empirical data from neuroimaging studies provide a relationship between brain activity and behavior that can be used as the foundation for biological plausibility in a computational model. For example, fMRI has been used to determine the relationship between brain regions innervated by serotonin and behaviors involved with reward prediction (Tanaka et al., 2007), the perceived value of reward (Seymour et al., 2012), and the association of serotonin with reactive aggression under certain circumstances (Krämer et al., 2011), amongst other behaviors. The tasks and behavioral results are comparable to studies conducted using the initial paradigm outlined in this paper (see **Figure 1A**), but they include the relationship between serotonergic brain regions and the different behaviors. For example, in future models, we plan to implement the serotonergic influence over the striatum to obtain theoretical evidence for reward prediction based on different levels of serotonin, and its resulting effect on decision-making (Tanaka et al., 2007; Seymour et al., 2012). This empirical data could be included into future computational models leading to more diverse and organic model behavior, which could also aid in the development of a better theoretical understanding of the underlying relationship between serotonergic brain regions and their associated behaviors. While fMRI is incorporated into our cyclic, multidisciplinary paradigm (**Figure 1B**), results from neuroimaging studies outside of our paradigm remain an integral part of the foundation for biological plausibility in our models, and in turn, increases the efficacy of our paradigm.

In addition to these sources of empirical evidence, theoretical data from other biologically realistic models and neurally inspired robotic agents can contribute to the biological plausibility of our models and serve as a basis for further empirical investigation. For example, incorporating more biophysically detailed models of DA and serotonergic neuromodulation, such as (Chorley and Seth, 2011; Wong-Lin et al., 2012; Avery et al., 2013; Cano-Colino et al., 2013), may be informative. The temporal dynamics of these models, as well as the neuroanatomical pathways that they include, would be of interest when coupled with subject interactions.

Embodiment is a key element to the paradigm we are promoting, and these human robot interaction experiments may not only evoke strong responses in subjects, but they may also inform the development of future neurorobots. Embodied models using robotic platforms have provided clues as to how neuromodulation can give rise to adaptive behavior in biological systems (Krichmar, 2013; Luciw et al., 2013). In one such experiment, using an actor-critic model featuring a reinforcement learning algorithm allowed a biped NAO robot to develop locomotion and adjust its

"fnint-07-00078" — 2013/11/19 — 20:26 — page 12 — #12

gait to different conditions (Li et al., 2013). These studies provide theoretical evidence for how adaptive behavior could develop in a biological system and suggest how this could be applied to a robotic system. Similarly, a recent biologically plausible model in simulation was able to develop motivated behavior by implementing an interplay between aversive and appetitive stimuli, which induced activity in their simulated serotonergic and DA brain regions, respectively (Weng et al., 2013). By closing the loop between brain, body, and environment, these embodied systems demonstrate how neuromodulators, such as dopamine and serotonin, can influence action selection and decision-making.

It is important to emphasize that the models used in our paradigm serve as a venue for investigating the influence of serotonin in motivational systems for robots and other autonomous systems. Future iterations of research through our paradigm could modify our model to increase the accuracy and scope of its biological representation. In contrast to other similar models that associate serotonin with decision-making (Daw et al., 2002; Doya, 2002;Weng et al., 2013), we modeled how phasic serotonergic neuromodulation could influence an autonomous agent's behavior in a game theoretic environment (Asher et al., 2010, 2012a; Zaldivar et al., 2010). However, this abstraction is limited in extrapolating the role serotonin plays in decision-making in other environments because while game theory is a good tool for investigating decision-making, it explicitly places numerical value on the cost and reward elements of decision-making. In contrast, other biologically plausible models of neuromodulation and behavior have linked the value of a decision to novelty (Bolado-Gomez and Gurney, 2013; Krichmar, 2013), curiosity (Luciw et al., 2013), and uncertainty (Krichmar, 2013) in environments void of game theory. Krichmar (2013), built upon our work with a biologically plausible model of neuromodulation and behavior consisting of acetylcholine/norepinephrine (novelty), serotonin (withdrawal and harm aversion), and dopamine (invigoration and risk-taking) systems, in an autonomous robot that demonstrated anxious and curious states associated with rodent behavior. This work was able to use this expanded model to show that high levels of serotonin caused withdrawn behavior, while low levels of serotonin, in combination with high levels of dopamine, brought about excessive exploratory behavior. Additionally, top-down signals from the frontal cortex to the raphe nucleus were found to be critical for coping with stressful events. In the pursuit of more biologically accurate models of behavior and decision-making, biological neuronal modeling allows for studying hypotheses about serotonin's involvement in behavior and learning that would otherwise be empirically difficult to test. Furthermore, predictions from such embodiment studies could motivate the design and scope of new animal studies.

Utilizing a cyclic paradigm lends itself especially well to studies that incorporate embodiment, as the internal mechanisms governing embodied models are constantly being updated to improve their behavior during interactions with human subjects. The evidence that human subjects are more likely to treat robotic platforms similarly to other humans rather than computer simulations (Breazeal and Scassellati, 2002; Kidd and Breazeal, 2004; Asher et al., 2012b) suggests that there are larger social expectations placed on embodied agents. Additionally, human subjects have been shown to report embodied agents as more helpful and aware when compared to simulated agents (Wainer et al., 2007), further justifying their use in studying human behavior and decision-making. Since the iteratively improved adaptive agents in our paradigm are modified to better simulate neuromodulatory influence and behave increasingly more like human subjects, their embodiment could lead to more robust decision-making and social interactions, which in turn leads to more compelling and informative human-robot interaction studies and better predictions about serotonergic and DA influence over behavior. Results of these studies ultimately lead to improved adaptive models situated in robots, which have value in a wide variety of applications (e.g., medical, commercial, industrial, etc.).

While implementing adaptive agents into robotic platforms is a promising venture for future study, past experiments have revealed individual differences between subjects that warrant the investigation of genetic sources. Because the results from our Hawk-Dove and Stag-Hunt experiments showed individual variation in game play, genetic screening for polymorphisms in human subjects could provide a venue for studying serotonin's role in this variation. Several groups have suggested that individual differences in behavior are influenced by genetic polymorphisms related to serotonin signaling (Bevilacqua and Goldman, 2011; Hyde et al., 2011; Loth et al., 2011). Within this cyclic paradigm, one such proposed study could incorporate our current adaptive agents used to play cooperative (Stag Hunt) and competitive (Hawk-Dove and Chicken) games with a random sampling of human subjects, who would be screened for polymorphisms related to serotonergic function (e.g., 5-HTTLPR) (Homberg and Lesch, 2011). From the data analysis of these genetic polymorphism experiments, genetic-dependent diversity could be integrated over the neuromodulatory function of our adaptive agents, and any additional experiments necessitated by the predictions that emerged from the previous iteration through the paradigm would be conducted (**Figure 1B**). These experiments could allow for the generation of new hypotheses leading to predictions about the genetic variation in serotonergic neuromodulation and its ties to motivated human behavior. Ultimately, the predictions might help shape the next generation of empirical studies.

Though it is important to utilize new techniques such as genetic screening to better understand the role of serotonin in decisionmaking, a primary benefit to our paradigm is its incorporation of theoretical predictions from past work into future studies. Previously, we found that the concept of two opposing subgroups (**Figure 6**) best described the subjects' behavior in the Hawk-Dove game. This theoretical data could be applied to the next generation of adaptive model (via the iterative component of **Figure 1B**) through additional assumptions or constraints of serotonergic neuromodulation. These new assumptions lead to better predictions about the diversity in behavior resulting from serotonergic manipulation. As another example, from the Stag Hunt human subject experiment, we discovered the tendency for adaptive agents to move counterintuitively when subject behavior was erratic. In a second iteration of experiments, we could improve upon this model by utilizing a top-down mechanism founded in neuromodulation to converge behavior in the face of seemingly random

"fnint-07-00078" — 2013/11/19 — 20:26 — page 13 — #13

influence. By using cognitive models, we can create behavioral phenotypes in future models to match the potential individual differences that arise in any given population. These are a couple of examples of how this cyclic paradigm would help our adaptive models and ultimately our understanding of neuromodulatory influence over behavior in decision-making.

In terms of potential clinical application, the proposed paradigm may help illuminate components of brain disorders associated with abnormal serotonergic function. Serotonin has been implicated in a variety of neuropsychiatric conditions including bipolar disorder (Robinson et al., 2009), antisocial personality disorder (Deakin, 2003), anxiety disorder (Heisler et al., 1998; Lowry et al., 2008), and affective disorder (Lowry et al., 2008). Because serotonin is strongly involved in these neuropsychiatric diseases, many frequently prescribed antidepressant and anti-anxiety medications target serotonin receptors. However, due to the complex physiological action of serotonin, it is difficult not only to gage the effectiveness of these psychiatric drugs, but also to isolate the neural pathways relevant to the serotonergic regulation of these disorders. Our work modeled the theorized influence serotonin has on decision-making in the context of cost assessment, which was accomplished by simulated lesions of the cost assessment region of the model (Asher et al., 2010, 2012a; Zaldivar et al., 2010). This resulted in an increase in impulsivity that could possibly be extrapolated to deficits in learning associated with these neuropsychiatric disorders. The Stag-Hunt, which focused on cooperation, may be applicable in the study of social disorders such as autism. Manipulations of computational models could also mimic such disorders, which would lead to predictions regarding the neural correlates of the disorder and could possibly warrant drug or therapy experimentation that tests the model's predictions.

Thus, the cyclic, multidisciplinary paradigm provides a strong approach toward making predictions about the neurobiology that ties serotonin to motivated behavior. As we continue to explore serotonin and its role in decision-making, future studies should consider applying this paradigm in order to accommodate the complex behavior that accompanies the activity of the serotonergic system. Adaptive neural models situated in a game theoretic environment utilized in both human and simulation experiments, accompanied with analysis that leads to an upgraded model for future use, is a strategy that lends itself to the production of valuable research in the fields of neuromodulation, behavior, technology, and neuropsychiatry.

### **REFERENCES**


"fnint-07-00078" — 2013/11/19 — 20:26 — page 14 — #14


"fnint-07-00078" — 2013/11/19 — 20:26 — page 15 — #15

Rilling, J. K., and Sanfey, A. G. (2011). The neuroscience of social decision-making. *Annu. Rev. Psychol.* 62, 23–48. doi: 10.1146/annurev.psych.121208.131647


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 20 March 2013; accepted: 24 October 2013; published online: 21 November 2013.*

*Citation: Asher DE, Craig AB, Zaldivar A, Brewer AA and Krichmar JL (2013) A dynamic, embodied paradigm to investigate the role of serotonin in decision-making. Front. Integr. Neurosci. 7:78. doi: 10.3389/fnint.2013.00078*

*This article was submitted to the journal Frontiers in Integrative Neuroscience.*

*Copyright © 2013 Asher, Craig, Zaldivar, Brewer and Krichmar. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

"fnint-07-00078" — 2013/11/19 — 20:26 — page 16 — #16