# **BASAL GANGLIA: PHYSIOLOGICAL, BEHAVIORAL, AND COMPUTATIONAL STUDIES**

**Topic Editors Ahmed A. Moustafa, Alon Korngreen, Izhar Bar-Gad and Hagai Bergman**

SYSTEMS NEUROSCIENCE COMPUTATIONAL NEUROSCIENCE

#### *FRONTIERS COPYRIGHT STATEMENT*

© Copyright 2007-2014 Frontiers Media SA. All rights reserved.

All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.

Cover image provided by Ibbl sarl, Lausanne CH

**ISSN** 1664-8714 **ISBN** 978-2-88919-388-2 **DOI** 10.3389/978-2-88919-388-2

## *ABOUT FRONTIERS*

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

## *FRONTIERS JOURNAL SERIES*

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing.

All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

## *DEDICATION TO QUALITY*

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view.

By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

## *WHAT ARE FRONTIERS RESEARCH TOPICS?*

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area!

Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

## **BASAL GANGLIA: PHYSIOLOGICAL, BEHAVIORAL, AND COMPUTATIONAL STUDIES**

Topic Editors: **Ahmed A. Moustafa,** University of Western Sydney, Australia **Alon Korngreen,** BarIlan University, Israel **Izhar Bar-Gad,** Gonda Brain Research Center, Bar-Ilan University, Israel **Hagai Bergman,** The Hebrew University of Jerusalem, Israel

The basal ganglia has received much attention over the last two decades, as it has been implicated in many neurological and psychiatric disorders. Most of this research—in both animals and humans—attempt to understand the neural and biochemical substrates of basic motor and learning processes, and how these are affected in human patients as well as animal models of brain disorders.

The current volume contains research articles and reviews describing basic, pre-clinical and clinical neuroscience research of the basal ganglia written by attendees of the 11th Triennial Meeting of the International Basal Ganglia Society (IBAGS) that was held March 3-7th, 2013 at the Princess Hotel, Eilat, Israel and by researchers of the basal ganglia. Specifically, articles in this volume include research reports on the biochemistry, computational theory, anatomy and physiology of single neurons and functional circuitry of the basal ganglia networks as well as the latest data on animal models of basal ganglia dysfunction and clinical studies in human patients.

# Table of Contents


Aude Retailleau and Thomas Boraud

*17 Synchrony in Parkinson's Disease: Importance of Intrinsic Properties of the External Globus Pallidus*

Bettina C. Schwab, Tjitske Heida, Yan Zhao, Enrico Marani, Stephan A. van Gils and Richard J. A. Van Wezel

*24 Role of Movement in Long-Term Basal Ganglia Changes: Implications for Abnormal Motor Responses*

Nicola Simola, Micaela Morelli, Giuseppe Frazzitta and Lucia Frau


J. Vincent Filoteo and W. Todd Maddox


Dorin Yael, Dagmar H. Zeef, Daniel Sand, Anan Moran, Donald B. Katz, Dana Cohen, Yasin Temel and Izhar Bar-Gad

*101 Decomposition of Abnormal Free Locomotor Behavior in a Rat Model of Parkinson's Disease*

Benjamin Grieb, Constantin Von Nicolai, Gerhard Engler, Andrew Sharott, Ismini Papageorgiou, Wolfgang Hamel, Andreas K. Engel and Christian K. Moll

*112 Primary Motor Cortex of the Parkinsonian Monkey: Altered Neuronal Responses to Muscle Stretch*

Benjamin Pasquereau and Robert S. Turner

*126 Subthalamic Nucleus Long-Range Synchronization—an Independent Hallmark of Human Parkinson's Disease*

Shay Moshel, Reuben R. Shamir, Aeyal Raz, Fernando R. de Noriega, Renana Eitan, Hagai Bergman and Zvi Israel


Renana Eitan, Reuben Ruby Shamir, Eduard Linetsky, Ovadya Rosenbluh, Shay Moshel, Tamir Ben-Hur, Hagai Bergman and Zvi Israel


Avital Adler, Shiran Katabi, Inna Finkes, Yifat Prut and Hagai Bergman

*196 Simulating the Effects of Short-Term Synaptic Plasticity on Postsynaptic Dynamics in the Globus Pallidus*

Moran Brody and Alon Korngreen

*207 An Extended Reinforcement Learning Model of Basal Ganglia to Understand the Contributions of Serotonin and Dopamine in Risk-Based Decision Making, Reward Prediction, and Punishment Learning*

Pragathi P. Balasubramani, V. Srinivasa Chakravarthy, Balaraman Ravindran and Ahmed A. Moustafa


Vignesh Muralidharan, Pragathi P. Balasubramani, V. Srinivasa Chakravarthy, Simon J. G. Lewis and Ahmed A. Moustafa

*251 A Neurocomputational Theory of how Explicit Learning Bootstraps Early Procedural Learning*

Erick J. Paul and F. Gregory Ashby

*268 Linking Reward Processing to Behavioral Output: Motor and Motivational Integration in the Primate Subthalamic Nucleus*

Juan-Francisco Espinosa-Parrilla, Christelle Baunez and Paul Apicella

*282 Fronto-Striatal Gray Matter Contributions to Discrimination Learning in Parkinson's Disease*

Claire O'Callaghan, Ahmed A. Moustafa, Sanne de Wit, James M. Shine, Trevor W. Robbins, Simon J. G. Lewis and Michael Hornberger


Ankur Gupta, Pragathi P. Balasubramani and V. Srinivasa Chakravarthy


Henning Schroll and Fred H. Hamker


## Basal ganglia: physiological, behavioral, and computational studies

#### *Ahmed A. Moustafa1 \*, Izhar Bar-Gad2, Alon Korngreen2,3 and Hagai Bergman4*

*<sup>1</sup> Department of Veterans Affairs, New Jersey Health Care System, School of Social Sciences and Psychology, Marcs Institute for Brain and Behaviour, University of Western Sydney, Sydney, NSW, Australia*

*<sup>3</sup> Everard Goodman Faculty of life sciences, Bar-Ilan University, Ramat Gan, Israel*

*<sup>4</sup> Department of Neurobiology (Physiology), Faculty of Medicine, Edemond and Lily Safra Center for Brain Research, Institue of Medical Research Israel-Canada, The Hebrew University of Jerusalem, Jerusalem, Israel*

*\*Correspondence: a.moustafa@uws.edu.au*

#### *Edited and reviewed by:*

*Maria V. Sanchez-Vives, ICREA-IDIBAPS, Spain*

**Keywords: basal ganglia, dopamine, Parkinson's disease (PD), computational modeling, animal studies, human imaging studies, deep brain stimulation**

The basal ganglia has received much attention over the last two decades, as it has been implicated in many neurological and psychiatric disorders, including Parkinson's disease (PD), Attention Deficit Hyperactivity Disorder (ADHD), Tourette's syndrome, and dystonia. Most of current basal ganglia research—in both animals and humans—attempts to understand the neural and biochemical substrates of basic motor and learning processes, and how these are affected in human patients as well as animal models of brain disorders, particularly PD.

The current volume contains research articles and reviews describing basic, pre-clinical and clinical neuroscience research of the basal ganglia written by researchers of the basal ganglia and attendees of the 11th Triennial Meeting of the International Basal Ganglia Society (IBAGS) that was held on March 3–7th, 2013 at the Princess Hotel, Eilat, Israel. Specifically, articles in this volume include research reports on the biochemistry, computational theory, anatomy, and physiology of single neurons and functional circuitry of the basal ganglia networks.

Below, we provide a summary of articles published in the volume. We divided the articles into 4 sections: animal studies, human studies, computational modeling, and reviews.

#### **ANIMAL STUDIES**

Using physiological recordings, Adler et al. (2013) studied the relationship of both medium spiny cells and interneurons of the striatum while monkeys were performing a Pavlovian conditioning task, showing that both classes of neurons play a key role in conditioning. Along these lines, Yael et al. (2013) studied the effects of D2 antagonists (haloperidol) on striatal activity, finding evidence that it alters firing patterns of medium spiny cells and interneurons. Interestingly, Bronfeld et al. (2013) found that microinjection of the GABAA antagonist bicuculline in the dorsal striatum leads to Tourette's tics in rats, potentially highlighting a role of interneurons in such motor symptoms. However, future studies should explain the exact function of interneurons in the generation of tics. In another study, Plata et al. (2013) found that the administration of nicotine activates inhibitory interneurons in the striatum, which can treat motor symptoms of PD (as similar to dopaminergic medications). The implications of these findings remain to be shown in clinical studies. Using whole-cell patch-clamp recordings, Arias-Garcia et al. (2013) found, for the first time, that Ca2+-activated K<sup>+</sup> channels explain the different durations between corticostriatal responses in the basal ganglia direct and indirect pathways. Future research should explain the relationship between these findings and motor symptoms in PD.

While the above studies focus on the striatum, other studies in the volume address the function of other downstream basal ganglia structures. For example, Lavian et al. (2013) studied the effects of high and low frequency stimulation of the subthalamic nucleus on the activations of subthalamic and globus pallidus neurons *in vitro*, finding evidence of complex relationship between these two basal ganglia structures. In another physiological study, Espinosa-Parrilla et al. (2013) recorded from the subthalamic nucleus in behaving monkeys, showing evidence that it plays a role in reward and motivational processes; their findings suggest that the subthalamic nucleus transforms motivational information into motor responses. These findings can potentially shed light on how motivational factors impact motor symptoms in PD. Benhamou and Cohen (2014) studied the types and activity patterns of globus pallidus internal segment neurons, finding evidence of the existence of distinct neuronal populations in this brain structure. Future physiological studies should investigate the exact function of these cell populations, and whether they are differentially impacted by neurological disorders.

Other studies in the volume focused on behavioral profile in animal models of PD. In one study, Pasquereau and Turner (2013) investigated the effects of fast muscle stretches on the primary motor cortex activity in monkeys before and after MPTP treatment; they found, for the first time, evidence that MPTP can alter primary motor cortex response to muscle stretch. Future research should investigate if these findings can shed light on the nature of akinesia and bradykinesia symptoms in patients with PD. In another study, Grieb et al. (2013) video monitored locomotor behavior in 6-OHDA rats, finding evidence of impaired motion speed among other locomotion variables. These data replicate findings from human studies with PD.

## **HUMAN STUDIES**

Most of the studies in the volume focus on PD. For instance, Moshel et al. (2013) have investigated the pattern of subthalamic

*<sup>2</sup> Gonda Brain Research Center, Bar-Ilan University, Ramat Gan, Israel*

nucleus oscillation and synchronicity in PD patients during the deep brain stimulation procedure. This is among the few studies that investigate the physiological patterns of different regions within the subthalamic nucleus in PD patients. Along the same lines, Eitan et al. (2013) recorded from the subthalamic nucleus in PD patients during deep brain stimulation procedures while subjects were presented with emotional stimuli and found evidence for hemispheric (right) and domain (bentro-medial) specificity of these responses. Future research should investigate hemispheric specificity of cognitive and motor processes in PD patients undergoing deep brain stimulation.

Other studies in the volume focus on cognitive profiles of PD patients. Filoteo and Maddox (2014) studied category learning performance in PD patients, finding evidence for category discontinuity on learning. These extends prior findings by the same authors on the effects of PD on category learning (Filoteo et al., 2007). Using imaging and computational modeling, O'Callaghan et al. (2013) studied learning impairment in PD patients, showing evidence for a role for the ventromedial prefrontal cortex and inferior frontal gyrus in these processes; this extends prior findings of the role of the basal ganglia in learning processes (Bodi et al., 2009; Keri et al., 2010).

In one study, Moll et al. (2014) recorded single cell as well as local field potential activity from globus pallidus (internal and external segments) in patients with cervical dystonia undergoing deep brain stimulation. They found that cervical dystonia is associated with asymmetric pallidal functions. However, future experimental and theoretical work should explain how damage to the globus pallidus relates to dystonia symptoms, and whether the basal ganglia direct pathway plays a similar role in dystonia symptoms.

#### **COMPUTATIONAL MODELING**

The volume contains different kinds of computational models of the basal ganglia that focus on physiological properties of basal ganglia structures or the effects of PD on motor and cognitive processes.

Regarding physiological models, Brody and Korngreen (2013) provided a compartment computational model linking synaptic plasticity with globus pallidus dynamics, as well as understanding the mechanism of the differential effects of low and high frequency stimulation on this structure. Future computational studies should also investigate the dynamics of other basal ganglia structures including low and high frequency stimulation of the subthalamic nucleus. Along the same lines, Merrison-Hort and Borisyuk (2013) developed a computational model of the globus pallidus along with afferent inputs from the cortex and subthalamic nucleus, highlighting how motor symptoms in PD can arise from aberrations to this circuit. This work shows that complex interactions among cortical and subcortical structures underlie the occurrence of motor symptoms in PD. Unlike the above models, Guo et al. (2013) provided a computational model of the thalamocortical circuit to investigate activity patterns of the thalamus in dystonia and PD, finding evidence that both diseases have to some extent similar effects on these basal ganglia structures. However, it remains to be shown on how damage to different basal ganglia structures can leads to different symptoms as in PD and dystonia.

Some other models in the volume focus on simulating basal ganglia-related motor and cognitive processes in health and disease. For example, Gupta et al. (2013) designed a basal ganglia model of precision grip in PD, addressing the effects of dopamine medications on grip function. This is among the first models that explains how dopamine medications impact grip force in PD patients. Future work should also address the effects of deep brain stimulation on grip force in PD. Muralidharan et al. (2014) provided one of the first computational models that simulate freezing of gait in PD as well as the effects of dopamine medications on gait parameters. This model specifically focused on simulating data from healthy subjects and PD patients passing through doorways of different widths. Future work should explain other factors that lead to freezing of gait, including obstacle avoidance, turning, and motor initiation. In another study, Balasubramani et al. (2014) designed a computational models of the role of dopamine and serotonin interaction in the basal ganglia in reward, punishment, and risk-based decision making, as studied previously in PD patients (Frank et al., 2007; Bodi et al., 2009). This is among the few models that investigated the function of dopamine and serotonin in behavioral processes in PD. As for motor processes, Tomkins et al. (2013) developed a model of the striatum in action selection and decision making, showing how this circuit decides on responses based on cortical inputs. Future work should investigate how action selection relates to motor symptoms in PD, including akinesia, bradykinesia, and medication-induced dyskinesia. Gershman et al. (2014) address the issue of time in reinforcement learning models of the basal ganglia, suggesting a single mechanism of reinforcement learning and interval timing. Paul and Ashby (2013) provided a computational model showing how memory systems (explicit and procedural memory systems) interact during learning.

## **REVIEWS**

Our volume includes various reviews, which focus on either the physiological properties of a basal ganglia structure, a basal ganglia-related disorder, computational models of the basal ganglia, mechanism of action of deep brain stimulation in basal ganglia-related disorders, as well as interaction of the basal ganglia with other brain structures.

For example, Schwab et al. (2013) provided a review of the physiological properties of the globus pallidus external segment in health and disease (focusing on PD). Along the same lines, Nambu and Tachibana (2014) also reviewed data on basal ganglia (particularly subthalamic nucleus, and globus pallidus) oscillations in relation to PD motor symptoms. These reviews complement many other existing review on the function and role of the striatum in PD motor symptoms. On the other hand, Molochnikov and Cohen (2014) reviewed data on the function of nigrostriatal and mesolimbic dopamine in different hemispheres. This review highlights the dissociable function of different hemispheres, yet future work should relate these findings to neurological and psychiatric disorders. In another review, Bosch-Bouju et al. (2013) reviewed data on the role of the thalamus in the integration of data from the cortex, cerebellum, and basal ganglia, and how such pathways play a role in the occurrence of motor symptoms in PD. In another paper, Nougaret et al. (2013) provided a commentary on a recent anatomical study on the prefrontal-subthalamic pathway in primates (Haynes and Haber, 2013). These studies highligh key findings on the patterns of connections from various prefrontal and cortical areas to the subthalamic nucleus. These studies have implications for understanding the motor and cognitive function of the subthalamic nucleus in healthy subjects as well as in patients with PD.

Further, some reviews in the volume attempt to explain the neural mechanism of deep brain stimulation. For instance, Chiken and Nambu (2014) reviewed studies arguing that deep brain stimulation work by disrupting abnormal signal transmission in PD, dystonia, and tremor, while Smith et al. (2014) reviewed data on the role of thalamo-striatal pathway in motor symptoms in PD, and suggest that deep brain stimulation to this pathway can aid in the treatment of PD symptoms. Unlike these reviews, Jahanshahi (2013) reviewed recent studies on the effects of subthalamic deep brain stimulation on motor and cognitive processes in PD, focusing on inhibitory and cognitive control. This review shows that beside motor processes, deep brain stimulation has a complex effect on cognitive control as well as other cognitive processes.

Some other reviews in the volume focus on behavioral processes. For instance, Seger (2013) provided an overview of the function of the visual cortico-striatal loop. This loop has so far received little attention in the literature than other basal ganglia loops. Interestingly, Simola et al. (2013) reviewed data on how early movement can impact abnormal involuntary movement and dyskinesia, focusing on the effects of dopamine replacement therapies on these motor complications. Studies reviewed here shed light on how levodopa and dopamine agonists can differentially affect the occurrence of dyskinesia in a subset of PD patients. Unlike prior reviews, Retailleau and Boraud (2014) reviewed data on the role of dopamine projection to the hippocampus in navigation in 6-OHDA rats. This review addresses an often less studies issues as most existing studies focus on dopamine projections to the basal ganglia and prefrontal cortex. In another review, Shine et al. (2013) provided a review of the neural and cognitive underpinnings of freezing of gait in PD. This review shed light on the complexity of freezing of gait, and explain how damage to the basal ganglia and the cortex can lead to lead to this motor symptom. Further, Moustafa and Poletti (2013) reviewed studies on the cognitive and neural abnormalities in subtypes of PD patients including tremor- vs. akinesia-dominant as well as patients with or without depression, impulsivity, and hallucinations. Further, Leisman et al. (2014) reviewed studies addressing the relationship between the basal ganglia and ADHD. This review explain how damage to corticostrial loops lead to attentional and impulsive behavior in ADHD patients.

Other reviews focus on computational models of the basal ganglia. In one study, Schroll and Hamker (2013) have provided an extensive review of existing basal ganglia models, focusing on models relating the anatomy of the basal ganglia to behavioral processes. This study reviewed most existing studies on the function of the basal ganglia direct and indirect pathways. On the other hand, Helie et al. (2013) reviewed basal ganglia network models of various motor and cognitive processes, including working memory, categorization, and sequence learning, and handwriting. This review shows how the basal ganglia can play a similar role in motor and cognitive processes.

## **CONCLUSIONS**

This volume provides the latest data on animal models of basal ganglia dysfunction and clinical studies in human patients with basal ganglia-related disorders. Although there are a multitude of studies on the anatomy, phyiology, and computational models of the basal ganglia, there are still many open questions. Future experimental and computational studies will continue to understand how exactly neurological and psychiatric disorders impact the basal ganglia as well as the neural mechanism of medications and deep brain stimulation.

## **REFERENCES**


insights on subthalamic nucleus functions. *Front. Comput. Neurosci.* 7:135. doi: 10.3389/fncom.2013.00135


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 08 July 2014; accepted: 04 August 2014; published online: 21 August 2014. Citation: Moustafa AA, Bar-Gad I, Korngreen A and Bergman H (2014) Basal ganglia: physiological, behavioral, and computational studies. Front. Syst. Neurosci. 8:150. doi: 10.3389/fnsys.2014.00150*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2014 Moustafa, Bar-Gad, Korngreen and Bergman. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## The Michelin red guide of the brain: role of dopamine in goal-oriented navigation

## **Aude Retailleau<sup>1</sup>\* and Thomas Boraud2,3**

<sup>1</sup> Sagol Department of Neurobiology, University of Haifa, Haifa, Israel

2 Institut des Maladies Neurodegeneratives UMR 5293, University of Bordeaux, Bordeaux, France

3 Institut des Maladies Neurodegeneratives UMR 5293, CNRS, Bordeaux, France

#### **Edited by:**

Ahmed A. Moustafa, University of Western Sydney, Australia

#### **Reviewed by:**

Mehdi Khamassi, CNRS (Centre National de la Recherche Scientifique), France Ahmed A. Moustafa, University of

Western Sydney, Australia

#### **\*Correspondence:**

Aude Retailleau, Neural Representations Lab, Sagol Department of Neurobiology, University of Haifa, 99 Aba Khoushy Ave, Mount Carmel, Haifa 3498838, Israel

e-mail: retailleau@campus.haifa.ac.il

Spatial learning has been recognized over the years to be under the control of the hippocampus and related temporal lobe structures. Hippocampal damage often causes severe impairments in the ability to learn and remember a location in space defined by distal visual cues. Such cognitive disabilities are found in Parkinsonian patients. We recently investigated the role of dopamine in navigation in the 6-Hydroxy-dopamine (6-OHDA) rat, a model of Parkinson's disease (PD) commonly used to investigate the pathophysiology of dopamine depletion (Retailleau et al., 2013). We demonstrated that dopamine (DA) is essential to spatial learning as its depletion results in spatial impairments. Our results showed that the behavioral effect of DA depletion is correlated with modification of the neural encoding of spatial features and decision making processes in hippocampus. However, the origin of these alterations in the neural processing of the spatial information needs to be clarified. It could result from a local effect: dopamine depletion disturbs directly the processing of relevant spatial information at hippocampal level. Alternatively, it could result from a more distributed network effect: dopamine depletion elsewhere in the brain (entorhinal cortex, striatum, etc.) modifies the way hippocampus processes spatial information. Recent experimental evidence in rodents, demonstrated indeed, that other brain areas are involved in the acquisition of spatial information. Amongst these, the cortex—basal ganglia (BG) loop is known to be involved in reinforcement learning and has been identified as an important contributor to spatial learning. In particular, it has been shown that altered activity of the BG striatal complex can impair the ability to perform spatial learning tasks. The present review provides a glimpse of the findings obtained over the past decade that support a dialog between these two structures during spatial learning under DA control.

**Keywords: dopamine, hippocampus, basal ganglia, spatial navigation, striatum**

## **INTRODUCTION: OF PARKINSON'S DISEASE, DISORIENTATION, DOPAMINE AND HIPPOCAMPUS**

Parkinson disease (PD) has long been characterized as a motor disease (Agid, 1991). Emphasis has been stressed for long on the so-called triad of akinesia, rigidity and tremor. However classic drug and surgical therapies are now able to control more or less these symptoms, often at the cost of disabling side effects such as drug-induced dyskinesia (Boraud et al., 2001). Nevertheless, the increased autonomy acquired by the Parkinsonian patients unmasked cognitive symptoms previously underestimated (Sawamoto et al., 2002). It was previously thought that cognitive alteration was a late feature of the disease but, in fact, it occurs early in about 15–20% of the patients (Weintraub et al., 2011; Svenningsson et al., 2012). Amongst cognitive symptoms we can identify spatial disorientation, which has been described as an early landmark 25 years ago but left unexplored for a long time (Hovestadt et al., 1987; Taylor et al., 1989) despite its impact on life quality and public health problem (Crizzle et al., 2012).

Spatial cognition includes all the processes that allow animals to acquire process, memorize and use spatial information to perform goal-directed movements. Animals perceive space and extract pertinent information relevant to their spatial behavior using two types of sensory information: external cues supplied by the environment and internal cues supplied by their own movements. Navigation strategies are based on internal cues (path integration), external cues (taxon navigation) or on both (local navigation; O'Keefe and Conway, 1978; Gallistel, 1990; Trullier et al., 1997) Navigation could be represented in space by two systems of coordinates: an external coordinate system (allocentric representation) or an internal frame (egocentric representation; Berthoz, 1991; Poucet and Benhamou, 1997; Benhamou and Poucet, 1998). Several structures are considered to be important for building the neural representation of these spatial coordinates often called cognitive maps: amongst them the hippocampus is considered as the corner stone (O'Keefe and Dostrovsky, 1971; O'Keefe, 1979). Dopaminergic afferents to the hippocampal formation arise from both the ventral tegmental area (VTA, A10) and *substantia nigra pars compacta* (SNc, A8–A9) dopaminergic cell groups (Scatton et al., 1980). In addition, intra-hippocampal injections of D1 agonists and D2 antagonists improve memory (Packard and White, 1991; Wilkerson and Levin, 1999), while 6- Hydroxy-dopamine (6-OHDA) lesions of dopaminergic inputs to hippocampus induce spatial working memory deficits (Gasbarri et al., 1996). The dorsal part of Cornu Ammonis areas 3 (CA3), considered to be involved in the rapid acquisition of new memory (Kesner, 2007), is the target of dopamine (DA) projections from VTA and SNc (Scatton et al., 1980; Luo et al., 2011). DA innervation together with a higher liability of place fields as compared to those of CA1 (Barnes et al., 1990; Mizumori, 2006), makes CA3 a good candidate for the detection of the contextual significance of spatial features (Penner and Mizumori, 2012).

These facts raise the question of what happens in this brain structure when dopamine is depleted in Parkinsonian patient. We recently addressed this issue in the 6-OHDA rat, a model of PD commonly used to investigate the pathophysiology of dopamine depletion (Retailleau et al., 2013). We demonstrated that hippocampus played not only a role in the identification of locations in an environment and in the planning of the trajectories to goals and path selection as previously showed (Morris et al., 1990; Bannerman et al., 1995; Whishaw and Jarrard, 1995; Whishaw and Tomie, 1996; Hollup et al., 2001; Johnson and Redish, 2007; Rolls, 2007), but also in goal encoding. Behavioral analysis showed that sham rats are able to maximize their behavior in a baited Y-maze in order to increase the total amount of reward income over the course of the session. This behavior is correlated to an increase in the mean firing rate of neurons in CA3 at both decision point and reward location. Moreover, the lesion of the dopaminergic neurons of Substantia Nigra disrupted this ability and uncorrelated firing rate of CA3 neurons from decision and reward location (**Figure 1**).

However, the origin of these alterations in the neural processing of the spatial features needs to be clarified. It could result from a local effect: dopamine depletion disturbs directly the processing of relevant spatial information at hippocampal level. Alternatively, it could result from a more distributed network effect: dopamine depletion elsewhere in the brain (entorhinal cortex, striatum, etc.) modifies the way hippocampus processes spatial information. In this mini-review we address as survey of arguments in the literature that can support eachtheory.

## **INFLUENCE OF DA IN THE HIPPOCAMPUS**

Dopamine from SNc and VTA provides a reward prediction error signal to the dorsal and ventral striatum (Schultz, 2006). It is tempting to assume that similar mechanisms are involved in the hippocampus itself. There are strong elements to support this assumption. Hippocampus is one of the rare brain areas where the 5 subtypes of DA receptors are found, D2-like (Brouwer et al., 1992) and D1-like families (Gingrich et al., 1992; Laurier et al., 1994; Sokoloff and Schwartz, 1995). In addition, intra-hippocampal injections of D1 agonists and D2 antagonists improves memory (Packard and White, 1991; Wilkerson and Levin, 1999), while 6-OHDA lesions of dopamine input to hippocampus result in spatial working memory deficits (Gasbarri et al., 1996). Exposure to a novel context increases hippocampal dopamine release (Ihalainen et al., 1999), which in turn facilitates long term potentiation (LTP) induction (Li et al., 2003; Lemon and Manahan-Vaughan, 2006). Detection of a context change by hippocampus can be used to update memory systems, which in turn may signal dopamine neurons to increase dopamine release. The subsequent increase in dopamine release in hippocampus seems to excite hippocampal neurons and so increase the duration of neural responses to glutamatergic input (Smialowski, 1987; Smialowski and Bijak, 1987) contributing as such to the increased stabilization of place fields typically observed as rats become familiar with new environmental conditions (Frank et al., 2004).

All of these data strengthened the hypothesis of a local effect of DA in the hippocampus, but there is a major drawback, which is that dopaminergic direct input to hippocampus, is very sparse and poorly described despite intensive search. As far as we know, we found only one study which described that a few fibers arise from both the VTA, A10 and SNc, A8–A9 dopaminergic cell groups (Scatton et al., 1980). But sparseness doesn't mean inefficiency and the contrast with the reasonably high density of DA receptors in the hippocampus pleads for an actual functional effect.

## **BASAL GANGLIA AND SPATIAL NAVIGATION**

Experimental approaches in rodents have provided behavioral evidence that pharmacological manipulations of the nucleus accumbens (NAc—part of the ventral striatum) impair acquisition and/or performance in spatial learning tasks (Annett et al., 1989; Wiener, 1993; Ploeger et al., 1994; Gal et al., 1997; Smith-Roe et al., 1999; Atallah et al., 2007; Ferretti et al., 2007). Packard and McGaugh's seminal experiments (Packard and McGaugh, 1996) introduced the hypothesis that the striatum contributed to egocentric learning. However, recent data have generated controversy and show that the striatum is also involved in other aspects of spatial learning. Various studies examined the involvement of different sub-regions of the striatum in allocentric and egocentric learning (Voorn et al., 2004). Their results suggest that the Dorso Lateral Striatum (DLS) plays a role in egocentric spatial learning (Yin and Knowlton, 2004) based on cue-action association (van der Meer et al., 2010) and both the NAc and the Dorso Medial Striatum (DMS) plays a role in allocentric learning (Lavoie and Mizumori, 1994; Mizumori et al., 2004; Ferretti et al., 2007). The central role of the ventral striatum in spatial decision making has been confirmed by electrophysiological approach (Lansink et al., 2009; van der Meer et al., 2010; van der Meer and Redish, 2011).

These findings pave the way for integration between allocentric and egocentric process in the striatum. Striatum is a highly convergent structure where huge overlaps exist between the different functional territories (Parent and Hazrati, 1995). Thorn et al. (2010) recently demonstrated that DMS and DLS display contrasting patterns of activity during task performance that developed concurrently with sharply different dynamics. It suggests that this integration can be a cooperative (additive) process or a competition process that would contribute to multimodal

decision making using mechanisms similar to those that have been modeled for action selection (Leblois et al., 2006; Guthrie et al., 2013).

## **INFLUENCE OF DA IN THE BASAL GANGLIA**

The regulation of striatal input by tonic and phasic dopamine release is so well known and described that there is not enough space in this mini-review to detail and we report the reader to more comprehensive review (Goto et al., 2007; Schultz, 2007; Humphries et al., 2012). Briefly, dopamine presysnaptic button connect the base of dendritic spines of the Medium Spiny Neurons of the striatum. From this key position they control the effect of the other inputs connecting the head of the spine (Parent and Hazrati, 1993) and play a key role in synaptic plasticity.

The role of DA at striatal level in the control of spatial navigation has been evidenced both in its ventral and dorsal part. In the ventral part, evidences are accumulated since the late 1980s. Mogenson was the first to highlight that the hippocampal signal transmission to the pedunculopontine tegmental nucleus is regulated by dopamine receptors of the NAc (Mogenson et al., 1987). The integrative role of the NAc and subpallidal area in relaying hippocampal signals to the mesencephalic locomotor region in the brainstem was investigated electrophysiologically in urethane-anaesthetized rats. A behavioral study of the functional connections was also performed in freely moving rats. In the electrophysiological experiments, subpallidal output neurons to the pedunculopontine nucleus were first identified by their antidromic responses to electrical stimulation of the pedunculopontine nucleus. Hippocampal stimulation was then shown to inhibit orthodromically some of these subpallidal neurons. The inhibitory response was attenuated following microinjection of a dopamine D2 agonist, but not a D1 agonist, into the NAc. This suggests that signal transmission from the hippocampus to the subpallidal output neurons to the pedunculopontine nucleus is modulated by a D2 receptor-mediated mechanism in the NAc. Injections of N-methyl-D-aspartate into the ventral subiculum of the hippocampus resulted in a threefold increase in locomotor responses. Injection of a D2 agonist into the accumbens reduced the hyperkinetic response dose-dependently and suggests that D2 receptors regulate locomotor responses initiated by the hippocampal-accumbens pathway. These results provide evidence of limbic (e.g., hippocampus) influences on locomotor activity by way of NAc-subpallidal-pedunculopontine nucleus connections that may contribute to adaptive behavior.

Concerning the dorsal part (DMS and DLS), the role of DA in controlling spatial navigation has been partially demonstrated only much more recently (for review: Penner and Mizumori, 2012). What is known so far is that, DA depletion in the DLS interfere with strategy shifting, but no experiments has been conducted yet in order to assess if it induced disruption of the egocentric learning.

Thus despite it is highly probable that DA depletion in the striatum could explain the effect on spatial behavior, there is no definitive evidences yet.

#### **TOWARDS AN INTEGRATIVE REPRESENTATION**

This quick review of the literature highlighted the multilevel implication of DA in the process of spatial information by the nervous system, however, it remains to determine at which level(s) the disruption of the dopaminergic function impacts spatial navigation in order to unravel the origin of spatial disorientation in PD patients (Hovestadt et al., 1987; Taylor et al., 1989).

The classical segregation between allocentric and egocentric learning seems too schematic and does not take into account the different modalities of perception of the environment. Hippocampus and striatum both contribute to encoding the spatial representation and seem to be involved in various modalities of perception. The hippocampus is related to the building of a spatial map (where am I?) but recent study evidenced that many place cells recorded from rats performing place or cue navigation tasks also discharged when they are at the goal location (Hok et al., 2007). The striatum provides reinforcing values to several aspects of the environment (where is my reward?), the ventral part to the pure localization aspects while the medial part encodes external cues. Meanwhile the dorsal striatum encodes the procedure (from here, which sequence of actions do I have to take in order to get my reward? (Retailleau et al., 2012). This taxonomy overlaps with the one recently proposed by Khamassi and Humphries(2012), which proposed that allocentric framework encompassed a model-based learning process while egocentric is more related to model free associative learning. It seems obvious that DA plays a key role in the striatum and that striatal DA depletion partly disrupts these processes. It is also highly probable that the rich density of DA receptors together with the response to reward location and decision processes we evidenced in CA3 (Retailleau et al., 2013) argue for also a local effect. So both mechanisms should be involved in the spatial disorientation observed in PD patients, but what remains to be established is in which proportion.

## **HOW EMERGES A SPATIAL STRATEGY?**

It has been shown that different spatial strategies can be used and that lesion of the hippocampus or one of the striatal territories may shift the dominant strategy used by the subject (Packard and McGaugh, 1996; Mizumori et al., 2004; Penner and Mizumori, 2012) It strongly suggests that the neural mechanism of the specific selection of one of these modalities is based on competition mechanisms.

We demonstrated that action selection and decision making are emerging properties of the architecture of the frontal and dorsal part of the cortex-basal ganglia (BG) loop (Leblois et al., 2006; Guthrie et al., 2013). It makes it a perfect analogous of the actor in the actor-critic model of Sutton and Barto (1998), working under the supervision of a critic. Actor and critic are interfaced by DA at striatal level. The exact role of DA is still discussed and may be different in different structures (Khamassi and Humphries, 2012). While it is almost evident that it acts as error prediction signal in dorsal territories it may be simple reward signal elsewhere. In fact the neural substrate underlying

the critic is still unknown and different sub-population of the actor structures can be involved (Lansink et al., 2009; Gläscher et al., 2010). For a comprehensive review of the involvement of the structures in different modalities of critic processing, we report the reader to Khamassi and Humphries (2012). We propose here to extend our model of the actor to the modalities of spatial navigation. The three modalities (external coordinates, external cues and internal coordinates) are supported by three different loops. The first loop passing through the ventral BG and the hippocampus, the second supported by the medial BG and thalamus respectively, and the third loop passing through the dorsal BG (**Figure 2**). The competition mechanism would allow the activation of one of the three modalities. During learning, dopamine would induce plasticity at various levels (ventral/medial or dorsal striatum, hippocampus) and modify the respective weight of each loop. According to the phase of learning and the reinforcing value of one of the aspects, one of the three networks would win the competition and therefore take over the control of the spatial navigation. This theory has the advantage of taking into account both the competitive and the collaborative aspects of their interactions and explains why, when all loops are functional, one can take over from another during the course of learning while, when one is disrupted, a different loop is able to take over the spatial navigation function. At the beginning of learning the system switches from one modality to another randomly due to the system noise level. Meanwhile, each system learns in parallel from the outcomes. From Packard and McGaugh (1996) work, we can infer that the more ventral system learns first and drives animal behavior after a moderate period of learning. After longer practice, the dorsal system is strengthened, takes over from the ventral system and drives the animal behavior. Inhibition of the stronger partner of the system (the ventral in the early phase, the dorsal in the later phase) allows the weaker system to take control of the behavior. According to the topography of the lesion, dopamine depletion as in PD may disrupt one or several of these systems. Where this disruption occurs and can it be compensated for by another system is still an open question to be investigated.

#### **REFERENCES**


alterations of pallidal neurones in the MPTP-treated monkey. *Brain* 124, 546– 557. doi: 10.1093/brain/124.3.546


during spatial learning. *J. Physiol. Paris* 106, 72–80. doi: 10.1016/j.jphysparis. 2011.10.002


Wiener, S. I. (1993). Spatial and behavioral correlates of striatal neurons in rats performing a self-initiated navigation task. *J. Neurosci.* 13, 3802 –3817.

Wilkerson, A., and Levin, E. D. (1999). Ventral hippocampal dopamine D1 and D2 systems and spatial working memory in rats. *Neuroscience* 89, 743–749. doi: 10. 1016/s0306-4522(98)00346-7

Yin, H. H., and Knowlton, B. J. (2004). Contributions of striatal subregions to place and response learning. *Learn. Mem.* 11, 459–463. doi: 10.1101/lm.81004

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 23 October 2013; accepted: 18 February 2014; published online: 18 March 2014.*

*Citation: Retailleau A and Boraud T (2014) The Michelin red guide of the brain: role of dopamine in goal-oriented navigation. Front. Syst. Neurosci. 8:32. doi: 10.3389/fnsys.2014.00032*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2014 Retailleau and Boraud. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Synchrony in Parkinson's disease: importance of intrinsic properties of the external globus pallidus

## *Bettina C. Schwab1,2\*, Tjitske Heida2, Yan Zhao2, Enrico Marani 2, Stephan A. van Gils <sup>1</sup> and Richard J. A. van Wezel 2,3*

*<sup>1</sup> Applied Analysis and Mathematical Physics, MIRA Institute of Technical Medicine and Biomedical Technology, University of Twente, Enschede, Netherlands*

*<sup>2</sup> Biomedical Signals and Systems, MIRA Institute of Technical Medicine and Biomedical Technology, University of Twente, Enschede, Netherlands*

*<sup>3</sup> Biophysics, Donders Institute for Brain, Cognition and Behavior, Radboud University, Nijmegen, Netherlands*

#### *Edited by:*

*Hagai Bergman, The Hebrew University, Israel*

#### *Reviewed by:*

*Thomas Wichmann, Emory University School of Medicine, USA Jérôme Baufreton, Centre National de la Recherche Scientifique, France Maria S. Aymerich, CIMA-Universidad de Navarra, Spain*

#### *\*Correspondence:*

*Bettina C. Schwab, Applied Analysis and Mathematical Physics, MIRA Institute for Biomedical Technology and Technical Medicine, University of Twente, Zilverling, Hallenweg 19, PO BOX 217, 7500 AE Enschede, Netherlands e-mail: b.c.schwab@utwente.nl*

The mechanisms for the emergence and transmission of synchronized oscillations in Parkinson's disease, which are potentially causal to motor deficits, remain debated. Aside from the motor cortex and the subthalamic nucleus, the external globus pallidus (GPe) has been shown to be essential for the maintenance of these oscillations and plays a major role in sculpting neural network activity in the basal ganglia (BG). While neural activity of the healthy GPe shows almost no correlations between pairs of neurons, prominent synchronization in the β frequency band arises after dopamine depletion. Several studies have proposed that this shift is due to network interactions between the different BG nuclei, including the GPe. However, recent studies demonstrate an important role for the properties of neurons within the GPe. In this review, we will discuss these intrinsic GPe properties and review proposed mechanisms for activity decorrelation within the dopamine-intact GPe. Failure of the GPe to desynchronize correlated inputs can be a possible explanation for synchronization in the whole BG. Potential triggers of synchronization involve the enhancement of GPe-GPe inhibition and changes in ion channel function in GPe neurons.

**Keywords: external globus pallidus, dopamine depletion, desynchronization, plasticity, GPe-GPe synapses, HCN channels, SK channels, NaF channels**

#### **1. INTRODUCTION**

Neural activity in the basal ganglia (BG) of patients with idiopathic Parkinson's disease (PD) and animal models of PD commonly shows high levels of synchronization, bursting, and oscillations in low frequency bands such as θ (4–7 Hz) and β (15–30 Hz) frequencies (Bergman et al., 1994; Obeso et al., 2000; Brown et al., 2001; Montgomery, 2007; Wichmann et al., 2011). Although it is not completely clear whether these abnormal neural activities cause PD motor symptoms, they are reliable disease markers as they coincide with motor symptoms after severe dopamine depletion (Kühn et al., 2006, 2009; Hammond et al., 2007; Eusebio et al., 2007; Quiroga-Varela et al., 2013). Nevertheless, the mechanisms and origins of the emergence and transmission of synchronization, bursting and oscillations remain controversial. Oscillations in the β frequency range, often related to rigidity, akinesia and bradykinesia, have been proposed to arise via the cortex (Brown, 2003; Sharott et al., 2005; Tachibana et al., 2011) or via interactions of the subthalamic nucleus (STN) and the external globus pallidus (GPe) (Plenz and Kital, 1999; Bevan et al., 2002; Terman et al., 2002; Tachibana et al., 2011; Fan et al., 2013).

After dopamine depletion, prominent changes in neural synchronization occur in projection neurons of the GPe, which has a central position in the BG loop (Smith et al., 1998) <sup>1</sup> . Under heathy conditions, activity in the GPe shows almost no correlations between pairs of neurons (Nini et al., 1995; Raz et al., 2000; Mallet et al., 2008), including spatially nearby neurons (Bar-Gad et al., 2003), although neurons in the GPe possess a large number of local axon collaterals and are believed to receive common inputs (Francois et al., 1984; Percheron et al., 1991; Yelnik, 2002). In contrast, after dopamine depletion, strong synchronization in the β frequency range was found (Nini et al., 1995; Raz et al., 2000; Heimer et al., 2002; Mallet et al., 2008). These findings led to the suggestion of a local mechanism that decorrelates activity in the healthy GPe (Bar-Gad et al., 2003). Failure of the GPe to decorrelate synchronized input can be an explanation for abnormal synchrony of the whole BG.

In this review, we discuss recent evidence supporting the crucial role of GPe properties in synchronizing and desynchronizing afferent activity and their remodeling during parkinsonism. We describe proposed mechanisms for this synchronization process intrinsic to the GPe, based on synaptic and cellular properties.

#### **2. INTRINSIC GPe STRUCTURE**

The GPe is located centrally in the BG and essentially contributes to its multiple feedback loops (Jaeger and Kita, 2011). Its neural dynamics, involving high firing rates, are strongly influenced by excitatory inputs from the STN (Goldberg et al., 2003). GABAergic synapses projecting to the GPe are provided mainly by the striatum and about 10% of the synapses arise in the GPe itself (Shink

<sup>1</sup>The primate external and internal globus pallidus (GPe and GPi) are named globus pallidus (GP) and endopenduncular nucleus (EP), respectively, in rodents. We will refer to GPe and GPi for these nuclei in general.

and Smith, 1995). Morphological characterization of these local axon collaterals in the rat brain indicates that the GP not only acts as a relay nucleus, but has intrinsic structures capable of internal information processing (Sadek et al., 2007). In these structures, information is processed from neurons in the outer part of the GP to neurons in the inner part (Sadek et al., 2007). This elaborate GP internal connectivity seems essential for sculpting GP activity, and GP projection neurons may take additional roles as inhibitory interneurons that control spiking behavior.

In healthy awake animals, two electrophysiological cell types have been identified in the GPe based on their firing rates and patterns (deLong, 1971; Bugaysen et al., 2010; Benhamou et al., 2012). 6-hydroxydopamine (6-OHDA) treated rats also showed clear differences in the firing rates and patterns between two distinct GP neuron populations *in vivo* (Mallet et al., 2008). In contrast, studies using healthy rat brain slices described three electrophysiological subgroups of neurons in the GP (Cooper and Stanford, 2000; Bugaysen et al., 2010). However, more recent *in vitro* studies reported no clear qualitative electrophysiological differences amongst GP neurons and challenge the existence of distinct GP neuron types (Chan et al., 2004, 2011; Hashimoto and Kita, 2006; Günay et al., 2008; Deister et al., 2013).

Nevertheless, anatomical dichotomy has often been described in the GP (Cooper and Stanford, 2002; Hoover and Marshall, 2002; Nobrega-Pereira et al., 2010). For instance, a group of proenkephalin positive neurons that preferentially target the striatum (Hoover and Marshall, 2002) and a small population of calretinin positive interneurons (Cooper and Stanford, 2002) have been reported. Based on fate mapping analysis, even five neural populations have been identified in the mouse GP which differ in progenitor lineage and partly in their embryonic domains (Nobrega-Pereira et al., 2010).

Recently, Mallet et al. (2012) combined anatomical and electrophysiological characteristics of classes of GP neurons. They described the existence of two distinct neural populations in the GP of a 6-OHDA treated rat that have distinct molecular profiles and axonal connectivities. Neurons of the first population fired antiphasic to STN neurons, often expressed parvalbumin (PV) and targeted the STN or the EP. The second population described a novel cell type: neurons that fired in-phase with STN neurons expressed proenkephalin and innervated both projection neurons and interneurons of the striatum. Mallet et al. (2012) also found differences in the dendritic and axonal architectures of the two cell types. In particular, local axon collaterals of the first neural population were longer while their dendritic spine density was lower in comparison to the second population.

Altogether, the complex structure of the GPe on both the synaptic and cellular levels indicates that information processing within the GPe is possible and might be critical for modulating dynamics in the whole BG network.

#### **3. IMPORTANT CONTRIBUTION OF THE GPe TO THE PATHOPHYSIOLOGY OF PARKINSONISM**

The GPe is in a unique position to propagate synchronized oscillatory activity, since it projects to virtually all other BG nuclei (Mallet et al., 2008). Furthermore, its neurons possess intrinsic oscillatory properties, leading to a steady pacemaking function (Wilson, 2013). Nevertheless, β band oscillations in the GPe in parkinsonism commonly exhibit smaller amplitudes than those in the STN or the GPi (Stein and Bar-Gad, 2013).

An important hypothesis proposes that the GPe plays a major role in information processing in the dopamine depleted BG, in particular by interacting with the STN (Plenz and Kital, 1999; Bevan et al., 2002; Terman et al., 2002; Fan et al., 2013). A study in 1-Methyl-4-Phenyl-1,2,3,6-Tetrahydropyridine (MPTP) treated monkeys showed that muscimol inactivation of the GPe to block its GABAergic outputs led to prominent reductions of β oscillations in the STN (Tachibana et al., 2011). The GPe is therefore assumed to regulate the presence of oscillations in the dopamine depleted BG, while the origins of these oscillations remain unclear.

Since neural activity abnormalities in PD are reversible with L-3,4-dihydroxyphenylalanine (L-Dopa) treatment, their emergence and reversal are both thought to be crucially dependent on dopamine levels, although possibly involving different mechanisms (Brown et al., 2001; Kühn et al., 2006; Tachibana et al., 2011). Despite that in literature the effects of dopamine depletion are often focused on the striatum, PD patients lose about 82% of dopamine in the GPe (Rajput et al., 2008). This specific loss is also attributed to motor symptoms as supported by several studies investigating the influence of dopamine directly in the rat GP. Firstly, dopamine receptor *D*1/*D*<sup>2</sup> blockage in the GP induced akinesia (Hauber and Lutz, 1999). Secondly, direct application of dopamine in the GP restored motor behavior in a 6-OHDA model (Galvan et al., 2001). Thirdly, injections of 6-OHDA in the GP induced dopamine depletion in both GP and striatum and mimicked the parkinsonian motor symptoms and neural activity abnormalities resulting from striatal 6-OHDA injections (Abedi et al., 2013).

These findings support the important role of dopamine depletion in the GPe for PD. Furthermore, the results of Abedi et al. (2013) additionally indicate that direct injury of the GPe could contribute to PD pathology. Indeed, Fernandez-Suarez et al. (2012) reported prominent cell death of PV-positive GABAergic GPe neurons, commonly projecting to STN and GPi, in 6-OHDA treated rats and in MPTP treated monkeys. In contrast, an earlier study by Hardman and Halliday (1999) did not describe abnormalities in the total number of PV-positive GPe neurons in PD patients. However, when considering cell density rather than absolute cell counts, death of GPe neurons is also possible here as seen in a trend toward a decrease of PV-positive neuron density (Fernandez-Suarez et al., 2012). Fernandez-Suarez et al. (2012) speculate that a loss of GABAergic GPe neurons could decrease inhibition of the STN and thus support its hyperactivity. Furthermore, the GPe may lose parts of its intrinsic structure, thereby forfeiting its ability to perform complex information processing. To prevent secondary cell death, adaptive processes could be triggered that may additionally impede information processing.

#### **4. POTENTIAL INTRA-GPe MECHANISMS FOR (DE)SYNCHRONIZATION**

Several mechanisms have been proposed that increase synchronization inside the GPe in parkinsonism or, in turn that desynchronize this nucleus under healthy conditions. The majority of these mechanisms are based on interactions between the GPe and other nuclei, namely the STN and striatum (e. g., Alexander and Crutcher, 1990; Ingham et al., 1997; Plenz and Kital, 1999; Terman et al., 2002; Kumar et al., 2011; Fan et al., 2013). In the following sections, we describe intrinsic GPe mechanisms for (de)synchronization, involving cellular and synaptic GPe properties.

#### **4.1. CELLULAR PROPERTIES**

Intrinsic properties of GPe neurons are determined by more than 10 voltage-gated ion channel types (Mercer et al., 2007; Günay et al., 2008; Jaeger and Kita, 2011). Changes in the expression or function of these channels can contribute to changes in activity dynamics and influence synchrony *in vivo*. Hyperpolarization and cyclic nucleotide-gated (HCN), small conductance calciumactivated potassium (SK), and fast, transient, voltage-dependent sodium (NaF) channels as well as general cellular heterogeneity have been proposed for desynchronization of the dopamine intact GPe.

### *4.1.1. HCN channel expression*

HCN channels are widely expressed in the dendrites of neurons in various parts of the brain such as the cortex, hippocampus, and thalamus (Poolos, 2012). They support pacemaking and can take part in sculpting synaptic responses (Chan et al., 2004).

Chan et al. (2004) proposed HCN channels in GP neuron dendrites as key determinants of regular spiking and synchronization. In a study of HCN channel function in 6-OHDA lesioned mice, Chan et al. (2011) uncovered an HCN channelopathy in GP neurons that accompanied pacemaking deficits. HCN channels located presynaptically on GP terminals are known to decrease the likelihood of GABA release (Boyes and Bolam, 2007). Viral delivery of HCN subunits and L-type calcium channel agonists restored pacemaking, but did not improve motor symptoms, suggesting that the channelopathy might therefore be an adaptive process and not causal for motor deficiency.

## *4.1.2. SK channel expression*

SK channels are assumed to contribute to the firing dynamics in most excitable cells (Bond et al., 1999) and can modulate plasticity (Woodward et al., 2010).

Studies with brain slices of healthy rats (Deister et al., 2009) and computational models of GPe neurons (Deister et al., 2009; Schultheiss et al., 2010) proposed a mechanism of decorrelation via an SK current. Deister et al. (2009) showed that rat GP neurons express functional SK channels that contribute to the precision of autonomous firing in GP neurons. Strong SK currents can decrease the sensitivity of GPe neurons to smaller synchronized inputs (Deister et al., 2009). Phase response curve analysis suggests that dendritic SK channel expression controls synchronization by changing the phase dependence of synaptic effects on spike timing (Schultheiss et al., 2010).

SK channels can indirectly be modulated via dopamine (Ramanathan et al., 2008) and may therefore exhibit altered dynamics in PD.

#### *4.1.3. NaF channel expression*

The initiation and propagation of action potentials on dendrites significantly depends on NaF channels (Hanson et al., 2004).

High expression of dendritic NaF channels has been suggested as a potential mechanism that actively decorrelates the GPe (Edgerton and Jaeger, 2011). In their computational model, Edgerton and Jaeger (2011) showed that neurons with low dendritic NaF channel expression have a high tendency to phase lock with synchronized synaptic input. They estimated that SK channel expression is only relevant in synchronizing neural activity if the dendritic NaF channel conductance was low compared to the conductance of other channels. Additionally, they report that HCN channel expression did not significantly alter oscillatory firing, leaving dendritic NaF channel expression as the main factor in determining the phase-locking properties of neurons.

GP neurons of rats express dendritic NaF channels and their distribution is enriched near sites of excitatory synaptic input (Hanson et al., 2004). Whether dendritic NaF channel expression actually decreases in PD has not been investigated yet. However, in other neuron types, it has been reported that NaF current density is subject to regulation through multiple pathways and on multiple timescales (Herzog et al., 2003; Hu et al., 2005; Xu et al., 2005), for example by dopamine D2 receptor-activated Ca2<sup>+</sup> signaling within few minutes (Hu et al., 2005).

## *4.1.4. Cellular heterogeneity*

Recently, Deister et al. (2013) suggested cellular heterogeneity as an active decorrelation mechanism. They found that the heterogeneity in firing rates and patterns found in GP neurons in healthy rats are not due to multiple cell types or synaptic transmission but rather caused by a change over time in cellular properties common to all neurons, leading to different cellular characteristics within minutes. Quantitative changes in the expression of HCN or other ion channels could underly this dynamic cellular heterogeneity. Continuous variations in ion channel composition could account for the entire range of firing rates and patterns in the GPe (Günay et al., 2008).

Since neurons firing at widely different rates do not tend to synchronize with each other, this cellular heterogeneity may make the GPe less susceptible to synchronized inputs. Deister et al. (2013) therefore describe a powerful mechanism of decorrelation in the healthy GPe. However, changes in this heterogeneity after dopamine depletion have not yet been investigated.

In summary, in the dopamine intact BG, HCN, SK, and NaF channels as well as cellular heterogeneity have been convincingly argued to contribute to neural dynamics in the GPe. A qualified hypothesis states that GPe neurons are not very dependent on synaptic input due to their intrinsic pacemaker function, potentially sustained by HCN and SK channel function (Wilson, 2013). Loss of the autonomous GPe activity could lead to correlation of neural activity by shared inputs. However, cellular changes in the GPe after dopamine depletion that could cause such a loss are rarely studied in experiments. It therefore seems likely that cellular properties contribute to desynchronization of the healthy GPe, but it remains unclear whether these properties induce synchronization after dopamine depletion.

## **4.2. SYNAPTIC PROPERTIES**

Synaptic coupling inside the GPe via local axon collaterals is wellestablished (Francois et al., 1984; Kita, 1994; Sadek et al., 2007; Miguelez et al., 2012) although functional GPe connectivity is highly variable and depends on the brain state (Magill et al., 2006). Rat GP-GP synapses have considerably different properties than striatum-GP synapses, with a lower paired-pulse ratio and weak responses to stimulation (Sims et al., 2008).

Although little is known about the effects of GABAergic transmission within the GPe, connections from the GPe to the STN and the substantia nigra pars reticulata (SNr) are better described and may share characteristics of GPe-GPe connections. In rat brain slices, GP-STN connections have been found to be sparse, but sufficiently powerful to inhibit and synchronize the autonomous activity of STN neurons (Baufreton et al., 2009). Bursts of activity from the rat GP are also able to effectively silence the firing of SNr neurons, although they can start firing again due to depression of these GP-SNr synapses (Connelly et al., 2010). A recent study demonstrates that GP-GP connections are also highly efficacious in modulating postsynaptic activity despite substantial short time depression and sparse connectivity (Bugaysen et al., 2013).

### *4.2.1. Synaptic strength*

Miguelez et al. (2012) showed that GP-GP inhibitory synaptic transmission increased in a rat 6-OHDA model, leading to increased rebound bursting. This altered transition may have major impacts on neural dynamics. Kita et al. (2004, 2006) demonstrated that specific blocking of GABA receptors in the monkey GPe regularizes neuron firing, indicating that GABAergic inhibition from the striatum and GPe regulates pallidal firing.

It is still unclear how much and which influence inhibitory GPe-GPe coupling has on synchrony in parkinsonism. If their firing is periodic, coupling between GPe cells could either synchronize or desynchronize activity (Wilson, 2013). In the healthy GPe, given the pacemaking function of these neurons, local axon collaterals may act as desynchronizing elements (Sims et al., 2008; Wilson, 2013). However, after dopamine depletion, the effect of local axon collaterals could be reversed and synchronize activity (Wilson, 2013).

## *4.2.2. Synaptic architecture*

Highly heterogeneous synaptic coupling between GPe neurons could be a factor for their desynchronization. As heterogeneity on a cellular basis can act as a decorrelator, highly inhomogeneous coupling amongst neurons could lead to similar effects. Sadek et al. (2007) described the anatomical network of GP-GP axon collaterals in the rat as structured rather than homogeneously distributed. It can be speculated that through injury or adaptive remodeling, this structure may become damaged and lose the ability to desynchronize.

Although changes in synaptic transmission within the rat GP after dopamine depletion have been measured (Miguelez et al., 2012), the detailed intrinsic connectivity of GPe still remains poorly understood. Nevertheless, it has become evident that this nucleus cannot only be considered as a homogeneous relay nucleus (Sadek et al., 2007). Further studies of its structural and functional connectivity, especially at different dopamine levels, are needed to shed light on information processing inside the GPe.

## **5. CONCLUSIONS**

Several lines of evidence emphasize the importance of intrinsic GPe properties in abnormal synchronization in parkinsonism. This makes the GPe an attractive target for future therapies, potentially involving direct pharmacological targeting.

Most of the evidence provided in this review is based on rodent studies, but the rodent GP may differ substantially from the human GPe in some aspects. Functionally, a lower average firing rate has been observed in the rodent GP compared to the primate GPe, while firing patterns were very similar (Benhamou et al., 2012). Anatomically, little is known about the level of human GPe local collateralization, although its existence is hardly debated (Francois et al., 1984). The rat GP is studied in more detail and shows a high level of complex local connections (Sadek et al., 2005, 2007).

Though often assumed, it remains unclear whether increased synchronization in the BG causes motor impairments in PD patients (Quiroga-Varela et al., 2013). The onset of the synchronization process occurs independently of the onset of motor symptoms in animal models of increasing levels of dopamine depletion (Leblois et al., 2007; Dejean et al., 2012). Nevertheless, impact of β band synchronization on motor control remains an established assumption (Brittain and Brown, 2013).

A comprehensive mechanism responsible for synchronization and desynchronization of the GPe that is dependent on dopamine levels, is still missing. However, loss of pacemaker function in GPe neurons and altered function of GPe-GPe synapses are important candidates (Wilson, 2013). Although this review focuses on intrinsic GPe properties, we do not suggest that interactions in the BG network are less important. Synaptic input to the GPe, mainly from the STN, plays a major role in pallidal synchronization (Goldberg et al., 2003; Tachibana et al., 2011). We propose intrinsic mechanisms of the GPe as crucial in processing these synchronized or partly synchronized inputs.

After dopamine depletion, GP neurons undergo plastic changes in their synaptic and cellular structure (Chan et al., 2011; Miguelez et al., 2012; Wichmann and Smith, 2013), which may potentially trigger synchronized neural activity. However, further studies on ion channel remodeling after dopamine depletion and their effects on synchrony and motor performance are missing. Intrinsic GPe connectivity is still insufficiently described and may not be restricted to GABAergic transmission. We emphasize that special attention should be drawn to possible cell death in the GPe (Fernandez-Suarez et al., 2012). Adaptive processes could be triggered to prevent further cell death that may lead to altered neural activity, which might involve synaptic as well as cellular changes.

## **ACKNOWLEDGMENTS**

Bettina C. Schwab receives funding of the Netherlands Organization for Scientific Research (NWO) and the MIRA Institute for Biomedical Technology and Technical Medicine, University of Twente. Yan Zhao is supported by the BrainGain SmartMix Program and the Whitaker International Program.

#### **REFERENCES**


160, 229–243. doi: 10.1016/S0079- 6123(06)60013-7


*Brain Res.* 929, 243–251. doi: 10.1016/S0006-8993(01)03263-2


model of parkinsonism. *J. Neurosci.* 22, 7850–7855.


Pathological synchronisation in the subthalamic nucleus of patients with Parkinson's disease relates to both bradykinesia and rigidity. *Exp. Neurol.* 215, 380–387. doi: 10.1016/j.expneurol.2008.11.008


in subthalamic nucleus neurons through inhibition of Cav2. 2 channels. *J. Neurophysiol.* 99, 442–459. doi: 10.1152/jn.00998.2007


Milestones in research on the pathophysiology of Parkinson's disease. *Mov. Disord.* 26, 1032–1041. doi: 10.1002/mds.23695


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 18 July 2013; accepted: 13 September 2013; published online: 04 October 2013.*

*Citation: Schwab BC, Heida T, Zhao Y, Marani E, van Gils SA and van Wezel RJA (2013) Synchrony in Parkinson's disease: importance of intrinsic properties of the external globus pallidus. Front. Syst. Neurosci. 7:60. doi: 10.3389/fnsys. 2013.00060*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2013 Schwab, Heida, Zhao, Marani, van Gils and van Wezel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Role of movement in long-term basal ganglia changes: implications for abnormal motor responses

## *Nicola Simola1, Micaela Morelli1,2,3\*, Giuseppe Frazzitta4,5 and Lucia Frau1*

*<sup>1</sup> Section of Neuropsychopharmacology, Department of Biomedical Sciences, University of Cagliari, Cagliari, Italy*

*<sup>2</sup> Center of Excellence for Neurobiology of Dependence, University of Cagliari, Cagliari, Italy*

*<sup>3</sup> National Council of Research (CNR), Institute of Neuroscience, Cagliari, Italy*

*<sup>4</sup> Department of Parkinson Disease Rehabilitation, "Moriggia-Pelascini" Hospital, Gravedona ed Uniti (Como), Italy*

*<sup>5</sup> Fondazione Europea Ricerca Biomedica (FERB), "S.Isidoro" Hospital, Trescore Balneario, Italy*

#### *Edited by:*

*Hagai Bergman, The Hebrew University, Israel*

#### *Reviewed by:*

*Thomas Boraud, Universite de Bordeaux, CNRS, France Alexia Pollack, University of Massachusetts-Boston, USA*

#### *\*Correspondence:*

*Micaela Morelli, Section of Neuropsychopharmacology, Department of Biomedical Sciences, University of Cagliari, Via Ospedale 72 09124, Cagliari, Italy e-mail: morelli@unica.it*

Abnormal involuntary movements (AIMs) and dyskinesias elicited by drugs that stimulate dopamine receptors in the basal ganglia are a major issue in the management of Parkinson's disease (PD). Preclinical studies in dopamine-denervated animals have contributed to the modeling of these abnormal movements, but the precise neurochemical and functional mechanisms underlying these untoward effects are still elusive. It has recently been suggested that the performance of movement may itself promote the later emergence of drug-induced motor complications, by favoring the generation of aberrant motor memories in the dopamine-denervated basal ganglia. Our recent results from hemiparkinsonian rats subjected to the priming model of dopaminergic stimulation are in agreement with this. These results demonstrate that early performance of movement is crucial for the manifestation of sensitized rotational behavior, indicative of an abnormal motor response, and neurochemical modifications in selected striatal neurons following a dopaminergic challenge. Building on this evidence, this paper discusses the possible role of movement performance in drug-induced motor complications, with a look at the implications for PD management.

#### **Keywords: priming, movement, immobilization,** *zif-268***, dynorphin, 6-OHDA, striatonigral, Parkinson's disease**

Motor complications induced by dopamine replacement therapy (DRT) are the major untoward effects associated with the pharmacologic management of Parkinson's disease (PD). These complications include end-of-dose deterioration, motor fluctuations, and abnormal motor responses, the latter being very disabling and severely limiting the patient's quality of life. Results obtained in experimental animal models of PD have indicated that pulsatile stimulation of dopamine receptors following DRT is a key step in the emergence of abnormal motor responses, and that these untoward effects are associated with a malfunction in the signal transduction pathway of the dopamine D1 receptor (Gerfen et al., 1990; Nutt, 2007; Santini et al., 2008; Guigoni and Bezard, 2009; Stocchi, 2009). Nevertheless, the precise mechanisms that underlie abnormal motor responses caused by DRT are still to be elucidated.

Recent findings have suggested that the generation of aberrant procedural memories in striatal motor circuits could participate in the manifestation of abnormal motor responses associated with DRT (Calon et al., 2000; Pisani et al., 2005; Jenner, 2008; Simola et al., 2009; Frau et al., 2013). Thus, the striatum plays a major role in processes such as integration of motor signals, acquisition of motor habits, and execution of motor programs, which are all critically regulated by dopamine (Mink, 1996; Packard and Knowlton, 2002; Gerdeman et al., 2003; Tang et al., 2007; Willuhn and Steiner, 2008). Starting from these premises, it has been hypothesized that the dopamine-denervated striatum fails to properly process motor information, and that this may result in an overload of striatal motor circuits following the performance of movement stimulated by drugs that activate dopamine receptors (Picconi et al., 2005; Pisani et al., 2005). This process, in turn, would promote pathologic motor learning, and eventually the onset of abnormal motor responses, such as dyskinesia (Picconi et al., 2005; Jenner, 2008). Therefore, the performance of movement might itself play a role in the emergence of abnormal motor responses caused by DRT. Interestingly, studies in both dopamine-denervated experimental animals and PD patients provide support to this view, by showing that physical activity may influence the severity of abnormal motor responses triggered by repeated administration of dopaminergic drugs (Reuter et al., 1999, 2000; Frazzitta et al., 2012; Aguiar et al., 2013). Moreover, recent evidence obtained in an experimental model of abnormal motor responses in hemiparkinsonian rats has provided a direct demonstration of an important role of movement performance in the emergence of these untoward effects (Simola et al., 2009; Frau et al., 2013).

## **CRITICAL EVALUATION OF ABNORMAL MOTOR RESPONSES IN EXPERIMENTAL ANIMAL MODELS OF PD**

Studies in experimental animals have dramatically contributed to the modeling of abnormal motor responses induced by dopaminergic drugs and elucidation of their mechanisms, and important results in this sense have been obtained in the unilaterally 6 hydroxydopamine (6-OHDA)-lesioned rat. Briefly, this animal model is characterized by a hemiparkinsonism subsequent to the infusion of 6-OHDA in the nigrostriatal pathway, which manifests as unilateral forelimb akinesia (Simola et al., 2007). Moreover, when treated with drugs that stimulate dopamine receptors, 6-OHDA-lesioned rats display a characteristic contralateral rotational behavior directed away from the site of toxin infusion, which is indicative of the antiparkinsonian effectiveness of the drug (Deumens et al., 2002; Simola et al., 2007). However, it is worth emphasizing that the repeated administration of drugs that stimulate dopamine receptors induces a sensitization in contralateral rotational behavior which reproduces the same biochemical changes observed in rats displaying dyskinetic-like abnormal involuntary movements (AIMs). In fact, abnormal motor responses induced by repeated treatment with dopaminergic drugs can be modeled in 6-OHDA-lesioned rats by measuring two types of behaviors: AIMs and sensitization in contralateral rotational behavior. AIMs consist of repetitive and purposeless movements of limbs and trunk, and are a reliable rodent model of human dyskinesias (Cenci et al., 1998; Lindgren et al., 2007). Sensitization in contralateral rotational behavior is also indicative of abnormal motor responses to dopaminergic drugs, since the intensity of this phenomenon directly correlates with the prodyskinetic potential of these drugs (Henry et al., 1998; Pinna et al., 2006; Carta et al., 2008).

With regard to sensitization in contralateral rotational behavior, it is worth mentioning the priming model. Priming involves a first administration of a dopamine D1/D2 receptor agonist (induction phase) that stimulates contralateral rotational behavior, followed, 3 days later, by the administration of a highly dyskinetic D1 receptor agonist (expression phase), at otherwise scarcely effective doses on rotational behavior (Morelli et al., 1989). Primed hemiparkinsonian rats display a vigorous contralateral rotational behavior on the expression phase, which is associated with neurochemical modifications in the striatum that are peculiar to experimental paradigms of prolonged administration of dopaminergic drugs (Pollack et al., 1997; van de Witte et al., 1998; Simola et al., 2007; Scholz et al., 2008; Nadjar et al., 2009). This evidence, therefore, indicates that the priming model is highly valuable for investigating the mechanisms that underlie abnormal motor responses to dopaminergic drugs in hemiparkinsonian rats.

## **MOVEMENT PERFORMANCE FOLLOWING INITIAL DOPAMINERGIC STIMULATION ENABLES THE MANIFESTATION OF SENSITIZED ROTATIONAL BEHAVIOR IN PRIMED HEMIPARKINSONIAN RATS**

Hemiparkinsonian 6-OHDA-lesioned drug-naïve rats that are treated with the D1/D2 agonist apomorphine (0.2 mg/kg, s.c.) during priming induction and left to rotate freely in response to the drug exhibit a marked contralateral rotational behavior when challenged 3 days later with the D1 receptor agonist 1-Phenyl-2,3,4,5-tetrahydro-(1H)-3-benzazepine-7,8-diol (SKF 38393, 3 mg/kg, s.c.), which is indicative of an abnormal motor response (Morelli et al., 1989; Simola et al., 2009). This behavioral effect of SKF 38393 has been found to be almost completely abolished in rats subjected to the same pharmacologic treatment that were immobilized for 1 h in a restrainer apparatus during priming induction, so that they could not perform rotational behavior in response to apomorphine (**Figure 1**; Simola et al., 2009). Importantly, immobilization has been demonstrated to suppress priming expression only when imposed concomitantly to apomorphine administration, but not immediately before or immediately after priming induction, therefore excluding a non-specific effect of immobilization (Simola et al., 2009). Moreover, the influence of immobilization on the effects of SKF 38393 has been shown not to be affected by the elevation in stress hormones that is associated with this procedure. This is clearly demonstrated by the finding that immobilization during priming induction retained the suppressant effects on priming expression even in rats treated with metyrapone (Simola et al., 2009), which prevents the elevation in stress glucocorticoid hormones that may be caused by immobilization (Calvo and Volosin, 2001). This finding is very important, since previous evidence has demonstrated that stress may have a profound influence on the behavioral and neurochemical effects of movement performance (Howells et al., 2005). Taken together, these findings indicate that the suppression of priming expression in rats immobilized during priming induction is attributable solely to the fact that immobilization prevented rats from performing movement in

was followed by priming expression with SKF 38393 (3 mg/kg s.c.), 3 days later. Rats were either allowed to rotate or were immobilized during priming induction. \**p* < 0.05 vs. non-primed rats and primed immobilized rats.

response to the initial dopaminergic stimulation (Simola et al., 2009). Therefore, the results obtained in hemiparkinsonian rats subjected to the priming model have provided the first direct demonstration that the performance of drug-stimulated movement may be crucial for the later emergence of abnormal motor responses to repetitive administration of dopaminergic drugs.

## **MOTOR PERFORMANCE FOLLOWING INITIAL DOPAMINERGIC STIMULATION TRIGGERS NEUROCHEMICAL MODIFICATIONS IN SELECTED STRIATAL EFFERENT NEURONS OF PRIMED HEMIPARKINSONIAN RATS**

In order to evaluate whether or not movement performance triggers changes indicative of neuronal long-term modifications, several biochemical and molecular parameters have been evaluated in the striatum of hemiparkinsonian drug-naïve and primed rats. The results obtained have demonstrated that both sensitized rotational behavior on the expression of priming and dyskinetic-like movements in dopamine-denervated animals repeatedly treated with dopaminergic drugs are associated with long-term changes in the production of striatal cyclic adenosine monophosphate (cAMP; Pinna et al., 1997), phosphorylation of dopamine- and cAMP-regulated neuronal phosphoprotein (DARPP-32; Santini et al., 2007), and expression of mRNAs encoding for different proteins and immediate early genes (IEGs; Barone et al., 1994; Cenci et al., 1998, 2009; Crocker et al., 1998; van de Witte et al., 1998; Carta et al., 2003, 2008; Aubert et al., 2005). In this regard, the IEG *zif-268* has recently been shown to be a useful marker of neuronal modification that involves striatal efferent pathways in animal models of abnormal motor responses (Carta et al., 2008, 2010).

*zif-268* (also known as *Egr-1*, *Krox-24*, *NGFI-*A, or *Zenk*) belongs to a class of inducible IEGs that encode regulatory transcription factors, and that have been implicated in diverse processes in a variety of cell types, including cell growth, differentiation, and apoptosis in response to extracellular stimuli (Gashler and Sukhatme, 1995). In the brain, *zif-268* mRNA and protein are constitutively expressed in several areas, such as the neocortex and hippocampus, and, of great importance in PD, the striatum (Christy et al., 1988; Mack et al., 1990; Schlingensiepen et al., 1991; Worley et al., 1991). Moreover, *zif-268* can be rapidly and transiently induced by a variety of pharmacologic and physiologic stimuli, including neurotransmitters, growth factors, peptides, depolarization, seizures, ischemia, and brain injury or cellular stress (Gashler and Sukhatme, 1995; Beckmann and Wilce, 1997). As mentioned above, *zif-268* may represent a useful and sensitive neurochemical marker in the evaluation of abnormal motor and neuronal responses associated with the onset of dyskinesia in dopamine-denervated animals treated with dopaminergic drugs (Carta et al., 2005, 2008). Thus, drugs such as L-3,4 dihydroxyphenylalanine (L-DOPA) and SKF 38393, which induce severe dyskinetic-like movements, markedly increase the levels of *zif-268* mRNA in striatal neurons, after both acute and subchronic treatment (Carta et al., 2005, 2008, 2010). In contrast, treatment with drugs that elicit a scarce dyskinetic-like response, such as ropinirole, do not produce any elevation in *zif-268* (Carta et al., 2010). Notably, the expression of mRNA encoding for *zif-* *268* was selectively increased in the direct striatonigral pathway, which seems to be the pathway most involved in development of dyskinetic movements (Carta et al., 2010).

Similar to these findings, experiments in 6-OHDA-lesioned hemiparkinsonian rats subjected to the priming model have shown that the administration of the D1 receptor agonist SKF 38393 on priming expression induces an increase in striatal *zif-268* mRNA (Frau et al., 2013). Moreover, analysis at the singlecell level showed that only enkephalin(−) striatonigral neurons, which belong to the direct pathway, displayed a significantly higher expression of *zif-268* following SKF 38393. On the other hand, enkephalin(+) striatopallidal neurons, which belong to the indirect pathway, that is less involved in abnormal motor responses elicited by dopaminergic drugs (Carta et al., 2010), did not show any significant modifications in the levels of *zif-268*. No significant differences in this effect were observed when the results from primed rats that performed rotational behavior during priming induction were compared with those obtained in rats that were immobilized on priming induction (Frau et al., 2013). This latter finding seems to suggest that modifications in *zif-268* mRNA levels observed in the enkephalin(−) striatonigral neurons of hemiparkinsonian rats during priming expression are not influenced by the fact that the rats could perform rotational behavior during priming induction. In this regard, it is worth mentioning that enkephalin(−) striatopallidal neurons include two subpopulations: substance P(+) and dynorphin(+) neurons. Analysis of *zif-268* in these neuronal populations has demonstrated a selective increase of this IEG in dynorphin(+) striatonigral neurons of rats primed with SKF 38393 that performed rotational behavior during priming induction, compared with primed rats immobilized on priming induction (**Figure 2**). This finding demonstrates, in the first place, a critical role of drug-stimulated movement performance in the emergence of neurochemical modifications in striatal neurons of dopamine-denervated rats subjected to repetitive administration of dopaminergic drugs, and that the dynorphin(+) neurons are selectively involved in the long-term modifications caused by early motor performance in the priming model (Frau et al., 2013). Moreover, the intensity of rotational behavior on priming expression was found to correlate positively with the levels of *zif-268* mRNA in dynorphin(+) neurons (Frau et al., 2013). This finding is very interesting, as it indicates a relationship between early performance of drug-stimulated movement and appearance of neurochemical adaptations associated with an abnormal motor response to dopaminergic drugs (Frau et al., 2013). With regard to the modifications in the levels of *zif-268* in the priming model, it is also relevant to observe that *zif-268* is rapidly induced in certain forms of learning, or after long-term potentiation (Lanahan and Worley, 1998; O'Donovan et al., 1999; Tischmeyer and Grimm, 1999; Bozon et al., 2002), and has recently been shown to be necessary for the formation of different forms of long-term memory (Jones et al., 2001). Together with the results obtained in the priming model, this finding would provide support to the hypothesis suggesting that abnormal motor responses to repetitive administration of dopaminergic drugs in conditions of dopaminergic denervation might involve the generation of abnormal procedural memories in striatal motor circuits.

## **CONCLUSIONS**

The results obtained in 6-OHDA-lesioned hemiparkinsonian rats subjected to the priming model demonstrate that the early performance of drug-stimulated movement promotes the later emergence of an abnormal motor response and striatal neurochemical adaptations following the subsequent administration of dopaminergic drugs with pro-dyskinetic potential (Simola et al., 2009; Frau et al., 2013). These results may appear to contrast with recent studies in experimental animals and PD patients that demonstrate how performance of movement in the form of physical training and exercise may improve motor deficits and even ameliorate dyskinesias (Goodwin et al., 2008; Döbrössy et al., 2010; Frazzitta et al., 2010; Dutra et al., 2012; Frazzitta et al., 2012; Aguiar et al., 2013). In this regard, it is worth considering that extensive neuroplasticity takes place in the striatum, which regulates movement performance, that physical activity may interfere with these neuroplastic phenomena, eventually influencing the execution of movement at a later time, and that neuroplasticity can be profoundly modified in conditions of dopamine denervation (Tillerson et al., 2001; Packard and Knowlton, 2002; Smith and Zigmond, 2003; Schouenborg, 2004; Graybiel, 2005). Therefore, it can be hypothesized that drug-stimulated movement and voluntary physical activity, given their different nature, might result in distinct neuroplastic adaptations in the dopamine-denervated striatum, leading to different effects on abnormal motor responses. Thus, irrepressible movement stimulated by dopaminergic drugs could overload striatal motor circuits with redundant information and promote pathologic motor learning, eventually triggering the generation of aberrant habits, which may manifest as abnormal motor responses, such as dyskinesias (Calon et al., 2000; Picconi et al., 2005; Jenner, 2008). On the other hand, physical activity in the framework of therapeutic programs could compete with purposeless movements triggered by dopaminergic drugs, therefore counteracting the generation of abnormal procedural mnemonic traces in the striatum, and thus ameliorating motor performance and mitigating dyskinesias.

In summary, the results obtained in hemiparkinsonian rats subjected to the priming model suggest that the performance of movement in response to an initial stimulation of dopamine receptors in the dopamine-denervated striatum plays a key role in the emergence of both abnormal motor responses and specific neuroadaptive changes in response to a later dopaminergic challenge. These results may help understand the initial molecular events that are at the basis of motor complications, such as dyskinesia, associated with DRT used to manage PD.

#### **AUTHOR CONTRIBUTIONS**

Nicola Simola and Lucia Frau: writing of the first draft of the manuscript and review. Giuseppe Frazzitta and Micaela Morelli: manuscript review and critique.

#### **ACKNOWLEDGMENTS**

Dr. Nicola Simola gratefully acknowledges Sardinia Regional Government for the financial support (P.O.R. Sardegna F.S.E. Operational Programme of the Autonomous Region of Sardinia, Euro pean Social Fund 2007–2013—Axis IV Human Resources, Objective l.3, Line of Activity l.3.1 "Avviso di chiamata per il finanziamento di Assegni di Ricerca"). Dr. Lucia Frau acknowledges the Sardinian Regional Government for financial support (Legge Regionale 7 Agosto 2007, N.7, attivazione contratto con giovani ricercatori, contratto 150/2013).

#### **REFERENCES**

Aguiar, A. S. Jr., Moreira, E. L., Hoeller, A. A., Oliveira, P. A., Córdova, F. M., Glaser, V., et al. (2013). Exercise attenuates levodopa-induced dyskinesia in 6-hydroxydopaminelesioned mice. *Neuroscience* 243, 46–53. doi: 10.1016/j. neuroscience.2013.03.039

Aubert, I., Guigoni, C., Håkansson, K., Li, Q., Dovero, S., Barthe, N., et al. (2005). Increased D1 dopamine receptor signaling in levodopainduced dyskinesia. *Ann. Neurol.* 57, 17–26. doi: 10.1002/ana.20296

Barone, P., Morelli, M., Popoli, M., Cicarelli, G., Campanella, G., and Di Chiara, G. (1994). Behavioural sensitization in 6-hydroxydopamine lesioned rats involves the dopamine signal transduction: changes in DARPP-32 phosphorylation. *Neuroscience* 61, 867–873. doi: 10. 1016/0306-4522(94)90409-X


DOPA treatment in the basal ganglia and their relevance to the development of dyskinesia. *Parkinsonism Relat. Disord.* 15, S59–S63. doi: 10. 1016/S1353-8020(09)70782-5


factors. *Prog. Nucleic Acid Res. Mol. Biol.* 50, 191–224. doi: 10. 1016/S0079-6603(08)60815-6


levodopa-induced dyskinesia and signal transduction. *FEBS J.* 275, 1392–1399. doi: 10.1111/j. 1742-4658.2008.06296.x


dependent on D1 dopamine receptors in the striatum. *Neuroscience* 153, 249–258. doi: 10.1016/j.neuroscience.2008. 01.041

Worley, P. F., Christy, B. A., Nakabeppu, Y., Bhat, R. V., Cole, A. J., and Baraban, J. M. (1991). Constitutive expression of zif268 in neocortex is regulated by synaptic activity. *Proc. Natl. Acad. Sci. U S A* 88, 5106–5110. doi: 10.1073/pnas. 88.12.5106

**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 July 2013; accepted: 30 September 2013; published online: 23 October 2013.*

*Citation: Simola N, Morelli M, Frazzitta G, and Frau L (2013) Role of movement in long-term basal ganglia changes: implications for abnormal motor responses. Front. Comput. Neurosci. 7:142. doi: 10.3389/fncom.2013.00142 This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2013 Simola, Morelli, Frazzitta and Frau. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## The role of frontostriatal impairment in freezing of gait in Parkinson's disease

#### *James M. Shine1 \*, Ahmed A. Moustafa2, Elie Matar 1, Michael J. Frank3 and Simon J. G. Lewis <sup>1</sup>*

*<sup>1</sup> Parkinson's Disease Clinic, Brain and Mind Research Institute, The University of Sydney, Sydney, NSW, Australia*

*<sup>2</sup> School of Social Sciences and Psychology, Marcs Institute for Brain and Behaviour, University of Western Sydney, Sydney, NSW, Australia*

*<sup>3</sup> Department of Cognitive, Linguistic and Psychological Sciences, Brown Institute for Brain Science, Brown University, Providence, RI, USA*

#### *Edited by:*

*Hagai Bergman, The Hebrew University, Jerusalem, Israel*

#### *Reviewed by:*

*Alessandro Stefani, University of Rome, Italy Fahad Sultan, University Tübingen, Germany*

#### *\*Correspondence:*

*James M. Shine, Parkinson's Disease Research Clinic, Brain and Mind Research Institute, The University of Sydney, 94 Mallet St, Camperdown, Sydney, NSW 2050, Australia*

*e-mail: mac.shine@sydney.edu.au*

## Freezing of gait (FOG) is a disabling symptom of advanced Parkinson's disease (PD) that leads to an increased risk of falls and nursing home placement. Interestingly, multiple lines of evidence suggest that the manifestation of FOG is related to specific deficits in cognition, such as set shifting and the ability to process conflict-related signals. These findings are consistent with the specific patterns of abnormal cortical processing seen during functional neuroimaging experiments of FOG, implicating increased neural activation within cortical structures underlying cognition, such as the Cognitive Control Network. In addition, these studies show that freezing episodes are associated with abnormalities in the BOLD response within key structures of the basal ganglia, such as the striatum and the subthalamic nucleus. In this article, we discuss the implications of these findings on current models of freezing behavior and propose an updated model of basal ganglia impairment during FOG episodes that integrates the neural substrates of freezing from the cortex and the basal ganglia to the cognitive dysfunctions inherent in the condition.

#### **Keywords: Parkinson's disease, freezing of gait, functional decoupling, subthalamic nucleus, pedunculopontine tegmental nucleus**

#### **INTRODUCTION**

Freezing of Gait (FOG) is a common disabling symptom of Parkinson's disease (PD) that typically manifests itself as a sudden inability to walk, despite the intention to move forward (Giladi et al., 1997; Nutt et al., 2011). Alternatively, the condition can also manifest as an inability to turn in a tight circle to traverse through confined spaces, such as narrow doorways (Spildooren et al., 2010). In addition, freezing episodes also occur more frequently during the performance of a concurrent cognitive task while walking (Jacobs et al., 2009; Lewis and Barker, 2009; Spildooren et al., 2010) and the phenomenon has also been associated with deficits in a number of executive functions (Amboni et al., 2008), including impairments in attentional set-shifting (Naismith and Lewis, 2010; Shine et al., 2012a) and response conflict resolution (Vandenbossche et al., 2012; Matar et al., 2013). Due to the vast range of conditions that can either provoke or relieve freezing behavior, there is currently a lack of consensus regarding the underlying mechanism of freezing (Nutt et al., 2011; Shine et al., 2011).

This lack of fundamental understanding of the condition has also severely limited the current therapeutic options for freezing. For example, FOG is known to only show a partial amelioration to dopaminergic medication (Shine et al., 2011) and there is even evidence that some patients only experience freezing during the "on" state (Espay et al., 2012) (though this is a separate syndrome that is not the focus of this paper). In addition, freezing behavior also shows a variable response to deep brain stimulation (DBS) therapy targeting either the subthalamic nucleus (STN) (Niu et al., 2012) or the pedunculopontine nucleus (PPN) (Lewis and Barker, 2009; Follett and Torres-Russotto, 2012). Due to the complex interplay of cognitive, affective, and motor impairments in the disorder, along with variable responses to dopaminergic and electrophysiological therapies, the pathophysiology remains poorly understood at the neural level (Nutt et al., 2011; Shine et al., 2011).

It is clear that a common pathophysiological mechanism of FOG must encompass each of these associated features, linking the factors that *provoke* freezing behavior with those that are involved in the *manifestation* of the symptom. In this perspective, we will use recent insights from neuroimaging experiments to sketch a framework that will link the provocation and manifestation of freezing into a single unified mechanism.

#### **EVIDENCE FROM NEUROIMAGING STUDIES**

A number of recent studies have used MRI techniques to explore group differences in the structural integrity of the brain. Voxel based morphometry has shown that patients with FOG have impaired gray matter volume in the posterior cingulate cortex and precuneus (Tessitore et al., 2012a). Using resting state functional connectivity, Tessitore and colleagues also reported that patients with FOG have impaired functional connectivity within the frontoparietal networks sub-serving attentional functions (Tessitore et al., 2012b). In addition, studies using diffusion tensor imaging have demonstrated that FOG is associated with poor connectivity between the PPN and the cerebellum (Schweder et al., 2010; Fling et al., 2013), as well as the thalamus and frontal cortex (Fling et al., 2013).

Despite the differences in the structural integrity of the brain in patients with FOG, it is not clear whether the underlying pathophysiology is localized to brainstem or frontostriatal systems. However, the results do broadly suggest impairments in global attentional mechanisms, perhaps implicating thalamic dysfunction (Sadaghiani et al., 2010), or in the effective updating of motor plans, through impaired communication between the cerebellum and the brainstem structures controlling gait (Ito, 2008).

Despite the discovery of these structural abnormalities in patients with FOG, the paroxysmal nature of freezing behavior suggests that the disorder is predominantly functional, manifesting only in response to a specific combination of abnormal neuronal patterns. As such, recent research studies have utilized novel behavioral tasks [such as mental imagery (Snijders et al., 2010) and virtual reality (VR) (Matar et al., 2013; Shine et al., 2013a)] that are able to explore the neuronal abnormalities associated with freezing events in susceptible patients without requiring the execution of actual gait. For example, a recent fMRI study investigated the functional MRI changes associated with a motor imagery task in a group of patients with FOG (Snijders et al., 2010). In this study, patients with and without clinical FOG where required to imagine themselves walking along a presented pathway. Despite the lack of any overt movements in the task, the patients with FOG showed a preferential activation in the mesencephalic locomotor regions (MLR), a brainstem structure associated with the neural control of gait (Lemon, 2008) and the production of anticipatory postural adjustments, which are abnormal in patients with FOG (Jacobs and Horak, 2007; Snijders et al., 2010). As mentioned above, these brainstem regions also show a lack of white matter connectivity to cortical, thalamic, and cerebellar structures (Schweder et al., 2010; Fling et al., 2013). Together, these results suggest that FOG may be due to pathological processes affecting the brainstem structures controlling the processing of gait.

In contrast, functional neuroimaging experiments exploiting VR, where patients use foot pedals to navigate a threedimensional environment presented on a screen, offer the ability to probe neural activity when a patient is actually performing a motor task. Recent work has demonstrated that freezing behavior can be provoked during this paradigm with patients experiencing paroxysmal episodes where they are unable to move their feet (Naismith and Lewis, 2010). Indeed, the amount of freezing elicited in the VR environment has been correlated with both selfreported (Naismith and Lewis, 2010) and clinically observed FOG (Shine et al., 2012b). The ability to elicit this freezing behavior in the fMRI setting has allowed insights into the neural correlates of FOG (Shine et al., 2013a,b).

One such experiment explored the fMRI differences between periods of normal walking and periods of freezing elicited during the VR task (Shine et al., 2013a). These episodes were associated with a significant increase in the BOLD response within the bilateral dorsolateral prefrontal cortex and the posterior parietal cortices (Shine et al., 2013a), which was consistent with the findings from a number of previous neuroimaging experiments that have implicated frontoparietal dysfunction in the pathophysiology of freezing (Bartels and Leenders, 2008). In addition, the motor arrests were also associated with concomitant decreased BOLD response within the bilateral caudate nucleus, suggesting a potential dissociation between the cortical and subcortical members of the frontostriatal pathways involved in executive function (Alexander and DeLong, 1986). Indeed, a more-recent neuroimaging experiment utilizing the same VR paradigm has shown that motor arrests on the VR task are related to a paroxysmal functional decoupling between the cortical and subcortical regions of the frontostriatal system (Shine et al., 2013c).

Motor arrests on the VR task were also associated with significant abnormalities in the BOLD response within the globus pallidus (GPi), the STN and the MLR (Shine et al., 2013a). Although initially counter-intuitive, one likely explanation put forward for this finding is that during freezing, the GPi and STN enter into an oscillatory state (Timmermann et al., 2007; Spildooren et al., 2010), decreasing their need for oxygen (Buzsáki and Draguhn, 2004) and lowering the relative signal in the BOLD response (Zumer et al., 2010). The oscillatory activity between these two nuclei would ultimately manifest as overwhelming inhibition on the MLR, leading to a decrease in the neuronal firing in this nucleus (Spildooren et al., 2010).

While these conclusions are speculative, there is a wealth of experimental evidence from electrophysiological recording studies that implicate abnormal oscillatory dynamics in the pathophysiology of akinetic Parkinsonian symptoms, particularly in the "theta" (Sarnethein and Jeanmonod, 2007) and "beta" frequency bands (∼13–20 Hz) (Hammond et al., 2007; Mallet et al., 2008a,b; Degos et al., 2009; Weinberger and Dostrovsky, 2011). Although there is some consternation about the precise source of these oscillatory signals (McCarthy et al., 2011; Tsang et al., 2012), increased activity in the STN is presumed to play a prominent role (Mallet et al., 2008a,b). Therefore, it is a direct prediction of this theory that freezing episodes should be associated with a temporary increase in beta synchrony within the STN (Brown, 2006), a prediction that is aligned with local field potential recordings in patients with FOG (Weinberger et al., 2006; Singh et al., 2012). In addition, the relative inactivation of the STN following high-frequency DBS surgery has also been shown to alleviate freezing behavior (Niu et al., 2012), however, it is not currently clear whether this is due to inactivation of the STN, decreased oscillatory synchrony between the cortex and basal ganglia or some other related mechanism (Bar-Gad et al., 2003; Hammond et al., 2007; Johnson et al., 2008; Weinberger and Dostrovsky, 2011). These results highlight the key role of subcortical activity in the pathophysiology of FOG, which may reflect an inability to effectively "update" motor sets during ongoing motor task performance (Chee et al., 2009). Furthermore, this interpretation is aligned with recent neurophysiological studies that explored electrical recordings directly from STN and PPN during DBS surgery (Singh et al., 2012; Thevathasan et al., 2012), however, these patterns have not been confirmed during dynamic exploration of specific episodes of FOG.

In addition to these sub-cortical processes, it is clear that there is likely to be a cortical component to the freezing phenomenon. For instance, a further fMRI experiment utilizing the VR gait task highlighted above showed that, although PD patients were able to recruit the dorsolateral prefrontal cortex and the posterior parietal cortex during the dual-processing of concurrent cognitive and motor tasks, those with FOG had decreased activity in the pre-supplementary motor area (pSMA) and the bilateral ventral striatum (Shine et al., 2013b). Although these regions are involved in a number of functions, this limited activity may represent the inability to effectively "shift" between competing attentional networks (Spildooren et al., 2010; Menon, 2011; Nutt et al., 2011; Shine et al., 2011) or could indicate an impairment in processing of error-related neuronal signals during the VR task (Salamone and Correa, 2002). This failure in error processing could relate more specifically to dysfunction within the hyper-direct pathway of the basal ganglia, which links specific regions of the frontal cortex (including the pSMA) with the STN of the basal ganglia (Nachev et al., 2008; Cavanagh and Frank, 2013; Haynes and Haber, 2013) and has previously been suggested to have a role in the processing of response conflict (Cavanagh et al., 2012). Furthermore, a role for the hyper-direct pathway in FOG is aligned with a recent electroencephalography (EEG) study that has shown that the transition from walking to freezing is associated with large increases in activity within the theta frequency sub-band (Handojoseno et al., 2012). This shift in EEG frequency has been broadly aligned with the response to conflict processing in the motor and pre-motor cortices (Nachev et al., 2008) and has also been related to increased response caution, reflected in a slowing of reaction time during conflict (Cavanagh et al., 2011). Furthermore, conflict-related theta power in cortical scalp electrodes was mirrored in the activation of STN local field potentials, further implicating the hyper-direct pathway in conflict processing (Cavanagh et al., 2011).

Together, these studies highlight the role of dysfunctional neuronal processing at multiple hierarchical levels of the nervous system, including brainstem and spinal cord structures that control gait and posture, frontoparietal cortical regions that subserve executive functions with subcortical basal ganglia structures link the two levels together.

#### **AN UPDATED MODEL OF FREEZING BEHAVIOR**

A key feature of freezing behavior is that it can be triggered by a range of differing factors including the processing of cognitive (Lewis and Barker, 2009), sensorimotor (Almeida and Lebold, 2010; Ehgoetz Martens et al., 2013) and limbic/affective information (Ehgoetz Martens et al., 2013). Although on the surface these aspects of brain function all appear quite dissimilar, they can all potentially lead to a state of increased response conflict processing in the brain in patients with PD (Vandenbossche et al., 2012).

Through its connection to the pSMA (the so-called "hyperdirect" pathway of the basal ganglia), the STN is able to bypass the striatum and directly drive an increase in inhibitory GABAergic output from the output structures of the basal ganglia, such as the internal segment of the GPi. Increased activity in the GPi, which is a member of the direct pathway of the basal ganglia, leads to an increase in the rate of inhibitory output onto the brainstem and thalamic structures that control the output of effective motor behaviors (Frank, 2006) (see **Figure 1**). Given its key role in regulating activity in the basal ganglia circuitry, the STN is therefore, able to modulate response conflict until such time as an appropriate response can be made (Frank, 2006). Through its processing of conflict-related neuronal signals and the modulation of inhibitory tone within the basal ganglia, the STN therefore, serves as a putative neuroanatomical link between both the provocation and manifestation of freezing behavior. During the processing of increased conflict, regardless of its modality (e.g., processing

**FIGURE 1 | Freezing is due to decreased input and output of the basal ganglia.** In periods of high response conflict, the subthalamic nucleus (STN) increases its firing rate, which subsequently drives increased activity within the internal segment of the globus pallidus (GPi), leading to decreased activity within the thalamus (Thal) and the pedunculopontine nucleus (PPN). While the triggering event in this model is currently unknown, it may be due to: 1—inherent impairment within the pre-supplementary motor area (pSMA), leading to an inefficient communication with the STN; 2—impairments in the temporal dynamics between the pSMA and STN, possibly due to impaired white matter connectivity; 3—cellular deficits within the striatum that place the nucleus at an increased risk of becoming hyperpolarized; or 4—pathology within the PPN, priming the nucleus for hyperpolarization by minimal inhibitory input. Freezing may be due to any combination of these factors. Key: red—increased activity; blue—decreased activity; arrow—excitatory connection; circle—inhibitory connection.

cognitive information whilst maintaining motor output), activity within the STN could act to "trigger" overwhelming increases in the output structures of the basal ganglia, effectively shutting down the targets through GABAergic inhibition.

Given the specific patterns of connectivity within the basal ganglia circuitry, the likely sequelae of impaired striatal activity is that the output structures (the GPi internus and the substantia nigra pars reticularis) will enter into low-energy oscillatory states, coupling with structures such as the STN (Buzsáki and Draguhn, 2004; Frank, 2006; Spildooren et al., 2010). Indeed, recent mechanisms of basal ganglia function have proposed that the primary role of the basal ganglia network is in the modulation of emergent synchronous oscillatory patterns rather than the alteration of single synaptic patterns (Bar-Gad et al., 2003; Hammond et al., 2007; Weinberger and Dostrovsky, 2011). The alternation between relative excitation and inhibition in the basal ganglia (Chee et al., 2009) would therefore, manifest as an inability to effectively "select" a single motor plan, which would ultimately lead to the overwhelming inhibition of the efferent targets of the output nuclei (such as the relay nuclei of the thalamus) and the brainstem structures controlling gait, such as the PPN (Tsang et al., 2010) and other members of the MLR (Snijders et al., 2010). Decreased activity within the dorsal PPN would impair motor initiation, due to the nuclei's efferent connectivity with the central pattern generators in the spinal cord that control coordinated flexion and extension of the muscles controlling gait (Takakusaki et al., 2004).

These proposed roles of the STN are well supported by behavioral (Aron and Poldrack, 2006; Frank, 2006; Wiecki and Frank, 2010) and neuroimaging evidence (Aron et al., 2007; Haynes and Haber, 2013). For example, the STN has a well known association with set-shifting behavior (Tsang et al., 2012), which is impaired in patients with freezing (Shine et al., 2012a). Furthermore, the role of the STN in freezing is aligned with the functional neuroimaging studies that suggested that the GPi and the STN entering into a low energy oscillatory state during freezing behavior (Buzsáki and Draguhn, 2004; Spildooren et al., 2010). In the low dopaminergic state, this oscillatory activity would then strongly facilitate increased activity within both the direct and indirect pathways of the basal ganglia, leading to overwhelming inhibition on the output structures of the basal ganglia, such as the thalamus and the MLR, but also on the striatum itself, leading to a functional decoupling between the cortex and the striatum. This is precisely the pattern of BOLD response changes that were observed during motor arrests on the VR task (Snijders et al., 2010) and have recently been reproduced during freezing episodes in an upper-limb tapping task (Vercruysse et al., 2013). Finally, the model also receives supportive evidence from a study of PD patients, in which high-frequency STN DBS led to a decrease in the normally elongated reaction times during conflict processing (Frank et al., 2007).

The oscillating inhibitory state of the basal ganglia nuclei may also explain the poorly understood phenomenon of "trembling in place," which refers to lower limb oscillatory activity in the 5–7 Hz range commonly observed during episodes of FOG (Moore et al., 2012). Computational modeling experiments have shown that Parkinsonian tremor (which occurs in the same 5–7 Hz frequency band) emerges naturally due to rhythmic activity between the STN and the GPi externus in the presence of a dopaminergicallydepleted basal ganglia (Pahapill and Lozano, 2000). As such, the inhibitory output from the basal ganglia would therefore, oscillate at the same frequency, rhythmically inhibiting and relaxing its inhibition on the PPN. The moments of relative quiescence on the PPN would therefore, allow increased signaling from the cholinergic cells within the PPN, which would still remain active via its connections with sensory signals via the dorsal root of the spinal cord (Pahapill and Lozano, 2000) (see **Figure 2**). This mechanism also provides a potential explanation for the relationship between the freezing phenomenon and impaired postural responses during tests of balance function (Jacobs and Horak, 2007).

### **INSIGHTS INTO POTENTIAL PATHOLOGICAL PROCESSES**

One of the major implications of this model is that freezing is best conceptualized as a functional disorder that only manifests once certain circumstances have occurred. This raises an interesting question regarding the likely location of pathology in the brains of patients with PD and freezing behavior. Based on the model, any pathological process that impaired the capacity of the brain to deal with information processing, and thus, to manifest increased conflict signaling would lead to an increase in freezing behavior.

**Freezing.** During walking, the Pre-supplementary Motor Area (pSMA) and Motor Cortex are able to effectively communicate motor plans to the basal ganglia, leading to the effective gating of basal ganglia outflow, allowing the appropriate communication of motor plans to the brainstem structures controlling gait, such as the dorsal pedunculopontine nucleus (PPNd), which subsequently informs the motor spinal cord (SCm), leading to normal gait. In addition, there is also effective feedback from the sensory spinal cord (SCs) to the cholinergic PPN (PPNc), further informing gait and balance. During Freezing, overwhelming response conflict leads to an increase in the firing

activity within the globus pallidus internus (GPi), effectively decreasing the output of the basal ganglia, respectively. The inhibited PPNd is then unable to communicate effective motor plans to the SCm, however, the SCs is able to remain in communication with the PPNc, leading to an imbalance in activity within the greater PPN and abnormal activation patterns in the SCm. The overwhelming inhibitory state of the basal ganglia can only be broken by a focused, goal-directed action, which would trigger the striatum to inhibit the GPi, effectively releasing the STN-mediated "brake" on the PPNd and the SCm. Key: black—active; gray—hypo-active; dotted line—mixed active/hypoactive.

Although there are many regions of the brain in which pathology would lead to increased global conflict, the most likely candidates are the ascending neurotransmitter projection systems of the brainstem, such as the ventral tegmental area, the locus coeruleus, and or the dorsal raphe nucleus. Each of these nuclei sends modulatory neurotransmitters to large portions of the brain, including both cortical and subcortical sites involved in walking and executive function. Indeed, these regions are often the target of Lewy body pathology (Rye and DeLong, 2003) and some of the projection nuclei, such as the dorsal raphe (Xiang et al., 2005) and the ventral tegmental area (Oades and Halliday, 1987), have direct efferent connections with the basal ganglia. Although each of the neurotransmitter systems has different effects on target neurons, imbalances in the proportion and timing of these neurochemicals is likely to lead to inefficient neuronal processing on a global scale, priming the system for errors and hence, increased response conflict.

Another possible candidate region is the PPN (Mazzone et al., 2013), which has also been associated with Lewy body pathology in PD (Pahapill and Lozano, 2000). The PPN has extensive efferent connections with the STN and also with a number of other structures that modulate the brain's response to conflict, such as the intralaminar nuclei of the thalamus and the striatum (Pahapill and Lozano, 2000). In rats, low-frequency stimulation of the PPN leads to decreased firing in the STN (Alam et al., 2012) and as such, impairments in the effectiveness of this signaling pathway could potentially lead to an increase in the influence of the STN over the basal ganglia. These concepts are supported by the functional improvements experienced by patients with freezing following DBS of the PPN nucleus (Stefani et al., 2007; Strafella et al., 2008; Ferraye et al., 2010; Hamani et al., 2011; Tykocki et al., 2011; Follett and Torres-Russotto, 2012; Khan et al., 2012), along with increases in regional cerebral blood flow observed during stimulation of the PPN (Strafella et al., 2008; Ballanger et al., 2009). In addition, this interpretation is consistent with the finding that the PPN lacks appropriate white matter connectivity with the cerebellum in patients with FOG and PD (Tessitore et al., 2012b; Fling et al., 2013), however, this finding could also be explained as the by-product of decreased firing within the PPN as a result of overwhelming inhibition from the basal ganglia (Lewis and Barker, 2009), suggesting that the primary pathology in freezing may not reside in the PPN.

#### **REFERENCES**


doi: 10.1146/annurev.ne.09.030186. 002041


It is also possible that the proposed increase in STN oscillatory activity is due to a dysfunctional process within the pSMA (see **Figure 1**). For example, global states that predispose the brain to impaired information processing may preferentially affect the pSMA, particularly in its ability to effectively select motor plans to match both exogenous affordance patterns and internal motivational states (Nachev et al., 2008). Due to an inability of the pSMA to switch between competing motor plans, the STN may receive inappropriate information from the pSMA—either of multiple, competing motor plans (Nachev et al., 2008) or the correct plan on a time-delay (Hikosaka and Isoda, 2010). A failure of information processing would cause the STN to increase its firing rate, leading to inhibition of the input and output structures of the basal ganglia, thus, effectively "buying time" until the appropriate, goal-directed behavior can be transmitted by the cortex (Frank, 2006; Spildooren et al., 2010).

## **FUTURE DIRECTIONS**

Although the separate predictions of these different hypotheses may be difficult to dissociate with fMRI, measures with higher temporal resolution may help to clarify the precise role of each structure in the spatiotemporal evolution of a freezing episode. As such, future studies should now be constructed to test the different aspects of this model using an array of neuroscientific techniques. Firstly, activity within the STN and PPN could potentially be recorded directly during DBS surgery, allowing for the analysis of the time course of activation and deactivation patterns within the different nuclei with respect to freezing behavior. Future neuroimaging experiments should explore the presence or absence of impairments in functional and effective connectivity associated with the predictions of the model, with a particular emphasis on the dynamic connectivity between cortical and subcortical structures. Finally, computational modeling experiments could be designed in order to probe the dynamic elements of the model, focusing on whether abnormal conflict processing through the STN can predict the specific behavioral patterns displayed on different neuropsychological and motor-based tasks by patients with freezing. Together, the results of these studies will help to inform the next generation of therapeutic advances for freezing behavior in PD, including the utilization of targeted closed-loop DBS (Rosin et al., 2011; Tsang et al., 2012) and the discovery of novel locations for DBS electrode placement (Stefani et al., 2007; Lourens et al., 2011; Khan et al., 2012).


disease. *Brain* 133(Pt 1), 205–214. doi: 10.1093/brain/awp229


ganglia mechanisms. *Trends Cogn. Sci.* 14, 154–161. doi: 10.1016/j.tics. 2010.01.006


(A10) system: neurobiology. 1. Anatomy and connectivity. *Brain Res.* 434, 117–165. doi: 10.1016/0165-0173(87)90011-7


*ONE* 8:e52602. doi: 10.1371/journal.pone.0052602


in the control of motor behaviors. *Neurosci. Res.* 50, 137–151. doi: 10.1016/j.neures.2004.06.015


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 22 July 2013; accepted: 13 September 2013; published online: October 2013. 04*

*Citation: Shine JM, Moustafa AA, Matar E, Frank MJ and Lewis SJG (2013) The role of frontostriatal impairment in freezing of gait in Parkinson's disease. Front. Syst. Neurosci. 7:61. doi: 10.3389/fnsys. 2013.00061*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2013 Shine, Moustafa, Matar, Frank and Lewis. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Procedural-based category learning in patients with Parkinson's disease: impact of category number and category continuity

## *J. Vincent Filoteo1,2\* and W. Todd Maddox3,4*

*<sup>1</sup> Veterans Administration San Diego Healthcare System, San Diego, CA, USA*

*<sup>2</sup> Department of Psychiatry, University of California, San Diego, CA, USA*

*<sup>3</sup> Department of Psychology, University of Texas, Austin, TX, USA*

*<sup>4</sup> Institute for Neuroscience, University of Texas, Austin, TX, USA*

#### *Edited by:*

*Ahmed A. Moustafa, University of Western Sydney, Australia*

#### *Reviewed by:*

*Shawn Ell, University of Maine, USA Carol Seger, Colorado State University, USA*

#### *\*Correspondence:*

*J. Vincent Filoteo, Veterans Administration San Diego, Healthcare System, Psychology Service 116B, VASDHS, 3350 La Jolla Village Dr., San Diego, CA 92161, USA e-mail: vfiloteo@ucsd.edu*

Previously we found that Parkinson's disease (PD) patients are impaired in procedural-based category learning when category membership is defined by a nonlinear relationship between stimulus dimensions, but these same patients are normal when the rule is defined by a linear relationship (Maddox and Filoteo, 2001; Filoteo et al., 2005a,b). We suggested that PD patients' impairment was due to a deficit in recruiting "striatal units" to represent complex nonlinear rules. In the present study, we further examined the nature of PD patients' procedural-based deficit in two experiments designed to examine the impact of (1) the number of categories, and (2) category discontinuity on learning. Results indicated that PD patients were impaired only under discontinuous category conditions but were normal when the number of categories was increased from two to four. The lack of impairment in the four-category condition suggests normal integrity of striatal medium spiny cells involved in procedural-based category learning. In contrast, and consistent with our previous observation of a nonlinear deficit, the finding that PD patients were impaired in the discontinuous condition suggests that these patients are impaired when they have to associate perceptually distinct exemplars with the same category. Theoretically, this deficit might be related to dysfunctional communication among medium spiny neurons within the striatum, particularly given that these are cholinergic neurons and a cholinergic deficiency could underlie some of PD patients' cognitive impairment.

**Keywords: Parkinson's disease, category learning, implicit processes, procedural learning, striatum, basal ganglia**

## **INTRODUCTION**

It is now widely accepted that there are multiple category learning systems (Ashby et al., 1998, 2010; Smith et al., 1998, 2012; Ashby and Maddox, 2005, 2011) and that different neural systems play different roles in these systems (Knowlton et al., 1994, 1996; Poldrack et al., 1999; Ashby and Ell, 2001; Filoteo et al., 2001a,b, 2005a,b; Patalano et al., 2001; Keri, 2003; Reber et al., 2003; Shohamy et al., 2004a,b; Maddox et al., 2005a,b; Cincotta and Seger, 2007; Nomura et al., 2007; Price et al., 2009; Waldschmidt and Ashby, 2011). One of the more interesting, and potentially important lines of research in this area is the study of how some categories can be acquired without conscious awareness. This phenomenon, often referred to as procedural-based category learning, occurs when participants learn complex categorization rules, and despite highly accurate learning, they are unable to describe explicitly why any given exemplar belongs to a specific category.

The behavioral mechanisms of procedural-based category learning have received much attention in several recent studies with normal individuals (Gluck et al., 2002; Maddox and Ashby, 2004; Ashby and Maddox, 2005, 2011; Ashby and O'Brien, 2005). These studies have demonstrated that this form of category learning has distinct operating characteristics that differentiate it from other types of category learning processes, such as explicit category learning (Ashby et al., 2002, 2003a,b; Maddox et al., 2003, 2004a,b; Maddox and Ing, 2005; Worthy et al., 2013). For example, the perceptual similarity among exemplars has to occur along a continuum within each category for normal proceduralbased learning to occur, whereas this is not the case for explicit category learning (Maddox et al., 2005a,b, 2007). Similarly, the number of categories to be learned does not differentially impact long-run accuracy in procedural-based category learning, but increasing the number of categories impedes the learning of explicit category rules (Maddox et al., 2004a,b).

Much has also been learned about the underlying neurobiology of implicit or procedural-based category learning by the functional imaging of normal individuals or by studying patients with neurological disorders. For example, fMRI studies with normal participants have identified the striatum as an important brain region for procedural-based category learning (Filoteo et al., 2006; Cincotta and Seger, 2007; Nomura et al., 2007; Waldschmidt and Ashby, 2011) and other studies have implicated midbrain dopamine regions in some implicit category learning tasks (Aron et al., 2004). Past work with patients with striatal dysfunction has also implicated this brain region in implicit forms of category learning. Knowlton et al. (1996), for example, demonstrated that patients with Parkinson's disease (PD) are impaired in learning probabilistically determined categories, a finding that has received considerable support in the literature (Shohamy et al., 2004a,b). Importantly, other patient studies have indicated that brain structure associated with explicit memory (hippocampus and diencephalon) do not contribute to the same extent to implicit forms of category learning (Knowlton et al., 1994) or the long-term retention of procedural-based categories (Filoteo et al., 2001b).

In our work we have conducted a series of studies designed to further understand the nature of procedural-based category learning deficits in patients with PD, and by extension, the role of the striatum in this process (Maddox and Filoteo, 2001; Ashby et al., 2003b; Filoteo et al., 2005a). We have primarily used the perceptual categorization task (Ashby and Gott, 1988) in which participants view simple two-dimensional stimuli often consisting of a single line that varies in length and orientation (or a Gabor patch that varies in spatial frequency and orientation; see **Figure 1**) and are asked to categorize stimuli into one of two categories (Category A or B), and then immediately following a response, feedback is given. The rule that dictates category membership depends on the nature of the relationship between the two stimulus dimensions. **Figures 2A,B** provide examples in which the optimal rule is linear or nonlinear, respectively. This figure provides scatter plots of Category A and B stimuli where the x-axis represents the length of the line (in arbitrary units) and the yaxis represents the orientation of the line (in arbitrary units). Closed squares represent stimuli from Category A and open circles represent stimuli from Category B. Each individual stimulus has the length value on the x-axis and the orientation value on the y-axis. The linear rule depicted in **Figure 2A** is represented as a linear function and provides an optimal separation of the Category A and B stimuli, whereas the nonlinear rule in **Figure 2B** is represented as a quadratic function that provides an optimal separation of the Category A and B stimuli. A participant who

would adopt the linear rule in **Figure 2A** or the nonlinear rule in **Figure 2B** would maximize long-run accuracy. Note that both rules are procedural-based category learning rules because it is very difficult to verbalize either the linear or nonlinear relationship between the two stimulus dimensions when they are not in the same perceptual units (e.g., length and orientation).

The results of our first study using this paradigm (Maddox and Filoteo, 2001) found that PD patients were impaired in learning a categorization rule that was based on a nonlinear relationship between lines that varied in length and orientation, whereas they were normal in learning a linear rule. Similarly, in our next study (Ashby et al., 2003a,b) we used a somewhat different task but again found that PD patients were normal in learning linear procedural-based rules. Finally, in a third study (Filoteo et al., 2005a,b) we again examined linear and nonlinear category learning and found that the patients were impaired in the nonlinear condition but not in the linear condition. Importantly, task difficulty could not explain these findings since the more difficult task (based on the accuracy of the control participants) was the linear task, on which PD patients were normal. This series of studies suggest that PD patients are impaired in procedural-based category learning, but only when the rule that dictates category membership is nonlinear.

A surface-level explanation of our findings is that PD results in deficits in learning nonlinear procedural based rules, but it does not impact linear rule learning. Unfortunately this explanation

does not provide any insight into the possible mechanisms that might be driving these findings. To help interpret our past results we use the neurobiological and theoretical framework provided by the Striatal Pattern Classifier (SPC) model introduced by Ashby and colleagues (Ashby and Waldron, 1999). This model has been found to provide a good accounting of normal participants' response patterns in previous procedural-based category learning studies (e.g., Ashby and Waldron, 1999; Waldron and Ashby, 2001; Maddox et al., 2007; for applications to stimulus identification see Ashby et al., 2001; Maddox, 2001, 2002). The assumptions of this model are based on the neurobiology proposed to underlie the procedural-based category learning system in COVIS (Ashby et al., 2001). The SPC model, which is outlined in detail in Ashby and Waldron (1999), incorporates the knowledge of the many-to-one mapping of visual cortical cells onto cells in the striatum (Wilson, 1995). The model proposes hypothetical "striatal units" that are thought to represent the medium spiny cells in the striatum and provide a low-resolution map of the perceptual space. During procedural-based category learning the model assumes that these striatal units become associated with a category label and learn to associate a response with groups of cells in visual regions of cortex. It is important to be clear that the SPC is a computational model that is inspired by what is known about the neurobiology of the striatum. Because of this fact, the "striatal units" are hypothetical and could be interpreted within the language of some other computational model (e.g., as "prototypes" in a multiple prototype model).

One important finding from the application of the SPC to data obtained from normal individuals (Ashby and Waldron, 1999) is that a greater number of striatal units are typically needed to represent a nonlinear rule as compared to a linear rule. The SPC is a minimum distance classifier. This is depicted in **Figure 3A** in which a linear rule is approximated by one striatal unit representing Category A (closed square) and another striatal unit representing Category B (open circle). In this case a minimum distance "bound" is learned. Note in **Figure 3A** that only a single unit per category is needed to approximate a linear rule. In contrast, **Figure 3B** provides a graphic representation of how striatal units might approximate a nonlinear rule. As can be seen in the first panel of **Figure 3B**, a single unit per category does not provide a good approximation of the optimal nonlinear rule. However, in **Figure 3C** the addition of a second striatal unit allows for a better approximation of the nonlinear rule via the piece-wise combination of two linear bounds so that two minimum distance bounds are learned. Thus, the SPC model argues that additional striatal units are needed to represent nonlinear rules.

This observation raised the interesting possibility that dysfunction in PD within these model-based "striatal units" might also reflect the pathological manifestations of PD within actual medium spiny neurons that compose the majority of cells within the striatum. These cells are the primary input nuclei in the striatum from the cortex and are part of the direct and indirect pathways within the basal ganglia. Medium spiny neurons are thought to be impacted in PD through the dysfunction of their dendritic spines due to deafferentation effects following the loss of dopamine cells within the pars compacta of the substantia nigra (Deutch et al., 2007), although this change might only be reflected

functionally in the later stages of the disease (Zaja-Milatovic et al., 2005). Given the involvement of the medium spiny neurons in PD, one manner in which this disease could impact the proposed units is that the actual number of functional medium spiny neurons has diminished in these patients, and because of this, nonlinear categories that require a greater number of units can no longer be adequately represented. Thus, procedural-based learning conditions in which a greater number of units are required would always place PD patients at a disadvantage. We refer to this as the "number of units" hypothesis.

An alternative, but somewhat related possibility, is that the number of functional medium spiny neurons is normal in PD (at least early in the disease), but somehow these neurons are unable to communicate in a manner that would enable learning to occur when a greater number of striatal units is needed to represent the categories, such as under nonlinear conditions. That is, the number of functional medium spiny neurons is sufficient to support nonlinear category learning, but impairment in the ability of these neurons to communicate results in impaired learning. We refer to this as the "communication among units" hypothesis. This hypothesis was initially based on the observation from our previous studies that the learning of nonlinear rules requires that certain stimuli that are less perceptually similar have to be grouped into the same category, whereas certain stimuli that are more perceptually similar have to be grouped into different categories. Such communication among medium spiny neurons would be needed for the striatum to output a consistent message regarding that category to which a particular stimulus belongs. That is, unless there were some sort of co-activation among medium spiny neurons that processed the percept of a stimuli belonging to the same category, the output of these neurons would theoretically send a unique message to other structures eventually responsible for generating a response (e.g., the globus pallidus), and these structures would have to somehow resolve the fact that different medium spiny cells are signaling that their representation belongs to the same category. Note that this would only be the case when multiple units are required to represent a category, because theoretically, no such resolution would be required when only single medium spiny neurons (or medium spiny neurons within close proximity of one another) are needed to learn, such as under linear conditions.

An important question that needs to be addressed, however, is what could allow the medium spiny neurons to communicate. One possibility is that cholinergic interneurons that connect medium spiny neurons within the striatum enable such communication, and under conditions in which it would be theoretically beneficial for such cells to communicate (i.e., nonlinear conditions), striatal interneurons are involved in the learning process. These neurons often referred to as tonically active neurons (or TANs, for their tonic firing rate at rest) comprise only a small percentage of neurons in the striatum but recently have been implicated in processes important to procedural-based category learning (Ashby et al., 2007; Ashby and Crossley, 2011; Crossley et al., 2013). Specifically, most studies suggest that these interneurons modulate the input of cortical cells onto striatal medium spiny neurons by decreasing (or pausing) their activity when a rewarding stimulus is processed within the striatum, which allows for increased reinforcement learning (Apicella, 2002; Joshua et al., 2008; Aosaki et al., 2010). However, another method by which these interneurons result in learning could be by controlling the number of potential responses that are selected by the striatum (Stocco, 2012) and would be more consistent with a broader view of these interneurons in various aspects of learning (e.g., Apicella, 2007). This proposed process could provide an appropriate mechanism by which the striatum is able to link perceptually distinct stimuli to the same category response. This process is also consistent with other models of basal ganglia function that suggest a role of the striatum in response selection (Mink, 1996; Stocco, 2012), with reinforcement learning being one aspect of selecting a response (Bar-Gad et al., 2003; Redgrave et al., 2011) or linking networks within the striatum that are important for learning (Graybiel et al., 1994). Although the exact effects of interneurons on medium spiny cell function is not completely known and likely very complex (see Oldenburg and Ding, 2011), these cells do appear to play an important role in normal striatal functioning. Importantly, animal models of PD suggest that the reduction of dopaminergic projections to the striatum result in abnormal interneuron activity (Raz et al., 2001; Pisani et al., 2003; Bonsi et al., 2011).

The purpose of the current study is to examine both the "number of units" hypothesis and the "communication among units" hypothesis described above. Experiment 1 examined the ability of PD patients and normal controls (NC) to learn a procedural-based task in which there were either four categories (Four-Category condition) or two categories (Two-Category condition). **Figure 4** displays the stimulus distributions for the Fourand Two-Category conditions. If PD results in a deficit in the

number of hypothetical striatal units, then they should demonstrate greater impairment in the Four-Category condition as compared to the Two-Category condition. In contrast, Experiment 2 examined the "communication among units" hypothesis by determining the ability of PD patients and NC participants to learn categories that have either a discontinuous distribution of stimuli (Discontinuous condition) or a continuous distribution of stimuli (Continuous condition). **Figure 6** displays the stimulus distributions for the Discontinuous and Continuous conditions. A finding that PD patients are impaired in the Discontinuous condition relative to the Continuous condition would provide theoretical support for the "communication among units" hypothesis of procedural-based category learning deficits in PD.

#### **GENERAL METHODS**

#### **PARTICIPANTS**

A total of 41 individuals participated in at least one of the two experiments: 20 PD patients and 21 NC participants. For the PD patients, 11 participated in at least one condition in both experiments and 9 participated in at least one condition in only one experiment. For the NC participants, 8 participated in at least one condition in both experiments and 12 participated in at least one condition in only one experiment. Participants were randomized to each experiment and in the case of those who participated in more than one experiment or in both conditions within an experiment, the order of administration of the experiments was randomized <sup>1</sup> . Participants were tested a minimum of 2 months apart between experiments or conditions. The specific numbers of individuals who participated in the two experiments are as follows. Experiment 1: Four-Category Condition, 12 PD patients (8 males and 4 females) and 12 NC participants (4 males and 8 females). Two-Category Condition, 11 PD patients (5 males and 6 females) and 11 NC participants (4 males and 7 females). Eight PD patients 8 NC participants were tested in both the Four-Category and Two-Category conditions. Experiment 2: Discontinuous Condition, 10 PD patients (6 males and 4 females) and 10 NC participants (4 males 6 females); Continuous Condition, 11 PD patients (8 males and 3 females) and 11 NC participants (5 males 6 females). Five PD patients and 7 NC participants were tested in both the Discontinuous and Continuous conditions.

The patients were recruited from Movement Disorder Clinics at UCSD and were diagnosed by a board-certified neurologist with subspecialty training in movement disorders. The diagnosis was based on UK Brain Bank Criteria (Hughes et al., 1992). PD patients were not included in the study if they scored above a cut-off of 11 on the Geriatric Depression Scale or if they scored below 130 on the Mattis Dementia Rating Scale (MDRS; Mattis, 1988). For Experiment 1, 14 patients were taking daily L-dopa medication, 8 were taking a dopamine receptor agonist, 5 were taking an MAO inhibitor, 5 were taking a COMT inhibitor as part of their L-dopa preparation, 5 were taking amantadine, and 1 was taking an anticholinergic. For Experiment 2, 14 patients were taking daily L-dopa medication, 8 were taking a dopamine receptor agonist, 3 were taking an MAO inhibitor, 6 were taking a COMT inhibitor as part of their L-dopa preparation, 5 were taking amantadine, and 1 was taking an anticholinergic.

**Tables 1**, **3** show the mean age, years of education, scores on the MDRS for the PD patients and NC participants who participated in Experiments 1 and 2, respectively, and the mean Hoehn and Yahr Rating Scale (HYRS; Hoehn and Yahr, 1967) score and the length of illness (LOI; years) for the PD patients. In both experiments, the PD and NC groups did not differ in age, education, scores on the MDRS, or gender distribution (all *p*- s > 0.05).

#### **STIMULI AND STIMULUS GENERATION**

In both experiments, the stimuli consisted of a single Gabor patch (see **Figure 1**) that varied in orientation and spatial frequency. The stimuli were computer generated and displayed on a 21 monitor with 1360 × 1024 resolution. Each Gabor patch was generated using MATLAB routines from Brainard's (1997) Psychophysics Toolbox, and each stimulus was 7 cm in diameter, which subtended a visual angle of about 8.8◦ from a viewing distance of 45 cm.

Both experiments used the randomization technique of Ashby and Gott (1988). For each experiment, an equal number of Category A and Category B stimuli were generated by sampling randomly from two bivariate normal distributions. Each random sample (*x*f, *x*o) was converted to a stimulus by deriving the frequency, *f* = 0.0025 + (*x*f/5000) cycles per pixel, and orientation, *o* = 0.36*x*<sup>o</sup> degrees. The scaling factors were chosen in an attempt to equate the salience of frequency and orientation based on our past experience with these stimuli. Each category distribution is specified by a mean and a variance on each dimension, and by a covariance between dimensions. For both category structures it was always the case that the covariance matrix for Category A was identical to the covariance matrix for Category B. The categories differed only in the location of their means.

The exact parameter values for the two experiments are listed in **Tables 2**, **4**, and the category structures are displayed in **Figures 4**, **6**. **Figure 4A** displays the category structures for the Four-Category condition in Experiment 1. Each filled square denotes the spatial frequency and spatial orientation of a Gabor pattern from Category A, each open circle denotes the spatial frequency and spatial orientation of a Gabor pattern from Category B, each closed diamond denotes the spatial frequency and spatial orientation of a Gabor pattern from Category C, and each closed triangle denotes the spatial frequency and spatial orientation of

**Table 1 | Demographic characteristics and Mattis Dementia Rating Scale Scores of the PD patients and NC participants in the Four-Category and Two-Category Conditions of Experiment 1.**


*HY, Hoehn and Yahr Rating Scale score; LOI, Length of Illness.*

#### **Table 2 | Category distribution parameter values for Experiment 1.**


<sup>1</sup>Additional analyses were conducted to determine if the order of administration of either the experiments or conditions could account for the pattern of results reported below and it was determined that this was not the case.

**Table 3 | Demographic characteristics and Mattis Dementia Rating Scale Scores of the PD patients and NC participants in the Discontinuous and Continuous Conditions in Experiment 2.**


*HY, Hoehn and Yahr Rating Scale score; LOI, Length of Illness.*

**Table 4 | Category distribution parameter values for Experiment 2.**


a Gabor pattern from Category D. **Figure 4B** displays the category structures for the Two-Category condition in Experiment 1, and **Figures 6A,B** display the Discontinuous and Continuous conditions in Experiment 2, respectively. For these figures, each filled square denotes the spatial frequency and spatial orientation of a Gabor pattern from Category A, while each unfilled circle denotes the spatial frequency and spatial orientation of a Gabor pattern from Category B. The solid line(s) in **Figures 4**, **6** denotes the location of the optimal decision bound(s). The use of the optimal bound in each of the four experiments maximizes long-run accuracy. Optimal accuracy in each condition was 95% given the categories overlapped to some extent, and thus were probabilistic.

#### **EXPERIMENTAL PROCEDURE**

For the Four-Category and Two-Category conditions in Experiment 1, 600 trials were presented in 6 blocks of 100 trials. For the Discontinuous and Continuous conditions in Experiment 1, 400 trials were presented and were broken down into 5 blocks of 80 trials. At the start of each condition, the participants were told that they were involved in a study that examined their ability to categorize simple stimuli, that a series of stimuli would be presented, and that they would be asked to categorize each as a member of Category A B, C, or D for the Four-Category condition of Experiment 1, or Category A or B in the Two-Category condition of Experiment 1, and both conditions in Experiment 2. They were also told that at the beginning of the experiment they may feel as though they were guessing, but as the experiment progressed, their accuracy would likely increase. Participants indicated their categorization responses by pressing designated keys on the computer keyboard. For each trial in both experiments, the stimulus was presented until the participant's categorization response was made and feedback was presented immediately after the response for 1 s that consisted of either the word "wrong" if their response was incorrect or "correct" if their response was correct. Once feedback was given, the next trial was initiated 1 s later.

## **EXPERIMENT 1: FOUR-CATEGORY vs. TWO-CATEGORY CONDITIONS**

Experiment 1 was designed to examine the "number of units" hypothesis. In the Four-Category condition, the participant must learn to assign each stimulus to one of four categories. Theoretically, each category is represented by a single striatal unit that is linked to the corresponding response (A, B, C, or D). As can be seen in **Figure 4A**, these categories are derived from four clusters of stimuli with different means and standard deviations (see **Table 2**). In the Two-Category condition, these same four clusters of stimuli are again used, but now there are only two categories given that Category A and B stimuli from the Four-Category condition are collapsed into Category A for the Two-Category condition, and Category C and D from the Four-Category condition are collapsed into Category B for the Two-Category condition. Thus, the exact stimuli are held constant across the two conditions, as is the nature of the stimuli, the timing of the task trials, and the nature of feedback. The only thing that varied is the number of categories.

As noted above, it was anticipated that if PD patients' deficits in learning nonlinear rules was due to such rules requiring a greater number of units to represent nonlinearity (see **Figure 3**) and there was a deficiency in the number of units in PD patients, the "number of units" hypothesis would predict that PD patients would be differentially impaired in the Four-Category condition as compared to the Two-Category condition.

#### **RESULTS**

Accuracy rates for the Four-Category condition of Experiment 1 are displayed in **Figure 5A** and were analyzed using a 2 (group: PD vs. NC) × 6 (blocks 1–6) mixed-design ANOVA. Results revealed a main effect of block, *<sup>F</sup>*(5, <sup>110</sup>) <sup>=</sup> <sup>37</sup>.02, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> *p* = 0.63, with both PD and NC participants' performance improving across the trials. However, there was no main effect of group, *<sup>F</sup>*(1, <sup>22</sup>) <sup>=</sup> <sup>0</sup>.08, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.78, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.00, and no group by block interaction, *<sup>F</sup>*(5, <sup>110</sup>) <sup>=</sup> <sup>0</sup>.35, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.88, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.02. Accuracy rates for the Two-Category condition are displayed in **Figure 5B** and were analyzed using the same ANOVA design. Results indicated a main effect of block, *<sup>F</sup>*(5, <sup>110</sup>) <sup>=</sup> <sup>11</sup>.88, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.37, but no main effect of group, *<sup>F</sup>*(1, <sup>22</sup>) <sup>=</sup> <sup>0</sup>.21, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.65, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.00, and no group by block interaction, *<sup>F</sup>*(5, <sup>110</sup>) <sup>=</sup> <sup>1</sup>.23, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.30, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.06.

## **DISCUSSION**

The results of Experiment 1 suggest that PD patients are not impaired when learning either four or two categories. As can be seen in **Figures 4A,B**, participants initially demonstrate a disadvantage in learning four categories as compared to two categories, but this is due to the fact that participants are initially guessing early in learning and chance responding in the four category condition is 25%, whereas in the two category condition it is 50%. However, as learning progresses, performance improves in both the Four- and Two-Category conditions and asymptotes at approximately 80% during the last block of trials. These findings are consistent with our previous work with healthy younger adults that showed little impact of category number on procedural-based category learning (Maddox et al., 2004a,b). The most important finding, however, is that there was no difference between PD patients and NC participants in the pattern and extent of learning in either the Four- or Two-Category conditions. If we can assume that normal learning in the Four-Category condition required a greater number of functional striatal units, then these findings do not support the "number of units" hypothesis.

#### **EXPERIMENT 2: DISCONTINUOUS CATEGORY vs. CONTINUOUS CATEGORY CONDITIONS**

The purpose of Experiment 2 was to examine the impact of within-category discontinuity on procedural-based category learning in PD and NC participants. As noted above, we have found in three past studies that PD patients are not impaired in learning procedural-based category rules when the rule that dictates category membership is linear (Ashby et al., 2003a,b; Filoteo et al., 2005a,b) and our findings from Experiment 1 in this study provide further support for this observation. As we have argued above, one aspect of learning nonlinear rules is that participants must learn to categorize perceptual dissimilar stimuli into the same category so that they can activate the same response, and conversely, participants must learn not to categorize perceptually similar stimuli into the same category so that such stimuli can elicit a different response. This process is thought to occur through a response selection mechanism that is modulated by cholinergic interneurons within the striatum by inhibiting competing responses (e.g., Stocco, 2012). If there were a deficiency in communication among the medium spiny neurons within the striatum because of poor communication through the interneurons, then learning would be impaired. Again, we refer to this hypothesis as the "communication among units" hypothesis.

To test this hypothesis, we created a two-category condition in which a greater number of units would be needed to represent the stimuli within a single category but the rule was nevertheless linear. To do so, we created discontinuous categories by using two non-overlapping clusters within each category. As can be seen in **Figure 6A**, Category A stimuli (A1 and A2 clusters under

Discontinuous condition in **Table 4**) compose two clusters as do Category B stimuli (B1 and B2 clusters under Discontinuous condition in **Table 4**). Importantly, stimuli from A1 and B1 are perceptually more similar than are A1 and A2 stimuli or B1 and B2. Thus, two important features of nonlinear rules are replicated: (1) perceptually dissimilar stimuli must be categorized together, and (2) more striatal units are needed to represent the categories; however, the rule is now linear. In contrast, in the Continuous condition, which served as the control condition, the categories were again composed of two clusters, but the clusters overlapped, which resulted in participants having to learn to categorize perceptually similar stimuli into the same category and a greater likelihood that only a single unit would be needed to represent the categories. If PD patients were differentially impaired in the Discontinuous relative to the Continuous condition, it would provide support for the "communication among units" hypothesis.

#### **RESULTS**

Accuracy rates for Experiment 2 are depicted in **Figure 7A** and were analyzed using a 2 (group: PD vs. NC) × 5 (blocks 1– 5) mixed-design ANOVA. Results of this analysis identified a main effect of group, *<sup>F</sup>*(1, <sup>18</sup>) <sup>=</sup> <sup>6</sup>.68, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.05, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.27, with PD patients performing worse than NC participants overall, and a main effect of block, *<sup>F</sup>*(4, <sup>72</sup>) <sup>=</sup> <sup>11</sup>.06, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> *p* =

0.38, with both PD and NC participants' performances improving across the blocks. There was no group by block interaction, *<sup>F</sup>*(4, <sup>72</sup>) <sup>=</sup> <sup>1</sup>.37, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.25, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.07. Performances in the Continuous Condition are shown in **Figure 7B** and were examined using the same mixed-design ANOVA as for the Discontinuous Condition. Results of this analysis indicated that there was there was a main effect of block, *F*(4, <sup>80</sup>) = 4.37, *p* < 0.01, η<sup>2</sup> *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.18, but no effect of group, *<sup>F</sup>*(1, <sup>20</sup>) <sup>=</sup> <sup>0</sup>.21, <sup>η</sup><sup>2</sup> *p* = 0.01, and no group × block interaction, *F*(4, <sup>80</sup>) = 0.85, *p* = 0.50, η2 *<sup>p</sup>* = 0.04.

#### **MODEL BASED ANALYSES**

To further examine the results obtained in Experiment 2, we applied models to the final block of data separately from each participant (e.g., Estes, 1956; Maddox and Ashby, 1998; Smith and Minda, 1998; Maddox, 1999). The main class of model on which we focussed assumed that participants used an implicit procedural-based learning strategy—instantiated by applying the Ashby and Waldron's (1999) Striatal Pattern Classifier (SPC; see below for details). The model parameters were estimated using maximum likelihood (Ashby, 1992; Wickens, 1993) and the goodness-of-fit statistic was

$$\text{AIC} = 2r - 2ln L,$$

where *r* is the number of free parameters and *L* is the likelihood of the model given the data (Akaike, 1974; Takane and Shibayama, 1992). The AIC statistic penalizes a model for extra free parameters in such a way that the smaller the AIC, the closer a model is to the "true model," regardless of the number of free parameters. Thus, to find the best model among a given set of competitors, one simply computes an AIC value for each model, and chooses the model associated with the smallest AIC value (for a discussion of the complexities of model comparisons see (Myung, 2000; Pitt et al., 2002).

The SPC model has been found to provide a good computational model of participants' responding in previous informationintegration category learning studies (e.g., Ashby and Waldron, 1999; Waldron and Ashby, 2001; for applications to stimulus identification see Ashby et al., 2001; Maddox, 2001, 2002). In addition, the assumptions of this model are based on the neurobiology proposed to underlie the procedural-based system (Ashby et al., 2001). The SPC-1 assumes that there is one striatal unit for each category, and the SPC-2 assumes that there are two striatal units for each category. Both models assume a single noise parameter that estimates the variability associated with the participant's responding, with large variability estimates being associated with less deterministic responding and small variability estimates being associated with more deterministic responding. These models were developed to examine the possibility that participants in the discontinuous condition might learn to associate the separate, and distinct, sub-clusters of perceptually similar stimuli with the appropriate category. We hypothesized that if there was a deficit in communication and recruitment among the medium spiny neurons via dysfunction of the interneurons, then the SPC-1 model should be more likely to account for the pattern of PD patients' responding in the discontinuous condition, whereas an SPC-2 model would be more likely to account for the NC participants' responding. In contrast, there should be no difference between the groups in the continuous condition.

The results of the model applications supported our prediction in that only 1 out of 10 of the PD patients' data sets in the discontinuous condition were better fit by the SPC-2 model, whereas 4 out of 10 of the NC participants' data sets were best fit by the SPC-2 model. Furthermore, for both groups, those participants whose data were best fit by the SPC-2 model demonstrated better accuracy than those whose data were best fit by the SPC-1 model (69.8 vs. 58.8% for the PD group; 87.2 vs. 70.0% for the NC participants). In contrast, in the continuous condition, 0 out of 11 PD patients' data sets were better fit by the SPC-2 model, and only 1 of the 11 data sets from the NC participants was best fit by the SPC-2 model. Thus, in the discontinuous condition, the SPC model with a greater number of units was more likely to account for NC data sets, and this model was also associated with greater accuracy rates<sup>2</sup> .

### **DISCUSSION**

The results from Experiment 2 indicated that, compared to NC participants, PD patients are impaired in procedural-based category learning when the categories are composed of discontinuous categories but are not impaired with continuous categories. **Figure 7** also demonstrates the slight advantage NC participants have when learning continuous vs. discontinuous categories, a finding that we observed in our previous studies with healthy younger participants (Maddox et al., 2007). Of note, if the category clusters from the discontinuous condition (see **Figure 6A**) were from four different continuous categories, as opposed to two discontinuous categories, we would predict that PD patients would be normal.

Note this is the first study in which we found PD patients to be impaired in learning a linear procedural-based rule, arguing against the surface-level explanation that PD patients' deficits in category learning are simply due to the linearity of the rule. Rather, the present results support the hypothesis that PD patients are impaired in procedural-based category learning when there is a need for communication among striatal units, thereby supporting the "communication among units" hypothesis. We now turn to a discussion of the theoretical implications of our findings.

#### **THEORETICAL DISCUSSION**

The main finding from the present set of experiments was that PD patients are impaired in learning discontinuous categories but are normal in learning continuous categories. In addition, these patients are not impaired when having to learn four categories. These findings provide initial support for our "communication among units" hypothesis. In contrast, the two groups did not differ in learning a procedural-based task with four categories, which does not support our "number of units" hypotheses.

The finding that PD patients are not impaired in learning either four- or two-category tasks suggests that the theoretical striatal units are functionally intact. This is consistent with the hypothesis that the medium spiny neurons were able to adequately represent multiple categories in our sample of PD patients. While it is known from PD animal models that the functional integrity of the medium spiny neurons can diminish in the absence of dopamine (Arbuthnott et al., 2000), post-mortem studies with actual PD patients suggest that structural changes to these neurons occurs only in later stages of the disease and may be the cause of motor complications secondary to dopaminergic treatment (i.e., dyskinesia). Specifically, Zaja-Milatovic et al. (2005) examined 9 PD patients post-mortem who had the disease a mean of 13 years and found that dendritic length of the medium spiny neurons was reduced in all striatal regions examined in PD patients relative to control post-mortem samples. The patients in our study tended to have the disease a mean of 6–7 years and were not displaying any complications of dopaminergic treatment (e.g., dyskinesia), so it is possible that the disease had not progressed in our patients to a point where the functioning of the medium spiny neurons had been impacted. This possibility raises the interesting question of whether PD patients with motor complications such as dyskinesia would be more likely to display deficits in the four-category condition given the possibility that their medium spiny neurons are less functional.

In contrast to the findings in Experiment 1, PD patients demonstrated a deficit in the Discontinuous condition in Experiment 2 but not in the Continuous condition. We have hypothesized that this is due to abnormal communication among the cholinergic interneurons in PD. We further hypothesized that normal interneuron communication is needed so that different medium spiny neurons that are processing perceptually dissimilar stimuli can resolve that they are representing stimuli that belong to the same category and are linked to the same response. This would only be required when there is a need for a greater number of the theoretical striatal units, such as when multiple units are needed to represent perceptually dissimilar exemplars from the same category (i.e., with discontinuous and nonlinear categories). Our assumption that striatal cholinergic interneurons are dysfunctional in PD is based on animal models that demonstrate increased activity of such neurons in the presence of reduced dopamine levels (Raz et al., 2001; Pisani et al., 2003; Bonsi et al., 2011). If this over activity of striatal interneurons is sufficient, improper signaling between medium spiny neurons would be likely to occur and the linking of perceptually dissimilar stimuli to the same response would be greatly compromised. This theoretical explanation is supported by other lines of research suggesting that the role of the basal ganglia, in general, and striatum, in particular, is to participate in response selection via the disinhibition of wanted responses and inhibition of unwanted responses (Mink, 1996; Stocco, 2012).

The possibility that cholinergic abnormality in PD underlies cognitive deficits in these patients is not new. However, the role of acetylcholine in PD cognition is not straightforward. On one hand there are previous studies indicating that medications

<sup>2</sup>It should also be noted that we applied a number of hypothesis-testing models that assumed the individual used an explicit approach to learning the categories. Although a few data sets were best fit by this class of models, the actual fits were close to those of the SPC models. Given specific questions we posed with the modeling and the small sample size we felt it was more important to focus on contrasting the SPC-1 and SPC-2 models.

that prevent the breakdown of acetylcholine (i.e., cholinesterase inhibitors) improve cognition in demented patients with PD (Emre et al., 2004; Bosboom et al., 2009; Possin et al., 2013). On the other hand we argue here that increased activity in cholinergic interneuron leads to a deficit in procedural-based category learning. Adding to this possible paradox are findings from a previous study where we demonstrated that impaired learning of a nonlinear procedural-based rule predicted future decline in global cognitive functioning in a group of nondemented PD patients (Filoteo et al., 2007). In addition, the fact that anticholinergic medications are often given to patients early in the course of the disease to improve motor symptoms by presumably reducing the over activity of the cholinergic interneurons also adds to the confusion as to how acetylcholine helps or hurts cognitive and motor functioning in PD. While we are unlikely to resolve these issues here, these possibilities raise the intriguing question of whether the administration of an anticholinergic would paradoxically improve nonlinear or discontinuous category learning in nondemented PD patients, or whether the use of a cholinesterase inhibitor would have any impact. These questions, and the general role of acetylcholine in PD cognition, certainly warrant further study.

It is important to note that the ideas tested in this paper are based on a hypothetical role of the function of cholinergic interneurons in the striatum and clearly represent an oversimplification of both the architecture and function of striatal medium spiny neurons and interneurons. At present, there is no neurobiological evidence to suggest that the specific role of these interneurons is to provide a conduit for which medium spiny neurons can link perceptually dissimilar stimuli to the same response. It may also be the case that the findings we report here are not due to such impairment but rather to some other mechanism, such as dysfunction in the output stage of response selection (e.g., Gurney et al., 2001). What is important is that we have further identified the experimental conditions under which PD patients demonstrate procedural-based category learning deficits, and that these data provide additional insights onto the mechanistic basis for some of our highly consistent previous results (Maddox and Filoteo, 2001; Ashby et al., 2003a,b; Filoteo et al., 2005a,b). In addition, the present work offers a potential computational understanding of the similarities between impaired nonlinear and discontinuous procedural-based category learning deficits in PD.

There are obviously several limitations to the present work. First, in regard to Experiment 1, it is possible that we did not tax the striatum sufficiently by the use of only four categories. It is possible that had we increased the number of categories we would have seen a deficit in the PD patients. As noted above, it is also possible that if we were to test patients in a more advanced stage of PD we would be more likely to see an impairment given the possibility that medium spiny neurons are only impacted in later stages of the disease (Zaja-Milatovic et al., 2005). Second, in regard to Experiment 2, there are several additional manipulations that could have been conducted to further examine the impact of discontinuity in PD patients' procedural-based category learning deficit. For example, in the present study we only examined one within category discontinuous separation and one between category separation. In other words, the within category cluster distance is fixed and so is the category (A vs. B) cluster separation. This issue could be examined parametrically to see what within and what between separations lead to a deficit. If, for example, we found that systematically increasing the between category separation decreases the magnitude of impairment in PD, this would further support the notion that perceptual similarity plays a key role in the observed deficit. Such manipulations are critical to further advance these theories. Third, in Experiment 2, the conditions did not only differ in terms of category continuity but also in terms of within-category range (i.e., how much of the stimulus space was occupied by category exemplars), which also could have explained the findings. However, in a previous study with healthy participants (Maddox and Filoteo, 2011) we found that category discontinuity had a greater impact on learning than did withincategory range, suggesting that the results from the present study are less likely related to the degree of within-category range. Nonetheless, it will be important for future studies to directly examine this issue in PD.

In summary, the present study tested two theories of PD patients' deficits in procedural-based category learning. Our results and conclusions, while highly tentative and theoretical, suggest that PD patients are primarily impaired when learning requires perceptually dissimilar stimuli to be grouped in the same category, which may be due to dysfunctional communication among striatal units secondary to faulty communication.

#### **ACKNOWLEDGMENTS**

This research was supported in part by VA Merit Award and NINDS Grant (R01-41372) to J. Vincent Filoteo and NIMH Grant (R01-59196) to W. Todd Maddox.

#### **REFERENCES**


Ashby, F. G., Ell, S. W., and Waldron, E. M. (2003a). Procedural learning in perceptual categorization. *Mem. Cognit.* 31, 1114–1125. doi: 10.3758/BF03196132


category learning. *Neuroreport* 16, 111–115. doi: 10.1097/00001756-200502080- 00007


category learning. *J. Exp. Psychol. Learn. Mem. Cogn.* 31, 654–669. doi: 10.1037/0278-7393.31.4.654


data from neuroimaging and neuropsychology. *Brain* 127(Pt 4), 851–859. doi: 10.1093/brain/awh100


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 24 October 2013; accepted: 20 January 2014; published online: 19 February 2014.*

*Citation: Filoteo JV and Maddox WT (2014) Procedural-based category learning in patients with Parkinson's disease: impact of category number and category continuity. Front. Syst. Neurosci. 8:14. doi: 10.3389/fnsys.2014.00014*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2014 Filoteo and Maddox. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Asymmetric pallidal neuronal activity in patients with cervical dystonia

#### *Christian K. E. Moll <sup>1</sup> \*, Edgar Galindo-Leon1, Andrew Sharott 1,2, Alessandro Gulberti 1, Carsten Buhmann3, Johannes A. Koeppen4, Maxine Biermann4, Tobias Bäumer 5, Simone Zittel 5, Manfred Westphal 4, Christian Gerloff 3, Wolfgang Hamel 4, Alexander Münchau5 and Andreas K. Engel <sup>1</sup>*



#### *Edited by:*

*Ahmed A. Moustafa, University of Western Sydney, Australia*

#### *Reviewed by:*

*Maria C. Rodriguez-Oroz, Hospital Donostia, Spain Jean-Jacques Soghomonian, Boston University School of Medicine, USA*

#### *\*Correspondence:*

*Christian K. E. Moll, Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Martinistrasse 52, 20246 Hamburg, Germany e-mail: c.moll@uke.de*

The origin of asymmetric clinical manifestation of symptoms in patients suffering from cervical dystonia (CD) is hitherto poorly understood. Dysregulated neuronal activity in the basal ganglia has been suggested to have a role in the pathophysiology of CD. Here, we re-assessed the question to what extent relative changes occur in the direct vs. indirect basal ganglia pathway in CD, whether these circuit changes are lateralized, and how these alterations relate to CD symptoms. To this end, we recorded ongoing single cell and local field potential (LFP) activity from the external (GPe) and internal pallidal segment (GPi) of 13 CD patients undergoing microelectrode-guided stereotactic surgery for deep brain stimulation in the GPi. We compared pallidal recordings from CD patients operated under local anaesthesia (LA) with those obtained in CD patients operated under general anaesthesia (GA). In awake patients, mean GPe discharge rate (52 Hz) was lower than that of GPi (72 Hz). Mean GPi discharge ipsilateral to the side of head turning was higher than contralateral and correlated with torticollis symptom severity. Lateralized differences were absent at the level of the GPe and in recordings from patients operated under GA. Furthermore, in the GPi of CD patients there was a subpopulation of theta-oscillatory cells with unique bursting characteristics. Power and coherence of GPe– and GPi–LFPs were dominated by a theta peak and also exhibited band-specific interhemispheric differences. Strong cross-frequency coupling of low-gamma amplitude to theta phase was a feature of pallidal LFPs recorded under LA, but not GA. These results indicate that CD is associated with an asymmetric pallidal outflow. Based on the finding of symmetric neuronal discharges in the GPe, we propose that an imbalanced interhemispheric direct pathway gain may be involved in CD pathophysiology.

#### **Keywords: cervical dystonia, GPi, GPe, microelectrode recording, LFP, oscillations, coherence, phase–amplitude coupling**

## **INTRODUCTION**

Isolated cervical dystonia (CD)—or spasmodic torticollis—is the most common form of an adult-onset, focal dystonia (Albanese et al., 2013). CD is characterized by phasic or sustained involuntary neck muscle contractions causing abnormal movements and postures of head and neck (Chan et al., 1991). The clinical presentation of many patients with CD is head turning or/and tilting to one side with rotation in the horizontal plane being the most common pattern of abnormal head and neck posture (Jankovic et al., 1991). Symptom control is often achieved with injections of botulinum toxin into overactive neck muscles (Albanese et al., 2011). For medically refractory CD, deep brain stimulation (DBS) of the internal globus pallidus (GPi) has emerged as a therapeutic option with good long-term efficacy (Krauss et al., 1999; Kiss et al., 2007; Walsh et al., 2013).

Currently, the pathophysiology of CD is not well understood. It is a widely accepted view that insufficient motor control in dystonia is critically related to functional disturbances of the basal ganglia, eventually giving rise to a systems level loss of inhibitory functions (Berardelli et al., 1998; Vitek et al., 1999; Hallett, 2011; Hendrix and Vitek, 2012). Much of the foundations for our current understanding of CD as a basal ganglia-related motor circuit disorder dates back to the 1920s when Foerster first proposed that CD symptoms would result from a deficient inhibitory striatal control of the globus pallidus (Foerster, 1921) and, more specifically, from "a focal destruction of that part of the neostriatum corresponding to the neck muscles" (Foerster, 1933). Influenced by Foerster's conceptual framework, supported by results of laboratory studies (Montanelli and Hassler, 1964) and motivated by experiences from torticollis surgery, Hassler later proposed

*<sup>3</sup> Department of Neurology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany*

that an unbalanced pallidal outflow could be critically involved in the asymmetrical clinical manifestation of CD (Hassler and Dieckmann, 1970).

Since then, the concepts of basal ganglia organization have markedly changed with recognition of the dichotomous organization of striatal outflow, giving rise to direct and indirect pathways through the basal ganglia, respectively, (Albin et al., 1989; Delong, 1990; Smith et al., 1998). In this influential model, the striatum exerts a dual control on basal ganglia output neurons in the GPi with opposite effects on thalamocortical circuits. Striatal direct pathway projection neurons inhibit pallidal outflow, and increased gain in this pathway is thought to facilitate movement initiation. On the other hand, an activation of indirect pathway projection neurons leads to increased activity in GPi neurons (by silencing neurons in the globus pallidus externus (GPe) which in turn leads to disinhibition of excitatory inputs to GPi from the subthalamic nucleus), thereby suppressing competing movements. In the rate model of dystonia, reduced pallidal inhibition results from a functional imbalance in the direct and indirect pathways. Both over activity in the direct or under activity along the indirect pathway, respectively, lead to excessive thalamocortical excitation and involuntary dystonic movements (Hallett, 2011). Alternatively, altered spatio-temporal patterns of neuronal activity such as excessive synchrony and/or oscillations along the different basal ganglia pathways may participate in the disruption of motor control in dystonia (Vitek, 2002). To what extent relative changes in the direct vs. indirect pathway occur in CD, whether these circuit changes are lateralized at the basal ganglia level, and how these alterations relate to CD symptoms is hitherto unclear.

There is a continuing interest in the results of single cell and local field potential (LFP) recordings from otherwise inaccessible subcortical regions of the human brain in the context of DBS surgery, as these techniques allow addressing some of the above-mentioned matters (Engel et al., 2005; Vitek et al., 2011). However, available data on lateralized neuronal activity in the GP of CD patients is sparse. Single cell recordings from the GPi of CD patients have mostly been pooled together with data from patients with phenotypically or etiologically different types of dystonias, respectively, (Starr et al., 2005; Chang et al., 2007; Weinberger et al., 2012), precluding a detailed analysis of lateralized activity changes. To our knowledge, only one study has investigated pallidal single cell activity specifically in CD patients (Tang et al., 2007). Tang et al. (2007) found no evidence for side-to-side differences in discharge rates nor patterns of GPi neurons. Several LFP studies have confirmed the presence of pronounced lowfrequency oscillatory activity in the GPi of CD patients (Liu et al., 2008; Sharott et al., 2008). In a recent LFP study in the GPi of CD patients, significant interhemispheric differences in the expression of these low-frequency oscillations have been reported at the population level (Lee and Kiss, 2013). At present, it is not clear how the observed discrepancies between these single cell and LFP studies can be accounted for.

The primary goal of this study was to reinvestigate the question of lateralized differences in pallidal outflow in CD patients. To address this question, we analyzed both single cell spiking and LFP activity from GPe and GPi of CD patients undergoing microelectrode-guided DBS surgery. In the absence of control data, we compared pallidal recordings from awake CD patients with those made in a population of CD patients operated under general anaesthesia (GA), a condition under which abnormal head and neck movements or postures were absent.

## **PATIENTS AND METHODS PATIENTS**

Thirteen patients (7 women, 6 men) with CD aged between 30 and 74 years were studied. The patients were referred to our hospital between February 2006 and April 2013 for stereotactic implantation of DBS electrodes in the GPi because of failure of treatment with botulinum toxin. Of these, 9 patients were operated under local anaesthesia (hereafter referred to as "LA") and 4 patients were operated under GA due to their symptom severity or anxiety, respectively. Disease duration ranged from 3 to 30 years. The current patient sample was limited to medically refractory cases of adult onset isolated dystonia (Albanese et al., 2013). Patients with combined or complex dystonia were excluded. The majority of patients presented with a complex combination of rotation (torticollis), tilt (laterocollis), flexion (antecollis), or extension (retrocollis) of head and neck, respectively, and shoulder elevation. Rotational torticollis was the most common abnormal movement pattern. The chin was turned to the right in 6 patients (3 in the LA group), and to the left in the other 6 patients (5 in the LA group). Involuntary head posturing was accompanied by tremulous head movements in 8 patients (4 in both groups). Further demographic and clinical details of the patients are given in **Table 1**.

## **CLINICAL EXAMINATION**

Preoperative symptom severity was rated using the torticollis severity scale of the Toronto Western Spasmodic Torticollis Rating Scale (TWSTR, maximum score of this torticollis severity subscale = 35; Consky and Lang, 1994). Standardized videotapes were reviewed independently by two movement disorder neurologists (Simone Zittel, Maxine Biermann), and a score was given by consensus rating. Torticollis symptom severity was comparable in both patient populations (TWSTR severity score of the LA group, 21.5 ± 4.5 vs. 20.3 ± 2.4 in the GA group, Mann–Whitney test, *p* = 0.86). All intraoperative recordings and subsequent data analyses were carried out without prior knowledge of patients' symptom severity.

## **DBS SURGERY**

In all patients, bilateral DBS electrodes were implanted in a single operation. All patients had normal brain magnetic resonance imaging (MRI). As reported elsewhere in detail (Moll et al., 2009), surgical planning comprised the delineation of a reference line connecting anterior and posterior commissure in T1-weighted MRIs as well as a direct visualization of the surgical target region on MRIs fused with stereotactic computerized tomography scans for indirect and individual targeting, respectively. All procedures and collection of recordings during DBS surgery were approved by the local ethics committee and all patients gave their written and informed consent. In all patients, the posteroventral lateral aspect of the GPi was targeted 20–22 mm lateral to the midline, 2–3 mm anterior to the mid-commissural point and 3 mm below


#### **Table 1 | Clinical characteristics of CD patients.**

*§As determined by the severity scale of the Toronto Western Spasmodic Torticollis Rating Scale (maximal score* <sup>=</sup> *35).*

*Abbreviations: CD, cervical dystonia; TC, torticollis; LC, laterocollis; AC, anterocollis; RC, retrocollis; SE, shoulder elevation; WC, writer's cramp; n/a, not available (quantification of patient's symptom severity not possible due to cervical plate stabilization performed in the 1980s).*

the commissural plane. Approach angles of the planned trajectories were 25 ± 11◦ in the sagittal and 10 ± 5◦ in the frontal plane. Microelectrode-guided stereotactic implantation of DBS electrodes encompassed recordings from up to five parallel tracks (mean number of recording tracks for mapping, 4 ± 1, Micro Guide, Alpha Omega Inc., Nazareth, Israel). The used recording configuration ("BenGun") consisted of four outer tracks arranged in a concentric array around a central trajectory aiming at the target. In all but one patient (case 10), the "BenGun" was turned 45◦ in respect to the standard "cross-like" configuration, taking into account the elongated morphology of the globus pallidus. Typically, antero- and postero-medial trajectories were used in addition to the central electrode—in some cases together with an additional anterolateral track. Microelectrodes (Alpha Omega Neuroprobe, Alpha Omega Inc., Nazareth, Israel) were simultaneously advanced in steps of 100–500µm. Average tip impedance was 660 ± 290 k at 1000 Hz. Recordings were started 16 ± 4 mm above the radioanatomically defined target level. In the LA group, vigilance level was continuously monitored and all patients were awake and co-operative throughout the whole recording procedure. During recordings, patients were asked to lie as still as possible with their eyes closed. Only "movementfree" recordings of spontaneous activity that were made before or after the assessment of sensorimotor responses by passive or active movements, were included in the present study. In the GA group, anaesthesia was induced with an intravenous bolus of 2 mg/kg propofol. Intravenous anaesthesia was then maintained with 4.1 ± 1.3 mg/kg/h propofol in combination with 0.2 ± 0.04µg/kg/min remifentanil. GA patients were ventilated via an endotracheal tube with an oxygen-air mixture and anaesthetic depth and adequacy were carefully monitored throughout

the whole operation by an experienced anaesthesiologist. In contrast to LA surgeries, during which some of the patients experienced involuntary muscle spasms of their neck muscles, no spontaneous movements or signs of dystonia occurred during surgery in GA.

#### **DATA ACQUISITION**

Unit activity was bandpass-filtered between 300 and 6000 Hz, amplified (×10,000), and sampled with 24,000 Hz. Monopolar local field potentials (LFPs) were recorded from the uninsulated distal most part of the guide tube (contact size of this macrotip ∼1 mm, impedance <1 k-), located 3 mm above the microtip. The guide tube was used as common reference. LFPs were bandpass-filtered between 1 and 300 Hz, sampled with 1.5 or 3 kHz and amplified (×5–10,000).

## **DATA ANALYSIS**

#### **SPIKE DETECTION AND -SORTING**

For spike detection, a threshold was set >5 *SD* of the background noise and then adjusted to the individual signal to noise ratio of the recorded unit (Offline-Sorter, Plexon Inc., Dallas, TX, USA). Extracted total waveform length was 3 ms with a prethreshold time of 1 ms. Waveform clusters were then visualized in 2D or 3D space. Typically, plotting the first vs. second principal component of the waveform resulted in a distinct cluster in addition to a noise cluster which was discarded. In the vast majority of cases, only one single unit was recorded per electrode tip. Next, we carefully checked the autocorrelogram of the resulting spike train for the presence of a central valley, to exclude refractory period violations. Cells in which >3% of the total interspike intervals (ISI) were shorter than 2 ms and multiunit activity were excluded from this study. The signal to noise ratio was then assessed for every well-isolated neuron. To this end, the peak-to-peak amplitude of the average action potential waveform was divided by the mean of 5× *SD* of the background activity, defined as noise. Rate stability was also tested. For every unit, instantaneous firing rate was calculated as a function of time with a binning of 10 ms and visualized after smoothing using a second-order low-pass filter with zero phase shift (cut-off frequency, 10 Hz). Recordings with obvious trends or transients in the rate functions were excluded from further analysis. Only well-isolated single cell data was included in our database.

As action potential shape could help to distinguish subpopulations of pallidal neurons (Bugaysen et al., 2010), we measured a variety of spike characteristics. These measures were derived from the averaged action potential waveform of the upsampled (1 MHz) and amplitude-normalized individual spikes. This study reports peak-to-peak durations. Several basic descriptors of single neuron discharge were determined to characterize ongoing spiking activity. Mean and peak (95th percentile rate) firing rate were calculated for each neuron. To evaluate the regularity of neural discharges, we computed the coefficient of variation (CV) as the SD of the ISI distribution divided by its mean. Because the CV (ISI) may overestimate the irregularity of bursting neurons, we also used the CV2 (Holt et al., 1996)—which compensates for bursting by calculating a local CV for consecutive ISI pairs and not for the whole spike train. We calculated the mean CV2 for each spike train. Neuronal bursting behavior was assessed by applying the Poisson burst surprise method (Legendy and Salcman, 1985). Only bursts having a surprise threshold >5 were considered, corresponding to a probability of <0.001 that the burst of interest would occur in a spike train that follows a Poisson distribution. The results of this analysis are given as the percentage of spikes that participate in those bursts. By dividing the mean ISI by the modal ISI we calculated a simple burst index for every unit, which has previously been applied to characterize human GP activity (Hutchison et al., 2003). In order to determine the time-magnitude structure of burst discharges, we calculated burst-triggered averages (Chan et al., 2011) for every unit that contained >5 bursts with a surprise value of 5. Finally, we adopted a modified version of Kaneoke and Vitek's neuronal discharge classification method (Kaneoke and Vitek, 1996) to differentiate regular, random and bursting discharge patterns (Levy et al., 2001). Briefly, discharge density histograms were created for every neuron (bin width = 1/mean firing rate) and compared with a discharge density of a Poisson process with a mean of 1 (Chi-square test). When the distribution followed a Poisson process, the neuron was classified as having a random discharge pattern. In case the distribution was not significantly Poisson, a regular or a bursting spike discharge pattern was assumed when the variance of the discharge density histogram was <1 and >1, respectively. Oscillatory properties of spike trains were assessed using the spectral method described by Rivlin-Etzion et al. (2006). Briefly, the power spectral density of a spike train was computed using nonoverlapping Hanning windows (length, 4096 ms) and a frequency resolution of 0.25 Hz. Confidence levels were created on the basis of a surrogate distribution of 100 random ISI shuffles. When peaks exceeded the 95% confidence interval, they were considered significant. Prevalence of units with significant oscillatory spiking was then assessed for each of the following frequency bands: theta (4–8 Hz), alpha (9–13 Hz), beta (14–35 Hz), and gamma (35–80 Hz).

#### **LFP ANALYSIS**

Due to technical issues, LFPs of cases 5 (LA) and 11 (GA) were unavailable. On the basis of visual inspection, LFP recordings with exceptional noise levels and artifacts were excluded from further analysis. LFPs were then downsampled to a sampling frequency of 500 Hz and digitally band-pass filtered with a secondorder notch filter at integer multiples of 50 Hz to remove line noise. To avoid phase distortions of the LFP signals, we applied zero-phase forward and reverse digital filtering. LFP power was estimated using multitaper spectral methods (time-bandwidth parameter *nw* = 3.5, 6 Slepian tapers, nfft = 500). To allow for a comparison across anaesthesia groups and to account for different LFP amplitudes depending on recording position and electrode impedance, we analyzed relative rather than absolute power. In order to obtain relative power, spectra were normalized to the total power between 1 and 250 Hz. As for the analysis of spike trains, spectra were subdivided into above mentioned frequency bands. Note that the 50 ± 2 Hz window was excluded for spectral analysis of LFPs in the gamma frequency range. Power was then averaged within these bands and significance levels of multiple *post-hoc* tests were adjusted using the Bonferroni correction.

#### **SPIKE-FIELD COHERENCE**

Spiking activity and LFP dynamics provide complementary perspectives on neural processes (Galindo-Leon and Liu, 2010; Moran and Bar-Gad, 2010). To estimate the relation between pallidal spiking and LFPs, we calculated the spike-field coherence (SFC) as described in Fries et al. (2001). Segments of LFP signal extending ±250 ms around each individual spike were used to calculate the spike-triggered average (STA). The individual Hamming-tapered LFP-segments and the Hamming-tapered STA were Fourier transformed to calculate the corresponding power spectral densities. SFC was calculated as the ratio between the power spectrum distribution of the STA and the mean power spectral density of LFP segments. The threshold of significance was determined by randomizing the spike-times (100 repetitions) preserving the number of spikes and the ISI distribution. The mean of the randomization-based SFC plus two standard deviations was considered as the threshold of significance. Spike-LFP pairs were never from the same electrode [distance between microtip (units) and macrotip (LFPs) was 3 mm].

#### **FIELD–FIELD COHERENCE**

The coherence *Cij*(*w*), defined as the measure of the linear dependency at a particular frequency *w* of two simultaneously recorded LFP signals from electrodes *i* and *j*, was calculated as:

$$C\_{\vec{ij}}(\omega) = \frac{\left|P\_{\vec{ij}}(\omega)\right|^2}{P\_{\vec{ii}}(\omega)P\_{\vec{ij}}(\omega)}\tag{1}$$

with *Pii*(*w*) and *Pjj*(*w*) the power spectral densities of channels *i* and *j*, respectively, and *Pij* the cross power spectral density.

#### **SPIKE–SPIKE COHERENCE**

The frequency relation between spiking activity of two single units was estimated by the spike–spike coherence as described in Halliday et al. (1995), using the NeuroSpec toolbox (version 2.0, www.neurospec.org). Briefly, in this algorithm the discrete spiking signal (defined by the time of spikes onset in milliseconds resolution) is converted into a 1 kHz-sampled continuum signal of 0s and 1s, with 1 corresponding to a *spike* and 0 to *nospike*. With both signals as a continuum the algorithm proceeds to evaluates their power spectral density and cross power spectral density as in Equation (1).

#### **PHASE–AMPLITUDE COUPLING**

Cross-frequency modulation was quantified using the previously described modulation index (Canolty et al., 2006). Briefly, LFP data was filtered into low-frequency (2 Hz-intervals between 2 and 20 Hz) and high-frequency (5 Hz-intervals between 20 and 200 Hz) bands. Analytic phase and amplitude envelope were then extracted by applying the Hilbert transform. A surrogate distribution with disrupted phase–amplitude relationship was generated by recalculating the modulation index 200 times, randomly assigning power-to-phase couples. Raw modulation indices were then transformed so that the normalized modulation index was defined as distance (in units of standard deviation) away from the mean of the distribution expected by chance (Cohen et al., 2009).

All statistical spike train and LFP analyses were carried out offline using custom written Matlab software (The MathWorks, Natick, MA). Comparisons were performed using the software Graph Pad Prism (Version 5.0, GraphPad Software, Inc., La Jolla, CA). In case of non-normal distribution of the data, nonparametric statistical testing was applied (Mann–Whitney rank sum test). Correlations were measured using Spearman's rank correlation coefficient Rho. If not stated otherwise, population averages represent data pooled from both operated hemispheres. Lateralized differences were assessed in patients with significant head turn (all patients except case #5). To allow for a calculation of meaningful averages, individual interhemispheric differences were assessed in patients with a minimum of 5 isolated single units per hemisphere. An alpha level of 0.05 was used for all statistical tests in this study. Unless otherwise noted, all values are given as mean ± *SD*.

### **RESULTS**

#### **NEURONAL DATA SAMPLE**

We recorded the activity of 593 cells from 87 trajectories that traversed the GP region of 9 CD patients recorded under LA (*n* = 394 cells) and of 4 CD patients that were operated under GA with propofol and remifentanil (*n* = 199 cells). The average recording duration in this study was 50 s (range, 15–384). Only well-isolated single cell data was included in our database. The average fraction of ISIs that violated a 2 ms post-spike silence was 0.27 ± 0.55% and the average signal to noise ratio for the whole dataset was 2.7 ± 1.1 (noise was defined as 5 standard deviations of the background activity to comply with the peakto-peak definition of the signal). The average signal to noise ratio in recordings carried out under GA (3.2 ± 1.4) was found to be

significantly higher compared to LA recordings (2.5 ± 1.0, Mann– Whitney test, *p* < 0.0001). The incidence of isolating more than 1 single units from an individual microelectrode tip was considerably higher in GA recordings (17/189 = 9%) compared to recordings in the awake patient (6/356 = 1.7%). Possibly, this is related to reduced noise levels and neuronal firing rates in the GP of patients operated under GA (see below), but stability of recordings could also be improved under GA. We carefully combined radioanatomical and electrophysiological criteria to define 193 of the cells as GPe cells (LA, *n* = 133; GA, *n* = 60), 349 as GPi cells (LA, *n* = 220; GA, *n* = 129), and 51 as "border cells" (LA, *n* = 41; GA, *n* = 10). A zone of electrical silence and the presence of peripallidal border cells were important landmarks for the localization of white matter fascicles surrounding GPe and GPi, respectively.

## **COMPARISON OF NEURONAL DISCHARGES UNDER LOCAL vs. GENERAL ANAESTHESIA**

#### *GPe neurons*

Average peak-to-peak durations of GPe spikes were slightly but significantly longer in the LA group [Mann–Whitney test, *p* = 0.03; 260 ± 74µs (LA) vs. 231 ± 41µs (GA)]. Peak and average firing rates of GPe neurons in the GA group were significantly lower compared to GPe neurons recorded under LA [Mann– Whitney tests, *p* < 0.0001; peak firing rate, 219.4 ± 98.4 Hz (LA) vs. 123.8 ± 64.9 Hz (GA); mean firing rate, 52.1 ± 27.4 Hz and 13.3 ± 8.7 Hz, respectively; **Table 2**; **Figures 1A–D**]. The strong reduction in rate was accompanied by a drastic alteration of the discharge pattern. While the majority of GPe neurons (∼80%) recorded under LA fired in an irregular manner, a bursty firing pattern was predominant in GPe neurons under GA. This difference was significant when comparing the proportion of neurons exhibiting different discharge modes (Chi-Square = 97.85, *df* = 2, *p* < 0.0001; **Figure 1I**) and for all tested descriptors of bursting behavior (Mann–Whitney test, *p* < 0.0001; e.g., 19.6 ± 15.8 vs. 59.8 ± 21.4% spikes participating in bursts, **Figures 1G,H**). Accordingly, both the global and local coefficients of variation had significantly higher means in the GA group compared to GPe recordings under LA (**Table 2**; **Figures 1E,F**). Burstiness of GPe neurons under LA and GA was inversely correlated with mean firing rate (LA, Spearman's rho −0.31, *p* = 0.0002; GA, Spearman's rho −0.31, *p* = 0.02).

#### *GPi neurons*

As with GPe neurons, action potentials of GPi neurons recorded under LA were significantly longer (280 ± 66µs) compared to waveforms derived from GA recordings (255 ± 63µs; Mann– Whitney test, *p* < 0.001). A comparison of the raw spike trains depicted in **Figures 2A,B** reveals that the direction of rate- and pattern changes in the GPi associated with different anaesthesia conditions was similar to the differences seen in GPe recordings. In comparison with LA, GPi recordings under GA had significantly lower firing rates and more bursty discharge patterns (**Table 2**; **Figures 2C–I**). A significant negative correlation was noted between burstiness and mean firing rate in both anaesthesia conditions (LA, Spearman's rho −0.45, *p* < 0.0001; GA, Spearman's rho −0.24, *p* = 0.004). When comparing discharge


**Table 2 | Electrophysiological properties of GP neurons in patients with CD.**

*Note that results in this table are given as means* ± *s.e.m.*

*Abbreviations: GPe, external globus pallidus; GPi, internal globus pallidus; LA, local anaesthesia; GA, general anaesthesia; CV (ISI), coefficient of variation of the interspike intervals; CV2 (ISI), local coefficient of variation of the interspike intervals as determined by the method of Holt et al. (1996), P–P duration, peak-to-peak duration of the average action potential waveform.*

rates between GPe and GPi within the patient group operated under LA, GPi cells (72.0 ± 28.2 Hz) fired significantly faster than GPe neurons (52.1 ± 27.4 Hz; Mann–Whitney test, *p* < 0.0001). In 5/9 patients, GPi firing rates were significantly higher compared to the GPe (Mann–Whitney tests, *p* < 0.05). In the remaining 4 patients, a non-significant trend toward higher GPi rates was observed (not shown). The average ratio of GPe:GPi firing frequency (Obeso et al., 1997) was 0.73 for recordings under LA and 0.51 for GA.

#### *Border cells*

Peripallidal border cells were recorded at the transitions from putamen to GPe and from GPe to GPi, respectively. Occasionally, putative border cells were also found within or at the ventral border of the GPi (delineating the lamina pallidi interna and the transition to the subpallidal fiber field, respectively). In line with previous reports (Delong, 1971; Hutchison et al., 1994; Taha et al., 1997), border cells exhibited several distinguishing features compared to GP neurons. First, border cells displayed long spike durations, which, however, were not significantly influenced by the anaesthetic regime (LA, 332 ± 96µs and GA, 321 ± 53µs; Mann–Whitney test, *p* = 0.7). Border cells had lower peak firing rates, and generally discharged significantly slower (∼30 Hz) and more regular (*CV* values <0.7) compared to both GPe and GPi neurons. Another prominent feature of border cell firing was the virtual absence of group discharges, as evidenced by low values for all measures that describe bursting behavior. In contrast to the apparent sensitivity of GPe and GPi neurons to anaesthetic drugs (see above), anaesthesia had little impact on border cell activity (**Figures 3A,B**). In agreement with previous observations made in generalized dystonia patients (Hutchison et al., 2003), no single descriptive parameter of neuronal activity differed significantly between LA and GA (all tests *p* > 0.05; **Figures 3C–H**). However, a significant difference was noted between the LA and GA group concerning the proportion of border cells that exhibited regular, random, or bursty discharge patterns (Chi-Square = 7.1, *df* = 1, *p* = 0.008; **Figure 3I**). The proportion of regular firing border cells was higher in the patient group recorded under GA.

#### **BURSTING BEHAVIOR**

To further characterize the time-magnitude structure of group discharges, we constructed peri-burst time histograms for all cells that contained >5 bursts. When comparing GPe and GPi group discharges within the same anaesthesia condition, burst morphology was relatively similar. The main determinants were (i) a short pre-burst decrease in discharge rate, (ii) peak intra-burst firing rates being reached around or shortly after burst onset, and (iii) a relatively rapid decay of neuronal firing rates back to baseline levels within 100–150 ms (**Figures 4A,B**). Bursting properties differed considerably between the two anaesthesia conditions. The pre-burst notch was absent or only moderately expressed in group discharges recorded from GPe and GPi neurons with low baseline firing rates under GA (**Figures 4C,D**). In both pallidal segments, average duration of burst discharges was significantly longer under GA compared to LA (**Figures 4E,F**).

In six cells recorded from the GPi of six different patients operated under LA (cases 1, 2, and 5–8), we observed a unique burst discharge pattern that consisted of (i) an initial brief period of tonic single spiking turning into (ii) the actual high-frequency group discharge and (iii) a post-burst pause in neuronal firing (**Figure 5A**). Burst episodes were characterized by a progressive spike amplitude reduction and an acceleration-deceleration spiking pattern (see the sub-panel in **Figure 5A**). This accelerationdeceleration pattern resulted in a parabolic-like shape when ISI duration was plotted as a function of position within the burst (ISI ordinal). **Figure 5B** shows pooled data from all bursts that were recorded from the same neuron as shown in **Figure 5A**, with alignment at the position of the last spike in a burst. Note the reduced variability in the fine structure of the burst during the acceleration phase and the progressive lengthening of ISIs at burst end. The population average of mean peri-burst discharge rates for all neurons with this unique bursting morphology is depicted in **Figure 5C**. The accelerating discharge rate following burst onset and the post-burst decrease in firing rate are clearly distinguishable. Firing rate then gradually ramps up to the next peak around 300 ms, which marks the onset of the next burst and indicates strong rhythmicity in these neurons. In fact, each cell had a significant spectral peak in the lower theta-frequency

Abbreviations: Peak firing rate, 95th percentile rate; CV (ISI), coefficient of variation of the interspike intervals; CV2 (ISI), local coefficient of variation of the interspike intervals as determined by the method of Holt et al.

significance (∗∗∗*p* < 0.001). **(I)** Stacked bar graph of GPe cells under LA and GA, depicting the relative portion of cells firing in different discharge modes as classified by the algorithm of Kaneoke and Vitek (1996).

range (thin gray lines in **Figure 5D**)—which gave rise to a distinct peak in the mean corrected power spectrum at 3–4 Hz (thick black line). The percentage of spikes participating in bursts was considerably higher in cells displaying the aforementioned bursting properties compared to the average burstiness of all GPi cells (34.7 ± 13.7% vs. 11.9 ± 11.7% for the whole sample). The average firing rate of these cells was not significantly different from the average firing rate of non-oscillatory GPi neurons recorded under LA, but lower compared to other oscillatory cells, that did not display the described unique discharge characteristics.

## **OSCILLATORY ACTIVITY IN GP CELLS**

As rhythmic neuronal activity in the GP has repeatedly been proposed to have a role in the pathophysiology of dystonia, we also assessed spike-train periodicity. **Figure 6A** (left and middle panel) shows two cells recorded from the GPi of case 3 (operation under LA), that exhibited prominent oscillatory activity as evidenced by spectral peaks in the theta-frequency range. The relative proportion of cells with significant peaks in their corrected power spectra was about the same for all investigated cell types and patient groups (GPi/LA, 20%; GPe/LA, 15%; GPi/GA, 16.3%;

GPe/GA, 26.3%; Border cells/LA, 19.5%; Border cells/GA, 20%). The stacked-bar graph in **Figure 6B** provides their percent abundance relative to all neurons within a patient subsample for each frequency band, as well as numbers of cells contributing to each portion. Irrespective of anaesthesia conditions, theta-oscillatory neurons constituted the largest fraction of cells with significant spectral peaks within the GPi. Gamma-oscillatory units in the GPe were noticeably more prevalent under GA. In fact, strong oscillatory modulations at high frequencies were repeatedly observed in the peri-burst time histograms of these cells (see **Figure 4C**). Significant spectral peaks of border cells originated from their rhythmic single cell spiking (see **Figures 3A,B**), with peak frequencies roughly corresponding to their average discharge rates. In both the LA and GA sample, the average firing rate of GPi oscillatory neurons was significantly higher

compared to cells without significant peaks in their corrected power spectra (GPi/LA, 80.4 ± 30.2 Hz vs. 69.9 ± 27.3 Hz, *t*-test, *p* = 0.03; GPi/GA, 39.6 ± 33.8 Hz vs. 23.2 ± 18.4 Hz, Mann– Whitney test, *p* = 0.02). In the GPe, however, average discharge rates of oscillatory and non-oscillatory cells were not significantly different (GPe/LA, 50.8 ± 31.9 Hz vs. 52.4 ± 26.7 Hz, *t*-test, *p* = 0.81; GPe/GA, 15.3 ± 10.6 Hz vs. 12.6 ± 7.9 Hz, Mann–Whitney test, *p* = 0.46). No significant differences were noticed in any sample between the burstiness of oscillatory cells and their non-oscillatory counterparts (Mann–Whitney tests, all *p*-values > 0.05).

#### **SPIKE–SPIKE COHERENCE**

Pairwise coherence was investigated in a total of 126 simultaneously recorded spike train pairs. The majority of pallidal

unit pairs exhibited flat coherence spectra and cross-correlations. The percentage of GPi cell pairs with significant peaks in the coherence-spectra was not significantly different between LA (10/41, 21.3%) and GA recordings (9/38, 23.7%, Fisher's exact test, *p* = 0.74). However, the proportion of significantly coherent cell pairs in the GPe was significantly lower in LA (5/26, 19.2%) than in GA recordings (7/21, 33.3%, Fisher's exact test, *p* = 0.04). Irrespective of anaesthesia and pallidal structure, significant peaks in the spike–spike coherence spectra consistently occurred in the theta-, beta-, and gamma frequency range. **Figure 6A** (right panel) shows significant theta-coherence in a cell pair that was simultaneously recorded in the GPi of case 3 and **Figure 6C** provides numbers and relative portions of significant coherence spectra between pallidal neurons for each frequency band. Alpha-band coherence was restricted to recordings under GA and related to propofol-associated alpha-spindling (not shown). Spike–spike coherence in the gamma-band range was more prevalent in GPe than in GPi.

## **LFP POWER**

Our pallidal LFP database of patients operated under LA comprised artifact-free recordings from >250 sites in 8 patients (average number of LFP recording sites per patient: 35 ± 23) located within GPi and GPe, respectively. Power spectra of these recordings were compared to those derived from >100 LFP recordings under GA in 3 patients (average number of LFP recording sites per patient: 40 ± 27) from each pallidal structure (**Figure 7**). Power spectra of both GPe and GPi recordings under LA were

dominated by a peak in the theta-frequency range (peak frequency ∼7 Hz). As **Figures 7A,C** illustrate, LFP power under GA also peaked in the theta range in both pallidal subdivisions, albeit with higher peak power and a peak frequency shift toward lower frequencies. When meaning all GPi–LFPs for individual patients, each patient displayed a main low-frequency peak between 2 and 8 Hz in the power spectrum (not shown). A second distinct spectral peak in the alpha—frequency range was recognizable in 7 out of 9 pallida of patients operated under LA. The GA group generally exhibited significantly more delta but less alpha power compared to LA in both GPe and GPi (**Figures 7B,D**). Due to a more pronounced peak frequency shift of GPe spectral

on an expanded time-scale. **(B)** The acceleration-deceleration pattern resulted in a parabolic-like shape when ISI duration was plotted as a function of position within the burst (ISI ordinal). Pooled data from all bursts that were recorded from the visualized neuron, with alignment at the position of the last spike in a burst. Note the reduced variability in

power under GA, beta but not theta power was significantly lower compared to GPe–LFP power under LA. In contrast, GPi theta power was significantly higher in awake patients compared to patients operated under GA. When comparing the spectral profile of GPe vs. GPi within the LA patient group, significant power differences were found in two frequency ranges. While theta LFP power was significantly higher within GPi (Mann–Whitney test, Bonferroni-corrected *p* < 0.0001), alpha power was significantly higher within GPe (Mann–Whitney test, *p* = 0.02 after Bonferroni correction).

## **LFP–LFP COHERENCE**

When comparing simultaneously recorded sites within GPi and GPe, respectively, peak LFP-coherence (with values >0.7) consistently occurred at low frequencies <10 Hz. The magnitude of LFP coherence under GA was generally lower compared to LA across the whole frequency spectrum in both pallidal structures (**Figures 8A,C**). However, when band-specific LFP–LFP coherence was assessed statistically, significant differences between the two anaesthesia conditions were confined to the region of the GPi. acceleration-deceleration pattern as in **(B)**. Firing rates were normalized by each neuron's average discharge rate between −300 and −100 ms before burst onset. **(D)** Mean (thick black line) and individual (gray line) normalized power spectra of the same cells as in **(C)**. All neurons had significant theta peaks in their corrected power spectra.

Here, theta-, alpha-, and gamma coherence were significantly higher for LA compared to GA recordings (**Figure 8D**; Mann– Whitney tests, Bonferroni-corrected *p*-values = 0.003, 0.005, and <0.001, respectively). In contrast, no significant differences in LFP coherence between LA and GA were observed within GPe (**Figure 8B**). When comparing the magnitude of LFP–LFP coherence between GPe and GPi in patients operated under LA, no significant differences were observed in any frequency band. The only significant difference in LFP coherence under GA was detected in the gamma frequency range, which was significantly higher in GPe compared to GPi (Mann–Whitney test, *p*-value after Bonferroni-correction = 0.004).

#### **PHASE–AMPLITUDE COUPLING IN PALLIDAL LFPs**

After having confirmed the presence of a dominant theta-rhythm in both power and coherence of pallidal LFPs, we wanted to know whether low-frequency phase and high-frequency amplitude of different LFP rhythms are coupled. To this end, we used the method of Canolty et al. (2006) and scanned a broad range of phase–amplitude pairs—each created by extracting

represent the asymptotic value of each estimate and solid horizontal lines give the estimated upper and lower 95% confidence limits. (Right panel) Coherence estimate. The horizontal dashed line gives the estimated upper 95% confidence limit. **(B)** Stacked bar graph illustrating absolute numbers

significantly coherent pairs. Individual pairs of simultaneously recorded cells could have more than one significant peak in their coherence spectra. Therefore, the sum of all relative portions may exceed 100% for a given subsample.

phase information from a lower and amplitude from a higher frequency band—for the presence of significant co-modulation. The pseudocolor plots in **Figure 9** depict grand averages of *z*scored modulation indices across all patient populations and pallidal substructures, respectively. Only statistically significant values (Bonferroni-corrected *z*-score > 4) of cross-frequency phase–amplitude coupling are shown. Our analysis revealed especially strong theta-phase modulation (peaking at 3–7 Hz) of the lower gamma-band (peaking at ∼40 Hz) in the GPi, but also in the GPe of patients operated under LA. Moreover, high gamma amplitude (>125 Hz) was modulated by the phase of oscillatory activity in the upper theta/lower alpha frequency range. Interestingly, the latter theta/alpha-to-highgamma comodulation was also observed in both GPe and GPi in recordings under GA. In contrast, theta-low gamma coupling was noticeably less expressed or absent in the GPi under GA. Instead, we observed substantial delta-phase modulation of oscillations in the beta-frequency (20–30 Hz) range. No difference in the phase–amplitude coupling pattern was found between ipsi- and contralateral recordings for any structure or condition.

## **SPIKE-FIELD COHERENCE**

SFC between LFPs and pallidal single unit activity was generally small and tended to disappear in the relatively high variance of the estimate. Estimating phase-coupling between spikes and LFPs with the pairwise phase consistency method of Vinck et al. (2010) produced essentially the same results, indicating that local spiking activity is not locked to global pallidal population oscillations picked up at distant sites.

## **CORRELATION OF ELECTROPHYSIOLOGICAL RECORDINGS WITH DEMOGRAPHIC AND CLINICAL VARIABLES**

As illustrated in **Figure 10A**, a significant positive correlation was observed (Spearman's rho 0.78, *p* = 0.017) between patient's age and internal pallidal firing rate (both hemispheres pooled). Age and GPe discharge rate were not correlated (Spearman's rho 0.12, *p* = 0.776). A significant negative correlation was observed between disease duration and GPi firing rate (Spearman's rho −0.7, *p* = 0.043; not shown). However, an even stronger negative correlation was found between disease duration and external pallidal discharge rate (Spearman's rho −0.81, *p* = 0.011; **Figure 10B**). We then wanted to test

whether the proportion of lifetime encumbered with CD also correlated with pallidal discharge rates. Therefore, we normalized the duration of CD symptoms to age at surgery by calculating the ratio: CD duration/age at surgery (Lumsden et al., 2013). While the correlation of this ratio and GPe rate missed statistical significance (Spearman's rho −0.62, *p* = 0.085), a significant negative correlation with internal pallidal discharge rate was found (Spearman's rho −0.72, *p* = 0.037). No significant correlation was found between patient's age and disease duration (Spearman's rho −0.29, *p* = 0.46) or torticollis symptom severity, respectively, (Spearman's rho −0.07, *p* = 0.843).

#### **LATERALIZED DIFFERENCES**

In patients with surgery in LA, from 3 to 22 GPi neurons were isolated per hemisphere (mean, 12 ± 5 neurons) and 5 or more units were obtained in 16/18 hemispheres. This allowed us to perform meaningful individual comparisons of discharge parameters between the two sides in most patients.

To better understand the relation between pallidal outflow and the asymmetric clinical manifestation of CD, we assessed possible lateralized differences in relation to the direction of head turn (chin direction) in patients with torticollis. In the 8 patients of the LA group with significant head turn, the population average of GPi discharge rates ipsilateral to the side of head turn (*n* = 99) was significantly higher compared to the contralateral side (74.8 ± 27.3 vs. 67.9 ± 27.5 Hz (*n* = 99), Mann–Whitney test, *p* = 0.04, **Figure 11A**). When this comparison was done for each patient individually, it was noted that ipsilateral discharge rates were always slightly higher than contralateral, but lateralized differences only reached statistical trend level in 3/8 patients (**Figure 11B**). In order to control for spurious side-to-side differences, we also performed a comparison between neurons recorded in the left (*n* = 120) and right (*n* = 100) GPi, respectively. Average discharge rates did not differ (unpaired *t*-test, *p* = 0.7). When the interhemispheric comparison was done for GPi cells recorded from anaesthetized patients, no significant difference in firing rate was found between the two sides [ipsilateral (*n* = 53) 25.6 ± 19 vs. contralateral (*n* = 76) 26.2 ± 24.4 Hz; Mann–Whitney test, *p* = 0.9]. Having observed that side-to-side difference were present under LA, but not GA, we then wanted to test whether these lateralized differences are a feature of the whole pallidal axis, including the GPe. Notably, when comparing mean firing rates of GPe neurons ipsilateral (*n* = 54) and contralateral (*n* = 79) to chin direction in the LA group, no significant interhemispheric difference was found (Mann–Whitney test, *p* = 0.5)—suggesting that the described discharge asymmetry in our patients was confined to the level of basal ganglia output neurons in the GPi. Peak firing rate and descriptors of firing

patterns showed no side-to-side difference, neither in the LA group (Mann–Whitney tests; peak firing rate, *p* = 0.3; CV (ISI), *p* = 0.1; CV2 (ISI), *p* = 0.5; participation of spikes in bursts, *p* = 0.4; Burst index, *p* = 0.9) nor in patients with GA. Likewise, no significant difference in any other GPe discharge parameter was found between the two sides. In patients with significant head turning that were operated under LA, the number of GPi neurons with significant spectral peaks showed no lateralized difference (ipsilateral *n* = 26/contralateral *n* = 28). Because the presence of dystonic head tremor could influence pallidal neuronal activity (Raz et al., 2000), we wondered whether tremor-related inhomogenieties in our patient sample could partially explain some of our results. Noticeably, the mean neuronal firing rate of GPi cells between tremulous vs. non-tremulous CD patients operated under LA was not significantly different (Mann–Whitney test; *p* = 0.6).

Subsequently, we addressed the question whether similar lateralized differences might be observed in LFPs recorded from the GPi. The power spectral profiles of LFPs recorded from hemispheres ipsilateral and contralateral to the side of head turning were largely similar and both dominated by a distinct peak in the 4–9 Hz range (**Figure 11C**). After Bonferroni correction for multiple comparisons, significant band-specific differences were found for two frequency ranges: While alpha-power was significantly higher in the ipsilateral GPi (Mann–Whitney test, *p* = 0.0028), power in the gamma-frequency range was significantly higher in the GPi contralateral to the side of chin direction (Mann–Whitney test, *p* < 0.0001). Side-specific assessment of LFP–LFP coherence revealed strongly coherent GPi–LFP signals in both hemispheres across a broad range of frequencies. Betaband LFP coherence within the GPi ipsilateral to the side of chin rotation was significantly higher compared to the opposite side (**Figure 11D**; Mann–Whitney test, Bonferroni-corrected *p*-value = 0.04).

#### **CORRELATION WITH DISEASE SEVERITY (TWSTR SEVERITY SCORE)**

Mean discharge rate of GPi cells pooled from both hemispheres was not correlated with torticollis severity (Spearman's rho 0.37, *p* = 0.4). Noticeably, for neurons recorded from the GPi ipsilateral to head turn, there was a significant correlation with the severity of dystonic symptoms (**Figure 12**; Spearman's rho 0.775, *p* = 0.04). In contrast, no significant relationship was found between contralateral GPi firing rate and TWSTR severity score (Spearman's rho 0.406, *p* = 0.4).

#### **DISCUSSION**

The principal findings of the current study were that lateralized differences of mean GPi discharge rates—depending on the

coupling between theta phase and amplitude in the lower gamma band. Under GA, this theta–gamma co-modulation was shifted toward higher frequencies for both phase and amplitude in the GPe **(B)** and was completely absent in the GPi **(D)**. A significant cross-frequency interaction between alpha-phase and amplitude in the higher gamma range was observed in both structures and under both anaesthetic modalities.

direction of head excursion—exist in patients with CD. Mean GPi discharge ipsilateral to the side of head turning was higher than contralateral and correlated with torticollis symptom severity. Lateralized differences were absent in recordings from patients operated under GA. Mean GPe discharge rate was lower than in GPi and was inversely correlated with disease duration. Another key finding of our study was that the GPi of CD patients comprised a subpopulation of theta-oscillatory cells with unique bursting characteristics. Power and coherence of ongoing GPe and GPi LFPs were dominated by a theta peak and also exhibited band-specific interhemispheric differences. Cross-frequency coupling of low-gamma amplitude to theta phase was especially pronounced in pallidal LFPs recorded under LA. Finally, single cell spiking was not locked to global pallidal LFP oscillations recorded at distant sites.

## **FIRING RATES IN GPe AND GPi**

Previous studies in CD (Tang et al., 2007) or other forms of dystonia (Merello et al., 2004; Starr et al., 2005) have reported parity in discharge rates between pallidal compartments, corresponding to a calculated ratio of GPe:GPi firing frequency (Obeso et al., 1997) close to 1. This ratio was 0.73 in our study, which is in-between the generally lower values (∼0.6) reported for this ratio in PD (Starr et al., 2005; Tang et al., 2007) and GPe:GPi firing frequency ratios of ∼0.8–1 in recordings from healthy primates (Starr et al., 2005; Elias et al., 2008; Erez et al., 2011).

In our study, the mean firing rate of GPi neurons pooled from both hemispheres of awake CD patients was 72 Hz. This is in excellent agreement with the mean rate of 71 Hz that was reported by the only available study investigating pallidal discharges in a cohort of CD patients (Tang et al., 2007). Thus, internal pallidal discharge rates of CD patients are grosso modo similar to those being reported for healthy nonhuman primates (Filion and Tremblay, 1991; Wichmann et al., 2002; Starr et al., 2005; Elias et al., 2008; Erez et al., 2011) but considerably higher compared to the generally low GPi rates reported for patients with other types of dystonia, in particular generalized forms (Vitek et al., 1999; Sanghera et al., 2003; Merello et al., 2004; Starr et al., 2005; Zittel et al., 2009).

Contrasting with the report of Tang et al. (2007), firing rates of neurons in the GPe (mean rate, 52 Hz) in our study were significantly lower than in GPi but also lower compared to the mean rates of ∼65 Hz that are typically found in the GPe of healthy monkeys (Wichmann et al., 2002; Starr et al., 2005; Elias et al., 2008; Erez et al., 2011). Instead, average GPe rates were in the same range as values reported for patients suffering from Parkinson's disease (Favre et al., 1999; Starr et al., 2005; Tang et al., 2007). Consistent with the finding of higher than normal firing rates of STN neurons in patients with CD and segmental dystonia (Schrock et al., 2009), this finding suggests that—similar to PD the indirect pathway is also overactive in CD. Taken together, our finding of decreased firing rates in GPe—but close to normal rates in GPi—could be explained by dual hyperactivation of both the direct and indirect pathways in our CD patients, which is also in accordance with a current model of dystonia (Vitek, 2002): While excess direct pathway gain will *decrease* GPi activity, overactivation of the indirect pathway will *increase* GPi activity (via weakened inhibitory inputs from GPe and disinhibited excitatory projections from STN). The net result may be no change in GPi activity.

#### **INFLUENCE OF ANAESTHETIC MODALITY ON PALLIDAL ACTIVITY IN CD**

It has long been known that pallidal neuronal activity in patients with dystonia is severely affected by propofol (Hutchison et al., 2003), which is the most frequently used anaesthetic in surgery for movement disorders. However, these observations were mainly made in patients suffering from generalized dystonia, where GA is frequently employed due to severity of symptoms. Our results confirm the notion of a propofol-related reduction in pallidal firing rates in conjunction with profound discharge pattern changes toward slow non-oscillatory bursting, and extend it to a homogeneous patient sample with a focal manifestation of dystonia. To the best of our knowledge, firing rates and -patterns of pallidal border cells have hitherto not been reported in patients with CD. Unlike pallidal neurons, discharge parameters of border cells showed no differences between LA and GA and were comparable to recordings from healthy primates (Bezard et al., 2001). Both observations are consistent with previous research in patients with generalized dystonia (Hutchison et al., 2003), suggesting that spontaneous border cell activity is unaffected in different types of dystonia and not strongly affected by propofol anaesthesia. The reported differences in firing rates and bursting characteristics between the two anaesthetic modalities may aid

**FIGURE 10 | Scatterplots of pallidal discharge rates as a function of demographic and clinical variables.** Data points in both graphs represent mean firing rates for each patient (pooled from recordings in both

expression of a prominent peak in the theta-frequency range. Relative alpha power is significantly higher in ipsilateral pallidal LFPs (arrowhead). Inset: Statistical comparison of ipsi- and contralateral LFP power between 8 and 13 Hz. (**D)** Side-specific coherence spectrum of LFP pairs that were simultaneously recorded in the GPi of awake patients. Comparison of ipsilateral (gray, *n* = 38) and contralateral (white, *n* = 39) GPi–LFP pairs. Inset: Ipsilateral GPi–LFP coherence was significantly higher in the beta frequency range. Asterisks refer to significant differences (∗*p* < 0.05; ∗∗*p* < 0.01, Bonferroni corrected).

future microelectrode-guided mappings of the pallidal region in DBS surgery for CD, especially when performed under GA with propofol and remifentanil.

#### **CORRELATION OF PALLIDAL DISCHARGE PARAMETERS WITH CLINICAL VARIABLES**

In addition to these findings, we found a series of significant correlations of pallidal discharge characteristics with clinical and demographic variables in our dataset, which have not been previously reported. In our dataset, patient's age was positively correlated with internal pallidal discharge rates of awake CD patients. This may be of interest because neuronal activity of dystonia patients is typically compared to data derived from Parkinson's disease patients (Silberstein et al., 2003; Starr et al., 2005; Tang et al., 2007), a patient group that is on average considerably older. Thus, inferences made based on differences in neuronal firing rate between different patient groups should take this possible bias into account. GPi and GPe firing rates were both negatively correlated with disease duration, suggesting that the dual hyperactivation of direct and indirect pathways (Vitek et al., 1999) is strongest in patients with a long history of CD symptoms.

Age at surgery and disease duration have repeatedly been identified as predictors of surgical outcome in patients with generalized dystonia treated with GPi–DBS (Isaias et al., 2008; Valldeoriola et al., 2010). However, data regarding predictors of surgical outcome in patients with CD are conflicting. While one study reported better surgical outcomes in patients with shorter disease duration (Yamada et al., 2013), another found no evidence for any such correlation (Witt et al., 2013). We also observed a significant negative correlation between the proportion of life lived with CD and firing rates in the GPi. Future studies should test whether response to pallidal DBS in CD depends on this measure, as has been demonstrated for childhood dystonia (Lumsden et al., 2013).

### **LATERALIZED DIFFERENCES**

Perhaps the most surprising observation of the present study was the finding of significant interhemispheric differences of pallidal outflow in relation to the clinically determined head excursions. The discharge rate of neurons located within the GPi ipsilateral to the side of head turn were significantly higher compared to their contralateral counterpart. This asymmetry was absent under GA and not observed in the GPe of awake patients. In conjunction with lines of evidence from our rate analysis in GPe and GPi (see above), our finding of imbalanced GPi rates suggests that on top of a symmetrical overactivation of indirect pathways, an asymmetric overactivation of the direct pathway activities may play a role in CD. How could this firing rate asymmetry be significant to the pathophysiology of CD? Lateralized inhibitory outflow from the GPi would lead to unbalanced activation of thalamo-cortical modules and/or brain stem centers and result in uncontrollable activation of neck muscles on one side—thus, contributing to the asymmetric clinical manifestation of CD. While the observed absolute side-to-side difference was moderate (∼7 Hz), network modeling studies have suggested that comparably small firing rate changes in a population of noisy or irregularly firing neurons may translate into larger changes in the activity of downstream neurons (Adair, 2001).

However, lateralized differences in individual patients reached non-significant trend levels at best, suggesting that large numbers of recordings may be required to detect this firing rate difference at the single-subject level. In conjunction with a possible undersampling of the neck-representing pallidal territory that is involved in CD, this might also explain differences between our findings and that of a previous investigation on pallidal physiology reporting the absence of side differences in CD patients (Tang et al., 2007).

A possible link between imbalanced pallidal outflow and the asymmetric manifestation of CD symptoms might also influence the surgical approach, e.g., the selection of the side for DBS in unilateral interventions or staged bilateral procedures. The question of therapeutic dominance of one GPi is undecided (Walsh et al., 2013). There are conflicting reports of beneficial effects on CD symptoms of GPi surgery both ipsilateral (Islekel et al., 1999; Moll et al., 2008; Torres et al., 2010) and contralateral (Kavaklis et al., 1992; Escamilla-Sevilla et al., 2002) to the side of head deviation. Our finding of a significant correlation between ipsi- but not contralateral GPi discharge rates and torticollis symptom severity may provide support for the view that the striato-pallidal system with less over activity along the direct pathway (i.e., the GPi ipsilateral to the direction of head turn) could be an important determinant in the pathophysiology of CD.

It is noteworthy that, in contrast to bursting or oscillation spike train properties, ipsilateral GPi discharge rate was the only neuronal activity parameter that correlated significantly with torticollis severity in our study. This is in agreement with a similar observation from a pallidal physiology study in patients with phenotypically and etiologically diverse types of isolated dystonia (Starr et al., 2005) and suggests that information relevant to disease severity in dystonia may primarily be encoded in rate, rather than pattern of pallidal discharges. However, it is important to note that Starr et al. (2005) found an inverse relationship between GPi rate and baseline dystonia severity score, while GPi rates were positively correlated with severity of neck affection in our CD patients, as assessed by a focal dystonia rating scale (TWSTRS).

#### **SINGLE CELL OSCILLATIONS AND COHERENCE OF NEURON PAIRS**

The proportion of significantly oscillatory cells (15–30%) in our study is in the same range as previously reported for patients with different types of isolated dystonia (Starr et al., 2005), but contrasts with a lower incidence of occurrence reported for CD patients (Tang et al., 2007). Single cell oscillations in the theta frequency range were consistently found in both pallidal divisions and under both anaesthetic modalities, which is in line with previous observations reporting pronounced low-frequency oscillatory activity in pallidal neurons of awake dystonia patients (Starr et al., 2005). In addition, we observed significant oscillatory activity of neuronal discharges in the beta- and gamma- frequency range. Higher frequency oscillations were particularly expressed within burst firing of GPe neurons under GA. Furthermore, our study provides evidence for the existence of oscillatory long-range synchrony between distant (∼2 mm) pallidal neurons, which—to the best of our knowledge—has not been reported before (Tang et al., 2007). Interestingly, both oscillatory single cell firing and coherence in the alpha- frequency range was restricted to recordings under GA. This suggests a frequency-specific modulation of neuronal activity related to anaesthesia, such as propofol- related spindling.

### **GPi HIGH FREQUENCY BURSTERS IN CD OSCILLATE AT THETA-FREQUENCIES**

A subpopulation of neurons in the GPi of patients with Parkinson's disease exhibiting a unique type of neuronal discharge was first described by Taha et al. (1997), who coined the term "GPi tonic-burster" for units combining discharge characteristics of two presumably different cell types that are regularly encountered in recordings from the rodent and primate GPe (Delong, 1971; Bugaysen et al., 2010; Benhamou et al., 2012). On the one hand, these cells share a tonic level of high frequency discharges with neurons called high frequency pausers (HF-P). On the other hand, some specific burst characteristics of GPi tonic bursters (high intraburst frequencies, progressive spike amplitude reduction) are similar to those of low-frequency bursters (LF-B). In conjunction with other lines of evidence (Taha et al., 1997; Chan et al., 2011), our results suggest that one striking feature of GPi tonic bursters is their robust rhythmicity. Taking this into account and to comply with the classic terminology introduced by DeLong (Delong, 1971), we propose that these cells may be termed oscillatory high-frequency bursters (OHF-B). All recorded OHF-B neurons had a unique burst structure. Each burst had a stereotypical acceleration-deceleration pattern, i.e., intraburst ISI reached a minimum midway through and became progressively longer toward the end of each burst. Together with the periodic transitions between bursting and quiescence, this peculiar discharge characteristic is reminiscent of "parabolic bursting" (Ermentrout and Kopell, 1986), referring to the parabola-like appearance ISI sequences in a burst (for an example, see **Figure 5B**). It is interesting to note that the peak oscillation frequencies of OHF-B discharges in CD and PD are markedly different. While OHF-B cells in CD patients peaked in the theta range, their oscillation frequency in patients suffering from Parkinson's disease was confined to the alpha- frequency range (Chan et al., 2011). In any case, oscillation periods of individual OHF-B cells were remarkably robust. Although sparsely distributed, it is therefore, conceivable that OHF-B cells represent a distinct class of pacemaker-neurons in the GPi. Their rhythmic bursting may be involved in the generation and/or maintenance of a steady network rhythm in larger neuronal populations, similar to "hub neurons" orchestrating widespread synchronization in developing networks (Bonifazi et al., 2009). It is of interest that only two of the patients where OHF-B cells were recorded, had head tremor. Therefore, this set of neurons is unlikely to represent true tremor cells (which drive head tremor or are driven by proprioceptive feedback from tremulous muscle movements). Moreover, it is unlikely that OHF-B properties are related to injury discharge or other recording instabilities because the average recording duration of stable activity from these cells was 79 ± 55 s.

## **LOW-FREQUENCY RHYTHMICITY IN PALLIDAL LOCAL FIELD POTENTIALS**

In our study, GPi–LFPs of awake CD patients had dominant spectral components at low frequencies (<10 Hz), adding to the existing evidence that increased oscillatory activity in the extended theta/alpha- frequency range is a prominent feature of LFPs recorded in the GPi of patients with different types of dystonia (Silberstein et al., 2003; Chen et al., 2006b; Liu et al., 2006), including CD (Lee and Kiss, 2013).

To our knowledge GPe–LFPs have not been reported in CD. Our data show that strong low-frequency oscillations also are cardinal features of LFPs recorded from the GPe in CD patients. Furthermore, this study also extends the observation of higher theta power in GPi compared to GPe, as has previously been reported for patients with various types of isolated dystonia (Chen et al., 2006b), to a group of homogeneous cases in terms of clinical phenotype and localization of dystonic symptoms. Our finding of higher alpha power in the GPe compared to GPi has not been reported previously—perhaps due to the common usage of relatively broad spectral frequency ranges (typically, 4–10 and 11–30 Hz) in many studies investigating LFPs in dystonia (Silberstein et al., 2003; Lee and Kiss, 2013).

Pallidal low-frequency LFP oscillations have been demonstrated to be correlated with (Chen et al., 2006a) and coherent to (Liu et al., 2006) dystonic muscle activity. Involuntary dystonic muscle spasms in turn have been shown to be synchronized by a descending oscillatory drive at similar frequencies in patients with isolated dystonia (Tijssen et al., 2000) and directional analysis has suggested a causal link between pallidal LFP oscillations <10 Hz and dystonic muscle activity (Sharott et al., 2008). In order to address the question whether low-frequency oscillations in the pallidal LFPs are relevant to the expression of clinical head excursion, we recorded from CD patients that were operated under GA and showed no signs of dystonic muscle activity. Rather unexpectedly, we also observed distinct theta peaks in power spectra of pallidal LFPs recorded from these fully anaesthetized and symptom-free patients. This suggests that low-frequency rhythmicity seems to represent a general characteristic of pallidal LFP activity in CD that does not simply arise as a consequence of dystonic muscle activity. It is noteworthy in this context that coherence in two distinct frequency ranges of GPi–LFPs was observed to be significantly higher in the awake state compared to LFPs recorded under GA. In addition to elevated levels of low-frequency coherence in the theta- and alpha- band, GPi– LFP coherence in the gamma- band was also higher in patients operated under LA. Higher intrapallidal coherence levels could have either have a generative role for dystonic muscle contractions (which was not specifically assessed in our study) or arise as a consequence of proprioceptive feedback from overactive neck muscles.

Side differences in GPi LFP power between 4 and 30 Hz have recently been reported in patients with CD (Lee and Kiss, 2013). The authors found a slight, yet non-significant preponderance of higher 4–30 Hz-power in the GPi ipsilateral to the direction of head turning. Supporting the notion that the direction of head excursion in CD is related to interhemispheric differences of pallidal neuronal activity, we found significantly higher alphaband power and beta- band coherence of GPi–LFPs ipsilateral to the direction of head rotation compared to the contralateral hemisphere.

#### **PALLIDAL THETA–GAMMA PHASE–AMPLITUDE COUPLING**

Another novel finding in our study is the demonstration of strong phase–amplitude cross-frequency coupling between theta/alpha and gamma oscillations in pallidal LFPs. Theta/alpha-gamma coupling was noticeably less prevalent in LFPs recordings under GA, despite the presence of strong oscillatory activity in the LFPs. In conjunction with previous lines of evidence showing a relationship between pallidal gamma-band activity and the scaling of ongoing movements (Brucke et al., 2012), it is therefore, conceivable that the frequently observed excessive theta/alpha oscillations in GPi–LFPs of dystonia patients influence aspects of involuntary muscle contractions via modulation of activity in the gamma-frequency range.

Spurious coupling in EEG or LFP frequency comodulation measures have been demonstrated to result from art factual sharp edges in the data (Kramer et al., 2008). This is a serious concern for studies in patients with hyperkinetic movement disorders, as movement artifacts could produce abrupt voltage changes, eventually resulting in artificial frequency comodulation results. In our study, head fixation precluded major head movements resulting from CD symptoms. Furthermore, the analysis of phase–amplitude measures was restricted to periods of stable recording conditions and the absence of head movements, respectively, as confirmed by concurrent stationary unit activity.

## **PALLIDAL UNITS ARE NOT LOCKED TO GLOBALLY COHERENT LFP OSCILLATIONS**

Previous attempts to link pallidal single neuron activity to the LFP in patients with dystonia have reported a prevalence of significant spike-field locking between 10% (Weinberger et al., 2012) and 30% (Chen et al., 2006b). In the study of Weinberger et al. (2012), the prevalence of units with significant SFC was ∼8% in unit and LFP recordings from the same electrode. This proportion dropped to 2% when the relationship of unit discharges to LFPs recorded from a second microelectrode (located 600µm apart) was investigated. Together with the lack of a consistent relationship between spikes and LFPs (recorded 3 mm away from the microtip) in our study, this distance-related decay profile of SFC may indicate that ongoing oscillatory activities in the GP of CD patients—although highly coherent at the LFP level does not necessarily couple spiking output to input signals on a global scale. Furthermore, the generally smaller SFC values for single unit activity compared to that for multiunit activity may have contributed to this null finding (Zeitler et al., 2006). Contrasting with the absence of long-range SFC within the GP, however, a subset of neuronal pairs showed significant spike– spike coherence. The presence of such long-range interactions at the level of single pallidal neurons is suggestive of an alternative explanation that takes into account the functional topography of the GP. In this respect, the low SFC may be due to the sampling of different pallidal subterritories, as electrode contacts for unit and LFP recordings with an interelectrode distance of 3 mm were arranged along the dorso-ventral pallidal axis. In contrast, coherence estimates between pairs of spikes or LFPs were based on recordings in the horizontal plane (2 mm away) that were most likely situated in similar functional divisions of the GP.

## **LIMITATIONS OF THE APPROACH**

One of the major limitations with invasive recordings from patients undergoing DBS surgery is the absence of data from a healthy control group. Data obtained from the diseased brains of human patients (collected during therapeutic interventions) are commonly compared to similar recordings from the healthy nonhuman primate. Conclusions from these comparisons across species must necessarily remain tentative and should be interpreted carefully.

Furthermore, fixation of the patient's head and neck in an overcorrected position may have provoked compensatory neck muscle activity. Thus, it is impossible to distinguish whether our results obtained from awake CD patients were the cause or the consequence of dystonic symptoms (Berardelli et al., 1998). While it is common practice to compare the results of invasive recordings between different movement disorders (Silberstein et al., 2003; Starr et al., 2005; Tang et al., 2007), we chose a different approach and compared data from awake CD patients with recordings in patients with the same diagnosis, but DBS surgery under GA, which was associated with complete suppression of all dystonic muscle activities.

Torticollis severity cannot be directly assessed in the context of head fixation during conventional frame-based stereotactic interventions. It would therefore be interesting to see how our results compare to pallidal recordings in patients without head fixation undergoing frameless stereotactic DBS surgery.

Being based on microelectrode-guidance, our study does not suffer from limitations relating to imprecise localization of recording sites that are inevitably associated with postoperative LFP recordings using the permanently implanted DBS electrode. Nevertheless, it is important to note that a definite localization of recording sites, as obtained by, e.g., histological control in animal studies, is impossible in studies of invasive recordings in humans. Moll et al. Asymmetric pallidal outflow in CD

When microelectrode-guided DBS surgery in the same patient is performed on different days—e.g., in the context of a staged procedure—lateralized differences of neuronal activity may be difficult to assess, as mere implantation of a DBS electrode can exert significant effects on patient's symptoms (Cersosimo et al., 2009). Such patient-to-patient variations were minimized by only including patients receiving bilateral DBS-implants on the same day.

Although the present study contains the hitherto largest number of neurons recorded from the GP of CD patients, the overall number of patients included in our study was relatively small due to a limited number of CD cases undergoing DBS surgery. In the light of conflicting evidence from a previous study (Tang et al., 2007), there is clearly a need for larger-scale studies with higher statistical power to clarify the issue of possible lateralized differences in pallidal outflow in patients with CD.

## **CONCLUSION**

Taken together, our results provide tentative support for the possibility that CD may be associated with a symmetrical indirect-pathway overactivity in conjunction with imbalanced interhemispheric activity levels along the direct pathway. The direct pathway ipsilateral to the direction of head turning could, according to this view, fail to compensate for excess indirect pathway activity. We hypothesize that asymmetrical pallido-thalamic output gain could then arise as a consequence of a disrupted balance of direct and indirect pathway overactivation on one side and be an important factor contributing to the asymmetric manifestation of CD symptoms.

#### **ACKNOWLEDGMENTS**

The authors are grateful to Yoshiki Kaneoke (Wakayama Medical University, Japan) and Anita Vasavada (Washington State University, USA) for generously sharing their MATLAB-code for spike train analysis. Andreas K. Engel acknowledges support by the EU (FP7-ICT-270212, ERC-2010-AdG-269716).

#### **REFERENCES**


globus pallidus in patients with primary dystonia. *Brain* 131, 1562–1573. doi: 10.1093/brain/awn083


patient with cervical dystonia and tremor. *J. Neurosurg.* 113, 1230–1233. doi: 10.3171/2010.4.JNS091722


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 November 2013; accepted: 23 January 2014; published online: 11 February 2014.*

*Citation: Moll CKE, Galindo-Leon E, Sharott A, Gulberti A, Buhmann C, Koeppen JA, Biermann M, Bäumer T, Zittel S, Westphal M, Gerloff C, Hamel W, Münchau A and Engel AK (2014) Asymmetric pallidal neuronal activity in patients with cervical dystonia. Front. Syst. Neurosci. 8:15. doi: 10.3389/fnsys.2014.00015*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2014 Moll, Galindo-Leon, Sharott, Gulberti, Buhmann, Koeppen, Biermann, Bäumer, Zittel, Westphal, Gerloff, Hamel, Münchau and Engel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Electrophysiological characterization of entopeduncular nucleus neurons in anesthetized and freely moving rats

## **Liora Benhamou and Dana Cohen\***

The Leslie and Susan Gonda Multidisciplinary Brain Research Center, Bar-Ilan University, Ramat-Gan, Israel

#### **Edited by:**

Alon Korngreen, Bar-Ilan University, Israel

#### **Reviewed by:**

Charles J. Wilson, University of Texas at San Antonio, USA Laurens Witter, Harvard Medical School, USA

#### **\*Correspondence:**

Dana Cohen, The Leslie and Susan Gonda Multidisciplinary Brain Research Center, Bar-Ilan University, No. 901, Ramat-Gan 52900, Israel e-mail: danacoh@gmail.com

The EntoPeduncular nucleus (EP), which is homologous to the internal segment of the Globus Pallidus (GPi) in primates, is one of the two basal ganglia (BG) output nuclei. Despite their importance in cortico-BG information processing, EP neurons have rarely been investigated in rats and there is no available electrophysiological characterization of EP neurons in vivo. We recorded and analyzed the activity of EP neurons in freely moving as well as anesthetized rats, and compared their activity patterns. Examination of neuronal firing statistics during wakefulness suggested that similar to neurons recorded in the primate GPi, EP neurons are a single population characterized by Poisson-like firing. Under isoflurane anesthesia the firing rate of EP neurons decreased substantially and their coefficient of variation and relative duration of quiescence periods increased. Investigation of the relationship between firing rate and depth of anesthesia revealed two distinct neuronal groups: one that decreased its firing rate with the increase in anesthesia level, and a second group where the firing rate was independent of anesthesia level. Post-hoc examination of the firing properties of the two groups showed that they were statistically distinct. These results may thus help reconcile in vitro studies in rats and primates which have reported two distinct neuronal populations, and in vivo studies in behaving primates indicating one homogeneous population. Our data support the existence of two distinct neuronal populations in the rat EP that can be distinguished by their characteristic firing response to anesthesia.

**Keywords: basal ganglia, electrophysiology, anesthesia, extracellular recording, firing patterns, neuronal population**

#### **INTRODUCTION**

The Basal Ganglia (BG), a group of subcortical nuclei, process and integrate motor, cognitive, and limbic information arriving from most cortical areas and the thalamus via a network of segregated pathways (Alexander et al., 1986; Alexander and Crutcher, 1990; Deniau et al., 1996; Kitano et al., 1998). The globus pallidus internal segment (GPi), which is one of the two output stations of the BG, projects to the cortex through different thalamic nuclei, and influences information processing in these targets (Kravitz et al., 2010; Kojima et al., 2013). The GPi is thought to participate in motor control especially in movement initiation, sequential organization of movement and action selection (Horak and Anderson, 1984; Mink, 1996; Redgrave et al., 1999). GPi lesions cause motor disturbances such as reduced speed and acceleration, resulting in hypometria (Turner and Anderson, 2005; Turner and Desmurget, 2010). By contrast, ablation of the GPi has been shown to improve the conditions of patients suffering from Parkinson's disease, which suggests that abnormal information processing in the GPi may have a greater influence on motor behavior than its silencing (Obeso et al., 2009). Additionally, it has been shown that GPi neurons process sensory information related to reward consumption (Lidsky, 1975), and that they are also tuned to reward predicting targets (Hong and Hikosaka, 2008), reward probability (Joshua et al., 2009) and that their activity facilitates response variability required for associative learning (Sheth et al., 2011). Moreover, it has been suggested that the GPi neurons may initiate reward related signals through their effect on the lateral habenula (LHb) (Hong and Hikosaka, 2008). Overall, the complexity of BG information processing encourages further examination of information flow through the different BG nuclei.

Primate GPi neurons are anatomically and biochemically comparable to the neurons of the GPe, a central station of the BG. However, extracellular electrophysiological characterizations of the firing properties of GPe neurons suggest there are two distinct neuronal groups of neurons dubbed high frequency pausers and low frequency bursters, whereas a similar analysis of GPi neurons reveals only one neuronal population (DeLong, 1971). GPi neurons display continuous high rate discharges of about 60 spikes/s with rapid firing rate fluctuations ranging from 10 to 107 spikes/s but without displaying pauses or prolonged silences as is the case for GPe neurons (DeLong, 1971). Other studies have reported that GPi neurons display uncorrelated non-synchronous activity (Bar-Gad et al., 2003a) and that they exhibit slow oscillations in the multisecond range (Wichmann et al., 2002). Currently there is no established characterization of neuronal activity in EntoPeduncular (EP) neurons in freely moving rats.

In general, the rodent and primate BG share similar cell types and connectivity. Thus comparative studies may help clarify the nature, role and features of the computation taking place in these structures. We previously compared the rat Globus Pallidus (GP) activity to the GPe, its primate homologue (Benhamou et al., 2012) and showed that like the primate GPe, the rat GP comprised two neuronal populations. Although the neuronal firing rates in the GP were substantially lower than those observed in the GPe, both species displayed similar firing pattern characteristics. Since the EP is homologous to the primate GPi, it was hypothesized that an extracellular electrophysiological characterization of the firing properties of EP would reveal a single population. Contrary to this assumption, however, anatomical tracing studies in primates, cats, and rats have reported that based on their projection target sites, it is likely that the GPi/EP neurons form two neuronal populations (Nauta, 1974; Filion and Harnois, 1978; van der Kooy and Carter, 1981; Tokuno et al., 1988; Parent et al., 2001). These anatomical studies were corroborated by a study in slice preparation of the rat EP that described two distinct neuronal populations based on their intrinsic electrical properties (Nakanishi et al., 1990).

Here, we characterize and compare neuronal activity recorded chronically in the EP of freely moving rats during wakefulness and under different levels of isoflurane anesthesia. Considering the similarity in the electrophysiological properties of GPe neurons between rats and primates (Benhamou et al., 2012) we hypothesized that rat EP neurons would display similar properties to those observed in primate GPi. During wakefulness, the recorded neurons displayed tonic activity with near Poissonian firing. The distributions of neuronal statistics such as firing rate (FR), coefficient of variation (CV), fano factor (FF) and relative duration of quiescence periods (Silence Index) calculated in freely moving animals were spread along a continuum, thus suggesting that this sample was composed of one neuronal population. By contrast, the characterization of neuronal activity measured under different levels of anesthesia revealed two distinct groups: one that decreased firing rate correlatively with anesthesia level and a second whose firing rate was independent of anesthesia. *Post-hoc* examination of the two population statistics revealed that despite being spread over a continuum, the characteristics of the two populations were significantly different. Our data thus suggest the existence of two neuronal populations in the rat EP that can only be distinguished under anesthesia.

## **MATERIALS AND METHODS**

All procedures were carried out in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals and the Bar-Ilan University Guidelines for the Use and Care of Laboratory Animals in Research.

## **SURGICAL PROCEDURE**

The surgical procedure was previously described (Nicolelis et al., 1997; Jacobson et al., 2009). In brief, four adult male Long-Evans rats (Harlan) weighing 450 g on average (range: 430–480 g) were sedated with 5% isoflurane and were then injected i.m. with ketamine HCl and xylazine HCl (100 and 10 mg/kg, respectively). Supplementary injections of xylazine and ketamine were administrated if required. Rats were maintained by a stereotaxic frame (Kopf Instruments, USA). After sterilization, the rat's head was incised and the skull surface revealed. Connective tissue was then removed and the skull surface cleaned. Two bilateral craniotomies slightly larger than the electrode were made above the EP (anteroposterior (AP): 2.4 mm, mediolateral (ML): 3.0 mm, dorsoventral (DV): −7.0 mm). Eight Formvar coated Nichrome wires (coated diameter: 0.001500, A-M Systems, Inc.) placed in a 27 Gauge cannulae were slowly introduced into the EP through each craniotomy independently. Electrodes (impedance 0.1–0.2 M at 1 kHz) were cemented in place with dental acrylic, leaving the upper part of the connectors exposed.

At the end of experiment, the rats were anesthetized with 5% isoflurane, then ketamine, xylazine and morphine (3 mg/Kg) were injected i.p and electrolytic lesions made before perfusion with 4% formaldehyde. The brain was fixated with 20% sucrose and sectioned with cryostat in 60 µm slices. Electrode positions were confirmed histologically using a microscope (NikonEclipse E400, 1x/400).

## **DATA COLLECTION**

Following about 10 days of recovery from surgery, the activity of EP neurons was recorded in four adult male Long-Evans rats. Within each session the rats were anesthetized with gradually decreasing doses of isoflurane (4% to 1% by steps of 1%) followed by a period of wakefulness during which the rats moved around in the recording chamber rather than remained immobile. Based on our observations of the elapsed time from anesthesia removal until animals started to walk freely in the recording chamber we estimated a recovery time of 5 min for each alteration in isoflurane level thus allowing neuronal activity to physiologically adjust to the different levels. Hence, data collection started after a 5 min intermission following each transition in anesthesia depth. Neural activity was amplified, band-pass filtered at 150–8000 Hz and sampled at 40 kHz using a multichannel acquisition processor system (MAP system; Plexon Inc, Dallas, TX, USA). Offline sorting was performed on all continuously recorded units (OfflineSorter V2.8.8; Plexon, Dallas, TX) and the data were analyzed using custom-written MATLAB software (R2010b, MathWorks Inc., Natick, MA).

## **DATA ANALYSIS**

#### **STATISTICAL ANALYSIS**

All the data reported are presented as the mean ± SEM. Unless stated otherwise, statistically significant differences between parameters were assessed by the non-parametric test Kruskal-Wallis with the Tukey-Kramer multiple comparison correction.

In order to identify significant nonlinear correlation between firing rate and isoflurane, we used the a-parametric Kendall-Tau correlation test (Matlab function: corr with type Kendall; *p* < 0.05). The Kendall-tau test provides a coefficient that represents the degree of concordance between two columns of ranked data. It is similar to the non-parametric Spearman's ρ-test but it enables easier interpretation of the correlation value. Specifically, it is the difference between the probability that the observed data are in the same order and the probability that the observed data are not in the same order.

#### **WAVEFORM PARAMETERS**

The waveform parameters consisted of the valley to peak ratio, valley to peak duration, valley width, zero-cross, and peak and valley amplitudes. The valley is the minimal amplitude time point and the peak is the first maximum observed after the valley time point. Valley width quantifies the duration of the extracellular waveform at its half amplitude, the valley to peak ratio is the valley amplitude divided by the absolute value of the peak amplitude, and zero-cross is the time elapsed between the two time points around the valley in which the amplitude equals zero (see inset in **Figure 3A**).

#### **FIRING PARAMETERS**

The firing parameters included the CV, FF, FR, refractory period and mode Inter-Spike Interval (ISI). The term CV defines the standard deviation of the ISI distribution divided by its mean. The FF is the variance of the spike count distribution calculated in non-overlapping time windows, divided by its mean (window duration equals the median ISI of each neuron). The FR is the total number of spikes divided by the total recording time (spike count rate). The refractory period defines the time in ms elapsed from 0 on the neurons' auto-correlogram to attain its half height at stable state. Mode ISI describes the mode value of the ISI distribution using 5 ms precision bins.

#### **AUTO- AND CROSS-CORRELATIONS**

Autocorrelation and crosscorrelation functions were calculated for latencies of 1000 ms and 500 ms respectively (bin size equals 1 ms).

For the cross-correlation functions, upper and lower confidence levels were calculated as follows: the mean and standard deviation of the cross-correlograms at time ±4–5 s were calculated. The probability that the signal crossed a specified limit in 1% of the bins over 1 s in every bin according to the Bonferroni correction for multiple comparisons was calculated in the following manner: *p* = 0.01 *Nbins* . Assuming a normal distribution, we obtained the number of standard deviations (*Z*-value) required to attain the probability *p* and drew the lower and upper confidence levels at the ordinates corresponding to the mean ± *Z* standard deviations.

#### **SILENCE INDEX**

The increase in CV indicated that under anesthesia neuronal firing could not be modeled by a Poisson distribution. Therefore to enable comparison between neuronal relative quiescence periods while taking into account the different firing rates we created the Silence Index which normalized the relative quiescence time of a neuron with that expected from a Poisson neuron having the same firing rate. The proportion of time taken by the longest 5% of the ISIs relative to the whole recording time was calculated for the real neuron and its Poisson model neuron. Then the value of the real neuron was divided by that of its Poisson model. Values substantially higher than 1 indicate that the neuron tended to be more silent than a Poisson neuron with the same firing rate, suggesting that the real neuron compensated its prolonged silent periods by a higher firing rate during non-quiescent periods in order to obtain a similar firing rate.

## **RESULTS**

We recorded the activity of 38 single neurons in the EP of four rats and characterized their waveforms and firing parameters during different levels of isoflurane anesthesia and wakefulness. The electrode placement in the EP was verified histologically by making an electrolytic lesion via the electrodes that displayed neuronal activity during recording sessions (see Section Materials and Methods). An example of the placement of an implanted electrode in the EP is shown in **Figure 1A**. Its corresponding coronal slice taken from a rat atlas (Paxinos, 2007) is shown in **Figure 1B**. In each session neuronal activity was recorded during five states: four levels of isoflurane anesthesia (1%, 2%, 3% and 4%), and wakefulness, during which the animal could move freely in the recording chamber. Neuronal activity was recorded throughout each session with a 5 min intermission following each transition in anesthesia depth, thus allowing neuronal activity to physiologically adjust to the different levels (see Section Materials and Methods). Overall, in each state we analyzed neuronal activity for an average time of 511 ± 24.47 s.

We first analyzed neuronal activity in the awake state to test whether as in primates the properties of the recorded population are uniformly distributed despite the possibility of including two subtypes as suggested by anatomical studies (Nakanishi et al., 1990; Parent et al., 2001). Initial observations indicated that all the recorded neurons displayed similar waveform shapes composed of a sharp narrow valley that was followed by a slightly longer duration peak. Waveform shapes of three EP neuronal exemplars are shown in **Figure 2A**. The waveforms were characterized by the following parameters described in the Materials and Methods section (see also inset in **Figure 2A**): the valley width at half amplitude (142 ± 4.43 µs), the waveform duration measured from the first valley to the following peak (258 ± 11 µs), the valley to peak amplitude ratio (0.67 ± 0.03), and the zero-crossing duration (275 ± 12 µs). Additionally, the neurons had an average refractory period of 6.97 ± 4.61 ms (mean ± std). Examination of the firing patterns displayed by EP neurons showed that these neurons fired tonically at a relatively high rate (26.0 ± 3.1 spikes/s; **Figure 2B**). Their firing was irregular as reflected in the relatively flat autocorrelograms over more than 1 s (see examples in **Figure 2C**). Of the 24 neuronal pairs recorded simultaneously during wakefulness, none showed a peak in the crosscorrelogram suggesting that EP neurons did not interact with one another or that their interactions were negligible. Like their waveforms, EP neurons displayed homogenous firing patterns characterized by the following parameters: CV = 1.08 ± 0.08, FF = 1.02 ± 0.09, mode ISI = 19.2 ± 1.5 ms, and a Silence Index = 1.06 ± 0.05. The distributions of these parameters are shown in **Figures 2D–H**. Overall, the relatively narrow distributions with a single peak characteristic of these firing pattern statistics suggest that the recorded population was composed of a single cell type that fired in a Poissonian manner most of the time.

We then examined the effect of isoflurane anesthesia on the properties of the EP neurons. The isoflurane did not influence the waveform shapes of the neurons, and all the measured parameters remained unchanged (valley width: *p* = 0.27, valley to peak duration: *p* = 0.94, zero-cross *p* = 0.62, valley to peak

amplitude ratio: *p* = 0.97) indicating that the recordings were stable throughout the sessions. By contrast, firing patterns varied substantially with isoflurane administration (see **Table 1**). The firing rate decreased significantly during anesthesia (14.8 ± 1.5 spikes/s) relative to wakefulness (26.0 ± 3.1 spike/s; Kruskal Wallis; *p* = 0.00009) as shown in **Figure 3A** and in the example shown below of neuronal firing recorded during wakefulness and under anesthesia (**Figure 3E**). Additionally, the CV values were substantially higher during anesthesia (2.40 ± 0.28) relative to wakefulness (1.08 ± 0.08; *p* = 0.03; **Figure 3B**). The increase in the CV values combined with the decrease in FR suggested that EP neurons ceased firing for durations longer than expected from irregular Poisson firing. Indeed, the Silence Index measured during isoflurane administration (1.51 ± 0.09) was substantially higher than during wakefulness (1.06 ± 0.05; **Figure 3D**; *p* = 0.048). An example of the ISI histogram of one EP neuron is shown during anesthesia and wakefulness and illustrates the long tail of ISI observed under anesthesia (**Figure 3F**). The FF values within the population did not change with isoflurane administration (**Figure 3C**; *p* = 0.75), and the neuronal pairwise interactions (*n* = 24) continued to be void.



Examination of the relationship between the neuronal firing rate and the different levels of anesthesia revealed two types of EP neurons: one type (Group I) whose firing rate decreased as the anesthesia level increased, and a second type (Group II) whose firing rate was uncorrelated with anesthesia level. Neuronal spike trains representative of the two groups are shown in **Figure 4A**. The a-parametric Kendall Tau correlation test which identifies significant nonlinear correlations (*p* < 0.05) was used to determine which of the neurons decreased its firing rate with the increase in isoflurane level (see Section Materials and Methods). Observation of the Tau correlation factor showed that none of the neurons increased its firing rate with anesthesia (Tau values < 0); however, only those that substantially and monotonically decreased firing rate (values ∼ −1) were identified as significant (red color; **Figure 4B**). Similar results were obtained using the Spearman's ρ rank correlation test. Group I comprised 37% (14/38) of the EP neurons whereas Group II comprised the remaining 63% (24/38) of the recorded population. Statistical examination of the influence of isoflurane administration on the two groups revealed a main effect for both group type and anesthesia level (a two-way ANOVA with Tukey-Kramer correction for multiple comparisons; *p* < 0.0001). *Post-hoc* examination of the firing rate showed that during wakefulness Group I displayed a significantly higher firing rate than Group II. Moreover, the firing rate of Group I gradually decreased and reached significance at isoflurane level of 2% relative to wakefulness whereas the firing rate of Group II decreased abruptly with anesthesia administration and remained constant during all isoflurane levels (**Figure 4C**).

Taking into consideration that neurons with similar firing rates can display different firing patterns and *vice versa* neurons with different firing rates can display similar firing patterns we tested whether the two groups display similar firing pattern characteristics. Specifically, we statistically compared the values of the firing pattern parameters CV, FF and silence index between the two groups which seemed non-separable in the awake state. Although both groups significantly decreased their firing rate in response to anesthesia they were characterized by different firing pattern parameters (**Figures 4D–F**). Group I displayed lower values of CV (1.70 ± 0.34), Silence Index (1.26 ± 0.11), and FF (0.91 ± 0.08) compared to Group II (CV: 2.25 ± 0.27; *p* = 0.004; Silence Index: 1.47 ± 0.08, *p* = 0.014; and FF: 1.09 ± 0.06, *p* = 0.008). These data suggest that the two groups may reflect the

existence of two subpopulations of neurons in the EP that could only be distinguished by their characteristic electrophysiological properties during different levels of anesthesia.

#### **DISCUSSION**

In this study we characterized neuronal activity in the EP of anesthetized and freely moving rats. Our results show that in freely moving animals, EP neurons displayed a high firing rate of 26 ± 3.1 spikes/s (range: 2.23–76.13 spikes/s, median: 20.30 spikes/s) whereas administration of isoflurane anesthesia induced a significant decrease in the firing rate to an average of 14.8 ± 1.5 spikes/s (range: 0.2–74.4 spikes/s, median: 9.27 spikes/s). This decrease in firing rate occurred either gradually as more anesthesia was delivered, or abruptly with the transition between wakefulness and the first level of anesthesia and then remained relatively stable regardless of the anesthesia level. Based on this distinction in the neurons' responses to anesthesia, we classified the EP neurons into two groups: Group I and Group II. Although examination of the firing properties of EP neurons during wakefulness indicated a homogenous population characterized by near Poissonian tonic firing and a lack of interactions, further examination suggested otherwise; the newly defined groups had significantly different firing pattern properties. Our results provide evidence for the potential existence of two distinct subpopulations within the rat EP that could be distinguished by their electrophysiological characteristics during isoflurane anesthesia.

Analysis of EP neurons recorded during wakefulness without a-priori classifying them into two distinct groups based on their anesthesia response curves revealed a seemingly homogeneous population. Similarly, extracellular recordings in the primate GPi revealed one homogenous neuronal population (DeLong, 1971). The firing pattern characteristics of GPi neurons excluding firing rate described in later studies by DeLong (1971) and others (Joshua et al., 2009; Sheth et al., 2011) are similar in nature to those observed in this study. Additionally, EP neurons were uncorrelated and apparently did not interact with one another as was previously observed in primates (Bar-Gad et al., 2003a), strengthening the view that the BG act as a decorrelating and dimensionality reduction system (Bar-Gad et al., 2003b). Under isoflurane, EP neurons increased their CV and Silence Index and decreased FR indicating that in that condition these neurons displayed a higher firing variability and pauses longer than expected from a Poissonian firing. A similar result was obtained in humans under Propofol anesthesia which inhibited GPi neurons, and in addition to inducing long pauses, increased their burst index (Hutchison et al., 2003).

The main difference between the neuronal activity of EP neurons and that of the GPi neurons is their average firing rate, which was 26.0 ± 3.1 spikes/s in EP and 63 spikes/s in GPi (DeLong, 1971). In a previous study we characterized neuronal activity of GP neurons in freely moving rats and compared the results to those obtained in primate GPe neurons using the same methodology (Benhamou et al., 2012). The outcome of that study showed that similarly to primates, rat GP neurons could be divided into two distinct neuronal populations based on their characteristic firing patterns. Two measures diverged from primates: the first was the average firing rate that appeared to be much lower in rats (20.07 spikes/s) than in primates (55 spikes/s), and the second was that the two subpopulations represented different proportions (rats: 73.5 vs. 26.5%; primates: 85 vs. 15%, for high frequency pausers and low frequency bursters, respectively (DeLong, 1971; Bugaysen et al., 2010; Benhamou et al., 2012). It should be noted that the proportions of these subpopulations resembled the *in vitro* proportions reported in rodents (Kita and Kitai, 1991; Cooper and Stanford, 2000). The current results further support the view that firing patterns rather than firing rates and subpopulation proportions are important to the preservation of BG information processing and especially the GP and EP, and are therefore conserved across species.

Isoflurane administration substantially altered the firing patterns displayed by EP neurons. Isoflurane is minimally metabolized in the body and has a low blood-gas partition coefficient (1.4) so that it has a very low solubility in blood and changes in concentrations are rapid. In the present experiment, we estimated a recovery time of 5 min based on our observations of the time it took animals to walk normally in the recording chamber following the removal of 1% isoflurane anesthesia. Recently, a median recovery time of 330 s has been reported when animals were placed under 1.5% isoflurane for 40 min (Taylor et al., 2013). Therefore, we cannot rule out that some of the anesthesia evoked effects may have had a longer lasting influence on the electrophysiological characteristics of EP neurons. Such a case however would have likely minimized rather than enhanced or caused the observed differences between the two groups because of residual effect averaging.

*In vitro* studies in rodents, cats and primates have reported the presence of two neuronal populations based either on their biophysical properties (Nakanishi et al., 1990) or on their projections' target sites (Nauta, 1974; Filion and Harnois, 1978; van der Kooy and Carter, 1981; Tokuno et al., 1988; Parent et al., 2001). On one hand, a proportion of 2/3 to 1/3 was found in rats based on the EP projections to the LHb and to the thalamus, respectively (van der Kooy and Carter, 1981). On the other hand, when the identification of two types of neurons was based on their membrane electrical properties a different proportion of division into types was reported; namely, 79.6% of type I neurons and 16.6% of type II neurons (Nakanishi et al., 1990). By taking into account that the intracellular recording method may create a sampling bias and consequently skew the data toward more accessible neurons, it is plausible that despite the difference in proportions the two studies dealt with the same neuronal types (i.e., that EP neurons projecting to different targets may be biophysically distinct). Isoflurane operates by altering neuronal membrane excitability (Franks and Lieb, 1994; Arhem et al., 2003) suggesting that neurons with different membrane properties may display different response curves under anesthesia. Therefore, based on the sensitivity of the membrane excitability to isoflurane and the similarity between the reported proportions of the two identified neuronal populations and the proportions observed in this study (63% in Group II and 37% in Group I) it is possible that the neurons comprising Group I are those projecting to the thalamus and those comprising Group II project to the LHb.

A complementary factor which may accentuate the differences between the firing properties of EP neurons is the different origin of their inputs; pallidohabenular neurons receive afferents from striatofugal neurons in the patch compartment whereas pallidothalamic neurons receive afferents from the matrix compartment (Rajakumar et al., 1993). It remains unknown however whether medium spiny neurons in the patch compartment respond to anesthesia differently than medium spiny neurons in the matrix. Another possible partition of EP neurons into subpopulations is based on their neurotransmitters; pallidothalamic neurons seem to comprise mainly GABAergic projections whereas the majority of the pallidohabenular projections in rats are non-GABAergic (Araki et al., 1984), and are possibly cholinergic (Moriizumi and Hattori, 1992). However, the release of different neurotransmitters is less likely to account for the observed differences in the groups' response curves and would yield a different group ratio (∼1:1; GABAergic vs. non-GABAergic neurons) than the observed ratio (2:1; pallidohabenular vs. pallidothalamic). Therefore, a classification of Group I and Group II as GABAergic and possibly cholinergic neurons is less probable.

Primate and cat pallidohabenular neurons, which are far less abundant (10 and 16% of the neurons, respectively) than rat pallidohabenular neurons (2/3 of the neurons), have lower baseline firing rates (33 spikes/s) than presumed motor GPi neurons (77 spikes/s) (Larsen and Mcbride, 1979; Hong and Hikosaka, 2008). In the current study we found that 63% of the neurons (Group II neurons) had significantly lower firing rates than Group I neurons in awake rats (Group II: 17.8 ± 2.5 spikes/s; and Group I: 40.0 ± 5.7 spikes/s). The similarity in the observed firing rate imbalance between the two identified groups in rats and between the habenular projecting GPi neurons and motor neurons in primates lends additional support to the hypothesis that the Group II neurons reported in this study are habenular projecting EP neurons whereas the Group I neurons are thalamus projecting EP neurons. By taking into account the small size of rat EP in the rostro-caudal axis (600–800 µm) it seems that confirmation of this hypothesis could be obtained by optogenetic activation of identified habenular projecting EP neurons or by their antidromic activation via stimulation of the LHb. If confirmed, the simple procedure of isoflurane administration or possibly other kinds of anesthesia would permit selective access to limbic vs. motor information processing in the rat EP and primate GPi.

## **CONCLUSION**

Our results suggest that: (1) firing patterns are preserved in the rat EP compared to primate GPi whereas firing rate and subpopulation ratio are not; and (2) the use of isoflurane anesthesia may provide an easy method to accentuate electrophysiological differences in the EP neurons and consequently enable their classification into two distinct neuronal populations. In so doing our results potentially reconcile *in vitro* studies performed in cat and rodent EP, and primate GPi where two subpopulations of neurons have been identified, and electrophysiological studies in behaving primates where only one homogeneous neuronal population has been reported. Further experiments are required to determine whether these two populations of neurons correspond to pallidohabenular and pallidothalamic neurons, and investigate whether their information processing capacity and contribution to behavior are similar or different in nature.

## **ACKNOWLEDGMENTS**

The study was financed in part by a REALNET (FP7-ICT270434) grant from the European Commission. The authors declare no competing financial interests.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The Associate Editor, Dr. Alon Korngreen declares that, despite being affiliated to the same institution as the authors, the review process was handled objectively and no conflict of interest exists.

*Received: 22 October 2013; accepted: 12 January 2014; published online: 10 February 2014.*

*Citation: Benhamou L and Cohen D (2014) Electrophysiological characterization of entopeduncular nucleus neurons in anesthetized and freely moving rats. Front. Syst. Neurosci. 8:7. doi: 10.3389/fnsys.2014.00007*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2014 Benhamou and Cohen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## High and low frequency stimulation of the subthalamic nucleus induce prolonged changes in subthalamic and globus pallidus neurons

## *Hagar Lavian1,2, Hana Ben-Porat <sup>1</sup> and Alon Korngreen1,2\**

*<sup>1</sup> The Leslie and Susan Gonda Interdisciplinary Brain Research Center, Bar-Ilan University, Ramat Gan, Israel <sup>2</sup> The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel*

#### *Edited by:*

*Ahmed A. Moustafa, University of Western Sydney, Australia*

#### *Reviewed by:*

*Jose Bargas, Universidad Nacional Autónoma de México, Mexico M. Gustavo Murer, Universidad de Buenos Aires, Argentina*

#### *\*Correspondence:*

*Alon Korngreen, The Leslie and Susan Gonda Interdisciplinary Brain Research Center and The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900, Israel*

*e-mail: alon.korngreen@biu.ac.il*

High frequency stimulation (HFS) of the subthalamic nucleus (STN) is widely used to treat the symptoms of Parkinson's disease (PD) but the mechanism of this therapy is unclear. Using a rat brain slice preparation maintaining the connectivity between the STN and one of its target nuclei, the globus pallidus (GP), we investigated the effects of high and low frequency stimulation (LFS) (HFS 100 Hz, LFS 10 Hz) on activity of single neurons in the STN and GP. Both HFS and LFS caused changes in firing frequency and pattern of subthalamic and pallidal neurons. These changes were of synaptic origin, as they were abolished by glutamate and GABA antagonists. Both HFS and LFS also induced a longlasting reduction in firing frequency in STN neurons possibly contending a direct causal link between HFS and the outcome DBS. In the GP both HFS and LFS induced either a long-lasting depression, or less frequently, a long-lasting excitation. Thus, in addition to the intrinsic activation of the stimulated neurons, long-lasting stimulation of the STN may trigger prolonged biochemical processes.

**Keywords: subthalamic nucleus, globus pallidus, high frequency stimulation, low frequency stimulation, basal ganglia**

## **INTRODUCTION**

High frequency stimulation (HFS) of the subthalamic nucleus (STN) is widely used to treat various basal ganglia disorders, particularly Parkinson's disease (PD; Limousin et al., 1995; Starr et al., 1998). Because both STN lesions and HFS lead to similar amelioration of PD symptoms (Bergman et al., 1990; Benazzouz et al., 1993), it is commonly thought that the alleviating effect of HFS stems from partial or complete inhibition of the STN neurons. Supporting this hypothesis, some *in vivo* studies have shown that HFS in the STN suppressed the firing rate of STN neurons (Tai et al., 2003; Filali et al., 2004). Similarly, HFS of the STN in brain slice preparations inhibited subthalamic neurons (Beurrier et al., 2001; Magarinos-Ascone et al., 2002), supporting the view of inhibition of the STN as the underlying cause for alleviation of PD symptoms. Note, however, that such inhibitory effects have not yet been demonstrated to be specific to stimulation at therapeutic parameters. Furthermore, other *in vivo* studies found that HFS in the STN led to increased glutamate concentration (Windels et al., 2000) or elevated firing rate in STN target nuclei, indicating increased subthalamic activity (Hashimoto et al., 2003). Finally, using optogenetics to selectively inhibit subthalamic neurons did not affect motor symptoms in hemi-parkinsonian rats (Gradinaru et al., 2009). Thus, the effects of HFS in the STN on the firing frequency of its neurons and other basal ganglia neurons are unresolved.

HFS may alleviate PD by changing the firing pattern of STN neurons rather than their firing rate, as HFS in an irregular pattern is not effective in treating PD, as oppose to regular stimulation (Dorval et al., 2010). Indeed, *in vivo* HFS of the STN causes changes in the timing of the firing of STN neurons (Meissner et al., 2005) and of neurons in the external and internal globus pallidus (GP; Hashimoto et al., 2003; Moran et al., 2011).

In contrast with the alleviating effect of HFS, low frequency stimulation (LFS) may worsen PD symptoms (Timmermann et al., 2004). Can this be correlated to cellular changes in STN neurons as observed for HFS? Therefore, to investigate the effects of STN stimulation, we compared the effects of HFS and LFS on the STN and on one of its target nuclei, the GP. Pallidal neurons not only receive glutamatergic input from the STN (Smith and Parent, 1988), but also innervate STN neurons via GABAergic synapses (Shink et al., 1996). We use these two reciprocally connected nuclei as a system for understanding the dynamic changes occurring during repetitive stimulation. Whole-cell recordings were made simultaneously from pallidal and subthalamic neurons in a rat brain slice preparation preserving the connectivity between these two nuclei. Both HFS and LFS led to similar prolonged depression of subthalamic firing. A similar long-term depression was also seen in pallidal neurons but also, less frequently, a long–term excitation.

## **MATERIALS AND METHODS**

#### *IN VITRO* **SLICE PREPARATION**

Brain slices were obtained from 17–22 days old Wistar rats as previously described (Stuart et al., 1993; Bugaysen et al., 2010). Rats were killed by rapid decapitation according to the guidelines of the Bar-Ilan University animal welfare committee. This procedure was approved by the national committee for experiments on laboratory animals at the Israeli Ministry of Health. The brain was quickly removed and placed in ice-cold artificial cerebrospinal fluid (ACSF) containing (in mM): 125 NaCl, 4 KCl, 25 NaHCO3, 1.25 Na2HPO4, 2 CaCl2, 2 MgCl2, 25 glucose, and 0.5 Naascorbate (pH 7.4 with 95% O2/5% CO2). Thick sagittal slices (370 µm) were cut on an HR2 Slicer (Sigman Electronic, Germany) and transferred to a submersion-type chamber, where they were maintained for the remainder of the day in ACSF at room temperature. Experiments were carried out at 37◦C, the recording chamber was constantly perfused with oxygenated ACSF.

#### *IN VITRO* **ELECTROPHYSIOLOGY**

Individual GP and STN neurons were visualized using infrared differential interference contrast (IR-DIC) microscopy. Wholecell recordings were obtained from the soma of GP and STN neurons using patch pipettes (4–8 M-) pulled from thick-walled borosilicate glass capillaries (2.0 mm outer diameter, 0.5 mm wall thickness, Hilgenberg, Malsfeld, Germany). The standard pipette solution contained (in mM): 140 K-gluconate, 10 NaCl, 10 HEPES, 4 MgATP, 0.05 SPERMIN, 5 l-glutathione, 0.2 EGTA, and 0.4 GTP (Sigma, pH 7.2 with KOH). The reference electrode was an Ag–AgCl pellet placed in the bath. Voltage signals were amplified by an Axopatch-200B amplifier (Axon Instruments), filtered at 10 kHz and sampled at 20 kHz. The 10-mV liquid junction potential measured under the ionic conditions reported here was not corrected for.

Electrical stimulation was applied via a monopolar 2–3 K- Narylene-coated stainless steel microelectrode positioned on the rostrodorsal part of the STN. The anode was an Ag–AgCl pellet placed in the bath. Stimulation pulses consisted of 100–300 µA biphasic currents (200 µs cathodal followed by 200 µs anodal phase). The interval between consecutive pulses was 100 and 10 ms, over 30/20 s leading to stimulation frequencies of 10 (300 stimuli) and 100 Hz (2000 stimuli), respectively. In several experiments the following drugs were added to the ACSF: bicuculline (BCC) methiodide to block GABAa receptors (final concentration 50 µM), D(-)-2-amino-5-phosphonopentanoic acid (APV) (50 µM) and 6-cyano-7-nitroquinoxaline-2,3-dione (CNQX) (15 µM) to block NMDA and AMPA receptors, respectively.

## **DATA ANALYSIS**

All off-line analyses were carried out using Matlab R2007b (Mathworks) and IgorPro 6.0 (WaveMetrics) on a personal computer. Data for each experiment were obtained from at least five rats. All results for each experiment were pooled and displayed as mean ± SD. The pre-stimulus average firing rate was calculated from spikes extracted from 30 s of continuous recording. Firing rate was defined as the time-dependent average firing rate aligned to the stimulation onset. Changes in firing pattern were determined

stimulation (upper trace). IPSP recorded in the STN in response to GP stimulation (lower trace). through raster plots and peri-stimulus time histograms (PSTH) from each recording. A *t*-test gave the significance of changes

before and after application of the drugs during stimulation

## **RESULTS**

(∗*p* < 0.05 and ∗∗*p* < 0.01).

The STN was identified as a small oval nucleus above the internal capsule (**Figure 1A**). Neurons in this region were spontaneously active and were characterized by sag current induced by hyperpolarization (**Figure 1B**) and burst firing (*n* = 39), as previously reported (Nakanishi et al., 1987; Beurrier et al., 1999). The GP was identified rostral to the internal capsule and caudal to the striatum (**Figure 1A**). To assess the connectivity between both nuclei in the slices, subthalamic and pallidal neurons were labeled with biocytin, allowing us to track stained axons. Axons in both directions were best preserved when the sagittal slices were cut at an angle of 17◦ to the midline. To confirm the functional connectivity in each slice, we placed the stimulating electrode at the center of each nucleus and obtained whole-cell recordings from neurons in the other nucleus. As both subthalamic and pallidal neurons are spontaneously active, recorded neurons were hyperpolarized to identify synaptic activity. Single pulse stimulation of the STN evoked excitatory postsynaptic potentials (EPSPs) in the recorded GP neurons (**Figure 1C**). These EPSPs were abolished after application of APV and CNQX indicating a glutamatergic origin (data not shown). Single pulse stimulation of the GP evoked inhibitory postsynaptic potentials (IPSPs) in the subthalamic neurons, indicating the preservation of GABAergic axons from the GP to the STN (**Figure 1C**).

Whole-cell recordings were obtained from 47 subthalamic and 48 pallidal neurons during repetitive stimulation of the STN at different frequencies. All recorded neurons were labeled with biocytin showing that they lay within the boundaries of the appropriate nucleus. **Figure 2A** shows an example of such simultaneous recording of a subthalamic (i) and a pallidal neuron (ii)

during 100 Hz stimulation of the STN. To investigate the effects of HFS, on the pallidal and subthalamic neurons we recorded simultaneously from both nuclei during 100 Hz stimulation. In order to identify changes in firing pattern, we calculated raster plots and PSTH from each recording (**Figures 2B, C**). HFS evoked significant changes in the firing pattern of most subthalamic (*n* = 44/47) and pallidal neurons (45/48) (**Figure 3**) as found *in vivo* (Hashimoto et al., 2003; Moran et al., 2011). During the stimulation the firing of neurons from both nuclei became partially locked to the stimulus pulses; population PSTHs showed that the STN and GP firing rate transiently increased at an average latency of 2.6 ± 1 ms and 2.5 ± 0.9 ms, respectively (**Figure 3A**). These latencies are consistent with *in vivo* findings (Kita and Kitai, 1991; Hashimoto et al., 2003; Moran et al., 2011). The effect of LFS was examined using stimulation at 10 Hz. Similar to the responses observed during HFS, most of the subthalamic (28/37) and pallidal (34/44) neurons showed a transient increase in firing rate at an average latency of 2.3 ± 0.8 ms and 2.5 ± 0.6 ms (**Figure 3B**).

The early excitation in the STN evoked by the stimulation protocols could result from intrinsic activation of the subthalamic neurons or from the activation of afferent cortical and thalamic glutamatergic fibers. The early GP excitation could result from antidromic activation of pallidal axons or from the activation of subthalamic fibers. To determine the origin of this excitatory phase APV and CNQX were applied to the slices. These glutamatergic blockers completely abolished this excitatory response in both subthalamic and pallidal neurons (**Figure 4**). That is, rather than activating the neurons intrinsically, the stimulation results in activation of neurons in both nuclei via glutamatergic synapses.

In some cases a transient inhibition followed the initial excitation evoked by each pulse. During HFS, the firing rate of 7/47 subthalamic (15%) and 9/48 pallidal neurons (19%) decreased to a minimum at an average latency of 3.7 ± 0.9 ms and 3.6 ± 0.5 ms, respectively (**Figure 5A**). Single pulses in LFS resulted in inhibition in 9 subthalamic (24%) and 19 pallidal (43%) neurons. This inhibitory phase was abolished by applying BCC (**Figure 5B**) and thus results from the delayed activation of GABAergic synapses. The lower prevalence of inhibition after HFS, compared with LFS, may be due to partial depression of GABAergic synapses due to the high frequency of stimulation.

We next characterized the effect of repetitive STN stimulation on the firing frequency of the subthalamic and pallidal neurons. The population firing rate of the subthalamic neurons increased slightly during both HFS and LFS. In most subthalamic neurons, cessation of stimulation was followed by a prolonged decrease in firing rate. An example of such a prolonged decrease in the firing rate after LFS at 10 Hz is shown in **Figure 6A**. Following 10 Hz stimulation the firing rate of 22/37 subthalamic neurons

was reduced by 31 ± 23% (*p* < 0.01); after 100 Hz stimulation the firing rate of 28/47 subthalamic neurons was reduced by 41 ± 26% (*p* < 0.01, **Figures 6B, C**). This prolonged inhibition was evident in cells from different areas within the STN and did not vary with the distance from the stimulation electrode.

Long-term effects were also observed in pallidal neurons. Similar to the responses recorded in the STN, the firing rate of 20/40 pallidal neurons was reduced by 22 ± 20% (*p* < 0.01) after 10 Hz stimulation and after 100 Hz stimulation the firing rate of 25/48 pallidal neurons was reduced by 21 ± 13% (*p* < 0.01) (**Figure 7**). As in the STN, there was no relation between the longterm response and the location of the pallidal neurons. However, in contrast to the STN, approximately a quarter of the pallidal neurons showed the opposite effect, a prolonged increase of firing rate which could be induced by both protocols (**Figure 8**). Following 10 Hz stimulation the firing rate of 9/40 pallidal neurons increased by 17 ± 11% (*p* < 0.01). Following 100 Hz stimulation the firing rate of 11/48 pallidal neurons increased by 24 ± 21% (*p* < 0.01).

#### **DISCUSSION**

This study characterized changes in firing of subthalamic and GP neurons during HFS and LFS of the STN. Both HFS and LFS led to significant short-term modulation of firing pattern in both nuclei (**Figures 2**, **3**) through the activation of glutamatergic and GABAergic synapses (**Figures 4**, **5**). LFS and HFS also induced

similar long-term effects. In STN, both protocols induced long lasting suppression of firing rate in many neurons (**Figure 6**), while in the GP the same protocols could induce either longlasting suppression or, less frequently, excitation (**Figures 7**, **8**).

Both HFS and LFS activated glutamatergic and GABAergic synapses, thus reshaping the firing pattern of the subthalamic neurons (**Figures 4**, **5**). None of the recorded cells was intrinsically activated by the stimulus pulses. These findings fit the changes in firing pattern found *in vivo* HFS studies (Hashimoto et al., 2003). It has also been shown *in vivo* that HFS of the STN raised the glutamate concentration in the GP. This suggests that HFS activates subthalamic neurons, which consequently excite pallidal neurons (Windels et al., 2000). We showed here that the excitation of both subthalamic and pallidal neurons was of glutamatergic origin. Previous findings showed that cortical afferents to the STN are activated antidromically during STN-HFS (Li et al., 2012). Our findings suggest that the activation of these glutamatergic fibers during STN-HFS or LFS consequently excite the subthalamic neurons.

During repetitive stimulation of the STN, the glutamatergic excitation after each pulse was followed by a GABAergic inhibition. Blocking the glutamatergic conduction blocked both excitatory and inhibitory phases (data not shown), indicating the dependence of the inhibitory GABAergic effect on activation of glutamategric fibers. It is possible that, due to differences in the basal activity of the glutamatergic and GABAergic afferents, the stimulus pulses directly activated only glutamatergic synapses, with the delayed GABAergic effect stemming from the increased activity of pallidal neurons. This may be due to differences in the basal activity of the glutamatergic and GABAergic afferents. The

STN receives glutamatergic inputs from the cortex and thalamus and GABAergic inputs from the GP. Unlike the thalamic and cortical afferents, pallidal neurons fire spontaneously at relatively high rates and thus the GABAergic synapses are partially depressed (Atherton et al., 2013; Bugaysen et al., 2013). As a result, the stimulus pulses may only modulate the activity of the glutamatergic synapses, while the GABAergic synapses remain unaffected. This suggestion is supported by our finding that GABAergic inhibition was present in 24% of subthalamic neurons during 10 Hz stimulation, yet only in 15% neurons during 100 Hz stimulation.

We found no clear dynamic change in the neuronal activity during repetitive stimulation, however there was a significant depression of firing rate following either 100 or 10 Hz stimulation in most subthalamic and pallidal neurons (**Figures 6**, **7**). In addition, both HFS and LFS induced prolonged increase of the firing rate in about a quarter of GP neurons but not in STN neurons (**Figure 8**). These results fit the different dynamic and static changes in firing rate induced by similar stimulation protocols applied to the GP (Erez et al., 2009; Bugaysen et al., 2011).

The differences in long-term responses between STN and GP (**Figures 6**–**8**) may arise from differences between the nuclei. Unlike the homogeneous neuronal population of the STN, the neurons in the GP can be classified into different populations by their electrophysiological and molecular characteristics (Cooper and Stanford, 2000; Bugaysen et al., 2010). These differences may account for the opposing long-term effects observed in the pallidal neurons. On the other hand, the differences in response may be explained by the inner connectivity of the

GP. Pallidal neurons send collaterals to form inhibitory synapses within the GP (Sato et al., 2000; Sadek et al., 2007). We recently showed that a single pallidal neuron can modulate the postsynaptic firing rate of other neurons in the GP (Bugaysen et al., 2013). It is thus possible that the different long-term responses were recorded from neurons receiving different synaptic inputs. One may expect that pallidal neurons receiving direct input from the STN exhibit a prolonged depression similar to that in the subthalamic cells. As the activity of these neurons decreases, other pallidal neurons may increase their firing rate due to disinhibition. This hypothesis remains to be investigated.

As the long-lasting effects observed here after repetitive stimulation were independent of frequency, they may be due to biochemical processes rather than inactivation of the stimulated neurons. Shen et al. (2003) reported that that HFS of the STN *in vitro* induced either long-term depression or long-term potentiation in the subthalamic synapses, implicating synaptic plasticity mechanisms. Here we found no excitatory effect in the STN. It should be pointed out that the experiments reported here were performed in slices obtained from normal rats, and may not represent the effects of STN-HFS and LFS in dopamine depleted preparations. As long-term plasticity mechanisms may be dopamine dependent (Yamawaki et al., 2012; Dupuis et al., 2013), similar measurements from dopamine-depleted slices are further required.

HFS is commonly thought to alleviate PD symptoms through inhibiting STN neurons. *In vitro* studies have shown that HFS of the STN caused complete cessation of firing during or following stimulation (Beurrier et al., 2001). An *in vivo* study showed

decreased firing rate of subthalamic neurons during, but not after, HFS of the STN in 6-hydroxy dopamine (6-OHDA) lesioned rats (Filali et al., 2004). These studies indicate HFS has an inhibitory effect on the STN and suggested that this may plays a role in alleviating PD symptoms. This hypothesis is contended by our results, which, as far we know, are the first demonstrating similar inhibitory effects induced by LFS of the STN. LFS shows no therapeutic effect in PD and may even worsen the symptoms. Therefore the mechanisms underlying the therapeutic effect of HFS in the STN remain to be resolved.

## **ACKNOWLEDGMENTS**

This work was supported by the Legacy Heritage Bio-Medical Program of the Israeli Science Foundation (#981/10) to Alon Korngreen.

## **REFERENCES**


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 26 August 2013; accepted: 04 December 2013; published online: 18 December 2013*.

*Citation: Lavian H, Ben-Porat H and Korngreen A (2013) High and low frequency stimulation of the subthalamic nucleus induce prolonged changes in subthalamic and globus pallidus neurons. Front. Syst. Neurosci. 7:73. doi: 10.3389/fnsys.2013.00073 This article was submitted to the journal Frontiers in Systems Neuroscience*.

*Copyright © 2013 Lavian, Ben-Porat and Korngreen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms*.

## Haloperidol-induced changes in neuronal activity in the striatum of the freely moving rat

#### *Dorin Yael 1†, Dagmar H. Zeef 2†, Daniel Sand1, Anan Moran3, Donald B. Katz 3, Dana Cohen1, Yasin Temel <sup>2</sup> and Izhar Bar-Gad1 \**

*<sup>1</sup> The Leslie & Susan Goldschmied (Gonda) Multidisciplinary Brain Research Center, Bar-Ilan University, Ramat-Gan, Israel*

*<sup>2</sup> Departments of Neuroscience and Neurosurgery, Maastricht University Medical Center, Maastricht, Netherlands*

*<sup>3</sup> Department of Psychology, Volen National Center for Complex Systems, Brandeis University, Waltham, MA, USA*

#### *Edited by:*

*Alon Korngreen, Bar-Ilan University, Israel*

#### *Reviewed by:*

*Christian K. E. Moll, University Clinic Hamburg-Eppendorf, Germany Judith Walters, National Institute of Neurological Disease and Stroke, National Institutes of Medicine, USA*

#### *\*Correspondence:*

*Izhar Bar-Gad, The Leslie & Susan Goldschmied (Gonda) Multidisciplinary Brain Research Center, Bar-Ilan University, Ramat-Gan 52900, Israel e-mail: izhar.bar-gad@biu.ac.il*

*†These authors have contributed equally to this work.*

The striatum is the main input structure of the basal ganglia, integrating input from the cerebral cortex and the thalamus, which is modulated by midbrain dopaminergic input. Dopamine modulators, including agonists and antagonists, are widely used to relieve motor and psychiatric symptoms in a variety of pathological conditions. Haloperidol, a dopamine D2 antagonist, is commonly used in multiple psychiatric conditions and motor abnormalities. This article reports the effects of haloperidol on the activity of three major striatal subpopulations: medium spiny neurons (MSNs), fast spiking interneurons (FSIs), and tonically active neurons (TANs). We implanted multi-wire electrode arrays in the rat dorsal striatum and recorded the activity of multiple single units in freely moving animals before and after systemic haloperidol injection. Haloperidol decreased the firing rate of FSIs and MSNs while increasing their tendency to fire in an oscillatory manner in the high voltage spindle (HVS) frequency range of 7–9 Hz. Haloperidol led to an increased firing rate of TANs but did not affect their non-oscillatory firing pattern and their typical correlated firing activity. Our results suggest that dopamine plays a key role in tuning both single unit activity and the interactions within and between different subpopulations in the striatum in a differential manner. These findings highlight the heterogeneous striatal effects of tonic dopamine regulation via D2 receptors which potentially enable the treatment of diverse pathological states associated with basal ganglia dysfunction.

**Keywords: striatum, dopamine antagonist, extra cellular recording, neuronal subpopulations, oscillations**

#### **INTRODUCTION**

The striatum is the main input structure of the basal ganglia, a group of sub-cortical nuclei involved in motor, limbic and associative behavior. The vast majority (>95%) of striatal neurons are the GABAergic projection neurons, which are traditionally termed medium spiny neurons (MSNs) (Rymar et al., 2004). MSNs can be divided into D1-receptor expressing MSNs projecting to the substantia nigra pars reticulata (SNr), and the entopeduncular nucleus (EP) and D2-receptor expressing MSNs projecting to the globus pallidus (GP) (Albin et al., 1989). MSNs are characterized by bursty firing in response to different movements and behaviors with a very low baseline firing rate (Wilson and Groves, 1981). Additionally, the striatum contains multiple small populations of interneurons (Tepper et al., 2004). Two of the most extensively studied interneuron populations are the GABAergic fast spiking interneurons (FSIs) and the cholinergic tonically active interneurons (TANs). FSIs, which comprise ∼1% of the striatal cell population, receive inputs from the cortex and GP, project to surrounding MSNs, and exert a strong inhibitory modulation (Berke, 2011). Electrophysiologically, FSIs display narrow spike waveforms with a high baseline firing rate and resemble the FSIs found in the hippocampus and cerebral cortex (Somogyi and Klausberger, 2005; Berke, 2011). TANs, which make up 1–2% of the striatal population (Goldberg and Reynolds, 2011), have a tonic irregular firing pattern, lack burst activity, and are strongly influenced by both positive and aversive stimuli. TANs show wide and high amplitude extracellular spike waveforms, with a relatively long refractory period (Sharott et al., 2009).

Dopamine and dopamine receptors play a key role in controlling information processing during both normal and pathological states. Dopamine denervation is the pathological hallmark of Parkinson's disease, while a hyper-dopaminergic state has been associated with hyper-kinetic and hyper-behavioral disorders ranging from dyskinesia to schizophrenia. Within the striatum, dopamine modulates the activity of individual neurons as well as the dynamics within and between the neuronal subpopulations (Murer et al., 2002; Humphries et al., 2009). Phasic and tonic alterations of dopamine levels within the striatum lead to a variety of changes in neuronal activity. Reward related, phasic, release of dopamine from the midbrain leads to inhibition of the cholinergic interneurons (TANs) in both non-human primates and rats,

**Abbreviations:** BG, basal ganglia; FSI, fast spiking interneuron; GP, globus pallidus; HVS, high voltage spindle; MSN, medium spiny neuron; SNc, substantia nigra pars compacta; TAN, tonically active neuron.

and similar effects has been reported for aversive stimuli (Aosaki et al., 1994b; Raz et al., 1996; Ravel et al., 2003). Chronic depletion of dopamine, such as in the 6-OHDA rat model or MPTP treated primate model of PD leads to an increase in the firing rate of the TANs and the formation of oscillatory firing patterns (Raz et al., 1996; Kish et al., 1999). The effect of dopamine on MSNs has classically been divided into positive modulation via D1-receptors and negative modulation via the D2-receptors (Albin et al., 1989). This basic modulation of the direct (D1 receptor expressing) and indirect (D2 receptor expressing) MSNs is further dependent upon additional factors such as cortical input, as well as the behavioral state of the subject (Moyer et al., 2007; Gerfen and Surmeier, 2011). The FSIs neuronal firing rate has been shown to be dependent on dopamine in that it increases after agonist application and decreases after antagonist application (Wiltschko et al., 2010).

These pronounced motor and behavioral effects of dopamine are utilized in providing pharmacological treatment of multiple motor and psychiatric disorders. Antipsychotic drugs, which are widely used in psychiatric conditions such as psychosis and mania or hyperkinetic motor disorders such as tics and chorea, are classically divided into typical and atypical antipsychotics. Typical antipsychotics (e.g. haloperidol) have a high affinity for D2 receptors (Gardner et al., 2005). The D2 receptor is expressed in the indirect basal ganglia pathway and plays an important role in behavioral, cognitive as well as affective and motor pathologies (Cohen and Frank, 2009). The high affinity of typical antipsychotics for the D2 receptor is known to cause extrapyramidal side effects such as akinesia, dystonia, parkinsonism, and tardive dyskinesia, as well as cognitive and behavioral alterations (Strange, 2008).

The current study assessed the effects of haloperidol on multiple striatal neuron subtypes (MSNs, TANs, and FSIs) in freely moving rats. We used chronic multi-electrode recordings to characterize the changes at the single neuron level and discuss network dynamics within the striatum and their potential role in the cortico-basal ganglia pathway.

## **METHODS**

#### **ANIMALS**

10 adult rats (five Sprague Dawley and five Long-Evans (Harlan) weighing 250–450 g) were used in this study. The animals were kept under controlled temperature, humidity, and lighting (12 h light/dark cycle) conditions and had free access to food and water. Experiments were performed during the light phase of the cycle. All procedures were approved and supervised by the Institutional Animal Care and Use Committee (IACUC) and were in accordance with the National Institute of Health Guide for the Care and Use of Laboratory Animals and the Bar-Ilan University Guidelines for the Use and Care of Laboratory Animals in Research. This study was approved by the National Committee for Experiments in Laboratory Animals at the Ministry of Health (permit number 6-01-11).

#### **SURGICAL PROCEDURES**

After arrival, the animals were housed in their home cage for at least 1 week before the operation. Animals were sedated with 5% isoflurane and then anesthetized with an intra-peritoneal injection of a mixture of ketamine HCl (100 mg/kg) and xylazine HCl (10 mg/kg) and an intra-muscular injection of atropine (0.1 mg). Anesthesia during the operation was maintained using isoflourane and supplementary injections of ketamine. The animals' scalp was shaved and cleaned, and anesthetic cream (2.5% lidocaine and 2.5% prilocaine) was applied into their ears to prevent discomfort during head fixation. The rats were placed in a stereotactic frame (Stoelting Co., Wood Dale, IL, USA) and their head was fixed. The scalp was sterilized and lidocaine (10 mg/ml) was locally injected into the skin around the incision location. A central incision was made and the skull was exposed. The head was then straightened by equalizing bregma-lambda height. Coordinates of the craniotomies were marked and holes were drilled. The dura was removed and 16 electrode arrays were slowly lowered unilaterally into the striatum and were grounded and fixed to the skull by screws and dental cement.

Two types of custom made multi-wire arrays were used. Four of the animals were implanted with non-movable 4 × 4 45μm diameter isonel-coated tungsten microwire linear arrays (Yarom and Cohen, 2011) [AP +1.0, −0.5 ML 2.0, 4.0 DV 5.0 (Paxinos and Watson, 2007)] (**Figures 1A,C**). These arrays displayed higher probabilities of recording TANs. Six of the animals were implanted with movable bundles of 16 25μm diameter formvar coated nichrome microwires (Grossman et al., 2008) [AP +1.0, −0.5 ML 2.0, 3.5 DV 4.0 (Paxinos and Watson, 2007)] (**Figures 1B,D**). These arrays displayed higher probabilities of recording MSNs and FSIs.

**FIGURE 1 | Methods.** Illustration of the **(A)** non-movable 4 × 4 45μm diameter isonel-coated tungsten microwire linear arrays and the **(B)** movable bundles of 16 25μm diameter formvar coated nichrome microwires and **(C,D)** their locations in the dorsal striatum.

#### **EXPERIMENTAL SESSIONS**

Experimental sessions began at least 7 days post-operation to allow full recovery of the animals. The animals were placed in the recording chamber, where they could move freely, and were connected to the recording system. The signal was continuously recorded throughout the session. The analog signal was amplified (∗200), band-pass filtered (0.5–10000 Hz, 4 pole Butterworth filter) and digitized at a sampling rate of 44 KHz (Alpha-Lab SNR, Alpha-Omega Engineering, Nazareth, Israel). The digitized continuous raw signal was acquired to allow offline sorting and analysis. Sessions initiated with recordings in the naïve state, which was followed by recording of neuronal activity during and after a single subcutaneous haloperidol (0.05 mg) injection. Few experimental sessions were performed in some of the animals (2.5 ± 1.3 mean ± *SD*, sessions/animals), which were separated by prolonged periods (11.3 ± 7 mean ± *SD*, days between sessions) to avoid accumulation of Haloperidol effect. In order to minimize behavioral effects on the results, the electrophysiological recordings took place during resting state of the animals both before and after the injection. Notably, rest state following haloperidol injection included akinesia and rigidity which were not part of the pre-injection state.

#### **DATA PREPROCESSING AND ANALYSIS**

Spike sorting of multiple single units from the continuous raw signal was performed offline (Offline Sorter v2.8.8, Plexon, Dallas, TX). Offline sorting using a continuous signal instead of the more common segmented data makes it possible to obtain long overlapping windows (1.5–2 ms in our data) surrounding each spike, thereby increasing the fidelity and quality of the sorting. Only neurons that formed clear distinguishable clusters were included in the study. Neurons that did not display a stable wave-form throughout the recording session were eliminated from the database. All subsequent analyses were performed using custom-written MATLAB code (Mathworks, Natick, MA).

## **HISTOLOGY**

After completion of the experimental sessions, the rats were deeply anesthetized with a mixture of ketamine (100 mg/kg), xylazine (10 mg/kg) and additional morphinhydroclorid (0.15 mg/kg). A current (–100μA, 10 s) was passed through the tips of the electrodes to create small marker microlesions. Rats were perfused transcardially with 0.9% saline followed by 10% paraformaldehyde. The brains were removed and cryoprotected for a minimum of 48 h in a fixation solution of 30% sucrose in 10% paraformaldehyde. Brains were frozen and cut into 50μm thick sections on a cryostat. Sections holding the striatum were counterstained with cresyl violet for identification of the structures and electrode placement verification.

## **RESULTS**

## **EXPERIMENTAL SESSIONS**

We continuously recorded extracellular activity in the dorsal striatum of freely moving rats in naïve and post-systemic haloperidol injection states using two types of 16 microwire arrays (25 and 45μm wire diameter). The recordings were made in 10 rats over 22 separate recording sessions. Electrophysiological recordings in the naïve state began after an initial acclimatization period in which the animals were allowed to move freely and explore the recording cage. Recording in the naïve state lasted 30 min, continued throughout the injection time and for a post-injection period of 30 min. Injection of haloperidol induced observable akinesia and rigidity within a few minutes during all of the sessions. These behavioral effects could be modulated for short periods of time; for example, after moving the animals from the recording cage to their home cage, a short period of exploration could occur. An additional common phenomenon occurring after haloperidol injection was increased stereotypical behavior in the form of teeth chattering.

#### **NEURONAL CLASSIFICATION**

The recorded signal was offline sorted and the dataset of 87 single units was classified into three categories—MSNs (30/87, 34.5%), TANs (42/87, 48.3%), and FSIs (15/87, 17.2%) (**Figure 2A**). All the neurons presented well separated clusters which were stable throughout the whole recording session. The classification was first done subjectively using multiple parameters such as the mean waveform shape, inter-spike interval (ISI) histogram, firing rate and pattern, and was subsequently verified via clustering using three parameters: (1) mean firing rate during the naïve period, (2) peak to valley width of the mean spike waveform and (3) the ISI coefficient of variation (CV) where the different subpopulations formed three clearly separable clusters (**Figure 2B**). The mean firing rate was calculated using 600 s long segments taken from the naïve state recordings (ending at least 100 s prior to the haloperidol injection). MSNs typically displayed a low baseline firing rate of 1.9 ± 1.9 Hz (mean ± *SD*), TANs displayed a medium rate 4.2 ± 2.0 Hz (mean ± *SD*) and FSIs displayed the highest baseline rate 17.5 ± 10.3 Hz (mean ± *SD*). The CV was calculated using the same time windows by dividing the standard deviation of the 1st order ISIs by their mean. This measure varied across the striatal subpopulations: MSNs typically displayed a bursty firing pattern over low baseline activity and high CV value of 2.5 ± 0.7 (mean ± *SD*). FSIs displayed a bursty firing pattern over high baseline activity and a CV of 1.7 ± 0.6 (mean ± *SD*). TANs displayed a Poisson-like firing pattern and a CV value of 0.9 ± 0.3 (mean ± *SD*). Waveform peak to valley width was calculated using the waveforms of the whole recording periods. MSNs waveforms were typically wide and displayed a long positive second phase. FSIs displayed a narrow waveform, and TANs displayed wide waveform containing initial positive phase (**Figure 2C**).

#### **HALOPERIDOL EFFECTS**

Different statistics for comparison of the naïve, pre-haloperidol injection state and the post-haloperidol injection state were calculated using stable periods of continuous 600 s recordings from the naive state, ending at least 100 s prior to the haloperidol injection, and from the post-haloperidol injection state, starting at least 600 s after the haloperidol injection. This procedure produced stable periods and served to avoid transition periods.

The mean firing rates of the neurons during the stable windows of the naïve and post-haloperidol states were compared. Both MSNs and FSIs significantly lowered their firing rates after haloperidol injection (paired *t*-test, *p* < 0.01) from the mean

raw data (right) and spike waveform (left) of MSN, FSI, and TAN (black lines—mean, colored lines—mean ± *SD*). **(B)** Division of the classified

population baseline rates of 1.9 ± 0.4 Hz (mean ± s.e.m.) and 17.5 ± 2.6 Hz (mean± s.e.m.) to firing rates of 0.6 ± 0.2 Hz (mean ± s.e.m.) and 6.5 ± 1.4 Hz (mean ± s.e.m.), respectively (**Figure 3A**). To assess the behavior of individual neurons within each subtype comparisons were made of the distribution of firing rates of each neuron in 60 short (10 s) windows making up the stable periods. In a single unit analysis, 80% (24/30) of the MSNs (**Figure 3B**) and 100% (15/15) of the FSIs (**Figure 3C**) significantly lowered their firing rates after the haloperidol injection (*t*-test, *p* < 0.01). The TANs' firing rates were elevated after the haloperidol injection (paired *t*-test, *p* < 0.01) from a mean population baseline rate of 4.2 ± 0.3 Hz (mean ± s.e.m.) to a firing rate of 6.3 ± 0.4 Hz (mean ± s.e.m.) (**Figure 3A**). In a single unit analysis, 88% (37/42) of the TANs elevated their firing rate post-haloperidol injection in a significant manner (*t*-test, *p* < 0.01) (**Figure 3D**). The transition dynamics of firing rates following haloperidol were complex and dependent on the specific animal and session, however, the effects were clear across the population (**Figures 4A,B**). Despite the different dynamics across the sessions, simultaneously recorded neurons demonstrated the same dynamics in their transition phases and timing following the injection (**Figure 4C**).

The CV of the naïve and post-haloperidol states were calculated for single units during the same 600 s stable time frames by dividing the standard deviation of the ISIs by their mean. The MSNs significantly (paired *t*-test, *p* < 0.01) lowered their CVs following the haloperidol injection from a value of 2.53 ± 0.13 (mean ± s.e.m.) to a value of 1.95 ± 0.14 (mean ± s.e.m.) (**Figure 5A**). CV values of FSIs population did not change significantly (naïve 1.7 ± 0.2, post-haloperidol 1.9 ± 0.2, paired *t*-test *p* > 0.01) (**Figure 5B**). TANs significantly (paired *t*-test, *p* < 0.01) elevated their CV values from a baseline value of 0.90 ± 0.05 (mean ± s.e.m.) to a value of 0.99 ± 0.05 (mean ± s.e.m.) and CV in the naïve state. Filled circles are the units presented in **(A)**. **(C)** Mean population waveform of MSN (blue), FSI (red), and TAN (green).

post-haloperidol injection (**Figure 5C**). The changes in the TANs' CV values following the haloperidol injection might be explained by the shortening of their refractory period, which was clearly seen in their ISI histograms (**Figure 5D**) and autocorrelation

functions (**Figure 5E**). A total of 131 of TAN pairs were recorded simultaneously in the naïve and in the subsequent post-haloperidol injection state.

(**Figure 4A**).

In the naïve state many pairs exhibited significant (*p* < 0.01) positive correlations (62/131, 47.3%) and a smaller minority had significantly negative correlations (13/131, 9.9%). Following the haloperidol injection the asymmetry increased, in that the majority of the neurons displayed a significant positive correlation (85/131, 64.9%) and a smaller minority showed significant negatively correlations (6/131, 4.6%) (**Figure 6A**). The haloperidol injection changed the fraction of significantly correlated neurons in a significant manner (χ<sup>2</sup> test, *p* < 0.01). However, the mean pair-wise correlation, which was highly positive in both the naïve and post-haloperidol states, did not differ between the two states (**Figure 6B**). This correlation was calculated on the rate normalized correlation functions. Using the raw cross correlation functions yielded larger positive correlations, where the skew was due to the increased firing rate during the post-haloperidol state.

Oscillatory activity in the 7–9 Hz band could be observed in some of the LFP recordings in the naïve state. These oscillations, which are typical of high voltage spindles (HVS), occurred in 19/72 (26.4%) of the recordings and were typically short and intermittent. Following haloperidol injection the number of oscillatory LFPs increased significantly to 33/72 (45.8%) (χ<sup>2</sup> test, *p* < 0.01) and their duration and abundance increased (**Figures 7A,B**). The mean power of the LFP increased preferentially in the HVS frequency band and its harmonics (**Figure 7C**). This was accompanied by a smaller but significant peak in the HVS frequency and its first harmonic in the spiking activity of single neurons (**Figure 7D**). The number of neurons with a significant peak in the power spectrum increased from 7/87 (8.1%) in the naïve state to 21/87 (24.1%) during the post-haloperidol state (χ<sup>2</sup> test, *p* < 0.01).

The activity of single neurons during HVS varied greatly depending on the neuronal sub-type. Both FSIs and MSNs fired in an oscillatory manner in a fixed phase with the LFP. Oscillations of the LFP in the HSV frequency appeared prior to the injection but were increased notably following the haloperidol injection (**Figures 8A,B**). This was accompanied by equivalent oscillations in the neuronal spiking activity of MSNs and FSIs (FSI example—**Figures 8C,D**). The LFP and spike train signals were highly coherent following the haloperidol injection in the HVS base frequency and its harmonics (**Figures 8E,F**). In sharp contrast, TANs did not display spike train oscillatory activity during oscillations in the LFP (**Figures 9A–F**) and thus were not coherent with the LFP signal. This property was evident when observing the overall neuronal population, which demonstrated a roughly equal coherence of MSNs and FSIs with their background LFP with only a small coherence for the TAN population (**Figure 9G**).

## **DISCUSSION**

In this study, we examined the effect of systemic haloperidol administration on the activity of neurons from different subpopulations within the striatum in freely moving rats. Our results demonstrate that neuronal sub-types within the striatum responded in a distinctly different fashion to the haloperidol application. MSNs and FSIs dramatically lowered their firing rate following haloperidol injection, whereas changes in the firing pattern were seen in the MSNs, which became more regular as a result of decreased bursting. In addition, the low frequency oscillations (7–9 Hz) that are typical of HVSs were evident following the haloperidol injection where the LFP signal became highly oscillatory and the firing pattern of both FSIs and MSNs

simultaneously.

correlation of the TAN pairs in the naïve (dark green) and post-haloperidol (light green). The narrow lines indicate one s.e.m. significance lines and the horizontal black dashed line indicates the normalized unity line. The scale is normalized to the mean rate of each pair.

became phase locked to these oscillations. The firing rate of TANs, on the other hand, showed a drug-induced increased firing rate, accompanied by a more irregular firing pattern resulting from a shortening of the relative refractory period. The TANs displayed a non-oscillatory firing pattern during periods of HVS and maintained the typical widely correlated activity within the sub-population.

The effects of dopamine modulation on both behavior and its underlying neuronal mechanisms have been studied extensively in the cortico-basal ganglia loop. Multiple techniques for dopamine modulation have been used in the study of movement and psychiatric disorders such as Parkinson's disease, Huntington's disease and Tourette syndrome, as well as in the study of normal processes such as plasticity, learning and reward. These techniques differ in such key aspects as (1) spread, (2) specificity, and (3) duration of the effect, which makes comparison across results complex. Dopamine denervation techniques range from widespread, non-specific chronic depletion of dopamine neurons, such as those observed following 6-OHDA injections or with genetic knockout models, to acute, focal, and specific antagonisms caused by localized dopamine antagonist microinjections or focal optogenetic modulations. In the current study we used systemic acute injections of the D2 antagonist haloperidol. In this case, spatially, the effect was widespread in the brain and did not only influence the striatal neurons. However, temporally, it had a limited duration, with minor impact on long term changes in the network function and structure. By contrast, chronic treatment with haloperidol has been shown to lead to significant plastic changes (Benes et al., 1985) and behavioral symptoms resembling tardive dyskinesia (Kinon et al., 1984). In addition, the dosage of haloperidol used in this study (0.05 mg) was lower by an order of magnitude than some recent studies (Burkhardt et al., 2007; Yang et al., 2013) that have examined its effect on the basal ganglia. The low dosage elicited the same behavioral symptoms to a large extent, while avoiding permanent damage such as potential apoptosis in the striatum and SNr (Mitchell et al., 2002) and resembling the dosage used in human patients.

Despite the differences between techniques, dopamine denervation mostly yields similar behavioral and neuronal results. Behaviorally, decreased dopamine levels have been shown to induce akinesia and parkinsonian-like symptoms such as rigidity. These outcomes are common to studies using localized 6-OHDA injection (Deumens et al., 2002), systemic injections of dopamine antagonists (Burkhardt et al., 2007) and tetrodotoxin (TTX) injections to the medial forebrain bundle (MFB) (Prosperetti

post-haloperidol injection reflecting the HVSs. **(C,D)** Mean power of the **(C)** LFP recording (*n* = 72) and **(D)** spike trains (*n* = 87) during the naïve state (black) and post-haloperidol state (gray). The narrow lines indicate one s.e.m.

**FIGURE 8 | Oscillatory neuronal activity during HVSs.** The oscillatory activity is shown for a single recording of an FSI which spikes in an oscillatory manner during HVSs: **(A,B)** the LFP, **(C,D)** the spiking activity of a neuron sorted from the same electrode and **(E,F)** the coherence between the LFP and the spiking activity. The **(A,C)** spectrogram and **(E)** coherogram illustrate

the changes in the oscillatory activity of the signals over 10 s windows throughout the session. The solid white line indicates the haloperidol injection time and the dotted white arrow indicates the stable post-haloperidol segment. The mean **(B,D)** power spectrum and **(F)** coherence are averaged over the stable post-haloperidol segment.

LFP and the spiking activity. The **(A,C)** spectrogram and **(E)** coherogram illustrate the changes in the oscillatory activity of the signals over 10 s

coherence between the LFP signal and the spiking activity of MSNs (blue), FSIs (red), and TANs (green) during the post-haloperidol segment.

et al., 2013). Neurophysiologically, striatal activity has been widely studied. However, the classification of striatal neurons into different populations has only recently become standardized, leaving multiple early studies hard to interpret as their data contain a myriad of cell types.

TANs in the dorsal striatum express both D2 and D5 receptors and receive dopaminergic input primarily from the SNc (Levey et al., 1993). *In vivo* intracellular recordings have shown that dopamine increase, via stimulation of the SNc, induces inhibition of the firing rate of TANs (Reynolds et al., 2004). Tonic dopamine depletion in 6-OHDA rats, on the other hand, increases their baseline discharge rate (Kish et al., 1999; Sanchez et al., 2011). Similarly, an acute block of the MFB by TTX also increases the firing rate (Prosperetti et al., 2013). Interestingly, while apomorphine likewise suppresses the firing rates of TANs (Fujita et al., 2013), D-Amphetamine, which increases the release of dopamine and inhibits its re-uptake, leads to an increased firing rate of TANs (Kish et al., 1999). In the non-human primate, no effect on spontaneous discharge rate of TANs by localized microinjections of either a D2 antagonist (sulpiride) or a D1 antagonist (SCH23390) (Watanabe and Kimura, 1998), or by unilateral dopamine depletion by MPTP infusion to the caudate-putamen complex (Aosaki et al., 1994a), was observed in the striatum. Generally, the increased rate of TANs following haloperidol injection in our study is in line with most of the literature of dopamine modulation.

There are few studies of FSI modulation by dopamine given the nascent state of the field. However, the decreased firing rate of the FSIs following haloperidol application is consistent with *in vivo* studies reporting a decreased firing rate post eticlopride, a D2 receptor antagonist (Wiltschko et al., 2010).The effect of D2 on FSIs has been hypothesized to result from modulation of their GABAergic input (Bracci et al., 2002).

Unlike the general consistency between reports on TAN and FSI activity following dopamine modulations, previous studies of MSN activity paint a diverse and complex picture. The range of expression of dopamine receptors on MSNs is diverse: D1-like receptors have classically been assigned to the direct striato-nigral pathway and the D2-like receptors to the indirect striato-pallidal pathway and their activation increases and decreases the firing rate of MSNs, respectively (Albin et al., 1989). This might partially explain the heterogeneous results on the activity of MSNs following dopamine modulation in studies which do not break down MSNs into different subpopulations or use specific receptor modulation (Prosperetti et al., 2013). Chronically denervated animals tend to show an overall increase in firing rate (Kish et al., 1999; Chen et al., 2001), whereas acute denervation leads to increases and decreases in firing rate (Wiltschko et al., 2010; Prosperetti et al., 2013). Our data show contradictory results since most of our MSNs significantly lowered their firing rate following haloperidol administration. This is a surprising result in that the D2 receptor inactivation is expected to increase the MSN firing rate (Albin et al., 1989; DeLong, 1990), which may result from indirect network interactions.

The relationship between FSIs and MSNs firing following haloperidol injection did not follow their presumed interaction. A decrease in the firing rate of FSIs is expected to elevate the MSN firing rate, due to the strong inhibition that FSIs exert on MSNs (Koos and Tepper, 1999) whereas in our study both FSIs and MSNs lowered their firing rate. This discrepancy has been reported elsewhere, but the mechanism is still under debate (Lansink et al., 2010; Wiltschko et al., 2010). The most common suppositions are that in the dopamine depleted state, the FSIs reduce their sensitivity to cortical stimulation or the cortico-striatal input might have a larger influence post dopamine depletion (Kimura, 1990; Sharott et al., 2009; Berke, 2011).

HVSs appear in awake immobile rats and are characterized by a typical frequency of 5–13 Hz with a typical spike and wave shape (Klingberg and Pickenhain, 1968). HVSs have been studied extensively in multiple brain areas and have been observed in different parts of the cortico-basal ganglia loop. The origin of HVSs is still under debate and thalamic as well as cortical influences are thought to play an important role (Berke et al., 2004). Neurons in the striatum (presumably MSNs) were shown to fire in a phase locked with HVS, mostly within the spike component (Buzsaki et al., 1990). HVSs were shown to affect neuronal activity in the basal ganglia input (striatum— MSNs) and output (SNr) which are phase-locked to both the cortical and basal ganglia local field potentials (Dejean et al., 2007). Dopamine modulates HVSs, and recent studies have demonstrated the significant increase in these oscillations following dopamine depletion and specifically the LFP oscillations in the cortex and GP following systematic application of D2 antagonists (Yang et al., 2013). Under the dopamine depletion state in the 6-OHDA rat, the duration and frequency of occurrence of HVSs increased significantly (Dejean et al., 2008). This increase was also apparent following the injection of dopamine antagonists directly to the striatum (Deransart et al., 2000). Our results extend previously reported work by revealing that the effect of HVSs is not uniform across the different neuronal subpopulations. Whereas haloperidol increases the abundance of HVSs, which is consistent with the literature regarding dopaminergic reduction in the striatum and triggers FSIs and MSNs to show phase-locked activity to the LFP, TANs do not reveal this phenomenon and remain largely non-oscillatory. This contrasts with previous studies in the parkinsonian primate, which have found oscillatory activity that was coherent with basal ganglia oscillations (Raz et al., 1996, 2001).This might be explained by an intrinsic ability to generate spikes of TANs and/or stronger cortical or thalamocortical influences on MSNs and FSIs.

Haloperidol, given at amounts comparable to the dosage administered to human patients suffering from different movement and psychiatric disorders, produces profound behavioral and neuronal effects. Despite the clear behavioral effect, the underlying neuronal mechanism appears complex and diverse even within only the striatum itself. Different sub-populations of neurons are affected differentially and the effects include changes in neuronal rate, pattern, and interactions. This emphasizes the need to generate fine grain clustering of neurons to elucidate the mechanisms of action of pharmaceuticals on the brain as a whole and specifically on the striatum.

#### **ACKNOWLEDGMENTS**

We thank M. Dror for his help with the animals, K. Belelovsky and Y. Loewenstern for help with the histology. This work was supported by an Israel Science Foundation (ISF) grant (372/09), a National Institute for Psychobiology grant and a Tourette Syndrome Association (TSA) grant. D. H. Zeef was supported by a NENS Stipends grant.

#### **REFERENCES**


Yang, C., Ge, S. N., Zhang, J. R., Chen, L., Yan, Z. Q., Heng, L. J., et al. (2013). Systemic blockade of dopamine D2-like receptors increases high-voltage spindles in the globus pallidus and motor cortex of freely moving rats. *PLoS ONE* 8:e64637. doi: 10.1371/journal.pone.0064637

Yarom, O., and Cohen, D. (2011). Putative cholinergic interneurons in the ventral and dorsal regions of the striatum have distinct roles in a two choice alternative association task. *Front. Syst. Neurosci.* 5:36. doi: 10.3389/fnsys. 2011.00036

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 09 September 2013; accepted: 25 November 2013; published online: 16 December 2013.*

*Citation: Yael D, Zeef DH, Sand D, Moran A, Katz DB, Cohen D, Temel Y and Bar-Gad I (2013) Haloperidol-induced changes in neuronal activity in the striatum of the freely moving rat. Front. Syst. Neurosci. 7:110. doi: 10.3389/fnsys.2013.00110 This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2013 Yael, Zeef, Sand, Moran, Katz, Cohen, Temel and Bar-Gad. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Decomposition of abnormal free locomotor behavior in a rat model of Parkinson's disease

## *Benjamin Grieb1,2\*, Constantin von Nicolai 1,3, Gerhard Engler 1, Andrew Sharott 1,4, Ismini Papageorgiou5, Wolfgang Hamel 6, Andreas K. Engel 1† and Christian K. Moll 1†*

*<sup>1</sup> Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, University of Hamburg, Hamburg, Germany*

*<sup>2</sup> Department of General Psychiatry, Center for Psychosocial Medicine, University of Heidelberg, Heidelberg, Germany*

*<sup>4</sup> Medical Research Council, Anatomical Neuropharacology Unit, Department of Pharmacology, University of Oxford, Oxford, UK*

*<sup>5</sup> Division of General Neurophysiology, Institute of Physiology and Pathophysiology, University of Heidelberg, Heidelberg, Germany*

*<sup>6</sup> Department of Neurosurgery, University Medical Center Hamburg-Eppendorf, University of Hamburg, Hamburg, Germany*

#### *Edited by:*

*Hagai Bergman, The Hebrew University, Israel*

#### *Reviewed by:*

*M. Gustavo Murer, Universidad de Buenos Aires, Argentina Nicola B. Mercuri, University of Rome, Italy*

#### *\*Correspondence:*

*Benjamin Grieb, Department of General Psychiatry, Center for Psychosocial Medicine, University of Heidelberg, Voßstrasse 2, 69115 Heidelberg, Germany e-mail: benjamin.grieb@ med.uni-heidelberg.de*

*†These authors have contributed equally to this work.*

Poverty of spontaneous movement, slowed execution and reduced amplitudes of movement (akinesia, brady- and hypokinesia) are cardinal motor manifestations of Parkinson's disease that can be modeled in experimental animals by brain lesions affecting midbrain dopaminergic neurons. Most behavioral investigations in experimental parkinsonism have employed short-term observation windows to assess motor impairments. We postulated that an analysis of longer-term free exploratory behavior could provide further insights into the complex fine structure of altered locomotor activity in parkinsonian animals. To this end, we video-monitored 23 hours of free locomotor behavior and extracted several behavioral measures before and after the expression of a severe parkinsonian phenotype following bilateral 6-hydroxydopamine (6-OHDA) lesions of the rat dopaminergic substantia nigra. Unbiased stereological cell counting verified the degree of midbrain tyrosine hydroxylase positive cell loss in the substantia nigra and ventral tegmental area. In line with previous reports, overall covered distance and maximal motion speed of lesioned animals were found to be significantly reduced compared to controls. Before lesion surgery, exploratory rat behavior exhibited a bimodal distribution of maximal speed values obtained for single movement episodes, corresponding to a "first" and "second gear" of motion. 6-OHDA injections significantly reduced the incidence of second gear motion episodes and also resulted in an abnormal prolongation of these fast motion events. Likewise, the spatial spread of such episodes was increased in 6-OHDA rats. The increase in curvature of motion tracks was increased in both lesioned and control animals. We conclude that the discrimination of distinct modes of motion by statistical decomposition of longer-term spontaneous locomotion provides useful insights into the fine structure of fluctuating motor functions in a rat analog of Parkinson's disease.

**Keywords: 6-OHDA lesions, stereology, spontaneous activity, Parkinson disease, video monitoring**

## **INTRODUCTION**

Neurotoxin-induced degeneration of nigral dopamine neurons in experimental animals results in motor abnormalities relevant to motor symptoms of Parkinson's disease (PD; Cenci et al., 2002). One strategy to deplete midbrain dopaminergic neurons in rats is to infuse the neurotoxin 6-hydroxydopamine (6-OHDA) directly into the substantia nigra pars compacta (SNc; Schwarting and Huston, 1996a,b), which can also cause moderate cell death in the neighboring ventral tegmental area (VTA). In the prototypic toxin-induced rat model of PD, unilateral intracerebral 6-OHDA injections lead to the expression of a strictly lateralized hemiparkinsonian phenotype, the behavioral sequelae of which have been described in great detail. A wide variety of different behavioral tests are used to examine motor changes associated with the asymmetrical depletion of the nigrostriatal dopaminergic system, e.g., skilled motor tasks (Mokrý et al., 1995; Truong et al., 2006), footprint-analysis (Metz et al., 2005), treadmill running (Brazhnik et al., 2012), or drug-induced rotation (Ungerstedt, 1971; Kelly, 1975). Compared to the unilateral 6-OHDA model, bilaterally lesioned rats have been used less commonly, although they display a far more severe parkinsonian phenotype. In this respect, the slowness and scarcity of movement observed in the bilateral 6-OHDA rat model resembles more closely the marked expression of PD-like symptoms in monkeys treated systemically with 1-methyl-4-phenyl-1,2,3,6 tetrahydropyridine (Bergman et al., 1990) and cardinal motor manifestations of human PD patients in advanced disease stages. The development of pronounced motor impairment in rats with severe bilateral 6-OHDA lesions is, however, often accompanied by aphagia, adipsia and abulia (Schallert et al., 1978; Sakai and Gash, 1994; Cass et al., 2005; Ferro et al., 2005).

Rat equivalents of akinesia are commonly assessed in an "open-field" environment. Most often, rat motion (including scanning, walking or running, as well as rearing) is detected by

*<sup>3</sup> Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany*

consecutively breaking light beams arranged in an array around the arena (Cass et al., 2005; Ferro et al., 2005; Belujon et al., 2007). Other methods to quantify locomotor capacities are wheel (Schallert et al., 1978) or treadmill running (Avila et al., 2010). These standard measurements usually assess a short period of 10–60 min of locomotor activity (Cass et al., 2005; Ferro et al., 2005; Belujon et al., 2007), assuming a stable motor phenotype over time. However, these methods may underestimate the complexity of exploratory behavior, in particular long-term motor fluctuations.

Recently, a novel analytical approach to video-based tracking data utilized the statistical discrimination of rat motor behavior on the basis of speeds (Drai et al., 2000; Drai and Golani, 2001). This form of data analysis showed that naïve rats, but also naïve mice (Drai et al., 2001), use distinctly different modes of motion to explore their environment. The current study aimed to assess the impact of dopamine depletion on these naturally occurring behavioral patterns. We hypothesized that expression of a severe parkinsonian phenotype would not only reduce overall motion speed, as could be assessed with other standard quantification methods, but would also alter the fine spatio-temporal structure of exploratory locomotion modes. To test this, we adapted the analysis approach of Drai and colleagues to video-based tracking data from 23 hours of continuous and spontaneous locomotion of bilaterally 6-OHDA lesioned and control rats.

## **MATERIALS AND METHODS**

#### **ANIMALS**

Animal experiments were approved by the local government authorities of Hamburg and carried out in accordance with the European Council Directive 86/609/EEC. All experiments were performed on male Brown Norway rats (Rattus norvegicus; Charles River Laboratories, Sulzfeld, Germany). All efforts were made to minimize suffering. The bilateral 6-OHDA lesion model is known to be associated with high mortality due to aphagia/adipsia and following loss of body weight (Ferro et al., 2005). Careful clinical inspection and weighting of animals was performed on a daily basis. In general, we offered a soft and moist rat chow and 10%-glucose solution in addition to standard chow and water *ad libitum*. To sustain aphagic and adipsic PD rats, we administered a liquid and high caloric nutrition for rodents (Altromin, Lage, Germany) by manual needle feeding. In total, the dropout rate was 5/14 PD rats. Dropouts included perioperative death (*n* = 2 PD) and loss of >20% body weight without stabilization of weight loss within 14 days after surgery (*n* = 3 PD) which led to euthanasia of these animals by decapitation under deep anesthesia (i.p.-injection of ketamine 100 mg/kg, xylazine 6 mg/kg).

#### **EXPERIMENTAL DESIGN**

We randomly assigned 14 rats to a PD group receiving bilateral 6-OHDA injections into the SNc and eight rats to a control group receiving bilateral vehicle injections (preoperative weight: 384 ± 32 g, mean ± *SD*). Video tracking of 23 hours of spontaneous locomotor activity was performed prior to surgery. Due to the labor intensity of sustaining PD rats we ran the experiment in two phases. Within the first subset of 10 rats (*n* = 8 PD; *n* = 2 controls) the dropout rate was 4/8 PD rats. The second subset of 12 rats (*n* = 6 PD; *n* = 6 controls) exhibited a dropout rate of 1/6 PD rats. Postoperative locomotor activity was assessed after a recovery period of 12 ± 2 days in the first subset and 28 ± 2 days in the second subset. After postoperative video monitoring, all rats were subsequently used in a separate study. We sacrificed rats after induction of deep anesthesia (ketamine/xylazine) ∼14 weeks after lesioning by transcardial perfusion with saline. A midbrain tissue block was immersion-fixed in 4%-PFA-solution (paraformaldehyde in 0.1 M phosphate buffered saline, Sigma-Aldrich) for subsequent histological TH-staining and stereological counting of tyrosine hydroxylase (TH)-positive SNc and VTA neurons.

#### **BILATERAL 6-HYDROXYDOPAMINE LESIONS**

Stereotactic injections were performed under general anesthesia introduced with isoflurane (Baxter Germany GmbH, Unterschleißheim, Germany) and maintained with i.p.-injections of ketamine (65 mg/kg, Dr. E. Gräub AG, Bern, Switzerland) and xylazine (3 mg/kg, Bayer Health Care, Leverkusen, Germany). Thirty minutes prior to 6-OHDA or vehicle injections, rats received a bolus i.p.-injection of desipramine (25 mg/kg, Sigma-Aldrich, Munich, Germany) to minimize uptake of 6-OHDA in noradrenergic midbrain neurons (Schwarting and Huston, 1996b). Cardiopulmonary protection was assured by an initial bolus injection of atropine (0.25 mg/kg, B. Braun Melsungen AG, Melsungen, Germany). Body temperature was monitored during surgery with a rectal probe and hypothermia was prevented with an adjustable heating pad (FST, Heidelberg, Germany). In addition, the eyes were covered with dexpanthenol cream to prevent exsiccation. Animals were mounted in a stereotactic frame (David Kopf Instruments, Tujunga, USA). To target the SNc a Hamilton microliter-syringe (FST) was lowered through a burr hole placed at +4 mm AP and ±2.2 mm ML using the interaural line as reference (Paxinos and Watson, 2005). The syringe was slowly lowered to the target depth at −8 mm relative to the dura. Five microliter neurotoxin (3µg/µl 6-OHDA hydrochloride free base in 0.2% ascorbic acid solution, stored on ice; Sigma-Aldrich) or vehicle (aqua injectabilia, 0.2% ascorbic acid solution; Sigma-Aldrich) was slowly infused at a rate of 0.5µl/min and the syringe was left in place for 2 min to allow for complete absorption of the toxin. The burr hole was closed with bone wax and the procedure was repeated in the contralateral hemisphere. Rats received s.c.-injections of metamizole (100 mg/kg, Medistar, Holzwickede, Germany) for postoperative analgesia following surgery and on behavioral signs of pain distress.

#### **STEREOLOGY**

To count TH-positive dopaminergic cells in PD rats and controls we performed free-floating TH-immunohistochemistry on serial 40µm coronal sections of the midbrain. **Figure 1A** displays representative examples of photomicrographs depicting TH-stained midbrain sections from a vehicle and 6-OHDA injected rat. Due to technical reasons we could only obtain stereology in a subset of 8/9 PD-rats and 5/8 controls. PFA-fixed tissue blocks containing SNc and VTA were transferred to 30%-sucrose solution and kept at 4◦C for 24 hours. Sections were cut with a freezing-microtome

vehicle (*n* = 5) and 6-OHDA (*n* = 8) injected rats. Double asterisks denote *P* ≤ 0.001 (uncorrected). TH, tyrosine-hydroxylase; 6-OHDA, 6-hydroxydopamine; VTA, ventral tegmental area; SNc, substantia nigra pars compacta; SNr, substantia nigra pars reticulata.

and VTA. **(C)** Estimated absolute population of cells in the SNc and VTA for

(Leica Instruments, Wetzlar, Germany) and stained in freefloating fashion for TH-activity. Briefly, sections were washed in phosphate buffer (0.01 M PBS, Sigma-Aldrich), incubated with 3%-H2O2-solution for 3 min to block endogenous peroxidase activity and incubated with 2% normal horse serum (added with 0.3% Triton X-100, Sigma-Aldrich) for 30 min. Sections were then incubated over night at 4◦C with the primary TH-antibody (1:250, monoclonal mouse antibody, Novocastra reagents, Leica Microsystems, Wetzlar, Germany), followed by the biotinylated secondary antibody (1:400, Novocastra reagents) for 30 min, and afterwards incubated with avidin and biotinylated horseradish peroxidase (ABC kit, Novocastra reagents) for another 30 min. TH was visualized by adding peroxidase substrate (0.02% DAB reagent in 0.003% H2O2 in PBS) for 2–10 min duration. Finally, sections were mounted on glass slides, dehydrated in an increasing alcohol row and fixed under a cover slid with Roti Histokitt II (Carl Roth, Karlsruhe, Germany).

Unbiased stereological counting of TH-positive cells in the SNc and VTA was performed using the Stereoinvestigator Software (Version 10.0, MicroBrightField Inc., Williston, Vermont, USA). The hardware consisted of an Olympus Bx61 brightfield microscope (Olympus Deutschland GmbH, Hamburg, Germany) equipped with a Microfire TM A/R camera (Optronics, California, USA) and an x-y-z galvano table (Carl Zeiss AG, Jena, Germany). The optical fractionator probe (West et al., 1991; West, 2002) was applied on series of 40µm thick coronal sections. To assure sampling of comparable structural parts, the stereological analysis was centered on an independent anatomical hallmark (the rootlets of the oculomotor cranial nerve). Overall we sampled 10 sections (sampling rate of 1.2 ± 0.3 mean ± *SD* sections) spanning ∼480µm in the cranio-caudal dimension. However, due to availability of continuous TH-sections containing SNc and VTA, three rats were investigated using 7, 8, or 9 sections, respectively. Using a 2.5× magnification with numerical aperture of 0.075 we defined three anatomical regions of interest (ROI), i.e., the SNc, the VTA and the substantia nigra pars reticulata (SNr) (Paxinos and Watson, 2005). For examples of anatomical 3-D reconstructions of ROIs and counted cells, see **Figure 1B**. Cell counting was performed using a 40× Plan-Neofluar dry type objective lens with a numerical aperture of 0.75 (Carl Zeiss AG) within the SNc and VTA-ROIs. The counting frame (50 × 50µm) with dissector height of 20µm was applied with a uniform random sampling grid of 150 <sup>×</sup> <sup>150</sup>µm (optical dissector volume of 50,000µm3, sampling grid area of 22,500µm2, Gundersen, 1986). Schaeffer's estimated coefficient of error ranged between 0.05 and 0.12 for the VTA of both controls and PD animals, as well as the SNc of controls. For SNc-ROIs of PD animals it ranged between 0.3 and 0.9, reflecting the scarceness of TH-positive cells in this area (Gundersen, 1986; West, 2002). The total volume of the sampled SNc and VTA parts was stereologically estimated using the Cavalieri method (Gundersen and Jensen, 1987). Our histological regime resulted in robust sampling >200 neurons per VTA and SNc in vehicle injected controls, enabling us to express TH-positive cell numbers in absolute numbers. Cell counts were calculated separately for each ROI and hemisphere, respectively, and combined across hemispheres (*n* = 6 hemispheres per group).

To assess the symmetry of depletion we calculated the laterality index (*LI*) for each rat (Seghier, 2008), *LI* = *f* × (*ELH* − *ERH*)/(*ELH* + *ERH*), where *ELH* and *ERH* are the combined VTA and SNc estimates of TH-positive cells for the left and right hemisphere, respectively, and *f* is a scaling factor set to 1 resulting in the *LI* to be bound between −1 and 1.

#### **BEHAVIORAL MONITORING**

Locomotion of 6-OHDA or vehicle injected rats was investigated via continuous video monitoring of 23 hours of spontaneous behavior in an open field-like environment. We used an infrared video-based tracking system (VideoMot 2.0, TSE Systems, Bad Homburg, Germany) to record movement paths as time series of x-y-positions at a sampling rate of 12.8 Hz. The recording arena was spaced 70 × 100 cm with 40 cm high walls (i.e., 320 × 435 pixel after frame grabbing and offline analysis) and equipped with three food pellet feeders and one water outlet positioned in the corners. The ground was covered with bedding. The arena size was chosen to be ∼3 times larger than the size of the rat's home cage to allow for generation of more naturalistic movement patterns including running. The arena was placed in a custommade recording box lined with foamed plastic to ensure light and acoustic insulation. Recordings took place under constant darkness.

#### **BEHAVIORAL DATA ANALYSIS**

To analyze spontaneous long-term behavior we modified a data processing and analysis approach developed for open-field locomotion in rodents (Drai et al., 2000; Drai and Golani, 2001). All routines were written in MATLAB (The Mathworks, Natick, USA). The analytical approach is based on the idea of calculating single speed values (cm/s) from x-y-position time series using a sliding window, separating rest vs. motion episodes and characterizing motion episodes by the maximal speed reached within each episode rather than the average speed of a given episode. Our modified analysis consisted of five separate analysis steps (see **Figure 2**): First, raw x-y-position time series data obtained for 23 hours at a sampling rate of 12.8 Hz (∼1,000,000 single data points) contained large amounts of artifacts (see **Figure 2A**). Typically, artifacts resulted from spurious "jumping" within a continuous x-y-position time series to distant places and back. Frequent sources of artifacts lay outside the actual recording arena and could therefore be removed through spatial outlier rejection. However, artifacts also appeared within the arena, generally jumping toward the arena's corners where animals spent most of their resting time (**Figure 2B**). All places on the pixel level that were "visited" twice during the recording time were neglected, eliminating artifacts associated with resting spots (**Figure 2C**). To further de-noise the recordings we applied a low-pass threshold of 80 cm/s (**Figure 2D**), as preliminary data screening showed that no rat reached speed levels >70 cm/s inside the recording arena (data not shown). Single speed values were calculated on position time series with a moving window of 0.3 s (Drai et al., 2000; Drai and Golani, 2001). Separation of resting vs. motion episodes was done by high-pass thresholding speed values at a noise level of 4 cm/s (**Figure 2E**). This threshold was based on previous work of Drai et al. (2000) and adapted to the distribution of single speeds in our data set with a peak at 4 cm/s (data not shown). The remaining motion episodes were used for further analysis (**Figure 2F**). To compensate for erroneous separation of movement episodes resulting from artifact correction we interpolated x-y-coordinates of single missing values.

## **BEHAVIORAL ENDPOINTS**

Smoothed histograms of log transformed maximal speed values, termed "log max-SD," were used to identify different modes of motion in control and PD rats (Drai et al., 2000; Drai and Golani, 2001). A Gaussian mixture model was fitted toward the empirical data to obtain information about the localization of different "gears" of motion within the distribution of speeds. Gears were separated at 10 cm/s, a value consistently derived from preoperative recordings. Descriptive locomotion parameters were calculated for slow (i.e., first gear) and fast (i.e., second gear) episodes. However, as the dissection of second gear episodes allowed a view on full-blown motion, we concentrated our analysis on such episodes.

The six different behavioral endpoints were defined as: (1) incidence of fast speed movement episodes, which is the percentage of second gear episodes relative to the number of all motion episodes; (2) absolute distance, calculated as the cumulative sum of the overall covered distance for all motion episodes; (3) average maximal movement speed, corresponding to the mean of all maximal speeds of second gear episodes; (4) dwell time, i.e., the average time it took the rats to execute a motion episode; (5) spatial spread, i.e., the mean distance covered within a motion episode; (6) curvature of movement tracks, i.e., the ratio of the detected real motion track distance and the distance of the virtual line connecting two coordinate pairs directly during temporal windows of 0.5 s. Curvatures were averaged across all windows of a given motion episode.

Our experimental paradigm allowed for *post-hoc* splitting of the experimental groups into rats monitored after 12 ± 2 days and 28 ± 2 days weeks of recovery. We separately analyzed the correlation between the six behavioral endpoints and the estimated absolute population of TH-positive neurons in the SNc and VTA, the corresponding LI of cell depletion and the observed weight reduction for the 6-OHDA and vehicle injected control group. We utilized bootstrap regressions, which do not rely on a normal distribution of data.

#### **STATISTICAL ANALYSIS**

All *post-hoc* analyses were performed using the MATLAB Statistics Toolbox and the BRAVO Toolbox for Bootstrap Regression Analysis of Voxelwise Observations. In case of non-normal data distributions, we employed non-parametric statistical testing (Wilcoxon rank sum test). An alpha level of 0.05 was used for all statistical tests. All statistical results are given as mean ± standard deviation (SD). We corrected *P*-values for multiple comparisons using the false discovery rate (FDR) method (Benjamini and Hochberg, 1995) in case of descriptive behavioral results and multiple bootstrap regression analyses. Bootstrap regressions were performed with 5000 iterations. For descriptive analysis of behavioral endpoints alone, *P*-values were FDR-corrected for a total of 48 different comparisons (2 experimental groups, 6 behavioral endpoints; 4 different comparisons: preoperative vs. 12 + 28 days postoperative, preoperative vs. 12 days postoperative, preoperative vs. 28 days postoperative and 12 days postoperative vs. 28 days postoperative). For bootstrap regressions, *P*-values were FDR-corrected for a total of 55 regressions (the estimated neuronal population of the SNc and VTA, the LI, weight reduction and 6 behavioral endpoints).

## **RESULTS**

#### **STEREOLOGICAL COUNTING OF TH-POSITIVE SNc AND VTA NEURONS**

Bilateral injections of 15µg 6-OHDA into the SNc resulted in extensive cell death of dopaminergic neurons in the SNc and intermediate cell death in the VTA (**Figure 1C**). The estimated population of TH-positive neurons for combined SNc-ROIs was reduced by −95% in PD rats compared to controls. The absolute number of cells within combined SNc-ROIs was 253.8 ± 196.9 in PD vs. 5098 ± 1189 in control rats (*P* = 0.001). For combined VTA-ROIs, the estimated population of TH-positive neurons was reduced by −51.4% in PD rats compared to controls. Here, the absolute number of cells within combined VTA-ROIs was 3073 ± 766 in PD vs. 6318 ± 828 in control rats (*P* = 0.001). We also tested whether the cell estimates differed significantly between the two subgroups that were monitored 12 and 28 days after lesioning. No statistically significant difference was found for either VTA or SNc-ROIs of 6-OHDA or vehicle injected rats (*P*-values > 0.25, data not shown). Furthermore, the estimated populations within single ROIs did not differ significantly between left vs. right hemispheres in PD and control rats, respectively (*P* > 0.2). Likewise, the LI for 6-OHDA treated rats was 0.14 ± 0.2 and 0.08 ± 0.01 for controls (*P* = 1, data not shown).

#### **POSTSURGICAL COURSE**

Careful daily inspection of PD and control rats in the home cage environment revealed reduced spontaneous locomotion and movement speed of 6-OHDA-treated rats. Body posture appeared with a hunchback-like shape and hind limb rigidity was detectable upon manual assessment. PD rats displayed aphagia and adipsia leading to a reduction in body weight of 19 ± 7.7% of the preoperative weight in PD and 1.5 ± 4.5% in control rats prior to postoperative behavioral monitoring (*P* = 0.002). Three PD rats were euthanized, as they did not exhibit stabilization of weight loss within 12 days after surgery. None of the control rats showed overt movement deficits, aphagia, or adipsia. No additional signs of altered behavior that could indicate persistent pain distress were detected during inspections.

#### **CIRCADIAN ACTIVITY**

Free locomotion monitoring took place under constant dark conditions for at least 23 continuous hours. Although no external light cues were given we saw a pattern of locomotor activity reflecting a circadian rhythm in preoperative (data not shown) and postlesion recordings in vehicle injected controls (**Figure 3A**). In most rats we found two activity phases that were separated by a phase of reduced activity lasting ∼12 hours each.

**FIGURE 3 | Circadian post-lesion activity.** Free locomotor activity of all vehicle **(A)** and all 6-OHDA **(B)** injected rats after lesioning. Animal numbers are given left to the respective vertical bars representing global activity phases. Red solid lines mark light switches at 11 pm (lights on) and 11 am (lights off) that would take place under usual housing conditions. Note that behavioral monitoring was performed under constant dark conditions.

This corresponds to the fact that our rats were accustomed to a 12 hour day/night cycle switching at 11 am (lights off) and 11 pm (lights on). Experiments were usually started between 5 and 6 pm and held under constant dark conditions. In PD rats, however, we found a partially disturbed circadian rhythm (**Figure 3B**) with two major changes. First, PD rats tended to show a prolonged initial global activity period and less distinguishable transition periods. Second, PD rats did not exhibited a clear rebound of activity in the later phase of monitoring.

#### **BIMODAL DISTRIBUTION OF MAXIMAL SPEED VALUES**

In prelesion recordings, characterization of movement episodes by the log max-SD resulted in a bimodal distribution (**Figure 4**). In order to distinguish between distinct gears of motion, we fitted a Gaussian mixture model to the empirical distribution of each recording. First gear speeds were centered at a log max-SD of 0.72 ± 0.02 in control and 0.71 ± 0.03 in PD rats, corresponding to a velocity of 5.1 cm/s. Second gear speeds were centered at a log max-SD of 1.24 ± 0.1 in control and 1.24 ± 0.06 in PD rats, i.e., 17.4 cm/s. Notably, 6-OHDA injections resulted in marked changes of the distribution of log max-*SD* values. We observed a partial loss of the bimodal distribution and curve flattening at the center of preoperative second gear episodes. This complicated the fitting of Gaussians and produced spurious results in several animals, e.g., PD rats #1, #2, and #5. Therefore, we separated gears at a log max-SD of 1 (10 cm/s), a value derived from the consistent bimodal distributions of preoperative monitoring.

#### **INCIDENCE OF FAST SPEED MOVEMENT EPISODES**

Comparing the number of first gear and second gear episodes per recording, we saw a shift in their incidence after 6-OHDA lesioning (**Figure 5A**). In the PD group, the incidence of second gear episodes decreased from 50.6 ± 4.1% to 36.4 ± 9% (pre- vs. 12 + 28 days postoperatively, *P* = 0.01, corrected). In vehicle injected controls the proportion of first gear and second gear episodes remained the same (51.8 ± 6.1% vs. 47.4 ± 4.7%, pre- vs. 12 + 28 days postoperatively, *P* = 0.26, corrected). Considering the two subpopulations separately, we only saw a significant difference after 12 days of recovery in PD rats (*P* = 0.02 at 12 days vs. *P* = 0.08 at 28 days; *P* = 0.35 for 12 vs. 28 days; corrected).

#### **OVERALL COVERED DISTANCE**

6-OHDA injections led to a reduction of the total covered distance within 23 hours of spontaneous locomotion (PD rats: 129.9 ± 25.8 m vs. 94.3 ± 28.7 m, pre- vs. 12 + 28 days postoperatively, *P* = 0.05; control rats: 143.2 ± 38 m vs. 138.9 ± 31.7 m, prevs. 12 + 28 days postoperatively, *P* = 1, corrected; **Figure 5B**). Again, this difference was significant for the PD group at 12 days, but not at 28 days of recovery (*P* = 0.02 at 12 days vs. *P* = 0.35 at 28 days; *P* = 1 for 12 vs. 28 days; corrected).

#### **AVERAGE MAXIMAL MOVEMENT SPEED**

The log max-SD across all second gear episodes of full motion was significantly reduced in PD rats (19.9 ± 1.7 cm/s vs. 17 ± 2 cm/s, pre- vs. 12 + 28 days postoperatively, *P* = 0.02; control rats: 20.6 ± 1.8 cm/s vs. 20 ± 1.3 cm/s, pre- vs. 12 + 28 days postoperatively, *P* = 1, corrected; **Figure 5C**). Likewise, the difference

was significant also within the PD group at 12 days of recovery (*P* = 0.02 at 12 days vs. *P* = 0.18 at 28 days; *P* = 0.92 for 12 vs. 28 days; corrected). We found no significant difference for first gear episodes (*P* = 0.07 pre- vs. 12 + 28 days postoperatively, *P* = 0.25 at 12 days, *P* = 0.09 at 28 days; *P* = 0.26 for 12 vs. 28 days; corrected).

#### **DWELL TIME**

The average dwell time of second gear episodes was increased in lesioned animals but not in controls (PD rats: 1.32 ± 0.08 s vs. 1.96 ± 0.35 s, pre- vs. 12 + 28 days postoperatively, *P* = 0.01; control rats: 1.33 ± 0.09 s vs. 1.37 ± 0.15 s, pre- vs. 12 + 28 days postoperatively, *P* = 0.84, corrected; **Figure 5D**). Interestingly, the difference in dwell time was also significant in the 28 days subset of PD rats, but not at 12 days post-surgery (*P* = 0.17 at 12 days vs. *P* = 0.01 at 28 days; *P* = 0.05 for 12 vs. 28 days; corrected). Furthermore, the dwell time of first gear episodes was also significantly prolonged (*P* = 0.001 pre- vs. 12 + 28 days postoperatively, *P* = 0.01 at 12 days, *P* = 0.008 at 28 days; *P* = 0.98 for 12 vs. 28 days; corrected).

#### **SPATIAL SPREAD**

The spatial spread accomplished within a given episode (averaged across all second gear episodes of full motion) was increased after 6-OHDA lesioning (PD rats: 17.5 ± 2 cm vs. 22.1 ± 5.3 cm, pre- vs. 12 + 28 days postoperatively, *P* = 0.08; control rats: 18 ± 1.2 cm vs. 18 ± 2.6 cm, pre- vs. postoperatively, *P* = 0.83, corrected; **Figure 5E**). Comparable to the dwell time, the difference was significant at 28 days in PD-rats, but not at 12 days or within the combined group (*P* = 1 at 12 days vs. *P* = 0.01 at 28 days; *P* = 0.08 for 12 vs. 28 days; corrected). Similar to the dwell time, the spatial spread was also significantly enlarged for first gear episodes (*P* = 0.001 pre- vs. 12 + 28 days postoperatively, *P* = 0.02 at 12 days, *P* = 0.008 at 28 days; *P* = 0.98 for 12 vs. 28 days; corrected).

#### **CURVATURE**

6-OHDA injections led to an increase of the mean curvature of motion tracks (PD-rats: 4.07 ± 0.52% vs. 9.15 ± 1.66%, pre- vs. 12 + 28 days postoperatively, *P* = 0.002). Notably, the curvature was also significantly increased in vehicle injected control rats, but only for the comparison of pre vs. combined postoperative groups (3.850 ± 0.70% vs. 5.15 ± 0.64%, pre- vs. 12 + 28 days postoperatively, *P* = 0.02, corrected; **Figure 5F**). However, in PD rats we saw a stronger increase 12 days after injections (*P* = 0.02 at 12 days) and a significant recovery of curvature ratios 2 weeks later (*P* = 0.05 for 12 vs. 28 days). At the later time point the ratio was still significantly enhanced (*P* = 0.01 at 28 days). The same pattern of statistically significant differences was observed for first gear episodes (*P* = 0.001 pre- vs. 12 + 28 days postoperatively, *P* = 0.01 at 12 days, *P* = 0.008 at 28 days; *P* = 0.05 for 12 vs. 28 days; corrected).

#### **BOOTSTRAP REGRESSIONS**

We found a significant correlation for five of 55 independent comparisons (**Figure 6A**) in 6-OHDA treated rats (*n* = 8). The estimated number of VTA, but not of SNc neurons correlated negatively with the magnitude of weight reduction (*R* = −0.86, **Figure 6B1**). Weight loss also correlated with dwell time (*R* = 0.8, **Figure 6B2**) and spatial spread (*R* = 0.8, **Figure 6B3**). Spatial spread also correlated with the incidence of second gear episodes (*R* = 0.71, **Figure 6B4**). Furthermore, the curvature of motion tracks obtained postoperatively correlated negatively with the spatial spread (*R* = −0.9, **Figure 6B5**). Further 9 comparisons were found statistically significant before FDR-correction and yielded *P*-values < 0.1 after correction (**Figure 6A**). Among them was the only comparison, incidence of second gear episodes vs. maximal speed, that was also found highly correlated in vehicle injected controls (*R* = 1, data not shown).

#### **DISCUSSION**

6-OHDA, 6-hydroxydopamine.

The main findings of our study are that the stereotypical locomotion pattern of rats exhibiting a first and a second gear of motion is partially lost after bilateral dopaminergic denervation. The reduced incidence of second gear episodes points at distinct deficits in the execution of fast motor sequences. However, although generally slowed, bilaterally lesioned rats also displayed bouts of spontaneous locomotion, to some degree similar to kinesia paradoxa observed in PD patients. Contrary to expectation, we observed an abnormally increased spatial spread and increased motion time during episodes of full motion in lesioned rats. Moreover, we noted larger curvature values in movement paths of lesioned rats. Upon acceleration, curvature values showed a reduction toward physiological values. All these changes were accompanied by an alteration of circadian locomotor activity.

The depletion of estimated dopaminergic cell populations in the SNc and VTA in our disease model was comparable to that observed in humans suffering from advanced PD, where the VTA is known to be significantly less depleted than the SNc (Hirsch et al., 1988). However, the mesocorticolimbic dopamine projections are vulnerable in advanced PD as well, and seem to contribute to the complex clinical picture encountered in humans, especially at later disease stages (Thobois et al., 2010). It is worth mentioning that our stereological results were similar to cell-loss estimates of 40–50% in the VTA described in human postmortem studies (Uhl et al., 1985; Hirsch et al., 1988; Dymecki et al., 1996; McRitchie et al., 1997; Thobois et al., 2010). The estimated number of SNc neurons of vehicle injected rats in our study was comparable to the number of TH-positive cells reported by a previous study (Fox et al., 2001) that investigated Fisher 344 × Brown Norway hybrids. Other studies reported significantly higher THpositive cell counts in different rat strains (Lewis rats, Strackx et al., 2008; Long-Evans rats Healy-Stoffel et al., 2012; Sprague-Dawley rats, Walker et al., 2012) suggesting large differences in the absolute number of midbrain dopaminergic cells in different rat strains.

PD rats in our study displayed a considerable reduction of body weight (∼20%). Weight loss and associated metabolic changes might by themselves have an influence on locomotion. We included the reduction of body weight into multiple bootstrap regressions and investigated the interdependence between behavioral endpoints, dopaminergic cell loss and weight reduction. We found a significant negative correlation between the VTA, but not the SNc dopaminergic population, and the magnitude of weight loss. Thus, abnormal feeding behavior and more generally abulia, was related to cell loss within the mesolimbic dopaminergic system in our study (Redgrave et al., 2010).

Our analysis approach allowed the discrimination of naturally occurring motion episodes in severely dopamine depleted and overtly akinetic and bradykinetic rats. The characterization of motion episodes by the maximal rather than the average speed accomplished in a given episode enabled us to analyze rare and, given the pronounced parkinsonian phenotype of our rats, unexpected episodes of qualitatively altered motor activity. These motion episodes were still characterized by a reduced incidence (i.e., poverty of movement or akinesia), reduced overall traveled distance (i.e., hypokinesia) and reduced maximal speed (i.e., bradykinesia) compared to controls. Hence, important hallmark symptoms of PD were clearly detectable in motion episodes of 6- OHDA rats. However, we did not see a significant correlation of these three behavioral endpoints with the estimated populations of the SNc and VTA.

Overall, the quantitative change in traveled distance was rather small. A disruption of the circadian rhythm, a cardinal non-motor symptom in human PD patients (Videnovic and Golombek, 2013) and present in animal models of PD (Kudo et al., 2011; Willison et al., 2013) may introduce a bias. Higher absolute activity values may result from prolongation and fragmentation of activity and reduced sleep phases. Our analysis method could not differentiate between awake resting and true sleep, but conclusions could still be drawn from the overall circadian activity pattern. Here, we observed a more fragmented pattern of activity together with a nearly absent second activity phase. Moreover, PD rats showed a significantly reduced absolute number of motion episodes in comparison to controls (data not shown). Together, this argues against a circadian bias toward higher distance values in our data.

The finding of increased dwell times and spatial spread in first and second gear motion episodes of PD rats was rather unexpected. The spatial spread in the 6-OHDA group was increased to values not seen in any pre-lesion or control recordings. Theoretically, an increased dwell time could be explained as a function of slowed locomotion. To reach the same location, bradykinetic rats may simply need more time. This interdependence should theoretically manifest in a negative correlation between speed and dwell time or spatial spread. To the contrary, our data revealed a positive correlation between speed and spatial spread, albeit not significant after FDR-correction. That is, faster PD rats showed spatially extended second gear episodes. Could a postlesion increase in curvature explain increased dwell times? Arguing against that, curvature values were significantly negatively correlated with spatial spread, and also with dwell time, speed and incidence of second gear episodes (before FDRcorrection). Furthermore, a higher incidence of second gear behavior correlated significantly with prolonged spatial spread. Such an increased incidence could be the result of an uneven effect of 6-OHDA lesioning and stronger diminishment of first gear in comparison to second gear activity. This would also be supported by the finding that 6-OHDA induced weight loss correlated positively with the incidence of second gear activity (before FDR-correction). Taken together, increased spatial spread and dwell times constitute an abnormal behavioral characteristic of PD rats.

An abnormal increase of locomotor activity in bilaterally lesioned rats has been described in response to a pharmacological challenge (Schallert et al., 1978). Unexpected bursts of locomotion in otherwise akinetic patients are also well known to occur in selected PD patients or cases with postencephalitic parkinsonism (kinesia paradoxa). The sudden ability of our rats to cover longer distances within a motion episode may thus represent a rat equivalent of this condition. There are, however, important differences to classical kinesia paradoxa, which is often triggered by external sensory cues (Martin, 1967).

One possible explanation for prolonged dwell times and increased spatial spread could be problems with the termination of movement. In PD patients, deficiency in smooth motion termination is known (Dounskaia et al., 2009), along with an inability to promptly change generated force or quickly re-plan current motion. Typically, PD patients show an asymmetric evolution of velocities during the execution of goal-directed behaviors with initial fast accelerations (Flash et al., 1992). In our data we found a symmetric evolution of acceleration between PD and control rats. That is, we saw a linear relationship between the dwell time and the time point where PD or control rats accomplished their maximal speed in a given episode (data not shown). Thus, the observed behavioral abnormalities do not support the presence of the hastening phenomenon (unwanted acceleration of movement in human PD patients) in our rats. Finally, it is also conceivable that the abnormal drive to continue exploration or food search could be a behavioral consequence of metabolic changes accompanying weight loss (Redgrave et al., 2010).

Another abnormal feature was found in the movement path's curvature. The ratio between real and direct distance of first and second gear motion episodes was significantly enlarged in PD rats and, to a lesser extent, in controls. What could be the cause of increased curvatures? Unilaterally depleted rats display a spontaneous ipsiversive motor bias (Ungerstedt, 1971). Hence, curvature increases could result from spontaneous partial turning behavior provoked by asymmetric dopaminergic lesioning in our rats. However, the calculated LI did not correlate with curvature or any other behavioral endpoint except distance (before FDR-correction). Furthermore, unilateral 6-OHDA lesions were shown to shorten steps in spontaneous walking (Metz et al., 2005), thus modeling the shuffling gait of PD patients (Knutsson, 1972). Hind limb rigidity was detected upon manual assessment in our PD rats and could have contributed to disturbed locomotor patterns with reduced step sizes, axial instability and loss of balance during walking. Hemiparkinsonian rats, when tested by e.g., beam-walking, also exhibit difficulties in motor coordination (Truong et al., 2006). Interestingly, we found a significant negative correlation between curvature and spatial spread, as well as maximal speed, dwell time and incidence of second gear activity (before FDR-correction). Thus, faster running or greater spreads were associated with straighter movement paths in PD rats. If limb rigidity was indeed related to the expression of increased curvature values, then the ability of PD rats to generate fast motion with reduced curvature values may reflect an overcoming of rigidity for brief periods.

We saw some differences between lesioned animals that were monitored at an early (12 days) and later stage (28 days) after 6-OHDA lesioning. Most strikingly, the dwell time was significantly longer at 28 days in comparison with 12 days postlesion. Contrary, curvature increases significantly decreased again at the later point of investigation. Importantly, no significant difference in cell counts was observed between the two subsets. It remains difficult to infer which compensatory mechanisms were at work here. The milder but still significant increase of curvature in controls, in conjunction with the significant decrease of curvature values in PD rats with time, could argue for an influence of and recovery from surgery *per se*. The emergence of an abnormally increased behavior such as kinesia paradoxa may have in turn developed over time when the rats recovered fully from surgery. Despite putative effects of recovery from surgery, absolute distance, speed and incidence of second gear activity were comparably decreased at an early and later point of investigation.

We conclude that long-term behavioral observations of spontaneous locomotion offers new perspectives on distinctly different modes of motion in a rat model of advanced PD. The present behavioral analysis, in conjunction with *in-vivo* electrophysiology, may be particularly suited to reveal neural mechanisms underlying motor fluctuations such as kinesia paradoxa, and may provide further insights into the complex pathophysiology of PD.

## **AUTHOR CONTRIBUTIONS**

Benjamin Grieb, Gerhard Engler, Ismini Papageorgiou, Wolfgang Hamel, and Christian K. Moll designed research; Benjamin Grieb, Gerhard Engler, and Ismini Papageorgiou performed experiments; Benjamin Grieb, Constantin von Nicolai, Andrew Sharott, and Ismini Papageorgiou analyzed the data; Benjamin Grieb, Constantin von Nicolai, Andrew Sharott, Andreas K. Engel, and Christian K. Moll wrote the manuscript.

#### **ACKNOWLEDGMENTS**

The authors would like to thank Doris Lange for help with the histology. This work was supported by the European Union (MRTN-CT-2005-019247).

## **REFERENCES**


Parkinson's disease: insights from a transgenic mouse model. *Exp. Neurol.* 243, 57–66. doi: 10.1016/j.expneurol.2013.01.014

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 September 2013; accepted: 08 November 2013; published online: 27 November 2013.*

*Citation: Grieb B, von Nicolai C, Engler G, Sharott A, Papageorgiou I, Hamel W, Engel AK and Moll CK (2013) Decomposition of abnormal free locomotor behavior in a rat model of Parkinson's disease. Front. Syst. Neurosci. 7:95. doi: 10.3389/fnsys. 2013.00095*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2013 Grieb, von Nicolai, Engler, Sharott, Papageorgiou, Hamel, Engel and Moll. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Primary motor cortex of the parkinsonian monkey: altered neuronal responses to muscle stretch

## *Benjamin Pasquereau and Robert S. Turner\**

*Department of Neurobiology, Center for Neuroscience and The Center for the Neural Basis of Cognition, University of Pittsburgh, Pittsburgh, PA, USA*

#### *Edited by:*

*Hagai Bergman, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*M. Gustavo Murer, Universidad de Buenos Aires, Argentina Yifat Prut, The Hebrew University of Jerusalem, Israel Joshua Goldberg, The Hebrew University of Jerusalem, Israel*

#### *\*Correspondence:*

*Robert S. Turner, Department of Neurobiology, Center for Neuroscience and The Center for the Neural Basis of Cognition, University of Pittsburgh, 4047 BST-3, 3501 Fifth Avenue, Pittsburgh, PA 15261, USA e-mail: rturner@pitt.edu*

Exaggeration of the long-latency stretch reflex (LLSR) is a characteristic neurophysiologic feature of Parkinson's disease (PD) that contributes to parkinsonian rigidity. To explore one frequently-hypothesized mechanism, we studied the effects of fast muscle stretches on neuronal activity in the macaque primary motor cortex (M1) before and after the induction of parkinsonism by unilateral administration of 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine (MPTP). We compared results from the general population of M1 neurons and two antidromically-identified subpopulations: distant-projecting pyramidal-tract type neurons (PTNs) and intra-telecenphalic-type corticostriatal neurons (CSNs). Rapid rotations of elbow or wrist joints evoked short-latency responses in 62% of arm-related M1 neurons. As in PD, the late electromyographic responses that constitute the LLSR were enhanced following MPTP. This was accompanied by a shortening of M1 neuronal response latencies and a degradation of directional selectivity, but surprisingly, no increase in single unit response magnitudes. The results suggest that parkinsonism alters the timing and specificity of M1 responses to muscle stretch. Observation of an exaggerated LLSR with no change in the magnitude of proprioceptive responses in M1 is consistent with the idea that the increase in LLSR gain that contributes to parkinsonian rigidity is localized to the spinal cord.

**Keywords: stretch reflex, primary motor cortex, MPTP, Parkinson's disease, rigidity**

## **INTRODUCTION**

Rigidity, one of the cardinal signs of Parkinson's disease (PD), is defined clinically as a sustained increase in resistance to passive movement of a joint throughout its range (Stebbins and Goetz, 1998). Although various possible contributing factors have been studied for decades at the central (Cantello et al., 1991; Aminoff et al., 1997; Strafella et al., 1997; Young et al., 1997), spinal (Angel and Hofmann, 1963; Dietrichson, 1971; Andersson and Sjolund, 1978; Delwaide et al., 1991; Lelli et al., 1991; Marchand-Pauvert et al., 2011) and muscle (Dietz et al., 1981; Lee and Tatton, 1982; Noth et al., 1988) levels, the exact pathophysiologic mechanisms of parkinsonian rigidity remain elusive. Many observations have suggested that the stretch reflex plays a key role in the generation of rigidity, but controversies persist regarding the precise relationship (Lee and Tatton, 1975; Berardelli et al., 1983; Rothwell et al., 1983; Meara and Cody, 1992; Xia et al., 2009).

In neurologically-intact animals, the stretch reflex appears to regulate limb stiffness to maintain precise control of multijoint posture during interactions with an unstable environment (Rothwell, 1990; Shemmell et al., 2010; Pruszynski et al., 2011a). The stretch reflex involves at least two components: a short-latency response mediated by fast-conducting segmental pathways (Magladery et al., 1951; Burke et al., 1984), and a longlatency component mediated, at least in part, by a transcortical pathway (Hammond, 1955; Marsden et al., 1972; Capaday et al., 1991; Day et al., 1991; Matthews, 1991; Palmer and Ashby, 1992; Pruszynski et al., 2011b). The idea that cortex contributes to the long-latency component (i.e., the long latency stretch reflex, LLSR) is supported by observations that neurons in primary motor cortex (M1) respond to proprioceptive perturbations at delays (20–60 ms prior to the late muscle reaction) appropriate for them to participate in this long-latency component (Evarts, 1973; Conrad et al., 1975; Evarts and Tanji, 1976; Cheney and Fetz, 1984; Abbruzzese et al., 1985; Aminoff et al., 1997; Mackinnon et al., 2000). The ability of unilateral muscle stretches to evoke bilateral LLSRs in cases of congenital corticospinal tract malformation (Matthews et al., 1990; Capaday et al., 1991) provides strong support for the view that a component of the LLSR is mediated via a transcortical route. It is important to note, however, that slow-conducting spinal reflex pathways also contribute to the LLSR (Berardelli et al., 1983; Cody et al., 1986).

In PD, the short-latency component of the stretch reflex appears essentially normal (Berardelli et al., 1983; Rothwell et al., 1983; Cody et al., 1986; Meara and Cody, 1993; Hayashi et al., 2001), but the LLSR is markedly exaggerated (Tatton and Lee, 1975; Mortimer and Webster, 1979; Berardelli et al., 1983; Rothwell et al., 1983; Cody et al., 1986; Meara and Cody, 1993; Hayashi et al., 2001; Xia et al., 2009). By increasing the activation of muscles that oppose passive stretch, an abnormally-increased LLSR may be a key factor in the genesis of parkinsonian rigidity. Indeed, several studies have documented a correlation between the magnitude of the LLSR and the severity of parkinsonian rigidity (Lee and Tatton, 1975; Mortimer and Webster, 1979; Tatton et al., 1984; Xia et al., 2009). Other studies failed to confirm such a clear relationship (Berardelli et al., 1983; Rothwell et al., 1983; Cody et al., 1986), most likely because additional reflex abnormalities, such as the aberrant activation of the shortening muscle, also contribute (Andrews et al., 1972; Xia et al., 2006, 2009). Further support for the LLSR's role in parkinsonian rigidity comes from observations that treatments for PD (i.e., dopaminergic medication, pallidotomy and subthalamic stimulation) clearly reduce stretch-related muscled activation, and ameliorate muscle stiffness proportionately (Teravainen et al., 1989; Limousin et al., 1999; Fung et al., 2000; Hayashi et al., 2001; Lee et al., 2002; Xia et al., 2006; Levin et al., 2009; Marchand-Pauvert et al., 2011; Raoul et al., 2012).

Two straightforward possibilities may account for the exaggerated LLSR of PD. The trans-cortical hypothesis states that abnormal neuronal activity transmitted from the parkinsonian basal ganglia (BG) (Delong and Wichmann, 2007) causes increased somatosensory responsiveness in the M1 neurons that participate in the LLSR. Indeed, neurons in the BG and thalamus of parkinsonian subjects have larger-than-normal responses to proprioceptive stimulation and reduced response selectivity with respect to the joint and limb stimulated (Filion et al., 1988; Bergman et al., 1994; Pessiglione et al., 2005). According to this model, increased proprioceptive responsiveness might be transmitted to M1 where it would appear as (1) a greater incidence of neuronal responses to proprioceptive stimulation in the population of M1 neurons, (2) an increase in the balance of torque-evoked increases over decreases, or (3) an increase in the magnitude of torque-evoked responses in individual M1 neurons. It is unclear, however, how this simple model can be reconciled with the abundant evidence that the M1 of parkinsonian subjects has reduced responsiveness to somatosensory inputs as measured by electroencephalography (Rossini et al., 1989; Aminoff et al., 1997; Rickards and Cody, 1997; Lewis and Byblow, 2002; Schrader et al., 2008; Degardin et al., 2009), transcranial magnetic stimulation (Lewis and Byblow, 2002), and functional imaging (Boecker et al., 1999). The alternative model hypothesizes that the exaggerated LLSR of PD is mediated by abnormal function of slow-conducting spinal reflex pathways (Berardelli et al., 1983; Cody et al., 1986; Simonetta Moreau et al., 2002; Marchand-Pauvert et al., 2011; Raoul et al., 2012).

Primary motor cortex is composed of a complex collection of distinct cell types that differ with respect to intrinsic physiology (McCormick et al., 1985; Connors and Gutnick, 1990; Stewart and Foehring, 2000; Hattox and Nelson, 2007; Kiritani et al., 2012) and afferent innervation (Swadlow, 1994). These subtypes may perform dissimilar functions in the LLSR and may be affected differently in the parkinsonian state (Pasquereau and Turner, 2011). For example, distant-projecting lamina 5b pyramidal tract-type neurons (PTNs) and intratelencephalic-projecting corticostriatal neurons (CSNs) in the M1 have markedly different somatosensory responsiveness (Bauswein et al., 1989; Turner and Delong, 2000). PTNs are positioned to play a relatively direct role in the expression of the abnormal LLSR in PD due to their direct projections to segmental motor nuclei (Landgren et al., 1962; Brodal, 1978; Kuypers, 1981). The CSNs of M1, in contrast, provide an important glutamatergic input to motor regions of the striatum, and thus, are in a position to influence, for better or worse, the disordered physiology of the dopamine-depleted striatum (Mallet et al., 2006).

Despite clear implication of the motor cortex in the LLSR, only a few studies have compared cerebral responses to proprioceptive inputs and abnormalities in the LLSR in parkinsonian subjects (Rossini et al., 1989, 1991; Aminoff et al., 1997), and these used a non-invasive electrocerebral approach. These studies reported the seemingly paradoxical finding that an exaggerated LLSR was correlated with *attenuation* of a sensory evoked potential that is thought to emanate from precentral cortical areas. To elucidate the changes in cortical function associated with the exaggerated LLSR of PD, we performed single unit recording in the M1 of parkinsonian macaques and studied the short latency neuronal responses to rapid muscle stretch. Based on previous observations that the spontaneous activity of PTN and CSN populations are affected differently in parkinsonism (Mallet et al., 2006; Pasquereau and Turner, 2011), we hypothesized that the responses of PTNs and CSNs to muscle stretch would be affected differently in parkinsonism. To address this issue, antidromicallyidentified neurons (PTNs and CSNs) were studied in the arm area of M1 in two rhesus monkeys. The stretch reflex was evoked by sudden rotations of the animal's elbow or wrist and recordings were obtained before and after induction of parkinsonism by unilateral intra-carotid administration of 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine (MPTP).

## **METHODS**

#### **ANIMALS, APPARATUS, TASKS**

Two female monkeys (*Macaca mulatta*) were used for these experiments (monkeys V and L). All aspects of animal care were in accord with the *Guide for the Care and Use of Laboratory Animals* (National Research Council, 1996), and all procedures were approved by the institutional animal care and use committee.

Data from these animals were part of a recent publication describing changes in resting cortical activity associated with parkinsonism (Pasquereau and Turner, 2011). Many aspects of the experimental approach were described in detail in that report. In brief, the animals performed a visuomotor step-tracking task similar to one used in previous studies of cortical and BG neuronal activity (Alexander, 1987; Mitchell et al., 1987; Alexander and Crutcher, 1990; Turner and Delong, 2000). The animal sat in a primate chair and faced a computer monitor. The right arm was secured into a close-fitting padded cradle attached to a onedimensional torquable manipulandum. The wrist (monkey L) or elbow (monkey V) joint was aligned with the manipulandum's axis of rotation. Flexion and extension movements rotated the manipulandum in the horizontal plane and thereby controlled the horizontal position of an onscreen cursor. A trial began when a center target appeared and the monkey made the appropriate joint movement to align the cursor with the target. The monkey maintained this position for the duration of a start-position hold period (random duration, 2–5 s), during which the animal could not predict the location of the upcoming lateral target. The target then shifted to the left or right (chosen at random), and the animal moved the cursor to capture the lateral target. The animal received a drop of juice for successful completion of the task.

On two-thirds of the trials (selected at random), single flexing or extending torque impulses (0.1 Nm–50 ms duration) were applied to the manipulandum by a DC brushless torque motor (TQ40W, Aerotech Inc., Pittsburgh PA) at an unpredictable time beginning 1–2 s (uniform randomized distribution) after initial capture of the center target. Each square-wave torque impulse induced an angular displacement of the joint (mean = 10-deg) causing a sudden stretch of arm extensor or flexor muscles. The animals were not trained to produce a specific response to these unpredictable perturbations, but the animals naturally adopted a strategy that returned the joint to its initial pre-impulse position.

Later aspects of the behavioral trial, which evaluated instruction-related neuronal activity, are irrelevant to the current study.

## **SURGERY**

The animals were prepared surgically using aseptic techniques under Isoflurane inhalation anesthesia (Pasquereau and Turner, 2011). A cylindrical stainless steel chamber was implanted at an angle of 35◦ in the coronal plane to allow access to the arm-related regions of the left M1 and the posterior putamen. The chamber and hardware for head fixation were fixed to the skull with bone screws and methyl methacrylate polymer.

Pairs of fine Teflon-insulated multistranded stainless steel wires were implanted into multiple arm muscles: flexor carpi ulnaris, flexor carpi radialis, biceps longus, brachioradialis, and triceps lateralis in monkey L; and posterior deltoid, trapezius, triceps longus, triceps lateralis and brachioradialis in monkey V. The wires were led subcutaneously to a connector fixed to the skull implant. Accurate placement of electromyographic (EMG) electrodes was verified post-surgically. Following surgery, animals were given prophylactic antibiotics and analgesic medication.

#### **PLACEMENT OF ELECTRODES FOR ANTIDROMIC IDENTIFICATION**

PTNs and CSNs were identified by antidromic activation from electrodes implanted in the cerebral peduncles and posterolateral striatum. Sites for implantation were identified using standard electrophysiological mapping techniques (Turner and Delong, 2000). Arm-related areas of the putamen were identified by sensorimotor examination of striatal activity and microstimulation effects. The arm-related fiber tract in the pre-pontine cerebral peduncle (ventral to the substantia nigra) was located using similar techniques.

Custom-built PtIr microwire electrodes were implanted at arm-related sites in putamen and the peduncle (Turner and Delong, 2000). After implantation, stimulation through the electrodes evoked arm movements similar to those observed at the target sites during microelectrode mapping. In both animals, three such electrodes were implanted in the posterior putamen between the planes of HC anterior 8 and 14, and one electrode was implanted in the arm-responsive portion of the pre-pontine peduncle (for details, see Turner and Delong, 2000). Histologic reconstruction confirmed that the striatal and peduncle electrodes were at sites known from anatomical studies to receive the bulk of M1 CSN and PTN projections, respectively (Brodal, 1978; Flaherty and Graybiel, 1991; Takada et al., 1998; Turner and Delong, 2000).

#### **DATA ACQUISITION**

Areas of M1 related to the primary joint used in the task were identified using microstimulation and sensorimotor mapping. We preformed trans-dural extracellular recording using single glass-coated PtIr microelectrodes mounted in a hydraulic microdrive (MO-95, Narishige Intl., Tokyo). A cortical region was targeted for data collection if neurons responded to active and/or passive movement of the arm and microstimulation at low currents evoked contraction of forelimb muscles (<40-μA, 10 biphasic pulses at 300-Hz). Microelectrode penetrations were performed throughout the targeted cortical area using sequential stimulation of each putamen and peduncle stimulating site as search stimuli (biphasic current pulses of 700-μA, 0.2-ms duration separated by 0.1-ms, >1.5-s between successive biphasic shocks). Neurons were selected for data collection if they were activated antidromically or if they were located in close proximity (<0.5-mm) to an antidromically-activated neuron. Standard tests for antidromic identification were used: constant antidromic latency (<0.2-ms jitter), reliable following of a high-frequency train of stimuli (three or four shocks at 200-Hz), and collision of antidromic spikes with spontaneously occurring spikes (Fuller and Schlag, 1976; Turner and Delong, 2000).

Neuronal activity was collected while the animal performed the step-tracking task. The microelectrode signal was amplified <sup>×</sup>104 and bandpass filtered (0.3–10 KHz, DAM-80, WPI Inc.). The action potentials of single neurons (sampled at 60 kHz) were discriminated on-line using template-based spike sorting (MultiSpike Detector, Alpha Omega Engineering, Nazareth, Israel). The timing of detected spikes and of relevant task events was sampled digitally at 1 kHz and saved to disk for offline analysis. EMG signals were differentially amplified (gain = 10-K), band-pass filtered (20-Hz to 5-kHz), rectified and then low-pass filtered (100-Hz). EMG data were collected during only a subset of data recording sessions. (No usable EMG signal was available in monkey L after MPTP administration.) Analog data reflecting angular position of the manipulandum (i.e., joint angle), the torque produced by the motor, and EMG were digitized at either 200-Hz (monkey L) or 500-Hz (monkey V).

#### **ADMINISTRATION OF MPTP**

After an adequate number of neurons were sampled from the neurologically-normal state, a hemiparkinsonian syndrome was induced by injection of MPTP into the left internal carotid artery [0.5-mg/kg, (Bankiewicz et al., 1986; Wu et al., 2007)]. This model of parkinsonism was chosen to facilitate care of the animals during the months-long period of post-intoxication recording and to increase the likelihood that animals would continue performing the operant task following intoxication (Bankiewicz et al., 2001). The MPTP administration procedure was performed under general anesthesia (1–3% Isoflurane) and prophylactic antibiotics and analgesics were administered post-surgically. Both animals developed stable signs of parkinsonism contralateral to the infusion (i.e., on the right side of the body). Quantitative measures of the severity of parkinsonism, of its impact on task performance, and histologic evidence of dopamine depletion are documented in a previous report (Pasquereau and Turner, 2011). Post-MPTP recording sessions started >30-days after MPTP administration.

#### **HISTOLOGY**

After the last recording session, each monkey was given a lethal dose of sodium pentobarbital and was perfused transcardially with saline followed by 10% formalin in phosphate buffer and then sucrose. The brains were processed histologically to localize microelectrode tracks (using cresyl violet staining) and to document the loss of dopaminergic cells in the substantia nigra *pars compacta* (SNc) using tyrosine hydroxylase (TH) immunochemistry (See Pasquereau and Turner (2011) for details concerning histologic results).

#### **DATA ANALYSIS**

The present report addresses only the short-latency joint movements and changes in neuronal activity evoked by torque perturbation during the start-position hold period. The general features of task performance before and after MPTP have been reported previously (Pasquereau and Turner, 2011).

The digitized signal reflecting manipulandum angle was filtered and differentiated [low-pass 25-Hz (Hamming, 1983)]. The onset of torque-evoked movement, peak velocity, and movement termination were detected automatically using angle, velocity, and duration criteria. A velocity threshold of 5-deg/s was used to detect movement. Session-by-session means of kinematic measures were entered into Three-Way ANOVAs to test for effects of MPTP administration, movement direction and animal.

EMG data were analyzed using methods similar to those described previously (Turner et al., 1995). For each muscle, peritorque means of the digitized EMG signals were constructed for all valid torque perturbations in each direction (flexion and extension). The signal was normalized by subtracting the background activity recorded 200-ms prior to the onset of the torque perturbation.

Neuronal data were screened based on the location, quality, and duration of the recording. Data were included in the analysis database if they met the following criteria: (1) The recording was obtained from a location within 3-mm of the anterior bank of the central sulcus from which movements of the arm could be evoked by microstimulation (i.e., <40-μA, 10 pulses at 300-Hz). (2) Adequate single unit isolation was maintained throughout the recording. Isolation was controlled during acquisition by adjusting electrode position and spike sorting. Adequate isolation was verified off-line by testing whether a neuron's inter-spike intervals (ISIs) obeyed a refractory period (>2-ms). (3) Neurons were studied if they were either responsive to antidromic stimulation or were encountered within 0.5-mm of an antidromically activated neuron.

The analyses presented here were performed on data extracted from completely different task time periods than those used in our previous publication (Pasquereau and Turner, 2011). In addition, not all neurons were recorded from under all task conditions, so the neuronal populations studied here overlap only partially with those used in (Pasquereau and Turner, 2011).

For each torque perturbation, continuous neuronal activation functions [spike density functions (SDFs)] were generated by convolving a spike train's delta function (1-ms resolution) with an asymmetric Gamma function kernel (*k* = 1.5 and θ = 20). SDFs are traditionally constructed as a sum of Gaussian functions centered on the times of each discriminated action potential (Szucs, 1998). Gaussian functions, however, exert influence backward in time (Thompson et al., 1996; Isoda and Hikosaka, 2008; Heitz et al., 2010) such that SDFs constructed using a Gaussian kernel occasionally gave spurious results, indicating that torque-evoked responses began prior to torque onset. The Gamma function, which approximates the timecourse of a postsynaptic potential, avoided this problem by exerting influence only forward in time. Mean peri-torque SDFs (averaged across trials) were constructed separately for the two torque directions for neurons studied during at least 5 repetitions of each torque direction. A neuron's baseline firing rate was calculated as the mean of the SDFs across the 500-ms epoch immediately preceding torque pulse onset. Phasic responses to a torque perturbation were detected by comparing SDF values (millisecond-by-millisecond) during a post-torque epoch (200-ms) relative to a cell's baseline firing rate (2-tailed *t*-test). The threshold for significance was adjusted to account for multiple comparisons [*p* < 0.01/(200-ms epoch/40 ms gamma filter half-width = 5 independent comparisons) = 0.002]. A neuron was judged to be torque-related if it generated a significant post-torque response for at least one movement direction. Response onset times, defined as the time at which the SDF first crossed the *p* = 0.002 threshold, were determined separately for each torque direction. For comparisons between neuronal populations, a cell's earliest response onset across directions was used as that cell's latency. We only analyzed cortical responses that began at relatively short latencies (between 20–60 ms), so as to exclude responses related to volitional compensatory movements (Evarts, 1973; Pruszynski et al., 2011b). The time of offset of a neuronal response was calculated using a similar method, searching for the time point at which the SDF first returned within the *p* = 0.002 threshold relative to baseline firing rate.

The magnitude of a torque-evoked response was measured using three separate measures. First, the mean firing rate during the response was calculated (i.e., mean of the SDF between times of response onset and offset). Second, we calculated the maximum change of firing rate away from baseline between times of response onset and offset. And third, we calculated the area under the curve in the SDF between times of response onset and offset. The temporal dispersion of a response was measured as its full-width half-maximum (FWHM).

The directional selectivity of torque-evoked responses was parameterized using a directional selectivity index (DSI) (Suarez et al., 1995): DSI = |1 - (NP/*P*)|, where NP equals the maximum change infiring rate from baseline in the non-preferred direction and *P* equals the maximum change in firing rate in the preferred direction during the post-torque period (200-ms).The preferred direction is defined as the direction that elicits the largest change from baseline firing rate (either positive or negative).By this convention, DSI = 0 meant that the response to torque perturbation was equal for both directions and therefore was not directionally selective (classified as *non-directional)*. DSI ≈1 meant that the cell response (either increase or decrease in firing) was present for only one movement direction (classified as *unidirectional activity*). A cell's torque response was considered directionally selective if the DSI was >0.50 (i.e., P/2>NP). Responses with DSIs between 0.5 and 1 were defined as *bidirectional* whereas those with DSIs >1 were considered *reciprocal* (changes of opposite sign for opposing directions of movement).

To study MPTP-related changes for different subsets of cells, we corrected for multiple comparisons using the Bonferroni– Holm method (Holm, 1979). Then, two separate analyses were performed to determine if MPTP-induced alterations in neuronal responses correlated with behavioral measures of rigidity. The first analysis tested for relationships across recording sessions. We tested for correlations between mean measures of neuronal responses (magnitudes and latencies) and mean measures of reflex movements for those sessions (behavioral indexes: peak velocity of movements and magnitude of EMG responses). Spearman's rank correlations were performed and the threshold for significance was adjusted to account for multiple comparisons [*p* < 0.05/(4 comparisons) = 0.0125]. The second analysis tested for relationships within individual recording sessions. We tested for correlations between a single neuron's trial-to-trial response latency and an animal's trial-to-trial response to torque perturbations (peak velocity and EMG). Spearman's rank correlations were performed for individual cells and results were compared between populations recorded pre- and post-MPTP.

## **RESULTS**

#### **DATABASE**

Single unit recordings were obtained from the arm-related areas of the left M1 of two monkeys. A total of 227 neurons were studied in the neurologically normal state. Of these, 66 were activated antidromically from peduncle stimulation (PTNs: 49 in monkey V and 17 in monkey L; **Table 1**) and 56 were activated from the putamen (CSNs: 31 in monkey V and 25 in monkey L). Of the 232 neurons collected during the post-MPTP period, 65 were PTNs (54 in monkey V and 11 in monkey L) and 58 were CSNs (48 in monkey V and 10 in monkey L). Only 3 cells were activated antidromically from both the putamen and the peduncle (0.5% of neurons studied). These three cells were included in the general "M1" category (i.e., all cells studied including PTNs, CSNs and non-activated cells), but they were excluded from "PTN" and "CSN" categories. The remainder of the neurons (105 pre-MPTP and 109 post-MPTP) were not activated antidromically (NA), but were recorded either at the same time as PT or CS recordings or were sampled within 0.5-mm of an antidromically activated neuron along the same microelectrode track.

Paralleling our previous report (Pasquereau and Turner, 2011), in neurologically-normal animals, resting neuronal firing rates measured immediately prior to torque onset were markedly higher for PTNs (mean rate: 16.3-spikes/s, range: 0.2–45; **Table 1**) than for CSNs (mean rate: 3.6-spikes/s, range: 0–22; Mann–Whitney *U*-test, *p* < 0.001). This result is consistent with previous descriptions of the marked differences in resting firing rates between intra-telencephalic-like CSNs and PTNs (Bauswein et al., 1989; Turner and Delong, 2000).

**Table 1 | Effects of MPTP on two distinct subpopulations of M1 cells.**


*Mean values* ± *SD before and after MPTP treatment are calculated for all M1 cells (left), for PTNs (middle), and for CSNs (right). \*p* < *0.05, \*\*p* < *0.01, \*\*\*p* < *0.001 (Mann–Whitney U-test); and* ##*p* < *0.01 (*χ<sup>2</sup> *test). All statistical results in this table compare results within the indicated category for pre- vs. post-MPTP periods.*

Also consistent with previous reports (Pasquereau and Turner, 2011) the spontaneous activity of the general population of M1 cells decreased by 17% after MPTP treatment (Mann–Whitney *U*-test, *p* < 0.05/2). MPTP had markedly different effects on the two identified neuronal populations. The pre-torque firing rate of PTNs was reduced by 27% following MPTP treatment (Mann–Whitney *U*-test, *p* < 0.001; **Table 1**) whereas the mean activity of CSNs remained unchanged (Mann–Whitney *U*-test, *p* = 0.5). The antidromic latencies of PTNs remained unchanged between pre- and post-MPTP periods (mean = 1.9-ms; Mann– Whitney *U*-test, *p* = 0.9; **Table 1**) whereas those of CSNs were significantly longer following MPTP administration (means = 5.1 and 6.1-ms, pre- and post-MPTP, respectively; Mann–Whitney *U*-test, *p* < 0.05/3).

#### **BEHAVIORAL EFFECTS OF MPTP**

Intracarotid administration of MPTP rendered the animals moderately parkinsonian as evidenced by increased reaction times, decreased movement velocities and reduced movement extents in the behavioral task [for details, see Pasquereau and Turner (2011)]. Monkeys showed limited variation in the severity of these impairments throughout the post-MPTP recording period (maximum 117-days).

Induction of parkinsonism also led to a slowing of torqueinduced displacements of the arm (ANOVA; *F* > 515, *p* < 0.001; **Figure 1**) and a reduction in movement amplitude (*F* > 49, *p* < 0.01). More specifically, the mean peak velocity and the amplitude of torque-induced displacements averaged across recording sessions were reduced by 16.2% and 6.6%, respectively, following MPTP administration, consistent with an increase of rigidity in the parkinsonian condition. The latencies of torque-induced displacements remained unmodified post-MPTP (*F* < 0.2 and

*p* > 0.05; **Figure 1**), but we found significant differences in the effects of MPTP for the two torque directions (*F* > 94 and *p* < 0.001). Because of this, data were analyzed separately for flexion and extension torque directions.

EMG responses to sudden, unpredictable muscle stretches were analyzed in their separate latency components (**Figure 2**). The short-latency response defined as the *m1* component (muscle activity occurring between 15 ± 3 and 40 ± 4-ms posttorque) was poorly formed and generally much smaller than the long-latency response (*m2*; 41 ± 4 and 80 ± 7-ms post-torque). Following the administration of MPTP, the *m2* component was markedly larger (Kolmogorov two-sample test, *p* < 0.05; **Figure 2**). The total duration of EMG responses to torque perturbations also increased (from 57 ± 6 to 79 ± 3-ms; Kolmogorov two-sample test, *p* < 0.001), primarily due to the appearance of an *m3* component. The *m1* short-latency response remained weak in all the case, and latencies of the earliest EMG responses showed no significant change (*p* > 0.05).

#### **TORQUE-EVOKED NEURONAL RESPONSES – PREVALENCE**

A large fraction of the general population of M1 cells (62%, 284/459, of all CSNs, PTNs, and NA cells) responded to torque perturbations with a phasic response at short latency (<60-ms, **Figure 3A**). Half of these torque-related cells (124/284, 44%) responded for only one direction of rotation (flexion or extension). **Figures 3B,C** illustrate examples of neuronal responses to torques in which a CSN responded to flexion torques only **(3B)**, and a PTN that responded to both flexion and extension torques **(3C)**. Monophasic increases in firing constituted a large fraction of the torque response in M1 (87%) whereas torque-evoked decreases in firing were observed infrequently (13% of all M1 neurons; **Table 1**). Consistent with previous observations (Turner

and Delong, 2000), PTNs were more responsive to proprioceptive stimulation than CSNs. Torque responses were observed in 68% (89/131) of PTNs but only 31% (35/114) of CSNs (χ<sup>2</sup> <sup>=</sup> <sup>32</sup>.3, *p* < 0.001; **Table 1** and **Figure 3A**).

#### **TORQUE-EVOKED NEURONAL RESPONSES – PRE- vs. POST-MPTP**

component (*m1*, *m2*, and *m3*).

Contrary to predictions from a simple model for exaggerated LLSRs in PD (see Introduction), induction of parkinsonism did not increase the prevalence of torque responses in the general population of M1 cells (62% for both states), in CSNs (26%), or in PTNs (68%; χ<sup>2</sup> < 6.4 *p* > 0.05/3; **Figure 3A**, **Table 1**). These results are broken down in a more fine-grained manner for the two monkeys in the Supplementary Table. The small number of proprioceptive-responsive CSNs studied post-MPTP prevented more in-depth analysis of response timing in this neuronal type.

Induction of parkinsonism also did not alter the balance of torque-elicited increases and decreases in firing. For PTNs, most torque-evoked changes in activity consisted of an increase in firing (72% of responses; 97 of 134 responses counting responses to flexions and extensions separately; **Figure 4A** and **Table 1**). That high prevalence of torque-evoked increases in firing did not differ between pre- and post-MPTP periods (χ<sup>2</sup> <sup>=</sup> <sup>0</sup>.1, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.7; **Table 1**). The magnitudes of torque-evoked responses also did not differ between pre- and post-MPTP periods. Response magnitudes were compared for M1 cells, and again for PTNs separately, using three different measures of magnitude: (1) the mean change of firing rate relative to pre-torque baseline, (2) the maximum change of firing rate, and (3) the area under the curve in the SDF. None of these comparisons yielded a significant difference between pre- and post-MPTP periods (Kolmogorov two-sample test and Mann–Whitney *U*-test, all *p*'s > 0.15; **Figure 4A**). To

extension) did not change following MPTP administration for the general population of M1 cells (top), for CSNs (middle), or for PTNs (bottom) (χ<sup>2</sup> test, *p* > 0.05/3). **(B,C)** Representative short latency responses to torque perturbations for CSNs **(B)** and PTNs **(C)**. The torque responses of CSNs tended to be small in magnitude and in one direction only, whereas the

control for the possibility that these results were biased by the MPTP-induced global slowing of torque-induced displacements of the arm (see *Behavioral effects of MPTP*, above), we selected subsets of recording sessions from pre- and post-MPTP sessions in which velocities were equivalent (range: 130–200 deg/s, Mann–Whitney *U*-test, *p* > 0.05; Supplementary Figure 1). We confirmed our main result in this subset by observing that MPTP did not change the magnitude of torque-evoked responses in M1 (Mann–Whitney *U*-test, all *p*'s > 0.19).

The latencies of torque-evoked responses, however, were significantly shorter following MPTP administration. This observation held true for the general population of M1 neurons and specifically for PTNs, and for responses to flexions and extensions, all considered separately (Kolmogorov two-sample test, all *p*'s < 0.05; **Figure 4B**). For PTNs, the mean latency of responses was reduced by 17% from a pre-MPTP value of 46-ms (45-ms for flexions, and 46-ms for extensions) to a post-MPTP value of 38 ms (37-ms for flexions, and 39-ms for extensions). Similarly, for the general population of M1 neurons, mean response latencies flexing or extending torques (*vertical black lines*). The *vertical gray dotted lines* show neuronal response latencies detected using significant threshold computed from the pre-torque period (*horizontal dotted lines*, *p* < 0.01/(5 comparisons) = 0.002). *Inset figures* in each panel illustrate antidromic activation and collision tests for the neuron that has its torque response shown.

were reduced by 15% from 41-ms pre-MPTP (40-ms for flexions, and 42-ms for extensions) to 35-ms post-MPTP (36-ms for flexions, and 35-ms for extensions; Kolmogorov two-sample test, *p* < 0.01). M1 response latencies were reduced by similar degrees for flexion and extension directions (Kolmogorov twosample test, *p* = 0.52; **Figure 4B**). This latency shortening effect could not be attributed to history effects (e.g., increasing experience by the animal or cumulative recording tracks) because we found no correlation between the latencies of torque-evoked responses and the days of training an animal experienced (monkey V: Spearman |rho|= 0.02 *p* = 0.85; monkey L: Spearman |rho|= 0.08 *p* = 0.68).

In addition to the shift in latencies, torque-evoked increases in PTN activity were more tightly synchronized or time-locked to the torque perturbation following MPTP (**Figure 4A**). For these responses, the response FWHM was reduced by 31% from a mean of 73-ms (80-ms for flexion, and 66-ms for extension) pre-MPTP to 50-ms (54-ms for flexion, and 46-ms for extension) post-MPTP (Mann–Whitney *U*-test, *p*'s < 0.05).

**perturbations.** The latency of response shifted significantly in PTNs or in the general population of M1 cells, although its magnitude remained unchanged between MPTP periods. **(A)** Population mean spike density functions (SDFs) averaged across all positive (top) or negative (bottom) phasic responses in PTNs for flexion or extension movements (left and right panels, respectively). Gray shadings and vertical gray lines indicate ± *SE* and the time of torque onset (time 0), respectively. Color plot of all

polarity of responses. Colors along each horizontal band indicate the significant changes in firing rate of one PTN induced by torque perturbation (red-yellow = increases; blue-cyan = decreases; firing rate scale in the bar). Black = no significant change in a SDF. **(B)** Following MPTP, the cumulative distributions for M1 (left) or for PTNs (right) showed a reduction of response latency to torque perturbation (Kolmogorov two-sample test; ∗∗∗*p* < 0.001, ∗∗*p* < 0.01, ∗*p* < 0.05).

## **RESPONSE DIRECTIONALITY – PRE- vs. POST-MPTP**

In the general population of M1 cells, the directional selectivity of torque responses was reduced following MPTP. Prior to MPTP, a large proportion of the torque-responsive cells in M1 responded differently for the two directions of torque perturbation (**Figure 5**; 143/231, 62% of all torque-responsive cells). By classifying the torque-evoked responses of M1 neurons according to their directional selectivity indices (DSI, *Methods*), it became clear that highly directional (reciprocal and unidirectional) response types became less common following MPTP administration (10% reduction in incidence), and bi-directional or non-directional response types became more common (10% increase in incidence; <sup>χ</sup><sup>2</sup> <sup>=</sup> <sup>22</sup>.2, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001; ### in **Figure 6**). Non-significant reductions (*p* > 0.05/2, likely due to small N) in the incidence of highly directional response patterns were observed in CSN and PTN sub-populations. This general decrease in directional selectivity could not be attributed directly to MPTPinduced changes in resting firing rates because pre-torque mean firing rates did not differ between the different directional categories (One-Way ANOVA, *F* = 0.77 and *p* = 0.46). In addition, EMG responses to torque perturbations did not show alterations in directional selectivity or sign of co-contraction between antagonist muscles (Supplementary Figure 2).

#### **ANALYSIS OF NEURAL-BEHAVIORAL CORRELATIONS**

Although torque-evoked cortical responses, movement velocity, and the *m2* component of EMG reflex activity were all affected by MPTP administration, we found no correlation between the modifications in M1 responses and variations in the LLSR. That question was addressed first by testing for correlations between session-by-session measures of the mean PTN response latency and mean measures of the LLSR (i.e., peak velocity of movements, and magnitude of the *m2* muscle component). This analysis was restricted to the relatively homogenous population of PTNs because their activity was known to be affected by MPTP administration. No significant correlation was found for either analysis (Velmax, Spearman |rho|< 0.18 *p* > 0.05; or *m2*max, |rho|< 0.26 *p* > 0.05). A second approach searched for correlations trial-bytrial between torque-evoked neuronal responses and measures of the LLSR (Velmax or *m2*max). Measures of PTN response latencies did not correlate with any of the measures of LLSR (Spearman |rho|< 0.34; *p* > 0.05). Similar negative results were obtained

from correlation analyses of the general population of M1 cells (results not shown).

## **DISCUSSION**

The present study addresses a long-standing but untested hypothesis for why the long-latency component of the stretch reflex is exaggerated in Parkinson's disease. Despite the fact that exaggeration of the LLSR is a significant contributor to parkinsonian rigidity (Lee and Tatton, 1975; Mortimer and Webster, 1979; Tatton et al., 1984; Xia et al., 2009), the CNS correlates of the parkinsonian stretch reflex have seldom been examined (Rossini et al., 1989; Aminoff et al., 1997) and never, to our knowledge, using single unit recording. One straightforward hypothesis states that the LLSR is exaggerated in PD because of increased corticospinal sensitivity to muscle stretch (Rothwell et al., 1983). Alternative hypotheses implicate increased gain in slow-conducting spinal reflex pathways (Berardelli et al., 1983). We tested the transcortical hypothesis by measuring the stretch-evoked responses of identified sub-populations of M1 neurons prior to and following the induction of parkinsonism. Contrary to the predictions of the trans-cortical model, we did not observe an increase in M1 responsiveness to proprioceptive stimuli. Rather, the MPTP-induced increase in rigidity and LLSR was associated with shortened M1 response latency and a reduction in directional selectivity. The sections below address each of these findings in turn.

#### **ABSENCE OF INCREASED RESPONSE PREVALENCE OR MAGNITUDE**

We found no evidence that the induction of parkinsonism lead to a greater incidence of neuronal responses to proprioceptive stimuli in the population of M1 neurons, an increase in the balance of torque-evoked increases over decreases, or an increase in the magnitude of torque-evoked responses in individual M1 neurons. These observations run contrary to predictions of the simple trans-cortical model for the exaggerated LLSR of PD. In their key study of the LLSR in PD patients, Rothwell et al. stated that "if the stretch reflex excitability is indeed enhanced in PD, the gain must be increased at some central site"(Rothwell et al., 1983, p. 35). The M1, and its population of PTNs in particular, is one of the few central nodes in the circuit that mediates the LLSR.

A straightforward interpretation of Rothwell et al.'s statement predicts that neurons in this circuit will show an increased prevalence of stretch-evoked neuronal responses or increased neuronal response magnitudes. We found no evidence of either in the activity of single units in M1.

further limited the statistical analysis. χ<sup>2</sup> test; ###*p* < 0.001.

Although inconsistent with the simple trans-cortical hypothesis, this result agrees with the abundant evidence that somatosensory-evoked responses are not enhanced in the sensorimotor cortex of parkinsonian subjects (Rossini et al., 1989; Aminoff et al., 1997; Rickards and Cody, 1997; Boecker et al., 1999; Lewis and Byblow, 2002; Schrader et al., 2008). The classical rate model of PD pathophysiology also proposes that excessive inhibitory outflow from the parkinsonian BG attenuates thalamocortical responses (Delong and Wichmann, 2007). Despite the limitations of the rate model (Montgomery, 2007), evidence continues to support the model's prediction that parkinsonism is associated with a reduction in M1 excitability (Lefaucheur, 2005; Brown et al., 2009; Pasquereau and Turner, 2011; Viaro et al., 2011; Leon-Sarmiento et al., 2013). From this perspective, it might be seen as surprising that we did not observe more profound *decreases* in the prevalence or magnitude of M1 responses to proprioceptive stimulation. Our failure to observe an attenuation of proprioceptive responsiveness, as predicted by indirect measures of brain activity (Rossini et al., 1989; Aminoff et al., 1997; Rickards and Cody, 1997; Boecker et al., 1999; Lewis and Byblow, 2002; Schrader et al., 2008), may be explained by the idea that M1 is organized into dissociable sub-circuits that are affected differently with the induction of parkinsonism (Shepherd, 2013).

Numerous studies have reported that neurons in BG structures and in the motor thalamus of parkinsonian animals show marked increases in the prevalence and magnitude of proprioceptive responses (Filion et al., 1988; Bergman et al., 1994; Pessiglione et al., 2005; Bronfeld and Bar-Gad, 2011). Our results suggest that this exaggerated proprioceptive responsiveness in subcortical structures is not relayed to cortex in any simple way. This result is congruent with other lines of evidence that indicate that the BG-thalamo-cortical pathway does not operate as a simple driver of cortical activity (Inase et al., 1996; Rubin and Terman, 2004; Kuramoto et al., 2009; Goldberg and Fee, 2012).

Our observation of an elevated LLSR (**Figure 2**) combined with no change in prevalence or magnitude of stretch-evoked neuronal responses in PTNs is consistent with the view that the alteration in reflex function that mediates the exaggerated LLSR of PD is localized to the spinal cord (Simonetta Moreau et al., 2002; Marchand-Pauvert et al., 2011; Raoul et al., 2012). Several abnormalities in segmental function have been observed in PD and there is no consensus as to which of them contribute most significantly to the LLSR or rigidity. Some studies implicate slow-conducting group II spindle afferents (Cody et al., 1986; Marchand-Pauvert et al., 2011). Others suggest impairments in inhibition (Tsai et al., 1997; Meunier et al., 2000), perhaps via underactivation of the IB interneuron (Delwaide et al., 1991), or, alternatively, an abnormality in homosynaptic depression (Raoul et al., 2012). For all of these, dysfunction within the BG may affect segmental motor function via descending BG projections to the pedunculopontine nucleus and from there to spinal cord projecting brainstem nuclei (Delwaide et al., 2000). Degeneration of the noradrenergic projection to the spinal cord may also play an important role (Simonetta Moreau et al., 2002).

#### **ALTERED RESPONSE LATENCIES AND DIRECTIONALITY**

The stretch-evoked neuronal responses in M1 began at shorter latencies following MPTP administration (**Figure 4B**). The fact that very similar latency shifts were observed for responses to torque perturbations in the flexion and extension directions suggests that the latency shift was a primary effect of parkinsonism and not a by-product of altered postural bias or background muscle activity. The mechanisms that might mediate such a shift in response latency remain unclear as do its functional implications.

We also found that the directional specificity of neuronal responses was reduced following MPTP administration (**Figure 6**). Similar reductions in somatosensory specificity have been reported for neuronal activity in BG structures and in the thalamus of parkinsonian subjects (Filion et al., 1988; Bergman et al., 1994; Pessiglione et al., 2005). Our observation of reduced encoding of movement direction in M1 is consistent with the general concept that a reduction in functional specificity may be an important component of the pathophysiology of PD (Bronfeld and Bar-Gad, 2011). More specifically, the increased prevalence of non-directional and bidirectional responses may contribute to the genesis of EMG "shortening reactions," which are abnormal muscle reflex responses to passive joint rotation that appear in the muscles that are shortened by the rotation (Andrews et al., 1972; Berardelli and Hallett, 1984). Recent evidence implicates the shortening reaction in the lead-pipe nature of parkinsonian rigidity (Xia et al., 2009, 2011). The deficient directional specificity observed in M1 may contribute to the genesis of shortening reactions by producing corticospinal excitation of segmental circuit elements that should, under normal conditions, be suppressed (e.g., in the motor neuron pool of the shortened muscle). A similar mechanism may account for the more transient responses observed following MPTP administration (**Figure 4**).

#### **METHODOLOGIC CONSIDERATIONS**

This is the first study to our knowledge to examine single unit activity in M1 related to the exaggerated LLSR of parkinsonism. We examined the torque-evoked responses of M1 neurons while our research subjects were engaged in a behavioral task that required postural stabilization. This approach allowed us to maintain relative control over the behavioral state of our animal subjects across the induction of moderate parkinsonism. We used the well-established non-human primate MPTP model of PD (Bankiewicz et al., 1986, 2001) and we studied the activity of two distinct sub-populations of cortical neurons, PTNs and CSNs, as identified by antidromic activation. The divergent effects of MPTP intoxication on the two neuronal populations is consistent with the view that the two play different roles in the pathophysiology of PD (Shepherd, 2013).

It is important to recognize several limitation to the methodology used. First, we did not pre-load the muscle to be stretched or control for the level of muscle activity prior to delivery of torque perturbations, as is often done in studies of the stretch reflex (Rothwell et al., 1983). It is unlikely that this represents a serious confound for the principal results, however, because the trans-cortical component of the LLSR appears to be insensitive to the pre-perturbation level of muscle activity (Pruszynski et al., 2011b).

Second, the contribution of the trans-cortical pathway to the LLSR appears to vary between effectors, being maximal for finger muscles (Marsden et al., 1983; Noth et al., 1991) and of less importance for muscles around more proximal joints such as the wrist and elbow (Berardelli et al., 1983; Cody et al., 1986). We might have obtained different results if we had studied, for example, the effects in M1 of proprioceptive perturbations delivered to intrinsic muscles of the hand.

Third, the peak velocity and amplitude of torque-evoked joint movements were significantly smaller following MPTP administration (**Figure 1**). The magnitude of the LLSR is known to be modulated in proportion to the velocity of muscle stretches (Rothwell et al., 1983; Powell et al., 2012). The stretch-evoked neuronal responses in M1 might have been more numerous or larger in magnitude if the perturbation kinematics had been identical pre- and post-MPTP. However, the difference in kinematics is unlikely to invalidate the significance of our main results. Even when data were selected from subsets of recording sessions that had equivalent velocities, we found that response magnitudes were very similar in the pre-and post-MPTP periods (Supplementary Figure 1).

Fourth, different results might have been observed if we had induced more severe parkinsonian symptoms or bilateral parkinsonism. Although our animals showed clear behavioral and histologic signs of moderate parkinsonism (Pasquereau and Turner, 2011), the symptoms of our animals were not as severe as those rendered by other MPTP intoxication protocols (Bankiewicz et al., 2001; Emborg, 2007).

## **CONCLUSIONS**

In summary, our results are not consistent with the idea that the exaggerated LLSR of PD is mediated by an increase in transcortical reflex gain. The most likely alternative is that reflex gain is abnormally increased in slow-conducting segmental pathways, such as those driven by group II spindle afferents (Cody et al., 1986; Marchand-Pauvert et al., 2011). The reduced directional specificity of M1 responses to muscle stretch provides additional evidence for the general breakdown in functional specificity in parkinsonism. This breakdown in directional specificity may contribute to the abnormal shortening reactions that contribute to parkinsonian rigidity.

## **FUNDING**

This work was supported by National Institute of Neurological Disorders and Stroke at the National Institutes of Health (grant numbers NS044551 and NS055197 to Robert S. Turner and the Center for Neuroscience Research in Non-human Primates, 1P30NS076405).

### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fnsys. 2013.00098/abstract

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 19 August 2013; accepted: 11 November 2013; published online: 26 November 2013.*

*Citation: Pasquereau B and Turner RS (2013) Primary motor cortex of the parkinsonian monkey: altered neuronal responses to muscle stretch. Front. Syst. Neurosci. 7:98. doi: 10.3389/fnsys.2013.00098*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2013 Pasquereau and Turner. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Subthalamic nucleus long-range synchronization—an independent hallmark of human Parkinson's disease

## *Shay Moshel 1,2,3,4\*†, Reuben R. Shamir 1,3†, Aeyal Raz 5, Fernando R. de Noriega6, Renana Eitan7, Hagai Bergman1,2,3 and Zvi Israel <sup>6</sup>*

*<sup>1</sup> Department of Medical Neurobiology, IMRIC, The Hebrew University-Hadassah Medical School, Jerusalem, Israel*

*<sup>2</sup> The Interdisciplinary Center for Neural Computation, The Hebrew University, Jerusalem, Israel*

*<sup>3</sup> The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University, Jerusalem, Israel*

*<sup>4</sup> The Research Laboratory of Brain Imaging and Stimulation, The Jerusalem Mental Health Center, Kfar-Shaul Etanim, Hebrew University-Hadassah Medical School, Jerusalem, Israel*

*<sup>5</sup> Department of Anesthesiology, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA*

*<sup>6</sup> Department of Neurosurgery, Center for Functional and Restorative Neurosurgery, Hadassah University Hospital, Jerusalem, Israel*

*<sup>7</sup> Department of Psychiatry, Hadassah University Hospital, Jerusalem, Israel*

#### *Edited by:*

*Izhar Bar-Gad, Bar-Ilan University, Israel*

#### *Reviewed by:*

*Thomas Wichman, Emory University School of Medicine, USA Peter Brown, University of Oxford, UK*

#### *\*Correspondence:*

*Shay Moshel, The Interdisciplinary Center for Neural Computation and the Edmond and Lily Safra Center for Brain Sciences and the Department of Medical Neurobiology, The Hebrew University - Hadassah Medical School, Building 3, PO Box 12272, Jerusalem 91120, Israel e-mail: shaymoshel@gmail.com*

*†These authors have contributed equally to this work.*

Beta-band synchronous oscillations in the dorsolateral region of the subthalamic nucleus (STN) of human patients with Parkinson's disease (PD) have been frequently reported. However, the correlation between STN oscillations and synchronization has not been thoroughly explored. The simultaneous recordings of 2390 multi-unit pairs recorded by two parallel microelectrodes (separated by fixed distance of 2 mm, *n* = 72 trajectories with two electrode tracks >4 mm STN span) in 57 PD patients undergoing STN deep brain stimulation surgery were analyzed. Automatic procedures were utilized to divide the STN into dorsolateral oscillatory and ventromedial non-oscillatory regions, and to quantify the intensity of STN oscillations and synchronicity. Finally, the synchronicity of simultaneously vs. non-simultaneously recorded pairs were compared using a shuffling procedure. Synchronization was observed predominately in the beta range and only between multi-unit pairs in the dorsolateral oscillatory region (*n* = 615). In paired recordings between sites in the dorsolateral and ventromedial (*n* = 548) and ventromedial-ventromedial region pairs (*n* = 1227), no synchronization was observed. Oscillation and synchronicity intensity decline along the STN dorsolateral-ventromedial axis suggesting a fuzzy border between the STN regions. Synchronization strength was significantly correlated to the oscillation power, but synchronization was no longer observed following shuffling. We conclude that STN long-range beta oscillatory synchronization is due to increased neuronal coupling in the Parkinsonian brain and does not merely reflect the outcome of oscillations at similar frequency. The neural synchronization in the dorsolateral (probably the motor domain) STN probably augments the pathological changes in firing rate and patterns of subthalamic neurons in PD patients.

**Keywords: Parkinson's disease, subthalamic nucleus, deep brain stimulation, oscillations, synchronization**

#### **INTRODUCTION**

The subthalamic nucleus (STN) plays a critical role in the control of basal ganglia activity (Kitai and Kita, 1987; Nambu et al., 2002). In Parkinson's disease (PD), midbrain dopaminergic neurons degenerate, leading to a cascade of physiological changes that strongly affect the STN (Bergman et al., 1994; Hamani et al., 2004). Inactivation (Bergman et al., 1990; Aziz et al., 1991; Alvarez et al., 2009) and deep brain stimulation (DBS, Benazzouz et al., 1993; Pollak et al., 1993; Benabid et al., 1994; Weaver et al., 2009; Follett et al., 2010; Moro et al., 2010; Williams et al., 2010; Bronstein et al., 2011; Lhommée et al., 2012; Odekerken et al., 2013; Schuepbach et al., 2013) of the STN are highly effective in the management of advanced PD.

Neuronal oscillations, at the level of action-potential (spike) discharge (Rodriguez-Oroz et al., 2001; Kuhn et al., 2005; Moran et al., 2008; Zaidel et al., 2010; Guo et al., 2012; Lourens et al., 2013) and local field potential (Kuhn et al., 2009; Chen et al., 2010; Giannicola et al., 2010; Rosa et al., 2011) have been observed in physiological studies of the STN of PD patients undergoing DBS surgery. LFPs span the frequency range of 1–70 Hz [or 1–400 Hz, if one include the high gamma peaks reported at 65–90 Hz and 250–350 Hz (Danish et al., 2007), but see (Yuval-Greenberg et al., 2008) for possible confounding factors in the high frequency regime of LFP], whereas spikes have their maximal power around 1000 Hz. Thus, although LFP oscillations have been thought to imply spike synchronization (Brown and Williams, 2005; Hammond et al., 2007; de-Solages et al., 2011), they more likely represent sub-threshold phenomena such as synaptic activity (Belitski et al., 2010; Buzsaki et al., 2012) which is probably correlated with spike activity.

Conclusive evidence of the correlation (and causality) between neuronal oscillations and synchronization in the PD STN has remained elusive. Physiological studies of neuronal synchronization in the STN of the MPTP primate model are not yet reported. Robust oscillatory synchronization patterns of STN spiking activity have been reported in the 6-hydroxydopamine rodent model of Parkinsonism (Machado et al., 2006; Mallet et al., 2008a,b; Lintas et al., 2012). In human PD patients, oscillatory synchronization of spiking activity has been reported in several studies (Levy et al., 2000, 2002a,b; Amirnovin et al., 2004; Weinberger et al., 2006; Hanson et al., 2012; Alavi et al., 2013; Lourens et al., 2013) but there have been no detailed descriptions of the dependence of the neuronal synchronization on the oscillatory activity or the spatial properties of the neuronal pairs (e.g., simultaneous recording of neurons from the oscillatory and non-oscillatory regions of the STN, see below).

Previous studies have shown that the STN of PD patients can be divided into a dorso-lateral oscillatory region (DLOR) and ventro-medial non-oscillatory region (VMNR) (Moran et al., 2008; Zaidel et al., 2010; Seifried et al., 2012; Guo et al., 2013). The first aim of this study was to explore the properties of neuronal (spike) synchronization of the STN of human PD patients, principally within and between the different STN domains. The second goal was to further explore the relationship between oscillations and synchronization phenomena in the neural activity of the STN.

To overcome the inherent technical difficulties of spike isolation (Joshua et al., 2007; Hill et al., 2011) and spike sorting (Lewicki, 1998) in the electrically noisy environment of the human operating room, and to increase the sensitivity of correlation analysis (Bedenbaugh and Gerstein, 1997; Gerstein, 2000) this study used the unresolved collective (multi-unit) spiking activity recorded by two different microelectrodes exploring the boundaries and the domains of the STN during DBS procedures. This enabled the exploration of the properties of long-range correlation in the STN, in contrast to correlation studies of the activity recorded by a single electrode (e.g., Moran et al., 2008) which can only reveal short range correlations.

## **MATERIALS AND METHODS**

## **PATIENTS AND SURGERY**

Simultaneous microelectrode recordings from two electrodes in patients with Parkinson's disease (PD) undergoing surgery for subthalamic nucleus (STN) deep brain stimulation (DBS) were analyzed in this study. All patients met accepted criteria for STN DBS and signed informed consent for surgery. Microelectrode recording is performed to accurately localize STN borders and domains, in order to optimize the placement of the DBS electrode and thus enhance the therapeutic effects of the DBS procedure. The data collection was therefore done as part of our routine procedures and not part of a clinical trial. This study was authorized and approved by the Institutional Review Board of Hadassah University Hospital in accordance with the Helsinki Declaration (reference codes: 0545-08-HMO and HMO: 10-18.01.08).

Surgery was performed using a CRW stereotactic frame (Radionics, Burlington, MA, USA). STN target coordinates were chosen as a composite of the indirect anterior commissureposterior commissure (AC-PC) atlas- based location and direct (1.5 or 3 Tesla) T2 magnetic resonance imaging (MRI), using Framelink 4 or 5 software (Medtronic, Minneapolis, USA). The recordings used in this study were made while the patients were awake without sedation. The patient's level of awareness was continuously assessed clinically and, if drowsy, the patient was stimulated and awoken through conversation by a member of the surgical team. Data were obtained while the patients were off dopaminergic medication, which was stopped 12 h prior to surgery.

### **MICROELECTRODE RECORDINGS**

Data were acquired with the MicroGuide system (Alpha-Omega Engineering, Nazareth, Israel). Neurophysiological activity was recorded using polyamide coated tungsten microelectrodes (Alpha Omega) with impedance mean ± standard deviation (*SD*) of 0.60 ± 0.11 M- (measured at 1 kHz at the beginning of each trajectory). The signal was amplified by 10,000, band-passed filtered from 250 to 6000 Hz using four-pole Butterworth filter hardware, and sampled at 48 kHz by a 12-bit A/D converter (using ±5 V input range). Local field potentials were not recorded due to constraints of electrical noise in the operating room.

Microelectrode recording was performed using two parallel microelectrodes starting 10 mm above the estimated center of the dorsolateral STN target, based on the pre-operative T2 MRI image. The two electrodes were simultaneously advanced, and therefore the distance between the two electrodes was fixed (2 mm) during all recordings. Trajectories followed a doubleoblique approach (approximately 60◦ from the axial AC-PC plane and 15◦ from the mid-sagittal plane) toward the STN target. The angles of the trajectory were slightly modified to avoid the cortical sulci, the ventricles and major blood vessels as revealed by gadolinium-enhanced T1 MRI (Machado et al., 2006). The "central" electrode was directed at the center of the STN target, and an "anterior" (ventral) electrode was located 2 mm anterior to the central electrode. Typically, the electrodes were advanced in steps of ∼100μm between successive recordings sites within the STN. Only trajectories where both electrodes had passed through the STN for at least 4 mm were used in this study (yielding 72 trajectories of 2 electrodes from 57 PD patients undergoing bilateral STN deep brain stimulation surgery). After identification of the STN ventral border by the electro-physiologist, the STN and its sub- regions were automatically detected using the Hidden Markov model (HMM) method (Zaidel et al., 2009).

## **DATABASE**

We studied 72 STN trajectories (each of 2 electrodes) from 57 PD patients, 40 males and 17 females, aged 58.9 ± 10.3 years (mean ± standard deviation, *SD*) and with disease duration of 10.3 ± 4.7 years (mean ± *SD*). The UPDRS motor part score, UPDRS III, was 49.2 ± 17.8 (mean ± *SD*) when assessed off dopamine replacement therapy before surgery. Patient details and clinical effects of the surgery are given at **Table 1**.

The minimal recording time duration of a STN pair to be included in this study was 5 s (analysis of the subset of recording with minimal recording duration of 10 s reveal similar results, data not shown). A total of 2390 multi-unit pair sites, in which both electrodes were judged to be inside the STN for the minimal




duration, were studied. The same data base was used for the single site (oscillation) analysis, yielding 4780 single STN sites. Recording (and analysis) time duration of the STN pairs equaled 23.7 ± 25.3 s (mean ± *SD*).

#### **ANALYSIS OF SYNCHRONIZATION AND OSCILLATIONS**

All data analysis utilized custom-made MATLAB 7.10b (R2010.b) routines. The local field potential frequency domain was filtered out by the recording apparatus. Burst frequencies below the range of the operating room band-pass filter (250–6000 Hz) could be detected using the rectified signal, which follows the envelope of multi-unit activity (Moran et al., 2008; Halliday and Farmer, 2010; Moran and Bar-Gad, 2010; Zaidel et al., 2010). The raw 250–6000 Hz analog signal was therefore rectified by the "absolute" operator and the global mean was subtracted. Thus, the resulting analysis represents only spike activity.

The average power spectrum density (PSD) at each site was calculated using Welch's method with a 1.5 s Hamming window (50% overlap), after removing the local window mean, and with a 131,072 FFT points (nfft), yielding spectral resolution of 1/3 Hz [nfft = 2∧round(log2(Fs/f\_res)), where *F*s = sampling frequency and f\_res is the spectral resolution]. PSD amplitude is affected by the amplitude of the recorded neural activity, which is impacted by non-physiological factors such as the impedance of the electrode (Zaidel et al., 2010). To create homogenous PSD results for all recorded sites, the "relative" (normalized) power spectral density was calculated by dividing it by the total power of the signal between 0 and 3000 Hz. This relative, or normalized, power spectral density therefore estimates the spectral peak in relation to the other peaks in the spectrogram.

To compute coherence, the magnitude squared (MS-) coherence method (Kay, 1988; Miller and Sigvardt, 1998) was used. Welch's method was utilized, with a 1.5 s Hamming window (50% overlap), after removing the local window mean and with a spectral resolution of 1/3 Hz (same conventions as for PSD). Coherence values are limited (by definition) between 0 and 1. All coherence averages were therefore calculated in Fisher's transform domain (Miranda de Sa et al., 2009) and then reversed.

By definition, the removal of each window mean in the spectrum and the coherence analysis eliminate any power at 0 Hz (DC). We therefore start all the spectrum and coherence plot of this manuscript at 1 Hz.

A constant baseline level emerged in our coherence results (e.g., **Figures 4A,B**). This baseline probably resulted from the finite sampling of two "random noise" sources. To verify this, pairs of Gaussian random noise sources were simulated. The simulated data were subjected to the same filters and absolute operator as the real neuronal data and the same analysis tools. The magnitude of the coherence baseline dropped exponentially as time duration increased. Therefore, the baseline level seen in the STN coherence is most likely due to the finite (and relatively) short duration (mean = 23.7 s) of the recordings in human patients. The coherence functions were normalized by the subtraction of the average coherence of the randomly shuffled (10,000 times) pairs from the same STN domain. Note that the normalized coherence functions can therefore display negative coherence values.

### **SYNCHRONIZATION AND OSCILLATION STRENGTH (***SYNC* **AND** *OSCIL SCORES***)**

Rosenberg and Halliday (Halliday et al., 1995, 2000; Farmer et al., 1997; Farmer, 1998) proposed a very useful method to estimate coherence significance. However, this method employs a threshold confidence level, and does not offer a quantitative measure of synchronization strength. Therefore, a *Z*-score like method (effective *Z*-score: *Z*∗) was devised to determine the synchronization and oscillation strength. The *Z*-score of a given parameter is defined as the number of standard deviations above (or below) the mean. In this case the parameter was the maximum value (peak value) of the smoothed PSD or coherence (see below). However, instead of using the standard deviation of the entire frequency range, a *tail* standard deviation (σtail) was defined in the frequency range of 35 to 70 Hz. In this range, no coherence or power spectrum phenomena were observed in our dataset (**Figures 1D**, **2A–C**, **3A,B**, **4B**). To smooth the coherence, a simple moving average (SMA) was calculated, with a window size of 23 samples (7.67 Hz), and a delta of one sample (i.e., the frequency resolution of 1/3 Hz). The synchronization strength or score was defined as *Z*<sup>∗</sup> = (MAX(SMA(C (*f*))) − μ/σtail). MAX(SMA (C (f))) is the maximum value of the coherence after smoothing with the moving average, and μ is the coherence mean. To find the frequency in which the spectrum or the coherence achieved maximal value, the arg-max(SMA(*C*(*f*))) was calculated. The coherence (*C*(*f*)) maximal peaks were defined in the smoothed coherence function with minimal distances of 5 Hz between them. The search for the coherence peak was started at the lower frequency, and progressed to the largest value of the smoothed coherence function. All calculations (max, mean and arg-max) were performed in the frequency range between 1 and 70 Hz. Negative scores were found in a few cases, due to residual high power at the low (1–2 Hz) frequency range, and these were ignored.

To determine the oscillation strength, the same effective *Z*score as for the synchronization was used and defined as the *oscil score*. The maximum value of the smoothed PSD (by a simple moving average, with window size of 23 samples, and delta of one sample, i.e., 1/3 Hz), and the *tail* standard deviation (σtail) in the frequency range of 35–70 Hz were calculated. The *oscil score* was defined as *Z*<sup>∗</sup> = (MAX(SMA(PSD (f))) − μ/σtail). MAX(SMA(PSD (f))) is the maximum value of the PSD after smoothing with the moving average; μ is the PSD mean.

To explore the relationship between oscillation and synchronization a statistical measure of the oscillation strength of the two oscillatory sites was used. The average PSD<sup>∗</sup> = (PSD1 + PSD2)/2 was calculated, where PSD1 and PSD2 were the power spectrum densities of each site in the neuronal pair, and the *oscil score* of PSD∗ was calculated. Additionally, other estimates of *oscil scores* of the two sites were calculated as: *min*(*oscil*1, *oscil*2); *max*(*oscil*1, *oscil*2); and as the geometric mean of the two scores where *oscil*1, *oscil*<sup>2</sup> were the *oscil scores* of each PSD site. The geometric mean was calculated as: oscil = sign(*oscil*1 ∗ *oscil*2) ∗ GeoMean(|*oscil*1|,|*oscil*2|), where *sign* was the sign operator of *oscil*<sup>1</sup> and *oscil*<sup>2</sup> product, in the case of negative values.

Synchronization or oscillations were defined to be significant when the scores reached the *Z*<sup>∗</sup> ≥ 2 (i.e., the coherence or the PSD peak value was higher than 2 *SD* of the mean values of these functions).

## **COHERENCE CONFIDENCE LEVEL**

To assess the validation of *sync score* the confidence level (CL) of the coherence analysis (Halliday et al., 1995) was used. We divided the microelectrode records of duration R into L non-overlapped disjoint segments of duration S (R = L∗S). The total spectrum was calculated using the average of the magnitude-squared (MS) of the discrete Fourier transform (periodogram), after removing the local mean in each segment S. Each segment contained S = 2∧16 samples with a frequency resolution of 0.7336 Hz. Only complete segments were analyzed; data points at the end of the record that did not make a complete segment were not included in the analysis. The procedures were implemented using Neurospec free MATLAB toolbox: http://www.neurospec.org. To obtain the approximate confidence interval for 95% and 99% from the data points, the level thresholds: CL95 <sup>=</sup> <sup>1</sup> <sup>−</sup> <sup>0</sup>.051/(*<sup>L</sup>* <sup>−</sup> <sup>1</sup>) and CL99 <sup>=</sup> <sup>1</sup> <sup>−</sup> <sup>0</sup>.011/(*<sup>L</sup>* <sup>−</sup> <sup>1</sup>) , respectively, were used. **Figure 4B** depicts examples of the relations between MS-coherence estimates (*Z*-scores) we used in the manuscript with coherence confidence levels of 95% and 99% respectively.

## **ASSESSING THE CAUSAL RELATIONS BETWEEN OSCILLATIONS AND SYNCHRONY**

Spurious synchronization can arise from non-coupled oscillatory sites that oscillate in the same frequency bands (i.e., two atomic clocks might be synchronized due to their exact frequency although there is no physical coupling between them (Strogatz, 2003). To rule out this spurious oscillation-synchronization, the mean coherence of randomly shuffled pairs (10,000 times) was calculated for each category (all pairs, DLOR-DLOR, VMNR-VMNR, and DLOR-VMNR) of the STN. The shuffling was performed using the Mersenne Twister algorithm (Matsumoto and Nishimura, 1998) with a different seed number in each iteration.

## **RESULTS**

## **SYNCHRONIZATION OCCURS ONLY BETWEEN DLOR PAIRS**

**Figure 1** show an example of synchronous oscillatory activity as recorded by two electrodes inserted into the STN of a PD patient during DBS procedures. The raw analog data is shown in two time scales in **Figures 1A,B**. The power spectrums and the coherence function of this recording are shown in **Figures 1C,D**, respectively. One can easily observe the synchronous oscillations in the beta range (∼20 Hz) in this example.

To explore the properties of STN neuronal synchronization, STN spiking activity simultaneously recorded from two electrodes was analyzed (**Figure 2**). In total, 2390 multi-unit pairs along 72 STN trajectories (with >4 mm STN span in both electrodes) from 57 PD patients undergoing DBS surgery were included in the analysis.

Previous physiological studies of the basal ganglia in the rodent (Mallet et al., 2008a,b) and primate (Bergman et al., 1994; Nini et al., 1995; Raz et al., 1996, 2000, 2001; Goldberg et al., 2002) models of PD have indicated abnormal synchronicity of basal ganglia neurons as one of the major changes occurring in the network following dopamine depletion. Nevertheless, when the neuronal synchronization of simultaneously recorded STN sites (over the entire STN 2390 pairs) was measured, no distinguishable synchronization was found.

The STN can be spatially differentiated into sub-regions according to neural activity (Zaidel et al., 2010). Two areas could be robustly discriminated in our recording: the dorsolateral oscillatory region (DLOR, *n* = 1778 sites, **Figure 3A**) and the ventromedial non-oscillatory region (VMNR, *n* = 3002 sites, **Figure 3B**). **Figures 3C,D** show the distribution of the *oscil scores* in the DLOR and VMNR, respectively. As expected, significantly

calculated using the HMM defined border, and lower rows calculated with a progressively increasing gap from the HMM border. *N* is the number of pairs, and the shaded areas represent the standard error of the mean (SEM) of the relative coherence function for each population. **(D)** Distribution of oscillations scores along the normalized DLOR depth. For each trajectory the DLOR length was normalized for 0-entry, 1–end of DLOR.

higher *oscil score* values were observed in the DLOR than in the VMNR.

The division of the STN into the DLOR and VMNR domains enabled testing of the synchronization of STN pairs from the same and different regions. Significant synchronization, mainly in the frequency band of 8–30 Hz was found, but only between pairs in the DLOR itself (DLOR-DLOR, *n* = 615 pairs, **Figure 2A**, upper subplot). This synchronization was not observed in pairs of electrodes at DLOR and VMNR (*n* = 548 pairs, **Figure 2B**, upper subplot) or in the VMNR (*n* = 1227 pairs, **Figure 2C**, upper subplot). This finding is consistent with previous multiple electrode studies of the human Parkinsonian STN (Levy et al., 2000, 2002a,b; Amirnovin et al., 2004; Weinberger et al., 2006, 2009; Alavi et al., 2013; Lourens et al., 2013) which reported coherence between STN oscillations in a small fraction of STN pairs. However, our findings indicated that the topographical location of the STN electrodes affected the probability of finding a correlation between STN sites, and coherence was only and robustly found between DLOR-DLOR multi-unit pairs.

Recent imaging studies (Lambert et al., 2012; Haynes and Haber, 2013) have clarified that the boundaries between the functional subdomains of the STN are fuzzy, and an overlap of motor and non-motor projections can be found in the transition areas between the STN domains. Therefore, the average coherence at the dorsolateral and ventromedial STN was tested with increasing gaps (0.5–2 mm) from the HMM borders. These results are shown in the lower five rows of **Figure 2**, and reveal a sharpening and increase of the average coherence peak in the STN DLOR when the gap is increased. Similarly, when the *oscillation scores* are calculated along the normalized depth of the DLOR, a gradual

**FIGURE 3 | Oscillatory activity in the STN DLOR and VMNR.** Average power spectra of STN activity of DLOR **(A)** and VMNR **(B)** recordings. The shaded areas represent the standard error of the mean (SEM) of the spectrum function for each population. *N*sites is the number of sites averaged. **(C,D)** Distribution of the *oscil scores* in the DLOR and the VMNR recordings. Note the different scales of the *Y*-axes. *Oscil scores* below zero and above 20 (*n* = 68 and 153 out 4780) are appended to the first and last bin respectively to enhance visualization of the results.

decrease in the oscillatory scores is observed as the DLOR lower border is approached (**Figure 2D**).

The above results were obtained by averaging over pairs recorded for different durations. The average coherence results (**Figure 2**) was further compared to the average coherence results of the same pairs with homogenous intervals (only the first 10 s of each recording was included, and recordings with durations shorter than 10s were excluded). Similar results (data not shown) were obtained.

## **SYNCHRONIZATION vs. OSCILLATIONS IN THE DLOR AREA**

Next, correlation between the oscillations and synchronization in the STN was analyzed. The oscillation and synchronization strengths were calculated using the *oscil* and *sync scores* for each pair in the DLOR area. **Figures 4A,B** depict three examples of power spectrum and coherence function of STN activity with their relative *oscil* and *sync scores*, respectively. See also **Figure 1** for an example of a simultaneous recording of two

**FIGURE 4 | Examples of power spectra and coherence functions with different oscillation and synchronization scores. (A)** Three examples of power spectra with low, medium, and high oscillation (*oscil*) scores. **(B)** Three examples of STN coherence functions with low, medium, and high synchronization (*sync*) scores. The dashed horizontal and continuous lines denote the confidence interval of 99 and 95% respectively. Sampling duration equals 33.4, 14.5, and 35.1 s for the power spectra examples and 40.2, 41.5, and 40.6 s for the coherence examples (yielding similar confidence intervals for the coherence functions).

sites in the STN, and their corresponding values of *oscil* and *sync scores*.

**Figure 5B** depicts the scatter plot of the *oscil* and *sync scores* for all DLOR pairs. Different indicators for the oscillation strength of pairs of STN sites were used: the minimal and maximal *oscil score*, the arithmetic average of the PSDs, and the geometric mean of *oscil scores*. In all cases, the scatter plot of the *sync score* vs. the *oscil score* of the pairs within the DLOR area (*n* = 615) indicated a significant correlation (*r* > 0.24, *p* < 0.001) between the synchronization and the oscillations. Here (**Figure 5B**) we show only the data for the arithmetic mean of the *oscil. score*s.

The correlation between the oscillation and synchronization strength could imply that the synchronization pattern was dependent on the homogeneity of the neuronal oscillations within the DLOR. If the neural oscillations in different sites of the STN of a single patient have a very stable and equal frequency, the existence of synchronization may not be the result of physical coupling between the STN neurons. Therefore, the DLOR pairs of each trajectory were randomly shuffled and the synchronization between the shuffled (non-simultaneously recorded) pairs was re-quantified in each trajectory. After shuffling, the oscillations remained in the same frequency band, but the synchronization was no longer apparent. **Figures 5A,C** show the average coherence functions before and after shuffling of the DLOR-DLOR pairs

significant average coherence between non-shuffled pairs within the STN DLOR. *N* = 615 represents the number of DLOR-DLOR pairs. **(B)** Scatter plot of synchronization *(sync)* and oscillation *(oscil)* scores in the STN DLOR reveals that the two measures are correlated. Each square represents the synchronization (*Y*-axis) vs. average (arithmetic mean) of the two oscillation scores of one of the 615 pairs within the DLOR. *r* is the Spearman correlation coefficient and *p* is the probability that *r* = 0 (no correlation between the scores). **(C)** Synchronization is no longer seen between non-simultaneously recorded (shuffled) STN DLOR-DLOR pairs. Inset: Schematic illustration of the shuffling procedure. The shuffling procedure was repeated 10,000 times for each pair. **(D)** Scatter plot of shuffled oscil-sync score values. Same conventions as in **(B)**.

(*n* = 615), respectively. **Figures 6A,B** depict the average *sync* and *oscil scores* before and after shuffling in the STN DLOR and the VMNR. As expected, shuffling had no significant effect on the *oscil score* in either area (oscillation is a property of a single element and therefore should not be affected by the shuffling procedure). However, the average *sync score* of the DLOR pairs, but not the VMNR pairs declined significantly after the shuffling procedure (**Figure 6A**). Finally, **Figure 5D** depicts the scatter plot of the *sync* and *oscil scores* of the shuffled pairs within the DLOR. The Spearman correlation between the *sync score* and *oscill score* dropped dramatically from r1 = 0.34 to r2 = 0.05 (*p* < 0.001 for the null assumption that r1=r2). The mean PSD estimate for the average *oscil score* (as in **Figure 5B**) was used for this analysis. Similar results were obtained for the other indicators of oscillation strength of the STN pairs.

#### **COHERENCE IS MAINLY IN THE BETA FREQUENCY RANGE**

Next, the frequency value where each spectrum (**Figure 7A**) and coherence (**Figure 7C**) reached its maximal value was calculated. In both cases, a bi-modal distribution was observed, with a dominance of tremor frequency (3–7 Hz) and beta (12–30 Hz) oscillations, for the auto-spectrums and the coherence functions, respectively. **Figures 7B,D** show the scatterplot of the maximal *oscil* and *sync scores*, respectively, as a function of their frequency. While the *oscil scores* had similar values in the beta and the tremor range, the values of the maximal *sync scores* were much higher in the beta than in the tremor range. These results are in line with our previous primate studies (Raz et al., 2000) that revealed

**FIGURE 6 | Average values of synchronization (***sync score***) in the DLOR, but not average VMNR synchronization or oscillations scores in both STN domains, are affected by the shuffling procedures. (A)** Synchronization (*Sync*) scores before (white) and after (black) shuffling, lines indicate standard error of the mean (SEM). **(B)** Oscillation (*Oscil*) scores before and after shuffling. Same conventions as in **(A)**.

615 DLOR-DLOR pairs.

mainly 5 Hz peaks in the auto-correlations vs. higher frequencies (10 Hz) in the cross-correlations functions of pallidal units and pairs recorded in the globus pallidus after MPTP treatment. However, we cannot rule out the possibility that the 10 Hz activity in this study is not tremor related and a harmonic feature (or n:m locking) of the tremor or of the neuronal oscillations at the tremor frequency.

#### **LACK OF POSITIVE CORRELATION BETWEEN THE STN** *OSCIL* **AND** *SYNC SCORES* **vs. PD SYMPTOMS**

Previous studies have suggested that STN oscillations and synchronization are correlated with tremor in PD patients (Levy et al., 2002a). This would indicate that the STN synchronized oscillations are driven by the tremor (which may be generated by an independent neuronal loop). The above findings of robust synchronization in the beta rather than in the tremor frequency range (**Figure 7D**) are not in line with this hypothesis. Nevertheless, we looked for correlations between the *oscil* and *sync scores* of our patients and their pre-operative (OFF medication) UPDRS scores. We did not find significant positive correlation between the average *oscil score* and *sync scores* of STN activity and the UPDRS scores of the tremor in the contra-lateral upper limb(s), all tremor (including axial) scores, and all UPDRS III motor scores. There is a trend for STN synchronized beta oscillations to be more robust in patients with less tremor. While these results might point to a correlation between STN beta oscillations and akinetic/rigid Parkinsonian symptoms (an issue that requires clarification in future studies with bigger sample of patients and with intra-operative clinical assessment), they definitely indicate that the STN beta synchronized oscillations are not a by-product of the PD tremor.

## **DISCUSSION**

In this manuscript, synchronization within the human Parkinsonian subthalamic nucleus was investigated. No significant synchronization was found over the STN as a whole. After dividing the STN into two electro-physiologically distinct regions, the dorsolateral oscillatory region (DLOR) and the ventromedial non-oscillatory region (VMNR), significant synchronization in the beta range was observed, however, only within the DLOR. The strength of the DLOR synchronization was correlated with the strength of the oscillations of the multi-unit pairs. Nevertheless, shuffling between DLOR pairs abolished synchronization, suggesting that STN synchronization is an independent phenomenon and not a mere reflection of neuronal oscillations at similar frequencies.

Previous studies have shown significant spatial overlap between the DLOR and the STN sensorimotor area (Rodriguez-Oroz et al., 2001; Zaidel et al., 2010). The finding that the STN VMNR (considered to be part of the limbic and associative basal ganglia network) remains unsynchronized is consistent with the predominantly motor nature of PD. However, the (normal) lack of synchronization in the STN VMNR may be due to a selection bias of our DBS patients. Since conventional inclusion criteria were used to select candidates for DBS, patients were usually severely motor-impaired and had few of the non-motor features of the disease. Furthermore, the DLOR may reflect the pathological area of the STN which progressively invades the limbic domains of the STN as the disease advances. Finally, our results are in line with a fuzzy rather than a sharp boundary between the STN sub-domains (Lambert et al., 2012; Haynes and Haber, 2013).

#### **THE STN SPIKING POPULATION ACTIVITY IS SYNCHRONIZED**

In this study population spiking (multi-unit) activity was used as a measure of the spiking activity of the STN rather than the more classical parameter of single unit activity (Perkel et al., 1967; Abeles, 1982; Lemon, 1984; Eggermont, 1990). This was primarily for practical reasons. The goal of physiological recording in the operating room (OR) is to enable better identification of the borders of the subthalamic nucleus and its sub-regions. The electrode is therefore advanced in 100μm steps rather than 2– 5μm steps as is customary in the research laboratory setup. The sampling duration at each step is also limited (Shamir et al., 2012) and the OR conditions often do not allow stable recordings (as compared to 30–90 min stable recordings in a research setting). On the other hand, the cross-correlation of composite spike trains derived from several un-discriminated cells recorded on a single electrode (multi-unit activity) enhances the sensitivity of correlation methods. First, the higher discharge rate of multi vs. single unit recording reduces the asymmetric sensitivity of correlation methods to excitation vs. inhibition (Aertsen and Gerstein, 1985). Second, multi-unit cross-correlation can be a more sensitive detector of a neuronal relationship than single-unit cross-correlation (Bedenbaugh and Gerstein, 1997). Thus, use of a multi-unit signal is warranted for both practical and theoretical reasons. Furthermore, the use of signals recorded by two different electrodes in this study reveal the long range (2 mm) synchronization of STN DLOR. It is hoped that future studies of STN units using objective metrics for quantification of the quality of the unit isolation (Joshua et al., 2007; Hill et al., 2011) will shed more light on synchronization in the STN and other basal ganglia structures of human patients.

#### **SYNCHRONIZATION ONLY OCCURS BETWEEN DLOR PAIRS IN THE STN**

Early studies described neuronal synchronization in the STN as an epiphenomenon found mainly in patients presenting with tremor (Levy et al., 2000, 2002a). More recent studies (Hanson et al., 2012; Alavi et al., 2013; Lourens et al., 2013) have reported that synchronization can be found between some but not all STN pairs. On the other hand, beta-band LFP and spike oscillations have been described as a consistent feature of human PD in the dopamine depleted state (Brown and Williams, 2005; Foffani et al., 2005; Little et al., 2012). Moreover, many studies have documented the consistency of beta-band oscillations in both the spatial and temporal domains (Bronte-Stewart et al., 2009; de-Solages et al., 2010; Zaidel et al., 2010; Abosch et al., 2012; Little et al., 2012). In this study, synchronization within the Parkinsonian STN DLOR was indeed found to correlate with oscillations. However, the shuffling procedure revealed that STN synchronization was not due to independent oscillators with a similar oscillation frequency (Strogatz, 2003). If this had been the case, a significant synchronization should also have been observed between the shuffled (non-simultaneously recorded pairs of the same patient) DLOR-DLOR pairs. Thus, the synchronization of the simultaneously recorded STN pairs probably reflects the increased coupling between these neurons in the dopamine depleted state of Parkinson's disease. This increased coupling is probably due to the increased efficacy of the common inputs to the STN cells, either from the cortex (Nambu, 2004; Kita and Kita, 2012) or from the external segment of the globus pallidus (Plenz and Kitai, 1999; Tachibana et al., 2011). However, at this stage the possibility of increased coupling by lateral connectivity within the STN cannot be ruled out (Parent et al., 2000; Parent and Parent, 2007).

The finding that most of the energy of the STN synchronous oscillations is in the beta range suggest that these oscillations are not generated by feedback of the peripheral tremor. It is interesting to note that synchronous oscillations in the basal ganglia of MPTP treated primates are mainly found in the 10 Hz domain, where human oscillations span the full beta range (12–30 Hz). Future studies should reveal if this is due to species difference, or due to differences between the MPTP model and human idiopathic Parkinson's disease.

#### **CONCLUDING NOTES**

In this study we show that the STN domain most affected by PD dopamine depletion (the DLOR, probably the STN motor domain) exhibited both oscillations and synchronization. This suggests that synchronization reflects an additional property of the Parkinsonian STN. Previous studies in the basal ganglia of MPTP treated primates have demonstrated that synchronization can be completely independent of oscillatory activity (Heimer et al., 2002). The previous and the current findings can serve to recast the relationship between oscillations and synchronization in the Parkinsonian basal ganglia (Raz et al., 2000; Amirnovin et al., 2004; Moran et al., 2008). In addition to changes in discharge rate and pattern, synchronization within the STN may be another pathophysiological marker of Parkinson's disease. The potential consequences of synchronization (as opposed to other attributes like rate and pattern change) are probably mainly due to reduced information capacity of the basal ganglia neurons. However, the different pathological changes in the parkinsonian basal ganglia are probably not mutually exclusive. Synchronized oscillations have stronger effects than less synchronized oscillations and completely unsynchronized oscillations might have no effect on target neurons. Furthermore, future studies toward adaptive DBS (Rosin et al., 2011) should investigate which of the pathophysiological changes in the STN activity might be used as the optimal trigger for closed loop DBS.

#### **ACKNOWLEDGMENTS**

This work was supported partially by the Vorst Family Foundation for Parkinson Research, by the Simone and Bernard Guttman chair of Brain Research, and the generous support of the Rosetrees and Dekker foundations (to Hagai Bergman) as well as by the ELSC post-doctorate fellowships to Shay Moshel and Reuben R. Shamir and the PATH and Bloom foundations of London to Zvi Israel. We would like to thank Prof. Y. Ritov for statistical advice and Dr. S. Freeman and E. Singer for help in scientific editing.

#### **AUTHOR CONTRIBUTIONS**

Shay Moshel and Reuben R. Shamir claim for equal contribution. Shay Moshel did the data analysis, and wrote the manuscript together with Hagai Bergman and Zvi Israel. Reuben R. Shamir handled the database and collected part of the data. AR initiated the study and helped in the collection of data. Hagai Bergman, Renana Eitan, Fernando R. de Noriega, and Zvi Israel collected the data. All authors discussed the results, reviewed the manuscript, and made their comments.

#### **REFERENCES**


the subthalamic nucleus of patients with Parkinson's disease. *Exp. Neurol.* 194, 212–220. doi: 10.1016/j.expneurol.2005.02.010


advanced Parkinson disease: a randomized controlled trial. *JAMA* 301, 63–73. doi: 10.1001/jama.2008.929


Zaidel, A., Spivak, A., Shpigelman, L., Bergman, H., and Israel, Z. (2009). Delimiting subterritories of the human subthalamic nucleus by means of microelectrode recordings and a Hidden Markov Model. *Mov. Disord.* 24, 1785–1793. doi: 10.1002/mds.22674

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 10 July 2013; accepted: 18 October 2013; published online: 19 November 2013.*

*Citation: Moshel S, Shamir RR, Raz A, de Noriega FR, Eitan R, Bergman H and Israel Z (2013) Subthalamic nucleus long-range synchronization—an independent hallmark of human Parkinson's disease. Front. Syst. Neurosci. 7:79. doi: 10.3389/fnsys. 2013.00079*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2013 Moshel, Shamir, Raz, de Noriega, Eitan, Bergman and Israel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Global actions of nicotine on the striatal microcircuit

## *Víctor Plata , Mariana Duhne , Jesús Pérez-Ortega , Ricardo Hernández-Martinez , Pavel Rueda-Orozco , Elvira Galarraga , René Drucker-Colín and José Bargas\**

*División de Neurociencias, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, Mexico City, Mexico*

#### *Edited by:*

*Ahmed A. Moustafa, University of Western Sydney, Australia*

#### *Reviewed by:*

*Raju Metherate, University of California, Irvine, USA (in collaboration with Irakli Intskirveli) James M. Tepper, Rutgers, The State University of NJ, USA*

#### *\*Correspondence:*

*José Bargas, División de Neurociencias, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, Circuito ext s/n, México City DF 04510, México e-mail: jbargas@ifc.unam.mx*

The question to solve in the present work is: what is the predominant action induced by the activation of cholinergic-nicotinic receptors (nAChrs) in the striatal network given that nAChrs are expressed by several elements of the circuit: cortical terminals, dopamine terminals, and various striatal GABAergic interneurons. To answer this question some type of multicellular recording has to be used without losing single cell resolution. Here, we used calcium imaging and nicotine. It is known that in the presence of low micromolar N-Methyl-D-aspartate (NMDA), the striatal microcircuit exhibits neuronal activity consisting in the spontaneous synchronization of different neuron pools that interchange their activity following determined sequences. The striatal circuit also exhibits profuse spontaneous activity in pathological states (without NMDA) such as dopamine depletion. However, in this case, most pathological activity is mostly generated by the same neuron pool. Here, we show that both types of activity are inhibited during the application of nicotine. Nicotine actions were blocked by mecamylamine, a non-specific antagonist of nAChrs. Interestingly, inhibitory actions of nicotine were also blocked by the GABAA-receptor antagonist bicuculline, in which case, the actions of nicotine on the circuit became excitatory and facilitated neuronal synchronization. We conclude that the predominant action of nicotine in the striatal microcircuit is indirect, via the activation of networks of inhibitory interneurons. This action inhibits striatal pathological activity in early Parkinsonian animals almost as potently as L-DOPA.

**Keywords: striatal microcircuit, nicotine, nicotinic receptors, GABAergic interneurons**

#### **INTRODUCTION**

The striatal microcircuit is composed of projection neurons and different classes of interneurons (e.g., Kawaguchi, 1993; Tepper and Bolam, 2004; Tepper et al., 2010). This circuit receives inputs from the cortex, the substantia nigra compacta (SNc) and the thalamus, among others, being a main entrance to the basal ganglia, a system of nuclei which encodes movement, associative learning, and procedural memory (Cools, 1980; DeLong, 1981; Aosaki et al., 1994; Kreitzer and Malenka, 2008; Do et al., 2012).

Although striatal projection neurons (SPNs) have low rates of basal firing in control conditions (e.g., Mink, 2003), uncorrelated excitatory drives such as N-Methyl-D-aspartate (2–5μM NMDA) induces correlated neuronal activity in the control striatal microcircuit *in vitro* (Carrillo-Reid et al., 2008); similar to that produced during movement *in vivo* (Vautrelle et al., 2009). This activity consists in moments of recurrent and spontaneous synchronization in the firing of different neuron pools. This synchronous activity is alternated among the different neuron pools, generating the appearance of determined sequences, some of them being reverberant sequences or cycles (Carrillo-Reid et al., 2008, 2009a). These dynamics have been shown to be modulated by transmitters acting through G-protein coupled receptors and signaling pathways such as those activated by dopamine (DA) and acetyl-choline (ACh) (Carrillo-Reid et al., 2009a, 2011).

On the other hand, when deprived of DA supply, as in animal models of Parkinson's disease (PD), the striatal circuitry also generates a profuse spontaneous and synchronized activity without the addition of NMDA or any other excitatory drive. However, this pathological activity induced by DA-depletion differs from that found in control tissue: it is characterized by the loss of sequential activity and alternating dynamics. Almost all activity becomes generated by the same neuron pool with recurrent synchronization, resembling the repetitive oscillations found in Parkinsonian subjects (Jáidar et al., 2010). Here, we show that both control (with NMDA) and Parkinsonian activities are globally suppressed by nicotine administration.

The ACh present in the striatal microcircuit is released by local cholinergic interneurons and is the highest in any brain region together with the levels of choline acetyl-transferase, and cholineesterase (Mesulam et al., 1992; Contant et al., 1996; Goldberg et al., 2012). Cholinergic interneurons are autonomous pacemakers and ACh release is continuous and dynamic, thus, producing a varying tonic level of ACh in the whole striatum according to demand (Bennett and Wilson, 1999; Goldberg and Wilson, 2005).

The majority of the neurons (>90%) in the striatal circuit are SPNs which respond to ACh via muscarinic G-protein coupled receptors (Galarraga et al., 1999; Alcantara et al., 2001; Yan et al., 2001; Zhang et al., 2002). Known actions of these muscarinic receptors are facilitatory due in part to suppression of K+ outward currents, directly or indirectly (Howe and Surmeier, 1995; Gabel and Nisenbaum, 1999; Galarraga et al., 1999; Lin et al., 2004; Olson et al., 2005; Pérez-Burgos et al., 2008; Pérez-Rosello et al., 2005; Shen et al., 2005).

However, much less is known about the nicotinic receptors present in this circuit (Goldberg et al., 2012). It is known that nAChRs are present in striatal dopaminergic terminals and promote DA release (e.g., Wonnacott et al., 2000; Grady et al., 2007; Keath et al., 2007; Livingstone and Wonnacott, 2009; Xiao et al., 2009; Cachope et al., 2012; Threlfell et al., 2012). It is also known that they are present in the terminals of cortical afferents and promote glutamate release (e.g., Marchi et al., 2002; Zhang and Warren, 2002; Campos et al., 2010). Finally, they are present in striatal GABAergic interneurons promoting GABA release that inhibits projection neurons (Koós and Tepper, 2002; Wilson, 2004; Kreitzer and Malenka, 2008; Livingstone and Wonnacott, 2009; Xiao et al., 2009; English et al., 2011; Ibáñez-Sandoval et al., 2011; Luo et al., 2013). Each of these actions has been studied directly and separately in cell-focused studies. However, it is not known which of them predominate in the microcircuit as a whole during nAChrs agonist administration.

Note that, if actions on glutamate afferents were predominant, we should see an enhancement of activity similar to that produced by NMDA alone, and a summation of effects would be evident. On the other hand, if the release of DA were the main action, mixed effects would appear: some interneurons are activated by DA, although their GABA release may be inhibited, being hard to foretell what is the net result (Bracci et al., 2002; Centonze et al., 2003). Activation of interneurons may enhance inhibition due to cholinergic activation. But at the network level, SPNs may be inhibited or excited by DA (e.g., Kiyatkin and Rebec, 1996), and both classes of DA-receptors, D1 and D2, increase synchronous firing (Carrillo-Reid et al., 2011). In addition, in the DA-depleted circuit, activity increases pathologically (Jáidar et al., 2010) while collateral inhibition is decreased (Taverna et al., 2008). In summary, a host of mixed and parallel actions makes hard to foretell what would be the global action of nicotinic receptors activation on the striatal microcircuit.

The importance of answering this question resides in the suspected neuro-protective action of nicotine in the prevention and development of PD (e.g., Costa et al., 2001; Quik et al., 2007; Kawamata et al., 2012). Although this postulate is still controversial (García-Montes et al., 2012), the hypothesis has its origin in epidemiological studies that claim less incidence of PD in smokers (Gorell et al., 1999; Herman et al., 2001; Driver et al., 2009), and in clinical studies that claim improvements of motor and cognitive symptoms in PD patients subjected to nicotine analogs (Fagerström et al., 1994; Villafane et al., 2007). From these arguments comes the importance of knowing with certainty what would be the end result of administering a tonic level of a nAChrs agonist in the striatal circuit, which in the present case is seen as a neuronal population of diverse elements capable to generate assembly dynamics (Carrillo-Reid et al., 2008) and involved in the generation of PD signs and symptoms.

## **MATERIALS AND METHODS**

#### **SLICE PREPARATION**

Corticostriatal slices (300μm) were obtained from PD20-40 male mice as previously described (Vergara et al., 2003). Animal experimentation followed the National Institutes of Health Guide for Care and Use of Laboratory Animals and the National University of Mexico guidelines. Slices were obtained with ice-cold saline (4◦C) containing in mM: 123 NaCl, 3.5 KCl, 1 MgCl2, 1.5 CaCl2, 26 NaHCO3, and 11 glucose (saturated with 95% O2 and 5% CO2). Slices remained in saline at room temperature (21–25◦C) for at least 1 h before the experiments.

#### **OPTICAL RECORDINGS OF NEURONAL POPULATIONS WITH SINGLE CELL RESOLUTION**

Slices were incubated in the dark with 10μM of the calcium indicator fluo-8 AM for about 20 min (Tef Labs, Austin, TX) in saline containing 0.1% dimethylsulphoxide (DMSO), equilibrated with 95% O2 and 5% CO2. We used an upright microscope equipped with a 10X, 0.95 NA water-immersion objective (E600FN Eclipse, Nikon, Melville, NY). To observe the changes in fluorescence we delivered pulses at 488 nm (50–100 ms exposure) to the preparation with a Lambda LS illuminator (Sutter instruments, Novato CA) connected to the microscope via fiber optics. The image field was 800 × 600μm in size. Short movies (∼180 s = epoch) were acquired at time intervals of 5–20 min during ≥60 min with a cooled digital camera (SenSys 1401E, Roper Scientific, Tucson, AZ) at 100–250 ms/frame.

Neurons active during the experiment (30 to 300 depending on number of epochs and age of animals; Carrillo-Reid et al., 2008) were identified due to their spontaneous calcium transients. They were recorded in control saline or during the application of NMDA and/or nicotine with or without ionotropic channel blockers such as: 6-cyano-2, 3-dihydroxy-7 nitro-quinoxaline disodium salt (10μM CNQX), D-(-)-2-amino-5-phosphonovaleric acid (50μM APV), bicuculline (10μM) or gabazine (10μM) (Sigma-Aldrich-RBI, St. Louis, MO). Stock solutions were prepared before experiments and added to the recording chamber in the final concentration indicated.

#### **IMAGE PROCESSING**

Neurons active in the field of view were selected automatically by a custom made program written in the LabView™ programming environment. The program processes the image sequence obtaining the fluorescence signals originated from action potentials discharge (Carrillo-Reid et al., 2008, 2009a). Briefly, a two dimensional coordinate was assigned for each cell. Each neuron was numbered and its precise location in the field of view was known. Calcium transients represent changes in fluorescence: (Fi-Fo)/Fo, where Fi denotes the fluorescence intensity at any frame and F*<sup>o</sup>* denotes the basal fluorescence of each neuron. As it has been reported (Carrillo-Reid et al., 2008, 2009b; Jáidar et al., 2010) the first time derivative of the calcium transient reflects the time of electrical discharge of striatal neurons (>2.5 times the standard deviation of the noise value), in this way, the electrical activity of each neuron in the field of view could be followed along the experiment.

We constructed binary matrixes with the activity of dozens of neurons recorded simultaneously (raster plots). In each matrix, each row denotes an active neuron (numbered), while each column represents a time frame when an image was taken. Time axis represents the total number of frames making each movie converted into a minutes scale. For analysis, we considered calcium transients elicited by neurons only. Signals from neurons are much faster than signals from glial cells (Ikegaya et al., 2004; Sasaki et al., 2007; Carrillo-Reid et al., 2008). To visualize electrical activity from several active neurons, the binary matrix was plotted as a raster plot where neuronal firing is represented by dots. The activity histogram, shown below the raster plot, illustrates population activity of all neurons recorded during an experiment; it is obtained by the addition of all active neurons in each frame. Each spontaneous peak of synchronous activity denotes a pool of several neurons firing together (or having closely correlated activity). Note that each peak of synchronization is a neuronal column vector. Different colors denote that neuronal vectors are composed of different neuron pools.

#### **STATISTICAL METHODS**

Statistically significant peaks of co-active neurons were vectorized identified and counted (Carrillo-Reid et al., 2008), therefore, the level of correlated firing in the network can be quantitatively assessed and statistically compared. To assess the probability that a given peak of synchronization had appeared by chance, the points of the same matrix (raster plot) were used for MonteCarlo simulations with 10,000 replications. Thus, a level of significance is marked (dashed line) for all activity histograms. All the peaks of synchronization denoted by colors surpassed the significance level (*P* < 0.05). In control conditions without NMDA there are no significant peaks of synchronization. However, they appear in control tissue after adding NMDA. In addition they appear without NMDA when the tissue is depleted of dopamine (DAdepleted). If a treatment suppress NMDA-induced activity or DA-depletion induced activity, significant peaks of synchronization may go away in the activity histogram and the matrix only contains black dots denoting a decrease of network activity.

The sequence of peaks of synchronization in the activity histogram denotes the activity of the microcircuit along time, that is, a sequence of neuron pools synchronizing their firing and alternating their activity with other neuron pools. This type of circuit behavior has been called assembly dynamics (e.g., Carrillo-Reid et al., 2008). To know whether synchronization increased or decreased after a given treatment, the number of neuronal vectors was counted in each of several image sequences (epochs) at different times in the same experiment. Mean of averages from different slices were lumped together and a free-distribution statistic was employed (Mann-Whitney's U) for comparison. A Wilcoxon *T*-test was used to compare the same slice under different conditions. Note that each 3 min epoch commonly has dozens of individual cells. For comparison of epochs from different slices (**Figure 3**) the Kruskal-Wallis one way analysis of variance was used with *post hoc* Dunn tests.

These peaks of synchronization denote neuronal vectors that represent microcircuit activity in a multidimensional space, where the number of dimensions is given by the total number of active cells in each vector. Vectorization of network activity along the experiment allows the searching of recurrent patterns of activity, i.e., vectors being active repeatedly or different vectors alternating their activity (Schreiber et al., 2003; Ikegaya et al., 2004; Brown and Williams, 2005; Carrillo-Reid et al., 2008). The set of all these vectors connected by arrows represent the transitions between network states. To know whether the same vectors are active several times, similarity maps were constructed and all possible vector pairs were compared along time. The similarity index between any pair of vectors is defined by their normalized scalar product (Sasaki et al., 2007; Carrillo-Reid et al., 2008, 2009b): the cosine of the angle between the compared vectors. High similarity between vectors means that the activity of almost the same cells (same neuron pool) generated them at different times (Schreiber et al., 2003; Carrillo-Reid et al., 2008, 2009a; Jáidar et al., 2010). Different vectors (different neuron pools) are denoted by different colors in raster plots, activity histograms and locally linear embedding (LLE) graphs.

The method used to detect the dynamics of network states from multidimensional vectors has been published (Carrillo-Reid et al., 2008, 2009b; Jáidar et al., 2010). Briefly, dimensionality of population vectors representing network states was reduced by LLE, a dimensionality reduction technique that preserves the structure of non-linear multidimensional data (Roweis and Saul, 2000; Brown and Williams, 2005; Carrillo-Reid et al., 2008). Vectors are then projected into a two dimensional space. As a result, it is possible to visualize clusters of data points representing the recurrence of similar vectors with similar pools of neurons (network states) alternating their activity. Their sequences of activation may follow cycles or reverberation denoted by arrows (Schreiber et al., 2003; Sasaki et al., 2007; Carrillo-Reid et al., 2008, 2009b; Jáidar et al., 2010). To choose the optimal number of network states we used hard and fuzzy clustering algorithms and the Dunn's index as a validity function (Bezdek and Pal, 1998; Sasaki et al., 2007; Carrillo-Reid et al., 2008, 2009b).

Global neuronal activity over a given time was also represented by cumulative distributions of all cell activity in a given epoch. The rates of accumulation were approximated with *ad hoc* linear regressions. Their average slopes ± their estimation errors were compared for significant differences with non-paired Student's *t* tests, experiment by experiment. Average significance is reported. In addition, for sample comparisons of these parameters we used Wilcoxon's T statistic for paired samples and Mann-Whitney U statistic for non-paired samples (Plata et al., 2013).

#### **THE 6-OHDA HEMI-PARKINSONIAN MODEL**

Hemiparkinsonian animals, rats or BAC-mice, were obtained as previously reported (Jáidar et al., 2010). Briefly, animals were anesthetized with ketamine (85 mg/kg, i.p.) plus xylazine (15 mg/kg, i.p.) while immobilized on a stereotactic frame. Each animal received a unilateral injection of 6-OHDA (8 μg in 0.2μl with 0.2 mg/ml of ascorbic acid) 0.1μl/1 min in the substantia nigra pars compacta (SNc coordinates: 3.8 mm caudal, 1.8 mm lateral to bregma, and 7.1 mm ventral to the skull surface in rats and 2.6 mm caudal, 0.7 mm lateral to bregma and 4.5 mm ventral to the skull surface in post natal day 21 mice).

As described before (López-Huerta et al., 2012), the degree of DA deprivation was tested provoking turning behavior 8 days after the surgery using automated rotometers and amphetamine (4 mg/kg, i.p.). Because these protocol has been extensively reported in several occasions (6-OHDA model; e.g., López-Huerta et al., 2012), here will not be reported in the Results, but only as a method to obtain hemiparkinsonian animals. Left and right full body turns were recorded for 90 min by a home-made computerized monitor. Animals showing >500 turns ipsilateral to the injected side were considered for further experiments.

### **RESULTS**

#### **CELL ASSEMBLY DYNAMICS IN THE STRIATAL MICROCIRCUIT**

**Figure 1A** illustrates a raster plot or matrix showing the activity of >100 neurons in a field of view within a striatal slice in control conditions. Each row in the plot represents the electrical activity of one neuron across a series of images (columns) recorded by means of calcium-imaging using fluo-8 (see Materials and Methods). Leftmost frame shows 3 min of activity (an epoch) in control striatal tissue without NMDA: note the scarcity of spontaneous activity and the absence of significant peaks of synchronization. In the next three frames (3 epochs 3 min each) it is shown an increase in activity involving dozens of striatal individual neurons (single cell resolution) after adding 2μM NMDA into the bath saline. All three epochs display the NMDA-induced activity. Colored dots denote neurons firing together and belonging to a pool of neurons. Different colors denote different neuron pools. The activity of these neurons is vectorized (column vectors). Note that neuronal vectors alternate their activity along time, that is, the network transits from one set of neurons to the other as indicated by different colors.

Activity histogram at bottom represents (**Figure 1B**) the summed neuronal activity of the matrix above, column by column, over time. Therefore, it represents a multicellular or population recording but with the possibility of locate and count each of the cells composing the peaks of synchronization (Carrillo-Reid et al., 2008, 2009a). The dashed horizontal line shows the level of statistical significance for the peaks of synchronization (obtained by Monte Carlo simulations, see above). Note that several peaks of neuronal synchronization, denoted by colors, cross this level, indicating that sets neurons synchronize their firing significantly and spontaneously and then alternate their activity with other synchronous neuron pools. Similarity index (**Figure 1C**) compares each neural vector with all others over time: a patchy appearance shows that similar and several vectors were in charge of activity through the time. Dimensionality reduction of neuronal vectors using LLE shows the same vectors projected into a two dimensional space with no units (**Figure 1D**; Carrillo-Reid et al., 2009a). Similar vectors group together (denoted by different colors) indicating various (3) network states in the circuit. Transitions between network states follow determined sequences or trajectories (arrows). Percentages give the probability of leaving a given network state. In this way, we can say that the group of neurons in the field of view shows cell assembly dynamics (Carrillo-Reid et al., 2008, 2009a). Note that in the control without NMDA (leftmost epoch in **Figure 1A**) there is no assembly dynamics. In the next three Figures, control activity indicates activity in the presence of 2μM NMDA, that is, cell assembly dynamics or circuit activity. It is in this activity where we tested the global action of nicotine in the striatal circuitry.

#### **GLOBAL ACTION OF NICOTINE IN THE ACTIVITY OF THE STRIATAL MICROCIRCUIT IS INHIBITORY**

Raster plot in **Figure 2A** illustrates three epochs (3 min each separated by dashed vertical lines): the left epoch shows an activity similar to that described in **Figure 1A** in the continuous presence of 2μM NMDA (arrow; NMDA was present all the time during this experiment): this activity reveals significant peaks of synchronization in the activity histogram below, the manifestation of assembly dynamics (**Figure 2B** colored). In NMDA presence, addition of 1 μM nicotine to the bath saline (middle epoch) drastically reduced the level of neuronal activity (horizontal bar indicates time of nicotine exposure). Nicotine reduced most circuit activity: less dots, absence of significant peaks synchronization. Note that the actions of nicotine were reversible upon washing (right epoch). Histogram of multicellular activity (**Figure 2B**; note significance level indicated by a dashed horizontal line) shows spontaneous and significant peaks of synchronization only in the presence of NMDA alone. Addition of nicotine abolished the peaks of synchronized activity even in the presence of NMDA (**Figure 2B**). Nonetheless, after nicotine was washed off, the peaks of synchronization begin to return (right epoch). By summating all activity from histogram in **Figure 2B** over time (all bars in the histogram for each epoch), we obtained a graph of cumulative cell activity (**Figure 2C**). It shows that total circuit activity in the presence of NMDA is much higher (*ad hoc* fitting of straight lines where slopes become the rate of accumulation along time ± estimation error). An average of a sample of experiments yields 479 ± 2 (act/min), which decays to 190 ± 1.5 (act/min) during nicotine (*n* = 6 slices; *P* < 0.005). **Figure 2D** shows the similarity matrix when nicotine is not present reassuring that assembly dynamics was present during NMDA. In addition, we counted the number of significant peaks of synchronization per epoch in the presence of NMDA and during nicotine in the continuous presence of NMDA (**Figure 2E**): an average of 3.3 ± 1.2 peaks/epoch in the control (NMDA) vs. 0.16 ± 0.4 peaks/epoch in the presence of nicotine (*<sup>n</sup>* <sup>=</sup> 6 slices; *<sup>n</sup>* <sup>=</sup> 3 epochs per slice; ∗∗*<sup>P</sup>* <sup>&</sup>lt; <sup>0</sup>.025).

**Figure 3** shows that actions of nicotine were blocked by 10μM mecamylamine, a non-selective and non-competitive antagonist of nAChrs (*n* = 4). NMDA produced the usual increase in circuit activity with spontaneous and statistically significant peaks of synchronization (**Figures 3A,B** left epoch; NMDA was present during the whole experiment). This activity was greatly decreased when nicotine was added to the bath (middle epoch in **Figures 3A,B**), when significant peaks of synchronization disappeared, denoting that many individual neurons stopped firing (dots in the matrix). However, they partially returned when mecamylamine was added in the presence of both NMDA and nicotine (**Figures 3A,B** right epoch), suggesting that the actions of nicotine were receptor specific. A more complete characterization of the nicotinic receptors involved is out of the scope of the present report. Cumulative plots (**Figure 3C**) show that circuit activity is significantly lowered only when nicotine was added: an average of the sample of experiments yields 163 ±

**FIGURE 1 | Assembly dynamics in the control striatal circuitry after NMDA. (A)** Raster plot showing the simultaneous activity of >100 neurons in a striatal slice using calcium-imaging. Each row in the matrix represents the activity of a single neuron across the series of images (columns). Left epoch shows 3 min of activity in control striatal tissue without NMDA. Note scarce activity and the absence of peaks of synchronization. In the next three epochs, separated by dashed vertical lines, 9 min of activity are shown after adding 2μM NMDA into the bath saline (denoted by horizontal black line on top). Note that many more dots, some of them colored, populate the matrix. Colored dots denote the synchronized activity of several neurons in a given column or neighboring columns (a neuron pool is then represented as a column vector). Note that different neuron pools produce the circuit activity along time. **(B)** Activity histogram at bottom is the summed neuronal activity (multicellular activity) from the raster plot above column by column (frame by frame in a given movie). The dashed line

shows the level of significance for the spontaneous peaks of synchronization denoted by colors (obtained by Monte Carlo simulations; statistically significant neuronal vectors; *n* = 6 slices). **(C)** A similarity index matrix compares each vector with all others along time: a patchy appearance shows that similar vectors were in charge of activity. **(D)** Dimensionality reduction using locally linear embedding (LLE) shows neuronal vectors projected in a two dimensional space with no units. Similar vectors grouped together (denoted by different colors) give raise to network states. The transitions among network states are denoted by arrows. Percentages give the probability to leave a given state. Colored dots and arrows represent the sequential activity of the circuit, that is, pools of neurons synchronized their firing and pass their activity from one pool of neurons to the other: cell assembly dynamics (Carrillo-Reid et al., 2008). Note reverberant trajectories in the sequence. Here and in the next figures, epochs (times of continuous image series) are separated by vertical dashed lines.

1.3 (act/min) in NMDA, which decays to 51 ± 0.7 (act/min) during nicotine (*n* = 3; average significance of *P* < 0.01). Note that subsequent addition of mecamylamine produced a return in the tendency of cumulative activity (green trace), even in the continuous presence of nicotine. The vectors similarity matrix reassured that NMDA-induced activity had assembly dynamics (**Figure 3D**). Histogram in **Figure 3E** shows average of synchronization peaks taken from several control slices in the presence of NMDA: 3.6 ± 0.34 peaks per epoch (*n* = 10 slices and *n* = 10 epochs). This average decreased significantly when nicotine was added to the superfusion: 0.2 ± 0.13 peaks per epoch (∗∗∗*P* < 0.001; *n* = 10 slices and *n* = 10 epochs). Note that when the nonselective nAChr antagonist, mecamylamine, was added to the bath

saline, the peaks of synchronization returned gradually and significantly: 2.7 ± 0.33 peaks per epoch (∗∗*P* < 0.01; *n* = 4 slices; *n* = 6 epochs).

In summary, nicotine was capable to reduce circuit activity in the striatal network and its actions were both reversible (**Figure 2**) and blocked by mecamylamine (**Figure 3**). But nAChrs are ionotropic channels that carry inward currents that should excite, not inhibit circuit activity, just as NMDA-receptors. How it is possible that two ionotropic cationic receptors have opposite actions in the same circuit? According with some cell-focused studies, we hypothesized that the actions of nicotine were indirect via the activation of striatal GABAergic interneurons (e.g., English et al., 2011; Ibáñez-Sandoval et al., 2011; Luo et al., 2013).

To see if this latter hypothesis was true we tried to block nicotine effects on the circuit with GABAA-receptor antagonists.

## **INHIBITORY ACTION OF NICOTINE IN THE STRIATAL CIRCUIT IS BLOCKED BY GABAA-RECEPTOR ANTAGONISTS**

**Figure 4A** illustrates a raster plot with three epochs (separated by dashed vertical lines). At left, the usual neuronal activity found in striatal tissue after addition of 2μM NMDA in the bath is shown (cf., **Figure 1**). Significant peaks of synchronization are present (**Figure 4B** left epoch; blue and red). At the middle epoch, 1μM nicotine plus 10μM bicuculline were added. In contrast with the action of bicuculline alone, where the activity of the circuit increases using the same pool of neurons over and over again (see Figure 8 in Carrillo-Reid et al., 2008 and Figure 5 in Jáidar et al., 2010), bicuculline together with nicotine, generated an activity of the circuit that displayed an increase of synchronization peaks (**Figures 4A,B** middle epoch). In these conditions, nicotine did not restrain circuit activity anymore and the induced activity consisted in a frenzy sequence of peaks coming from different neuron pools that alternate their activity without pace. 10 μM gabazine had the same results when administered with nicotine (*n* = 3; not shown but see: López-Huerta et al., 2012, 2013). Alternations among neuronal pools became frequent and rarely repeated.

In summary, bicuculline added alone increased activity based in a single or dominant peak of synchronization (see Carrillo-Reid et al., 2008; Jáidar et al., 2010), nicotine added alone, decreased the activity and peaks of synchronization disappeared

(**Figure 2**), but nicotine added with bicuculline increased activity based on an increase in the peaks of synchronization (**Figure 4**; cf., middle epoch with left and right epochs). The wash off of both drugs returned the circuit to usual levels in terms of activity during NMDA (**Figures 4A,B** right epoch). Similarity matrix includes vectors activity with bicuculline and nicotine (**Figure 4C**). Dimensionality reduction LLE compares circuit dynamics before (**Figure 4D** left; only two states red and blue) and after nicotine plus bicuculline (**Figure 4D** right; four states). Note that transitions between network states drastically augmented. This action was surprising in the sense that it suggests that several different sets of neurons are being activated by nicotine in

synchronization (colored) appear during NMDA. They disappeared when

spite of blocking GABAA-receptor transmission, as in the case of DA D2-receptor agonist action in the striatal circuit (cf., Figure 5B in Carrillo-Reid et al., 2011).

#### **PARKINSONIAN ACTIVITY IN DOPAMINE DEPLETED STRIATUM IS REDUCED BY NICOTINE**

in the presence of nicotine.

A mice sample was lesioned unilaterally with 6-OHDA. The degree of DA deprivation in the striatal tissue was tested provoking turning behavior 8 days after the surgery using automated rotometers and amphetamine (4 mg/kg, i.p.). Animals showing >500 turns ipsilateral to the injected side were considered for further experiments (López-Huerta et al., 2012).

**FIGURE 4 | Global actions of nicotine in the striatal circuit are blocked by bicuculline. (A)** Raster plot showing again three different epochs of activity. At the left epoch, the typical activity of the striatal circuit with 2μM NMDA is seen displaying two neuron pools with synchronized activity (red and blue). Middle epoch: neuronal activity is dramatically increased due to the addition of 1μM nicotine plus 10μM bicuculline into the bath saline. In contrast with what happens when bicuculline is administered alone (Carrillo-Reid et al., 2008), nicotine plus bicuculline showed a quite diverse set of synchronization peaks. Right epoch: nicotine plus bicuculline actions were reversible. **(B)** Activity histogram of summed

multicellular activity showing that in control (left) there are five peaks of spontaneous synchronization belonging to two different neuron pools (blue and red). Peaks of synchronization increase in number and classes when both nicotine and bicuculline are added to the bath (middle epoch). When bicuculline is given alone, circuit activity is manifested by a highly recurrent peak of synchronization (Carrillo-Reid et al., 2008). Right epoch: activity begins to return to normal after the drugs were washed off. **(C)** Similarity matrix in the presence of both drugs. **(D)** LLE before (NMDA left) and after addition of nicotine plus bicuculline (right): note more network states and more transitions in circuit activity when both drugs are present.

As already reported (Jáidar et al., 2010), the DA depleted striatum exhibited spontaneous activity and peaks of statistically significant synchronization that appear spontaneously in the absence of NMDA or any other excitatory drive (**Figures 5A,B**; first three epochs). This pathological activity is different to the one recorded in the control striatum without NMDA (leftmost epoch in **Figure 1A**). It is also different than NMDA-induced activity in control tissue (Carrillo-Reid et al., 2008; Jáidar et al., 2010). First three epochs in the raster plot of **Figure 5A** show Parkinsonian activity followed by addition of nicotine in the last two epochs. In fact, nicotine was added after the beginning of the fourth epoch to appreciate the quick action of nicotine. Activity histogram (**Figure 5B**) shows that the DA-depleted microcircuit presents significant peaks of synchronization. Note however, that in this case the peaks are composed by the same pool of neurons (same color red), having recurrent activity once and again (Jáidar et al., 2010) as it is the case with bicuculline actions when it is given alone (Carrillo-Reid et al., 2008). In other words, DA-depletion produces an increased activity with no alternation. Interestingly, addition of 1 μM nicotine to the bath saline (last two frames) abolished Parkinsonian activity and the peaks of synchronization (last two epochs in **Figures 5A,B**). Cumulative activity clearly shows more activity over time for the Parkinsonian microcircuit (**Figure 5C**): average rate of activity over time in the DA deprived circuit was: 170 ± 1 (act/min) while it was 54 ± 0.4 (act/min) after nicotine (*n* = 6; *P* < 0.006), showing that nicotine significantly reduced pathological activity.

As previously reported (Jáidar et al., 2010), peaks of synchronization were made by similar neuronal vectors or the same network state having recurrent activity, according to similarity index and LLE analysis (**Figures 5D,E**; Jáidar et al., 2010). Note LLE (**Figure 5E**) showing the same network state recurring on itself absorbing all synchronized neurons, suggesting that this is the microcircuit correlate of Parkinsonian slow oscillations. It

**FIGURE 5 | Nicotine inhibits Parkinsonian activity in the dopamine depleted striatal microcircuit. (A)** Raster plots showing five epochs of activity in a DA-depleted striatal circuit. There is spontaneous neuronal activity in the absence of NMDA (first three epochs shows activity in a 6-OHDA rodent model of hemi-Parkinsonism, see Materials and Methods). However, significant synchronization is composed of the same neuron pool. Last two epochs show that nicotine reduces this activity. **(B)** Note that activity during DA deprivation exhibits significant peaks of synchronization. However, all are depicted with the same color since similarity indexes indicate that these vectors are built with a similar set of neurons presenting

was surprising that nicotine was capable to suppress the excess in activity and the dominant network state. Finally, we also quantified the number of synchronization peaks per epoch: it was an average of 4.2 ± 1.7 peaks/epoch (*n* = 6 slices; *n* = 18 epochs) in a sample of DA deprived slices and 0.3 ± 0.7 (peaks/epoch) in DA deprived slices during nicotine treatment (**Figure 5F**; *n* = 6 slices; *n* = 12 epochs; ∗∗∗*P* < 0.005). In fact, the action of nicotine in this early Parkinsonian tissue is as potent as that of L-DOPA (cf., Plata et al., 2013).

## **DISCUSSION**

Global action of nicotine in the striatal microcircuit reduced both the NMDA-induced neuronal activity in control tissue and the Parkinsonian pathological activity in the rodent 6-OHDA model of PD. However, nicotine, as NMDA, is an agonist of ligand-gated ionotropic channels whose role is to generate inward currents and depolarize target neurons. Nevertheless, each one of these agonists, NMDA and nicotine, has a completely different and even opposed action in the striatal circuit.

correlated activity in a recurrent mode. Note that these peaks disappear during nicotine. **(C)** Cumulative activity is significantly higher during DA-depletion than during DA-depletion with nicotine. **(D)** Similarity matrix shows similar neuronal vectors dominating the activity through the time in the DA depleted microcircuit. **(E)** LLE shows that the same network state keeps repeating its firing over time in the DA depleted tissue (all dots are of the same group-color), as though the circuit could not leave this dominant network state (Jáidar et al., 2010). Nicotine was capable to suppress this recurrent network state significantly. **(F)** The histogram shows that peaks of spontaneous synchronization are virtually abolished by nicotine. ∗∗∗*P* < 0.005.

Cell-focused studies have disentangled an array of different actions of nicotine at the cellular and synaptic levels. First, nicotine activates nAChrs in incoming glutamate terminals inducing glutamate release (e.g., Marchi et al., 2002; Zhang and Warren, 2002; Campos et al., 2010). If this action were the predominant action at the microcircuit level, then it would be facilitatory of circuit activity and add to the activation produced by NMDA in the control circuit or to the pathological activity found in the DA-depleted circuit. This was not the case.

Second, nicotine also activates dopaminergic synapses inducing DA release (e.g., Wonnacott et al., 2000; Grady et al., 2007; Keath et al., 2007; Livingstone and Wonnacott, 2009; Xiao et al., 2009; Cachope et al., 2012; Threlfell et al., 2012). This action taken alone would lead to the induction of some kind of activity in the network (Carrillo-Reid et al., 2011) although this activity would be unpredictable solely based in cell-focused studies. On the one hand, DA induces firing in striatal interneurons. This action would favor inhibition of the circuit (Bracci et al., 2002). Nevertheless, besides postsynaptic activation of interneurons, DA also inhibits the release of GABA from the terminals of the same or different interneurons (Centonze et al., 2003). This, apparently, is a contradictory result. The end result of both actions taken together is hard to infer with cell-focused studies alone. This fact supports the need to observe actions of transmitters at the network level, and not only at the cellular level. Also, when seen at the circuit level, both classes of DA receptors, D1 and D2, induce an increase in spontaneous synchronized activity (Carrillo-Reid et al., 2011). Besides, individual SPNs may be inhibited or excited by DA (e.g., Kiyatkin and Rebec, 1996; Carrillo-Reid et al., 2011) depending on context. Taken one by one, it is difficult to make sense of all these actions and any inference about the end results is precluded. This fact supports the direct study of the global action in the microcircuit as a whole.

Furthermore, in case the activity of the circuit is increased due to DA-depletion (Jáidar et al., 2010), collateral inhibition among SPNs is decreased, and most actions of nicotine would be on GABAergic synapses made by interneurons. In such a case, DA analogs, such as L-DOPA, completely restore the control activity of the circuit, dramatically and reversible, being this action a neuronal correlate of behavioral and clinical trials (Plata et al., 2013). Accordingly, one question of the present work is how much nicotinic actions approximate that of L-DOPA.

Third, nAChrs actions have also been recorded in striatal GABAergic interneurons during cell-focused studies: nicotinic activation of several interneuron types such as fast-spiking (FS), low-threshold spiking (LTS), tonically firing neurogliaform cells, and others, release GABA upon nicotinic activation and inhibit projection neurons (Koós and Tepper, 2002; Wilson, 2004; Kreitzer and Malenka, 2008; Livingstone and Wonnacott, 2009; Xiao et al., 2009; English et al., 2011; Ibáñez-Sandoval et al., 2011; Luo et al., 2013). As the cell-focused studies described above, this appears to be another important action of nicotine when observed in isolation, cell by cell.

To conclude, cell-focused studies have revealed a host of mixed and parallel actions, some pointing to inhibitory and others to excitatory types of activity. Taken together, all these actions make hard to foretell what would be the global action of nicotinic receptors activation on the striatal microcircuit as a whole. The importance of answering this question is that clinical studies claim improvements of motor and cognitive symptoms in PD patients subjected to nicotine analogs (Fagerström et al., 1994; Villafane et al., 2007), that is, after systemic administration. Moreover, some epidemiological studies claim less incidence of PD in smokers (Gorell et al., 1999; Herman et al., 2001; Driver et al., 2009). Therefore, it is logical to question about what is the predominant action of nicotine in the striatal circuit when tonic concentrations are raised. To observe the answer to this matter directly at the microcircuit level, some type of multicell recording and therefore, a more sophisticated analysis, are required. However, the results surpassed any expectation about which of the mixed and parallel actions described in cell-focused studies predominates: nicotine readily and reversibly decreased control (with NMDA) and Parkinsonian activity. This observation gives raise to perhaps more important questions: what is the relation between nicotine actions and its supposed anti-Parkinsonian activity? Could it be used as an adjunct therapy in PD?

In summary, we demonstrate that the predominant action of a tonic elevation in nicotine concentration in the striatal microcircuit is the inhibition of network activity through the activation of GABAergic transmission, since inhibitory nicotinic action was blocked by GABAA-receptor antagonists. Secondly, the inhibitory activity could be either on the assembly dynamics induced by an uncorrelated excitatory drive such as NMDA (Carrillo-Reid et al., 2008), or else, on the pathological activity derived from DA depletion (Jáidar et al., 2010) in early Parkinsonian animals models.

In the second case, nicotinic action is a microcircuit correlate of the already described anti-Parkinsonian actions of nicotine (Fagerström et al., 1994; Gorell et al., 1999; Costa et al., 2001; Herman et al., 2001; Villafane et al., 2007; Driver et al., 2009; Quik et al., 2007; García-Montes et al., 2012; Kawamata et al., 2012). Indeed, in comparison to our early Parkinsonian animal models (Jáidar et al., 2010), we show that the action of nicotine is almost as strong as that of L-DOPA (Plata et al., 2013). Such a strong anti-Parkinsonian action at the circuit level was certainly unexpected.

Many classes of striatal interneurons are known to express nAChrs and are capable to be activated by nicotinic analogs (Koós and Tepper, 2002; Wilson, 2004; Quik et al., 2007; Kreitzer and Malenka, 2008; Livingstone and Wonnacott, 2009; Xiao et al., 2009; English et al., 2011; Ibáñez-Sandoval et al., 2011; Luo et al., 2013). As a result, they release GABA and inhibit SPNs. This inhibition is blocked by GABAA-receptor antagonists such as bicuculline (e.g., English et al., 2011; Ibáñez-Sandoval et al., 2011; Luo et al., 2013). Accordingly, here it is shown that global inhibitory nicotinic actions could be blocked with GABAA-receptor antagonists such as bicuculline suggesting that the strong decrease in circuit activity, normal or pathological, is due to massive interneurons activation. In fact, the action of bicuculline alone in the circuit has been reported both in control activity (with NMDA; Carrillo-Reid et al., 2008) and in Parkinsonian activity (with DA-depletion; Jáidar et al., 2010). When administered alone, bicuculline produces an increase in activity, but similarly to Parkinsonian activity it is characterized by a dominant pool of neurons having spontaneous synchronization in a recurrent way (Carrillo-Reid et al., 2008). In the DA-depleted tissue, activity entrenches the dominant state produced by DA absence destroying the alternating dynamics that may remain (Jáidar et al., 2010). In contrast, when bicuculline was given in the presence of nicotine, a plural set of peaks of synchronization appeared, manifesting a strong assembly dynamics with abundant trajectories in the LLE plot. This behavior suggests that several classes of interneurons are being activated.

Further research is needed to find out which of the interneuron classes predominate over the others, since each element of the circuit may express a different nAChr. The importance of this future dissection about the mechanism of how these powerful nicotinic action may happen is that it is known that PD course with hypercholinergia (rev in: Pisani et al., 2005; Goldberg and Reynolds, 2011) and that some types of interneurons become hyperexcitable during DA depletion (Dehorter et al., 2009), a result that appear as counter-intuitive given the present data. Therefore, this action may involve the activation of a specific interneuron network (with a specific nAChr-class). A strong candidate is the neurogliaform interneuron which releases abundant GABA setting the stage for volume transmission (Ibáñez-Sandoval et al., 2011). Alternatively, nicotine may favor the synchronized network activity of an interneuron network united by gap junctions (Tepper et al., 2010). In both cases, activation of these neurons may be capable to inhibit the interneurons that are overactive during PD (Dehorter et al., 2009), including perhaps, the cholinergic ones. This future research may find that a specific receptor is involved, a necessary step to find selective ligands with potential therapeutic use. Finally, *in vivo* experiments are needed to answer another question: how such a potent inhibition would allow common motor tasks.

In any case, the present results highlight the possibility of using nicotine analogs as an adjunct to L-DOPA in PD therapy.

The increase in DA release induced by nicotine may have its own beneficial actions as long as some DA-terminals still remain (Wonnacott et al., 2000; Grady et al., 2007; Keath et al., 2007; Livingstone and Wonnacott, 2009; Xiao et al., 2009; Cachope et al., 2012; Threlfell et al., 2012). A possible difficulty is that DA and glutamate release are tied together, and have a reciprocal interaction and regulation. In any case, here, we present a bio-assay that allows the evaluation of control and pathological activity of the striatal microcircuit in which the actions of drugs with suspected therapeutic actions may be tested and compared with usual behavioral assays.

#### **CONCLUSION**

Calcium imaging techniques may serve to design bio-assays to test potential anti-Parkinsonian drugs in *in vitro* brain slices. Here, we tested nicotine, which has been suspected to possess anti-Parkinsonian activity for a long time. Indeed, it had an action similar to L-DOPA assayed in the same preparation (cf., Plata et al., 2013). However, both drugs were assayed in early Parkinsonian animals tested with turning behavior as a correlate of DA-deprivation. Further research is needed to test nicotine analogs in later stages of the disease, as for example, in dyskinesias.

In addition, we searched the mechanism of nicotinic actions which turned out to be indirect: the global effect of elevating the tonic concentration of nicotine in the striatal microcircuit was that of inhibiting the circuit in a way that was blocked by GABAAreceptor antagonists, that is, most probably by activating a set of inhibitory interneurons. Further research is needed to find out which neurons and nAChRs are involved in these actions. But of the many parallel and sometimes contradictory actions that have been described separately in cell-focused studies, the activation of interneurons appeared to be the predominant one.

#### **AUTHOR CONTRIBUTIONS**

Víctor Plata, Mariana Duhne, Pavel Rueda-Orozco made most experiments, Ricardo Hernández-Martinez and Mariana Duhne lesioned animals with 6-OHDA and evaluated behavior, Víctor Plata and Jesús Pérez-Ortega made or modified acquisition and analysis software, Elvira Galarraga, René Drucker-Colín, and José Bargas had the original ideas, planned and reviewed the experiments, José Bargas and Víctor Plata wrote the article. Irakli Intskirveli reviewed the article.

#### **ACKNOWLEDGMENTS**

We thank A. Laville, G. X Ayala and A. Aparicio for technical support and advice and to Dr. C. Rivera for animal care. This work was supported by Consejo Nacional de Ciencia y Tecnología (CONACyT México) grants 98004 and 154131, and by grants from Dirección General de Asuntos del Personal Académico. Universidad Nacional Autónoma de México: IN-202914 and IN-202814 to Elvira Galarraga and José Bargas, respectively. Víctor Plata had a CONACyT doctoral fellowship and data in this work are part of his doctoral dissertation in the Posgrado en Ciencias Biomédicas de la Universidad Nacional Autónoma de México.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 05 August 2013; accepted: 17 October 2013; published online: 06 November 2013.*

*Citation: Plata V, Duhne M, Pérez-Ortega J, Hernández-Martinez R, Rueda-Orozco P, Galarraga E, Drucker-Colín R and Bargas J (2013) Global actions of nicotine on the striatal microcircuit. Front. Syst. Neurosci. 7:78. doi: 10.3389/fnsys.2013.00078 This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2013 Plata, Duhne, Pérez-Ortega, Hernández-Martinez, Rueda-Orozco, Galarraga, Drucker-Colín and Bargas. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Asymmetric right/left encoding of emotions in the human subthalamic nucleus

## *Renana Eitan1†, Reuben R. Shamir 2,3,4\*†, Eduard Linetsky5, Ovadya Rosenbluh1, Shay Moshel 2,3,4,6, Tamir Ben-Hur 5, Hagai Bergman2,3 and Zvi Israel <sup>4</sup>*

*<sup>1</sup> Department of Psychiatry, Hadassah-Hebrew University Medical Center, Jerusalem, Israel*

*<sup>2</sup> Department of Medical Neurobiology (Physiology), Institute of Medical Research – Israel-Canada, The Hebrew University-Hadassah Medical School, Jerusalem, Israel*

*<sup>3</sup> The Edmond and Lily Safra Center for Brain Research, The Hebrew University, Jerusalem, Israel*

*<sup>4</sup> Department of Neurosurgery, Center for Functional and Restorative Neurosurgery, Hadassah-Hebrew University Medical Center, Jerusalem, Israel*

*<sup>5</sup> Department of Neurology, Hadassah–Hebrew University Medical Center, Jerusalem, Israel*

*<sup>6</sup> The Jerusalem Mental Health Center, Kfar-Shaul Etanim, Jerusalem, Israel*

#### *Edited by:*

*Izhar Bar-Gad, Bar-Ilan University, Israel*

#### *Reviewed by:*

*Andrea A. Kühn, Charité, University-Medicine Berlin, Germany Paul Krack, Grenoble Institute of Neuroscience - Inserm U.836-UJF-CEA-CHU, France*

#### *\*Correspondence:*

*Reuben R. Shamir, Department of Medical Neurobiology (Physiology) and Department of Neurosurgery, The Hebrew University-Hadassah Medical School, Ein-Karem Campus, PO Box 12272, 91120 Jerusalem, Israel*

*e-mail: shamir.ruby@gmail.com*

*†These authors have contributed equally to this work.*

Emotional processing is lateralized to the non-dominant brain hemisphere. However, there is no clear spatial model for lateralization of emotional domains in the basal ganglia. The subthalamic nucleus (STN), an input structure in the basal ganglia network, plays a major role in the pathophysiology of Parkinson's disease (PD). This role is probably not limited only to the motor deficits of PD, but may also span the emotional and cognitive deficits commonly observed in PD patients. Beta oscillations (12–30 Hz), the electrophysiological signature of PD, are restricted to the dorsolateral part of the STN that corresponds to the anatomically defined sensorimotor STN. The more medial, more anterior and more ventral parts of the STN are thought to correspond to the anatomically defined limbic and associative territories of the STN. Surprisingly, little is known about the electrophysiological properties of the non-motor domains of the STN, nor about electrophysiological differences between right and left STNs. In this study, microelectrodes were utilized to record the STN spontaneous spiking activity and responses to vocal non-verbal emotional stimuli during deep brain stimulation (DBS) surgeries in human PD patients. The oscillation properties of the STN neurons were used to map the dorsal oscillatory and the ventral non-oscillatory regions of the STN. Emotive auditory stimulation evoked activity in the ventral non-oscillatory region of the right STN. These responses were not observed in the left ventral STN or in the dorsal regions of either the right or left STN. Therefore, our results suggest that the ventral non-oscillatory regions are asymmetrically associated with non-motor functions, with the right ventral STN associated with emotional processing. These results suggest that DBS of the right ventral STN may be associated with beneficial or adverse emotional effects observed in PD patients and may relieve mental symptoms in other neurological and psychiatric diseases.

**Keywords: Parkinson's disease, deep brain stimulation (DBS), emotions, subthalamic nucleus, spikes**

#### **INTRODUCTION**

Subthalamic nucleus (STN) deep brain stimulation (DBS) is an established therapy for the management of motor symptoms of advanced Parkinson's disease (PD; Benazzouz et al., 1993; Benabid et al., 1994; Weaver et al., 2009; Follett et al., 2010; Moro et al., 2010; Bronstein et al., 2011; Lhommée et al., 2012; Odekerken et al., 2013; Schuepbach et al., 2013), and is also a promising potential therapy for the management of obsessivecompulsive disorder (OCD) (Mallet et al., 2008; Chabardès et al., 2012).

Psychiatric adverse effects such as apathy, depression, emotion recognition and reactivity, hypomania, and suicide have been observed in PD patients before and after STN DBS (Dujardin et al., 2004; Schroeder et al., 2004; Biseul et al., 2005; Temel et al., 2006; Drapier et al., 2008; Witt et al., 2008; Péron et al., 2010a). Castner et al. (2007) have observed delayed reaction times for negative valence words in healthy control volunteers and for PD patients "on" stimulation, but not PD patients in the "off" stimulation condition. In addition, recognition of negative emotions (fear, anger, and sadness) expressed visually (facial expressions) or vocally was significantly impaired in PD patients after STN DBS (Péron et al., 2010a,b). PD patients "on" stimulation reported significantly less intense feelings of fear, anxiety, and disgust for film excerpts intended to induce "fear" as compared with the pre-operative and the control groups (Vicente et al., 2009).

In contrast, overall *improvement* in neuropsychiatric symptoms in PD patients following STN DBS has recently been reported (Lhommée et al., 2012). STN DBS of PD patients in the "off" medication state was associated with a reduction in the frequency and severity of non-motor fluctuations (Ortega-Cubero et al., 2013), and reduce compulsive use of dopaminergic medication and its behavioral consequences (Eusebio et al., 2013). Thus, the STN is not a pure motor structure, but also involved in emotional processing (Péron et al., 2012).

Although there is considerable anatomical evidence that supports the segregation of the STN into distinct limbic, associative and motor zones (Lambert et al., 2012; Haynes and Haber, 2013), there is no widely accepted spatial model for the physiological correlates of this subdivision (Brunenberg et al., 2012; Buot et al., 2012). Emotional processing is considered to be lateralized and attributed to the non-dominant hemisphere (Gainotti, 2012). Right/left asymmetry of dopamine deficits may differentially impact emotion and cognition in PD patients (Tomer and Aharon-Peretz, 2004; Lambert et al., 2012; Ventura et al., 2012). Piallat et al. (2011) have reported that burst neurons were predominantly left-sided in the STN of OCD but bilateral in PD patients. Such right/left asymmetry has not been observed in the STN of PD patients either by means of imaging modalities or macro-electrode local field potential (LFP) recordings (Kühn et al., 2005; Buot et al., 2012; Lambert et al., 2012).

To better understand the limbic role of the STN, microelectrode unit activity in PD patients undergoing STN DBS was recorded and analyzed during intraoperative vocal emotional stimulation. This study utilized the oscillatory activity that characterizes the PD STN for segmentation of motor and non-motor regions. STN oscillatory activity in the beta (12–30 Hz) band is associated with akinetic-rigid PD symptoms and is observed in the dorso-lateral oscillatory region (DLOR) of the STN that corresponds to the anatomically defined sensory-motor region (Zaidel et al., 2010). Beta band oscillations are less likely to be observed at the ventro-medial non-oscillatory (VMNR) STN region that is probably related to cognitive and limbic functions. This study tested whether the electro-physiological response to an emotional stimulus would be observed in the ventro-medial part of the STN, and furthermore whether it would be lateralized to the non-dominant hemisphere in similarity to the cortex.

### **METHODS**

Clinical characteristics of PD patients (*n* = 17) of this study are given in **Table 1**. All patients met accepted inclusion criteria for DBS surgery and signed informed consent. This study was authorized and supervised by the IRB of Hadassah Medical Center (reference code: 0168-10-HMO).

STN evoked responses to emotional vocal stimulus were analyzed. To avoid a bias, the emotional stimulus should incorporate


*\**,*\*\*Right/Left trajectory was excluded for artifacts and noise; ACE-R, Addenbrooke's Cognitive Examination - Revised (range 0–100, lower scores indicate cognitive decline); FAB, Frontal Assessment Battery (range 0–18, lower scores indicate cognitive decline); HAM-21, Hamilton Depression Scale (range 0–67, higher scores indicate depressed mood); UPDRS OFF, Unified Parkinson's Disease Rating Scale off medication (range 0–199, higher scores indicate advanced dysfunction here and in parts I-IV); UPDRS Part I: clinician-scored mentation, behavior, and mood evaluation (range 0–16); UPDRS Part II: clinician-scored activities of daily life evaluation (range 0–52); UPDRS Part III: clinician-scored motor evaluation (range 0–108); UPDRS Part IV: clinician-scored evaluation of complications of therapy (range 0–23).*

a variety of emotions and be culture and language independent as much as possible. For this purpose, the Montreal affective voices (MAV) database was selected and validated. The MAV consists of negative, neutral and positive non-verbal male and female voices (Belin et al., 2008). The emotional voices were played to the PD patients during DBS surgery synchronized with microelectrode recordings of the STN. The data was then analyzed to define the areas of the STN at which the emotive vocal stimulation resulted in a modulation of the neuronal activity. In the following sections the above paradigm is described in more detail.

#### **VALIDATION OF EMOTIONAL VOICES DATABASE**

The Montreal affective voices, a validated tool for research on auditory affective processing (Belin et al., 2008) was used in this study. During vocal communication, listeners attend to speech prosody to infer the emotions or the affective mood of the speaker. Non-verbal affective processing based on auditory recognition does not contain verbal context and is therefore valid across different countries and cultures. Moreover, it is probably less dependent on brain areas that are specialized for language processing, and therefore reflects a purer emotional activity. To validate the Montreal affective voices for Hebrew speakers, the computerized Montreal affective voices question battery was translated to Hebrew and validated with 29 healthy volunteers (>60 years, 66.7 ± 5.2 years, mean ± SD, *n* = 14, 8 females and 6 males; and <60, 34.9 ± 8.9 years, mean ± SD, *n* = 15, 8 females and 7 males). Whilst on their regular dopamine replacement therapy, the emotive voices were played to all patients (*n* = 17) the day before surgery such that they would become more familiar with the MAV in the operating room. Thirteen patients (62.3 ± 7.5 years old, mean ± SD, three females and 10 males) also answered the Montreal affective voices battery that assesses subjective information regarding the valence and arousal of the played voices.

For unbiased comparison of the PD patients and healthy volunteers, an age and gender matched group of 12 healthy volunteers was selected (60.5 ± 13.1 years old, mean ± SD, three females and nine males; shuffling of the members of this group have resulted in similar results). To estimate the accuracy of the recognition of emotive voices (recognition accuracy) the valence of the emotional voices were considered to be 100, 0 and −100 for positive, neutral, and negative voices, respectively. The patients and volunteers were asked to rank the valence on a continuous scale from −100 (negative) to 100 (positive). The recognition accuracy was defined as the absolute difference between the database values and the patient evaluation.

#### **SURGERY AND MICROELECTRODE RECORDING**

Surgery, recording and data analysis methods used to evaluate the response to the emotive stimuli and to discriminate between dorsal and ventral regions of the STN are similar to those reported in our previous study (Shamir et al., 2012). Briefly, surgery was performed using the CRW stereotactic frame (Radionics, Burlington, MA, USA). STN target coordinates were chosen as a composite of indirect targeting based on the anterior commissure—posterior commissure atlas based location and direct targeting with three Tesla T2 magnetic resonance imaging (MRI), using Framelink 5 software (Medtronic, Minneapolis, USA). All recordings used in this study were made while the patients were awake and not under sedation. The patient's level of awareness was continuously assessed clinically, and if drowsy the patient was stimulated and awoken through conversation by a member of the surgical team. The side (right/left) of the first trajectory was chosen according to the severity of the Parkinsonian symptoms (right/left side first— 7/10 patients respectively). The DBS procedures were done off dopaminergic medications (>12 h since last medication).

Microelectrode recording (MER) data was acquired with the MicroGuide system (AlphaOmega Engineering, Nazareth, Israel). Neurophysiological activity was recorded via polyamide coated tungsten microelectrodes (Alpha Omega) with impedance: 0.59 ± 0.13 M- (mean ± SD, measured at 1 kHz at the beginning of each trajectory). The signal was amplified by 10,000, band-passed from 250 to 6000 Hz, using a hardware four-pole Butterworth filter, and sampled at 48 kHz by a 12-bit A/D converter (using ±5 V input range). LFPs were not recorded due to constraints of electrical noise in the operating room. For both the left and right hemispheres, a microelectrode-recording trajectory using two parallel microelectrodes was made, starting at 10 mm above the calculated target (center of the STN trajectory as per imaging). A "central" electrode was directed at the center of the dorsolateral STN target, and an "anterior" electrode was advanced in parallel, 2 mm anterior and ventral to the central electrode. A typical trajectory was ∼60◦ from the axial anterior commissure–posterior commissure plane and ∼15◦ from the mid-sagittal plane. Final trajectory plans were slightly modified to avoid the cortical sulci, the ventricles and major blood vessels.

Spontaneous and evoked STN multi-unit MER activity was recorded. The two electrodes were simultaneously advanced in small discrete steps of ∼0.1 mm and typical recording duration of ∼20 s within the STN along the planned trajectory axis (recording durations were increased at sites of vocal emotional stimuli to ∼180 s). A synchronized acquisition of the MER data and played emotional voices was performed. The emotional voices were randomized and a pseudo-random set (of actors and emotions) was played 1–3 times/trajectory in accordance with the patient's condition and preference. The STN entry and exit were discerned visually by the neurophysiologist as a sharp increase and decrease in the background activity, respectively. The STN boundaries were further confirmed and the dorsolateral oscillatory region was detected automatically using a custom method (Zaidel et al., 2009, 2010). Visual display of a typical trajectory data is presented at **Figure 1**. The root mean square (RMS) of the MER signal is normalized by its value at non-cellular area (white matter of the anterior internal capsule before the STN entry). The normalized RMS at the different recording depths is used to estimate the spiking (multi-unit) background activity and to define the MER depths at which the electrode intersects the dorsal STN entry and ventral STN exit (**Figure 1**, upper plot). Spectrograms (power spectral density as a function of the recording depth) facilitate the subdivision of the STN into a dorso-lateral oscillatory region and a ventro medial non-oscillatory region (**Figure 1**, lower plot). Typically, 80–100 MER sites were recorded for each hemisphere over 40 min. Of this time 10–15 min were usually dedicated to perform the current experiment. The neurosurgeon

and the psychiatrist continuously evaluated the patient condition during the surgery. The experimental procedure was stopped if the patient expressed unwillingness to continue or if prolonging the surgery was deemed inadvisable.

#### **NEURONAL DATA RECORDING AND ANALYSIS**

In the operating room, 44 emotive voices were played for durations of 1–3 s with inter-voice random intervals of 2–4 s in up to three locations within the STN. Seventy-one data segments from 25 STNs of the 17 patients were available for analysis (**Table 2**).

At first, the raw analogue signal was rectified by the "absolute" operator to detect burst frequencies below the range of the operating room 250–6000 Hz band-pass filter (Zaidel et al., 2010). The rectified signal was smoothed with a digital eight-order low-pass Chebyshev Type I software filter with cutoff frequencies of 80 and 400 Hz. Then, the signals were down-sampled at 200 and 1000 Hz for root mean square (RMS) and spectral analysis, respectively. The above low-pass filter thresholds and down-sample rates were selected empirically such that the RMS best represents the average background activity (low pass filtering at 80 Hz and down sampling at 200 Hz) and to allow an accurate spectral analysis (low pass filtering at 400 Hz and down sampling at 1000 Hz).

The RMS is related to the variability of the MER signal in the local (<0.1 mm) area of the electrode. Intensive and burst activity that characterizes the STN is associated with a high variability and



therefore large RMS values. The RMS was computed with a bin size of 500 ms and bin steps of 20 ms on the down sampled rectified MER data. Then, the RMS Z-score was computed to indicate how the MER energy changed below or above the mean RMS during vocal emotive stimulation. The Z-score function parameters (mean and SD) were estimated from the RMS values 1 s before the voice stimulation and smoothed with a Gaussian window (window size = 35 ms).

To estimate if there was a significant RMS response with respect to electrode location, the average Z-score was computed over a bin size of 2 s (400 samples) after stimulation and over all the 44 played voices. A ∼5% average Z-score increase was observed at many sites within the right ventromedial non-oscillatory region, but not in the left ventro-medial non-oscillatory region or left or right dorso-lateral oscillatory regions. Therefore 5% was used as a cut-off to decide whether an STN site was responsive to the emotional voices. This low "experimentally driven" threshold is to be expected when averaging the Z scores of 400 samples. Similar results were found with different thresholds in the range of 3–7% (data not shown).

The RMS estimates the total energy of the STN signal integrated over all frequencies. Spectral analysis was performed to study the effect of the emotive vocal stimulation on the different frequency bands of the STN spiking activity. Specifically, the power spectrum density (PSD) and event related desynchronization (ERD) in resolution of 1/3 Hz and in the range of 3–250 Hz were computed. As a preparation step for computing the average power spectral density, the 1000 Hz down sampled rectified MER data was truncated into segments of 2000 ms with steps of 50 ms between segments. Then, the power spectral density of each segment was computed using Welch's method with a 1 s Hamming window and 50% overlap. The event related de-synchronization was computed by dividing the resulted power spectral density by the average power spectral density 1 s before stimulation for each frequency bin (0.33 Hz). Thus, the event related de-synchronization normalizes out high power spectral density values in the beta (12–30 Hz) and gamma (30– 100 Hz) bands frequencies that characterize the dorso-lateral and ventro-medial regions, respectively. Such location related features may mask a possible temporal response to emotive stimuli. The event related de-synchronization therefore estimates the temporal changes in the frequency domain after emotive stimulation. Specifically, modulations were expected to be observed at the alpha (8–12 Hz) and low beta (12–20 Hz) bands (Buot et al., 2012). The neural modulation was also analyzed with respect to valence (positive, neutral or negative) of the emotive voice. The average (over 2 s) responses were compared and a *t*-test was performed to evaluate the significance of their differences.

Artifacts (e.g., large transients induced by patient movement or staff handling of the patient and electrical artifacts during stimulation) were detected by visual inspection of the raw data. PSD plots, and MERs associated with these artifacts were excluded (1 right and 3 left trajectories). Analysis was performed with custom programs written in Matlab (The MathWorks, Inc., Massachusetts, US).

#### **RESULTS**

PD patients "on" medication (∼24 h before surgery) less accurately recognized (*p* < 0.05) voices expressing positive and neutral emotions in comparison to healthy age and gender matched subjects (**Figure 2**).

Neuronal responses to emotive stimuli in the ventro-medial non-oscillatory region of the right STN were far larger than the responses at the dorso-lateral oscillatory region. An example is presented in **Figure 3**. In this case one electrode was located in

**FIGURE 2 | Emotions recognition accuracy.** Medicated Parkinson's disease (PD) patients (*n* = 13) compared with three healthy control (HC) groups: (1) average age and gender (male/female) ratio matched (AGM) group (*n* = 12); (2) 60 years old or older (*n* = 14), and; (3) younger than 60 years old (*n* = 15). Error bars represent standard error of the mean (SEM). ∗*p* < 0.05 (*t*-test) statistical significant difference between the PD patients and each of the tested healthy control groups.

the ventro-medial non-oscillatory region of the right STN while the other was located 2 mm apart at the dorso-lateral oscillatory region of the same STN such that the spiking activity was recorded simultaneously from both regions and for the same vocally emotive stimuli. The Z score of the spiking activity increased in the ventro-medial non-oscillatory region after stimulation, but not in the dorso-lateral oscillatory region (**Figure 3A** right vs. left). Moreover, large decreases in the power spectrum density and event related desynchronization of alpha and low beta bands (8–12 and 12–20 Hz, respectively) were observed in the ventromedial zone, but not in the dorso-lateral region (**Figures 3B,C** right vs. left). A comparison of the power spectral density values reveal that beta band (12–30 Hz) oscillatory activity was observed at the dorso-lateral oscillatory-region (**Figure 3B** left) before and after the presentation of the vocal stimuli (Time = 0). Gamma band (30–100 Hz) power spectral density values were larger at the ventro-medial non-oscillatory area in comparison to the dorso-lateral oscillatory-region (**Figure 3B** right vs. left).

Analyzing the STN spiking activity of all 17 patients collectively (12 bilateral and 5 left unilateral DBS surgeries; 11 right and 14 left STNs after exclusion of noisy data, **Tables 1**, **2**) supported these findings. The impact of emotional stimuli on the event related de-synchronization and the Z-score spiking activity was different in different STN domains (**Figure 4**). The right ventro-medial non-oscillatory of the STN was associated with large responses (reduction in event related de-synchronization and increased Z score spiking activity) to the vocally emotive stimuli (**Figure 4B**, right). The neuronal responses in the right ventro-medial non-oscillatory region were of larger magnitude (paired *t*-test *p* < 0.01) in comparison to the responses in the left ones (**Figure 4B**, left) and to the left and right dorso-lateral oscillatory-regions (**Figure 4A**, left and right). Moreover, larger responses were observed for positive, rather than negative or neutral, stimuli in the right STN ventro-medial non-oscillatory region (**Figure 5**; paired *t*-test *p* < 0.01).

Spatial analysis of the responses reveal that an average Z score increase of 5% or more (see Methods section for more information) was observed in 48% (10/21) of MER sites in the right ventro-medial non-oscillatory region, in comparison to only 11% (2/19) of sites in the left ventro-medial non-oscillatory region, and with 23% (3/13) and 22% (4/18) of the sites in the right and left dorso-lateral oscillatory-regions, respectively. Moreover, most of the responses in the right and left dorso-lateral oscillatoryregions (66% and 75%, respectively) are near its border with the ventro-medial non-oscillatory region and might reflect a fuzzy boundary between the different territories of the STN.

Finally, intra-operative responses (7 right ventral STNs, 8 data segments) to vocally emotive stimuli were compared to the results of the preoperative MAV battery. Intraoperatively, patients were off medications, while preoperative testing was performed with patients on medications to ensure highest emotional recognition ability and to minimize the preoperative stress. Analysis of the intraoperative responses was carried only for emotive voices that the patient had reported preoperatively as reflecting increased arousal and non-neutral valence. Significant correlation (*p* < 0.05) was observed between preoperative emotion recognition accuracy and intraoperative spiking activity (Z score and event

related de-synchronization) for the emotive voices (**Figure 6**; *r* = 0.24, and *r* = −0.35, respectively). Significant correlation was also observed between preoperative perceived arousal of the emotive voice and intraoperative modulation of the neural activity amplitude (Z-score, *r* = 0.25, *p* < 0.05).

## **DISCUSSION**

These results provide electrophysiological evidence for lateralization of emotional brain function in the human basal ganglia. They also further support the concept of segregation of the STN into separate motor and non-motor regions and suggest that these regions may be differentiated by specific electrophysiological markers.

## **STN FUNCTIONAL ORGANIZATION**

Early evidence for the somatotopic organization of the STN can be tracked back to non-human primate studies in the late 1940's (Mettler and Stern, 1962). Lesions at some STN areas of a Rhesus monkey have resulted with ballistic movements, while other areas were associated with a choreic type of movement, or were not associated with abnormal movements at all. Studies of the basal ganglia of behaving primates have revealed a somatotopic organization of movement related neurons (Delong et al., 1984; Wichmann et al., 1994; Nambu et al., 2002). Autoradiographic tracer studies have demonstrated that the ipsilateral STN receives a somato-topically organized projection from the pre-central motor cortex (Monakow et al., 1978). The remaining part of the nucleus was related with less intensive projections from the premotor and prefrontal areas. Nambu et al. (1996; Nambu, 2011) found that inputs from the primary motor cortex were allocated mostly within the lateral half of the STN (hyperdirect pathway) and a somatotopic organization was demonstrated for the orofacial, forelimb, and hindlimb subdomains. Haynes and Haber (2013) have recently observed topographically organized cortico-STN pathway in primates. They report that limbic areas project to the medial tip of the STN, straddling its border and extending into the lateral hypothalamus. Associative areas project to the medial half of the STN, and motor areas to the lateral half. Moreover, limbic projections terminated primarily rostrally and motor projections more caudally. An imaging study on human subjects suggests that the STN can be divided bilaterally into limbic, associative and motor regions occupying the anterior, mid and posterior portions of the nucleus respectively (Lambert et al., 2012).

Microelectrode recordings (MER) are often utilized in DBS surgeries to define the STN boundaries (Hutchison et al., 1998;

**FIGURE 4 | A comparison of responses to emotive stimulation in different subthalamic nucleus (STN) regions. (A)** Left and right dorso-lateral oscillatory regions (DLOR, 14 STNs, 18 data segments, and 11 STNs, 13 data segments, respectively). **(B)** Left and right ventro-medial non-oscillatory regions (VMNR, 14 STNs, 19 data segments, and 11 STNs, 21

data segments, respectively). Significant increases in the background activity, and reduced oscillatory activity was observed in the VMNR of the right STN after vocal emotional stimuli **(B**, right**)**, but not in the left VMNR **(B**, left**)** or left or right STN DLOR **(A)**. Solid line represents the response mean, and shaded area represents the standard error of the mean (SEM).

**FIGURE 5 | A comparison of responses to emotive stimulation in subthalamic nucleus (STN) regions. (A)** Left and right dorso-lateral oscillatory regions (DLOR), 5 STNs, 7 data segments, and 3 STNs, 4 data segments, respectively. **(B)** Left and right ventro-medial non-oscillatory regions (VMNR, 5 STNs, 8 data segments, and 4 STNs, 10 data segments, respectively). A steep reduction in the oscillatory activity is present in the right STN VMNR for positive voices **(B**, right**)**,

but not for neutral voices and less for negative voices. Such responses are not observed for emotive voice stimuli in the left VMNR **(B**, left**)** or left or right STN DLOR **(A)**. Color coding: Red, green, and blue responses to negative, neutral, and positive emotive voices, respectively. Solid lines represent the mean of the event related desynchronization change (ERD), and the surrounding shaded area represents the response standard error of the mean (SEM).

Benazzouz et al., 2002; Castner et al., 2007) and facilitate the spatial mapping of its physiological subdomains. Rodriguez-Oroz et al. (2001) have studied STN MER of PD patients and reported that all neurons with sensorimotor responses were in the dorsolateral region of the STN. Abosch et al. (2002) also reported that movement related responses were observed more in the dorsal part of the STN in comparison to its ventral area. Further studies incorporating MER reported even more detailed somatotopic organization, and revealed that arm-related STN cells were located laterally and at the rostral and caudal poles, whereas

leg-related cells were located medially and centrally (Rodriguez-Oroz et al., 2001; Theodosopoulos et al., 2003; Romanelli et al., 2004). Clinically, most effective DBS treatment is associated with stimulation of the dorsal STN region (Godinho et al., 2006; Weise et al., 2013). Neural activity of the anterior-medial area of the STN can be correlated with checking behavior in the STN of patients with OCD (Burbaud et al., 2013). Another recent study incorporating LFP signals from PD STN's, demonstrated that the ventral part of the STN encodes the emotional valence of stimuli independently of the motor context (Buot et al., 2012).

Our results support the subdivision of STN into dorsal- "motor" and ventral-"non-motor" regions. These findings may explain previous reports on emotional changes in PD patients undergoing DBS in the STN (Temel et al., 2006; Mallet et al., 2007; Witt et al., 2008). Finally, our findings provide supportive evidence for the limbic role of the right ventral STN (Péron et al., 2013) and its involvement in encoding of emotional prosody (Alba-Ferrara et al., 2011).

#### **DOPAMINERGIC MODULATION OF EMOTIONAL PROCESSING IN PD**

It has been proposed that dopaminergic medication may reverse the bias of non-medicated PD patients for better learning from negative feedback and make them more sensitive to positive than negative outcomes (Frank et al., 2004; Bódi et al., 2009; Maia and Frank, 2011). Therefore, it might be expected that PD patients on medication would better recognize positive emotions. However, our results demonstrate a significant impairment in recognition of non-verbal vocal burst representing positive emotions, but not for negative ones in the medicated PD patients tested before their DBS procedures. The reported results of PD emotionrecognition studies are mixed (Kan et al., 2002; Assogna et al., 2010; Moustafa et al., 2013). The discrepancies between the studies may be related to the stimulation modality (e.g., visual vs. auditory; Kan et al., 2002), the type of dopaminergic therapy (e.g., Levodopa vs. dopamine agonists; Moustafa et al., 2013), the intensity of presented emotions (Assogna et al., 2010), and the testing conditions (i.e., high patient stress and anxiety at the day before surgery in our setup).

Previous studies utilizing macro-electrode LFP recordings in the STN of PD patients have demonstrated event related desynchronization activity in the alpha and low beta frequencies (8–20 Hz) after emotive stimuli (Kühn et al., 2005; Brücke et al., 2007; Huebl et al., 2011; Buot et al., 2012). The largest modulation was observed after a visual presentation of pleasant stimuli in patients "on" dopamine medication. Bout et al. reported that largest modulation in "off" dopamine medication PD patients was caused by unpleasant stimuli (Buot et al., 2012). This observation adds important new evidence regarding the effect of dopaminergic drug therapy on emotional processing (Castrioto et al., 2013). The discrepancy between the reported LFP results and those in this study may be related to the differences in stimulation modality, intensity of presented emotions and the high patient stress and anxiety during surgery.

Another marked difference between our results and previous LFP studies is that no significant lateralization of the STN responses were observed in macro-electrode recordings. This discrepancy may be explained by the greater spatial resolution afforded by microelectrode recording of spiking activity in comparison to macro-electrode LFP recording. Furthermore, LFP's probably mainly reflect the synaptic input of the STN (Buzsáki et al., 2012) while spikes recorded with microelectrodes reflect the output of the STN. It may be that there is right/left symmetry in the synaptic inputs to the STN and still right/left *asymmetry* in their output (e.g., due to different excitability levels).

## **CONCLUDING REMARKS**

The involvement of the basal ganglia in reinforcement learning has been extensively studied (Schultz et al., 1997; Hollerman and Schultz, 1998; Morris et al., 2004; Bayer and Glimcher, 2005; Joshua et al., 2010). Effective reinforcement learning depends on accurate recognition of the current state of the animal, including the emotional valence of different stimuli. Our results show that the largest activity modulation was associated with positive emotional stimuli and therefore the right ventral STN likely encodes emotional information that may be incorporated in reinforcement learning.

The results of this imply that DBS of the right ventral STN might be associated with more psychiatric side effects in PD patients in comparison to other STN regions. Therefore, the right ventral STN should be identified during the implantation of the DBS electrodes. We suggest that adjustment of STN DBS stimulation parameters should take into account emotional symptoms and could employ different strategies for the right and left STN to improve treatment outcome. Special caution is advised with right-sided stimulation in patients that are prone to or have developed psychiatric side effects.

Further studies may investigate the potential benefit of ventral STN DBS for primary psychiatric disorders such as depression or OCD. It is further suggested that these studies will examine the possibility that STN DBS for psychiatric indications might be right unilateral rather than bilateral. Finally, it is our hope that further studies will also explore the hypothesis that the left ventro-medial non-oscillatory region of the STN is related to other non-motor functions that are lateralized to the dominant hemisphere such as speech (Anzak et al., 2011).

#### **REFERENCES**


## **ACKNOWLEDGMENTS**

This research was supported in part by the Post-doctoral fellowships (to Reuben R. Shamir) of the Edmond and Lily Safra Center for Brain Sciences (ELSC), the Vorst family grant for research on Parkinson's disease, by the Simone and Bernard Guttman chair of Brain Research, and the generous support of the Rosetrees and Dekker foundations (to Hagai Bergman), the PATH fund and Bloom foundation for research on Parkinson's disease (to Zvi Israel), the Brain and Behavior Research Foundation (for Renana Eitan), and the Joint Research Grant from Hebrew University Hadassah Medical School and the Hadassah Medical Organization (to Renana Eitan, Zvi Israel, and Hagai Bergman).

(2012). Structural and resting state functional connectivity of the subthalamic nucleus: identification of motor STN parts and the hyperdirect pathway. *PLoS ONE* 7:e39061. doi: 10.1371/journal.pone.0039061


in the primate subthalamic nucleus: evidence for ordered but reversed body-map transformations from the primary motor cortex and the supplementary motor area. *J. Neurosci.* 16, 2671–2683.


*Ann. Neurol.* 69, 793–802. doi: 10.1002/ana.22222


J., et al. (2009). Bilateral deep brain stimulation vs best medical therapy for patients with advanced Parkinson disease: a randomized controlled trial. *JAMA* 301, 63–73. doi: 10.1001/jama. 2008.929


disease: a randomised, multicentre study. *Lancet Neurol.* 7, 605–614. doi: 10.1016/S1474-4422(08) 70114-5


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 July 2013; accepted: 04 October 2013; published online: 29 October 2013.*

*Citation: Eitan R, Shamir RR, Linetsky E, Rosenbluh O, Moshel S, Ben-Hur T, Bergman H and Israel Z (2013) Asymmetric right/left encoding of emotions in the human subthalamic nucleus. Front. Syst. Neurosci. 7:69. doi: 10.3389/ fnsys.2013.00069*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2013 Eitan, Shamir, Linetsky, Rosenbluh, Moshel, Ben-Hur, Bergman and Israel. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Duration differences of corticostriatal responses in striatal projection neurons depend on calcium activated potassium currents

## *Mario A. Arias-García , Dagoberto Tapia , Edén Flores-Barrera , Jesús E. Pérez-Ortega , José Bargas and Elvira Galarraga\**

*Departamento de Neurociencia Cognitiva, División de Neurociencias, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, México City DF, México*

#### *Edited by:*

*Alon Korngreen, BarIlan University, Israel*

#### *Reviewed by:*

*James M. Tepper, Rutgers, The State University of NJ, USA Edward A. Stern, BarIlan University, Israel*

#### *\*Correspondence:*

*Elvira Galarraga, Departamento de Neurociencia Cognitiva, División de Neurociencias, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, Circuito Exterior S/N, Ciudad Universitaria, 04510 México City DF, México e-mail: egalarra@ifc.unam.mx*

The firing of striatal projection neurons (SPNs) exhibits afterhyperpolarizing potentials (AHPs) that determine discharge frequency. They are in part generated by Ca2+-activated K+-currents involving BK and SK components. It has previously been shown that suprathreshold corticostriatal responses are more prolonged and evoke more action potentials in direct pathway SPNs (dSPNs) than in indirect pathway SPNs (iSPNs). In contrast, iSPNs generate dendritic autoregenerative responses. Using whole cell recordings in brain slices, we asked whether the participation of Ca2+-activated K+-currents plays a role in these responses. Secondly, we asked if these currents may explain some differences in synaptic integration between dSPNs and iSPNs. Neurons obtained from BAC D1 and D2 GFP mice were recorded. We used charybdotoxin and apamin to block BK and SK channels, respectively. Both antagonists increased the depolarization and delayed the repolarization of suprathreshold corticostriatal responses in both neuron classes. We also used NS 1619 and NS 309 (CyPPA), to enhance BK and SK channels, respectively. Current enhancers hyperpolarized and accelerated the repolarization of corticostriatal responses in both neuron classes. Nevertheless, these drugs made evident that the contribution of Ca2+-activated K+-currents was different in dSPNs as compared to iSPNs: in dSPNs their activation was slower as though calcium took a diffusion delay to activate them. In contrast, their activation was fast and then sustained in iSPNs as though calcium flux activates them at the moment of entry. The blockade of Ca2+-activated K+-currents made iSPNs to look as dSPNs. Conversely, their enhancement made dSPNs to look as iSPNs. It is concluded that Ca2+-activated K+-currents are a main intrinsic determinant causing the differences in synaptic integration between corticostriatal polysynaptic responses between dSPNs and iSPNs.

**Keywords: Ca2+-activated K+-currents, BK channels, SK channels, corticostriatal pathway, medium spiny neurons, striatum, synaptic integration**

## **INTRODUCTION**

Ca2+-activated K+-currents are important regulators of excitability: they control firing frequency and synaptic integration (Bond et al., 2005; Salkoff et al., 2006; Faber, 2009). In striatal projection neurons (SPNs), Ca2+-activated K+-currents contribute to action potential repolarization and the afterhyperpolarization that makes up a great part of the interspike interval during repetitive firing (Pineda et al., 1992; Vilchis et al., 2000; Pérez-Garci et al., 2003; Pérez-Rosello et al., 2005; Wolf et al., 2005; Galarraga et al., 2007; Flores-Barrera et al., 2009). Selective peptidic toxins made clear that Ca2+-activated K<sup>+</sup> currents in SPNs comprise "small" SK and "large" BK conductance channels (Pineda et al., 1992; Bargas et al., 1999). These channels are important targets for the actions of neurotransmitters, e.g.: dopamine, acetylcholine and somatostatin, among others (Pineda et al., 1995; Hernández-López et al., 1996; Vilchis et al., 2002; Pérez-Rosello et al., 2005; Galarraga et al., 2007). In dissociated neurons, Ca2+-activated K+-currents are preferentially activated by calcium entry through CaV2.1 and CaV2.2 channels. CaV1 and CaV2.3 calcium channel represent a much smaller Ca2<sup>+</sup> source (Vilchis et al., 2000).

However, Ca2<sup>+</sup> entry in the dendrites of SPNs is not mainly due CaV2.1 and CaV2.2 channels. Near synaptic sites of SPNs, NMDA, CaV1, CaV2.3, and CaV3 channels become a main source of Ca2<sup>+</sup> (Galarraga et al., 1997; Carter and Sabatini, 2004; Higley and Sabatini, 2008; Flores-Barrera et al., 2011). Because Ca2+ activated K+-channels could be present in the synaptic regions of dendrites (Ngo-Anh et al., 2005; Gu et al., 2008; Benhassine and Berger, 2009; Wynne et al., 2009; Faber, 2010; Grewe et al., 2010; Allen et al., 2011; Hosy et al., 2011; Tonini et al., 2013, e.g., Womack et al., 2004), one question to ask is whether Ca2+-activated K<sup>+</sup> currents are involved in dendritic synaptic integration and the repolarization of polysynaptic corticostriatal responses (Vizcarra-Chacon et al., 2013), given that Ca2<sup>+</sup> sources could be different than in the soma. Different sources of Ca2<sup>+</sup> to activate Ca2+-dependent K+-currents in the dendrites and the soma-axon hillock regions could imply that synaptic inputs and the generation of action potentials may be regulated differentially. Thus, in the present work we describe the role of Ca2+-activated K+-currents in the corticostriatal responses and use them as reporters of the importance of Ca2<sup>+</sup> influx into the dendrites during synaptic integration.

SK class channels are widely expressed in the brain and are found in dendritic spines near synaptic contacts (Ngo-Anh et al., 2005; Gu et al., 2008; Faber, 2010; Allen et al., 2011; Hosy et al., 2011; Tonini et al., 2013). BK class channels are functional in each neuronal compartment: soma, dendrites and terminals (Gu et al., 2008; Benhassine and Berger, 2009; Wynne et al., 2009; Grewe et al., 2010). Ca2+-activated K<sup>+</sup> channels on neuronal dendrites can be activated by calcium influx through synaptic glutamate receptors and/or voltage-activated Ca2+-channels (Faber, 2009).

Because polysynaptic corticostriatal responses after a single stimulus are more prolonged, and evoke more action potentials, in direct pathway SPNs (dSPNs) than in indirect pathway SPNs (iSPNs) (Bargas et al., 1991; Flores-Barrera et al., 2010, 2011), a logical follow up of our previous studies had the following goals: first, identify whether Ca2+-activated K+-currents participate in the polysynaptic corticostriatal response of SPNs (Vizcarra-Chacon et al., 2013). Second, determine the relative importance of these currents in suprathreshold synaptic integration. Third, observe whether there is a difference in the roles of BK and SK currents in these responses. And finally, find out if these currents can explain the difference in duration between the corticostriatal polysynaptic responses of dSPNs and iSPNs.

## **MATERIALS AND METHODS**

#### **SLICE PREPARATION**

All experiments were carried out in accordance with the National Institutes of Health Guide for Care and Use of Laboratory Animals and were approved by the Institutional Animal Care Committee of the Universidad Nacional Autónoma de México. D1 and D2 dopamine receptors-eGFP BAC transgenic mice, between postnatal days 30–60 (PD30-60; FVB background, developed by the GENSAT project) were used. Adult Wistar rats were also used to detect possible inconsistencies. The number of animals employed in the experimental samples was near the minimal possible to attain robust reproducible results and statistical significance. Animals were anesthetized with ketamine/xylazine. Their brains were quickly removed and placed into ice cold 4◦C) cerebrospinal fluid (CSF) containing (in mM): 126 NaCl, 3 KCl, 25 NaHCO3, 1 MgCl2, 2 CaCl2, 11 glucose, 300 mOsm/L, pH = 7.4 with 95% O2 and 5% CO2. Parasagittal neostriatal slices (250–300μm thick) were cut using a vibratome and stored in oxygenated bath CSF at room temperature for at least 1 h before recording.

#### **ELECTROPHYSIOLOGICAL RECORDINGS**

Whole-cell patch-clamp recordings were performed with micropipettes made with borosilicate glass, fire polished for D.C. resistances of about 3–6 M-. Internal solution was (in mM): 120 KMeSO4, 2 MgCl2, 10 HEPES, 10 EGTA, 1 CaCl2, 0.2 Na2ATP, 0.2 Na3GTP, and 1% biocytin. Some recordings were carried out using sharp microelectrodes (80–120 M-) filled with 1% biocytin and 3M potassium acetate fabricated from borosilicate-glass (Flores-Barrera et al., 2010). Slices were superfused with CSF at 2 ml/min (34–36◦C). Recordings were digitized and stored with the aid of software designed in the laboratory by one of the co-authors in the LabView environment (National Ins., Austin, TX, USA). To have an approximation of whole neuronal input resistance (RN) and changes in bridge balance, a small current pulse of −20 mV was delivered to the soma before the orthodromic responses were obtained at intervals over 20 s. Changes over 20% in the responses to this pulse or in the D.C. current to maintain the holding potential in current-clamp mode made us to discard the experiments. Sampling rate or digitizing procedures made difficult to have both the slow corticostriatal responses and full-blown trains of sodium spikes at the same time, so that some spikes look truncated in some traces. Cells with resting potential more negative than −70 mV (at zero current) and RN about 100–200 M were chosen (e.g., Salgado et al., 2005).

Drugs were dissolved in the CSF from stock solutions made daily. Recordings were carried out in the dorsal striatum. Stimulation was performed with concentric bipolar electrodes (tip = 50μm) in the cortex (>250μm from the callosum). Synaptic responses were evoked by a single square pulse of 0.1 ms. The cell membrane potential was held at −80 mV. A series of current pulses of increasing intensities were used. Responses obtained with suprathreshold stimulus strength (2× threshold) were compared (Vizcarra-Chacon et al., 2013). Current-clamp data were obtained to observe the most physiological response. eGFP-positive and negative neurons from D1 and D2 eGFP animals were compared. After recordings, neurons were injected with biocytin. eGFP-positive visualization was observed on a confocal microscope as previously described (Flores-Barrera et al., 2010).

The following drugs: apamin and charybdotoxin were obtained from Alomone Labs (Israel). NS 1619 (BK channel activator), CyPPA, and NS 309 (SK channel activators) were acquired from TOCRIS (R&D Systems, Minneapolis, MN, USA).

#### **STATISTICS**

Statistical values are presented as mean ± s.e.m. Digital subtraction was used to obtain time courses of the components enhanced or decreased by the different peptides and drugs. Main parameter reported was "half width" (duration at 50% of maximal response amplitude) because it comprises both the increase in response duration and the increase in response amplitude. Kruskal-Wallis tests with *post-hoc* Dunn tests were used to compare samples with different treatments. Statistical difference was accepted when *P* < 0.05. Intensity-response (I-R) plots where built using as response the average magnitude of the half width as a function of stimulus strength. Data were fitted to the following sigmoid function using a non-linear Marquardt algorithm:

$$R(i) = \frac{R\_{\text{max}}}{1 + e^{-k(i - ih)}}$$

where *R*(*i*) is the response as a function of stimulus intensity normalized to threshold units, *R*max is the maximal half width attained, *k* is the slope factor and *ih* is the stimulus intensity necessary to attain a response of half magnitude. In this work, the value reported is *R*max. The function fits to average experimental values and their 95% confidence intervals are plotted as colored bands (Tecuapetla et al., 2005; López-Huerta et al., 2013) while standard errors of average responses are plotted as usual: lines with the symbols.

## **RESULTS**

SPNs were recorded from PD30-60 BAC D1 or D2 eGFP mice and from rats of equivalent age (*n* = 77). We previously reported that suprathreshold corticostriatal responses are more prolonged and evoke more action potentials in dSPNs than in iSPNs (Flores-Barrera et al., 2010) and that their duration depends on polysynaptic activity (Vizcarra-Chacon et al., 2013) involving both

#### cortical and striatal neurons (a feed-forward circuit) as well as intrinsic currents (Vergara et al., 2003; Flores-Barrera et al., 2009, 2011).

#### **BLOCKADE OF BK AND SK CHANNELS DEPOLARIZE AND PROLONG CORTICOSTRIATAL RESPONSES IN STRIATAL PROJECTION NEURONS**

Red trace in **Figure 1A** is a control corticostriatal response in a dSPN. The superimposed blue trace is a response obtained with the same stimulus and in the same cell after adding 20 nM charybdotoxin (ChTx), a blocker of BK-channels, to the bath CSF: the response exhibited an additional depolarization that prolonged the response and the number of spikes. The superimposed

**FIGURE 1 | Blockade of BK and SK channels depolarize and prolong corticostriatal responses in striatal projection neurons. (A)** Three superimposed records from a dSPN were obtained without changing stimulus strength: they show a corticostriatal response in control conditions (red trace), the same response in the presence of the blocker of BK-channels, 20 nM charybdotoxin (ChTx, blue trace), and finally, the same response in the presence of both ChTx and the blocker of SK channels, 100 nM apamin (ChTx + apamin, black trace). Each blocker depolarized and prolonged the response, and also increased the number of action potentials that were fired. Digital subtractions at bottom illustrate the hyperpolarizing influence that Ca2+-activated K+-currents exerted over the corticostriatal responses and that were suppressed by the blockade of ChTx (blue) and apamin (black). Note that the hyperpolarizing influence of Ca2+-activated K+-currents rise slowly and last hundreds of milliseconds thus contributing to restrain the build up of the corticostriatal response. **(B)** Intensity-response (I-R) graphs obtained by plotting average half width (duration at 50% of the peak amplitude of the response) as a function of threshold intensity including: minimal, subthreshold (0.5×), threshold (1.0×) and suprathreshold responses (2.0×). A sigmoid function was fitted. Curves with the action of each blocker are plotted individually as well as the administration of both blockers together. Shadowed colored areas denote 95% confidence intervals, symbols depict average values of the samples for each stimulus strength ± s.e.m. **(C)** Tukey box plots illustrate the distributions of

measurements for suprathreshold responses (half width at 2× threshold strength), in control, in the presence of each blocker, and in the presence of both blockers. Differences are significant: ∗*P* < 0.5 and ∗∗*P* < 0.01 with respect to the controls. **(D)** Three superimposed records from an iSPN were obtained without changing stimulus strength: they show a corticostriatal response in control conditions (green trace), the same response in the presence of the blocker of BK channels, 20 nM charybdotoxin (ChTx, blue trace), and finally, the same response in the presence of both ChTx and apamin (ChTx plus apamin, black trace). As in the case of dSPNs, each blocker increased response duration and depolarization (half width). Subtractions at the bottom show the hyperpolarizing influences exerted by Ca2+-activated K+-currents that were suppressed by charybdotoxin (ChTx, blue trace) and apamin plus ChTx (black trace). Note that hyperpolarization induced by Ca2+-activated K+-currents rise fast and then produced a sustained plateau hyperpolarization that restrains the corticostriatal response by hundreds of milliseconds until final return to holding potential. **(E)** Fitted I-R plots (half width) were graphed individually for each blocker and for both blockers administered together as in **(B)**. **(F)** Box plots summarize the distributions of measurements for suprathreshold response in control, with each blocker, and with both blockers administered together. Differences are significant: ∗*P* < 0.5 and ∗∗*P* < 0.01 with respect to the control. Due to the intent of showing the slow responses some fast spikes are truncated by the digitizing procedure.

black trace is the corticostriatal response to the same stimulus after adding 100 nM apamin, a blocker of SK-channels, in the continuous presence of ChTx. This response was even more depolarized, more prolonged and the number of action potentials fired increased, showing that Ca2+-activated K+-currents control the level and duration of corticostriatal responses which physiologically compose the depolarized up-states (Stern et al., 1997; Vergara et al., 2003; Vautrelle et al., 2009) of SPNs. At the bottom, digital subtractions of the hyperpolarizing influences elicited by Ca2+-activated K+-currents that were suppressed by the actions of these peptides are illustrated (ChTx in blue and apamin plus ChTx in black). Note that hyperpolarizations induced by Ca2+-activated K+-currents rise slowly and lasts hundreds of milliseconds, suggesting that, after synaptic activation, Ca-entry takes some time to activate BK- and SK-channels in dSPNs. The parameter that encompasses both the depolarization and the increase in duration produced by the peptides is the half width of the response (duration at half amplitude). **Figure 1B** shows that half width as a function of stimulus intensity (in threshold units) could be used to build I-R plots, which could be fitted to sigmoid functions (see Materials and Methods): continuous traces represent the average fit of several similar experiments and surrounding shadowed areas represent 95% confidence intervals. Symbols with lines represent individual average responses ± s.e.m. to each stimulus. Note that peptides, administered jointly, or independently, change I-R plots with respect to the control, although all plots tend to reach saturation. Box plots in **Figure 1C** summarize the actions of both peptides in the experimental samples (because not all distributions approached normality, free-distribution statistical tests were used for comparisons; see Materials and Methods), including a sample where both peptides were applied together in either order at 2× threshold strength (saturation). Half width in the control was (mean ± s.e.m.): 296 ± 26 ms (*n* = 24, in Red), after ChTx it was 474 ± 40 ms or 60% increase (*n* = 10; in blue, <sup>∗</sup>*P* < 0.05), after apamin it was 370 ± 53 ms or an increase of 25% (in purple, *n* = 10; <sup>∗</sup>*P* < 0.05), and with both peptides applied together it was 773 ± 26 ms or a 161% increase (in black, *n* = 6; ∗∗*P* < 0.02). Fitted values of the same parameter (e.g., *R*max for maximal half width; see Materials and Methods) were not significantly different to experimental measurements (**Figure 1B**) and therefore are not included here to avoid redundancy.

The same experiments were performed in iSPNs. Half width of iSPNs is known to be briefer than that from dSPNs (Flores-Barrera et al., 2010). These neurons commonly fire a single or brief high frequency burst of spikes at the beginning of the synaptic response. As above, **Figure 1D** shows three superimposed records: the green one was taken in control conditions, the blue trace was obtained with the same stimulus and in the same cell after addition of 20 nM charybdotoxin (+ChTx) and the black trace was obtained after addition of 100 nM apamin (+apa) in the continuous presence of charybdotoxin. Each peptide depolarized and prolonged the response (half width). At the bottom, subtractions of the hyperpolarizing influences elicited by Ca2+ activated K+-currents blocked by these peptides are illustrated (charybdotoxin in blue and ChTx plus apamin in black). Note that hyperpolarizions induced by Ca2+-activated K+-currents in iSPNs show a major difference when compared to those obtained from dSPNs: hyperpolarizations begin quickly, and then are followed by a plateau-like hyperpolarization that restrains the corticostriatal response by hundreds of milliseconds to slowly return to holding potential during the last half of the response. These responses suggest that in contrast to dSPNs, Ca-entry activates BK- and SK-channels immediately at the beginning of the synaptic response in iSPNs, suggesting that Ca-entry that induces them probably comes from all-or-none dendritic phenomena that produce the Ca-influx (Bargas et al., 1991; Flores-Barrera et al., 2010; Higley and Sabatini, 2010; Plotkin et al., 2011; Vizcarra-Chacon et al., 2013). **Figure 1E** shows fitted I-R plots for each peptide acting independently or together in either order. Continuous traces are the function fits for the set of experiments and surrounding shadowed correspond to 95% confidence intervals. Symbols with lines are average measurements ± s.e.m. Box plots in **Figure 1F** summarize the actions of both peptides in different samples, including the sample where both peptides were applied together in either order: at 2 × threshold strength, the control was 115 ± 13 ms (in green, *n* = 19), while after ChTx it was 204 ± 44 ms or a 77% increase (in blue, *n* = 7; <sup>∗</sup>*P* < 0.05), and after apamin it was 267 ± 46 ms or a 132% increase (in purple, *n* = 8; ∗∗*P* < 0.02). Both peptides applied together rendered a half width of 344 <sup>±</sup> 48 ms or a 199% increase (in black, *<sup>n</sup>* <sup>=</sup> 6; ∗∗*<sup>P</sup>* <sup>&</sup>lt; <sup>0</sup>.02).

#### **ENHANCEMENT OF BK AND SK CHANNELS HYPERPOLARIZE AND SHORTEN CORTICOSTRIATAL RESPONSES IN STRIATAL PROJECTION NEURONS**

Red trace in **Figure 2A** is a control corticostriatal response in a dSPN. The superimposed orange trace is the same response obtained with the same stimulus and in the same cell after adding 2.5μM NS 309, an activator of SK-channels, to the bath CSF: response depolarization decreased and duration became briefer. 2.5μM CyPPA had the same actions (not illustrated). Finally, the same response is shown in the presence of both NS 309 and the activator of BK-channels: 2.5μM NS 1619 (black trace): there was a larger reduction of the response, repolarization was faster and number of action potentials fired was reduced to one. These actions show that a positive modulation of Ca2+-activated K+-currents may control the magnitude of the corticostriatal responses in dSPNs. At the bottom, subtractions illustrate the cortical-induced depolarizations suppressed by these drugs (NS 309 in orange and NS 1619 plus NS 309 in black). That is, enhancement of Ca2+-activated K+-currents may suppress a great part of the prolonged depolarizing corticostriatal response. Nonetheless, note that these depolarizing components begin and end slowly in dSPNs. **Figure 2B** shows that I-R plots produced by any enhancer or a combination of them were depressed with respect to the control. Box plots in **Figure 2C** summarize the actions of both enhancers in different experimental samples, including a sample where both drugs were applied together in either order at 2× threshold strength. Half width in the control was (mean ± s.e.m.): 302 ± 31 (*n* = 18, in red), after NS 1619 it was 57 ± 2 ms or a 81% decrease (*n* = 6; in pink, <sup>∗</sup>*P* < 0.05), after NS 309 it was 65 ± 6 ms or a 78% decrease (in orange, *n* = 6; <sup>∗</sup>*P* < 0.05), and with both activators applied

**FIGURE 2 | Enhancement of BK and SK channels hyperpolarizes and shortens corticostriatal responses in striatal projection neurons. (A)** Three representative records from a dSPN were obtained in the same cell without changing stimulus strength (superimposed): the red trace shows a corticostriatal response in control conditions, the same response is shown in the presence of 2.5μM NS 309 (orange trace), an activator of SK-channels, and finally, the same response is shown in the presence of both NS 309 and the activator of BK-channels, 2.5μM NS 1619 (black trace). Each activator hyperpolarized the response, shortened its duration, and decreased the number of action potentials fired. Subtractions of the response depolarizations blocked by NS 309 (orange) and NS 309 plus NS 1619 (black) are shown at the bottom. Note that depolarizations blocked by these drugs begin slowly and last hundreds of milliseconds. Therefore, their blockade greatly reduced the suprathreshold responses **(B)**. Fitted I-R plots with 95% confidence intervals and experimental averages ± s.e.m. Action of each drug is plotted individually as well as the administration of both activators together. **(C)** Box plots summarize the distribution of measurements for suprathreshold response (half width at 2×) in different samples: control, with each activator, and with both activators administered together. Samples differed significantly from the control

( ∗*P* < 0.05). **(D)** Three superimposed records from an iSPN were obtained without changing stimulus strength: they show a corticostriatal response in control conditions (green trace), the same response in the presence of the activators of SK channels, 2.5μM NS309 (orange trace), and finally, the same response in the presence of both NS309 and 2.5μM NS1619 (black trace). As in the case of dSPNs, each blocker decreased response duration and depolarization (half width). Subtractions of the depolarizations blocked by NS309 (orange) and NS309 plus NS1619 (black) are shown at the bottom. Note that depolarizations in this case had a sudden initiation and repolarization. **(E)** Intensity-response (I-R) graphs obtained by plotting half width (duration) at 50% of the peak amplitude of the response as a function of threshold intensity including subthreshold (0.5×), threshold (1.0×), and suprathreshold responses (2.0×). Action of each activator is plotted individually as well as the administration of both drugs together. **(F)** Box plots summarize the distribution of samples for suprathreshold response (half width at 2×) in control, with each activator, and with both drugs administered together. Differences were significant (∗*P* < 0.05). Although some fast spikes are truncated by the digitizing procedure, note that in these cases some initial spikes exhibit inactivation due to their relative refractory periods.

together it was 54 <sup>±</sup> 16 ms or a 82% decrease (in black, *<sup>n</sup>* <sup>=</sup> 5; <sup>∗</sup>*<sup>P</sup>* <sup>&</sup>lt; <sup>0</sup>.05).

Green trace in **Figure 2D** is a control corticostriatal response in an iSPN. The superimposed orange trace is the same response after adding 2.5μM NS 309 (SK-channels enhancer). Note that response depolarization decreased, duration became briefer and less action potentials were fired. 2.5 μM CyPPA had the same actions (not illustrated). Also superimposed, the response of the same cell is shown in the presence of both NS 309 and 2.5μM NS 1619 (black trace). Note that cell response was reduced to the point of becoming subthreshold. These actions show that Ca2+ activated K+-currents may modulate corticostriatal responses in iSPNs. At the bottom, subtractions showing the depolarizing components of the corticostriatal responses shunted by the enhancement of Ca2+-activated K+-currents are illustrated (NS 309 in orange and NS 1619 plus NS 309 in black). Note that actions of these drugs begin and return faster, as compared to those from dSPNs. **Figure 2E** shows that I-R plots produced by any enhancer or a combination of them were depressed with respect to the control. Box plots in **Figure 2F** summarize the actions of both enhancers in the experimental samples, including a sample where both drugs were applied together in either order at 2× threshold strength. Half width in the control was (mean ± s.e.m.): 110 ± 9 ms (*n* = 10, in green), after NS 1619 it was 44 <sup>±</sup> 4 ms or a 60% decrease (*<sup>n</sup>* <sup>=</sup> 6; in pink, <sup>∗</sup>*<sup>P</sup>* <sup>&</sup>lt; <sup>0</sup>.05), after NS 309 it was 40 <sup>±</sup> 5 ms or a 62% decrease (in orange, *n* = 6; <sup>∗</sup>*P* < 0.05), and with both activators applied together it was 38 <sup>±</sup> 16 ms or a 64% decrease (in black, *<sup>n</sup>* <sup>=</sup> 4; <sup>∗</sup>*<sup>P</sup>* <sup>&</sup>lt; <sup>0</sup>.05).

To conclude, in both classes of SPNs Ca2+-activated K+ currents control the level of depolarization and duration of the prolonged polysynaptic corticostriatal response during synaptic integration. These results strongly suggest that intrinsic currents such as Ca2+-activated K+-currents may control the influence of polysynaptic entries and their consequent activation of intrinsic currents contributing to up-states duration. However, a main difference between both classes of SPNs is that in iSPNs, a much larger influence of Ca2+-activated K+-currents appears to be present, triggered at the initiation of the synaptic inputs and immediately as in iSPNs. Next, we studied the influence of Ca2+ activated K+-currents in subthreshold synaptic responses, before action potentials were elicited, to see if there is a difference between both neurons classes.

## **Ca2+-ACTIVATED K+-CURRENTS CONTROL SUBTHRESHOLD SYNAPTIC INPUTS IN iSPNs**

**Figures 3A,B** illustrate subthreshold synaptic potentials recorded after cortical stimulation in a dSPN and an iSPN in three different conditions: control (red and green traces for dSPN and iSPN, respectively), after addition of 100 nM apamin to block

SK-channels (purple trace), and after addition of 20 nM ChTx in the continuous presence of apamin to block both SK- and BKcomponents of Ca2+-activated K+-currents (black traces). Note that in the case of dSPNs, calcium influx during subthreshold synaptic events does not appear to activate much Ca2+-activated K+-currents since the subthreshold synaptic potentials are very little affected. In comparison, blockade of Ca2+-activated K+ currents in iSPNs greatly alters the amplitude of subthreshold synaptic events, suggesting that calcium-influx capable to activate these currents can be generated with a minimum of active synapses (Higley and Sabatini, 2010). Because these events most probably occur in dendritic spines, we conclude that calcium influx during subthreshold synaptic events is enough to activate Ca2+-activated K+-currents in iSPNs, but not in dSPNs. **Figures 3C,D** show similar experiments with the enhancers of Ca2+-activated K+-currents. First, subthreshold synaptic potentials were recorded in control conditions (red and green traces for dSPN and iSPN, respectively), then the same events were recorded during enhancement of SK-channels (orange traces, NS 309), and finally, the same events were recorded during enhancement of both SK- and BK-current components (black traces, with NS 309 plus NS 1619). Enhancers of Ca2+-activated K+-currents do affect the amplitude of subthreshold synaptic potentials in dSPNs, showing that, if activated by some other means (more or larger synaptic potentials or spatial summation, as in suprathreshold responses), the channels that carry these currents are near enough to shunt synaptic inputs. However, enhancers of Ca2+-activated K+-currents reduce subthreshold synaptic potentials induced in iSPNs in such an extent as to almost completely block the synaptic event. These results suggest that in iSPNs, synaptic inputs may activate Ca2+-activated K+-currents augmenting the action of the enhancers and exerting a much larger effect as compared to dSPNs.

We conclude that even at subthreshold levels, Ca-entry during synaptic potentials in iSPNs is powerful enough to activate Ca2+-activated K+-currents and then shunt synaptic entries, to a degree that, during suprathreshold polysynaptic events (Vizcarra-Chacon et al., 2013) their duration and depolarization (half width) are greatly reduced in comparison to the same events generated in dSPNs. To further support these conclusions, we tried to control the corticostriatal suprathreshold responses in these neurons by manipulating their Ca2+-activated K+-currents.

#### **CONTROLLING VOLTAGE TRAJECTORIES OF CORTICOSTRIATAL RESPONSES**

The question underlying this series of experiments is: what are the main determinants of the different voltage trajectories found in dSPNs, as compared with iSPNs, during corticostriatal responses? **Figure 4A** shows that when the corticostriatal response from a dSPN is subject to an enhancer of a Ca2+ activated K+-current (BK-channels component enhanced by NS 1619 in this case), the response acquires a faster repolarization, shortens its train of spikes and becomes similar to a typical corticostriatal response from an iSPN. This result suggests that the reason why dSPNs responses are slower than those from iSPNs is a decreased activation of Ca2+-activated K+-currents. Conversely, **Figure 4B** shows that when the corticostriatal response from

**FIGURE 4 | Controlling voltage trajectories of corticostriatal responses: converting responses from dSPNs or iSPNs into one another. (A)** Control corticostriatal response in a dSPN (top). The arrow indicates the addition of 2.5μM of the of the BK-channel activator NS 1619. Note that repolarization was enhanced and half width decreased, the corticostriatal response becoming comparable to that of an iSPN in control conditions (bottom). Inset: immunocytochemical preparation showing neurons from a BAC-D1 eGFP mouse (green) and the recorded neuron filled with biocytin (orange). **(B)** Control corticostriatal response in an iSPN. The arrow indicates the addition of 100 nM of the SK-channel blocker apamin. Note that fast repolarization and the firing of a brief burst of action potentials at the beginning of the response were changed by a response with increased duration (half width) and a more prolonged train of action potentials firing

an iSPN, exhibiting a brief train of spikes and a faster repolarization, is subject to a blocker a Ca2+-activated K+-current (SK-channels component blocked by apamin in this case), the response is prolonged, its repolarization retarded and its train of spikes increases in duration, becoming similar to a corticostriatal response from a dSPN. This result support the inference that the reason why iSPN responses are faster than those from dSPNs is an increased activation of Ca2+-activated K+-currents. Superimposition of the traces in each condition supports these hypotheses (**Figures 4C,D**).

## **Ca2+-ACTIVATED K+-CURRENTS AND PROPAGATION OF DENDRITIC AUTORREGENERATIVE EVENTS IN iSPNs**

It has been shown that the reason why iSPNs fire a brief train of action potentials at the beginning of the corticostriatal response and then repolarize faster than dSPNs, is the presence of autoregenerative events in their dendrites: the brief train of action potentials rides on top of autoregenerative events that are elicited as local dendritic responses (Carter and Sabatini, 2004; Day et al., 2008; Flores-Barrera et al., 2010). However, in some occasions dendritic autoregenerative evens propagate to the somatic compartment inactivating the brief burst of spikes (Flores-Barrera et al., 2010). It is out of the scope of the present report to find out what is the origin and ionic constitution of these regenerative events in iSPNs (but see: Higley and Sabatini, 2010). at lower frequency; comparable to those seen in control dSPNs. Inset: immunocytochemical preparation showing neurons from a BAC-D2 eGFP mouse (green) and the recorded neuron filled with biocytin (orange). **(C)** Superimposition of control dSPNs and iSPNs responses clearly shows larger half width for dSPNs. **(D)** After the actions of NS 1619 and apamin in dSPNs and iSPNs, respectively, half width of iSPNs became larger than that of a dSPNs, strongly suggesting that the shape of the corticostriatal responses in these neurons depend on Ca2+-activated K+-currents. Experiment made by triplicate and with different blockers and enhancers. Although the digitizing procedure may show incomplete fast spikes, the increase in the amplitude of some spikes in the dSPN recording are due to longer interspike intervals (**A**, top). Note constancy of spikes amplitude in **(B)**, bottom.

However, several independent reports using different methods have posited that different classes of voltage-activated calcium channels as well as NMDA channels may be the origin of these events (Sabatini and Svoboda, 2000; Sabatini et al., 2002; Carter and Sabatini, 2004; Carter et al., 2007; Higley and Sabatini, 2008; Flores-Barrera et al., 2009; Plotkin et al., 2011). Moreover, channels that carry Ca2+-activated K+-currents have been shown to be present in dendritic spines or nearby dendrites in many neuron classes (Wolfart and Roeper, 2002; Cai et al., 2004; Bond et al., 2005; Ngo-Anh et al., 2005; Gu et al., 2008; Benhassine and Berger, 2009; Faber, 2009, 2010; Lujan et al., 2009; Hopf et al., 2010; Allen et al., 2011; Hosy et al., 2011), and data from striatal neurons is very suggestive that, at least in iSPNs, a tight synaptic association between synaptic, voltage-activated Ca2+-channels and Ca2+-activated K+-channels may be present (Day et al., 2008; Higley and Sabatini, 2008; Hopf et al., 2010).

The above described phenomena suggest that Ca2+-activated K+-currents may control autoregenerative events initiated by dendritic synaptic inputs, thus interfering with their propagation to the somatic-axonal compartments at most times. An experiment showing that this inference may be true is shown in **Figure 5A**: the green trace shows a couple of actions potentials riding on top of a suspected dendritic regenerative event taking place during synaptic activation. The subsequent addition of 100 nM apamin discloses the regenerative event by promoting

**FIGURE 5 | Ca2+-activated K+-currents control the propagation of autoregenerative potentials in iSPNs. (A)** Green trace shows a control suprathreshold corticostriatal response in an iSPN. Addition of 100 nM apamin discloses an autoregenerative and propagated action potential whose main ionic component is Ca2<sup>+</sup> (purple trace; Bargas et al., 1991). Subsequent addition of 20 nM ChTx in the continuous presence of apamin further prolongs the duration of the regenerative event without increasing its amplitude, illustrating its all-or-none properties (black trace). **(B)** Green trace is a spontaneous regenerative event that sometimes appears in control suprathreshold responses in iSPNs (alternating with the initial burst of spikes). Addition of 2.5μM NS 309 hindered the propagation of this event to the somatic area and accelerated the repolarization (orange trace). Clearly, fast spikes inactivate when calcium autoregenerative potentials propagate.

its propagation to the soma (purple trace). Further addition of 20 nM ChTx in the continuous presence of apamin (black trace) makes this event more prolonged and capable to induce the firing of fast spikes. These results suggest that Ca2+-activated K+ currents were preventing autoregenerative events in the dendrites to propagate to the somatic compartment. In addition, in iSPNs, autoregenerative events sometimes take the place instead of the initial bursts of sodium spikes (Flores-Barrera et al., 2010). Green trace in **Figure 5B** illustrates one such a case. It is also shown that addition of NS 309 (orange trace) rapidly abolished the propagation of the regenerative event producing a fast repolarization. These experiments suggest that Ca2+-activated K+-currents control synaptic integration in SPNs, in particular, they avoid propagation of regenerative events to the somatic area, secluding them within dendritic compartments. Their propagation would make inefficient the generation and propagation of a burst of action potentials out to the axon.

#### **DISCUSSION**

Briefly, the present work shows that: (1) The SK and BK components of Ca2+-activated K<sup>+</sup> currents present in SPNs (Pineda et al., 1992; Bargas et al., 1999; Vilchis et al., 2000), participate in shaping the corticostriatal polysynaptic responses. Basically, these currents participate in synaptic integration controlling the duration and depolarization of the responses. (2) The role of Ca2+-activated K+-currents is different in dSPNs as compared to iSPNs: in dSPNs, Ca2+-activated K<sup>+</sup> currents produce slow gradual responses, suggesting that a diffusion delay occurs between Ca-entry and K+-currents activation. In contrast, Ca2+-activated K+-current action measured in the same way in iSPNs is fast rising and thereafter maintained during a large portion of the response, suggesting that they are activated immediately at the beginning of the synaptic inputs. In fact, Ca2+-activated K<sup>+</sup> currents appeared to regulate small subthreshold synaptic potentials in iSPNs (Higley and Sabatini, 2010) and much less so in dSPNs. (3) We show that Ca2+-activated K+-currents are main determinants of the time courses of the corticostriatal responses in SPNs: enhancement of these currents accelerated the repolarization of dSPNs making them to look as iSPNs. Correspondingly, their blockade prolongs the corticostriatal responses in iSPNs making them to look as dSPNs. Finally, (4) a main role is played by Ca2+-activated K<sup>+</sup> currents in iSPNs. It is known that when these neurons are activated, they may elicit regenerative events, seen as all-or-none events that, when propagation ensues, can become full-blown calcium action potentials as recorded from the soma. When this occurs, the burst of action potentials that should go out to the axon is obliterated.

#### **OUTPUTS OF dSPNs AND iSPNs HAVE DIFFERENT DURATIONS**

Before BAC-mice were available, investigators averaged the output of SPNs thinking that different durations were a part of the same spectrum. However, BAC-D1/<sup>2</sup> GFP mice have taught us that iSPNs and dSPNs have different excitable properties (Shen et al., 2007; Day et al., 2008; Kravitz et al., 2010; Gerfen and Surmeier, 2011), and that their responses to intracortical stimulation are differentially integrated (Flores-Barrera et al., 2010), even though responses to stimulation at the dendrites, with uncaged glutamate, and the glutamate receptors employed, do not seem different (Plotkin et al., 2011; Vizcarra-Chacon et al., 2013). However, once SPNs classes were sorted out, differences in duration of their physiological responses—trains of action potentials—were easily observed, even during extracellular recordings under certain conditions (optogenetic stimulation; e.g., see Figures 2E–H in: Kravitz et al., 2010). How to explain these differences in duration? In the present work we show that Ca2+-activated K+-currents are a main factor in explaining the duration of corticostriatal responses.

Apparently, dendritic Ca2+-activated K+-currents limit calcium influx and depolarization, making a negative feedback loop (Faber, 2009). Since a stronger calcium entry has been previously reported in iSPNs (Day et al., 2008), the functional importance of Ca2+-activated K+-currents in determining the threshold and duration of dendritic autoregenerative potentials (most probably due to Ca2+: Bargas et al., 1991; Carter and Sabatini, 2004) is fundamental to avoid their propagation in order to be able to generate the brief train of action potentials that goes out to the axon. The action of these currents at subthreshold levels shows that the calcium to activate them is available even with weak synaptic stimuli (Higley and Sabatini, 2010). However, at the level of dendritic spines, blockage of some Ca2<sup>+</sup> sources could have paradoxical effects increasing the amplitude of synaptic events (Higley and Sabatini, 2010). But at suprathreshold levels, where up-states could arise, NMDA receptors, and SK channels are functionally coupled in the dendritic spines of hippocampal, amygdala, and striatal neurons (Bloodgood and Sabatini, 2005; Ngo-Anh et al., 2005; Lujan et al., 2009; Faber, 2010; Higley and Sabatini, 2010) to control the duration and depolarizing level of plateau potentials (Cai et al., 2004), limiting the influx of calcium through NMDA receptors and controlling the amount of excitation (Bond et al., 2005; Ngo-Anh et al., 2005; Faber, 2010; Tonini et al., 2013).

But why the activation of Ca2+-dependent K+-currents is faster in iSPNs than in dSPNs so that repolarization of responses in iSPNs makes them much briefer? Apparently, eliciting local autoregenerative responses in the dendrites, at the time of synaptic inputs, is more common in iSPNs than in dSPNs (Flores-Barrera et al., 2010; Higley and Sabatini, 2010). A brisk Ca2<sup>+</sup> entry may explain the fast rise in the activation and actions of Ca2+-activated K+-currents, and therefore the faster repolarization of iSPNs after a brief burst of action potentials.

However, these are not the only intrinsic currents that participate in shaping the suprathreshold corticostriatal response or in eliciting autoregenerative responses in SPNs. Other inward currents participate. As for example, the slow inward current carried by CaV1 channels (Galarraga et al., 1997; Vergara et al., 2003; Carter and Sabatini, 2004; Flores-Barrera et al., 2011), a slow sodium component blocked by phenytoin and riluzole (Carrillo-Reid et al., 2009b), inward currents carried by CaV3 and CaV2.3 channels (Carter and Sabatini, 2004; Higley and Sabatini, 2010; Plotkin et al., 2011), all them besides the synaptically activated NMDA and kainate slow current components (Schiller and Schiller, 2001; Carter and Sabatini, 2004; Plotkin et al., 2011; Vizcarra-Chacon et al., 2013). But analyses have not been complete, other inward currents need to be confirmed or discarded in mature neurons as for example currents carried by TRP channels (Hill et al., 2006) or calcium-activated non-selective cation currents (ICAN; Mrejeru et al., 2011).

Further research is needed to see if different synapses (proximal, distal) use the same calcium source, or if there could be many sources being either the same or diverse for different synapses. It is known that voltage-gated Ca2+-channels and NMDA receptors differ in their distribution over spines and dendritic shafts. High calcium concentrations can only be reached within a few nanometers around the channel and dissipate within microseconds (Sabatini et al., 2002). So that the Ca2+-source used to activate Ca2+-dependent K+-currents may be very specific and precise (Vilchis et al., 2000). In contrast, NMDA-channels lead to a longer lasting calcium gradient extending across the spine during an excitatory postsynaptic potential (Sabatini and Svoboda, 2000; Keller et al., 2008). In any case, the Ca2<sup>+</sup> source should be able to generate autoregenerative events to explain the differences in duration of the responses between dSPNs and iSPNs.

In view of the above, it should not seem strange that intrinsic outward currents should also participate in suprathreshold corticostriatal responses to damp or stop inward currents from reaching too strong depolarizations. In the present work we show that Ca2+-activated K+-currents are a component of the outward currents that fulfill this role. They have been described at the somatic (Bargas et al., 1999) and the dendritic levels (Ngo-Anh et al., 2005). However, several other outward currents may participate together with the synaptic GABAergic currents (Flores-Barrera et al., 2010) in order to control suprathreshold responses originated by converging polysynaptic cortical inputs known to outlast the initial stimulus and the monosynaptic responses by several hundreds of milliseconds (Vizcarra-Chacon et al., 2013).

It is known that these prolonged responses arise from the polysynaptic convergence of cortical and feed-forward striatal inputs because in current-clamp mode the glutamatergic response can clearly be divided into two parts, an early part, which is monosynaptic, and a late part, which by definition cannot be monosynaptic, but is blocked by AMPA-receptor antagonists (Vizcarra-Chacon et al., 2013), that is, it has to be polysynaptic. In addition, in voltage-clamp mode, at certain holding potentials, a late barrage of synaptic inputs keeps arriving during several hundreds of milliseconds after the initial monosynaptic response (Goldberg et al., 2012; Vizcarra-Chacon et al., 2013). Either stimulus strength or holding potential have to be changed for the activation of unclamped intrinsic currents by this synaptic barrage (Flores-Barrera et al., 2009). Finally, several striatal interneurons can be activated from the cortex (e.g., Vizcarra-Chacon et al., 2013) and the feed-forward circuit that involves other striatal cellular elements in the suprathreshold corticostriatal response of SPNs is well known (rev. in: Surmeier et al., 2011).

## **CONCLUSION**

Ca2+-activated K<sup>+</sup> channels, most probably located at the dendrites, explain the different durations between corticostriatal responses in dSPNs and iSPNs. Because polysynaptic corticostriatal responses (Vizcarra-Chacon et al., 2013) are the components of more prolonged up-states (Stern et al., 1997), an inference can be made: transmitters that modulate Ca2+-influx or Ca2+ activated K+-currents intervene in the regulation of up-states duration. This in turn regulates correlated firing among neuronal ensembles to produce network dynamics (Carrilo-Reid et al., 2008). Therefore, they may be a part of the explanation of the actions of modulatory transmitters at the striatal microcircuit (Carrillo-Reid et al., 2009a; Jáidar et al., 2010; Carrillo-Reid et al., 2011).

## **ACKNOWLEDGMENTS**

We thank Antonio Laville, Gabriela X Ayala, and Ariadna Aparicio for technical support and advice and Dr. Claudia Rivera for animal care. This work was supported by the Consejo Nacional de Ciencia y Tecnología (México) grants 98004 and 154131 and by grants from Dirección General de Asuntos del Personal Académico. Universidad Nacional Autónoma de México: IN202914 and IN202814 to Elvira Galarraga and José Bargas, respectively. Mario A Arias-García has a CONACyT doctoral fellowship and data in this work are part of his doctoral dissertation in the Posgrado en Ciencias Biomédicas de la Universidad Nacional Autónoma de México. Jesús Pérez-Ortega was the author of the acquisition program used for stimulation and recording.

## **REFERENCES**


striatal EPSPs through an L-type Ca2+ conductance. *Neuroreport* 8, 2183–2186. doi: 10.1097/00001756- 199707070-00019


*Neuroreport* 14, 1253–1256. doi: 10. 1097/01.wnr.0000081861.45938.71


the SLO family. *Nat. Rev. Neurosci.* 7, 921–931. doi: 10.1038/nrn1992


channels are selectively coupled to P/Q-type calcium channels in cerebellar purkinje neurons. *J. Neursci.* 24, 8818–8822. doi: 10.1523/ JNEUROSCI.2915-04.2004

Wynne, P. M., Puig, S. I., Martin, G. E., and Treistman, S. N. (2009). Compartmentalized β subunit distribution determines characteristics and ethanol sensitivity of somatic, dendritic, and terminal large conductance calcium-activated potassium channels in the rat central nervous system. *J. Pharmacol. Exp. Ther*. 329, 978–986. doi: 10.1124/jpet.108. 146175

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 05 August 2013; accepted: 15 September 2013; published online: 04 October 2013.*

*Citation: Arias-García MA, Tapia D, Flores-Barrera E, Pérez-Ortega JE, Bargas J and Galarraga E (2013) Duration differences of corticostriatal responses in striatal projection neurons depend on calcium activated potassium currents. Front. Syst. Neurosci. 7:63. doi: 10.3389/fnsys.2013.00063*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2013 Arias-García, Tapia, Flores-Barrera, Pérez-Ortega, Bargas and Galarraga. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Motor tics evoked by striatal disinhibition in the rat

## *Maya Bronfeld , Dorin Yael , Katya Belelovsky and Izhar Bar-Gad\**

*The Leslie and Susan Gonda (Goldschmied) Multidisciplinary Brain Research Center, Bar-Ilan University, Ramat-Gan, Israel*

#### *Edited by:*

*Hagai Bergman, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*M. Gustavo Murer, Universidad de Buenos Aires, Argentina Jonathan Mink, University of Rochester Medical Center, USA*

#### *\*Correspondence:*

*Izhar Bar-Gad, The Leslie and Susan Gonda (Goldschmied) Multidisciplinary Brain Research Center, Bar-Ilan University, Ramat-Gan 52900, Israel e-mail: izhar.bar-gad@biu.ac.il*

Motor tics are sudden, brief, repetitive movements that constitute the main symptom of Tourette syndrome (TS). Multiple lines of evidence suggest the involvement of the cortico-basal ganglia system, and in particular the basal ganglia input structure—the striatum in tic formation. The striatum receives somatotopically organized cortical projections and contains an internal GABAergic network of interneurons and projection neurons' collaterals. Disruption of local striatal GABAergic connectivity has been associated with TS and was found to induce abnormal movements in model animals. We have previously described the behavioral and neurophysiological characteristics of motor tics induced in monkeys by local striatal microinjections of the GABAA antagonist bicuculline. In the current study we explored the abnormal movements induced by a similar manipulation in freely moving rats. We targeted microinjections to different parts of the dorsal striatum, and examined the effects of this manipulation on the induced tic properties, such as latency, duration, and somatic localization. Tics induced by striatal disinhibition in monkeys and rats shared multiple properties: tics began within several minutes after microinjection, were expressed solely in the contralateral side, and waxed and waned around a mean inter-tic interval of 1–4 s. A clear somatotopic organization was observed only in rats, where injections to the anterior or posterior striatum led to tics in the forelimb or hindlimb areas, respectively. These results suggest that striatal disinhibition in the rat may be used to model motor tics such as observed in TS. Establishing this reliable and accessible animal model could facilitate the study of the neural mechanisms underlying motor tics, and the testing of potential therapies for tic disorders.

#### **Keywords: basal ganglia, Tourette syndrome, striatum, bicuculline, GABA**

## **INTRODUCTION**

Tourette syndrome (TS) is a childhood-onset neurological disorder characterized by the persistent expression of motor and vocal tics (American Psychiatric Association, 2000). Tics are involuntary, repetitive stereotyped movements or vocalizations. Their severity and complexity vary in the range between "simple tics," which involve only one or a few closely related muscle groups whose activation leads to isolated brief jerk-like movements, and "complex tics," which involve sequential or coordinated activation of several muscle groups. Although the neuronal mechanisms underlying motor tics remain unknown, multiple studies suggest the involvement of the cortico-basal ganglia (CBG) loop, and specifically the striatum, in the pathophysiological processes leading to tics (Peterson et al., 1998, 2003; Albin et al., 2003; Bloch et al., 2005; Kalanithi et al., 2005; Wang et al., 2011; Worbe et al., 2012).

The basal ganglia (BG) are a group of interconnected subcortical nuclei, which form a feedback loop with the cortex. The BG are involved in motor, associative and limbic processes (Alexander et al., 1986), and their dysfunction has been implicated in multiple motor and psychiatric disorders, including Parkinson's disease (Deuschl et al., 2000), Huntington's disease (Reiner et al., 1988), obsessive compulsive disorder (OCD) (Modell et al., 1989) and TS (Singer and Minzer, 2003; Abelson et al., 2005; Kalanithi et al., 2005). The striatum is the main input structure of the BG which receives glutamatergic inputs from the cerebral cortex and the thalamus and dopaminergic inputs from the midbrain (Kemp and Powell, 1971; Wilson et al., 1983; Haber et al., 2000). The striatum projects internally to other BG nuclei which ultimately influence cortical activity via their innervation of the thalamus (Albin et al., 1989; DeLong, 1990). The striatum can be sub-divided into motor, associative, and limbic domains, defined by the origins of the cortical inputs to these territories (Alexander et al., 1986). In the rat, the striatum can be roughly divided into the sensorimotor dorso-lateral part (Cospito and Kultas-Ilinsky, 1981; Ebrahimi et al., 1992), the associative dorsomedial part (Reep et al., 2003), and the limbic ventral territory (Berendse et al., 1992). The dorso-lateral striatum receives somatotopically organized sensorimotor inputs from the cortex, but studies differ as to the exact location, organization and segregation of the somatic territories within the striatum (Webster, 1961; Ebrahimi et al., 1992; Brown and Sharp, 1995; Brown et al., 1998).

The striatal cell population is composed of a single population of projection neurons and multiple types of interneurons. The vast majority of neurons (estimated to be 95% in the rat) are the GABAergic projection neurons, also known as medium spiny neurons (MSNs) (Kemp and Powell, 1971). The striatum also includes multiple types of GABAergic interneurons, such as the fast spiking interneurons (FSIs), low threshold spiking neurons

**Abbreviations:** BG, basal ganglia; CBG, cortico-basal ganglia; ITI, inter-ticinterval; MSN, medium spiny neuron; TS, Tourette syndrome.

(LTS), and others (Koos and Tepper, 1999; Tepper and Bolam, 2004). Thus, GABA is a key neurotransmitter of the intrastriatal network as it mediates inputs from interneurons, MSN collaterals (Wilson and Groves, 1980; Bolam et al., 2000) and external input from other parts of the BG (namely, the globus pallidus externus—GPe) (Mallet et al., 2012).

Dysfunction of GABAergic transmission, specifically within the striatum, has been implicated in TS. Imaging studies have identified a reduction in the volume of the striatum in TS patients (Peterson et al., 2003; Bloch et al., 2005) and post mortem studies indicate that this loss can primarily be attributed to a specific reduction in the number of striatal GABAergic interneurons (Kalanithi et al., 2005; Kataoka et al., 2010). Functional imaging studies in TS patients have also found a widespread GABAergic dysfunction in multiple BG nuclei, including the striatum (Lerner et al., 2012).

Disruption of local GABAA transmission within the striatum has been shown to induce abnormal movements or behavioral states in model animals (Worbe et al., 2009). Early studies found that local administration of GABAA antagonists, such as picrotoxin or bicuculline into the striatum of rats (Marsden et al., 1975; Tarsy et al., 1978) and monkeys (Crossman, 1987) induces repetitive contralateral jerk-like movements which were initially termed dyskinesia or myoclonus. More recent studies in monkeys suggested that these movements can be used to model the motor tics observed in TS (McCairn et al., 2009; Worbe et al., 2012; Bronfeld et al., 2013). In the current study we sought to characterize the abnormal movements induced by striatal disinhibition in the rat, using similar tools and analyses to the research performed on primates. We used local microinjections of bicuculline into the anterior and posterior parts of the rat striatum and examined the induced abnormal movements and behaviors, their progression in time, and their somatic distribution.

## **MATERIALS AND METHODS**

#### **ANIMALS**

Twenty-eight adult male Sprague Dawley rats weighing 316– 460 g (375 ± 40 g, mean ± STD) were used in this study. Rats were maintained under conditions of controlled temperature and humidity, in a 12:12 h light/dark cycle, with free access to food and water. Data from three male Macaca Fascicularis monkeys was used in this study. All primate procedures were previously described in detail (McCairn et al., 2009; Bronfeld et al., 2011). All procedures were approved and supervised by the Institutional Animal Care and Use Committee (IACUC) and were in accordance with the National Institute of Health Guide for the Care and Use of Laboratory Animals and the Bar-Ilan University Guidelines for the Use and Care of Laboratory Animals in Research. This study was approved by the National Committee for Experiments in Laboratory Animals at the Ministry of Health (permit number 6–01–11).

#### **SURGICAL PROCEDURE**

Rats were sedated with 5% isoflurane and then injected intraperitoneally with ketamine HCl (100 mg/kg) and xylazine HCl (10 mg/kg). Anesthesia was maintained using a mixture of isoflurane 0.5–1% and oxygen and supplementary injections of ketamine were administered as required. The rat's head was fixed in a stereotaxic frame (Stoelting Co., Wood Dale, IL, USA). After sterilization of the skin and local infiltration with lidocaine (10 mg/ml) to reduce pain, an incision was made in order to expose the skull surface. Connective tissue was removed and the skull surface cleaned. Holes were drilled bilaterally in the skull targeting the anterior (AP: + 1.5, ML: ±2.5, DV: 3) or posterior (AP: −0.4 to −0.5, ML: ±3.5, DV: 3) striatum (**Figure 1**) (Paxinos and Watson, 2007). Guide cannulae (stainless steel 25 G tubes) were inserted to a location 2 mm above the injection target and sealed with a cannula-dummy (stainless steel wire 28 G). Most animals were implanted in both hemispheres with only one cannula in each hemisphere, but in a subset of animals (4 rats) two cannulae were implanted in each hemisphere. The implantation was secured to the skull with screws and dental acrylic cement (Coltene/Whaledent Inc., Cuyahoga Falls, OH, USA). Norocarp (Carprofen, 4 mg/kg) was injected subcutaneously post-surgery to relieve pain. Experimental sessions began after the animals recovered from surgery (typically 7–10 days).

#### **MICROINJECTIONS**

We used microinjections of bicuculline, which is a competitive GABAA antagonist which can also inhibit SK channels (Stocker et al., 1999). Bicuculline methiodide (Sigma-Aldrich, Rehovot, Israel) was dissolved in physiological saline or artificial cerebrospinal fluid (aCSF containing (in mM): 145 NaCl, 15 Hepes, 2.5 KCl, 2 MgCl2, 1.2 CaCl2, PH 7.4 with NaOH) to a final concentration of 1μg/μl. A volume of 0.5–1μl was pressure injected at a rate of 0.35 or 0.5μl/min (NE-1000, New Era Pump Systems, Farmingdale, NY, USA) using a 10μl syringe (Hamilton, Reno, NV, USA). The injection was applied through an injection cannula (stainless steel 30 G tube) which was attached to the syringe via a flexible tube (Tygon microbore tube, Component Supply Company, Fort Meade, FL, USA). The injection cannula was manually inserted into the guide cannula which was fixed in the acrylic, with its tip located 2 mm past the tip of the guide cannula in the striatum.

#### **EXPERIMENTAL SESSIONS**

Experimental sessions began with a period of normal behavior (typically 10 min) within the behavioral observation cage,

**FIGURE 1 | Histology.** Nissl staining of coronal sections of the right hemisphere from an injected rat. A cannula lesion can be seen in the **(A)** anterior and **(B)** posterior striatum, black arrows point to the corresponding injection sites.

to establish a baseline behavior and allow the animals to acclimatize to the cage. Following this period the injection cannula was inserted and bicuculline was injected into the striatum over 1–3 min. The injection cannula was removed 1–2 min after the end of the injection, and the cannula-dummy was re-inserted. The animals' behavior was monitored continuously throughout the session and some sessions were filmed for detailed offline analysis using high speed video (two cameras at 50 frames/s each, uEye, IDS Imaging Development Systems, Obersulm, Germany) and acquired using StreamPix 4 (Norpix, Montreal, Canada). Sessions ended and the animals returned to their home cage at least 40 min after the microinjection (if no behavioral abnormalities were observed), or at least 10 min after bicuculline-induced behavioral abnormalities disappeared. Control injections of saline or aCSF were performed in some animals, using the same experimental design.

#### **HISTOLOGY**

The position of the microinjections was verified at the end of the experiments using histology. Animals were anesthetized and transcardially perfused with physiological saline followed by 10% paraformaldehyde. The brains were removed and immersed in a fixation solution of 30% sucrose and 10% paraformaldehyde for a minimum of 48 h. Brains were then frozen and sliced (50μm coronal sections) using a cryostat microtome. Sections were mounted on glass slides, stained with Cresyl violet, and subsequently examined under a microscope to verify the location of the cannulae.

#### **DATA ANALYSIS**

Frame-by-frame analysis of sessions that were video recorded was used for all detailed temporal analyses of the tics. Tics were marked by the timing of the first frame in which the tic-related deflection of the relevant body part was observed. Each session was viewed from two angles (top and side) to ensure optimal identification of tic movements. Calculations of the average and the coefficient of variation (CV) of the inter-tic-intervals (ITIs) for both rats and monkeys were based on a representative video segment of 1–3 min from each session. All data analyses were performed using custom-written MATLAB code (MATLAB 2010A, MathWorks, Natick, MA, USA). All measures in the Results section are described as mean±STD unless stated otherwise.

#### **RESULTS**

Eighty-nine microinjections were performed, 53 in the anterior (28 cannulae) and 36 in the posterior (23 cannulae) striatum (**Figure 1**). One to five injections were performed in each cannula, and repeated injections typically induced similar behavioral effects. Motor tics were induced in 36/53 (68%) anterior and 21/36 (58%) posterior injections. Tics always appeared in the contralateral side to the injection and none were observed in the ipsilateral side. Tics were characterized as repetitive brief muscle contractions in an isolated muscle group of the hindlimb, forelimb, or face (see supplementary video). Tic-related movements ranged from small twitches of one finger to strong deflections of an entire limb or even the torso. During the course of a single session, small tics would initially be observed which would rapidly increase in amplitude and become more pronounced, usually maintained a stereotypic shape and size throughout the session, until they gradually decreased and reached complete cessation.

Tics appeared within 6 ± 3 min from the onset of the microinjection. However, the tic latency was significantly shorter following injections to the anterior than to the posterior striatum (latency anterior 4.8 ± 2.3 min, latency posterior 8.4 ± 2.8 min, *p* < 0.001, two-tailed *t*-test) (**Figure 2A**). Tics lasted for an average of 45 ± 18 min, with no significant difference between anterior and posterior locations (duration anterior 41 ± 13 min, duration posterior 50 ± 22 min, *p* > 0.1, two-tailed *t*-test) (**Figure 2B**). During the course of a session tic frequency varied around a mean base frequency which was typical to that session (**Figure 3A**). Across animals, the mean ITI ranged between 1.3 and 4.1 s, with an average of 2.02 ± 0.97 s (*N* = 14 sessions). The ITIs observed in rats were not significantly different from those observed in monkeys following striatal disinhibition, for which the mean ITI ranged between 0.5 and 6 s, with an average of 2.6 ± 1.44 (**Figure 3C**, *N* = 14 sessions, Mann–Whitney *U*-test, *p* > 0.1). In both monkeys and rats the length of ITIs was irregularly distributed (**Figure 3B**), as was evident by their low *CV* values (**Figure 3D**, rats *CV* = 0.33 ± 0.14, monkey *CV* = 0.32 ± 0.1, Mann–Whitney *U*-test, *p* > 0.1).

Following microinjection, tics were usually observed in a single body part and were limited to this location throughout the session [22/36 (61%) anterior and 11/21 (52%) posterior tic-inducing injections]. However, sometimes [9/36 (25%) anterior injections, 8/21 (38%) posterior injections] tics appeared in an additional location later during the session; e.g., tics observed initially only in the hindlimb progressed later to the forelimb, in which case tics were expressed simultaneously in both locations. In a minority

**FIGURE 4 | Tic somatic localization.** Behavioral effects of bicuculline microinjections into the anterior (left) or posterior (right) striatum, classified according to the extent of localization of motor tics.

Red arrow indicates time of first tic. **(B)** Histogram of the inter-tic

of cases [5/36 (14%) anterior injections, 2/21 (10%) posterior injections] the tics appeared in two locations at once (**Figure 4**).

The location of the tics followed a somatotopic organization relative to the injection site such that forelimb tics were more commonly induced by injections to the anterior striatum, whereas hindlimb tics were mostly induced by injections to the posterior striatum. This effect was most apparent when examining the first location in which tics were expressed following microinjection (chi-square test with Yates' correction, χ<sup>2</sup> (4) = 44.23, *p* < 0.001). In 29/36 (80%) anterior microinjections the tics were initially expressed in the contralateral forelimb. All other microinjections in the anterior location initially induced tics in the head/face muscles alone (2/36, 6%) or were simultaneous with forelimb tics (5/36, 14%). Hindlimb tics were never expressed as the first effect following anterior striatal microinjections (0/36, 0%). In contrast, posterior microinjections induced hindlimb tics with the shortest latency in 18/21 cases (85%). The first effects of the remainder of the posterior microinjections were tics expressed either in head/face muscles alone, in the face and forelimb areas together, or in the hindlimb and forelimb simultaneously [1/21 (5%) injections each case] (**Figure 5A**). This somatotopic organization was maintained even when examining all locations in which tics were expressed throughout the session [chi-square test with Yates' correction, χ<sup>2</sup> (6) = 32.59, *p* < 0.001]. Forelimb tics were almost always (35/36, 97%) observed following injections into the anterior striatum. Forelimb tics induced by anterior injections were most commonly expressed alone [21/36 (58%)] or in conjunction with head/face tics [11/36 (31%) injections], and only seldom together with hindlimb tics (3/36, 8%). Hindlimb tics were the most common effect of injections into the posterior striatum (19/21, 90%). Hindlimb tics were expressed alone [10/21 (48%) injections] or in combination with forelimb tics (5/21, 24%) facial tics (1/21, 5%) or both (3/21, 14%). Only 2/21 (10%) posterior injections did not induce hindlimb tics, but rather tics

Horizontal lines indicate the group mean.

in head/face muscles alone (1/21, 5%) or together with forelimb tics (1/21, 5%) (**Figure 5B**).

To control for the possible effects of cortical damage on tic expression we implanted a subset of animals with two cannulae in each hemisphere, over both the anterior and the posterior striatum. Overall, the presence of two cannulae in each hemisphere did not affect the somatotopic distribution of the tics. In all animals with two cannulae per hemisphere, the first location in which tics were expressed was the contralateral forelimb following anterior injections (*N* = 11 injections) and the contralateral hindlimb following posterior injections (*N* = 5 injections). The somatotopic distribution of all tic locations per session (**Table 1**) also did not significantly differ between animals with one or two cannulae per hemisphere, for either the anterior (chi-square test with Yates' correction, χ<sup>2</sup> (3) = 5.6, *p* > 0.1) or the posterior

**FIGURE 5 | Somatotopic organization of tics.** Percentages of bicuculline microinjections into the anterior (blue) or posterior (red) striatum inducing tics in different body parts. **(A)** Body part in which the first tics following microinjection were observed. **(B)** All body parts in which tics were observed at any stage during the session.

(chi-square test with Yates' correction, χ<sup>2</sup> (5) = 1.33, *p* > 0.1) injections.

Additional hyper-behavioral abnormalities were observed after some of the striatal bicuculline microinjections. These included hyperactivity, expressed as increased locomotion with frequent switches between different behaviors from the rat's normal behavioral repertoire (e.g., walking, rearing, sniffing, eating, etc.), circling around the cage perimeter due to an increased tendency to turn in one direction (contraversive to the injected hemisphere) and pivoting—on-the-spot rotations to the contralateral side to the injection. Each of these behaviors could be observed alone or in combination with another, and their expression, latency and duration were independent of the expression of tics. Hyper-behavioral abnormalities were more commonly observed following injections to the posterior [19/36 (53%) injections] than the anterior [13/53 (25%) injections] striatum [chisquare independence test, χ<sup>2</sup> (1) = 7.43, *p* < 0.01]. Such behaviors could be observed both following tic-inducing microinjections [10/36 (28%) anterior injections, 10/21 (48%) posterior injections] and microinjections that did not induce tics [3/17 (18%) anterior injections, 9/15 (60%) posterior injections]. Another phenomenon which was sometimes observed following bicuculline microinjection was an abnormal posture of the limb. This abnormality was observed only in the hindlimbs following injections to the posterior striatum, usually in relation to hindlimb tics: at the end of a tic, the leg would not return to its normal position but rather maintained an abnormal posture. An abnormal posture was observed in 11/21 (52%) posterior injections in which it waxed and waned during the course of a session, such that it was evident only during some but not all of the tics.

In many sessions [23/36 (64%) anterior and 16/21 (76%) posterior tic-inducing injections] tetanic episodes were sporadically interspersed within the tic train. These episodes included several seconds in which the animal expressed multiple high-frequency successive tics, and would stop all other activities. The prevalence of these episodes varied widely across animals and sessions and ranged from once or a few times up to multiple times throughout the session.

Control injections of saline or aCSF were performed only in cannulae in which bicuculline previously induced the expression of tics: 7 anterior cases and 4 posterior cases. No abnormal behaviors or movements of any kind were observed following any of the control microinjections.



## **DISCUSSION**

In the current study we used localized striatal microinjections of the GABAA antagonist bicuculline to induce motor tics in freely behaving rats and characterized the nature and course of the induced abnormal movements and behaviors, most notably motor tics. Tics started within a few minutes following microinjection and their expression was mostly confined to a single body part. Tic expression followed a somatotopic organization, in which injections in the anterior striatum mainly induced tics in the contralateral forelimb, whereas injections to the posterior striatum induced hindlimb tics. The tics lasted for slightly under 1 h on average, and their frequency varied within and between different sessions and animals in the range of 15–45 tics per minute. In some cases, striatal bicuculline microinjections elicited other hyper-behavioral abnormalities, including general hyperactivity and contraversive circling or pivoting, as well as tetanic episodes which were interspersed within the tic trains.

Overall, motor tics induced by striatal disinhibition in rats demonstrated the main properties of tics induced in monkeys using a similar manipulation (Crossman et al., 1988; McCairn et al., 2009; Bronfeld et al., 2011): (1) In both species tics started within a few minutes following microinjection and were expressed as brief jerk-like movements manifested as retraction or contraction of the affected body part or muscle group on the contralateral side to the injection (Tarsy et al., 1978; Patel and Slater, 1987; Crossman et al., 1988; McCairn et al., 2009); (2) Tics were mostly focal in nature, but were sometimes expressed in more than one location (Crossman et al., 1988; McCairn et al., 2009; Bronfeld et al., 2011). Even in such cases, the tics remained isolated and did not resemble seizure-like whole body phenomena; (3) During tic expression the animals seemed to maintain normal motor and cognitive functions, as was evidenced by sustained exploration, rearing, grooming and eating in the rats, and by continued performance of a motor task by the monkeys (McCairn et al., 2009; Worbe et al., 2009); (4) Normal behavior was only paused during the brief tetanic episodes, which were observed in both monkeys and rats (McKenzie and Viik, 1975; Tarsy et al., 1978; McCairn et al., 2009); (5) The frequency of tics displayed variability during the course of a session as well as across different sessions and animals. However, the overall frequency range and the general temporal structure of tics throughout a session were remarkably similar between the species (McKenzie and Viik, 1975; Tarsy et al., 1978; McCairn et al., 2009). (6) The additional hyper-behavioral symptoms (hyperactivity and contraversive circling) which sometimes followed microinjections were common to both species. Such behaviors have been observed in both monkeys (Grabli et al., 2004; Worbe et al., 2009; Bronfeld et al., 2010) and rats (Wisniecki et al., 2003; Ikeda et al., 2010) following microinjections of GABAA antagonists into the striatum or into an adjacent BG nucleus—the GPe. In the current study, hyperbehavioral symptoms were more common following microinjections into the posterior striatum, a location which is closer to the striatum-GPe border, compared with the anterior striatum location. This suggests that these symptoms may be mediated, at least in part, by diffusion of the bicuculline to the GPe, rather than by its direct effect in the striatum. Hindlimb dystonia was also sometimes observed following striatal bicuculline microinjections in the rats but not in monkeys. However, this phenomenon is not unique to the rats, as it was previously described in cats following striatal disinhibition (Yoshida et al., 1991; Yamada et al., 1995).

The main difference between motor tics induced by striatal disinhibition in monkeys and rats pertains to the somatotopic organization observed in the rat which was not detected in the monkey (Crossman et al., 1988; McCairn et al., 2009; Worbe et al., 2009, 2013; Bronfeld et al., 2011). Somatotopic organization of tic-inducing striatal disinhibition in the rat has been observed in previous studies. It is evident both from a combination of different studies, each described injections in a different striatal location which induced tics in different body parts (McKenzie et al., 1972; Patel and Slater, 1987; Muramatsu et al., 1990; Nakamura et al., 1990) as well as from a study that explicitly compared the effects of injections in different striatal loci (Tarsy et al., 1978). Overall, the somatotopic organization followed an anterior-posterior axis, in which tics were observed in orofacial muscles following injections into the anterior striatum (Nakamura et al., 1990), in the forelimbs following injections into the central striatum (Tarsy et al., 1978; Patel and Slater, 1987) and in the hindlimbs following injections into the posterior striatum (Tarsy et al., 1978). Mixed effects were also observed, mostly following the same somatotopic organization (head and forelimb tics, forelimb and hindlimb tics). We also observed some degree of co-expression of tics in multiple body parts, which may be attributed either to small variability in inter-animal injection sites or to diffusion of the injected substance across multiple somatotopic territories (Yoshida et al., 1991).

A possible explanation for the different effects of differentially located injections is the somatotopic organization of the striatum. Although there is a general consensus that the striatal motor territory is somatotopically organized, as evidenced by anatomical connectivity (Brown, 1992; Ebrahimi et al., 1992; Brown et al., 1998), neuronal activity (Carelli and West, 1991), metabolic mapping (Brown and Sharp, 1995) and lesion studies (Pisa and Schranz, 1988), findings tend to differ regarding the topographical arrangement of this organization. Early studies reported that cortical fibers projecting to the striatum are topographically well-organized along an anterior-posterior axis, creating a somatotopic map in which the head, forelimb, and hindlimb are represented in successively posterior striatal locations (Webster, 1961). More recent studies suggest a more complex organization of cortico-striatal connectivity, in which cortical projections cluster in partially overlapping groups and create a three dimensional complex map along the anterior-posterior, ventral-medial and dorso-ventral axes (Carelli and West, 1991; Brown, 1992; Ebrahimi et al., 1992; Brown et al., 1998). The effects of local striatal disinhibition in our study and in previous ones (McKenzie et al., 1972; Tarsy et al., 1978; Patel and Slater, 1987; Muramatsu et al., 1990; Nakamura et al., 1990) support the former organization but not the latter.

An alternative explanation for the observed somatotopy is that the effects of local striatal disinhibition might be mediated by and in fact dependent on the involvement of the cerebral cortex. Cortical involvement was suggested by Tarsy and colleagues, who found that when the injection cannula was inserted into the rat striatum at an angled trajectory that spared the sensorimotor cortex overlaying the striatum, tics did not appear following the injection of GABA antagonists (Tarsy et al., 1978). Remarkably, tics could be elicited by a combination of striatal disinhibition via an angled cannula combined with cortical damage (done by merely inserting a second cannula into the cortex). These findings were later challenged by Patel and colleagues who found that cortical damage had no effect on the manifestation of tics since they were equally induced by straight (passing through the cortex) and angled (leaving the cortex intact) striatal microinjections (Patel and Slater, 1987). In the current study, we had a subset of animals which had simultaneous cortical damage in the areas overlaying both the anterior and posterior striatal injection sites. The somatic distribution of tics in these animals was similar to the distribution observed in animals implanted with only one cannula in each hemisphere, which suggests that the cortical damage has little or no effect over the determination of tic locations.

In the primate, the somatotopic organization of the striatum has been well-established by converging results of anatomical studies (Kunzle, 1975; Flaherty and Graybiel, 1991; Takada et al., 1998), recordings of neuronal activity (Nambu et al., 2002) and electrical microstimulation (Alexander and DeLong, 1985). Overall it follows a dorsal-ventral organization, in which the hindlimbs are represented at the most dorsal aspect of the striatum, followed by the forelimb and the head/face regions which are represented at gradually more ventral parts. However, the effects of striatal disinhibition fail to display any somatotopic organization. Instead, tics are expressed predominantly in orofacial muscles, and to a lesser extent in forelimb muscles (Crossman et al., 1988; McCairn et al., 2009; Worbe et al., 2009, 2013; Bronfeld et al., 2011) and rarely in hindlimb muscles (Crossman et al., 1988), regardless of the striatal injection site. The basis of this inter-species difference and the lack of tic-related somatotopy in monkeys are unclear. A possible explanation might be related to an over-representation of orofacial regions in the primate striatum (Miyachi et al., 2006). This is in line with tic expression in human TS patients, in which head and face tics are typically the first and most common symptom of the disorder (Kurlan, 2004).

Previous studies describing the behavioral outcomes of striatal disinhibition referred to the induced abnormal movements as choreiform (McKenzie and Viik, 1975), dyskinesia (Muramatsu et al., 1990) or more commonly, myoclonus (Pycock et al., 1976; Tarsy et al., 1978; Patel and Slater, 1987; Crossman et al., 1988). However, recent studies in monkeys suggest that this model may be applicable to the study of motor tics (McCairn et al., 2009; Bronfeld et al., 2011, 2013; Bronfeld and Bar-Gad, 2013; Worbe et al., 2013). The results of the current study suggest that striatal disinhibition in rats may also be used as a model for tics. The isolated, brief, and sudden nature of the abnormal movements in rats is qualitatively more similar to tics rather than chorea, which tends to consist of smooth continuous movements (Mark, 2004). Several other characteristics of these movements, including their spontaneous nature, their persistence both during rest and active movements, their irregular timing and their localized expression are more consistent with the clinical expression of tics rather than myoclonus (Obeso et al., 1985; Shibasaki and Hallett, 2005; Vercueil, 2006). Finally, while tics have been associated with dysfunction of the CBG system (Bronfeld and Bar-Gad, 2013), myoclonus is attributed to other neuronal structures (Caviness and Brown, 2004; Shibasaki and Hallett, 2005; Cassim and Houdayer, 2006). Another feature of the striatal disinhibition model is the occurrence of other hyperbehavioral disorders in both monkeys and rats. These behavioral abnormalities are reminiscent of attention deficits, hyperactivity and repetitive/compulsive behaviors (Grabli et al., 2004; Worbe et al., 2009) which are often observed in TS patients (Freeman et al., 2000) but not in myoclonus patients. This comorbidity in both humans and in the animal model both strengthens the validity of the model and suggests that a common neurophysiological mechanism may underlie these different disorders (Bronfeld and Bar-Gad, 2013; Bronfeld et al., 2013).

Multiple neuronal sub-populations and pathways within the striatum might mediate the tic-inducing effects of bicuculline. These include the GABAA-mediated collateral connections between MSNs (Tunstall et al., 2002), connections between striatal interneurons and MSNs (Koos and Tepper, 1999), interactions between different types of interneurons leading to indirect modulation of the MSNs (English et al., 2012) and feedback afferents from the GPe (Bevan et al., 1998; Mallet et al., 2012). A recent study demonstrated that a selective inhibition of one type of striatal interneurons (FSI) could induce complex abnormal movements in mice, including dystonic postures and jerking movements (Gittis et al., 2011). These findings are in line with anatomical observations of selective loss of FSIs in human TS patients (Kalanithi et al., 2005). However, FSI dysfunction does not account for the complete behavioral manifestation of motor tics, and further studies are required to identify the neural substrates and interactions involved in the generation of bicuculline-induced motor tics.

The resemblance between striatal disinhibition in monkeys and rats, and indeed between these animal models and the clinical symptoms of human TS patients, suggests that rats may be a valid animal species for the continued exploration of tics through this model. A rat-based model for TS holds great promise for advancing research in this field. The phenomenological similarities of tics induced in animals and those observed in human patients offer a unique opportunity to directly study the physiological and neurological mechanisms underlying and associated with motor tics. Furthermore, this model may used as a novel platform for the investigation of novel therapeutic targets of TS, including screening of pharmacological agents, testing novel medical devices and behavioral manipulations.

### **ACKNOWLEDGMENTS**

We thank D. Cohen and M. Dror for their help with the animals and T. Tsur, E. Vinner and D. H. Zeef for technical assistance. This work was supported by an Israel Science Foundation (ISF) grant (372/09), a National Institute for Psychobiology grant and a Tourette Syndrome Association (TSA) grant.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/Systems\_Neuroscience/10.3389/ fnsys.2013.00050/abstract

## **REFERENCES**


Synaptic organisation of the basal ganglia. *J. Anat.* 196(Pt 4), 527–542. doi: 10.1046/j.1469- 7580.2000.19640527.x


recent advances. *Lancet Neurol.* 3, 598–607. doi: 10.1016/S1474-4422 (04)00880-4


436–447. doi: 10.1017/S0012162200 000839


fascicularis. *Brain Res.* 88, 195–209. doi: 10.1016/0006-8993(75) 90384-4


in basal ganglia/limbic striatal and thalamocortical circuits as a pathogenetic mechanism of obsessive-compulsive disorder. *J. Neuropsychiatry Clin. Neurosci.* 1, 27–36.


globus pallidus. *Brain Res.* 116, 353–359. doi: 10.1016/0006-8993 (76)90916-1


investigated by local injection of GABA antagonists. *Neurosci. Res.* 10, 34–51. doi: 10.1016/0168-0102 (91)90018-T

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 23 July 2013; accepted: 24 August 2013; published online: 18 September 2013.*

*Citation: Bronfeld M, Yael D, Belelovsky K and Bar-Gad I (2013) Motor tics evoked by striatal disinhibition in the rat. Front. Syst. Neurosci. 7:50. doi: 10.3389/ fnsys.2013.00050*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2013 Bronfeld, Yael, Belelovsky and Bar-Gad. This is an* *open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Different correlation patterns of cholinergic and GABAergic interneurons with striatal projection neurons

## *Avital Adler 1,2\*, Shiran Katabi 1, Inna Finkes 1, Yifat Prut 1,2,3 and Hagai Bergman1,2,3*

*<sup>1</sup> Department of Medical Neurobiology, Institute of Medical Research Israel-Canada, The Hebrew University-Hadassah Medical School, Jerusalem, Israel*

*<sup>2</sup> The Interdisciplinary Center for Neural Computation, The Hebrew University, Jerusalem, Israel*

*<sup>3</sup> Edmond and Lily Safra Center for Brain Sciences, The Hebrew University, Jerusalem, Israel*

#### *Edited by:*

*Izhar Bar-Gad, Bar-Ilan University, Israel*

#### *Reviewed by:*

*Gilad Silberberg, Karolinska Institute, Sweden Andrew Sharott, University of Oxford, UK*

#### *\*Correspondence:*

*Avital Adler, Department of Medical Neurobiology, The Hebrew University-Hadassah Medical School, Kiryat Hadassah, PO Box 12272, Jerusalem 91120, Israel e-mail: avital.adler@gmail.com*

The striatum is populated by a single projection neuron group, the medium spiny neurons (MSNs), and several groups of interneurons. Two of the electrophysiologically well-characterized striatal interneuron groups are the tonically active neurons (TANs), which are presumably cholinergic interneurons, and the fast spiking interneurons (FSIs), presumably parvalbumin (PV) expressing GABAergic interneurons. To better understand striatal processing it is thus crucial to define the functional relationship between MSNs and these interneurons in the awake and behaving animal. We used multiple electrodes and standard physiological methods to simultaneously record MSN spiking activity and the activity of TANs or FSIs from monkeys engaged in a classical conditioning paradigm. All three cell populations were highly responsive to the behavioral task. However, they displayed different average response profiles and a different degree of response synchronization (signal correlation). TANs displayed the most transient and synchronized response, MSNs the most diverse and sustained response and FSIs were in between on both parameters. We did not find evidence for direct monosynaptic connectivity between the MSNs and either the TANs or the FSIs. However, while the cross correlation histograms of TAN to MSN pairs were flat, those of FSI to MSN displayed positive asymmetrical broad peaks. The FSI-MSN correlogram profile implies that the spikes of MSNs follow those of FSIs and both are driven by a common, most likely cortical, input. Thus, the two populations of striatal interneurons are probably driven by different afferents and play complementary functional roles in the physiology of the striatal microcircuit.

#### **Keywords: striatum, interneurons, crosscorrealtion, physiology, spikes**

## **INTRODUCTION**

The striatum is the primary input stage of the basal ganglia network. Its medium spiny projection neurons (MSNs) constitute the vast majority of striatal cells (Tepper et al., 2008). However, their activity and hence striatal output is thought to be highly affected by a proportionally small population of a-spiny interneurons (Kawaguchi et al., 1995; Kreitzer, 2009; Tepper et al., 2010; Gittis and Kreitzer, 2012). Two major groups of striatal interneurons have been extensively studied by *in vitro* and *in vivo* physiological methods: the fast spiking parvalbumin (PV) expressing GABAergic interneurons (FSIs; e.g., Berke, 2008; Tepper et al., 2010) and the tonically active cholinergic interneurons (TANs; e.g., Kimura et al., 1984; Aosaki et al., 1994; Graybiel et al., 1994; Morris et al., 2004).

GABAergic FSIs form powerful perisomatic synapses onto MSNs (Tepper et al., 2008). *In vitro* studies suggest that the GABAergic FSIs provide strong feed-forward inhibition that shapes the firing patterns of MSNs (Tepper et al., 2008; Gittis et al., 2010; Planert et al., 2010). On the other hand, there are no reports of studies of mono-synaptic interactions between TANs and MSNs (but see English et al., 2012, for recent evidence for di-synaptic interactions between TANs and MSNs). TANs probably cannot be simply characterized as having an excitatory or inhibitory effect on MSN activity; rather they are assumed to have a global modulatory effect (Oldenburg and Ding, 2011).

FSIs have been found to display high sensitivity to cortical input (Mallet et al., 2005) and may integrate information from diverse cortical areas (Parthasarathy and Graybiel, 1997). Their *in vivo* extracellular activity has been described mainly in rodents, and exhibits robust task-related responses in operant conditioning paradigms (Berke, 2011). However, although FSIs are coupled by gap junctions (Kita et al., 1990; Koos and Tepper, 1999), their *in vivo* activity was shown to be highly individualized (Berke, 2008; Schmitzer-Torbert and Redish, 2008). Similarly, correlation studies of the spiking activity of simultaneously recorded FSI-MSN pairs failed to find strong evidence for mono synaptic inhibition of MSN activity by the FSI (Sharott et al., 2009; Gage et al., 2010; Lansink et al., 2010). The *in vivo* activity of TANs has been amply investigated in behaving primates. In associative learning paradigms these cells pause their tonic firing for a 200–300 milliseconds in response to external stimuli that become associated with rewarding (and aversive) outcomes (Kimura et al., 1984; Apicella, 2002; Joshua et al., 2008). The TANs receive excitatory inputs from cortex and thalamus, and have been shown to increase their discharge in response to direct cortical stimulation (Sharott et al., 2012). Although TANs receive both cortical and thalamic innervations, the TAN characteristic pause response is probably driven by thalamic input (Matsumoto et al., 2001; Nanda et al., 2009; Ding et al., 2010; Schulz et al., 2011).

Both striatal interneuron cell types have been shown to be imperative to normal striatal functioning (Pisani et al., 2007; Gittis et al., 2011). This makes it crucial to define the functional (*in vivo*) relationship between their activity and the MSNs that mediate striatal output. We thus recorded and analyzed the simultaneous spiking activity of MSN–TAN or MSN–FSI pairs and used cross-correlation methods to identify the direct synaptic interactions and/or common input drives of these striatal projection—interneurons pairs.

## **METHODS**

Two monkeys (*Macaque fascicularis*, G male, 4.5 kg; L female, 3 kg) were used in this study. Experimental protocols were conducted in accordance with the National Research Council's *Guide for the Care and Use of Laboratory Animals* and the Hebrew University guidelines for the use and care of laboratory animals in research. The experimental protocols were approved and supervised by the Institutional Animal Care and Use Committee (IACUC) of the Hebrew University and Hadassah Medical Center. The Hebrew University is an Association for Assessment and Accreditation of Laboratory Animal Care (AAALAC) internationally accredited institute. The behavioral paradigm, surgery procedures, data-recording, and analysis methods were described in previous manuscripts (Adler et al., 2012, 2013). Here we only describe in detail the methods not previously used.

#### **BEHAVIORAL TASK**

During recordings the monkeys were engaged in a well-practiced classical conditioning task involving rewarding, aversive, and neutral outcomes (**Figure 1A**). Details of the behavioral paradigm and monkey behaviors are provided in our previous reports (Adler et al., 2012, 2013). Briefly, each trial began with the presentation of a visual cue (full-screen fractal images) for a period of 2 s. The cues were immediately followed by an outcome which could be one of three categories: liquid food in the reward trials, air puff (directed at both eyes) in the aversive trials, or neither in the neutral trials. The beginning of the outcome epoch was signaled by one of three sounds (duration, 80 ms) that discriminated the three outcome categories. Trials were followed by a variable inter-trial interval (ITI) of 5–6 s. In each category there were three/two (monkey G and L, respectively) different visual cues. In the rewarding and aversive trials the cues were differentiated by the magnitude or intensity of the liquid food or air puff, respectively. In the neutral trials the cues were differentiated by a change in the duration of the ITI (−2/0/+2 s to ITI duration). In total there were nine/six (monkey G and L, respectively) different visual cues; three/two (monkey G and L, respectively) for each outcome category. In this study, we combined the trials within each outcome category and present the results for the rewarding trials (which include all amounts of liquid food), the aversive trials (which include all air puff intensities), and the neutral trials. Visual fractal cues and auditory sounds were randomized between monkeys.

Licking and blinking behavior was recorded by an infrared reflection detector (Dr. Bouis, Freiburg, Germany) and video computerized analysis (Mitelman et al., 2009). We have previously demonstrated (Adler et al., 2012, 2013) that during recordings the monkeys were familiar with the visual cues and displayed the appropriate anticipatory licking and blinking behavior. Specifically, they licked to the presentation of the rewarding (and not aversive or neutral) cues and they blinked to the presentation of the aversive (and not rewarding or neutral) cues.

## **RECORDING AND CLASSIFICATION OF EXTRACELLULARLY RECORDED STRIATAL NEURONS**

Striatal neuronal activity was recorded by two to eight glass coated tungsten microelectrodes (impedance at 1 KHz 0.3–0.8 Mohm and horizontal distance of 0.5 mm) that were advanced separately (EPS; Alpha-Omega Engineering) into the different domains of the anterior striatum (**Figure 1B**). The electrodes were slowly advanced in each recording session to enable optimal detection and sorting of the spontaneous spiking activity. We used two criteria to distinguish between striatal cell types: the cells' average firing rate, and extracellular spike waveform duration from the first negative peak to the following positive peak (**Figures 1C–E**). Cells with waveform durations of 0.9– 2.5 ms and average firing rates <4 Hz were classified as presumed MSNs. Cells with waveform durations >2.5 ms and average firing rates of 3–15 Hz were classified as TANs. Finally, cells with waveform durations <0.9 ms and average firing rates >4 Hz were classified as FSIs. Remaining cells that did not strictly belong to the above groups were discarded (**Figure 1D**, unclassified) and are not reported here. Additional classifications using valley width at half maximum of the spike waveform and discharge pattern [coefficient of variation (CV) of the inter spike interval (ISI); see below] received similar identification (data not shown).

## **DATA ANALYSIS OF THE DISCHARGE PATTERN OF STRIATAL NEURONS**

The discharge patterns of the recorded striatal neurons were characterized by the ISI CV and the auto-correlation histograms (**Figure 2**). The ISI CV is defined as the SD/mean of the ISIs of each neuron and was calculated on the entire recording epoch (including the ITI). The auto correlation histogram (ACH) was calculated for a delay of 2 s. The ACH of each neuron was calculated for each task event and averaged to provide the raw ACH. The raw ACH was normalized by the average firing rate of the cell. Note that the CV is affected only by the first order ISI, whereas the ACH is affected by all spikes occurring within the 2 s interval.

## **DATA ANALYSIS OF SINGLE CELL RESPONSES**

Neural responses to behavioral events were characterized by a post stimulus time histogram (PSTH) starting at cue presentation and ending 2 s after outcome delivery (**Figure 3**). PSTHs were calculated in 1 ms bins and smoothed with a Gaussian window (*SD* of 20 ms). The baseline firing rate was calculated by averaging the firing rate in the last 3 s of the ITI and was subtracted from the smoothed PSTH. To determine a significant response in a single PSTH, we calculated the SD of the PSTH

Classical conditioning paradigm. Visual cues were presented for 2 s and predicted the delivery of food (reward trials, upper row), air puff (aversive trials, third row), or only sound (neutral trials, second row). The trial outcome epoch was followed by a variable inter trial interval (ITI) of 5–6 s. **(B)** Recording sites: a representative coronal section +3 mm from the anterior commissure [adapted from Martin and Bowden, 2000]. Two to eight electrodes were advanced separately into one or two of the three sub regions of the striatum. P for putamen, C for caudate, and V for ventral striatum. **(C)** An example of simultaneously recorded pairs of units from the putamen. Each row is a 4 s analog trace of extracellular recording from a single electrode filtered between 300 and 6000 Hz. First two rows are MSN (red) and TAN (blue) recorded simultaneously, second two rows are MSN (red) and FSI (green) recorded simultaneously. **(D)** Classification of striatal neuron subtypes. Each dot represents a single neuron colored according to

either group. Abscissa: firing rate in Hz (logarithmic scale). Ordinate: spike waveform duration (ms). **(E)** Spike waveform averaged over all cells (average ± STD, line and shaded envelope, respectively) in each of the clusters. Waveform length was measured as the distance between the first negative peak and the next positive peak (left and right dashed lines, respectively). Upper row; TAN. Middle row: MSN. Third row: FSI. Same color coding as in **(D)**. **(F)** Spatial layout of TAN-MSN pairs. Each point represents a single pair. Abscissa: coordinates in the horizontal plane (in mm); M, medial; L, lateral; zero in the center of the putamen in our recordings. Ordinate: coordinates in the peri-sagittal plane (in mm); A, anterior; P, posterior; zero is coronal section AC0 (AC, anterior commissure). Z-axis: depth from entry to the striatum (in mm) of the TAN in the TAN-MSN pair. Blue and gray, location of pairs with significant and not significant correlations, respectively. **(G)** Spatial layout of FSI-MSN pairs. Same conventions as in **(F)**.

of the last 3 s of the ITI using the same number of trials as in the studied PSTH and identified time segments in which the deviation from the baseline firing rate exceeded three times the ITI-SD. A response was considered significant only if the duration of the deviant segment was >60 ms (three times the SD of the smoothing filter).

## **DATA ANALYSIS OF RESPONSE SIMILARITY OF CELL PAIRS**

The signal correlation (**Figure 4**, right column) was calculated between all cell pairs (simultaneously and non-simultaneously recorded) within each population as described previously (Joshua et al., 2009; Adler et al., 2013). Briefly, a signal correlation measures to what extent a pair of neurons tend to respond

**FIGURE 2 | Striatal MSNs, TANs, and FSIs display different firing patterns. (A)** MSN discharge pattern. Left subplot: Distributions of the CV of the ISIs of striatal MSN neurons. Abscissa: CV. Ordinate: fraction of cells. Right subplot: Average ± SEM (solid line and envelope) auto cross correlation histogram of MSNs normalized by the average discharge rate and averaged over all cells. N is for number of neurons. **(B)** TAN discharge pattern. Same conventions as in **(A)**. **(C)** FSI discharge pattern. Same conventions as in **(A)**.

**FIGURE 3 | Striatal MSNs, TANs, and FSIs display different response profiles. (A)** MSN response profile. Average response ± SEM (solid line and envelope) to cue presentation (0 s) and outcome delivery (2 s). Ordinate: firing rate in Hz normalized by the ITI discharge rate. Blue, reward events; red, aversive events; green, neutral events. N is for number of cells. **(B)** TAN response profile. Same conventions as in **(A)**. **(C)** FSI response profile. Same conventions as in **(A)**.

similarly to the behavioral events (i.e., similarity of the PSTH vectors). For each neuron we computed the PSTHs in 100 ms bins (without smoothing) for all behavioral events. We combined all PSTHs of a single cell into one matrix with rows for each

**FIGURE 4 | Different response characteristics of striatal MSNs, TANs, and FSIs. (A)** MSN response profile. Left subplot: distribution of MSNs that had a significant response. Blue, red, and green bars: fraction of cells that had a significant response for reward, aversive, and neutral events, respectively. Black bar: fraction of cells that had a significant response to at least one of the task events. Second subplot: distribution of response onset. Abscissa: time in seconds for significant increase in firing rate. Red line marks the average response onset time. Right subplot: distribution of the signal correlation between all (simultaneously and non-simultaneously recorded) MSN pairs. N is for number of pairs. **(B)** TAN response profile. Same conventions as in **(A)**. In the second subplot: distribution of response onset, left and right columns: latency of significant decrease and increase in firing rate, respectively. **(C)** FSI response profile. Same conventions as in **(A)**.

behavioral event and columns for each 100 ms time bin. For each column, we subtracted that column's mean and then flattened the matrix into a single vector. For each pair of neurons we computed their signal correlation by calculating the correlation coefficient of these two vectors. Signal correlation values range from plus one (for highly correlated response profiles) through zero (non-correlated response profiles) to minus one (anti-correlated response profiles).

## **DATA ANALYSIS OF CORRELATIVE SPIKING ACTIVITY OF SIMULTANEOUSLY RECORDED CELL PAIRS**

Spike to spike synchronization (**Figures 5**–**8**) between simultaneously recorded MSN-TAN, MSN-FSI, or TAN-TAN pairs was determined using cross correlation histograms (CCHs, Perkel et al., 1967). CCHs were computed with 1 ms bins for ±2 s around the trigger (MSN) spike and the conditional discharge rates of the reference cells (TAN or FSI) were smoothed using a Gaussian (*SD* of 10 ms). For the TAN-TAN pairs the selection of the trigger and the reference cell was done randomly. Only cell pairs with minimal isolation quality (>0.7, Joshua et al., 2007) and rate stability that were recorded simultaneously for more than 21 and 30 min (monkey L and G, respectively) were included in the database. We used different inclusion criteria for the two monkeys in order to

have a similar number of trials for each category for the two monkeys (two and three different outcome magnitudes were used in monkey L and G, respectively, see Behavioral task details above).

CCHs were computed separately for each task event and averaged to provide the raw CCH. Raw CCH may reflect the common activation of the recorded neurons either by the intrinsic network connectivity or by the common activation by behavioral events. However, the common activation by the behavioral events could be detected also in trials that have not been simultaneously recorded. Raw CCHs can be therefore normalized (corrected for common modulation of discharge rate) by using PSTH (reflecting the average responses of the recorded cell) and shift predictors (shuffling of the trials). As expected for stationary data, PSTH and shift predictors yielded similar results. Only the PSTH correction method is presented here.

To determine a significant peak/trough in a single CCH we calculated the SD of the last 0.5 s in both negative and positive lags of the CCH, and identified segments in which the CCH (±1.5 s around zero) exceeded three times the SD. A CCH was considered to a have a significant peak/trough only if the duration of the deviant segment was >30 ms (three times the SD of the smoothing filter). We used additional methods (Abeles, 1982a) to determine the significance of the CCH and obtained similar results (data not shown).

aversive event. Same conventions as in **(A)**. **(C)** Normalized CCH averaged over all interneurons to MSN pairs for the neutral event. Same conventions as in **(A)**. **(D)** Normalized CCH (using the PSTH predictor) averaged over behavioral events for all interneurons to MSN pairs.

To better characterize the CCHs we calculated the area under the curve of the normalized CCH at ±1.5 s time lags. Specifically, we summed the number of spikes in the ±1.5 s CCH time bins and divided this sum by the number of bins to obtain the average number of added spikes of the reference cell (FSI or TAN) around the occurrence of a spike of the trigger (MSN) cell. This parameter ranges from negative values (indicating inhibitory correlation) through zero (indicating no correlation) to positive values (indicating positive correlation).

Finally, to determine the skewness of a significant CCH we calculated a symmetry index. This index is found by subtracting the number of significant bins in the negative lag of the CCH from those in the positive lag divided by their sum.

#### **RESULTS**

#### **STRIATAL CELL CLASSIFICATION AND IDENTIFICATION**

We recorded the activity of striatal neurons from two monkeys engaged in a well-practiced classical conditioning task (**Figure 1A**). The task involved presentation of visual images

different y-scales).

**FIGURE 7 | MSN-interneuron pairs do not display narrow peaks or troughs in their cross correlation histograms. (A)** Raw (with no normalization) cross correlation histograms (CCH) between pairs of striatal interneurons and MSNs averaged over all pairs. Black line and gray shaded envelope display average and SEM values, respectively. Cross correlation histograms were computed with 1ms bins for ±100 ms around the trigger spike and were smoothed using a Gaussian (*SD* of 2 ms). The MSN (trigger cell) discharge is at time zero. Ordinate: conditional firing rate (spikes/s) of the FSI and TAN (reference cell), given a spike of the MSN at time zero. Abscissa: Time lag (in ms) around the discharge of the trigger cell. Axis labels on lower left subplots apply for all subplots. **(B)** Normalized CCH (using the PSTH predictor) averaged over behavioral events for all interneurons to MSN pairs.

**FIGURE 8 | TAN-TAN pairs display narrow peaks in their CCHs. (A)** Raw (with no normalization) cross correlation histograms (CCH) between pairs of striatal TANs averaged over all simultaneously recorded pairs. Black line and gray shaded envelope display average and SEM values, respectively. Cross correlation histograms were computed with 1 ms bins for ±2000 ms around the trigger spike and were smoothed using a Gaussian (*SD* of 10 ms). Ordinate: conditional firing rate of the reference cell, given a spike of the trigger cell at time zero. Inset: CCH at shorter time scale (±500 ms around the trigger spike). **(B)** Normalized CCH (using the PSTH predictor) averaged over behavioral events for all TAN to TAN pairs.

(cues) predicting either food outcome in rewarding trials, air puff in aversive trials or neither in neutral trials (Adler et al., 2012, 2013). Recordings were made from two to eight electrodes simultaneously in all striatal domains (anterior caudate, putamen and ventral striatum, **Figure 1B**). We classified striatal cells into three distinct groups using the waveform profiles (300–6000 Hz band-pass filtered extracellularly recorded activity) and the average firing rates of the recorded neurons (**Figures 1C–E**). Of the 1287 neurons that passed our inclusion criterion, 777 were classified as striatal phasically active neurons (presumably striatal projection neurons, MSNs), 283 were classified as TANs (presumably striatal cholinergic neurons), and 36 as FSIs (presumably striatal parvalbumin expressing GABAergic neurons). As reported previously in the rodent (Berke et al., 2004; Berke, 2008), the primate FSIs had the narrowest spike waveform lengths and the fastest average firing rates. TANs had the widest spike waveform lengths and intermediate firing rates. Finally, MSNs displayed an intermediate waveform length and the slowest firing rates.

#### **DISTINCT DISCHARGE PATTERNS OF STRIATAL NEURONS**

The three populations of striatal neurons also displayed distinctive firing patterns (**Figure 2**). TANs had the lowest values of CV of their ISIs with a very narrow distribution, revealing the tendency of these neurons for regular discharge. The CV of the MSNs ISIs was larger and broadly distributed, and the distribution of FSIs CV was intermediate in values and variance (**Figure 2**, left column).

Similar phenomena were observed in the average autocorrelograms of the three populations (**Figure 2**, right column). The auto-correlogram reveals the probability of neurons to discharge a spike as a function of time relative to a previous spike (at time = 0) of this neuron. The average auto-correlogram of the TANs (**Figure 2B**) shows a relative refractory period with a tendency for rebound after discharge. On the other hand, the average auto-correlograms of both the MSN and FSI (**Figures 2A,C**) revealed their tendency to fire at burst (central peaks in the histograms) that lasted ∼0.5 s.

#### **DISTINCT POPULATION RESPONSE PATTERNS OF STRIATAL CELLS**

Cells in all three sub-populations were highly modulated by the task, particularly to cue presentation (**Figure 3**). More than 93% of striatal cells (for all three populations) responded to at least one of the task events (**Figure 4**, left column). However, across striatal sub-populations, the cells displayed distinct response profiles.

MSNs (**Figure 3A**) typically responded with an average sustained increase in discharge rate to the visual cues, which started on average 547.2 ± 9.8 ms after cue presentation. As reported previously (Adler et al., 2012, 2013), MSNs displayed highly diverse responses. The diversity of the responses of neuronal population can be characterized by the distribution of the signal correlations; i.e., the similarity of the vectors of responses of two neurons of this population (Oram et al., 1998; Averbeck and Lee, 2004; Cohen and Kohn, 2011). The neural activity of MSN-MSN pairs was characterized by a symmetrical signal correlation distribution (average signal correlation ± SEM; 0.004 ± 0.0003, **Figure 4A** right).

Unlike MSNs, TANs (**Figures 3B**, **4B**) responded with a very stereotyped and synchronized (average TAN-TAN signal correlation ± SEM; 0.12 ± 0.0008, **Figure 4B**, right) pause and rebound excitation to cue presentation (**Figure 3B**), which was very sharp and immediate (average ± SEM onset to pause response: 153.2 ± 3.9 ms, to excitation: 334.9 ± 4.9 ms, **Figure 4B**, 3rd subplot).

FSIs (**Figure 3C**), like MSNs, responded mostly with an increase in discharge rates to cue and outcome presentation (at time = 0 and 2 s, respectively). However, this response was more immediate than the MSN response (**Figure 4C**, middle column; average ± SEM FSI onset time: 251.7 ± 27.4 ms, One-Way ANOVA, *p* < 0.05, *f* = 75.13, *df* = 2; MSN response onset time was different from that of TAN and FSI). In terms of similarity of the neural responses of FSI-FSI pairs (**Figure 4C** right column), FSIs were not as diverse as the MSNs (average signal correlation ± SEM; 0.06 ± 0.009). However, they also did not display the highly synchronized activity pattern of TANs that is characteristic of basal ganglia neuromodulator groups (e.g., dopamine neurons and TANS, Morris et al., 2004; Joshua et al., 2009). A One-Way ANOVA revealed that the distribution of the FSI-FSI signal correlation was different (*p* < 0.05, *f* = 8.78, *df* = 2) from that of MSN-MSN and TAN-TAN pairs.

To sum up, all three populations of striatal projection and interneurons were highly modulated by the task; however, they differed considerably in their response profile and response synchronization levels.

#### **MSNs ARE DIFFERENTIALLY CORRELATED WITH STRIATAL INTERNEURONS**

**Figures 5A**, **6** display the raw and corrected average CCHs between striatal MSN-TAN and MSN-FSI pairs (left and right columns, respectively).

The CCHs between simultaneously recorded pairs of TANs and MSNs were typically flat. In fact, all (*N* = 379 pairs) but three MSN-TAN pairs were not significantly correlated. We further calculated the average number of added spikes of the reference cell (TAN) to the trigger cell (MSN) around the corrected CCH time window of ±1.5 s (see Methods). A negative value would indicate that whenever the trigger cell spiked, the reference cell was more likely to suppress its discharge, a positive value would indicate the opposite, and zero would imply there was no correlation between the two. As predicted by the average flat CCH, we found the distribution of added spikes for the MSN-TAN pairs to be symmetrical around zero and not significantly different from zero (*Z*-test, *p* = 0.7, **Figure 5B**, left).

To ascertain that the average flat MSN-TAN CCH was not a result of opposing effects canceling each other out, we examined the CCHs separately for each type of task event and normalized them by a PSTH predictor (**Figure 6**, left). We found the raw (data not shown) and the normalized CCHs were typically flat for all behavioral events. There were no MSN-TAN pairs with significant CCHs for the reward event and only a single pair had a significant CCH for the aversive and neutral events.

Unlike the flat CCHs of MSN-TAN pairs, the MSN-FSI pairs were highly correlated. The average raw CCH of all MSN-FSI pairs (*N* = 66 pairs, **Figure 5A**, right) displayed a very broad positive and asymmetrical peak. Even after normalizing the raw CCHs by a PSTH predictor (**Figure 6**, right) to compensate for the effects of similar responses (**Figures 3**, **4**) a broad positive peak remained. Most of the MSN-FSI pairs that displayed a significant CCH (*N* = 29 pairs) had a positive peak (*N* = 24 pairs) and only five had a negative trough. This is evident both in the average CCH (**Figures 5A**, **6**) and in the positively skewed distribution of the CCH number of added spikes (**Figure 5B**, significantly different from zero, *Z*-test, *p* < 0.05). We found the correlation between MSN and FSI pairs was dependent on the cells' location within the striatum (**Figures 1F**,**G**). MSN-FSI pairs with significant correlations were more likely to be located posteriorly (*t*-test, *p* < 0.05).

Most MSN-FSI pairs with a significant CCH exhibited an asymmetrical histogram where the peak of the histogram was shifted toward negative values. This implies that the spikes of the trigger cell (MSN) followed those of the reference cell (FSI). We quantified the asymmetry (in the CCHs with significant positive peaks) using a symmetry index (see Methods). Most (20/24) MSN-FSI pairs with a significant CCH had a negative symmetry index with an average of −0.31 ± 0.1 (mean ± SEM, calculated over both positive and negative indices) indicating a CCH peak that was shifted to the left. Like MSN-TAN pairs, the correlation between FSIs and MSNs was not dependent on the task event (**Figure 6**, right). Furthermore, we examined the MSN-FSI CCHs at shorter time lags of ±100 ms (**Figure 7**) to search for the expected effects of the mono-synaptic inhibition of MSN discharge by FSI activity. We did not find troughs (or peaks) in the raw (**Figure 7A**) and PSTH predictor normalized (**Figure 7B**) CCHs in these time frames (none of the pairs were significant).

### **DISCUSSION**

We simultaneously recorded the spiking activity of striatal projection neurons (MSNs) and interneurons (TANs or FSIs) from monkeys engaged in a classical conditioning task involving rewarding, aversive, and neutral cues.

All striatal neurons were highly responsive to the behavioral events, but they displayed differential response properties. Striatal MSNs displayed the most sustained (**Figure 3**) and diverse response pattern (symmetric and broad distribution of the values of MSN-MSN signal correlation; **Figure 4**, right column). Striatal TANs (presumably cholinergic interneurons) displayed the most transient and synchronized activity pattern (the distribution of TAN-TAN signal correlation was significantly shifted to the right). Finally, striatal FSIs (presumably GABAergic interneurons) displayed intermediate values in both parameters.

The TAN-MSN CCHs were flat, suggesting a modulatory rather than a driving effect of the synchronized TAN activity on MSN neurons. The FSI-MSN pairs displayed a broad and asymmetrical peak in their CCHs. Thus, our correlation analysis of the spiking activity of remote (>0.5 mm) FSI-MSN pairs does not reveal evidence for mono-synaptic inhibition, but it shows that generally FSI discharges precede MSN spikes.

#### **STRIATAL MSNs AND TANs ARE NOT CORRELATED**

Striatal TANs constitute a very small percentage of striatal cells (Aosaki et al., 1995). Nonetheless, their widespread axonal field suggests that they should have a significant influence over MSN activity via their muscarinic synapses (Bolam et al., 1984; Bonsi et al., 2011). Anti-cholinergic agents were the first effective pharmacological treatment for Parkinson's disease, and their significant role in the pathophysiology of basal ganglia related disorders is emphasized in the dopamine-acetylcholine balance hypothesis (Calabresi et al., 2006; Aosaki et al., 2010; Sciamanna et al., 2012). Acetylcholine secreted by TANs can affect MSNs directly by changing the cells' excitability (Kreitzer, 2009; Goldberg et al., 2012) or indirectly by altering the dopaminergic input to the striatum (Threlfell et al., 2012). However, striatal TANs (like dopaminergic neurons, Moss and Bolam, 2008; Matsuda et al., 2009; Rice et al., 2011) probably have widespread influences via volume conductance and extra synaptic effects. Thus, they can modulate (in conjunction with the dopaminergic and other modulators of the striatum) the efficacy of the cortico and thalamo-striatal synapses, rather than directly affecting their target neurons' ongoing discharge (Kreitzer, 2009; Higley et al., 2011).

Consistent with this reasoning, we did not find any correlations between the TANs' spiking activity and that of the MSNs (**Figures 5**–**7** left columns), in line with a previous primate study of TAN-MSN correlations (Kimura et al., 2003). In this study, Kimura et al. reported significant (serial) correlation only between 3 out of 16 TAN-PAN pairs (Kimura et al., 2003, last line of their Table 1). Furthermore, we previously reported a lack of TAN—globus pallidus correlations in the normal (before MPTP) primate (Raz et al., 2001). As most of the innervation of pallidal neurons (>90% of their synapses, Percheron et al., 1994) is from striatal MSNs, a lack of TAN-pallidal correlation is congruent with a lack of TAN-MSN correlation.

The finding that MSN and TANs were not synchronized seems to be at odds with a recent optogenetic study (English et al., 2012) revealing strong TAN-MSN poly-synaptic modulation mediated by neuropeptide Y-neurogliaform (NPY-NGF) interneuron. The lack of TAN-MSN correlations is even lessexpected given the physiological studies revealing TAN-TAN synchronization (Raz et al., 1996; Kimura et al., 2003; Morris et al., 2004). These correlation studies imply a functional redundancy among TANs (i.e., the ongoing spiking activity of a single TAN is a faithful representation of the entire TAN network). Furthermore, beyond the synchronization of the spontaneous TAN spikes, TANs also show exceptional similarity in their responses to behavioral events. Indeed, many studies of the responses of TANs to behavioral events indicate that the TAN network is globally synchronized (See Graybiel et al., 1994, **Figure 4**; Adler et al., 2013, Figure 9). Thus, the finding of MSN-TAN flat correlation cannot be neglected on the basis of the spatial distance (>0.5 mm) between the MSN-TAN of this study.

Nonetheless, the practical implication of the physiological TAN-to-TAN synchronization is still quite modest. The typical shape of a TAN-TAN cross-correlogram can be characterized as triangle with a 100 ms base (around time zero, the time of a spike of the trigger TAN) and height of 1 spike/s above the average discharge rate of the reference TAN (See **Figures 8A,B** for raw and PSTH predicted normalized TAN-TAN correlograms, respectively). Namely, there is increased probability (beyond the baseline discharge rate) for one TAN to emit a spike at 100 ms around the discharge of another spike. This synchronization is thus very different from the optogenetic stimulation which probably induces a considerably stronger and sharper synchronization between TANs. We therefore suggest that the difference in the TAN-MSN connectivity found in our study and the studies that used optogentic tools (English et al., 2012) are due to these different time and intensity scales of TAN synchronization.

## **LACK OF EVIDENCE FOR MONO-SYNAPTIC INHIBITION IN THE FSI-MSN CROSS CORRELATION HISTOGRAMS**

In this study we used the waveform profiles of the extracellularly recorded spikes and the average discharge rates (**Figure 1D**) to classify three populations of striatal neurons (MSN, TANs, and FSI). We have found that other parameters (e.g., discharge pattern and responses to behavioral events) also revealed different profiles for the three classes of neurons. Following the rodent literature, we assume that our FSIs are the PV expressing GABAergic interneurons of the striatum. However, TH-expressing neurons can also be fast-spiking and calretinin neurons, which are particularly numerous in the primate striatum, could also make up part of the sample (although we do not yet know their spike shape, they are also GABAergic interneurons). The methodological limits of extra-cellular recordings in behaving animals do not enable us to verify that the FSIs recorded here represent a single type of striatal interneuron, and this should be further clarified by future studies.

*In vitro* studies have demonstrated that single FSI spikes can delay or abolish MSN spikes (Koos and Tepper, 1999; Planert et al., 2010). Furthermore, the FSIs are coupled by gap junctions (Kita et al., 1990; Koos and Tepper, 1999). Together, these properties were interpreted as suggesting that FSIs synchronously inhibit MSNs. However, in line with recent theoretical (Hjorth et al., 2009) and rodent *in vivo* studies (Berke, 2008; Gage et al., 2010; Lansink et al., 2010), we found that in the primate, the FSI population does not show synchronized spiking activity and does not respond similarly to behavioral events (**Figure 4C**, right subplot). Furthermore, as in these *in vivo* rodent studies (Gage et al., 2010; Lansink et al., 2010) we could not detect narrow troughs in the FSI-MSN CCHs. Finally, our results are in line with an earlier rodent study (Sharott et al., 2009). Although this study was carried out under anesthesia, it demonstrates the lack of negative correlation between MSNs and FSIs and that FSI-MSN correlations are stronger than TAN-MSN correlations.

This discrepancy between *in vitro* vs. the *in vivo* rodent and current primate studies could possibly be rooted in differences between intra- and extra-cellular recording methods. Intracellular studies are biased for adjacent neurons. If the *in vivo* FSI network is not synchronized, the lack of short-latency troughs in the FSI-MSN CCHs may reflect a selection bias for extra-cellular recording of only unconnected MSN-FSI pairs. In fact, the probability of detecting a connected pair with our multiple electrode setup (0.5 mm horizontal distance between electrodes) was small since FSI make strong and dense projections on MSN neurons within a 0.3 mm radius of their soma (Koos et al., 2004; Mallet et al., 2005; Gittis et al., 2010). However, Gage et al. (2010) only examined pairs recorded by the same tetrode, whereas the MSN is most probably within the axonal field of the recorded FSI, and as here, failed to detect evidence for strong functional effects of the mono-synaptic FSI-MSN connection.

FSI-MSN connections show substantial depression during continuous discharge (Klaus et al., 2011). The lack of evidence for monosynaptic inhibition between spikes of FSIs and MSNs could also reflect the high discharge rate of FSIs in behaving animals (unlike slice preparations), thus leading to prolonged synaptic depression of the MSNs (see discussion in Gage et al., 2010). However, see a recent study of local pallidal interactions (Bugaysen et al., 2013) revealing that despite significant short-term synaptic depression and the high frequency discharge of pallidal neurons the local network is modulated by these depressing synapses.

Finally, such discrepancies between *in vitro* experiments (mainly recording intra-cellular sub threshold phenomena) and the spiking activity of pairs of neurons have been reported previously (Abeles, 1982b; Eggermont, 1990; Renart et al., 2010). These may reflect the lower sensitivity of cross-correlation methods for the detection of inhibition (Aertsen and Gerstein, 1985); however, fast inhibition of pyramidal cells by local interneurons has been detected by cross correlations studies in the cortex (Bartho et al., 2004; Gage et al., 2010). In our view, the lack of evidence for short latency depression of MSN discharge by FSI spikes highlights the non-linearity of the input-output relations of the striatal microcircuits (see further Discussion below).

#### **MSNs AND FSIs ARE ACTIVATED BY A COMMON INPUT**

FSIs project heavily onto MSNs. They are highly sensitive to cortical input and display shorter response latencies than MSNs and were therefore suggested to mediate striatal feed-forward inhibition (Tepper et al., 2008). Our response onset measurements (**Figure 4**, mid column) extend this observation to behaving primates as well.

The MSN-FSI CCHs in this study displayed an asymmetrical broad (∼1 s) peak. This very broad peak might explain the discrepancy between our study and a previous rodent study (Gage et al., 2010) which did not find peaks in 100 ms normalized CCHs. The broad CCH peak likely originated from a common input to both cell types. It remains to be determined whether the cortical projection to the FSIs is distinct to a certain extent from other cortico-striatal projections (Berke, 2011). Our data suggest that adjacent MSNs and FSIs receive similar cortical input. This result is congruent with the claim that FSI feed-forward inhibition expands the dynamical range of afferent input to which the MSNs can respond by setting a threshold for MSN activation that is proportional to stimulus strength (Pouille et al., 2009; Gittis et al., 2010). For FSI feed-forward inhibition to regulate the MSN activity dynamic range, FSIs must spike prior to the MSNs in response to afferent stimulation. The asymmetrical FSI-MSN CCHs found in our study (**Figures 5**, **6** right columns) and the faster onset times of the FSIs (**Figure 4**, 2nd column) meet this requirement and thus support arguments for faster cortical activation of the FSIs.

The asymmetric MSN-FSI correlograms could also resolve the apparent contradiction between *in vitro* and *in vivo* studies and the lack of evidence for mono synaptic inhibition in the MSN-FSI CCHs in the *in vivo* condition. The latency between FSI firing and MSN inhibitory post synaptic current (IPSC) is very short and the result of the IPSC is often a delaying of MSN spiking. The end result could appear to be MSN firing following FSI firing with a delay that appears to be synchronous (on a large time scale) with the FSI discharge. Our results thus imply that the shared cortical drive to both cell types, the faster responses of FSI and the relative delay of MSN discharge are the main processes in striatal microcircuit physiology. Therefore, consistent with growing evidence (Berke, 2011; Gittis et al., 2011), we suggest that FSIs play a complex and detailed role in modulating MSN activity rather than broad and non-specific inhibition.

Finally, another proposed function of the FSI-MSN synapse is in synchronizing the delayed spikes in MSNs. In future, this could be tested using the Joint-PSTH (JPSTH) method (Aertsen et al., 1989) between two MSN and a FSI recorded simultaneously, and by using the spike of the FSI as the JPSTH trigger. The scarcity of FSI recordings and the low discharge rate of striatal MSN does not enable us (with our current methodological limits) to reliably perform this analysis on our data.

## **CONCLUDING REMARKS**

We presented a differential functional relationship between MSNs and two types of striatal interneurons: TANs, the presumably cholinergic interneurons, and FSIs, the presumably PV expressing GABAergic interneurons. We did not find evidence for direct monosynaptic interactions between the MSNs and either striatal interneuron at the level of cross correlation of their spiking activity. However, the flat CCHs of MSN-TAN pairs contrasted with the asymmetric broadly peaked CCHs of MSN-FSI pairs. This suggests that the two interneuron populations play a different role in modulating MSN activity and striatal information processing (Szydlowski et al., 2013). In this report we do not present any data regarding direct crosscorrelations between FSI and TANs. This is due to the scarcity of such simultaneously recorded pairs in the striatum of behaving primates. Nevertheless, the robust differences between the correlation patterns of MSNs with TANs and FSIs suggest that these striatal interneurons have independent and different functions. Whereas the highly synchronized TANs are likely to have a widespread influence via volume conductance, the less-synchronized FSIs appear to be more involved in spatially constrained feed-forward information processing in the striatal network.

## **ACKNOWLEDGMENTS**

This study was supported in part by the Select and Act FP7 grant, by the Simone and Bernard Guttman chair of Brain Research, and the generous support of the Rosetrees and Dekker foundations (to Hagai Bergman). Avital Adler is supported by the Adams Fellowship Program of the Israel Academy of Sciences and Humanities.

## **REFERENCES**


Characterization of neocortical principal cells and interneurons by network interactions and extracellular features. *J. Neurophysiol.* 92, 600–608. doi: 10.1152/jn.01170.2003


*J. Neurosci.* 28, 11673–11684. doi: 10.1523/JNEUROSCI.3839-08.2008


M. (2010). Fast-spiking interneurons of the rat ventral striatum: temporal coordination of activity with principal cells and responsiveness to reward. *Eur. J. Neurosci.* 32, 494–508. doi: 10.1111/j.1460-9568.2010.07293.x


588–598. doi: 10.1111/j.1460-9568. 2008.06598.X


(1996). Neuronal synchronization of tonically active neurons in the striatum of normal and parkinsonian primates. *J. Neurophysiol.* 76, 2083–2088.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 10 July 2013; accepted: 15 August 2013; published online: 03 September 2013.*

*Citation: Adler A, Katabi S, Finkes I, Prut Y and Bergman H (2013) Different correlation patterns of cholinergic and GABAergic interneurons with striatal projection neurons. Front. Syst. Neurosci. 7:47. doi: 10.3389/fnsys.2013.00047*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2013 Adler, Katabi, Finkes, Prut and Bergman. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Simulating the effects of short-term synaptic plasticity on postsynaptic dynamics in the globus pallidus

## *Moran Brody1 and Alon Korngreen1,2\**

*<sup>1</sup> The Leslie and Susan Gonda Multidisciplinary Brain Research Center, Bar-Ilan University, Ramat Gan, Israel <sup>2</sup> The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel*

#### *Edited by:*

*Izhar Bar-Gad, Bar-Ilan University, Israel*

#### *Reviewed by:*

*Dieter Jaeger, Emory University, USA Charles J. Wilson, University of Texas at San Antonio, USA*

#### *\*Correspondence:*

*Alon Korngreen, The Mina and Everard Goodman Faculty of Life Sciences, Leslie and Susan Gonda Multidisciplinary Brain Research Center, Bar-Ilan University, Ramat Gan 52900, Israel e-mail: alon.korngreen@biu.ac.il*

The rat globus pallidus (GP) is one of the nuclei of the basal ganglia and plays an important role in a variety of motor and cognitive processes. *In vivo* studies have shown that repetitive stimulation evokes complex modulations of GP activity. *In vitro* and computational studies have suggested that short-term synaptic plasticity (STP) could be one of the underlying mechanisms. The current study used simplified single compartment modeling to explore the possible effect of STP on the activity of GP neurons during low and high frequency stimulation (HFS). To do this we constructed a model of a GP neuron connected to a small network of neurons from the three major input sources to GP neurons: striatum (Str), subthalamic nucleus (STN) and GP collaterals. All synapses were implemented with a kinetic model of STP. The *in vitro* recordings of responses to low frequency repetitive stimulation were highly reconstructed, including rate changes and locking to the stimulus. Mainly involved were fast forms of plasticity which have been found at these synapses. The simulations were qualitatively compared to a data set previously recorded *in vitro* in our lab. Reconstructions of experimental responses to HFS required adding slower forms of plasticity to the STN and GP collateral synapses, as well as adding metabotropic receptors to the STN-GP synapses. These finding suggest the existence of as yet unreported slower short-term dynamics in the GP. The computational model made additional predictions about GP activity during low and HFS that may further our understanding of the mechanisms underlying repetative stimulation of the GP.

**Keywords: network, neuron, short-term plasticity, facilitation, depression, metabotropic receptors, deep brain stimulation, simulation**

## **INTRODUCTION**

The rat globus pallidus (GP), homologous of the primate and human Globus Pallidus external segment (GPe), is one of the nuclei of the basal ganglia and plays an important role in a variety of motor and cognitive processes (Kita and Kitai, 1994; Kita, 2007; Sadek et al., 2007; Goldberg and Bergman, 2011). More than 80% of the inputs reaching the GP are inhibitory. Nevertheless, the firing rate of GP neurons increases during various motor actions (Georgopoulos et al., 1983; Mink and Thach, 1991; Gardiner and Kitai, 1992; Turner and Anderson, 1997). One explanation for this seeming contradiction may be the involvement of short- term synaptic depression of the GP inhibitory synapses. Such shortterm synaptic depression has been reported for inputs from the striatum to the GP (Str-GP) (Rav-Acha et al., 2005; Sims et al., 2008), for the connections from the subthalamic nucleus to the GP (STN-GP) (Hanson and Jaeger, 2002) and GP-GP synapses (Sims et al., 2008). It has been suggested that both Str-GP and GP-GP synapses undergo depression during stimulation (Rav-Acha et al., 2005; Sims et al., 2008), while STN-GP synapses display facilitation followed by depression (Hanson and Jaeger, 2002). Simultaneous weakening of inhibitory synapses and strengthening of excitatory synapses could lead to domination of excitatory inputs. This would explain the elevation of firing rate seen during motor actions.

Short-term plasticity (STP) at GP synapses could also explain an *in vitro* data set recently recorded in our lab, in which low frequency stimulation (LFS) locked the firing of GP neurons to the stimulus but had only a mild impact on their firing rate (Bugaysen et al., 2011). In contrast, high frequency stimulation (HFS) generated biphasic modulation of the firing frequency, with inhibitory and excitatory phases. Blocking synaptic transmission showed that these effects result from synaptic activity and not from direct activation of the neurons.

Due to changes in the pattern of GP activity during Parkinson's disease (Nini et al., 1995; Raz et al., 2000; Mallet et al., 2008; Bronfeld et al., 2010; Moran et al., 2012), the GP has become a target for high frequency deep brain stimulation (DBS) to treat the symptoms of the disease (Yelnik et al., 2000; Dostrovsky et al., 2002; Bar-Gad et al., 2004; Vitek et al., 2004). Stimulation of the GP may have a broad impact on the function of the basal ganglia due to its input-output connections, but the physiological effects of DBS of the GP are still unclear. Analyzing its possible effects will further our understanding of GP influence on the basal ganglia.

To shed light on the mechanisms generating GP activity in the normal and pathological state it is important to consider the effects of STP on GP dynamics. These were qualitatively explored here using a simple single compartment model. The model consisted of a GP neuron connected to a small network of neurons from the three major input sources to the GP, striatum, subthalamic nucleus and GP-GP synapses. All synapses were implemented with a kinetic model of STP and the model attempted to reproduce the previously recorded data set (Bugaysen et al., 2011).

#### **METHODS**

#### **SHORT-TERM SYNAPTIC PLASTICITY IN IONOTROPIC SYNAPSES**

The mathematical framework used here to implement STP dynamics has been previously described (Varela et al., 1997). This model faithfully captured physiological results although it pays no attention to the biological mechanisms underlying STP. The postsynaptic amplitude, *A*, relies on three factors, initial amplitude, *A*0, facilitation variable, *F*, and depression variable, *D*.

$$A = A\_0 \text{FD} \tag{1}$$

Both *F* and *D* were initially set to 1.

The depression variable *D* was multiplied with each stimulus by a constant *d* representing the depression following a single action potential:

$$D \to Dd \tag{2}$$

Since *d* ≤ 1, *D* decreased with each action potential. After each stimulus, *D* recovered exponentially to 1 using first order kinetics with time constant τ*D*:

$$
\pi\_D \frac{dD}{dt} = 1 - D \tag{3}
$$

The facilitation variable *F* was increased with each stimulus by a constant *f* representing the facilitation following a single action potential:

$$F \to F + f \tag{4}$$

Since *f* ≥ 0, *F*, increased with each action potential. *F* was increased but not multiplied by *f* since multiplication during HFS caused *F* to grow beyond any biological proportion. After each stimulus *F* recovered exponentially to 1 using first order kinetics with time constant τ*F*:

$$
\pi\_F \frac{dF}{dt} = 1 - F \tag{5}
$$

#### **TYPE 1 METABOTROPIC GLUTAMATE RECEPTORS**

The model used to implement type 1 metabotropic glutamate receptors was based on a previous model implementing GABA-B metabotropic receptors (Destexhe et al., 1998). The model is described by the following scheme:

$$R\_0 + T \rightleftharpoons R \tag{6}$$

$$
\mathcal{R} \rightleftharpoons \mathcal{G} \tag{7}
$$

where binding of a neurotransmitter *T* to an inactivated receptor *R*<sup>0</sup> causes the receptor to reach the active state *R*. The receptor activation leads to G-protein activation *G* which represents the increase in membrane conductance. This scheme is given by the following equations:

$$\frac{d[R]}{dt} = Kf\_R \* [T] \* (1 - R) - Kb\_R \* R \tag{8}$$

$$\frac{d[G]}{dt} = K\mathfrak{f}G \ast R - K\mathfrak{b}\_G \ast G \tag{9}$$

$$I\_i = \text{g max} \ast G \ast (\nu - E\_i) \tag{10}$$

where [*R*] represents the fraction of activated receptors and [*G*] represents the fraction of activated G-proteins. With each action potential, *T* was increased from 0 to constant value *X* for amount of time *t* and then decayed exponentially back to 0 using first order kinetics with time constant τ*T*:

$$
\pi\_T \frac{dT}{dt} = -T \tag{11}
$$

#### **PARAMETER SENSITIVITY ANALYSIS**

To test the sensitivity of the model to variations in the value of a single parameter the simulation was performed with 100 random values of this parameter sampled from a normal distribution. Afterwards the mean trial was calculated for each parameter, and for every sampled value the distance from the mean trial was estimated using a mean square error (MSE) function:

$$\chi^2 = E\left((\hat{\theta} - \theta)^2\right) = \frac{1}{N} \sum \left(\chi\_i^\* - \chi\_i\right)^2 \tag{12}$$

#### **DATA ANALYSIS**

All simulations were carried out using Neuron 7.1, data analysis used Igor Pro 6.02A (WaveMetrics) and MATLAB R2010a (Mathworks). Firing rate and poststimulus time histogram (PSTH) were calculated for each simulation. Mean firing rate was calculated for each category using 1 s bins. The PSTH was calculated for each frequency by averaging the action potentials of every millisecond in a 100 ms time window after each stimulus. All rate and PSTH results were presented as normalized frequency relative to the prestimulus firing rate under control conditions.

#### **THE MODEL**

The model consisted of one simplified postsynaptic, regular firing, neuron connected to a network of inputs including Str, STN and GP neurons (**Figure 1**). Different neuron types displayed different neuronal firing properties and were connected to the main postsynaptic neuron through synapses with distinctive dynamics. All the cellular and synaptic properties are described below.

#### **CELLULAR PROPERTIES**

All neurons consisted of a single compartment soma (diameter 96 μm). The channel dynamics in the neurons followed a modified Hodgkin–Huxley model (Hodgkin and Huxley, 1952) that was adapted to display regular firing phenotype (Pospischil et al., 2008). This model generated action potentials with amplitude of 92 mV (measured from threshold) and a half-width of 0.9 ms at 36 degrees centigrade.

The sodium conductance of this model obeyed the following equations (Pospischil et al., 2008).

$$\begin{aligned} \text{Ma} &= \mathfrak{g}\_{Na} m^3 h \,(V - E\_{\text{Na}})\\ \frac{dm}{dt} &= \alpha\_m (V) (1 - m) - \mathfrak{\beta}\_m (V) m\\ \frac{dh}{dt} &= \alpha\_h (V) (1 - h) - \mathfrak{\beta}\_h (V) h\\ \alpha\_m &= \frac{-0.32 (V - V\_T - 13)}{\exp\left[-\left(\frac{V - V\_T - 13}{4}\right)\right] - 1} \\ \beta\_m &= \frac{0.28 (V - V\_T - 40)}{\exp\left[-\left(\frac{V - V\_T - 40}{5}\right)\right] - 1} \\ \alpha\_h &= 0.128 \exp\left[-\frac{V - V\_T - 17}{18}\right] \\ \beta\_m &= \frac{4}{1 + \exp\left[-\left(\frac{V - V\_T - 40}{5}\right)\right]} \end{aligned} (13)$$

The potassium conductance of this model obeyed the following equations (Pospischil et al., 2008).

$$\begin{aligned} I\_K &= \overline{g}\_K n^4 \ (V - E\_K) \\ \frac{dn}{dt} &= \alpha\_n(V)(1 - n) - \beta\_n(V)n \\ \alpha\_n &= \frac{-0.032\left(V - V\_T - 15\right)}{\exp\left[-\left(\frac{V - V\_T - 5}{5}\right)\right] - 1} \\ \beta\_n &= 0.5 \exp\left[-\frac{V - V\_T - 10}{40}\right] \end{aligned} \tag{14}$$

The sodium and potassium channel conductances were set to *<sup>g</sup>*Na <sup>=</sup> <sup>0</sup>.05 S/cm2 and *<sup>g</sup>*<sup>K</sup> <sup>=</sup> <sup>0</sup>.005 S/cm2 and their reversal potential to *E*Na = 50 mV and *E*<sup>K</sup> = −100 mV, respectively. Conductance for the leak current was set to *<sup>g</sup>*Leak <sup>=</sup> <sup>0</sup>.0001 S/cm2 and the reversal potential to *E*leak = −65 mV. VT adjusts spike threshold and was set in our model to −63 mV. Some of the models presented in Pospischil et al. (2008) contain either a slow potassium current (m-current) to introduce spike frequency adaptation or voltage-gated calcium currents to generate bursting. To keep the model of our postsynaptic neuron as simple as possible these additional channels were not inserted.

STN neurons display spontaneous firing rate at rest (Nakanishi et al., 1987; Bevan and Wilson, 1999; Do and Bean, 2003) ranging between 5 and 40 Hz (Nakanishi et al., 1987). The simulated STN neurons here were characterized by a 6 Hz firing frequency. Sixteen STN neurons were attached to the main postsynaptic neuron by a glutamatergic synapse. A stimulating electrode was inserted into ten of these neurons (**Table 1**).

GP neurons have also been found to exhibit spontaneous firing rate at rest (Cooper and Stanford, 2000; Bugaysen et al., 2010); in this model the GP neurons connected to the principle cell were adjusted to display a 13 Hz firing frequency corresponding to our previous recordings (Bugaysen et al., 2010). Twenty-nine neurons were attached to the main postsynaptic neuron. A stimulating electrode was inserted into six of these neurons (**Table 1**). Str neurons are quiescent at rest (Delong, 2000). Thus our model included only Str neurons into which a stimulating electrode was attached. Eight Str neurons were attached to the main postsynaptic neuron and stimulated during the simulation (**Table 1**).

#### **SYNAPTIC PROPERTIES**

STN-GP synapses undergo facilitation followed by fast depression (Hanson and Jaeger, 2002). Additionally, the simulations predicted the involvement of slower synaptic dynamics including augmentation and slow depression. Thus, implementation of these synaptic properties resulted in extension of Equation (1) as follows:

$$A = A\_0 \* f \* d\_{\text{fast}} \* a \* d\_{\text{slow}} \tag{15}$$

The fast depression and facilitation values were set at: *d*fast = 0.9, τ*D*fast = 491 ms, *f* = 0.4 τ*<sup>F</sup>* = 170 ms, based on previous findings (Hanson and Jaeger, 2002), while the augmentation and slow depression were set at: *d*slow = 0.9975 τ*D*slow = 250,000 ms, *a* = 0.03, τ*<sup>A</sup>* = 8000 ms (**Table 2**). Activation of these synapses produced an EPSP of 0.42 mV.



was that in the full network.

Brody and Korngreen Synaptic plasticity in the globus pallidus

**Table 2 | Synaptic parameters.**


The STN-GP synapses were supplemented with type 1 metabotropic glutamate receptors (mGluRs1) using Equations (8–11) and the rate equations were adjusted with the rate constants *KfR* = 20 ms, *KbR* = 10, 000 ms, *KfG* = 10 ms, *KbG* = 30 ms. With each action potential, *T* was set at *T* = 0.2, for 0.28 ms, afterwards decaying exponentially to 0 with time constant τ*<sup>T</sup>* = 0.01 ms, and the maximal conductivity was *g*max = 0.06 pS.

Sims et al. (2008) reported that GP-GP synapses undergo a minor fast depression. This was not implemented in our model. However, our simulations predicted that this synapse should display slow depression, which was therefore implemented using Equation (1) with facilitation factors (*f*, τ*F*) subtracted. The depression constants were set at *d* = 0.998, τ*<sup>D</sup>* = 20000 ms (**Table 2**).

The Str-GP synapses are inhibitory synapses that undergo rapid depression during stimulation (Rav-Acha et al., 2005). These synapses were also implemented using Equation (1) without facilitation. The depression constants for these synapses were set at *d* = 0.998, τ*<sup>D</sup>* = 20000 ms, based on Rav-Acha et al. (2005) (**Table 2**).

#### **RESULTS**

STP dynamics reported for Str-GP, STN-GP and GP-GP synapses were implemented in the attempt to qualitatively simulate a previously recorded *in vitro* data set (Bugaysen et al., 2011). Str-GP synapses were characterized by fast depression with a time constant of 600 ms (Rav-Acha et al., 2005) and STN-GP synapses were characterized by facilitation with a time constant of 170 ms, followed by depression with a time constant of 491 ms (Hanson and Jaeger, 2002). Since GP-GP synapses undergo a minor and insignificant depression (Sims et al., 2008), they were not implemented with STP kinetics.

Experimental firing rate changes during repetitive 10 Hz stimulation could be reproduced using these fast time constants (**Figure 2**, left column). Stimulation under control conditions slightly decreased the firing frequency (**Figure 2A**). A blockade of inhibitory synapses equivalent to *in vitro* application of bicuculline caused a marked increase of the spontaneous firing rate and a further increase during stimulation (**Figure 2C**). Blockade of excitatory synapses equivalent to *in vitro* application of APV and CNQX slightly decreased the spontaneous firing rate, while stimulation during the blockade caused an additional decrease. These phenomena were not observed experimentally (**Figure 2E**). Unlike the results obtained with LFS, the model had difficulty in capturing the effects of repetitive 40 Hz stimulation (**Figure 2**

to mean prestimulus frequency under control conditions. Left column shows LFS results under control conditions **(A)**, during blockade of inhibition **(C)** and blockade of excitation **(E)**. Right column shows HFS results under control conditions **(B)**, during blockade of inhibition **(D)** and blockade of excitation **(F)**. Black traces, model results; gray traces, normalized population average recorded from GP neurons *in vitro* (Bugaysen et al., 2011). Horizontal bars indicate the stimulation period. In these experiments inhibition was blocked with 50 μM bicuculline and excitation with 15μM CNQX and 50 μM APV.

right column). In all three conditions HFS induced a constant change in the firing frequency. This also contrasted with the *in vitro* results that showed complex modulation of the firing rate.

Since the model using fast forms of STP was unable to reproduce all experimental results, slower STP dynamics were introduced into the various synapses. These slow kinetics were set to be roughly on the expected order of magnitude that may correspond to the observed experimental time course. Thus, they may reflect not only STP but other cellular processes such as channel adaptation or intracellular calcium buildup. Slow depression with a time constant of 20 s was added to GP-GP synapses, while STN-GP dynamics were augmented and given a slower depression with time constants of 8 and 250 s respectively. A previous study showed that Str-GP synapses were completely depressed during HFS (Rav-Acha et al., 2005), thus Str-GP synapses were not included in 40 Hz stimulations and were not implemented with slower forms of STP.

LFS results were not influenced by the new STP dynamics (**Figure 3**, left column), but during HFS most of the results were greatly improved (**Figure 3**, right column). With slower forms of STP the model succeeded in generating complex rate changes resembling the *in vitro* results during 40 Hz stimulation under control conditions (**Figure 3B**). The model faithfully captured the initial decrease in firing frequency but

**FIGURE 3 | Model and experimental results during LFS and HFS after adding slow kinetics to GP-GP and STN-GP synapses.** Left column shows LFS results under control conditions **(A)**, under blockade of inhibition **(C)** and blockade of excitation **(E)**. Right column shows HFS results under control conditions **(B)**, under blockade of inhibition **(D)** and blockade of excitation **(F)**. Plotted as in **Figure 1**. Black traces, model results; gray traces, normalized population average recorded from GP neurons *in vitro* (Bugaysen et al., 2011). Horizontal bars indicate the stimulation period. In these experiments inhibition was blocked with 50μM bicuculline and excitation with 15 μM CNQX and 50μM APV.

the following increase was smaller than observed experimentally. Results during inhibitory blockade highly resembled the *in vitro* results during stimulation but the model had less success reconstructing the after-stimulation effects (**Figure 3D**). The model failed to improve the results during blockade of excitation (**Figure 3F**). Overall, as the slow STP dynamics in STN-GP and GP-GP synapses significantly improved the model results, the model predicts the existence of slower plasticity dynamics in the GP that have not as yet been observed.

Following the addition of slow synaptic kinetics, the main differences between the model and the experimental results were: (1) the after-stimulation effects (**Figures 3B,D**) and (2) a lower firing rate during stimulation (**Figure 3**). These differences led us to add a mechanism implementing type 1 metabotropic glutamate receptors (mGluRs1). These are part of the metabotropic glutamate receptors located on GP neurons (Testa et al., 1994, 1998; Hanson and Smith, 1999; Smith et al., 2001; Marino et al., 2002; Poisik et al., 2003; Kaneda et al., 2007). mGluRs1 activation depolarize the membrane potential of GP neurons during and after stimulation (Poisik et al., 2003; Kaneda et al., 2007). Since STN-GP synapses were the only synapses in the model activated by glutamate, the mGluRs were merely added to their mechanisms.

The addition of mGluRs kinetics improved both LFS and HFS results (**Figure 4**), with the improvement during HFS being more significant. 10 Hz stimulation under control conditions (**Figure 4A**) caused the firing rate to first decrease, but shortly afterwards it returned almost to the prestimulus level. Additionally, unlike the first two versions of the model, in this model firing rate increased slightly at the end of stimulation, an increase also observed experimentally. This version of the model with mGluRs also improved the results with blockade of inhibition and excitation, mainly by elevating the firing frequency during stimulation (**Figures 4C,E**). With 40 Hz stimulation, the results were markedly improved under control conditions (**Figure 4B**); after an initial decrease firing rate increased significantly for tens of milliseconds, remaining elevated after the end of stimulation. Under blockade of inhibition (**Figure 4D**) the mGluRs addition caused the firing rate to decay gradually after the end of the stimulation. Finally, firing rate increased during stimulation under blockade of excitation (**Figure 4F**), returning to almost the prestimulus firing rate. Even though these effects were not identical to the experimental results they showed a similar tendency. However, the induction of mGluRs also resulted in after-stimulation effects not observed experimentally.

Adding the metabotropic receptors was the final step in the construction of our model. **Figure 5** illustrates the conductivity of all synapses during LFS (**Figures 5A–D**) and all synapses except Str-GP synapses during HFS (**Figures 5E–G**), since Str-GP synapses were excluded during 40 Hz stimulation. After reaching the final model configuration we examined its ability to reconstruct rapid changes in firing pattern, including locking to the stimulus. Even though the model was fine-tuned based on the results of slow firing rates, similar rapid firing pattern effects were reproduced and the results under all three conditions included features resembling the

excitation **(E)**. Right column, HFS results under control conditions **(B)**, during blockade of inhibition **(D)** and blockade of excitation **(F)**. Black traces, model results; gray traces, normalized population average recorded from GP neurons *in vitro* (Bugaysen et al., 2011). In these experiments inhibition was blocked with 50μM bicuculline and excitation with 15 μM CNQX and 50 μM APV. Plotted as in **Figure 1** Horizontal bars indicate the stimulation period.

experimental results (cf. **Figures 6A–F**). These results consisted of a decrease in firing rate immediately after the stimulus, which later returned to baseline. The recovery of the firing rate in the model was faster than observed experimentally. With blockade of inhibition (**Figure 6B**) the firing rate increase immediately after the stimulus, later decreasing and fluctuating around the prestimulus firing rate, similar to the experimental results (**Figure 6E**).

The addition of slow time constants to STN-GP and GP-GP synapses, as well as the addition of metabotropic receptors to STN-GP synapses, had a marked influence on the model results. Thus, the sensitivities of four parameters characterizing the various features of the slow dynamics were tested (see methods). We examined three parameters which characterize the STN-GP

synapses - augmentation time constant (τ*A*), slow depression time constant (τ*D*Slow) and the reaction rate constant describing the metabotropic receptor decay (*KbR*)—and the depression time constant (τ*D*) characterizing GP-GP synapses. Random values of these parameters were sampled from different normal distributions when the distribution mean for each parameter reached its final value in the model and the SD was 20% of the mean for τ*D*Slow, *KbR*, τ*D*, and 30% for τ*A*.

**Figure 7** illustrates the results of the parameter sensitivity analyses for these four parameters. During 10 Hz stimulation the population PSTH results of all parameters exhibited similar errors for the various values sampled (**Figure 7**, top row). This indicated that these parameters mildly influenced firing pattern during LFS. These findings are reasonable since the population PSTH time windows were 100 ms while the slow dynamics time constants ranged from several to hundreds of seconds. Firing rate results during 10 Hz stimulation differed for the four parameters (**Figure 7**, middle row). Changing the values of the augmentation and the slow depression of STN-GP synapses resulted in similar errors for the different sampled values (**Figures 7B,E**), indicating that these parameters had no impact during LFS. That is, these results fit the previous results showing no change in the firing rate during LFS after adding slow time constants to STN-GP synapses (**Figure 3**, left column).

However, changing the reaction rate constant and the GP-GP depression time constant resulted in a greater error, as the sampled values were farther from the mean (**Figures 7H,K**) indicating that the model was sensitive to changes in these parameters. The results for the reaction rate constant of the metabotropic receptors correlated with the addition of these receptors having some influence on the model results during LFS (**Figure 4**, left column).

The results for the depression time constant of the GP-GP synapses showed a marked influence on the model results, even though this kind of influence was not apparent during LFS (**Figure 3**, left column). During 40 Hz stimulation changes in all four parameter values produced greater error, showing a clear influence on the model results (**Figure 7**, bottom row). This fitted the finding that both slower synaptic dynamics and metabotropic receptors greatly improved the model results during HFS (**Figures 3**, **4** right columns).

DBS treatment for Parkinson's disease is only therapeutic at frequencies above 100 Hz (Dostrovsky and Lozano, 2002), while LFS has no effect on the symptoms or even worsens them (Rizzone et al., 2001; Moro et al., 2002). An *in vitro* study showed that the activity of STN neurons is differently modulated by HFS and LFS (Garcia et al., 2003). Thus we tested the response of GP neurons to stimulation at various frequencies to determine whether HFS and LFS also have different effects on GP activity.

No changes were found during stimulation at frequencies up to 10 Hz; **Figure 8A** gives an example during 5 Hz stimulation. In contrast, there were marked changes in firing rate during stimulation at or above 20 Hz. Above 40 Hz the response became biphasic, with an initial decrease in firing rate followed by an increased firing rate that decayed during the stimulation but returned to the mean prestimulus firing rate only tens of seconds after the end of stimulation. **Figure 8B** shows a response to 100 Hz stimulation. To quantify the firing rate differences during LFS and HFS we compared the minimal and maximal firing rates observed with each stimulus frequency (**Figure 8C**). This quantification emphasized that LFS had little influence on the firing rate; up to 10 Hz stimulation elicited no or only small differences between the minimal and the maximal firing rate. In contrast, HFS resulted in large differences which increased with stimulus frequency.

## **DISCUSSION**

This main objective of this study was to qualitatively simulate GP activity during low (LFS) and HFS, while emphasizing the importance of STP dynamics at the three main synapses of the GP; synapse with neurons from the striatum (Str-GP), from the subthalamic nucleus (STN-GP) and synapses between GP neurons (GP-GP). We constructed a model imitating the activity of a simple postsynaptic neuron connected to a small input network.

The model presented in this work aimed at qualitatively investigating the role of STP on firing of GP neurons. Thus, the main focus of the model was the description of synaptic dynamics while the firing dynamics of the postsynaptic neuron were simulated using a simple Hodgkin-Huxley like model (Pospischil et al., 2008). This simplified postsynaptic model does not take into account the large number of ion channels expressed in GP neurons nor their contribution to slow postsynaptic integration. It is quite likely that the slow time constants we added to the synapse in order to better simulate the response of the neuron to repetitive stimulation may not be a synaptic property but one stemming from postsynaptic channel activity. Indeed both experiments and modeling have shown substantial contribution of calcium activated channels to the firing of GP neurons (Deister et al., 2009). Thus, the current simulations present a set of thought experiments aimed at identification of the possible role of STP in GP firing rather than a full model of the postsynaptic neuron. The first model implemented with the fast STP dynamics reported for Str-GP (Rav-Acha et al., 2005) and

the beginning of the stimulation pulse.

STN-GP (Hanson and Jaeger, 2002) synapses failed to reconstruct *in vitro* HFS effects (**Figure 2**) (Bugaysen et al., 2011). This result focused the need to address longer time scales when considering the effect of prolonged stimulation of GP neurons. Thus, slower STP dynamics were introduced; depression was added to GP-GP synapse dynamics and both depression and augmentation added to the STN-GP synapse dynamics. The addition of these synaptic processes greatly improved the model results during HFS (**Figure 3**), suggesting the existence of slower STP dynamics at GP synapses that have not yet been isolated in biological preparations.

Since slow STP dynamics have a great influence during prolonged HFS it is especially important to consider them in the GP, since the GP serves as a target for high frequency DBS in the attempt to treat symptoms of Parkinson's disease (Yelnik et al., 2000; Dostrovsky et al., 2002; Bar-Gad et al., 2004; Vitek et al., 2004). Under normal conditions GP neurons do not appear to exhibit correlated activity (Nini et al., 1995; Bar-Gad et al., 2003; Goldberg and Bergman, 2011), but correlation was observed in 20% of paired GP neurons in monkey and rat Parkinson's models (Nini et al., 1995; Mallet et al., 2008; Bronfeld et al., 2010; Goldberg and Bergman, 2011). Synaptic depression can decorrelate neuron activity (Abbott and Regehr, 2004), thus the depression predicted by our model at GP-GP synapses could cause decorrelation during HF-DBS treatments of the GP and STN nuclei. Since both GP and STN DBS (Benabid, 2003) and GP DBS (Vitek et al., 2004) alleviate Parkinson's symptoms, it is tempting to speculate that decorrelation derived from synaptic depression is one of the mechanisms underlying the therapeutic effects of DBS.

Implementation of metabotropic glutamate type 1 receptors, which have been found in GP neurons (Testa et al., 1994, 1998; Hanson and Smith, 1999; Marino et al., 2002), further improved the model results. Their major contribution was to

the reconstruction of after-stimulation effects (**Figure 4**). Thus, our model predicts that these after-stimulation effects are due to the activation of mGluRs and not of ionotropic synapses, agreeing with previous studies (Poisik et al., 2003; Kaneda et al., 2007).

maximal (triangles) firing rates observed during each stimulus frequency.

All model responses to HFS derived from activation of only STN-GP and GP-GP synapses, without any activation of Str-GP synapses. This implies Str-GP synapses have no impact on GP activity during HFS and, indeed, Rav-Acha et al. (2005) showed that Str-GP synapses were completely depressed above 33 Hz stimulation. In the final configuration of the model STN-GP synapses accounted for 30% of the inputs, Str-GP synapses for 15% and GP-GP synapses for 55%. Histochemical analyses show that STN-GP synapses account for less than 20% of the inputs, 80% of inputs arise from Str-GP synapses and the remainder from GP-GP synapses (Kita and Kitai, 1994; Shink and Smith, 1995; Kita, 2007; Sadek et al., 2007). These discrepancies could result from the properties of the *in vitro* slices from which the physiological data set for reconstruction was obtained. As these brain slices included mainly the GP nucleus, it is plausible that the GP-GP connections in the slice were much better preserved than the other connections. This could have biased the original data set away from the conditions in which the histochemical results were obtained.

A higher baseline firing rate of STN neurons than the 6 Hz frequency used here has been reported in a number of studies (Nakanishi et al., 1987; Beurrier et al., 1999; Do and Bean, 2003; Hallworth and Bevan, 2005). Hanson et al. (2004) found sodium channels near STN boutons along GP dendrites, and these channels amplified the depolarization generated by activation of STN-GP synapses. Elevating the baseline firing rate of STN neurons and/or increasing the EPSP size in STN-GP synapses to fit these studies would probably reduce the number of STN neurons, bringing their number closer to the anatomical results.

Str-GP and GP-GP synapses were both implemented with IPSPs of 0.33 mV. Str-GP synapses are located mainly on GP dendrites, while GP-GP synapses are located only on the soma. Moreover GP-GP synapses are larger than Str-GP synapses (Falls et al., 1983; Kita, 1994, 2007; Shink and Smith, 1995). Therefore, the IPSPs generated after activation of Str-GP synapses could be smaller than those generated by GP activation of GP-GP synapses. Smaller IPSPs would most likely increase the number of Str neurons, closer to the anatomical results.

Stimulating the model with frequencies from 1 to 100 Hz resulted in different responses to LFS (up to 10 Hz) and HFS (20 Hz and above) (**Figure 8**). Different responses to LFS and HFS have also been observed in the STN (Garcia et al., 2003). DBS treating for Parkinson's disease is only therapeutic at high frequencies and has no impact at low frequencies (Rizzone et al., 2001; Dostrovsky et al., 2002; Moro et al., 2002). DBS efficiency with HFS may thus result from the strong influence of HFS on the firing rate and firing pattern of both the GP and the STN, changes which are not apparent during LFS. The results of the model predict the existence of new STP dynamics and that GP neurons respond differentially to low and HFS. Testing these predictions in biological preparations will further our understanding on GP activity in the normal and pathological state, as well as during DBS treatment in Parkinson's disease.

#### **ACKNOWLEDGMENTS**

This work was supported by the Legacy Heritage Bio-Medical Program of the Israeli Science Foundation to Alon Korngreen (Grant #981/10).

### **REFERENCES**


*J. Neurosci.* 29, 8452–8461. doi: 10.1523/JNEUROSCI.0576-09.2009


neurons through the frequencydependent activation of postsynaptic GABAA and GABAB receptors. *J. Neurosci.* 25, 6304–6315. doi: 10.1523/JNEUROSCI.0450-05.2005


nonexclusive relation of pallidal discharge to five movement modes*. J. Neurophysiol.* 65, 273–300.


(1997). A quantitative description of short-term plasticity at excitatory synapses in layer 2/3 of rat primary visual cortex. *J. Neurosci.* 17, 7926–7940.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 03 June 2013; accepted: 22 July 2013; published online: 08 August 2013. Citation: Brody M and Korngreen A (2013) Simulating the effects of shortterm synaptic plasticity on postsynaptic dynamics in the globus pallidus. Front. Syst. Neurosci. 7:40. doi: 10.3389/fnsys. 2013.00040*

*Copyright © 2013 Brody and Korngreen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## An extended reinforcement learning model of basal ganglia to understand the contributions of serotonin and dopamine in risk-based decision making, reward prediction, and punishment learning

#### *Pragathi P. Balasubramani 1, V. Srinivasa Chakravarthy1 \*, Balaraman Ravindran2 and Ahmed A. Moustafa3*

*<sup>1</sup> Department of Biotechnology, Indian Institute of Technology - Madras, Chennai, India*

*<sup>2</sup> Department of Computer Science and Engineering, Indian Institute of Technology - Madras, Chennai, India*

*<sup>3</sup> Foundational Processes of Behaviour Research Concentration, Marcs Institute for Brain and Behaviour & School of Social Sciences and Psychology,*

*University of Western Sydney, Sydney, NSW, Australia*

#### *Edited by:*

*Izhar Bar-Gad, Bar Ilan University, Israel*

#### *Reviewed by:*

*Alon Korngreen, Bar-Ilan University, Israel Robert Schmidt, BrainLinks-BrainTools, Germany*

#### *\*Correspondence:*

*V. Srinivasa Chakravarthy, Computational Neuroscience Laboratory, Department of Biotechnology, Indian Institute of Technology - Madras, Chennai 600036, India e-mail: schakra@iitm.ac.in*

Although empirical and neural studies show that serotonin (5HT) plays many functional roles in the brain, prior computational models mostly focus on its role in behavioral inhibition. In this study, we present a model of risk based decision making in a modified Reinforcement Learning (RL)-framework. The model depicts the roles of dopamine (DA) and serotonin (5HT) in Basal Ganglia (BG). In this model, the DA signal is represented by the temporal difference error (δ), while the 5HT signal is represented by a parameter (α) that controls risk prediction error. This formulation that accommodates both 5HT and DA reconciles some of the diverse roles of 5HT particularly in connection with the BG system. We apply the model to different experimental paradigms used to study the role of 5HT: (1) Risk-sensitive decision making, where 5HT controls risk assessment, (2) Temporal reward prediction, where 5HT controls time-scale of reward prediction, and (3) Reward/Punishment sensitivity, in which the punishment prediction error depends on 5HT levels. Thus the proposed integrated RL model reconciles several existing theories of 5HT and DA in the BG.

#### **Keywords: serotonin, dopamine, basal ganglia, Reinforcement Learning, Risk, Reward, Punishment, Decision Making**

## **INTRODUCTION**

Monoamine neuromodulators such as dopamine, serotonin, norepinephrine and acetylcholine are hailed to be the most promising neural messengers to ensure healthy adaptation to our uncertain environments. Specifically, serotonin (5HT) and dopamine (DA) play important roles in various cognitive processes, including reward and punishment learning (Cools et al., 2011; Rogers, 2011). DA signaling has been linked to reward processing in the brain for a long time (Bertler and Rosengren, 1966). Furthermore the activity of mesencephalic DA neurons are found to closely resemble temporal difference error (TD) in Reinforcement Learning (RL) (Schultz, 1998). This TD error represents the difference in the total reward (outcome) that the agent or subject receives at a given state and time, and the total predicted reward. The semblance between the TD error signal and DA signal served as a starting point of an extensive theoretical and experimental effort to apply concepts of RL to understand the functions of the Basal Ganglia (BG) (Schultz et al., 1997; Sutton and Barto, 1998; Joel et al., 2002; Chakravarthy et al., 2010). This led to the emergence of a framework for understanding the BG functions in which the DA signal played a crucial role. Deficiency of such a neuromodulator (DA) leads to symptoms observed in neurodegenerative disorders like Parkinson's Disease (Bertler and Rosengren, 1966; Goetz et al., 2001).

#### **THE MULTIPLE FUNCTIONS OF SEROTONIN**

It is well-known that dopamine is not the only neuromodulator that is associated with the BG function. Serotonin (5HT) projections to the BG are also known to have an important role in decision making (Rogers, 2011). 5HT is an ancient molecule that existed even in plants (Angiolillo and Vanderkooi, 1996). Through its precursor tryptophan, 5HT is linked to some of the fundamental processes of life itself. Tryptophan-based molecules in plants are crucial for capturing the light energy necessary for glucose metabolism and oxygen production (Angiolillo and Vanderkooi, 1996). Thus, by virtue of its fundamental role in energy conversion, 5HT is integral to mitosis, maturation, and apoptosis. In lower organisms, it modulates the feeding behavior and other social behaviors such as dominance posture, and escape responses (Kravitz, 2000; Azmitia, 2001; Chao et al., 2004). Due to its extended role as a homeostatic regulator in higher animals and in mammals, 5HT is also associated with appetite suppression (Azmitia, 1999; Halford et al., 2005; Gillette, 2006). Furthermore, 5HT plays important roles in anxiety, depression, inhibition, hallucination, attention, fatigue, and mood (Tops et al., 2009; Cools et al., 2011). Increasing 5HT level leads to decreasing punishment prediction, though recent evidence pointing to the role of DA in processing aversive stimuli makes the picture more complicated (So et al., 2009; Boureau and Dayan, 2011). The tendency to pay more attention to negative than positive experiences or other kinds of information (negative cognitive biases) are found to occur at lower levels of 5HT (Cools et al., 2008; Robinson et al., 2012). 5HT is also known to control the time scale of reward prediction (Tanaka et al., 2007) and to play a role in risk sensitive behavior (Long et al., 2009; Murphy et al., 2009; Rogers, 2011). Studies found that under conditions of tryptophan depletion, which is known to reduce the brain 5HT level, risky choices are preferred to safer ones in decision making tasks (Long et al., 2009; Murphy et al., 2009; Rogers, 2011). Reports about 5HT transporter gene influencing risk based decision making also exist (He et al., 2010; Kuhnen et al., 2013). 5HT is known to influence non-linearity in risk-based decision making (Kahneman and Tversky, 1979) risk-aversivity in the case of gains and risk-seeking during losses, while presented with choices of equal means (Murphy et al., 2009; Zhong et al., 2009a,b). In summary, 5HT is not only important for behavioral inhibition, but is also related to time scales of reward prediction, risk, anxiety, attention etc., and to noncognitive functions like energy conversion, apoptosis, feeding, and fatigue.

#### **PRIOR THEORETICAL AND COMPUTATIONAL ABSTRACT MODELS OF SEROTONIN**

It would be interesting to understand and reconcile the roles of DA and 5HT in the BG. Prior abstract models addressing the same quest such as that by Daw et al. (2002) argue that DA signaling plays a role that is complementary to 5HT. It has been suggested that whereas the DA signal responds to appetitive stimuli, 5HT responds to aversive or punitive stimuli (Daw et al., 2002). Unlike computational models that argue for complementary roles of DA and 5HT, empirical studies show that both neuromodulators play cardinal roles in coding the signals associated with the reward (Tops et al., 2009; Cools et al., 2011; Rogers, 2011). Genes that control neurotransmission of both molecules are known to affect processing of both rewarding and aversive stimuli (Cools et al., 2011). Complex interactions between DA and 5HT make it difficult to tease apart precisely the relative roles of the two molecules in reward evaluation. Some subtypes of 5HT receptors facilitate DA release from the midbrain DA releasing sites, while others inhibit them (Alex and Pehek, 2007). In summary, it is clear that the relationship between DA and 5HT is not one of simple complementarity. Both synergistic and opposing interactions exist between these two molecules in the brain (Boureau and Dayan, 2011).

Efforts have been made to elucidate the function of 5HT through abstract modeling. Daw et al. (2002) developed a line of modeling that explores an opponent relationship (Daw et al., 2002; Dayan and Huys, 2008) between DA and 5HT. In an attempt to embed all the four key neuromodulators—DA, 5HT, norepinephrine and acetylcholine—within the framework of RL, Doya (2002) associated 5HT with discount factor, γ , which is a measure of time-scale of reward integration (Doya, 2002; Tanaka et al., 2007). There is no single computational theory that integrates and reconciles the existing computational perspectives of 5HT function in a single framework.

#### **OUR MODEL IN BRIEF**

In this modeling study, we present a model of both 5HT and DA in BG simulated using a modified RL framework. Here, DA represents TD error as in most extant literature of DA signaling and RL (Schultz et al., 1997; Sutton and Barto, 1998), and 5HT controls risk prediction error. Action selection is controlled by the utility function that is a weighted combination of both the value and risk function (Bell, 1995; Preuschoff et al., 2006; D'acremont et al., 2009). In the proposed modified formulation of utility function, the weight of the risk function depends on the sign of the value function and a tradeoff parameter α, which we describe in detail below. Just as value function was thought to be computed in the striatum, we now propose that the utility function is computed in the striatum.

The outline of the paper is as follows: Section Methods describes the model equations. In Section Results, we show that a combination of both value and the risk function for decision making explains the following experiments. The first of these pertains to risk sensitivity in bee foraging (Real, 1981). Here we demonstrate that the proposed 5HT and DA model can simulate this simple neurobiological instance of risk-based decision making. We then show the capability of the model to explain the roles of 5HT in the representative experimental conditions: risk sensitivity in Tryptophan depleted conditions (Long et al., 2009); time-scale of reward prediction (Tanaka et al., 2007); and reward and punishment sensitivity (Cools et al., 2008). We present the discussion on the model and results in Section Discussion. Furthermore in the discussion, we hypothesize that the plausible neural correlates for the risk component are the D1R and the D2R co-expressing medium spiny neurons of the striatum, with serotonin selectively modulating this population of neurons.

### **METHODS**

On the lines of the utility models described by Bell (1995) and D'acremont et al. (2009), we present here the utility function, *Ut* as a tradeoff between the expected payoff and the variance of the payoff (the subscript *"t"* refers to time). The original Utility formulation used in Bell (1995; D'acremont et al. (2009) is (Equation 2.1).

$$U\_t(s, a) = Q\_t(s, a) - \kappa \sqrt{h\_t(s, a)}\tag{2.1}$$

where *Qt* is the expected cumulative reward and *ht* is the risk function or reward variance, for state, *s,* action, *a*; κ is the risk preference. Note that in equation. 2.1, we represent the state and action explicitly as opposed to (Bell, 1995; D'acremont et al., 2009).

In classical RL (Sutton and Barto, 1998) terms, following policy, π, the action value function, *Q*, at time *t* of a state, *"s,"* and action, *"a"* may be expressed as (Equation 2.2).

$$Q^{\pi}(s, a) = E\_{\pi}(r\_{t+1} + \gamma r\_{t+2} + \gamma^2 r\_{t+3})$$

$$+ \cdots \cdot |s\_t = s, \ a\_t = a) \tag{2.2}$$

where *rt* is the reward obtained at time, *t*, and γ is the discount factor (0 <γ < 1). *E*<sup>π</sup> denotes the expectation when action selection is done with policy π. The incremental update for the action value function, *Q* is defined as in Equation 2.3.

$$Q\_{t+1}(s\_t, a\_t) = Q\_t(s\_t, a\_t) + \eta\_Q \delta\_t \tag{2.3}$$

where *st* is the state at time, *t*; *at* is the action performed at time, *t*, and η*<sup>Q</sup>* is the learning rate of the action value function (0 < η*<sup>Q</sup>* < 1). δ*<sup>t</sup>* is the TD error defined by Equation 2.4,

$$\delta\_t = r\_{t+1} + \chi Q\_t(s\_{t+1}, \ a\_{t+1}) - Q\_t(s\_t, \ a\_t) \tag{2.4}$$

In the case of immediate reward problems, δ*<sup>t</sup>* is defined by Equation 2.5.

$$
\delta\_t = r\_t - Q\_t\left(\mathbf{s}\_t, \ a\_t\right) \tag{2.5}
$$

Similar to the value function, the risk function "*ht*" has an incremental update as defined by Equation 2.6.

$$h\_{t+1}(s\_t, a\_t) = h\_t(s\_t, a\_t) + \eta\_h \xi\_t \tag{2.6}$$

where η*<sup>h</sup>* is the learning rate of the risk function (0 < η*<sup>h</sup>* < 1), and ξ*<sup>t</sup>* is the risk prediction error expressed by Equation 2.7,

$$
\xi\_t = \delta\_t^2 - h\_t(s\_t, a\_t) \tag{2.7}
$$

η*<sup>h</sup>* and η*<sup>Q</sup>* are set to 0.1, and *Qt* and *ht* are set to zero at *t* = 0 for simulations of (sections Risk Sensitivity and Rapid Tryptophan Depletion, Time Scale of Reward Prediction and Serotonin, Reward/Punishment Prediction Learning and Serotonin) described below.

We now present a modified form of the utility function by substituting κ = α.*sign*[*Qt*(*st*, *at*)] in (Equation 2.1).

$$U\_t(\mathbf{s}\_t, \ a\_t) = Q\_t(\mathbf{s}\_t, \ a\_t) - \alpha \text{sign}(Q\_t(\mathbf{s}\_t, \ a\_t)) \sqrt{h\_t(\mathbf{s}\_t, \ a\_t)} \tag{2.8}$$

In (Equation 2.8), the risk preference includes three components—the "α" term, the "*sign(Qt)"* term, and the risk term <sup>√</sup>*ht*. The *sign(Qt)* term achieves a familiar feature of human decision making viz., risk-aversion for gains and risk-seeking for losses (Kahneman and Tversky, 1979). In other words, when *sign(Qt)* is positive (negative), *Ut* is maximized (minimized) by minimizing (maximizing) risk. Note that the expected action value *Qt* would be positive for gains that earn rewards greater than a reward base (= 0), and would be negative otherwise during losses. We associate 5HT level with α, a constant that controls the relative weightage between action value and risk (Equation 2.8).

In this study, action selection is performed using softmax distribution (Sutton and Barto, 1998) generated from the utility. Note that traditionally the distribution generated from the action value is used. The probability, *Pt(a|s)* of selecting an action, *a*, for a state, *s*, at time, *t*, is given by the softmax policy (Equation 2.9).

$$P\_t(a|s) = \exp(\beta U\_t(s, \
a)) \int \sum\_{i=1}^n \exp\left(\beta U\_t(s, i)\right) \tag{2.9}$$

*n* is the total number of actions available at state, *s*, and β is the inverse temperature parameter. Values of β tending to 0 make the actions almost equiprobable and the β tending to ∞ make the softmax action selection identical to greedy action selection.

#### **RESULTS**

In this section, we apply the model of 5HT and DA in BG (Section Methods) to explain several risk-based decision making phenomena pertaining to BG function.

	- Risk sensitivity in Bee foraging (Real, 1981)
	- Risk sensitivity and Tryptophan depletion (Long et al., 2009)

The parameters for each experiment are optimized using genetic algorithm (GA) (Goldberg, 1989) (Details of the GA option set are given in Supplementary material).

#### **RISK SENSITIVITY IN BEE FORAGING** *Experiment summary*

In the bee foraging experiment by Real (1981), bees were allowed to choose between flowers of two colors—blue and yellow. Both types of flowers deliver the same amounts of mean reward (nectar) but differ in the reward variance. The experiment showed that bees prefer the less risky flowers i.e., the one with lesser variance in nectar (Real, 1981).

Biogenic amines such as 5HT are found to influence foraging behavior in bees (Schulz and Robinson, 1999; Wagener-Hulme et al., 1999). In particular, the brain levels of dopamine, serotonin, and octopamine are found to be high in foraging bees (Wagener-Hulme et al., 1999). Montague et al. (1995) showed risk aversion in bee foraging using a general predictive learning framework without mentioning DA. They assume a special "subjective utility" which is a non-linear reward function (Montague et al., 1995) to account for the risk sensitivity of the subject. In the foraging problem of (Real, 1981) bees choose between two flowers that have the same mean reward but differ in risk or reward variance. Therefore, the problem is ideally suited for risk-based decision making approach. We show that the task can be modeled, without any assumptions about "subjective utility," by using the proposed 5HT-DA model which has an explicit representation for risk.

#### *Simulation*

We model the above phenomenon of bee foraging using the modified utility function of Section Methods. This foraging problem of (Real, 1981) is treated as a variation of the stochastic "twoarmed bandit" problem (Sutton and Barto, 1998), possessing no state (*s*) and 2 actions (*a*). We represent the colors of the flower ("yellow" and "blue") that happens to be the only predictor of nectar delivery as two arms (viz. the two actions, *a*). Initial series of experimental trials is modeled to have all the blue flowers ("norisk" choice) delivering 1μl (reward value *r* = 1) of nectar; 1/3 of the yellow flowers delivering 3μl (*r* = 3), and the remaining 2/3 of the yellow flowers contain no nectar at all (*r* = 0) (yellow flowers = "risky" choice). These contingencies are reversed at trial 15 and stay that way till trial 40. Since the task here requires only a single decision per trial, we model it as an *immediate reward* problem (Equation 2.5). Hence the δ for any trial *t* is calculated as in Equation 3.1.2.1 for updating the respective action value by Equation 3.1.2.2.

$$\delta\_t = r\_t - Q\_t \left( a\_t \in \{blue flower, yellowflower\} \right) \text{ (3.1.2.1)}$$

$$Q\_{t+1}(a\_t) = Q\_t(a\_t) + \eta\_Q \delta\_t \tag{3.1.2.2}$$

$$h\_{t+1}(a\_t) = h\_t(a\_t) + \eta\_h \xi\_t \tag{3.1.2.3}$$

$$
\xi\_t = \delta\_t^2 - h\_t(a\_t) \tag{3.1.2.4}
$$

$$U\_t(a\_t) = Q\_t(a\_t) - \alpha \text{sign}(Q\_t(a\_t))\sqrt{h\_t(a\_t)}\tag{3.1.2.5}$$

In our simulation, the expected action value (given by *Q*) for both the flowers converges to be the same value (=1). Our model accounts for the risk through the variance (represented by "*h*" of each flower: Equations 3.1.2.3, 3.1.2.4) component in the utility function (Equation 3.1.2.5) that plays a key role in the action selection.

#### *Results*

In the experiment (Real, 1981), most of the bees visited the constant nectar yielding blue flowers initially i.e. they chose a risk-free strategy, but later the choice switched to the yellow flowers, once the yellow became the less risky choice. We observe the same in our simulations too. Risk-aversive behavior being an optimal approach during the positive rewarding scenario, the blue flowers that deliver a steady reward of 1 have higher utility and are preferred over the more variable yellow flowers initially. The situation is reversed after trial 15 when the blue flowers suddenly become risky and the yellow ones become risk-free. Here, the utility of the yellow flowers starts increasing, as expected. Note that the expected action value for both flowers still remains the same, though the utility has changed.

With η*<sup>h</sup>* = 0.051, η*<sup>Q</sup>* = 0.001, α = 1.5 in Equation 3.1.2.5, and β = 10 in Equation 2.9 for the simulation, the proposed model captures the shift in selection in less than 5 trials from the indication of the contingency reversal (red line in the **Figure 1**). Since the value is always non-negative, and α > 0, our model exhibits risk-averse behavior, similar to the bees in the study.

#### **RISK SENSITIVITY AND RAPID TRYPTOPHAN DEPLETION** *Experiment summary*

Now we show that the above risk based decision making by 5HT-DA model framework can also explain the Long et al. (2009) experiment on risk sensitivity under conditions of Tryptophan depletion. Their experiment required the monkey to saccade to one of two given targets. One target was associated with a guaranteed juice reward (safe) and the other with a variable juice volume (risky). A non-linear risk sensitivity toward juice rewards by adopting risk-seeking behavior for small juice rewards and risk aversive behavior for the larger ones (Long et al., 2009) was observed in the monkeys. They showed that when brain

**(Sims) as an average of 1000 instances, that adapted from Real (1981) experiment (Expt), and red line indicating contingency reversal.**

5HT levels are reduced by Rapid Tryptophan Depletion (RTD), monkeys preferred risky over safer alternatives (Long et al., 2009). Tryptophan acts as a precursor to 5HT and therefore reduction in tryptophan causes reduction in 5HT.

#### *Simulation*

The juice rewards *r<sup>j</sup>* , represented in Long et al. (2009) as open time of the solenoid used to control the juice flow to the mouth of the monkeys, are given in **Table 1**. The non-linearity in risk attitudes observed by the monkeys is accounted for in the model by considering a reward base (*rb*) that is subtracted from the juice reward (*r<sup>j</sup>* ) obtained. The resultant subjective reward *(r)* is treated as the actual immediate reward received by the agent (Equation 3.2.2.1). Subtracting *r <sup>b</sup>* from *r<sup>j</sup>* , associates any *r<sup>j</sup>* < *r <sup>b</sup>* with an effect similar to losses (economy), and any *r<sup>j</sup>* > *r <sup>b</sup>* with gains.

$$r = r^j - r^b \tag{3.2.2.1}$$

The reward base (*r <sup>b</sup>*) used in the experiment is 193.2. A separate utility function *Ut*, is computed using Equation 2.8 for each state *'s'* tabulated in (**Table 1**) and action choice, *a* (*a* ∈ *safe target*, *risky target* ) pair. This is also modeled as an *immediate reward* problem and the subjective reward given by Equation 3.2.2.1 is used for the respective (state, action) pair's TD error calculation (Equation 2.5). The action value function is updated over trials using Equation 2.3 and the risk updates are using Equation 2.6 for any (state, action) pair described above.

#### *Results*

Here we examine the following conditions: (1) overall choice, (2) equal expected value (EEV) and (3) unequal expected value (UEV). In EEV cases, saccade to either the safe or the risky target offered the same mean reward, as shown in the first four states (*s*) of the (**Table 1**). In UEV cases, the mean reward maintained for the two targets is not the same, as in the last two states (*s*) of the (**Table 1**). The optimized 5HT parameter (used in Equation 2.8), α, is equal to 1.658 for the RTD condition and is 1.985

**Table 1 | The sample reward schedule adapted from Long et al. (2009).**


for the baseline (control) condition. The optimized β used in Equation 2.9 is 0.044. Long et al. (2009) demonstrated a significant reduction in choosing safe option on lowering the 5HT levels in brain. This was seen irrespective of the options possessing equal or unequal expected value (EEV/ UEV). Our simulation results also generated a similar trend for EEV and UEV conditions (**Figure 2**: Sims) as that of experimental results [**Figure 2**: expt adapted from Long et al. (2009)]. The classical RL model would fail to account for such a result in the selection of safe option especially in the EEV case, where that model would predict equal probability (= 0.5) for selecting both the safe and risky rewards.

## **TIME SCALE OF REWARD PREDICTION AND SEROTONIN** *Experiment summary*

## In this section, we show using the model of Section Methods that the α parameter that represents 5HT is analogous to the timescale of reward integration (γ as in Equation 2.2) as described in the experiment of Tanaka et al. (2007). In order to verify the hypothesis that 5HT corresponds to the discount factor, γ (as in Equation 2.4), Tanaka et al. (2007) conducted an experiment in which subjects performed a multi-step delayed reward choice task under an fMRI scanner. Subjects had to choose between a white square leading to a small early reward and a yellow square leading to a large but delayed reward (Tanaka et al., 2007). They were tested in: (1) tryptophan depleted, (2) control and (3) excess tryptophan conditions. At the beginning of each trial, subjects were shown two panels, each consisting of white and yellow squares, respectively. The two panels were occluded by variable numbers of black patches. When the subjects selected any one of the panels, a variable number of black patches are removed from the selected panel. When either panel was completely exposed, reward was provided. One of the panels (yellow) provided larger reward with greater delay; the other (white) delivered a smaller reward but after a shorter delay. A total of 8 trials were presented to each subject and the relative time delay ranges set for the white and the yellow panels are (3.75∼11.25 s, 15∼30 s) in four trials, (3.75∼11.25 s, 7.5∼15 s) in two trials, and (1.6∼4.8 s, 15∼30 s) and (1.6∼4.8 s, 7.5∼15 s) in one trial each.

#### *Simulation*

We modeled the above task with the state variable, *s*, representing the number of black patches in a panel and action, *a*, as choosing

any condition did not reject the null hypothesis, which proposes no difference between means, with *P* value > 0.05. Here the experimental

results are adapted from Long et al. (2009).

any one of the panels. Each simulation time step equals one experimental time step of 2.5 s. The initial number of black patches on the white and yellow panels are 18 ± 9, and 72 ± 24 respectively. The number of patches removed varied between trials, and are given for the white panel and the yellow panel as follows (Tanaka et al., 2007). They are (Ss, Sl) = (6 ± 2, 8 ± 2) in 4 trials, (6 ± 2, 16 ± 2) in 2 trials, and (14 ± 2, 8 ± 2), (14 ± 2, 16 ± 2) in the remaining 2 trials respectively. The above 8 trials are repeated for all three tryptophan conditions viz. depleted, control and excess. Finally the reward associated with the white panel is *r* = 1 and with that of yellow is *r* = 4. Since there is a delay in receiving the reward, the TD error formulation used in Equation 3.3.2.1 is used for updating the value of the states (denoting the discounted expectation of reward from a particular number of patches in a panel). The action of removing certain patches from a panel actually leads to another resultant state with a reduced number of patches. Hence at any particular "*t*" the resultant states of white and yellow panels are compared for action selection. While the value function is updated using Equation 3.3.2.2, the risk function is updated as in Equations 3.3.2.3, 3.3.2.4. The agent is then made to choose between the utility functions given by Equation 3.3.2.5 of both the panels at time, *t*. Eventually the panel that is completely exposed is labeled as selected for a particular trial.

$$\delta\_t = r\_{t+1} + \chi Q\_t(s\_{t+1}) - Q\_t(s\_t) \tag{3.3.2.1}$$

$$Q\_{t+1}(s\_t) = Q\_t(s\_t) + \eta\_Q \delta\_t \tag{3.3.2.2}$$

$$h\_{t+1}(s\_t) = h\_t(s\_t) + \eta\_h \xi\_t \tag{3.3.2.3}$$

$$
\xi\_t = \delta\_t^2 - h\_t(s\_t) \tag{3.3.2.4}
$$

$$U\_t(s\_t) = Q\_t(s\_t) - \alpha \operatorname{sign}(Q\_t(s\_t)) \sqrt{h\_t(s\_t)} \qquad (3.3.2.5)$$

#### *Results*

In **Figure 3A**, for sample values of γ = (0.5, 0.6, 0.7) used in Equation 3.3.2.1, the probability of selecting larger reward is plotted as a function of α. Note that for constant γ , the probability of selecting delayed reward increases with α. The β used to report the **Figure 3** is 20. The change of value (*Q*) and risk function (*h*) as a function of the *states, s* (# of black patches) of each panel is shown in Supplementary material for various values of γ . If α is interpreted as 5HT level, delayed deterministic reward choices are favored at higher 5HT levels. Thus α in our model effectively captures the role of γ in the experiment of Tanaka et al. (2007) for functionally representing the action of 5HT in the striatum of BG. In addition, a trend of increasing differences between the utilities of the yellow and the white panels as a function of the state, *st*, could be seen on increasing the value of α (**Figure 3B**). This is similar to the increasing differences of value functions for states, *st*, between the yellow and white panels on increasing the value of γ (**Figure 3B**, Supplementary material). These differences in values / utilities are of prime importance for deciding the exploration/exploitation type of behavior by any policy such as that in Equation 2.9.

Our goal in the Section Time Scale of Reward Prediction and Serotonin is to relate our model's serotonin correlate (α in Equation 2.8) to that proposed in experiment of Tanaka et al. (2007) (γ as in Equation 2.2) in striatum. The differential activity of striatum observed in fMRI of the subjects in different tryptophan conditions was indeed modeled in Tanaka et al. (2007) via value functions (Equations 2.2–2.3) with different γ values.

**FIGURE 3 | (A)** Selection of the long term reward as a function of α. Increasing γ increased the frequency of selecting the larger and more delayed reward. Increasing α also gave similar results for a fixed γ . **(B)** Differences in the utilities (*U*) between the yellow and white panels averaged across trials for the states, *st*, as a function of γ and α. Here *N* = 2000.

Specifically, the value generated by a lower (higher) γ value better modeled the striatal activity following tryptophan depletion (excess tryptophan). An increase in γ results in a value distribution, which when expressed with a particular value of β (Equation 2.9), would increase the probability of selecting the delayed but larger rewards (Sutton and Barto, 1998). Note that the subjects in Tanaka et al. (2007) show no great preference to one action over the other, though the striatal activity levels in subjects show sensitivity to γ values. This could be because action selection is not singularly influenced by the striatum and is probably influenced by downstream structures like GPi (Globus Pallidus—interna), or parallel structures like STN (SubThalamic Nucleus) and GPe (Globus Pallidus—externa) (Chakravarthy et al., 2010). Doya (2002) suggested that the randomness in action selection, which has been parametrized by β (Equation 2.9) in RL models, can be correlated by the effect of norepinephrine on the Pallidum. Thus for sufficiently small β, it is possible to obtain equal probability of action selection, though the corresponding utilities might sufficiently different. The focus of this section is to draw analogies between the discount parameter γ of classical RL models, and α parameter in our utility-based model, as substrates for *5HT function in striatum*.

## **REWARD/PUNISHMENT PREDICTION LEARNING AND SEROTONIN** *Experiment summary*

The ability to differentially learn and update action selection by reward and punishment feedback is shown to change on altering the tryptophan levels in subjects. We model a deterministic reversal learning task (Cools et al., 2008; Robinson et al., 2012) in which the subjects were presented with two stimuli, one associated with reward and the other with punishment. On each trial, the subjects had to predict whether the highlighted stimulus would lead to reward or punishment response. The subjects were tested in either a balanced or a depleted tryptophan levels (drink), on their association of the stimulus to the corresponding action at any time. Erroneous trials were followed by the same stimulus till it has been predicted by the subject correctly and the same is adopted in the simulations too. Trials were grouped into blocks. Each subject performed 4 experimental blocks, which were preceded by a practice block in order to familiarize the subject with the task. Each experimental block consisted of an acquisition stage followed by a variable number of reversal stages. One of two possible experimental conditions was applied to each block. The experimental conditions were: unexpected reward (punishment) condition where a stimulus previously associated with punishment (reward) becomes rewarding (punishing). Since there are 4 blocks of trials, there were two blocks for each condition. Performance of the subjects in the non-reversal trials was evaluated as a function of—(a) drink and condition (unexpected reward/unexpected punishment), and (b) drink and outcome (reward/punishment) trial type. Results showed that performance did not vary significantly with condition in both balanced and tryptophan depleted cases. Errors were fewer for tryptophan depleted cases than balanced cases in both conditions. Specifically, errors were fewer for punishment-prediction trials compared to reward-prediction trials in tryptophan-depleted cases. Thus the experiment suggests that tryptophan-depletion selectively enhances punishment-prediction relative to reward-prediction. Please refer (Cools et al., 2008) for a detailed explanation of the experimental setup and results.

#### *Simulation*

We model the two stimuli as states, *s* (*s* ∈ {*s*1, *s*2}), and the response of associating a stimulus to reward or punishment as action, *a* (action *a* ∈ *a*<sup>1</sup> = *reward*, *a*<sup>2</sup> = *punishment* ). At any particular trial, *t*, the rewarding association is coded by *rt* = +1, and the punitive association is coded by *rt* = −1. This is treated as an immediate reward problem and the TD error calculation in Equation 2.5 is used. As in the experiments, three types of trials are simulated as follows: non-reversal trials in which the association of a stimulus—response pair is learnt; reversal trials in which the change of the learnt association is triggered; and the switch trials where the reversed associations are tested following the reversal trials. The setup followed is similar to that of the experiment: The maximum numbers of reversal stages per experimental block are 16, with each stage to continue till the correct responses fall in the range of (5–9). The block terminates automatically after 120. There are two blocks in each condition, and hence a total of 480 trials (4 blocks) conducted per agent. The design of the experiment has an inbuilt complementarity in the association of the actions to a particular stimulus (increasing the action value of *a*<sup>1</sup> for a stimulus, *s*, decreases the same of *a*<sup>2</sup> to *s*) and that of the stimuli to a particular action (increasing the action value of *s*<sup>1</sup> to *a* decreases the same for *s*<sup>2</sup> to *a*). Hence in the simulations, the action values associated [*Qt*(*st*, *at*) as in Equation 2.3] with the two actions [*Q*(*s*, *at*) and *Q*(*st*,*a*2)] for any particular state *'s'* are simulated to be complimentary (Equation 3.4.2.1) at any trial "*t.*"

$$Q(s,\ a\_1) = -Q(s,\ a\_2) \tag{3.4.2.1}$$

The action values of the two stimuli, *s*, [*Q*(*s*1, *a*) and *Q*(*s*2,*a*)] mapped to the same action, *a* are also complimentary (Equation 3.4.2.2) at any trial "*t.*"

$$Q(s\_1, a) = -Q(s\_2, a) \tag{3.4.2.2}$$

Hence, only one out of the four value functions [*Q*(s1, a1), *Q*(s1, a2), *Q*(s2, a1), *Q*(s2, a2),] are learnt by training while the other 3 are set by the complementarity rules to capture the experimental design. We assume that such a complementarity could be learnt during the initial practice block that facilitated familiarity. The action (response) selection is by setting the β of the policy Equation 2.9 optimized to 10, and executing the same policy on the utilities (Equation 2.8) of the two responses (*a*) for any given stimulus (*s*) at a trial (*t*). The risk functions for the same are given by Equation 2.6.

#### *Results*

In the non-reversal trials, all the errors with respect to the drink and the condition (viz., unexpected reward and unexpected punishment) are featured in the **Figure 5**. The errors with respect to the drink and the outcome (viz., reward and punishment prediction errors) in both conditions are shown in **Figure 4**. Our results (**Figure 4**: sims values) show that the reward prediction error in the simulations does not vary much from the balanced (optimized α = 0.5 representing control tryptophan) condition to the tryptophan depleted (represented by optimized α = 0.3) condition, but the punishment prediction error decreases thereby matching the experimental results [**Figure 4**: expt values adapted from Cools et al. (2008). The errors in unexpectedly rewarding and punitive trials are obtained to be the same in both the balanced and tryptophan depleted cases (**Figure 5**: sims values) again matching with the experiment [**Figure 5**: expt values adapted from Cools et al. (2008)]. Therefore, increased 5HT levels in balanced condition are seen promoting the inhibition of responses to punishing outcomes as proposed by Cools et al. (2008). Reducing 5HT via tryptophan depletion then removes this inhibition. We can see a similar result from (**Figures 4**, **5**) depicting balanced (α = 0.5) and the tryptophan depleted (α = 0.3) conditions. *Sign*(*Qt*) term in Equation (3.3.2.5) plays a crucial role in this differential response to gains (rewards) and losses (punishments) (analysis of the results on removing the *Sign*(*Qt*) term is provided in Supplementary material). As the data is in the form of counts, the errors are reported as SQRT (error counts) (Cools et al., 2008) in **Figures 4**, **5**.

#### **DISCUSSION**

#### **MAIN FINDINGS OF THE MODEL**

Reinforcement Learning framework has been used extensively to model the function of basal ganglia (Frank et al., 2007; Chakravarthy et al., 2010; Krishnan et al., 2011; Kalva et al., 2012). The starting point of our model was to understand the contributions of serotonin in BG function (Tanaka et al., 2009; Boureau and Dayan, 2011). We use the notion of risk, since serotonin is shown to be associated with risk sensitivity. Some instances are as follows: On presentation of the choices with risky and safe rewards, the reduction of central serotonin levels favor the selection of risky choices comparative to the baseline levels (Long et al., 2009). The non-linearity in risk-based decision making—risk aversivity in the case of the gains and risk seeking in the case of losses, is postulated to be affected by central serotonin levels (Murphy et al., 2009). Negative affective behavior such as depression, anxiety and other behavior such as impulsivity caused due to the reduction of the central serotonin levels, is argued to be a risky choice selection in a risk based decision making framework (Dayan and Huys, 2008). Based on the putative link between serotonin function and risk sensitivity, we have extended the classical RL approach of policy execution using the utility function (Equation 2.8) instead of value function. The utility function combines value function with risk function. We propose that the weightage (α) that combines value and risk in the utility function represents serotonin (5HT) functioning in BG. Using this formulation, we show that three different experimental paradigms instantiating diverse theories of serotonin function in the BG can be explained under a single framework.

The proposed model is applied to different experimental paradigms. The first is a bee foraging task in which bees choose between yellow and blue flowers based on the associated risk (Real, 1981). The proposal model is applied to this simple

**function of "***α***" and outcome trial type; "***α* **= 0***.***5" (balanced) and "***α* **= 0***.***3" (Tryptophan depletion).** Error bars represent standard errors of the difference as a function of "α" in simulation for size "*N*" = 100 (Sims).

instance of risk based decision making, though the experiment does not particularly relate to DA and 5HT signaling. The risk sensitivity reported in the bee foraging experiment is predicted by our model (for α = 1) accurately.

Next we model experiments dealing with various functions of 5HT. One such experiment links 5HT levels to risky behavior. Experiments by Long et al. (2009); Murphy et al. (2009) discuss associating 5HT levels to non-linear risk sensitivity in gains and losses. In our study, we model a classic experiment by Long et al. (2009) describing the risk sensitivity in monkeys on depleting 5HT level. With our model, the effect of increased risk-seeking behavior in RTD condition is captured with parameter α = 1.658 and the baseline condition with α = 1.985. This result shows that our model's 5HT-correlate "α" can control risk sensitivity.

The third experiment is a reward prediction problem (Tanaka et al., 2009) associating 5HT to the time scale of prediction. Herein the subjects chose between a smaller short-term reward and a larger long-term reward. Our modeling results show that for a fixed γ , increasing α increases the probability of choosing the larger, long-term reward. Since higher α denotes higher 5HT level, the model corroborates the experimental result, suggesting

that our model's 5HT-correlate "α" behaves similar to the time scale of reward prediction.

Finally the fourth experiment is to show the differential effect of 5HT on the sensitivity to reward and punishment prediction errors. Under conditions of balanced 5HT (α = 0.5), the model is less sensitive to punishment and commits more errors in predicting punishment; this trend is rectified in depleted 5HT (α = 0.3) condition. For numerical analysis of reward and punishment prediction error, the experiment by Cools et al. (2008) did not take the acquisition trials into consideration. However, these trials serve to learn the initial association between stimulus and response. They also act as a base for the forthcoming reversal and switch trials and are hence taken into analysis in our simulation. This differential effect shown by the model 5HT-correlate "α" toward punishment corroborates the experimental evidence linking 5HT to adverse behavior exhibited in psychological disorders like depression and anxiety (Cools et al., 2008, 2011; Boureau and Dayan, 2011).

Simulation results thus show that the proposed model of 5HT function in BG reconciles three diverse existing theories on the subject: (1) risk-based decision making, (2) time-scale of reward prediction and (3) punishment sensitivity. To our knowledge this is the first model that can reconcile the diverse roles of serotonin under a simple and single framework.

#### **SIGNIFICANCE OF SIGN(Qt)**

The *sign(Qt)* term presented in the modified formulation of utility function (Equation 2.8) denotes the preference for risk in a given context of the experiment. At high mean reward values humans are found to be risk-averse, whereas at low mean reward values they are risk-seeking (Kahneman and Tversky, 1979). In neuroeconomic experiments, this risk preference is statistically determined, for example, by maximizing the log likelihood of the decisions (D'acremont et al., 2009). Though this method estimates the risk preference subjectively, it is derived from decisions made throughout the experiment. The use of *sign(Qt)* in our model takes into account the variation of the subjective risk preference, according to the expected cumulative reward outcomes observed *within* an experiment. The significance of this term in the formula of modified utility (Equation 2.8) can be seen from the Supplementary material. This Supplementary material presents the results of simulating the experiment by Cools et al. (2008) with an altered model having no *sign(Qt)* term in the utility function of Equation (2.8). The mean number of errors does not vary as a function of both trial type and condition, for different values of "α," contrary to what happens in the experiment. Thus *sign(Qt)*term is essential for simulating the results of Cools et al. (2008). Such a behavior of nonlinear risk sensitivity has been shown to be modulated by 5HT in various experiments (Long et al., 2009; Murphy et al., 2009), which further strengthens our proposal of introducing the term *sign(Qt)* in Equation (2.8).

#### **5HT-DA INTERACTION IN THE "RISK" COMPONENT OF DECISION MAKING**

The risk part of the utility function (Equation 2.8) has three components: <sup>α</sup>, *sign(Qt)*and <sup>√</sup>*ht*. While "α" represents 5HT, the remaining two components are dependent on "δ" or DA. Thus the proposed model of risk computation postulates a complex interaction between DA and 5HT. In neurobiology, complex interactions are indeed seen to exist between DA and 5HT (Di Matteo et al., 2008a,b) at the cellular level that are not detailed in this present abstract model. The 5HT afferents from dorsal raphe nucleus differentially modulate the DA neurons in SNc and ventral tegmental area (VTA) (Gervais and Rouillard, 2000). The 5HT projections act via specific receptor subtypes in the DA neurons. Action of 5HT 1A, 5HT 1B, 5HT 2A, 5HT 3, 5HT 4 agonists facilitate dopaminergic release, whereas 5HT 2C agonists inhibit the same. Selective serotonin reuptake inhibitors are known to reduce the spontaneous activity of DA neurons in VTA (Di Mascio et al., 1998; Alex and Pehek, 2007; Di Giovanni et al., 2008). The 5HT neurons in Dorsal Raphe nucleus also receive dense DA innervations from midbrain DA neurons (Ferre et al., 1994) and express D2R (Suzuki et al., 1998).

#### **CONTRIBUTIONS FROM EXISTING MODELS**

The previous models on 5HT seem to focus on individual functions of 5HT in isolation without reconciling them in a single framework. Most of them consider 5HT as a neuromodulator mediating aversive outcomes (Daw et al., 2002; Boureau and Dayan, 2011; Cools et al., 2011). Some describe 5HT as a controller of time-scale in prediction of rewards (Tanaka et al., 2007), and as a modulator that associates the aversive outcomes to past actions (Tanaka et al., 2009). Psychological disorders associated with lowered 5HT levels, such as impulsivity and negative moods, have also been studied by the existing models on 5HT. They infer impulsivity to be the result of increased short term reward prediction (Tanaka et al., 2007), and negative moods to increased punishment sensitivity, respectively (Cools et al., 2011; Robinson et al., 2012). Such observation may then be captured in our model by assessing the risk involved in the task and by controlling the "α" (5HT) parameter.

#### **STUDY PREDICTIONS AND FUTURE WORK**

Our proposed unified model is an abstract mathematical model, aimed at explaining a range of behavioral effects of 5HT. It is only a preliminary model that uses a modified RL framework and explains the role of 5HT and DA in the BG. It focuses mainly on risk computation and the role of nigrostriatal DA signal in shaping the learning of risk and value in BG. Ideally, a convincing model of utility computation in BG should go beyond the 5HT-DA interaction in the abstract representation of the value and the risk quantities and demonstrate how the utility computation would be carried out by the neurobiological correlates in BG.

In classical Actor-Critic approaches to modeling BG function, value computation is thought to occur in striatum (Joel et al., 2002). There is evidence from functional imaging that supports this theory (O'doherty et al., 2006). There is strong evidence for the existence of DA-modulated plasticity in corticostriatal connections, an effect that is necessary to account for value computation in the medium spiny neurons (MSNs) of striatum (see review by Kötter and Wickens, 1998). The idea that MSNs are probably cellular substrates for value computation has found its place in recent modeling literature (Morita et al., 2012).

Starting from the fact that the effect of DA on the D1-expressing MSNs of the striatum is to increase the firing rate (by having an increasing gain as a function of δ), it has been shown in a computational model of BG that these D1-expressing MSNs are capable of computing value (Krishnan et al., 2011). Just as D1R-expressing MSNs are thought to be cellular substrates for value computation in the striatum (Kötter and Wickens, 1998; O'doherty et al., 2006; Krishnan et al., 2011; Morita et al., 2012), we propose that D1D2-coexpressing MSNs can be the cellular correlates for risk computation. We have already developed a network model of BG in which risk is computed by D1D2-coexpressing neurons in the striatum, while value is computed by D1-expressing medium spiny neuron (unpublished). Just are neurons that compute value function (Equations 2.3–2.4) require monotonically increasing gain as a function of δ in the MSNs, risk function (Equations 2.6–2.7) would require a "U-shaped" gain function as a function of δ. It is plausible that these risk-type of gain functions would then probably be exhibited by neurons that coexpress both the D1-like gain function that increases as with δ, *and* D2-like gain function that decreases with δ (Servan-Schreiber et al., 1990; Moyer et al., 2007; Thurley et al., 2008; Humphries et al., 2009). Interestingly about 59% neurons in Globus Pallidus and 20–30% in ventral striatum coexpress D1R and D2R (Perreault et al., 2010). Even among the MSNs of the striatum, the proportion of D1R-D2R co-expressing neurons are greater in ventral striatal MSNs (17% in shell) compared to 5% in dorsal striatum (Surmeier et al., 1996; Bertran-Gonzalez et al., 2008). Some studies also point out that around 70% of the MSNs in striatum coexpress the D1 and the D2 type receptors (Surmeier et al., 1996). The ventral striatum also mediates risk sensitivity in action selection (Stopper and Floresco, 2011), the latencies of response, and the sensitivity to the magnitude of the rewards (Acheson et al., 2006; Floresco et al., 2006). This encourages us to predict a link between the risk-based functioning of the ventral striatum and the significant presence of the co-expressing D1R-D2R neurons here. We would also like to explore the plausibility of the functioning of D1R-D2R co-expressing neurons to the computation of the risk function and the selective modulation of serotonin on these risk computing neurons in future. We predict therefore that selective loss of these co-expressing neurons would make the subject less sensitive to the risk component of the environment.

The role of serotonin in reward and punishment sensitivity of PD subjects could also be analyzed using our proposed modeling approach. In experiments where reward/punishment sensitivity of PD subjects was studied, PD patients ON DA medication showed an increased reward sensitivity compared to PD OFF subjects who showed increased punishment sensitivity (Frank et al., 2007; Bodi et al., 2009). Our proposed model, in which serotonin controls the weightage of risk, is expected to account for the aforementioned experimental results. Preliminary work on application of the proposed model to the study of (Bodi et al., 2009) gave encouraging results (unpublished).

In connection with the neurobiological correlate of the *sign(Qt)* term, the aforementioned discussion suggests a general, complex interaction between DA and 5HT signals. But as a specific circuit that can form the basis for the -*sign(Qt)* term in Equation 2.8, we invoke the circuitry that links habenula with striatum. Habenula is a structure that is thought to be involved in brain's responses to reward, pain and anxiety (Lecourtier and Kelly, 2007; Hikosaka, 2010). It gained importance for its interactions with the DA and 5HT systems (Lecourtier and Kelly, 2007; Hikosaka, 2010). It is a small structure located near the posterior-dorsal-medial end of thalamus. It is divided into medial habenula (MHb) and lateral habenula (LHb). Striatum (in particular D1R containing striosome) and LHb are thought to form a negative feedback loop [LHb→Rostromedial Tegmental Nucleus (RMTg)→VTA/SNc→Striatum→Globus Pallidus→LHb], not via direct connections but via intermediaries (Lecourtier and Kelly, 2007; Hikosaka, 2010). Activation of LHb neurons inhibits the DA cells of VTA and SNc. This DA is also known to have a special action on MSNs as follows. Activation of D1 receptors is known to enhance (suppress) the activation of MSNs if the prior membrane state is depolarized (polarized) (Flores-Hernandez et al., 2002). However, we do not know if the action of DA on the hypothesized risk computing D1–D2 coexpressing neurons is one of the stabilizers of the pre-existing state. Based on the data reviewed above, we plan to develop a model in which D1-expressing MSNs whose activity represents value, act on D1R-D2R co-expressing MSNs via habenula, by an interaction term that can be roughly described by—*sign(Qt)*.

Finally, a theory of 5HT and DA in the BG must go beyond the striatum since 5HT innervations in the BG are not confined to striatum, but include GPe, SNc, and PPN (Wallman et al., 2011). We plan to elucidate the role of 5HT and DA in these other nuclei of the BG through a more complete network model in our future. The suggested roles of DA in the BG include, (1) plasticity of corticostriatal connections, (2) switching between DP and IP by striatal DA, and (3) modulating the exploratory drive arising from the STN-GPe system (Chakravarthy et al., 2010; Kalva et al., 2012). Analogously, a comprehensive theory of 5HT and DA in the BG is planned to be developed. The theory might shed light on the role of 5HT in some of the key functions of the BG viz., action selection/decision making.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fncom. 2014.00047/abstract

#### **REFERENCES**


microdialysis studies. *Prog. Brain Res.* 172, 7–44. doi: 10.1016/S0079-6123(08) 00902-3


*Neurosci Biobehav. Rev.* 31, 658–672. doi: 10.1016/j.neubiorev.2007. 01.004


of rewards in the ventral and dorsal striatum. *PLoS ONE* 2:e1333. doi: 10.1371/journal.pone.0001333


and losses. *Proc. R. Soc. B Biol. Sci.* 276, 4181–4188. doi: 10.1098/rspb.20 09.1312

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 December 2013; accepted: 30 March 2014; published online: April 2014. 16*

*Citation: Balasubramani PP, Chakravarthy VS, Ravindran B and Moustafa AA (2014) An extended reinforcement learning model of basal ganglia to understand the contributions of serotonin and dopamine in risk-based decision making, reward prediction, and punishment learning. Front. Comput. Neurosci. 8:47. doi: 10.3389/fncom. 2014.00047*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2014 Balasubramani, Chakravarthy, Ravindran and Moustafa. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Transient and steady-state selection in the striatal microcircuit

#### *Adam Tomkins 1,2, Eleni Vasilaki 1,2, Christian Beste3, Kevin Gurney4 \* and Mark D. Humphries <sup>5</sup>*

*<sup>1</sup> Department of Computer Science, University of Sheffield, Sheffield, UK*

*<sup>4</sup> Adaptive Behaviour Research Group, Department of Psychology, University of Sheffield, Sheffield, UK*

*<sup>5</sup> Faculty of Life Sciences, University of Manchester, Manchester, UK*

#### *Edited by:*

*Ahmed A. Moustafa, University of Western Sydney, Australia*

#### *Reviewed by:*

*Pragathi Priyadharsini Balasubramani, Indian Institute of Technology, India Greg Ashby, University of California, Santa Barbara, USA*

#### *\*Correspondence:*

*Kevin Gurney, Adaptive Behaviour Research Group, Department of Psychology, Western Bank, University of Sheffield, Sheffield, S10 2TP, UK e-mail: k.gurney@sheffield.ac.uk*

Although the basal ganglia have been widely studied and implicated in signal processing and action selection, little information is known about the active role the striatal microcircuit plays in action selection in the basal ganglia-thalamo-cortical loops. To address this knowledge gap we use a large scale three dimensional spiking model of the striatum, combined with a rate coded model of the basal ganglia-thalamo-cortical loop, to asses the computational role the striatum plays in action selection. We identify a robust transient phenomena generated by the striatal microcircuit, which temporarily enhances the difference between two competing cortical inputs. We show that this transient is sufficient to modulate decision making in the basal ganglia-thalamo-cortical circuit. We also find that the transient selection originates from a novel adaptation effect in single striatal projection neurons, which is amenable to experimental testing. Finally, we compared transient selection with models implementing classical steady-state selection. We challenged both forms of model to account for recent reports of paradoxically enhanced response selection in Huntington's disease patients. We found that steady-state selection was uniformly impaired under all simulated Huntington's conditions, but transient selection was enhanced given a sufficient Huntington's-like increase in NMDA receptor sensitivity. Thus our models provide an intriguing hypothesis for the mechanisms underlying the paradoxical cognitive improvements in manifest Huntington's patients.

**Keywords: response selection, action selection, striatum, Huntington's disease, basal ganglia, excitotoxicity**

## **1. INTRODUCTION**

Finding the neural substrate for the process of "selection" is key to furthering our understanding of decision-making (Ding and Gold, 2013), action selection (Mink, 1996; Grillner et al., 2005), planning (Houk and Wise, 1995), action sequencing (Jin and Costa, 2010), and even working memory (Gruber et al., 2006). A unifying proposal is that the basal ganglia forms just such a generic selection mechanism (Prescott et al., 1999; Redgrave et al., 1999); this proposal neatly explains why the basal ganglia have been hypothesized to contribute to each of these functions. But specifying the computational process of selection by the basal ganglia is challenging (Berns and Sejnowski, 1998; Gurney et al., 2001a,b; Humphries et al., 2006; Leblois et al., 2006).

A particular unknown is the computational role of the basal ganglia's input nucleus, the striatum. The striatum's GABAergic projection neurons comprise the vast majority of cells and are connected by local collaterals of their axons (Wilson and Groves, 1980). The lack of layers or of clear axial preferences in the direction of dendrites or axons suggests that striatal tissue is homogeneous in all three dimensions (Humphries et al., 2010). Such GABAergic connectivity naturally lends itself to the idea that the striatum forms a vast recurrent network that, locally, implements a winner-takes-all computation (Alexander and Wickens, 1993; Fukai and Tanaka, 1997; Wickens, 1997). The weak strength of synapses between the projection neurons (Jaeger et al., 1994; Czubayko and Plenz, 2002; Tunstall et al., 2002) is difficult to reconcile with this proposal (Plenz, 2003), as they suggest projection neuron output can only modulate ongoing activity and not outright inhibit their targets.

Here we report an alternative, transient form of selection that can occur in weak, sparse networks of striatal projection neurons. Using our three-dimensional network model of distancedependent connections in the striatal microcircuit (Humphries et al., 2009b, 2010), we explored the effect on striatal output of competing inputs to two projection neuron populations. We found that rapidly stepped input to one population caused a transient competitive effect on the two populations' outputs, which disappeared after around 100 ms. In response to the same inputs, we also found that sufficiently dense striatal connectivity could result in steady-state competition, where the post-step equilibrium activity of each population reflects the inhibition of one by the other.

To compare transient and steady-state selection we challenged both forms of model to account for the paradoxical response selection results of Beste et al. (2008). They found that manifest Huntington's disease patients were both faster and less error prone than controls on a simple two-choice reaction-time task. As Huntington's disease primarily results in striatal damage, this

*<sup>2</sup> INSIGNEO Institute for in Silico Medicine, University of Sheffield, Sheffield, UK*

*<sup>3</sup> Cognitive Neurophysiology, Universitätsklinikum Carl Gustav Carus, TU Dresden, Germany*

suggests the hypothesis that changes in the striatum directly affect response selection. We expand on the role of the striatum in signal selection, by describing a framework for signal selection that may account for both the typical decline in performance for most tasks under Huntington's conditions Ho et al. (2003), as well as a mechanism for increased performance under the same conditions. We thus explored how Huntington's disease-like changes to our striatum models could affect both transient and steadystate selection, and sought whether the effect on either form of selection could explain the results of Beste et al. (2008), while also accounting for the usual cognitive impairment in Huntington's disease (Lawrence et al., 1998; Ho et al., 2003).

#### **2. MATERIALS AND METHODS**

We study here an updated version of our prior, full-scale model of striatum (Humphries et al., 2009b, 2010). Compared to those models, the model here brings together the three-dimensional anatomy model from Humphries et al. (2010) with an updated version of the dopamine-modulated projection neuron model from Humphries et al. (2009a).

#### **2.1. SPIKING NEURON MODELS**

The basic model neuron used in the large scale striatal model is derived from the model neuron proposed in Izhikevich (2003), which was extended to encompass the effects of dopamine modulation on intrinsic ion channels and synaptic input in Humphries et al. (2009b).

In the biophysical form of the Izhikevich model neuron, *v* is the membrane potential and the "recovery variable" *u* is the contribution of the neuron class's dominant ion channel:

$$C\dot{\nu} = k \left( \dot{\nu} - \dot{\nu}\_r \right) \left( \dot{\nu} - \dot{\nu}\_t \right) - \mu + I \tag{1}$$

$$
\dot{u} = a \left[ b \left( \nu - \nu\_r \right) - u \right] \tag{2}
$$

with reset condition

$$\text{if } \nu \succ \nu\_{\text{peak}} \text{ then } \nu \gets \varepsilon, \, u \gets u + d$$

where in the equation for the membrane potential (Equation 1), *C* is capacitance, *vr* and *vt* are the resting and threshold potentials, *I* is a current source, and *c* is the reset potential. Parameter *a* is a time constant governing the time scale of the recovery due to the dominant ion channel. Parameters *k* and *b* are derived from the I-V curve of the target neuron behavior, where *b* describes how sensitive the recovery variable *u* is to fluctuations in the membrane potential *v*. Parameter *d* describes the after spike reset of recovery variable *u*, and can be tuned to modify the rate of spiking output.

#### *2.1.1. Projection neuron model*

The projection neuron models' parameter values and their source are given in **Table 1**. Parameters *C*, *d*, *vt*, and the AMPA synaptic conductance *g*ampa (see below) were found by searching for the best-fit to the f-I curve and spiking input–output functions of the Moyer et al. (2007) 189-compartment projection neuron model (Humphries et al., 2009a).

In Humphries et al. (2009a) we showed how this model can capture key dynamical phenomena of the projection neuron: the slow-rise to first spike following current injection; paired-pulse facilitation lasting hundreds of milliseconds; and bimodal membrane behavior emulating up- and down-state activity under anaesthesia and in stimulated slice preparations.

#### *2.1.2. Fast-spiking interneuron model*

For the FSI model, Equation (2) for the *u* term is given by (Izhikevich, 2007b)

$$\dot{u}\_{\rm fs} = \begin{cases} -au\_{\rm fs} & \text{if } \nu\_{\rm fs} < \nu\_{\rm b}, \\ a\left[b\left(\nu\_{\rm fs} - \nu\_{\rm b}\right)^{3} - u\_{\rm fs}\right] & \text{if } \nu\_{\rm fs} \ge \nu\_{\rm b}, \end{cases} \tag{3}$$

which enables the FSI model to exhibit Type 2 dynamics, such as a non-linear step at the start of the current-frequency curve between 0 and 15–20 spikes/s. Further discussion on the FSI model used in the striatal microcircuit can be found in Humphries et al. (2009b); the FSI model parameters are reproduced in **Table 2**.



**Table 2 | Intrinsic parameters for the fast spiking interneuron model.**


*Dimensions are given where applicable. See Humphries et al. (2009b) for details.*

#### *2.1.3. Dopaminergic modulation of intrinsic ion channels*

Tonic levels of dopamine in the striatum modulate the excitability of the projection neurons and fast-spiking interneurons (Nicola et al., 2000; Mallet et al., 2006). Our network model incorporates modulation by tonic dopamine through the relative activation levels of D1 and D2 receptors. These levels are modeled using the method proposed in Humphries et al. (2009b), in which complex membrane dynamics are subsumed by linear transforms with only two parameters φ1, φ<sup>2</sup> ∈ [0, 1], describing the proportion of D1 and D2 receptor activation, respectively. Throughout we used φ<sup>1</sup> = φ<sup>2</sup> = 0.3.

For activation of D1 receptors on projection neurons we used the simple mappings:

$$\nu\_r \gets \nu\_r \left( 1 + K \phi\_1 \right) \tag{4}$$

and

$$d \leftarrow d\left(1 - L\phi\_1\right),\tag{5}$$

which respectively model the D1-receptor mediated enhancement of the inward-rectifying potassium current(KIR) (Equation 4) and enhancement of the L-type Ca2<sup>+</sup> current (Equation 5).

For activation of D2 receptors on projection neurons we used the mapping:

$$k \gets k \left(1 - \alpha \phi\_2\right) \tag{6}$$

which models the small inhibitory effect on the slow A-type potassium current, increasing the neuron's rheobase current (Moyer et al., 2007).

With these mappings, the model neuron is able to accurately capture the effect of D1 or D2 receptor activation on both the f-I curves and spiking input–output functions of the Moyer et al. (2007) compartmental model of the projection neuron.

Dopamine modulated fast spiking inter-neurons in the striatal network only express the D1-family of receptors (Centonze et al., 2003). Activation of this receptor depolarizes the neuron's resting potential [see Humphries et al. (2009b) for further details]. Thus we used the following mapping of the resting potential:

$$\nu\_r \leftarrow \nu\_r \left(1 - \eta \phi\_1\right) \tag{7}$$

#### **2.2. SYNAPTIC MODELS**

Synaptic input comprises the source of current *I* in Equation (1):

$$I = I\_{\text{ampa}} + I\_{\text{gaba}} + B(\nu)I\_{\text{nmda}}.\tag{8}$$

where *I*ampa, *I*gaba , *I*nmda are current input from AMPA, GABA, and NMDA receptors, respectively, and *B*(*v*) is a term that models the voltage-dependent magnesium plug in the NMDA receptors. Compared to the projection neuron, FSIs receive no NMDA receptor input from cortex, have a moderately larger AMPA conductance (**Table 2**), but do receive input via local gap junctions (see below).

Each synaptic input type *z* (where *z* is one of ampa, nmda, gaba) is modeled by

$$I\_z = \bar{\mathfrak{g}}\_z h\_z \left( E\_z - \nu \right), \tag{9}$$

where *g*¯*<sup>z</sup>* is the maximum conductance and *Ez* is the reversal potential. We use the standard single-exponential model of post-synaptic currents

$$\dot{h}\_{\overline{z}} = \frac{-h\_{\overline{z}}}{\mathfrak{c}\_{\overline{z}}}, \quad \text{and} \qquad h\_{\overline{z}}(t) \leftarrow h\_{\overline{z}}(t) + \mathbb{S}\_{\overline{z}}(t), \tag{10}$$

where τ*<sup>z</sup>* is the appropriate synaptic time constant, and *Sz*(*t*) is the number of pre-synaptic spikes arriving at all the neuron's receptors of type *z* at time *t*.

Given that one interest here is in the possible roles of striatal NMDA sensitivity in Huntington's disease, we paid careful attention to two complexities of the NMDA receptor: its non-linear voltage-gating, and its saturation. The term *B*(*v*) in Equation (8), which models the voltage-dependent magnesium plug in the NMDA receptors, is given by (Jahr and Stevens, 1990)

$$B(\nu) = \frac{1}{1 + \frac{\left[\text{Mg}^{2+}\right]\_0}{3.57} \exp\left(-0.062\nu\right)},\tag{11}$$

where [Mg2+]<sup>0</sup> is the equilibrium concentration of magnesium ions.

As glutamate can remain locked into the NMDA receptor for 100 ms or more (Lester et al., 1990), so the pool of available receptors becomes rapidly saturated at high afferent firing rates. To capture this we introduce a mean-field model of synaptic saturation where we interpret the term *hz* in Equation (10) as the number of active receptor groups over the whole neuron. Each step in *h*nmda, following a number of spikes *S*nmda(*t*), activates that number of receptor groups, which decays with a time constant τnmda. To introduce saturation, we bound the size of the step by the proportion of available groups. Together, these concepts give us the model:

$$\dot{h}\_{\text{nmda}} = \frac{-h\_{\text{nmda}}}{\mathfrak{r}\_{\text{nmda}}}, \quad \text{and} \qquad h\_{\text{nmda}}(t) \leftarrow h\_{\text{nmda}}(t)$$

$$+ \left[1 - \frac{h\_{\text{nmda}}(t)}{N\_{\text{nmda}}}\right] S\_{\text{nmda}}(t). \tag{12}$$

As well as introducing this saturation of the NMDA synapses, we also removed the 1/τ*<sup>s</sup>* scaling of post-synaptic current amplitude used in Humphries et al. (2009a). This allowed the model synaptic conductances to be the same order of magnitude as their experimental counterparts. Consequently, we re-tuned *g*ampa by fitting the input–output functions of the Moyer et al. (2007) 189 compartment projection neuron model, following the protocol in Humphries et al. (2009a). We obtained equally good fits to those found previously with a value of *g*ampa = 0.4 (results not shown).

#### *2.2.1. Dopaminergic modulation of synaptic input*

Following the projection neuron models in Humphries et al. (2009a), we add D1 receptor modulation of NMDA receptor evoked EPSPs by

$$I\_{\rm nmda}^{\rm D1} = I\_{\rm nmda} \left( 1 + \beta\_1 \phi\_1 \right), \tag{13}$$

and we add D2 receptor modulation of AMPA receptor evoked EPSPs by

$$I\_{\rm ampa}^{\rm D2} = I\_{\rm ampa} \left( 1 - \beta\_2 \phi\_2 \right), \tag{14}$$

where β<sup>1</sup> and β<sup>2</sup> are scaling coefficients determining the relationship between dopamine receptor occupancy and the effect magnitude (**Table 3**). Due to the addition of saturating NMDA synapses, we also re-tuned these parameter values by fitting the input– output functions of the Moyer et al. (2007) 189-compartment projection neuron model under D1 and D2 receptor modulation of synaptic inputs, following the protocol in Humphries et al. (2009a).

Finally, following the model in Humphries et al. (2009b), we add D2 receptor modulation of GABAergic input to FSIs by

$$I\_{\rm gaba}^{\rm fsi} = I\_{\rm gaba} \left( 1 - \epsilon\_2 \phi\_2 \right). \tag{15}$$

#### *2.2.2. Gap junctions*

A gap junction between FSIs *i* and *j* is modeled as a compartment with voltage *v*∗ *ij*, which has dynamics

$$\text{tr}\dot{\nu}\_{\dot{ij}}^{\*} = \left(\nu\_{i} - \nu\_{i\dot{j}}^{\*}\right) + \left(\nu\_{\dot{j}} - \nu\_{i\dot{j}}^{\*}\right),\tag{16}$$

where τ is a time constant for voltage decay, and *vi* and *vj* are the membrane potentials of the FSI pair. The current introduced by

**Table 3 | Synaptic and gap junction parameters for the striatal network.**


that cable to the FSI pair is then

$$I\_{\rm gap}^\*(i) = \lg\left(\nu\_{i\bar{j}}^\* - \nu\_i\right) \qquad \qquad I\_{\rm gap}^\*(j) = \lg\left(\nu\_{i\bar{j}}^\* - \nu\_{\bar{j}}\right), \tag{17}$$

where *g* is the effective conductance of the gap junction. The total gap junction input *I*gap to a FSI is then the sum over all contributions *I*∗ gap.

#### **2.3. STRIATUM NETWORK MODEL**

Our model captures the connections within the GABAergic microcircuit in striatum, illustrated in **Figure 1**. We simulated a large-scale model representing a three-dimensional cuboid of the striatum in the adult rat at one-to-one scale, containing every projection neuron and fast-spiking interneuron present in the biological tissue. We used a density of 89,000 projection neurons per mm<sup>3</sup> (Oorschot, 1996) and a FSI density of 1% [see Humphries et al. (2010) for discussion]. We assumed projection neurons were evenly split between D1 and D2 receptor dominant types, and without any spatial bias. Hence we randomly assigned half of the projection neurons to be D1-type and half to be D2-type.

In the Results we predominantly report the results of simulations using a 300μm on the side cube, giving 2292 projection neurons and 23 FSIs. Other sizes are noted explicitly where used.

To connect the neurons we used two different models. In the *physical* model we used distance-dependent functions for

**FIGURE 1 | GABAergic striatal microcircuit.** Input to the striatum comes from glutamatergic (GLU: •) fibers originating in the cortex, thalamus, hippocampal formation and amygdala, and dopaminergic (DA: -) fibers from brainstem dopaminergic neurons. The projection neurons (SPNs) are interconnected via local collaterals of their axons projecting to other nuclei of the basal ganglia. The fast-spiking interneurons (FSIs) can form dendro-dendritic gap junctions between them and are also connected by standard axo-dendritic synapses. All these intra-striatal axo-dendritic connections () are GABAergic and hence inhibitory.

probability of connection between each element of the microcircuit. These functions were derived from overlap of dendritic and axonal arbors, and are given in Humphries et al. (2010) for each connection type in the microcircuit.

In the *random* model we ignored distance, and simply made connections to each neuron at random until the correct number of incoming connections of each type was made. The target number of connections were derived from the mean values obtained from the central neurons of the three-dimensional connectivity model in Humphries et al. (2010), and taken from column 1 of Table 5 in that paper: SPNs → 1 SPN: 728; FSIs → 1 SPN: 30.6; FSIs → 1 FSI: 12.8; FSI gap junctions per FSI: 0.65.

#### **2.4. SELECTION COMPETITIONS**

Cortical input to the model was designed to emulate the response selection component in a general two-choice task, where a (possibly noisy) stimulus taking one of two values is observed over time and a choice made between the two corresponding responses. In such a task, we propose that the two responses are made salient by the onset of each trial and then, after a perceptual decision is made about the stimulus value, the corresponding response increases in salience. This generic setup was inspired by the experimental procedures of Beste et al. (2008), in which participants were asked to distinguish between short (200 ms) and long (400 ms) auditory tones, using a distraction paradigm. Inputs followed a ramping trajectory to simulate evidence accumulation and increasing decision confidence (Asaad et al., 2000). We previously showed that transient selection can be seen in response to stepped cortical inputs (Tomkins et al., 2012).

The striatum model was divided up into three populations, two physically close SPN populations representing the two competing responses, which we refer to throughout as *channels*, and the remaining background neurons given a constant input. Neurons were randomly divided into the two channels, with 40% of the neurons in channel 1 and 2, respectively, and the remaining 20% of cells were labeled "background" neurons.

The input protocol is illustrated in **Figure 2A**, and **Figure 2B** shows an example response of the entire network to this protocol. Each response population received a priming input at a background rate for 1500 ms, causing them to reach a steadystate of firing activity. At 1500 ms, channel 1, (black) received a ramping input for a time of 50 ms, raising the salience toward a new steady-state, when it became the most salient cortical input to the striatum. During the 50 ms ramping time, channel 2 also received a ramping input, matching that of channel 1 for 25 ms. Following this, the signal to channel 2 decreased back to the background rate, describing the evidence accumulation trajectory of an out-competed action.

Rates were specified for each cortical spike train input to each projection neuron and FSI model. Both neuron models received the equivalent of 250 input spike trains [see Humphries et al. (2009b) for details].

We measured how the striatal microcircuit performed channel wise signal selection on the cortical inputs, using this simple protocol, inspired by the auditory decision task performed in Beste et al. (2008). However, due to the abstract nature of the input protocol we use, applied to a generic

**FIGURE 2 | Measures of selectivity in striatal output. (A)** Ramping cortical input into the striatum model. Two channels are driven by input spike trains, demonstrating signal selection between most-salient (channel 1) and least-salient (channel 2) striatal signals. **(B)** Raster plot of the striatum microcircuit output for a single selection experiment. Increased firing can be seen in channel 1 at the onset of the ramped input in panel **(A)**. **(C)** A sample striatal output of the *physical* network, showing a zero-phase filter of the mean spiking output from the two competing channels in response to the ramped input in panel **(A)**. Annotations demonstrate the measures used in the transient selectivity measure. *S*1, *S*2: stable firing rate; -*S*(1,2) : maximum of the difference between the two channels firing rates over the transient period. **(D)** A sample striatal output of the *random* network, in response to the same input. Annotations demonstrate the measures used in the steady-state selectivity measure. *S<sup>P</sup>* : pre-step stable firing rate.

simulation of the striatal microcircuit, the selection measured in these results could be applied to any channel-wise decision task throughout the striatum, and is not limited to auditory processing.

#### **2.5. METRICS FOR SELECTION**

We define "selectivity" in the striatum as the ability to robustly distinguish competing signals. The striatum demonstrates two complementary modes of selectivity, which we measure with different metrics. These selection metrics are applied to the output of each channel, which is characterized by a zero-phase filtered mean firing rate.

#### *2.5.1. Transient selectivity*

Given a competitive split in cortical input, we see a temporary boosting of the most-salient signal, accompanied with a temporary suppression of the least-salient competitive signal (**Figure 2C**). This transient phenomena presents a boost of the difference in salience between the two competing signals. We identify two key regimes: (1) -*S*(1,2) , the maximum difference between the two signals during the transient peaks; (2) *S*1, *S*2, the mean stable activity level of each channel after the transient period dissipates. The total transient selectivity, between 0 and 1, is defined as

$$T\text{S} = 1 - \frac{\text{S}^1 - \text{S}^2}{\Delta^{\text{S}(1,2)}}, \qquad 0 \le T\text{S} \le 1\tag{18}$$

where -*<sup>S</sup>*(1,2) is the *maximum* difference between the firing rates of Channel 1 and Channel 2 over the transient window (*t* = 1500 : 2000 ms). This enables the measure to allow for cases in which the largest perturbations from the mean are not temporally coincident, either due to reliable intrinsic dynamic properties of the network, or statistical fluctuations therein.

#### *2.5.2. Steady-state selectivity*

The striatum network can exhibit signal suppression on its leastsalient channel due to sustained inhibition by the most salient channel. *Steady-state* selectivity is measured on the least-salient channel, as the percentage reduction in the mean channel firing rate after the rise in salience of the most-salient signal. An example of steady-state selectivity in the *random* network can be seen in **Figure 2D**. We define (*SP*) as the stable firing rate of the primed channel 2 before the increase in competition, and from this we calculate the steady-state selectivity (SS) as:

$$\text{SS} = 100 \times \left( 1 - \frac{S^2}{S^P} \right). \tag{19}$$

#### **2.6. BASAL GANGLIA-THALAMOCORTICAL LOOP MODEL OF TRANSIENT SELECTION**

To study the contribution of the transient striatal dynamics to the selection mechanism of the whole basal ganglia, we used the population-level implementation of our basal-ganglia thalamocortical loop model (Humphries and Gurney, 2002). **Figure 3** schematically illustrates the loop model, and the connectivity of the response-representing populations.

The average activity *a* of all neurons comprising a channel's population changes according to

$$
\pi \dot{a} = -a(t) + I(t) \tag{20}
$$

where τ is a time constant and *I* is summed, weighted input. We used τ = 10 ms throughout. The normalized firing rate *y* of the

**FIGURE 3 | Basal ganglia thalamo-cortical loop model.** The main circuit (right) embeds the basal ganglia into a thalamo-cortical feedback loop. Each nucleus contains multiple response-representing populations. Within the basal ganglia, the circuit can decomposed into an off-center, on-surround network (left): three populations are shown, with example activity levels in the bar charts to illustrate the relative contributions of the nuclei. Note that, for clarity, full connectivity is only shown for the second population. Briefly, the selection mechanism works as follows. Constant inhibitory output from substantia nigra pars reticulata (SNr) provides an "off" signal to its widespread targets in the thalamus and brainstem. Cortical inputs representing competing saliences are

organized in separate populations, which project to corresponding populations in striatum and subthalamic nucleus (STN). The balance of focussed inhibition from striatum and diffuse excitation from STN results in the most salient input suppressing the inhibitory output from the corresponding SNr population, signaling "on" to that SNr population's targets. Tonic dopamine levels in the striatum set the ease with which the channels are selected, and subsequently switched between following further salient inputs. For quantitative demonstrations of this model see Gurney et al. (2001b) and Humphries and Gurney (2002). GP: globus pallidus; SNr: substantia nigra pars reticulata; STN: subthalamic nucleus; TRN: thalamic reticular nucleus.

$$y(t) = F\left(a(t), \theta\right) = \begin{cases} 0 & a(t) \le \theta \\ a(t) - \theta & \theta < a(t) < 1 - \theta \\ 1 & a(t) \ge 1 - \theta \end{cases} \tag{21}$$

with threshold θ.

The following describes net input *Ii* and output *yi* for the *i*th channel of each structure, with *n* channels in total. The full model was thus given by (Humphries and Gurney, 2002):

$$\begin{aligned} \text{Corr:} \quad &I\_i^{\text{int}} = j\_i^{\text{int}} + c, \\ &j\_i^{\text{ext}} = F\left(d\_i^{\text{ext}}, 0\right), \\ \text{Thalamus:} \quad &\stackrel{\text{def}}{=} \quad &j\_i^{\text{ext}} - j\_i^{\text{SE}} - 0.1j\_i^{\text{TR}} \\ &\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad$$

Net input was computed from the outputs of the other structures, except driving input *ci* to channel *i* of cortex. The striatum was divided into two populations, one of projection neurons with the D1-type dopamine receptor, and one of projection neurons with the D2-type dopamine receptor. Many converging lines of evidence from electrophysiological and anatomical studies support this functional split into D1- and D2-dominant projection neurons and, further, that the D1-dominant neurons project to SNr, and the D2- dominant neurons project to GP (Gerfen et al., 1990; Surmeier et al., 2007; Matamales et al., 2009).

In line with the projection neuron model described above, the model included opposite effects of activating D1 and D2 receptors on striatal projection neuron activity: D1 activation facilitated cortical efficacy at the input, while D2 activation attenuated this efficacy (Moyer et al., 2007; Humphries et al., 2009a). The mechanism for this mirrored that of the spiking projection neuron model in using simple linear factors. Thus, if the relative activation of D1 and D2 receptors by tonic dopamine are λ1, λ<sup>2</sup> ∈ [0, 1], then the increase in efficacy due to D1 receptor activation was given by (1 + λ1); the decrease in efficacy due to D2 receptor activation was given by (1 − λ2). Throughout we set λ<sup>1</sup> = λ<sup>2</sup> = 0.2, simulating tonic levels of dopamine.

The negative thresholds ensured that STN, GP, and SNr have spontaneous tonic output (Humphries et al., 2006). We simplified the model here compared to Humphries and Gurney (2002) by delivering input only to cortex, to represent the salience-driven response selection, rather than to cortex, striatum and STN; both models gave qualitatively the same results. We used exponential Euler to numerically solve this system, with a time-step of 1 ms.

We used *n* = 8 channels in total, with two of those channels (4 and 5) receiving non-zero inputs, mimicking the input protocol used for the striatal network model, which is designed to abstractly simulate the two choice reaction-time task performed in Beste et al. (2008). Baseline inputs *c*<sup>4</sup> = *c*<sup>5</sup> = 0.3 were delivered at simulation onset. A step in input *c*<sup>5</sup> occurred between 100 and 200 time-steps: a small step of *c*<sup>5</sup> = 0.5 or a large step of *c*<sup>5</sup> = 0.7. The ability for the model to select was assessed during this step period. As in prior models (Berns and Sejnowski, 1998; Gurney et al., 2001b; Humphries and Gurney, 2002; Humphries et al., 2006), selection was assessed by observing the change in activity on each SNr channel, as this output provides the tonic inhibition of thalamic and brainstem structures and is thought to gate the execution of actions (Redgrave et al., 1999). Here, successful selection of a channel was defined as the SNr output falling to zero.

#### *2.6.1. Modeling transient selection in the rate-coded model*

We mimicked the ability of the striatum microcircuit to produce transient phenomena using an input injection into the striatum of the rate coded model. At *t* = 100 we injected external inputs into each striatal channel in the model, forcing a transient increase or decrease as appropriate in the corresponding channels. Transient sizes were extracted from the striatal microcircuit traces, and reproduced in the rate coded model. Individual transients were calculated as the percentage change in the firing rate of the circuit during the transient period compared to the stable firing rate achieved post-transient. This allowed us to gauge the role of the complex striatal dynamics, generated by our microcircuit model and responsible for the transient selection mechanism, on the selection properties of the entire basal ganglia-cortex loop.

#### **3. RESULTS**

In what follows we discuss the simulation results of our model and interpret them as potential mechanisms explaining the findings of Beste et al. (2008). We discuss the two types of potential selection mechanisms that we have termed *transient* and *steady-state*.

#### **3.1. TRANSIENT SELECTION BY THE STRIATUM**

#### *3.1.1. Transient selection emerges from the striatal microcircuit*

We sought insight into the potential for competition within the striatum by examining the dynamics of our three-dimensional network model. We first explored the effect on striatal output of competing inputs to two projection neuron populations. These inputs were intended to emulate the changes in cortical signals representing two alternative responses in a generic two-choice decision-making task.

**Figure 4A** shows the mean firing rate of each channel from the same example simulation. After the divergence in inputs at *t* = 1.5 s, a transient increase of the firing rate is elicited in channel 1, the most salient population, and a transient suppression of the firing rate is elicited in channel 2. This transient suppression occurs despite no change in the input to channel 2. Moreover, this population rapidly returns (∼100 ms) to its pre-step firing rate. Consequently, we termed this phenomenon *transient* selection.

We found that the elicited transient selection was robust over a wide range of choices for the baseline input rate and the signal difference between the two channel inputs after the signal divergence. **Figure 4B** shows that transient selection could be robustly elicited for any step size over 0.5 Hz when the baseline input rate exceeded ∼4 Hz.

#### *3.1.2. Transient selection is due to both circuit and intrinsic membrane properties*

We further investigated the mechanisms underlying the positive and negative transient changes in population activity. We found that the positive transient was produced by single neuron dynamics, whereas the negative transient was due to network connectivity. This can be seen in **Figures 5A,B**, where lesioning either the projection neuron connections or all the network connections abolished the negative transient but did not prevent the positive transient.

To confirm the positive transient was a single neuron phenomenon, we simulated an individual projection neuron model receiving many trials of the same stepped input protocol, and averaged its responses. The resulting peri-stimulus time

trains have been convolved with a zero-phase digital filter to create smooth firing rates without lag. **(B)** Mean transient selection landscape color coded such that brighter colors represent higher selectivity. Landscape shows the mean transient selectivity averaged over 30 trials as a function of base input signal and step in signal difference during competition.

histogram (**Figure 5C**) shows that the neuron had a clear transient increase in firing probability immediately after the step of input. Running the same test on a model of a cortical regularspiking pyramidal neuron, with input scaled to produce approximately the same steady-state rates, showed no such transient increase in firing probability after a step in input (**Figure 5D**). Thus the transient increase in population activity observed in a single trial of the network is a statistical phenomenon of synchronous spiking of many projection neurons, and seemingly dependent upon properties particular to the striatal projection neuron.

We sought to elucidate these properties by injecting sequential current steps directly into the projection neuron model and observing the behavior of the membrane voltage *v* and slow current *u*. **Figure 5E** shows that a step in current applied to an already depolarized membrane triggers a rapid double spike, followed by slower regular spiking. **Figure 5F** plots the corresponding trajectory of the slow current *u*: the initial depolarizing injection makes the slow current *u* increasingly negative, thus slowly charging the membrane potential *v* [**Figure 5E**; see Equation (1)]. The subsequent step of injected current increases the membrane potential rapidly, and the contribution of the large, negative *u* ensures a rapid pair of spikes time-locked to the current step. However, once spiking has been initiated, the equilibrium value of *u* is less negative than immediately before the current step. Consequently, the smaller contribution of the slow current *u* ensures a comparatively slow spike rate in the steady-state.

To show that the slow current *u* is critical, we examined the dependence of this spiking "adaptation" on the parameters of the slow current. We repeated the sequential-step current injection protocol for a range of step-sizes, and measured the adapting response as *f*ratio = *F*first/*F*last, the ratio of the first and last interspike intervals after the current step. A value of *f*ratio > 1 thus indicates an adaptation. We found that the adaptation response appeared with a second current step above ∼50 pA (blue curves in **Figures 5G,H**). **Figure 5G** shows that the adaptation response disappeared if we reduced the effective time constant of the slow current (increased *a*), allowing the slow current to recover faster after spiking. **Figure 5H** shows that the adaptation response also disappeared if we reduced the gain *b* of the slow current The transient phenomena thus depends critically on the slow current *u*.

As lesioning only the connections between the projection neuron could abolish the negative transient (**Figure 5A**), this suggested it arose from a network effect where the neurons contributing to the positive transient inhibited their targets. To test this observation, we simulated the model with lesioned projectionneuron collaterals for a range of baseline input firing rates and step sizes (protocol in **Figure 2A**) and computed the size of the negative transient that resulted. **Figure 5I** shows that the negative transient was indeed abolished for a wide-range of values for the input firing rates. However, a sufficiently large baseline firing rate and step in firing rate could still result in a negative transient (upper-right corner of **Figure 5I**). Thus, it seems that sufficient cortical drive of the FSI population (which inhibits the projection neurons) also contributes to the negative transient in projection neuron population activity.

output with lesioned projection neuron connections. **(B)** Striatal output with all intra-striatal connections lesioned. **(C)** Peri-stimulus time histogram of a single projection neuron output, averaged over 50 steps of spiking input from *r* = 4 Hz to *r* = 7.2 Hz (onset *t* = 3 s), exhibiting transient behavior. **(D)** Peri-stimulus time histogram of a single regular-spiking cortical neuron model, averaged over 50 steps of spiking input from *r* = 0.75 Hz to *r* = 3 Hz, with no transient behavior. Model parameters given in Izhikevich (2007a). **(E)** The membrane potential (*v*) of the

(200 pA) followed by a further step in current at 1 s. **(F)** The corresponding changes in the slow current (*u*). **(G)** *f*ratio in the projection neuron model as a function of current step size and slow current decay constant 1/*a* ms. **(H)** *f*ratio in the projection neuron model as a function of current step size and slow current gain *b*. **(I)** The effect of projection neuron connection lesions on the negative transient. Landscape of negative transients measured as ratio of the maximum negative transient peak over the steady-state, plotted as a function of base input rate vs signal difference.

#### *3.1.3. Transient selection is sufficient to alter decision making performance*

Though the previous result demonstrates the existence and origin of transient selection within the striatum, it is not sufficient to show a causative effect of transient selection on decision-making. To address this issue, we asked whether such transient signals in the striatum could enhance the selection of input signals by the basal ganglia circuit. Here we consider selection to mean that the output of a substantia nigra pars reticulata (SNr) population falls from its tonic rate to zero. In particular, we hypothesized that the transient signals in striatum would be amplified in the complete basal-ganglia-thalamo-cortical loop, and thus directly influence the output of the basal ganglia.

To test this, we used our rate-coded model of population activity in the basal ganglia-thalamocortical loop (Humphries and Gurney, 2002). The model received inputs to two populations of cortico-striatal neurons (**Figure 6A**), mimicking the protocol used in our full-scale striatum model. An example of the subsequent SNr outputs are illustrated in **Figure 6B**. At the time of the step in input to one population, we emulated the subsequent transient signals observed in our full-scale model by brief injections of further increased input to that striatal population and decreased input to the other. These correspondingly produced small, brief positive and negative transients in the output of those striatal populations, for both D1 and D2-type projection neurons (**Figures 6C,D**). Note that the subthalamic nucleus populations also received the cortical input signals, but not the transient signals.

We found that a small positive transient elicited in the striatal population was sufficient to change the speed and persistence

**FIGURE 6 | Transient selection in striatum is amplified by basal ganglia-thalamo-cortical loop.** Panels **(A–D)** show an example simulation of the loop model that included emulation of the transient selection signals originating in the striatum (transient size: 50%; thalamo-cortical loop gain *g* = 2). **(A)** Cortical input to the rate-coded model, mimicking the selection protocol used in the striatal microcircuit selection experiments. **(B)** Corresponding SNr output response for three populations: no input (red); baseline only (blue); and baseline-plus-step (green). The input step thus caused clear selection by forcing the SNr output to zero. **(C)** Evoked response in the rate coded striatal D1 neurons, showing the effect of the injected transient at *t* = 100. **(D)** Evoked response in the rate coded striatal D2 neurons. **(E)** Proportion of time an action was selected, as a function of transient size. Transient size is expressed as a proportion of the steady-state firing rate achieved without the transient. Step values indicate the cortical input before and after the step in input. Parameter *g*: closed-loop gain of the thalamocortical loop. **(F)** Proportion of time an action was selected, given a small input step. **(G)** Time delay before selection achieved, as a function of transient size, for large input step. Delay is given between the step in cortical input and the corresponding SNr population reaching zero output. **(H)** Time delay before selection achieved, as a function of transient size, for small input step.

of selection (**Figures 6E–H**). **Figures 6E,F** show that signal selection was maintained for longer with increasing transient sizes. Correspondingly, **Figures 6G,H** show that increasing the size of transients injected into the model striatum decreased the time to selection. These changes were found irrespective of the size of input step, or of the closed-loop gain *g* of the positive thalamocortical feedback loop (Chambers et al., 2011) (When *g* = 1, this loop is a perfect integrator, while with *g* = 2, there is an amplifying feedback loop.) Thus, transient signals in the striatum are sufficient to modulate selection by the basal ganglia.

#### **3.2. STEADY-STATE SELECTION BY THE STRIATUM**

Prior debates about selection in the striatum have focussed on stable, winner-take-all modes of computation (Wickens, 1997; Plenz, 2003). In order to compare transient selection with this more common form of selection computations, we sought to understand whether our striatal model could show stable, winnertakes-all-like dynamics; here we refer to these as "steady-state" selection, in contrast to "transient" selection, as the competition between inputs causes persistent changes to output firing rates.

#### *3.2.1. Steady-state selection in a randomly-connected model*

Neurally-inspired models of winner-take-all dynamics are often based on fully-connected or dense randomly-connected networks (Hartline and Ratliff, 1958; Alexander and Wickens, 1993; Fukai and Tanaka, 1997; Mao and Massaquoi, 2007; Yim et al., 2011). We thus simulated our striatal model with random connectivity, in which each neuron type received, on average, the same number of connections, and the connections were made by choosing source neurons at random from across the three-dimensional cuboid. The target number of connections was based on the expected number of connections of a projection neuron and FSI in the center of a 1 mm<sup>3</sup> network, according to the computational anatomical estimates of Humphries et al. (2010) (see Materials and Methods). In this way, the randomly-connected model was more densely connected relative to the distancedependent model. Thus, while closer to the topology usually studied for steady-state selection, the randomly-connected model still retained connection statistics consistent with the estimates obtained in Humphries et al. (2010).

We tested the randomly-connected model with the same stepped input protocol as the physically-connected model (**Figure 2A**). **Figure 7A** shows an example of the mean population firing rates in the randomly-connected striatum model, with evident steady-state selection: the population receiving the stepped cortical input increases its firing rate, and the other population correspondingly decreases its firing rate despite receiving the same input throughout. We found that the magnitude of steady-state selection was dependent on the size of the baseline firing rate and input step. **Figure 7B** shows that the most effective steady-state selection occurred for low baseline rates and large input steps, approaching a winner-takes-all like response of nearly complete suppression (∼80%) of the losing population's activity.

**Figure 7C** shows that lesioning the connections between projection neurons prevents steady-state selection. **Figure 7D** shows that lesioning the FSI input to the projection neurons reduces but does not eliminate the steady-state selection, while also reinstating a transient period. This suggests that mutual inter-channel inhibition by the projection neurons populations is responsible for the suppression effect seen in both the *random* and the larger *physical* networks.

#### *3.2.2. Distance-dependent connectivity can support steady-state selection*

To assess if such steady-state selection required homogeneous, random connectivity of the kind described above, we checked whether such selection could be found in the physical model of connectivity. Again using the same stepped input protocol, we simulated physical networks up to 1 mm3, in order to increase the density of connectivity within the center of the network, which scales with the number of neurons in the model (**Figure 8B**).

**Figure 8A** shows that steady-state selection could be observed for distance-dependent connectivity, given a sufficiently large model (here 1 mm3). We found that the magnitude of steadystate selection increased monotonically with increasing network size (**Figure 8D**), approaching the steady-state selectivity seen in the *random model*. **Figures 8B,C** shows that in the *physical* model as the number of neurons increases as a function of network size so does the average number of connections each projection neuron receives. By contrast, the *random model* always has the same density of connections. The *physical* model's correspondence between the number of connections to a projection neuron and the effectiveness of steady-state selection suggests that such selection is dependent on the density of connections between projection neurons.

The model further suggests that it is only the increased density of connections that is key, and not an increase in recurrent connections between projection neurons. **Figure 8E** shows the absolute number of recurrent connections in the *physical* and *random* network configurations. Note that the number of bi-directional connections in the *random* network drops of as a function of network size due to the fact that each neuron receives a fixed number of connections regardless of the network size. By contrast we see a small rise in the number of bi-directional connections in the *physical* model. However, **Figure 8F** shows that in both random and physical networks the proportion of connections that are bi-directional falls with increasing network size. Thus, the increased effectiveness of steady-state selection is likely due to increased absolute connection density and not increased recurrent connections.

#### **3.3. COMPARING SELECTION MECHANISMS: PARADOXICAL SELECTION ENHANCEMENT IN HUNTINGTON'S DISEASE**

Having established that two contrasting forms of selection can be supported by the striatal circuit, depending on the type and density of connectivity, we then sought insight into how the two forms of selection could be distinguished. In particular, we hypothesized that they would make different predictions about how changes to the striatum would alter response selection. In order to test this hypothesis, we sought an experimental data-set that could provide a basis for testing our predictions.

Beste et al. (2008) have recently shown a rare example of paradoxical cognitive enhancement in a neurological disorder. They reported that manifest Huntington's disease patients had faster and less error prone response selection on a simple two-choice auditory task than controls or pre-manifest Huntington's disease patients. As Huntington's disease is primarily characterized by widespread loss of striatal projection neurons [FSI populations have been shown to be more resistant to HD-modifications (Ghiglieri et al., 2012)], and increased sensitivity of NMDA receptors on striatal projection neurons (Fan and Raymond, 2007), these results suggest the hypothesis that one or both of these changes to the striatum lead to enhanced selection, and as such we look into excitotoxicity as a possible candidate for the paradoxical improvements investigated.

We thus simulated both transient and steady-state selection under Huntington's-like changes to the striatal model,

**FIGURE 8 | Steady-state selection in the physical model of the striatal microcircuit. (A)** Mean firing rate of two projection neuron populations in a 1 mm3 model, with 89,749 total simulated neurons. **(B)** Number of simulated neurons as a function of network size. **(C)** Average number of connections per neuron as a function of network size. The *physical* network (black) approaches the density of connections seen in the random network (gray) with increased network size. **(D)** Magnitude of steady-state selection as a function of network size. All simulations used the inputs [5,6] Hz. Magnitude is the percentage suppression in the average firing rate of the losing channel after the competitive signal onset (*t* = 2.5 s). Shown in gray is the steady-state selectivity seen in the random model for a network of size 300μm3. Bars set at <sup>±</sup> 2 s.d, computed over 15 repeats. **(E)** Number of bi-directional connections as a function of network size. The total number of pairs of reciprocal connections in the *physical* model are shown in black, and the *random* model in gray. Bi-directional pairs decrease in the physical model with increasing network sizes, due to the fixed number of connections each neuron receives. **(F)** The ratio *R*biof bi-directional connections to the total number of connections a neuron makes for the *physical* model (black) and the *random* model (gray).

and searched for evidence of enhanced selection. We emulated increased NMDA receptor sensitivity by increasing the conductance of the NMDA synapse (we report this as the ratio of the NMDA:AMPA conductances), and separately emulated the cell loss by randomly removing a specified percentage of projection neurons. We did this to explore a wide range of plausible simulated Huntington's disease conditions. Across both changes, we mapped the change in transient and steady-state selection in response to the same input protocol (baseline 5 Hz, step 1 Hz).

## *3.3.1. Steady-state selection consistently degrades in simulated Huntington's disease*

To assess the impact of Huntington's-like changes on steady-state selection, we used the randomly-connected model to ensure that the suppression of the losing population was sufficient to be detectably modulated by the Huntington's-like changes. **Figure 9** shows that steady-state selection was uniformly diminished by all Huntington's-like changes, whether in isolation or combination.

#### *3.3.2. Transient selection enhancement in simulated Huntington's disease*

We assessed the impact of Huntington's-like changes on transient selection using the same physical model network as that used for **Figure 4**. **Figure 10** shows that transient selection could be diminished by the loss of projection neurons alone, yet could be enhanced by the simultaneous increase in NMDA conductance. Thus the model predicts a region of Huntington's-like conditions where the deleterious effect of cell loss can be more than compensated by the increased sensitivity of NMDA receptors.

**Figure 10A** shows an example improvement in transient selectivity under high cell atrophy and a high excitability, whereas **Figure 10B** shows the removal of the transient selectivity under high cell atrophy but only a small increase in excitability. These examples show that the transient selectivity range of ∼0.10 over

**FIGURE 9 | Steady-state selection under simulated Huntington's disease. (A)** An example of reduced signal suppression in the striatum with high cell atrophy (65% cell loss, NMDA:AMPA ratio 0.5). **(B)** An example of removed signal suppression in the striatum with high degradation (75% cell loss, NMDA:AMPA ratio 1). **(C)** Magnitude of signal suppression over all simulated Huntington's conditions. Magnitudes are means over 15 simulations. The control, healthy-state model is in the bottom left-hand corner (NMDA:AMPA = 0.5; 0% atrophy).

the "excitotoxicity landscape" in **Figure 10C**, corresponds to dramatic changes in the striatal output. Further, **Figure 6** shows that even small modifications in the transient size in the striatum will modulate the signal selection speed in the wider basal ganglia networks.

## **4. DISCUSSION**

We found a novel form of transient selection supported by the striatal network. This emerged from our three-dimensional network of sparse, weak feedback connectivity between the striatal projection neurons and dense, strong feedforward inputs from the fast-spiking interneurons. We observed that rapidly increasing the ongoing input to one of two competing populations of projection neurons caused a transient peak of activity in that population and a synchronous transient dip in activity of the other. The dip lasted around 100 ms before the activity returned to its pre-step level, thus showing no steady-state competitive effect between the two populations.

Using a population-level model of the complete basal gangliathalamo-cortical loop, we showed that transient selection in the striatum was sufficient to enhance selection by the entire circuit (as determined by suppression of SNr output). The presence of transient selection both increased the speed at which the whole circuit resolved a competition between salient inputs, and increased the circuit's ability to persist with the selected input. Both effects were observed for either perfect-integrator or amplifying feedback in the thalamo-cortical loop.

The origin of the transient selection had two components. The positive transient in the population activity was driven by single neuron adaptation. We found that a further step in input to an already depolarized projection neuron caused a spike followed by rapid decrease in spiking probability. This implies that the positive transient observed in the population activity was a statistical effect: that, across a whole population of projection neurons, a sub-set of neurons were sufficiently depolarized at the time of stepped input to show this adaptation effect in synchrony, and thus cause a transient peak in population activity.

The negative transient in the population activity was a subsequent network effect of the positive transient: the synchronized spiking of the neurons participating in the positive transient was sufficient to drive a dip in activity in their target neurons in the other population.

#### **4.1. TWO FORMS OF SELECTION COMPETITION**

Having established the existence and mechanics of the transient selection phenomenon, we sought to understand the conditions under which our striatal model could also support a steady-state competition effect, akin to classical winner-takes-all (Hartline and Ratliff, 1958; Fukai and Tanaka, 1997; Mao and Massaquoi, 2007). Such steady-state competition could plausibly arise in striatum as each projection neuron receives sufficient weak synapses from other projection neurons to continuously modulate its ongoing activity (Guzman et al., 2003; Humphries et al., 2010; Chuhma et al., 2011).

We found that increasing the number of projection neuron synapses gave rise to steady-state competition where the stable increase in activity in one population caused a stable decrease in activity of the other population. These results are consistent with Yim et al. (2011) who reported a weakly-competitive effect between two populations of neurons in a randomly-connected inhibitory network of spiking neurons, and showed that weak correlation between inputs to the network could enhance this effect. We advanced this result by showing that such steadystate competition could arise in both distance-dependent and randomly-connected networks, given either that we increased the physical size of our three-dimensional striatal network, and thus increase the density of connections, or randomly-connected the network based on the average connections of the most densely connected projection neuron.

Our models thus predict that the form of selection competition is dependent on the density of connections between projection neurons. Whether the striatum is ever as sparsely connected as in our distance-dependent model, or ever as densely connected as in the homogenous random model is an open question. It is possible that both forms of selection exist depending on local inhomogeneities in striatal tissue. We know that many aspects of the striatum shows gradients of density across the network, including the dorsal-ventral gradient of interneuron populations (Kubota and Kawaguchi, 1993) and the rostro-caudal gradient of FSI gap junctions (Fukuda, 2009). Correspondingly, it is plausible that there exists a gradient of projection neuron connection density.

We also note that the recent report by Oorschot et al. (2013) of projection neuron collaterals making synapses on to the somas of other projection neurons can only enhance both forms of competition. Such GABAergic somatic synapses are likely to shunt all dendritic input to the soma, thus providing powerful feedback inhibition. For transient selection, this could result in a larger negative transient; for steady-state selection, this could result in more depressed activity in the losing population. Open questions here include the relative density of such somatic synapses originating from projection neurons, and whether they have specific functional targets such as specifically occurring between projection neurons in competing populations.

Both forms of striatal selection mechanisms ultimately influence selection mediated by the whole basal ganglia network and expressed via their output nuclei (including SNr). As discussed in the Materials and Methods, this expression is via disinhibition (Chevalier and Deniau, 1990; Berns and Sejnowski, 1998; Redgrave et al., 1999; Gurney et al., 2001a; Humphries et al., 2006); increased activity of a striatal population inhibits the tonic inhibitory output of a SNr population, thus representing the selection of their represented signal (**Figure 3**). We showed that transient selection in the striatal populations is sufficient to enhance selection by disinhibition from SNr (**Figure 6**). This occurs because the most salient input causes a transient increase of activity in the corresponding striatal population and consequently transiently decreases the output of the corresponding SNr population. This fall is sufficient to allow activity to grow in the target thalamo-cortical loop, which in turn projects to the original striatal population, further increasing its activity—thus the positive feedback loop amplifies the transient changes in striatum. The effect of steady-state selection in the striatum on the whole basal ganglia is more straightforward. The long-lasting drop in output of all losing striatal populations comparatively reduces their inhibition of the corresponding SNr populations. Consequently, the fall in output of the SNr population representing the winning signal is enhanced compared to its competitors.

#### **4.2. EXPERIMENTAL PREDICTIONS OF TRANSIENT SELECTION**

Direct experimental observation of transient selection is challenging. The positive transient in population activity could only be observed on a single trial given sufficient simultaneous sampling of neurons within that population, a situation unlikely to occur with current recording technology. However, we showed that the basic mechanism underlying the positive transient in the population activity could be observed through sequential steps of current injection into a single neuron model. Thus our model makes a tractable experimental prediction: that there exists a regime of long, sequential steps of current into the projection neuron soma that will elicit a rapid burst of two or more spikes followed by slower regular firing. If such a regime exists, it would provide evidence in favor of the existence of transient selection mechanisms in the striatal network.

#### **4.3. TRANSIENT SELECTION ALONE COULD EXPLAIN ENHANCED SELECTION IN HUNTINGTON'S DISEASE**

We sought to determine whether transient and steady-state selection could be differentiated by their predictions for how changes to the striatal circuit would affect selection. To this end, we asked if Huntington's-like changes of increased NMDA receptor sensitivity and loss of projection neurons could account for Beste et al. (2008)'s report of enhanced selection by Huntington's disease patients. In terms of our models, we asked if either transient or steady-state selection would improve due to these Huntington'slike changes to the striatum.

As one might expect *a priori*, simply removing projection neurons and thus reducing connectivity between them impaired both types of selection. Increasing NMDA receptor sensitivity also impaired steady-state selection, and thus this form of selection predicted that all Huntington's-like changes impair selection, a result which is inconsistent with the report by Beste et al. (2008). Surprisingly, however, we found that for transient selection, increased NMDA receptor sensitivity could more than compensate for cell loss and actually enhance selection. We also found that transient selectivity was only clearly improved with both high cell degradation and increased excitability, and thus not in pre-symptomatic-like conditions. Thus, alteration of transient selection and not steady-state selection in striatum is consistent with enhanced performance of symptomatic Huntington's disease patients compared to controls and pre-symptomatic patients.

Beste et al. (2008) noted that this enhanced response selection was paradoxical, as Huntington's disease patients are consistently worse than age-matched controls across a range of cognitive decision-making tasks (Knopman and Nissen, 1991; Bamford et al., 1995; Lawrence et al., 1998; Ho et al., 2003). Our models offer two potential explanations for why Huntington's disease related changes in striatum are usually associated with cognitive impairment but could also lead to paradoxical cognitive enhancement. First, suppose that all regions of striatum engaged by cognitive tasks implement transient selection. Our model shows that there are limited combinations of NMDA receptor sensitivity increase and cell atrophy where transient selection is enhanced compared to the healthy case; for most combinations transient selection is deteriorated compared to the healthy-state. Thus, one hypothesis is that there is a continuum of NMDA receptor sensitivity increase and cell atrophy across the striatum, and the Beste et al. (2008) task engaged a region of striatum with enhanced transient selection, whereas most tasks engage regions of the striatum with deteriorated transient selection. Second, suppose instead that different regions of striatum use transient or steady-state selection dependent on the local density of projection neuron connections. Our models shows that steady-state selection is always deteriorated by any Huntington's-like change to the striatum. Consequently, this suggests the hypothesis that the Beste et al. (2008) task engaged a region of the striatum using (enhanced) transient selection, whereas most cognitive tasks engage a region of striatum using steady-state selection, and thus are always deteriorated in Huntington's disease patients compared to the healthy-state.

#### **FUNDING**

L'Agence Nationale de Recherche "NEUROBOT" project and a MRC Senior non-Clinical Fellowship (Mark D. Humphries); the EU Framework 7 "IM-CLeVeR" project (Kevin Gurney); EPSRC Green Brain project EP/J019534/1 (Eleni Vasilaki); EPSRC DTA student scholarship (Adam Tomkins); and Deutsche Forschungsgemeinschaft (DFG) Grant BE4045/10-1.

#### **ACKNOWLEDGMENTS**

We thank Alex Cope for help with testing the simulation code.

## **REFERENCES**

Alexander, M. E., and Wickens, J. R. (1993). Analysis of striatal dynamics: the existence of two modes of behaviour. *J. Theor. Biol.* 163, 413–438. doi: 10.1006/jtbi.1993.1128

Asaad, W. F., Rainer, G., and Miller, E. K. (2000). Task-specific neural activity in the primate prefrontal cortex. *J. Neurophysiol.* 84, 451–459.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 September 2013; accepted: 21 December 2013; published online: 20 January 2014.*

*Citation: Tomkins A, Vasilaki E, Beste C, Gurney K and Humphries MD (2014) Transient and steady-state selection in the striatal microcircuit. Front. Comput. Neurosci. 7:192. doi: 10.3389/fncom.2013.00192*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2014 Tomkins, Vasilaki, Beste, Gurney and Humphries. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## A computational model of altered gait patterns in parkinson's disease patients negotiating narrow doorways

#### *Vignesh Muralidharan1, Pragathi P. Balasubramani 1, V. Srinivasa Chakravarthy1 \*, Simon J. G. Lewis <sup>2</sup> and Ahmed A. Moustafa3*

*<sup>1</sup> Department of Biotechnology, Indian Institute of Technology Madras, Chennai, India*

*<sup>2</sup> Parkinson's Disease Research Clinic, Brain and Mind Research Institute, The University of Sydney, Sydney, NSW, Australia*

*<sup>3</sup> Marcs Institute for Brain and Behaviour and School of Social Sciences and Psychology, University of Western Sydney, Sydney, NSW, Australia*

#### *Edited by:*

*Izhar Bar-Gad, Bar-Ilan University, Israel*

#### *Reviewed by:*

*Rafal Bogacz, University of Oxford, UK Robert Rosenbaum, University of Pittsburgh, USA*

## *\*Correspondence:*

*V. Srinivasa Chakravarthy, Computational Neuroscience Laboratory, Department of Biotechnology, School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu 600036, India e-mail: schakra@iitm.ac.in*

We present a computational model of altered gait velocity patterns in Parkinson's Disease (PD) patients. PD gait is characterized by short shuffling steps, reduced walking speed, increased double support time and sometimes increased cadence. The most debilitating symptom of PD gait is the context dependent cessation in gait known as freezing of gait (FOG). Cowie et al. (2010) and Almeida and Lebold (2010) investigated FOG as the changes in velocity profiles of PD gait, as patients walked through a doorway with variable width. The former reported a sharp dip in velocity, a short distance from the doorway that was greater for narrower doorways. They compared the gait performance in PD freezers at ON and OFF dopaminergic medication. In keeping with this finding, the latter also reported the same for ON medicated PD freezers and non-freezers. In the current study, we sought to simulate these gait changes using a computational model of Basal Ganglia based on Reinforcement Learning, coupled with a spinal rhythm mimicking central pattern generator (CPG) model. In the model, a simulated agent was trained to learn a value profile over a corridor leading to the doorway by repeatedly attempting to pass through the doorway. Temporal difference error in value, associated with dopamine signal, was appropriately constrained in order to reflect the dopamine-deficient conditions of PD. Simulated gait under PD conditions exhibited a sharp dip in velocity close to the doorway, with PD OFF freezers showing the largest decrease in velocity compared to PD ON freezers and controls. PD ON and PD OFF freezers both showed sensitivity to the doorway width, with narrow door producing the least velocity/ stride length. Step length variations were also captured with PD freezers producing smaller steps and larger step-variability than PD non-freezers and controls. In addition this model is the first to explain the non-dopamine dependence for FOG giving rise to several other possibilities for its etiology.

**Keywords: gait, freezing of gait, doorway, basal ganglia, reinforcement learning**

#### **INTRODUCTION**

Altered gait behavior is a motor impairment observed in patients with Parkinson's disease (PD), a neurodegenerative disorder that involves a loss of dopaminergic neurons in the brain. PD gait is characterized by the following features: (1) Reduced stride length, reduced walking speed, increased cadence and increased double support duration (Morris et al., 1998); (2) Exhibits flat foot strike, and in rare conditions the "toe to heel strike" gait pattern is also observed (Hughes et al., 1990); (3) Intra-individual variability in foot strike patterns is lower in PD patients than in control subjects (Kimmeskamp and Hennig, 2001); (4) Vertical ground reaction force (VGRF) representing the normal force exerted on the foot during gait, has two peaks in controls—one when the foot hits the ground, and the other when it lifts off again. In early stages of PD, the two peaks in VGRF are present but with lower intensity compared to controls. In advanced PD, where the patients walk with narrow shuffling steps, the two peaks in VGRF merge into one (Koozekanani et al., 1987); (5) Postural instability is a common feature in late stage PD. Postural sway is also reduced probably due to reduced flexibility in adjusting one's bodily responses to changing posture (Morris et al., 2000). Abnormal postural sway in PD might also be due to stiff joints. The degree of gait variability as seen by any of the above mentioned features, is correlated with gait severity in PD patients (Hausdorff et al., 1998).

In addition to the aforementioned features, a more debilitating and dramatic feature of PD gait is known as Freezing of Gait (FOG). It is characterized by frequent falls (Latt et al., 2009), and is an episodic phenomenon of cessation of gait triggered by certain environmental contexts like narrow passages or crowded places (Almeida and Lebold, 2010; Cowie et al., 2010). PD gait features like reduced stride length and reduced walking speed appear to be gradually aggravated under certain environmental conditions, culminating in a motor block, or a freezing episode (Chee et al., 2009). Some cases of PD patients (PD-freezers) exhibit freezing in specific contexts such as facing transverse lines on a road crossing or narrow doorways (Hughes et al., 1990; Morris et al., 1998), while the same transverse lines on a treadmill alleviates freezing symptoms (Azulay et al., 1999). This shows the importance of the higher level cortical control over the rhythm generating spinal control in gait and FOG, since the visual feedback can affect gait only through the cortical route.

Human motor function has three levels of control: cortical, subcortical and spinal. Specifically gait is controlled by a complex network of brain areas spanning all the three levels: the neocortex (Sahyoun et al., 2004); subcortical areas including the basal ganglia (BG), vestibular system, cerebellum; and the spinal cord (Middleton and Strick, 2000; Lemon, 2008; Takakusaki et al., 2008). Motor commands arising from the brain's gait control centers are strongly influenced by sensory feedback via visual, proprioceptive and other sensory channels (Sahyoun et al., 2004). At the level of spinal cord each limb is thought to be controlled by a network of unit burst generators called Central Pattern Generators (CPGs) (Ijspeert, 2008). This network of CPGs, which acts under the top-down control from higher cortical motor areas, and the proprioceptive and visual feedback, is thought to be the ultimate driver of human gait. The broad picture of the neural substrates involved in gait control is shown in **Figure 1A**. However, since the focus of the present study is PD gait, we limit ourselves to a smaller architecture that highlights the role of BG (**Figure 1B**). A more detailed description and justification of the model architecture is presented in The Model.

Motor and other forms of impairment observed in PD are primarily linked to dopamine deficiency caused by cell loss in the Substantia Nigra pars compacta (SNc), a small but important nucleus in BG (Kish et al., 1988). The BG is a group of subcortical nuclei performing vital roles of action selection, action gating, motor preparation, among others (Chakravarthy et al., 2010). The striatum is the major input port of BG affected by the activity of the cortex and the limbic regions. This gets connected directly to the output port (Globus Pallidus interna / Substantia Nigra pars reticulata) via the direct pathway (DP), or through Sub-thalamic Nuclei—Globus Pallidus externa network via the indirect pathway (IP). The output nuclei project onwards to cortical targets like prefrontal, premotor and the motor cortices via the thalamus (Chakravarthy et al., 2010).

The idea that mesencephalic dopamine signal is linked to environmental rewards (Houk et al., 1995; Schultz et al., 1997) opened doors to the application of concepts from reinforcement learning (RL) to model BG (Joel et al., 2002; Frank, 2005; Chakravarthy et al., 2010). The basic tenet of RL is that stimulus-response pairs that are rewarding are reinforced and those that are punitive are attenuated. The mapping between stimuli and responses would have been an easier problem, but for the fact that often reward comes, not immediately after an action is performed, but after a delay. In some cases, reward and punishment feedback arrives after a long series of actions. It remains then to allocate credit to past actions and determine which actions have contributed to reward and to what action, a problem otherwise known as temporal credit assignment problem. Since reward comes after a delay, for the simulated object (referred to here as agent) to select the correct action at any given instant, RL theory offers a surrogate to reward known as value function. The value function is defined as the total expected future reward, with appropriate discounting

of future. The RL component known as the 'Critic' computes value after repeatedly sampling the action space and receiving rewards/punishments. Another key RL component known as the 'Actor' uses the value information provided by the critic to select correct or potentially rewarding actions. Many computational models have been directed toward mapping RL concepts onto the functional anatomy of BG (Joel et al., 2002).

According to the classical depictions of functional anatomy of BG, the DP facilitates movement, and is hence dubbed the GO pathway, while the IP inhibits movement and hence known as the NOGO pathway (Albin et al., 1989; Frank, 2005). Signal transmission between DP and IP is thought to be switched by striatal dopamine: higher (lower) levels of striatal dopamine activate the GO (NOGO) pathway. In earlier work, we proposed that the classical GO/NOGO picture of BG function needs to be expanded, suggesting insertion of a third "EXPLORE" regime between GO and NOGO (Sridharan et al., 2006; Chakravarthy et al., 2010; Magdoom et al., 2011; Kalva et al., 2012). This EXPLORE regime drives the stochastic exploration of action space, which is essential for RL to work in complex environments. In the present study we use this expanded GO/EXPLORE/NOGO (GEN) understanding of BG functioning, to model PD gait.

Two experimental studies, that investigate the gait pattern of PD patients as they approach a doorway, are simulated in the present study (Almeida and Lebold, 2010; Cowie et al., 2010). The study of Cowie et al. (2010) shows a sharp dip in velocity as the PD patient approaches the doorway, a dip that becomes sharper in the case of narrower doorways (Cowie et al., 2010); this effect was more pronounced in PD patients (ON and OFF freezers) than in healthy controls. Almeida and Lebold (2010) consider a similar setup but compare the gait patterns of PD freezers with nonfreezers in terms of step lengths and its variability (Almeida and Lebold, 2010). The proposed BG model accounts for the above mentioned velocity profiles and gait features (stride / step lengths) of PD patients from these two experimental studies.

We model them at two stages of control: (1) the higher level of control representing the cortico-basal-ganglia system, and (2) the spinal level CPGs that translate the higher level gait commands such as velocity into gait rhythm (**Figure 1B**). The BG model is essentially simulated using the Actor-Critic architecture, with the difference that the Actor is modeled by the GEN model (Sridharan et al., 2006; Chakravarthy et al., 2010; Magdoom et al., 2011; Kalva et al., 2012). The spinal CPGs are modeled by networks of hopf oscillators (Righetti and Ijspeert, 2006). The model is used to simulate the results of two PD gait studies (Almeida and Lebold, 2010; Cowie et al., 2010).

The paper is outlined as follows: the Model section describes the modeling components and equations. The Result section explains the experimental setup, the model implementation and the simulation results. Velocity profiles of control subjects and PD patients (ON/OFF, freezers / non-freezers) as they negotiate a doorway are simulated and compared with experimental results. Section Discussion finally discusses the results obtained, model limitations, predictions and future work.

#### **THE MODEL**

The proposed model simulates the approach of a subject to a doorway and computes the velocity profile along the track leading to the doorway. The agent repeatedly approaches a doorway, walking along a short track. The agent aims at passing through the doorway without bumping into the sides of the doorway. Due to the well-known tradeoff between accuracy and speed in motor function (Mackay, 1982; Bradshaw and Sparrow, 2000; Duarte and Latash, 2007), rapid approaches to the doorway are more likely to result in a collision. Therefore, in our model, the agent learns to reduce its speed in the vicinity of the doorway, which it does using RL mechanisms.

**Figure 2** shows the block diagram of the proposed model, which mainly consists of three components—the Cortico-BG system, CPG, and locomotor apparatus. The Cortico-BG system, shown inside the dashed box (**Figure 2**), takes a representation of the view of the doorway, the "view vector," from the position, *X*, of the agent. It is obtained from the cortical module: VISION. The block denoting τ denotes the time delay in the passage. The BG [consisting of the CRITIC, ACTOR (GEN), VALUE DIFFERENCE, and the TD ERROR modules] uses the view vector and updates the agent's velocity (*vx* and *vy*). This velocity information from the higher command centers is sent to the CPG module, which translates the velocity into joint angles (θ). The subsequent block labeled STRIDE uses the joint angle information and orientation (*vx* and *vy*) and computes the next position.

The ENVIROMENT (doorway) module checks if the new position results in a collision of the agent with the doorway. A positive reward, *r*, is delivered if there is no collision, and a punishment (negative *r*) in case of collision. The BG uses the view vector and reward information to compute value, thereby completing the cycle.

We now describe individual model components in detail.

#### **THE CORTICO-BASAL GANGLIA SYSTEM: VISION**

This module computes the state of the agent, the "view vector," φ, which codes the view of the doorway from the position [(*x*, *y*) or *X*] of the agent (**Figure 3**). The calculations are given by the Equations (1–5).

In our study, the field of vision (FOV) of the agent is fixed at 120◦. The FOV is divided into small sectors, denoting the size of the view vector. In our case, the view vector is a 1x 50 array and therefore FOV is spilt into 50 sectors (**Figure 3**). The position of the agent (*x*, *y*) is the viewing point and the orientation vectors *vx* and *vy* form the view direction of the agent from which it can see 60◦ to the left and 60◦ to the right. Considering *R*<sup>o</sup> as the orientation vector (2 × 1) represented by *vx* and *vy* and the angle subtended by each *i*th sector with respect to *R*oas *sec <sup>i</sup>* , the orientation vectors of each of other 49 sectors is given by

$$R\_i^{\text{sec}} = O\_{\text{mat}} R\_o \tag{1}$$

where *O*mat is the orientation matrix (2 × 2) given by

$$\Theta\_{\rm mat} = \left[ \cos \left( \Theta\_i^{\rm sec} \right), \sin \left( \Theta\_i^{\rm sec} \right); -\sin \left( \Theta\_i^{\rm sec} \right), \cos \left( \Theta\_i^{\rm sec} \right) \right] \quad (2)$$

The slope *mi*(Equation 3) of each of the *Rsec <sup>i</sup>* is calculated with respect to the agent's current position (x, y).

$$m\_i = \left[ \left( \mathbf{y} + \mathbf{R}\_i^\circ \right) - \mathbf{y} \right] / \left[ \left( \mathbf{x} + \mathbf{R}\_i^\mathbf{x} \right) - \mathbf{x} \right] \tag{3}$$

In order to identify if a given sector's orientation hits the door or a wall assuming the y coordinate of the door is *y*door *<sup>i</sup>* , the xcoordinate (*x*door *<sup>i</sup>* ) of each of the orientation vectors is calculated at *y*door *<sup>i</sup>* as in Equation 4.

$$\left(\mathbf{x}\_{i}^{\text{door}} = \left(\mathbf{y}\_{i}^{\text{door}} - \mathbf{y}\right) / m\_{i} + \mathbf{x} \tag{4}$$

Using the *x*door *<sup>i</sup>* coordinates of all the views, the view vector is given as Equation 5.

$$\begin{cases} \text{if} (\mathbf{x}\_i^{\text{door}} \ge -d\_{\text{Pos}\_\mathbf{x}})^\wedge (\mathbf{x}\_i^{\text{door}} \le d\_{\text{Pos}\_\mathbf{x}})\\ \phi\_i(t) = 1 \\ \text{else} \\ \phi\_i(t) = 0 \end{cases} \tag{5}$$

Therefore, the agent viewing the doorway from a given position (*X*), would see more or less number of 1 s in its visual field, depending on its orientation, distance to the doorway and the width of the doorway (*d*length) (**Figure 3**). The view vector is thus ideally suited to be used as the state of the agent.

#### **THE BASAL GANGLIA MODULE**

The BG module is essentially simulated using the Actor-Critic architecture (Joel et al., 2002) but with important deviations from classical RL (Sutton and Barto, 1998) regarding the formulation of the Actor.

#### *Critic*

The Critic computes the value "*V*" for the view vector [φ(*t*)]. It is defined as an estimation of the predicted reward at any time, *t*, for that state φ(*t*). The value function is denoted by Equation 6.

$$V(t) = E(r\_{t+1} + \gamma r\_{t+2} + \gamma^2 r\_{t+3} + \dotsb) \tag{6}$$

Here, r*<sup>t</sup>* is the reward *r* obtained at time, *t*.

In our study, we approximated *V*(*t*) as in Equation 7.

$$V(t) = \tanh\left[\sum W\_i(t)\phi\_i(t)\right] \tag{7}$$

The update equation for the above approximation (having weight vector, *W*) is given by eqn. 8.

$$
\Delta W = \eta \delta \phi(t) \tag{8}
$$

Here, "δ(*t*)" denotes the TEMPORAL DIFFERENCE (TD) error in value function, that is correlated to dopamine signaling (Schultz, 2010). It is given by Equation 9 in which γ is the discount factor.

$$
\delta = r(t) + \gamma V(t) - V(t-1) \tag{9}
$$

#### *GO/EXPLORE/NOGO or GEN*

The policy (Actor) used here is known as the GO/EXPLORE/NOGO or GEN policy, the neurobiological origins of which were described in earlier work (Sridharan et al., 2006; Magdoom et al., 2011; Kalva et al., 2012). GEN essentially represents an approach to action selection, by performing a stochastic hill-climbing over the value function. In the doorway problem that is presently studied, reward, *r*, is obtained at the doorway when the agent passes through the doorway without collision. Thus the value profile is expected to have a maximum at the doorway. Therefore, value gradient can be used to approach the doorway securely without colliding with the sides of the doorway.

A quantity known as *VALUE DIFFERENCE* (Equation 10), δ*V*, which is the gradient of the value,

$$
\delta\_V = V(t) - V(t-1) \tag{10}
$$

plays an important role in the process of hill-climbing over the value profile.

Note the resemblance between the Value Difference in eqn. 10 and the TD error (Equation 9). It may be observed that, δ = δ*V*, when γ = 1 and when the agent is not at the goal state (*r* = 0). We assume that both δ and δ*<sup>V</sup>* represent dopamine signals but perform distinct roles: while δ is used for training the value function as in the case of typical Actor-Critic models of BG, we assume that δ*<sup>V</sup>* is used for switching between DP and IP, which is thought to be a function of striatal dopamine (Humphries and Prescott, 2010; Amemori et al., 2011). δ*<sup>V</sup>* can be used to hill-climb over value function using the following rules,

$$\begin{aligned} \text{if}(\delta\_V > D\_{\text{hi}})\\ \Delta X(t) &= +\Delta X(t-1) \qquad - \text{ \*Go''} \\ \text{elseif}(\delta\_V > D\_{\text{lo}} \quad \wedge \quad \delta\_V \le D\_{\text{hi}})\\ \Delta X(t) &= \chi \qquad - \text{ \*Explace''} \\ \text{else} \quad (\delta\_V \le D\_{\text{lo}})\\ \Delta X(t) &= -\Delta X(t-1) \qquad - \text{ \*NoGo''} \qquad (\varepsilon) \end{aligned} \tag{11}$$

where *X* = (*x*, *y*) denotes the position of the agent on the track; *D*hiis a positive threshold and *D*lo is a negative threshold; χ is a uniform random variable. A similar rule for hill-climbing over value function was used earlier in Magdoom et al. (2011), which describes a model of Parkinsonian reaching movements.

The key difference in the classical RL implementation of Actor, wherein the action is typically modeled as an explicit function of the state φ, and the GEN policy, is that the action is computed by following the value gradient over the position space, *X*. Although value is a function of the view vector, φ(*t*), we perform the hillclimbing over the position space, *X*, that is mapped onto the view vector uniquely.

The 3 discrete regimes—GO, EXPLORE and NOGO—of Equation 11 can be combined seamlessly into a single equation as follows (Equation 12):

$$\begin{array}{l} \Delta X(t) = A\_{\rm G} \text{sig}(\lambda\_{\rm G} \delta\_{V}) \Delta X(t-1) \\ \quad + A\_{\rm E} \chi \exp(-\delta\_{V}^{2}/\sigma\_{\rm E}^{2}) \\ \quad - A\_{\rm N} \text{sig}(\lambda\_{\rm N} \delta\_{V}) \Delta X(t-1) \end{array} \tag{12}$$

where

$$\text{sig}(\mathbf{x}\_{\text{ig}}) = 1/\left[1 + \exp(-\mathbf{x}\_{\text{ig}})\right] \tag{13}$$

The rationale behind Equations (11–13) (Sridharan et al., 2006; Chakravarthy et al., 2010; Magdoom et al., 2011; Kalva et al., 2012) may be described as follows. The "GO" regime, which occurs when δ*<sup>V</sup>* > *D*hi, means that the previous position update, *X*(*t* − 1), had caused significant increase in the value (δ*V*). Therefore, according to Equation 11 above, *X*(*t*) is in the same direction as *X*(*t* − 1), which justifies the form of the first term in eqn. 12, that is a continuous version of rule Equation 11a, as shown in **Figure 4**. The GO regime is thought to be implemented by the DP, which is activated at higher levels of striatal dopamine, δ*V*. A low level of dopamine δ*<sup>V</sup>* < *D*lo implies that the previous position update had caused significant decrease in the value. Therefore, the position update is in the opposite direction to the previous update. This mechanism is thought to be implemented by the IP. This regime is denoted by the third logsig term with a negative slope (λN) in the Equation 12, a continuous version of the rule of Equation 11c. Intermediate levels of dopamine, (*D*hi < δ*<sup>V</sup>* < *D*lo), implies that the previous change in value is not significant; therefore the subsequent position update occurs

**FIGURE 4 | An illustration of the operation of GO, EXPLORE, NOGO regimes.** Each of the regimes represent a map between *X*(*t*−1) and *X*(*t*) defined as *X*(*t*) = χ∗*C*(δ*v*)∗*X*(*t*−1). **(A)** For GO regime, for δ*<sup>v</sup>* > *D*hi, *C* = 1; else *C* = 0; χ = 1. The resulting step-like profile is approximated by a sigmoid, shown as the first term on the RHS of Equation (12). **(B)** For NOGO regime, for δ*<sup>v</sup>* < *D*lo, *C* = 1, else *C* = 0; χ = 1. The resulting inverted step profile is approximated by the sigmoid defined as the third term on RHS in Equation (12). **(C)** For the EXPLORE regime, for *Dlo* < δ*<sup>v</sup>* < *D*hi, *C* = 1, else *C* = 0. This pulse-like profile of *C* is approximated by a Gaussian function of δ*<sup>v</sup>* a; χ is a random number generated from a uniform distribution with range [−0.5 to 0.5] in this case.

in a random (χ) direction (Equation 11b). The second term in Equation (12) is a continuous version of the rule of Equation 11b.

The parameters that define the GEN policy are *A*G, *A*N, *A*E, λG, λ<sup>N</sup> in Equation (12), the discount factor, γ in Equation (9), and width, σ, of the Gaussian term in Equation (12). The last parameter, σ, is known as the "exploration parameter" since it controls the extent of exploration by the GEN policy. The parameters that denote changes in dopamine corresponding to PD OFF and ON conditions are δlim and δmed respectively. These parameters are trained using genetic algorithms (Appendix A) after imposing specific constraints related to various conditions (controls, PD OFF and PD ON) as described in the later sections.

Thus the GEN policy computes the update in position, *X*. The position update is represented as velocity components (*vx* and *vy*) and passed onto the CPG module, which in turn computes the hip and knee angles θ, for the calculation of the next position.

#### **THE CPG MODULE**

CPGs are neural networks capable of producing coordinated rhythmic activity in the spinal cord for driving rhythmic movements like locomotion (Ijspeert, 2008). A network of coupled non-linear oscillators, modeled using adaptive hopf oscillators (Righetti and Ijspeert, 2006), is used here as a model of the CPG network. The model assumes that the CPG controls the angle profiles of hip and knee joints that directly reflects the motor output, producing the necessary activation and deactivation of muscles producing gait. It is a simple kinematic model of the leg, where the CPGs control the joint angles including those of the hip (θ*h*) and two knees (θ*k*<sup>1</sup> and θ*k*2). The hip and knee joint angles are approximations of the human locomotion obtained by Fourier analysis (De Pina Filho and Dutra, 2009). **Figure 5B** shows the approximate profiles of hip and knee joints, modeled as truncated Fourier series (De Pina Filho and Dutra, 2009). It represents the training signals for the CPGs which are 500 steps in time for one gait cycle (T). Since our aim is to reproduce the rhythms, during training these are provided repeatedly to the network till it converges to produce such gait cycles with appropriate amplitude, frequency and phase relationship.

The motivation behind using the adaptive hopf network is to have a smooth control over the amplitude and frequency of the oscillators. Three pools are used to represent the CPG network (where s = 3 and *j* = 1: *s* represents the hip, knee1, and knee2 respectively). Each pool consists of optimal number of oscillators, two for the hip and three for each of the knees (where *N* = 1 or *N* = 2 and *i* = 0: *N*, respectively), that in total constitute the CPG network (**Figure 6**). The dynamics of the adaptive hopf oscillators are given by Equations 14–21, for the neurons (oscillators) in each pool *j*. Each variable is represented with the subscript *i*, *j* denoting the *i*th oscillator in the

**FIGURE 5 | (A)** The joint angle representation on the kinematic leg model with the thigh (l1) and shank (l2) links representing the joint angles θ*<sup>h</sup>* (hip), θ*k*1, θ*k*2(two knees); **(B)** Variation of hip and knee angles with time and their inter-phase relationships. Extrema (θ*h*\_*ext*) in the hip angle are denoted by numbers 1, 2, and 3.

*j*th pool. The intrinsic variables *pi*, *<sup>j</sup>*and *qi*, *<sup>j</sup>* of the oscillators are in Equations 14, 15 with *zi*, *<sup>j</sup>* = *p*2 *<sup>i</sup>*, *<sup>j</sup>* <sup>+</sup> *<sup>q</sup>*<sup>2</sup> *i*, *j* . *Fj*(*T*) is the error signal as described in Equation 18, where "*T*" denotes time steps needed to complete one gait cycle, which in our study is taken to be a vector of size [1 × 500]. It is weighted by a factor , and is given as feedback to the oscillators through Equation 14. In Equations 14, 15, μ controls the amplitude of oscillations, and ξ controls the speed of recovery of the system after perturbations.

$$\mathfrak{p}\_{i,j} = \xi(\mu - z\_{i,j}^2)\mathfrak{p}\_{i,j} - \omega\_{i,j}\mathfrak{q}\_{i,j} + \mathfrak{e}F\_j(T) + \mathfrak{r}\sin(\theta\_{i,j}^{IP} - \psi\_{i,j}) \tag{14}$$

$$q\_{i,j} = \xi(\mu - z\_{i,j}^2)q\_{i,j} - \omega\_{i,j}p\_{i,j} \tag{15}$$

The adaptation of the oscillators to a specific frequency (ω*i*,*j*) and amplitude (α*i*,*j*)of an input signal is achieved by Equations 16 and 17 respectively. The learning rate η*a*for the update equation for α*i*,*<sup>j</sup>* (Equation 17) is set at 0.08.

$$
\omega\_{i,j} = -\varepsilon F\_j(T) \frac{q\_{i,j}}{z\_{i,j}} \tag{16}
$$

$$
\alpha\_{i,j} = \eta\_a p\_{i,j} F\_j(T) \tag{17}
$$

*Fj*(*T*) describes the error signal (Equation 18) that is defined as the difference between the teaching signal (*P*teach,*j*) and the learnt signal (*Q*learned,*j*) at a time instant. The teach signals (a single gait cycle) for the oscillators in a pool "*j*" represent the angle profile of any one of the joints (either the hip, knee1, or knee2) seen in **Figure 5B** and is vector of size (1 × 500) in time. Hence all the

**FIGURE 6 | Training of the CPG network with the desired hip (***h***) and knee (***k***1 and** *k***2) angles (θ) represented in Figure 5B.** The number of hopf oscillators used to train the hip (ω*h*) and knee angles (ω*k*<sup>1</sup> and ω*k*2) are 2 and 3 respectively. Phase difference within-CPGs is maintained by ψlocal while across-CPGs is maintained by ψglobal. αs modulate the intrinsic CPG rhythm to output the learnt joint angles.

oscillators (*i* = 0: *N*) in a particular pool (*j*) receive the teach signal of the hip (if *j* = 1) or the knee 1 (if *j* = 2), knee 2 (if *j* = 3), respectively.

$$F\_j(T) = P\_{teach,j}(T) - Q\_{learmed,j}(T) \tag{18}$$

*Fj*(*T*) is provided to the oscillatory network only during the adaptation stage (learning) and grows smaller as the learning progresses till it eventually becomes zero (*P*teach,*<sup>j</sup>* = *Q*learned,*j*). The α*i*,*<sup>j</sup>* and ω*i*,*<sup>j</sup>* converge at this point, and the network still encodes the pattern even after the removal of *Fj*(*T*). These variables can be represented as α<sup>0</sup> *<sup>i</sup>*,*<sup>j</sup>* and <sup>ω</sup><sup>0</sup> *<sup>i</sup>*,*<sup>j</sup>* where the superscript "0" denotes the convergence to optimal values. The learnt signals which are joint angles, expressed here by θ*h*, θ*k*1and θ*k*2, are the output of each oscillator pool represented by the dot product of α<sup>0</sup> *<sup>i</sup>*,*<sup>j</sup>* and *pi*,*<sup>j</sup>* (Equation 19).

$$\mathbf{Q}\_{\text{Kernel},j}(T) = \sum\_{i=0}^{N} \alpha\_{i,j}^{0} p\_{i,j} \tag{19}$$

Intra-pool phase relationship (i.e., within hip, within each knee) is maintained via the internal variable ψ*i*,*<sup>j</sup>* (Equation 20) where τ forms the weight factor to maintain the phase relationship among the oscillators (Equation 14) with respect to the 0th oscillator within the pool (for a single "*j*" under consideration). This indicates that each oscillator within that pool receives a scaled phase input ψ*i*,*<sup>j</sup>* from its respective reference oscillator ψ0,*<sup>j</sup>*

$$\Psi\_{i,j} = \sin\left(\frac{\alpha\_{i,j}}{\alpha\_{0,j}}\theta\_{0,j}^{IP} - \theta\_{i,j}^{IP} - \Psi\_{i,j}\right) \tag{20}$$

where the instantaneous phase of an oscillator, θIP *<sup>i</sup>*,*<sup>j</sup>* within a pool is

$$\theta\_{i,j}^{IP} = \text{sgn}(p\_{i,j}) \cos^{-1}(-\frac{q\_{i,j}}{z\_{i,j}}) \tag{21}$$

In addition to a local phase variable ψ*i*,*<sup>j</sup>* (Equation 20), which does not consider the phase maintenance across different pools of oscillators (*j* = 1: s), a global/inter-pool phase relationship (between the hip and two knees) is introduced via a new state variable ψ*<sup>G</sup>* <sup>0</sup>,*j*, whose dynamics are governed by the following equations (Equations 22, 23). The Equation 22 is similar to Equation 14 with two changes that includes the variable *p*0,*j*, controlling the dynamics of only the 0th oscillator of each pool and the addition of the global phase variable ψ*<sup>G</sup>* 0,*j* (Equation 23). These equations represent phase maintenance by a scaled phase input (ψ*<sup>G</sup>* 0,*j* ) across the pools of oscillators. In our case the global phase is maintained with respect to one of the hip oscillators (the 0th oscillator in the hip pool, i.e., ψ*<sup>G</sup>* <sup>0</sup>,1) as the reference oscillator. The block diagram for training the CPG network is given by **Figure 6**, and the **Table 1** denotes the values of various parameters used in the CPG model.

$$p\_{0,j} = (\mu - z^2)p\_{0,j} - \omega\_{0,j}q\_{0,j} + \mathfrak{r}\sin(\theta\_{0,j}^{IP} - \psi\_{0,j}^G) \tag{22}$$

$$\Psi\_{0,j}^{G} = \sin(\theta\_{0,j-1}^{IP} - \theta\_{0,j}^{IP} - \Psi\_{0,j}^{G}) \tag{23}$$

**Table 1 | List of parameter values for simulating the network of adaptive hopf oscillators (CPG model).**


The GEN equation yields velocity components *vx* and *vy*, providing information on the magnitude and the direction of the agent's movements. Since our aim is to model the aspect of stride length, the magnitude of velocity obtained from the BG module is used to control the α<sup>0</sup> *<sup>i</sup>*,*<sup>j</sup>* of the oscillators through a proportionality gain (*k*) (Equation 24). This provides a proxy for the magnitude of velocity in terms of the joint angles. Since the joint angular velocity can be varied in terms of their amplitude (by changing α<sup>0</sup> *i*,*j* ), it is very convenient to translate an indirect measure of velocity obtained from the GEN module to a realistic motion of joints.

$$k(t) = A\_k \tanh(c\_k \sqrt{\nu\_x^2 + \nu\_{\mathcal{V}}^2})\tag{24}$$

where *k*(*t*) is the proportionality gain variable. *Ak* is the amplitude factor for the gain and *ck* is the sensitivity/slope factor which are set at 3 and 1 respectively in all conditions. The α<sup>0</sup> *<sup>i</sup>*,*<sup>j</sup>* is modulated by *k*(*t*) as in Equation 25

$$
\alpha\_{i,j}^f(t) = \alpha\_{i,j}^0 k(t) \tag{25}
$$

Here α *f <sup>i</sup>*,*<sup>j</sup>* reflects the changes in <sup>α</sup><sup>0</sup> *<sup>i</sup>*,*<sup>j</sup>* on being modulated by a factor of *k*(*t*) in each step of a trial to the doorway. Now α*<sup>f</sup> <sup>i</sup>*,*<sup>j</sup>* takes up the role of α<sup>0</sup> *<sup>i</sup>*,*<sup>j</sup>* (as seen in Equation 19) and has an effect on the output of the CPG network especially on dictating the amplitudes of the hip and knee angles. On obtaining larger/smaller values of velocity (*vx* and *vy*) from the GEN module the gain variable *k*(*t*) is varied i.e., either increased/decreased which in turn increases/decreases the amplitudes of the hip and knee angles by modulating α<sup>0</sup> *i*,*j* (Equation 25). Therefore, increased amplitude of velocity obtained from the BG results in increased stride lengths from the oscillator network and vice-versa. The stride module then calculates the stride/step length using θ*h*, θ*k*1and θ*k*2, which is the actual displacement used to translate the agent's position in space. The stride length obtained as a function of the joint angles is described in the section below. The connectivity between the BG network and the CPGs for training is given by **Figure 2**. Once the CPGs are trained, the gait execution is modeled as in **Figure 7**.

#### **THE LOCOMOTOR APPARATUS: STRIDE**

As depicted in **Figure 2** the stride module uses the angles θ*<sup>i</sup>* (especially θ*h*) from the CPG network to determine the stride length. Stride length in a gait cycle is defined as the distance between the heel strike of one leg to the heel strike of the same leg and thus covers two steps. The hip angle θ*<sup>h</sup>* as seen in **Figure 5B** has three peaks. Since θ*<sup>h</sup>* is the angle between the two hips and knee

angles are almost 0 at the extremes (**Figure 5B**) each peak in the hip angle represents a *Step*. Considering the first peak as the heel strike of one the legs, the next two peaks would be the next two steps or a *Stride* (**Figure 5B**). The thigh length, *l*<sup>1</sup> is taken as 0.5 m and the shank length, *l*<sup>2</sup> as 0.6 m (**Figure 5A**) and is adapted from Taga's biped model (Taga et al., 1991). The stride length (SL) is calculated as in Equation 26.

$$L\_{\rm STR} = 2(l\_1 + l\_2)\sin(\theta\_{h\_{\rm ext2}2}/2) + 2(l\_1 + l\_2)\sin(\theta\_{h\_{\rm ext3}3}/2) \tag{26}$$

In order to simulate the *step lengths*, only a single peak (θ*h*\_*ext*2) is considered; hence *L*STRwill possess only the first term. As the α0 *<sup>i</sup>* s are modulated, the amplitude of θ*<sup>h</sup>* varies giving rise to different *stride / step lengths*. The stride / step length now supplies the displacement information to the agent. The information for direction is obtained from the unit vectors of *vx* and *vy* (νˆ*<sup>x</sup>* and νˆ*y*) of GEN module, respectively. The stride length and the direction are combined to calculate the agent's next position as in Equation 27.

$$\begin{aligned} \Delta \mathbf{x} &= L\_{\text{STR}} \* \hat{\boldsymbol{\nu}}\_{\mathbf{x}} \\ \Delta \boldsymbol{\nu} &= L\_{\text{STR}} \* \hat{\boldsymbol{\nu}}\_{\mathbf{y}} \end{aligned} \tag{27}$$

The change in position (performed by Equation 27) would then trigger the VISION module to compute the new view vector, thus forming a loop. The trained Cortico-Basal ganglia, CPG module along with the locomotor apparatus is then used for testing the agent's performance as shown in **Figure 7**.

#### **RESULTS**

We simulate the results of two experimental conditions that study gait patterns of PD patients as a they walk towards a doorway (Almeida and Lebold, 2010; Cowie et al., 2010). In both studies, PD patients were asked to walk through doorways of different sizes (wide, medium and narrow), with the idea of understanding the changes in gait velocity and the conditions that trigger FOG. The Cowie et al. (2010) study shows significant differences

(unclamped δ).

in the gait velocity and stride length, for healthy controls, PD ON and PD OFF freezers. The velocity and stride lengths were significantly different among the three subject groups. In this study, the controls produce higher velocities of gait and higher stride lengths than PD freezers, under all door conditions. PD ON subjects show lesser velocities and stride lengths compared to controls but higher than PD OFF, who show the lowest velocities. The PD subjects (both ON and OFF) also produce significant dips in their gait velocity especially near the doorway, showing signs of freezing (**Figures 9A**, **10A**). The Almeida and Lebold (2010) study takes into account the gait patterns specifically of PD freezers and PD non-freezers. The differences among the three groups of subjects—controls, PD freezers and non-freezers—are evident from their step length profiles. PD freezers produce significantly low step lengths compared to controls and PD non-freezers. The trend is exaggerated as the doorway width decreases with the narrow doorway producing the least step length (**Figure 12A**). They also show increased variability among PD freezers in comparison to controls and non-freezers (**Figure 13A**). The experimental paradigm is quite similar in both the above studies (Almeida and Lebold, 2010; Cowie et al., 2010).

#### **SIMULATING THE ENVIRONMENT**

We start with a description of the doorway and the reward schedule used in the ENVIRONMENT module of our model. The agent's state and action representation is in the form of view vector and the velocity vector respectively. The position vector limits are: [-2, 2] for *x*-position across the breadth of the track, and [0, 10] for *y*-position along the length of the track (**Figure 3**). At the start, the agent is always positioned at *y* = 0.1 for a random *x*, and is directly oriented toward the door, whose center is

located at (*x*, *y*) = (0, 10) (**Figure 3**). The view vector φ(*t*) corresponding to any given position and orientation is given by eqn. 5, and the velocity is the action selected by following policy GEN (Equation 12). The agent is presented with three door conditions (wide, medium, and narrow).

In the Cowie et al. (2010) study, the door sizes are scaled to the participant's shoulders (100% shoulder width—narrow door; 125% shoulder width—medium door; 150% shoulder width wide door), while the Almeida and Lebold study uses doors of fixed size (wide door—1.8 m; normal door—0.9 m; narrow door—0.675 m). In our model, the agent has a circular body of diameter 1 m and the door sizes (*dlength*) are 3 m for "wide," 2.5 m for "medium/normal" and 2 m for "narrow" cases. The agent must control its movements through a distance of 10 m before it encounters the door. The rewards/punishments are as follows: *r* = 5 at the door for successful passage, and *r* = −1 for collision with the sides of the door and the boundaries of the track; *r* = 0 elsewhere.

#### **SIMULATING THE GEN**

In the BG model, GEN parameters (*A*'s and λ's: *A*G, *A*N,*A*E,λG,λN) of Equation 12 are computed for all the doorway cases (narrow, medium and "wide") and medicated conditions (ON/OFF). For all the doorways, once the above parameters are first optimized for controls, they are then directly used for simulating the PD condition. The optimization is done such that the simulation results fall within the error of the experimental results. The cost function chosen for optimization considers two elements: (a) the magnitude of stride / step length for each doorway

condition and (b) the stride / step length gradient between each doorway in any medication condition, the details of which are explained in Appendix A. The distinction between conditions of PD freezers (ON and OFF) and non-freezers among the two experiments is explained as follows.

#### *In Cowie et al. (2010) study*

Once the set for controls (*A*G, *A*N, *A*E, λG, λN, γ and σ) is optimized, the set for the PD freezers is obtained as follows. The parameters δlim (a status of limited dopamine availability), δmed = 0 are treated specially for PD OFF case. Since δlim controls the clamping of δ (Equation 9), a step that represents dopamine deficiency under PD conditions, we search for the optimal δlim(Equation 28) to describe PD OFF gait results. Furthermore, in PD OFF condition, we set δmed = 0 denoting absence of medication. Additionally γ (discount factor) and σ (exploration parameter) are also trained in PD OFF condition. In summary, the parameters that are trained in PD OFF condition are δlim, γ, and σ. The parameter δmed is simply set to 0. All these parameters (*A*'s and λ's from the control set, δlim, γ) are carried over to PD ON from PD OFF case, except σ and δmed(Equation 29) which are trained. The optimized parameter values are as in **Table 2**.

#### *In Almeida and Lebold (2010) study*

The controls set (*A*G, *A*N, *A*E, λG, λN, γ, and σ) is first optimized as in the Cowie et al. case. Further as the experimental

results for both freezers and non-freezers are in PD ON condition, δlim, δmed, γ and σ are optimized for PD freezers. For the PD non-freezers, all the PD freezers parameters are carried over except for γ and σ that are also optimized to match the experimental results. The optimized parameter values are as in **Table 3**.

The effect of adjusting parameters such as γ and σ in addition to δ (δlim, δmed) for simulating PD freezers (ON / OFF as in Cowie et al., 2010) and non-freezers (as in Almeida and Lebold, 2010) compared to controls is apart from conventional modeling of PD condition where just the dopamine analogue δ in particular δlim and δmed is varied. The motivation behind such a strategy is explained in Rationale Behind Optimization Strategy and Model Behavior Section.

The parameters including *AG*, *AN*, and *AE* are optimized to 2.5, 1 and 1 respectively; and the sensitivity to Go (λG) and NoGo (λN) is fixed at 1 and -1, respectively for the controls, PD freezers / non-freezers irrespective of the door-widths *d*length simulated. Since PD is a dopamine-deficient condition, PD OFF conditions are simulated in the model by clamping δ (Equation 9) to a low value "δlim" (Equation 28). To the clamped δ, a medication factor δmed is added to simulate PD ON conditions (Equation 29). A similar modeling approach to PD conditions was adopted earlier in (Magdoom et al., 2011). Conceptually, if the range of δ values for controls is represented as [*a b*], then PD OFF adopts a range of [*a*δlim] where both *a*, δlim < *b* and PD ON takes up the range of [*a* + δmed, δlim + δmed] where δlim + δmed < *b*. In the simulations we set *a* and *b* as −1 and +1, respectively. **Tables 2**, **3** show the parameter values for different condition settings.

$$\text{Los, at a \"pout\" no \"spac\".}$$

$$\begin{array}{ll} \text{PDOFF}: & \text{If } \\$ > \\$\_{\text{lim}} \\ & \\$ = \\$\_{\text{lim}} \end{array} \tag{28}$$
 
$$\begin{array}{ll} \text{If } \\$ > \\$\_{\text{lim}} \\ \text{PDON}: & \\$ = \\$\_{\text{lim}} + \\$\_{\text{med}} \\ \text{else} \end{array} \tag{29}$$

δ = δ + δmed

#### **SIMULATING THE VELOCITY PROFILES, STRIDE / STEP LENGTHS**

The Cowie et al. (2010) study suggests that there is no significant change in cadence (steps/s) of the subjects involved in the study. Therefore, frequency of the hopf oscillators is fixed such that the output rhythm produces 2 steps/s or 1 stride. Moreover in order to prevent the agent from making undesirable backward movements away from the door, stride length/ step length is equated to a small constant value whenever the velocity (*vy*) generated from GEN is negative. In our simulation, this constant value is taken to be 0.0001. (Note that in the other case, the velocities *vx* and *vy* from GEN are translated into the corresponding stride / step length by using the CPGs of section The CPG Module). The model (cortico-BG system) simulation is discrete in time (*t*) i.e., each iteration is considered as execution of a single stride and a single update of the velocity of the agent.

During training, the agent repeatedly walks along a track to the doorway of specific size for 100 passes, and the value function is built up by training the value weights, *W* (Equation 8). In testing conditions, the pre-trained weights of the value function are used and the agent is run for another 100 passes to obtain a velocity profile along the track.

Since the model does not provide velocities at every point in space, linear interpolation is conducted to fill in the gaps of *vy*, which is averaged across the 100 passes to construct the velocity profile. Here, if *va* and *vb* represent velocities at two discrete points *Xa* and *Xb*, then the velocities for intervals in between *Xa* and *Xb* is given by Equation 30.

$$\nu\_{\rm res} = \nu\_a + (\nu\_b - \nu\_a) \frac{X\_{\rm res} - X\_a}{X\_b - X\_a} \tag{30}$$

The following results are averaged over a length of the track starting from 2 m before the door till the doorway (in Y axis), for 50 such velocity profiles. In order to maintain regional consistency with each door size, the positions of the agent taken into account for averaging the velocities are (1) the position of the door itself and, (2) half of the door width [−2*d*pos,2*dpos*] on either side of the door along the width of the track.

#### **RATIONALE BEHIND OPTIMIZATION STRATEGY AND MODEL BEHAVIOR**

PD is a condition marked by decreased dopamine levels in the BG, and hence the simulations of the same from the controls are first directed toward understanding the role of the parameter δlim. The dopamine analogue δlim is varied between [−1, 1], where −1 represents highly depleted conditions and +1 is the unclamped control conditions, and the stride length is determined at each level of δlim as the model output. The simulations are carried out

in a freeze-stimulating narrow door case (following simulation criteria used for the Cowie et al. study, see **Table 2**) with all other parameters kept constant at control levels. The model shows no significant differences in the stride lengths on varying the parameter δlimas seen in **Figure 8A**. Incidentally the Cowie et al. study makes an interesting observation between the velocity profiles of PD OFF and PD ON freezing subjects. The presence of the medication is not able to affect the stride trends for different doorways seen in both ON and OFF states (i.e., although PD ON subjects have increased strides to PD OFF, both class of subjects show sensitivity to doorway size), suggesting the involvement of factors other than dopamine in freezing events (**Figure 10A**).

These observations then forced us to investigate other parameters which could bring about such a behavior trend seen in the freezers. The discount factor γ and the exploration parameter σ are good modulatory candidates to explore apart from dopamine, owing to the fact that they are related to the neural correlates (Doya, 2002; Tanaka et al., 2007)—serotonin and norepinephrine respectively and also that their levels have been shown to be altered in PD and in medication conditions (Chalmers et al., 1971; Fahn et al., 1971). Varying the values of γ and σ, individually starts to produce changes in the stride lengths as seen in **Figure 8B**. These simulations (also following the simulation criteria used for Cowie et al. study, see **Table 2**) are carried out at unclamped or control level of δlim under a narrow doorway. The variation in stride length encourages the necessity in optimizing γ and σ in PD conditions to match the same trends seen in the experiments.



**Table 3 | Parameter values for different condition settings for δlim, γ, σ, and δmed for the Almeida and Lebold (2010) study.**


The velocity profile obtained from the model of Cowie et al. (2010) for controls and the PD condition is as shown in **Figures 9A,B** respectively. In controls there seems to be a reduction in velocity on approaching the doorway which is exaggerated in PD conditions. The velocity near the doorway is normalized by the average velocity calculated far before the doorway (5– 6 m) as seen in **Figures 9A,B**. Additionally simulations show a certain door-size dependent scaling of velocity in case of PD subjects. The simulated stride length profile for controls, PD ON and PD OFF under different doorway sizes is shown in **Figure 10B**, and that of the experiments (Cowie et al., 2010) in **Figure 10A**. The average stride length of controls is higher than that of the PD patients. In the model, we also found that PD ON case has higher mean velocities than PD OFF, in agreement with experimental data. Our simulation results also show that there is a significant difference in stride lengths (*p* < 0.005) between the wide/medium door and the narrow door conditions in both PD ON and PD OFF states (**Figure 10**). The shape of value function profile for both the controls and PD shows marked differences (**Figure 11**). Here, the value function for controls shows a positive gradient in the vicinity of the door suggesting the presence of a reward at the door. In case of PD patients, the value function is inverted and dips before the doorway, indicating low reward expectancy near the doorway. Since the GEN dynamics (Equation 12) depend on the gradient of value function (represented by δ*V*: Equation. 10), that negative gradient of value function may be a factor contributing to the velocity dip near the doorway.

Almeida and Lebold (2010) in their study show differences in gait patterns between PD ON—freezers and non-freezers. The experiments conducted in the ON condition report that the PD freezers group produces significantly reduced step lengths, compared to non-freezers and controls. This reduction in step lengths is further amplified in the case of reduced door sizes (*d*length). PD freezers also show changes in step length variability, a clear concomitant feature of freezing (Almeida et al., 2007). Our model captures this effect, and we present our results in terms of step length and step length coefficient of variation (CV). PD freezers show significantly reduced step lengths compared to non-freezers (*p* < 0.05) and controls (*p* < 0.005) under all door conditions (**Figure 12**). In order to capture the increased variability observed in PD freezers, the coefficient of variation (CV) in step length within a trial is determined (**Figure 13B**) throughout the corridor facing the doorway, and is averaged across trials (*N* = 50). Step length CV shows similar trends as seen in the original study (**Figure 13A**) where the PD freezers show significantly higher CV in comparison to controls and PD non-freezers in all the three door conditions. The step length CV reported in Almeida and Lebold (2010) is hypothesized to be a factor of unstable gait and a voluntary control over it.

A conclusion that the experiments lead to is that dopamine reduction, modeled here by clamping δ, alone cannot lead to FOG (Almeida and Lebold, 2010; Cowie et al., 2010). The simulations also reinforce the same conclusion. Therefore we studied the role of other model parameters including γ and σ in bringing about FOG (Almeida and Lebold, 2010; Cowie et al., 2010). This suggests the involvement of several factors for an event like FOG, and a single parameter (δ, γ, or σ) might not be sufficient to produce the observed effect of freezing. A plausible neurobiological interpretation of this modeling conclusion is presented in the following section.

### **DISCUSSION**

In this study, we model gait changes and the occurrence of FOG in PD patients walking through doorways of different sizes. Our model reproduces the results of the studies of Cowie et al. (2010) and Almeida and Lebold (2010). The model shows significant decrease in the velocities (as a dip in velocity) and stride lengths for the PD (ON/OFF) compared to the Controls as seen in Cowie et al. (2010). The decrease in velocity observed in the controls and PD (ON/OFF) freezers, is also significant with changing door sizes i.e., the reduction in the door size increases the dip in velocity near the doorway. The step length profiles of Controls, PD freezers and PD non-freezers are also reproduced in concordance with the Almeida and Lebold (2010) results. We show that PD freezers produce significantly smaller steps than the controls and PD non-freezers in all the doorway conditions. Furthermore within the PD freezers (different doorway conditions), there exists a doorway effect with the narrowest door producing the least step length. In addition we replicate the trends observed that is the increased CV in step length found in PD freezers compared to non-freezers and controls.

FOG is a characteristic feature highlighting the cortical-BG loop influence on the spinal rhythms in the gait generation (Lewis and Barker, 2009; Naismith et al., 2010). Here, gait is a motor function that can be driven by spinal circuits and the error correcting systems like BG, with only a limited consciousness and voluntary control from the motor cortical areas (Takakusaki et al., 2008). Certain external conditions, for example confined spaces, might force a shift toward increased voluntary control (Maruyama and Yanagisawa, 2006) on gait. Furthermore the manifestation of FOG as start-hesitation, destination-hesitation and obstacle avoidance have been thought to be a result of impairment in willed / voluntary action (Maruyama and Yanagisawa, 2006). Lewis and Barker hypothesized that freezing might also result from the depletion in the available dopamine (δ), on induction of high cognitive loads (Lewis and Barker, 2009).

There are no existing computational models explaining the FOG in PD, to our knowledge. Our model captures this feat by carefully considering the impact of different levels of control on gait. The model consists of two stages of control: the cortico-BG and CPG on the locomotor apparatus. The cortico-BG module uses RL concepts for learning the environment in which the agent is placed (for navigating through doorway of variable widths). The BG dynamics are modeled through GEN that has been tested in many of our earlier studies (Sridharan et al., 2006; Magdoom et al., 2011; Kalva et al., 2012). This module outputs a higher level control parameter such as velocity of gait to be passed on to the next in control: the CPGs. The CPGs are modeled through dynamic adaptive hopf oscillators (Righetti and Ijspeert, 2006) representing the rhythmic spinal cord activity aiding the locomotion. Here the velocity obtained from the cortico-BG module is translated to the joint angle displacement during gait. This joint angle information is converted to the translatory motion in terms of stride / step lengths in the locomotor apparatus. This approach of modeling gives two major advantages: (1) It consolidates the essential functioning of the two stages of control in an abstract manner to explain the FOG, which a detailed model of only CPG driven biped model of gait (Taga et al., 1991; Mori et al., 2004) cannot reproduce. (2) It also explains the non-dopamine dependence on the FOG seen in the experiments modeled in this study (Almeida and Lebold, 2010; Cowie et al., 2010). The results point out the implications of the other parameters used in the study (γ and σ) for explaining the context dependent freezing phenomenon.

#### **INFLUENCE OF δ, γ AND σ PARAMETERS AND THEIR PLAUSIBLE CORRELATES:**

As discussed in the text above, δ is the dopamine functioning correlate depicting the temporal difference error in value function. Since dopamine deficiency is generally considered the crucial factor, the "star of the show" (Lewitt, 2012), responsible for PD related impairment, RL-based computational models of BG function typically propose TD error (a dopamine correlate) as the key variable that controls normal and pathological function. It has to be noted that the study by Cowie et al. (2010) made an interesting observation that the L-Dopa medication given to resurge the dopamine levels of PD freezers did not have a significant effect on the sensitivities to doorways. The same is captured by our model effectively, as seen in **Figure 8A**. The figure backs the non-dopamine dependence of FOG by showing no significant changes in stride length simulated for narrow doorway (width 2 m) under various clamped δ conditions simulated for control levels of γ and σ. It is also known that there are significant changes in other key neuromodulators like norepinephrine, serotonin and acetylcholine that is observed in PD, though these findings have not sufficiently influenced mainstream thinking about PD pathogenesis.

Norepinephrine is involved in important brain functions like wakefulness, vigilance and circadian rhythms (Aston-Jones et al., 1994; Yu and Dayan, 2005; Lewitt, 2012). Similar to loss of dopaminergic cells in SNc, there is marked loss of norepinephrine-releasing cells in Locus Coeruleus (LC) in PD (Cash et al., 1987; Del Tredici et al., 2002). Loss of norepinephrine is found to produce more pronounced motor impairment than destruction of dopamine fibers caused by MPTP (Rommelfanger and Weinshenker, 2007). Serotonin is known to be significantly involved in a wide spectrum of activities ranging from moods like anxiety, depression leading to major disorders such as bipolar disorder, major depression, schizophrenia, to reward- punishment sensitivity and their prediction in action selection (Lopez-Ibor, 1992; Vaswani et al., 2003; Boureau and Dayan, 2011; Rogers, 2011). There is evidence for altered serotonergic transmission and its involvement in motor impairment in PD (Fahn et al., 1971; Kish et al., 2008). It would be interesting to have a theory of BG function that combines the action of dopamine, norepinephrine and serotonin.

There was indeed an attempt to accommodate the function of all the four neuromodulators—dopamine, serotonin, norepinephrine and acetylcholine—in a unified theoretical framework based on RL (Doya, 2002). According to this view, dopamine represents TD error, norepinephrine represents exploration denoted by the temperature parameter, β, serotonin represents discount parameter, γ, and acetylchoine represents the learning rate, η. Specifically, within BG circuitry, it was suggested that GP is the substrate for exploration (Doya, 2002). GP is also known to have high levels of norepinephrine (Russell et al., 1992). From a purely dynamical point of view, chaotic dynamics of STN-GPe system qualifies to serve as a source of exploratory drive, an idea that has been investigated extensively using computational models (Sridharan et al., 2006; Ranganathan et al., 2012). In the present model, the exploration parameter, σ, denotes the extent of exploration, and therefore may be described as a neural correlate for norepinephrine in BG. Similarly serotonin has been linked to the discount factor, γ, or the time-scale of reward integration, with larger values of γ corresponding to higher levels of serotonin (Tanaka et al., 2007). Low levels of serotonin were associated with impulsivity, a behavior that may be thought to be a result of shortterm reward seeking (Rogers, 2011). Based on the arguments just described, we adjust both γ and σ that represent serotonin and norepinephrine respectively, in addition to δlim and δmed that are related to dopamine levels, in the present model to capture PD-related gait changes.

Therefore, in addition to incorporating PD-related changes in δ (δlim and δmed) corresponding to ON and OFF conditions respectively, we also explore the effect of the discount factor (γ) and exploration parameter (σ) on the velocity profile of the agent. These parameters have distinct roles in the model. By lowering σ it is possible to produce the velocity dip and stride length decrease as in **Figures 8B**, **9B**. As a result PD freezers (ON / OFF) are modeled with lower σ compared to controls (See **Table 2**). The lower γ maintains the doorway effect between the controls and PD freezers and also emphasizes the fact that smaller values of γ fit PD velocity profiles better in the model, reflecting reduced serotonin levels in PD patients compared to controls (**Figure 8B**). Specifically, the PD ON conditions are modeled by increased σ compared to the PD OFF case and addition of δmed in the model (described in Section Simulating the GEN). This assumption in modulating σ in addition to δmed in PD ON implies that the medication factor δmed increases the norepinephrine levels in the BG. There is evidence pointing to this claim and that the norepinephrine levels do increase on uptake of dopamine medication (Chalmers et al., 1971). L-Dopa treated rats have been found to have higher levels of norepinephrine mainly in the striatum, hypothalamus, brainstem and cerebellum (Romero et al., 1972). Taking into account these factors, the model incorporates the changes in σ which gives much better match to the experimental data than just altering δmed. This further led us to believe that even among PD subjects, the freezers could be hypothesized to have decreased serotonin and norepinephrine compared to nonfreezers. Under the conditions of PD non-freezers, the γ and σ level increase in comparison to the PD freezers (**Table 3**). This results urge us to propose that γ and σ values may possibly reflect the importance of considering the other neuromodulators like serotonin and norepinephrine respectively, on context dependent FOG.

We conclude that the loss of dopaminergic cells alone cannot explain the FOG mechanism observed in PD patients. We predict that altered levels of serotonin and norepinephrine may contribute to freezing. Future work will be aimed at development of more detailed network model of BG and its role in gait control. The model will elucidate the contributions of dopamine, serotonin and norepinephrine to gait in normal and PD conditions.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 September 2013; accepted: 21 December 2013; published online: 09 January 2014.*

*Citation: Muralidharan V, Balasubramani PP, Chakravarthy VS, Lewis SJG and Moustafa AA (2014) A computational model of altered gait patterns in parkinson's disease patients negotiating narrow doorways. Front. Comput. Neurosci. 7:190. doi: 10.3389/fncom.2013.00190*

*This article was submitted to the journal Frontiers in Computational Neuroscience. Copyright © 2014 Muralidharan, Balasubramani, Chakravarthy, Lewis and Moustafa. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## **APPENDIX**

The Genetic Algorithm (Goldberg, 1989) option set for optimization is given in the following table. Optimization toolbox 6.0, Matlab R2011a, The Mathworks Inc. is used.


Cost function (Expt measure—Sims measure)<sup>2</sup>

*C* = \_0.5 <sup>3</sup> *<sup>i</sup>* <sup>=</sup> <sup>1</sup>(*exi* <sup>−</sup> *simsi*)<sup>2</sup> <sup>+</sup> <sup>0</sup>.5{[(*ex*<sup>1</sup> <sup>−</sup> *ex*2) <sup>+</sup> (*ex*<sup>2</sup> <sup>−</sup> *ex*3) <sup>+</sup> (*ex*<sup>1</sup> <sup>−</sup> *ex*3)]− [(*sims*<sup>1</sup> − *sims*2) + (*sims*<sup>2</sup> − *sims*3) + (*sims*<sup>1</sup> − *sims*3)]}

"*ex*" here refers to the experimental stride length values at each of the doorway (1—wide, 2—medium and 3—narrow) and "*sims*" is the model's ouput to a set of parameter values. The details of the parameters optimized at any given condition is described in section Result.

## A neurocomputational theory of how explicit learning bootstraps early procedural learning

#### *Erick J. Paul <sup>1</sup> and F. Gregory Ashby2 \**

*<sup>1</sup> Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana, Champaign, IL, USA <sup>2</sup> Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA, USA*

#### *Edited by:*

*Ahmed A. Moustafa, University of Western Sydney, Australia*

#### *Reviewed by:*

*Carol Seger, Colorado State University, USA Srinivasa Chakravarthy, Indian Institute of Technology Madras, India*

#### *\*Correspondence:*

*F. Gregory Ashby, Department of Psychological and Brain Sciences, University of California, 552 University Road, Santa Barbara, Santa Barbara, CA 93106, USA e-mail: ashby@psych.ucsb.edu*

It is widely accepted that human learning and memory is mediated by multiple memory systems that are each best suited to different requirements and demands. Within the domain of categorization, at least two systems are thought to facilitate learning: an explicit (declarative) system depending largely on the prefrontal cortex, and a procedural (non-declarative) system depending on the basal ganglia. Substantial evidence suggests that each system is optimally suited to learn particular categorization tasks. However, it remains unknown precisely how these systems interact to produce optimal learning and behavior. In order to investigate this issue, the present research evaluated the progression of learning through simulation of categorization tasks using COVIS, a well-known model of human category learning that includes both explicit and procedural learning systems. Specifically, the model's parameter space was thoroughly explored in procedurally learned categorization tasks across a variety of conditions and architectures to identify plausible interaction architectures. The simulation results support the hypothesis that one-way interaction between the systems occurs such that the explicit system "bootstraps" learning early on in the procedural system. Thus, the procedural system initially learns a suboptimal strategy employed by the explicit system and later refines its strategy. This bootstrapping could be from cortical-striatal projections that originate in premotor or motor regions of cortex, or possibly by the explicit system's control of motor responses through basal ganglia-mediated loops

**Keywords: basal ganglia, categorization, computational model, COVIS, declarative, non-declarative, procedural learning**

## **INTRODUCTION**

The existence of multiple memory systems was theorized as early as the 1970s (Tulving, 1972), and the idea was more formally stated in the mid-1980s (Tulving, 1985). The distinction between declarative (i.e., explicit, conscious memory for specific events or facts) and non-declarative (i.e., implicit) memory systems is now well established (e.g., Poldrack and Packard, 2003).

The fundamental ideas driving multiple memory system theories have been applied more recently to the domain of category learning. Perhaps the most successful multiple system model of category learning is COVIS (Ashby et al., 1998). COVIS assumes two interacting systems that each map onto previously hypothesized human memory systems (Ashby and O'Brien, 2005). The COVIS explicit system uses declarative memory and mediates learning in tasks that require hypothesis testing, logical reasoning, and the application of verbalizable rules. In contrast, the procedural system uses non-declarative memory and learns to gradually associate motor programs with regions of perceptual space through reinforcement learning (Ashby and Waldron, 1999; Ashby et al., 2007).

Computational and mathematical models based on other theories of human category learning have been described, tested, and compared (e.g., Homa et al., 1979; Hintzman, 1984; Nosofsky, 1986; Kruschke, 1992; Ashby and Maddox, 1993; Smith and Minda, 2000; Love et al., 2004), and even other multiple-systems accounts have been proposed (Erickson and Kruschke, 1998; Anderson and Betz, 2001), though no other model has been formulated with such deep ties to known neurobiology as COVIS. It is precisely because of these neurobiological constraints that COVIS has been such a successful model of human category learning.

The neurobiology underlying its two distinct systems has been well described (Ashby et al., 1998; Ashby and Valentin, 2005; Ashby and Ennis, 2006). Furthermore, the neurobiological motivation of COVIS serves to constrain the model by utilizing the known neural basis of the constituent memory systems responsible for category learning to define the function and implementation of each system. Following is a brief description of the two systems of COVIS.

## **COVIS**

As mentioned above, the explicit system of COVIS learns in tasks that require logical reasoning and explicit rules. This system tests simple hypotheses about category membership by allocating executive attention to single stimulus dimensions and then formulating explicit rules using Boolean algebra (e.g., logical conjunctions). Working memory is used to store candidate rules during testing. The COVIS explicit system is assumed to be mediated by a broad neural network that includes the prefrontal cortex (PFC), the anterior cingulate cortex (ACC), the head of the caudate nucleus, and medial temporal lobe (MTL) structures.

The procedural system of COVIS learns to associate motor programs with multidimensional stimuli via reinforcement learning (Ashby and Waldron, 1999). The neuroanatomical basis is the direct pathway through the basal ganglia. More specifically, for visual stimuli the relevant structures are early visual cortex (excluding V1) and parietal visual areas, posterior putamen, the internal segment of the globus pallidus (GPi), the ventral-anterior and ventral-lateral thalamic nuclei, and premotor cortex [i.e., supplementary motor area (SMA) and/or dorsal premotor cortex (PMd)]. The key site of learning is at cortical-striatal synapses, and plasticity there follows reinforcement learning rules, with dopamine from the substantia nigra pars compacta (SNpc) serving as the training signal.

COVIS assumes that the explicit system dominates in rulebased (RB) category-learning tasks. In RB tasks, the categories can be separated by a rule that is easy to describe verbally. In many cases, only a single stimulus dimension is relevant, although tasks where the optimal strategy is a logical conjunction are also RB. An example is shown in the bottom panel of **Figure 1**. Here, stimuli are discs with alternating black/white bars that vary in their orientation and thickness (or spatial frequency). A simple rule on the thickness dimension successfully accounts for the separation of the categories.

The top panel of **Figure 1** shows an example of an information-integration (II) category–learning task. Here, information about both stimulus dimensions (i.e., orientation and spatial frequency) must be (pre-decisionally) integrated to respond optimally. Note that there is no way to verbalize the optimal strategy in this task (i.e., denoted by the diagonal boundary). RB strategies could be (and frequently are) adopted in II tasks, but such strategies lead to suboptimal performance. COVIS assumes that when the explicit system fails to learn a task of this kind, it passes control to the procedural system.

Many dissociations in learning RB and II categories have been observed across a variety of methodologies and in numerous human and non-human populations (see Ashby and Maddox, 2005, 2011 for reviews). These dissociations strongly support the hypothesis that multiple memory systems contribute to category learning.

#### **INTERACTIONS BETWEEN DECLARATIVE AND PROCEDURAL MEMORY**

Although overwhelming evidence now supports the existence of functionally separate declarative and procedural memory systems, much less is known about how these systems interact. The first important question to answer is how a single response is selected, given that either system can presumably control behavior?

Logically, there are at least three possible ways to select a single response when two learning and memory systems are simultaneously active. One possibility is that the outputs of the constituent systems are mixed or blended to produce the final output. This assumption is made by several currently popular categorization models (Erickson and Kruschke, 1998; Anderson and Betz, 2001).

Mixture models assume that all categorization responses reflect a mixture of declarative and procedural processes, so that the difference between RB and II tasks is quantitative rather than qualitative—that is, the mixture might weight the declarative output more heavily in an RB task than in an II task, but some weight is always given to both systems. The problem for mixture models is to account for the many behavioral dissociations that have been reported between performance in RB and II tasks. For example, a simultaneous dual task greatly interferes with RB category learning, but not with II learning (Waldron and Ashby, 2001; Zeithamova and Maddox, 2006). If the dual task interferes with the use of declarative memory and responding is always a mixture of declarative and procedural memory processes then it seems that a dual task should interfere with both RB and II learning.

shown as a solid black line.

A second logical possibility, which we call *soft switching*, is that only one system controls each response, but that control is passed back and forth between the systems on a trial-by-trial basis. This is the assumption made by the original version of COVIS. More specifically, COVIS assumes that the response on each trial is generated by whichever system is most confident in the accuracy of its response (weighted by a system bias parameter).

Ashby and Crossley (2010) reported results that argue against soft switching. These experiments used hybrid categories (illustrated in **Figure 2**) that were constructed so that perfect performance is possible if participants use a 1D rule on disks with steep orientations and an II strategy on disks with shallow orientations. Nevertheless, fewer than 5% of participants were able to adopt a strategy of the optimal type (solid black line). On the other hand, Erickson (2008) reported the results of an experiment in which about 40% of his participants appeared to switch successfully between declarative (i.e., RB) and procedural (i.e., II) strategies on a trial-by-trial basis. But this experiment provided participants with several explicit cues that signaled which memory system to use on each trial. Despite all of these cues, most participants failed to switch strategies. Thus, although the existing data suggest that some participants are able to soft switch under some conditions, the evidence argues strongly against soft switching as the default mechanism to resolve system interactions.

A third logical possibility is *hard switching* (HS), in which one system is used exclusively and a single switch is made to the other system (when the task demands it). This hypothesis seems most consistent with the available data—in that it accounts for all the RB vs. II behavioral dissociations, as well as the data of Erickson (2008) and Ashby and Crossley (2010). Nevertheless, hard switching faces numerous theoretical challenges. For example, consider an II category-learning task. A hard-switch version of COVIS assumes that participants begin experimenting with explicit strategies, and then when these all fail, switch to a procedural strategy. If so, then what is happening within the procedural system early in learning when declarative memory systems control behavior? One possibility is that no procedural learning occurs until the procedural system controls behavior. This model makes a strong prediction. A suboptimal explicit strategy will

almost always yield performance that is significantly above chance in an II task. Thus, if the procedural system cannot learn while the explicit system controls behavior, then a hard-switch model must predict that accuracy will rise to well above chance and then suddenly fall back to chance when control is passed to the procedural system. To our knowledge, none of the many published II studies has ever reported this strange data pattern.

The second possibility is that the procedural system learns on trials when the explicit system is controlling behavior. COVIS actually predicts such learning because it assumes that each system receives its own separate feedback signal that rewards or punishes the system for recommending either the correct or incorrect response, respectively, regardless of whether that recommendation was used to select the single response made on that trial. These separate feedback signals allow both systems to learn independently of the other, and regardless of whether or not they control responding.

The assumption that the explicit and procedural categorylearning systems each receive their own independent feedback signal on every trial is very strong. In most category learning experiments, only a single feedback signal is given that is based entirely on the emitted (observable) response; independent feedback would require sophisticated self-monitoring that might for example, send a reward signal to the procedural system on trials when the emitted response was incorrect. The explicit system might have such flexibility: evidence suggests that the PFC can handle abstract forms of feedback and generate second-order feedback (Ashby et al., 2002; Maddox et al., 2003; Cools et al., 2006; Maddox et al., 2008; Wallis and Kennerley, 2010). On the other hand, the available evidence does not support the hypothesis that the procedural system can self-monitor and generate its own feedback. Learning in the procedural system depends on dopamine signals in the striatum. Dopamine neurons in SNpc respond to rewards and reward-predicting stimuli, and also encode reward prediction error (Schultz et al., 1997, 2000; Tobler et al., 2003), which depends in part on the valence of the feedback. This suggests that the basal ganglia are unable to flexibly manipulate feedback signals, and instead respond to the valence (and expectation) of feedback (and reward). As such, the possibility of independent feedback seems unlikely.

A more plausible solution is to assume that each system receives the feedback elicited by the observable response and that each system must use this common feedback to guide learning. The theoretical challenge for this hypothesis is to show how one system can learn from feedback that is based on the response of the other system. Because the evidence is good that the explicit system dominates during early responding, this question is moot for any tasks that are learned by the explicit system (e.g., RB tasks), because in such cases, evidence suggests that the procedural system is never used. In II tasks however, we expect the explicit system to control early learning and the procedural system to control late learning. The key question addressed by this article is how the procedural system can learn during trials when the explicit system controls responding and there is only a single source of feedback.

To begin, we need to identify a candidate neural mechanism that mediates the competition between the two systems. The hypothesis that the procedural system is able to learn while the explicit system is in control suggests that when in control, the explicit system prevents the procedural system from accessing motor output systems, but does not interfere with learning. Some independent evidence supports this hypothesis (e.g., Packard and McGaugh, 1996; Foerde et al., 2006, 2007).

Given these considerations, Ashby and Crossley (2010) proposed that frontal cortex and the subthalamic nucleus (STN) control system interactions via the hyperdirect pathway through the basal ganglia. The hyperdirect pathway begins with direct excitatory glutamate projections from frontal cortex (via presupplementary cortex, preSMA) to the STN, which sends excitatory glutamate projections directly to the internal segment of the globus pallidus (GPi; Parent and Hazrati, 1995; Joel and Weiner, 1997), making it more difficult for striatal activity to affect cortex. The evidence that the preSMA is in this circuit comes from several sources. First, Hikosaka and Isoda (2010) reviewed evidence that the preSMA is crucial for switching between controlled and automatic responding. Second, Waldschmidt and Ashby (2011) reported that after 20 sessions of practice, the only brain areas in which neural activation correlated with accuracy in an II task were the preSMA and SMA.

Evidence for a role of the STN in this model comes largely from studies using the stop-signal task where participants initiate a motor response as quickly as possible when a cue is presented. On some trials, a second cue is presented soon after the first signaling participants to inhibit their response. A variety of evidence implicates the STN in this task (Aron and Poldrack, 2006; Aron et al., 2007; Mostofsky and Simmonds, 2008). A popular model is that the second cue generates a "stop signal" in cortex that is rapidly transmitted through the hyperdirect pathway to the GPi, where it cancels out the "go signal" being sent through the striatum.

Ashby and Crossley (2010) hypothesized that when the explicit system is controlling behavior a stop signal may inhibit a potentially competing response generated by the procedural system: the PFC could increase STN output via the hyperdirect pathway, preventing the procedural system from influencing cortical motor systems, thereby allowing the explicit system to control the overall response. Note that because the inhibition occurs downstream from the striatum, this hypothesis theoretically allows procedural learning to occur within the striatum. The next section describes several different model architectures that will be used to explore the conditions under which procedural learning can occur with a single source of feedback.

#### **SIMULATION SET 1—COVIS ARCHITECTURES**

The basic model that we will use in all investigations is COVIS (Ashby et al., 1998, 2011). The original version of COVIS assumed soft switching and independent feedback. However, to gain more insight into system interactions, we will explore a number of alternative model architectures. For the reasons described above, our focus will be on two model features: the nature of the feedback signal (separate signals to each system vs. a single feedback signal); and the switching mechanism (soft vs. hard). Following are descriptions of three alternative versions of COVIS that explore the effects of varying these two features.

#### **MODEL 0: INDEPENDENT FEEDBACK, SOFT SWITCHING**

Our first goal is to implement COVIS computationally as described in Ashby et al. (2011). Only components critical to the simulations will be described; interested readers are encouraged to review the complete description elsewhere (e.g., Ashby et al., 1998, 2011; Hélie et al., 2012a,b). Model 0 is simply the independent feedback model previously described with some simplifications made to each system to improve the likelihood of learning in the procedural system, and to maximize computational efficiency.

First, because our goal is not to elucidate the nature of learning in the explicit system, it was heavily simplified to respond with the most accurate possible one-dimensional rule from the outset of the simulated experiment (in normal applications, the explicit system selects and tests a variety of rules). Thus, in current applications, the explicit system has no free parameters and reduces to the equation below. The response rule is: respond A on trial *n* if *hE*(*x*) < 0; respond B if *hE*(*x*) > 0; and the discriminant function is defined as:

$$h\_E(\mathbf{x}) = \mathbf{x}\_d - \mathbf{C}\_d - \mathbf{e}\_E,\tag{1}$$

where *xd* is the value of stimulus *x* on dimension *d*, *Cd* is a constant that plays the role of a decision criterion (typically learned; hard-coded here) and ε*<sup>E</sup>* is a normally distributed random variable with mean 0 and variance σ<sup>2</sup> *<sup>E</sup>* that models the variability in both stimulus perception and memory of the decision criterion. When σ<sup>2</sup> *<sup>E</sup>* is large, the discriminant value becomes less deterministic, so σ<sup>2</sup> *<sup>E</sup>* was set to zero in the present application.

The COVIS procedural system is a two-layer connectionist network. The input layer includes 625 input units arranged in a 25 × 25 grid (i.e., because the simulations will use stimuli that vary on two stimulus dimensions). Each input unit is tuned to a particular stimulus, in the sense that it is maximally activated when its preferred stimulus is presented, and it is less activated when similar stimuli are presented. The activation in sensory cortical unit *K* on trial *n* is given by

$$I\_K(n) = e^{\frac{-d(K,\text{ stimulus})^2}{\sigma\_R}}\tag{2}$$

where *d*(*K*, stimulus) is the Euclidean distance (in stimulus space) between the stimulus preferred by unit *K* and the presented stimulus (i.e., in units of 25 × 25 space). Equation 2 is a radial basis function (e.g., Kruschke, 1992; Riesenhuber and Poggio, 1999). The constant σ*<sup>R</sup>* has the effect of expanding or narrowing the width of the radial basis function, much like a variance.

The output layer in the COVIS procedural system is assumed to represent the striatum. The model includes the same number of output units as there are alternative responses. All simulations described here used two categories (A and B), so all models included two output units. Activation of striatal unit *J* in the output layer is determined by the weighted sum of activations from the input layer projecting to it:

$$S\_{\mathcal{I}}(n) = \sum\_{K=1}^{625} w\_{K,\mathcal{I}}(n) I\_K(n) \tag{3}$$

where *wK*, *<sup>J</sup>*(*n*) is the strength of the synapse between cortical unit *K* and striatal cell *J* on trial *n*. The decision rule in the procedural system is similar to that of the explicit system. The decision rule is: respond A on trial *n* if *SA*(*n*) > *SB*(*n*); otherwise respond B. The relative activity between striatal units changes as the model learns, and learning is accomplished by adjusting the synaptic weights, *wK*, *<sup>J</sup>* (n), up or down from trial-to-trial via reinforcement learning.

The weights in the procedural system are updated based on the three factors: (1) pre-synaptic activation, (2) post-synaptic activation, and (3) dopamine levels (e.g., Arbuthnott et al., 2000; Ashby and Hélie, 2011). Specifically, *wK*, *<sup>J</sup>*(*n*) is updated after each trial using the following learning rule:

$$\begin{aligned} \boldsymbol{w}\_{K,J}(n+1) &= \boldsymbol{w}\_{K,J}(n) + \alpha \boldsymbol{I}\_K(n) \left[ \boldsymbol{S}\_I(n) - \boldsymbol{\theta}\_{\text{NMDA}} \right]^+ \\ &\quad \left[ D(n) - D\_{\text{base}} \right]^+ \left[ \boldsymbol{w}\_{\text{max}} - \boldsymbol{w}\_{K,J}(n) \right] \\ &\quad - \beta \boldsymbol{I}\_K(n) \left[ \boldsymbol{S}\_I(n) - \boldsymbol{\theta}\_{\text{NMDA}} \right]^+ \left[ D\_{\text{base}} - D(n) \right]^+ \boldsymbol{w}\_{K,J}(n) \\ &\quad - \gamma \boldsymbol{I}\_K(n) \left[ \boldsymbol{\theta}\_{\text{NMDA}} - \boldsymbol{S}\_I(n) \right]^+ \left[ \boldsymbol{S}\_I(n) - \boldsymbol{\theta}\_{\text{AMPA}} \right]^+ \boldsymbol{w}\_{K,J}(n) \end{aligned}$$

The function [*g*(*n*)]<sup>+</sup> = *g*(*n*) if *g*(*n*) > 0, and otherwise *g(n)* = 0. The constant *D*base is the baseline dopamine level, *D*(*n*) is the amount of dopamine released on trial *n* following feedback, and α, β, γ, θNMDA, and θAMPA are all constants. The first three of these (i.e., α, β, and γ) operate like standard learning rates because they determine the magnitudes of increases and decreases in synaptic strength. The constants θNMDA and θAMPA represent the activation thresholds for post-synaptic NMDA and AMPA (more precisely, non-NMDA) glutamate receptors, respectively. The numerical value of θNMDA > θAMPA because NMDA receptors have a higher threshold for activation than AMPA receptors. This is critical because NMDA receptor activation is required to strengthen cortical-striatal synapses (Calabresi et al., 1996).

The positive term in Equation (4) describes the conditions under which synapses are strengthened (i.e., striatal activation above the threshold for NMDA receptor activation and dopamine above baseline) and the two negative terms describe conditions that cause the synapse to be weakened. The first possibility (line 3) is that post-synaptic activation is above the NMDA threshold but dopamine is below baseline (as on an error trial), and the second possibility is that striatal activation is between the AMPA and NMDA thresholds. Note that synaptic strength does not change if post-synaptic activation is below the AMPA threshold.

The Equation (4) model of reinforcement learning requires that the amount of dopamine released in response to the feedback signal [that is, the *D*(*n*) term] is specified on every trial. COVIS adopts the popular model that, over a wide range, dopamine firing is proportional to the reward prediction error (RPE) (e.g., Schultz et al., 1997; Tobler et al., 2003):

$$\text{RPE} = \text{Obtained Reward} - \text{Predicted Reward} \tag{5}$$

The procedural system of COVIS uses a simple model of dopamine release by first computing both obtained and predicted reward, and then by estimating the amount of dopamine released as a function of RPE.

In applications that do not vary the valence of the rewards, the obtained reward *Rn* on trial *n* is defined as

$$R\_n = \begin{cases} +1 & \text{if correct feedback} \\ 0 & \text{if no feedback is given} \\ -1 & \text{if error feedback} \end{cases} \tag{6}$$

In models that assume separate, independent feedback signals, the reward signal is based on the response suggested by the procedural system regardless of which system initiated the response. In models that assume a single feedback signal, the reward is based on the observable response. The single-operator learning model (Bush and Mosteller, 1955) is used to compute predicted reward, *Pn*:

$$P\_n = P\_{n-1} + \alpha\_{\text{pr}} \left( R\_{n-1} - P\_{n-1} \right) \tag{7}$$

It is well known that when computed in this fashion, *Pn* converges exponentially to the mean reward value and then fluctuates around this value (until reward contingencies change). The parameter αpr governs how quickly *Pn* converges.

Finally, to compute the dopamine release to the feedback, a simple model matching empirical results reported by Bayer and Glimcher (2005) is used:

$$D(n) = \begin{cases} 1 & \text{if } \text{RPE} > 1\\ 0.8 \text{RPE} + 0.2 \text{ if } -0.25 \le \text{RPE} \le 1\\ 0 & \text{if } \text{RPE} < -0.25 \end{cases} \tag{8}$$

Note that the baseline dopamine level is 0.2 (i.e., when RPE = 0) and that dopamine levels increase linearly with RPE between a floor of 0 and a ceiling of 1.

For the final simplification, a strong form of lateral inhibition at the level of the striatum was assumed. Computationally, this amounts to updating only the weights associated with the striatal unit matching the response suggested by the procedural system. For example, if the procedural system suggests an "A" response, only the weights associated with the "A" striatal unit are modified. This simplification effectively serves a dual-purpose: it accelerates learning in the procedural system because only the weights relevant to that trial are updated and improves computational efficiency.

In order to resolve the competition between the systems and to select an overall model response, confidence is measured on every trial by calculating a discriminant value for each system. For the explicit system, the discriminant value equals the distance from the stimulus to the decision bound used by the explicit system. The confidence of the explicit system, which equals |*hE*(*n*)|, will be large on any trial where the stimulus is far (in stimulus space) from the response criterion. In the procedural system, confidence is measured by the difference in striatal unit activity and is defined by:

$$|h\_P(n)| = |S\_A(n) - S\_B(n)|\tag{9}$$

Note that as with the explicit system, the procedural system will be highly confident when it strongly favors one response over another. Hence, the explicit system is more confident when the stimulus is far from the bound and the procedural is more confident when the stimulus is strongly associated with one motor response but not the other.

The trust placed in each system is determined by overall system weights, θ*E*(*n*) and θ*P*(*n*), where θ*E*(*n*) + θ*P*(*n*) = 1. Because humans are naturally rule preferring and there is no procedural learning at the beginning of an experiment, COVIS assumes that trust in the explicit system is initially much higher than in the procedural system, hence θ*E*(1) = 0.99. Throughout each experiment, the system weights are adjusted based on the success of the explicit system. When the explicit system suggests a correct response,

$$
\theta\_E(n+1) = \theta\_E(n) + \Delta\_{\rm OC} \left[ 1 - \theta\_E(n) \right], \tag{10}
$$

where -OC is a learning rate constant. If instead the explicit system suggests an incorrect response then

$$
\theta\_E(n+1) = \theta\_E(n) - \Delta\_{\rm OE} \theta\_E(n), \tag{11}
$$

where -OE is another rate constant. The two regulatory terms on the ends of Eqs. 10 and 11 restrict θ*E*(*n*) to the range 0 ≤ θ*E*(*n*) ≤ 1. Finally, on every trial, θ*P*(*n*) = 1 − θ*E*(*n*). Thus, Equations 10 and 11 also guarantee that θ*P*(*n*) falls in the range 0 < θ*P*(*n*) < 1. The parameters -OC and -OE control how fast θ*E*(*n*) changes in response to correct and incorrect feedback, respectively; thus, they also control how quickly θ*P*(*n*) changes, which is related to how frequently the procedural system will be allowed to generate the overall system response.

The overall system decision rule is to emit the response suggested by the explicit system if θ*E*(*n*)× | *hE*(*n*) |> θ*P*(*n*) × |*hP*(*n*)|; otherwise emit the response suggested by the procedural system. Notice that this is done on a trial-by-trial basis and that either system may be responsible for the overall response generated depending on the confidence and trust; hence, this a soft-switching model.

All exploratory analyses will be conducted with this model, so that Model 0 will serve as a baseline to compare the effects of modifications to this standard implementation. The simplifications made to the model serve only to optimize learning performance and computational efficiency. It is important to emphasize that the simplifications will only help the procedural system learn faster, so all results should be interpreted as a best-case learning scenario.

#### **MODEL 1: SINGLE SOURCE OF FEEDBACK, HARD SWITCH**

The first modified COVIS model follows the revisions suggested by the evidence reviewed above. Specifically, this model only receives a single source of feedback based on the response of the controlling system and assumes that the switch from the explicit to the procedural system is a one-time, hard switch.

The goal of the present research is not to specify exactly how this hard switch is implemented computationally, but instead to evaluate learning in the procedural system under explicit-system controlled responding. For this reason, Model 1 never switches from the explicit to the procedural system. The hyperdirect pathway model places the switching gate downstream from learning in the procedural system (i.e., downstream from the striatum), so the procedural system weights are still modified on every trial. The absence of any switching is a worst-case scenario for a hardswitch model in a procedurally learned task, but it is also the best way to evaluate the ability of the procedural system to learn while the explicit system controls responding (by design, the procedural system will necessarily learn after a switch).

With a single feedback source, and because the model never switches, RPE (Equations 5 and 7) and therefore dopamine Equation (8) will always be driven by the responses of the explicit system. Concretely, reward, *Rn* is determined only by the explicit system's response to the stimulus (affecting RPE and dopamine calculations). The procedural system will suggest its response normally, but the weights will be updated based on the feedback elicited by the explicit system response. All other features of this model are identical to the implementation of Model 0.

#### **MODEL 2: SINGLE SOURCE OF FEEDBACK, SOFT SWITCH**

The second modified COVIS model is a simple modification of Model 0 in which a single source of feedback is used in conjunction with soft switching. Models 1 and 2 together allow procedural-system learning with a single source of feedback to be evaluated both under hard- and soft-switching architectures.

As with Model 0, the switching algorithm described above was used to select between procedural and explicit system responses on every trial. The difference from Model 0 is that, while the explicit system controls the overall response, the procedural system (specifically, RPE and dopamine) is updated using the feedback signal generated by the explicit system's response. On the other hand, whenever the procedural system controls the overall response, the feedback signal is guaranteed to be congruent with the response suggested by the procedural system. All other features of this model are identical to the implementation of Model 0.

A summary of these models and their differences appears in **Table 1**.

## **METHOD—SIMULATION SET 1**

## **CATEGORIZATION SIMULATIONS**

#### *Information-integration categories*

Each model was evaluated through simulation to determine whether the procedural system could learn II categories (e.g., as in **Figure 1**, top panel). Human participants reliably learn categories like these, so any multiple systems model of category learning must pass this basic performance benchmark. These simulations served as a first-pass elimination round—any model performing poorly on this category structure should be considered unviable while all successful models will move on to a second simulation experiment. A total of 300 sample stimuli were drawn from each of two bivariate normal distributions (with the category means and variances in **Table 2**). The sample category distributions used are shown in **Figure 3**. Maximum performance by the explicit system on these categories is 78.5%, and the explicit onedimensional bound (the dashed line) was set to guarantee that the explicit system reached this level of performance.

Model 0 should have no trouble learning these categories as each subsystem receives independent feedback. Models 1 and 2

**Table 1 | Summary of each model and brief description of differences.**


**Table 2 | Mean, variance, and covariance parameters for each category in Figure 3.**


are particularly interesting because, currently, it is unknown if any COVIS model will successfully learn II categories without independent feedback.

#### *Hybrid categories (Ashby and Crossley, 2010)*

All models that successfully learned the benchmark II category structures were tested on the hybrid categories shown in **Figure 2** (Ashby and Crossley, 2010). These categories require a 1D rule on disks with steep orientations and an II strategy on disks with shallow orientations, so optimal responding requires trial-bytrial system switching. As mentioned earlier, Ashby and Crossley (2010) reported that only 4% of participants showed any evidence of trial-by-trial switching. An obvious prediction is that soft-switching models will perform well on these categories (i.e., better than human participants), but it is not clear how well hard-switching models will perform.

The hybrid categories shown in **Figure 2** were used in this simulation. A total of 300 stimuli were uniformly sampled from each of two categories separated by the hybrid bound. Maximum performance by the explicit system on these categories is 88.33%, and the one-dimensional bound of the explicit system (the dashed line) was set to this optimal position.

#### **ASSESSMENT OF MODEL PERFORMANCE—PARAMETER SPACE PARTITIONING**

The goal of our simulation analyses is not to ask how well each particular model can fit some data set, but rather to ask whether each model is or is not capable of learning. Before concluding that a model cannot learn, it is vital to examine its performance under a wide range of parameter settings. Similarly, when a model does learn, it is important to know whether the learning is representative of the model, or restricted to a small set of parameter settings. Because of these unique modeling goals, we chose to evaluate the performance of each model using a parameter space partitioning (PSP) analysis (Pitt et al., 2006, 2008). PSP is a technique used to investigate the global performance of cognitive models. The basic idea is to exhaustively explore the parameter space (defined by the free parameters of the model) and to map out regions that lead to qualitatively different behaviors (called data patterns).

The end result is a disjoint partitioning of the parameter space into regions that each produces a qualitatively unique behavior (data pattern). The finite set of data patterns must be defined by the experimenter. In order to map out the space, discrete "steps" in the parameter space are each assigned a data pattern. A step is defined as a particular set of numerical values for every parameter that defines the space. Because the computational demands of searching the parameter space increase dramatically with the number of parameters, PSP uses an efficient Markov chain Monte Carlo search algorithm (Pitt et al., 2006). Recall that the implementation of COVIS used here was optimized for computational efficiency so that the parameter space search would be computationally tractable. Once the entire parameter space has been mapped, volumes (i.e., contiguous regions) of the space are computed to quantify the range of parameters that produce a particular data pattern. These volumes describe how likely a behavior is to be produced by the model.

**Figure 4** offers a schematic representation of this analysis and related concepts. In this hypothetical example, a simple model is defined by two parameters (θ1, θ2). By simultaneously varying these parameters, the model can account for three qualitatively different behaviors or data patterns. These could be almost anything. For example, pattern 1 might be that performance is significantly better in an experimental condition than in a control condition (e.g., at least 5% better), pattern 2 could be the reverse ordering, and pattern 3 might be that performance in the two conditions is not significantly different. The PSP analysis measures the area (or volume when there are 3 or more parameters) of the parameter space that predicts each of the three patterns. In this case, the analysis reveals that for most pairs of parameters, data pattern 1 is produced, but for more restricted sets of parameters, patterns 2 or 3 occur.

#### **IMPLEMENTATION OF PSP**

In the following analyses, a MATLAB implementation of the PSP algorithm was obtained from the website of J. I. Myung (http:// faculty.psy.ohio-state.edu/myung/personal/psp.html). For the PSP search algorithm to proceed, the model must produce deterministic output (i.e., produce the same behavior) for a given set of parameters. In order to accomplish this, all randomized features of the model must be fixed. All models used here omit the noise (e.g., perceptual and criterial) terms typically included. Thus, the only probabilistic features are the random initial weights for each striatal unit and the randomized stimulus ordering.

In normal applications of the model the stimulus ordering is completely randomized in every simulation. Because of this, the performance of the model averaged across many (i.e., 100 or more) simulations will be robust to stimulus ordering. This is important because Pitt et al. (2006) observed that other models

of category learning (e.g., ALCOVE; Kruschke, 1992) are sensitive to stimulus ordering. In addition, the effect of stimulus ordering interacts with the particular initial randomization of weights in the procedural system. To handle these random ordering and initialization effects, it is necessary to choose a fixed random sample on which to run all PSP analyses. For these reasons a random set of 19 random stimulus orderings and weight initializations were generated. Each simulation (each step in the parameter space) was tested on this set, and the model performance was averaged across them. This allowed for deterministic output and relatively stable estimates of the model behavior.

It is necessary to select the parameters defining the parameter space. Including every parameter of the model would be inefficient because the parameter space would be very high dimensional and many parameters interact in predictable ways. For example, in COVIS, if the AMPA and NMDA thresholds (Equation 4) are set too high, then the procedural system will be unable to learn regardless of the values of any other parameters.

Our goal is only to evaluate the ability of learning to proceed in the procedural system, so the parameter space was constrained to parameters that directly affect procedural learning: the learning rates, α and β (Equation 4), and α*pr* (Equation 7), which determines how fast predicted reward converges to the expected reward value, and therefore affects the trial-by-trial dopamine fluctuations and weight adjustment.

Additionally, for Model 2 only, the system switching rate parameters -OC and -OE (Equations 10, 11) were also included in the PSP analysis. These parameters directly affect systemswitching behavior and learning in the procedural system (note that this is not true for model 0 because the procedural system always receives independent feedback). **Table 3** summarizes the function and search range for each manipulated parameter. Every other parameter was fixed to the specific value shown in **Table 4**. The learning rate parameter γ (Equation 4) was irrelevant because the AMPA and NMDA thresholds were set low enough that the model would never be above AMPA but below NMDA (e.g., line 3 of Equation 4).

The next choice in a PSP analysis is to define the qualitative behaviors (data patterns) that determine the partitioning of the parameter space. We chose to partition based on the accuracy of the procedural system separated into deciles from 0 to 100%; hence, a total of 10 data patterns were possible. For example, data pattern 1 occurs if the procedural system accuracy falls between

**Table 3 | Function and search range for each parameter in the PSP analysis.**


**Table 4 | Values of all parameters that remained fixed across all simulations.**


0 and 10%, pattern 2 occurs if accuracy falls between 10 and 20%, and so forth. This simple definition of data patterns easily allows for a rough quantification of how well the procedural system performs on the category structure after the weights have been trained.

The PSP analysis proceeded in two stages: the initial partitioning stage, and an evaluation of robustness stage. For the initial stage, each step in the parameter space was tested on all 19 initializations. Each initialization included a training phase, where the model was trained with feedback on the sample stimuli, and a testing phase where end-of-simulation (learned) procedural system weights were run forward through the stimuli again simply to estimate the asymptotic procedural system accuracy. The test performance of the model was averaged over all 19 initializations to determine the final data pattern for each step in the parameter space. The PSP algorithm proceeded for six search cycles to obtain a reliable partitioning of the parameter space.

The complete PSP search returned the volume of parameter space that was associated with each of the 10 data patterns, and a specific set of parameter values that generated each discovered pattern. The robustness stage further tested each model in 200 random stimulus orderings and weight initializations using the parameters returned for each discovered data pattern. This two-stage approach was used because the initial PSP analysis is computationally demanding, so running hundreds of simulations per step in parameter space would be prohibitively time consuming. The second stage is important, though, because it essentially establishes the reliability of the parameters to produce a particular behavior (pattern).

#### **RESULTS—SIMULATION SET 1**

#### **MODEL 0: INDEPENDENT FEEDBACK, SOFT SWITCH**

Recall that Model 0 is simply the COVIS model as previously described. It is an independent feedback model (i.e., both systems learn independently and receive a separate feedback signal), and uses soft switching (either system can potentially control the overall system response on every trial). This version of Model 0 will be referred to as Model 0 (2FB-SS) because it receives two independent feedback (2FB) signals and uses soft switching (SS).

#### *Information integration categories*

**Figure 5** shows the percentage of the volume of parameter space that produced each data pattern that was discovered. Note that Model 0 (2FB-SS) showed robust learning of the II categories in the majority of the parameter space (**Figure 5**, first column). A total of three data patterns were found. Over 99.99% of the volume of the parameter space learned the categories at or above

90% accuracy (Pattern 10). The remaining two data patterns accounted for less than 0.01% of the parameter space volume and corresponded to accuracy in the 70–79% and the 80–89% deciles (Patterns 8 and 9).

Recall that in the Stage 2 analysis, representative parameter values were chosen that produced each of the three observed data patterns and then the model was tested on 200 random stimulus orderings and weight initializations for each of these three parameter settings. In all cases, the performance of the model on these new tests was virtually identical to the performance on the 19 stimulus orderings and weight initializations used in the PSP analysis. Thus, learning in Model 0 (2FB-SS) is highly robust.

Finally, **Figure 6** shows the learned procedural system weights for each striatal unit at the end of training (again, averaged across 200 simulations). It is clear from the figure that a progression of weight strength follows the data patterns: parameters producing the worst model performance led to noisier, smaller weights than parameters producing the best model performance. Still, the success of Model 0 (2FB-SS) in the II categories suggests that it will learn equally well in the hybrid category structures.

#### *Hybrid categories*

As with the II categories, Model 0 (2FB-SS) showed very good learning throughout the parameter space with the hybrid categories (**Figure 5**, second column). A total of five data patterns were found. Over 99.94% of the parameter space learned the hybrid categories at or above 90% accuracy. Less than 0.06% of the parameter space was accounted for by data patterns corresponding to the four accuracy deciles between 50 and 90%.

As with the II categories, Model 0 (2FB-SS) showed very robust learning in each region of the parameter space. In all cases, performance on the 200 random initializations was virtually identical to performance on the 19 training initializations. The behavior of the model appears to be very similar for both hybrid and II categories. An examination of the weights showed that they closely mimicked the structure of the hybrid categories (e.g., as the **Figure 6** weights mimic the structure of the II categories).

#### **MODEL 1: SINGLE SOURCE OF FEEDBACK, HARD SWITCH**

Recall that Model 1 receives only one feedback signal, and uses hard switching (the explicit system controls responding until it yields control to the procedural system, at which point the procedural system controls responding), so we refer to this model as Model 1 (1FB-HS). Also recall that in these simulations the procedural system is never allowed to respond—in other words, the hard switch never occurs. This way, the ability of the procedural system to learn during explicit system responding can be fully evaluated.

## *Information integration categories*

The partitioning for Model 1 (1FB-HS) required a slight modification to acquire a higher resolution partitioning of the space. In the first partitioning, approximately 62% of the parameter space volume was between 40 and 59% test performance accuracy, and about 38% of the volume was associated with test performance accuracy between 60 and 69%. To create a finer-grained partitioning, for Model 1 (1FB-HS) only the 60–69% decile was subdivided into 60–64 and 65–69% ranges.

**Figure 5** shows that the model performed poorly on the II categories. Approximately 60% of the parameter space performed at between 40 and 59% in the test phase, suggesting that, for more than half of the volume, the model produced no measurable learning. About 39% of the parameter space volume showed between 60 and 64% test accuracy and slightly more than 1% of

bottom row weights (Pattern 3) and the stimuli plotted in **Figure 3**, and also

the weaker correspondence in the lower-performing weights.

the volume was above 65% test accuracy. The robustness analysis, described in **Figure 7**, shows that **Figure 5** actually overestimates the performance of Model 1 (1FB-HS). While the test performance of the model clearly produces the expected output when averaged across the 19 fixed initializations (dark gray bars), every set of parameters (i.e., for each discovered data pattern) performs no better than 52% in the test phase when averaged across 200 random initializations (light gray bars). Because of the model's II categorization learning failure, the hybrid categories were not explored.

#### **MODEL 2: SINGLE SOURCE OF FEEDBACK, SOFT-SWITCH**

Model 2 (1FB-SS) receives only one source of feedback [as in Model 1 (1FB-HS)], and uses soft switching [either system can potentially control the overall system response on every trial [as in Model 0 (2FB-SS)].

#### *Information integration categories*

The partitioning for Model 2 (1FB-SS) was comparatively diverse, suggesting that this model is capable of a wider range of behaviors. A total of 6 data patterns were discovered for the II categories (**Figure 6**, fourth column). The percentage of volume associated with each data region is also shown in **Table 5**. Although the model is capable of responding with accuracy above 90%, this behavior occurs only for a very small proportion of the parameter space. In fact, accuracy in the test phase was below 70% for approximately 80% of the parameter space. Overall, each data pattern appeared relatively robust across the 200 simulations. Only data patterns 3 and 4 performed worse (10 and 6%, respectively) on the 200 simulations than on the 19 fixed initializations. Hence, the performance of this model seems to be reliable.

Although only a small proportion (approximately 1%) of the overall parameter space was associated with test performance at or above 80% after training, the results suggest that for a restricted range of parameters, this model can successfully learn II categories with only a single source of feedback.

**FIGURE 7 | Robustness of Model 1 (1FB-HS) in II categories for each of four discovered data patterns.** Dark gray bars represent the performance of the model (proportion correct in the test phase) for the 19 fixed initializations used in the PSP analysis; light gray bars represent the performance of the model across 200 random initializations. In every case, the stage 2 analysis using 200 simulations failed to reproduce the patterns observed in the PSP across the 19 fixed initializations suggesting that the observed PSP performance was not robust. Note that the 60–69% performance decile was separated into two patterns (7a and 7b) for a more fine-grained evaluation.


**Table 5 | Percentage of parameter space volume for each observed data pattern (or test phase accuracy ranges) for Model 2 (1FB-SS).**

#### *Hybrid categories*

The performance of Model 2 (1FB-SS) on the hybrid categories was similar to its performance on the II categories. The same data patterns were observed, and the parameter space volumes associated with each of these patterns were similar (compare the 4th and 5th columns of **Figure 5**). When tested across the 200 random initializations, the model's performance was robust.

#### *Soft switching drives learning in the procedural system*

The ability of Model 2 (1FB-SS) to learn at all is in stark contrast to Model 1. The only difference between those models is the switching mechanism—Model 1 uses hard switching, whereas Model 2 uses soft switching. These results suggest that the switching mechanism is allowing the procedural system to learn in Model 2 (1FB-SS). For example, when the model switches to the procedural system, the procedural system receives veridical feedback, which should facilitate procedural learning. Note that this hypothesis predicts that model accuracy should increase with the proportion of trials controlled by the procedural system. To test this prediction, we computed the correlations between accuracy and the number of procedural system responses for Model 2 (1FB-SS) separately on the II and hybrid categories, both for the 19 training initializations and the 200 random initializations. All four of these correlations were *r* ≥ 0.97 (all *p* < 0.001).

Given that the procedural system only learns when it is parameterized so that it generates the majority (nearly 90%) of responses, a follow-up test was conducted to evaluate whether the procedural system would take over control of responding with simple RB categories (**Figure 1**, bottom panel). The RB categories were created by rotating the II categories (**Figure 3**) counter-clockwise so that the optimal bound (solid black line) is vertical.

The model was run for 200 simulations on these RB categories using the parameters that produced the best II learning (i.e., pattern 6). In these simulations, the RB performance was hard-coded so that the explicit system either performed at 90, 95, 99, or 100% accuracy. These simulations revealed that the average proportion of trials where the procedural system controlled responding was 0.84, 0.76, 0.41, and 0.00, respectively. This is fundamentally problematic because the explicit system should overwhelmingly control responding in RB tasks, especially when it is performing at such high accuracy levels.

#### **DISCUSSION—SIMULATION SET 1**

The results of the PSP analyses reveal the strength and limitations imposed by the feedback signal. First, when each system receives independent feedback (Model 0, 2FB-SS), the procedural system of COVIS readily learns to respond accurately in both II and hybrid categories. This is problematic because the model performs substantially better than humans in the hybrid task. Soft switching allows the model to pass control to the procedural system, which easily learns the hybrid categories. It is important to recall that Model 0 (2FB-SS) was designed deliberately to maximize learning. For this reason, our results represent a best-case scenario for this model.

In contrast, procedural learning was seriously compromised when both systems received the same feedback signal. Although the PSP for Model 1 hinted at a small amount of learning in the procedural system, that learning depended critically on the exact ordering of the stimuli during training, because during test the apparent learning disappeared. Note that Model 1 failed to learn even though the explicit system received correct feedback more than 75% of the time. This was because the feedback was independent of any activation within the procedural system, and thus it had the same effect as if the feedback was random.

Of course, the procedural system of Model 1 would learn *after* the hard switch occurs. But our results show clearly that the procedural system learns nothing until it controls responding in the task. Thus, in the II task simulated here, Model 1 predicts that accuracy should drop from around 75% correct to chance on the trial of the hard switch. As noted earlier, we know of no II studies that have reported such a mid-session drop in accuracy.

Finally, the results from Model 2 (1FB-SS) show that with soft switching, the procedural system of COVIS can learn provided that it controls responding for the majority of trials in the experiment. In other words, when the procedural system is allowed to generate the overall response for a large proportion of the training trials, it successfully learns to respond to the categories. That this occurs is unsurprising and essentially suggests that, within a narrow range of parameterizations, the model learns with one source of feedback in a serial fashion.

The empirical evidence (e.g., Ashby and Crossley, 2010) suggests that soft switching is not the dominant method via which humans resolve competition between declarative and procedural memory systems. However, other evidence (Erickson, 2008) suggests that a minority of humans seem to be able to switch trialby-trial if given enough cues, so it may be the case that under some conditions, soft switching could occur naturally during the course of learning. When the task demands are clear, people may successfully adopt different strategies and flexibly shift among them. This is the idea behind knowledge partitioning, where participants learn to apply different strategies to different stimuli within one task (Lewandowsky and Kirsner, 2000; Yang and Lewandowsky, 2004).

Regardless of the plausibility of soft switching, the greatest problem with Model 2 (1FB-SS) is that it only predicts learning in the procedural system when the procedural system dominates responding in the task, regardless of whether that task is RB or II. This is problematic because the model predicts that even onedimensional RB tasks will frequently be learned procedurally—a prediction that is strongly contradicted by the literature (e.g., Waldron and Ashby, 2001).

The results of these simulations largely suggest that with a single feedback source, simply modifying the system switching mechanism is not sufficient to account for human learning data. Regardless of the switching mechanism, the results show that learning in the procedural system is possible only if the procedural system controls responding throughout the majority of training.

## **ADDITIONAL COVIS MODIFICATIONS—THE BOOTSTRAPPING HYPOTHESIS**

All of the models so far investigated predict no procedural learning while the explicit system controls responding. If we take for granted that the procedural system learns even while it does not control responding, then it appears that another modification to the architecture of COVIS may be necessary. One possibility is that the explicit system somehow trains or bootstraps the procedural system while it controls responding.

Although speculative, this could occur if the procedural system is somehow informed of the response the explicit system generates. As discussed above, the problem with the single feedback models so far investigated is that when the explicit system controls responding, the feedback it elicits is independent of activity within the procedural system. In other words, any change in cortical-striatal weights is as likely to be rewarded as any other change. If the procedural system is somehow fed information about the explicit system response, however, then this independence could disappear, which might allow the procedural system to learn from the explicit system.

#### **IMPLEMENTING BOOTSTRAPPING IN COVIS**

In order to implement bootstrapping in COVIS, a simple modification was made. Recall from earlier that, on every trial, the explicit system generates a discriminant value, *hE*(*n*), that quantifies the overall output of the explicit system's response. Also recall that, on every trial, each striatal unit produces an overall activation value that is driven by the stimulus. For example, activation in striatal unit J on trial *n* is denoted by *SJ*(*n*) (i.e., see Equation 3). Using these values, the following modification was made to reflect the hypothesis that the procedural system is privy to the response generated by the explicit system whenever the explicit system controls responding. If the explicit system controls the overall response and emits a response corresponding to category *J* on trial *n*, set

$$S\_{\mathcal{I}}(n) = S\_{\mathcal{I}}(n) + \mid h\_{\mathcal{E}}(n) \mid \tag{12}$$

and make no changes to *SK*(*n*), for all *K* = *J*.

This modification makes no changes to the parameters of the model; it only makes a new assumption about the flow of information when the explicit system is generating the overall system response. Specifically, it assumes that information about the explicit system's response is fed back into the procedural system at the level of the striatum and only to the striatal unit matching the response of the explicit system. Ramping up the activity of the striatal unit corresponding to the response generated by the explicit system translates into larger changes in the weights associated with that striatal unit (see the reinforcement learning equation, Equation 4). The timing of this ramping-up of activity only needs to occur before feedback is given to the model, and because the explicit system discriminant value *hE*(*n*) is related to the explicit system decision, one can further assume that this information transfer occurs in tandem with the explicit system decision.

One attractive property of this modification is that it does not necessarily override the output of the procedural system. For example, suppose that the procedural system has been partially trained in an II categorization task, so its weights have been modified to respond accurately. The explicit system may make an incorrect "A" response to a stimulus, but because the procedural system has been trained, it is suggesting a correct "B" response with high confidence. The explicit system discriminant value *hE*(*n*) would be added to the striatal unit response *SA*(*n*), but *SB*(*n*) could still be larger and thus, the procedural system's output would not be washed out by the explicit system.

Early in training when the procedural system has not yet learned anything, both *SA*(*n*) and *SB*(*n*) will be approximately equal, so the explicit system's output will nudge learning in the procedural system in the direction of the explicit system's response when the weights are updated. The end result should be procedural system weights that are updated to reflect the response strategy of the explicit system as long as the explicit system controls responding.

## **SIMULATION SET 2—BOOTSTRAPPED COVIS ARCHITECTURES**

This section explores two COVIS architectures (Models 1 and 2) with bootstrapping implemented. The general methodology and presentation of results will follow those already described and presented. Bootstrapping Model 0 (2FB-SS) is unnecessary as it receives independent feedback in each system.

#### **MODEL 1: SINGLE SOURCE OF FEEDBACK, HARD SWITCH, BOOTSTRAPPED**

This model is exactly as described above with the additional bootstrapping modification. It will be referred to as Model 1 (1FB-HS-B) because it receives only a single source of feedback (1FB), assumes hard switching, and is bootstrapped (B) by the explicit system.

### **MODEL 2: SINGLE SOURCE OF FEEDBACK, SOFT SWITCH, BOOTSTRAPPED**

Model 2 (1FB-SS-B) is exactly as described above with the additional bootstrapping modification.

## **METHOD—SIMULATION SET 2**

The methods of assessment for both models were identical to those described for simulation set 1.

## **RESULTS—SIMULATION SET 2**

#### **MODEL 1 (1FB-HS-B)**

#### *Information integration categories*

The PSP on Model 1 (1FB-HS-B) was definitive: only one data pattern was discovered, corresponding to test phase accuracies greater than 90%. This suggests that across all parameter values (at least for the parameters defining the parameter space) the model learns II categories handily. The robustness analysis showed that the performance of the model across the 200 random simulations were essentially identical as across the 19 training simulations. Thus, the model is highly robust. Recall, however, that the suboptimal explicit strategy only performs at about 78% correct in II categories. How, then can the procedural system outperform the suboptimal strategy it was trained on?

A careful evaluation of the procedural system weights provides insight. The average trained procedural system weights across 200 simulations appear different than in previous models; the weights now show a residual trace of the vertical bound used by the explicit system (**Figure 8**, top). The effects of the explicit system training are most easily seen in the bottom of **Figure 8**, which shows a ratio of the A and B weights (i.e., A/B and B/A). Here, the solid red vertical line approximately corresponds to the boundary used by the explicit system. Note the small regions to the right and left of the bound in the A/B and B/A ratios, respectively. These are the regions where the procedural system weights are driven to zero due to the incorrect responses being made by the explicit system (top row, yellow arrows).

In other words, the procedural system not only learns to make the responses that the explicit system gets correct, but it also learns to avoid responses made by the explicit system that were incorrect. This analysis resolves the apparent paradox of how the procedural system was able to outperform the system that trained it.

#### *Hybrid categories*

The partitioning for Model 1 (1FB-HS-B) on the hybrid categories was similar to the II partitioning. The PSP algorithm discovered two data patterns corresponding to 80–89% test phase accuracy and greater than 90% test phase accuracy. The volume estimate was nearly 100% for the pattern corresponding

(1FB-HS-B) trained on II categories, averaged across 200 simulations. Blue (cool colors) represent small weights whereas red (warm colors) represent large weights. Note the qualitative difference between these weights and those in **Figure 6**. **Bottom:** Ratio of procedural system weights. Solid vertical line approximately corresponds to the explicit system rule-based bound. Note that large values in the ratio correspond to regions where the weights in the denominator are driven toward zero (darkest blue regions in the top row indicated by yellow arrows).

to >90% test accuracy. The model's performance was also very robust across 200 simulations.

As with the II categories, the weights learned by the procedural system show a residual trace of the explicit system training. This is most easily seen in **Figure 9**, which shows a ratio of the A and B weights. Note that, as with the II categories, the procedural system weights for the striatal unit where incorrect responses are made appear to be driven toward zero in the region where the explicit system responds incorrectly. Specifically, notice in the bottom right panel that there are some large positive values to the left of the explicit system's approximated bound. These large positive values in the ratio are due to the A-weights being very close to zero in those regions. Again, the procedural system learns what response not to make in this region.

#### **MODEL 2 (1FB-SS-B)**

#### *Information integration categories*

The PSP analysis found only two data patterns corresponding to 80–89% and >90% test phase accuracy. Again, the volume of parameter space was nearly 100% for the >90% data pattern, and again, the model's performance was very stable across 200 simulations using random stimulus orderings and weight initializations. The pattern of weights learned by Model 2 (1FB-SS-B) were functionally identical to the weights learned by Model 1 (1FB-HS-B).

The only noteworthy difference between the models is that in Model 2 (1FB-SS-B), the procedural system is sometimes allowed to generate the overall system response. Recall that, in Model 2 (1FB-SS), the procedural system could learn, but only when the procedural system dominated responding (nearly 90% of all responses). This, however, was not a learning requirement in Model 2 (1FB-SS-B). Across 200 simulations, for pattern 9 (80–89% accuracy in the test phase), the procedural system only generated 7% of the overall responses on average; for pattern 2 (>90% test phase accuracy), the procedural system generated about 40% of the overall responses.

corresponds to the explicit system rule-based bound.

Thus, procedural system learning in Model 2 (1FB-SS-B) is not critically dependent on the procedural system taking over the task.

#### *Hybrid categories*

Again, the PSP analysis only found two data patterns for Model 2 (1FB-SS-B) in the hybrid categories and nearly 100% of the volume of the parameter space was assigned to the pattern corresponding to >90% test phase accuracy. The performance of the model was also very robust across 200 simulations. The learned weights to those learned by Model 1 (1FB-HS-B).

Finally, as with the II categories, the success of Model 2 (1FB-SS-B) no longer hinged critically on the procedural system dominating the training phase. Across 200 simulations, the procedural system of Model 2 (1FB-SS-B) never accounted for more than 7% of the overall responses, on average (i.e., no parameterization caused the procedural system to dominate responding as before).

## **DISCUSSION—SIMULATION SET 2**

With the addition of bootstrapping, there was a striking difference in the model's performance using only a single source of feedback. Basically, the bootstrapping modification allowed the feedback elicited by the responses of the explicit system to become useful to the procedural system. Model 1 failed to learn without bootstrapping whereas, with bootstrapping, it learned the categories well, even without ever generating a response. Model 2's performance previously depended on soft switching, which allowed the procedural system to dominate responding, but with the bootstrapping mechanism implemented, the model no longer required control.

Notably, the bootstrapping models learn both to respond and to not respond in certain regions of stimulus space, effectively allowing the procedural system to learn the categories better than the explicit system even though the procedural system only gets feedback based on the suboptimal explicit system strategy. This is a unique and strong prediction, which should be interpreted with some caution—recall that the models were biased toward learning very well, so it may be that, with a full explicit system and noise, the procedural system might not show such a dramatic performance improvement. This would be more in line with observed learning in human studies (i.e., learning curves generally have no large discontinuities from sudden drops or jumps in accuracy). However, even with noise, the weights in the procedural system should change in the same general fashion, so it seems that, with bootstrapping, the procedural system is capable of learning a little better than the explicit system, and once it is allowed to take over, refine its strategy.

Finally, it is interesting that in the soft switching Model 2 (1FB-SS-B), the procedural system was able to make as many as 40% of the overall responses with II categories, but not more than 7% with hybrid categories. This result is in line with the observation in Ashby and Crossley (2010) that humans tend to persist with suboptimal RB strategies with hybrid categories. Overall, learning in both models presented here is nearly identical. The current simulations, therefore, do not support a hard switching mechanism over a soft switching mechanism. With the assumption of bootstrapping, these simulations only confirm that the procedural system can learn with one source of feedback, regardless of the switching mechanism.

## **GENERAL DISCUSSION**

The simulations show clearly that the procedural system can learn with one source of feedback as long as the response generated by the explicit system is communicated back to the procedural system (i.e., via bootstrapping). Specifically, the model assumes that this information is passed back to the procedural system at the level of the striatum. COVIS is a model constrained by neurobiology, so although the simulations reported above verify that bootstrapping is a plausible computational mechanism that allows the procedural system to learn during explicit system control, an ideal model would identify a neurobiologically plausible explanation for how bootstrapping could work in a human brain.

There are a number of possible specific pathways via which the procedural system could receive an efferent copy of the explicit system motor response. In general, the challenge for all these accounts is that this efferent motor signal must project to the same striatal targets in the procedural system that receive the relevant visual input. Unfortunately, current neuroanatomy is not precise enough to draw any strong conclusions. Thus, the possibilities considered in this section must all be considered speculative. Hopefully, future research will clarify this issue.

The organizing scheme of the basal ganglia is characterized by parallel cortical-striatal-cortical projection loops (Alexander et al., 1986; Parent and Hazrati, 1995; DeLong and Wichmann, 2007). For this reason, one obvious hypothesis is that an efferent copy of the explicit system response is passed back to the striatum via direct cortical-striatal projections from premotor or motor areas of cortex to the striatal regions responsible for procedural learning. Within premotor areas, one intriguing possibility is that the signal originates in ventral premotor cortex (PMv), which is a likely candidate for the first motor-target of the explicit system.

Tracing and direct stimulation studies have found that premotor and motor regions project to distinct regions of the striatum (Takada et al., 1998; Nambu et al., 2002). Specifically, neurons from primary motor cortex (M1) send projections to medial aspects of the putamen, and neurons in premotor cortex project to dorsolateral regions of the putamen. Further evidence suggests connectivity patterns are both segregated and overlapping (Draganski et al., 2008), and also in accordance to the same somatotopic organization in cortex (Jones et al., 1977; Flaherty and Graybiel, 1993; Takada et al., 1998), which together suggest that the projections may indeed terminate in the general regions of the striatum responsible for executing the motor responses of the procedural system.

Recent studies have found different kinds of projections from M1 cortical layer V into the striatum. For example, Parent and Parent (2006) found not only direct projections from M1 into the striatum, but also indirect projections via long-range pyramidal tract neurons that form en passant synapses across wide regions. It is believed that these different projections actually stem from two different classes of layer V pyramidal neurons (Molnár and Cheung, 2006): those that send projections within the cortex and basal ganglia (IT-type) and those that send projections to deeper structures, brain stem, and spinal cord (PT-type; Reiner et al., 2003, 2010).

Another well-known organizing principle within the basal ganglia is the direct- and indirect-pathways (Albin et al., 1989; Gerfen, 1992; Pollack, 2001). Both pathways receive excitatory cortical input, but the direct pathway has the effect of increasing striatal output whereas the indirect pathway has the opposite effect. Some evidence now suggests that direct pathway striatal neurons receive input predominantly from IT-type neurons, whereas the indirect pathway neurons receive input largely from PT-type neurons (Reiner et al., 2003, 2010; Lei et al., 2004), and that short projecting IT-type neurons convey a different motor signal to the striatum than the motor signal conveyed to the spinal cord via long projecting neurons (Bauswein et al., 1989; Turner and DeLong, 2000). In contrast, the long projecting PT-type neurons may communicate to the striatum an efferent copy of the motor signals being sent to the spinal cord (Parent and Parent, 2006). Reiner et al. (2010) thus proposed a theory that these inputs ultimately lead to different kinds of motor modulation within the striatum. Specifically, they suggested that IT-type neuronal projections to the striatum facilitate planned motor actions along the direct pathway and that PT-type projections stymie conflicting motor actions along the indirect pathway.

This hypothesis suggests that projections to the indirect pathway might teach the procedural system what responses not to make early on during explicit system control by driving weights toward zero in regions where the suboptimal explicit strategy yields incorrect responses. Similarly, projections to the direct pathway could teach the procedural system when there is agreement between the explicit system's strategy and the categories by increasing weights. Broadly speaking, the differential effects of these cortical-striatal projections would translate to increasing one striatal response and decreasing another, which has the overall effect of increasing the output of the elicited motor response relative to the not-elicited response. Although there is no indirect pathway in COVIS, this is computationally the effect achieved by adding the output of the explicit system to the procedural system. Furthermore, it would be straightforward to implement COVIS with the addition of the indirect pathway. Even if this dichotomy between projections to the striatum is incorrect, the existence of projections from cortical motor regions to the striatal nuclei hypothesized to mediate procedural learning is well-established, and thus a plausible pipeline through which explicit system responses are communicated to the procedural system.

A very different alternative possibility is that the explicit system executes its motor response by activating the striatum, in which case the information about its response might automatically be communicated to the striatum. In other words, this hypothesis predicts that the striatum is involved in explicitly produced, volitional movements, either at the level of generation or execution of motor movements. Striatal involvement in volitional movement is corroborated by PET (Roland et al., 1982; Jueptner and Weiller, 1998), and fMRI (Cunnington et al., 2002) experiments. Neurophysiological studies in non-human primates provide more direct evidence (e.g., Romo et al., 1992; Schultz and Romo, 1992). In those experiments, striatal neurons responded to self-initiated movement, either leading up to the movement (suggesting involvement in generating the motor action), or with the movement (suggesting involvement in executing the motor action).

Although the exact computational mechanism is currently unknown, the available neurobiological evidence supports the possibility that the procedural system of COVIS can be bootstrapped by the explicit system before it takes over responding in a perceptual category-learning task. This bootstrapping could be from cortical-striatal projections from premotor or motor regions into the striatum, or possibly by the explicit system's control of motor responses through basal ganglia-mediated loops.

#### **ACKNOWLEDGMENTS**

This research was supported in part by AFOSR grant FA9550-12- 1-0355, NIH (NINDS) Grant No. P01NS044393, and by Grant No. W911NF-07-1-0072 from the U.S. Army Research Office through the Institute for Collaborative Biotechnologies.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 06 September 2013; accepted: 22 November 2013; published online: 18 December 2013.*

*Citation: Paul EJ and Ashby FG (2013) A neurocomputational theory of how explicit learning bootstraps early procedural learning. Front. Comput. Neurosci. 7:177. doi: 10.3389/fncom.2013.00177*

*This article was submitted to the journal Frontiers in Computational Neuroscience. Copyright © 2013 Paul and Ashby. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Linking reward processing to behavioral output: motor and motivational integration in the primate subthalamic nucleus

## *Juan-Francisco Espinosa-Parrilla, Christelle Baunez and Paul Apicella\**

*Institut de Neurosciences de la Timone, CNRS-Aix-Marseille Université, Marseille, France*

#### *Edited by:*

*Hagai Bergman, The Hebrew University—Hadassah Medical School, Israel*

#### *Reviewed by:*

*Peter Redgrave, The University of Sheffield, UK Okihide Hikosaka, National Eye Institute, USA*

#### *\*Correspondence:*

*Paul Apicella, Institut de Neurosciences de la Timone, CNRS-Aix-Marseille Université, 27 boulevard Jean Moulin, 13005 Marseille, cedex 05, France e-mail: paul.apicella@univ-amu.fr*

The expectation and detection of motivationally relevant events is a major determinant of goal-directed behavior and there is a strong interest in the contribution of basal ganglia in the integration of motivational processes into behavioral output. Recent research has focused on the role of the subthalamic nucleus (STN) in the motivational control of action, but it remains to be determined how information about reward is encoded in this nucleus. We recorded the activity of single neurons in the STN of two behaving monkeys to examine whether activity was influenced by the delivery of reward in an instrumental task, a Pavlovian stimulus-reward association, or outside of a task context. We confirmed preliminary findings indicating that STN neurons were sensitive not only to rewards obtained during task performance, but also to the expectation of reward when its delivery was delayed in time. Most of the modulations at the onset of reaching movement were combined with modulations following reward delivery, suggesting the convergence of signals related to the animal's movement and its outcome in the same neurons. Some neurons were also influenced by the visuomotor contingencies of the task, i.e., target location and/or movement direction. In addition, modulations were observed under conditions where reward delivery was not contingent on an instrumental response, even in the absence of a reward predictive cue. Taken as a whole, these results demonstrate a potential contribution of the STN to motivational control of behavior in the non-human primate, although problems in distinguishing neuronal signals related to reward from those related to motor behavior should be considered. Characterizing the specificity of reward processing in the STN remains challenging and could have important implications for understanding the influence of this key component of basal ganglia circuitry on emotional and motivated behaviors under normal and pathological conditions.

**Keywords: basal ganglia, reinforcement, reward expectation, context processing, single-neuron activity, monkey**

## **INTRODUCTION**

Although it is traditionally considered that the subthalamic nucleus (STN) is important in motor control, an increasing number of studies has been conducted to investigate the role of this nucleus in the processing of reward-related information. Evidence in favor of the STN involvement in motivational processes comes primarily from lesions studies in behaving rats showing that STN dysfunction leads to increased responding for a food reward (Baunez et al., 2002). In addition, STN lesions could have a differential impact on the incentive motivational properties of natural reinforcers and drugs of abuse, the STN-lesioned rats becoming more motivated as they work to obtain food reward and less motivated when cocaine was used as the reward (Baunez et al., 2005), suggesting that processing of different types of rewards can be dissociated at the STN level.

Clinical studies also support the notion that the STN is a component of reward circuitry. In particular, deep brain stimulation (DBS) of this nucleus, which is effective at alleviating motor symptoms in patients with Parkinson's disease, can also interfere with brain circuits that mediate mood and reward signals leading to enhanced motivation and decreased apathy in some of these patients (Funkiewiez et al., 2003; Takeshita et al., 2005), although an interference with decreased dopaminergic medication cannot be excluded. It has been also reported that DBS of the STN can either increase (Houeto et al., 2002; Schüpbach et al., 2005) or decrease (Witjas et al., 2005; Lhommée et al., 2012; Eusebio et al., 2013) the addiction of parkinsonian patients for their levodopa treatment. Since abnormal repetitive behaviors (i.e., compulsions) include an emotional component, the observation that obsessive–compulsive disorders can be improved by DBS of the STN is also an argument in favor of the contribution of this nucleus in emotional and motivational processes (Mallet et al., 2002; Baunez et al., 2011). Taking into account those elements, it has been suggested that the STN may represent a promising target for the treatment of addiction (Pelloux and Baunez, 2013).

Although, functional neuroimaging research in humans poses challenges to the interpretation of changes in activity of small subcortical brain structures (Keuken et al., 2013; reviewed in Péron et al., 2013), studies in this field have recently confirmed the role of the STN in behavioral inhibition highlighted by animal studies, such as the ability to cancel planned or already initiated actions (Aron and Poldrack, 2006; Li et al., 2008). While the role of this nucleus in motivation has received much less attention, the contribution of other parts of the basal ganglia, particularly the ventral striatum, in emotion and reward processes has been studied in many experiments using monetary or taste rewards under a variety of behavioral paradigms (Delgado, 2007). So far, hemodynamic changes restricted to the STN area which may be linked to anticipation and experience of reward have not been reported. However, as mentioned above, such an approach is still limited by the relatively poor spatial resolution of brain imaging techniques. On the other hand, neuronal recordings from the STN in patients with Parkinson's disease or obsessive–compulsive disorders during DBS surgery or after electrode implantation (local field potentials, LFPs) have provided evidence that the STN region is active during the processing of information related to emotional aspects of behavior (Kühn et al., 2005; Brücke et al., 2007; Burbaud et al., 2013). A potential disturbance of emotional information processing at the STN level could account for the mood changes reported in parkinsonian patients subjected to STN stimulation (Krack et al., 2001; Schneider et al., 2003).

More direct evidence for the involvement of STN in motivational processes is obtained from single-neuron recording experiments in animals performing controlled behavioral tasks. Several components of the basal ganglia circuitry, including the striatum, globus pallidus and substantia nigra pars reticulata, have been implicated in the processing of reward-related information and in linking motivation to action (reviewed in Schultz et al., 2000 and Hikosaka et al., 2006). In contrast, there are few data in direct support of reward processing at the STN level. Previous studies in behaving rats have reported that STN neurons can be modulated after the presentation of stimuli associated with reward and at the time of the reward delivery (Teagarden and Rebec, 2007; Lardeux et al., 2009, 2013). Moreover, STN neurons show differential responses to reward-related cues according to the quality of the expected reward, i.e., the degree of sweetness of a sucrose solution (Lardeux et al., 2009), and they can also discriminate between food and drug reinforcement, showing a specialization according to the relative preference for the reward (Lardeux et al., 2013). These findings are consistent with the existing literature on the impact of STN inactivation in rodents, supporting a role for this nucleus in reward processing. Conversely, much less information about the role of the STN in motivation is available in the primate. In a preliminary report, we have examined the relation of STN neuronal activity to movement and reward in one monkey performing an arm reaching task (Darbaky et al., 2005). In that study, we showed that the discharge of STN neurons was often modulated during the movement period of the task and at the time of the reward delivery, suggesting that STN neurons carrying signals related to motor activity could also be informed about the reception of the reinforcer. However, the details as to how the association of reward information with motor processes occurs in this nucleus remain unclear.

The purpose of our study was to further examine in monkeys the activity of STN neurons which could be related to reward delivered in distinct behavioral situations. The activity was analyzed during performance of visually-triggered movements leading to reward and in conditions with no contingency between movement and reward. Our objective was to determine whether the information contained in the discharge of individual STN neurons could be related to reward and by the behavioral context in which rewards were experienced.

## **MATERIALS AND METHODS**

#### **ANIMALS**

Two adult macaque monkeys (G and P, *Macaca fascicularis)*, weighing 5–6 kg, were used as subjects. Experimental setup, surgical procedures and recording procedures were the same as described previously (Deffains et al., 2010). Both animals were fully trained to perform visually triggered arm-reaching movements for liquid rewards before being surgically prepared for neuronal recordings. During the training and recording periods, the monkeys were deprived of water in their home cage and received apple juice during the experiments. Unlimited water access was allowed for at least one day each week. All experiments were in accordance with the guidelines of the National Institutes of Health Guide for the Care and Use of Laboratory Animals and the French laws on animal experimentation.

## **BEHAVIORAL TESTING CONDITIONS**

The monkeys were seated in a specially designed restraining box, facing a panel 30 cm from its head. The panel contained two metal knobs (10 × 10 mm) separated by 20 cm horizontally, and two light-emitting diodes (two-colored LEDs red/green), one above each knob, at eye level of the animal (**Figure 1A**). An unmovable metal bar mounted at the center of the panel served as a starting point for the reaching movement. Each trial began by keeping the hand on the bar. There was no fixation point signaling the trial initiation or constraining the monkey's eye movements. After a period of at least 1 s, a visual cue (green light) was presented randomly, either to the right or left, for 0.5 s. Cue presentation was followed by a 1-s delay period at the end of which the trigger stimulus (red light) appeared in the same location, indicating that the monkey should release the bar and touch the corresponding target. The delivery of reward (0.3 ml of apple juice) occurred immediately after target contact, under the control of a solenoid valve placed outside the experimental room. After target contact, the monkey moved back to the bar and waited for the total duration of the current trial (5 s) to elapse before a new trial began. An error trial was recorded when monkeys took longer than 1 s to initiate or execute the movement (omission trials). The two monkeys received training until they achieved a consistent correct performance rate of > 90% in the reaching task before the neuronal recording started.

In the basic task condition, reward was delivered immediately after the correct hand reaching for a target ("immediate reward") (**Figure 1A**, left). We also employed another condition in which a constant delay of 0.5 s was introduced between target contact

#### **FIGURE 1 | Sequences of events and behavior in the testing**

**conditions**. **(A)** In the reaching task, monkey started a trial by keeping the hand on a bar. A first visual stimulus (green light) was presented for 0.5 s at one of the two locations. A second visual stimulus (red light) was presented 1 s later at the same location. The animal was then required to release the bar and reach for the target located below the light, which it had to touch to receive a liquid reward. In the *immediate reward* condition, the reward was given immediately after correct target contact. In the *delayed reward* condition, the reward was delivered at the end of a 0.5-s interval after target contact. In the *passively delivered reward* condition, the monkey remained motionless without any access to the bar. This is a Pavlovian protocol in which a visual stimulus signaled the delivery of reward 1 s later, independently of a motor action. The three conditions were tested in separate blocks of trials. RT, MT. **(B)** Reaching task performance for the two monkeys. Each value of RT and MT was obtained by calculating the mean for all correct trials (± SEM) for the different locations of the target stimulus (*ipsi* and *contra* refer to the location of the stimulus ipsilateral and controlateral to the moving arm, respectively) and timing of reward delivery (*immed* and *delay* refer to immediate and delayed delivery of reward after target contact, respectively). An asterisk indicates that the value was significantly different between the two target locations or reward timing conditions (paired *t*-test, *P* < 0.05). **(C)** Licking behavior in the three conditions. Superimposed traces of mouth movement records are aligned on reward valve activation occurring with target contact in the immediate reward condition, 0.5 s after target contact in the delayed reward condition, and 1 s after the onset of the visual stimulus in the Pavlovian protocol, i.e., when reward was passively delivered.

and reward delivery ("delayed reward") (**Figure 1A**, middle), trials being similar to the immediate condition in all other aspects of appearance and timing. In particular, the trigger light remained illuminated until target contact, regardless of the moment of reward delivery. In a few cases, the 0.5-s delay was replaced by a 1.0-s delay when single neuron isolation was maintained long enough to complete an additional test. Because this happened infrequently, it was not possible for the monkey to predict the further lengthening of the target-reward interval. In all conditions, trials lasted 5 s so that a temporally less well-defined period began with reward delivery and ended with the cue of the subsequent trial (i.e., intervals between reward and the next cue varied from ∼2 s to 3 s in the 1.0-s delay and immediate reward conditions, respectively). The immediate and delayed reward conditions were conducted in separate blocks of 40–60 trials, the change in condition not being indicated by any explicit cues. However, because of the block design, it did not take monkeys many trials to adjust their expectation after a switch in testing condition. Before the recording experiments started, both monkeys were well-trained in the immediate reward condition, whereas the delayed reward condition was used only occasionally during recording sessions.

We also employed a condition in which the liquid reward was delivered in a passive manner after a period of 1 s after one LED was illuminated with a red light (**Figure 1A**, right). This was a Pavlovian conditioning procedure in which reward was not contingent on behavior (Apicella et al., 1997). In this testing condition, the sliding door located at the front of the restraining box was closed to prevent manual access to the panel. In addition to the testing of the monkeys in the Pavlovian protocol, we also delivered the liquid at unpredictable times, in the absence of any cue to precisely time when the reward would be delivered. The change from instrumentally to passively delivered reward was indicated by the experimenter entering the experimental room to open or close the sliding door of the box. Conversely, there was no signal indicating that the Pavlovian protocol was about to be changed to a block of trials in which reward was delivered alone. Passively delivered reward conditions were presented using the same trial duration as used in the reaching task, thus preserving the overall temporal structure of the testing condition.

#### **NEURONAL RECORDINGS**

On completion of training, each monkey underwent sterile surgery under sodium pentobarbital anesthesia. An opening was made in the skull over the left hemisphere and a stainless steel recording chamber (25 mm OD) was implanted over the hole, its center being aimed at the anterior commissure (AC), approximately 5 mm anterior to the rostral pole of the STN. The chamber was held in place by dental acrylic anchored with stainless steel screws drilled into the skull. Two stainless steel cylinders were also embedded in dental acrylic for subsequent head fixation during recording sessions. Antibiotics and analgesics were administered after surgery. Extracellular activity of single neurons was recorded with tungsten microelectrodes, as described previously (Deffains et al., 2010). In both monkeys, with their heads restrained while performing the behavioral tasks, neuronal recordings were carried out first from striatum. After several months of recording mainly targeted at the putamen, finding the accurate location of the STN was relatively easy using the striatal tracks, particularly at the level of the posterior putamen. Parallel electrode tracks were then made vertically, through the thalamus, zona incerta, STN and substantia nigra pars reticulata, in that order, the transition between these structures being obvious because of grossly different spontaneous neuronal activity. The electrode was driven by a hydraulic microdrive (MO-95; Narishige, Tokyo, Japan) through a stainless steel guide tube, which was used to penetrate the dura. After penetration of the dura, the electrode was advanced until the dorsal border of the STN was identified by an increase of background noise and typical large-amplitude irregular spike activity after passing the white matter area below the thalamus (i. e., fields of Forel) and zona incerta (Matsumura et al., 1992; Wichmann et al., 1994; Isoda and Hikosaka, 2008). Signals from neuronal activity were conventionally amplified, filtered (bandpass, 0.3–1.5 kHz), and converted to digital pulses through a window discriminator. During the recording of any neuron, activity in the immediate reward version of the reaching task was generally studied first, and, if the isolation could be sustained for a sufficient period of time, the tests were continued in the other conditions. It needs to be pointed out that the relative lack of stability of recording in the primate STN did not always ensure the isolation of individual neurons over successive trial blocks for different conditions (Wichmann et al., 1994). We have succeeded in helding stable neurons for two or more blocks in only a few cases. Presentation of visual stimuli, delivery of reward, collection of movement parameters, mouth movements and single-neuron activity were controlled by a computer, using custom software written by E. Legallet.

#### **DATA ANALYSIS**

Performance in the reaching task was assessed by measuring the reaction time (RT), i.e., the time between the onset of the trigger stimulus and release of the bar, and the movement time (MT), i.e., the time taken to contact a target after releasing the bar. The analysis included data from all correctly performed trials and all recording sessions, excluding error trials (omissions of the trigger stimulus) and premature responses (RTs < 100 ms). The tube conducting reward liquid to the spout positioned directly in front of the monkey's mouth was equipped with a strain gauge circuit with which we monitored the licking movements as analog signals (sampling rate: 100 Hz). The timing characteristics of the mouth movements that monkeys performed in the different conditions were assessed off-line by single-trial analysis.

To analyze neuronal correlates of the initiation of movement, we calculated the mean firing rate in a 300-ms time window extending from 200 ms before movement onset until 100 ms after movement onset, called the "perimovement period", and in another 300-ms time window extending from 400 to 700 ms after target contact, called the "postreward period". The mean discharge rate in each task period was compared with that in the control period (the 1 s duration before the cue onset) to examine whether the neuron showed significant task-related activities. If the mean discharge rate in a given period was significantly different from that in the control period (two-tailed Student *t*-test, *P* < 0.05), the neuron was considered to show task-related activity in that period. We tested for changes in activity during these two period for each recorded neuron.

The selectivity of the task-related activity for a particular location of the target stimulus was judged to be present if the magnitudes of the perimovement activity were significantly different between the two locations (two-way ANOVA, period (control, perimovement) × location (left, right), *p* < 0.05).

To determine the response latency of a neuron to a particular task event, onset and offset times of statistically significant changes in activity were assessed using a previously established procedure based on a sliding time window analysis (Deffains et al., 2010). Briefly, baseline activity was determined in the 1-s period that preceded the onset of the cue (control period). A test window of 100 ms was moved in steps of 10 ms, starting at the onset of the cue. We then compared activity from the baseline period to activity in the sliding window. Neurons showing a statistically significant difference in activity during ≥ 20 consecutive steps (Wilcoxon signed-rank test, *P* < 0.05) were considered as modulated. The latency of a significant change in neuronal activity was defined as the beginning of the first of 20 consecutive steps showing a significant difference as against the baseline activity during the control period.

In addition to the assessment of activity changes of the individual STN neurons, we also summed activity of all neurons tested in a given condition or for a particular location of the target stimulus and made population histograms. For each neuron, a normalized perievent time histogram was obtained by dividing the content of each bin by the number of trials. The population histogram was obtained by averaging all normalized histograms referenced to a particular event. These histograms were constructed from neurons recorded in monkeys P and G in each testing condition.

## **HISTOLOGY**

We confirmed our recording sites by histological verification in one monkey, whereas the location of recorded neurons was determined solely on the basis of the electrophysiological information concerning the boundaries of the STN and surrounding structures in the other monkey. After the experiments had been completed, monkey P was sacrificed with an intravenous overdose of pentobarbital and perfused transcardially with 0.9% saline followed by a fixative (4% paraformaldehyde, pH 7.4 phosphate buffer). The brain was cut in 50-µm coronal sections, mounted on slides, and stained with cresyl violet. We identified the recording sites that had been marked with small electrolytic lesions in and around the STN at the end of neuronal data collection. Electrode penetrations were then reconstructed in serial sections through the STN by referring to the marking lesions.

## **RESULTS**

#### **TASK PERFORMANCE DATA**

The mean RT and MT for different target locations and reward timing for each monkey are presented in **Figure 1B**. Consistent with previous reports from our lab (Ravel et al., 2006; Deffains et al., 2010), RTs to target stimuli presented contralaterally to the moving arm were slightly but significantly longer than RTs to ipsilateral target stimuli in monkey G (Student's *t*-test, *P* < 0.01), whereas the effect of spatial location on RT was no significant in monkey P (*P* > 0.05). On the other hand, MTs for contralateral movements were longer than MTs for ipsilateral ones in both monkeys (*P* < 0.01). It is possible that monkey P used prior information about the target location efficiently to prepare the movement and then improve speed during initiation, thus explaining the lack of spatial response bias in the initiation phase of movement in this animal.

As also shown in **Figure 1B**, comparing the RTs obtained during blocks where the reward was delivered immediately after target contact with those where the reward was delayed by 0.5 s after target contact, yielded significant longer RTs in the latter condition in both monkeys (two-way ANOVA) with the factors target location and reward timing, effect of reward timing: monkey G: (*F* (1, 401) = 73.82, *P* < 0.01; monkey P: (*F* (1, 364) = 96.16, *P* < 0.01). In contrast, for MT, there was no significant difference between the two conditions (monkey G: *F* (1, 401) = 0.47, *P* > 0.05; monkey P: *F* (1, 364) = 0.65, *P* > 0.05). There was no interaction between target location and reward timing regarding RT and MT in monkey G (*P* > 0.05), whereas such interactions were detected for RT (*P* < 0.05) and MT (*P* < 0.01) in monkey P. These findings indicate that monkeys took longer to initiate reaching movements when the timing of the reward outcome was delayed. These results differ from what was found in a previous study, using a similar reaching task, in which we reported that neither RT and MT were influenced by the delayed delivery of reward after target contact (Ravel et al., 2001). This lack of concordance could be attributed to differences in training levels, the small sample of neurons tested in the delayed reward condition in the present study did not allow enough training. In a few cases (1 and 4 trial blocks in monkeys G and P, respectively), a 1-s delay was employed in order to further lengthen the period of expectation of reward after target contact. This test was only used when stable recordings were obtained on neurons that were modulated during the target-reward interval of 0.5 s. In monkey P, two-way ANOVA performed with target location and reward timing (immediate, delay 0.5-s, delay 1.0-s) as factors revealed a significant main effect of reward timing on RT (*F* (2, 266) = 31.01, *P* < 0.01), so that RT was longer in the delay 1.0-s than the delay 0.5-s conditions. In contrast, the MTs varied insignificantly by reward timing (*F* (2, 266) = 1.97, *P* > 0.05). When monkeys were tested with delayed reward conditions, whether a 0.5-s or a 1.0-s delay, they moved immediately their hand back to the bar after having touched the target and then waited for the delay before reward. In this regard, there were no noticeable differences in gross behavior between immediate and delayed reward conditions while the animal was resting the hand on the bar in order to initiate the next trial.

Representative mouth movement recordings are illustrated in **Figure 1C**. They were essentially the same as those described in earlier studies using similar behavioral situations (Ravel et al., 2001; Apicella et al., 2009). The monkey started to lick the spout on or slightly before target contact in the instrumental task, regardless of the timing of reward, and these movements were prolonged until the receipt of liquid when a 0.5-s delay was introduced between target contact and reward delivery. Licking movements occurred during the 1-s interval separating the presentation of the visual stimulus from the delivery of reward, indicating that the stimulus served as a trigger for mouth movements in anticipation of the upcoming reward. This served as a wellestablished behavioral marker of the Pavlovian stimulus-reward association.

#### **GENERAL**

While the monkeys performed the reaching task, we recorded from 74 neurons in the STN (51 and 23 from monkey G and P, respectively) with a background firing rate of 23.5 ± 16.4 spikes/s (mean ± SEM). Most neurons (*n* = 59) had medium to high firing rates (mean 28.6, range 10.3–71.3), a small number (*n* = 15) had firing below 10 spikes/s (mean 3.4). The firing characteristics of STN neurons were similar to what has been reported previously in behaving monkeys in terms of spiking irregularity and baseline firing rate (Matsumura et al., 1992; Wichmann et al., 1994; Isoda and Hikosaka, 2008).

Based on the location of recording sites identified histologically in monkey P, these neurons were recorded between 5 and 7 mm posterior to the AC, over the mediolateral extent of the STN, except its most medial part. A summary plot of all neurons recorded is shown in **Figure 2**, including those that could not be recorded for enough number of trials. On the basis of histological analysis of electrode tracks in this monkey, it appears that we failed to adequately sample the most medial portion of the nucleus in its more anterior parts. As regards the recording sites of well isolated neurons, the two major groups of task-related changes in activity (i.e., those occurring during the perimovement period and/or the postreward period) did not show clear regional differences in the STN explored.

In general, STN neurons exhibited a variety of changes in activity during various periods of the reaching task, with some neurons being modulated before the onset of the movement and others being modulated later, after the monkey's hand contacted the target and was immediately followed by the delivery of reward. These modulations could consist in either excitation or inhibition. As we point out below, the same neurons often displayed changes in activity during two distinct task periods, namely around the initiation of movement and after target contact.

## **ACTIVITY AFTER TARGET CONTACT**

The activity of 70% (52/74) of neurons (13 of 23 in monkey P, 39 of 51 in monkey G) was significantly modulated after reaching toward the correct target. **Figure 3A** shows an example of neuron with increased activity after target contact (right part). This neuron also showed increased activity, although to a weaker degree, before the initiation of movement (left part). As is demonstrated by this case, the temporal linkage to movement onset was clearly apparent by aligning neuronal activity on bar release. The activity was rapidly decreased immediately after the onset of movement and increased again after movement termination, lasting more than 1 s after target contact. About three quarters of the detected changes occurring after target contact consisted of increases and one quarter of decreases.

As illustrated in **Figure 3A**, the majority of neurons (35/52) significantly modulated after target contact also showed a significant change in activity around the initiation of movement. Thirty three neurons changed their activity in the same direction for both movement initiation and after target contact (28 increases,

5 decreases), whereas only 2 neurons had opposite changes in activity (an increase for movement initiation and a decrease after target contact). Changes in activity occurring specifically after target contact were observed in 17 neurons (10 increases, 7 decreases) and 15 neurons showed an exclusive change in activity around the initiation of movement (10 increases, 5 decreases).

We examined the time course of activity changes at the level of the whole sample of neurons modulated during the postreward period (**Figure 3B**). Only neurons showing increases in their firing rate were included in this population average. The average activity reached a first peak of increased activity slightly after the presentation of the trigger stimulus that elicited movements. A second peak of increased activity occurred later after target contact. It is noticeable that this latter peak was well located in the 300-ms time window we have chosen for analysis (i.e., 400–700 ms after target contact), indicating that the majority of activations following target contact were relatively late increases.

onset of movement (red squares in each row). Histogram scale: impulses/bin. Binwidth for histograms: 20 ms. **(B)** Modulation of population activity in the reaching task. Only neurons showing increases in firing rate after target contact, pooled from both monkeys, are included in this analysis. Average activity is aligned on the trigger stimulus (left) or target contact (right), which are marked by vertical dashed lines. Vertical scale: impulses/s. N: number of neurons included for the population histogram.

The relative late onset of peak firing after target contact suggests that changes in STN neuronal activity did not occur directly in response to target contact. Also, peak firing did not appear to be associated with the monkey's movement back to the resting bar. Although a possible relationship to the consumption of liquid must be taken into account, it seems that the motor aspects of orofacial activity were not the only explanation, particularly for those neurons which began to be modulated with a relatively long latency after target contact. This explanation, however, does not hold for other neurons that became active earlier, during the reaching movement, a task period in which preparation and initiation of mouth movements took place.

We next examined whether the magnitude of the change in activity during the perimovement period was affected by the location of the target stimulus and/or the direction of the associated movement. Target location had a significant effect (*P* < 0.05; two-way ANOVA) on perimovement activity in 22 of the 74 neurons tested (19 and 3 neurons in monkeys G and P, respectively). Among them, 19 showed increases in activity around the initiation of movement and 3 showed decreases. The fact that most of the neurons sensitive to the location of the target stimulus were recorded in monkey G in which a spatial response bias was observed behaviorally (**Figure 1B**), suggests that a contralateral preference in STN modulations could emerge in parallel with slow movement initiation in the contralateral direction. We found that 14 of the 22 neurons displaying spatially selective modulations preferred the contralateral stimulus location, and 8 the ipsilateral one (*P* < 0.05; two-way ANOVA followed by Tukey's test). This is illustrated by data from two example neurons classified as spatially selective in **Figure 4A**. The first neuron (left part) is an example of directional preparatory activity, the selective activation being manifested by a sustained increase in discharge rate during the delay period prior to contralateral movement, whereas a corresponding reduction in activity occurred with ipsilateral movements. The other neuron (right part) showed a contralateral-selective activation in advance of trigger presentation and until movement onset. It is noteworthy, in the two example neurons, that the change in activity terminated with the onset of movement, rather than the presentation of the stimulus triggering movement. Moreover, a phasic component occurring just before the movement onset was still visible even in the case of ipsilateral movements, suggesting the presence of a neuronal signal related to movement initiation that did not depend on the spatial features of the task.

Separate population histograms were constructed from all neurons showing target location selectivity (**Figure 4B**). This analysis was confined to data from neurons recorded in monkey G showing increased activity during the perimovement period and in which a significant effect of spatial location was detected. Despite the small number of neurons, the spatial preference in terms of magnitude of change in the average activity was obvious when the stimulus was presented contralaterally to the reaching arm. As shown in **Figure 4B**, the selectivity for ipsilateral target location appeared less evident at the level of the population average, compared to contralateral location.

## **INFLUENCE OF DELAYING THE REWARD AFTER TARGET CONTACT**

Additional tests were performed for characterizing changes in activity following target contact. In particular, we wanted to separate in time movement termination from the receipt of liquid by delaying the reward for 0.5 s, occasionally 1 s, after the contact of the monkey's hand with the target. In 21 neurons (8 and 13 from monkeys P and G, respectively), single neuron isolation could be successfully maintained during recording in both the immediate and delayed reward conditions. Of these, 16 neurons showed a sustained change in activity through the delay between target contact and reward delivery, consisting of increases and decreases in activity in 10 and 6 neurons, respectively. For 5 of the 16 neurons showing a sustained change in their activity during the 0.5-s delay (2 increases, 3 decreases), data were also collected when the target–reward interval was further lengthened to 1 s and we verified that the change in activity was prolonged accordingly. This is consistent with previous studies demonstrating that STN neurons recorded in behaving monkeys can display sustained changes in activity as a possible reflection of a state of expectation of the delivery of reward (Matsumura et al., 1992; Darbaky et al., 2005). **Figure 5** shows data from two neurons which have been fully tested in three successive trial blocks, illustrating activity profiles dependent on the duration of the delay between reaching the target and receiving the reward. The rasters in the left panels represent the activity of a neuron which displayed elevated activity during the delay. The same pattern of activity is seen in the neuron illustrated in the right panels, but consisting in a suppression of activity during the delay. In both cases, modulations occurring during the delay began just before target contact and continued until reward delivery. Despite the limited size of our data set, it is noticeable that sustained changes in activity before reward delivery were frequently found in the ventral half of the nucleus (**Figure 2**) which has been shown previously to contain neurons related to reward expectation in monkey (Matsumura et al., 1992).

In order to investigate whether the task-related changes in STN neuronal activity showed a dependency on reward timing, we tested the activity of each neuron in the perimovement period with a three-way ANOVA with factors: period × target location × reward timing. Of the 21 neurons recorded in both the immediate and delayed reward conditions, 16 were modulated in a similar manner and 5 showed a significant difference in the magnitude of perimovement activity between the two conditions. Specifically, 3 neurons were modulated more strongly when the reward was delivered immediately after the target contact, whereas 2 neurons were modulated more strongly when monkeys received reward 0.5 s after the target contact. It therefore appears that there was no systematic relationship between the level of modulation and reward timing, suggesting that changes in movement speed were not linked to perimovement changes in STN firing as a function of reward timing. On the other hand, a significant difference in the magnitude of postreward activity between the two conditions was detected in 15 of the 21 neurons, with 10 of them being modulated more strongly when the reward was delivered immediately after the target contact and 5 neurons being modulated more strongly when reward was delayed by 0.5 s. This suggests that modulations of postreward activity can be explained, at least for some STN neurons, by the subjective value of reward which decayed as the time to its delivery was delayed. The small number of neurons tested with the 0.5-s and 1.0-s delays prevented a similar analysis.

#### **INFLUENCE OF DELIVERING REWARDS NOT CONTINGENT UPON INSTRUMENTAL REACTIONS**

The relationship of changes in neuronal activity to the expectation and detection of reward was further examined when the liquid was delivered outside of the reaching task. Twenty neurons (16 and 4 neurons in monkeys G and P, respectively) were studied in the Pavlovian protocol. **Figure 6A** (left panel) shows a neuron that displayed two consecutive components of neuronal modulation, i.e., a brief increase in firing after the presentation of the visual stimulus and another increase after the subsequent delivery of reward. The same neuron was also recorded when the reward was delivered alone, in the absence of the preceding stimulus (right panel), and an increase in firing was still detected after this single event.

A notable feature of STN neurons recorded in the Pavlovian protocol is the substantial variability in the temporal profile of

their changes in activity. To quantitatively assess this, the times of onset and offset of modulation were determined for each neuron with the use of the sliding window procedure (see Section Materials and Methods). The distribution of temporal profiles of activity relative to stimulus onset for every recorded neuron is presented in **Figure 6B** (left). We found that 17 neurons were significantly modulated after the presentation of the visual stimulus (10 increases, 7 decreases), the remaining 3 neurons being modulated only after the delivery of reward (2 increases, 1 decrease). As can be seen in this figure, the change in activity after stimulus onset was maintained in the time period immediately following stimulus onset for 16 neurons and even extended beyond reward delivery for 15 neurons. Two neurons were modulated only after the presentation of the stimulus (2 increases). Some

activity changes began after stimulus onset, lasted before reward delivery and then restarted later after the delivery of reward, whereas others persisted through the delay until the delivery of reward. Overall, it appears that the distributions of onset of modulation overlapped substantially among neurons with taskrelated increases and decreases in activity. In 10 neurons, the receipt of reward produced a change in activity of the same sign as the change following stimulus onset, whereas 5 neurons had bidirectional changes in activity. These observations are consistent with the idea that STN neurons were not exclusively sensitive to the reinforcement of an instrumental response. We examined the population activity for the sample of 20 neurons recorded in the Pavlovian protocol (**Figure 6C**). Considering the complex pattern of changes in activity described above, we combined all neurons, regardless of their increase and/or decrease in firing rate. Two phases of increased activity were visible after each task event in the population average. The first phase appeared relatively homogeneous, whereas the second phase included multiple components.

Finally, it was of special interest to examine STN neuronal activity when the monkey received reward without any predictive

colored horizontal line represents the period with a statistically significant increase (blue) or decrease (red) in activity for a single neuron. Lines are ordered according to onsets of activity change after the presentation of the visual stimulus or the delivery of reward. Gray horizontal dashed lines indicate a lack of significant change in discharge rate after a given event. **(C)** Modulation of population activity. Population included all neurons (i.e., both increases and decreases) recorded in the two testing situations. Same

cues. Among a sample of 18 neurons (3 and 15 neurons from monkey P and G, respectively), 13 displayed changes in activity after the delivery of reward (11 increases, 2 decreases). The remaining 5 neurons showed no detectable change in activity during the test. As mentioned above, the example neuron shown in **Figure 6A** (right) increased its activity in response to reward given alone. The temporal parameters of the changes in activity were also analyzed for each neuron tested in this condition and the results of this analysis are illustrated in **Figure 6B** (right). It can be seen that the increases in firing occurred more frequently than the decreases. We then performed a population analysis for these 18 neurons and by comparing it with the average activity

reward not signaled by a predictive cue (right). Same conventions as in **Figure 3A**. **(B)** Time course of changes in activity of all neurons tested in the Pavlovian protocol (left) and during the delivery of reward alone (right). Each

> obtained in the Pavlovian protocol, it appears that the modulation in response to the delivery of reward was more homogeneous in the former condition.

## **DISCUSSION**

conventions as in **Figure 3A**.

Although clinical and neurophysiological evidence has long pointed to the role of STN in regulating motor function, recent studies have also implicated this nucleus in the processing of reward-related information. Consistent with this idea, we found that STN neurons displayed changes in activity after the delivery of reward, regardless of the need to make a movement to obtain reward. When monkeys were actively engaged in target reaching, the reward signals carried by STN neurons were often combined with modulations around the time of movement onset, demonstrating that STN neurons are sensitive to the movement and its reward outcome. These observations confirmed earlier findings from only a single monkey performing a reaching task to obtain reward (Darbaky et al., 2005). In the present study, we have provided further details of reward-related changes in STN activity under varying contextual conditions, i.e., during instrumental and Pavlovian conditioning tasks and even in the absence of any reward predictive cue. In addition, our findings have emphasized the presence of STN neurons exhibiting sustained changes in activity in advance of reward whose delivery was delayed in the reaching task, indicating that STN neuronal activity is influenced by the state of expectation of future rewards. Altogether, these results obtained in non-human primates are consistent with the idea that the STN is a component of basal ganglia circuitry that mediates motivational processes.

#### **REWARD-RELATED ACTIVITY IN THE STN**

Even if increasingly more clinical and animal lesion studies are highlighting the role of the STN in reward circuitry, only a few studies of neuronal activity in the STN have documented the sensitivity of individual neurons to the delivery of reward during the performance of motor tasks in both rodents (Teagarden and Rebec, 2007; Lardeux et al., 2009, 2013) and monkeys (Matsumura et al., 1992; Darbaky et al., 2005). Previously, we have shown in a preliminary study that STN neuronal activity was modulated by reward delivered after an instrumental response, and preliminary work done at that time indicated that rewardrelated modulations were still detected when reward was passively delivered in a Pavlovian manner (Darbaky et al., 2005). The activity changes reported here confirm these preliminary findings and further indicate that STN neurons were responsive to unpredicted deliveries of reward. It therefore appears that STN neurons have firing rates that are sensitive to appetitive events themselves, whether or not a motor response was required to obtain reward and even outside of Pavlovian or instrumental response control. However, as we discuss below, circumstances in which rewards are passively experienced by animals do not exclude a possible influence of motor constraints to consume the liquid.

Earlier reported changes in STN neuronal activity after reward delivered have been reported in rats performing instrumental tasks (Teagarden and Rebec, 2007; Lardeux et al., 2009, 2013) and are comparable to those reported here in monkeys. In addition, Lardeux et al. (2009) have further demonstrated that STN activity may be related to reward quality (i.e., different sweetness of a liquid sucrose reward). These same authors have recently reported that different populations of STN neurons were modulated by the delivery of an appetitive liquid or administration of a psychoactive drug (Lardeux et al., 2013), suggesting that distinct neuronal circuits within the STN mediate the positive value of stimuli.

Several basal ganglia structures with neuronal activity linked to the rewarding significance of conditioned stimuli as well as to reward itself have been extensively investigated in both rodents and monkeys (Schultz et al., 2000; Hikosaka et al., 2006). It appears that the STN also contributes to the processing of motivational information. How reward sensitivity is encoded in the STN compared with that found in other components of the basal ganglia circuitry, particularly the striatum, remains to be clarified.

Although our results showed, at the single-neuron level, the contribution of the STN in the detection of rewarding events, they did not allow to firmly establish whether the observed changes in activity were related to the hedonic nature of the reward. In particular, the possibility that the reward-related activity encoded some aspects of orofacial behavior was not totally eliminated. In an attempt to dissociate mouth movements from the delivery of reward *per se*, we have examined the timing characteristics of licks at the spout during neuronal recordings. As already pointed out in our previous study (Darbaky et al., 2005), the reward-related activity was not directly related to the licking patterns in terms of onset and duration. For example, in the reaching task, the majority of modulations of STN neuron firing which occurred after target contact were relatively late changes, suggesting that they were unlikely to be coupled to preparation or initiation of mouth movements which began earlier. We cannot exclude, however, that they could be involved in later phases of liquid consumption, such as swallowing. Moreover, the variability in the time course of the observed changes in neuronal activity during Pavlovian conditioned behavior challenges the notion that STN neurons may encode the highly stereotyped pattern of licking movements elicited in this condition.

## **HETEROGENEITY OF REWARD-RELATED ACTIVITIES IN THE STN**

Consistent with previous electrophysiological studies, we have found that STN neurons were either excited or inhibited by rewarding events, increases being more common than decreases. Interestingly, the prevalence of increases in STN neuronal activity became particularly evident when switching from a Pavlovian association between stimulus and reward to a situation in which reward was not signaled by a cue, thus suggesting a change in the way that the reward itself is processed by STN neurons. For comparison with our earlier study (Darbaky et al., 2005), it was also noticed that increases in STN neuronal activity were more frequent when reward was given outside of a learning context. In the rat experiments, modulations of STN firing following reward delivery also consisted of either an increase or a decrease in firing (Teagarden and Rebec, 2007; Lardeux et al., 2009, 2013), although the respective proportions of changes in opposite direction could be different from one study to another, and an influence of context on reward-related activities was also highlighted in the case of changes in reward values (Lardeux et al., 2009, 2013). Because behavior-related increases and decreases in the activity of STN neurons are assumed to have opposing effects on behavioral output, it is important to understand the significance of these opposing changes in activity. Indeed, increases in STN activity are thought to suppress movement execution by increasing inhibition of thalamic and brainstem targets via basal ganglia output structures and the reverse may occur in the case of decreases in STN activity.

## **CHANGES IN STN NEURONAL ACTIVITY RELATED TO SPECIFIC ASPECTS OF TARGET REACHING**

Our findings have shown two main and temporally distinct patterns of task-related activity in the STN during performance in the reaching task, one occurring around the initiation of movement and the other after target contact, immediately followed by reward delivery. Previous studies in behaving rats have also shown that most neurons in the STN carry signals related to both motor behavior and reward outcomes (Teagarden and Rebec, 2007; Lardeux et al., 2009, 2013). These observations point to an influence of motor and motivational aspects of task performance at the single-neuron level, possibly reflecting the fact that reward signals generated by STN neurons are linked to the performance of specific actions.

In the present study, we have examined the involvement of the STN in motor aspects of performance specifically regarding the visuospatial features of the reaching task. We found that a number of STN neurons displayed preferences for one location of the triggering stimulus and/or direction of its associated motor response, most of them preferring the location and/or direction opposite to the moving arm. In at least one monkey, this spatial preference was accompanied by a variation in the speed of the motor response to the stimulus. A link between STN neuronal activity and spatial information provided by stimuli eliciting movements in various directions has been reported in previous monkey studies (Georgopoulos et al., 1983; DeLong et al., 1985; Isoda and Hikosaka, 2008). However, it is unknown whether this spatial selectivity was related to the location of the trigger stimulus and/or the direction of the movement associated with that stimulus. Indeed, in these studies (including our own), the motor response was directed toward the spatial location of the trigger stimulus, so that it cannot be established whether the observed changes in activity were dependent on the "sensory" or "motor" constraints. Only a few studies have attempted to dissociate these two aspects while studying task-related neuronal activity in basal ganglia (Alexander and Crutcher, 1990; Ravel et al., 2006). However, in rats trained to make a constant motor reaction (i.e., cessation of lever pressing) to different spatial locations of a movement-triggering stimulus, no influence of location was reported on the activity of STN neurons (Lardeux et al., 2009).

#### **CHANGES IN STN NEURONAL ACTIVITY RELATED TO REWARD EXPECTATION DURING TASK PERFORMANCE**

We have found that STN neurons may exhibit persistent changes in firing throughout a time interval introduced between correct target contact and reward delivery in the reaching task. These findings are consistent with previous observations in monkeys that showed the influence of variations in reward timing on STN activity (Matsumura et al., 1992; Darbaky et al., 2005). In addition, we showed that monkeys were sensitive to the presence of a reward delay after target reaching, suggesting that they discounted the value of a reward when it is delayed in time. We found that few neurons exhibited non-systematic differences in firing around the initiation of movement which leads to an immediate or delayed reward. On the other hand, a relatively large number of neurons showed stronger changes in activity following reward delivered immediately after completion of movement, compared with delayed reward, suggesting that they may participate in the time-discounted encoding of reward value. In the present study, sustained changes in activity were extended when the target-reward interval was prolonged to 0.5 to 1 s, suggesting that this activity may be interpreted as reflecting a representation of outcomes which is crucial for the control of reward-guided behavior. Again, we cannot exclude completely the alternative explanation that the observed changes in neuronal activity may reflect preparation to consume the liquid (Roesch and Olson, 2003). The observed modulations in STN neuronal activity did not appear to be related to the simultaneously occurring mouth movements, suggesting that these modulations might be related more to reward expectation than to motor preparation, but this needs to be further clarified. In addition, our task manipulation did not allow a clear-cut dissociation between motivation and attention (Maunsell, 2004). Since STN has been shown to be involved in attention (Baunez and Robbins, 1997, 1999), we cannot rule out a contribution of attentional processes in the neuronal changes reported here. Additional work is needed to clarify this issue.

#### **IS REWARD-RELATED ACTIVITY CONFINED TO SPECIFIC REGIONS OF THE STN?**

Based on anatomical studies of corticosubthalamic circuitry, it is thought that regions of the STN can be functionally delineated along distinct territories. Although neuroimaging data do not provide the spatial resolution required to examine these distinctions, recording studies at a single-neuron resolution level may be helpful to analyze the fine distribution of neurons displaying specific response properties. Previous studies have reported that neurons related to limb movements are located primarily in the dorsolateral STN region (Georgopoulos et al., 1983; DeLong et al., 1985; Wichmann et al., 1994), whereas neurons related to visuomotor functions were generally located more ventrally (Matsumura et al., 1992; Isoda and Hikosaka, 2008). Although anatomical connectivity defines an organization in subterritories (Groenewegen and Berendse, 1990), the functional evidence is less clear for the rodent STN for which researchers refer to taskrelated activities without attempting to parcel out the subdivisions (Teagarden and Rebec, 2007; Lardeux et al., 2009, 2013). Despite some limitations on the extent of the area over which we recorded STN neurons, our results based on histological examination in one monkey did not show regional differences that were particularly obvious in the pattern of distribution of task-related neurons. In particular, neurons sensitive to reward were scattered throughout the parts of the STN explored, without preferential location in the ventromedial part, which receives inputs from the orbitofrontal cortex and anterior cingulate cortex and is considered the « limbic » part of the STN in primates (Takada et al., 2001; Karachi et al., 2005; Haynes and Haber, 2013). This observation confirms what had been previously noted in our preliminary study (Darbaky et al., 2005). Also, neurons sensitive to movement did not appear to be clustered in the dorsolateral part of the STN, which is connected to motor and premotor cortical areas, and corresponds to the « motor »part of the STN. Although the small number of neurons sampled did not allow us to state whether the present findings are representative of STN recordings in general, the presence of neurons sensitive to expectation of reward in the ventral part of the STN is consistent with the findings of Matsumura et al. (1992).

#### **FUNCTIONAL SIGNIFICANCE OF REWARD-RELATED ACTIVITY IN STN**

Although our findings add to the growing body of literature highlighting the role of the STN in motivation, it is not yet clear how reward signals found in the STN are used to influence behavioral output. One could speculate that changes in STN neuronal activity related to reward expectation or to the detection of the reward itself contribute to behaviors directed at obtaining the reward by maintaining the representation of an expected outcome and by monitoring the positive feedback which may serve to shape future behavior. An important feature of the reward signals generated by STN neurons in an instrumental task was their frequent combination with signals related to the movement. These combined changes in STN firing may reflect a mechanism linking the generation of motor behavior with the rewarding outcome. The specific function of STN in reward-guided behavior thus appears to be complex and further investigations of changes in STN neuronal activity employing appropriately designed tasks are needed to understand how this nucleus contribute to motor and reward processing in the basal ganglia circuitry.

Recently, the STN has been considered to be a key component of the brain network which mediates behavioral inhibition, particularly under circumstances requiring the active suppression of inadequate movements when several conflicting actions compete (Frank et al., 2007). At the moment, it is unclear how this conception can be matched with reward signals generated by STN neurons such as those described here. However, even seemingly simple motor behaviors consist of a spatiotemporal organization of distinct motor components, and STN neurons may intervene to facilitate or suppress these components to complete the whole behavior. In this regard, combined increases and decreases in STN activity may reflect a mechanism that releases from inhibition actions and maintains inhibitory control over others which is crucial for reward-oriented behavior.

#### **CONCLUSION**

In summary, the data gathered in the present study provide further evidence for a contribution of the STN to motivational control of behavior in the non-human primate. Detailed understanding of the specificity of reward processing within single STN neurons could lead to a better appreciation of the influence of this nucleus on motivated behaviors under normal and pathological conditions in humans. This is particularly relevant in view of the current interest for surgical therapy aimed at treating psychiatric disorders associated with impaired reward and motivational processes, including drug addiction.

#### **AUTHOR CONTRIBUTIONS**

Juan-Francisco Espinosa-Parrilla and Paul Apicella designed the study, collected the data, and performed the analyses. Paul Apicella wrote the manuscript with the help of Christelle Baunez.

#### **ACKNOWLEDGMENTS**

We thank E. Legallet for designing the computer programs, I. Balansard for assistance with surgery, and M. Deffains and S. Ravel for help with data collection. This work was supported by Centre National de la Recherche Scientifique. Juan-Francisco Espinosa-Parrilla was funded by a French ANR grant to Christelle Baunez.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 August 2013; accepted: 16 November 2013; published online: 17 December 2013.*

*Citation: Espinosa-Parrilla J-F, Baunez C and Apicella P (2013) Linking reward processing to behavioral output: motor and motivational integration in the primate subthalamic nucleus. Front. Comput. Neurosci. 7:175. doi: 10.3389/fncom.2013.00175 This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2013 Espinosa-Parrilla, Baunez and Apicella. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Fronto-striatal gray matter contributions to discrimination learning in Parkinson's disease

## *Claire O'Callaghan1,2, Ahmed A. Moustafa3, Sanne de Wit 4, James M. Shine5, Trevor W. Robbins 6, Simon J. G. Lewis <sup>5</sup> and Michael Hornberger 1, 2, 7, 8\**

*<sup>1</sup> Neuroscience Research Australia, Sydney, NSW, Australia*


*<sup>5</sup> Parkinson's Disease Clinic, Brain and Mind Research Institute, University of Sydney, Sydney, NSW, Australia*


*<sup>8</sup> Department of Clinical Neurosciences, University of Cambridge, Cambridge, UK*

#### *Edited by:*

*Izhar Bar-Gad, Bar-Ilan University, Israel*

#### *Reviewed by:*

*Carol Seger, Colorado State University, USA Todd Maddox, University of Texas Austin, USA*

#### *\*Correspondence:*

*Michael Hornberger, Neuroscience Research Australia, Cnr Barker and Easy Street, Randwick, Sydney, NSW 2031, Australia e-mail: m.hornberger@neura.edu.au* Discrimination learning deficits in Parkinson's disease (PD) have been well-established. Using both behavioral patient studies and computational approaches, these deficits have typically been attributed to dopamine imbalance across the basal ganglia. However, this explanation of impaired learning in PD does not account for the possible contribution of other pathological changes that occur in the disease process, importantly including gray matter loss. To address this gap in the literature, the current study explored the relationship between fronto-striatal gray matter atrophy and learning in PD. We employed a discrimination learning task and computational modeling in order to assess learning rates in non-demented PD patients. Behaviorally, we confirmed that learning rates were reduced in patients relative to controls. Furthermore, voxel-based morphometry imaging analysis demonstrated that this learning impairment was directly related to gray matter loss in discrete fronto-striatal regions (specifically, the ventromedial prefrontal cortex, inferior frontal gyrus and nucleus accumbens). These findings suggest that dopaminergic imbalance may not be the sole determinant of discrimination learning deficits in PD, and highlight the importance of factoring in the broader pathological changes when constructing models of learning in PD.

**Keywords: Parkinson's disease, discrimination learning, goal-directed learning, computational modeling, voxelbased morphometry, fronto-striatal**

#### **INTRODUCTION**

Parkinson's disease (PD) is a neurodegenerative condition characterized by hallmark motor disturbances, with its primary neuropathology in the nigrostriatal pathway. This leads to severe dopamine depletion in the dorsal striatum, while the ventral striatum is relatively preserved in the earlier disease stages (Jellinger, 2001). In PD, both the progressive dopamine depletion in the basal ganglia and the concurrent beneficial and deleterious effects of dopamine replacement medications, have been associated with a range of distinct learning impairments (for reviews, see Price et al., 2009; Foerde and Shohamy, 2011b). These dopamine dependent learning deficits in PD have been informative in the development of theoretical accounts of learning function and have provided important advances and testable predictions for computational explanations of learning (Frank, 2005). In particular, PD has been associated with acquisition deficits in feedback-based discrimination learning (Myers et al., 2003; de Wit et al., 2011), which have also been described via computational approaches (Moustafa et al., 2010).

Feedback-based and trial-and-error learning is presumed to be mediated by relative patterns of tonic vs. phasic dopamine activity occurring in response to environmental reinforcers (Schultz, 2002; Bromberg-Martin et al., 2010). Indeed, current accounts of discrimination learning in PD have been derived through ONvs. OFF-medication patient studies and through computational models, which have established a role for basal ganglia dopamine imbalance as a crucial factor underpinning the feedback-based learning deficits (Frank et al., 2004; Shohamy et al., 2006). Whilst such explanations of learning deficits based on dopaminergic imbalance do accord with the biological characteristics of PD, these theories have not addressed the potential contributions of other prevalent pathological effects in PD. For example, in addition to the characteristic dopamine depletion PD is also associated with gray matter loss and reduced white matter integrity (Duncan et al., 2013). Significantly, regions of gray matter loss in PD involve systems that are implicated in a range of higher level cognitive functions (including learning), and it is only more recently that direct associations between volumetric reductions and specific cognitive deficits have been confirmed in early stage, non-demented PD (Filoteo et al., 2013; O'Callaghan et al., 2013).

Given the known volumetric brain changes in PD and the possibility that they may directly affect learning processes, exploring

*<sup>2</sup> Faculty of Medicine, School of Medical Sciences, University of New South Wales, Sydney, NSW, Australia*

this relationship to inform future learning theories and computational approaches that rely on PD as a model is now vital. In the current study, we directly examined this issue by combining voxel-based morphometry analysis with a computational modeling technique in order to determine how fronto-striatal gray matter reductions relate to acquisition efficiency on a discrimination learning task. We hypothesized PD patients would show impaired learning acquisition rates and that these impairments would be associated with volumetric reductions in fronto-striatal regions that are crucial for feedback-based learning and reward processing.

## **MATERIALS AND METHODS**

#### **CASE SELECTION**

Seventeen non-demented PD patients were recruited from the Brain and Mind Institute Parkinson's Disease Research Clinic; all satisfied UKPDS Brain Bank criteria for diagnosis of PD (Gibb and Lees, 1988) and were between Hoehn and Yahr stages I and III (Hoehn and Yahr, 1967). Motor score from the Unified Parkinson's Disease Rating Scale (UPDRS-III) (Goetz et al., 2008) is also reported. One patient was untreated; three were on levodopa monotherapy and two were taking levodopa plus an adjuvant; nine patients were on levodopa plus a dopamine agonist, and in this group four were also taking an adjuvant and one was taking a monoamine oxidase inhibitor; one patient was on a dopamine agonist plus a monoamine oxidase inhibitor and one was taking a monoamine oxidase inhibitor only. Treated patients performed behavioral testing in the ON state, having taken their usual medications. L-dopa daily dose equivalents (DDE mg/day) were calculated for treated patients. Patients with overt clinical depression were not included in the study and a measure of affective disturbance was obtained (Beck Depression Inventory-II; BDI-II, Beck et al., 1996). Eleven age- and education-matched healthy controls were selected from a volunteer panel. See **Table 1** for demographic details and clinical characteristics.

The research study was approved by the Human Ethics Committees of the Central and South Eastern Sydney Area Health Services and the Universities of Sydney and New South Wales, and complies with the statement on human experimentation issued by the National Health and Medical Research Council of Australia.

#### **NEUROPSYCHOLOGICAL ASSESSMENT**

All patients and controls were administered the Mini Mental State Examination (MMSE; Folstein et al., 1975) to determine their overall cognitive functioning. For detailed measurement of executive function, patients and controls underwent a battery of tests including Verbal Fluency [measured by the number of words produced in 60 s, beginning with F, A, and S (Benton et al., 1994)]; the Trail-Making test (time B-A) to assess speeded set-shifting (Reitan and Wolfson, 1985); and a Digit Span task, with digits repeated in their original order (forwards) and in reverse order (backwards) (Wechsler, 1997) to assess attention span and working memory.

#### **DISCRIMINATION LEARNING TASK**

We administered a discrimination learning task developed by de Wit and colleagues, which was an abbreviated version of a **Table 1 | Mean (SD) values for Controls and PD patients on demographics, clinical characteristics and discrimination learning measures.**


*n.s., non significant, \*p* < *0.05, \*\*p* < *0.001; F-values indicate significant differences across groups, otherwise due to unequal variance* χ*2 indicates differences across groupsa. MMSE, Mini-Mental State Examination; UPDRS III, Motor score from the Unified Parkinson's Disease Rating Scale; BDI-II, Beck Depression Inventory II.*

more extensive instrumental learning measure described by de Wit et al. (2007). The task was computer based and programmed using Visual Basic 6.0, with keyboard response keys *z* and *m* programmed to register a *left* or *right* response.

Discrimination learning tasks involve a discriminative stimulus that signals whether or not a certain response will lead to a particular outcome; stimuli are presumed to have acquired discriminative control over instrumental performance when correct responding occurs in the presence of a given stimulus (i.e., when the stimulus: response-outcome contingency is acquired) (Bouton, 2007). In the current discrimination learning task, for each trial the discriminative stimulus consisted of a colored icon depicting a piece of a fruit on the front of a box. There were six possible fruits that could be pictured on the outside of the box (i.e., strawberry, lemon, grape, kiwi, melon, and orange). Subjects were required to make either a *left* or *right* response in order to "open" the box and obtain the outcome/reward inside (the outcome being a different fruit, i.e., coconut, pear, pineapple, cherry, banana, and apple). Each of the six stimulus fruits were associated with a particular correct response (i.e., *left* or *right*) that would result in obtaining the reward/outcome. These contingencies were kept constant, for example a *left* response to the strawberry stimulus would always result in the box opening to reveal an outcome/reward, whereas if a *right* response was made to the strawberry stimulus, the box would open to reveal nothing inside. Additional feedback was provided as the opened box revealing the reward was paired with a positive sound and points displayed on the screen, whereas the opened box with nothing inside was paired with a negative sound effect. The initial fruit stimulus remained on the screen until subjects made a response and faster correct responses earned more points (in the range from 1 to 5). The outcome fruit was presented for 1 s, and inter-trial intervals were fixed at 1.5 s.

Subjects were instructed at the outset of the task that they would need to determine the correct response for each stimulus fruit via a trial and error process. It was emphasized that these contingencies would not change throughout the trials, so that it would be possible for them to learn these stimulus-response associations. They were also encouraged to memorise the stimulus: response-outcome associations, as they would be questioned on them at the end.

Each subject completed 96 trials, comprising of eight 12-trial blocks during which each of the six possible stimulus-response pairs was presented twice in a randomized order; three of the stimulus fruits were associated with a correct *left* response and the other three were associated with a correct *right* response. Across subjects, the particular fruits that served as the stimulus and those that served as the outcome were counterbalanced. From the discrimination learning task, we derived a binary outcome measure of either 1 or 0 for each trial (1 indicating a correct response for that trial, 0 an incorrect response). Finally, after completing the trials, patients were asked to fill in pencil and paper questionnaires that probed explicit knowledge of the stimulus: response-outcome contingencies. These questionnaires were divided into three parts (each with six items), assessing knowledge of: (1) stimulus-response knowledge; (2) response-outcome; and (3) stimulus-outcome. In part (1), subjects were shown pictures of each stimulus fruit one at a time and they were asked to verbally indicate whether a *left* or *right* response was associated with obtaining a reward for each stimulus. A similar procedure was followed in part (2), as subjects were shown each reward/outcome and asked to indicate whether a *left* or *right* response had been necessary to successfully achieve that reward. In part (3), subjects were shown each stimulus fruit alongside an array of all possible reward fruits and they selected the reward that had been paired with each particular stimulus.

#### **COMPUTATIONAL MODEL**

Given the insufficiency of classical statistical methods in extracting learning rates and trial-by-trial responses, we applied the reinforcement Q-learning model to the outcome measures generated from the discrimination learning task, for each subject's pattern of correct and incorrect responses across the 96 trials (Sutton and Barto, 1998). The input of this model is a trial-bytrial sequence of responses for each subject, while the output is the learning rate and exploration parameter values, which cannot be obtained from regular statistical analysis of behavioral data. Previous research has used similar computational models to fit model parameter values for each subject in genetic (Frank et al., 2007) and patient studies (Gold et al., 2012). The rationale for applying the Q-learning model to the behavioral data is to disentangle each subject's performance to different components, and also to determine which model parameters can better account for variations in behavioral performance across different groups. Here, we attempt to understand the observed behavioral results using the computational reinforcement Q-learning model (Watkins and Dayan, 1992; Sutton and Barto, 1998; Frank et al., 2007) and specifically, we have fitted our behavioral data using a Q-learning model (Frank et al., 2007).

By using the reinforcement Q-learning model, we fit individual subject's trial-by-trial data, which culminates in two parameter values that correspond to the subject's learning rate and exploration/exploitation bias. The learning rate parameter modulates the degree to which feedback on the current trial is used to adjust expectations for future trials. The exploration parameter indicates whether the subject is more likely to choose the same or a different response as on previous trials with the same stimulus. A small exploration/exploitation parameter indicates exploitation (i.e., increased likelihood that subjects will choose the same response as previously made, when presented with the same stimulus), and a large value indicates exploration (i.e., increased likelihood they will choose a different response when presented with the same stimulus). In principle, impaired feedback learning can occur because of small learning rate or decreased likelihood to explore alternative responses at the expense of exploiting previously erroneous response strategies.

Specifically, we compute a weight (*W)* value for selecting each stimulus *i* during trial *t*, such that the value of the chosen stimulus is modified by reinforcement feedback:

$$PE(t) = US(t) - W(t)$$

**w**here *PE*(*t*) is the prediction error at time t; *US*(*t*) is feedback presented at time t, and is equal to 1 for positive and 0 for negative feedback. *W*-values are computed using the following equation.

$$W\_i(t+1) = W\_i(t) + \alpha PE(t)$$

where α is learning rate (for more details, see Frank et al., 2007).

We have modeled choice by using a softmax logistic function, with inverse gain (exploration) parameter β, such that the probability of choosing A over B was computed as:

$$P\_A(t) = \frac{e^{W\_A(t)/\beta}}{e^{W\_A(t)/\beta} + e^{W\_B(t)/\beta}}$$

Each participant's trial-by-trial choices were fitted with two free parameters, α and β, which were selected to maximize fit to participant's sequence of choices in the task. β is an inverse gain parameter and reflects the participant's tendency to either exploit (i.e., to choose the response with the currently highest *W*-value) or explore (i.e., to randomly choose a category).

We then fitted the model to each participant's data, by searching through the space of each of these two parameters from 0 to 1 with a step size of 0.01. We then optimized the log likelihood estimate (LLE) at trial t:

$$\text{LLE} = \text{Log}\left(\Pi\_t P(t)\right)$$

where t is trial number (for a total of 96 trials). For each participant, the best fitting parameter values are those associated with maximum LLE. Equivalently, maximum LLE is the most predictive of the participant's responses in the task. In this model, the best fitting parameter values to each participant's behavioral data accommodate trial-by-trial adaptations in response to feedback given based on participants' choices. In addition, we predict that these values will explain differences in learning efficiency between patients and controls.

Finally, to validate our model we compared our results with a random responder model. Specifically, we calculated the pseudo-*R*<sup>2</sup> measure, which is (LLE-*r*)/*r*, where *r* is the log likelihood of the data under a model of purely random choices, in which *p* = 0.5 for all trials (Camerer and Ho, 1999; Daw et al., 2006). The resulting pseudo-*R*<sup>2</sup> statistic reveals how well the model fits the data compared to a model predicting chance performance and is independent of the number of trials to be fit in each set (see Frank et al., 2007, for discussion).

#### **BEHAVIORAL ANALYSES**

Data were analyzed using SPSS19.0 (SPSS Inc., Chicago, Ill., USA). Parametric demographic and neuropsychological data were compared across the groups via One-Way ANOVAs followed by Tukey *post-hoc* tests. A priori, demographic and learning variables were plotted and checked for normality of distribution by Kolmogorov-Smirnov tests. Variables showing non-parametric distribution were analyzed via Chi-square, Kruskal-Wallis and Mann-Whitney *U-*tests. A repeated measures ANOVA with Bonferroni *post hoc* tests was used to explore group differences in learning accuracy across the eight blocks, with group (control vs. patient) as the between-subjects variable and block (blocks 1–8) as the within-subjects variable.

## **IMAGING ACQUISITION**

All patients and controls underwent the same imaging protocol with whole-brain T1 images acquired using 3T Philips MRI scanners with standard quadrature head coil (8 channels). The 3D T1-weighted sequences were acquired as follows: coronal orientation, matrix 256 <sup>×</sup> 256, 200 slices, 1 <sup>×</sup> 1 mm<sup>2</sup> in-plane resolution, slice thickness 1 mm, TE/TR = 2.6/5.8 ms.

#### **VOXEL-BASED MORPHOMETRY (VBM) ANALYSIS**

3D T1-weighted sequences were analyzed with FSL-VBM, a voxelbased morphometry analysis (Ashburner and Friston, 2000; Good et al., 2001) which is part of the FSL software package http://www. fmrib.ox.ac.uk/fsl/fslvbm/index.html (Smith et al., 2004). First, tissue segmentation was carried out using FMRIB's Automatic Segmentation Tool (FAST) (Zhang et al., 2001) from brain extracted images. The resulting gray matter partial volume maps were then aligned to the Montreal Neurological Institute standard space (MNI152) using the non-linear registration approach using FNIRT (Andersson et al., 2007a,b), which uses a b-spline representation of the registration warp field (Rueckert et al., 1999). The registered partial volume maps were then modulated (to correct for local expansion or contraction) by dividing them by the Jacobian of the warp field. The modulated images were then smoothed with an isotropic Gaussian kernel with a standard deviation of 3 mm (FWHM: 8 mm). A region-of-interest (ROI) mask for prefrontal and striatal brain regions was created by using the Harvard-Oxford cortical and subcortical structural atlas. The atlas regions that comprise the entire prefrontal cortex and striatum were included in the mask, these included frontal pole, superior frontal gyrus, middle frontal gyrus, inferior frontal gyrus, frontal medial cortex, subcallosal cortex, paracingulate gyrus, cingulate gyrus (anterior division), frontal orbital cortex, caudate, putamen, and nucleus accumbens. Finally, a voxelwise general linear model (GLM) was applied and permutationbased non-parametric testing was used to form clusters with the Threshold-Free Cluster Enhancement (TFCE) method (Smith and Nichols, 2009), tested for significance at *p* < 0.05, corrected for multiple comparisons via Family-wise Error (FWE) correction across space, unless otherwise stated.

## **RESULTS**

#### **DEMOGRAPHICS, CLINICAL CHARACTERISTICS AND NEUROPSYCHOLOGICAL ASSESSMENT**

Demographics and general cognitive scores can be seen in **Table 1**. Participant groups did not differ in terms of age, education or MMSE score (*p*'s > 0.1). Patients and controls did not differ in their Digit Span forwards score (*p* > 0.6), but patients were impaired relative to Controls for Digit Span backwards (*p* < 0.05). Groups were equivalent for Letter Fluency scores (*p* > 0.2) and although groups did not differ significantly on Trail Making B-A scores, there was a strong trend toward worse performance in the patients (*p* = 0.06). See **Table 1**.

#### **LEARNING MEASURES**

Overall accuracy scores on the discrimination learning task are shown in **Table 1** and learning accuracy across the eight blocks is shown in **Figure 1**. Overall accuracy across the 96 trials was not significantly different between the groups (*p* > 0.1). Results of the repeated measures ANOVA showed that there was no significant main effect of group [*F*(1, <sup>26</sup>) = 2.6, *p* > 0.1]. Mauchly's test indicated that the assumption of sphericity had been violated [χ<sup>2</sup> (27) = 71.0, *p* < 0.001] therefore degrees of freedom were corrected using Huynh-Feldt estimates of sphericity (ε = 0.550). The results show a significant main effect for block [*F*(4.8, <sup>124</sup>.2) = 20.3, *p* < 0.001], which reflected that, irrespective of group, accuracy in blocks 6, 7, and 8 was significantly higher than in blocks 1, 2, and 3 (*p*-values < 0.05), accuracy in block 5 was significantly higher than in blocks 1 and 2 (*p* < 0.05), and accuracy in block 4 was higher than accuracy in block 1 (*p* < 0.002). There was no significant group by block interaction [*F*(4.8, <sup>124</sup>.2) = 1.6, *p* > 0.1). *Post-hoc* between-group comparisons revealed that controls and PD patients only differed significantly on their accuracy in block 7 with controls having a higher accuracy score (*p* < 0.05), no significant difference were observed in other blocks (*p* > 0.05). Within-group *post-hoc* analysis showed that controls had consistent significant differences in accuracy between early and late blocks, with blocks 4, 5, 6, 7, and 8 all having higher accuracy than both blocks 1 and 2 (*p*-values < 0.05). PD patients showed a slightly less consistent pattern, with accuracy in blocks 5, 6, 7, and 8 higher than in block 1 (but not block 2) (*p*values < 0.05); with all other block accuracies were equivalent, expect for blocks 7 and 8 being significantly higher than block 3 (*p*-values < 0.05).

Results of the learning rate and exploration parameters for the discrimination learning task, as derived from the computational model, are also shown in **Table 1**. Exploration parameters did not differ significantly between the groups (*p* > 0.3) and the small value of the parameter in both patients and controls suggested minimal exploration, which would be predicted based on the nature of the task. Learning Rate for the PD patients was significantly reduced relative to controls (*p* = 0.001) and these Learning Rate values were further analyzed in the VBM analysis. Results from the random responder model revealed the mean and standard deviation of pseudo-*R*<sup>2</sup> were 0.2901 and 0.173, respectively. This was significantly larger than zero, indicating our model performs better than chance at fitting individuals' data.

Participant groups did not differ in terms of explicit knowledge of Stimulus-Response-Outcome contingencies. The following mean (standard deviation) results on the three questionnaire sections were achieved, each section with a possible maximum score of 6 (i.e., 1 point per item). Stimulus-Response accuracy for controls was 5.6 (0.05) and for PD patients 5.3 (1.6); Response-Outcome accuracy for controls was 5.0 (1.2) and PD patients 4.6 (1.7); Stimulus-Outcome for controls was 3.5 (1.7) and PD patients 3.0 (2.2), with all *p*-values > 0.5. In a correlation analysis, none of the PD clinical variables (i.e., disease duration, Hoehn and Yahr stage, UPDRS III, DDE mg/day, BDI score) or the digits backward score, showed a significant relationship with the Learning Rate measure (*p*'s > 0.1).

#### **VBM ANALYSIS**

The PD group was initially contrasted with controls to reveal overall patterns of brain atrophy in the fronto-striatal mask. PD patients showed gray matter atrophy bilaterally in the frontal orbital cortex and subcallosal cortex, extending back to the left ventral striatal (nucleus accumbens) territory; as well as in the inferior frontal gyri bilaterally (see Supplementary Table 1).

Learning rate was then entered as a covariate in the design matrix of the VBM analysis. For PD patients, Learning Rate score covaried with gray matter atrophy in the frontal medial cortex/frontal pole, the right inferior frontal gyrus and the left subcallosal cortex/left nucleus accumbens (see **Table 2** and **Figure 2**).

Finally, a partial correlation analysis was used to explore whether common damage to the ventromedial prefrontal cortex, right inferior frontal gyrus and left subcallosal cortex/nucleus

**Table 2 | Region of interest Voxel-based morphometry (VBM) results showing areas of significant gray matter intensity decrease that covary with learning measures.**


*All results uncorrected at p* < *0.01; only clusters with at least 40 contiguous voxels included.*

**FIGURE 2 | VBM analysis showing the frontal and striatal regions that correlated with elevated learning rates in the patients in (A) frontal medial cortex (B) right inferior frontal gyrus (C) subcallosal/left nucleus accumbens.** Clusters are overlaid on the MNI standard brain (*t* > 2.50). Cultured voxels show regions which were significant in the analyses for *p* < 0.01 uncorrected and a cluster threshold of 40 contiguous voxels.

accumbens explained the significant correlations with Learning Rate. The ventromedial prefrontal region still correlated significantly with Learning Rate (*p* < 0.05) when right inferior frontal gyrus and left subcallosal cortex/nucleus accumbens were taken into account. In contrast, neither right inferior frontal gyrus nor left subcallosal cortex/nucleus accumbens regions correlated significantly with Learning Rate when atrophy in the other regions was partialled out (*p-*values ≥ 0.2).

### **DISCUSSION**

By employing a combined approach of computational modeling and VBM analysis, we show that PD patients have a learning acquisition deficit that is associated with volumetric reductions in discrete fronto-striatal regions. This is the first time that such learning deficits in PD have been probed via structural imaging techniques and our findings fit well with the broader learning literature, whilst highlighting a novel approach in order to further characterize discrimination learning in PD.

The nature of learning assessed in the current study reflects the formation of stimulus-response associations, which are learnt through incorporating feedback via a trial-and-error approach. Impaired learning acquisition rates on discrimination tasks have been demonstrated behaviorally in PD patients (Czernecki et al., 2002; Myers et al., 2003; de Wit et al., 2011; Shiner et al., 2012) and also in neurocomputational models of PD (Moustafa et al., 2010). Furthermore, Shohamy and colleagues (2004, 2006) have shown that in PD the feedback learning deficit is relatively specific, as patients are impaired when required to learn associations on the basis of feedback, but equivalent to controls when observational learning of the same associations was required.

Our results further confirm a feedback-based learning acquisition deficit in mild, non-demented PD. Patients and controls were equivalent in their exploration parameters, with both showing a minimal amount of exploration. This would be expected given the nature of the task wherein subjects are not encouraged to modify their responses as the stimulus-response-outcome contingencies do not change. Nevertheless, it further validates the utility of our model that it was able to identify this effect. Results from the analysis of learning accuracy across blocks indicated that deficient learning in the PD patients was mostly driven by poorer performance later in the task. We did not find a difference in explicit knowledge of stimulus: response-outcome contingencies, suggesting that despite a deficient learning rate the PD patients were ultimately able to attain a good level of knowledge of these contingencies (see also de Wit et al., 2011). The acquisition impairment did not correlate with any clinical disease variables; nor was a correlational relationship evident between learning rate and working memory (as assessed via the digit span backwards task), which was found to be mildly impaired. Importantly, on other executive domains assessed in the current study, the PD patients' performance was equivalent to controls, which supports the notion of a discrete discrimination learning deficit in this patient group.

The previous findings relating deficient feedback-based learning in PD to dopamine dysfunction have been somewhat equivocal, as comparisons between patients ON vs. OFF medication have found that performance on a variety of learning tasks is impaired in both scenarios (Czernecki et al., 2002; Ell et al., 2010; Moustafa and Gluck, 2011), or that performance differs based on task demands (Shohamy et al., 2006) or valence of feedback signals (Frank et al., 2004). A number of studies using feedback-based category learning in PD have suggested that respective demands on selective attention vs. working memory, which are differentially affected by dopamine therapy, may determine learning performance (Filoteo et al., 2005, 2007). Given that in the OFF state patients suffer severe depletion in dorsal striatum and its projection targets, whilst the ON state is associated with restoration of those levels and the possibility of dopaminergic "overdose" in ventral striatum and limbic regions (Cools et al., 2001), differential effects on discrimination learning would be expected. Nonetheless, the finding of similar effects arising from two ostensibly disparate conditions has been explained with respect to the "relative" rather than "absolute" levels of dopamine, as a reduced dynamic range of phasic dopamine activity can result from both the ON and OFF states (Frank, 2005).

In contrast to previous studies that have characterized discrimination learning deficits in PD with respect to dopaminergic dysfunction, our current results define these deficits with respect to the possible structural abnormalities that may be contributory. In addition to dopamine depletion, PD is also associated with gray matter loss and synaptic denervation in frontostriatal regions essential to broad aspects of learning and feedback processing, including the striatum (Rosenberg-Katz et al., 2013), medial temporal regions (Filoteo et al., 2013) and ventromedial prefrontal cortex (O'Callaghan et al., 2013). More specifically, prefrontal volume loss has been identified in nondemented PD, in comparison to healthy controls (Song et al., 2011; Melzer et al., 2012). Our findings reveal that discrete fronto-striatal regions, namely ventromedial prefrontal cortex, right inferior frontal gyrus and nucleus accumbens, are directly associated with acquisition deficits during feedback-based discrimination learning. The presence of underlying gray matter loss contributing to learning deficits may to some degree explain why discrimination learning can be affected both ON and OFF medication, and thus indicate that dopamine imbalance may not be the sole explanation for learning deficits in PD.

Our findings potentially shed light on previous reports that disease severity in PD is associated with specific learning impairments (Owen et al., 1993; Swainson et al., 2006). In particular, Swainson et al. (2006) found that early-stage, unmedicated patients were not impaired on a complex discrimination learning task; whilst early-stage, medicated patients *were* impaired on the task, their performance was mediated by deficient perceptual categorization of the complex stimuli, rather than a learning deficit *per se*. In contrast, only patients with severe, medicated PD showed impaired learning in the absence of perceptual categorization deficits. This raises the possibility that some factor other than inappropriate dopamine levels may intervene in later-stage PD to produce learning impairments on the task. Interestingly, the comparison groups of Huntington's disease and frontal lobe lesion patients included in the study showed the same pattern of intact perceptual categorization, but impaired learning, suggesting that more extensive fronto-striatal dysfunction may underpin the learning impairments. Taken together with our findings, it may be that fronto-striatal atrophy is a contributing factor to those learning impairments seen in PD with disease progression.

The possibility that fronto-striatal atrophy can mediate learning performance is also relevant to previous studies that have identified considerable variation within their PD cohorts. For example, using a rule-based category learning task, Ashby et al. (2003) found that PD patients were impaired at the group level, however, this effect was driven by impaired performance in only half of the patients, with the remainder performing equivalent to controls. The authors interpreted this as evidence of distinctive PD sub-groups. Indeed, differences in the clinical phenotypes of PD are well recognized (Lewis et al., 2005) and evidence is accumulating that the presence of more widespread fronto-subcortical atrophy may be characteristic of certain sub-groups (Feldmann et al., 2008; Melzer et al., 2012; Rosenberg-Katz et al., 2013). An admixture of PD patients with and without prefrontal volume loss may contribute to within-group variation in learning performance.

Results from our partial correlation analysis suggest that atrophy in the ventromedial prefrontal region may be driving the association with acquisition deficits. Although previous research using functional MRI in healthy controls has identified striatal activity as crucial during the acquisition phase of learning tasks (Pessiglione et al., 2006; Foerde and Shohamy, 2011a), others have shown ventromedial prefrontal cortex activity during learning acquisition (de Wit et al., 2009). Whereas the gradual learning of stimulus-response associations is presumed to reflect "habit" learning that is mediated by basal ganglia dopamine signals (Shohamy et al., 2008), "goal-directed" learning, which involves a focus on stimulus-response-outcome associations, has been linked to medial prefrontal regions (Balleine and O'Doherty, 2009). The interplay between the habitual and goaldirected modes can be explained by the "dual-systems" account, whereby instrumental learning can be supported by either modality (Dickinson and Balleine, 1994; de Wit and Dickinson, 2009). In line with the possibility that acquisition of instrumental discriminations is partly supported by goal-directed learning, de Wit et al. (2009) showed that engagement of the ventromedial prefrontal cortex during discrimination learning was predictive of goal-directed performance during a subsequent test phase. During that "instructed outcome-devaluation" test phase, participants were told that some of the fruit outcomes were no longer worth points. Participants with relatively strong engagement of the ventromedial prefrontal cortex during learning were better able to direct their responses toward the still-valuable outcomes and away from the devalued ones. More recently, individual differences in the strength of the white-matter pathway between the ventromedial prefrontal cortex and caudate have also been implicated in goal-directed control, whilst connectivity between the posterior putamen and premotor cortex has been related to habit learning (de Wit et al., 2012). Given these previous investigations of the role of the ventromedial prefrontal cortex in action control, our results are in keeping with a deficit in goal-directed learning.

In the category learning literature, the Competition between Verbal and Implicit Systems model (COVIS; Ashby et al., 1998) has been proposed to explain the neural systems that mediate rule-based learning vs. procedural (information-integration) learning. Whilst both are inherently feedback-based, these learning mechanisms necessitate different strategies and depend on divergent systems. The former comprising of an explicit hypothesis-testing system underpinned by a broad network including prefrontal cortex, anterior cingulate, hippocampus and caudate head; and the latter, requiring perceptual information to be integrated at a pre-decisional level, is mediated by corticalstriatal synapses within the putamen and premotor cortex circuitry (Ashby and Maddox, 2011). However, there is growing consensus that prefrontal regions, in particular ventromedial prefrontal cortex, may play a role in both types of learning (Seger, 2008). Schnyer et al. (2009) explored this directly by contrasting ventromedial prefrontal cortex lesion patients on rule-based vs. information-integration learning and found that patients were impaired in both types of learning. Work by Seger and colleagues (Seger and Cincotta, 2005; Seger et al., 2010) has also highlighted the role of the ventral striatum in encoding feedback during unstructured category learning tasks. These findings suggest that the ventromedial prefrontal cortex and ventral striatum important hubs in the cortico-striatal motivational loop—are critical for monitoring and integrating feedback, regardless of the learning strategy.

Ventromedial prefrontal regions and ventral striatum (particularly nucleus accumbens) are also more generally associated with reward processing (Kringelbach, 2005), which may further explain why these regions were implicated in acquisition learning deficits in our patients, as the feedback involved in the task was reward-oriented. Specific reward-learning deficits have previously been demonstrated in PD (Swainson et al., 2000; Housden et al., 2010), and based on the volumetric reductions we found in regions crucial to reward processing in our patient cohort, it is likely that deficient reward processing may have contributed to the acquisition deficits. Our finding that the right inferior frontal gyrus was also associated with the acquisition deficit may reflect the demands of more general cognitive control that is required in such a learning task. The right inferior frontal gyrus is well known to be implicated in inhibitory control of behavior (Aron et al., 2004), however, a broader interpretation of its action is that it is involved in the detection/monitoring of task-relevant cues (Hampshire et al., 2010) and in terms of learning processes, the region is recruited during reversal learning (Cools et al., 2002).

From a mechanistic account, the involvement of prefrontal regions in learning from trial-by-trial feedback is also emphasized in computational models that seek to integrate basal ganglia and prefrontal function with respect to higher level executive processes. In the computational accounts proposed by O'Reilly and Frank (2006), the prefrontal cortex is active in maintaining information, whereby task-relevant information is determined via basal ganglia-prefrontal interactions that serve as a gating mechanism (see also Hazy et al., 2007). In these models, basal ganglia dopamine-dependant learning systems are presumed to trigger updates of working memory representations in the prefrontal cortex, whilst simultaneously inhibiting task-irrelevant information—thus allowing intrinsic prefrontal cortical mechanisms to actively maintain the contents of working memory. Our results suggest that direct atrophy in prefrontal regions may interfere with the updating and maintenance of task-relevant information in these models, which may therefore contribute to deficient acquisition on learning tasks.

The VBM technique utilized in this study is not without limitations, including registration and normalization issues and imperfect gray-white matter segmentation, particularly in relation to already atypical brains (Mechelli et al., 2005). In addition, the analysis we conducted does not measure the particular morphological changes brain structures undergo in PD and in interpreting findings of reduced gray matter density, it must be borne in mind that the precise mechanisms of cell degeneration in PD are still a matter of debate (Obeso et al., 2010). Nevertheless, VBM provides an important tool to further characterize learning systems in PD.

Together, our findings suggest that discrete fronto-striatal regions contribute to the feedback-based learning deficits in PD. It is likely that gray matter loss in these regions interacts with dopaminergic dysfunction to produce these deficits, and that the ultimate behavioral manifestation reflects an interplay between neurotransmitter imbalance and underlying structural changes. Our findings have important implications for the development of learning theories based on PD as a model of dopaminergic dysfunction. Whereby current theories and computational approaches have tended to focus on dopamine imbalance in intra-basal ganglia circuitry, a broader appreciation of the more distributed brain changes, such as gray matter loss, and how these may also affect learning processes is crucial in order to continue to refine these theoretical models. These results highlight that dysfunction in dopaminergic systems may not be the sole explanation for feedback-based learning deficits in PD, but that gray matter loss may also contribute to these deficits.

#### **AUTHOR CONTRIBUTIONS**

Claire O'Callaghan contributed to the design and conceptualization of the study, data collection, analysis and interpretation of data, drafting and revision of the manuscript. Ahmed A. Moustafa carried out the computational modeling and contributed to interpretation of the data and revision of the manuscript. Sanne de Wit contributed to study conceptualization and manuscript revision. James M. Shine contributed to data collection and manuscript revision. Trevor W. Robbins was involved in revision of the manuscript. Simon J. G. Lewis contributed to interpretation of the data and revision of the manuscript. Michael Hornberger contributed to design and conceptualization of the study, analysis and interpretation of data, and revision of the manuscript.

#### **ACKNOWLEDGMENTS**

Claire O'Callaghan is supported by an Alzheimer's Australia Dementia Research Foundation PhD scholarship. Ahmed A. Moustafa is supported by UWS Competitive Research Grant (P00021210). Simon J. G. Lewis is supported by a National Health and Medical Research Council Practitioner Fellowship (1003007). Michael Hornberger is supported by an Australian Research Council Research Fellowship (DP110104202).

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fncom.2013. 00180/abstract

#### **REFERENCES**


action control. *J. Neurosci.* 32, 12066–12075. doi: 10.1523/JNEUROSCI.1088- 12.2012


patients with Parkinson's disease or frontal or temporal lobe lesions: possible adverse effects of dopaminergic medication. *Neuropsychologia* 38, 596–612. doi: 10.1016/S0028-3932(99)00103-7


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 October 2013; accepted: 25 November 2013; published online: 12 December 2013.*

*Citation: O'Callaghan C, Moustafa AA, de Wit S, Shine JM, Robbins TW, Lewis SJG and Hornberger M (2013) Fronto-striatal gray matter contributions to discrimination learning in Parkinson's disease. Front. Comput. Neurosci. 7:180. doi: 10.3389/fncom. 2013.00180*

*This article was submitted to the journal Frontiers in Computational Neuroscience. Copyright © 2013 O'Callaghan, Moustafa, de Wit, Shine, Robbins, Lewis and Hornberger. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## The emergence of two anti-phase oscillatory neural populations in a computational model of the Parkinsonian globus pallidus

#### *Robert Merrison-Hort <sup>1</sup> \* and Roman Borisyuk1,2*

*<sup>1</sup> Centre for Robotics and Neural Systems, School of Computing and Mathematics, The University of Plymouth, Plymouth, UK <sup>2</sup> Neural Networks Laboratory, Institute of Mathematical Problems in Biology, Russian Academy of Sciences, Pushchino, Russia*

#### *Edited by:*

*Ahmed A. Moustafa, University of Western Sydney, Australia*

#### *Reviewed by:*

*Robert Rosenbaum, University of Pittsburgh, USA Vignesh Muralidharan, Indian Institute of Technology Madras, India*

#### *\*Correspondence:*

*Robert Merrison-Hort, Centre for Robotics and Neural Systems, School of Computing and Mathematics, The University of Plymouth, A221 Portland Square, Drake Circus, Plymouth PL4 8AA, UK e-mail: robert.merrison@*

*plymouth.ac.uk*

Experiments in rodent models of Parkinson's disease have demonstrated a prominent increase of oscillatory firing patterns in neurons within the Parkinsonian globus pallidus (GP) which may underlie some of the motor symptoms of the disease. There are two main pathways from the cortex to GP: via the striatum and via the subthalamic nucleus (STN), but it is not known how these inputs sculpt the pathological pallidal firing patterns. To study this we developed a novel neural network model of conductance-based spiking pallidal neurons with cortex-modulated input from STN neurons. Our results support the hypothesis that entrainment occurs primarily via the subthalamic pathway. We find that as a result of the interplay between excitatory input from the STN and mutual inhibitory coupling between GP neurons, a homogeneous population of GP neurons demonstrates a self-organizing dynamical behavior where two groups of neurons emerge: one spiking in-phase with the cortical rhythm and the other in anti-phase. This finding mirrors what is seen in recordings from the GP of rodents that have had Parkinsonism induced via brain lesions. Our model also includes downregulation of Hyperpolarization-activated Cyclic Nucleotide-gated (HCN) channels in response to burst firing of GP neurons, since this has been suggested as a possible mechanism for the emergence of Parkinsonian activity. We found that the downregulation of HCN channels provides even better correspondence with experimental data but that it is not essential in order for the two groups of oscillatory neurons to appear. We discuss how the influence of inhibitory striatal input will strengthen our results.

#### **Keywords: Parkinson's disease, globus pallidus, oscillation, synchronization, HCN, downregulation, deep-brain stimulation**

## **1. INTRODUCTION**

Parkinson's disease is a neurodegenerative disorder which (amongst other symptoms) causes a range of movement-related disturbances such as tremor and slowness (bradykinesia). The primary pathology of the disease is the death of dopaminergic neurons in the basal ganglia (BG), specifically those in the substantia nigra pars compacta (SNc). Since the dopaminergic neurons in the SNc provide widespread innervation to the other regions of the basal ganglia, it is not surprising that their loss results in profound changes to neuronal activity in these regions. What is not yet understood is the precise mechanism by which abnormal neuronal activity arises as a result of dopamine loss—and how this activity relates to motor symptoms. One very successful hypothesis for this was the so-called "rate" hypothesis (DeLong, 1990), which held that motor areas of the basal ganglia are divided into two feed-forward pathways that transfer information from the cortex to the thalamus: a pro-kinetic "direct" and an anti-kinetic "indirect" pathway. According to this model, loss of dopamine input to the striatum upsets the balance of these two pathways, resulting in movement abnormalities. While the rate hypothesis makes predictions that have resulted in successful treatments, such as lesioning of hyperactive nuclei on the indirect pathway (Lozano et al., 1995; Gill and Heywood, 1997), more recent electrophysiological studies have demonstrated that the changes in neuronal activity that underlie Parkinsonian motor impairment are likely to be considerably more complex than those implied by the rate model [see Rubin et al. (2012) for review].

One aspect of pathological activity in the Parkinsonian basal ganglia that is under active investigation is the increase in synchronous oscillatory firing. Local field potential (LFP) recordings from the subthalamic nucleus (STN) of patients with Parkinson's disease show a clear increase in power in the β frequency band (10–30 Hz) when patients are off medication [reviewed in Eusebio and Brown (2009)], and the size of the reduction in β power that occurs with dopamine-replacement medication is positively correlated with the concomitant improvement in severity of anti-kinetic symptoms (Kühn et al., 2006). There are a number of reasons why widespread pathological oscillations may cause motor deficits, for example they may impair the ability to relay information (Mallet et al., 2008b). It has also been proposed that, in health, sporadic β oscillations act as a global signal for maintenance of the current motor activity (Jenkinson and Brown, 2011).

What is the neural basis for the exaggerated oscillatory activity of the Parkinsonian basal ganglia? One possibility is that the reciprocally-connected neurons of the excitatory STN and inhibitory globus pallidus (GP; homologous to the external globus pallidus in primates) act as a neural oscillator. Several computational studies have suggested that this is a plausible mechanism (Gillies et al., 2002; Terman et al., 2002; Holgado et al., 2010). *In vitro* work in slices containing only STN and GP neurons have also shown that oscillatory firing is indeed possible (Plenz and Kital, 1999), though only at frequencies much lower than the β band.

Another possible explanation for exaggerated BG β oscillations in Parkinson's disease is that dopamine acts to modulate the effects of rhythmic cortical activity on cortical-basal ganglia pathways, such that in conditions of reduced dopamine this network becomes pathologically entrained to cortical rhythms. Evidence for this comes from studies that have used signal processing techniques to attempt determine whether β band coherence between the cortex and basal ganglia is directed from cortex to STN or vice versa. Such studies have shown that, in patients with Parkinson's disease (Litvak et al., 2011) or Parkinsonian rodents (Sharott et al., 2005), the oscillations arise in the cortex and drive STN activity. Computational studies that investigate the synchronization of basal ganglia neurons in Parkinson's disease often consider the neurons to be phase oscillators, which either synchronize themselves Popovych and Tass (2012) or become synchronized through common external inputs Wilson et al. (2011).

Experiments using rodent models of Parkinson's disease provide compelling evidence that under Parkinsonian conditions the activity of neurons in the GP are much more susceptible to entrainment by cortical rhythms than in the healthy case. Under conditions of urethane anaesthesia, neurons in the GP of healthy rodents show uncorrelated tonic firing. However, in animals where Parkinsonism has been induced, either through chronic lesioning of the SNc with 6-hydroxydopamine (OHDA) (Ni et al., 2000; Magill et al., 2001) or acute inactivation of SNc projection fibers (Galati et al., 2009), the spiking activity of the majority of GP neurons becomes significantly correlated with cortical "slow wave activity" (SWA); this is the major cortical rhythm in the anaesthetized state and has a frequency of approximately 1 Hz. These experiments also reveal that, in the chronic lesioned case at least, the neurons in GP are split into two major groups, distinguished by whether they preferentially fire during the active phase (ECoG peaks) or inactive phase (ECoG troughs) of SWA in dopamine-deprived conditions. These will be referred to as the TA and TI groups, respectively. The underlying basis for this division is unknown, but the same division is seen in respect to cortico-pallidal synchronization that occurs transiently at β frequencies in response to sensory stimulation in OHDA lesioned rodents (Mallet et al., 2008a), which suggests that the same mechanism may be responsible for pathological entrainment in both behavioral states/frequency bands. If this is the case, then understanding this mechanism may lead to improved treatments for Parkinson's disease. Unfortunately we do not currently know the route through which oscillatory cortical input entrains the GP, although it is likely to involve the two major sources of synaptic input to GP neurons: the inhibitory medium spiny projection neurons of the striatum and excitatory STN neurons. Both receive widespread cortical inputs and both show increased firing during the peaks of SWA under Parkinsonian conditions in rodents (Magill et al., 2001; Tseng et al., 2001). Given that the majority of GP neurons belong to the TI group it has been suggested that cortical oscillations are most effectively relayed via the inhibitory striatum (Walters et al., 2007), but this view is challenged by the fact that the entrainment of GP neurons to SWA appears to be critically dependent on a functioning STN (Ni et al., 2000; Galati et al., 2009).

In this paper we test the hypothesis that the inhibitory network of GP neurons allows two anti-phase groups of oscillatory neurons to appear in response to rhythmic excitatory STN input only. To do this we consider a small neural network model of interconnected conductance-based GP neurons. Although the parameters of the neurons in this population are homogeneous, our simulations reveal a mechanism by which the two oscillatory groups can appear. This collective behavior is the result of a self-organization process that depends on the GP neurons' inhibitory dynamics and rhythmic STN modulation. We study the neural network model under healthy and Parkinsonian conditions and demonstrate a good correspondence between simulation results and experimental recordings. Special attention has been paid to studying the possible role of downregulation of hyperpolarization-activated cyclic nucleotide-gated (HCN) channels in this network, based on the effect that these channels appear to have on GP neurons' responses to synaptic input (Chan et al., 2004; Boyes et al., 2007) and the suggestion that oscillatory activity may not appear immediately after dopamine lesion and might instead depend on slower adaptive processes (Degos et al., 2009).

The structure of this paper is as follows. Section 2 describes our model including the cellular properties of model GP neurons, how STN neuron activity was generated, the nature of synaptic connectivity and our proposed model for HCN downregulation. Section 3 describes the results of simulations and demonstrates that model GP neurons behave realistically both in isolation and as part of a network. It also shows how the network's activity changes under simulated Parkinsonian conditions in a way that is similar to the results of previous biological experiments. Section 4 examines results in the context of what has been seen in animal models of Parkinson's disease and discusses what the results might mean in terms of potential improvements to treatments for the disease.

#### **2. MATERIALS AND METHODS**

**Figure 1** shows a simple representation of the neural network model which includes a population of 100 interconnected GP neurons (right panel, blue) which receive excitatory synaptic input from 50 STN neurons (left panel, red). Each GP neuron is described by a detailed single compartment conductance based model of the Hodgkin–Huxley type with inhibitory connections from other GP neurons. The STN neurons are described by a simple enhanced leaky integrate-and-fire model. Neurons in the STN population are not connected to each other but they are modulated by a common cortical slow-wave rhythm and make excitatory synapses onto GP neurons.

population of GP neurons. Inhibitory local synaptic connections between GP neurons have random connectivity.

#### **2.1. MODEL GP NEURONS**

The model GP neurons are of standard Hodgkin–Huxley type, with a single compartment per neuron. We included ten voltagegated ionic channels as in the multicompartmental modeling work of Günay et al. (2008): fast and slow delayed rectifying K<sup>+</sup> (Kv3 and Kv2, respectively), fast and slow A-type K<sup>+</sup> (Kv4fast, Kv4slow), M-type K<sup>+</sup> (KCNQ), fast spike-producing Na<sup>+</sup> (NaF), persistent pacemaking Na<sup>+</sup> (NaP), hyper-polarization activated Ca2<sup>+</sup> (HVA), and fast and slow mixed-conductance hyperpolarization-activated channels (HCNfast, HCNslow). For simplicity our model does not include calcium-gated potassium "SK" channels, as these channels' most significant effect on the activity of GP neurons appears to be a lengthening of spike afterhyperpolarization (AHP) (Deister et al., 2009), and we were able achieve physiologically realistic AHPs without this channel. Equation 1 describes how the membrane potential (V) of a model GP neuron evolves in time.

$$C\frac{dV}{dt} = g\_{leak}(E\_{leak} - V) + I\_{naf} + I\_{nap}$$

$$+ I\_{k\upsilon2} + I\_{k\upsilon3} + I\_{k\upsilon4f} + I\_{k\upsilon4s} + I\_{kcnq} \tag{1}$$

$$+ I\_{h\upsilona} + I\_{h\upsilonnf} + I\_{h\upsilon ns} + I\_{syn} + I\_{ext}$$

Here *C* and *gleak* are the total membrane capacitance and leak conductance and *Eleak* is the reversal potential of the leak channels. Values for these parameters are given in **Table 1**. *Isyn* is the total synaptic current received by the neuron (see below). *Iext* is an externally applied current that was only non-zero when testing the response of individual GP neurons to current injections. The remaining currents correspond to the voltage-dependent channels, each of which has an activation gate (represented by the state variable *m*) and, for most channels, an inactivation gate (state variable *h*). Two channels have slow inactivation gates (*s*) in addition to their activation and inactivation gates. Equation 2 shows the current due to a channel with all three gates:

$$I = m^{\mu} h^{\rho} s^{\phi} g\_{X} (E\_{X} - V) \tag{2}$$

Here *EX* is the reversal potential of the channel, *gX* is the maximum conductance of the channel, and μ, ρ and φ are integers that give the relative numbers of gating molecules. **Table 2** shows which channels contain each gate type and the corresponding values of μ, ρ and φ. Note that the parameters governing the

**Table 1 | Passive membrane parameters and channel conductances for the model GP neurons.**


**Table 2 | The relative proportion of gating molecules of each type for each channel in the model GP neurons.**


dynamics of each gate vary from channel to channel. Since we use the same equations and parameters for channel gates as Günay et al. (2008) we do not reproduce these here and instead refer to the supplementary material of that paper where they are listed.

We could not directly use the channel maximum conductance parameters from Günay et al. (2008), since this was a multicompartmental model and the conductances varied widely between the different compartments. Instead we adjusted the channel conductances so that our model neurons exhibited intrinsic pacemaking and displayed realistic responses to depolarizing and hyperpolarizing current injections. The chosen conductances are shown in **Table 1**. In order to generate a range of intrinsic pacemaking frequencies we applied Gaussian noise to the maximum conductances of the persistent sodium and HCN channels (*gnap*, *ghcnf* , and *ghcns*). The mean values are given in **Table 1** and the standard deviation was 50% of the mean value for NaP and 30% for both HCN channels.

It has been proposed (Chan et al., 2011) that a homeostatic mechanism may reduce the intrinsic firing rate of GP neurons via downregulation of HCN channels in response to burst firing. Our model includes a possible mechanism for this downregulation by allowing the maximum conductance of HCN channels to decrease. This occurs during periods of elevated firing rate, which are indicated by high intracellular calcium concentration ([*Ca*]). In order to model the dynamics of this calcium concentration we use the equations from Terman et al. (2002). The rate of change of intracellular calcium is given by Equation 3:

$$\frac{d[Ca]}{dt} = \epsilon(I\_{\text{hva}} - k\_{Ca}[Ca])\tag{3}$$

Here represents calcium buffering and has the value 10−<sup>4</sup> *Ms CL* , while *kCa* is the calcium pump rate and has the value 15.0 *CL Ms* [parameter values from Rubin and Terman (2004)]. *Ihva* represents the instantaneous current due to HVA channels (these are the only Ca<sup>+</sup> channels in the model). The maximum conductance of HCN channels remains constant during time steps and is adjusted between steps when [*Ca*] > *THCN*, where *THCN* is the threshold above which downregulation occurs. The amount by which the conductance is adjusted takes the form of a sigmoid curve and is given in Equation 4.

$$\text{g}\_{\text{HCN}}(t+\Delta t) = \max\left[0, g\_{\text{HCN}}(t) - \frac{k\_{\text{HCN}}\Delta t}{1 + \exp(\frac{\theta - \{\text{Ca}\}(t+\Delta t)}{\sigma})}\right] \text{(4)}$$

Here *kHCN* is the maximum conductance change that can occur in one step, θ is the level of intracellular calcium that gives half the maximum change, and σ is the slope of the sigmoid. We do not have any data from biological experiments to suggest values for the downregulation parameters. Since we hypothesize that downregulation mostly only occurs during fast burst firing under Parkinsonian conditions, we chose parameters such that, in healthy conditions, downregulation only occurred in the very fastest firing GP neurons. The values of *kHCN* and σ that we chose give a fairly rapid reduction in maximum HCN conductance in response to elevated firing rates. The downregulation parameters that we chose are: *THCN* <sup>=</sup> <sup>0</sup>.2, *kHCN* <sup>=</sup> <sup>6</sup> <sup>×</sup> <sup>10</sup>−4, <sup>σ</sup> <sup>=</sup> <sup>0</sup>.1, θ = 0.5.

Note that our model GP neurons are supposed to represent those in the rodent GP. This nucleus is usually considered to be equivalent to the so-called "external" pallidus (GPe) in primates.

#### **2.2. MODEL STN NEURONS**

Since the aim of this study is to investigate the effects of rhythmic STN input upon the neurons of the GP, we did not model STN neurons to the same level of biological detail as GP neurons. Instead, to simulate the SWA-modulated bursting of STN neurons that occurs under urethane anaesthesia we use an enhanced integrate-and-fire generator of neural activity, as described in Borisyuk (2002). The STN neurons include an exponentially decaying threshold, accumulation of membrane potential, stochastic noise and an absolute refractory period. Aside from approximating the SWA modulation of STN activity our model does not include any synaptic inputs to the STN. In particular we do not include the GP-STN projection because experiments in urethane-anaesthetized rats have shown that the changes in firing rate and pattern that occur in the STN following OHDA-lesion are not dependent on synaptic input from the GP (Hassani et al., 1996).

During urethane anaesthesia STN neurons display uncorrelated bursting activity that is modulated by a slow rhythm (Magill et al., 2001) which, for the purposes of this paper, we assume arises from excitatory cortical inputs. Since firing is uncorrelated, each STN spike train is generated independently using a procedure that results in spiking activity that is similar to the spiking activity of real neurons. The generated spike trains contain activity that is oscillatory with a period of 1300 ms (≈0.8 Hz), where cycle is made up of an 800 ms "inactive" phase and a 500 ms "active" phase. The spike trains are constructed by alternately sampling from two intermediate spike trains, one with slow irregular firing and one with fast irregular firing (for the inactive and active periods, respectively). The average firing rates are 0.5 Hz for the inactive period and 30 Hz for the active period; under simulated Parkinsonian conditions the firing frequency during the active phase is increased to 60 Hz. **Figure 2** shows crosscorrelations between two STN spike trains which demonstrate that within the active bursting period activity is not correlated (A), but that there is strong common modulation at 0.8 Hz (B). **Figures 2C,D** are raster plots of the generated spiking activity of 50 STN neurons in healthy and Parkinsonian conditions, respectively, demonstrating a clear increase in firing frequency during the active phase in the Parkinsonian case.

#### **2.3. SYNAPTIC CONNECTIVITY**

Each model neuron contains two variables, *o*(*t*) and *c*(*t*), which represent synaptic opening and closing, respectively. When a neuron spikes (defined by the membrane potential crossing 0 mV in the positive direction), both variables are step-increased by 1. Following this, the variables decay exponentially to zero according to different time constants τ*<sup>o</sup>* and τ*c*, respectively (Eq. 5). Since τ*<sup>o</sup>* < τ*c*, a transient synaptic current arises in all post-synaptic neurons following a spike, according to Eq. 6:

$$\frac{d\boldsymbol{o}}{dt} = -\frac{1}{\mathfrak{r}\_{\boldsymbol{o}}} \boldsymbol{o} \qquad \frac{d\boldsymbol{c}}{dt} = -\frac{1}{\mathfrak{r}\_{\boldsymbol{c}}} \boldsymbol{c} \tag{5}$$

$$I\_{\rm syn} = (c - o)\mathbf{g}\_{\rm Sym}(e\_{\rm rev} - V) \tag{6}$$

Here *erev* is the reversal potential of the synapse (mV) and *V* is the membrane potential of the post-synaptic neuron (mV). The value of parameters *erev*, τ*<sup>c</sup>* and τ*<sup>o</sup>* vary based on the type of the neuron (glutamatergic STN or GABAergic GP). *gSyn* denotes the maximum unitary conductance of a synapse (nS) and its value for a particular synapse is drawn from a Gaussian distribution. The mean of this distribution was *gSG* for STN-GP synapses and *gGG* for intra-GP synapses and the standard deviation was 30% of the mean in both cases.

For the intra-GP inhibitory synapses we used synaptic time constants τ*<sup>o</sup>* = 5 ms and τ*<sup>c</sup>* = 40 ms from unpublished currentclamp recordings of GP-GP IPSPs taken from slices containing rat GP, cortex and striatum [Alon Korngreen, personal communication]. Similarly, we chose *gGG* = 0.5 nS which gives a peak IPSP of 0.5 mV (measured as deflection away from a holding potential of −65 mV during injection of hyperpolarizing current) to match the same experimental recordings. We used a standard GABA reversal potential of −80 mV. Connectivity between GP neurons was uniformly random, with each neuron inhibiting 20 others (no self-connections).

**(C,D)** the background is shaded to show the active (pink) and inactive (blue) phases of the SWA cycles. **(A,B)** are normalized using the

Anatomical data regarding the structure of the STN-GP projection is currently lacking. However, it is clear that there are many fewer STN neurons than GP neurons and that each GP neuron only samples the activity of a small proportion of STN neurons (Jaeger and Kita, 2011). We therefore arbitrarily chose to model 50 STN neurons, each of which makes excitatory synapses onto two randomly selected GP neurons. The time constants of STN-GP synapses in the model are τ*<sup>o</sup>* = 0.2 ms and τ*<sup>c</sup>* = 60 ms, based on the recordings shown in Loucif et al. (2005). The average maximum synaptic conductance (*gSG*) used for the healthy case was chosen to be just low enough such that the majority of GP neurons didn't show significant entrainment to the SWA rhythm and we investigated the effects of increasing the value in the Parkinsonian case.

#### **2.4. MODELING OF PARKINSONISM**

We simulate the OHDA-lesioned (Parkinsonian) rat basal ganglia by making three changes to the model's parameters: (i) faster STN firing during the active phase of SWA (**Figure 2D**) (ii) increased STN-GP synapse strength and (iii) increased intra-GP inhibition strength. Although the changes that occur to functional connectivity in the basal ganglia in Parkinson's disease are currently under active investigation, there is experimental support for facilitation of both GP-GP (Johnson and Napier, 1997) and STN-GP (Johnson and Napier, 1997; Hernández et al., 2006) synapses. Similarly, under urethane anaesthesia it has been shown that spiking in the STN continues to be modulated by the cortical SWA rhythm, but that its firing becomes more intense during the active period (Magill et al., 2001; Galati et al., 2009).

correlations. The horizontal bars show the 95% confidence interval for

#### **2.5. CATERGORIZATION OF NEURON ACTIVITY**

significant correlations (Brillinger, 1976).

We used a simple method to catergorize GP neurons as being of type TA (in-phase with active SWA), TI (in-phase with inactive SWA), NM (not modulated by SWA) or QU (quiet). Each spike fired by the neuron to be catergorized is represented by a complex number that indicates its phase in relation to SWA. The sum of these complex numbers then gives an indication of the average phase, ω, as shown in Eq. 7.

$$\alpha\_k = \sum\_{s \in S\_k} e^{i\Theta(s)} \tag{7}$$

Here *Sk* is the set of spike times for neuron *k* (0 ≤ *k* < 100) and θ(*s*) is the phase of SWA at time *s* (0 ≤ θ(*s*) < 2π). The argument of the complex number ω*<sup>k</sup>* indicates the average SWA phase at which neuron *k* fires, while its modulus provides an indication of how strongly SWA-modulated the firing is. Normalizing the modulus by the number of spikes gives a confidence measure *ck* <sup>=</sup> <sup>|</sup>ω*k*<sup>|</sup> <sup>|</sup>*Sk*<sup>|</sup> , where *ck* <sup>=</sup> 0 indicates that spikes did not fire preferentially at any one phase and *ck* = 1 indicates that every spike occurred at exactly the same phase. After visual inspection of spike trains, we decided to catergorize neurons with confidence *c* < 0.1 as NM. We catergorized neurons with *c* ≥ 0.1 as either TA or TI based on whether the average phase was during the active or inactive part of the SWA cycle. Neurons that fired fewer than one spike every SWA cycle on average were catergorized as QU.

#### **2.6. IMPLEMENTATION DETAILS**

We simulated the model using custom written software developed by R Merrison-Hort. This software is written in C and uses the adaptive Runge-Kutta-Felberg ODE solver routine from the GNU Scientific Library (version 1.15). Absolute and relative error tolerances of 10−<sup>5</sup> and a maximum step size of 1 ms were used for all simulations. To analyse the results we used scripts written in the Python (2.7) programming language with routines from the NumPy (1.6.2) and SciPy (0.11.0rc1) libraries. For each set of parameters twelve simulations were performed, with different random neural connectivity, STN spike trains and parameter noise (as described above) in each simulation.

All reported mean values are given with their standard deviation.

#### **3. RESULTS**

#### **3.1. MODEL GP NEURONS BEHAVE REALISTICALLY UNDER HEALTHY CONDITIONS**

The characteristics of the model GP neurons qualitatively match that those of real rodent GP neurons in a number of key ways and are illustrated in **Figure 3A**. When no synaptic or injected currents are present (dark blue trace in **Figure 3A**), most model neurons (96%; 481/500) pacemake at a range of frequencies (23.6 Hz ± 4.0). Depolarizing current injections increase the frequency of firing (green trace), with very high frequencies possible (up to approximately 200 Hz). Hyperpolarizing current injections result in a prominent and transient "sag" in membrane potential (red, cyan and pink traces). The first spike after hyperpolarizing current is removed occurs after a similar delay regardless of the size of the injected current. These properties match those seen in experiments with slices of rodent GP (Chan et al., 2004; Bugaysen et al., 2010).

The mixed-conductance HCN channels play an important role in the activity of the model GP neurons and their response to hyperpolarizing input. The combination of a reasonably depolarized reversal potential (−30 mV) (Lüthi and McCormick, 1998) and activation at hyperpolarized membrane potentials (lower than −60 mV) means that these channels act to return neurons to spiking threshold faster after hyperpolarizing current (or inhibitory synaptic input) is removed. **Figure 3B** shows how the simulated blockade of HCN channels affects the activity of a model GP neuron. When HCN channels are removed (*ghcnf* ,*<sup>s</sup>* = 0), the average pacemaking frequency decreases to 15.8 Hz ± 2.5 and 12% (58/500) of neurons do not pacemake. Without HCN channels the membrane potential sag is no longer seen, and hyperpolarizing current has a much stronger effect on membrane potential. The time between the removal of hyperpolarizing current and the return of spiking is also much longer, and much more sensitive to the hyperpolarization level. These results agree with previous work that has investigated the role of HCN channels using mouse GP slices and multicompartmental simulations (Chan et al., 2004).

#### **3.2. HEALTHY NETWORK ACTIVITY**

Whilst we were able to base STN firing rate and the conductance of GP-GP synapses directly on experimental evidence, we could only do this indirectly with the STN-GP synaptic conductance (*gSG*). We chose a value of 0.1 nS for this parameter in the healthy case as this gives similar proportions of neurons in the TI, TA and NM groups as seen in experiments. **Figure 4A** shows this distribution [cf. Figure 2A in Mallet et al. (2008b)] and **Figure 4B** shows the spiking activity of the TI, TA and NM neurons in one trial. A small proportion of neurons (9.9% ± 2.1) are catergorized as QU because they fire spikes rarely or not at all; we are not sure if this is a biologically accurate result as such neurons may have been excluded from the results of electrophysiological studies. The majority of neurons (68.3% ± 3.9) are catergorized as NM and neurons in this group spike with an average firing rate

Depolarizing current causes fast, regular spiking (green trace), while hyperpolarizing current reveals a sag in membrane potential and rebound

firing (red, cyan, and purple traces). With no current injection the neuron fires regularly at approximately 22 Hz (blue trace). **(B)** Neuron with HCN channels removed. Sag in membrane potential is lost and pacemaking is slowed. Note the difference in scale for the injection currents between **(A)** and **(B)**.

of 12.3 Hz ± 3.3 and coefficient of variation (CV) of 0.12 ± 0.04. These statistics are in good agreement with those of neurons recorded from mice GP slices by Chan et al. (2004) (firing rate 12.5, CV 0.18). However, in contrast to these experimental results, which found no effect on firing rate or CV after blocking GABAA receptors, we would expect the average firing rate of the neurons in our model to increase slightly with inhibition blocked, as the average firing rate in the network is lower than the average pacemaking frequency of isolated model neurons. The average firing frequencies in the (small) TI and TA groups were 3.7 Hz ±1.8 and 10.7 Hz ± 4.7, respectively.

#### **3.3. PARKINSONIAN NETWORK ACTIVITY**

The effects of dopamine lesion were simulated in the model by an increased intensity of STN firing and increased strength of STN-GP excitation and intra-GP inhibition. These changes have a profound effect on activity in the model GP that is similar to what is seen in experiments. As **Figure 5** shows, most neurons begin to preferentially fire during either the active or inactive phase of the SWA. In order to see proportions of TA and TI neurons that were

similar to *in vivo* results it was necessary to double the strength of STN-GP and GP-GP synapses (*gSG* = 0.2 nS, *gGG* = 1.0 ns).

SWA-modulated firing patterns, either in-phase (TA) or anti-phase (TI). **(B)** Raster plot of Parkinsonian GP neuron activity (description as in **Figure 4B**).

The average firing frequency of NM neurons decreased slightly under Parkinsonian conditions while the average firing rates of the TI and TA groups increased to 6.3 Hz ±3.6 and 11.1 Hz ±5.1, respectively (**Figure 6A**). Although the variance of these statistics is fairly large, there does appear to be a trend for different firing rates between the different groups that is not seen *in vivo* (Magill et al., 2001). This difference is perhaps not too surprising given our simplistic and somewhat arbitrary choices for STN-GP and GP-GP connectivity.

In order to investigate the factors that determine whether a neuron becomes TA, TI, NM or QU we examined the following statistics of each neuron: maximum conductance of the NaP persistent sodium channel; initial (before downregulation) maximum conductance of fast and slow HCN channels; total maximum conductance of all excitatory (AMPA) synapses from STN neurons; total maximum conductance of all inhibitory synapses from other GP neurons. In general these statistics were remarkably similar between each of the groups, with two exceptions. Firstly, quiet (QU) neurons have, on average, much lower maximum conductances for their NaP channel (**Figure 6B**). These channels underlie autonomous pacemaking (Mercer et al., 2007) and the intrinsic pacemaking frequency is strongly dependent on the value of the NaP maximum conductance. Since QU neurons have low NaP conductance they are likely to pacemake very slowly

**neurons from all trials (***n* **= 1200). (A)** Average firing rate is rather variable but is in general slightly lower for TI neurons. **(B)** Maximum conductance of the persistent sodium channel (NaP) which underlies pacemaking. Quiet neurons can be easily categorized as those with very low NaP conductance. **(C)** Average total maximum conductance due to excitatory synapses. TA neurons receive more excitation on average, but it is highly variable.

or not at all, and it appears (from examination of voltage traces) that the incoming inhibition from other GP neurons is sufficient to prevent them from ever firing. Secondly, TA neurons receive on average more excitatory synaptic input from the STN than the other groups (**Figure 6C**). This result was expected, since STN firing occurs during the active phase of SWA and so it is not surprising that those GP neurons that receive more STN input also fire preferentially during the active phase. In general, however, the simple statistics we examined about membrane properties and synaptic connectivity are not enough to determine which group a particular neuron will fall into. Predicting the classification of a neuron involves knowing the classification of the other GP neurons that it receives inhibition from. This makes the problem complex and extremely difficult to resolve a priori. Instead, when the network is simulated, a dynamic process takes place in which the network self-organizes its activity into the different groups of neurons.

HCN channel downregulation plays a significant, but not essential, role in the emergence of the different groups of neurons in our model. Without this mechanism, many neurons in the TA group continue to fire during the inactive phase due to their intrinsic pacemaker properties. Although this inactive-phase firing is slower than their active-phase firing (due to reduced excitatory input), it is still a source of inhibitory input to other GP neurons and may silence or slow the firing of some which may otherwise be catergorized as TI. With the HCN downregulation mechanism those neurons that receive the most excitatory STN input, and therefore fire at the fastest rate during the active period, will have their maximum HCN conductances reduced. This reduction has the effect of decreasing the intrinsic pacemaking frequency and increasing the hyperpolarization that occurs in response to inhibitory input. To quantify the effect on pacemaking frequency, we ran four simulations using the Parkinsonian parameter values for a period of time long enough for downregulation to take effect (6.5 s) and then removed all synapses and recorded firing rates. The average pacemaker frequency after downregulation was 16.9 Hz ±2.1, a clear reduction from the normal pacemaker frequency of our model neurons (23.6 Hz ± 4.0). The proportion of QU neurons after downregulation was 2%, lower than the 4% that we would expect based on the normal pacemaker properties, but we attribute this to statistical noise due to the rather small sample sizes.

These changes to pacemaking affect the competition dynamics during the inactive phase and mean that most TA neurons are no longer able to fire at all during this phase. Although the proportion of neurons classified as either TA or TI is similar with or without HCN downregulation (72.7% ± 5.4 normally, 68.7% ± 5.0 without downregulation), TA neuron firing is much more clearly restricted to the active phase with downregulation. This is seen in the average confidence measure (ω) of TA neurons, which is 0.44 ± 0.22 with HCN downregulation and 0.32 ± 0.18 without. The effect is also shown by phase diagrams showing the spiking activity of typical neurons (**Figure 7**). In these plots the background is shaded to show the active (pink) and inactive (blue) parts of the SWA cycle, showing that TI neurons fire preferentially in the inactive part and TA neurons fire preferentially in the active part. The red bars show the average spike phase and their length indicates the confidence measure as a proportion of the total radius.

#### **3.4. LARGER NETWORKS**

The results described above were from simulations with 100 GP neurons, each of which made 20 inhibitory synapses onto 20 other (randomly chosen) GP neurons. This level of coupling (≈ 20%) is probably much higher than what is seen in the real GP (Sadek et al., 2007). We ran several simulations, using Parkinsonian parameter settings, where the coupling proportion was decreased by scaling up the number of GP and STN neurons but keeping the number of synapses that each neuron made constant. In the first set of simulations we used 200 GP neurons and 100 STN neurons and in the second set we used 300 GP neurons and 150 STN. These give GP-GP coupling levels of 20/199 ≈ 10% and 20/299 ≈ 7%, respectively. In each case we ran three simulations. For the simulations with 200 GP neurons there were an average

of 71 ± 2.8 TI neurons and 73.7 ± 1.7 TA neurons. For the simulations (*n* = 3) with 300 GP neurons there were an average of 122 ± 4.1 TI neurons and 101.3 ± 2.5 TA neurons. These proportions (particularly in the latter case) are similar to the proportions in the smaller network (see **Figure 5**).

shows the phase confidence measure (ω) as a proportion of the total radius.

#### **3.5. OTHER FREQUENCY BANDS**

We briefly investigated to see if the activity of GP neurons could become entrained to higher frequency cortical rhythms, specifically those in the β band. To do this we generated STN spike trains that were modulated at approximately 14 Hz (70 ms period: 40 ms inactive phase, 30 ms active phase). Although biological experiments on OHDA lesioned rodents find that most neurons fall under the same TI or TA category regardless of whether the cortical rhythm is SWA or β, it was difficult and not effective to use our normal method to categorize neurons because the number of spikes fired by GP neurons in each β cycle was very low. However, examining spike cross-correlations between STN and GP neurons showed that the majority of GP neurons did show oscillatory firing that was in-phase with the STN input (**Figure 8A**). We examined auto-correlations for individual GP neurons and found that the frequency of these neurons' oscillations varied somewhat from neuron to neuron, which suggests that their firing becomes synchronized to some intermediate frequency between the 14 Hz input and their intrinsic pacemaking frequency. We did not see any GP neurons that showed anti-phase oscillations when using our standard Parkinsonian parameters. However, when the degree of intra-GP inhibition is dramatically increased (40% coupling, *gGG* = 3.0 nS) then a few neurons do begin to show a preference for anti-phase firing (**Figure 8B**), albeit at a very low rate. It is possible that with different synaptic parameters or connection topology (for example, STN input that preferentially makes contact with a particular group of GP neurons), the synchronized activity of one group could cause a second group to become synchronized in anti-phase.

## **4. DISCUSSION**

#### **4.1. A NEW, BIOLOGICALLY DETAILED, MODEL HELPS US TO STUDY GP DYNAMICS**

We have presented what we believe to be a novel model of GP neurons that features much of the biological realism of previous detailed multi-compartmental models but considerably reduced complexity (both computationally and in terms of model construction). This makes our model well-suited to detailed modeling of the dynamics of networks of GP neurons and their connections with other nuclei. We have also introduced a possible computational mechanism for simulating the downregulation of HCN channels and shown that this improves how closely our results fit with biological evidence.

Our results demonstrate a mechanism whereby local inhibitory connections allow two anti-phase oscillatory subpopulations of GP neurons to emerge in response to rhythmic excitatory input from the STN. The two subpopulations appear due to a complex self-organization process and despite the homogenity of the overall population. This effect is only seen when both the STN input and inhibitory GP-GP coupling are sufficiently strong, and there is good experimental evidence that both STN input to the GP and intra-GP coupling increase in rodent models of Parkinson's disease. We therefore claim that our model shows a plausible mechanism for those experimental results which show a prominent increase in the number of TA and TI neurons that occurs in the rodent GP after dopamine lesioning (Magill et al., 2001). In our model, HCN channel downregulation makes oscillatory entrainment of the in-phase group of neurons more prominent but is not essential for the two groups to appear. This may explain the result of Chan et al. (2011) whereby artificial up-regulation of HCN channels via viral transfection restored the cells' ability to autonomously pacemake but did not give any significant improvement to Parkinsonian motor impairment. The fact that we did not see an increase in the number of neurons that were unable to autonomously pacemake following simulations of the Parkinsonian network may indicate that we didn't set the threshold for HCN downregulation low enough.

## **4.2. RELATIONSHIP TO PREVIOUS STUDIES ON COUPLED OSCILLATORS**

Networks of coupled oscillators are found in many areas of science and the dynamics of such networks have therefore been widely studied from a theoretical standpoint. Our model can be thought of as a network of inhibitory coupled oscillators that receive random, sparse excitatory input with a particular global frequency. Although many theoretical studies of similar systems use reduced models, they may still provide insights into the different dynamical behaviors that our model is likely to exhibit.

Most previous theoretical studies do not include common external input to the coupled oscillators, although some may consider the effects of input that is constant in time. Chow (1998) describes the analysis of a neuronal network that consists of a number of oscillators with heterogenous spiking frequencies that are all-to-all coupled by inhibitory connections. Such networks are capable of producing a range of dynamics, including almostsynchronous phase-locking, harmonic locking, and suppression. The stability of these states strongly depends on the details of the neurons' response to synaptic input. This network is similar to our model in the case where STN input is made constant in time, although inhibitory coupling in the GP network is random and relatively sparse rather than all-to-all. Since SWA oscillations are much slower than the GP neurons' intrinsic pacemaking, we can consider the GP network during the active and inactive phases separately, with constant STN input within each phase. Using the terminology of Chow (1998), during the active phase of SWA (high STN input) the TI neurons are in the suppressed state while in the inactive phase the TA neurons are suppressed. We have not observed synchrony between neurons during active or inactive phases, although it's possible that the system would converge to these states after a long period of time with constant input.

As the frequency of the cortical modulation is increased to be closer to the GP neurons' pacemaker frequencies (e.g., into the β band) it no longer makes sense to consider the scenario of constant STN input. In this scenario, theoretical results from oscillator models may be useful for suggesting conditions that support the emergence of in-phase and anti-phase groups. Golomb et al. (1992) describes a network of phase oscillators that all receive common global input that is a function of the phases of the oscillators. This is not explictly the case in our model, but nevertheless their findings regarding the stability of solutions with clustered phase distributions may be relevant. In particular they show that the fewer clusters a state has, the more stable it is (larger basin of attraction). This could explain why the GP network under β modulation organises into just two anti-phase clusters. Kilpatrick and Ermentrout (2011) study a more biologically realistic model for the emergence of gamma rhythms in a network containing a large population of excitatory neurons with a smaller subpopulation of inhibitory interneurons. Interestingly, they show that the number of clusters that emerge in their model depends on the level of spike frequency adaptation in the excitatory neurons, which arises due to a calcium current. Our model contains a calcium channel that activates during fast firing and causes some degree of spike frequency adaptation, which raises the possibility that using different conductances for this channel may result in patterns with more than two clusters. It has also been shown that networks of neurons that have heterogenous synaptic interconnectivity may display clustered dynamics if the connectivity structure satisfies certain conditions (Li et al., 2003)—although this has only been shown for excitatory synapses and so it is not clear whether the same would apply to the GP network.

#### **4.3. STIMULATION OF THE STN MAY REDUCE OSCILLATIONS IN THE HYPERDIRECT PATHWAY**

The hypothesis that basal ganglia activity is entrained to cortical rhythms via the hyper-direct pathway in Parkinson's disease offers some explanation of the possible mechanism(s) underlying the clinical effectiveness of STN deep-brain stimulation (DBS), in which an implanted electrode provides constant electrical stimulation of approximately 120 Hz to the STN. The precise effects of this stimulation on neuronal activity in the basal ganglia are not fully understood and are likely to be many and varied (Kringelbach et al., 2010). The computational model of the basal ganglia of Kumar et al. (2011) included the effects of DBS through either a reduction of strength of cortex-STN synapses or inhibitory input onto STN and in both cases DBS was found to reduce oscillatory firing. In this model oscillations appear because the STN and GPe act as a pacemaker circuit due to the excitatory connections between STN neurons. It is clear that if DBS were added to our model in a similar manner then oscillations in the GP would be much reduced, as they are dependent on reasonably strong input from the STN. Another possible mechanism of DBS that has good experimental support is that it antidromically activates the fibers that project from the cortex to the STN (Li et al., 2007). The effect of this antidromic activiation is a reduction of oscillatory activity in the cortical regions that project to the STN. Since our model only includes STN input to the GP, clearly a reduction of oscillatory STN activity would reduce GP oscillations as well. Wilson et al. (2011) found that in a relatively abstract model of the GP consisting of uncoupled phase oscillators synchronized to a common input, chaotic dynamics served to desynchronize the population at frequencies and intensities similar to those that are clinically effecive for DBS. The same may also be true of our model when the GP neurons are entrained to β-frequency STN input but further investigation is required. Our model could also help to test and improve the effectiveness of new forms of DBS, such as that proposed by Popovych and Tass (2012) which involves using multiple electrodes to desynchronize groups of neurons that have become entrained to particular rhythms.

#### **4.4. THE EMERGENCE OF TI/TA GROUPS DEPENDS ON STRONG INHIBITORY COUPLING**

One possible weakness of our model is that it relies on intra-GP inhibition being much denser than is currently supported by experimental evidence. It has been estimated that the probability of a given GP neuron synapsing onto any other, randomlychosen, GP neuron is less than 1% (Sadek et al., 2007) but in most of our simulations this value is 20%. Our preliminary experiments with larger networks have suggested that the level of intra-GP coupling can be reduced while preserving the division into TI and TA groups by increasing the size of the network. Further work will involve investigating even larger (more computationally expensive) networks to see how much further the size of the network can be increased while maintaining the same division.

If further increases in network size cannot generate realistic activity with physiological levels of GP-GP coupling, there are several other possible reasons why the connectivity may be greater than has so far been measured experimentally. It has been suggested that the basal ganglia are organized into a series of partially overlapping "channels" (Alexander and Crutcher, 1990), where neurons preferentially synapse onto other neurons in the same channel. We have previously shown modeling evidence that increased coupling between channels may allow the STN-GP circuit to generate oscillations (Merrison-Hort et al., 2013), but in the present study we suggest that our small population of GP neurons could represent part of a single channel. Under this assumption, the proportion of coupled neurons might be much higher than would be seen from picking pairs of neurons from across the whole GP at random. It is also possible that the projection from GP to STN, which is not included in our model, may contribute to the effect of lateral inhibition since the trisynaptic GP-STN-GP pathway is a route by which GP neurons inhibit other GP neurons, and there is experimental evidence to suggest the strength of this pathway may be increased under Parkinsonian conditions (Johnson and Napier, 1997). However, it is hard to say whether or not this explanation is plausible without more detailed information about the topology of the STN-GP and GP-STN projections.

Similarly, the increase in GP-GP synaptic conductance that occurs under Parkinsonian conditions in our model may be larger than in reality. Although we have used data from paired-pulse experiments to choose the conductance of GP-GP synapses in the healthy case, it is not clear how much this increases by following loss of dopaminergic input. Miguelez et al. (2012) used an optogentic technique to stimulate a number of GP neurons whilst recording IPSCs and found an increase of approximately 67% after dopamine lesioning. This is considerably smaller than the increase we use under Parkinsonian conditions, where the GP-GP conductance is doubled. However, the increase seen by Miguelez et al. (2012) may be lower than the in conductance at a single GP-GP synapse, since their method simultaneously activates many pre-synaptic GP neurons and the summation of the resulting IPSCs may not be linear.

#### **4.5. FUTURE WORK: STRIATAL INPUT, RECIPROCAL CONNECTIONS AND GP HETEROGENITY MAY IMPROVE OUR RESULTS**

The aim of this study was primarily to investigate whether the hyperdirect pathway alone could account for one characteristic of the Parkinsonian GP and we have therefore only included STN-GP and GP-GP synaptic connectivity. However, the main source of synaptic input to the GP is the striatum, and it is clear that adding simulated inhibitory striatal synaptic input could improve our results. Galati et al. (2009) demonstrated that the delivery of a GABAA antagonist into the GP also causes cortical entrainment of the neurons there and that this effect is dependent on a functioning STN. They suggest that this demonstrates that inhibitory striatal input is also involved in oscillatory entrainment. This result is more difficult to explain in our model, since it is unlikely that decreasing the level of GP-GP inhibition would cause oscillations to appear in the (otherwise) healthy case. However, if the effect of GABA antagonism is to remove *tonic* background inhibition (probably from the striatum) then we could include this in our model as a depolarizing current injection to all GP neurons. This would move their membrane potentials closer to the spike threshold and make them more sensitive to the (weak) rhythmic STN input that is present in the healthy case. Furthermore, Tseng et al. (2001)showed that the activity of striatal projection neurons is modulated by cortical SWA and increases after OHDA lesion. Including the effects of this in our model would probably increase the number of TI neurons and may allow us to reduce the amount of intra-GP inhibition to a more realistic level.

Another possible pathway that could be added to our model is the projection from the GP back to the STN. Computational models of networks that include this connection have shown that the STN and GP can work together to act as a pacemaker circuit (Gillies et al., 2002; Terman et al., 2002; Holgado et al., 2010). Terman et al. (2002) describes the results of simulating a spiking model of the interconnected STN-GPe network in which the tonic activity of both populations can become bursty with a regular bursting rhythm. In fact, the neurons in this model can self-organise into different sized clusters, which allows for the possibility of two anti-phase groups under some conditions. We expect that our model could support similar regimes if the GP-STN connection was added, provided that we also introduced a more realistic STN neuron model.

Although our model has demonstrated that discrete groups of neurons can emerge from a population of GP neurons with homogeneous (unimodally distributed) membrane properties, there is now some evidence that the neurons in the TA and TI groups are distinct in some ways, including the nature of local inhibitory connectivity, the basal ganglia nuclei that they project to, and in their chemistry (Mallet et al., 2012). Similarly, several studies (Nambu and Llinás, 1997; Cooper and Stanford, 2000; Bugaysen et al., 2010) have attempted to categorize GP neurons based on their electrophysiological properties, and their results seem to suggest that several distinct groups may exist (although the boundaries remain fuzzy). It would be very interesting to incorporate these results into our model, perhaps by making the parameter noise for the NaP or HCN channels bi- or tri-modal and by giving one group of neurons a higher degree of local connectivity than another. We expect that this would promote the emergence of the TI, TA and NM groups and would probably reduce the degree of GP-GP connectivity that is required in order to obtain results that are similar to experiments.

#### **ACKNOWLEDGMENTS**

We are grateful to A. Korngreen for sharing data regarding IPSPs of GP-GP synapses and P. Magill for friendly and useful feedback about our early experiments. Robert Merrison-Hort would like to acknowledge the generous support for workshop attendance given by the Fields Institute.

#### **FUNDING**

Robert Merrison-Hort and Roman Borisyuk are supported by the Faculty of Science and Technology, University of Plymouth.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 09 September 2013; accepted: 12 November 2013; published online: 02 December 2013.*

*Citation: Merrison-Hort R and Borisyuk R (2013) The emergence of two anti-phase oscillatory neural populations in a computational model of the Parkinsonian globus pallidus. Front. Comput. Neurosci. 7:173. doi: 10.3389/fncom.2013.00173*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2013 Merrison-Hort and Borisyuk. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Computational model of precision grip in Parkinson's disease: a utility based approach

## *Ankur Gupta†, Pragathi P. Balasubramani † and V. Srinivasa Chakravarthy\**

*Computational Neuroscience Laboratory, Department of Biotechnology, Indian Institute of Technology Madras, Chennai, India*

#### *Edited by:*

*Ahmed A. Moustafa, University of Western Sydney, Australia*

#### *Reviewed by:*

*Raju S. Bapi, University of Hyderabad, India Sebastien Helie, Purdue University, USA*

#### *\*Correspondence:*

*V. Srinivasa Chakravarthy, Computational Neuroscience Laboratory, Department of Biotechnology, Indian Institute of Technology Madras, Chennai 600036, India e-mail: schakra@ee.iitm.ac.in*

*†These authors have contributed equally to this work.*

We propose a computational model of Precision Grip (PG) performance in normal subjects and Parkinson's Disease (PD) patients. Prior studies on grip force generation in PD patients show an increase in grip force during ON medication and an increase in the variability of the grip force during OFF medication (Ingvarsson et al., 1997; Fellows et al., 1998). Changes in grip force generation in dopamine-deficient PD conditions strongly suggest contribution of the Basal Ganglia, a deep brain system having a crucial role in translating dopamine signals to decision making. The present approach is to treat the problem of modeling grip force generation as a problem of action selection, which is one of the key functions of the Basal Ganglia. The model consists of two components: (1) the sensory-motor loop component, and (2) the Basal Ganglia component. The sensory-motor loop component converts a reference position and a reference grip force, into lift force and grip force profiles, respectively. These two forces cooperate in grip-lifting a load. The sensory-motor loop component also includes a plant model that represents the interaction between two fingers involved in PG, and the object to be lifted. The Basal Ganglia component is modeled using Reinforcement Learning with the significant difference that the action selection is performed using utility distribution instead of using purely Value-based distribution, thereby incorporating risk-based decision making. The proposed model is able to account for the PG results from normal and PD patients accurately (Ingvarsson et al., 1997; Fellows et al., 1998). To our knowledge the model is the first model of PG in PD conditions.

**Keywords: precision grip, Parkinson's disease, basal ganglia, reinforcement learning, decision making**

**Abbreviations:** *AG*/*E*/*<sup>N</sup>* , Gains of Go/Explore/NoGo components of GEN equation in Equation (23); BG, Basal Ganglia; CE, cost function to evaluate the performance of lift; *CE*GEN, Cost function for optimizing GEN parameters; *DAhi* and *DAlo*, thresholds at which dynamics switches between Go, NoGo and Explore regimes; DM, decision making; DP, Direct Pathway; *EL*, position error; *expt*, experiment; *Ff* , frictional force; *FG*, grip force; *F*Gref, reference grip force; *FL*, Lift force; *F*slip, minimum force required to prevent the object from slipping; g, acceleration due to gravity; GEN, Go-Explore-NoGo; GPe, Globus Pallidus externa; GPi, Globus Pallidus interna; *h*, risk function; IP, Indirect Pathway; *KP*,*L, KI*,*L*, and *KD*,*L*, the proportional, integral and derivative gains of the Lift force controller; M, mass (subscript "O" and "fin" denote object and finger, respectively); *Mp*, maximum peak value of the response curve; PD OFF, Off medicated PD subjects; PD ON, On medicated PD subjects; PD, Parkinson's disease; PG, Precision Grip; PID, Proportional-Integral-Derivative controller; *r*, reward; RBFNN, Radial Basis Function Neural Networks; RL, Reinforcement Learning; *SGF*, Stable Grip Force; sim, simulation; SM, Safety margin; SNc, Substantia Nigra pars compacta; SNr, Substantia Nigra pars reticulata; STN, Subthalamic Nucleus; T, simulation time in milliseconds; *t*, trial; *Tp*, peaking time of the response curve; *U*, Utility function; *V*, Value function; *X*/*X*˙ /*X*¨, position/ velocity/acceleration (subscript "O" and "fin" denote object and finger, respectively); *X*¯ , mean position (subscript "O" and "fin" denote object and finger, respectively); *X*ref, reference position; α, weight factor combining the value and the risk functions; δ, reward prediction error; δLim, clamped value of δ*<sup>U</sup>* ; δMed, added δ*<sup>U</sup>* due to medication; δ*<sup>U</sup>* , change in Utility function; δ*<sup>U</sup>* , gradient of the Utility function; δ*<sup>V</sup>* , Temporal difference in value function; ζ, damping factor; λ*G*/*<sup>N</sup>* , sensitivities of the Go/NoGo component in Equation 23; μ, friction coefficient; μ*m*, means of RBFNN; ν, uniformly distributed noise; ξ, risk prediction error; π, policy; σ*E*, standard deviation used for the Explore component in Equation (23); σ*m*, standard deviations for RBFNN; φ, feature vector; ψ, random variable uniformly distributed between -1 and 1; ω*d*, the damped natural frequency; ω*n*, natural frequency.

#### **INTRODUCTION**

Precision grip (PG) is the ability to grip objects between the forefinger and thumb (Napier, 1956). Successful performance of PG requires a delicate control of two forces (grip force, *FG*, and lift force, *FL*) exerted by two fingers on the object. In a grip-lift task *FG* is kept sufficiently high to couple *FL* with the object via the agency of friction between the object and the fingers. An optimal *FL* is also required to overcome the object's weight and lift it off the surface of the table on which it rests. These forces (*FL* and *FG*) are thought to be generated in parallel by different subsystems in the brain (Ehrsson et al., 2001, 2003). The critical *FG*at which the object slips is called the slip force (*F*slip) and the difference between the actual steady state *FG*(*SGF*), used in a successful lift, and *F*slip is known as the safety margin (*SM* = *SGF* − *F*slip). Johansson and Westling (1988) demonstrated the SM in controls to be 40–50% of slip force (Johansson and Westling, 1988). A high SM is employed to prevent the object from slipping due to internal (accelerations due to arm motion) (Werremeyer and Cole, 1997) and external (random changes in object load) perturbations (Eliasson et al., 1995)—motor activity is optimized for the internal perturbations and this optimality is lost on the addition of an external perturbation external perturbation (Charlesworth et al., 2011; Sober and Brainard, 2012; Wolpert and Landy, 2012). An excessive SM is undesirable as it would cause muscle fatigue and may even lead to crushing of the object.

SM in grip force is a crucial and defining parameter of PG performance. *F*slip serves as the threshold below which the object cannot be lifted. Human subjects operate at *FG* that is much higher than *F*slip; operation at a small SM makes the gripping unstable. Therefore, this need to operate sufficiently far from the border of instability may be regarded as a strategy for minimizing risk. The need for a large SM indicates that concepts from theories of risk-dependent decision making (DM) may be applied to understand PG performance (Bell, 1995; D'Acremont et al., 2009). By definition, risk is the variance in reward outcome (Bell, 1995; D'Acremont et al., 2009). In the context of PG performance, a reward may be thought to be associated with successful lifts. The risk is maximum close to *F*slip and expected reward (value) saturates for grip forces much greater than *F*slip. Optimal PG consists of maximizing average reward while minimizing risk. The present study approaches the problem of PG in terms of risk minimization and describes it within the framework of Reinforcement Learning (RL). The model is used to explain PG performance in both controls and Parkinson's disease (PD) patients.

PG studies in PD patients show a remarkable difference in SM patterns between PD patients and controls (Ingvarsson et al., 1997; Fellows et al., 1998). PD patients were shown to be capable of storing and recalling previous lift parameters (Muller and Abbs, 1990; Ehrsson et al., 2003). This allows them to scale *FG* when the load force on the object changes (Gordon et al., 1997; Fellows et al., 1998). Interestingly, even though *FG* scaling is preserved, scientific community is divided on the question of sensory deficits in PD patients being a cause for their altered SM. A sensory deficit would lead to suboptimal sensory-motor coordination. Some studies support the above theory of sensory deficit in PD (Moore, 1987; Schneider et al., 1987; Klockgether et al., 1995; Jobst et al., 1997; Nolano et al., 2008) and others (Gordon et al., 1997; Ingvarsson et al., 1997) reject it. Ingvarsson et al. (1997) demonstrated that the controls and PD OFF patients generated nearly similar *SGF*s. A higher *SGF* was generated in PD ON (L-Dopa Medication) case when the lifted object was covered with silk, suggesting a higher safety margin in PD ON condition (Ingvarsson et al., 1997). It has been suggested that this increase in *SGF* may be due to L-Dopa induced hyperkinesias (Ingvarsson et al., 1997; Gordon and Reilmann, 1999). In another study, Fellows et al. (1998) observed that PD ON subjects show a higher *SGF* than controls, but there was no mention of PD OFF results (Fellows et al., 1998). Reports also suggest a considerably higher *SGF* variance in PD OFF condition when compared to controls and PD ON condition (Ingvarsson et al., 1997). This may indicate the importance of the concepts of *risk sensitivity* in understanding the *SGF* in controls, PD ON and PD OFF conditions. Furthermore, recent evidence suggests that risk-takers are less prone to Parkinson's disease (Sullivan et al., 2012). PD medications such as L-Dopa (Ehrsson et al., 2003) and dopamine agonists (Claassen et al., 2011) increase impulsivity and risk-seeking behavior in PD patients. PD subjects under medication tend to show less sensitivity to negative outcomes and therefore tend to make risky choices (Wu et al., 2009). The effect of PD medication (dopamine agonist) in enhancing riskseeking tendency was also confirmed using the Balloon Analog Risk Task—an elegant assay for risk-related behavior (Claassen et al., 2011). The impairment of risk-processing in PD patients (Ehrsson et al., 2003; Wu et al., 2009; Claassen et al., 2011), and altered SM in PD, makes a risk-based motor control approach to PG performance even more compelling.

None of the previously explained computational models for PG lift tasks (Kim and Inooka, 1994; Fagergren et al., 2003; de Gruijl et al., 2009) explain the grip force levels used by controls and PD patients. Hence, a computational model to explain grip forces in controls and PD forms the motivation for the present work.

Drawing from the aforementioned presentation of facts, we propose to model PG performance, and its alterations in PD condition, using the mathematics of risk. We use the concept of utility function, a combination of value and risk components, embedded in the framework of Reinforcement Learning (RL) (Bell, 1995; Long et al., 2009; Wu et al., 2009; Wolpert and Landy, 2012). Concepts from RL have been used extensively in the past to model the function of the Basal Ganglia (BG) in control and PD conditions (Sridharan et al., 2006; Gangadhar et al., 2008, 2009; Chakravarthy et al., 2010; Krishnan et al., 2011; Magdoom et al., 2011; Kalva et al., 2012; Pragathi Priyadharsini et al., 2012; Sukumar et al., 2012). In a recent modeling study, we used the utility function to model the role of the BG in reward, punishment and risk based learning (Pragathi Priyadharsini et al., 2012).

We now present a computational model for human PG performance in controls and PD subjects in (ON/OFF) medicated states. Using risk-based DM to model PG performance, we show the alteration of *FG* in PD patients (Wolpert and Landy, 2012) in a modified RL framework. Modeling results match favorably with experimental PG performance.

The paper is organized as follows. Section "Model" presents the model. Section "The Precision Grip Control System" presents PG control system. Section "The Utility Function Formulation and Computing *U*(*F*Gref)" presents the utility function formulation and Section "Modeling Precision Grip Performance as Risk-Based Action selection" presents a model of the BG based on the same (Magdoom et al., 2011; Pragathi Priyadharsini et al., 2012; Sukumar et al., 2012). In the results section, the model of the BG is used to explain PG performance of PD patients described by Ingvarsson et al. (1997) and Fellows et al. (1998). A discussion of the proposed model and modeling results is presented in the final section.

## **MODEL**

## **THE COMPLETE PROPOSED PRECISION GRIP MODEL IN A NUTSHELL**

1. We first define a closed-loop control system in which the plant (the finger and object system) is controlled by two controllers—a *FL* controller and a *FG*. There are two inputs to the entire loop—a reference grip force (*F*Gref) and a reference position (*X*ref). The reference position, the position to which the object must be lifted, is predefined for a given task by the experimenter. We are now left with *F*Gref as the crucial parameter that determines the PG performance of the control system. *F*Gref is given as a step input to the *FG* controller; the output of the controller, *FG*(*T*), is used to grip the object ('*T*' denotes simulation time in *milliseconds*). The challenge consists of finding *F*Gref that leads to successful lifts.


#### **THE PRECISION GRIP CONTROL SYSTEM**

PG performance consists of two fingers and an object interacting through friction (represented by friction coefficient μ). A free body diagram showing the various forces acting on the fingers and the object are shown in **Figure 1**. The fingers, shown in two parts on either side of the object, represent the index finger and the thumb. For simplicity we assume that the two fingers are identical in mass and shape.

*FG* is the grip force applied on the object horizontally acting in opposite directions. *FL* is the lift force acting at the finger-object interface, lifting the object up. By the action of the *FG* pressing on the object, and due to the friction between the finger and object, *FL* gets coupled to the object. The forces thus emerging between the finger and object are shown in **Figure 1B**. The frictional force *Ff* acts on the object in the upward direction, with *Ff* /2 on either side of the object.

**Figure 2A** illustrates the interaction of *FL* and *FG* in controlling the position of the object (*Xo*). The model consists of two controllers (*FL* and *FG* controllers) and a plant. Inputs to the plant are *FL* and *FG* while the outputs are the object position (*Xo*), finger position (*X*fin) and their derivatives (*X*˙ *<sup>o</sup>*, *X*¨ *<sup>o</sup>*, *X*˙ fin, *X*¨ fin). The objective of the control task is to lift the object to a reference position, *X*ref. The model receives *F*Gref and *X*ref as the inputs and the *X*fin (position of finger), *Xo*(position of object), *X*˙ fin (velocity of finger), *X*˙ *<sup>o</sup>* (velocity of object), *X*¨ fin (acceleration of finger) and *X*¨ *<sup>o</sup>* (acceleration of object) as outputs. The plant is described in detail in Appendix A.

The following sections describe the design of the controllers (*FG* and *FL*) followed by their training method, respectively.

## *The grip force (FG) controller*

The *FG* controller (designed as a second order system) is used to generate the *FG* which couples the fingers to the object. Typical *FG* profiles of human subjects show a peak and a return to a steady state value, resembling the step-response of an underdamped second order system, thereby justifying the choice of an underdamped second order system as a minimal model. The *FG* controller for a step input (*F*Gref) is given in Equation (1).

$$F\_G = \frac{\left\|\mathbf{o\_n}\right\|^2}{\left(\mathbf{s}^2 + 2\boldsymbol{\omega}\_\mathbf{n}\,\mathbf{\zeta}\mathbf{s} + \boldsymbol{\omega}\_\mathbf{n}^2\right)}\tag{1}$$

In order to determine the values of natural frequency, ω*<sup>n</sup>* and damping factor, ζ, maximum overshoot (*Mp*, defined as the maximum peak value of the response curve) and time to peak (*Tp*, peaking time of the response curve) are required. Using prior published experimental values (Johansson and Westling, 1984) for *MP* and *Tp*, *FG* controller parameters are obtained using Equations (2, 3) (Ogata, 2002).

$$M\_{\mathbb{P}} = e^{-\left(\xi \alpha\_n / \alpha\_d\right) \pi} \tag{2}$$

$$T\_{\mathcal{P}} = \frac{\pi}{\alpha\_{\rm d}}\tag{3}$$

**FIGURE 2 | Block diagram showing (A) the interaction of the various components and their corresponding inputs and outputs.** *X*, *X*˙ , and *X*¨ are the position, velocity, and acceleration; subscript "fin" and "o" denote finger and object, respectively; **(B)** the control loop used for *FL* controller design. The grip force in the full system of panel **(A)** is set to a constant value of 10N.

the object.

where ω*<sup>d</sup>* the damped natural frequency is given as,

$$
\omega\_d = \omega\_n \sqrt{1 - \xi^2} \tag{4}
$$

See "Controller Training from the Model of The Precision Grip Control System" for the details of the above calculations.

#### *Lift force (FL) controller*

The *FL* controller, which is a Proportional-Integral-Derivative (PID) controller [Equation (5)], takes the position (*EL*) as input [Equation (6)], and produces a time-varying *FL* profile (*FL*,PID) as output [Equation (5)] which in turn controls the object position.

$$F\_{L,\text{PID}} = K\_{P,L}E\_L + K\_{I,L} \int\_0^T E\_L(\mathbf{r}\_n)d\mathbf{r}\_n + K\_{D,L} \frac{dE\_L}{dT} \tag{5}$$

$$E\_L = X\_0 - X\_{\text{ref}} \tag{6}$$

Here the *KP*,*L*, *KI*,*<sup>L</sup>* and *KD*,*<sup>L</sup>* are the proportional, integral and derivative gains, respectively, for the *FL* controller.

PID controller output is non-zero initially which is not realistic since the initial value of *FL*must be zero. Hence, *FL*,PID is smoothened using Equation (7).

$$
\pi\_s \frac{dF\_L}{dT} = -F\_L + F\_{L, \text{PID}} \tag{7}
$$

where *FL* (*T* = 0) = 0.

In order to design the *FL* controller, we simplify the full system of **Figure 2A** as a *FL* controller with a high constant *FG* (10 N) to prevent the slip (**Figure 2B**). Note that if a constant and high value of *FG* is assumed, slip is completely prevented, and the *FG* controller is effectively eliminated from the complete system (**Figure 2B**). The *FL* controller design now involves lifting a simple inertial load straight up from an initial position (*Xo* = 0 m) to a final position (*Xo* = 0.05 m).

Performance of the lift is evaluated using the cost function, CE, [Equation (8)]. This Cost function comprises of (1) average position difference between the finger (*X*¯ fin) and the object (*X*¯ *<sup>o</sup>*) at the end of the trial and (2) the difference in position between the desired and actual average position of the object.

$$\text{CE (F}\_{\text{Gref}}) = 0.5 \left( \frac{\bar{X}\_{\text{fin}} - \bar{X}\_o}{\bar{X}\_{\text{fin}}} \right)^2 + 0.5 \left( \frac{X\_{\text{ref}} - \overline{X}\_o}{X\_{\text{ref}}} \right)^2 \tag{8}$$

The *FL* PID controller parameters were then optimized for cost function (CE) using Genetic Algorithm (GA) (Goldberg, 1989; Whitley, 1994) (refer **Figure 3** for block diagram and for parameters refer Supplementary Material A) keeping *FG* constant (=10 N) at a sufficiently high value so that the object does not slip.

The *FL* controller described above is designed assuming a constant and high grip force. But, when both *FG* and *FL* controllers are inserted in the full system (**Figure 2A**) the system may behave in a very different manner due to gradual *FG* buildup starting from zero. When a step input of magnitude *F*Gref is given to the *FG* controller, the *FG* starts from 0, then approaches a peak value and stabilizes at a steady-state value known as Stable Grip

Force (*SGF*). Even with the trained controllers, for low values of *F*Gref, the object may slip. But once the *F*Gref is sufficiently high, slip is prevented and the object can be lifted successfully. Therefore, for a successful lift, an optimal value of *F*Gref needs to be determined. The optimal *F*Gref, which is related to *SGF* needed for a successful grip/lift performance, varies with the experimental setup, skin friction etc. (Ingvarsson et al., 1997; Fellows et al., 1998). It is here that we use concepts from RL and the utility function formulation for searching the *F*Gref state space.

#### **THE UTILITY FUNCTION FORMULATION**

The optimality of a decision is measured by the rewards fetched by it. Selection of an optimal choice providing the maximum expected value of the rewards (value) is known as optimal decision making (DM). DM can be seen in stock exchange, corporates and even our daily lives (which may or may not involve explicit monetary rewards). A lot of DM is carried out subconsciously when the person is unaware of the reason for choosing ones decision (Ferber et al., 1967). Non-human primates also show DM capabilities (Lakshminarayanan et al., 2011; Leathers and Olson, 2012). A mathematical framework is essential to empirically understand subjective DM behavior. Various independent studies confirm the role of value (average reward) and risk (variance in the rewards) in DM (Milton and Savage, 1948; Markowitz, 1952; Hanoch and Levy, 1969; Kahneman and Tversky, 1979; Bell, 1995; Lakshminarayanan et al., 2011).

A search for components of DM in the living world lead to identification neural correlates of value and risk in human and non-human primates (D'Acremont et al., 2009; Schultz, 2010; Lakshminarayanan et al., 2011; Leathers and Olson, 2012). Human DM process takes account of the risk in addition to the mean rewards that the decisions fetch. Furthermore, many neurobiological correlates of risk sensitivity are found in support of risk-based DM in humans (Wu et al., 2009; Schultz, 2010; Zhang et al., 2010; Wolpert and Landy, 2012).

An important instance of risk-based DM model is the utility function formulation, which is a combination of value and risk information (D'Acremont et al., 2009). The utility function used in the current model of PG performance derives from (Bell, 1995; D'Acremont et al., 2009) study that uses the concept of utility (*U*) as a weighted sum of value (*V*), which represents expected reward, and risk (*h*), which denotes variance in the reward. The weighting factor, λ, involved in the linear sum of *V* and *h*, denotes risk preference [Equation (9)].

$$U = V - \lambda\sqrt{h} \tag{9}$$

We here introduce few terms from RL used in our study, following a policy (π) is associated with an expected value (*V*) of the state (s) at trial (*t*) is given by Equation (10).

$$V(t) = E\_{\pi} \left( r(t+1) + \gamma \left| r(t+2) + \gamma^2 \right| r(t+3) + \dots \left| s(t) = s \right) \right) \tag{10}$$

where, E is expectation and γ is the discount factor denoting the myopicity in the prediction of future rewards, and the reward is denoted by "*r.*" The *V* update is by Equation (11).

$$V(t+1) = V(t) + \eta\_V \delta(t) \tag{11}$$

where η*<sup>V</sup>* is the learning rate for *V* and "δ" is the temporal difference error or the reward prediction error, and is given by Equation (12)

$$\delta(t) = r(t+1) + \gamma \, V(t+1) - V(t) \tag{12}$$

Similarly, the risk prediction error, "ξ(*t*)" is denoted by Equation (13).

$$
\xi(t) = \xi(t)^2 - h(t) \tag{13}
$$

where, the risk function denoted by (*h*) is the variance in the prediction error [Equation (14)]

$$h(t+1) = h(t) + \eta\_{\text{risk}} \,\xi(t) \tag{14}$$

Here ηrisk is the learning rate for risk. We now introduce a modified version of Utility (*U*) [Equation (9)], as follows,

$$U(t) = V(t) - \alpha \operatorname{sign}(V(t)) \sqrt{h(t)}\tag{15}$$

where "α sign(*V*(*t*))" is the risk preference (Pragathi Priyadharsini et al., 2012).

The sign(*V*) term achieves a familiar feature of human decision making viz., risk-aversion for gains and risk-seeking for losses (Kahneman and Tversky, 1979). In other words, when *V* is positive (negative), *U* is maximized (minimized) by minimizing (maximizing) risk. We now use the above basic framework for modeling the Utility as a function of *F*Gref.

#### **COMPUTING U(***F***Gref)**

We now explain how the above-described Utility function formulation is applied in the present study. The Utility function, *U*, and its components *V* and *h*, are expressed as functions of *F*Gref, which is taken as the state variable. A given value of *F*Gref results in either slip or successful lift of an object. The measure of performance is expressed by CE [Equation (8)].

The value "*V*" and risk "*h*" are computed as a function of *F*Gref by repeatedly simulating lift for a range of values around *F*Gref. This helps us to analyze the possible variability caused on the performance of the PG task with *F*Gref. A range of values were obtained by adding uniformly distributed noise (ν: refer **Table 1**) to *F*Gref [Equation (16)]. We modeled ν to proportionally represent the *F*slip i.e., the value of ν is increased for a low μ.

$$
\hat{F}\_{\text{Gref}} = F\_{\text{Gref}} + \nu \tag{16}
$$

For the given value of *F*Gref, *VCE* was drawn from the performance measure [Equation (8)] and is generated using the cost function CE as [Equation (17)]. Refer **Figure 4** for the block diagram for determining *VCE*.

$$V\_{\rm CE} \left( \hat{\mathbb{F}}\_{\rm Gref} \right) = e^{-\rm CE \left( \hat{\mathbb{F}}\_{\rm Gref} \right)} \tag{17}$$

There is no explicit reward in the present formulation. *VCE* represents a reward-like quantity, the average of which is linked to value. Value [Equation (18)] and risk [Equation (19)] were calculated as the mean and variance of the *VCE*,

$$V(F\_{\text{Gref}}) = \text{mean}\left(V\_{\text{CE}}\left(\hat{F}\_{\text{Gref}}\right)\right) \tag{18}$$

$$h\left(\mathcal{F}\_{\text{Gref}}\right) = \text{var}\left(V\_{\text{CE}}\left(\mathcal{F}\_{\text{Gref}}\right)\right) \tag{19}$$

**Table 1 | Parameters used in simulation: Mass of the object (***Mo***), friction coefficient (μ) and noise added (ν) for different cases simulated in the study.**


We now have a set of numerical values of *F*Gref and the corresponding *V* and *h* values. These numerical values are used to construct *V* and *h* as explicit functions of *F*Gref using Radial Basis Function Neural Networks (RBFNNs) explained in Appendix B. The above mentioned Equations (18, 19) is general to all the trials. These processes are summarized in **Table 2**.

Specifically, for a *F*Gref chosen at trial, *t*, *U*(*F*Gref(*t*)) is a combination of the *V*(*F*Gref(*t*)) and *h*(*F*Gref(*t*)). Utility, *U*(*F*Gref(*t*)) [Equation (20)], is then obtained in terms of *V* and *h*, as shown below:

$$U\left(F\_{\text{Gref}}(\mathbf{t})\right) = V\left(F\_{\text{Gref}}(\mathbf{t})\right) - \alpha \sqrt{h\left(F\_{\text{Gref}}(\mathbf{t})\right)}\tag{20}$$

Decisions are made by choosing actions that maximize *U*(*F*Gref(*t*)). Increasing the value of α makes the decision more risk aversive, while the decisions are more risk-seeking for smaller values of α. Maximizing *U*(*F*Gref(*t*)) is done by a stochastic hillclimbing process called the "Go/Explore/NoGo" (GEN) method. This method is inspired by dynamics of the BG and was described in greater detail in earlier work (Magdoom et al., 2011). We now present a brief account of the GEN method.

#### **MODELING PRECISION GRIP PERFORMANCE AS RISK-BASED ACTION SELECTION**

The key underlying idea of the proposed model is to treat the problem of choosing the right *FG* as an action selection problem and thereby suggest a link between the action selection function of the BG and PG performance. Impairment in action selection machinery of the BG under PD conditions is then invoked to explain *FG* changes in PD ON and OFF conditions.

In line with the tradition of Actor-Critic approach to modeling the BG (Joel et al., 2002); we recently proposed a model of the BG in which value is thought to be computed in the striatum. Furthermore, we hypothesized that the action selection function of the Basal Ganglia is accomplished by performing some sort of a stochastic hill-climbing over the value function computed in the striatum. We dubbed this stochastic hill-climbing method the "Go/Explore/NoGo" (GEN) method (Magdoom et al., 2011; Kalva et al., 2012), which denotes a conceptual expansion of the classical Go/NoGo picture of the BG function. As an extension of the above model, we had recently proposed a model of the

### **Table 2 | Steps to train value =** *V* **(***F***Gref) and risk =** *h***(***F***Gref).**

• *F*Gref is randomly chosen between [1N 10N].


BG in which the striatum computes not just value but the Utility function (Pragathi Priyadharsini et al., 2012). Action selection is then achieved by applying the GEN method to the Utility function. In the present study, we apply the GEN method to the Utility function and estimate grip forces in control and PD conditions.

#### *Neurobiological interpretation of the GEN method in controls*

We now present some background and neurobiological interpretation of the GEN method in connection with functional understanding of the BG, following which details of the GEN method will be provided.

The BG system consists of 7 nuclei situated on two parallel pathways that form loops—known as the direct pathway (DP) and the indirect pathway (IP)—starting from the cortex and returning to the cortex via the thalamus. Striatum is the input port of the BG, which receives inputs from the cortex. The Globus Pallidus interna (GPi) and the Substantia Nigra pars reticulata (SNr) are the output ports that project to thalamic nuclei, which in turn project to the cortex, closing the loop. The DP consists of the striatum projecting directly to GPi/SNr, while the IP consists of a longer route starting from the striatum and returning to GPi/SNr via Globus Pallidus externa (GPe) and the Subthalamic Nucleus (STN) in that order. The striatum receives dopaminergic projections from the Substantia Nigra pars compacta (SNc) (Mink, 1996; Smith et al., 1998). Striatal dopamine levels are thought to switch between DP and IP, since the DP (IP) is selected for higher (lower) levels of dopamine (Sutton and Barto, 1998; Frank, 2005; Wu et al., 2009) (**Figure 5**).

In classical accounts of the BG function, the DP is known as the Go pathway since it facilitates movement and the IP is called the NoGo pathway since it inhibits movement. Striatal dopamine levels are thought to switch between Go and NoGo regimes (Albin et al., 1989). We have been developing a view of the BG modeling in which the classical Go/NoGo picture is expanded to Go/Explore/NoGo picture, wherein a new Explore regime is inserted between the classical Go and NoGo regimes (Kalva et al.,

**(Indirect Pathway) specified.**

2012). This explore regime corresponds to random exploration of action space which is a necessary ingredient of any RL machinery. Kalva et al. (2012) show that the Explore regime arises naturally due to the chaotic dynamics of the STN-GPe loop in the IP. The three regimes together amount to a stochastic hill-climbing, which we describe as the GEN method. The GEN method has been used in the past to describe a variety of functions of the BG, in control and PD conditions, including reaching movements (Magdoom et al., 2011) and spatial navigation (Sukumar et al., 2012).

Magdoom et al. (2011) used the GEN method to hill-climb over the value function (Magdoom et al., 2011). δ*V*(*t*) is defined as a temporal difference in value function [Equation (21)].

$$\delta\_V(t) = V(F\_{\text{Gref}}(t)) - V(F\_{\text{Gref}}(t-1)) \tag{21}$$

where "*t*" is not "time" but "trial number."

The GEN method used in Magdoom et al. (2011) can be summarized using the following Equation (22),

$$\begin{aligned} \text{if}(\delta\_V(t) > DA\_{hi})\\ \Delta F\_{\text{Gref}}(t) &= +\Delta F\_{\text{Gref}}(t-1) \\ \text{else} \text{if}(\delta\_V(t) > DA\_{lo} \wedge \delta\_V(t) \le DA\_{hi}) \end{aligned} \qquad \text{(a)}$$

$$\begin{aligned} \Delta F\_{\text{Gref}}(t) &= \psi & - \text{ "Explore"} & (\text{b})\\ \text{else} \quad (\delta\_V(t) &\le DA\_{\text{lo}}) \end{aligned}$$

$$
\Delta F\_{\text{Gref}}(t) = -\Delta F\_{\text{Gref}}(t-1) \qquad \qquad - \text{ "NoGo"} \qquad \text{(c)}\
$$

(22)

where, ψ is a random vector, and ||ψ|| = χ, a positive constant. *DAhi* and *DAlo* are the thresholds at which dynamics switches between Go, NoGo and Explore regimes [Equation (22)]. The underlying logic of the above set of Equations (22a–c) is as follows. If δ*V*(*t*) > *DAhi*, the last update resulted in a sufficiently large increase in *V*; therefore continue in the same direction in the next step. This case is called the "Go" (Equation 22a) regime. If δ*V*(*t*) ≤ *DAlo*, the last update resulted in a significant decrease in *V*; therefore proceed in the direction opposite to the previous step. This case is called the "NoGo" (Equation 22b) regime. If *DAlo* < δ*V*(*t*) ≤ *DAhi*, there was neither a marked increase nor decrease in *V*; therefore Explore (Equation 22c) for new directions. This case is called the Explore regime. In Magdoom et al. (2011) we assumed a simple symmetry between *DAhi* and *DAlo*, such that *DAhi* > 0 and *DAlo* = −*DAhi*.

However, in the present study we introduce a small variation of the above formulation of the GEN method. Instead of *V*, we perform hill-climbing over the Utility landscape. The three separate Equations (22a–c) can be combined into a single Equation (23) [as in Sukumar et al. (2012)], as follows:

$$\begin{array}{ll} \Delta F\_{\text{Gref}}(t) = A\_G \log \text{sig} \left( \lambda\_G \delta\_U(t) \right) & \Delta F\_{\text{Gref}}(t-1) \\ -A\_N \log \text{sig} (\lambda\_N \delta\_U(t)) & \Delta F\_{\text{Gref}}(t-1) \\ +A\_E \psi \exp(-\delta\_U^2(t)/\sigma\_E^2) & \end{array} \tag{23}$$

where,

And,

$$\text{logsig}(n) \, = \, \frac{1}{1 + \, \exp(-n)}\tag{25}$$

δ*U*(*t*) = *U* (*F*Gref(*t*)) − *U* (*F*Gref(*t* − 1)) (24)


Just as TD error is interpreted as dopamine signal in classical RL-based accounts of the BG function, we interpret δ*<sup>U</sup>* as dopamine signal in the present study. In Equation (23) above, the first term on the RHS corresponds to "Go" regime, since it is significant for large positive values of δ*<sup>U</sup>* (since *AG* and λ*<sup>G</sup>* are positive). The second term on the RHS of Equation (23) above corresponds to "NoGo" regime, since it is significant for large negative values of δ*<sup>U</sup>* (due to *AN* > 0 and λ*<sup>G</sup>* < 0). The third term on the RHS corresponds to "Explore" regime, since it is dominant for values of δ*<sup>U</sup>* close to 0.

#### *Neurobiological Interpretation of the GEN Method in PD*

PD being a dopamine deficient condition, a natural way to incorporate Parkinsonian pathology is to attenuate the dopamine signal, δ*U*. In (Magdoom et al., 2011; Sukumar et al., 2012), PD pathology was modeled by clamping the dopamine signal, δ*U*, and preventing it from exceeding an upper threshold. The rationale behind such clamping is that with fewer dopaminergic neurons left, SNc may not be able to produce a signal intensity that exceeds a certain threshold. In the present case of PG, such a constraint is applied to δ*U*. If [a,b] is the natural unconstrained range of values of δ*<sup>U</sup>* for controls, then for PD OFF simulation, a clamped value of δLim changes the δ*<sup>U</sup>* range to [a, δLim] where δLim < b. Furthermore, to simulate the increase in dopamine levels in PD ON condition, due to administration of L-dopa, a positive constant is added to δ*U*, thereby changing the range of δ*<sup>U</sup>* in PD ON condition to [a+ δMed, δLim + δMed] [Equation (26)].

$$\delta\_{\rm U}(t) = \begin{cases} [a, b] & \text{(a) for controls} \\ [a, \delta\_{\rm Lim}] & \text{(b) for PD OFF} \\ [a + \delta\_{\rm Med}, \delta\_{\rm Lim} + \delta\_{\rm Med}] & \text{(c) for PD ON} \end{cases} \tag{26}$$

where δLim + δMed < *b* and *a* < *b*.

#### *Training the GEN parameters*

The output of the GEN system is *FG* from which *SGF* is calculated as average *FG* between 4000 and 5000 ms of simulation time, the mean and variance of which must match with the mean and variance of *SGF* obtained from PG experiments under control and PD conditions (Ingvarsson et al., 1997; Fellows et al., 1998). The parameters to be trained are *AG*/*E*/*<sup>N</sup>* [gains of the Go/Explore/NoGo terms in Equation (23)], λ*G*/*<sup>N</sup>* [sensitivity of Go/NoGo terms in Equation (23)] and σ*<sup>E</sup>* [sensitivity of Explore term in Equation (23)]. Determination of the GEN parameters is done by optimizing a cost function *CE*GEN given as.

$$CE\_{\rm GEN} = 2\left(\overline{SGF}\_{\rm expt} - \overline{SGF}\_{\rm sim}\right)^2 + \left(\sigma\_{\rm expt} - \sigma\_{\rm sim}\right)^2 \tag{27}$$

*SGF* is the stable grip force generated and σ is the variance in the error. Subscripts *expt* and *sim* denote experimental and simulated values, respectively. *CE*GEN is formulated such that more weightage is given to the *SGF* error and lesser to the variance in the error [Equation (27)].

The six model parameters (*AG*/*E*/*N*, λ*G*/*N*, and σ*E*) of the GEN method [Equation (23)] are trained to capture the following experimental conditions: Controls and PD ON conditions are obtained from Fellows et al. (1998), whereas controls, PD ON and OFF from Ingvarsson et al. (1997) for both sandpaper and silk surfaces. However, the parameters *AG*/*E*/*N*, λ*G*/*N*, and σ*<sup>E</sup>* are not all trained separately for every experimental condition. Initial parameter values for *AG*/*E*/*N*, λ*G*/*N*, σ*E*, and α were determined (**Figure 6A**) by matching the *SGF* results for controls of Fellows et al. (1998); this matching is achieved by optimizing *CE*GEN [Equation (27)] using GA (**Figure 6A**). This initial training of GEN parameters is a kind of calibration of the parameters for a given experimental setup. Once an initial estimate of parameters was obtained, *AG*/*E*/*N*, λ*G*/*N*, and σ*<sup>E</sup>* were fixed and optimal values of α, δLim, and δMed were obtained using GA for the PD conditions, of Fellows et al. (1998). Similarly, for Ingvarsson et al. (1997), the initial parameter values for *AG*/*E*/*N*, λ*G*/*N*, σ*E*, and α were determined by matching the *SGF* results for controls using Equation (27). For the cases of PD ON and PD OFF (Ingvarsson et al., 1997), the search space was limited to α, δLim, δMed, by having fixed the values of *AG*/*E*/*N*, λ*G*/*N*, σ*<sup>E</sup>* obtained from controls (Ingvarsson et al., 1997).

The model was tested (**Figure 6B**) using the trained GEN parameters to determine if the model generated outputs are close to the experimental values.

#### **RESULTS**

We now apply the model described in the previous section to explain the experimental results for two published studies

(Ingvarsson et al., 1997; Fellows et al., 1998). For simplicity we used only constant weight trials (i.e., only the trials where only one load was repeatedly used for lifting). The friction coefficient was calculated as load force/slip force (Forssberg et al., 1995) [Equation (28)].

$$
\mu = M\_{\text{o}} \text{g/} F\_{\text{slip}} \tag{28}
$$

Fellows et al. (1998) investigated 12 controls, 16 PD patients and four hemi-parkinsonian patients. All the PD subjects were in medication ON state. The subjects were asked to lift the object to a height of 4–8 cm. The study comprised of two loads 3.3 N and 7.3 N. Various combinations of these two loads were used to give rise to "light," "heavy," "unload," and "load" condition. In our study, we simulated only the experimental results for the "light" condition (which featured lifting the 3.3 N object for all the trials) with a desired object lift height of 5 cm.

Ingvarsson et al. (1997) investigated the role of medication in 10 controls and 10 PD subjects under two object surface conditions (silk and sandpaper). The task required the object to be PG–lifted to a height of 5 cm above the table. The entire experiment was divided into 3 parts (a) coordination of forces, (b) adaptation to friction and (c) rapid load changes. "Coordination of forces" required the object to be lifted to 5 cm height, and maintained at that position for 4–6 s using PG. "Adaptation to friction" required the subjects to gradually reduce the grip force on the object thereby causing slip *F*slip is calculated. Finally, in the "rapid load changes" task a plastic disk was dropped in a padded plate thereby causing abrupt changes in *FL*.

In the present study, we use the *F*slip for determining the friction coefficient which is used to match the experimentally obtained grip force values under silk and sandpaper condition for coordination of forces case. Ingvarsson et al. (1997) reported the results in median ± Q3 quartile format. Hence for simplicity the results are assumed to be normally distributed with mean = median and Q3 = mean + 0.6745 standard deviation. The entire text reports the results in terms of mean and variance.

We now describe the simulation results starting from controller training to simulation of results from Ingvarsson et al. (1997) and Fellows et al. (1998).

## **CONTROLLER TRAINING FROM THE MODEL OF THE PRECISION GRIP CONTROL SYSTEM**

## *The F<sup>G</sup> controller*

In the present model the grip force controller is designed as reference tracking controller which receives *F*Gref as the input and generates a time-varying *FG*(*T*) as the output. So in the proposed configuration the *FG* controller is only affected by the input (*F*Gref) it receives. Using the overshoot ratio, *Mp* = 1.25, [Equation (2)] and time to peak, *Tp* = 530 ms, [Equation (3)] as design criteria [*Mp* and *Tp* values obtained from Johansson and Westling (1984)], we determined ω*<sup>n</sup>* = 6.4 and ζ = 0.4 as the parameters for transfer function of the *FG* controller [Equation (1) in The Precision Grip Control System; Ogata, 2002]. **Figure 7A** shows the grip force profile (for *T* = 5000 ms) for the input *F*Gref = 10 N. Since the *FG* controller is modeled as a transfer function which is dependent only on the *F*Gref value,

et al. (1998) (μ = 0.44, *Mo* = 0.33 kg) are shown. Note that in generation of **(B–D)**, the *F*Gref is kept constant at 10 N for illustration purpose.

the controller did not require retraining for different friction conditions.

## *The F<sup>L</sup> controller*

The efficacy of *FL* controller output can be observed in the output object and finger position. If a suboptimal *FL* is generated, the object does not reach the reference position. The cardinal components affecting the object-finger slip are μ and M*<sup>o</sup>* (Refer **Table 1** for the values of μ and M*<sup>o</sup>* used in simulation). Since the *FL* controller is affected by μ and M*o*, a decreased μ or increased M*<sup>o</sup>* increases the *F*slip Therefore, in this study we trained the *FL* controller for the minimum μ and the maximum M*<sup>o</sup>* to prevent the object from slipping even when μ increases or M*<sup>o</sup>* decreases. Furthermore, to train the *FG* controller, we assume a sufficiently large *FG* (=10 N) thereby effectively decoupling the *FG* controller. With a large, constant *FG*, the cost function (CE) [Equation (8)] was optimized using the GA (Goldberg, 1989; Whitley, 1994) for the setup parameters from Fellows et al. (1998) (μ = 0.44 and M*<sup>o</sup>* = 0.33 N) to obtain PID parameter values. The PID parameter values obtained were *KP*,*<sup>L</sup>* = 6.938, *KI*,*<sup>L</sup>* = 14.484, *KD*,*<sup>L</sup>* = 1.387, τ*<sup>s</sup>* = 0.087. The same PID parameters were used to simulate results from Ingvarsson et al. (1997) also. In **Figure 7**, the output of the simulated finger and object position is shown for (**Figure 7B**) the silk condition of (Ingvarsson et al., 1997), (**Figure 7C**) the sandpaper condition of (Ingvarsson et al., 1997) and (**Figure 7D**) light condition of (Fellows et al., 1998).

Since we fixed the PID parameters for the *FL* controller, efficacy of the PID parameters across the three control conditions viz. Ingvarsson et al. (1997) silk condition, Ingvarsson et al. (1997) sandpaper, and Fellows et al. (1998), needs to be determined. Two important criteria for determining a successful lift are: low slip (*X*¯ fin − *X*¯ *<sup>o</sup>*) and low position error (*X*ref − *X*¯ *<sup>o</sup>*). In all the three cases (refer **Table 1**) the finger–object slip distance was <0.005 m and the *X*ref − *X*¯ *<sup>o</sup>* <0.001 m, where, *X*¯ *<sup>o</sup>* is average position between simulation time (T) as 4000 and 5000 ms. The *FG* controller output is shown in **Figure 7A**; object and hand position for Ingvarsson et al. (1997) sandpaper, Ingvarsson et al. (1997) sandpaper, and Fellows et al. (1998) is shown in **Figures 7B–D**. Note that *F*Gref is held constant at 10 N.

#### **OBTAINING** *V(F* **Gref),** *h(F* **Gref), AND** *U(F* **Gref)**

Utility based approach requires estimation of *V*(*F*Gref) and *h*(*F*Gref) [refer model Section "Computing *U*(*F*Gref)"]. *V* and *h* are generated for Ingvarsson silk and Ingvarsson sandpaper and were obtained using the parameters given in **Table 1**. We assumed the noise to be inversely related to the friction coefficient. Hence a higher noise was used in case of lower μ and vice versa.

Note that value functions [expressed as a function of *F*Gref (Appendix B, Equation (42)] in case of PG are expected to have a sigmoidal shape, since a *FG* level that exceeds the *F*slip always results in a successful grip-lift and therefore reward. On the contrary, the risk function, *h* [Appendix B, Equation (43)], is expected to be bell-shaped since risk is the highest in the neighborhood of slip force, and zero far from it.

These observations are supported by the value and risk functions constructed for various experimental conditions—Fellows et al. (1998) light and Ingvarsson et al. (1997) silk and sandpaper cases. **Figure 8A** shows the value and risk functions for Fellows et al. (1998), **Figure 8B** shows the value and risk functions for Ingvarsson et al. (1997) silk and **Figure 8C** shows the value and risk functions for Ingvarsson et al. (1997) sandpaper condition.

#### **PERFORMING ACTION SELECTION USING THE BG AND SIMULATING THE CONTROL AND PD CASES OF THE STUDY**

Two features mark the difference in *V*(*F*Gref) and *h*(*F*Gref) between controls and PD. In PD OFF case we apply the clamped δ*U*(= δLim) condition [Equation (26b)], whereas in PD ON case we add a positive constant (δMed) to δ*<sup>U</sup>* [Equation (26c)]. In addition to these changes in the dopamine signal, δ*U*, we assume altered risk sensitivity in PD. Studies also suggest altered risk taking in PD patients (in particular, risk in PD ON> risk in PD OFF) when compared to healthy controls (Cools et al., 2003). Since α represents risk sensitivity in the utility function [Equation (20)], we use a smaller α in PD case (both ON and OFF) (Refer **Tables 3**, **4** for the simulated values).

As described earlier, the GEN method [Equation (23)], produces a series of *SGF* values with a certain mean and standard deviation computed over the trials. It may be recalled, from "Model", that Equation (23) represents a form of stochastic hill-climbing over the utility function, *U*. It represents a map from *F*Gref(*t* − 1) to *F*Gref(*t*), where "*t*" represents the trial number. The three terms on the RHS of Equation (23) represent GO, NOGO and Explore regimes, in that order. The three regimes are active under conditions of high, low, and moderate dopamine (δ*U*), respectively. **Figure 9** shows some sample profiles of the three terms on the RHS of Equation (23).

**FIGURE 8 | Figure showing the value, risk and utility (α = 0***.***3 and α = 0***.***5) of the RBF network as an average value of 20 runs for (A) Fellows et al. (1998) (B) Ingvarsson et al. (1997) silk, and (C) Ingvarsson et al. (1997) sandpaper.**



Using the GEN policy on δ*U*,we simulate our model with parameters described in **Tables 3**, **4** (However, simulation with δ*<sup>V</sup>* of Equation (21) with the same parameter set described in this section yields Supplementary Material B).

#### *Procedure to train the GEN parameters*

• The change in *F*Gref (i.e., -*F*Gref) from trial, "*t*" to "*t* + 1" is given in Equation (23).

**Table 4 | Table showing the GEN parameters and Utility parameters for simulating both Ingvarsson et al. (1997) cases of silk and sandpaper in controls, PD OFF and PD ON conditions.**



the parameters from controls are used for PD ON only. In case of (Ingvarsson et al., 1997), the parameters from controls are used in PD ON and OFF for two surface conditions – silk and sandpaper.

• In PD OFF case, only δLim and α are trained. In PD ON case, δLim,δMed and α are trained.

#### *Simulation of Fellows et al. (1998) results*

Fellows et al. (1998) for controls was simulated using the parameters (**Table 4**) obtained by optimizing GA on *CE*GEN.


A comparison of the experimental and simulated data obtained for Fellows et al. (1998) using the parameters in **Table 3** is given as **Figure 10**.

#### *Simulation of Ingvarsson et al. (1997) results*

Following the simulation of the Fellows et al. experiment:

(a) The results of the Ingvarsson et al. (1997) controls for both sandpaper and silk were simulated for obtaining (*AG*/*E*/*N*, λ*G*/*N*,σ*E*,and α = 0.5) using GA (**Table 4**). The other parameters are set as δLim = 1 and δMed = 0.

(b) In PD ON condition, the same control parameters (*AG*/*E*/*N*,

for the control and PD ON groups are statistically significant with the

*p*-value < 0.05 in both the experiment and simulation.


A comparison of the experimental and simulated data obtained for Ingvarsson et al. (1997) for silk and sandpaper using the parameters in **Table 4** is given as **Figures 11** and **12**, respectively. In order to be consistent with the experimental result the simulation results in **Figures 11** and **12** are shown in median ± Q3. **Figures 10**–**12** replicate the empirical findings (both mean / median and variance profiles) well. Fellows et al. (1998) results show that the mean(*SGF*) is higher in PD ON case compared to controls (*SGF*norm<*SGF*PDON) (**Figure 10**). A similar result (*SGF*norm<*SGF*PDON) is also reported in Ingvarsson et al. (1997) silk (**Figure 11**) and sandpaper cases (**Figure 12**). Furthermore, var(*SGF*) under PD OFF cases is observed to be greater than that of controls. It may be inferred that the increase in *SGF* during the PD conditions would be due to their increased SM while playing risk aversive in the game of risk based decision making. This increased SM could be needed to oppose the increased internal perturbations/sensory-motor incoordination observed in the PD patients.

#### **DISCUSSION**

In this paper, we present a computational model to explain the changes in *FG* in controls and PD ON/OFF conditions (Ingvarsson et al., 1997; Fellows et al., 1998). To our knowledge this is the first computational model of PG performance in PD conditions. A novel aspect of the proposed approach to modeling PG is to apply (based on the observation that PG performance involves a SM) concepts from risk-based DM to explain

**FIGURE 11 | Comparison of experimental (Ingvarsson et al., 1997) and simulation results for** *SGF* **for silk surface.** The bars represent the median (± Q3 quartile). The results for the control and PD ON groups are statistically significant with the *p*-value < 0.05.

*FG* generation. To this end, we applied a recent model of the BG action selection based on the utility function formulation, instead of value function, to explain PG performance (Pragathi Priyadharsini et al., 2012).

simulation are non-significant in both the experiment and simulation.

There are significant challenges involved in developing a computational model of PG in PD conditions. The root cause of this difficulty is that the pathology in PD is located at a high level (dopaminergic neurons of the BG) in motor hierarchy, while the motor expression is at the lowest level in the hierarchy (finger forces). Ideally speaking, a faithful computational model must incorporate these two well-separated levels in motor hierarchy, and all the levels in between. But development of such an extensive model would be practically infeasible, and may not be essential to the problem at hand. On the other hand, if model compactness is the sole governing principle, one may build an empirical, data-fitting model that links experimental parameters like friction, object weight, and abstract neural parameters like dopamine level, medication level, with observed data like mean and variance of grip force generated. But such an over-simplified model could be a futile mathematical exercise without much neurobiological content.

Therefore, a conservative approach to model PG performance in PD would have two components: (1) a minimal model of sensory-motor loop dynamics involved in generating PG forces, and (2) a minimal model of the BG that incorporates the effect of dopamine changes on the BG dynamics. Whereas the BG level generates evaluations of actions (*FG*) based on the rewards (successful grip performance) obtained from PG performance, the sensory-motor loop dynamics receives the command from the BG level and generates *FG*. While the first component represents DM, the second represents execution. These two components must then be integrated. PD-related reduction in dopamine level in the integrated model must then be manifested as appropriate changes seen in PG forces. This is the approach adopted in the present study. **Figure 13** presents the grand plan of the entire model, and the training/design steps followed to build various components.

#### **MODELING THE SENSORIMOTOR LOOP**

The sensory-motor loop component consists of two controllers, for generating the grip force and lift force, and a plant model. *FG* and *FL* and given as inputs to the plant model which simulates the of the object and the fingers. The error between actual and desired object positions is fed back to the *FL* controller. The *FG* controller receives *F*Gref as input and generates the *FG*(*T*) profile. *F*Gref is generated by the second component, the BG component, by a DM process. The BG component is built on the lines of Actor-Critic models of the BG, where the utility function used instead of value function, and the dopamine signal, δ, is the temporal difference of utility [Equation (24)].

## **THE F***L***CONTROLLER FORMS A LOOP WITH THE PLANT**

The controller gives *FL* as input to the plant and receives position error as feedback. The controller is trained by GA as described in "The Precision Grip Control System". The grip force controller is designed as an open-loop second order system that gives *FG* as input to the plant (see The "Precision Grip Control System"). Plant dynamics is described in Appendix A. The controller and plant system, with its trained parameters, is then integrated with appropriately trained the BG models to simulate control and PD results (Ingvarsson et al., 1997; Fellows et al., 1998).

#### *Simulation of Fellows et al. (1998) results*

Incorporating the values of object mass (M*<sup>o</sup>* = 0.33 kg) and friction (μ = 0.44) from Fellows et al. (1998), the controller and plant system trained above is used to calculate *V* and *h* functions (see "The Precision Grip Control System"). The *V* and *h* functions thus computed are explicitly modeled using RBFNNs (see "The Utility Function Formulation"). The *V* and *h* functions are combined to produce the utility function, which is

**FIGURE 13 | An illustration of the entire model, the training/design steps followed to build various components and to reproduce the results of Fellows et al. (1998) and Ingvarsson et al. (1997).** The text in {} denote section name.

used in the GEN method [see "Computing *U*(*F*Gref)"] to produce mean(*SGF*) and var(*SGF*) as outputs. The parameters of the GEN method must be calibrated to each experimental setup. Thus, the GEN parameters (*AG*/*E*/*N*, λ*G*/*N*and σ*E*) are trained by GA to produce the mean(*SGF*) and var(*SGF*) corresponding to the controls case. Then for the PD ON case, only δLim and δMed are optimized (*AG*/*E*/*N*, λ*G*/*<sup>N</sup>* and σ*<sup>E</sup>* are unchanged). Simulated and experimental values of mean(*SGF*) and var(*SGF*) approximate the experimental mean(*SGF*) and var(*SGF*) (see Results: **Figures 11**, **12**).

#### *Simulation of Ingvarsson et al. (1997) results*

The case of Ingvarsson et al. (1997) is a bit more complicated since it involves two friction conditions: silk and sandpaper. It also involves both PD ON and OFF unlike Fellows et al. (1998) which describes results for only PD ON. For the condition of sandpaper, object mass (M*<sup>o</sup>* = 0.3 kg) and friction (μ = 0.94) are incorporated into the trained controller and plant system. *V* and *h* functions are computed and explicitly modeled using RBFNNs ("The Utility Function Formulation") and Utility function is computed by combining *V* and *h*. The Utility function is used in the GEN method; the GEN parameters (*AG*/*E*/*N*, λ*G*/*<sup>N</sup>* and σ*E*) are trained to minimize *CE*GEN[Equation (27), **Figure 5**] so as to calibrate the model for the sandpaper case of Ingvarsson et al. (1997). The trained GEN parameters are used for the PD OFF case and only δLim is trained to optimize *CE*GEN. The same GEN parameters are again used for the PD ON case, where both δLim and δMed are optimized. A similar procedure is followed for the "silk" case (M*<sup>o</sup>* = 0.3 kg and μ = 0.44) of Ingvarsson et al. (1997) as outlined in **Figure 13**.

Risk-based decision making can arise in both motor and cognitive domains (Claassen et al., 2011). The present study deals with risk in motor domain. In this context, an interesting question naturally arises. Is there a correlation between risk-sensitivity in motor and cognitive domains. Does impaired risk-sensitivity in one domain carry over to the other? In other words, do PD patients show impaired decision making in motor and cognitive domains equally? In order to answer the above line of questioning, we propose to use a task, the Balloon Analog Risk Task (BART), which tests risk-sensitivity in cognitive domain (Claassen et al., 2011). We then propose to adapt that BART to the motor domain.

In BART, the subject is asked to press a key and inflate a virtual balloon displayed on the monitor. For every key press, the virtual balloon is inflated by a fixed amount and the subject earns a fixed number of points. The catch lies in that, on inflation beyond a threshold volume, the balloon bursts and the subject loses all the points. Knowing when to stop and redeem all the points earned so far involves risk based decision making.

The above task, which is a cognitive task, can be redesigned in terms of motor function, specifically in terms of PG performance. Just as in the BART task there is a threshold point at which the balloon bursts, in PG task there is a threshold grip force at which the object slips. In the redesigned BART task, the subject will earn more points as he/she grips the object with the grip force as close as possible to the slip force. Uncertainty can be incorporated by using objects that look identical but with different weights. It will be interesting to see possible parallels in performance of normal or PD patients, on both the cognitive and PG versions of the BART. If the above line of experimentation confirms that there is correlation between risk-sensitivity in motor and cognitive domains, it would place risk-based decision making approach to understanding PD on a stronger foundation.

#### **AUTHOR CONTRIBUTIONS**

Ankur Gupta: Computational model development, analysis and manuscript preparation. Pragathi Priyadharsini B: Computational model development, analysis and manuscript preparation. V. Srinivasa Chakravarthy: Computational model development, analysis and manuscript preparation.

#### **ACKNOWLEDGMENTS**

This study was funded by Department of Biotechnology, Government of India.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/ 10.3389/fncom.2013.00172/abstract

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 26 July 2013; accepted: 07 November 2013; published online: 02 December 2013.*

*Citation: Gupta A, Balasubramani PP and Chakravarthy VS (2013) Computational model of precision grip in Parkinson's disease: a utility based approach. Front. Comput. Neurosci. 7:172. doi: 10.3389/fncom.2013.00172*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2013 Gupta, Balasubramani and Chakravarthy. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## **APPENDIX A**

#### **PG MODEL** *Plant*

## The forces (*FL* and *FG*) obtained from the two controllers are used for determining the kinetic (position, velocity and acceleration of finger and object). The plant model incorporates the *FL*and *FG* for obtaining the net forces acting on both the finger (Ffin) and object (F*o*), with the interaction based on with the interaction based on finger-object interface through friction (F*f*). The net force acting on finger and object is given in Equations (29, 30). Please note that *M*fin is kept constant to M*o*/10 in the model.

$$F\_{\text{fin}} = F\_L - F\_f - M\_{\text{fin}} \mathbf{g} \tag{29}$$

$$F\_o = F\_f + F\_n - M\_o \mathbf{g} \tag{30}$$

When the object is resting on surface the net force on object is zero as there is no acceleration. So, the normal force is obtained by keeping F*<sup>o</sup>* = 0 in Equation (30). When the object is lifted from the table the normal force becomes zero. F*<sup>n</sup>* determination is given in Equation (31).

$$F\_n = \begin{cases} M\_\text{og} - F\_f, & \text{if } X\_\text{o} = 0 \land M\_\text{o} \mathbf{g} > F\_f \\ 0, & \text{else} \end{cases} \tag{31}$$

The frictional force (F*f*) coupling the finger and object is given in Equation (32)

$$F\_f = \begin{cases} F\_{\text{noslip}}, & \text{if } F\_{\text{noslip}} < F\_{\text{slip}} \\ F\_{\text{slip}}, & \text{else} \end{cases} \tag{32}$$

Where, the *F*slip, representing the maximum frictional force that can be generated is given in Equation (33).

$$F\_{\text{slip}} = 2\mu F\_G \tag{33}$$

The F*<sup>f</sup>* required to prevent slip is given in Equation (34)

$$F\_{\text{noslip}} = \frac{M\_o M\_{\text{fin}}}{M\_o + M\_{\text{fin}}} \left(\frac{F\_L}{M\_{\text{fin}}} - \frac{F\_n}{M\_o}\right) \tag{34}$$

According to Newton's second law of motion force is given as a product of mass and acceleration. So, Equations (29, 30) can also be represented as Equations (35, 36).

$$F\_{\rm fin} = M\_{\rm fin} \frac{d^2 X\_{\rm fin}}{d\chi^2} \tag{35}$$

$$F\_o = M\_o \frac{d^2 X\_o}{dt^2} \tag{36}$$

The kinetic parameters can be obtained by integrating *<sup>d</sup>*2*Xo dt*<sup>2</sup> to obtain velocity and double integrated to obtain the position.

## **APPENDIX B**

#### **TRAINING RBF**

In order to determine *U* in PG performance we first need to identify the state and the reward signal. Since *F*Gref is the key variable that decides the final outcome, *F*Gref at trial ('t') is treated as a state variable.

As described earlier, calculation of *U*(*F*Gref(*t*)) requires *V*(*F*Gref(*t*)) and *h*(*F*Gref(*t*)) as explicit functions of *F*Gref(*t*). To this end we use data-modeling capabilities of neural networks to implement *V*(*F*Gref) and *h*(*F*Gref) as explicit functions of *F*Gref.

Using the values of *V*(*F*Gref) and *h*(*F*Gref) as output and *F*Gref as input, an RBFNN (contains 60 neurons with the centroids distributed over a range [0.1 12] in steps of 0.2, and a standard deviation (σRBF) of 0.7) was constructed and trained to approximate *V*(*F*Gref) and *h*(*F*Gref). For a given *F*Gref(*t*), a feature vector () is represented using RBFNN [Equation (37)].

$$\phi\_m(F\_{\text{Gref}}(t)) = \exp(-\left(F\_{\text{Gref}}(t) - \mu\_m\right)^2 / \sigma\_m^2) \tag{37}$$

Here, for the mth basis function, μ*<sup>m</sup>* denotes the center and σ*<sup>m</sup>* denotes the spread.

Using the φ that was obtained from Equation (37). The RBFNN weight for determining value, w*V*, is updated in Equation (38). Hence the value is the mean of all the *VCE*'s obtained on *F*ˆGref.

$$
\Delta\omega\_V = \eta\_V \,\Delta V\_{CE}(F\_{\text{Gref}}) \,\phi(F\_{\text{Gref}}) \tag{38}
$$

where η*<sup>V</sup>* is the learning rate maintained to be 0.1, and the change in *VCE* is given as in Equation (39).

$$
\Delta V\_{\rm CE}(F\_{\rm Gref}) = e^{-\rm CE(\hat{F}\_{\rm Gref})} - e^{-\rm CE(F\_{\rm Gref})} \tag{39}
$$

The risk function *(h)* is then the variance in the -*VCE* as per Equation (40). Risk is the variance seen in all the *VCE*'s obtained on *F*ˆGref.

$$
\xi(F\_{\text{Gref}}) = \Delta V\_{\text{CE}}(F\_{\text{Gref}})^2 - h(F\_{\text{Gref}}) \tag{40}
$$

The weights for risk function *wh*, is updated Equation (41)

$$
\Delta\omega\_{\hbar} = \eta\_{\hbar} \,\,\xi(\text{F}\_{\text{Gref}}) \,\,\phi(\text{F}\_{\text{Gref}}) \tag{41}
$$

Here, η*<sup>h</sup>* is the learning rate for risk function = 0.1 and ξ is the risk prediction error [Equation (40)].

From the trained RBFNN, *V*(*F*Gref) and *h*(*F*Gref) are calculated using Equations (42, 43), respectively.

$$V(F\_{\text{Gref}}(\mathbf{t})) = \boldsymbol{\omega}\_V \boldsymbol{\phi}(F\_{\text{Gref}}(\mathbf{t})) \tag{42}$$

$$h(F\_{\text{Gref}}(\mathbf{t})) = \mathcal{w}\_{\hbar} \phi(F\_{\text{Gref}}(\mathbf{t})) \tag{43}$$

## Time representation in reinforcement learning models of the basal ganglia

#### *Samuel J. Gershman1 \*, Ahmed A. Moustafa2 and Elliot A. Ludvig3,4*

*<sup>1</sup> Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA*

*<sup>2</sup> School of Social Sciences and Psychology, Marcs Institute for Brain and Behaviour, University of Western Sydney, Sydney, NSW, Australia*

*<sup>3</sup> Princeton Neuroscience Institute and Department of Mechanical and Aerospace Engineering, Princeton University, Princeton, NJ, USA*

*<sup>4</sup> Department of Psychology, University of Warwick, Coventry, UK*

#### *Edited by:*

*Hagai Bergman, The Hebrew University- Hadassah Medical School, Israel*

#### *Reviewed by:*

*Yoram Burak, Hebrew University, Israel Daoyun Ji, Baylor College of Medicine, USA*

#### *\*Correspondence:*

*Samuel J. Gershman, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Room 46-4053, 77 Massachusetts Ave., Cambridge, MA 02139, USA e-mail: sjgershm@mit.edu*

Reinforcement learning (RL) models have been influential in understanding many aspects of basal ganglia function, from reward prediction to action selection. Time plays an important role in these models, but there is still no theoretical consensus about what kind of time representation is used by the basal ganglia. We review several theoretical accounts and their supporting evidence. We then discuss the relationship between RL models and the timing mechanisms that have been attributed to the basal ganglia. We hypothesize that a single computational system may underlie both RL and interval timing—the perception of duration in the range of seconds to hours. This hypothesis, which extends earlier models by incorporating a time-sensitive action selection mechanism, may have important implications for understanding disorders like Parkinson's disease in which both decision making and timing are impaired.

**Keywords: reinforcement learning, basal ganglia, dopamine, interval timing, Parkinson's disease**

## **INTRODUCTION**

Computational models of reinforcement learning (RL) have had a profound influence on the contemporary understanding of the basal ganglia (Joel et al., 2002; Cohen and Frank, 2009). The central claim of these models is that the basal ganglia are organized to support prediction, learning and optimization of long-term reward. While this claim is now widely accepted, RL models have had little to say about the extensive research implicating the basal ganglia in interval timing—the perception of duration in the range of seconds to hours (Buhusi and Meck, 2005; Jones and Jahanshahi, 2009; Merchant et al., 2013). However, this is not to say that time is ignored by these models—on the contrary, time representation has been a pivotal issue in RL theory, particularly with regard to the role of dopamine (Suri and Schultz, 1999; Daw et al., 2006; Ludvig et al., 2008; Nakahara and Kaveri, 2010; Rivest et al., 2010).

In this review, we attempt a provisional synthesis of research on RL and interval timing in the basal ganglia. We begin by briefly reviewing RL models of the basal ganglia, with a focus on how they represent time. We then summarize the key data linking the basal ganglia with interval timing, drawing connections between computational approaches to timing and their relationship to RL models. Our central thesis is that by incorporating a time-sensitive action selection mechanism into RL models, a single computational system can support both RL and interval timing. This unified view leads to a coherent interpretation of decision making and timing deficits in Parkinson's disease.

## **REINFORCEMENT LEARNING MODELS OF THE BASAL GANGLIA**

RL models characterize animals as agents that seek to maximize future reward (for reviews, see Maia, 2009; Niv, 2009; Ludvig et al., 2011). To do so, animals are assumed to generate a prediction of future reward and select actions according to a policy that maximizes that reward. More formally, suppose that at time *t* an agent occupies a state *st* (e.g., the agent's location or the surrounding stimuli) and receives a reward *rt*. The agent's goal is to predict the *expected discounted future return*, or *value*, of visiting a sequence of states starting in state *st* (Sutton and Barto, 1998):

$$V(\mathbf{S}\_t) = E\left[\sum\_{k=0} \gamma^k r\_{t+k}\right],\tag{1}$$

where γ is a parameter that discounts distal rewards relative to proximal rewards, and *E* denotes an average over possibly stochastic sequences of states and rewards.

Typically, a state *st* is described by a set of *D* features, {*xt* (1),..., *xt* (*D*)}, encoding sensory and cognitive aspects of an animal's current experience. Given this state representation, the value can be approximated by a weighted combination of the features:

$$
\hat{V}\left(s\_t\right) = \sum\_d \left.\psi\_t\left(d\right)x\_t\left(d\right)\right|
$$

where *V*ˆ is an estimate of the true value *V*. According to RL models of the basal ganglia, these features are represented by cortical inputs to the striatum, with the striatum itself encoding the estimated value (Maia, 2009; Niv, 2009; Ludvig et al., 2011). The strengths of these corticostriatal synapses are represented by a set of weights {*wt* (1),..., *wt* (*D*)}.

These weights can be learned through a simple algorithm known as *temporal-difference (TD) learning,* which adjusts the weights on each time step based on the difference between received and predicted reward:

$$
\omega\_{t+1} \left( d \right) = \omega\_t \left( d \right) + \alpha \delta\_t e\_t \left( d \right),
$$

where α is a learning rate and δ*<sup>t</sup>* is a prediction error defined as:

$$
\delta\_t = r\_t + \chi \hat{V} \left( s\_{t+1} \right) - \hat{V} \left( s\_t \right) \ .
$$

The *eligibility trace et*(*d*) is updated according to:

$$e\_{t+1}\left(d\right) = \gamma \lambda e\_t\left(d\right) + \mathfrak{x}\_t(d),$$

where λ is a decay parameter that determines the plasticity window of recent stimuli. The TD algorithm is a computationally efficient method that is known to converge to the true value function [see Equation (1) above] with enough experience and adequate features (Sutton and Barto, 1998).

The importance of this algorithm to neuroscience lies in the fact that the firing of midbrain dopamine neurons conforms remarkably well to the theoretical prediction error (Houk et al., 1995; Montague et al., 1996; Schultz et al., 1997; though see Redgrave et al., 2008 for a critique). For example, dopamine neurons increase their firing upon the delivery of an unexpected reward and pause when an expected reward is omitted (Schultz et al., 1997). The role of prediction errors in learning is supported by the observation that plasticity at corticostriatal synapses is gated by dopamine (Reynolds and Wickens, 2002; Steinberg et al., 2013), as well as a large body of behavioral evidence (Rescorla and Wagner, 1972; Sutton and Barto, 1990; Ludvig et al., 2012).

A fundamental question facing RL models is the choice of feature representation. Early applications of TD learning to the dopamine system assumed what is known as the *complete serial compound* (CSC; Moore et al., 1989; Sutton and Barto, 1990; Montague et al., 1996; Schultz et al., 1997), which represents every time step following stimulus onset as a separate feature. Thus, the first feature has a value of 1 for the first time step and 0 for all other time steps, the second feature has a value of 1 for the second time step and 0 for all other time steps, and so on. This CSC representation assumes a perfect clock, whereby the brain always knows exactly how many time steps have elapsed since stimulus onset.

The CSC is effective at capturing several salient aspects of the dopamine response to cued reward. A number of authors (e.g., Daw et al., 2006; Ludvig et al., 2008), however, have pointed out aspects of the dopamine response that appear inconsistent with the CSC. For example, the CSC predicts a large, punctate negative prediction error when an expected reward is omitted; the actual decrease in dopamine response is relatively small and temporally extended (Schultz et al., 1997; Bayer et al., 2007). Another problem with the CSC is that it predicts a large negative prediction

error at the usual reward delivery time when a reward is delivered early. Contrary to this prediction, Hollerman and Schultz (1998) found that early reward evoked a large response immediately after the unexpected reward, but showed little change from baseline at the usual reward delivery time.

It is possible that these mismatches between theory and data reflect problems with a number of different theoretical assumptions. Indeed, several theoretical assumptions have been questioned by recent research (see Niv, 2009). We focus here on alternative time representations as one potential response to the findings mentioned above.

We will discuss two of these alternatives (see also Suri and Schultz, 1999; Nakahara and Kaveri, 2010; Rivest et al., 2010): (1) the microstimulus representation and (2) states with variable durations (a *semi-Markov* formalism) and only partial observability. For the former, Ludvig et al. (2008) proposed that when a stimulus is presented, it leaves a slowly decaying memory trace, which is encoded by a series of temporal receptive fields. Each feature (or "microstimulus") *xt*(*d*) represents the proximity between the trace and the center of the receptive field, producing a spectrum of features that vary with time, as illustrated in **Figure 1A**. Specifically, Ludvig et al. endowed each stimulus with microstimuli of the following form:

$$\chi\_t(d) = \frac{\wp\_t}{\sqrt{\sigma^2}} \exp\left(-\frac{\left(\wp\_t - \frac{d}{D}\right)^2}{2\sigma^2}\right)$$

where *D* is the number of microstimuli, σ<sup>2</sup> controls the width of each receptive field, and *yt* is the stimulus trace strength, which was set to 1 at stimulus onset and decreased exponentially with a decay rate of 0.985 per time step. Both cues and rewards elicit their own set of microstimuli. This feature representation is plugged into the TD learning equations described above.

The microstimulus representation is a temporally smeared version of the CSC: whereas in the CSC each feature encodes a single time point, in the microstimulus representation each feature encodes a temporal range (see also Grossberg and Schmajuk, 1989; Machado, 1997). With the CSC, as time elapses after a stimulus, there is one, unique feature active at each time point. Learned weights for that time point therefore all accrue to that one feature. In contrast, at any time point, a subset of the microstimuli is differentially activated. These serve as the features that can be used to generate a prediction of upcoming reward (values). Note how the temporal precision of the microstimuli decreases with time following stimulus onset, so that later microstimuli are more dispersed than earlier microstimuli.

Recent data from Adler et al. (2012) have provided direct evidence for microstimuli in the basal ganglia (**Figure 1B**). Recording from the putamen while a monkey was engaged in a classical conditioning task, Adler et al. found clusters of medium spiny neurons with distinct post-stimulus time courses (for both cues and outcomes). As postulated by Ludvig et al. (2008), the peak response time varied across clusters, with long latency peaks (i.e., late microstimuli) associated with greater dispersion. Recording from the caudate nucleus, Jin et al. (2009) also found clusters of neurons that encode time-stamps of different events. These neurons carry sufficient information to decode time from the population response. Early time points are decodable with higher fidelity compared to late time points, as would be expected if the dispersion of temporal receptive fields increases with latency.

A different solution to the limitations of the CSC was suggested by Daw et al. (2006). They proposed that dopaminergic prediction errors reflect a state space that is *partially observable* and *semi-Markov*. The partial-observability assumption means that the underlying state is inferred from sensory data (cues and rewards), rather than using the features as a proxy for the state. Thus, prediction errors are computed with respect to a *belief state*, a set of features encoding the probabilistic inference about the hidden state. The semi-Markov assumption means that each state is occupied for a random amount of time before transitioning. In the simulations of Daw et al. (2006), only two states were postulated: an interstimulus interval (ISI) state and an intertrial interval (ITI) state. Rewards are delivered upon transition from the ISI to the ITI state, and cues occur upon transition from the ITI to the ISI state. The learning rule in this model is more complex than the standard TD learning rule [which is used by the Ludvig et al. (2008) model]; however, the core idea of learning from prediction errors is preserved in this model.

It is instructive to compare how these two models account for the data on early reward presented by Hollerman and Schultz (1998). In the Ludvig et al. (2008) model, the weights for all the microstimuli are updated after every time step: the late microstimuli associated with the cue (i.e., those centered around the time of reward delivery) accrue positive weights, even after the time of reward delivery (a consequence of the temporal smearing). These post-reward positive predictions generate a negative prediction error, causing the early microstimuli associated with the reward to accrue negative weights. When reward is presented early, the net prediction is close to zero, because the positive weights on the late cue microstimuli compete with the negative weights on the early reward microstimuli. This interaction produces a negligible negative prediction error, consistent with the data of Hollerman and Schultz (1998). The account of Daw et al. (2006) is conceptually different: when reward is presented early, the model infers that a transition to the ITI state has occurred early, and consequently no reward is expected.

Thus far, we have discussed time representations in the service of RL and their implications for the timing of the dopamine response during conditioning. What do RL models have to say about interval timing *per se*? We will argue below that these are not really separate problems: interval timing tasks can be viewed fundamentally as RL tasks. Concomitantly, the role of dopamine and the basal ganglia in interval timing can be understood in terms of their computational contributions to RL. To elaborate this argument, we need to first review some of the relevant theory and data linking interval timing with the basal ganglia.

## **TIME REPRESENTATION IN THE BASAL GANGLIA: DATA AND THEORY**

The role of the basal ganglia and dopamine in interval timing has been studied most extensively in the context of two procedures: the peak procedure (Catania, 1970; Roberts, 1981) and the bisection procedure (Church and Deluty, 1977). The peak procedure consists of two trial types: on fixed-interval trials, the subject is rewarded if a response is made after a fixed duration following cue presentation. On probe trials, the cue duration is extended, and no reward is delivered for responding. **Figure 2A** shows a typical response curve on probe trials: on average, the response rate peaks around the time of food presentation (20 or 40 s in the figure) is ordinarily available and then decreases. The peak time (a measure of the animal's interval estimate) is the time at which the response rate is maximal.

The other two curves in **Figure 2A** illustrate the standard finding that drugs (or genetic manipulations) that increase dopamine transmission, such as methamphetamine, shift the response curve leftward (Maricq et al., 1981; Matell et al., 2004, 2006; Cheng et al., 2007; Balci et al., 2010), whereas drugs that decrease dopamine transmission shift the response curve rightward (Drew et al., 2003; Macdonald and Meck, 2005).

In the bisection procedure, subjects are trained to respond differentially to short and long duration cues. Unreinforced probe trials with cue durations between these two extremes are occasionally presented. On these trials, typically, a psychometric curve is produced with greater selection of the long option (i.e.,

procedures, methamphetamine leads to overestimation of the elapsing interval, producing early responding in the peak procedure and more "long" responses in the bisection procedure. Figure replotted from Maricq et al. (1981).

the option reinforced following long duration cues) with longer probes and greater selection of the short option with shorter probes and a gradual shift between the two (see **Figure 2B**). The indifference point or point of subjective equality is typically close to the geometric mean of the two anchor durations (Church and Deluty, 1977). Similar to the peak procedure, in the bisection procedure, **Figure 2B** shows how dopamine agonists usually produce a leftward shift in the psychometric curve—i.e., more "short" responses, whereas dopamine antagonists produce the opposite pattern (Maricq et al., 1981; Maricq and Church, 1983; Meck, 1986; Cheng et al., 2007). Under some circumstances, however, dopamine agonists induce temporal dysregulation with an overall flattening of the response curve and no shift in preference or peak times (e.g., Odum et al., 2002; McClure et al., 2005; Balci et al., 2008).

The most influential interpretation of these findings draws upon the class of pacemaker-accumulator models (Gibbon et al., 1997), according to which a pacemaker (an "internal clock") emits pulses that are accumulated by a counter to form a representation of subjective time intervals. The neurobiological implementation of this scheme might rely on populations of oscillating neurons (Miall, 1989; Matell and Meck, 2004), integration of ramping neural activity (Leon and Shadlen, 2003; Simen et al., 2011), or intrinsic dynamics of a recurrent network (Buonomano and Laje, 2010). Independent of the neural implementation, the idea is that drugs that increase dopamine speed up the internal clock, while drugs that decrease dopamine slow the internal clock down.

This interpretation is generally consistent with the findings from studies of patients with Parkinson's disease (PD), who have chronically low striatal dopamine levels. When off medication, these patients tend to underestimate the length of temporal intervals in verbal estimation tasks; dopaminergic medication alleviates this underestimation (Pastor et al., 1992; Lange et al., 1995). It should be noted, however, that some studies have found normal time perception in PD (Malapani et al., 1998; Spencer and Ivry, 2005; Wearden et al., 2008), possibly due to variations in disease severity (Artieda et al., 1992).

Pacemaker-accumulator models have been criticized on a number of grounds, such as lack of parsimony, implausible neurophysiological assumptions, and incorrect behavioral predictions (Staddon and Higa, 1999, 2006; Matell and Meck, 2004; Simen et al., 2013). Moreover, while the pharmacological data are generally consistent with the idea that dopamine modulates the speed of the internal clock, these data may also be consistent with other interpretations. One important alternative is the class of "distributed elements" models, which postulate a representation of time that is distributed over a set of elements; these elements come in various flavors, such as "behavioral states" (Machado, 1997), a cascade of leaky integrators (Staddon and Higa, 1999, 2006; Shankar and Howard, 2012), or spectral traces (Grossberg and Schmajuk, 1989). The effects of dopaminergic drugs might be explicable in terms of systematic changes in the pattern of activity across the distributed elements (see, for example, Grossberg and Schmajuk, 1989).

In fact, the microstimulus model of Ludvig et al. (2008) can be viewed as a distributed elements model embedded within the machinery of RL. This connection suggests a more ambitious theoretical synthesis: can we understand the behavioral and neurophysiological characteristics of interval timing in terms of RL?

#### **TOWARD A UNIFIED MODEL OF REINFORCEMENT LEARNING AND TIMING**

One suggestive piece of evidence for how RL models and interval timing can be integrated comes from the study of Fiorillo et al. (2008); (see also Kobayashi and Schultz, 2008). They trained monkeys on a variation of the peak procedure with classical contingencies (i.e., water was delivered independent of responding) while recording from dopamine neurons in the substantia nigra and ventral tegmental area with five different intervals spanning from 1 to 16 s. As shown in **Figure 3A**, they found that the dopamine response to the reward increased with the interval, and the dopamine response to the cue decreased with the interval.

Whereas the response to the cue can be explained in terms of temporal discounting, the response to the reward should not (according to the CSC representation) depend on the cue-reward interval. The perfect timing inherent in the CSC representation means that the reward can be equally well predicted at all time points. Thus, there should be no reward-prediction error, and no phasic dopamine response, at the time of reward regardless of the cue-reward interval. Alternatively, the dopamine response to reward can be understood as reflecting increasing uncertainty in the temporal prediction. **Figure 3B** shows how, using the microstimulus TD model as defined as in Ludvig et al. (2008), there is indeed an increase in the simulated reward prediction error as a function of interval. In the model, with longer intervals,

**FIGURE 3 | (A)** Firing rates of dopamine neurons in monkeys to cues and rewards as a function of cue-reward interval duration. Adapted from Fiorillo et al. (2008) with permission from the publisher. **(B)** Prediction errors to the reward and cue as a function of the interval duration for the microstimulus model. The simulations of the microstimulus model were run for 100 trials and used the same parameters specified in Ludvig et al. (2008): λ = 0.95, α = 0.01, γ = 0.98, *D* = 50, σ = 0.08. We treated 20 time steps as a unit of 1 s, and each trial was separated by an interval of 500 time steps. Note the logarithmic scale on the x-axis.

the reward predictions are less temporally precise, and greater prediction errors persist upon reward receipt.

Interval timing procedures, such as the peak procedure, add an additional nuance to this problem by introducing instrumental contingencies. Animals must now not only predict the timing of reward, but also learn when to respond. To analyze this problem in terms of RL, we need to augment the framework introduced earlier to have actions. There are various ways to accomplish this (see Sutton and Barto, 1998). The Actor-Critic architecture (Houk et al., 1995; Joel et al., 2002) is one of the earliest and most influential approaches; it postulates a separate "actor" mechanism that probabilistically chooses an action *at* given the current state *st*. The action probabilities *p*(*st*|*at*) ∝ exp{*f*(*st*|*at*)} are updated according to:

$$f(\mathbf{s}\_t|a\_t) \leftarrow f(\mathbf{s}\_t|a\_t) + \eta \delta\_t [1 - p(\mathbf{s}\_t|a\_t)],$$

where η is a learning rate parameter and δ*<sup>t</sup>* is the is the prediction error defined earlier. The value estimation system plays the role of a "critic" that teaches the actor how to modify its action selection probabilities so as to reduce prediction errors.

When combined with the microstimulus representation, the actor-critic architecture naturally gives rise to timing behavior: in the peak procedure, on average, responding will tend to increase toward the expected reward time and decrease thereafter (see **Figure 2**). Importantly, the late microstimuli are less temporally precise than the early microstimuli, in the sense that their responses are more dispersed over time. As a consequence, credit for late rewards is assigned to a larger number of microstimuli. Under the assumption that response rate is proportion to predicted value, this dispersion of credit causes the timing of actions to be more spread out around the time of reward as the length of the interval increases, one of the central empirical regularities in timing behavior (Gibbon, 1977; see also Ludvig et al., 2012 for an exploration of this property in classical conditioning). As described above, an analog of this property has also been observed in the firing of midbrain dopamine neurons: response to reward increases linearly with the logarithm of the stimulusreward interval, consistent with the idea that prediction errors are being computed with respect to a value signal whose temporal precision decreases over time (Fiorillo et al., 2008). To the best of our knowledge, pacemaker-accumulator models cannot account for the results presented in **Figure 3**, because they do not have reward-prediction errors in their machinery. Instead, they collect a distribution of past cue-reward intervals and draw from that distribution to create an estimate of the time to reward (e.g., Gibbon et al., 1984).

The partially observable semi-Markov model of Daw et al. (2006) can account for the findings of Fiorillo et al. (2008), but this account deviates from the normative RL framework. Daw et al. use an external timing signal with "scalar noise" (cf. Gibbon et al., 1984), implemented by adding Gaussian noise to the timing signal with standard deviation proportional to the interval. Scalar noise induces larger-magnitude prediction errors with increasing delays. However, these prediction errors are symmetric around 0 and hence cancel out on average. To account for the effects of cuereward interval on the dopamine response, Daw et al. assume that negative prediction errors are rectified (see Bayer and Glimcher, 2005), resulting in a positive skew of the prediction error distribution. **Figure 4** shows how this asymmetric rectification results in average prediction errors that are increasingly positive for longer intervals. Note that rectification is not an intrinsic part of the RL framework, and in fact compromises the convergence of TD to the true value function. One potential solution to this problem is to posit a separate physiological channel for the signaling of negative prediction errors, possibly via serotonergic activity (Daw et al., 2002).

The microstimulus actor-critic model can also explain the effects of dopamine manipulations and Parkinson's disease. The key additional assumption is that early microstimuli (but not later ones) are primarily represented by the striatum. Timing in the milliseconds to seconds range depends on D2 receptors in the dorsal striatum (Rammsayer, 1993; Coull et al., 2011), suggesting that this region represents early microstimuli (whereas late microstimuli may be represented by other neural substrates, such as the hippocampus; see Ludvig et al., 2009). Because the post-synaptic effect of dopamine at D2 receptors is inhibitory, D2 receptor antagonists increase the firing of striatal neurons expressing D2 receptors, which mainly occur in the indirect or "NoGo" pathway and exert a suppressive influence on striatal output (Gerfen, 1992). Thus, the ultimate effect of D2 receptor

antagonists is to reduce striatal output, thereby attenuating the influence of early microstimuli on behavior. As a result, predictions of the upcoming reward will be biased later, and responses will occur later than usual (e.g., in the peak procedure). This fits with the observation that the rightward shift (overestimation) of estimated time following dopamine antagonist administration is proportional to the drug's binding affinity for D2 receptors (Meck, 1986). In contrast, dopamine agonists lead to a selective enhancement of the early microstimuli, producing earlier than usual responding (see **Figure 2A**).

A similar line of reasoning can explain some of the timing deficits in Parkinson's disease. The nigrostriatal pathway (the main source of dopamine to the dorsal striatum) is compromised in Parkinson's disease, resulting in reduced striatal dopamine levels. Because D2 receptors have a higher affinity for dopamine, Parkinson's disease leads to the predominance of D2-mediated activity and hence reduced striatal output (Wiecki and Frank, 2010). Our model thus predicts a rightward shift of estimated time, as is often observed experimentally (see above).

The linking of early microstimuli with the striatum in the model also leads to the prediction that low striatal dopamine levels will result in poorer learning of fast responses (which depend on the early microstimuli). In addition, responding will in general be slowed because the learned weights to the early microstimuli will be weak relative to those of late microstimuli. As a result, our model clearly predicts poorer learning of fast responses in Parkinson's disease. A study of temporal decision making in Parkinson's patients fits with this prediction (Moustafa et al., 2008). Patients were trained to respond at different latencies to a set of cues, with slow responses yielding more reward in an "increasing expected value" (IEV) condition and fast responses yielding more reward in a "decreasing expected value" (DEV) condition. It was found that the performance of medicated patients was better in the DEV condition, while performance of non-medicated patients was better in the IEV condition. If nonmedicated patients have a paucity of short-timescale microstimuli (due to low striatal dopamine levels), then the model correctly anticipates that these patients will be impaired at learning about early events relative to later events.

Recently, Foerde et al. (2013) found that Parkinson's patients are impaired in learning from immediate, but not delayed, feedback in a probabilistic decision making task. This finding is also consistent with the idea that these patients lack a representational substrate for early post-stimulus events. Interestingly, they also found that patients with lesions of the medial temporal lobe show the opposite pattern, possibly indicating that this region (and in particular the hippocampus) is important for representing late microstimuli, as was suggested earlier by Ludvig et al. (2009). An alternative possibility is that the striatum and hippocampus both represent early and late microstimuli, but the stimulus trace decays more quickly in the striatum than in the hippocampus (Bornstein and Daw, 2012), which would produce a more graded segregation of temporal sensitivity between the two regions.

## **CONCLUSION**

Timing and RL have for the most part been studied separately, giving rise to largely non-overlapping computational models. We have argued here, however, that these models do in fact share some important commonalities and reconciling them may provide a unified explanation of many behavioral and neural phenomena. While in this brief review we have only sketched such a synthesis, our goal is to plant the seeds for future theoretical unification.

One open question concerns how to reconcile the disparate theoretical ideas about time representation that were described in this paper. Our synthesis proposed a central role for a distributed elements representation of time such as the microstimuli of Ludvig et al. (2008). Could a representation deriving from the semi-Markov or pacemaker-accumulator models be used instead? This may be possible, but there are several reasons to prefer the microstimulus representation. First, microstimuli lend themselves naturally to the linear function approximation architecture that has been widely used in RL models of the basal ganglia. In contrast, the semi-Markov model requires additional computational machinery, and it is not obvious how to incorporate the pacemaker-accumulator model into RL theory. Second, the semi-Markov model accounts for the relationship between temporal precision and interval length at the expense of deviating from the normative RL framework. Third, as we noted earlier, pacemaker-accumulator models have a number of other weaknesses (see Staddon and Higa, 1999, 2006; Matell and Meck, 2004; Simen et al., 2013), such as lack of parsimony, implausible neurophysiological assumptions, and incorrect behavioral predictions. Nonetheless, it will be interesting to explore what aspects of these models can be successfully incorporated into the next generation of RL models.

#### **ACKNOWLEDGMENTS**

We thank Marc Howard and Nathaniel Daw for helpful discussions. Samuel J. Gershman was supported by IARPA via DOI contract D10PC2002 and by a postdoctoral fellowship from the MIT Intelligence Initiative. Ahmed A. Moustafa is partially supported by a 2013 internal UWS Research Grant Scheme award P00021210. Elliot A. Ludvig was partially supported by NIH Grant #P30 AG024361 and the Princeton Pyne Fund.

#### **REFERENCES**


methamphetamine-induced horizontal shifts in peak-interval timing functions. *Psychopharmacology (Berl.)* 188, 201–212. doi: 10.1007/s00213-006-0489-x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 October 2013; accepted: 23 December 2013; published online: 09 January 2014.*

*Citation: Gershman SJ, Moustafa AA and Ludvig EA (2014) Time representation in reinforcement learning models of the basal ganglia. Front. Comput. Neurosci. 7:194. doi: 10.3389/fncom.2013.00194*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2014 Gershman, Moustafa and Ludvig. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## The visual corticostriatal loop through the tail of the caudate: circuitry and function

## *Carol A. Seger\**

*Program in Molecular, Cellular, and Integrative Neuroscience, Department of Psychology, Colorado State University, Fort Collins, CO, USA*

#### *Edited by:*

*Ahmed A. Moustafa, University of Western Sydney, Australia*

#### *Reviewed by:*

*Erick J. Paul, University of Illinois Urbana Champaign, USA Greg Ashby, University of California, Santa Barbara, USA*

#### *\*Correspondence:*

*Carol A. Seger, Program in Molecular, Cellular, and Integrative Neuroscience, Department of Psychology, Colorado State University, Mail code 1876, Fort Collins, CO 80523, USA e-mail: seger@lamar.colostate.edu* Although high level visual cortex projects to a specific region of the striatum, the tail of the caudate, and participates in corticostriatal loops, the function of this visual corticostriatal system is not well understood. This article first reviews what is known about the anatomy of the visual corticostriatal loop across mammals, including rodents, cats, monkeys, and humans. Like other corticostriatal systems, the visual corticostriatal system includes both closed loop components (recurrent projections that return to the originating cortical location) and open loop components (projections that terminate in other neural regions). The article then reviews what previous empirical research has shown about the function of the tail of the caudate. The article finally addresses the possible functions of the closed and open loop connections of the visual loop in the context of theories and computational models of corticostriatal function.

**Keywords: striatum, caudate, category learning, basal ganglia, corticostriatal, recurrent neural network, reinforcement learning, Area TE**

## **INTRODUCTION**

Modern research in the basal ganglia has become increasingly focused on cognitive functions, an extension from early work that focused on the role of dorsal circuits in motor processing, and ventral circuits through the nucleus accumbens in reward and addiction. However, within the domain of cognition, researchers have concentrated on interactions between the prefrontal cortex and anterior regions of the striatum underlying executive functions. The interactions of temporal lobe cortex with the posterior striatum, specifically the tail of the caudate nucleus, have been minimally studied. This is likely due to a combination of factors, including the lack of a good rodent model for this system, and methodological difficulties in accessing, isolating, and measuring activity in the tail of the caudate. However, in recent years there has been a significant increase in research on the visual corticostriatal loop, which may signal that this structure's time has come. This goal of this paper is to provide a thorough review of the anatomy and function of the visual corticostriatal loop. It first provides a detailed review of what is known (and not known) about the corticostriatal circuitry passing through the tail of the caudate, and summarizes empirical studies investigating tail of the caudate function. It then surveys computational neuroscience to explore potential functions of these circuits, and proposes several future directions for research.

## **ANATOMY OF THE VISUAL CORTICOSTRIATAL SYSTEM**

The visual corticostriatal loop consists of the lateral and inferior temporal higher order visual cortex, its target regions in the tail and genu of the caudate, and subsequent projections through basal ganglia output nuclei to thalamus. This section traces this circuitry, beginning with the anatomy of the tail of the caudate, then describing the projections from visual cortex to caudate, caudate to substantia nigra pars reticulata (SNr), and finally from SNr to thalamus and back to cortex, or to superior colliculus, forming both open and closed loops. The focus is on the primate brain, both human and macaque monkey, though other species including rat and cat are also discussed where relevant studies are available. The focus is also on higher order visual projections, but because the temporal auditory cortex also projects to adjacent regions of the posterior caudate it will be discussed where relevant.

#### **THE TAIL OF THE CAUDATE**

The tail of the caudate nucleus is a subregion of the basal ganglia. Overall, the basal ganglia consist of three subcortical nuclei: the caudate, putamen, and globus pallidus. The caudate and putamen, together, are collectively referred to as the striatum. The caudate nucleus is located immediately lateral to the ventricles, and has a rather unusual spiral shape, as illustrated in **Figure 1**. The largest portion of the caudate is the anterior and medial region, which is referred to as the head. From the head, the caudate extends in posterior and lateral direction through the body, turns in an inferior direction through the genu, and finally projects in an anterior direction through the tail. As illustrated in **Figure 2**, the anterior portion of the tail of the caudate passes through the medial temporal lobe. It runs superior to the hippocampus, divided from this structure by only a narrow portion of the lateral ventricle. From posterior to anterior, the tail also passes close by the fimbria, immediately lateral to the lateral geniculate, inferior to the putamen, and finally adjacent to the amygdala. It is medial to deep portions of the middle temporal cortex, and dorsomedial to medial temporal cortex regions including the entorhinal cortex. Many of these adjacent structures are also associated with learning and memory, which provides many challenges for dissociating the functions of these structures. These challenges are further discussed in sections Human Neuroimaging and Lesion Studies below.

In terms of chemical neuroanatomy, the tail and body of the caudate have higher levels of cholinergic interneurons than other regions of the striatum (Bernácer et al., 2007). Dopamine projections to the striatum overall show a gradient from highest density in the more antero-medial-inferior regions, out to the more poster-lateral-superior regions; the tail of the caudate thus appears likely to have similar dopamine projections as other

posterolateral regions such as the body of the caudate and posterior putamen (Haber et al., 2006).

#### **PROJECTIONS FROM VISUAL CORTEX TO THE CAUDATE TAIL** *General characteristics of the corticostriatal system: loops and projections*

The dominant view, since the classic paper by Alexander et al. (1986), has been that the corticostriatal system is structured as a set of independent recurrent loops. The primary loop structure includes projections from the cortex, to the striatum (caudate and putamen), to the globus pallidus and/or SNr, to the thalamus, and finally back to cortex. Tract tracing methods revealed that cortical regions projected topographically, such that different cortical regions projected to different parts of the striatum. Overall the corticostriatal projections form a continuous system with projections from medial-anterior regions of cortex (e.g., orbitofrontal cortex) out to lateral-posterior regions (e.g., superior parietal cortex) generally projecting along a ventro-anterior-medial (e.g., nucleus accumbens) to dorso-posterior-lateral (e.g., posterior putamen) gradient in the striatum. For heuristic purposes, this system has been divided into separate loops in order to highlight different functions, but any such division is ultimately arbitrary. The most common, and widely accepted, division is into three loops: limbic through the ventral striatum, motor through the middle and posterior putamen, and associative through the anterior striatum (including the head of the caudate and anterior putamen). This division includes the visual projections from

temporal cortex in the associative loop. However, in primates projections from prefrontal regions differ from those from temporal cortex, with the latter projecting to more posterior caudate and putamen. Lawrence et al. (1998) recognized this and proposed a division into four basic loops, illustrated in **Figure 3**, including the visual loop as a separate network.

These four loops have additional meaningful subdivisions. For example, premotor and primary motor regions interact with different regions of the putamen. One additional loop that is particularly relevant for understanding the visual loop is the oculomotor loop (normally considered part of the executive loop, though it is similar functionally to parts of the motor loop as well). The oculomotor loop connects cortical regions involved in visual attention and eye movement planning to the body of the caudate nucleus. This region of the caudate has been shown to be sensitive to visual information as well (Ding and Gold, 2013; Watanabe and Munoz, 2013). The possible interactions of the oculomotor and visual loops are discussed further in section Visual Attention and Eye Movement Control below.

At the cellular level, individual corticostriatal projection neurons typically make multiple (though sparse) synapses on multiple striatal spiny neurons along an axon that projects longitudinally across the caudate and putamen in a roughly anterior to posterior direction (Selemon and Goldman-Rakic, 1985). As a result, within the visual loop, large scale projection zones from regions of cortex also tend to be oriented longitudinally along the caudate, and are often so narrow so that they do not extend across the entire width of the tail of the caudate but rather are localized

to a lateral, medial, dorsal, or ventral portion (as indicated in **Table 1** for monkey tracer studies).

The degree of convergence of cortical projection neurons on striatal neurons has been controversial. Some theories argued for broad convergence, so that axons from different cortical regions converge on the same striatal neurons. Later cellular based studies shed doubt on that view (see Bar-Gad et al., 2003 for review). First, although tracer studies often find large projection zones, more fine grained tracer studies show that within these zones the projections are not evenly distributed, but rather have "patchy" patterns of innervation in which some subareas within the overall region are innervated and others are not (Goldman and Nauta, 1977). Although individual projection neurons can extend a considerable distance along the striatum, they make very sparse synapses, which reduces numerically the potential for convergence (Zheng and Wilson, 2002). Each striatal neuron receives input from at most 0.01% of the corticostriatal projection neurons (Bar-Gad et al., 2003), and adjacent striatal neurons likely do not share cortical afferents (Zheng and Wilson, 2002). When there is convergence, the likelihood is high that the input is from nearby cortical neurons (Kincaid et al., 1998).

#### *The corticostriatal projection from visual temporal cortex in non-human mammals*

An early influential identification of corticostriatal projections from temporal cortex to the tail of the caudate came from a study that examined axonal degeneration after ablation of striatal tissue (Kemp and Powell, 1970). More detailed information came from later studies using anterograde and retrograde tracers. The results of studies examining projections from the visual and auditory regions of the temporal lobe using anterograde tracers in monkeys are summarized in **Table 1**. In general, the target structures in the striatum of visual temporal cortex fall in three regions. First is the tail of the caudate nucleus, sometimes extending into the genu and body of the caudate. Second is the posterior putamen, which is adjacent to the tail of the caudate. Third is a discontinuous region of the dorsal head and/or body of the caudate. Relatively anterior temporal regions (e.g., region TE, **Table 1**) tend to project to areas further down the tail than more posterior temporal regions (e.g., region TEO, **Table 1**). These studies are complemented by a study (Saint-Cyr et al., 1990) that applied a retrograde tracer in the tail and genu of the caudate and found that it received broad projections from temporal visual regions, with relatively posterior regions projecting to the genu and relatively anterior to the tail. Little is known about convergence of extrastriate visual regions. As shown in **Table 1**, the different temporal regions project to similar but not identical striatal territories. However, only one study has examined individual axons projecting from higher order visual cortex to the tail of the caudate (Cheng et al., 1997). It found that convergence onto striatal modules was limited to input from cells within the same or adjacent cortical columns. Cheng et al. argued that these projections represent related but not identical features of an object and could be useful in forming an integrated visual representation.

Cats have well developed visual systems and served as model species in most early visual electrophysiological studies. Updyke (1993) examined subcortical connections between 11 extrastriate


**Table 1 | Tracer studies examining projections from extrastriate visual cortex to striatum in monkey.**

*[3H]-AA: radiolabeled amino acids, typically a mixture of [3H]-proline and [3H]-leucine. WPA-HRP: Wheat germ agglutinin conjugated to horseradish peroxidase. PVL: Phaseolus vulgaris leucoagglutinin.*

visual areas commonly studied in electrophysiological studies. He found results broadly consistent with tracer studies in monkeys: visual cortex projected to longitudinal territories within the caudate nucleus that extended from the dorsal head, through the body, and into the tail. Visual cortex also projected to the posterolateral putamen.

Visual projections have also been studied in rodents, though there are significant differences from primates. Rodents have very limited cortical vision compared with primates (Baker, 2013), and the gross anatomy of the rodent striatum is significantly different. In rodent striatum, the anterior caudate and putamen form a single structure, usually divided into dorsomedial and dorsolateral striatum, and the posterior region of the striatum differs in shape from the primate tail of the caudate, which makes it difficult to establish clear homology. Visual cortex in the rodent is on lateral and posterior cortical surface, adjacent to auditory cortex. Faull and colleagues (Faull et al., 1986; McGeorge and Faull, 1989) placed retrograde tracers in multiple striatal regions, and found that auditory cortex projected to the most ventral and caudal region of the striatum, and visual cortex to an adjoining but more rostral region within the dorsomedial striatum. More recent research has argued that association area projections, including those from visual and auditory cortex, also project to a distinct dorsocentral region of striatum, and may converge with regions of parietal cortex important for the control of spatial attention (Cheatwood et al., 2003, 2005; Reep et al., 2003). There is evidence that primary visual cortex in the rodent also projects to the striatum, in contrast to absence of such projections in monkey (López-Figueroa et al., 1995).

An alternative approach to mapping projections from cortex to striatum is through direct activation or deactivation of one region combined with a measure of activity from the other region. Glynn and Ahmad (2002) directly stimulated individual cortical regions in rats while recording from multiple striatal regions. Stimulation of secondary visual regions in the occipital lobe lead to greatest activity in posterior and slightly anterior medial regions, whereas stimulation of auditory regions led to strong posterior activity. These results are consistent with the tracer studies performed by Faull and colleagues summarized above. Cohen (1972) found that stimulating the inferior temporal cortex or tail of the caudate in the monkey had similar effects on discrimination learning, and these effects differed from when stimulation was presented to the dorsolateral prefrontal cortex or anterior striatum.

#### *Visual corticostriatal projections in humans*

There has been little examination of the visual corticostriatal projection in humans. Anatomical imaging studies using diffusion tensor imaging (and related techniques) have typically focused on projections from the frontal cortex and have not reported connections with the temporal cortex (Draganski et al., 2008; Verstynen et al., 2012). One early study, although limited (it compared the entire caudate to the entire putamen), did report substantial temporal cortex connections with the caudate nucleus (Leh et al., 2007). Projections from temporal cortex to posterior caudate are understudied for two reasons: first, projections from temporal lobe to basal ganglia are hard to follow using current DTI methods because of twists or kinks in the pathways, and second, as is discussed in more detail in section Human Neuroimaging below, the atlases commonly used in neuroimaging do not include the tail of the caudate.

A new approach is utilizing resting state fMRI to identify circuits with intrinsic connectivity. Several recent studies have shown that resting state fMRI has a good correspondence with known anatomical connections; however, it should be noted that resting state cannot tell us which regions are directly connected, but rather just tells us which regions tend to coactivate (Hermundstad et al., 2013). Choi et al. (2012) examined connectivity between known cortical networks and the basal ganglia. As predicted from the anatomical connections, resting state networks that include the inferotemporal cortex were shown to correlate with relatively posterior regions of the caudate nucleus. One important caveat is that the caudate region examined did not include the tail of the caudate and only extended through part of the body of the caudate, and therefore is likely to underreport or completely miss connectivity with visual cortex.

#### **PATHWAYS THROUGH THE BASAL GANGLIA OUTPUT NUCLEI**

After the striatum, information passes through the basal ganglia output nuclei, including the globus pallidus, both internal (GPi) and external (GPe) portions, and the SNr. There are two primary pathways, which are termed the direct and indirect pathways. The direct pathway involves projections from striatum directly to SNr or GPi. The indirect pathway passes first to the GPe, and then to the SNr or GPi. In the primary visual loop, direct pathway projections target a lateral portion of the SNr (Saint-Cyr et al., 1990; Middleton and Strick, 1996; Maurin et al., 1999). The lateral SNr is also a target of auditory projections (Kolomiets et al., 2003). In the rodent, auditory and visual projections also target the SNr, but they extend to ventral as well as lateral subregions (Faull et al., 1986).

The motor and executive corticostriatal loops also have a third pathway, termed the hyperdirect pathway, that projects to the sub thalamic nucleus (STN) rather than the striatum, and from there to the SNr/GPi. Studies have shown that associative prefrontal regions as well as motor regions project to the STN in a topographic manner in primates (Mathai and Smith, 2011; Haynes and Haber, 2013). However, only one study has explicitly examined whether temporal cortex projects to the STN; it found no projections from a posterior superior temporal region (Afsharpour, 1985). Researchers assume therefore that there is no visual hyperdirect pathway, though Coizet et al. (2009) argue that the STN may receive visual information indirectly, but nevertheless rapidly, from the superior colliculus.

All three pathways converge onto a common inhibitory projection to the thalamus. The subsequent projections from thalamus to cortex are excitatory, and therefore the net tonic effect of this inhibitory projection onto the thalamus is to keep activity levels low in both thalamus and cortex. The direct pathway phasically releases the thalamus from inhibition and allows excitatory output to cortex; the indirect and hyperdirect pathways increase the inhibition of thalamus and cortex across different time scales cortex (DeLong, 1990; Mink, 1996; Frank, 2005).

#### **CLOSING THE LOOP: PROJECTIONS TO THALAMUS AND BACK TO CORTEX**

The canonical closed loop of the basal ganglia involves a return projection from the thalamus to the originating area of cortex. Tracing out the entire loop, and in particular the return projections from thalamus, is very difficult to do with tracer studies and typically requires the use of viruses that can map out multisynaptic pathways. The only study that has done this for the visual loop is Middleton and Strick (1996) who traced projections from the SNr to thalamus area VAmc to temporal lobe area TE, complementing the research finding projections from TE to basal ganglia. This closed loop is illustrated in **Figure 4**. In their subsequent review article Middleton and Strick (2000) mention research finding recurrent connections to parietal cortex (Clower et al., 2005), and conclude that while it is still unknown whether all the cortical areas that target BG also receive connections from BG, closed loop circuits may be a fundamental feature of the basal ganglia.

#### **OPEN LOOPS FROM THE VISUAL CORTICOSTRIATAL SYSTEM**

The basal ganglia also have a number of different open loop projections (Lopez-Paniagua and Seger, 2011); the primary ones are diagrammed along with the closed loop in **Figure 5**. An open loop projection is one that targets a different structure than the originating cortical region (Joel and Weiner, 1994). One well established open loop projection is to the superior colliculus, and allows visual information processed in the visual loop to directly elicit saccadic eye movements (Hikosaka et al., 2013).

Another group of open loop projections pass through the thalamus. VAmc, the region of the thalamus that receives visual loop projections, projects to more than just visual cortex. One important target of VAmc is the pre-supplementary motor area (pre-SMA) (Nakano et al., 1992), an area involved in processes integrating higher order motor control with executive functions in the dorsolateral prefrontal cortex. The COVIS model of

visual categorization learning is based on this open loop connection between the visual loop and pre-SMA, and can successfully account for many aspects of learning (Ashby et al., 1998, 2007). The thalamus also makes direct projections to striatum, and information from the thalamus may therefore also affect other corticostriatal loops without returning to cortex (Joel and Weiner, 1994; McFarland and Haber, 2000).

Another type of corticostriatal loop involves direct projections from the SNr to striatum, bypassing the return projections through the thalamus and cortex. This projection has been shown to have both recurrent closed-loop aspects and open-loop aspects (Haber et al., 2000). Haber and colleagues refer to these projections as forming an "ascending spiral" because the open loop projections tend to target the striatal regions at the next step along the overall antero-medial-inferior to postero-lateral-superior gradient from motivational to associative to motor loops. These projections also exist in rodents, with return SNr connections both proximal to the originating region, as well as to more distal associative regions (Maurin et al., 1999; Mailly et al., 2003). These connections have been verified in the cat as well (Harting et al., 2001).

## **EMPIRICAL RESEARCH EXAMINING THE FUNCTIONS OF THE CAUDATE TAIL**

This section surveys what is known about the function of the tail of the caudate from neuroimaging, neurophysiological, and lesion studies. There are relatively few studies that specifically target the tail of the caudate activity, especially in contrast to the number of studies examining the head of the caudate or putamen. One reason may be the experimental challenges posed by the unusual

**FIGURE 5 | Schematic diagram of the visual corticostriatal loop, including the best established open and closed loop connections.** Broad regions of extrastriate and visual temporal cortex project to the tail of the caudate. From the tail of the caudate, direct pathway projections go to the lateral SNr. There are open loop projections from SNr to superior colliculus that can directly release eye movements. There are also projections to the thalamus. From thalamus, there are projections back to cortex that close the loop, as well as open loop projections to other cortical regions, such as the pre-SMA. The hyperdirect pathway is not included because it is unknown if visual cortex projects to STN; see text for details. GPe: Globus Pallidus, external portion. Pre-SMA: Pre supplementary motor area. DLPFC: Dorsolateral Prefrontal Cortex. SNr: Substantia Nigra pars reticulata. VTA: Ventral Tegmental Area. SNc: Substantia Nigra pars compacta.

shape and location of the tail of the caudate for neuroimaging and lesion research, which are discussed in more detail in each section below.

#### **HUMAN NEUROIMAGING**

Researchers in the fields of category learning and visual classification learning have targeted the tail of the caudate, inspired by theories of visual categorization proposing that the open loop projection from the visual loop to premotor cortex could serve as a plausible biological substrate for incremental and implicit category learning (Ashby et al., 1998; Seger, 2008). The tail and adjoining regions of the body of the caudate are recruited in these tasks, as illustrated in **Figure 6**. Activity often follows the time course of learning, increasing as accuracy continues to increase (Seger et al., 2010). Activity also correlates with learning, such that subjects who learn better have higher recruitment in this region (Seger and Cincotta, 2005, 2006), and activity is higher for correctly categorized trials than error trials (Nomura et al., 2007). The tail of the caudate is recruited across a variety of different category learning tasks, including rule based tasks, and information integration tasks, and for both deterministic and probabilistic stimulus—category relationships (Seger and Cincotta, 2005), indicating it is not dependent on a particular stimulus type or category structure (Seger, 2008). Activity is present both when subjects are learning via trial and error, and when learning via observation (Cincotta and Seger, 2007), and is greater at the time of stimulus-response processing than at the time of feedback receipt (Lopez-Paniagua and Seger, 2011).

Human neuroimaging typically is performed on a whole brain basis. However, for several reasons tail of the caudate activity can

receipt portion; data from Lopez-Paniagua and Seger (2011).

easily be missed. One reason is limitations in the normalization algorithms, that typically are optimized to maximize accuracy for cortical rather than subcortical structures. Without precise normalization, activity in the tail of the caudate in individuals will not overlap in a group analysis, and no apparent group activity will be detected. A second reason is that standard neuroimaging atlases such as the Harvard-Oxford structural atlas used with neuroimaging analysis programs such as FreeSurfer truncate the caudate at the body, and completely exclude the tail. Many studies use these ROIs as a template for spatial normalization, for region of interest based analyses, or for small volume correction for multiple comparisons. A final reason is that the tail of the caudate is close to the hippocampus, and could be misidentified as such especially in tasks involving learning and memory. Therefore, the tail of the caudate may be recruited in additional cognitive tasks, but yet not have been properly identified and reported in the neuroimaging literature. Future work should use high resolution scanning and may need to modulate parameters to maximize potential signal. One worrisome finding is that in a recent developmental study researchers report that they were unable to localize the tail of the caudate in 22% of a set of high resolution MR scans they examined (Nabavizadeh and Vossough, 2013). It is unclear whether the structure will prove to be easier to localize in adults, or if improved scanning methods can practically be developed that will allow for better localization.

#### **NON-HUMAN ANIMALS: ELECTROPHYSIOLOGICAL STUDIES**

Until recently, only a few studies have targeted the tail of the caudate in animal research studies. In some early monkey studies researchers found activity in response to visual stimuli. Caan et al. (1984) found that tail of the caudate and posterior putamen cells responded to a variety of complex visual stimuli. Tail of the caudate neurons were also found to be active during visual discrimination learning (Brown et al., 1995).

In the last few years, the Hikosaka lab has begun a systematic investigation of the role of the tail of the caudate in learning to make saccades to visual stimuli (Hikosaka et al., 2006, 2013). They found that neurons in the tail of the caudate code for eye movements on the basis of both stimulus identity and location (Yamamoto et al., 2012). Tail of the caudate cells were also sensitive to value of the visual stimulus, with higher activity for stimuli associated with greater reward. This pattern of activity transferred into a free viewing task including multiple stimuli: monkeys looked at previously rewarded stimuli for longer than non-rewarded stimuli even though no gaze contingent rewards were given (Yamamoto et al., 2013). Activity patterns in the SNr, which receives input from the tail of the caudate and projects to the superior colliculus to elicit saccades, were the opposite pattern, consistent with the inhibitory GABAergic projections from caudate to SNr to superior colliculus (Yasuda et al., 2012). Finally, the tail of the caudate differed from the head of the caudate in that neurons in the tail were sensitive to stable long-term value of stimuli, whereas those in the head flexibly adapted to changes in stimulus value (Kim and Hikosaka, 2013); activity patterns in the body of the caudate were intermediate. They further verified that the tail of the caudate played a causal role in saccades to stable value stimuli by inactivating it with muscimol, which selectively impaired responses to stable value stimuli.

#### **LESION STUDIES**

Lesion work is challenging for a number of reasons, and as a result no specific lesion studies targeting only the tail of the caudate have been performed in non-human animals, and no human cases of brain damage limited to the tail of the caudate have been reported. However, there have been a number of isolated yet intriguing findings that imply the basal ganglia play a role in visual processing. For example, one study of premature infants found that basal ganglia damage was associated with visual impairment to a greater degree than occipital lobe damage (Mercuri et al., 1997). Section Tail of the Caudate and Amnesia discusses research that has examined the learning and memory consequences of medial temporal lobe damage, usually in the context of examining global amnesia. Section Basal Ganglia Disorders discusses what is known about visual processing deficits in the primary basal ganglia disorders, Parkinson's disease and Huntington's disease.

#### *Tail of the caudate and amnesia*

The only animal lesion study to specifically make a claim about the role of the tail of the caudate was a study by Teng et al. (2000). They compared monkeys with combined hippocampal and tail of the caudate lesions with lesions limited to the hippocampus and found that only the monkeys in which the tail of the caudate was lesioned were impaired on visual discrimination learning.

Given how close the tail of the caudate is to the hippocampus, the tail of the caudate may also have been damaged in some reported human cases of amnesia. The two most common etiologies of amnesia affecting the medial temporal lobe are herpes simplex encephalitis and anoxic damage. Herpes simplex encephalitis typically damages not only the hippocampus but adjacent temporal cortical regions and the amygdala (Kapur et al., 1994). Although anoxic injury is often thought to be selective for the hippocampus, there is evidence that the basal ganglia are also commonly damaged by anoxic injury (Caine and Watson, 2000; Hopkins and Bigler, 2012). One study found that patients with selective amnesia had damage limited to the hippocampus after anoxia (Di Paola et al., 2008), but not all published studies have been so careful in linking structure and function.

If the tail of the caudate is potentially damaged in amnesia, are there particular deficits currently thought to be due to damage to the hippocampus and/or medial temporal lobe cortex that may instead be due to tail of the caudate damage? The Teng et al. (2000) study described above found that impairments in visual discrimination learning were due to damage to the tail of the caudate rather than the hippocampus. Their task required animals to learn to discriminate between similar visual stimuli across multiple trials, via trial and error. This task is similar to human visual categorization tasks known to recruit the body and tail of the caudate (discussed in section Human Neuroimaging). Patients with amnesia have also been shown to be impaired on these tasks (Hopkins et al., 2004). However, this impairment could be because the task includes multiple demands, some of which may require hippocampus. One proposed contribution of the hippocampus to visual categorization learning that has received some empirical support is its role in processing novel visual stimuli, potentially in order to establish new memory traces that can then interact with the striatal learning processes (Meeter et al., 2008; Seger et al., 2011). Another is that the hippocampus can represent stimuli that are exceptions to the overall rule (Davis et al., 2012).

Patients with medial temporal lobe damage have also been reported to have other abnormalities in visual learning and memory. These are usually attributed to the specific computational functions that the damaged regions are thought to perform. Within the medial temporal lobe, multiple neocortical regions converge on the medial temporal cortex (including entorhinal, perirhinal, and posterior parahippocampal regions). From the medial temporal lobe cortex information passes through the dentate-hippocampal loop which implement functions including pattern separation and pattern completion that allow for the formation of new relational memories (Rolls, 2010; Jones and McHugh, 2011). In particular, the pattern separation functions of the dentate gyrus are important for being able to distinguish between very similar items across categories (LaRocque et al., 2013) and thus damage to the hippocampus leads to problems in forming new visual relational memories. The medial temporal cortex, in particular the perirhinal cortex, is thought to have an important role in visual memory; it is often linked to recognition memory for visual objects (Balderas et al., 2013). Some theories argue that this region should be thought of as a higher order visual processing region (Pagan et al., 2013), and have shown that people with damage to this region have problems with perceptual categorization and learning (Graham et al., 2006; Barense et al., 2012; Erez et al., 2013).

#### *Basal ganglia disorders*

Although no human lesions specific to the tail of the caudate have been reported, many studies of degenerative diseases that affect the basal ganglia in general have been performed. Currently little is known about how different diseases affect higher order visual processing, and if these effects are due to damage to the visual loop in particular or could be caused by damage to other loops. Patients with two of the major basal ganglia disorders, Parkinson disease and Huntington disease, are impaired in visual categorization learning (Knowlton et al., 1996; Shohamy et al., 2008) which relies on multiple corticostriatal loops including the visual loop (Seger, 2008), but it is unclear whether their impairments are specifically due to visual loop damage rather than impairments in other cognitive functions subserved by other corticostriatal loops such as feedback processing (Shohamy et al., 2008; Holl et al., 2012) or attentional shifting (Moustafa and Gluck, 2010).

Although both Parkinson and Huntington patients are impaired on visual categorization learning, the two disorders have very different underlying pathologies, different patterns of progression, and may affect the visual loop differently. In Parkinson disease there is some evidence that the visual loop should be affected relatively late because initial dopamine loss is primarily in rostral and lateral portions of the dopaminergic midbrain (Damier et al., 1999), which leads to dopamine depletion in the putamen (Kish et al., 1988). As the disease progresses it affects the anterior striatum, with the ventral striatum affected last. This pattern is reflected in shifts of functional connectivity in Parkinson's disease, with the motor loop the most strongly affected (Helmich et al., 2010). It is unclear when the tail of the caudate is primarily affected, though given overall patterns of connectivity it is most likely in parallel with the anterior striatum. In Parkinson disease the most unusual visual processing disturbance is the presence of visual hallucinations; approximately one third of the patients surveyed in one study reported visual hallucinations, generally images of animals or people that lasted for on the average for 5 min (Davidsdottir et al., 2005). Meppelink et al. (2009) found that visual hallucinations were associated with reduced object processing in higher order visual cortex and reduced bottomup input to the prefrontal cortex. In addition, Parkinson disease patients also have some deficits in eye movement and attentional control, though these could be due to oculomotor loop or other dysfunction (Chambers and Prescott, 2010; Archibald et al., 2013).

In Huntington's disease cell loss proceeds from dorso-medial to ventro-lateral regions, with the tail of the caudate (along with medial head of the caudate dorsal putamen) having the greatest cell loss (Aylward et al., 2004). However, it should be noted that by the time Huntington disease is manifested overall damage to the basal ganglia is severe, and subsequent progression of the disease may be primarily due to increasing damage to cortex, white matter, and other subcortical structures (Georgiou-Karistianis et al., 2013). Overall, greater visual processing impairments have been reported in Huntington disease than Parkinson's disease, but typically only for difficult tasks, which leaves open the possibility that as in probabilistic classification the impairment is due to other cognitive processes besides visual processing. Gómez-Tortosa et al. (1996) found that visual deficits developed in parallel with other cognitive deficits, with early disease patients impaired only in a task involving complex visual integration. Lawrence et al. (2000) examined performance on a series of visuospatial and visual object perception tasks. Perception was unimpaired except for a very difficult object decision task (identifying degraded objects). There have been reports of deficits in recognition of emotional facial expressions, but again that could be due to known problems in processing emotion rather than problems in visual processing (Snowden et al., 2008). Several studies have found impairment on tasks that required working memory and/or recognition memory, including maintenance of individual patterns and in a delayed match to sample task These tasks all demand incorporation of visual perceptual information with selective behavioral choice, which could be dependent on the visual loop or on other corticostriatal loops (Mohr et al., 1991; Lawrence et al., 2000; Dumas et al., 2012).

## **POTENTIAL FUNCTIONS OF THE VISUAL CORTICOSTRIATAL LOOP**

As described in section Anatomy of the Visual Corticostriatal System, the visual corticostriatal loop has the same circuitry as other corticostriatal loops, with the possible exception of the lack of a hyperdirect pathway. Therefore, the computational functions carried out in the visual loop should be similar to those in other corticostriatal loops, though these computations may be applied to achieve different ends. For example, the basic selection function of corticostriatal circuitry in the motor loop is used to select specific motor programs, whereas in the executive systems it contributes to working memory and cognitive strategy selection.

#### **INSIGHTS FROM COMPUTATIONAL MODELS**

A good place to start then is to consider what are the functions of the basal ganglia in the better studied motor and executive loops. Computational models of these loops generally implement one or two of the following basic mechanisms. The first is a selection (sometimes termed gating or thresholding) function. Multiple representations exist at the cortical level. As described in more detail in section Pathways Through the Basal Ganglia Output Nuclei, the basal ganglia overall exert a strong inhibition onto the thalamus, which prevents activity in the excitatory projections from thalamus to cortex. The direct pathway within the basal ganglia selects or disinhibits the representation that is most suitable for the current situation. The indirect and hyperdirect pathways modulate inhibition across varying time scales (Mink, 1996; Frank, 2005; Humphries et al., 2006).

The second is a reinforcement learning function. The basal ganglia can learn to strengthen or weaken this selection process via dopaminergic input, in a manner consistent with reinforcement learning algorithms (Lee et al., 2012; Morita et al., 2012). Most models focus on selection and reinforcement learning within a single corticostriatal loop, and incorporate very simple representations of single regions of cortex, such as a motor cortex with two potential responses (Frank, 2005, 2011; Humphries et al., 2006). These models suggest potential ways to address the closed loop function of the visual corticostriatal loop, but open loop functions require more complex cortical representations that incorporate at least two cortical regions.

With multiple cortical regions, it is possible to consider different applications of the selection process such that selection is directed to another cortical region, or affects how input from another cortical region is processed in the target region. These applications of selection are often referred to as "gating" or "routing" models. Two good examples are the FROST model developed by Ashby et al. (2005), and the PBWM (Prefrontal basal ganglia working memory) model developed by O'Reilly and colleagues (Hazy et al., 2006, 2007). PBWM model includes multiple prefrontal regions or "stripes" that maintain and update working memory. Updating happens when the direct pathway disinhibits a prefrontal stripe and allows a new item to enter into working memory; this process is referred to as gating. Perceptual items are represented in a separate cortical module, which both projects directly to the PFC region, and to the striatum as well. Because the focus of this model is the basal ganglia and prefrontal cortex components, perceptual information is represented in highly abstracted form (e.g., in terms of fully determined features and/or object identity).

Stocco et al. (2010) take an alternate approach, which they term "routing," in which interaction between multiple cortical regions is through closed loop mechanisms plus cortico-cortical input from other regions. In their model the cortical portion of the corticostriatal loop receives broad inputs from other cortical regions. The closed loop functions to select or inhibit these inputs through the direct and indirect pathways.

#### **POSSIBLE FUNCTION OF RECURRENT (CLOSED LOOP) PROJECTIONS IN THE VISUAL LOOP**

As summarized above, and illustrated in **Figure 4**, the main evidence for recurrent connections to temporal cortex in primates is the Middleton and Strick (1996) study finding a closed loop through temporal area TE. It is possible that TE is atypical, and most temporal regions don't have recurrent connections. This is not unprecedented: in the corticocerebellar system most cortical regions project to the cerebellum, but not all receive closed loop return projections (Bostan et al., 2013). Furthermore, anatomical connectivity is a necessary prerequisite, but is not sufficient. The brain has a plethora of connections between disparate regions, many of which are usually weak or dormant. One example is that the occipital lobe receives projections from auditory and somatosensory regions, but does not show significant sensitivity to these sensory modalities in sighted persons. However, in the blind these projections can strengthen and allow for the occipital lobe to be recruited for tactile and/or auditory processing (Pascual-Leone et al., 2005).

Assuming for the moment that there are robust and active closed loop recurrent projections to the visual cortex, what might be their function? This function should be consistent with what we know about the computations identified in the other corticostriatal loops, as summarized above, namely selection via the direct route, and modulation of inhibition through the indirect route. What use might selection and inhibition be in higher order visual processing? The most direct application of the concept of selection is that these connections would inhibit alternative representations of visual information, and select the dominant one for further processing. The visual cortex does have to resolve ambiguity (the input to the visual system is often consistent with many potential interpretations and can be parsed in many ways), and this mechanism is at least theoretically consistent with our knowledge of basal ganglia function. There are also cortico-cortico projections between visual regions that may provide inputs that are subject to these selection processes.

There has been little research investigating basal ganglia activity in conditions of visual ambiguity. The strongest evidence for a causal role of the visual corticostriatal system in this process would come from lesion studies in primates or studies of basal ganglia disorders in humans. As described above in section Basal Ganglia Disorders, visual symptoms of basal ganglia disorders have not been widely studied, and it is unclear whether the impairments that have been reported could be due to problems in resolving visual ambiguity.

#### **FUNCTION OF OPEN LOOP PROJECTIONS IN THE VISUAL LOOP**

Open loop models involve effects on other brain structures. As discussed in section Open Loops from the Visual Corticostriatal System, interactions between corticostriatal loops often follow the gradient within the striatum from motivational loop, through to the associative loops (including the visual loop), to the motor loop. Therefore, with respect to the visual loop, it makes sense to focus on potential open loop projections to the cortical regions participating in the associative and the motor loop, and how the selection or gating function of the striatum might be utilized in each cortical region. Largely, these regions are the frontoparietal networks underlying a hierarchy of executive control including both cognitive and motor functions (Badre, 2008), and selection may be utilized for motor or cognitive functions. This section discusses three potential frontoparietal network open loop targets. The first is open loop projections to premotor regions to enable behavioral choice. The second is projections to oculomotor networks enabling shifts in visual attention and eye movements. Control of eye movements allows for better perception (as items are foveated), and also the ability to ultimately make decisions about the visual stimulus and subsequently choose an appropriate course of action. Finally, open loop projections from the visual loop to executive regions such as the lateral prefrontal cortex may allow for visual information to be maintained or manipulated in working memory during extended cognitive processing, rather than being immediately used to select a motor or eye movement response.

## *Visual conditional response performance and learning*

A logical extension of the idea that the basal ganglia are important for selection of motor programs is the idea that the non-motor loops ultimately have the function of interacting with the motor region to allow the organism to learn to select and execute motor responses that are appropriate to the current situation. The basal ganglia are involved in a variety of tasks in which subjects learn to perform a conditional response on the basis of a stimulus or situation (Seger, 2008, 2009), including arbitrary visuomotor association learning (Wise and Murray, 2000), category learning (Seger and Miller, 2010), habit learning (Yin and Knowlton, 2006; Graybiel, 2008), and decision making (Summerfield and Tsetsos, 2012; Seger and Peterson, 2013). These tasks all have in common a trial structure in which the subject is presented with a stimulus or cue, most often visually, makes a response conditional on the cue, then receives feedback or reward if the response was correct. Studies indicate that the striatum as a whole makes several contributions to category learning, resulting in recruitment of different striatal regions during different portions of a trial. For example, the putamen is most active when making the motor response, and the head of the caudate is most active when processing the stimulus and receiving feedback (Peterson and Seger, 2013).

Studies examining the tail of the caudate (summarized in section Human Neuroimaging and **Figure 6**) find that it is active during stimulus processing, consistent with it playing a role in visual processing for categorization. Conditional responses to visual stimuli could be supported by open loop projections from the visual loop to motor structures. One example described above in section Non-human Animals: Electrophysiological Studies is the work by Hikosaka and colleagues investigating the open loop direct projection to the superior colliculus which allows for eye movements to be sensitive to the learned value of the visual stimuli. Another example is visual categorization, in which visual stimuli provide the information to choose the appropriate category and motor response used to indicate the category. The COVIS model proposed by Ashby et al. (1998, 2007) models visual categorization through a direct open loop projection from the visual loop to the pre-SMA, consistent with known output projections from the VAmc region of the thalamus that participates in the visual loop.

However, it is possible that the role of the visual loop in conditional learning is indirect rather than via direct projections to motor regions. Anatomically, the pre-SMA is strongly interconnected with prefrontal regions, and does not directly project to SMA and other premotor regions (Picard and Strick, 1996). Instead of directly selecting motor responses, the visual loop projections to this region could serve to select more abstract categorical representations, which then contribute to motor response selection via projections from prefrontal to premotor regions. Recent work by the Ashby lab (Waldschmidt and Ashby, 2011) indicates that direct motor selection of categorical responses may be accomplished through the motor loop involving the putamen, possibly via the known cortricostriatal projections from parietal lobe regions to the putamen. Overall, studies find a shift in networks during conditional response learning from executive to motor, and the visual loop may more strongly affect learning in the early stages in interaction with executive regions. This shift from associative to motor loops is consistent with a large body of research in rodents finding a shift from dorsomedial striatum (homolog of the anterior caudate) to dorsolateral striatum (homolog of the putamen) as learning progresses from being goal-directed to habitual (Balleine et al., 2009).

#### *Visual attention and eye movement control*

Another possible target of open loop projections from the visual loop are frontoparietal regions involved in visual attention and eye movement control, including the frontal eye fields and parietal cortex. These regions interact direct with another region of the caudate, the lateral body, in the oculomotor loop. Several recent reviews have considered the anatomy of this system and its function in regulating eye movements (McHaffie et al., 2005; Shires et al., 2010). Recent theories have argued that this system is important for visual attention more broadly. Visual attention involves collecting and integrating sensory and cognitive data about the world in order to focus processing on potentially important objects and their spatial locations. Perceptual and motor factors in visual attention are tightly connected, and that as a result this system allows for guiding action to objects and locations, in particular eye movements (Gottlieb and Balan, 2010). Within this network, area LIP (located around the Intraparietal sulcus in humans) is often considered to represent a spatial salience or priority map (Bisley and Goldberg, 2010) in which spatial location is combined with other important information about objects including reward value, category membership, amount of information supporting a particular perceptual decision, etc. The oculomotor loop caudate neurons in this region are sensitive to many of the same factors as LIP and FEF (Watanabe and Munoz, 2013). Ding and Gold (2010, 2012, 2013) found that multiple relevant variables for perceptual decision making were coded for in the body of the caudate, including cells sensitive to information accumulation, decision threshold, and bias before actual stimulus toward a left or right saccade. Harsay et al. (2011) have examined the system in humans, and found that functional connectivity between the cortical oculomotor regions and caudate predicts learning in a saccade task.

The evidence for the importance of both visual loop and oculomotor loop processing in controlling eye movements raises the question of how the visual loop might interact with the oculomotor loop through open loop projections. One known open loop projection from the visual loop is the preSMA, which is adjacent to the FEF and considered along with FEF to fall within Brodmann's area 8. In addition, interaction between loops could be through cortical regions; there are known anatomical and functional connections between LIP and temporal lobe visual processing regions (Gold and Shadlen, 2007).

#### *Visual working memory*

Another possibility is that open loop projections are important for visual working memory, in particular selecting or gating which visual representations should be maintained and processed in working memory. This interpretation brings together two strands of research in working memory: the first one focusing on cortex and showing that working memory for visual items (e.g., objects or faces) involves interaction between frontoparietal working memory systems and the temporal lobe regions important for representing those items (Clapp et al., 2010; Gazzaley and Nobre, 2012). The second is research showing that the striatum is important for selecting what items should be gated into working memory. Several theories of gating emphasize open loop projections in which the selected items are gated as representations in higher order systems, but may also require recurrent projections combined with cortico-cortico projections.

There have been a large number of studies finding a role of the basal ganglia in working memory, but most of them have focused on the executive function components involving frontoparietal and anterior striatum interactions to maintain, select and update working memory. Most empirical work has supported the idea that the basal ganglia are especially important for selecting which items should enter working memory, often by filtering the possible inputs (McNab and Klingberg, 2008; Baier et al., 2010).

#### **CONCLUSION AND FUTURE DIRECTIONS**

In summary, there is substantial anatomical and functional evidence for a visual loop through the tail of the caudate nucleus. The visual loop however, has received much less attention from researchers than loops through the frontal cortex supporting executive, motor, and motivational functions. One goal of this review is to encourage basal ganglia researchers to consider how the visual loop might interact with other basal ganglia systems that they study. Another goal is to highlight important future directions of research in this area.

There are many ways in which our knowledge of the anatomy of the visual corticostriatal loop is limited which could fruitfully be addressed in future research. Our knowledge of the projections from cortex to striatum is based on a small number of studies in monkeys, and it is still unknown exactly which visual regions project to which striatal regions in humans (section Projections from Visual Cortex to the Caudate Tail). Although there appears to be no visual hyperdirect pathway in primates, the data available is not conclusive (section Pathways Through the Basal Ganglia Output Nuclei). Our knowledge of recurrent closed loop projections from temporal lobe is based on only one published study examining a single temporal region in monkey; although it is a plausible assumption that other visual cortical regions form similar closed loops, it has not been verified empirically (section Closing the Loop: Projections to Thalamus and Back to Cortex). Finally, we do not have complete knowledge of all the potential targets of open loop connections from the visual loop (section Open Loops from the Visual Corticostriatal System).

Empirical studies of the tail of the caudate have been hampered by methodological limitations. In neuroimaging, future research should focus on development of new high resolution scanning and spatial normalization processes to allow identification of the tail of the caudate on an individual subject level and support group analyses across subjects (section Human Neuroimaging). Researchers studying medial temporal lobe damage should develop ways to assess whether tail of the caudate damage has occurred in amnesic patients, and to distinguish between behavioral impairments due to tail of the caudate damage and those due to damage to adjoining structures (section Tail of the Caudate and Amnesia). Finally, researcher studying basal ganglia disorders should consider the potential effects of damage to the tail of the caudate and avoid an exclusive focus on motor and executive functions (section Basal Ganglia Disorders).

The fundamental functions of the visual corticostriatal loop are still unknown. No well-developed theories have addressed the role of recurrent closed-loop projects back to visual cortex (section Possible Function of Recurrent (Closed Loop) Projections in the Visual Loop). Several theories propose specific functions for some of the open loop projections from the visual loop, but because these projections have not been fully characterized anatomically, we do not yet have a full picture of their functions (section Function of Open Loop Projections in the Visual Loop). This paper suggested a number of possible closed and open loop functions of the visual corticostriatal loop, but developing and testing complete theories awaits future research.

#### **AUTHOR CONTRIBUTIONS**

Carol A. Seger determined the content and wrote the paper.

#### **ACKNOWLEDGMENTS**

I would like to thank Greg Ashby, Eu Young Choi, Deborah Budding, Michael Frank, Mark Humphries, Leonard Koziol, Howard Landman, Liz Race, and Timothy Verstynen for answering my questions and/or making suggestions on an earlier version of this article. Preparation of this review was funded by the National Institutes of Health, MH079182.

#### **REFERENCES**


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 03 September 2013; accepted: 18 November 2013; published online: 06 December 2013.*

*Citation: Seger CA (2013) The visual corticostriatal loop through the tail of the caudate: circuitry and function. Front. Syst. Neurosci. 7:104. doi: 10.3389/fnsys. 2013.00104*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2013 Seger. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Exploring the cognitive and motor functions of the basal ganglia: an integrative review of computational cognitive neuroscience models

## *Sebastien Helie1\*, Srinivasa Chakravarthy2 and Ahmed A. Moustafa3*

*<sup>1</sup> Department of Psychological Sciences, Purdue University, West Lafayette, IN, USA*

*<sup>2</sup> Department of Biotechnology, Indian Institute of Technology, Madras, India*

*<sup>3</sup> Marcs Institute for Brain and Behaviour and School of Social Sciences and Psychology, University of Western Sydney, Sydney, NSW, Australia*

#### *Edited by:*

*Izhar Bar-Gad, Bar-Ilan University, Israel*

*Reviewed by: Thomas Boraud, Université de Bordeaux, CNRS, France Arvind Kumar, University of Freiburg, Germany*

#### *\*Correspondence:*

*Sebastien Helie, Department of Psychological Sciences, Purdue University, 703 Third Street, West Lafayette, IN 47907, USA e-mail: shelie@purdue.edu*

Many computational models of the basal ganglia (BG) have been proposed over the past twenty-five years. While computational neuroscience models have focused on closely matching the neurobiology of the BG, *computational cognitive neuroscience* (CCN) models have focused on how the BG can be used to implement cognitive and motor functions. This review article focuses on CCN models of the BG and how they use the neuroanatomy of the BG to account for cognitive and motor functions such as categorization, instrumental conditioning, probabilistic learning, working memory, sequence learning, automaticity, reaching, handwriting, and eye saccades. A total of 19 BG models accounting for one or more of these functions are reviewed and compared. The review concludes with a discussion of the limitations of existing CCN models of the BG and prescriptions for future modeling, including the need for computational models of the BG that can simultaneously account for cognitive and motor functions, and the need for a more complete specification of the role of the BG in behavioral functions.

**Keywords: basal ganglia, computational cognitive neuroscience, cognitive function, motor function, Parkinson's disease**

### **INTRODUCTION**

The basal ganglia (BG) are a group of nuclei at the base of the forebrain that are strongly connected to the cortex. While the role of the BG had historically been restricted to motor function, a substantive amount of recent research suggests that the BG are also involved in a variety of cognitive functions (Steiner and Tseng, 2010). Behavioral and neural experiments with human and non-human animals have informed our understanding of the BG function for over a century, and the past two decades have seen an increased use of computational models to simulate the anatomy and functionality of the BG. The most anatomically detailed computational neuroscience models seldom go as far as simulating complex animal behavior (because of complexity issues), but simpler (less anatomically detailed) models can be used to simultaneously account for some anatomical details and complex animal behavior. The strength of these later *computational cognitive neuroscience* (CCN) models lies in that they can simultaneously account for both neuroscience data and behavioral data (Ashby and Helie, 2011).

This review article focuses on CCN models of the BG and classifies existing models according to cognitive and motor function. The remainder of this article is organized as follows. First, the anatomy that is usually included in CCN models of the BG is reviewed. This anatomy section is incomplete by design, as only details that are simulated to account for specific cognitive or motor function are included. Next, we review CCN models used to simulate six different cognitive functions, namely categorization, instrumental conditioning, probabilistic learning, working memory, sequence learning, and automaticity. This presentation is followed by CCN models of motor function. Computational cognitive neuroscience models of motor functions are separated into models of reaching, handwriting, and eye saccades. The review concludes with a discussion of the limitations of existing CCN models of the BG and prescriptions for future modeling. Future directions emphasize the need for CCN models of the BG that can simultaneously account for cognitive and motor functions, and the need for a more complete specification of the role of the BG in the reviewed functions.

#### **NEUROANATOMY OF THE BASAL GANGLIA**

The BG include the striatum (caudate, putamen, nucleus accumbens), the globus pallidus (GP), the subthalamic nucleus (STN), the substantia nigra (SN), the ventral tegmental area, and the olfactory tubercle (see **Figure 1**). The striatum receives the majority of afferent connections whereas the internal segment of the GP Globus pallidus (internal) (GPi) and SN pars reticulate (SNr) are the sources of the majority of efferent connections that target cortical regions via the thalamus. Based on both structural and functional evidence, the striatum is often divided into a ventral and a dorsal part. The ventral striatum includes the nucleus accumbens, ventromedial portions of the caudate and putamen, and the olfactory tubercle. The dorsal striatum, which is usually opened loop.

the main focus of CCN models of the BG, includes the remainder of the caudate and putamen.

Virtually all of the neocortex sends excitatory (glutamatergic) projections to the striatum (Reiner, 2010). Corticostriatal input is massively convergent with estimates ranging from 5,000 to 10,000 cortical neurons converging on a single striatal medium spiny neuron (MSN; the main striatal projection neurons) (Kincaid et al., 1998). Classically, corticostriatal organization is thought to follow a fairly strict spatial topography (Kemp and Powell, 1970). Along the rostral-to-caudal extent of the BG, the cortical afferents tend be more prevalent from rostral-to-caudal cortical regions. For instance, ventral striatum receives input predominantly from orbitofrontal cortex, ventromedial prefrontal cortex, and anterior cingulate cortex (ACC). As one moves caudally within the striatum, inputs from areas 9, 46, and 8 become more prevalent (Haber et al., 2006; Calzavara et al., 2007), followed by inputs from premotor regions (area 6) with the most caudal motor and somatosensory cortical regions projecting preferentially to the caudal putamen (Flaherty and Graybiel, 1994). Spatial topography holds as you continue rostrally and ventrally through parietal and temporal cortices as well as other extrastriate visual areas (Kemp and Powell, 1970; Yeterian and Pandya, 1993, 1995, 1998).

The thalamus provides another major source of input to the BG (Wilson, 2004), with the majority of thalamostriatal projections originating from the parafascicular-centromedian (CMPf) complex (Smith et al., 2010). Thalamic input to the striatum synapses on both MSNs and cholinergic tonically active neurons (TANs; a class of large-body interneurons) (Smith et al., 2004), with the latter likely playing an important role in modulating cortico-striatal synaptic plasticity (Ashby and Crossley, 2011). Finally, thalamic input to the striatum is in a position to modulate BG function by virtue of cortico-thalamo-striatal connections and striatal-thalamo-striatal feedback (Smith et al., 2010).

The BG also receives dopaminergic input that plays a critical role in modulating striatal activity. Dopamine is projected from the ventral tegmental area and SN pars compacta to the BG and prefrontal cortex, among other brain regions. Dopamine firing patterns fluctuate between two different modes: phasic and tonic. While the phasic mode is fast-acting and spans milliseconds, the tonic mode is long-acting and can span minutes or hours. The dissociable function of both phasic and tonic dopamine is debatable (Dreher and Burnod, 2002; Assadi et al., 2009; Moustafa et al., 2013). However, various studies suggest that phasic dopamine plays a key role in synaptic plasticity and reinforcement learning (Wickens et al., 1996; Reynolds et al., 2001), while tonic dopamine is important for speeding-up reaction times (Niv et al., 2007; Moustafa et al., 2008) and controlling the signal-to-noise ratio (Durstewitz and Seamans, 2008).

Information flow through the BG follows two distinct pathways (see **Figure 1**). Striatal MSNs in the direct pathway project directly to the output nuclei (e.g., GPi) and selectively express D1 like receptors (i.e., D1 and D5; Gerfen et al., 1990). The striatal MSNs in the indirect pathway project to the external segment of the GP Globus pallidus (external) (GPe) prior to reaching the output nuclei of the BG (e.g., GPi), and selectively express D2 like receptors (i.e., D2, D3, and D4; Gerfen and Young, 1988). In addition to the direct and indirect pathways, the STN is another major input structure of the BG receiving extensive cortical and thalamic input. This so-called hyperdirect pathway provides a means by which frontal cortical regions can monosynapically influence STN function (Nambu et al., 2002).

With abundant dopamine receptors in the BG affecting the dynamics of the different pathways, most CCN models of the BG include a role for dopamine. One important way of testing whether the hypothesized role for dopamine in the model is adequate is to simulate the model under dopamine-depleted conditions. Specifically, reducing the amount of dopamine available in the model should produce Parkinsonian symptoms. Parkinson's disease (PD) is caused by the accelerated death of dopamine producing neurons. The pattern of cell loss is opposite to that of, and more severe than in, normal aging. Within the SN pars compacta, cell loss is predominately found in the ventral tier with less (but still extensive) damage in the dorsal tier (Fearnley and Lees, 1991; Gibb and Lees, 1991). In contrast, normal aging yields substantially less cell loss and in a dorsal-to-ventral pattern. Parkinsonian motor symptoms appear after a loss of 60–70% of SN pars compacta cells and 70–80% of dopamine levels in striatal nuclei (Bernheimer et al., 1973; Gibb and Lees, 1991). Motor symptoms include tremor, rigidity, bradykinesia, and akinesia. In addition to motor deficits, non-demented PD patients present cognitive symptoms that resemble those observed in patients with frontal damage. Numerous studies documenting cognitive deficits of PD patients have revealed impairment in a variety of tasks related to memory, learning, visuospatial skills, and attention (e.g., Gotham et al., 1988; Price et al., 2009).

## **COGNITIVE FUNCTION**

While many cognitive functions have been attributed to the BG (for a review, see Steiner and Tseng, 2010), relatively few have been modeled and numerically simulated using CCN models, i.e., models that can simultaneously account for both neurobiological and behavioral data. Hence, this review does not constitute an attempt at reviewing all the cognitive and motor functions attributed to the BG: the focus is on CCN models of the BG. Note that the model descriptions included are conceptual, in that implementation details and mathematical formulations are not discussed. The reader is referred to the cited original papers for model details and equations. **Table 1** summarizes the reviewed models along with their respective components.

#### **CATEGORIZATION**

Categorization is the ubiquitous process by which individual items are grouped to form categories. The massive convergence of cortico-striatal connectivity makes the BG an ideal site for categorization, and much research supports the role of the BG in category learning (for a review, see Seger, 2008).

#### *Models*

One of the earliest and most prominent neurobiological models of categorization is called COVIS (Ashby et al., 1998). COVIS is a multiple-system theory that was originally developed to account for the many behavioral dissociations between verbal and nonverbal categorization (as described by the general recognition theory; Ashby and Gott, 1988). COVIS includes an hypothesistesting system and a procedural learning system. The hypothesistesting system can quickly learn a small set of (e.g., verbal) categories (those that can be found by hypothesis-testing and often be verbally described) while the procedural learning system can learn any type of arbitrary categories in a slow trial-and-error manner (e.g., non-verbal). Each categorization system relies on a separate brain circuit but, interestingly, they both include the BG. In the hypothesis-testing system, the BG is used to support working memory maintenance and for rule switching. In the procedural system, the BG is used to learn stimulus—response associations. The COVIS model of categorization has been used to simulate a large number of category learning experiments and made several behavioral predictions, many of which have been later confirmed by empirical experiments (for a review, Maddox and Ashby, 2004). For example, COVIS predicts that delaying the feedback in verbal categorization should not affect performance (because the hypothesis-testing system relies on working memory) whereas delaying feedback in non-verbal categorization should impair learning (because the procedural learning system relies on dopamine-mediated reinforcement learning in the BG). This prediction was confirmed in Ashby et al. (2003). In addition, reducing dopamine levels in COVIS can account for many cognitive symptoms in PD patients such as perseveration, reduced sensitivity to negative feedback, and others (see Helie et al., 2012a,b). Likewise, dopamine elevation can account for the effect of positive affect on verbal category learning (Helie et al., 2012b). While most COVIS simulations have used a rate version of the model, a spiking version of the procedural-learning system has been used to account for some categorization results and

#### **Table 1 | Summary of the basal ganglia components included in the reviewed models.**


*Note. DP = Direct pathway [(1) in Figure 1]; IP = Indirect pathway [(2) in Figure 1]; HP = Hyperdirect pathway [(3) in Figure 1]; Str = Striatum; GPi = Globus pallidus (internal); GPe = Globus pallidus (external); STN = Subthalamic nucleus. \* These models used the substantia nigra pars reticulate (SNr) as their output node of the basal ganglia. In this context, the SNr is functionally equivalent to the GPi.*

extended to account for instrumental conditioning (Ashby and Crossley, 2011) and automaticity (Ashby et al., 2007).

As an alternative, Moustafa and Gluck (2011a,b) proposed a computational model of the striatum and prefrontal cortex that focuses on the dopamine projections to these areas as well as their interactions during multi-cue category learning. In this task, participants learn to select and pay attention to a single cue among a multi-cue pattern, and then make a categorization response. Participants learn this task via corrective feedback. In the model, the prefrontal cortex is essential for attentional selection while the striatum is used for motor response selection. Similar to COVIS, the Moustafa and Gluck (2011a,b) model can account for categorization deficits in PD patients by reducing dopamine levels in both the BG and prefrontal cortex, which is in agreement with empirical results (Kaasinen et al., 2001; Silberstein et al., 2005). Additionally, the Moustafa and Gluck (2011a,b) model can account for some effects of dopaminergic and anticholinergic medication. The Moustafa and Gluck (2011a,b) model assumes that the administration of anticholinergic medications in PD interferes with hippocampal function, which is also in agreement with empirical studies (Meco et al., 1984; Pondal et al., 1996; Ehrt et al., 2010; Herzallah et al., 2010). In contrast, the current version of COVIS has not been used to simulate medication effects in PD.

#### *Synthesis*

The reviewed models of categorization both agree that the BG, and its interaction with the prefrontal cortex, are essential for category learning. Furthermore, they agree that dopamine levels in both the BG and prefrontal cortex are important. While COVIS (Ashby et al., 1998) has been used to simulate a wider range of categorization tasks, the Moustafa and Gluck (2011a,b) model has been used to simulate more details in a smaller subset. For example, one limitation of the Moustafa and Gluck (2011a,b) model is that it does not simulate complex multi-cue learning tasks that involve paying attention to more than one stimulus (which can be done using COVIS). However, the Moustafa and Gluck (2011a,b) model can simulate the effect of dopaminergic medication, whereas COVIS has not been used to simulate medication effects. One important difference between the COVIS and Moustafa and Gluck (2011a,b) model is that COVIS assigns a different role for BG and cortical dopamine, namely error signal and signal gain (respectively). In contrast, Moustafa and Gluck (2011a,b) assign both of these roles to dopamine in both the BG and the prefrontal cortex. In addition, an important limitation of both models is that they oversimplify the anatomy of the BG by not including the indirect and hyperdirect pathways. Future work aimed at increasing the biological accuracy of COVIS and the Moustafa and Gluck (2011a,b) models may highlight some additional key differences between the modeling approaches and allow for selecting the most appropriate BG model of categorization.

#### **INSTRUMENTAL CONDITIONING**

Instrumental conditioning (also called "operant" conditioning) is a process by which animals learn to behave in a way that will maximize reward and minimize punishment. In a typical instrumental conditioning experiment, a neutral environment is altered and begins generating rewards (acquisition phase). Next, the reward is removed from the environment and the environment reverts to its neutral state (extinction phase). Extinction is usually followed by a reacquisition phase, where the reward is introduced again in the same neutral environment. Typically, a new behavior is learned during the acquisition phase, and progressively disappears during the extinction phase. The behavior reappears during the reacquisition phase, usually at a much faster rate than during the initial acquisition phase. This phenomenon is called *fast reacquisition*. Much evidence implicates the BG in instrumental conditioning (e.g., O'Doherty et al., 2004; Yin et al., 2005), but the neurobiology underlying extinction and fast reacquisition is poorly understood.

#### *Models*

Ashby and Crossley (2011) proposed a spiking model of the direct pathway of the BG (see **Figure 1**) inspired by the COVIS procedural learning system (Ashby et al., 1998) to account for instrumental conditioning. The Ashby and Crossley (2011) model focuses on the TANs, a population of cholinergic interneurons in the striatum that is rarely included in CCN models of the BG. Pakhotin and Bracci (2007) have shown that TANs play an important role in inhibiting cortical activation of the MSNs (i.e., the projection neurons generally modeled in the direct and indirect pathways). As suggested by their name, TANs have a high baseline firing rate, but they learn to pause in rewarding contexts (Apicella, 2007). Ashby and Crossley (2011) suggest that one possible role for the TANs is to protect striatal learning from catastrophic interference and allow for fast reacquisition. In addition to the direct pathway, the Ashby and Crossley (2011) model includes a sensory association area, the supplementary motor area (SMA), and the CMPf complex.

The Ashby and Crossley (2011) model is an opened loop through the BG (from sensory association cortex to the SMA). The stimulus activates the sensory association cortex, which in turn activates the striatum and the direct pathway of the BG. At the same time, the context activates the CMPf complex, which transmits activation to the TANs (this pathway is not included in **Figure 1**). At the beginning of an experiment, the simulated subject does not know that the context is rewarding. Hence, the TANs do not pause, and the MSNs in the direct pathway cannot be activated by the sensory association cortex. This prevents stimulus—response association learning. During the acquisition phase, the TANs quickly learn that the context is rewarding and pause. The MSNs are thus released from inhibition and the model learns to produce the rewarding behavior using reinforcement learning. Next, during the extinction phase, the TANs learn that the context is no longer rewarding and stop pausing. This change inhibits the MSNs and interrupts cortico-striatal learning. Hence, the associations that were learned during the acquisition phase are protected. Finally, during the reacquisition phase, the context becomes rewarding again, and the TANs learn to pause. The MSNs are released from inhibition, and the learned associations are in the same state as in the acquisition phase, which produces fast reacquisition. Using this process, the model has been used to reproduce the acquisition, extinction, and fast reacquisition phases typical of instrumental conditioning and single-cell recording data from TANs showing that the cells learn to pause in rewarding contexts (Ashby and Crossley, 2011).

## *Synthesis*

The Ashby and Crossley (2011) model is the only CCN model of instrumental conditioning that can simultaneously account for behavioral (e.g., fast reacquisition) and single-cell data (from the TANs). This model constitutes an important step in that it provides an implementation and numerical simulation of the theory that TANs learn to pause in rewarding contexts, and how this can affect reinforcement learning in the BG. However, the neuroanatomy of the BG was simplified in that only the direct pathway through one of the cortico-BG loops was included. It is unclear at this time how the TANs' dynamics would affect the indirect pathway, or how the model would behave if more than one loops was included. Future work is needed to verify how the theory implemented in Ashby and Crossley (2011) behaves in a more detailed model of the BG.

#### **PROBABILISTIC LEARNING**

Probabilistic learning typically refers to tasks where the association between the response and the reward is uncertain. Unlike most categorization experiments, the same response to the same stimulus can result in different outcomes on different trials. While probabilistic learning has been shown to rely on the BG since the mid-1990s (Knowlton et al., 1996), it took a decade before CCN models of the BG were used to attempt to account for probabilistic learning.

## *Models*

The Frank (2005) model was originally proposed to account for cognitive deficits in parkinsonism. The model includes both the direct and indirect pathways through the BG (see **Figure 1**), the premotor cortex, and an unspecified input area (probably located in posterior cortex) (so the model is an opened loop). In the Frank (2005) model, the input activates both the premotor cortex and the striatum. However, cortical activation is insufficient to produce a response, so BG processing is required to gate the correct response. The focus of the model is on: (1) the role of the indirect pathway in probabilistic learning and (2) the role of dopamine in probabilistic learning. In the Frank (2005) model, the direct pathway is in charge of selecting the appropriate action (Go) whereas the indirect pathway is in charge of inhibiting inappropriate actions (NoGo). The direct and indirect pathways converge in the GPi and compete to control GPi activation, and eventually the response. Simulation results show that removing the indirect pathway in the model reduces performance, suggesting that both the direct and indirect pathways are essential in probabilistic learning. In addition, the effect of the indirect pathway needs to be specific to each action (so that the indirect pathway can individually inhibit each action).

As described in the neuroanatomy section above, the competition between the direct and indirect pathways is modulated by dopamine (the second focus of the Frank (2005) model). Specifically, higher dopamine levels increase activation in the direct pathway (e.g., through D1 receptors) and reduces activation in the indirect pathway (e.g., through D2 receptors). Hence, dopamine release following unexpected rewards results in longterm potentiation (LTP) in the direct pathway and long-term depression (LTD) in the indirect pathway. In contrast, dopamine dips following the unexpected absence of a reward reduces activation and produces LTD in the direct pathway but increases activation and produces LTP in the indirect pathway. The simulation results suggest that the dynamic range of the dopamine signal is crucial in probabilistic learning and reversal learning (e.g., when the response—reward associations are changed during learning). Reducing (to simulate PD) or increasing (to simulate medication overdose) dopamine levels can result in simulated Parkinsonian symptoms (Frank, 2005).

Another interesting model of probabilistic learning was recently proposed by Guthrie et al. (2013). The Guthrie et al. (2013) model is based on an earlier computational neuroscience model of the BG that focuses on the interaction between the direct and hyperdirect pathways (Leblois et al., 2006). The Guthrie et al. (2013) model includes two cortico-BG closed-loop that interact in the striatum. The first loop is called the "cognitive" loop and is used to identify the visual symbols used in the probabilistic learning task. The second loop is called the "motor" loop and is used to select a response based on the observed symbols. Some of the corticostriatal projections affect both loops, but the rest of the circuit is segregated. In both loops, the direct pathway is in charge of selecting the correct channel (i.e., identifying the symbols or the response) while the hyperdirect pathway sends non-specific inhibition to the GPi to produce a center-surround decision process. All corticostriatal synapses are plastic (using dopamine-mediated reinforcement learning) and the cognitive loop gradually learns to bias the motor loop, thus producing faster reaction times. The model successfully reproduces both neural firing rates and behavioral data in the double-arm bandit task.

The categorization models reviewed earlier have also been applied to probabilistic learning. The Moustafa and Gluck (2011a) model focused on the role of dopamine in probabilistic learning. In addition to simulating probabilistic learning with normal dopamine levels, Moustafa and Gluck (2011a) have simulated the effect of decreased dopamine (as in PD) and the effect of dopaminergic medication in both the BG and prefrontal cortex. The COVIS model has also been used to simulate probabilistic learning (Helie et al., 2012a). While COVIS was not used to simulate medication effects, the model could account for probabilistic learning with normal and reduced (as in PD) dopamine levels (with a dosage effect such that lowest levels of dopamine produced worst performance; see Knowlton et al., 1996).

## *Synthesis*

The reviewed models of probabilistic learning tend to be more biologically detailed than the reviewed models of categorization. Specifically, the Frank (2005) model includes the direct and indirect pathways, whereas the Guthrie et al. (2013) model includes the direct and hyperdirect pathways. In contrast, COVIS (Ashby et al., 1998) and the Moustafa and Gluck (2011a,b) models only included the direct pathway. Interestingly, however, the Frank (2005) model does not include the same details as the Guthrie et al. (2013) model. Both models include the direct pathway for action selection, and use dopamine-mediated reinforcement learning to learn corticostriatal synapses. However, the Frank (2005) model uses the indirect pathway as a channel-specific excitatory process to cancel inappropriate actions whereas Guthrie et al. (2013) use the hyperdirect pathway as a non-specific excitatory process to cancel inappropriate actions. Neither model includes both the indirect and hyperdirect pathways. While there is agreement on the need for an excitatory process to enhance the contrast between the selected and non-selected actions, the exact process is still to be determined.

While the categorization models only included the direct pathway of the BG, one of their strengths is that, in addition to their generality, they also include other brain areas. For instance, Unlike the Frank (2005) model, the Moustafa and Gluck (2011a,b) model simulates the role of prefrontal cortex and its dopamine projections, which is in agreement with empirical studies (Mulder et al., 2003; Histed et al., 2009). Also, analysis of the parameter space in the COVIS simulations challenges the role of the BG for procedural learning in probabilistic learning, and suggests instead that the BG are used for hypothesis-testing in this task (Gluck et al., 2002). So, both categorization models agree on an important role for the prefrontal cortex in probabilistic learning, and this role is missing from both the Frank (2005) and the Guthrie et al. (2013) models. The most productive future approach might be to add a prefrontal cortex to the existing probabilistic learning models and see how this addition affects the dynamic of the different BG pathways.

#### **WORKING MEMORY**

Working memory is a cognitive function used to maintain and manipulate information in real-time for a short duration (seconds). While working memory has traditionally been associated with the prefrontal cortex (Fuster, 2008), Monchi et al. (2000) proposed that the BG may be required to maintain information in prefrontal cortex.

#### *Models*

The Monchi et al. (2000) model was originally proposed to account for working memory deficits in PD and schizophrenia. The model includes three BG-thalamocortical closed loops: two with the prefrontal cortex (one for spatial information and the other for object information), and one through the ACC (for strategy selection). In all three cases, only the direct pathway through the BG was included (see **Figure 1**). The role of the two prefrontal-BG loops is to maintain working memory information about the stimuli, whereas the ACC maintains the adopted strategy by inhibiting all the prefrontal cortex loops except one (i.e., representing the selected strategy). Visual input to the BG comes from the posterior parietal cortex (spatial) and inferior temporal cortex (object). The model output is located in the premotor cortex, and the nucleus accumbens (not shown in **Figure 1**) is used to provide feedback. In the model, the visual stimulus is input to the prefrontal cortex loops, and the stimulus activity is propagated through the direct pathway of the BG. As a result, the thalamus is released from GPi inhibition, and activation produced by the stimulus in the prefrontal cortex reverberates through closed-loops with the thalamus. When a response is required, the prefrontal cortex transfers its activation to the premotor cortex. If the response is incorrect, the nucleus accumbens sends a feedback signal to the ACC loop, which selects a new strategy by switching its inhibition to different prefrontal cortex loops. The Monchi et al. (2000) model has been used to simulate a delayed response task and the Wisconsin Card Sorting Test. Interestingly, reducing the connection strengths within the BGthalamocortical loops produces Parkinsonian symptoms, whereas reducing nucleus accumbens activity produces deficits similar to those observed in schizophrenia (Monchi et al., 2000).

Five years later, Ashby et al. (2005) proposed the FROST model to account for intact spatial working memory maintenance. Similar to the Monchi et al. (2000) model, FROST includes the direct pathway of the BG (see **Figure 1**), and working memory maintenance relies on reverberating activation between the prefrontal cortex and the thalamus. However, unlike in the Monchi et al. (2000) model, only one prefrontal cortex loop is included, and thalamic activation is not sufficient to maintain prefrontal activity: a second set of closed-loops between the prefrontal cortex and posterior cortex needs to be simultaneously activated to maintain working memory information. In Ashby et al. (2005), the focus is on simulating spatial working memory tasks, and the area of posterior cortex required for working memory maintenance is the posterior parietal cortex. However, it is likely the case that the specific location in posterior cortex depends on what information is being maintained. For instance, if the items being maintained in working memory were objects, then it is likely that the posterior cortex area involved would be inferior temporal cortex. Another difference between FROST and the Monchi et al. (2000) model is that the striatum in FROST is activated by a different population of prefrontal neurons (separate from the working memory maintenance prefrontal population) whereas the same prefrontal neurons are used to activate the striatum and maintain information in Monchi et al. (2000). As a result, the striatum becomes activated only after the stimulus has disappeared in FROST, whereas the striatum becomes activated as soon as the stimulus appears in Monchi et al. (2000). These differences between FROST and Monchi et al. (2000) were motivated by recent single-cell recording results reviewed in Ashby et al. (2005). FROST has been used to reproduce single-cell recordings from many experiments in several brain regions, in addition to accounting for working memory capacity limitation and the relation between memory span and the ability to ignore distracting information (Ashby et al., 2005).

One common theme of the two previous models is that working memory activity is maintained by closed-loops between the thalamus and prefrontal cortex, and the main role of the BG is to release the thalamus from inhibition and allow for the reverberating activation to take place. However, this view was challenged by Frank et al. (2001) who proposed a model of BGprefrontal cortex interaction in working memory. Specifically, Frank et al. (2001) argued that in order for the thalamus to contribute to working memory maintenance in the way described by the previous models, it would need to have a dedicated number of cells comparable to the number of cells dedicated to working memory in prefrontal cortex (which is unlikely). Instead, the Frank et al. (2001) model proposes that working memory maintenance is accomplished by reverberating loops between two cell populations within the prefrontal cortex. Similar to Monchi et al. (2000) and FROST (Ashby et al., 2005), the Frank et al. (2001) model includes the direct pathway through the BG (see **Figure 1**), but the role of the direct pathway is to "turn on the switch" on the prefrontal cortex cells and allow for reverberating activation. The "switch" can only be turned on if the prefrontal cortex cells from one population simultaneously receive activation from the BG and the other prefrontal cortex cell population. Once the switch is "on", the BG is no longer required for working memory maintenance. The Frank et al. (2001) model has been used to simulate the 1-2-AX task, a working memory task that requires maintenance but also switching and selecting new items (Frank et al., 2001). Specifically, the 1-2-AX task requires the subject to maintain two cues in working memory in order to correctly select a response to a target sequence. The identity of the target sequence depends on the previous cue, which is used to trigger selection and switching.

One topic that was not addressed by any of the previous models of working memory is learning. How can the brain learn what is important, and what needs to be maintained in working memory? Moustafa and Maida (2007) proposed a computational model of prefrontal cortex and BG interactions that is similar to the Frank et al. (2001) model except that Moustafa and Maida (2007) also simulate: (a) temporal difference learning based on phasic dopamine signaling and (b) more than one corticostriatal loops that are responsible for both motor and cognitive processes. Specifically, the model includes a cortico-striatal motor loop and a cortico-striatal cognitive loop whose functions are action selection (choosing motor responses) and cognitive selection (choosing the perceptual information to be maintained in working memory), respectively. The model can account for the separate roles of the motor and cognitive loops in working memory maintenance, including delayed-response tasks.

Schroll et al. (2012) recently proposed a CCN model of working memory to address the problem of learning complex working memory tasks. The Schroll et al. (2012) model includes two separate BG-thalamocortical closed loops, one through the prefrontal cortex for maintenance and another through motor cortex to produce a response. Only the direct pathways were used for maintenance and response selection, but the hyperdirect pathway was also included in the prefrontal loop as a reset mechanism (see **Figure 1**). Specifically, visual information enters the model through the inferior temporal cortex, which then activates the lateral prefrontal cortex. This activation is transferred through the direct pathway of the BG and releases the thalamus from inhibition, which in turn activates the lateral prefrontal cortex. In the Schroll et al. (2012) model, working memory activation in the prefrontal cortex is maintained by a reverberating activation loop through the direct pathway, so the BG does not only act as a gating mechanism but is part of the maintenance loop (unlike Monchi et al., 2000; Frank et al., 2001; Ashby et al., 2005). At any moment, the prefrontal cortex can activate the STN, which increases activation in the GPi and interrupt working memory maintenance (i.e., the reset mechanism). More importantly, the connectivity between the prefrontal cortex and the striatum and the connections between the prefrontal cortex and the STN are learned using dopaminemediated reinforcement learning. Hence, the model can automatically adapt and only maintain information that is rewarded in working memory. The model has been used to simulate a delayed response task, a delayed alternation task, and the 1-2-AX task (Schroll et al., 2012).

## *Synthesis*

Working memory is one of the most active areas for CCN modeling of the BG. Five different models were reviewed, each having both commonalities and differences. First, all five models focused on the interaction between the BG and the prefrontal cortex, but only included the direct pathway of the BG for working memory maintenance and response selection. Hence, the neuroanatomy of the BG was not very detailed. Also, all models except Schroll et al. (2012) used the BG as a gating mechanism that turns working memory maintenance "on" or "off ". The main difference is that Monchi et al. (2000) and Ashby et al. (2005) used the BG to gate closed loops between the prefrontal cortex and the thalamus, whereas Frank et al. (2001) and Moustafa and Maida (2007) used the BG to gate closed loop between two populations of prefrontal cortex units. This differs from Schroll et al. (2012) where the BG was not used for gating, but instead was part of the working memory maintenance mechanism itself (i.e., the closed loop went through the BG). In all cases, however, working memory maintenance was achieved by closed loop through the prefrontal cortex.

Another important difference between the models is that the Ashby et al. (2005) and the Moustafa and Maida (2007) models focused on simple maintenance tasks. In contrast, the Monchi et al. (2000), Frank et al. (2001), and Schroll et al. (2012) models were able to simulate more complex tasks involving hierarchical structures and switching. Only the Moustafa and Maida (2007) and the Schroll et al. (2012) models include learning mechanisms that allowed for selecting the relevant information that needs to be maintained in working memory. The other models assumed a pre-filtering of the information.

Interestingly, there seems to be a progression and a building up of knowledge related to CCN models of working memory. The Schroll et al. (2012) model is the most recent, and also the most detailed. It is the only model that can learn and simulate complex tasks. However, this model departs from all the others in that the BG is not used as a gating mechanism but is part of the maintenance mechanism. This departure from previous modeling is not extensively discussed in Schroll et al. (2012), and it is unclear at this point what motivated this departure. More work is needed to determine which of these two roles the BG play in working memory, but the overlap in the models, and the progression in functionality, suggest a steady progress in CCN modeling of working memory.

## **SEQUENCE LEARNING**

Almost all our everyday behaviors and cognitive activities can be interpreted as a sequence of steps that bring us closer to achieving a goal. One key question is how can we learn to chain these sequences of substeps? Miyachi et al. (1997, 2002) have gathered much data suggesting that the BG is important in learning such sequences.

## *Models*

Nakahara et al. (2001) formalized Miyachi et al. (1997, 2002) findings into a CCN model. According to Nakahara et al. (2001), sequences are learned in both visual and motor coordinates. The visual sequences are learned by a BG-thalamocortical closed-loop linking the anterior striatum with the dorsolateral prefrontal cortex while motor sequences are learned by a BG-thalamocortical closed-loop linking the posterior striatum with the SMA. Only the direct pathway through the BG is included in each loop (see **Figure 1**), and both the visual and motor loops learn every sequence in parallel using reinforcement learning. The visual loop learns faster than the motor loop, and response coordination between the loops is controlled by the pre-SMA. According to Nakahara et al. (2001), the visual loop relies on working memory and is important for rapid acquisition of sequences. However, the motor loop is more reliable and produces movement more rapidly after training. As a result, control is gradually transferred from the visual loop to the motor loop in the Nakahara et al. (2001) model. The Nakahara et al. (2001) model has been used to account for: (1) the time course of learning (including single-cell recordings and lesion studies); (2) the effect of opposite hand use; (3) the effect of sequence reversal; and (4) the change in brain locus from early to late sequence production (Nakahara et al., 2001).

#### *Synthesis*

The Nakahara et al. (2001) model is interesting for several reasons. First, it successfully accounts for lesion data, single-cell recordings, and behavioral phenomena. In addition, the transition from a visual loop to a motor loop represents an early attempt at bridging the gap between cognitive and motor functions of the BG. However, a recent study by Desmurget and Turner (2010) challenges the generality of the Nakahara et al. (2001) model. Specifically, Desmurget and Turner (2010) had monkeys perform a sequence of visually-cued joystick movements aimed at moving a cursor into a pre-determined part of a computer screen. After some training, muscimol was injected into the motor segment of the GPi to functionally disconnect the BG from the frontal cortex. The results show that the kinematics of the movements were impaired, but not sequence knowledge. Desmurget and Turner (2010) interpreted these results as suggesting that the BG contributes to motor execution in automatic sequence production, but not to the motor sequencing or the storage of the overlearned sequence. This result is problematic for the Nakahara et al. (2001) model.

#### **AUTOMATICITY**

Automaticity results from overtraining in a task until performance requires little attentional resources and becomes highly rigid (Helie et al., 2010; Helie and Cousineau, 2011). Many computational models of automaticity development have assigned a role for the BG.

## *Models*

First, in the Nakahara et al. (2001) model of sequence learning (above), automaticity in sequence production is characterized by a gradual transfer from the visual loop (which learns sequences in visual coordinates) to the motor loop (which learns sequences in motor coordinates). This corresponds well with the results of Miyachi et al. (1997, 2002), who showed using single-cell recordings that task-sensitive cells in early learning are mostly located in the anterior striatum whereas selective cells in late sequence production are mostly located in the posterior striatum (Miyachi et al., 2002). This specialization of the striatum is further supported by inactivation studies where muscimol (a GABA agonist) was injected in different parts of the striatum in early and late sequence production. Well-learned sequence production was selectively disrupted by muscimol injection in the middle-posterior putamen while early sequence production was selectively disrupted by muscimol injection in the anterior caudate and putamen (Miyachi et al., 1997).

However, a recent study by Desmurget and Turner (2010) challenges the generality of the Nakahara et al. (2001) model. Specifically, injecting muscimol into the motor segment of the GPi to functionally disconnect the BG from the frontal cortex affects the kinematics of the movements but not sequence knowledge. These results suggest that the BG contributes to motor execution in automatic sequence production, but not to the motor sequencing or the storage of the overlearned sequence. Interestingly, the results of Desmurget and Turner (2010) are consistent with a neurobiological model of automaticity in perceptual categorization (SPEED) (Ashby et al., 2007). SPEED uses the procedural system of COVIS (Ashby et al., 1998) (i.e., the direct pathway of an opened loop between posterior cortex and premotor areas) but also includes a Hebbian learning mechanism between posterior cortex and premotor areas. The role of the BG in SPEED is to learn to produce the correct categorization responses early in training to ensure that the correct motor plan in the premotor areas is consistently activated shortly after the visual representation in associative cortex (using dopamine-mediated reinforcement learning). This consistent association between associative and premotor cortical activity triggers Hebbian learning between associative cortex (stimulus) and the premotor areas (response), and the direct cortico-cortical connections eventually become strong enough so that the BG is no longer required to produce a response. When responding becomes purely cortical, the skill is said to be "automatic" [note that this is different from Nakahara et al. (2001), in which the posterior striatum is still required for automatic sequence production]. SPEED has been used to simulate single-cell recordings data in many categorization experiments, as well as human reaction times and accuracies in categorization (Ashby et al., 2007; Helie and Ashby, 2009).

While the Hikosaka et al. (2000) and SPEED models can account for how behavior becomes automatic, they cannot account for how automatic responses are overridden by goaldirected behavior when needed (e.g., when the stimulus response associations change). Chersi et al. (2013) proposed a computational model of automaticity in instrumental conditioning that can account for the change back to goal-directed behavior when needed. The Chersi et al. (2013) model includes the prefrontal cortex (for goal representation), the motor cortex (for action representation), the sensory cortex (for stimulus representation), the BG (for action selection), and the thalamus (to relay information between the BG and the motor cortex). Two sets of connections are plastic: (1) connections from the prefrontal cortex to the motor cortex (to learn goal—response associations) and (2) connections from the sensory cortex to the striatum (to learn stimulus—response associations). According to this model, the stimulus activates the sensory cortex, which in turn activates a goal in prefrontal cortex and action representations in the striatum. For automatic behavior, the striatal activation propagates through both the direct and indirect pathways (see **Figure 1**) of the BG and an action is selected by inhibiting all but one action at the output level (SNr, but it is functionally equivalent to the GPi shown in **Figure 1**). The action that is not inhibited activates the appropriate response in motor cortex (through the thalamus). For goal-directed behaviors, the prefrontal activation propagates to the appropriate action in motor cortex. When an automatic action needs to be overwritten by a goal-directed behavior, the prefrontal cortex sends activation to the STN, which hyperpolarizes the SNr and prevents the BG from controlling the motor response (Chersi et al., 2013). The model has been successfully used to account for the development of automaticity in an instrumental conditioning task and the reversal of stimulus—response associations after automaticity had developed (Chersi et al., 2013).

### *Synthesis*

The Nakahara et al. (2001) and the Chersi et al. (2013) models both assign the role of producing automatic behavior to the BG. However, this "classic" role of the BG in automaticity is difficult to reconcile with the Desmurget and Turner (2010) data. As an alternative, SPEED (Ashby et al., 2007) also assigns an important role to the BG in automaticity, but this role is restricted to training automatic cortico-cortical projections that can account for automaticity. Simply put, the BG is required to learn automatic behaviors, but the BG is no longer required to produce automatic behaviors once the cortico-cortical connectivity is sufficiently strong. The SPEED model can account for the Desmurget and Turner (2010) data, but it includes only the direct pathway of one loop through the BG. In contrast, the Nakahara et al. (2001) model includes two loops through the BG (only the direct pathways) and the Chersi et al. (2013) model includes both the direct and indirect pathways, but only one loop through the BG (similar to SPEED). In addition to being the most biologically detailed, the Chersi et al. (2013) model is the only reviewed model that can override automatic behavior using goal-directed behavior. This is an important function that was not accounted for by the previous models. However, like the Nakahara et al. (2001) model, the Chersi et al. (2013) model cannot account for the Desmurget and Turner (2010) data. To summarize, each one of these models was designed to account for a different aspect of automatic behavior, and successfully accounts for the aspect of automaticity for which it was designed. The next step is to explore how each one of these candidate models can account for the missing function/data that was the focus of the other candidate models.

## **MOTOR FUNCTION**

This section describes motor functions that have been attributed to the BG and that have been simulated using CCN models. Hence, computational models that focus only on modeling biological data or motor function (e.g., kinematics) were not included. Similar to the section reviewing cognitive functions above, the model descriptions are conceptual, in that implementation details and formalities are not discussed. The reader is referred to the cited original papers for details and equations. **Table 1** summarizes the reviewed models along with their respective components.

## **REACHING**

The BG has been implicated in reaching movements for many years (for a review, see Bischoff, 1998). Not surprisingly, PD patients show unmistakable changes in reaching movements, which can be used for diagnostic purposes (Brown and Jahanshahi, 1996). Simple reaching movements in PD patients show longer reaction times and movement times than normal controls. This reduced movement speed seen in PD reaching is called bradykinesia. From a physiological perspective, a typical reaching movement under normal conditions consists of a sequence of agonist-antagonist bursts. In contrast, a PD reaching movement is generally multi-staged and involves multiple agonist bursts. Furthermore, PD reaching movements have greater variability of hand position for larger movements (Sheridan and Flowers, 1990). PD patients also show impairment in sequential movements (Weiss et al., 1997). For example, during movements aimed at reaching a glass of water, PD patients exhibit an inordinately long pause between the reaching and retrieval of the glass.

## *Models*

Several computational models relating dopamine deficiencies to impaired reaching movements have been proposed. One of the first models of PD reaching movements is the model of Bischoff (1998). Bischoff (1998) model includes the prefrontal cortex (for working memory/learning), the SMA (for sequence generation), the pre-SMA (for sequence preparation), motor cortex (for movement parameters), the thalamus (to filter information from the BG to cortex), and the BG. The BG model assigns complementary roles to the direct and indirect pathways (see **Figure 1**). According to Bischoff (1998), the role of the direct pathway is to inform the motor cortex of the movement's next sensory state, while the role of the indirect pathway is to inhibit competing movements. The function of dopamine is to keep the balance between the two pathways, which is impaired in PD. The Bischoff (1998) model was used to simulate the reciprocal aiming task, a task where subjects are asked to alternatively tap a stylus between two targets as quickly as possible. Reducing the dopamine levels in the simulation reproduced bradykinesia and the exaggerated pauses in sequential movements observed in PD.

Magdoom et al. (2011) also proposed a model of the role of the BG in PD reaching movements. The model is cast in the framework of reinforcement learning and focuses on interactions between the motor cortex and the BG. The Magdoom et al. (2011) model uses the classical interpretations of BG pathways according to which the direct pathway facilitates movement, (i.e., the "Go" pathway), while the indirect pathway inhibits movements (i.e., the "NoGo" pathway). Switching between the two pathways is thought to be driven by striatal dopamine levels. However, Magdoom et al. (2011) also deviate from the classical "Go"/"NoGo" model of the BG by adding an intermediate regime called the *explore* regime. The explore regime is used to control the stochasticity of action selection when the gradient is absent or too weak to allow for reinforcement learning. The indirect pathway is proposed to be the substrate supporting the explore regime. Simulations show that under dopamine-deficient conditions of PD, the model spends less time in the "Go" regime while spending more time in the remaining two regimes. These regime changes were used to account for a variety of features of impaired reaching movements in PD including movement undershoot (Van Gemmert et al., 2003), bradykinesia, and increased path variability (Sheridan and Flowers, 1990).

## *Synthesis*

Two models that highlight the role of the BG in reaching were reviewed. The Bischoff (1998) model includes, in addition to the BG, cortical areas such as prefrontal areas, the SMA and the pre-SMA. The model captures bradykinesia and abnormal pauses in sequential movements under Parkinsonian conditions. The reaching model of Magdoom et al. (2011) also incorporates the BG and the motor cortex. However, the Magdoom et al. (2011) model is cast in the framework of reinforcement learning, whereas there is no learning in the Bischoff (1998) model. Focus on learning makes the Magdoom et al. (2011) model consistent with the proposed role of BG in error correction (Lawrence, 2000). As a result, the Magdoom et al. (2011) model is more general and is consistent with the view that a wide range of BG functions can be explained within the framework of reinforcement learning (Chakravarthy et al., 2010). The compatibility of the Magdoom et al. (2011) model with other CCN models of BG function may facilitate integration to achieve a more complete model of BG function.

## **HANDWRITING**

Handwriting is a fine motor skill. PD patients often exhibit an impaired form of handwriting, known as micrographia, characterized by reduced letter size, a "kinky" handwriting contour, and abnormal fluctuations in velocity and acceleration (Teulings and Stelmach, 1991; Van Gemmert et al., 1999; Gangadhar et al., 2009). As a result, handwriting features like stroke size, peak acceleration, and stroke duration have been attributed to the BG and used for diagnosis of PD (Phillips et al., 1991; Van Gemmert et al., 2003).

## *Models*

Although models of PD handwriting are scanty, extensive work has been done on modeling handwriting generation. One of the earliest insights into modeling handwriting consisted of performing a Fourier-like resolution of handwriting into oscillatory components (Hollerbach, 1981). This notion has led to the development of oscillatory or spiking neural network models of handwriting generation that can be trained to produce single characters (Schomaker, 1991; Kalveram, 1998). However, the models of Schomaker (1991) and Kalveram (1998) suffered from the absence of a plausible procedure for initializing the phases of neural oscillators, a difficulty that was solved in an oscillatory neural network model of handwriting generation proposed by Gangadhar et al. (2007).

While the above-described models did not explicitly include the BG, Gangadhar et al. (2008) combined the Gangadhar et al. (2007) handwriting generation model with a model of the BG. Similar to handwriting patterns observed in PD patients, the Gangadhar et al. (2008) model exhibits micrographia under conditions of reduced dopamine. Another significant feature of the model is the role of the dynamics of the STN and the GPe, which are connected as a loop to produce complex oscillations. Under pathological conditions, the oscillations of the STN and the GPe in the model are highly correlated, resembling the correlated neural firing from STN and GPe neurons under dopamine-deficient conditions observed in real electrophysiology experiments (Bergman et al., 1994; Brown et al., 2001). Under the influence of correlated oscillations of STN and GPe, the Gangadhar et al. (2008) model produces handwriting with large fluctuations in velocity in addition to diminutive letter size.

As another example, Contreras-Vidal and Stelmach (1995) attached a BG model to the VITE-WRITE model (Bullock et al., 1993) to simulate PD handwriting. The Contreras-Vidal and Stelmach (1995) model includes the direct, indirect, and hyperdirect pathways of the BG (see **Figure 1**), the SMA, and other motor and premotor areas. The role of the SMA is to read-in the next motor subprogram from the movement plan, while the role of the other motor and premotor areas is to produce the movement selected by the SMA. The role of the BG is to modulate the dynamics of the formation of movement trajectories (produced by VITE-WRITE). Reducing dopamine in the model to simulate PD results in reduced letter size, as observed in PD patients.

## *Synthesis*

Two models of PD handwriting were reviewed above. The model of Contreras-Vidal and Stelmach (1995) combines the VITE-WRITE model (Bullock et al., 1993) with a BG model. The essence of the model consists of showing that dopamine reduction in PD causes an imbalance in the outputs of the direct and indirect pathways. Although constructed out of considerably different modeling components, the model of Gangadhar et al. (2008) also shows an imbalance in the activations of the direct and indirect pathways under simulated PD conditions, which causes a reduction in letter size. In addition, Gangadhar et al. (2008) also account for the oscillations in STN-GPe interaction. Loss of complexity in these oscillations under PD conditions were linked to higher velocity fluctuations and distorted handwriting contour in PD handwriting. To summarize, the indirect pathway appears to be critical in accounting for handwriting.

## **EYE SACCADES**

Eye saccades are rapid, darting eye movements that shift the fovea to points of interest in the visual scene. There is an extensive cortical and subcortical network that is responsible for saccade generation, and the BG play a key role among the subcortical substrates for saccade generation (Hikosaka et al., 2000). The influence of BG on saccades is propagated via the superior colliculus, a midbrain nucleus with a central role in saccade generation (not shown in **Figure 1**). Studies on Parkinsonian monkeys prepared by MPTP (a neurotoxin used to destroy dopaminergic neurons) infusion have observed prolonged saccades, longer reaction times, smaller peak velocities, and smaller amplitudes (Kato et al., 1995). Smaller peak velocities and smaller amplitudes in PD saccades may be comparable to bradykinesia and hypometria found in PD reaching movements. Similarly, analogous to PD tremor in extremities, some PD patients exhibit square-wave jerks in visually guided saccades (Rascol et al., 1991).

#### *Models*

Computational modeling literature that specifically focuses on the role of BG in saccade generation is rather limited. Dominey and Arbib (1992) proposed a model of the role of the BG in sequential saccade generation. Their model includes a number of relevant neural substrates such as the superior colliculus, thalamus, frontal eye fields, and the BG. In the Dominey and Arbib (1992) model, the BG is used as an indirect link between the frontal eye field and the superior colliculus, and its main function is to prevent saccades while a target stimulus is foveated. As such, only the direct pathway through the BG is modeled. The Dominey and Arbib (1992) model has been used to simulate simple saccade data, memory saccade data, double saccade data, compensatory saccade data, and lesion data.

Two decades later, Krishnan et al. (2011) proposed a model of the role of the BG in saccade generation using the principle of reinforcement learning. Similar to their model of PD reaching movement (Magdoom et al., 2011), the indirect pathway serves as an explorer that drives the saccades toward more rewarding targets. The model was able to successfully simulate two forms of visual search, namely feature search and conjunction search, a sequential saccade task, and a directional saccade task. When PD-related changes were incorporated in the model by reducing BG output, the model exhibited increased search times (Krishnan et al., 2011).

#### *Synthesis*

Two models of the role of BG in saccade generation were reviewed above (Dominey and Arbib, 1992; Krishnan et al., 2011). Both models can account for a range of saccade data in normal and lesioned/pathological conditions. The anatomical components incorporated by the two models are also quite similar. However, there are two main distinguishing features between the two models. One of these features is anatomical: the Dominey and Arbib (1992) model does not include the indirect pathway in the BG, whereas the indirect pathway plays a key role in the Krishnan et al. (2011) model. The second feature is functional: the Dominey and Arbib (1992) model does not involve learning, while that of Krishnan et al. (2011) model is based on reinforcement learning. These key differences make the Krishnan et al. (2011) model more biologically and functionally detailed.

## **GENERAL DISCUSSION**

This article presented a review of CCN models of cognitive and motor functions. The 19 reviewed models were organized to highlight BG functionality and classified according to six cognitive functions (i.e., categorization, instrumental conditioning, probabilistic learning, working memory, sequence learning, and automaticity) and three motor functions (i.e., reaching, handwriting, and visual saccades). On the one hand, some of the reviewed models are standalone models of specific functions of BG, e.g., the reaching model of Bischoff (1998). On the other hand, there are models that are specific instances of a more general learning framework applied to BG function. COVIS (Ashby et al., 1998, 2007; Ashby and Crossley, 2011), and the models of Chakravarthy and colleagues (Krishnan et al., 2011; Magdoom et al., 2011) belong to the second category. For example, both the models of Krishnan et al. (2011) and Magdoom et al. (2011) used a nearly identical reinforcement learning-based approach to account for the specific motor functions of reaching and saccade generation. A review article by Chakravarthy et al. (2010) proposes that an expanded framework based on reinforcement learning, adapted to BG anatomy and physiology, can be used to explain a wide variety of BG functions. Such a proposal needs a more extensive modeling and experimental investigation for further confirmation. However, interestingly, CCN models accounting for more than one function were accounting for more than one cognitive function or more than one motor function. None of the reviewed CCN models could account for at least one motor and one cognitive function simultaneously. This may be a serious limitation as behavioral experiments are beginning to reveal important interactions between motor and cognitive processes. Below, we discuss how cognitive processes might impact motor function, and point to novel directions for computational modeling studies.

#### **INTERACTION OF MOTOR AND COGNITIVE PROCESSES**

While none of the models included simultaneously accounted cognitive and motor functions, they all had a cognitive and motor component. For example, the Ashby and Crossley (2011) model made a cognitive decision, but it also included premotor areas associated with the response. It just did not include a detailed model of the motor response (e.g., how is the left button pressed). Likewise, the Gangadhar et al. (2007) has to include a cognitive component specifying what character is to be drawn. However, the focus is on how the movement is produced. Perhaps the model that comes closest to integrating motor and cognitive functions is the model of Guthrie et al. (2013). In this model, both a cognitive and a motor decision are taken, and the interaction between these decisions is accounted for. However, this model does not include a detailed simulation of how the movement is produced. Therefore, it was only discussed in the context of cognitive function.

One way to explore how cognitive and motor functions interact is to explore disease states. For example, akinesia and bradykinesia in PD are arguably associated with BG (and corticostriatal circuits) dysfunction, while tremor is perhaps associated with cerebellar, thalamic, and STN abnormalities (Kassubek et al., 2002; Probst-Cousin et al., 2003; Weinberger et al., 2009; Zaidel et al., 2009; Mure et al., 2011). For example, Schillaci et al. (2011) found that PD patients with akinesia and rigidity as the predominant symptoms have significantly more widespread dopamine loss in the striatum than PD patients with tremor as the predominant symptom. Because these different brain areas (e.g., striatum, cerebellum) are also involved in different cognitive functions, it is reasonable to hypothesize that different PD motor symptoms may be associated with different cognitive impairments. Accordingly, Jankovic et al. (1990) found that PD patients with predominant tremor are less cognitively impaired than patients with bradykinesia. Below we explore some specific PD motor symptoms and their relation to cognition.

## *Akinesia*

Experimental studies have shown that PD patients with severe akinesia are generally more cognitively impaired than PD patients with predominant tremor (Vakil and Herishanu-Naaman, 1998; Poletti et al., 2011, 2012; Poletti and Bonuccelli, 2013). For instance, PD patients with severe akinesia and rigidity symptoms are more impaired than PD patients with severe resting tremor at working memory tasks (Moustafa et al., 2013). Likewise, studies have shown that PD patients with tremor are usually less cognitively impaired than PD patients with akinesia or gait dysfunction (Burn et al., 2006; Lyros et al., 2008; Oh et al., 2009; Domellof et al., 2011). For example, Vakil and Herishanu-Naaman (1998) found that tremor-dominant PD patients are less impaired at procedural learning than akinesia-dominant PD patients.

Most motor models of the BG and corticostriatal circuit function have been able to explain the occurrence of akinesia and bradykinesia, but not tremor (Obeso et al., 2008). We suggest that motor performance may rely on cognitive processes in two different ways: (a) maintenance of motor plans in working memory while performing a sequence of movements, such as hand/leg movement, grasping, or reaching (Hayhoe et al., 2002; Ohbayashi et al., 2003; Piek et al., 2004; Issen and Knill, 2012); or (b) maintenance of goals in working memory while performing a motor act, such as maintaining the goal of grasping the cup in working memory while moving the hands (Batuev, 1989; McIntyre et al., 1998; Matsumoto et al., 2003). This relationship between cognitive and motor processes could explain why some cognitive training programs are effective at treating motor dysfunction in PD patients (Disbrow et al., 2012). Although this is speculative, computational models are needed to explicitly study the complex relationship between motor and cognitive processes in healthy subjects and PD patients.

## *Freezing of gait*

Freezing of gait—paroxysmal cessation of motor output—is a common symptom in advanced PD (Hoehn and Yahr stage 2+) (Giladi et al., 1992). Freezing of gait is debilitating since it often leads to falls and, importantly, is not manageable by common psychopharmacological medications (Giladi et al., 1992; Matar et al., 2013).

Research shows that perceptual and cognitive factors play a role in successful locomotion and the occurrence of freezing of gait episodes in PD patients (Lewis and Barker, 2009; Naismith et al., 2010; Matar et al., 2013). For example, providing auditory or visual cues or instructions can often reduce the occurrence of freezing behavior in PD patients (Lewis and Barker, 2009). Other studies found that walking dysfunction in PD is related to difficulty in resolving response interference produced by distractors (Plotnik et al., 2011; Vandenbossche et al., 2011). Locomotive dysfunction in PD is associated with brain volume changes (Kostic et al., 2012; Tessitore et al., 2012) and aberrant neural activity within the prefrontal cortex (Matsui et al., 2005; Shine et al., 2013), suggesting a role for cognitive processes in locomotion.

There are currently no computational models that simulate the role of cognitive processes in the occurrence of freezing of gait in PD patients. Prior computational models of BGcortex interactions have focused on the simulation of cognitive processes (O'Reilly and Frank, 2006), learning, or simple action selection in static environments (Gurney et al., 2001) without considering how cognitive factors might affect motor actions such as locomotion. Future models should simulate how the cortex represents multiple inputs (including perceptual and cognitive) that feed into the BG, which is important for action selection (e.g., move right, left, forward, etc.). Future models should also be more dynamical in that they should continuously receive and update perceptual input from the environment and produce motor output (step right, left, ...), which then takes the model to a new perceptual input, and so forth.

#### **WHAT IS THE ROLE OF THE BASAL GANGLIA IN COGNITIVE AND MOTOR FUNCTION?**

In addition to the current unavailability of CCN models of the BG that can simultaneously account for cognitive and motor function, another limitation of the current state of BG modeling is the absence of consensus on the specific function of the BG in a given task. For example, many CCN models of working memory assign a role for the BG, but some models use the BG as a gating mechanism allowing for thalamo-cortical loops (e.g., Monchi et al., 2000; Ashby et al., 2005), while others use the BG as a gating mechanism for cortico-cortical loops (e.g., Frank et al., 2001) or as the actual maintenance mechanism (Schroll et al., 2012). As with many other cognitive and motor functions, CCN models are critical in pinpointing the specific function of the BG in the cognitive task (e.g., working memory). Computational models can be simulated to identify the consequences of different design choices, and these predictions need to be tested empirically. While models tend to do very well at simulating the function that motivated the model, it is unclear at this point how the model can handle other (different) functions. One way to select useful BG CCN models is to consider generalization capabilities. Towards this end, general integrative frameworks are most useful. For example, the reinforcement learning approach of Chakravarthy and colleagues (Krishnan et al., 2011; Magdoom et al., 2011) or the COVIS-based approach of Ashby and colleagues (Ashby et al., 1998; Apicella, 2007; Ashby and Crossley, 2011) are useful because they have been used to account for functions that were outside of the original scope of the model. Other models of cognitive and motor functions need to be generalized to account for data for which they were not originally designed to build biological "cognitive architectures". Frameworks that are already general should attempt to bridge the gap between CCN models of cognitive function and CCN models of motor function. This could be achieved by integrating existing models. For example, a detailed CCN model of motor function could be added to the COVIS framework. Likewise, a detailed CCN model of cognitive function could be added to the reinforcement-learningbased approach of Chakravarthy and colleagues. While more data will help in eliminating some of the candidate CCN BG models, generalization and integration will be required to avoid overfitting the model to the available data.

## **REFERENCES**


for improvement of movement initiation in Parkinson's disease. *Brain Res.* 1452, 151–164. doi: 10.1016/j.brainres.2012.02.073


Zaidel, A., Arkadir, D., Israel, Z., and Bergman, H. (2009). Akineto-rigid vs. tremor syndromes in Parkinsonism. *Curr. Opin. Neurol.* 22, 387–393. doi: 10.1097/wco. 0b013e32832d9d67

**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 31 July 2013; accepted: 15 November 2013; published online: 06 December 2013*.

*Citation: Helie S, Chakravarthy S and Moustafa AA (2013) Exploring the cognitive and motor functions of the basal ganglia: an integrative review of computational cognitive neuroscience models. Front. Comput. Neurosci. 7:174. doi: 10.3389/fncom.2013.00174*

*This article was submitted to the journal Frontiers in Computational Neuroscience*.

*Copyright © 2013 Helie, Chakravarthy and Moustafa. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms*.

## Hemispheric differences in the mesostriatal dopaminergic system

## *Ilana Molochnikov and Dana Cohen\**

*The Leslie and Susan Gonda Multidisciplinary Brain Research Center, Bar-Ilan University, Ramat-Gan, Israel*

#### *Edited by:*

*Ahmed A. Moustafa, University of Western Sydney, Australia*

#### *Reviewed by:*

*Rachel Tomer, University of Haifa, Israel Reuben Ruby Shamir, Case Western*

*Reserve University, USA \*Correspondence:*

*Dana Cohen, The Gonda Brain Research Center, Bar Ilan University, Building number 901, Ramat Gan 52900, Israel e-mail: danacoh@gmail.com*

The mesostriatal dopaminergic system, which comprises the mesolimbic and the nigrostriatal pathways, plays a major role in neural processing underlying motor and limbic functions. Multiple reports suggest that these processes are influenced by hemispheric differences in striatal dopamine (DA) levels, DA turnover and its receptor activity. Here, we review studies which measured the concentration of DA and its metabolites to examine the relationship between DA imbalance and animal behavior under different conditions. Specifically, we assess evidence in support of endogenous, inter-hemispheric DA imbalance; determine whether the known anatomy provides a suitable substrate for this imbalance; examine the relationship between DA imbalance and animal behavior; and characterize the symmetry of the observed inter-hemispheric laterality in the nigrostriatal and the mesolimbic DA systems. We conclude that many studies provide supporting evidence for the occurrence of experience-dependent endogenous DA imbalance which is controlled by a dedicated regulatory/compensatory mechanism. Additionally, it seems that the link between DA imbalance and animal behavior is better characterized in the nigrostriatal than in the mesolimbic system. Nonetheless, a variety of brain and behavioral manipulations demonstrate that the nigrostriatal system displays symmetrical laterality whereas the mesolimbic system displays asymmetrical laterality which supports hemispheric specialization in rodents. The reciprocity of the relationship between DA imbalance and animal behavior (i.e., the capacity of animal training to alter DA imbalance for prolonged time periods) remains controversial, however, if confirmed, it may provide a valuable non-invasive therapeutic means for treating abnormal DA imbalance.

**Keywords: dopamine, laterality, striatum, side preference, VTA, nucleus accumbens**

## **INTRODUCTION**

There is a general consensus regarding cerebral dominance and hemispheric specialization in the human brain in which the right hemisphere is dominant for analyzing complex visuo-spatial relationships and for the perception and expression of emotions, whereas language is represented in the left hemisphere (Kimura and Archibald, 1974; Gazzaniga, 1995; Davidson and Kenneth, 2004; Fox and Reed, 2008; Gotts et al., 2013). Healthy subjects display asymmetries in the DA system, which in humans have been associated, for example, with lateralized motor performance and hand preference (Bracha et al., 1987; De La Fuente-Fernandez et al., 2000; Mohr and Bracha, 2004). Pathological DA asymmetries occur in neuropsychiatric and neurodegenerative disorders such as schizophrenia, depression, and Parkinson's disease (Peterson et al., 1993; Seibyl et al., 1995; Gruzelier, 1999; Hietala et al., 1999; Van Dyck et al., 2002; Hsiao et al., 2003, 2013). Interhemispheric imbalance in DA concentration and its metabolites has been described in the rodent mesostriatal system and in related structures such as prefrontal cortex (PFC), Entorhinal cortex (EC) and the hippocampus (Glick et al., 1982; Schneider et al., 1982; Denenberg and Rosen, 1983; Drew et al., 1986; Fride and Weinstock, 1987; Nowak, 1989; Rodriguez et al., 1994; Becker, 1999; Hietala et al., 1999; Thiel and Schwarting, 2001; Van Dyck et al., 2002; Silva et al., 2007; Vernaleken et al., 2007; Cannon et al., 2009; Martin-Soelch et al., 2011; Hsiao et al., 2013). This DA imbalance in the mesostriatal system has been associated with a variety of motor and cognitive aspects of behavior among which are spatial performance such as side preference in a T maze or during lever-press (Zimmerberg et al., 1974; Glick et al., 1981; Castellano et al., 1987), locomotor activity (Glick and Ross, 1981; Glick et al., 1988; Cabib et al., 1995; Nielsen et al., 1997; Budilin et al., 2008), difference in sensitivity to intracranial selfstimulation, motivation, responsiveness to stress, and appetitive and aversive stimuli (Glick et al., 1981; Carlson et al., 1991, 1993; Besson and Louilot, 1995; Sullivan and Gratton, 1998; Berridge et al., 1999; Sullivan and Dufresne, 2006; Fox and Reed, 2008; Tomer et al., 2008, 2014; Laplante et al., 2013).

In this review, we address the midbrain DA system functionality by examining the laterality of the mesostriatal system in populations of animals as reflected in DA imbalance across hemispheres. To do so, we specify the type, origin and destination of primarily midbrain DA neurons, review evidence supporting DA related hemispheric laterality as assessed by measuring DA levels and its metabolite dihydroxyphenylacetic acid (DOPAC), examine correlations between DA imbalance and spatial and cognitive biases in animal behavior, and finally, we discuss the ways in which these correlations are altered in response to unilateral and systemic biochemical intervention and behavioral manipulation. We particularly ask whether or not manipulations generate a symmetrical effect across hemispheres (i.e., whether a unilaterally applied manipulation generates a mirror image of the effect generated by applying that manipulation to the other hemisphere). The search for papers was done using relevant key words such as asymmetry, laterality, DA, ventral tegmental area (VTA), substantia nigra (SN), striatum, nucleus accumbens (NAc), PFC, medial forebrain bundle (MFB), 6-hydroxydopamine (6-OHDA), unilateral, left, right, inter hemispheric, mesostriatal, mesolimbic, nigrostriatal, affective, limbic, etc. Papers were selected irrespective of results if reported data comprised left/right information without collapsing them across hemispheres.

## **ANATOMY AND CONNECTIVITY**

The two prime dopaminergic fiber groups, the A9 and A10, are located in the ventral midbrain (Dahlstroem and Fuxe, 1964). The A10 group is the largest group of DAergic cells in the ventral midbrain tegmentum which is located for the most part in the VTA. The A10 group represents about two thirds (∼65%) of VTA cell population. In addition, the VTA comprises a sizable population of GABAergic inhibitory neurons (∼30%) and excitatory glutamatergic neurons (∼5%) (Nauta et al., 1978; Johnson and North, 1992). This DAergic group gives rise to the mesolimbic DA system projecting through the MFB primarily to the ventral striatum (NAc) (Dahlstroem and Fuxe, 1964). The A9 group is the most densely packed group of DAergic cells, which is located in the substantia nigra pars compacta (SNc) and accounts for about 90% of the neuronal population in this relatively homogenous structure, in addition to a small population of GABAergic inhibitory neurons (10%). The A9 group constitutes the nigrostriatal DAergic pathway which projects through the MFB to the dorsal striatum (dStr) composed of the caudate and putamen. Another small group of DAergic cells in rodents and primates is the A8 group, located in the midbrain reticular formation.

The DAergic signal arising from these midbrain structures enables basal ganglia (BG) control of motor planning and action selection (Wurtz and Hikosaka, 1986; Berns and Sejnowski, 1998; Gurney et al., 2001) by two parallel pathways diverging in the striatum and separating the BG loop into the direct, striatonigral pathway, and the indirect, striatopallidal pathway, via mediumsized spiny projection neurons (MSNs) expressing D1 receptors and D2 receptors, respectively (Alexander and Crutcher, 1990; Gerfen et al., 1990). The subthalamic nucleus which is a part of the indirect pathway also receives direct cortical input via the hyper-direct fast pathway that bypasses the striatum and is therefore independent of striatal DA input (Kita, 1992; Nambu et al., 2000, 2002). The balance between the direct and indirect pathways is thought to be essential for proper BG function (Albin et al., 1989; Wichmann and Delong, 1996; Delong and Wichmann, 2007). The MSNs, which are GABAergic neurons, make up the vast majority (90–95%) of striatal neurons in rodents and primates (Kemp and Powell, 1971; Mensah and Deadwyler, 1974; Wilson and Groves, 1980; Graveland and Difiglia, 1985). The MSNs relay information via the direct and indirect pathways to the two major output stations of the BG, the substantia nigra pars reticulata (SNr) and the globus pallidus internal segment (or the entopeduncular nucleus in rodents) which are both primarily GABAergic (Van Der Kooy and Wise, 1980; Oertel and Mugnaini, 1984; Bolam et al., 1985, 1993; Parent and Hazrati, 1995).

#### **CONNECTIVITY WITHIN THE MESOSTRIATAL SYSTEMS**

The DA neurons in the SNc project predominantly to the striatum, which also receives a small percentage of DA inputs from the SNr and VTA (Van Der Kooy and Wise, 1980). Overall, at least 85% of the fibers reaching the striatum from the midbrain arise from the DA mesostriatal neurons. Of the 15% of the non-DA carrying fibers reaching the striatum, about 2% arrive from the VTA, 3–4% from the SNc, and 8–9% arrive from the SNr (Van Der Kooy et al., 1981; Swanson, 1982; Fallon et al., 1983; Gerfen et al., 1987). In turn, neurons located in the striatal patches (Gerfen, 1984, 1985) project back to the DAergic neurons in the SNc (Gerfen et al., 1987). The two types of SNc neurons also project onto the NAc and comprise about 38% of its midbrain input (Swanson, 1982). The NAc projections back to the nigral complex are substantially larger than the nigrostriatal fibers it receives (Nauta et al., 1978).

All three types of VTA neurons project onto the NAc and comprise 40% of its midbrain input (Phillipson, 1979; Swanson, 1982). Of these NAc afferents, 85% are primarily DAergic and the remaining 15% arise from GABA or glutamate releasing neurons (Swanson, 1982; Gerfen et al., 1987; Gonzalez-Hernandez et al., 2004, p. 23; Stuber et al., 2012; Ishikawa et al., 2013). In turn, the NAc sends inhibitory projections back to VTA GABAergic neurons including those that innervated it, thus enabling reciprocal information transfer (Nauta et al., 1978; Phillipson, 1979; Kalivas et al., 1993). In addition, the VTA DAergic neurons are inhibited by GABAergic interneurons located in the VTA (Stuber et al., 2012; Van Zessen et al., 2012).

Approximately 1–3% of the DA fibers reaching the striatum come from the contralateral SNc, the VTA and the posterior lateral hypothalamic area (Veening et al., 1980; Gerfen et al., 1982; Swanson, 1982; Altar et al., 1983; Fallon et al., 1983; Consolazione et al., 1985; Lieu and Subramanian, 2012). Less than 10% of the VTA efferents cross the midline to the contralateral side (Swanson, 1982) where they give rise to about 8% of the fibers reaching the NAc and about 1–3% of the fibers reaching the dStr (Carter and Fibiger, 1977; Altar et al., 1983; Douglas et al., 1987). This inter-hemispheric communication does not pass through the corpus callosum (Fass and Butcher, 1981) and instead decussates before the fibers reach the lateral hypothalamic area, close to the VTA (Altar et al., 1983; Douglas et al., 1987). It has been shown that VTA and SN neurons project either ipsilaterally or contralaterally but not bilaterally (Loughlin and Fallon, 1982).

#### **CONNECTIVITY BETWEEN THE MESOSTRIATAL SYSTEMS AND RELATED STRUCTURES**

The cortical targets of the SN include the PFC, the motor cortex and the EC (Loughlin and Fallon, 1984). All major regions of the cerebral cortex communicate with the striatum bilaterally, however, with ipsilateral predominance (Veening et al., 1980; Fisher et al., 1986; McGeorge and Faull, 1989; Brog et al., 1993; Totterdell and Meredith, 1997; Alloway et al., 2006; Lieu and Subramanian, 2012). Importantly, a single SN neuron sends collaterals to limbic, striatal, and cortical structures, unlike VTA neurons that can project solely to one terminal field (Loughlin and Fallon, 1984).

Projection neurons in the VTA arrive at the PFC and synapse on pyramidal neurons that project to the contralateral PFC (Carr and Sesack, 2000; Hnasko et al., 2012; Stuber et al., 2012) suggesting that the PFC is ideally suited to modulate inter-hemispheric information flow between itself and the BG. Additionally, the PFC sends excitatory glutamatergic projections onto VTA GABAergic neurons which innervate the NAc, and also onto VTA DAergic neurons which project back to the PFC (Carr and Sesack, 2000).

The VTA neurons also project to the EC of which 46% of the projections are DAergic. The EC in turn sends glutamatergic projections to the striatum, mainly to the NAc core (Krayniak et al., 1981; Finch et al., 1995; Moser et al., 2010). Similar to the VTA-NAc connectivity, the majority of the fibers (92%) arrive at the ipsilateral EC whereas the remaining fibers reach the contralateral EC (Swanson, 1982). In addition to the bidirectional communication between the VTA and the NAc, and the VTA-EC-NAc circuit, these structures are also part of a complex loop engaging the hippocampus. Specifically, DAergic neurons in the VTA project to the hippocampus which through the subiculum, NAc and the ventral pallidum releases DA neurons in the VTA from tonic inhibitory ventral pallidum influence (Lisman and Grace, 2005). In addition, the glutamatergic neurons in the VTA project back to the ventral pallidum (Hnasko et al., 2012).

Observation of the mesostriatal and mesolimbic DA pathways reveals a symmetrical connectivity with a primary influence in the ipsilateral side and a minor influence in the contralateral side (illustrated in **Figure 1**). Symmetrical fiber connectivity entails that unilateral manipulations of the DA system will affect mainly the manipulated side and that the outcome of left vs. right manipulations will be symmetrical as long as physiological factors are not taken into account. Inter-hemispheric communication allowing bilateral influence of unilateral manipulation is supported primarily by the PFC without breaking the symmetry. It is noteworthy that information is lacking regarding fiber distribution differences in the left vs. right hemispheres. If fiber distribution differences exist, they may yield asymmetrical, bi-lateral response to a variety of unilateral manipulations.

#### **ENDOGENOUS ASYMMETRY IN THE DAergic SYSTEM**

Endogenous differences in the DA system of population of animals were observed in adulthood by many groups. These differences were expressed in DA level and metabolites, D1 and D2 receptor concentration, binding potential of the receptors and DA transporters measured in the neocortex, striatum and NAc. However, reports by different laboratories do not always concur in occurrence site, magnitude and the directionality of the observed imbalance (Glick et al., 1982; Schneider et al., 1982; Denenberg and Rosen, 1983; Drew et al., 1986; Fride and

(solid lines) and contralateral (dashed lines) projections of fibers containing DA (blue), GABA (red), or glutamate (green). Abbreviations: PFC, Prefrontal Cortex;

EP, Entopeduncular nucleus; VP, Ventral Pallidum; SNc, Substantia Nigra Pars Compacta; SNr, Substantia Nigra Pars Reticulata; VTA, Ventral Tegmental Area. Weinstock, 1987; Nowak, 1989; Rodriguez et al., 1994; Giardino, 1996; Becker, 1999; Thiel and Schwarting, 2001; Van Dyck et al., 2002; Silva et al., 2007; Vernaleken et al., 2007; Cannon et al., 2009; Martin-Soelch et al., 2011; Hsiao et al., 2013). Examination of the results and experimental conditions reveals, in addition to methodology of measurement, differences in animals' age, gender ratio, and in the species and strains of animals used. A clue for reconciling this discrepancy may be found in reports describing an inter-hemispheric regulatory/compensatory process which acts differentially in both hemispheres to restore DAergic balance. Such a process has been suggested to be active during development (Rodriguez et al., 1994; Giardino, 1996; Frohna et al., 1997; Vernaleken et al., 2007) and following tissue damage (Altar et al., 1983; Pritzel et al., 1983; Carlson et al., 1996; Roedter et al., 2001; Blesa et al., 2011). This regulatory/compensatory process which is time-dependent utilizes mechanisms of action such as controlling the ratio of D1/D2 receptors (Joyce, 1991a,b), changing the number of synaptic terminals (Jaeger et al., 1983; Lawler et al., 1995; Meshul et al., 1999; Gittis et al., 2011; Golden et al., 2013), and enhancing DA turnover and release from the available fibers (Ikegami et al., 2006). If the influence of this regulatory process on the expression of DA imbalance is experience dependent the discrepancy across laboratories could be reconciled because the variance in animals' experience is likely to be smaller within a laboratory than across laboratories. Hence, observation of hemispheric chemical imbalance may be obscured by the animal's unique experience and therefore studying DA imbalance in relation to animal behavior, preferably during a well-defined task, may minimize the influence of experience and could potentially reveal a robust effect that is consistent across laboratories.

## **BEHAVIORAL CORRELATES OF ENDOGENOUS DA ASYMMETRY**

The degree of right hand preference in humans positively correlates with left putamen dominance, whereas right caudate dominance positively correlates with the level of performance during simultaneous bimanual movements in right-handed healthy subjects (De La Fuente-Fernandez et al., 2000). Rodents also display paw preference yet this preference is correlated with higher DA and DOPAC concentrations in the NAc ipsilateral to the preferred limb (Cabib et al., 1995; Budilin et al., 2008). Additionally, lateralization of DOPAC/DA ratios in favor of the right ventral striatum was positively related to right-side thigmotaxis (Thiel and Schwarting, 2001). Generally, however, spatial preference in rodents appears to be the manifestation of difference in activity occurring between the nigrostriatal systems. Specifically, rats forced to select a side (left or right) in a T maze displayed significantly higher DA content in the striatum contralateral to the preferred side than to the ipsilateral striatum (Zimmerberg et al., 1974). Similarly, the preferred direction of rotation was directly linked to the asymmetry in baseline DA (see **Figure 2**): the preferred direction of rotation was contralateral to the striatum with higher DA (Glick et al., 1988). Further support for this observation comes from intracranial self-stimulation experiments showing that the side having a lower threshold for stimulation was contralateral to the direction of spontaneous rotations (Glick et al., 1981). Opposite to DA levels, lower DOPAC values were

observed in the striatum contralateral to the preferred direction of rotation (Glick et al., 1977, 1988). However, haloperidol, which decreases the activation of DA receptors, inversed the relation between DOPAC concentration and direction of rotation without changing the animal's preferred direction (Jerussi and Taylor, 1982), suggesting that DOPAC concentration by itself may be a less reliable predictor of rotation direction. The preferred direction of rotation and paw preference in rodents are uncorrelated suggesting that they reflect two distinct processes (Nielsen et al., 1997) each correlated with DA asymmetry in a different striatal sub-region. This distinction in rodents contrasts with data showing a relation between human handedness and preferred direction of rotation; right-handers preferred left-sided turning and non-right-handers preferred right-sided turning (Mohr et al., 2003).

Other brain areas were also linked with spatial preference in rodents. For example, in the two-lever operant conditioning task rats showed a right lever bias that was correlated with enhanced activity in the left frontal cortex (Glick and Ross, 1981). Additionally, enhanced activity in the left PFC was observed during right rotation preference. These findings are consistent with reports of PFC-striatum interactions. Specifically, phasic activation of the PFC has been shown to increase DA release in the ipsilateral NAc (Taber and Fibiger, 1995) probably via glutamateinduced activation of DA neurons in the VTA (Karreman and Moghaddam, 1996).

As opposed to the motor and sensory modalities displaying characteristic laterality, limbic entities such as stressors and natural rewards (food, liquid, or pleasure) lack laterality. Nonetheless, several studies suggest that endogenous DA asymmetry can be correlated with affective behaviors as well. For example, human baseline asymmetry in D2 receptor availability in the left relative to the right striatum was associated with greater positive incentive motivation (Tomer et al., 2008; Martin-Soelch et al., 2011). Moreover, human subjects with higher D2 binding in the left hemisphere were sensitive to learning from positive reinforcement whereas those with higher D2 binding in the right hemisphere were sensitive to learning from negative reinforcement (Tomer et al., 2014). Additionally, right-biased rats were significantly more active and had a stronger side preference than left-biased rats (Glick and Ross, 1981) which also may suggest attention/motivation influence.

## **THE INFLUENCE OF BRAIN MANIPULATION ON ANIMAL BEHAVIOR**

In the previous section we reviewed evidence showing correlations between endogenous DA imbalance and natural behavior. In the following section we examine how systemic and unilateral artificial alteration of the inter-hemispheric DA imbalance influences animals' behavior.

## **MANIPULATIONS IN THE NIGROSTRIATAL SYSTEM** *Systemic biochemical manipulations*

Systemic application of d-amphetamine, which enhances DA concentration at the DAergic terminals was found to enhance the rate of rotations in the preferred direction prior to the injection (Jerussi and Glick, 1976), as well as the inter-hemispheric difference in striatal DA content in a dose dependent manner (Glick et al., 1981). Moreover, d-amphetamine application further lowered the threshold for MFB activation and consequently further biased the side preference of self-stimulation (Glick et al., 1981). Apomorphine, which is a non-selective DA agonist, induced rotational behavior in rats similar to damphetamine. The induced rotation increased with apomorphine dosage until reaching a plateau (Jerussi and Glick, 1975). Conversely, apomorphine administration significantly decreased the number of times rats chose their preferred arm in a T-Maze but did not alter the side preference observed prior to drug application (Castellano et al., 1987) which may indicate drug spread into limbic regions. Inhibition of catecholamine synthesis by Alpha-methyl-para-tyrosine (AMPT) markedly reduced or completely abolished d-amphetamine induced rotation (Jerussi and Glick, 1976), but did not affect rotation elicited by apomorphine (Jerussi and Glick, 1975), suggesting that these two drugs operate differently. Haloperidol prevented the rotation elicited by both d-amphetamine and apomorphine (Jerussi and Glick, 1976), indicating its robust influence.

## *Local biochemical manipulations*

A more selective way to induce DA imbalance in the nigrostriatal pathway is by unilaterally injecting DA agonists and antagonists into the dStr or the SN. **Figure 2** summarizes the results obtained following unilateral manipulations. Local injection of DA into the SNc induced weak ipsiversive or mixed ipsiversive and contraversive rotation probably due to DA autoreceptors which locally inhibit DA neurons (Jang et al., 2011). Injection of DA into the SNr only induced contraversive circling (Kelly et al., 1984). Local injection of apomorphine to the SNr or SNc had a mild influence on the tendency of the rats to rotate (Kelly et al., 1984), suggesting activation of other neuronal mechanisms that counteract the influence of apomorphine on DAergic neurons. Unilateral injections of apomorphine into the dStr induced contraversive turning as did injections of d-amphetamine, NMDA (Ossowska and Wolfarth, 1995), and atropine which is a muscarinic acetylcholine receptor antagonist (Jerussi and Glick, 1976).

GABA-related drugs and GABA antagonists applied intranigrally also facilitated rotational behavior; GABA, GABAA agonist (e.g., muscimol) and GABAB agonists (e.g., γ-hydroxybutyric acid and baclofen) injected unilaterally into the SNr of rats elicited contraversive turning by disinhibiting the SNr influence on the striatum and enhancing DA activation ipsilaterally to the injection site, whereas unilateral injections of GABAA antagonist (e.g., bicuculline) produced ipsiversive turning by enhancing SNr inhibition of the striatum (Olpe et al., 1977; Scheel-Kruger et al., 1977).

Unilateral optical stimulation of MSNs in the direct or the indirect striatal pathways concurred with previously described manipulations which unilaterally enhanced or attenuated DA level; direct pathway activation led to contraversive rotation, whereas indirect pathway activation yielded ipsiversive rotation (Kravitz et al., 2010). Such a selective activation of the two pathways validates the influence of DA hemispheric imbalance on preferred rotation direction in the absence of a compensatory mechanism influence.

## *Unilateral lesions using 6-hydroxydopamine*

6-hydroxydopamine (6-OHDA) is a neurotoxic compound that selectively destroys catecholamine neurons by penetrating their membrane via the DA or noradrenaline reuptake transporters (Simola et al., 2007). The 6-OHDA injection site determines the extent of damage caused by the neurotoxin and its specificity. Unilateral injection in the MFB has been shown to lesion both the uncrossed and crossed projections of the A9 and A10 cell groups converging on the ipsilateral dStr, and produces extensive catecholamine neuron destruction of about 97% of the cells primarily in the ipsilateral SNc and VTA. The contralateral SNc and VTA are less affected by the neurotoxin injection (Altar et al., 1983) because only a small percentage of their efferents (1% and 8% in the SNc and VTA, respectively) pass through the injection site (Iwamoto et al., 1976; Altar et al., 1983; Torres and Dunnett, 2012).

A more specific lesion can be produced by injecting the neurotoxin into the SNc, which leads to the destruction of only the nigrostriatal (A9) pathway, and more focal damage can be created by lesioning sub-regions of the dStr complex, which induces cell death of 50–99% of the SNc neurons depending on the affected dStr volume (reviewed in Deumens et al., 2002). These three procedures serve as classic animal models for Parkinson's disease (Ungerstedt and Arbuthnott, 1970; Simola et al., 2007). All of these lesions produce a biochemical imbalance in which lower DA levels are measured in the ipsilateral-to-lesion dStr than in the contralateral dStr. This biochemical imbalance is correlated with behavioral alterations such as the development of motor slowness and a turning preference bias toward the side of the lesion (Roedter et al., 2001; Simola et al., 2007). Specifically, 6-OHDA lesions of either the SNc, MFB, or the dStr induce ipsiversiveto-lesion rotation (Jerussi and Glick, 1976; Glick et al., 1977; Carman et al., 1991; Annett et al., 1992; Deumens et al., 2002). The lesion-induced turning behavior ceases within the first postoperative week (Pritzel et al., 1983) thus setting an upper bound on the time required for the compensatory processes to become evident.

It is interesting to note that unilateral optical activation of MSNs in the direct/indirect pathway induced contraversive/ipsiversive rotations similar to that induced by 6-OHDA lesions. This finding suggests that alterations in DA concentrations influence the balance in activation of the direct vs. indirect pathways; reduced DA concentration tips the scale toward activation of the striatopallidal pathway whereas enhanced DA concentration emphasizes the striatonigral pathway.

#### *Unilateral lesions and systemic biochemical manipulations*

Systemic administration of pharmacological agents influenced the direction of turning evoked by the 6-OHDA lesion by enhancing or reducing the lesion-induced biochemical imbalance. damphetamine produced ipsilateral circling behavior whereas haloperidol and pimozide, which decrease the activation of DA receptors, produced contralateral circling behavior in rats with 6-OHDA lesion in the SNc (Iwamoto et al., 1976). Electrolytic lesions of the SNc induced a rotation preference similar to that induced by 6-OHDA (Arbuthnott and Crow, 1971; Iwamoto et al., 1976). Systemic administration of apomorphine generated a more complex outcome depending on the type and extent of the resulting tissue damage; apomorphine yielded contraversive circling following 6-OHDA lesions (Iwamoto et al., 1976; Glick et al., 1977; Meshul and Allen, 2000) and the number of rotations depended on the extent of tissue damage. By contrast, apomorphine led to ipsiversive circling following electrolytic lesion (Iwamoto et al., 1976).

Overall, artificial enhancement of striatal DA imbalance in rodents produces circling behavior in the direction toward the side with depleted DA concentration (see **Figure 2**). The outcome of this category of manipulations which ultimately influence striatal DA processing and transmission is consistent regardless of which component in the circuit has been manipulated and whether the manipulation enhanced or attenuated activity.

#### **MANIPULATIONS IN THE MESOLIMBIC SYSTEM**

Application of 6-OHDA to the mesolimbic system produced characteristic behavioral deficits different in nature from those produced by lesioning the nigrostriatal system. Unilateral injection of 6-OHDA to the right VTA or SN impaired acquisition of operant tasks (Hritcu et al., 2008). Bilateral small 6-OHDA lesions of the VTA produced a significant increase in spontaneous locomotor activity whereas large 6-OHDA lesions of the VTA or the NAc produced hypoactivity in the open field, a complete blockade of the locomotor stimulating effects of d-amphetamine, and a profound supersensitive response to apomorphine expressed as enhanced locomotion (Koob et al., 1981). Interestingly, a radiofrequency-VTA lesion caused a greater increase in spontaneous activity relative to the VTA 6-OHDA lesion, suggesting the presence of a powerful inhibitory influence of the mesolimbic DA system within the VTA (Koob et al., 1981).

#### *Manipulations of structures interacting with the mesolimbic system*

Experiments described thus far support the view that the relationship between the manipulation-induced DA imbalance and the animal's induced rotation direction is symmetric across hemispheres (see **Figure 2**). Experiments directly assessing the existence and function of DA imbalance in the mesolimbic pathway are lacking. However, some of the information can be deduced from experiments addressing the issue indirectly, for example, by applying pharmacological intervention in structures which interact with the mesolimbic system such as the PFC, EC, and hippocampus. Such experiments reveal that despite the apparent anatomical symmetry of the fibers unilateral manipulations influence the DA system asymmetrically. The results of these experiments are summarized in **Table 1**.

#### *The effect of PFC manipulations on the mesolimbic DA system*

The PFC represents information in the short term working memory (Kolb et al., 1994; Kesner et al., 1996; Goldman-Rakic, 2011) and is crucial for spatial and emotional response selection which are influenced by mesocortical DA (Thierry et al., 1976; Delatour and Gisquet-Verrier, 1999; Sullivan, 2004). As previously mentioned the interaction between mesostriatal DA and PFC is reciprocal (Carr and Sesack, 2000; Hnasko et al., 2012; Stuber et al., 2012). 6-OHDA lesions of the mesocortical DA innervating the right mPFC and anterior cingulate induced a significant bilateral reduction in DA content and an increase in DA turnover in the striatum (Sullivan and Szechtman, 1995). Similar lesions to the left mPFC and anterior cingulate did not affect striatal DA. Moreover, similar to unilateral lesion of the right PFC, bilateral lesions also reduced DA content and increased DA turnover but the response was limited to the right NAc (Sullivan and Szechtman, 1995). Evidence for the asymmetrical influence exerted by the PFC on the striatum also comes from studies performed on human subjects. For example, application of a continuous TMS theta burst stimulation (cTBS) to the left dorsolateral PFC of healthy young right-handed adults inhibited DA release bilaterally in the caudate and ipsilaterally in the putamen, whereas right dorsolateral PFC cTBS did not influence binding potential in the striatum (Ko et al., 2008).

A few studies correlated DA imbalance following chemical manipulations of the PFC with behavior. For example, correlations were observed between DA depletion induction in the right medial PFC (mPFC) and rats exhibiting the most severe stress-induced gastric pathology—ulcers (Sullivan and Szechtman, 1995). Additionally, unilateral injection of DA (D1/D2) antagonist into the mPFC significantly enhanced the restraint stress-induced increases in corticotrophin and corticosterone in non-handled rats (Sullivan and Dufresne, 2006). However, similar experiments in handled rats showed that only


**Table 1 | The reported directionality of DA imbalance in the mesolimbic system following biochemical and behavioral manipulations.**

*A circle denotes the spatial distribution of the manipulation outcome. Colored area indicates a significant increase (red) or decrease (blue) in DA or its metabolites in the corresponding hemisphere. Data were included from different species, ages, and measurement techniques.*

right side DA receptor blockade induced elevation of peak stress hormone levels (Sullivan and Dufresne, 2006). These data provide supporting evidence for the asymmetrical relationship between DA imbalance in the mesolimbic system and the animals' affective behavior, and further emphasize that the animals' experience may hamper attempts to characterize this relationship.

## *The effect of EC and hippocampus manipulations on the mesolimbic DA system*

The EC and the ventral hippocampus (VH) receive DAergic projections from the VTA, and depending on event novelty (Lisman and Grace, 2005) act upon the VTA to regulate the DAergic transmission to the NAc (Louilot and Le Moal, 1994; Kurachi et al., 2000). Inter-hemispheric comparison of changes in DA and DOPAC concentration in the NAc and related regions following pharmacological intervention in the EC or VH revealed a timedependent process with complex lateralization. Excitotoxic lesion of the left EC decreased DOPAC levels in the NAc and the mPFC 2 weeks post lesion (Kurachi et al., 2000). Moreover, measurements performed in adolescence following excitotoxic lesion of the left EC in newborn rats showed bilateral enhancement of DA concentration in the NAc and dStr, and a unilateral decrease in the right mPFC. The NAc also displayed a bilateral decrease in DOPAC/DA concentration ratio whereas the dStr did not. None of these alterations in the DA system was detected in the newborns (Uehara et al., 2000). When the left EC of adult rats was lesioned with 6-OHDA the outcome was reversed; DA tissue content decreased whereas DOPAC tissue content and the resultant DOPAC/DA ratio increased in the NAc (Louilot and Choulli, 1997). It remains to be seen whether the differences in outcome are due to the age at which the lesion was made or due to the lesion type. A similar lesion to the right EC caused elevation in DOPAC/DA ratio only in the left NAc (Louilot and Choulli, 1997). A more specific intervention by manipulation of both D1 and D2 receptors in the EC and VH significantly influenced DOPAC concentrations in the NAc such that the effect of the D2 manipulation was more pronounced than that of D1 (Louilot and Le Moal, 1994). Specifically, administration of a D2 receptor antagonist in the left EC increased DOPAC concentration in both NAc whereas administration of a D2 agonist caused the opposite effect (i.e., decreased DOPAC concentration in both NAc). Injection of a D2 antagonist or agonist into the right EC did not influence DOPAC levels in the NAc. Injections of a D2 antagonist and agonist into the VH evoked similar responses as left EC injections but the response was limited to the side ipsilateral to the injected hemisphere (Louilot and Le Moal, 1994). Moreover, injection of the neurotoxic compound Tetrodotoxin (TTX) into the left EC decreased DOPAC in both NAc whereas TTX in the right EC increased DA transmission in the ipsilateral NAc and decreased DOPAC in the contralateral NAc (Louilot and Le Moal, 1994). Unilateral TTX injections into the VH decreased DOPAC in the NAc of both hemispheres irrespective of injection side (Louilot and Le Moal, 1994). These side-dependent alterations in NAc DA release and its metabolites support the existence of an asymmetrical functional interdependence between midbrain DAergic pathways reaching the EC and the NAc (Louilot and Choulli, 1997).

Consistent with altered mesolimbic DAergic transmission, a left EC excitotoxic lesion in adult rats enhanced spontaneous and methamphetamine induced locomotor activity (Sumiyoshi et al., 2004; Kopniczky et al., 2006). Interestingly, the methamphetamine-induced normalized DA release in the ipsilateral NAc or dStr did not change in comparison to sham-operated animals possibly due to enhanced postsynaptic sensitivity rather than presynaptic alteration in DA levels (Uehara et al., 2000; Sumiyoshi et al., 2004). When the excitotoxic lesion to the left EC was combined with ipsilateral inactivation of the mPFC by lidocaine (a non-selective blocker of axonal fibers of passage as well as neurons) the methamphetamine-induced DA release in the ipsilateral NAc was substantially enhanced (Uehara et al., 2007). By contrast, ipsilateral inactivation of the mPFC by lidocaine alone reduced the methamphetamine-induced DA release in the NAc, which is consistent with previous data. Overall, inactivation of the mPFC together with structural abnormalities in the EC leads to deregulation of DAergic neurotransmission in limbic regions (Uehara et al., 2007).

The results described thus far strengthen assumptions concerning the lateralized involvement of the PFC, the EC, and the VH in the modulation and possibly regulation of mesostriatal DA asymmetry (Glick and Ross, 1981; Glick and Carlson, 1989; Fox and Reed, 2008). Overall, it seems that the nigrostriatal system displays symmetrical laterality (i.e., the outcome of different manipulations applied to one hemisphere is a mirror image of the outcome obtained when the manipulation is applied to other hemispheres), whereas the DA mesolimbic system and related structures display asymmetrical laterality (i.e., the outcome of different unilateral manipulations is not a mirror image of the same manipulation applied to the other hemisphere), The occurrence of an asymmetrical laterality supports the notion of specialized hemispheric function (Sullivan, 2004; Fox and Reed, 2008).

#### **THE INFLUENCE OF ANIMAL TRAINING ON DA IMBALANCE IN THE NIGROSTRIATAL SYSTEM**

Thus, far we have described how endogenous DA imbalance in the nigrostriatal system correlates with animal behavior and how accentuation of this biochemical imbalance can alter behavior. Below we address the opposite question of whether utilizing different behavioral tasks can induce or influence the basal DA imbalance.

Prior to training on an electrified T-maze (Zimmerberg et al., 1974) a relatively small percentage (about 54–59%) of animals displayed side preference (Castellano et al., 1987). After training the vast majority (85.71%) of the rats displayed a side preference with a strong bias toward the right arm (80%—right arm preference; 20%—left arm preference) (Castellano et al., 1987). Unilateral lesions using 6-OHDA ipsilaterally to the preferred side did not change preference parameters, whereas contralateral lesion massively and persistently decreased the choice of the side preferred preoperatively (Castellano et al., 1987). A similar outcome was observed when rats were trained to circle for a sucrose reward in a randomly assigned direction. Prior to training, caudate and NAc DA and DOPAC levels were the same in both hemispheres. By contrast, following training a significant increase occurred in both DA and DOPAC concentrations in the caudate and NAc contralateral to the turning direction whereas these concentrations did not change in the ipsilateral caudate compared to control animals (Yamamoto and Freed, 1982, 1984a). Amphetamine administration further enhanced turning in the trained direction regardless of the animals' previous circling preference. As in naive animals, amphetamine-induced circling led to increased DA concentrations in the caudate contralateral to the trained circling direction (Yamamoto and Freed, 1984b). These findings are congruent with the previously described relationship between DA imbalance and side preference.

Similar experiments (Szostak et al., 1986, 1988, 1989; Glick and Carlson, 1989) in which water-deprived animals were trained on circling behavior failed to replicate the results described by Yamamoto et al. (Yamamoto and Freed, 1982, 1984a,b). In particular, circling did not produce a biochemical imbalance in the nigrostriatal and the mesolimbic DA system; rats trained to circle using a continuous schedule of reinforcement did not exhibit any change in concentrations of striatal DA or DOPAC although alterations in DA and DA turnover were detected in the mPFC (Szostak et al., 1986; Glick and Carlson, 1989). Changing the reward schedule induced enhanced DAergic activity in both the NAc and the dStr (Szostak et al., 1986). A plausible explanation arises from a report which showed that different regions of the striatum exhibit different DA and DOPAC concentrations, thus suggesting striatal region specialization (Szostak et al., 1989). Importantly, the use of a water deprivation protocol without circling training was sufficient to induce a bilateral decrease in DOPAC/DA ratio in the NAc and mPFC (Glick and Carlson, 1989).

The following experiments further strengthen the relationship between DA imbalance and rotation direction. Unilateral 6-OHDA lesions of the MFB of rats trained to circle in their preferred direction for water reinforcement revealed a complex scenario that fits relatively well with the chemical imbalance described in control animals. Specifically, rats with a lesion contralateral to their trained direction stopped turning in that direction and often turned in the untrained direction. By contrast, rats lesioned ipsilaterally to the direction of reinforced circling exhibited only a 50% decrease in the rate of reinforced responding. These findings also highlight the effect of training on DA hemispheric imbalance. Following these procedures the experimental contingencies were reversed such that all rats had to turn in the untrained direction. After the reversal, the contralaterally lesioned group had to turn toward the lesion and consequently easily acquired the reversal, however, reinforced rates of responding did not reach preoperative rates possibly due to altered motivation (Mogenson et al., 1980; Smith et al., 2002; Wise, 2002; Solstad et al., 2008). Conversely, the ipsilaterally lesioned group had to turn away from the lesion and consequently was unable to acquire the reversal and continued to turn in the originally trained direction (Szostak et al., 1988). These findings are in line with the previously described rotation direction induced by 6-OHDA lesions. Moreover, these experiments suggest a causal relationship between DA imbalance and rotation direction because they imply that the direction of circling cannot be altered by training as long as the biochemical imbalance cannot be reversed.

Overall, by considering that alterations in DA/DOPAC concentrations resulting from training appear to be sensitive to variations in the training protocol such as reward schedule, severity and type of deprivation (food or water), type of reward delivered (sucrose, water, or food) and level of performance reached following training, the question whether DA imbalance may be effectively manipulated by training, which is a simple noninvasive procedure and as such may have a therapeutic capacity, remains controversial. These findings also raise the question whether the observed behavioral bias is a byproduct of the current DA imbalance or whether the regulatory mechanism promotes actions aimed to attenuate the inherent hemispheric laterality by overworking the less active regions.

## **THE INFLUENCE OF ANIMAL TRAINING ON DA IMBALANCE IN THE MESOLIMBIC SYSTEM**

The question whether behavioral manipulations can induce or influence DA imbalance in the mesolimbic system has been addressed by utilizing different behavioral paradigms involving stress. Stress induction lacks laterality and therefore in all of the experiments emergent results reflect internal network properties. In this section we focus primarily on experiments showing DA measures in the NAc and PFC in attempt to identify whether laterality appears. A summary of the results is shown in **Table 1**.

DA responses to a naturally attractive olfactory stimulus were elevated in both NAc, most markedly in the right core. However, conditioned taste aversion following LiCl treatment resulted in elevation in DA levels in the left but not right NAc core which further expanded to the left shell upon a second presentation of the now aversive stimulus (Besson and Louilot, 1995). Different results were obtained when instead of presenting aversive stimulus physical stress was applied. The amplitude of DA release following tail pinch was higher in the right NAc compared to the left (Laplante et al., 2013) whereas in the mPFC DA release increased bilaterally (Sullivan and Gratton, 1998). Moreover, the duration of DA response in the left mPFC was significantly longer than that in the right mPFC (Sullivan and Gratton, 1998).

DA measures in the right mPFC correlated with reduced stress levels following 5 repetitions of restraint stress induction in both handled and non-handled animals (Sullivan and Dufresne, 2006). However, in handled rats DA turnover in the right mPFC was significantly higher than the left whereas in non-handled rats DA turnover in the left mPFC was significantly higher relative to the right mPFC (Sullivan and Dufresne, 2006). It should be noted however, that these results are incongruent with earlier findings regarding the cortical lateralization of emotional regulation which is facilitated by early postnatal handling stimulation, and fails to occur in the absence of postnatal handling (Denenberg, 1981; Sullivan and Dufresne, 2006). DA turnover in the right mPFC was higher than the left also in animals that received a foot shock without being able to terminate it (Carlson et al., 1993). When animals could control foot shock termination, DA turnover increased in the mPFC bilaterally (Carlson et al., 1993). Similarly, when rats and mice were exposed to a brightly lit novel environment (novelty stress) DA turnover in the right PFC was significantly higher than the left PFC in animals that weren't given the option of engaging in a non-escape behavior (e.g., chewing an inedible object), compared to those who did (Berridge et al., 1999). Thus, once animals are able to attenuate the physiological stress by controlling stressor termination or by engaging in a coping or displacement behavior the stress-induced DA response in the right PFC is attenuated (Berridge et al., 1999).

DA concentration measurements following different durations of restraint stress provide additional support for the correlation observed between stress level and DA imbalance in the mPFC. DA concentration increased in the right mPFC and reached significance relative to the left mPFC after the animals were kept restraint for an hour, and DA turnover was significantly higher in the left relative to the right mPFC following 15 min of restraint stress but not following prolonged restraint duration. These experiments suggest an alternative hypothesis in which a left to right shift in the mesocortical DA activation occurs with prolonged exposure to stress (Carlson et al., 1991).

The lack of agreement between these reports suggests that laterality in the mesolimbic DA system depends on factors such as the type of induced stress, stress duration and animal's previous experience (Denenberg, 1981; Fox and Reed, 2008). Therefore, these experiments do not allow addressing the question of whether causality exists between DA imbalance in the mesolimbic system and behavior; whether DA imbalance can reduce stress levels or alternatively that DA imbalance is determined by the animals' sensitivity to stress induction should be addressed by utilizing different experimental methods. In any case, despite the absence of a consistent link between behavioral manipulations lacking directionality and DA imbalance in the mesolimbic system these experiments support an asymmetrical processing characteristic of a specialized left vs. right network.

## **DISCUSSION**

DA imbalance across brain hemispheres is inherent in newborns. With development, this imbalance decreases due to an interhemispheric time-dependent regulatory mechanism which differentially influences the DA system in each hemisphere (Rodriguez et al., 1994; Giardino, 1996; Frohna et al., 1997; Vernaleken et al., 2007). This mechanism also compensates for alterations in the endogenous DA imbalance following brain lesion and possibly neurodegenerative processes. The existence of such a mechanism raises the question whether there is a way to influence DA imbalance by utilizing specially designed behavioral manipulations that directly or indirectly activate the mechanism. Although controversial, studies in the nigrostriatal system suggest this possibility is feasible whereas in the mesolimbic system the feasibility of this approach remains to be explored. If successful, this approach can influence the development of novel non-invasive therapeutic means for the treatment of various disorders affected by alterations in DA imbalance.

The EC, the hippocampus, and PFC are part of a network which modulates NAc responses to DA arriving from the VTA. This network seems to display asymmetric left-right laterality rather than displaying the symmetric laterality observed in the nigrostriatal system. This difference in laterality is consistent with the roles played by each BG circuit: the sensorimotor and associative regions (dStr) display laterality which matches the laterality of their sensory inputs, whereas the limbic regions (NAc) which process abstract inputs supposedly lacking laterality are sensitive to the laterality of prefrontal and temporal lobe structures. Such laterality fits well with the concept of hemispheric specialization described in the PFC in relation to various behaviors (Clark et al., 2003; Sullivan, 2004; Goel et al., 2007; Fox and Reed, 2008; Lupinsky et al., 2010). Additional studies are required to determine whether the observed left-right laterality in the mesolimbic system has functional implications for information processing in subcortical structures (e.g., BG) or does it merely reflect the asymmetric functionality in structures such as the EC and the PFC.

The DA system plays a major role in the planning and execution of movements and in acquisition and expression of learned appetitive behaviors which allow the organism to adapt to its surrounding and thus essential for animals survival. To enable comprehensive understanding of the structure and function of this system, it is essential to plan and execute experiments which in addition to factors such as age, gender, and previous experience, take into account the existence of hemispheric specialization, the endogenous DA imbalance and its influence on behavior, and the way in which behavior can influence this imbalance.

#### **ACKNOWLEDGMENT**

This study was financed in part by a REALNET (FP7-ICT270434) grant from the European Commission.

#### **REFERENCES**


population bias. *Physiol. Behav.* 40, 607–612. doi: 10.1016/0031-9384(87) 90105-3


6-hydroxydopamine-induced axon terminal lesions: evidence for interhemispheric functional coupling of the two nigrostriatal pathways. *J. Comp. Neurol.* 432, 217–229. doi: 10.1002/cne.1098


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 31 October 2013; accepted: 24 May 2014; published online: 11 June 2014. Citation: Molochnikov I and Cohen D (2014) Hemispheric differences in the mesostriatal dopaminergic system. Front. Syst. Neurosci. 8:110. doi: 10.3389/fnsys.2014.00110 This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2014 Molochnikov and Cohen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Mechanism of parkinsonian neuronal oscillations in the primate basal ganglia: some considerations based on our recent work

## **Atsushi Nambu1,2\* and Yoshihisa Tachibana1,2**

<sup>1</sup> Division of System Neurophysiology, National Institute for Physiological Sciences, Okazaki, Japan <sup>2</sup> Department of Physiological Sciences, Graduate University for Advanced Studies, Okazaki, Japan

#### **Edited by:**

Ahmed A. Moustafa, University of Western Sydney, Australia

**Reviewed by:**

Alessandro Stefani, University of Rome, Italy Reuben R. Shamir, Case Western Reserve University, USA

#### **\*Correspondence:**

Atsushi Nambu, Division of System Neurophysiology, National Institute for Physiological Sciences, 38 Nishigonaka, Myodaiji, Okazaki 444-8585, Japan e-mail: nambu@nips.ac.jp

Accumulating evidence suggests that abnormal neuronal oscillations in the basal ganglia (BG) contribute to the manifestation of parkinsonian symptoms. In this article, we would like to summarize our recent work on the mechanism underlying abnormal oscillations in the parkinsonian state and discuss its significance in pathophysiology of Parkinson's disease. We recorded neuronal activity in the BG of parkinsonian monkeys treated with 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine. Systemic administration of L-DOPA alleviated parkinsonian motor signs and decreased abnormal neuronal oscillations (8–15 Hz) in the internal (GPi) and external (GPe) segments of the globus pallidus and the subthalamic nucleus (STN). Inactivation of the STN by muscimol (GABA<sup>A</sup> receptor agonist) injection also ameliorated parkinsonian signs and suppressed GPi oscillations. The blockade of glutamatergic inputs to the STN by local microinjection of a mixture of 3-(2-carboxypiperazin-4-yl)-propyl-1-phosphonic acid (glutamatergic NMDA receptor antagonist) and 1,2,3,4-tetrahydro-6-nitro-2,3-dioxo-benzo[f]quinoxaline-7-sulfonamide (glutamatergic AMPA/kainate receptor antagonist) suppressed neuronal oscillations in the STN. STN oscillations were also attenuated by the blockade of GABAergic neurotransmission from the GPe to the STN by muscimol inactivation of the GPe. These results suggest that cortical glutamatergic inputs to the STN and reciprocal GPe-STN interconnections are both important for the generation and amplification of the oscillatory activity of GPe and STN neurons in the parkinsonian state. The oscillatory activity in the STN is subsequently transmitted to the GPi and may contribute to manifestation of parkinsonian symptoms.

**Keywords: Parkinson's disease, neuronal oscillation, globus pallidus, subthalamic nucleus,** β**-band, monkey, basal ganglia**

## **INTRODUCTION**

Parkinson's disease (PD) is a neurodegenerative disorder affecting motor and non-motor functions. Motor dysfunction in PD, including akinesia, tremor and rigidity is largely attributed to the progressive loss of dopaminergic (DAergic) neurons in the substantia nigra pars compacta. There are two hypotheses that explain the pathophysiology of PD. The "firing rate model" originally proposed that dopamine (DA) depletion reduces tonic excitation to striatal neurons projecting to the internal segment of the globus pallidus (GPi) (i.e., *direct* pathway) and tonic inhibition to striatal neurons projecting to the external segment of the globus pallidus (GPe) (*indirect* pathway) (DeLong, 1990; Mallet et al., 2006). Both of these changes are thought to increase average firing rates of GPi and substantia nigra pars reticulata neurons. This increased activity in the basal ganglia (BG) output nuclei induces decreased activity in thalamic and cortical neurons, resulting in akinesia. However, recent electrophysiological studies using 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine (MPTP)-induced PD monkeys have failed to detect an expected increase in GPi activity (Wichmann et al., 1999; Raz et al., 2000; Rivlin-Etzion et al., 2008).

The firing rate model has now been largely supplanted by the "firing pattern model" that emphasizes oscillatory and/or synchronized activity. Oscillatory and/or synchronized activity is frequently observed in the BG of patients with movement disorders and animal models, which may cause the disturbance of information processing in the BG (Bergman et al., 1998). Unit activity and local field potentials recorded from PD animals and patients have shown oscillatory and synchronized activity in the GPe, GPi and subthalamic nucleus (STN; Bergman et al., 1998; Levy et al., 2000; Raz et al., 2000; Brown et al., 2001; Brown, 2007). The frequency bands include the tremor (4–9 Hz) and β (10–30 Hz) bands. The β-band oscillation may be a primary cause of akinesia, since the treatment of akinesia with drugs effectively suppresses the β-band oscillation. Recent studies also reported β-band synchronized activity in STN neurons of PD patients (Moshel et al., 2013), and correlation between the high β-band activity and freezing gate in PD patients (Toledo et al., 2014). Deep brain stimulation (DBS), which has been widely accepted as an effective therapeutic option of PD, is suggested to improve motor symptoms by activation of efferent fibers (Hashimoto et al., 2003), changes of oscillatory activity (Vitek, 2008) and/or decoupling STN-GPi oscillations (Moran et al., 2012). By contrast, in the course of MPTP-treatment of monkeys, the appearance of PD motor symptoms preceded that of oscillatory activity (Leblois et al., 2007), seeming to contradict the firing pattern model.

In this article, we would like to summarize our recent work on the mechanism regulating the abnormal BG oscillations (Tachibana et al., 2011) and discuss its significance in PD pathophysiology.

#### **OSCILLATORY ACTIVITY IN THE BG OF PD**

The firing properties of BG neurons were compared between the normal and PD states of macaque monkeys. PD states were induced by MPTP treatment (2.4–2.5 mg/kg, carotid artery injection and additional intravenous injections). The average firing rates of GPe neurons were significantly decreased (normal, 65.2 ± 25.8 Hz; PD, 41.2 ± 22.5 Hz) and those of STN neurons were significantly increased (normal, 19.8 ± 9.7 Hz; PD, 27.6 ± 11.4 Hz) in the PD state, whereas the firing rate of GPi neurons were not changed (normal, 67.0 ± 24.3 Hz; PD, 63.1 ± 26.9 Hz). These data contradict the firing rate model. Burst strength (Levy et al., 2001a; Wichmann and Soares, 2006) was increased in the GPi/GPe and STN of the PD states. The mean power (Soares et al., 2004; Rivlin-Etzion et al., 2006) of the 8–15 Hz (low-β) oscillations was increased in the GPi/GPe and STN, whereas there were no consistent changes in the 3–8 Hz and 15–30 Hz (highβ) oscillations. Oscillatory bursts of GPi/GPe and STN neurons were observed as multiple peaks in the autocorrelograms (e.g., **Figures 1B1**, **2B1, 3B1)**. The peak frequency with a maximum power of the oscillatory bursts of GPi/GPe and STN neurons was around 14 Hz (**Figures 1B2**, **2B2, 3B2)**.

## **DA DEPENDENCE OF BG OSCILLATIONS**

We first tested whether the abnormal BG oscillations depend on DAergic inputs. DA was administrated systemically to PD monkeys, and the effects on the neuronal activity of GPi/GPe and STN neurons were examined. The motor disability was ameliorated within 5 min after intravenous L-DOPA injections (2.5–3.5 mg/kg, iv). L-DOPA administration decreased 8–15 Hz oscillations in the GPi/GPe and STN. Approximately 30 min after L-DOPA injections, the monkeys returned to the PD states, and the abnormal oscillations reappeared. The overall firing rate was not changed throughout the injections. These results have demonstrated that abnormal burst firing and 8–15 Hz oscillatory activity of GPi/GPe and STN neurons are DA-dependent. They also suggest that neuronal oscillations in the GPi/GPe and STN, rather than their spontaneous firing rate changes, may be critical for PD symptoms, supporting the firing pattern model.

## **ORIGINS OF ABNORMAL GPi/GPe OSCILLATIONS**

Then, the origins of 8–15 Hz GPi/GPe oscillations were examined. The GPi (Tachibana et al., 2008) and GPe (Kita et al., 2004) receive glutamatergic inputs from the STN and GABAergic inputs from the striatum and GPe (GPe-GPe projections via the intranuclear

**FIGURE 1 | Effects of subthalamic nucleus (STN) inactivation on neuronal activity of the internal segment of the globus pallidus (GPi) under the parkinsonian state. (A)** A schematic diagram showing anatomical connections of the basal ganglia and the experimental method. Recording from GPi neurons was performed with muscimol injection into the STN to block STN inputs to the GPi. Open and filled arrows represent glutamatergic and GABAergic projections, respectively. Cx, cerebral cortex; GPe, external segment

of the globus pallidus; Str, striatum; Th, thalamus. **(B)** A representative GPi neuron showing abnormal oscillations in the parkinsonian state. **(1)** Autocorrelograms calculated from a 50-s spike train and **(2)** power spectra of the same spike trains are shown. Gray dashed lines represent a confidence level of P = 0.01. **(C)** Muscimol inactivation of the STN decreased the firing rate and 8–15 Hz oscillatory activity of the GPi neuron. Modified from Tachibana et al. (2011).

axon collaterals). To determine which inputs contribute to abnormal 8–15 Hz GPi oscillations, each input was selectively blocked. Firstly, the STN was inactivated by injection of a GABA<sup>A</sup> receptor agonist, muscimol (4.4 mM, 0.5–1.0 µL) while GPi neuronal activity was simultaneously recorded (**Figure 1A**). Inactivation of the STN ameliorated PD motor signs, such as bradykinesia and rigidity, as previously reported (Bergman et al., 1990; Wichmann et al., 1994; Levy et al., 2001b) and decreased the 8–15 Hz oscillations (**Figures 1B, C**) and the firing rate.

Secondly, GABAergic inputs from the striatum and GPe were blocked, and the effects on the oscillatory activity of GPi/GPe neurons were examined. Microinjection of a GABA<sup>A</sup> receptor antagonist, gabazine (1 mM, 0.1–0.2 µL) in the vicinity of recorded GPi/GPe neurons increased the firing rate of GPi/GPe neurons, and augmented the 8–15 Hz GPi oscillations, but induced no changes in GPe oscillations. These results suggest that 8–15 Hz GPi/GPe oscillations are generated by glutamatergic inputs mainly from the STN, but not by GABAergic inputs from the striatum and GPe.

#### **ORIGINS OF ABNORMAL STN OSCILLATIONS**

Next, the origins of 8–15 Hz STN oscillations were examined. The STN receives glutamatergic inputs from the cerebral cortex and the thalamus, and GABAergic inputs from the GPe. Firstly, ionotropic glutamatergic inputs were blocked, and the effects on the oscillatory activity of STN neurons were examined (**Figure 2A**). Microinjection (0.1–0.2 µL) of a mixture of an *N*-methyl-D-aspartate receptor antagonist, 3-(2 carboxypiperazin-4-yl)-propyl-1-phosphonic acid (CPP, 1 mM) and an AMPA/kainate receptor antagonist, 1,2,3,4-tetrahydro-6-nitro-2,3-dioxo-benzo[f]quinoxaline-7-sulfonamide (NBQX, 1 mM) in the vicinity of recorded STN neurons decreased the 8–15 Hz oscillations (**Figures 2B, C**).

Secondly, GABAergic inputs from the GPe were blocked, and the effects on the oscillatory activity of STN neurons were examined (**Figure 3A**). Muscimol inactivation (1–2 µL) of the GPe attenuated the 8–15 Hz STN oscillations (**Figures 3B, C**) and increased the firing rate. However, the GPe inactivation induced no clear behavioral changes. These findings have shown that the 8–15 Hz STN oscillation are generated by glutamatergic inputs from the cortex and thalamus and GABAergic inputs from the GPe.

Previous studies reported the coherence between the electrocorticogram and the STN LFPs/STN unit activity in the PD state and have suggested that cortical glutamatergic inputs can drive STN oscillations in frequency bands below 30 Hz (Magill et al., 2000, 2001; Sharott et al., 2005; Mallet et al., 2008b). It is hypothesized that cortical β-rhythm is preferentially transmitted to the BG (Brittain and Brown, 2014). This idea is also supported by an optogenetic study that selective stimulation of cortico-STN projections ameliorated PD symptoms (Gradinaru et al., 2009). The other glutamatergic inputs to the primate STN may come from the intralaminar thalamic nuclei (Lanciego et al., 2009). The parafascicular thalamic nucleus (PF) neurons in PD rats showed oscillatory activity (0.5–2.5 Hz), but PF firings lagged STN firings (Parr-Brownlie et al., 2009).

Another origin of STN oscillations may be the GABAergic inputs from the GPe (Baufreton et al., 2005a). An *in vivo* rat study indicated that 15–30 Hz oscillations between GPe and STN neurons were developed during DA depletion (Mallet et al., 2008a). DAergic innervation in the GPe was decreased in PD monkeys (Schneider and Dacko, 1991), and the GPe-GPe GABAergic transmission was augmented (Watanabe et al., 2009). The oscillatory glutamatergic inputs mainly from the cortex and synchronized GABAergic inputs from the GPe may accelerate the

**to STN neurons under the parkinsonian state. (A)** Recording from STN neurons was performed with intrasubthalamic microinjection of CPP and NBQX to block glutamatergic inputs to the STN. **(B)** A representative STN

**(C)** Intrasubthalamic microinjection of CPP and NBQX decreased 3–8 Hz and 8–15 Hz oscillations of the STN neuron. Modified from Tachibana et al. (2011).

**FIGURE 3 | Effects of GPe inactivation on STN neurons under the parkinsonian state**. **(A)** Recording from STN neurons was performed with muscimol injection into the GPe to block GABAergic inputs from the GPe. **(B)** A representative STN neuron showing

abnormal 8–15 Hz oscillations under the parkinsonian state. **(C)** Muscimol inactivation of the GPe decreased the 8–15 Hz oscillations and increased the firing rate of the STN neuron. Modified from Tachibana et al. (2011).

oscillatory activity in STN neurons (Shen and Johnson, 2000, 2005; Baufreton et al., 2005b; Baufreton and Bevan, 2008).

## **BG OSCILLATIONS AND PD PATHOPHYSIOLOGY**

Our work shows the following results: (1) The loss of DA induced abnormal 8–15 Hz oscillations in GPi/GPe and STN neurons; (2) The abnormal 8–15 Hz GPi/GPe and STN oscillations were reversed by systemic DA administration; (3) The abnormal 8–15 Hz GPi/GPe oscillations were originated from the STN oscillations; and (4) The STN oscillations were driven by glutamatergic inputs mainly from the cortex and GABAergic inputs from the GPe. These findings support the firing pattern model and suggest the mechanism of BG oscillations: Glutamatergic inputs to the STN and reciprocal GPe-STN interconnections generate and amplify the oscillatory activity of STN and GPe neurons in PD. Such oscillatory activity is subsequently transmitted to GPi neurons, and finally reaches the thalamus, cortex and brain stem, contributing to the expression of PD symptoms (**Figure 4**).

The causal relationship between the BG oscillations and PD symptoms is a fundamental question. Leblois et al. (2007) have reported that oscillatory activity of BG neurons does not precede the appearance of PD motor symptoms in the course of chronic MPTP treatment of monkeys, questioning such causal relationship. Moreover, acute disruption of DA transmission did not develop oscillatory activity, which is distinct from chronically depleted animals (Mallet et al., 2008b). The BG oscillations may merely reflect other fundamental activity changes. In PD, the balance between the cortico-STN-GPi *hyperdirect* (Nambu et al., 2000), cortico-striato-GPi *direct* and cortico-striato-GPe *indirect* pathways was lost by the lack of DA in the striatum, and the "dynamic" network properties of the BG were changed (Nambu et al., 2005; Kita and Kita, 2011). It is suggested that the imbalance between the *hyperdirect* and *direct* pathways generates the BG

oscillations (Leblois et al., 2006). Further studies are needed to solve this fundamental question.

In this article, we would like to emphasize a close relationship between the BG oscillations and PD symptoms. In fact, DAergic medication, STN-DBS, and voluntary movements in human patients are all reported to decrease the cortico-BG synchronization (Brown et al., 2001, 2004; Cassidy et al., 2002; Levy et al., 2002; Williams et al., 2002; Silberstein et al., 2005; Lafreniere-Roula et al., 2010). In a similar manner, the suppression of 8–15 Hz oscillations in the primate BG may be essential to ameliorate PD motor symptoms. These findings could shed light on the pathophysiology of PD and understanding the mechanisms of current therapies, and lead us to further rational treatments of PD.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 27 September 2013; accepted: 14 April 2014; published online: 23 May 2014*. *Citation: Nambu A and Tachibana Y (2014) Mechanism of parkinsonian neuronal oscillations in the primate basal ganglia: some considerations based on our recent work. Front. Syst. Neurosci. 8:74. doi: 10.3389/fnsys.2014.00074*

*This article was submitted to the journal Frontiers in Systems Neuroscience*. *Copyright © 2014 Nambu and Tachibana. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms*.

## Disrupting neuronal transmission: mechanism of DBS?

## **Satomi Chiken and Atsushi Nambu\***

Division of System Neurophysiology, National Institute for Physiological Sciences and Department of Physiological Sciences, Graduate University for Advanced Studies, Myodaiji, Okazaki, Japan

#### **Edited by:**

Ahmed A. Moustafa, University of Western Sydney, Australia

#### **Reviewed by:**

Alessandro Stefani, University of Rome, Italy Robert S. Turner, University of Pittsburgh, USA

#### **\*Correspondence:**

Atsushi Nambu, Division of System Neurophysiology, National Institute for Physiological Sciences and Department of Physiological Sciences, Graduate University for Advanced Studies, 38 Nishigonaka, Myodaiji, Okazaki 444-8585, Japan e-mail: nambu@nips.ac.jp

Applying high-frequency stimulation (HFS) to deep brain structure, known as deep brain stimulation (DBS), has now been recognized an effective therapeutic option for a wide range of neurological and psychiatric disorders. DBS targeting the basal ganglia thalamo-cortical loop, especially the internal segment of the globus pallidus (GPi), subthalamic nucleus (STN) and thalamus, has been widely employed as a successful surgical therapy for movement disorders, such as Parkinson's disease, dystonia and tremor. However, the neurophysiological mechanism underling the action of DBS remains unclear and is still under debate: does DBS inhibit or excite local neuronal elements? In this review, we will examine this question and propose the alternative interpretation: DBS dissociates inputs and outputs, resulting in disruption of abnormal signal transmission.

**Keywords: deep brain stimulation, basal ganglia, subthalamic nucleus, globus pallidus, cortico-basal ganglia loop, electrophysiology**

#### **INTRODUCTION**

Applying high-frequency electrical stimulation (HFS) to a specific target in subcortical structures, known as deep brain stimulation (DBS), was introduced as a surgical treatment for movement disorders in early 1990s (Benabid et al., 1991, 1994; Siegfried and Lippitz, 1994a,b; Limousin et al., 1995). Since then, DBS has been widely accepted as an effective therapeutic option. DBS targeting the ventral thalamus dramatically alleviates essential and resting tremor (Benabid et al., 1991, 1996; Siegfried and Lippitz, 1994b; Koller et al., 1997; Rehncrona et al., 2003). DBS targeting the subthalamic nucleus (STN) and the internal segment of the globus pallidus (GPi) has been largely used for treatment of Parkinson's disease, and GPi-DBS has marked effects on improvement of dystonic symptoms (Limousin et al., 1995; Deep-Brain Stimulation for Parkinson's Disease Study Group, 2001; Coubes et al., 2004; Wichmann and Delong, 2006; Kringelbach et al., 2007; Ostrem and Starr, 2008; Vitek, 2008; Vidailhet et al., 2013). However, the exact mechanism of the effectiveness remains to be elucidated.

Since DBS gives rise to similar effects to those of lesions, it was originally considered to inhibit local neuronal elements. In fact, neuronal firings of neighboring neurons were inhibited by STN- or GPi-DBS (Boraud et al., 1996; Dostrovsky et al., 2000; Wu et al., 2001; Filali et al., 2004; Lafreniere-Roula et al., 2010). On the other hand, recent studies have emphasized activation of neuronal elements. Actually, STN-DBS increased activity of GPi neurons through the excitatory STN-GPi projections (Hashimoto et al., 2003; Galati et al., 2006; Reese et al., 2011), and GPi-DBS reduced activity of thalamic neurons through the inhibitory GPi-thalamic projections (Anderson et al., 2003; Pralong et al., 2003; Montgomery, 2006). In addition, recent studies reported multi-phasic responses consisting of excitation and inhibition in GPi neurons during GPi-DBS (Bar-Gad et al., 2004; Erez et al., 2009; McCairn and Turner, 2009; Leblois et al., 2010). In this article, we critically review recent studies, and discuss the possible mechanism of effectiveness of DBS.

## **DEEP BRAIN STIMULATION (DBS) INHIBITS LOCAL NEURONAL ELEMENTS**

Both DBS and lesion were found to produce similar benefits on alleviation of symptoms. For example, STN-DBS has similar effects on Parkinsonian motor signs (Benazzouz et al., 1993; Benabid et al., 1994; Limousin et al., 1995) to the STN-lesion (Bergman et al., 1990; Aziz et al., 1991; Levy et al., 2001) and blockade of synaptic transmission from the STN to the GPi (Graham et al., 1990; Brotchie et al., 1991). Thus, DBS was originally assumed to inhibit local neuronal elements. Actually, the most common effect of STN- or GPi-HFS on neighboring neurons was reduction of the firing rates.

Distinct suppression of neuronal activity was recorded during STN-DBS around the stimulating sites in Parkinsonian patients during stereotactic surgery (Filali et al., 2004; Welter et al., 2004). Similar results were also obtained in animal models, such as Parkinsonian monkeys (Meissner et al., 2005; Moran et al., 2011) and rats (Tai et al., 2003; Shi et al., 2006). Stimulus artifacts hinder detection of spikes during 2–3 ms after stimulus pulses and some spikes may be obscured when neuronal activities are recorded nearby the stimulating electrodes. Recent studies enabled detection of spikes just after stimulus pulses by removal of stimulus artifacts using the template subtraction method (Wichmann, 2000; Hashimoto et al., 2002) and confirmed that STN-DBS decreased firing of neighboring neurons (Meissner et al., 2005; Moran et al., 2011). Although STN-HFS much decreased neuronal firing around the stimulation site, complete cessation of

high-frequency stimulation (HFS; 30 µA, 100 Hz, 10 pulses) in a normal monkey. Raw traces of spike discharges after removing the stimulus artifacts by the template subtraction method **(1)** and raster and peristimulus time histogram (PSTHs; 100 trials; binwidth, 1 ms) **(2)** are shown. Arrows indicate

the timing of local stimulation. Spontaneous discharges of the GPi neuron were completely inhibited by the stimulation. **(B)** Effect of local injection of gabazine (GABA<sup>A</sup> receptor antagonist) in the vicinity of the recorded GPi neuron on inhibition of spontaneous activity induced by GPi-HFS. The inhibition was abolished after gabazine injection. Modified from Chiken and Nambu (2013).

STN firing was observed in a limited number of neurons. STN-HFS at 140 Hz reduced mean firing rate of STN neurons by 77% in Parkinsonian patients, and among them, 71% of STN neurons exhibited residual neuronal activity, while only 29% of STN neurons exhibited total inhibition (Welter et al., 2004). Similar results were also observed in Parkinsonian monkeys (Meissner et al., 2005), and Parkinsonian and normal rats (Tai et al., 2003). Decreased abnormal oscillatory activity in the STN was also observed during STN-DBS in Parkinsonian monkeys (Meissner et al., 2005). Inhibitory effects sometimes outlasted the stimulus period (Tai et al., 2003; Filali et al., 2004; Welter et al., 2004).

Inhibitory effects of GPi-DBS on firing of the neighboring neurons were also reported (Boraud et al., 1996; Dostrovsky et al., 2000; Wu et al., 2001; McCairn and Turner, 2009). Complete inhibition of local neuronal firing was more commonly induced by GPi-DBS than by STN-DBS (**Figure 1A**). GPi-HFS at 100 Hz induced complete inhibition of 76% of neighboring neurons in normal monkeys (Chiken and Nambu, 2013), and the inhibition outlasted the stimulus period, sometimes over 100 ms after the end of stimulation. Similar post-train inhibition was also observed in Parkinsonian patients (Lafreniere-Roula et al., 2010).

To the contrary, multiphasic responses consisting of the excitation and inhibition during GPi-HFS were recently observed in GPi neurons of Parkinsonian monkeys (Bar-Gad et al., 2004; Erez et al., 2009; McCairn and Turner, 2009) and dystonic hamsters (Leblois et al., 2010). The discrepant results may be due to differences in stimulus parameters used in these experiments: larger axons are easily activated by electrical stimulation than smaller ones (Ranck, 1975), and continuous repetitive stimulation might cause failure of postsynaptic events due to receptor desensitization and/or transmitter depletion (Wang and Kaczmarek, 1998; Zucker and Regehr, 2002). Such multiphasic responses may normalize abnormal firings, such as bursting and oscillatory activity in Parkinson's disease and dystonia as described below.

## **MECHANISM OF INHIBITION**

Several possible mechanisms account for the inhibitory responses have been proposed, including depolarization-block and inactivation of voltage-gated currents (Beurrier et al., 2001; Shin et al., 2007). However, these are less probable, because both single-pulse and low-frequency stimulation in the GPi evoked intense short latency inhibition in neighboring neurons (Dostrovsky et al., 2000; Dostrovsky and Lozano, 2002; Chiken and Nambu, 2013). Another possible mechanism is that the inhibition is caused by activation of GABAergic afferents in the stimulated nucleus (Boraud et al., 1996; Dostrovsky et al., 2000; Dostrovsky and Lozano, 2002; Meissner et al., 2005; Johnson et al., 2008; Liu et al., 2008; Deniau et al., 2010). A recent study confirmed that inhibitory responses induced by GPi-HFS were mediated by GABA<sup>A</sup> and GABA<sup>B</sup> receptors (Chiken and Nambu, 2013; **Figure 1B**). GABAergic inhibition is strong and inhibits even directly evoked spikes by GPi stimulation, which is characterized constant- and short latency (**Figures 2A, B**; Chiken and Nambu, 2013).

The GPi receives excitatory glutamatergic inputs from the STN as well as inhibitory GABAergic inputs from the striatum and GPe (Smith et al., 1994; Shink and Smith, 1995). Afferent axon terminals from the STN are also activated by the stimulation, but the glutamatergic excitation is probably overwhelmed because of predominance of GABAergic inputs in the GPi (Shink and Smith, 1995). On the other hand, many GPe neurons exhibited complex responses composed of both excitation and inhibition during GPe-HFS (Chiken and Nambu, 2013). The density of GPe terminals on GPi neurons is higher than those on GPe neurons

(Shink and Smith, 1995), and the balance between GABAergic and glutamatergic inputs may explain the different effects between GPe-HFS and GPi-HFS. Similarly, STN-HFS stimulated both glutamatergic and GABAergic afferents and generated both excitatory and inhibitory post-synaptic potentials (EPSPs and IPSPs) in the STN neurons (Lee et al., 2004). Thus, HFS activates afferent axons in the stimulated nucleus, and the effects vary depending on the composition of the inhibitory and excitatory axon terminals.

## **DEEP BRAIN STIMULATION (DBS) EXCITES LOCAL NEURONAL ELEMENTS**

It is rational that local stimulation excites local neuronal elements. Actually, directly evoked spikes, which are characterized by shortand constant latency, are induced in GPi neurons by GPi-HFS (Johnson and McIntyre, 2008; McCairn and Turner, 2009). Such excitation may propagate through efferent projections. Thalamic activity was reduced during GPi-HFS through inhibitory GPithalamic projections in Parkinsonian monkeys (Anderson et al., 2003) and dystonia patients (Pralong et al., 2003; Montgomery, 2006). GPi activity was increased during STN-DBS through excitatory STN-GPi projections (Hashimoto et al., 2003; Galati et al., 2006; Reese et al., 2011). STN-DBS increased both glutamate and GABA levels in the substantia nigra pars reticulata (SNr) of normal rats in microdialysis studies (Windels et al., 2000; see also Windels et al., 2005). An intraoperative microdialysis study revealed that STN-DBS produced significant increase in extracellular concentration of cyclic guanosine monophosphate (cGMP) in the GPi (Stefani et al., 2005). Functional magnetic resonance imaging (MRI) and positron emission tomography (PET) studies in humans indicated that efferent outputs from the stimulated nucleus are excited during DBS (Jech et al., 2001; Hershey et al., 2003; Boertien et al., 2011). Changes of the firing rates and patterns of target nuclei may normalize abnormal firings, such as bursting and oscillatory activity, which are observes in the cortico-basal ganglia loop of Parkinson's disease and dystonia (Anderson et al., 2003; Hashimoto et al., 2003; Hammond et al., 2007; Johnson et al., 2008; Vitek, 2008; Deniau et al., 2010).

According to the modeling study (McIntyre et al., 2004), subthreshold HFS suppressed intrinsic firings in the cell bodies, while

suprathreshold HFS generated efferent outputs at the stimulus frequency in the axon without representative activation of the cell bodies. Thus, although stimulation may fail to activate cell bodies of GPi neurons due to strong GABAergic inhibition, it can still excite the efferent axons and provide inhibitory inputs to the thalamus at the stimulus frequency.

DBS also antidromically excites afferent axons. Actually, antidromic activation of GPi neurons induced by STN-DBS was observed in Parkinsonian monkeys (Moran et al., 2011), and antidromic activation of thatamic (Vop) neurons induced by GPi-DBS was observed in Parkinsonian patients (Montgomery, 2006). Low intensity STN-HFS induced GABAergic inhibition in the SNr through antidromic activation of GPe neurons projecting to both the STN and SNr (Maurice et al., 2003; see also Moran et al., 2011), whereas higher intensity stimulation induced glutamatergic excitation in the SNr through activation of STN-SNr projections. STN-HFS also activated motor cortical neurons antidromically and suppressed abnormal low frequency synchronization including beta band oscillation in Parkinsonian rats (Li et al., 2007, 2012; Degos et al., 2013). Recent development of optogenetics has enabled selective stimulation of afferent inputs or efferent outputs, and contribute to analyzing the mechanism of effectiveness of DBS. A recent study has shown that selective stimulation of cortico-STN afferent axons can robustly ameliorate symptoms in Parkinsonian rats without activation of STN efferent axons (Gradinaru et al., 2009), suggesting that therapeutic effects of STN-DBS may be exclusively accounted for activation of cortico-STN afferent axons.

It is also probable that STN-DBS induces dopamine release through STN-SNc projections. STN-DBS induced dopamine release by activation of nigrostriatal dopaminergic neurons in rats (Meissner et al., 2003) and pigs (Shon et al., 2010), however it did not increase dopamine level of the striatum in human patients (Abosch et al., 2003; Hilker et al., 2003). DBS may also affect neurons whose axons pass nearby the stimulating site. A model-based study showed that clinically effective STN-DBS also activated the lenticular fasciculus, which is composed of GPithalamic fibers, in addition to STN neurons temselves (Miocinovic et al., 2006). Actually, STN-DBS induced direct excitation of GPi neurons through activation of the lenticular fasciculus (Moran et al., 2011).

Participation of non-neuronal glial tissues should also be considered as one of possible mechanisms of DBS effectiveness. DBS induced glutamate and adenosine triphosphate (ATP) release from astrocytes (Fellin et al., 2006; Tawfik et al., 2010). A recent study revealed that HFS applied to the thalamus induced abrupt increase in extracellular ATP and adenosine (Bekar et al., 2008).

Adenosine activation of A1 receptors depressed excitatory transmission in the thalamus, and alleviated tremor in a mouse model. Thus, it is possible that ATP and glutamate are released from astrocytes triggered by DBS and modulate neuronal activity in the stimulated nucleus (Vedam-Mai et al., 2012; Jantz and Watanabe, 2013).

## **DEEP BRAIN STIMULATION (DBS) DISRUPTS NEURONAL TRANSMISSION**

The striatum and STN are input stations of the basal ganglia and receive inputs from a wide area of the cerebral cortex (Mink, 1996; Nambu et al., 2002). The information is processed through the *hyperdirect*, *direct*, and *indirect* pathways and reaches the GPi/SNr, the output station of the basal ganglia (**Figure 3A**). During voluntary movements, neuronal signals originating in the cortex are considered to be transmitted through these pathways, reach the GPi/SNr and control movements (Mink, 1996; Nambu et al., 2002). Signal transmission through the *direct* pathway reduces GPi activity and facilitates movements by disinhibiting the thalamus, whereas the *hyperdirect* and *indirect* pathways increase GPi activity and suppress movements (Nambu et al., 2002; Nambu, 2007; Kravitz et al., 2010; Sano et al., 2013).

Chiken and Nambu (2013) recently examined responses of GPi neurons evoked by motor cortical stimulation during GPi-HFS in normal monkeys. In that study, both cortically evoked responses and spontaneous discharges were completely inhibited during GPi-HFS by strong GABAergic inhibition (**Figure 3B**), suggesting that GPi-HFS blocks information flow through the GPi. Since abnormal cortically evoked responses (Chiken et al., 2008; Kita and Kita, 2011; Nishibayashi et al., 2011) and abnormal bursts and oscillatory activity (Wichmann et al., 1994; Bergman et al., 1998; Starr et al., 2005; Brown, 2007; Chiken et al., 2008; Nishibayashi et al., 2011; Tachibana et al., 2011) were observed in GPi neurons in Parkinson's disease and dystonia, signal transmission of such abnormal activities to the thalamus and motor cortex would be responsible for motor symptoms. Thus, disruption of the abnormal information flow could suppress expression of motor symptoms. This mechanism may explain the paradox that GPi-DBS produces similar therapeutic effects to lesions of the GPi: both GPi-DBS and GPi-lesion interrupt abnormal information flow through the GPi.

STN-DBS may also interrupt neurotransmission of abnormal signals. Maurice et al. (2003) examined the effects of STN-DBS on cortically evoked responses of SNr neurons in normal rats.

Cortically evoked early and late excitation was totally abolished during high intensity STN-HFS, and much reduced during low intensity STN-DBS, while cortically evoked inhibition was preserved (**Figure 4A**), suggesting that information flow through the trans-STN pathway was blocked by STN-DBS without interrupting other pathways. The response patterns of SNr neurons during STN-DBS are similar to those of GPi neurons during STN blockade by muscimol in normal monkeys (Nambu et al., 2000; **Figure 4B**). Thus it is rational that STN-DBS has similar effect to lesion or silencing of the STN. In Parkinson's disease, due to the loss of dopaminergic modulation, the information flow through the striato-GPi *direct* pathway is weakened, whereas the information flow through the striato-GPe *indirect* pathway is facilitated. Both STN-DBS and STN lesioning may alter the balance of inhibitory inputs through the *direct* pathway and excitatory inputs through the *hyperdirect* and *indirect* pathways to the GPi by disrupting information flow through the STN, and effectively alleviate bradykinesia seen in Parkinson's disease. Similar idea, a functional disconnection of the stimulated elements, has also proposed by other groups (Anderson et al., 2006; Deniau et al., 2010; Moran et al., 2011).

#### **CONCLUSION**

DBS has variety of effects on neurons in the stimulated nucleus of the cortico-basal ganglia loop, though transmitter release, orthodromic activation of efferent axons, antidromic activation of afferent axons, and direct stimulation of passing axons nearby the stimulating electrode. The effects vary depending on the neural composition of the stimulated nucleus, and the effects extend much wider than originally expected. However, a common mechanism would underlie the effectiveness of DBS: DBS dissociates inputs and outputs in the stimulated nucleus and disrupts abnormal information flow through the cortico-basal ganglia loop (**Figure 5**). The mechanism may explain the paradox that DBS produces similar therapeutic effects to lesions or silencing of the nucleus.

## **ACKNOWLEDGMENTS**

This work was supported by a Grant-in-Aid for Scientific Research (A) 21240039, the Core Research for Evolutional Science and Technology, Strategic Japanese-German Cooperative Programme, and Brain Machine Interface Development under the Strategic Research Program for Brain Sciences to Atsushi Nambu, and a Grant-in-Aid for Scientific Research (C) 25430021 to Satomi Chiken. We thank Kana Miyamoto, Shigeki Sato, Hitomi Isogai, and Keiko Matsuzawa for technical assistance.

#### **REFERENCES**


Zucker, R. S., and Regehr, W. G. (2002). Short-term synaptic plasticity. *Annu. Rev. Physiol.* 64, 355–405. doi: 10.1146/annurev.physiol.64.092501.11 4547

**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 26 August 2013; accepted: 19 February 2014; published online: 14 March 2014.*

*Citation: Chiken S and Nambu A (2014) Disrupting neuronal transmission: mechanism of DBS? Front. Syst. Neurosci. 8:33. doi: 10.3389/fnsys.2014.00033 This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2014 Chiken and Nambu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Cognitive-motor interactions of the basal ganglia in development

## *Gerry Leisman1,2,3,4\*, Orit Braun-Benjamin1,2 and Robert Melillo1,3,5*

*<sup>1</sup> The National Institute for Brain and Rehabilitation Sciences, Nazareth, Israel*

*<sup>2</sup> Department of Mechanical Engineering, ORT-Braude College of Engineering, Karmiel, Israel*

*<sup>3</sup> F.R. Carrick Institute for Clinical Ergonomics, Rehabilitation, and Applied Neurosciences, Hauppauge, NY, USA*

*<sup>4</sup> Facultad Manuel Fajardo, Institute for Neurology and Neurosurgery, Universidad de Ciencias Médicas de la Habana, Habana, Cuba*

*<sup>5</sup> Nazareth Academic Institute, Nazareth, Israel*

#### *Edited by:*

*Ahmed A. Moustafa, University of Western Sydney, Australia*

#### *Reviewed by:*

*Calixto Machado, Institute of Neurology and Neurosurgery, Cuba Jorge L. Morales-Quezada, Laboratory of Neuromodulation, USA*

#### *\*Correspondence:*

*Gerry Leisman, The National Institute for Brain and Rehabilitation Sciences, ORT-Braude College of Engineering, 51 Snunit, PO Box 78, Karmiel 21982, Israel e-mail: gerry.leisman@ staff.nazareth.ac.il*

Neural circuits linking activity in anatomically segregated populations of neurons in subcortical structures and the neocortex throughout the human brain regulate complex behaviors such as walking, talking, language comprehension, and other cognitive functions associated with frontal lobes. The basal ganglia, which regulate motor control, are also crucial elements in the circuits that confer human reasoning and adaptive function. The basal ganglia are key elements in the control of reward-based learning, sequencing, discrete elements that constitute a complete motor act, and cognitive function. Imaging studies of intact human subjects and electrophysiologic and tracer studies of the brains and behavior of other species confirm these findings. We know that the relation between the basal ganglia and the cerebral cortical region allows for connections organized into discrete circuits. Rather than serving as a means for widespread cortical areas to gain access to the motor system, these loops reciprocally interconnect a large and diverse set of cerebral cortical areas with the basal ganglia. Neuronal activity within the basal ganglia associated with motor areas of the cerebral cortex is highly correlated with parameters of movement. Neuronal activity within the basal ganglia and cerebellar loops associated with the prefrontal cortex is related to the aspects of cognitive function. Thus, individual loops appear to be involved in distinct behavioral functions. Damage to the basal ganglia of circuits with motor areas of the cortex leads to motor symptoms, whereas damage to the subcortical components of circuits with non-motor areas of the cortex causes higher-order deficits. In this report, we review some of the anatomic, physiologic, and behavioral findings that have contributed to a reappraisal of function concerning the basal ganglia and cerebellar loops with the cerebral cortex and apply it in clinical applications to attention deficit/hyperactivity disorder (ADHD) with biomechanics and a discussion of retention of primitive reflexes being highly associated with the condition.

**Keywords: basal ganglia, frontal lobe, cognition, autism, ADHD, posture**

#### **BASAL GANGLIA AND COGNITIVE FUNCTION**

## **ORGANIZATION OF THE BASAL GANGLIA FOR COGNITION AND MOTOR FUNCTION**

It is known that the basal ganglia interact closely with the frontal cortex (Alexander et al., 1986) and that damage to the basal ganglia can produce many of the same cognitive impairments as damage to the frontal cortex (Brown and Marsden, 1990; Brown et al., 1997; Middleton and Strick, 2000; Leisman and Melillo, 2012; Leisman et al., 2013). This close relationship raises many questions regarding the cognitive role of the basal ganglia and how it can be differentiated from that of the frontal cortex itself. Are the basal ganglia and frontal cortex just two undifferentiated pieces of a larger system? Do the basal ganglia and the frontal cortex perform essentially the same function but operate on different domains of information processing? Are the basal ganglia an evolutionary predecessor to the frontal cortex, with the frontal cortex performing a more sophisticated version of the same function?

The basal ganglia are part of a neuronal system that includes the thalamus, the cerebellum, and the frontal lobes. Like the cerebellum, the basal ganglia were previously thought to be primarily involved in motor control. However the role of the basal ganglia in motor and cognitive functions has now been well established (Alexander et al., 1986; Middleton and Strick, 2000; Thorn et al., 2010; Leisman and Melillo, 2012; Leisman et al., 2013).

The basal ganglia surround the diencephalon and are made up of five subcortical nuclei (represented in **Figure 1**): globus pallidus, caudate, putamen, substantia nigra, and the subthalamic nucleus (STN) of Luys. The basal ganglia are thought to have expanded during the course of evolution as well and is therefore divided into the neostriatum and paleostriatum. The paleostriatum consists primarily of the globus pallidus, which is derived embryologically from the diencephalon. During the course of its development, they further divide into two distinct areas: the external and internal segments of the globus pallidus.

The neostriatum is made up of two nuclei: the caudate and the putamen. These two nuclei are fused anteriorly and are collectively known as the striatum. They are the input nuclei of the basal ganglia and they are derived embryologically from the telencephalon. The STN of Luys lies inferiorly to the thalamus at the junction of the diencephalon and the mesencephalon or midbrain. The putamen lies inferiorly to the thalamus and has two zones similar to the globus pallidus. A ventral pole zone called pars reticulata exists as well as a dorsal darkly pigmented zone called the pars compacta.

The pars compacta contains dopaminergic neurons that contain the internum. The globus pallidus internum and the pars reticulata of the putamen are the major output nuclei of the basal ganglia. The globus pallidus internum and the pars reticulata of the putamen are similar in cytology, connectivity, and function. These two nuclei can be considered to be a single structure divided by the internal capsule. Their relationship is similar to that of the caudate and the putamen. The basal ganglia are part of the extrapyramidal motor system as opposed to the pyramidal motor system that originates from the sensorymotor cerebral cortex. The pyramidal motor system is responsible for all voluntary motor activities, except for eye movement. The extrapyramidal system modifies motor control and is thought to be involved with higher-order cognitive aspects of motor control as well as in the planning and execution of complex motor strategies and the voluntary control of eye movements. There are two major pathways in the basal ganglia: the direct pathways that promote movement and the indirect pathways that inhibit movement (cf. Melillo and Leisman, 2009).

The basal ganglia receive afferent input from the entire cerebral cortex but especially from the frontal lobes. Almost all afferent connections to the basal ganglia terminate in the neo-striatum (caudate and putamen). The neo-striatum receives afferent input from two major sources outside of the basal ganglia: the cerebral cortex (cortico-striatal projections) and the intra-laminar nucleus of the thalamus. The cortico-striatal projections contain topographically organized fibers originating from the entire cerebral cortex. An important component of that input comes from the centro-median nucleus and terminates in the putamen. Because the motor cortex of the frontal lobes projects to the centromedian nucleus, this may be an additional pathway by which the motor cortex can influence the basal ganglia. The putamen appears to be primarily concerned with motor control, whereas the caudate appears to be involved in the control of eye movements and certain cognitive functions. The ventral striatum is related to limbic function and therefore may affect autonomic and emotional functions.

The major output of the basal ganglia arises from the internal segment of the globus pallidus and the pars reticulata of the putamen. The nuclei project in turn to three nuclei in the thalamus: the ventral lateral nuclei, the ventral anterior nuclei, and the mesio-dorsal nuclei. Internal segments of the globus pallidus project to the centro-median nucleus of the thalamus. Striatal neurons may be involved with gating incoming sensory input to higher motor areas such as the intra-laminar thalamic nuclei and premotor cortex that arise from several modalities to coordinate behavioral responses. These different modalities may contribute to the perception of sensory input (Middleton and Strick, 2000) leading to motor response. The basal ganglia are directed, in a way similar to the cerebellum, to premotor and motor cortices as well as the prefrontal cortex of the frontal lobes.

Experiments where herpes simplex virus-1 was administered into the dorsolateral prefrontal cortex of monkeys to determine its axonal spread or connection labeled the ipsilateral neurons in the internal segments of the globus pallidus and the contralateral dentate nucleus of the cerebellum (Chudler and Dong, 1995). It is therefore thought that this may show a role of both the cerebellum and the basal ganglia in higher cognitive functions associated with the prefrontal cortex. This would also substantiate a cortico-striato-thalamo-cortical loop, which would have a cognitive rather than a motor function, as exemplified in **Figure 2**.

The putamen is also thought to connect to the superior colliculus through non-dopaminergic axons, which forms an essential link in voluntary eye movement. It is thought that the normal basal ganglia function results from a balance of the direct and indirect striatal output pathways and the different involvement of these pathways account for hyperkinesia or hypokinesia observed in disorders of the basal ganglia (Middleton and Strick, 1994). Hypokinesia is a disinhibition or increase in spontaneous movement (tics and tremors). It is thought that hypokinesia and hyperkinesia may relate to hypo- active behavior and hyperactive behavior associated with subcortical hypo-stimulation or hyperstimulation of medial and orbitofrontal cortical circuits (Vitek and Giroux, 2000). It is important to review these connections further to understand the role of the basal ganglia in the control of cognitive function.

Five fronto-subcortical circuits unite regions of the frontal lobe (the supplementary motor area; frontal eye fields; and dorsolateral, prefrontal, orbitofrontal, and anterior cingulate cortices) with the striatum, the globus pallidus, and the thalamus in functional systems that mediate volitional motor activity, saccadic eye

toward the frontal lobe, particularly the premotor and supplementary

cognition.

movements, executive functions, social behavior, and motivation (Litvan et al., 1998; Vitek and Giroux, 2000).

In general then, there exist a number of cortical loops through the basal ganglia that involve prefrontal association cortex and limbic cortex. Through these loops, the basal ganglia are thought to play a role in cognitive function that is similar to their role in motor control. That is, the basal ganglia are involved in selecting and enabling various cognitive, executive, or emotional programs that are stored in these other cortical areas. Moreover, the basal ganglia appear to be involved in certain types of learning. For example, in rodents the striatum is necessary for the animal to learn certain stimulus-response tasks (e.g., make a right turn if stimulus A is present and make a left turn if stimulus B is present). Recordings from rat striatal neurons show that early in training, striatal neurons fire at many locations while a rat learns in a Tshaped maze. This suggests that initially the striatum is involved throughout the execution of the task. As the animal learns the task and becomes exceedingly good at its performance, the striatal neurons change their activity patterns, firing only at the beginning of the trial and at the end. It appears that the learned programs to solve this task are now stored elsewhere; the firing of the striatal neurons at the beginning of the maze presumably reflects the enabling of the appropriate motor/cognitive plan in the cortex, and the firing at the end of the maze is presumably involved in evaluating the reward outcome of the trial.

Some circuits in the basal ganglia are involved in non-motor aspects of behavior. These circuits originate in the prefrontal and limbic regions of the cortex and engage specific areas of the striatum, pallidum, and substantia nigra. The dorsolateral prefrontal circuit originates in Brodmann's areas 9 and 10 and projects to the head of the caudate nucleus, which then projects directly and indirectly to the dorsomedial portion of the internal pallidal segment and the rostral substantia nigra pars reticulata. Projections from these regions terminate in the ventral anterior and medial dorsal thalamic nuclei, which in turn project back upon the dorsolateral prefrontal area. The dorsolateral prefrontal circuit has been implicated broadly in so-called "executive functions." These include cognitive tasks such as organizing behavioral responses and using verbal skills in problem solving. Damage to the dorsolateral prefrontal cortex or subcortical portions of the circuit are associated with a variety of behavioral abnormalities related to these cognitive functions.

The lateral orbitofrontal circuit arises in the lateral prefrontal cortex and projects to the ventromedial caudate nucleus. The pathway from the caudate nucleus follows that of the dorsolateral circuit (through the internal pallidal segment and substantia nigra pars reticulata and thence to the thalamus) and returns to the orbitofrontal cortex. The lateral orbitofrontal cortex appears to play a major role in mediating empathetic and socially appropriate responses. Damage to this area is associated with irritability, emotional lability, failure to respond to social cues, and lack of empathy. A neuro-psychiatric disorder thought to be associated with disturbances in the lateral orbitofrontal cortex and circuit is obsessive-compulsive disorder.

The anterior cingulate circuit arises in the anterior cingulate gyrus and projects to the ventral striatum. The ventral striatum also receives "limbic" input from the hippocampus, amygdala, and entorhinal cortices. The projections of the ventral striatum are directed to the ventral and rostromedial pallidum and the rostrodorsal substantia nigra pars reticulata. From there the pathway continues to neurons in the paramedian portion of the medial dorsal nucleus of the thalamus, which in turn project back upon the anterior cingulate cortex. The anterior cingulate circuit appears to play an important role in motivated behavior, and it may convey reinforcing stimuli to diffuse areas of the basal ganglia and cortex via inputs through the ventral tegmental areas and the substantia nigra pars compacta (SNpc). These inputs may play a major role in procedural learning. Damage to the anterior cingulate region bilaterally can cause akinetic mutism, a condition characterized by profound impairment of movement initiation.

In general, the disorders associated with dysfunction of the prefrontal cortex and cortico-basal ganglia-thalamo-cortical circuits involve action rather than of perception or sensation. These disturbances are associated both with both intensified action (impulsivity) and flattened action (apathy). Obsessivecompulsive behavior can be viewed as a form of hyperactivity. The disturbances of mood associated with circuit dysfunction are believed to span the extremes of mania and depression. Both dopamine and serotonin, two biogenic amines that modulate neuronal activity within the circuits, are important to depression.

These observations suggest that the neural mechanisms underlying complex behavioral disorders might be analogous to the dysfunctions of motor circuits. Thus, schizophrenia might be viewed as a "Parkinson disease of thought." By this analogy, schizophrenic symptoms would arise from disordered modulation of prefrontal circuits. Other cognitive and emotional symptoms may similarly be equivalents of motor disturbances such as tremor, dyskinesia, and rigidity.

In humans, the basal ganglia appear to be necessary for certain forms of implicit memory tasks. Like motor habit learning, many types of cognitive learning require repeated trials and are often unconscious. An example is probabilistic classification. In this type of task, people have to learn to classify objects based on the probability of belonging to a class, rather than on any explicit rule. In one experiment, subjects were shown a deck of cards with different symbols. Each symbol was associated with a certain probability of predicting rain or sunshine, and the subjects had to say on each trial whether the symbol was a predictor of rain or sunshine. Because the same symbol sometimes predicted sunshine and other times predicted rain, the subjects could not devise a simple rule, and they made many errors at first. Over time, however, they began to get better at classifying the symbols appropriately, although they still often claimed to be guessing. Patients with basal ganglia disorders were impaired at this task, suggesting that the processing of the cognitive loops of the basal ganglia are somehow involved in our ability to subconsciously learn the probabilities of predicted outcomes associated with particular stimuli.

Some circuits in the basal ganglia are involved in non-motor aspects of behavior. These circuits originate in the prefrontal and limbic regions of the cortex and engage specific areas of the striatum, pallidum, and substantia nigra. The dorsolateral prefrontal circuit originates in Brodmann's areas 9 and 10 and projects to the head of the caudate nucleus, which then projects directly and indirectly to the dorsomedial portion of the internal pallidal segment and the rostral substantia nigra pars reticulata. Projections from these regions terminate in the ventral anterior and medial dorsal thalamic nuclei, which in turn project back upon the dorsolateral prefrontal area. The dorsolateral prefrontal circuit has been implicated broadly in so-called "executive functions." These include cognitive tasks such as organizing behavioral responses and using verbal skills in problem solving. Damage to the dorsolateral prefrontal cortex or subcortical portions of the circuit are associated with a variety of behavioral abnormalities related to these cognitive functions.

The lateral orbitofrontal circuit arises in the lateral prefrontal cortex and projects to the ventromedial caudate nucleus. The pathway from the caudate nucleus follows that of the dorsolateral circuit (through the internal pallidal segment and substantia nigra pars reticulata and thence to the thalamus) and returns to the orbitofrontal cortex. The lateral orbitofrontal cortex appears to play a major role in mediating empathetic and socially appropriate responses. Damage to this area is associated with irritability, emotional lability, failure to respond to social cues, and lack of empathy. A neuro-psychiatric disorder thought to be associated with disturbances in the lateral orbitofrontal cortex and circuit is obsessive-compulsive disorder.

The anterior cingulate circuit arises in the anterior cingulate gyrus and projects to the ventral striatum. The ventral striatum also receives "limbic" input from the hippocampus, amygdala, and entorhinal cortices. The projections of the ventral striatum are directed to the ventral and rostromedial pallidum and the rostrodorsal substantia nigra pars reticulata. From there the pathway continues to neurons in the paramedian portion of the medial dorsal nucleus of the thalamus, which in turn project back upon the anterior cingulate cortex. The anterior cingulate circuit appears to play an important role in motivated behavior, and it may convey reinforcing stimuli to diffuse areas of the basal ganglia and cortex via inputs through the ventral tegmental areas and the SNpc. These inputs may play a major role in procedural learning. Damage to the anterior cingulate region bilaterally can cause akinetic mutism, a condition characterized by profound impairment of movement initiation.

In general, the disorders associated with dysfunction of the prefrontal cortex and cortico-basal ganglia-thalamo-cortical circuits involve action rather than perception or sensation. These disturbances are associated both with both intensified action (impulsivity) and flattened action (apathy). Obsessivecompulsive behavior can be viewed as a form of hyperactivity. The disturbances of mood associated with circuit dysfunction are believed to span the extremes of mania and depression. Both dopamine and serotonin, two biogenic amines that modulate neuronal activity within the circuits, are important to depression (Leisman and Melillo, 2013; Leisman et al., 2013).

These observations suggest that the neural mechanisms underlying complex behavioral disorders might be analogous to the dysfunctions of motor circuits. Thus, schizophrenia might be viewed as a "Parkinson disease of thought." By this analogy, schizophrenic symptoms would arise from disordered modulation of prefrontal circuits. Other cognitive and emotional symptoms may similarly be equivalents of motor disturbances such as tremor, dyskinesia, and rigidity.

In humans, the basal ganglia appear to be necessary for certain forms of implicit memory tasks. Like motor habit learning. Many types of cognitive learning require repeated trials and are often unconscious. An example is probabilistic classification (**Figure 4**). In this type of task, people have to learn to classify objects based on the probability of belonging to a class, rather than on any explicit rule. In one experiment, subjects were shown a deck of cards with different symbols. Each symbol was associated with a certain probability of predicting rain or sunshine, and the subjects had to say on each trial whether the symbol was a predictor of rain or sunshine. Because the same symbol sometimes predicted sunshine and other times predicted rain, the subjects could not devise a simple rule, and they made many errors at first. Over time, however, they began to get better at classifying the symbols appropriately, although they still often claimed to be guessing. Patients with basal ganglia disorders were impaired at this task, suggesting that the processing of the cognitive loops of the basal ganglia are somehow involved in our ability to subconsciously learn the probabilities of predicted outcomes associated with particular stimuli.

## **DEVELOPMENTAL MOTOR MILESTONES AND COGNITIVE FUNCTION**

#### **INHIBITION AND DISINHIBITION**

It has been known for a while that individuals who are markedly late in achieving developmental milestones are at high risk for subsequent cognitive impairment (von Wendt et al., 1984; Melillo, 2011). The mechanisms underlying infant motor and adult cognitive associations remain poorly characterized. One possibility is that the neural systems that subserve motor development in infancy also contribute to the development and operation of specific cognitive processes later in life. Factors related to efficiencies in such systems may be reflected in both rapid motor developments early in life and subsequently in improved cognitive functions (Murray et al., 2006; Ridler et al., 2006). However, a number of questions remain concerning the specificity of associations between infant development and later cognitive functions, which, if they could be answered, could shed light on the reasons behind the associations. For example, is the effect confined to infant motor development, or does it also apply to other developmental domains, such as language? Is the effect confined to specific domains of cognition (e.g., executive function), or does it also apply to general intellectual function? Murray et al. (2007) examined these questions in a large British general population birth cohort in which measurements were available for development in language and motor domains in infancy, general intellectual function in childhood and adolescence, and specific neuropsychological function (e.g., verbal fluency, a test of executive/frontal lobe function) in adulthood. These authors noted that (Murray et al., 2006) noted that faster attainment of motor developmental milestones is related to better adult cognitive performance in some domains, such as executive function.

The developing infant is concerned with navigating to items of interest and exploring the environment, ultimately to develop a sense of self, independent of the environment to which he or she is circumnavigating. The central idea of the mechanism being advocated concerns the influence on a proceeding (or currently planned) muscular act. That influence stems from motivationtriggered anticipation of the act's outcome, and it is conjectured to prevail only if "consciousness" is present.

Because motivation relates to the self, while an act's consequences can include environmental components, consciousness is seen as lying at the operational interface between body movement and the body's surroundings. Anticipation is mediated by specific anatomical features, the independent functioning of which, underlies thought simulation of the body's (sometimes passive) transactions with its milieu. Only through those anatomical attributes can an individual possess consciousness.

When a child attempts its first step, prior attainment of the balanced upright position will have involved failed attempts, with attendant pain. What leads to discomfort will have been stored as memory of possible sensory feedback resulting from certain self-paced movements. Likewise, the fact that specific muscular movements can achieve forward motion will already be part of a repertoire accessible unconsciously. Ultimately, the child hits upon the correct combination and timing of elemental movements and the first successful step is taken. That consolidation into a more complex motor pattern is temporarily deposited in explicit memory (Squire, 1992), and subsequently transferred to long-term implicit memory (Schacter et al., 1990), probably during the frequent periods of sleep (Leisman, 2013a,b), characteristic in infancy. Soon, the toddler is able to walk without concentrating on every step, and more complicated foot-related scenarios will enjoy brief sojourns at the center of the explicit stage.

The system conjures up a simulated probable outcome of the intended motor pattern, and vetoes it if the prognosis is adverse. The simulated outcome lies below the threshold for actual movement, and the mimicking requires two-way interaction between the nervous system and the spindles (Matthews, 1972, 1982; Proske et al., 2000) associated with the skeletal musculature, particularly when the muscles are already in the process of doing something else. The interplay provides the basis of sensation, this always being in the service of anticipation.

The bottleneck in sensory processing (Broadbent, 1958) arises because planning of movement is forced to avoid potential conflict between the individual muscles. Because we learn about the world only through our actual or simulated muscular movements, this is postulated to produce the unity of conscious experience. Intelligence then becomes a measure of the facility for consolidating elementary movements (overt or covert) into more complex motor patterns, while creativity is the capacity for probing novel consolidations of motor responses.

We can think without acting, act without thinking, act while thinking about that act, and act while thinking about something else. Our acts can be composite, several muscular patterns being activated concurrently, though we appear not to be able to simultaneously maintain two streams of thought. When we think about one thing while doing something else, it is always our thoughts, which are the focus of attention. This suggests that there are least two thresholds, the higher associated with overt movement and the lower with thought. Assuming that the signals underlying competing potential thoughts must race each other to a threshold (Carpenter et al., 1999), it may be highly significant that cortical and thalamic projections form no strong loops (Crick and Koch, 1998). As mentioned earlier, the presence of strong loops could make overt movement too automatic. We can now add a second possible penalty; thoughts might otherwise establish themselves by default. One should note that overt movement and mere imagery-that is, covert preparations for movement, appear to involve identical areas (Jeannerod, 1999).

The competition (Posner and Rothbart, 1998; Leisman et al., 2012; Leisman and Melillo, 2013) is played out in a group of collectively functioning components, these being the sensory, motor and anterior cingulate areas of the cortex, the thalamic ILN (in conjunction with the NRT), the amygdala, and the striatum. In mammals, the latter has a heterogeneous structure (Graybiel and Ragsdale, 1978) in which the continuous matrix is inter-digitated with the isolated striosomes. The input to the striatum appears to be more intimately connected to the components just identified. Given that its output reaches M1, via the GPi, whereas the matrix output does not, it seems that the striatum may be more essentially related to consciousness and much like the indidual motor elements of the infant (Leisman et al., 2012). Likewise, the pars intermedia seems to be the more intimately consciousness-related part of the cerebellum, because it has analogous projections. And the threshold for overt movement may be exceeded only when both feeding components are dispatching signals concurrently. The matrix, conversely, appears to serve already-established motor patterns, because its output ultimately reaches the PMA/SMA and the prefrontal area. Its cerebellar partner is clearly the hemispherical region. It is worth noting that the cerebellar hemispheres are particularly prominent in the primates, and that they are preeminent in humans; they appear to bear much of the responsibility for making us what we are.

The focus of competition for attention appears to be the PMA/SMA, because it receives from all the thalamic nuclei handling BG/Cb output. More remote regions of the system, which feed signals to those BG/Cb components, influence attention. The inferior olive seems to play a complementary role for the Cb, sending signals through the climbing fibers when something unexpected occurs (De Zeeuw et al., 1998) and, because LTD will not yet have had time to develop for this novel situation, disturbing the permissive effect of the disinhibition. The periodic shifting of attention, as when we simultaneously converse (or merely think) and drive in a busy thoroughfare, must make considerable demands on the putative differential clutch mechanism and this could be the dual responsibility of the SNpc and the sub-thalamic nucleus, which appear to serve as gain controls for the striosome-related and matrix-related routes, respectively. This situation is exemplified by our ability to think of one thing while overtly doing another.

Thoughts, according to this scheme, are merely simulated interactions with the environment, and their ultimate function is the addition of new implicit memories, new standard routes from sensory input to permitted motor output or new optimized complex reflexes. For a given set of synaptic couplings between PMA/ SMA and M1, a specific pattern of output signals from the former will produce a specific sequence of muscular movements. Efference copies of those output signals, dispatched through axon collaterals, will carry the full information sent to the muscles, via M1, but they will not directly produce movement because their target neurons are not immediately concerned with motor output. Those efference-copy signals may be above the threshold for thought, however, and the latter will thus be subtly tied to a pattern of motor output. The duality of routes, and the fact that these overlap in the PMA/SMA region, could well underlie the interplay between explicit and implicit in brain function.

A major problem confronting those who would explain consciousness is its apparently multifarious nature and the attendant difficulty in an effective operational definition. We attach great significance to the provision of context-specific reflexes, as occurs when one is learning to walk.

At the largest scale, one can see a number of parallel loops from the frontal cortex to the striatum to the globus pallidus internal segment (GPi) or substantia nigra pars reticulata (SNr) and then on to the thalamus, finally projecting back up in the frontal cortex (Alexander et al., 1986). Both the frontal cortex and the striatum also receive inputs from various areas of the posterior/sensory cortex. The critical point is that striatal projections to the GPi/SNr and from the GPi/SNr to the thalamus are inhibitory. Furthermore, the GPi/SNr neurons are tonically active, meaning that in the absence of any other activity, the thalamic neurons are inhibited by constant firing of GPi/SNr neurons. Therefore, when the striatal neurons fire, they serve to disinhibit the thalamic neurons (Chevalier and Deniau, 1990). This disinhibition produces a gating function enabling other functions to take place but does not directly causing them to occur, as a direct excitatory connection would so that the activation of striatal neurons enables, but does not directly cause, subsequent motor movements. A schematic of these inhibitory-disinhibitory functions may be seen in **Figure 3**.

#### **ANATOMICAL CONSTRAINTS**

Here we will discuss the implications of a few important anatomical properties of the basal-ganglia-frontal-cortex system. A strong constraint on understanding basal ganglia function comes from the fact that the GPi and SNr have a relatively small number of neurons. There are approximately 111 million neurons in the human striatum (Fox and Rafols, 1976), whereas there are only 160,000 in the GPi (Lange et al., 1976) and a similar number in the SNr. This means that whatever information is encoded by striatal neurons must be vastly compressed or eliminated on its way up to the frontal cortex. This constraint coincides nicely with the gating hypothesis: The basal ganglia do not need to convey detailed content information to the frontal cortex; instead, they simply need to tell different regions of the frontal cortex when to update. As we noted in the context of motor control, damage to the basal ganglia appears to affect initiation, but not the details of execution, of motor movements—presumably, not that many neurons are needed to encode this gating or initiation information.

Another constraint to consider concerns the number of different sub-regions of the frontal cortex for which the basal ganglia can plausibly provide separate gating control. Crude estimates suggest that gating occurs at a relatively fine-grained level. Finegrained gating is important for mitigating conflicts where two representations require separate gating control and yet fall within one gating region. The number of neurons in the GPi/SNr provides an upper limit estimate, which is roughly 320,000 in the human. This suggests that the gating signal operates on a region of frontal neurons, instead of individually controlling specific neurons.

An interesting possible candidate for the regions of the frontal cortex that are independently controlled by the basal ganglia are distinctive anatomical structures consisting of interconnected groups of neurons, called stripes (Pucak et al., 1996). It is plausible that each stripe or cluster of stripes constitutes a separately controlled group of neurons; each stripe can be separately updated by the basal ganglia system.

Anatomical constraints are consistent with the selective gating hypothesis by suggesting that the basal ganglia interacts with a large number of distinct regions of the frontal cortex. We hypothesize that these distinct stripe structures constitute separately gated col- lections of frontal neurons, extending the parallel loops concept of Alexander et al. (1986) to a much finer grained level (Beiser and Houk, 1998). Thus, it is possible to maintain some information in one set of stripes, while selectively updating other stripes.

#### **THE DEVELOPMENT OF INHIBITION AND DISINHIBITION OF PRIMITIVE REFLEXES: MOTOR AND COGNITIVE EFFECTS**

The nature of primitive reflex development on both motor and cognitive function has been more extensively reviewed elsewhere (Melillo, 2011). There has been a correlation shown between retained primitive reflexes and delayed motor development in very low birth weight (VLBW) infants. (Marquis et al., 1984) They noted that VLBW infants retained stronger primitive reflexes and exhibited a significantly higher incidence of motor delays than did full-term infants. They confirmed a high incidence of motor delays among VLBW infants and demonstrated a clear association between retained reflexes and delayed motor development in VLBW infants. It is important to note that this was in the absence of any overt pathology in the brains of these children.

In another study (Burns et al., 2004) the relationship between extreme low birth weight infants, motor and cognitive development at one and at 4 years was studied. The authors observed a relationship between motor ability and cognitive performance. Their study investigated the association between movement and cognitive performance at one and 4 years corrected age of children born less than 1000 g, and whether developmental testing of movement at 1 year was predictive of cognitive performance at 4 years. Motor assessment at both ages was performed using the neurosensory motor developmental assessment (NSMDA). Cognitive performance was assessed on the Griffith Mental Developmental Scale at 1 year and McCarthy Scales of Children's Abilities at 4 years. A significant association was found between NSMDA group classification at 1 year and cognitive performance at both one and at 4 years and between the subscales of each test. They also noted that group classification of motor development at 1 year was predictive of cognitive performance at 4 years and this was independent of biological and social factors and the presence of cerebral palsy.

In yet another study, (Dutia et al., 1981) the relationship between a normal intact cerebellum and primitive reflexes was examined. Tonic labyrinth and neck reflexes were studied separately and in combination in the decerebrate cat before and after acute cerebellectomy. The investigators noted clear changes in these reflexes both before and after surgery. They concluded that the presence of the cerebellum is required for the occurrence of the normal asymmetric labyrinth reflexes. Decreased size and immaturity as well as dysfunction of the cerebellum and the inferior olive are seen in almost all children with neurobehavioral disorders and these factors are thought to play a critical role in the development of normal coordination and synchronization of the motor system and the brain (Melillo, 2011; Leisman et al., 2013).

Romeo et al. (2009) examined the relationship between the acquisition of a postural reflex, the forward parachute reaction (FPR), and the age of acquisition of independent walking. They noted that most of the infants they examined had a two-step development pattern. The infants at first showed an incomplete and then a complete FPR, which was observed more frequently at nine months. An incomplete FRP only, without successive maturation to a complete FPR was present in 21% of the whole sample. Infants with a complete FPR walked at a median age of 13 months, whereas those with an incomplete FPR only walked at a median age of 14 months. The investigators observed, in those with incomplete pattern, a trend toward delayed acquisition of independent walking.

Teitelbaum et al. (2002) hypothesized that movement disturbances in infants can be interpreted as "reflexes gone astray" and may be early indicators of autism. They noted that in the children they reviewed, some had reflexes that persisted too long in infancy, whereas others first appeared much later than they should. The asymmetric tonic neck reflex is one reflex that they noted may persist too long in autism. Head verticalization in response to body tilt they noted is a reflex that does not appear when it should in a subgroup of "autistic-to-be" infants They suggested that these reflexes may be used by pediatricians to screen for neurological dysfunction that may be a markers for autism.

#### **BASAL GANGLIA, ATTENTION, AND COGNITIVE FUNCTION**

Although there exist numerous definitions of intelligence beyond one's ability to perform on intelligence tests, in the context of our present discussion, it is possible to define intelligence operationally as, "*the ability to consolidate already*-*learned motor patterns into more complex composites*, *such consolidation sometimes being merely co*v*ert*, *rather than o*v*ert*." This definition was discussed in the context of autism (Cotterill, 1998; Melillo and Leisman, 2009). A normal child, lying on its back and wanting to roll over onto its front, soon learns that this can be readily accomplished if first the head, then the shoulders, and finally the hips are swiveled in the same direction. If the timing of this sequence is correct, the supine-prone transition requires a minimum of effort. Autistic infants appear to experience considerable difficulty in learning this simple motor sequence. Indeed, the sequence does not even occur in their failed attempts. Instead, they awkwardly arch their backs and ultimately fall into the desired position.

When a new motor pattern is being acquired, both the means and the ends will be coded in currently active patterns of neuronal signals. And there must be interactions between these patterns because the goal will influence the route through muscular hyper- space by which it is to be achieved. The PFC probably dictates patterns of elementary muscular sequences, but it must be borne in mind that the sophistication of the latter will depend upon what the individual has already learned. A ballet dancer would regard as an elementary motor pattern a muscular sequence, which the novice would find quite difficult. The most spectacular feature to evolve thus far has been that seen in the mammals, and it permitted acquisition, during a creature's own lifetime, of novel context-specific reflexes, especially those relying on sequences of muscular movements. This mechanism makes heavy demands on the neural circuitry, because it requires an attentional mechanism. And because attention must, perforce, be an active process, there has to be feedback from the muscles, carrying information about their current state, including their current rate of change of state. Without such information, anticipation would be impossible, and without anticipation there could be no meaningful adjudication and decision as to the most appropriate way of continuing an on-going movement. Without such a mechanism, novel context-specific reflexes could not be acquired.

The fascinating thing is that access to such on-line information mediates consciousness, the gist of which is the ability to know that one knows. The ability to know that one knows is referred to by psychologists as first-order embedding. Higher embedding, such as that exemplified by knowing that one knows that one knows, merely depends upon the ability to hold things in separate patches of neuronal activity in working memory. This manifests itself in a creature's intelligence, which also dictates its ability to consolidate existing schemata into a new schema. When we know that we know, the muscular apparatus is not only monitoring its own state, it is monitoring the monitoring.

In short, one can think of the overall influence of the basal ganglia on the frontal cortex as "releasing the brakes" for motor actions and other functions. The basal ganglia are important for initiating motor movements, but not for determining the detailed properties of those movements.

#### **RELATIONSHIP BETWEEN MOTOR INCOORDINATION AND ADHD/AUTISM IN COGNITIVE FUNCTION**

We have elsewhere described how abnormal motor development can accurately be used as a marker to predict autism and other developmental disorders in later development (Leisman, 2011). Many authors have noted a relationship between incoordination and clumsiness, especially of posture and gait, and autism as well as with other neurodevelopmental disorders. The type of gait and motor disturbance has been compared mostly to those that are either basal ganglionic and most commonly cerebellar in origin (Nayate et al., 2005). The most common of all comorbidities in practically all neurobehavioral disorders of childhood is *DCD*, developmental coordination disorder, or more simply put "clumsiness" or motor incoordination. In fact, practically all children in this spectrum have some degree of motor incoordination. The type of incoordination is also usually of the same type primarily involving the muscles that control gait and posture or gross motor activity. Sometimes to a lesser degree, we find fine motor coordination also affected.

Postural sway during quiet stance is often assumed to be a resultant sum of internal noises generated in the postural control system carrying little useful information (Ishida and Imai, 1980; Fitzpatrick et al., 1992). This suggests that a small and slow sway as a part of the postural control during quiet stance might be important to provide updated and appropriate sensory information helpful to standing balance and it is certainly cognitively mediated (Gatev et al., 1999).

Although "time to maintain a given posture" is a useful clinical measure, "body sway" is used as a measure to characterize the performance of upright posture. Body sway is a kinematic term and can be derived from the sum of forces and moments acting on the human body. Many studies have shown that when various sensory systems are systematically manipulated, body sway is affected (Masani et al., 2006, 2013). For example, absence of visual input has been shown to result in an increase in body sway (Sarabon et al., 2013). Thus, postural sway can be analyzed neurologically as well as biomechanically (Melillo and Leisman, 2009) and the combination of both aspects can contribute to a more comprehensive understanding of the processes involved when maintaining body balance in general and the relationship between the basal ganglia and the frontal cortex in particular in developmental disorders. Before viewing the biomechanical considerations, let us first define some basic biomechanical notions represented in **Figure 4**.

The most simplified biomechanical model assumes the human as one rigid body, where the COM is located at the waist, a pivot axis at the ankle, and a COP where the GRF vector acts. The assumptions used in the presented model are those of the inverted pendulum model of human standing balance (Winter and Eng, 1995): (1) The balance problem can be completely described by the movement of the whole-body COM, (2) the distance l from the axis of rotation to the COM remains constant, and (3) the excursions of the COM are small with respect to l.

From Euler's equation:

$$
\sum M = I\alpha \tag{1}
$$

When the vertical projection of the COM is denoted as x, the position of the COP as x1, and the COM distance from the axis of rotation as l, (1) can be written as (Winter and Eng, 1995):

$$I(\mathbf{x}\_1 - \mathbf{x}) \, m\mathbf{g} = I\boldsymbol{\alpha} \approx -ml^2 \ddot{\mathbf{x}} \tag{2}$$

This is a dynamic unstable process, as the structure of the inverted pendulum and the postural control cannot achieve momentum equilibrium (*M* = 0). Where small body movements cause acceleration of the COM, a radial acceleration exists leading to

**FIGURE 4 | Summary of biomechanical principles.** Body center-of-mass (COM)—is the location where all of the mass of the system could be considered to be located. For a solid body it is often possible to replace the entire mass of the body with a point mass equal to that of the body's mass. This point mass is located at the center of mass. COG—the resultant force of all of small attractive forces of the mass particles of which the body is composed is the body's weight, and the location at which the resultant force is assumed to act. Ground reaction force vector (GRF)—the resultant of a pressure distribution under the foot or feet. Center- of -pressure (COP)—the location point of the ground reaction force vector (GRF). Body center-of-mass (COM) is regulated through movement of the COP under the feet. In such a model the difference between body COM and COP will be proportional to the acceleration of body COM. Base of support (BoS), is defined as the possible range of the COP, which is loosely equal to the area below and between the feet (in two-feet standing) (Winter et al., 1988).

priority of equilibrium control during almost all motor tasks including quiet standing aimed at reposition the COG over the COP (Gatev et al., 1999). The muscles around the ankle and hip joints work continuously as the human body struggles to maintain balance. One can see that as long as the COP is kept beyond the COM position, with respect to the rotation center at the ankle, the body is accelerated back to the upright position.

A major problem for human standing posture is the high center of gravity (COG) maintained over a relatively small base of support.

In attempting to understand motor mechanisms involved in the development of balance, research on postural control has focused mainly on two types of study: (a) balance with respect to external conditions, (b) postural adjustments to anticipated internal disturbances of balance. Unexpected external disturbances reveal centrally programmed patterns of postural responses. Afferent feedback also influences posture when the initial setting is disturbed. The second type of disturbance reveals feed-forward postural adjustments (for review, Dietz, 1992). By feed-forward, we mean that the controller predicts an external input or behaves using higher-order processing rather than simple negative feedback of a variable (Gatev et al., 1999).

Studies of the postural responses to unexpected small and slow external disturbances in the antero-posterior direction found that most people reposition the COG by swaying as a flexible inverted pendulum primarily about the ankles with little hip or knee motion. This stereotyped pattern of muscle activation is called "ankle strategy." When responding to larger, faster displacement of support, the primary action of most people occurs at the hip resulting in active trunk rotation or the so-called "hip strategy" (Nashner and McCollum, 1985). The choice of a postural strategy to disturbance was found to depend on the available appropriate sensory information (Nashner et al., 1989).

Locomotion is fundamental for an optimal child development. The ability to smoothly and adequately navigate through the environment enables the child to interact with the environment. Children with developmental disabilities including autism spectrum disorders and attention deficit/hyperactivity disorder (ADHD) demonstrate locomotor difficulties. ADHD and autistic spectrum individuals have reported significant motor difficulties, both fine and gross (Melillo and Leisman, 2009).

According to Patla et al. (1991) successful locomotion requires (1) producing a locomotor pattern for supporting the body against gravity and propelling it forward, while (2) maintaining the body in balance, and (3) adapting the pattern to meet environmental demands. The bipedal walking pattern that humans have adopted over time constitutes an elegant way to meet these requirements in an efficient and economic way. Several findings with respect to motor control in children with DCD and ADHD, however, indicate that they could have problems to meet some of these constraints related to neuromuscular control. Raynor (2001) observed decreased muscular strength and power in children with DCD, accompanied by increased levels of co-activation in a unilateral knee flexion hand extension task.

Similar neuromuscular problems, indicating difficulties with the selective muscle control necessary for rhythmic coordination, were found in a unilateral tapping task by Lundy-Ekman et al. (1991). Likewise Volman and Geuze (1998) showed that these rhythmic coordination difficulties of children with DCD are not restricted to the control of unilateral tapping. By means of a bimanual flexion-extension paradigm they found that relative phase stability of children with DCD was less stable than in controls. Second, with regard to balance various researchers agree that children with ASD/DCD show deficits in the control of posture as observed in the increased levels of postural sway during quiet stance (Wann et al., 1998; Przysucha and Taylor, 2004). From studies where upright stance was perturbed by means of a sudden displacement of a moveable platform it was concluded that the balance recovery strategy of children with DCD was different (Williams, 2002). Their strategy was characterized by a top-down muscular activation pattern compared to the distalproximal pattern displayed by children without DCD, which was argued to be more efficient. In stance, the projection of the center of mass has to be kept within the borders of the base of support, in order to maintain balance. For locomotor balance however, one must achieve a compromise between the forward propulsion of the body, which involves a highly destabilizing force, and the need to maintain the overall stability (Winter and Eng, 1995). Taking into account this complexity with respect to the control of posture during locomotion it can be hypothesized that the balance problems experienced by children with DCD might be a limiting factor for their locomotor activity.

So far, descriptions of the gait pattern of children with DCD are limited to some qualitative observations. Larkin and Hoare (1991) have notified for example poor head control, bent arms in a guard position, jerky limb to limb transitions, excessive hip flexion, pronounced asymmetry, wide base of support, short steps, foot strike with flat foot and toe-walking. In an attempt to quantify the gait pattern of children with DCD (see **Figure 6**), Woodruff et al. (2002) developed an Index of Walking Performance. This index is based on a comparison of four spatiotemporal gait parameters (time of opposite toe-off, single stance time, total stance time, and step length) with reference parameters of the San Diego database (Sutherland, 1978). From their calculations Woodruff et al. concluded that the walking pattern of six out of seven children with DCD indeed was atypical. This one-dimensional measure of walking performance is useful for classifying and evaluation of gait performance in clinical practice; however, it does not explain the nature or source of atypical gait (see **Figure 5**). In addition, comparison of gait variables with a reference population without controlling for stature (or leg length) and body weight might obscure deviations and lead to imprudent conclusions, since the walking pattern is highly dependent on anthropometrical characteristics (Therefore, in order to gain insight into the gait pattern of children Hof, 1996; Stansfield et al., 2003). with developmental disorders, more detailed and quantitative data are needed.

Up to 50% of children and adolescents with ADHD exhibit motor abnormalities including altered balance (Buderatha et al., 2009). Different studies report balance testing included a

**FIGURE 5 | Stick-figures of the body configuration at initial FS (left) and TO (right).** Gray lines represent the TD-children without DCD, black lines represent the children with DCD. Feet with broken lines are the contralateral feet. Arrows indicate significant differences of the joint angles (*p* < 0.05) (from Deconinck, 2005).

disruption of sensory signals. During dynamic posturography ADHD-participants showed mild balance problems, which correlated with findings in cerebellar children. ADHD children showed abnormalities in a backward walking task and minor abnormalities in the paced stepping test. They did not differ in treadmill walking from the controls. These findings support the notion that cerebellar dysfunction may contribute to the postural deficits seen in ADHD children. However, the observed abnormalities were minor. It needs to be examined whether balance problems become more pronounced in ADHD children exhibiting more prominent signs of clumsiness. Although it has been fairly well known that attention deficit disorders are comorbid with psychiatric disorders such as the ones described above, what is less known and what is more significant is the association between ADD/ADHD and motor controlled dysfunction (clumsiness) or DCD (American Psychiatric Association, 1994). In the past, motor clumsiness had been viewed as a neurological rather than as a psychiatric disorder. Motor control problems were first noted in what were then called minimal brain dysfunction syndromes or MBD. MBD was the term used to describe children of normal intelligence, but with comorbidity of attention deficit and motor dysfunction or "soft" neurological signs. Several studies by Denckla and others (Denckla and Rudel, 1978; Denckla et al., 1985; Gillberg et al., 1993; Kadesjö and Gillberg, 1998) have shown that comorbidity exists between ADHD and OCD, dyscoordination and/or motor perceptual dysfunction. Several studies have shown that 50% of children with ADHD also had OCD (Brown et al., 2001).

In a Dutch study (Hadders-Algra and Towen, 1992), 15% of school age children were judged to have mild neural developmental deviations and another 6% demonstrated severe neural developmental deviations (occurring in boys twice as often as in girls). Minor developmental deviations were reported to consist of dyscoordination, fine motor deviations, choreiform movements, and abnormalities of muscle tone. Researches that have dealt with these minor neural developmental deviations tend to look at motor dysfunction as a sign of neurological disorder that may be associated with other problems such as language and perception dysfunction.

In Asperger's syndrome, it has been noted that individual's have significant degrees of motor incoordination. In fact, in Wing's original paper, she noted that of the 34 cases that she had diagnosed based on Asperger's description, "90% were poor at games involving motor skill, and sometimes the executive problems affect their ability to write or draw." Although, gross motor skills are most frequently affected, fine motor and specifically graphomotor skills were sometimes considered significant in Asperger's syndrome" (Wing, 1988; Wing and Attwood, 1988). Wing and Attwood (1988) noted that posture, gait, and gesture incoordination were most often seen in Asperger's syndrome and that children with classic autism seem not to have the same degree of balancing and gross motor skill deficits. However, it was also noted that the agility and gross motor skills in children with autism seem to decrease as they get older and may eventually present in similar or at the same level as Asperger's syndrome.

Gillberg and Gillberg (1989) also reported clumsiness to be almost universal among children that they had examined for

Asperger's syndrome. The other associated symptoms noted consisted of severe impairment and social interaction difficulties, preoccupation with a topic, reliance on routines, pedantic language, comprehension, and dysfunction of nonverbal communication. In subsequent work, Gillberg included clumsiness as an essential diagnostic feature of Asperger's syndrome.

It has been reported (Gillberg and Kadesjö, 2003) that children with ADHD and autism spectrum problems, particularly those given a diagnosis of Asperger syndrome, have a very high rate of comorbid DCDs. Klin et al. (1995) noted that a significantly higher percentage of Asperger's rather than non-Asperger's autistic individuals showed deficits in both fine and gross motor skills either relative to norms or by clinical judgment. They further noted that all 21 Asperger's cases showed gross motor skill deficits, but 19 of these also had impairment in manual dexterity, which seems to suggest that poor coordination was a general characteristic of Asperger's. With studies like these, many researches have noted dysfunction of fine motor coordinative skills as a feature of autistic spectrum disorders.

Manjiviona and Prior (1995) noted that 50% of autistic*s* and 67% of their Asperger's individuals studied presented with significant motor impairment as defined by norms on a test of motor impairment. Walker et al. (2004) also noted that autistic groups did not differ from Asperger's groups with respect to dominant hand speeds on type boards although both were slower than psychiatric controls. Vilensky et al. (1981) analyzed the gait pattern of a group of children with autism. They used film records and identified gait abnormalities in these children that were not observed in a controlled group of normally developing children or in small groups of "hyperactive/aggressive children." Reported abnormalities were noted to be similar to those associated with Parkinson's. Hallett et al. (1993) assessed the gait of five high functioning adults with autism compared with age matched normal controls. Using a computer assisted video kinematic technique; they found that gait was atypical in these individuals. The authors noted that the overall clinical findings were consistent with a cerebellar rather than a basal ganglionic dysfunction.

Kohen-Raz et al. (1992) noted that postural control of children with autism differs from that of matched mentally handicapped and normally developing children and from adults with vestibular pathology. These objective measures were obtained using a computerized posturographic technique. It has been also noted that the pattern of atypical postures in children with autism is more consistent with a mesocortical or cerebellar rather than vestibular pathology. Numerous investigators (Howard et al., 2000) have provided independent empirical evidence that basic disturbances of the motor systems of individuals with autism are especially involved in postural and lower limb motor control.

Makris et al. (2008) examined attention and executive systems abnormalities in adults with childhood ADHD. They noted that ADHD is hypothesized to be due, in part, to structural defects in brain networks influencing cognitive, affective, and motor behaviors (Leisman et al., 2013). Although the literature on fiber tracts is limited in ADHD, Makris and colleagues note that gray matter abnormalities suggest that white matter connections may be altered selectively in neural systems. A prior study, (Ashtari et al., 2005) using diffusor tensor magnetic resonance imaging showed alterations within the frontal and cerebellar white matter in children and adolescents with ADHD. In this study of adults the authors hypothesized that fiber pathways subserving attention and executive functions would be altered. To this end, the cingulum bundle (CB) and superior longitudinal fascicle II (SLF II) were investigated *in vivo* in 12 adults with childhood ADHD and 17 demographically comparable unaffected controls using DT-MRI. Relative to controls, the fractional anisotropy (FA) values were significantly smaller in both regions of interest in the right hemisphere, in contrast to a control region (the fornix), indicating an alteration of anatomical connections within the attention and EF cerebral systems in adults with childhood ADHD. The demonstration of FA abnormalities in the CB and SLF II in adults with childhood ADHD provides further support for persistent structural abnormalities into adulthood.

Researchers at Stanford University have observed that in children with ADHD, also known as childhood hyperkinetic disorder (Wing and Attwood, 1988) frontal-subcortical connections are disrupted by subcortical dysfunction showing decreased glucose consumptioninfrontal cortex, and decrease nigrostriatalD2 receptor uptake ratios The Stanford study used functional MRI to image the brains of boys between the ages of 8 and 13 while playing a mental game. Ten of the boys were diagnosed with ADHD and six were considered normal.When the boyswere tested there appeared to be a clear difference in the activity of the basal ganglia with the boys with ADHD having less activity in that area than the control subjects. After administering methylphenidate, the participants were scanned again and it was found that boys with ADHD had increased activity in the basal ganglia whereas the normal boys had decreased activity in the basal ganglia. Interestingly, the drug improved the performance of both groups to the same extent.

This may be a similar finding as the PET scans on patients with hyperactivity disorder, where normal appearing frontal metabolism existed with decreased caudate and putamen metabolism (Gillberg and Gillberg, 1989). Methylphenidate, a dopamine reuptake inhibitor, may increase function in a previously dysfunctional basal ganglia whereas raising dopamine levels in normal individuals would most likely result in decreased activity of the basal ganglia to prevent overproduction of dopamine. The previously dysfunctional basal ganglia would have most likely resulted in decreased frontal metabolism with increased thalamo-cortical firing; this would result in decreased cognitive function with increased hyperkinetic (hyperactive) behavior. Increasing dopamine levels may increase frontal metabolism due to increased activity of the striatum with decreased firing of the globus pallidus thereby inhibiting thalamo-cortical firing decreases which in turn decreases hyperkinetic behavior. This would make sense based on the findings of fMRI before and after, and the fact that both groups showed equal improvement in performance.

Etiological theories suggest a deficit in cortico-striatal circuits, particularly those components modulated by dopamine and therefore discussed in comparison with the other basal ganglia related disorders in the paper. Teicher et al. (2000) developed a functional magnetic resonance imaging procedure (T2 relaxometry) to indirectly assess blood volume in the striatum (caudate and putamen) of boys 6–12 years of age in steady-state conditions. Boys with attention-deficit/hyperactivity disorder had higher T2 relaxation time measures in the putamen bilaterally than healthy control subjects. Daily treatment with methylphenidate significantly changed the T2 relaxation times in the putamen of children with ADHD. There was a similar but non-significant trend in the right caudate. Teicher and colleagues concluded that attentiondeficit/hyperactivity disorder symptoms might be closely tied to functional abnormalities in the putamen, which is mainly involved in the regulation of motor behavior.

Converging evidence implies the involvement of dopaminergic fronto-striatal circuitry in ADHD. Anatomical imaging studies using MRI have demonstrated subtle reductions in volume in regions of the basal ganglia and prefrontal cortex (Castellanos et al., 2002). Cognitive functioning is mildly impaired in this disorder (Seymour et al., 2004). In particular, cognitive control, the ability to inhibit inappropriate thoughts and actions, is also affected and therefore we are again dealing with a disorder of inhibition. Several studies have shown that this impairment is related to the reduction in volume in fronto-striatal regions (Sergeant et al., 2002), and functional studies have suggested that older children and adults with ADHD may activate these regions less than controls during tasks that require cognitive control (Bush et al., 1999). Durston et al. (2002) showed that the development of this ability is related to the maturation of ventral fronto-striatal circuitry.

Volumetric abnormalities have also been associated with the basal ganglia and in turn with ADHD. Qiu et al. (2009), to specify localization of these abnormalities, employed large deformation diffeomorphic metric mapping (LDDMM) to examine the effects of ADHD, sex, and their interaction on basal ganglia shapes. The basal ganglia (caudate, putamen, globus pallidus) were manually delineated on magnetic resonance imaging from typically developing children and children with ADHD. LDDMM mappings from 35 typically developing children were used to generate basal ganglia templates. These investigators found that boys with ADHD showed significantly smaller basal ganglia volumes compared with typically developing boys, and LDDMM revealed the groups remarkably differed in basal ganglia shapes. Volume compression was seen bilaterally in the caudate head and body and anterior putamen as well as in the left anterior globus pallidus and right ventral putamen. Volume expansion was most pronounced in the posterior putamen. They concluded that the shape compression pattern of basal ganglia in ADHD suggests an atypical brain development involving multiple frontal-subcortical control loops, including circuits with premotor, oculomotor, and prefrontal cortices.

Aaron et al. (2007) brilliantly outlined the nature of inhibition in fronto-basal-ganglia networks relative to cognition. Their paper was not about the problems of ADHD individuals *per se* but a thorough analysis of the neurophysiology of stopping. They hand indicated that sensory information about a stop signal is relayed to the prefrontal cortex, where the stopping command must be generated. They collected the evidence together indicating that the right inferior frontal cortex (IFC) is a critical region for stop signal response inhibition (Chambers et al., 2006) with the most critical portion likely being the pars opercularis (Brodmann area 44) in humans. The right IFC can send a stop command to intercept the Go process via the basal ganglia [represented in Figure 7B from Aaron et al. (2007)]. The Go process is likely generated by premotor areas that project via the direct pathway of the basal ganglia (through striatum, pallidum, and thalamus), eventually exciting primary motor cortex and generating cortico-spinal volleys to the relevant effector each interacting with the globus pallidus (Aaron and Poldrack, 2006). The Stop process could activate the globus pallidus via a projection from the STN. As seen in **Figures 7A–C**, high resolution fMRI has shown activation of a midbrain region, consistent with the STN, when subjects successfully stop their responses (Aaron and Poldrack, 2006), and diffusion tractography shows that this STN region is directly connected to the right IFC via a white matter tract (Aaron et al., 2007) (**Figure 7C**). Thus, once the Stop command is generated in frontal cortex, it could be rapidly conveyed to the basal ganglia via the so-called "hyperdirect pathway" to intercept the Go process in the final stages of the race. Two recent studies identified a third critical node for the stopping process in the dorso-medial frontal cortex, including the pre-supplementary motor area) (Floden and Stuss, 2006).

Balance deficits, motor planning, motor coordination and perceptual-motor problems associated with other developmental disorders are, as we have noted, present with individuals with ADHD (Kaplan et al., 1998). As we had noted earlier, there have been attempts to assume a single underlying disorder such as atypical brain development because of the high level of comorbidity between learning, attention, (developmental) coordination, and behavioral disorders (Kaplan et al., 1998).

The contribution of sensory organs to posture has been the object of much inquiry and for good reason. A malfunction in any of the three primary sensory subsystems (visual, vestibular, or somatosensory) can compromise integrative function and as a result limit adaptability of posture. A lack of optimal postural control limits the development of sensory strategies, anticipatory mechanisms, internal representations, neuromuscular synergies, and adaptive mechanisms (Shumway-Cook and Woolacott, 2001).

Inadequate input and the inability to integrate and prioritize information from different sources result in instability, poor motor planning, poor coordination, and perceptual motor problems. Although posture dysfunction among children with ADHD may not be easily identified, research indicates that balance is compromised with this population (Zang et al., 2002).

Posture and balance are accomplished through several mechanisms acting together to maintain orientation and stability (Shumway-Cook and Woolacott, 2001). Both the sensory and motor systems, along with the biomechanical properties of the organism provide the foundation for posture control (Palmeri et al., 2002).

Self-organizing properties of motor behavior evident in other biological and natural systems are evident in the developing human as well (Kamm et al., 1990). Various subcomponents within the individual, the task at hand, and the environment all

**FIGURE 7 | (A)** The interactive race model between Go and Stop processes. The parameters were estimated by fitting the model to thousands of behavioral trials from a monkey neurophysiology study. **(B)** Schematic of fronto-basal-ganglia circuitry for Going and Stopping. The Go process is generated by premotor cortex, which excites striatum and inhibits globus pallidus, removing inhibition from thalamus and exciting motor cortex (see text for details). The stopping process could be generated by IFC leading to activation of the subthalamic nucleus, increasing broad excitation of pallidum and inhibiting thalamocortical output, reducing activation in motor cortex. **(C)** Diffusion-weighted

imaging reveals putative white matter tracts in the right hemisphere between the dorsomedial preSMA, the ventrolateral PFC or IFC, and the putative region of the STN. **(D)** Regions of the rat brain implicated in behavioral stopping. Stopping is significantly impaired following excitotoxic lesions within the regions highlighted in red, whereas lesions within the gray-colored regions have no effect on stopping. OF, Orbitofrontal cortex; IL, infralimbic cortex; PL, prelimbic cortex; DM Str, dorsomedial striatum; NAC, nucleus accumbens (core); DH, dorsal hippocampus; VH, ventral hippocampus; GPi, globus pallidus pars interna (from Aaron and Poldrack, 2006).

interact to determine the movement that emerges, with no *a priori* determination of which system is the primary control parameter. Unlike hierarchical theories of motor control that adhere to a prescriptive system of generating behavior (i.e., Maturation and/or information processing theory purports that the brain or central nervous system dictates outcome responses or behavior) a Dynamical Systems framework suggests that the brain is one of many components but not the sole determinant of performance. Gravity, musculoskeletal properties, motion dependent torques and all other changing (dynamical) contexts, which can include the environment (e.g., ambient temperature, surface, initial position), task at hand, and arbitrary rules, play significant roles in shaping the resulting action. It is impossible to command all motor units, from infinite initial positions through all possible planes of motion (thus the "degrees of freedom problem") and produce such elegant and quick motor responses as routinely displayed by even the youngest of children. Thus, from a Dynamical Systems perspective, given the near infinite degrees of freedom involved, a parsimonious solution will involve the temporary organization of "coordinative structures" or units (Clark and Whittall, 1989).

In postural terms, early forms of coordinative units that allow infants to interact with the environment necessitate reflexes. Through development, more complex forms of control emerge such as anticipatory postural responses (e.g., feed-forward mechanisms, Horak and Nashner, 1986) and postural synergies (e.g., probably mid-brain or brainstem reflexes, Sveistrup and Woollacott, 1996) that usher more adaptive balance behaviors. Subsequently voluntary motor control is available or possible as temporarily organized units or components within the organism perform at optimized levels. This includes sensory, perceptual and motor functions that collectively allow highly adaptive responses.

Dysfunction may arise because a subcomponent of the system is not functioning to its capacity, thus acting as a weak link. Children with ADHD interact with their environment but not in a consistent fashion as the typical population, perhaps due to a less than adequate sensory apparatus as suggested (Zang et al., 2002). The weakest component of the system serves as the control parameter in this case and determines the integrity of the coordinative unit.

### **CONCLUSIONS**

Neural circuits linking activity in anatomically segregated populations of neurons in subcortical structures and the neocortex throughout the human brain regulate complex behaviors such as walking, talking, language comprehension and other cognitive functions including those associated with frontal lobes. Many neocortical and subcortical regions support the cortical-striatalcortical circuits that confer various aspects of language ability, for example. However, many of these structures also form part of the neural circuits regulating other aspects of behavior. For example, the basal ganglia, which regulate motor control, are also crucial elements in the circuits that confer human linguistic ability and reasoning. The cerebellum, traditionally associated with motor control, is active in motor learning. The basal ganglia are also key elements in reward-based learning. Data from studies individuals with Tourette's syndrome, Obsessive-Compulsive Disorder as well as with Broca's aphasia, Parkinson's disease, hypoxia, focal brain damage, and from comparative studies of the brains and behavior of other species, demonstrate that the basal ganglia sequence the discrete elements that constitute a complete motor act, syntactic process, or thought process. Imaging studies of intact human subjects and electrophysiologic and tracer studies of the brains and behavior of other species confirm these findings. Dobzansky had stated, "Nothing in biology makes sense except in the light of evolution" (cited in Mayr, 1982). That applies with as much force to the human brain and the neural bases of cognition as it does to the human foot or jaw. The converse follows: the mark of evolution on the brains of human beings and other species provides insight into the evolution of the brain bases of human language. The neural substrate that regulated motor control in the common ancestor of apes and humans most likely was modified to enhance cognitive and linguistic ability. Language and cognition played a central role in this process. However, the process that ultimately resulted in the human brain may have started when our earliest hominid ancestors began to walk.

## **ACKNOWLEDGMENTS**

This work has been supported, in part, by the Government of Israel, *Kamea Dor-Bet* program, by the Children's Autism Hope Project, and by the Richmond County Savings Foundation.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 16 October 2013; accepted: 24 January 2014; published online: 13 February 2014.*

*Citation: Leisman G, Braun-Benjamin O and Melillo R (2014) Cognitive-motor interactions of the basal ganglia in development. Front. Syst. Neurosci. 8:16. doi: 10.3389/ fnsys.2014.00016*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2014 Leisman, Braun-Benjamin and Melillo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## The thalamostriatal system in normal and diseased states

#### **Yoland Smith1,2,3\*, Adriana Galvan1,2,3 , Tommas J. Ellender<sup>4</sup> , Natalie Doig<sup>4</sup> , Rosa M. Villalba1,3 , Icnelia Huerta-Ocampo<sup>4</sup> , Thomas Wichmann1,2,3 and J. Paul Bolam<sup>4</sup>**

<sup>1</sup> Yerkes National Primate Research Center, Emory University, Atlanta, GA, USA

<sup>2</sup> Department of Neurology, Emory University, Atlanta, GA, USA

<sup>3</sup> Udall Center of Excellence for Parkinson's Disease, Emory University, Atlanta, GA, USA

<sup>4</sup> Department of Pharmacology, MRC Anatomical Neuropharmacology Unit, Oxford, UK

#### **Edited by:**

Hagai Bergman, The Hebrew University, Israel

#### **Reviewed by:**

Jose L. Lanciego, University of Navarra, Spain Antonio Pisani, Università di Roma "Tor Vergata", Italy

#### **\*Correspondence:**

Yoland Smith, Yerkes National Primate Research Center, Emory University, 954, Gatewood Road NE, Atlanta, GA 30329, USA e-mail: ysmit01@emory.edu

Because of our limited knowledge of the functional role of the thalamostriatal system, this massive network is often ignored in models of the pathophysiology of brain disorders of basal ganglia origin, such as Parkinson's disease (PD). However, over the past decade, significant advances have led to a deeper understanding of the anatomical, electrophysiological, behavioral and pathological aspects of the thalamostriatal system. The cloning of the vesicular glutamate transporters 1 and 2 (vGluT1 and vGluT2) has provided powerful tools to differentiate thalamostriatal from corticostriatal glutamatergic terminals, allowing us to carry out comparative studies of the synaptology and plasticity of these two systems in normal and pathological conditions. Findings from these studies have led to the recognition of two thalamostriatal systems, based on their differential origin from the caudal intralaminar nuclear group, the center median/parafascicular (CM/Pf) complex, or other thalamic nuclei. The recent use of optogenetic methods supports this model of the organization of the thalamostriatal systems, showing differences in functionality and glutamate receptor localization at thalamostriatal synapses from Pf and other thalamic nuclei. At the functional level, evidence largely gathered from thalamic recordings in awake monkeys strongly suggests that the thalamostriatal system from the CM/Pf is involved in regulating alertness and switching behaviors. Importantly, there is evidence that the caudal intralaminar nuclei and their axonal projections to the striatum partly degenerate in PD and that CM/Pf deep brain stimulation (DBS) may be therapeutically useful in several movement disorders.

**Keywords: thalamus, Parkinson's disease, intralaminar nuclei, glutamate, vesicular glutamate transporter, attention, striatum, Tourette's syndrome**

#### **INTRODUCTION**

Although the evolution of the thalamus and striatum pre-dates the expansion of the cerebral cortex (Butler, 1994; Reiner, 2010; Stephenson-Jones et al., 2011), our knowledge about the functional anatomy and behavioral role of the connections between them is minimal compared to the amount of information that has been gathered about the corticostriatal system (Kemp and Powell, 1971; Alexander et al., 1986; Parent and Hazrati, 1995). However, significant progress has been made in our understanding of the anatomical and functional organization of the thalamostriatal system since its first description in the early 1940's (Vogt and Vogt, 1941; Cowan and Powell, 1956). Research in the last decade has resulted in a much better understanding of various aspects of the synaptic properties of the thalamostriatal projection(s), and their potential roles in cognition. Furthermore, evidence that the center median/parafascicular (CM/Pf) complex, the main source of thalamostriatal connections, is severely degenerated in Parkinson's disease (PD), combined with the fact that lesion or deep brain stimulation (DBS) of this nuclear group alleviates some of the motor and non-motor symptoms of Tourette's syndrome (TS) and PD, has generated significant interest in these projections. In this review, we will discuss these topics and provide an overview of our current knowledge of the functional anatomy, synaptology, and physiology of the mammalian thalamostriatal system, as well as the involvement of these projections in disease processes. Because of space limitation, we will focus mostly on recent developments in this field. Readers are referred to previous reviews for additional information and a broader coverage of the early literature on the thalamostriatal projections (Groenewegen and Berendse, 1994; Parent and Hazrati, 1995; Mengual et al., 1999; Haber and Mcfarland, 2001; Van der Werf et al., 2002; Kimura et al., 2004; Smith et al., 2004, 2009, 2010, 2011; McHaffie et al., 2005; Haber and Calzavara, 2009; Halliday, 2009; Minamimoto et al., 2009; Sadikot and Rymar, 2009; Galvan and Smith, 2011; Bradfield et al., 2013b).

## **THALAMOSTRIATAL CIRCUITRY AND SYNAPTIC CONNECTIVITY**

#### **FUNCTIONALLY SEGREGATED BASAL GANGLIA-THALAMOSTRIATAL CIRCUITS THROUGH THE CENTER MEDIAN/PARAFASCICULAR (CM/Pf) COMPLEX**

Although the thalamostriatal system originates from most thalamic nuclei, the CM/Pf (or the Pf in rodents) is the main source of thalamostriatal projections in primates and non-primates (Smith and Parent, 1986; Berendse and Groenewegen, 1990; Francois et al., 1991; Sadikot et al., 1992a,b; Deschenes et al., 1996a,b; Mengual et al., 1999; McFarland and Haber, 2000; Mcfarland and Haber, 2001; Smith et al., 2004, 2009; Castle et al., 2005; McHaffie et al., 2005; Parent and Parent, 2005; Raju et al., 2006; Lacey et al., 2007). CM/Pf neurons send massive and topographically organized projections to specific regions of the dorsal striatum, but provide only minor inputs to the cerebral cortex (Sadikot et al., 1992a; Parent and Parent, 2005; Galvan and Smith, 2011; **Figure 1**). Single cell tracing studies in monkeys have shown that more than half of all CM neurons innervate densely and focally the striatum without significant input to the cerebral cortex, while about one third innervates diffusely the cerebral cortex without significant projections to the striatum, and the remaining neurons project to both targets with a preponderance of innervation of the dorsal striatum (Parent and Parent, 2005).

Based on its preferential targeting of specific functional territories, the primate CM/Pf complex is divided into five major subregions: (1) the rostral third of Pf which innervates mainly the nucleus accumbens; (2) the caudal two thirds of Pf which project to the caudate nucleus; (3) the dorsolateral extension of Pf (Pfdl) which targets selectively the anterior putamen; (4) the medial two thirds of CM (CMm) which innervates the post-commissural putamen; and (5) the lateral third of CM (CMl) which is the source of inputs the primary motor cortex (M1). Through these projections, the CM/Pf gains access to the entire striatal complex, thereby making the CM/Pf-striatal system a functionally organized network that may broadly affect motor and non-motor basal ganglia functions (Smith et al., 2004, 2009, 2010; Galvan and Smith, 2011). In rodents, the lateral part of Pf is considered to be the homologue of the primate CM and projects mainly to the sensorimotor region (i.e., the dorsolateral part) of the caudate-putamen complex, whereas the medial rodent Pf displays strong similarities with the primate Pf, projecting to associative and limbic striatal regions of the striatum (Groenewegen and Berendse, 1994).

#### **THALAMOSTRIATAL SYSTEMS FROM NON-CENTER MEDIAN/PARAFASCICULAR (CM/Pf) THALAMIC NUCLEI**

In addition to the CM/Pf complex, thalamostriatal projections originate from several other rostral intralaminar and nonintralaminar thalamic nuclei. In primates and non-primates, the rostral intralaminar nuclei (central lateral, paracentral, central medial), the mediodorsal nucleus (MD), the pulvinar, the lateral posterior nucleus, the medial posterior nucleus, midline, anterior and the ventral motor nuclear group are prominent sources of thalamostriatal projections to the caudate nucleus

**FIGURE 1 | Summary of the anatomical, functional and pathological characteristics that differentiate thalamostriatal projections from the CM/Pf vs. other thalamic nuclei (i.e., non CM/Pf).**

and putamen (Royce, 1978; Beckstead, 1984; Smith and Parent, 1986; Groenewegen and Berendse, 1994; Smith et al., 2004, 2009; Alloway et al., 2014). In contrast to the projections from the CM/Pf complex, these nuclei send major projections to the cerebral cortex, while contributing a modest or sparse innervation of the dorsal and ventral striatum (Royce, 1983; Macchi et al., 1984; Deschenes et al., 1996a,b; Smith et al., 2004, 2009; **Figure 1**). In rats, the topography of these projections corresponds to the functionally segregated organization of the striatum, so that sensorimotor-, associative- and limbic-related thalamic nuclei innervate functionally corresponding regions of the dorsal and ventral striatum (Berendse and Groenewegen, 1990, 1991; Groenewegen and Berendse, 1994). Although such detailed analyses have not been done in primates (except for projections from the ventral anterior/ventral lateral (VA/VL) complex), evidence from retrograde and anterograde labeling studies indicate that the primate non-CM/Pf thalamostriatal projections also display a strict functional topography (Parent et al., 1983; Smith and Parent, 1986; Fenelon et al., 1991; McFarland and Haber, 2000; Mcfarland and Haber, 2001). The thalamostriatal projections from the ventral motor thalamic nuclei have received particular attention in these studies. It appears that projections from the pars oralis of the VL (VLo), the main recipient of sensorimotor internal globus pallidus (GPi) outflow, terminate preferentially in the postcommissural putamen, whereas projections from the magnocellular division of the VA, the principal target of the substantia nigra pars reticulata (SNr) and associative GPi outflow, innervate the caudate nucleus (Mcfarland and Haber, 2001). Within striatal territories, VA/VL projections terminate in a patchy manner in the striatum, indicating that additional organizational principles may be at work that have not yet been elucidated (Groenewegen and Berendse, 1994; McFarland and Haber, 2000; Mcfarland and Haber, 2001; Smith et al., 2004, 2009; Raju et al., 2006).

Through the use of trans-synaptic viral tracing studies, Strick and colleagues have suggested that thalamostriatal projections from the VL and other intralaminar thalamic nuclei that receive cerebellar outflow from the dentate nucleus may be the underlying connections through which the cerebellum communicates with the basal ganglia (Bostan and Strick, 2010; Bostan et al., 2010, 2013). Dysfunctions of these cerebello-thalamo-basal ganglia interactions may underlie some aspects of the pathophysiology of dystonia and other movement disorders (Jinnah and Hess, 2006; Neychev et al., 2008; Calderon et al., 2011).

#### **THE SUPERIOR COLLICULUS: A POTENTIAL DRIVER OF THE THALAMOSTRIATAL SYSTEM IN MAMMALS**

Anatomical studies have suggested that additional sub-cortical tecto-basal ganglia loops exist that connect the superficial and deep layers of the superior colliculus with specific thalamic nuclei, which then gain access to the basal ganglia circuitry via thalamostriatal connections (McHaffie et al., 2005; Redgrave et al., 2010). The existence of these connections could resolve some of the fundamental issues associated with short-latency responses to biologically salient stimuli (Smith et al., 2011; Alloway et al., 2014). As discussed in more detail below, Pf neurons exhibit short-latency excitatory responses to salient stimuli. Redgrave and colleagues have suggested that the superior colliculus (optic tectum in lower species) displays the evolutionary profile, anatomical connectivity and physiological features that would allow it to mediate such effects upon thalamic neurons (McHaffie et al., 2005; Redgrave et al., 2010). The basal ganglia and the superior colliculus are neural structures that appeared early (>400 million years ago) and have been highly conserved throughout the evolution of the vertebrate brain (Reiner, 2010; Stephenson-Jones et al., 2011), thereby suggesting that they are part of fundamental processing units that play basic functions in mammalian behavior. The superior colliculus has direct access to primary sensory information, and studies have shown that stimuli associated with positive or negative outcomes activate different sub-regions of this nucleus that engage various tecto-thalamo-striatal loops (Redgrave et al., 1999; Alloway et al., 2014). Through these loops, the primary sensory events could be rapidly transmitted to the striatum and affect the basal ganglia circuitry which, in turn, could lead to basal ganglia-mediated disinhibition of different sub-regions of the superior colliculus that could help select and reinforce some sensory stimuli over others (Redgrave et al., 1999), most likely through regulation of corticostriatal plasticity via dopaminergic and cholinergic intrastriatal mechanisms (Ding et al., 2010; Smith et al., 2011). Through these processes, the short-latency sensory-driven activity in the superior colliculus could be used by the basal ganglia to reinforce the development of novel habits or procedures (Redgrave et al., 2010; Smith et al., 2011).

## **SYNAPTIC ORGANIZATION AND PREVALENCE OF THALAMOSTRIATAL VS. CORTICOSTRIATAL TERMINALS**

Anterograde tracing studies in several species have shown that the thalamostriatal projections give rise to asymmetric (or Gray's Type 1) synapses. In rodents, the principal synaptic target of most non-CM/Pf thalamostriatal projections are dendritic spines of striatal medium spiny neurons (MSNs), a pattern of synaptic connectivity similar to the corticostriatal system (Kemp and Powell, 1971; Dube et al., 1988; Xu et al., 1991; Raju et al., 2006; Lacey et al., 2007; **Figures 1**, **2**). In contrast, striatal afferents from CM/Pf (or Pf in rodents) establish asymmetric synapses principally with dendritic shafts of MSNs (Dube et al., 1988; Sadikot et al., 1992b; Smith et al., 1994; Sidibe and Smith, 1996; Raju et al., 2006, 2008; Lacey et al., 2007) and several types of striatal interneurons including cholinergic interneurons (Meredith and Wouterlood, 1990; Lapper and Bolam, 1992; Sidibe and Smith, 1999) and parvalbumin-positive GABA interneurons (Rudkin and Sadikot, 1999; Sidibe and Smith, 1999; **Figure 1**). Overall, 70–90% of CM/Pf (or Pf) terminals form axo-dendritic synapses in the rat and monkey striatum (Dube et al., 1988; Sadikot et al., 1992b; Raju et al., 2006; Lacey et al., 2007). However, single cell filling studies have revealed that the pattern of synaptic connection of individual Pf neurons is highly variable in rats. For instance, some Pf neurons were found to be the sources of terminals that terminate almost exclusively on dendritic spines, whereas others predominantly target dendritic shafts (Lacey et al., 2007). It is not known whether these neurons represent functionally different subpopulations of Pf-striatal cells.

The cloning of the vesicular glutamate transporters 1 or 2 (vGluT1 or vGluT2) (Fremeau et al., 2001, 2004) and the demonstration that these transporters are differentially expressed in corticostriatal (vGluT1-positive) or thalamostriatal (vGluT2 positive) terminals (but see Barroso-Chinea et al., 2008) have helped with the assessment of the relative prevalence, and the characterization of the synaptic connectivity of corticostriatal and thalamostriatal terminals in rodents and nonhuman primates. These studies showed that 95% of all vGluT1-positive corticostriatal projections terminate on dendritic spines of striatal neurons, and 5% on dendrites in both rodents and primates. This pattern is different from that of vGluT2-containing thalamostriatal boutons. For instance, only 50–65% (depending on the striatal region) of vGluT2-containing terminals contact dendritic spines in monkeys, while the remaining form asymmetric synapses with dendritic shafts (Raju et al., 2008). In rats, as many as 80% of vGluT2-positive terminals form axo-spinous synapses in the striatum (Raju et al., 2006; Lacey et al., 2007). Whether these differences in the proportion of vGluT2 terminals in contact with striatal spines represent a genuine species difference in the microcircuitry of the thalamostriatal system between primates and non-primates remain to be determined.

It is also important to note that the synaptic connectivity of vGluT2-containing terminals differs between the patch/striosome and the matrix compartments of the striatum in rats. While the ratio of axo-spinous and axo-dendritic synapses for vGluT2 immunoreactive terminals is 90:10 in the patch compartment, it is about 55:45 in the matrix (Raju et al., 2006; **Figure 1**). The fact that the massive thalamostriatal projection from Pf, a predominant source of axo-dendritic glutamatergic synapses, terminates exclusively in the striatal matrix accounts for this difference in the overall synaptology of vGluT2-containing terminals between the two striatal compartments (Herkenham and Pert, 1981; Sadikot et al., 1992b; Raju et al., 2006). Evidence that activity imbalances between the patch/striosome and matrix compartments may be involved in various basal ganglia disorders (Crittenden and Graybiel, 2011) highlights the potential significance of this compartmental segregation of CM/Pf inputs to the mammalian striatum. In contrast to vGluT2, the pattern of synaptic innervation of corticostriatal vGluT1-positive terminals does not differ between patch/striosome and matrix compartments (Raju et al., 2006).

The use of vGluT1 and vGluT2 also allowed the quantification of the relative prevalence of cortical and thalamic terminals in the rat and monkey striatum. In the monkey post-commissural putamen, ∼50% of putative glutamatergic terminals (i.e., those forming asymmetric synapses) express vGluT1, ∼25% contain vGluT2 and ∼25% do not display immunoreactivity for either vGluT1 or vGluT2 (Raju et al., 2008). In rats, the differences in the prevalence of vGluT1 over vGluT2 terminals is not as striking (35% vGluT1 vs. 25% vGluT2 in rats), and the percentage of putative glutamatergic terminals unlabeled for either transporter subtype is higher (∼40%) than in monkeys (Kaneko and Fujiyama, 2002; Fujiyama et al., 2004; Lacey et al., 2005; Fujiyama et al., 2006; Huerta-Ocampo et al., 2013). It remains unclear whether the large proportion of putative glutamatergic terminals that are not immunopositive for vGluT1 or vGluT2 are simply cortical and thalamic boutons that express undetectable levels of vGluT1 or vGluT2, whether they express other, yet unidentified, vGluT(s) or whether they are non-glutamatergic.

#### **THALAMIC INPUTS TO DIRECT VS. INDIRECT PATHWAY NEURONS**

The striatum comprises two main populations of output neurons characterized by their differential dopamine receptors and neuropeptides expression (D1/substance P/dynorphin or D2/enkephalin) (Gerfen, 1984). These so-called "direct" and "indirect pathway" MSNs receive synaptic inputs from the thalamus and the cerebral cortex (Somogyi et al., 1981; Hersch et al., 1995; Sidibe and Smith, 1999; Lanciego et al., 2004; Lei et al., 2004, 2013; Huerta-Ocampo et al., 2013). In rats, the proportion of thalamic (vGluT2-positive) or cortical (vGluT1-positive) terminals in contact with direct or indirect pathway MSNs is very similar when considered as a population (Doig et al., 2010; Lei et al., 2013), but the total number of cortical terminals is higher than the number of thalamic boutons in contact with *individual* MSNs (Huerta-Ocampo et al., 2013). Tract-tracing studies in monkeys suggested that afferents from the CM preferentially innervate direct pathway MSNs in the putamen (Sidibe and Smith, 1996). However, because the CM/Pf (or Pf in rodents) has a unique pattern of synaptic connectivity in the striatum compared with other thalamic nuclei (**Figures 1**, **2**), and because axonal tracers labeled only a subset of CM terminals in this study, the potentially preferential innervation of direct pathway neurons needs to be further assessed using vGluT2 or other more general marker of the CM/Pf-striatal system.

The convergence of thalamic and cortical inputs upon single MSNs is consistent with *in vivo* and *in vitro* electrophysiological analyses showing that single direct or indirect pathway neurons respond to both cortical and thalamic stimulation in rodents (Kocsis and Kitai, 1977; Vandermaelen and Kitai, 1980; Ding et al., 2008; Ellender et al., 2011, 2013; Huerta-Ocampo et al., 2013). Recent studies have suggested that thalamic inputs may gate corticostriatal transmission via regulation of striatal cholinergic interneurons, and that this interaction may modulate behavioral switching and attentional set-shifting (Kimura et al., 2004; Ding et al., 2010; Smith et al., 2011; Sciamanna et al., 2012; Bradfield et al., 2013a,b).

#### **RELATIONSHIPS BETWEEN THALAMIC OR CORTICAL TERMINALS AND DOPAMINERGIC OR HISTAMINERGIC AFFERENTS**

The modulation of excitatory inputs from the cerebral cortex by dopaminergic afferents from the substantia nigra pars compacta is central to our understanding of the functional properties of the basal ganglia. The post-synaptic cortical-induced excitatory responses are modulated by dopamine acting through a wide variety of pre- and/or post-synaptic mechanisms dependent on the type and localization of dopamine receptors and the physiological state of striatal MSNs (Gonon, 1997; Reynolds et al., 2001; Cragg and Rice, 2004; Surmeier et al., 2007; Rice and Cragg, 2008; Ma et al., 2012). Although the regulatory effects of dopamine on thalamic glutamatergic transmission have not been directly assessed, the similarity between the overall pattern of synaptic connectivity of non-CM/Pf thalamic and cortical terminals with MSNs (Moss and Bolam, 2008) suggests that these two pathways may be regulated in the same manner by nigrostriatal dopamine afferents (**Figure 1**). The dopaminergic modulation of corticostriatal transmission relies in part on the synaptic convergence of dopaminergic and cortical synapses on individual spines of striatal MSNs (Freund et al., 1984; Bolam and Smith, 1990; Smith et al., 1994) and/or pre-synaptic dopamine-mediated regulation of glutamate release from neighboring cortical terminals (Surmeier et al., 2007). Our recent quantitative ultrastructural analyses of rat tissue immunolabelled to reveal both dopaminergic axons and cortical (vGluT1-positive) or thalamic (vGluT2 positive) terminals in rats showed that both glutamatergic systems display the same structural relationships with dopaminergic afferents, i.e., all thalamic and cortical glutamatergic terminals are located within 1 µm of a dopaminergic synapse suggesting that synaptically released and spilled over dopamine may modulate most glutamatergic terminals in the rodent striatum

(Arbuthnott et al., 2000; Arbuthnott and Wickens, 2007; Moss and Bolam, 2008; Rice and Cragg, 2008; but see Xu et al., 2012). However, it is unclear whether this general concept of interaction between dopaminergic and thalamostriatal afferents also applies to CM/Pf-striatal terminals that form axo-dendritic synapses. In light of our previous tracing studies which showed that axodendritic thalamic inputs from CM and dopaminergic terminals do not display significant structural relationships on the dendritic surface of striatal neurons (Smith et al., 1994), it is likely that the interactions between dopaminergic afferents and CM/Pf or non-CM/Pf thalamostriatal synapses differ (**Figure 1**).

Glutamatergic inputs from the cerebral cortex and thalamus to both direct and indirect pathway MSNs are also modulated pre-synaptically by histamine (Ellender et al., 2011). Histaminergic projections to the striatum that originate in the hypothalamus negatively modulate corticostriatal and thalamostriatal transmission through histamine H3 receptors.

## **AFFERENT CONNECTIONS OF CENTER MEDIAN/ PARAFASCICULAR (CM/Pf)**

In addition to the massive GABAergic projections from the GPi and SNr (see above), the CM receives inputs from motor, premotor and somatosensory cortices (Mehler, 1966; Kuypers and Lawrence, 1967; Kunzle, 1976, 1978; Catsman-Berrevoets and Kuypers, 1978; DeVito and Anderson, 1982), while the Pf is the main target of the frontal and supplementary eye fields (Huerta et al., 1986; Leichnetz and Goldberg, 1988) and associative areas of the parietal cortex (Ipekchyan, 2011). The CM/Pf complex also receives significant afferents from various subcortical sources, including the superior colliculus (Grunwerg et al., 1992; Redgrave et al., 2010), the pedunculopontine tegmental nucleus (Pare et al., 1988; Parent et al., 1988; Barroso-Chinea et al., 2011), the cerebellum (Royce et al., 1991; Ichinohe et al., 2000), the raphe nuclei, the locus coeruleus (Lavoie and Parent, 1991; Royce et al., 1991; Vertes et al., 2010), and from the mesencephalic, pontine and medullary reticular formation (Comans and Snow, 1981; Steriade and Glenn, 1982; Hallanger et al., 1987; Cornwall and Phillipson, 1988; Vertes and Martin, 1988; Royce et al., 1991; Newman and Ginsberg, 1994).

## **PHYSIOLOGY OF THE THALAMOSTRIATAL PROJECTIONS IN VIVO RECORDING STUDIES**

Experiments to study the role of the projections from the CM/Pf to the striatum date back to the 1970s. These studies described that electrical stimulation of the intralaminar nuclei induces short-latency excitatory post-synaptic potentials (EPSPs) in anesthetized cats and rats (Kocsis and Kitai, 1977; Vandermaelen and Kitai, 1980), confirming the existence of a direct glutamatergic thalamostriatal connection. Later, Wilson and colleagues demonstrated that these responses occurred in striatal MSNs (Wilson et al., 1983) and in cholinergic interneurons (Wilson et al., 1990). In addition to short-latency EPSPs, both types of neurons also showed prolonged inhibitions, or long-latency excitations (Wilson et al., 1983, 1990), indicating that the thalamic stimulation also engaged polysynaptic pathways.

In more recent studies in awake monkeys, we carried out extracellular recordings in the striatum during electrical stimulation of the CM/Pf complex (Nanda et al., 2009; **Figure 3**). We found that striatal cells did not respond to single-pulse stimulation, but that many neurons showed long-latency (tens of milliseconds) responses to burst stimulation (100 Hz, 1 s) of the CM/Pf. While phasically active neurons (PANs), which likely correspond to striatal MSNs, responded mainly with increases in firing, tonically active striatal neurons (TANs, likely to be cholinergic interneurons) often showed combinations of increases and decreases in firing (Nanda et al., 2009; **Figure 3**). Both types of responses were most likely generated by activation of the intrastriatal circuitry. As mentioned above, anatomical observations support the idea that CM/Pf terminals contact both striatal GABAergic and

**FIGURE 3 | Electrophysiological responses of striatal neurons to electrical stimulation of CM in awake monkeys.** Electrical stimulation (100 Hz, 100 pulses-shaded area) of CM evokes responses in PANs (putatively MSNs) and TANs (putatively cholinergic interneurons) in awake rhesus monkeys. **(A)** Example of a PAN responding with increased firing to electrical CM stimulation. **(B)** Example of a TAN responding with a brief decrease followed by an increase in firing to CM stimulation. The histograms and rasters are aligned to the start of stimulation trains. **(C)** Summary of responses. The majority of PANs display increases or decreases in firing rate, while most TANs present combinatory (increases and decreases) responses following CM stimulation (see Nanda et al., 2009 for more details).

cholinergic elements in primates and non-primates (Sidibe and Smith, 1999), and that cholinergic interneurons receive GABAergic synaptic inputs from collaterals of direct and indirect pathway MSNs (Gonzales et al., 2013). These collaterals may mediate the decreases in firing after the activation of the presumably excitatory thalamostriatal projections.

The eventual effects of CM stimulation on the striatal circuitry may depend on the experimental conditions chosen. For instance, pharmacological stimulation of the Pf in rats, or electrical stimulation of the CM/Pf in monkeys, was shown to reduce striatal acetylcholine levels, an effect that can be reversed by intrastriatal administration of GABA*<sup>A</sup>* receptor antagonists (Zackheim and Abercrombie, 2005; Nanda et al., 2009). The reduction in acetylcholine levels could be explained assuming that the thalamic activation drives intrastriatal GABAergic neurons that then secondarily inhibit cholinergic interneurons. However, other studies showed that electrical stimulation of Pf increased the level of acetylcholine in the rat striatum (Consolo et al., 1996a) in an NMDA-receptor dependent manner.

#### **IN VITRO RECORDINGS IN BRAIN SLICES**

A new rat brain slice preparation that partly preserved thalamostriatal axons (Smeal et al., 2007) has enabled studies of the chemical and functional properties of thalamostriatal synapses and the potential relationships between thalamostriatal and corticostriatal systems in normal state (Ding et al., 2008; Smeal et al., 2008). Using this preparation, the ratio of NMDA/non-NMDA glutamatergic receptors was found to be higher at thalamic than cortical synapses (Ding et al., 2008; Smeal et al., 2008), an observation that extends earlier neurochemical studies in adult rats (Baldi et al., 1995; Consolo et al., 1996a,b). This slice preparation has also lead to additional data suggesting that the thalamostriatal system gates corticostriatal signaling via activation of striatal cholinergic interneurons, and that this functional interaction might be altered in mouse model of dystonia (Ding et al., 2008; Sciamanna et al., 2012). However, it is a limitation of this preparation that thalamostriatal projections from CM/Pf cannot be distinguished from those originating in other parts of the thalamus.

The introduction of optogenetic methods helped to further characterize the properties of specific thalamostriatal synapses in rats (Ellender et al., 2013). Thus, neurons in the central lateral nucleus (CL) have bushy, frequently branching, dendrites and, under anesthesia, fire action potentials in the form of lowthreshold Ca2<sup>+</sup> spike bursts, while Pf neurons have long, infrequently branching dendrites and give rise to action potentials that are only rarely in the form of low-threshold bursts (Lacey et al., 2007; **Figure 2**). In the striatum, thalamostriatal terminals from the CL terminate almost exclusively on dendritic spines, while Pf boutons target predominantly dendritic shafts (**Figure 2**). *In vitro* optogenetic activation of the different pathways combined with whole-cell, patch-clamp recordings of direct or indirect pathway MSNs in adult mice (Ellender et al., 2013) revealed that stimulation of CL synapses leads to large amplitude, predominantly AMPA-receptor mediated, excitatory responses that display shortterm facilitation. In contrast, stimulation of Pf synapses gives rise to small amplitude responses that display short-term depression and are largely mediated by post-synaptic NMDA receptors (Ellender et al., 2013; **Figure 4**). The high frequency Ca2<sup>+</sup> spike bursts in CL neurons together with the synaptic properties of CL thalamostriatal synapses suggests that thalamic inputs from CL are well suited to driving MSNs to depolarization and hence firing (Ellender et al., 2013). In contrast, the firing characteristics and properties of Pf neurons and their thalamostriatal synapses suggest that these are better suited to exert modulatory effects on striatal MSNs, which could be in the form of facilitating Ca2+-dependent processes. Furthermore, pairing Pf pre-synaptic stimulation with action potentials in MSNs leads to NMDA receptor- and Ca2+-dependent long-term depression at these synapses (Ellender et al., 2013; **Figure 4**).

## **THE ROLE OF THE CENTER MEDIAN/PARAFASCICULAR (CM/Pf) THALAMOSTRIATAL SYSTEM IN COGNITION**

The CM/Pf-striatal system is now thought to be critical in mediating basal ganglia responses to attention-related stimuli, and may be engaged in behavioral switching and reinforcement functions (Kimura et al., 2004; Minamimoto et al., 2009; Smith et al., 2011; Bradfield et al., 2013a).

## **RESPONSES OF CENTER MEDIAN/PARAFASCICULAR (CM/Pf) NEURONS IN ATTENTION-RELATED TASKS**

Because the intralaminar nuclei receive massive ascending projections from the reticular formation and various brainstem regions (see above), and have long been known as the source(s) of widely distributed "nonspecific" thalamocortical projections, these thalamic nuclei are considered part of the ascending "reticular activating system" that regulates arousal and attention (as reviewed in Van der Werf et al., 2002). In line with this concept, functional imaging studies in humans demonstrated a significant increase of activity in CM/Pf during processing of attention-related stimuli (Kinomura et al., 1996; Hulme et al., 2010; Metzger et al., 2010). More recent observations in primates showed that CM and Pf neurons respond to behaviorally salient visual, auditory and somatosensory stimuli (Matsumoto et al., 2001; Minamimoto and Kimura, 2002; Minamimoto et al., 2005; **Figure 5**). In these studies, the response latencies of Pf neurons were much shorter than those of CM neurons (**Figure 5**). Compatible with the view that responses of CM/Pf neurons to external events are related to attention, the initially vigorous responses fade quickly upon repeated stimulus presentation if stimuli were not followed by reward, and, thus, lose their salience (Matsumoto et al., 2001; Minamimoto and Kimura, 2002; Kimura et al., 2004; Minamimoto et al., 2005). Acute pharmacological inactivation of Pf in monkeys disrupts attention processing more efficiently than CM inactivation (Minamimoto and Kimura, 2002). The functional responses of CM/Pf neurons to attention-related stimuli thus suggest a role of the CM/Pf-striatal system in cognition, most particularly related to attention shifting, behavior switching and reinforcement processes (Matsumoto et al., 2001; Minamimoto and Kimura, 2002; Kimura et al., 2004; Minamimoto et al., 2005; Smith et al., 2011; Bradfield et al., 2013a,b). There is also evidence that sensory-responsive CM neurons may be involved in mechanisms needed for decision-making and biasing actions (Minamimoto et al., 2005).

**FIGURE 4 | Electrophysiological responses of striatal MSNs to optogenetic activation of thalamostriatal terminals from CL or Pf in mice. (A)** Channelrhodopsin-2 (ChR2) was delivered to either the CL or Pf thalamic nucleus using a stereotaxic injection of adeno-associated virus containing the double-floxed sequence for ChR2-YFP in CAMKII-cre mice. This approach enabled expression of ChR2 in the excitatory thalamic neurons of either the CL or Pf nucleus including in their axonal arbor. Thalamic axons expressing ChR2-YFP were readily visible in acute striatal slices and this method allowed for activation of only those synapses originating from neurons from the injected thalamic nucleus by illumination of these slices with blue (473 nm) laser or LED light. **(B)** Whole-cell patch-clamp recordings of striatal MSNs showed that activation of CL synapses led to consistently larger post-synaptic responses than activation of Pf synapses. **(C)** Detailed investigation of the glutamate receptor-mediated currents revealed the CL synapses exhibit predominantly

Additional evidence for a "cognitive" role of the CM/Pf projections to the striatum comes from studies in mice in which selective immunotoxin lesions or pharmacological inactivation of the Pf-striatal projection, impair performance in a discrimination learning task (Brown et al., 2010; Kato et al., 2011). Furthermore, recent evidence indicates the Pf projection to the posterior dorsomedial striatum is involved in regulating the interaction between new and previously learned stimuli (Bradfield et al., 2013a,b). It is noteworthy that neither lesion nor pharmacological inactivation of CM/Pf in monkeys or rodents lead to motor impairments (Matsumoto et al., 2001; Minamimoto and Kimura, 2002; Brown et al., 2010; Kato et al., 2011).

AMPA receptor-mediated currents in response to light activation and the Pf synapses exhibit predominantly NMDA receptor-mediated currents. **(D)** The short term plastic properties of these synapses were investigated by repetitive activation of the synapses in close succession. This revealed that inputs from CL exhibit short term facilitation, in which the response to the second activation has a larger amplitude than the response to the first activation, whereas inputs from Pf exhibit short term depression. **(E)** The long term plastic properties of these synapses were investigated using a spike timing-dependent plasticity protocol, consisting of the pairing of optical activation of a pre-synaptic thalamic input with a post-synaptic action potential in a MSN. This protocol induced a clear long term depression of synaptic efficacy at Pf synapses, but no plasticity was observed at CL synapses. The same observation was made using either a pre-post or post-pre pairing, with only the pre-post pairing shown for clarity (see Ellender et al., 2013 for more details).

#### **CENTER MEDIAN/PARAFASCICULAR (CM/Pf)-STRIATAL SYSTEM REGULATION OF STRIATAL CHOLINERGIC INTERNEURONS**

Reward-associated events evoke pause responses in striatal TANs (which are likely to be cholinergic interneurons) (Goldberg and Reynolds, 2011). These responses are regulated in part by the CM/Pf-striatal system, because they are almost completely abolished by chemical inactivation of the CM/Pf complex in monkeys (Matsumoto et al., 2001; **Figure 5**). Furthermore, the removal of Pf inputs to cholinergic interneurons reduces the firing rate of these neurons and produces an enduring deficit in goaldirected learning (Bradfield et al., 2013a). These observations are consistent with the fact that cholinergic interneurons receive

synaptic inputs from CM/Pf (Lapper and Bolam, 1992; Sidibe and Smith, 1999), that CM stimulation strongly affects TAN activity patterns (Wilson et al., 1990; Nanda et al., 2009) and that CM/Pf alterations affect striatal acetylcholine release (Consolo et al., 1996a,b; Zackheim and Abercrombie, 2005; Nanda et al., 2009). As mentioned above, several mechanisms have been proposed to explain how activation of the glutamatergic CM/Pf-striatal projection evokes inhibitions or pauses in TAN firing, including the involvement of intercalated GABAergic and dopaminergic elements, as well as intrinsic properties of cholinergic interneurons (Ding et al., 2010; Goldberg and Reynolds, 2011; Sciamanna et al., 2012; Threlfell et al., 2012).

## **DEGENERATION OF THE THALAMOSTRIATAL SYSTEM IN PARKINSON'S DISEASE (PD)**

Postmortem studies have shown that 30–40% of CM/Pf neurons are lost in PD patients with mild motor deficits, and that the extent of CM/Pf degeneration does not further progress with the severity of the Parkinsonian motor signs (Xuereb et al., 1991; Heinsen et al., 1996; Henderson et al., 2000a,b, 2005; Brooks and Halliday, 2009; Halliday, 2009). We have recently found a similarly robust loss of CM/Pf neurons in monkeys that were chronically treated with low doses of the neurotoxin 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine (MPTP), even in motor asymptomatic animals with minimal nigrostriatal dopaminergic denervation (Villalba et al., 2014; **Figure 6**). In PD patients and MPTP-treated monkeys, this thalamic degeneration predominantly affects CM/Pf (Henderson et al., 2000a,b; Villalba et al., 2014), although significant neuronal loss was also reported in the parataenial, cucullar and central lateral nuclei of PD patients (Halliday, 2009). In the brain of PD patients, α-synuclein deposition was found in the latter three nuclei, but not as much in CM/Pf (Brooks and Halliday, 2009). Parvalbumin-negative neurons are particularly affected in CM (Halliday, 2009). It is noteworthy that robust CM/Pf neuronal loss is not only found in PD, but has also been found in other neurodegenerative diseases, including progressive supranuclear palsy and Huntington's disease (Heinsen et al., 1996; Henderson et al., 2000a,b). The cellular

properties of CM/Pf neurons needs to be studied in more detail to determine the potential factors that make them more sensitive to degeneration than other thalamic cells in these diseases (see also below).

In rodent and primate models of PD, striatal dopamine loss is associated with a loss of glutamatergic synapses (Ingham et al., 1998; Villalba et al., 2013), which is consistent with a loss of spines on striatal MSNs in PD postmortem material (see below). Stereological estimates of the number of glutamatergic synapses formed by vGluT1- (marker of cortical terminals) or vGluT2- (marker of thalamic terminals) containing terminals in the putamen of MPTP-treated monkeys showed that the number of vGluT2 positive terminals, but not that of vGluT1-containing boutons, is substantially reduced in Parkinsonian animals (Villalba et al., 2013). This suggests that the loss of thalamic glutamatergic inputs to the striatum outweighs the loss of cortical terminals in MPTPtreated monkeys (Villalba et al., 2013). Because CM/Pf inputs to striatal MSNs predominantly terminate on dendritic shafts (see above), these findings are in line with our previous studies which had demonstrated that MPTP treatment reduces the relative prevalence of vGluT2-positive axo-dendritic synapses in the putamen more strongly than that of axo-spinous synapses (Raju et al., 2008). Together, these results support the hypothesis that the loss of CM/Pf inputs to the striatum prominently contributes to the glutamatergic deafferentation of MSNs in PD. It is not yet clear whether the loss of striatal inputs from CM/Pf also affects the glutamatergic innervation of cholinergic interneurons.

The loss of glutamatergic afferents is reflected in a profound loss of dendritic spines of striatal MSNs in PD patients (Stephens et al., 2005; Zaja-Milatovic et al., 2005), rodent models of PD (Ingham et al., 1989; Day et al., 2006; Kusnoor et al., 2009) and in MPTP-treated monkeys (Raju et al., 2008; Smith et al., 2009; Villalba et al., 2009, 2013; Villalba and Smith, 2010, 2011, 2013). In the latter studies, a 40–50% spine loss was found in the sensorimotor striatum, similar to findings in PD patients

(Zaja-Milatovic et al., 2005; Villalba et al., 2009). The link between the aforementioned striatal glutamatergic deafferentation and spine loss remains uncertain. For instance, it remains unclear whether the CM/Pf degeneration and the resulting loss of glutamatergic inputs to the striatum contribute to the changes in the number and morphology of dendritic spines in PD. As a further complication, the use of dopaminergic drugs can also affect spine growth and morphology. Thus, a recent study showed aberrant restoration of axo-spinous synapses that affect corticostriatal, but not thalamostriatal synapses, in unilaterally 6-hydroxydopamine (OHDA)-lesioned rats that developed L-3,4-didroxyphenylalanine (L-DOPA)-induced dyskinesias (Zhang et al., 2013), suggesting a differential impairment of the two glutamatergic systems in dyskinesia. Details about the extent of Pf neuronal loss in these animals are needed to translate these findings to human PD patients with L-DOPA-induced dyskinesia.

#### **POTENTIAL REASONS FOR CENTER MEDIAN/PARAFASCICULAR (CM/Pf) NEURON LOSS IN PD**

Although it remains unclear why CM/Pf neurons are particularly sensitive to neurodegeneration in PD and other disorders (Halliday et al., 2005; Halliday, 2009), their chemical phenotype and sensitivity to chronic MPTP administration in non-human primates (Villalba et al., 2014) may provide clues. The CM/Pf nuclei do not carry a specifically high level of α-synuclein deposits in PD patients (Brooks and Halliday, 2009), so that their vulnerability in PD cannot be explained by the burden of α-synuclein aggregation.

In rodents, striatal terminals from the Pf express immunoreactivity for the protein cerebellin 1, a neurochemical feature that appears to be specific for the CM/Pf-striatal system, and may also regulate the morphology of striatal spines (Kusnoor et al., 2010). It remains to be established whether this unique molecular characteristic amongst thalamic neurons contributes to their susceptibility. It is also possible that the absence of calcium binding proteins expression in CM/Pf neurons determines their differential vulnerability. In humans, postmortem studies have shown that subpopulations of parvalbumin-containing neurons are mainly affected in Pf, while non-parvalbumin/non-calbindin neurons are more specifically targeted in CM (Henderson et al., 2000a). In MPTP-treated animals, the degeneration of CM/Pf could instead be due to direct toxic effects of the MPTP metabolite, 1-methyl-4-phenylpyridinium (MPP+), on the thalamus, independent of its effects on the nigrostriatal system (Villalba et al., 2014). Consistent with this possibility, injections of MPP+ into the rodent striatum resulted in a major neuronal loss in Pf, without significant effects on cortical neurons (Ghorayeb et al., 2002).

## **THE CENTER MEDIAN/PARAFASCICULAR (CM/Pf) AS A TARGET OF NEUROSURGICAL INTERVENTIONS IN BRAIN DISORDERS**

Although the physiological properties of the caudal intralaminar nuclei and their projections remain poorly characterized, these nuclei have been used as targets for surgical interventions, aimed at treating pain, seizures, impairments of consciousness, or movement disorders. We will focus our discussion on the use of neurosurgical procedures as treatment of movement disorders, because this use is most easily linked to the interactions between CM/Pf and the basal ganglia. These procedures have been used specifically in patients with disabling TS, or with PD. The mechanisms of action of CM/Pf interventions in these diseases, and the specifics of the optimal surgical approach and DBS characteristics remain matters of speculation. Furthermore, inclusion and exclusion criteria for trials of these interventions are only beginning to emerge for TS patients, while no formal criteria have yet been developed for trials in patients with PD.

#### **ABLATIVE SURGERIES OF CENTER MEDIAN/PARAFASCICULAR (CM/Pf)**

Since the 1960s, unilateral or bilateral lesions of the intralaminar and medial thalamic nuclei, as well as the nucleus ventro-oralis internus (Voi) have been empirically used to treat patients with TS (see below) (Hassler and Dieckmann, 1970, 1973; de Divitiis et al., 1977; Hassler, 1982). These studies have reported impressive reductions in tic frequency in these patients, along with a lesser reversal of compulsive symptoms. The effects of CM/Pf lesions in other movement disorders (such as Parkinsonism) have not been extensively characterized. However, in rodents, Pf lesions were shown to prevent the neurochemical changes produced by dopamine denervation in different basal ganglia nuclei (Kerkerian-Le Goff et al., 2009). Similar experiments in MPTP-treated primates have not resulted in significant antiparkinsonian effects (Lanciego et al., 2008).

#### **CM/Pf DEEP BRAIN STIMULATION AND TS**

TS is a neuropsychiatric disorder of childhood onset. Patients develop rapid, stereotyped movements (tics) which typically peak in preadolescence and decline in the later teenage years. Many TS patients also suffer from psychiatric comorbidities, such as obsessive compulsive disorder, attention-deficit hyperactivity disorder, or depression. Most patients are successfully (albeit partially) treated with neuroleptics and other drugs or with behavioral therapy. However, a few patients continue to experience severe tics in adulthood. These patients are candidates for neurosurgical procedures. Although still under considerable debate, TS may be the result of abnormal GABAergic or dopaminergic transmission in the basal ganglia (see, e.g., Buse et al., 2013; Worbe et al., 2013), involving both motor and limbic basal ganglia-thalamocortical circuitry.

Although there is little evidence linking the pathology or functional disturbances of CM/Pf to TS, this nuclear complex has been a major focus of surgical treatment of this condition, mostly because of the early empirical evidence with ablative treatments (see above). Early investigations of the use of stimulation of CM/Pf in movement disorders were carried out in patients that were enrolled in pain treatment studies, but also suffered from movement disorders (Andy, 1980; Krauss et al., 2002). Significant symptomatic improvements of motor dysfunctions were reported in these studies, but the stimulation parameters were not communicated. Since then, impressive reductions in tic frequency and severity, perhaps with greater effectiveness against motor than vocal tics, have been reported, although the number of TS patients treated with CM/Pf DBS remains small (Visser-Vandewalle et al., 2003, 2004, 2006; Temel and Visser-Vandewalle, 2004; Houeto et al., 2005; Ackermans et al., 2006, 2008, 2010, 2011; Bajwa et al., 2007; Maciunas et al., 2007; Servello et al., 2008, 2010; Shields et al., 2008; Porta et al., 2009; Hariz and Robertson, 2010; Sassi et al., 2011; Maling et al., 2012; Savica et al., 2012; Visser-Vandewalle and Kuhn, 2013). The time course of tic improvement varies between individuals, ranging from immediate effects (Visser-Vandewalle et al., 2003; Maciunas et al., 2007) to a more protracted time course (Maciunas et al., 2007; Servello et al., 2008). In addition to the motor symptoms of the disease, CM/Pf DBS also effectively alleviated some of the psychiatric components of TS, including obsessive-compulsive behaviors and anxiety (Houeto et al., 2005; Mink, 2006; Visser-Vandewalle et al., 2006; Neuner et al., 2009; Krack et al., 2010; Sassi et al., 2011). The mechanisms of action of CM/Pf stimulation on TS signs and symptoms remain unclear, but the anatomy and potential role of the thalamostriatal system from CM/Pf in cognition, as well as functional studies indicating the key role of CM/Pf in regulating striatal cholinergic interneurons activity (see above) in animals, suggest that CM/Pf stimulation may mediate its effects through complex regulation of striatal microcircuits that influence both motor and non-motor basal ganglia-thalamocortical and thalamostriatal networks (Nanda et al., 2009; Kim et al., 2013).

## **CENTER MEDIAN/PARAFASCICULAR (CM/Pf) DBS AND PARKINSON'S DISEASE (PD)**

CM/Pf DBS was found to provide significant anti-parkinsonian benefits in 6-OHDA-treated rats (Jouve et al., 2010). CM/Pf DBS has also been used in a few PD patients. These studies have suggested that CM/Pf DBS may have anti-dyskinetic effects and reduce freezing of gait, a symptom that is not satisfactorily treated with either medications or conventional DBS approaches directed at subthalamic and pallidal targets (Caparros-Lefebvre et al., 1999; Mazzone et al., 2006). More recent studies have suggested that CM/Pf DBS may also reduce Parkinsonian tremor (Peppe et al., 2008; Stefani et al., 2009).

## **CONCLUDING REMARKS AND FUTURE STUDIES**

Despite significant progress in our understanding of the anatomy, physiology and pathophysiology of the thalamostriatal systems, many unresolved issues remain. The cloning of vGluT1 and vGluT2 has had a significant impact in our understanding of the anatomical and synaptic organization of the thalamostriatal systems, allowing us to further appreciate that the thalamus is a massive source of extrinsic glutamatergic inputs to the striatum that originates either in the CM/Pf nuclear complex, or in the numerous non-CM/Pf thalamic nuclei. Future studies aimed at understanding the physiological role of these multiple thalamostriatal circuits, and their functional interactions with the corticostriatal and nigrostriatal systems, are warranted to decipher the mechanisms by which these extrinsic afferents regulate basal ganglia functions.

Because of the complex relationships between the cerebral cortex, thalamus and striatum, the use of traditional stimulation or lesion methods has had a limited impact in our understanding of the physiology and synaptic properties of thalamostriatal connections. However, as presented in this review, optogenetic approaches help overcome some of these technical challenges, setting the stage for a rigorous and detailed characterization of the physiology and pathophysiology of the CM/Pf vs. non-CM/Pfstriatal systems in normal and diseased states.

The major breakthroughs that were recently made in characterizing some aspects of the role of the CM/Pf-striatal system in regulating the physiological responses of striatal cholinergic interneurons to attention-related salient sensory stimuli provide a deeper understanding of the mechanisms involved by which the basal ganglia regulate activities such as behavioral switching, attentional set-shifting and reinforcement. Because the CM/Pfstriatal system undergoes massive degeneration in PD, future studies aimed at assessing the effects of this degeneration upon attention and other basal ganglia-related cognitive functions are needed. A better understanding of the respective role played by the thalamostriatal vs. the nigrostriatal dopamine systems in the regulation of cholinergic interneuron activity is another area of great interest for future studies.

In light of neuropathology studies of the thalamus in human patients with PD (Halliday, 2009), combined with studies in MPTP-treated non-human primates, it appears that CM/Pf neurons are particularly sensitive to degeneration in PD, neurodegenerative diseases, and neurotoxic insults. Thus, future studies aimed at elucidating the chemical, physiological and pharmacological properties of CM/Pf neurons vs. other thalamic cells are essential to determine the basis for the selective vulnerability of CM/Pf neurons in brain disorders.

Finally, another area of great interest is the use of DBS of CM/Pf in movement disorders, most particularly in TS. We need to understand better how stimulation of the thalamostriatal system from CM/Pf alleviates tics and psychiatric symptoms of this disease. Furthermore, rigorous blinded trials in a large number of patients still need to be done before this treatment can be recommended for patients with TS.

## **REFERENCES**


ganglia pathways in the rat: Ipsi- and contralateral projections. *J. Comp. Neurol.* 483, 143–153. doi: 10.1002/cne.20421


Minamimoto, T., and Kimura, M. (2002). Participation of the thalamic CM-Pf complex in attentional orienting. *J. Neurophysiol.* 87, 3090–3101.


neurons of rat dorsal striatum. *Eur. J. Neurosci.* 28, 2041–2052. doi: 10.1111/j. 1460-9568.2008.06505.x


median raphe nucleus in the rat. *J. Comp. Neurol.* 275, 511–541. doi: 10. 1002/cne.902750404


Zhang, Y., Meredith, G. E., Mendoza-Elias, N., Rademacher, D. J., Tseng, K. Y., and Steece-Collier, K. (2013). Aberrant restoration of spines and their synapses in L-DOPA-induced dyskinesia: involvement of corticostriatal but not thalamostriatal synapses. *J. Neurosci.* 33, 11655–11667. doi: 10.1523/jneurosci.0288-13. 2013

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 12 November 2013; accepted: 11 January 2014; published online: 30 January 2014.*

*Citation: Smith Y, Galvan A, Ellender TJ, Doig N, Villalba RM, Huerta-Ocampo I, Wichman T and Bolam JP (2014) The thalamostriatal system in normal and diseased states. Front. Syst. Neurosci. 8:5. doi: 10.3389/fnsys.2014.0 0005*

*This article was submitted to the journal Frontiers in Systems Neuroscience. Copyright © 2014 Smith, Galvan, Ellender, Doig, Villalba, Huerta-Ocampo, Wichman and Bolam. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Computational models of basal-ganglia pathway functions: focus on functional neuroanatomy

## *Henning Schroll 1,2,3,4\* and Fred H. Hamker 1,4\**

*<sup>1</sup> Bernstein Center for Computational Neuroscience, Charitè – Universitätsmedizin Berlin, Berlin, Germany*

*<sup>2</sup> Department of Psychology, Humboldt-Universität zu Berlin, Berlin, Germany*

*<sup>3</sup> Department of Neurology, Charitè – Universitätsmedizin Berlin, Berlin, Germany*

*<sup>4</sup> Department of Computer Science, Chemnitz University of Technology, Chemnitz, Germany*

#### *Edited by:*

*Ahmed A. Moustafa, University of Western Sydney, Australia*

#### *Reviewed by:*

*Ahmed A. Moustafa, University of Western Sydney, Australia Andrea Stocco, University of Washington, USA*

#### *\*Correspondence:*

*Fred H. Hamker and Henning Schroll, Department of Computer Science, Chemnitz University of Technology, Straße der Nationen 62, 09111 Chemnitz, Germany e-mail: fred.hamker@ informatik.tu-chemnitz.de; henning.schroll@ informatik.tu-chemnitz.de*

Over the past 15 years, computational models have had a considerable impact on basal-ganglia research. Most of these models implement multiple distinct basal-ganglia pathways and assume them to fulfill different functions. As there is now a multitude of different models, it has become complex to keep track of their various, sometimes just marginally different assumptions on pathway functions. Moreover, it has become a challenge to oversee to what extent individual assumptions are corroborated or challenged by empirical data. Focusing on computational, but also considering non-computational models, we review influential concepts of pathway functions and show to what extent they are compatible with or contradict each other. Moreover, we outline how empirical evidence favors or challenges specific model assumptions and propose experiments that allow testing assumptions against each other.

**Keywords: dopamine, reinforcement learning, response selection, response timing, working memory, gating, stimulus-response association**

## **1. INTRODUCTION**

#### **1.1. INTRODUCTION TO THE CONCEPT OF BASAL-GANGLIA PATHWAYS**

Basal ganglia (BG) contain a variety of both glutamatergic and GABAergic fiber tracts. Why is BG organization that complex? Two influential theories, published more than 20 years back (Albin et al., 1989; DeLong, 1990), came up with a first idea: they proposed that BG control excitation and inhibition of cortex, therefore requiring two distinct pathways: a direct pathway (cortex→striatum→globus pallidus internus) was assumed to facilitate motor cortical activity, while an indirect pathway (cortex→striatum→globus pallidus externus→subthalamic nucleus→globus pallidus internus) was assumed to inhibit motor-cortical firing. These concepts provided an explanation of prominent BG motor disorders: over-activity of the excitatory direct pathway was assumed to result in overshoot of motor activity (as in Huntington's disease), while over-activity of the inhibitory indirect pathway was proposed to result in pathological motor inhibition (as in Parkinson's disease; Albin et al., 1989; DeLong, 1990). Inspired by this intuitive concept and the fact that it was later discovered to fail at explaining some prominent empirical findings (e.g., Marsden and Obeso, 1994), revised and extended models have been developed since. As part of this process, an additional, shorter route of the indirect pathway (cortex→striatum→globus pallidus externus→globus pallidus internus) has been proposed (Smith et al., 1998) as well as an additional hyperdirect pathway (cortex→subthalamic nucleus→globus pallidus internus; Nambu et al., 2002). If all of these pathways can be identified to fulfill distinct functions, the complexity of BG anatomy might be understood as a necessity to guarantee BG functionality.

According to general understanding, direct, indirect and hyperdirect BG pathways transmit cortical input to globus pallidus internus (GPi) and substantia nigra reticulata (SNr), two largely analog BG output nuclei that tonically inhibit the thalamus (**Figure 1**). The direct pathway proceeds from cortex via striatum to GPi; information traversing this pathway has to pass a glutamatergic synapse first and a GABAergic synapse afterwards (**Figure 1**). Cortical input to the direct pathway thus reduces GPi firing which in turn increases activities in thalamus and cortex. The short indirect pathway passes from cortex to GPi via striatum and globus pallidus externus (GPe); synapses are glutamatergic, GABAergic and GABAergic, respectively. The *long* indirect pathway, in contrast, additionally passes through the subthalamic nucleus (STN) and contains an additional glutamatergic synapse (**Figure 1**). Cortical input to either of the two indirect pathways thus increases GPi firing. The hyperdirect pathway, finally, passes from cortex via STN to GPi and contains glutamatergic synapses only; cortical input to this pathway therefore increases GPi activity as well. Pathways are usually assumed to transmit information in a feed-forward manner; existing feedback-projections (e.g., from GPe to striatum or from STN to GPe; cf. **Figure 1**) are either assumed to not be part of these pathways or are assumed to be required for stabilization of information transmission only.

In the last decade, the rise of computational simulation techniques has boosted model development. Today, there is a multitude of different models, none of which yet accounts for all relevant empirical findings (section 9). Most of these models assume

a clear anatomical separation between the different pathways. Although this is likely a simplification (Lévesque and Parent, 2005), physiological data corroborates the assumption of functionally separate pathways: electrical stimulation of cortex results in three temporally distinct changes of activity in GPi that can be traced back to the effects of direct, indirect and hyperdirect pathways, respectively (Nambu et al., 2000; Kita et al., 2006; Kita and Kita, 2011). Even if pathways are not built out of distinct sets of neurons, thus, they appear to be functionally separated.

#### **1.2. WHY COMPUTATIONAL MODELING?**

Most of the models and hypotheses we will review offer not just verbal and graphical descriptions, but an additional mathematical (i.e., computational) implementation. Such mathematical implementations offer important advantages, including, but not limited to the following: they allow computing the effects of non-linear interactions between simulated neurons that would be impossible to compute mentally. Moreover, they are innately precise, thus preventing fuzzy assumptions; if some of a model's various assumptions contradict each other or do not interact well, the model will fail to produce meaningful output. Finally, computational models produce predictions that do not immediately originate from their assumptions. Such predictions might, for instance, relate to model performance during specific behavioral tasks. As a note of caution, however, computational models are often hard to grasp intuitively: a set of mathematical formulas does not innately reveal what function a model serves. Rather, extensive and often iterative simulations are required to reveal these functions. To report and review computational models, thus, verbal and graphical descriptions of model assumptions and outputs are required as well. These, however, may suffer from lack of precision and in any case simplify a model's "real" computational details.

In the context of BG functioning, computational modeling has been particularly fruitful in recent years. The complexity of BG anatomy and physiology, in light of their substantial interactions with cortex, thalamus and other sub-cortical nuclei makes them a good target for computational modeling.

## **2. ANATOMICAL AND PHYSIOLOGICAL CONSTRAINTS FOR INTERPRETATIONS OF PATHWAY FUNCTIONS**

#### **2.1. PATHWAY AFFERENTS FROM CORTEX AND THALAMUS**

The striatum (which is part of direct and indirect pathways) receives topographically organized inputs both from intratelencephalically-projecting cortical cells and from axon collaterals of cortical pyramidal-tract neurons (**Figure 2A**; Donoghue and Kitai, 1981; Lei et al., 2004; Parent and Parent, 2006; Shepherd, 2013). Cortico-striatal cells are predominantly located in cortical layer V, but also in layers II, III, and IV (Rosell and Giménez-Amaya, 1999). Striatal medium spiny neurons (MSNs) of the direct pathway have been shown to receive the majority of their inputs from intratelencephalically projecting cortico-striatal neurons, while striatal MSNs of the indirect pathways receive a greater proportion of inputs from axon collaterals of cortical pyramidal-tract neurons (Lei et al., 2004). The indirect pathways' inputs might thus largely consist of efference copies of motor output, informing this pathway about currently initiated responses.

Next to its MSNs, striatum contains cholinergic and several types of GABAergic interneurons (Tepper, 2010). While GABAergic interneurons receive extensive cortical input (Lapper et al., 1992; Kawaguchi, 1993; Ramanathan et al., 2002), cholinergic interneurons might receive more extensive input from thalamus than from cortex (Lapper and Bolam, 1992; Kawaguchi, 1993). Thalamic efferents to striatum are extensive and topographically organized (Berendse and Groenewegen, 1990; Lanciego et al., 2004).

STN (which gives rise to the hyperdirect pathway) receives topographically organized inputs from frontal and motor cortices (Hartmann-von Monakow et al., 1978; Afsharpour, 1985), again mainly from layer V (Canteras et al., 1990). Its cortical afferents have been described as deriving mainly from axon collaterals of cortico-fugal pyramidal-tract neurons (Giuffrida et al., 1985; Kita and Kita, 2012), thus potentially providing STN with efference copies of motor output. Potential inputs from sensory cortical areas have been both reported (Canteras et al., 1988) and repudiated (Afsharpour, 1985; Kolomiets et al., 2001). In any case, sensory cortices may modulate STN activity multi-synaptically via striatum and GPe of the long indirect pathway (Kolomiets et al., 2001). Like the striatum, STN receives topographically organized inputs from thalamus (Lanciego et al., 2004).

#### **2.2. PATHWAY CONDUCTION VELOCITIES**

Electrical stimulation of motor cortex results in triphasic changes of activity in GPi (**Figure 2B**; Nambu et al., 2000). Approximately 8 ms after stimulation of primary motor cortex, a fast excitation of GPi is observed that is followed by a short inhibition at about

21 ms after stimulation and a late excitation at about 30 ms after stimulation on average (Nambu et al., 2000). Chemical blocking of BG nuclei as well as parallel recordings in STN and GPe have shown that the fast excitation is caused by the hyperdirect pathway, the short inhibition by the direct pathway and the late excitation by the long indirect pathway (Nambu et al., 2000; Kita et al., 2006; Kita and Kita, 2011). The hyperdirect pathway's exceptionally fast response has been linked to unique properties of STN neurons, involving a slow decay of excitatory postsynaptic potentials (EPSPs) and a dynamic decrease in spike threshold after EPSPs (Farries et al., 2010; Kita and Kita, 2011).

## **2.3. ARBORIZATION PATTERNS OF PATHWAY OUTPUTS**

BG pathways arborize differently broadly in GPi, thus affecting different numbers of GPi neurons: striatal neurons arborize with a high degree of specificity in globus pallidus in monkeys (Hazrati and Parent, 1992b), while STN neurons more uniformly excite large numbers of pallidal cells (Hazrati and Parent, 1992a,b; **Figure 2C**). Despite their different patterns of arborization, however, striatal and subthalamic cells were found to converge onto the same pallidal neurons in internal and external segments of globus pallidus (Hazrati and Parent, 1992a,c). Based on this evidence, the direct pathway and the short indirect pathway are usually assumed to influence relatively focused pallidal representations, whereas the hyperdirect pathway and the long indirect pathway likely exert relatively global effects (cf. Mink, 1996; Nambu et al., 2002; Brown et al., 2004; Nambu, 2004; Frank, 2006).

## **2.4. DOPAMINERGIC IMPACTS ON BG PATHWAYS**

Synaptic plasticity in BG pathways is modulated by dopamine (Shen et al., 2008): while dopamine facilitates long-term potentiation (LTP) in cortico-striatal synapses of the direct pathway via D1-type dopamine receptors, it facilitates long-term depression (LTD) in cortico-striatal synapses of the indirect pathways via D2 type dopamine receptors (**Figure 2D**; Shen et al., 2008). Phasic BG dopamine signals, as emitted by neurons of substantia nigra compacta (SNc), have been hypothesized to encode error signals of reward prediction (Hollerman and Schultz, 1998): whenever an animal receives more reward than could be expected based upon previous reinforcement contingencies, dopamine neurons increase their firing above a low baseline rate; whenever less reward is received than could have been expected, firing decreases below this baseline. These findings inspired proposals that BG play an important role in reinforcement learning processes in the brain. Dopamine neurons, moreover, do not exclusively respond to rewarding events, but presumably also to salient non-rewarding and to aversive events (Bromberg-Martin et al., 2010). Dopaminergic effects on synaptic plasticity are well studied only for cortico-striatal synapses of direct and indirect pathways (Gerfen et al., 1990; Shen et al., 2008). For striatal outputs to GPe, GPi, and SNr, in contrast, the effects of dopamine have not yet been studied in similar detail. There are, however, hints that dopamine might modulate synaptic plasticity in these nuclei as well: all of them are innervated by axons of SNc dopamine neurons (Cossette et al., 1999; Gauthier et al., 1999); oral administration of the dopamine precursor levodopa modulates activity-dependent synaptic plasticity in SNr (Prescott et al., 2009). Moreover, it has been shown that SNr and entopeduncular nucleus (rat GPi equivalent) predominantly express D1 dopamine receptors (Boyson et al., 1986; Levey et al., 1993), that globus pallidus (rat GPe equivalent) expresses relatively high quantities of D2 receptors, but probably still more D1 dopamine receptors (Boyson et al., 1986; Levey et al., 1993) and that STN expresses both D1-type and D2-type receptors in considerable quantities (Flores et al., 1999). For connections from STN to SNr, moreover, D1 receptor agonists have been found to increase excitatory postsynaptic currents (EPSCs), whereas D2 receptor agonists decrease them (Ibañez-Sandoval et al., 2006). Based on these pieces of evidence, it has been assumed that increases in dopamine levels facilitate LTP along the entire direct pathway (involving both cortico-striatal and striato-GPi/ striato-SNr synapses), LTD along the entire short indirect pathway and LTP along the entire hyperdirect pathway (Schroll et al., 2013). Although this interpretation is consistent with existing empirical data, it has not yet been proven directly.

It is generally assumed that dopamine exerts additional shortterm effects on striatal activity that are in line with dopamine's effects on long-term plasticity: high dopamine levels are assumed to excite D1 MSNs of the direct pathway, while low dopamine levels are hypothesized to excite D2 MSNs of the indirect pathways (e.g., Wichmann and DeLong, 1996; Frank et al., 2004; Frank, 2005). Additionally, dopamine is assumed to modulate MSNs' sensitivity to glutamatergic synaptic inputs from cortex, again oppositely for D1 and D2 MSNs (Humphries et al., 2009). Empirically, dopamine has indeed been shown to modulate ion-channel conductances in the striatum (Calabresi et al., 1987; Lin et al., 1996). If, however, dopamine in fact spontaneously excites the direct and inhibits indirect pathways, remains to be shown (Calabresi et al., 2007). Similarly, it needs to be clarified to what extent spontaneous effects of dopamine fulfill a behaviorally relevant function on their own or might simply support dopaminergic effects on long-term plasticity. A detailed model of how dopamine affects the membrane properties of striatal MSNs has been provided by Humphries et al. (2009). Extending their model by known effects of dopamine on synaptic plasticity and including it, as a module, in systems-level models of cortico-BG-thalamic circuitry might help to understand the complex effects of dopamine on the functions of BG pathways.

#### **2.5. CORTICO-BG-THALAMIC LOOPS**

As outlined in **Figure 1**, BG are organized in loops with cortex and thalamus. It has been proposed that separate cortico-BGthalamic loops work in parallel and in relative independence of each other (Alexander et al., 1986). The number of independent loops is hard to estimate. Alexander et al. (1986) proposed the existence of at least five such loops (corresponding to motor, oculomotor, dorsolateral prefrontal, lateral orbitofrontal and anterior cingulate cortex). Frank et al. (2001) later suggested that each of these loops might be again subdivided into various sub-loops and estimated the human frontal cortex to contain around 20,000 such loops in total. Interestingly, the assumption of independent loops implies that each BG pathway has a variety of separate channels (i.e., one for each loop) and that each of these channels might subserve a different function (cf. Schroll et al., 2012). Thus, it might be more fruitful to search for superordinate principles of pathway functions than for specific pathway contributions related to individual loops. Along these lines, it has been distinguished between open and closed cortico-BG-thalamic loops (**Figure 2E**; Alexander et al., 1986; Joel and Weiner, 1994; Haber, 2003): while closed loops connect a particular area of cortex back to that same cortical area, open loops interconnect different areas of cortex. Anatomical crossovers between loops have indeed been described, in particular for cortico-striatal synapses (e.g., Inase et al., 1996; Takada et al., 1998; Calzavara et al., 2007) and cortico-thalamic synapses (Darian-Smith et al., 1999; McFarland and Haber, 2002). BG pathways might have entirely different functions in open loops than in closed loops. In particular, closed loops appear well fit for maintenance of information, while open loops might foster spread of information between cortical areas (Schroll et al., 2012; Trapp et al., 2012). For open loops, a hierarchy of information flow has been proposed that favors transmission of information from motivational via cognitive toward motor loops, but not vice versa (Haber, 2003).

## **3. HYPOTHESES ON FUNCTIONAL CONTRIBUTIONS OF BASAL GANGLIA**

The above-mentioned pieces of evidence restrict the degrees of freedom for plausible hypotheses on pathway functions, but still leave a lot of interpretive freedom. Before reviewing hypothesized contributions of individual pathways, we will outline proposed functions of BG as an entirety.

## **3.1. SELECTION MACHINE**

BG have been hypothesized to contribute to the selection of motor responses (e.g., Mink, 1996; Hikosaka et al., 2000; Gurney et al., 2001a,b; Frank et al., 2004; Nambu, 2004; Ashby et al., 2007; Schroll et al., 2012). Allowing for context-appropriate selection, they have moreover been assumed to establish and maintain associations between stimulus representations and response representations (**Figure 3A**; e.g., Reading et al., 1991; Packard and Knowlton, 2002). In line with these hypotheses, patients with BG disorders (i.e., Parkinson's disease and Huntington disease) are impaired in response selection (Lawrence et al., 1999; Wylie et al., 2009) and lesions of striatum result in impairments in acquiring stimulus-response rules (e.g., Reading et al., 1991; El Massioui et al., 2007). Recently, BG have been reported to be particularly involved in *learning* of stimulus- response associations, while they might be less important for execution of habitual stimulusresponse behavior (Antzoulatos and Miller, 2011; Waldschmidt and Ashby, 2011).

In a generalization of the selection hypothesis, BG have been proposed to select any cortical representation (rather than just motor programs), including internal cognitive and emotional states, based upon activation of any other representation (Trapp et al., 2012). In another generalization, BG have been assumed to establish associations not only between stimuli and responses, but between stimuli, responses and outcomes (**Figure 3A**; Redgrave and Gurney, 2006).

#### **3.2. PERFORMANCE OF SEQUENCES**

Based upon the idea that BG encode stimulus-response associations (section 3.1), they have been hypothesized to establish and execute sequences of motor processes by linking each single response of a sequence to its respective predecessor (Berns and Sejnowski, 1998; Nakahara et al., 2001). According to this hypothesis, BG interlink the different elements of a sequence in a stimulus-response manner, such that each performed "response" of a sequence serves as a "stimulus" for the following response. BG thus do not contain a single "overall" representation of each sequence, but an array of individual associations between its subsequent elements.

Evidence on BG involvement in sequence learning and execution has been provided for grooming in mammals (e.g., Berridge and Whishaw, 1992), song production in songbirds (e.g., Brainard and Doupe, 2000; Kao et al., 2005; Ölveczky et al., 2005) and sensorimotor production in humans (Doyon et al., 1997; Boecker et al., 1998).

#### **3.3. RESPONSE INITIATION AND TERMINATION**

BG have been hypothesized to provide initiation and termination signals for motor responding (**Figure 3B**; Nambu, 2004). According to this hypothesis, cortex sends a succession of corollary signals to the different BG pathways that ensure surroundinhibition of (premature) responses, response initiation and response termination, respectively. According to Nambu's (2004) hypothesis, BG determine the timing of already selected responses based on their corollary input signals from cortex (see also Mink, 1996); Nambu (2004), however, did not develop a computational model; we do not know of any such model that implements BG contributions to initiation and termination of motor responses in a loop linking the BG to primary motor cortex (M1).

Nambu's (2004) concept has been inspired by evidence of BG pathways' different conduction velocities as reviewed in section 2.2: stimulation of cortex first results in excitation, then inhibition, and finally again excitation of GPi. These three phases are assumed to correspond to inhibition of (premature) responses, response initiation and response termination, respectively (Nambu, 2004). However, pathway conduction velocities, by themselves, are no convincing proof of Nambu's (2004) assumptions: differences in conduction velocities are minute in magnitude and also relatively inflexible; they do not explain how response timing can be adapted to different contexts. Moreover, pathways not only have different conduction velocities but also receive different inputs that will likely set in at different times (section 2.1). These different set-ins of inputs may be far more decisive for latencies of pathway outputs than the pathways' conduction velocities. In line with this reasoning, Nambu (2004) hypothesized that pathways require temporally distinct corollary inputs from cortex to properly initiate and terminate responses.

#### **3.4. WORKING MEMORY GATING AND MAINTENANCE**

In the cognitive domain, BG have been hypothesized to control working-memory processes (**Figure 3C**). According to one proposal (O'Reilly and Frank, 2006), they guard the gate to working memory and thereby determine which stimuli will be maintained. Learning which stimuli to gate through initially requires random gating of working-memory contents according to this proposal. According to a different proposal (Schroll et al., 2012), BG are part of a working-memory maintenance system, allowing for reverberation of information in cortico-BG-thalamic loops. Here, BG are assumed to both determine which pieces of information enter working memory and to contribute to their actual maintenance. This model does not require an initially random selection of working-memory contents, but relies on shaping to learn complex working-memory paradigms. There is ample evidence on BG involvement in working memory tasks, both from human (Lewis et al., 2004; Alberts et al., 2008; Hershey et al., 2008; Moustafa et al., 2008b; Landau et al., 2009) and animal subjects (Levy et al., 1997). Empirical differentiation between the two hypotheses, however, is not yet possible as will be outlined in section 4.5.

#### **3.5. DIMENSIONALITY REDUCTION**

BG have been proposed to establish a focus on relevant (salient) information by reducing dimensionality of cortical information (Bar-Gad et al., 2000). Based on dopaminergic reinforcement signals, BG are assumed to learn efficient compression of information in such a way that its approximate reconstruction remains possible. In line with empirical data (Nelson et al., 1992; Nini et al., 1995; Bar-Gad et al., 2000), the model by Bar-Gad et al. (2000) predicts that correlations between neuronal activities decrease from cortex to globus pallidus. It might moreover explain why cortex contains far more neurons than striatum, which again contains far more neurons than GPi/SNr (Oorschot, 1996).

#### **3.6. REINFORCEMENT LEARNING**

Most computational models assume that BG pathways contribute to reinforcement learning (e.g., Berns and Sejnowski, 1998; Brown et al., 2004; Frank, 2006; O'Reilly and Frank, 2006; Ashby et al., 2007; Stocco et al., 2010; Schroll et al., 2012). According to this hypothesis, BG adapt behavior in such a way that reinforcements are maximized (**Figure 3D**). Specifically, they are assumed to foster repetition of those actions, emotions and cognitive processes that result in reinforcements.

Under the umbrella term "reinforcement learning," BG have been proposed to learn from unexpected rewards (e.g., Suri et al., 2001; Brown et al., 2004; Ashby et al., 2007; Vitay and Hamker, 2010), from punishments (e.g., Frank et al., 2004), and, generally, from unexpected sensory events (Redgrave and Gurney, 2006). The latter generalization implies that BG may learn any novel association between stimuli, actions and outcomes, even if not followed by reward. Via such a mechanism, animals and humans might learn contingencies that are relevant for obtaining positive outcomes in the future: for instance, they might find out that a particular action results in access to a safe sleeping place, even when currently foraging for food (cf. Redgrave and Gurney, 2006).

The reinforcement-learning hypothesis is based on findings that phasic dopamine signals in BG encode error signals of reward prediction (Hollerman and Schultz, 1998) and other salient unexpected events (e.g., Horvitz et al., 1997; Rebec, 1998), where value and salience of events might be signaled by distinct dopamine systems (Bromberg-Martin et al., 2010). The hypothesis is further based on evidence that dopamine modulates synaptic plasticity in BG (Shen et al., 2008). However, synaptic plasticity is well investigated only for cortico-striatal fibers. Some computational models thus limit dopamine-modulated learning processes to these cortico-striatal fibers (e.g., Brown et al., 2004; Ashby et al., 2007; Guthrie et al., 2009; Moustafa and Gluck, 2011), while others, more daringly, assume them to occur along more extensive parts of BG pathways (e.g., Vitay and Hamker, 2010; Schroll et al., 2012). In a particularly strong version of the reinforcementlearning hypothesis, BG refrain from processing as a particular function becomes automatized, i.e., after this function has been reliably learned via reinforcements (Ashby et al., 2007). This hypothesis is corroborated by single cell data from monkeys (Antzoulatos and Miller, 2011) and functional imaging data from humans (Waldschmidt and Ashby, 2011). Automatic functioning has instead been assumed to rely on cortico-cortical or cortico-thalamo-cortical connections (Ashby et al., 2007; Schroll et al., 2013). These connections might allow for faster information transfer because of fewer synaptic contacts than the route through the BG and might thus explain reduced reaction times in automatized tasks (Ashby et al., 2007).

The reinforcement-learning hypothesis is a meta-perspective that is fully compatible with any of the hypotheses outlined in sections 3.1 to 3.5, since it refers to how BG pathways arrive at a particular function and not to what that function is. In fact, all of the hypotheses outlined in sections 3.1 to 3.5 may be correct since BG might flexibly learn to establish exactly those functions that result in reinforcements in a given learning context. Empirical evidence on BG involvement in reinforcement learning is extensive for both animals (e.g., Featherstone and McDonald, 2004; El Massioui et al., 2007; Antzoulatos and Miller, 2011) and humans (e.g., Frank et al., 2004; Tanaka et al., 2004; Schönberg et al., 2007; Moustafa et al., 2008a). Computational models are particularly suitable for formalizing (and then simulating) reinforcementlearning processes because of these processes' iterative nature.

## **4. PROPOSED FUNCTIONS OF THE DIRECT PATHWAY**

As outlined in sections 2.3 and 2.4, the direct pathway facilitates cortical activity, it is strengthened by dopamine and its arborization is focused rather than divergent. In the following sub-sections 4.1 to 4.6, we will review the direct pathway's proposed functions in detail. Sections 5 and 6 will then cover proposed functions of indirect and hyperdirect pathways. To provide a quick overview, **Table 1** summarizes which models interpret which aspects of BG anatomy.

## **4.1. GLOBAL MOTOR FACILITATION**

Early non-computational models (Albin et al., 1989; DeLong, 1990) as well as a recent computational model of BG pathways (Stocco et al., 2010) proposed that the direct pathway unspecifically facilitates motor activity. And indeed, it has been confirmed optogenetically that stimulation of striatal MSNs of the direct pathway results in increased locomotion in mice (Kravitz et al., 2010). If this is a (relatively) unspecific effect, however, remains to be shown. The relatively sparse arborization of striatal MSNs in GPi intuitively challenges the global-facilitation hypothesis, although the degree of sparseness cannot yet be interpreted in functional terms and thus is no proof against the hypothesis.

#### **4.2. SPECIFIC MOTOR FACILITATION**

Mink (1996) proposed that the direct pathway specifically facilitates desired responses (rather than motor activity *per se*). This hypothesis therefore directly contradicts the global-facilitation


**Table 1 | Summary of model features and foci.**

hypothesis outlined in section 4.1. Recent computational models mostly follow Mink's (1996) suggestion (e.g., Gurney et al., 2001a,b; Suri et al., 2001; Brown et al., 2004; Frank, 2005; Ashby et al., 2007; Schroll et al., 2012) and applied it to cognitive operations as well (e.g., O'Reilly and Frank, 2006; Schroll et al., 2012). Because of the multitude of parallel cortico-BG-thalamic loops (section 2.5), it indeed appears likely that different channels of the direct pathway may simultaneously facilitate different types of representations (Schroll et al., 2012). Most computational models moreover hypothesize that the direct pathway *learns* to facilitate specific cortical representations based on rewards (e.g., Suri et al., 2001; Brown et al., 2004; Frank, 2005; Ashby et al., 2007; Schroll et al., 2012). We do not know of any empirical data that favors the specific-facilitation hypothesis over the global-facilitation hypothesis or vice versa.

## **4.3. STIMULUS-RESPONSE MAPPING**

The direct pathway has been hypothesized to facilitate specific motor programs only if they are appropriate in a given stimulus context (e.g., Brown et al., 2004; Ashby et al., 2007; Vitay and Hamker, 2010; Schroll et al., 2012). This hypothesis is a more specific version of the specific-facilitation hypothesis outlined in section 4.2. It says that the direct pathway connects specific stimulus representations to specific response representations (i.e., motor programs) and then facilitates a particular motor program only if the corresponding stimulus representation is active. The stimulus-response hypothesis relies on the existence of open cortico-BG-thalamic loops that interlink cortical areas involved in stimulus processing with areas involved in motor responding. And indeed, striatal areas that receive inputs from both primary somatosensory and primary motor cortices have been reported (Flaherty and Graybiel, 1993). Clear evidence of open cortico-BG-thalamic loops that connect visual or auditory cortices to primary motor cortex, however, has not yet been shown, although both higher-order visual and higher-order auditory cortices are known to project to striatum (LeDoux et al., 1991; Bordi and LeDoux, 1992; Baizer et al., 1993).

In a generalization of the stimulus-response hypothesis, internal states (like emotions, mental images or abstract cognitive concepts) may serve as stimuli for stimulus-response associations as well. In an even broader generalization, the direct pathway may interlink any two cortical representations (cf. Trapp et al., 2012), potentially in a hierarchy from emotional via motivational, cognitive and premotor to motor regions (Haber, 2003).

Ashby and Crossley (2011) proposed that striatal cholinergic interneurons take part in the establishment of stimulus-response associations. They suggest that these interneurons tonically inhibit striatal MSNs of the direct pathway in the absence of stimulus inputs and that a stimulus-contingent release of inhibition is required for a direct-pathway induced activation of responses.

## **4.4. TEMPORALLY PRECISE INITIATION OF RESPONSES**

Nambu (2004) hypothesized that the direct pathway determines the point in time when a particular response is initiated. According to his concept, the cortex selects an appropriate response and then first sends a corollary signal to the hyperdirect pathway, which globally inhibits all motor programs. Briefly afterwards, a second corollary signal to the direct pathway initiates a specific motor response at the appropriate point in time. This response-initiation hypothesis is, in its core, compatible with the specific-facilitation hypothesis (section 4.2) and the stimulus-response hypothesis (section 4.3): the direct pathway may well select appropriate responses (potentially based upon stimulus input) and also determine the exact time at which they are initiated. The specifics of these hypotheses, however, are incompatible: according to Nambu (2004), the *cortex* decides for a response, while it is the direct pathway and its specific connectivity according to the other two proposals.

Empirical evidence for the response-initiation hypothesis comes from patients with Parkinson's disease, a BG disorder that goes along with decreased activation of the direct pathway (but also with increased activation of the indirect pathways; Kravitz et al., 2010; Kita and Kita, 2011): as predicted by the responseinitiation hypothesis, Parkinson's disease patients are impaired in initiating movements (Bloxham et al., 1984; Carli et al., 1985; Hikosaka et al., 1993), but not in completing them (Carli et al., 1985) or in performing pre-programmed movements (Bloxham et al., 1984).

The response-initiation hypothesis as well as, in fact, any of the hypotheses outlined in sections 4.1 to 4.3, has been challenged based on reports that BG activation mostly occurs only *after* overt responses are visible (e.g., Aldridge et al., 1980; Jaeger et al., 1995). However, some BG neurons *do* become active before EMG onset (Jaeger et al., 1995) and reports of delayed BG activity may result from the specific study designs involved: motor responses were trained for extensive amounts of time in these studies, before BG were recorded. Recent evidence, however, points at an important role of the BG in initiating responses only while new response patterns are being *learned* (Antzoulatos and Miller, 2011; Waldschmidt and Ashby, 2011).

## **4.5. WORKING-MEMORY GATING AND MAINTENANCE**

The direct pathway has been suggested to play an important role in working-memory functions. In particular, it has been suggested to gate information into working memory (Gruber et al., 2006; O'Reilly and Frank, 2006), but also to contribute to working-memory maintenance itself (Ashby et al., 2005; Schroll et al., 2012). According to the former hypothesis, the direct pathway is required for gating information into PFC, where it is then maintained independent of direct-pathway activity. According to the latter hypothesis, in contrast, maintenance of workingmemory content requires reverberation of activity in cortico-BGthalamic loops, explicitly involving the direct pathway. Empirical evidence does not clearly favor one interpretation over the other. The former hypothesis predicts a phasic change in direct-pathway activity only while working-memory content is gated, while the latter predicts sustained activity over delay periods of workingmemory tasks. In favor of the latter hypothesis, a subset of striatal neurons has been empirically shown to exhibit sustained activities over delay periods of a spatial delayed response task (Cromwell and Schultz, 2003) and of a delayed saccade task (Hikosaka et al., 1989). In favor of the former hypothesis, however, the caudate nucleus of the striatum has been found more active during working memory manipulation than during working memory maintenance in a functional magnetic resonance imaging (fMRI) study (Lewis et al., 2004).

#### **4.6. DIMENSIONALITY REDUCTION**

The direct pathway has been proposed to perform the dimensionality reduction process outlined in section 3.5. No other BG pathway is assumed to take part in this process.

## **5. PROPOSED FUNCTIONS OF LONG AND SHORT INDIRECT PATHWAYS**

Two indirect pathways have been described, both of which inhibit cortical activity (**Figure 1**): a short route passes from GPe directly to GPi and arborizes there rather sparsely, while a longer route passes from GPe to STN and further from STN to GPi where it arborizes rather profusely (section 2.3). High dopamine levels foster LTD in cortico-striatal synapses that belong to these indirect pathways, while low dopamine levels facilitate LTP (section 2.4). As most models implement only one of the two indirect pathways, their interpretations might not be specific to the particular pathway included. To highlight the often-neglected fact that these pathways might establish entirely different functions, however, we nevertheless explicitly differentiate between the two pathways.

#### **5.1. GLOBAL INHIBITION OF MOTOR PROGRAMS (LONG INDIRECT PATHWAY)**

In early non-computational models, the long indirect pathway is assumed to globally inhibit motor behavior (Albin et al., 1989; DeLong, 1990). This hypothesis is in good agreement with the long indirect pathway's relatively global effects on GPi as outlined in section 2.3. Functional evidence for this hypothesis comes from an optogenetic study, where stimulation of striatal MSNs of the indirect pathways resulted in decreased motor initiation and increased bradykinesia (Kravitz et al., 2010). However, this study did not differentiate between long and short indirect pathways. Moreover, striatal MSNs of the indirect pathways were stimulated relatively globally which may be expected to cause global effects on behavior even if the long indirect pathway does not act as globally as hypothesized.

#### **5.2. INHIBITION OF SPECIFIC MOTOR PROGRAMS (SHORT INDIRECT PATHWAY)**

Based on its sparse arborization, the short indirect pathway has been suggested to inhibit specific motor programs (Brown et al., 2004; Frank et al., 2004; Frank, 2005; Schroll et al., 2013). More specifically, it has been hypothesized to *learn* this inhibition based on unfavorable outcomes, including omissions of expected rewards (e.g., Brown et al., 2004; Schroll et al., 2013) and aversive events (Frank et al., 2004). The chain of events between the occurrence of these unfavorable outcomes and the inhibition of motor programs is assumed to be the following: unfavorable outcomes cause phasic reductions in BG dopamine levels, which then activate the short indirect pathway to suppress the response that had resulted in the unfavorable outcome. Empirically, phasic decreases in dopamine have indeed been shown to strengthen cortico-striatal synapses of the indirect pathways (Shen et al., 2008). Moreover, it has indeed been shown that omissions of expected rewards result in phasic decreases in dopamine activity (Hollerman and Schultz, 1998); it thus appears plausible that the short indirect pathway inhibits responses based on omissions of expected rewards. The consequences of aversive events, however, might be more complex: while some SNc neurons indeed become less active following aversive events, others increase their activity (Matsumoto and Hikosaka, 2009). Insofar as SNc neurons respond differently to reward omissions and aversive events, it remains speculative if the short indirect pathway inhibits responses based on aversive events as well. In favor of such an effect, blocking of the indirect pathways (but not the direct pathway) in genetically modified mice has been shown to result in impaired shock avoidance (Hikida et al., 2010). Moreover, Frank et al. (2004) showed that Parkinson's disease patients (who suffer from dopamine loss) learn better from negative as opposed to positive outcomes than healthy controls (but see Shiner et al., 2012, for a challenge of their conclusions). Thus, the short indirect pathway may well learn to inhibit motor programs based both on omissions of rewards and on aversive events.

Omissions of expected rewards occur primarily during reversal learning and extinction (i.e., when expected stimulus-responsereward associations are no longer valid). In the model by Schroll et al. (2013), therefore, the short indirect pathway inhibits specific responses specifically during reversal learning. Pharmacological studies indeed show that D2 receptor agonists (which predominantly target indirect pathways; cf. section 2.4) impair reversal learning in humans (Mehta et al., 2001). Also, D2-type receptor *ant*agonists (but not D1-type antagonists) result in reversallearning deficits in non-human primates (Lee et al., 2007). By assuming that both D2 agonists and D2 antagonists render D2 receptors insensitive to phasic changes in physiologically generated dopamine signals, these findings are in line with Schroll et al.'s (2013) hypothesis: according to their model, suppression of previously correct responses during reversal learning requires task-related, phasic unbinding of dopamine at D2 receptors after omissions of expected rewards. Both D2 agonists and D2 antagonists can be assumed to impair such a task-related unbinding. None of the above-cited studies, however, distinguished between long and short routes of the indirect pathway. It thus remains to be established to what extent it is indeed the short route that inhibits specific responses during reversal learning.

#### **5.3. TERMINATION OF EXECUTED RESPONSES (LONG INDIRECT PATHWAY)**

In consideration of the long indirect pathway's relatively slow conduction velocity (section 2.2), this pathway has been hypothesized to provide termination signals for motor execution (Nambu, 2004). In line with this hypothesis, GPi cells show a triphasic change of activity in response to cortical stimulation: an early excitation via the hyperdirect pathway is followed by a brief inhibition via the direct pathway and by a late excitation via the long indirect pathway (**Figure 2B**). According to the termination hypothesis, the late excitation terminates those motor responses that are initiated via the intermediate inhibition. Acknowledging the long indirect pathway's relatively broad effects on GPi (section 2.3), the proposed termination signal has been assumed to act relatively globally (Nambu, 2004).

The global-response-termination hypothesis is well compatible with the hypothesis that the long indirect pathway globally inhibits responses (section 5.1). Response termination, however, requires a delay in suppression such that responses can be initiated first.

We do not know of any direct functional evidence for a termination function of the long indirect pathway.

#### **5.4. DEFERRAL OF SELECTED PLANS (SHORT INDIRECT PATHWAY)**

According to Brown et al. (2004), the short indirect pathway defers execution of specific selected responses until appropriate. This hypothesis is an extension of the specific-inhibition hypothesis outlined in section 5.2 with regard to the dimension of time. Response deferral is assumed to be no built-in function of the short indirect pathway, but needs to be learned from omissions of expected rewards. In other words, the default is to not defer chosen plans. If, however, premature release of a response via the direct pathway results in reward omission, the short indirect pathway learns to inhibit this response (Brown et al., 2004). According to the model, response deferral is learned in a contextbased way by associating the deferral to any stimulus input that might be present during the deferral period. Thalamo-striatal feedback to the short indirect pathway is hypothesized to inform the short indirect pathway which response was selected before reward omission so that exactly this response can be inhibited. We do not know of any empirical evidence corroborating the deferral hypothesis.

#### **5.5. SURROUND-INHIBITION OF COMPETING MOTOR PROGRAMS**

Both indirect pathways have been hypothesized to establish a surround inhibition of unwanted motor programs during motor responding (Mink, 1996; Stocco et al., 2010). Mink (1996) proposed a particular role of the STN in this respect, thus referring to the long indirect pathway (but also to what is known today as the hyperdirect pathway). In contrast, Stocco et al. (2010) proposed that the short route of the indirect pathway is involved. Neither Mink (1996) nor Stocco et al. (2010) hypothesized on how broad the "space" of suppressed competing motor programs may be (i.e., if every other motor program would be inhibited or just a set of particularly strong competitors). Therefore, arborization patterns do not provide any consistent clue whether an involvement of the short or the long route is more realistic. As a challenge to both hypotheses, however, the effects of dopamine on longterm plasticity in cortico-striatal MSNs of the indirect pathways are exactly opposite those observed in cortico-striatal MSNs of the direct pathway (section 2.4; Shen et al., 2008). Thus, when facilitation of a response via the direct pathway is strengthened by dopamine, surround inhibition of its competitors may not be strengthened as well; center facilitation and surround inhibition can not be established at the same time unless the activities of dopamine neurons that target the "center" increase, while those that target the "surrounds" may simultaneously decrease. Since such an effect has not yet been reported, the hyperdirect pathway might be a more suitable candidate for surround inhibition than any of the indirect pathways (cf. section 6.4).

#### **5.6. CONTROL SYSTEM**

Challenging the subdivision of BG fiber tracts into direct, indirect and hyperdirect pathways, BG have been proposed to consist of a selection pathway, containing what is referred to as direct and hyperdirect pathways in this review, and of a control pathway, vaguely consisting of what is termed long and short indirect pathways here (Gurney et al., 2001a,b; Humphries et al., 2006). More specifically, the control pathway is assumed to consist of the full short indirect pathway, the long indirect pathway up to the STN and an additional route from cortex via STN to GPe (Gurney et al., 2001a). According to Gurney et al. (2001a,b), the control pathway does not have a separable function itself, but rather supports direct and hyperdirect pathways in selecting responses. Among its functions, the control pathway is hypothesized to regulate the amount of activity in STN (and thereby also in GPi): according to the model, motor selection requires that the amount of global motor inhibition is neither so strong that it overrules any facilitation of a specific motor program via striatum, nor so weak that multiple responses are selected simultaneously. By regulating the activity of STN cells and thereby the amount of global motor inhibition, the control pathway ensures an appropriate balance between excitation and inhibition such that neither too many nor too few responses are released simultaneously.

A similar concept is based on the architecture of direct and long indirect pathways as specified in this review (cf. **Figure 1**): in Suri et al.'s (2001) model, the long indirect pathway is hypothesized to globally increase motor inhibition such that only significant contributions of the direct pathway will result in cortical activation, while insignificant direct-pathway effects will be suppressed.

#### **5.7. CLOSING THE GATE TO WORKING MEMORY (SHORT INDIRECT PATHWAY)**

With regard to working memory, the short indirect pathway has been proposed to prevent gating of information into workingmemory storage systems (O'Reilly and Frank, 2006). According to this hypothesis, the short indirect pathway and the direct pathway oppose each other in a push-and-pull manner (section 7.1) to forbid or allow gating of information into working memory, respectively. The hypothesis is structurally similar to, and therefore compatible with, the idea that the short indirect pathway inhibits (gating of) specific motor programs (section 5.2); these functions could be performed by different cortico-BG-thalamic loops. Although there is evidence for BG involvement in working-memory functions (Levy et al., 1997; Lewis et al., 2004; Landau et al., 2009), there is, to the best of our knowledge, no data on the specific gating function of the short indirect pathway proposed by O'Reilly and Frank (2006).

#### **6. PROPOSED FUNCTIONS OF THE HYPERDIRECT PATHWAY**

As outlined in sections 2.2 and 2.3, the hyperdirect pathway excites GPi relatively fast and relatively globally. A major proportion of its inputs derives from axon collaterals of pyramidal tract neurons (section 2.1), while synaptic plasticity in this pathway is not yet well understood (section 2.4).

#### **6.1. PREVENTION OF PREMATURE RESPONSES**

Based on its fast and relatively global effects on GPi, the hyperdirect pathway has been proposed to globally prevent premature responses until response selection has been completed (Frank, 2006; Stocco et al., 2010). Along these lines, the hyperdirect pathway has been predicted to be particularly vital in situations where extensive response conflict occurs (Frank, 2006; Frank et al., 2007), i.e., whenever multiple competing motor programs are simultaneously active in premotor cortex. Recordings of STN activity during high-conflict and low-conflict choices have been performed to investigate this prediction. A typical paradigm starts with a couple of training trials, in which subjects are presented with pairs of stimuli (e.g., A-B or C-D) and have to choose one stimulus of each pair. Each stimulus is associated with a fixed reward probability across trials (e.g., A: 20%—B: 80%, and C: 30%—D: 70%).

Being instructed to maximize rewards, subjects are supposed to learn to choose the stimulus of each pair that provides better average outcomes (i.e., B and D respectively, in our example). In subsequent test trials, pairs are shuffled such that high-conflict pairs (e.g., B–D) and low conflict pairs (e.g., A–D) may be presented. Using such a task, human patients ON deep brain stimulation (DBS) of the STN (which inhibits spiking activity in STN, Gradinaru et al., 2009, and thus eliminates task-related information processing in this nucleus) have been shown to make faster decisions under high response conflict than patients OFF DBS (Frank et al., 2007). Moreover, intracranial EEG recordings from DBS electrodes have revealed differences in STN oscillatory activity between high-conflict and low-conflict trials (Cavanagh et al., 2011), thus arguing for a contribution of STN to conflict processing, in line with Frank et al.'s (2007) prediction.

## **6.2. STOPPING PREPARED RESPONSES BEFORE EXECUTION**

Along similar lines, the hyperdirect pathway has been hypothesized to globally inhibit prepared responses when a stop signal is shown before response execution (Aron and Poldrack, 2006; Aron, 2011; Wiecki and Frank, 2013). This hypothesis is fully compatible with the premature-response-inhibition hypothesis outlined in section 6.1; the hyperdirect pathway may flexibly switch between both of these functions as required by context. Both functions require fast and global inhibition of motor programs, which would fit well with the hyperdirect pathway's fast conduction velocity and its relatively global effects on GPi. Indeed, not just the premature-response-inhibition hypothesis, but also the stop hypothesis is corroborated by empirical evidence: in an fMRI study, STN (which is part of the hyperdirect pathway—but also of the long indirect pathway) has been shown to be more active in humans in stop trials than in go trials of a stop-signal task (Aron and Poldrack, 2006). In that same study, STN has been found more active in subjects that show a fast inhibition of responses after a stop cue (i.e., a fast stop-signal reaction time, SSRT) than in subjects with slow response inhibition. Contrarily, however, another fMRI study reported smaller STN activity in fast-inhibiting subjects than in slow inhibitors (Ray Li et al., 2008). As a challenge to the stop-signal hypothesis, PD patients (whose STN activity is increased; Kreiss et al., 1997; Huang et al., 2007), show slower (instead of faster) inhibition of responses in stop-signal tasks (Gauggel et al., 2004).

#### **6.3. DEACTIVATION OF BG TO ALLOW FOR TOP-DOWN CONTROL**

The hyperdirect pathway has been hypothesized to subdue BG influences on motor cortex in order to allow for top-down PFC control over motor-cortical activities (Chersi et al., 2013). According to this hypothesis, PFC inputs to the hyperdirect pathway decrease activities of all GPi/ SNr neurons to similar levels via inhibitory interneurons, thereby overruling any response-activating effects caused by the direct pathway and preventing task-related BG outputs (Chersi et al., 2013). PFC may then control motor-cortical activities itself. By proposing that the hyperdirect pathway globally overrules any effects of the direct pathway, the deactivation hypothesis has a common assumption with the premature-response-inhibition hypothesis (section 6.1) and the response-stopping hypothesis (section 6.2). However, the deactivation hypothesis specifies that the hyperdirect pathway *decreases* activities in BG output nuclei (via inhibitory interneurons; Chersi et al., 2013), while the other two hypotheses assume it to increase firing in these nuclei. Since the major effect of the hyperdirect pathway on GPi/SNr is known to be excitatory (cf. **Figure 1**), we do not consider the deactivation hypothesis to be particularly plausible in this respect. It might, however, still hold true in its core: a global *increase* in GPi firing could deactivate BG to allow for top-down control just as well.

#### **6.4. SURROUND INHIBITION OF COMPETING MOTOR PROGRAMS**

Just like the indirect pathways (section 5.5), the hyperdirect pathway has been hypothesized to establish surround-inhibition of unwanted motor programs during responding (Gurney et al., 2001a; Nambu, 2004; Humphries et al., 2006; Schroll et al., 2013). Two versions of this hypothesis exist: according to the first, the hyperdirect pathway inhibits only those responses that compete for execution with the desired response, but not the desired response itself which is instead facilitated via the direct pathway (Schroll et al., 2013). According to the second version, the hyperdirect pathway globally inhibits all motor programs, including the desired one, which is distinguished only by its additional activation via the direct pathway (Gurney et al., 2001a; Nambu, 2004; Humphries et al., 2006).

Both hypotheses well reflect the different arborization patterns of direct and hyperdirect pathways, which have been found sparse and profuse, respectively (section 2.3). We don't know of any empirical study investigating the strict surroundinhibition hypothesis against the more unspecific surroundinhibition hypothesis.

#### **6.5. WORKING-MEMORY UPDATE**

The hyperdirect pathway has been hypothesized to clear information from working-memory and to thus allow for updating of working-memory content (Schroll et al., 2012). According to this hypothesis, the hyperdirect pathway breaks reverberation of activity in cortico-BG-thalamic loops (which is assumed to be the neuronal basis of working-memory maintenance, section 4.5). Empirical evidence for this hypothesis is yet rather unspecific: DBS of the STN in Parkinson's disease patients (which is assumed to inhibit spiking activity in this nucleus; Gradinaru et al., 2009) impaired working-memory performance in a spatial delayed response task (Hershey et al., 2008) and in an n-back task (Alberts et al., 2008); to what extent these effects were in fact produced by failures in updating working-memory content, however, or may have been caused by other types of errors, was not delineated. The working-memory-update hypothesis may be generalized to an involvement of the hyperdirect pathway in updating of any information that may be maintained in closed cortico-BG-thalamic loops.

## **7. PROPOSED INTERACTION PATTERNS BETWEEN PATHWAYS**

Having reviewed prominent hypotheses on pathway functions, we will now outline in how far these hypotheses may be grouped into general "principles" of pathway functions. In section 2.5, we reviewed evidence that BG are compartmentalized into a variety of largely independent loops related to motor, premotor, cognitive, motivational, and emotional functions. Since each of these loops is assumed to contain its own separate channel of each BG pathway, we pinpointed that each pathway might contribute to a variety of different functions at the same time. We therefore concluded that it might be a more fruitful approach to search for general principles of pathway functions than for individual pathway contributions related to different loops. While such general principles may be defined from various viewpoints, we hold the view that the models reviewed in sections 3 to 6 differ most consistently from each other with regard to their assumptions on how pathways interact in their effects on cortex. While most models agree that the direct pathway somehow activates specific cortical representations, assumptions on how indirect and hyperdirect pathways interact with this activation are more controversial. In the following sub-sections, we will outline these different concepts.

Importantly, different concepts of pathway interactions are not always mutually exclusive. Rather, the BG might learn from reinforcements which patterns of interactions are most appropriate under different circumstances and might thereby flexibly adapt information processing to environmental demands.

#### **7.1. PUSH-AND-PULL OPPOSITION**

Direct and short indirect pathways have been hypothesized to oppose each other in a push-and-pull manner (**Figure 4A**; e.g., Brown et al., 2004; Frank, 2005; O'Reilly and Frank, 2006): while the direct pathway is assumed to activate specific cortical representations, the short indirect pathway might at the same time try to inhibit them. The relative balance between activation and inhibition might then determine if a particular representation is activated or inhibited overall. The direct pathway's activation is usually assumed to be strengthened by dopamine bursts (i.e., by unexpectedly strong reinforcements; Hollerman and Schultz, 1998), while the short indirect pathway's inhibition is assumed to be strengthened by dopamine dips that might either derive from omissions of expected rewards (Brown et al., 2004; Schroll et al., 2013) or from punishments (e.g., Frank et al., 2004).

The push-and-pull assumption underlies the specificresponse-inhibition hypothesis outlined in section 5.2, the response-deferral hypothesis outlined in section 5.4 and the gate-closing hypothesis outlined in section 5.7.

#### **7.2. CENTER-SURROUND COOPERATION**

Direct and hyperdirect pathways have been proposed to establish a center-surround system of activation and inhibition, where the direct pathway activates specific "central" cortical representations,

#### **FIGURE 4 | Hypotheses on interactions between pathway outputs.**

Three-dimensional Gaussians depict neuronal activities (z-axis), as elicited by basal-ganglia pathways, for "central" and "surrounding" cortical representations (x- and y-axes). Direct-pathway effects are denoted by red arrows, while the effects of hyperdirect and indirect pathways are denoted by green and blue arrows, respectively. Pointed arrows denote excitatory, rounded arrows inhibitory effects. **(A)** Push-and-pull opposition: direct and short indirect pathways may oppose each other in a push-and-pull manner, where the effects of direct and short indirect pathways are equal in spatial extent (e.g., Brown et al., 2004; Frank, 2005; Schroll et al., 2013). In the example shown here, the direct pathway (thick red arrow) overpowers the short indirect pathway (thin blue arrow). **(B)** Center-surround cooperation. The direct pathway activates specific cortical representations, while either the hyperdirect pathway (Nambu, 2004) or the long indirect pathway (Mink, 1996) globally inhibit these representations. Since the direct pathway's effect is assumed to be more powerful, center-surround activation emerges. **(C)** Strict center-surround cooperation. The direct pathway activates specific cortical representations, while the hyperdirect pathway inhibits surrounding (i.e., "competitive") representations, but not the activated representation itself (Schroll et al., 2013). The resulting effect is mostly equivalent to **(B)**. **(D)** Center-surround cooperation with global activation. The direct pathway excites cortex relatively globally, while the short indirect pathway inhibits all but the "center" representation (Stocco et al., 2010). As a result, again, the central cortical representation is activated, while its "surrounds" are inhibited. **(E)** Global blocking of activation. The direct pathway tries to activate specific cortical representations, while the hyperdirect pathway globally inhibits them. In contrast to **(B)**, the hyperdirect pathway is more powerful than the direct pathway and thus overrules any direct-pathway effect (cf. Aron and Poldrack, 2006; Frank, 2006). Please note that pathway effects are depicted as Gaussians for merely illustrative purposes. Most models do not implement Gaussian functions, but rather assume "box-car" (i.e., all-or-nothing) effects.

while the hyperdirect pathway inhibits "surrounding" (i.e., competitive) representations (Nambu, 2004; Schroll et al., 2013). As outlined in section 6.4, different models assume the hyperdirect pathway to either inhibit only competing motor programs, sparing the center (**Figure 4C**; Schroll et al., 2013), or to inhibit the center-facilitated motor program as well (**Figure 4B**; Nambu, 2004). According to the latter hypothesis, the direct pathway needs to be powerful enough to overrule any effect by the hyperdirect pathway to still activate the "center" representation. According to the former hypothesis, center-activation and surround-inhibition may partially compensate for each other: by activating the center more strongly, surround inhibition may become less relevant, while strong surround inhibition may require less powerful center-activation.

Center-surround collaboration has been proposed for direct and indirect pathways as well (section 5.5). Mink (1996) assumed direct and long indirect pathways to interact as depicted in **Figure 4B**, while Stocco et al. (2010) proposed a different mechanism: according to their model, the direct pathway activates cortex relatively globally (i.e., unspecifically), while the short indirect pathway inhibits all undesired representations, resulting in activation of only the desired representation (**Figure 4D**).

#### **7.3. GLOBAL BLOCKING OF ACTIVATIONS**

According to a different set of hypotheses, global inhibition of cortical representations via the hyperdirect pathway is powerful enough to overrule any specific activation caused by the direct pathway (**Figure 4E**). Whenever the hyperdirect pathway globally inhibits cortical representations, thus, the direct pathway becomes powerless. Such a function of the hyperdirect pathway underlies the premature-response prevention hypothesis reviewed in section 6.1, the response-stopping hypothesis outlined in section 6.2 and the working-memory-update hypothesis outlined in section 6.5. By approximation, it also underlies the deactivation hypothesis reviewed in section 6.3, which, however, states that the hyperdirect pathway overrules any directpathway effects by globally *facilitating* activation of cortical representations.

#### **7.4. MODULATION OF ACTIVATION**

An again different set of hypotheses (section 5.6) suggests that the long indirect pathway modulates the direct pathway's effects. Suri et al. (2001) proposed that the long indirect pathway globally inhibits cortical representations to such an extent that only strong activations of specific desired representations via the direct pathway result in cortical activity, while weak activations will be suppressed (Suri et al., 2001). According to this hypothesis, thus, the direct pathway may overrule the long indirect pathway's effects only if it is powerful. The hypothesis thus lies somewhere between the center-surround cooperation as depicted in **Figure 4B** and the global-blocking hypothesis as shown in **Figure 4E**. In a functionally related proposal (section 5.6), a control pathway (vaguely consisting of the two indirect pathways; Gurney et al., 2001a,b; Humphries et al., 2006) controls the number of cortical representations that can be activated simultaneously by a similar process, as outlined in section 5.6.

## **8. FUTURE DIRECTIONS: TESTS OF MODEL ASSUMPTIONS**

Conflicting hypotheses on pathway functions may be empirically tested against each other. Critically, such tests will have to link brain processes to overt behavior and will thus have to be performed in awake animals or humans. In the following sub-sections 8.1 to 8.3, we suggest a few such experiments.

#### **8.1. THE DIRECT PATHWAY: UPDATE vs. MAINTENANCE OF WORKING-MEMORY CONTENT**

Models are relatively unanimous about the direct pathway's functional contribution to motor responding. With regard to its potential involvement in working-memory processes, however, two relatively incompatible hypotheses have been proposed (section 4.5): according to the first, the direct pathway takes part in gating working-memory content (Gruber et al., 2006; O'Reilly and Frank, 2006), while working-memory maintenance is subserved by the cortex. According to the second, the direct pathway contributes to working-memory maintenance, while workingmemory updating is ensured by the hyperdirect pathway. Please note, however, that a direct-pathway involvement in both maintenance and updating of working memory content is conceivable: the direct pathway could, for instance, contribute to workingmemory maintenance in closed loops and to working-memory updating in interlinked open loops.

To test the maintenance against the updating hypothesis, an experimenter could inactivate direct-pathway MSNs in genetically modified mice (cf. Hikida et al., 2010) and observe the effects of this manipulation on working-memory performance. If the direct pathway is involved in gating of information, but not in its maintenance, updating of working-memory content should be impaired, while there should be no loss of information over time once the information is correctly gated into working memory. In brief, thus, errors should mostly be of the perseverative type. If, in contrast, the direct pathway takes part in maintenance of information, gating might be relatively unimpaired, but workingmemory content should decay over time. Animals should then show relatively random (rather than perseverative) errors. The experimenter might want to inactivate direct-pathway MSNs during *training* of working-memory tasks, since this might more consistently involve BG participation than an already automatized task (cf. Antzoulatos and Miller, 2011; Waldschmidt and Ashby, 2011).

Related studies could also be performed with human subjects: rather than *inactivating* the direct pathway, however, natural variances in direct-pathway gene expression could be related to working-memory performance (cf. Heinz et al., 1996, for such a study outside the context of working memory).

#### **8.2. THE SHORT INDIRECT PATHWAY: REVERSAL LEARNING vs. AVOIDANCE OF AVERSIVE EVENTS**

It has been proposed that the short indirect pathway learns response inhibition based on aversive events (Frank et al., 2004). Alternatively, the pathway has been hypothesized to learn inhibition based on omissions of expected rewards, which occur consistently during reversal learning and extinction (Schroll et al., 2013). While these functions do not in principle contradict each other, none is yet firmly established empirically. An experimenter could design a stimulus-response learning paradigm. In a first phase of this experiment, animals would learn associations between stimuli and responses (i.e., button presses) based upon either rewards that are presented when the correct button is chosen or punishments that are presented when the incorrect button is chosen. In a second phase, previously learned stimulus-response associations would be reversed or extinguished. Measures of the short indirect pathway's strength would be recorded over the learning process (e.g., magnitudes of phasic firing-rate decreases in GPe during responding). The aversive-events hypothesis predicts that phasic decreases in GPe activity should become stronger after each aversive event, whereas the reward-omission hypothesis predicts that phasic decreases should become stronger during reversal of rewarded associations or during extinction.

#### **8.3. SURROUND INHIBITION: LONG INDIRECT vs. SHORT INDIRECT vs. HYPERDIRECT PATHWAY**

While some authors hypothesize surround-inhibition of unwanted motor programs to be implemented via long or short indirect pathways (section 5.5; Mink, 1996; Stocco et al., 2010), others hypothesize the hyperdirect pathway to control such a function (section 6.4; Nambu, 2004; Schroll et al., 2013). While the mechanisms that are assumed to establish surround inhibition differ between models, their effects are mostly equivalent (**Figures 4B–D**): A "central" cortical representation is activated, while its surrounding representations (i.e., competitors) are suppressed. To find out which pathway (if any) is responsible for such a surround-inhibition, an experimenter could measure GPi firing rates in intact, GPe-lesioned and STN-lesioned animals during performance of clearly defined motor responses that have easily identifiable competitors (such as moving a limb toward left vs. right). The surround-inhibition hypothesis of unwanted motor programs implies that those GPi neurons, which show a phasic decrease in activity with response *A*, show an increase in activity during competitive response *B* and that there are other neurons that behave vice versa. If the long indirect pathway is responsible for such a surround inhibition, lesions of STN and GPe should each eliminate the phasic increase in GPI activity. If, however, the short indirect pathway is involved, only lesions of GPe should eliminate it. If the hyperdirect pathway is critical, only STN lesions should abolish the phasic increase in firing.

## **9. FUTURE DIRECTIONS: MODEL DEVELOPMENTS**

Although existing models of BG pathways account for a variety of anatomical, physiological and biochemical data, some major findings have not yet been implemented at all. **Table 2** contains some of these findings and specifies how computational modeling might help to understand their significance with regard to the functions of BG pathways.

It may be noted that **Table 2** repeatedly relates to synaptic plasticity. We hold the view that the mechanisms of synaptic plasticity in BG pathways are a key to understanding their functions. Many computational models rely on the assumption that BG vitally contribute to reinforcement learning. If this is correct, the pathways' mechanisms of synaptic learning must be of central importance. To date, however, only cortico-striatal plasticity has been investigated extensively (e.g., Shen et al., 2008), whereas potential mechanisms of plasticity in cortico-subthalamic, striato-pallidal, subthalamo-pallidal, and striato-striatal synapses remain elusive. Because of these knowledge gaps, computational models differ extensively in their assumptions on the mechanisms of synaptic plasticity in BG pathways. Combined empirical and modeling efforts will be required to unveil these mechanisms and to analyze how they contribute to the functions of BG pathways. Neurocomputational modeling in particular might be used to investigate how synaptic plasticity controls the emergence of pathway functions. Schroll et al. (2013) recently showed that by specifying the rules of synaptic plasticity in a computational model of BG pathways (but not the pathways' patterns of connectivity), pathway functions self-organized as the model learned a behavioral task from reinforcements. Such an approach of specifying plasticity and investigating the emergence of pathway functions could be repeated for refined mechanisms of plasticity (accounting for instance, for spike-time-dependent effects) and extended to BG fiber tracts that are no core elements of BG pathways (like striatal interneurons or the back-projections from GPe to the striatum). Different model may be compared against each


*Some empirical findings have not yet been implemented in computational models. We here highlight some of these findings and outline what computational modeling might contribute to their understanding.*

other by analyzing their performance on relevant behavioral paradigms.

#### **10. CONCLUSIONS**

In the Introduction we posed the question, why BG contain such a multitude of nuclei and fiber tracts. By reviewing influential hypotheses on the functions of BG pathways, we hope to have outlined potential functional advantages of such complexity. It has to be admitted, however, that all of the interpretations reviewed have been developed from a reverse-engineering standpoint, asking why BG are complex, given that this is the case. They do not answer why complexities evolved in the first place or if there might have been simpler solutions that would have guaranteed equivalent functionality.

Most theorists assume that BG nuclei and fiber tracts give rise to separate pathways and that these pathways fulfill distinct functions. While they mostly agree that the direct BG pathway activates specific cortical representations, the functions of indirect and hyperdirect pathways are under intense debate. We have outlined various hypotheses on these pathways' functions and suggested that they may be grouped according to these pathways' hypothesized interactions with the direct pathway. Specifically, we have identified push-and-pull opposition, center-surround cooperation, global blocking of direct-pathway effects and modulation of direct-pathway effects as major proposed interaction patterns.

We hope to have motivated stringent empirical tests of hypotheses on pathway functions, believing that theory-based research promises exciting advances in the understanding of BG complexity.

#### **ACKNOWLEDGMENTS**

Fred H. Hamker and Henning Schroll have been supported by the German Research Foundation (Deutsche Forschungsgemeinschaft) grants "German-Japanese Collaboration Computational Neuroscience: The function and role of Basal Ganglia Pathways: From single to multiple loops" (DFG HA2630/8-1), and "Deep brain stimulation. Mechanisms of action, Cortex-Basal Ganglia-Physiology and Therapy Optimization" (KFO 247: DFG HA2630/7-1).

#### **REFERENCES**


anterograde horseradish peroxidase study in the rat. *Brain Res.* 458, 53–64. doi: 10.1016/0006-8993(88)90495-7


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 25 July 2013; accepted: 11 December 2013; published online: 30 December 2013.*

*Citation: Schroll H and Hamker FH (2013) Computational models of basal-ganglia pathway functions: focus on functional neuroanatomy. Front. Syst. Neurosci. 7:122. doi: 10.3389/fnsys.2013.00122*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2013 Schroll and Hamker. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

**REVIEW ARTICLE** published: 25 December 2013 doi: 10.3389/fnsys.2013.00118

## Effects of deep brain stimulation of the subthalamic nucleus on inhibitory and executive control over prepotent responses in Parkinson's disease

## *Marjan Jahanshahi\**

*Cognitive Motor Neuroscience Group and Unit of Functional Neurosurgery, Sobell Department of Motor Neuroscience and Movement Disorders, UCL Institute of Neurology, The National Hospital for Neurology and Neurosurgery, London, UK*

#### *Edited by:*

*Ahmed A. Moustafa, University of Western Sydney, Australia*

#### *Reviewed by:*

*Benedicte Ballanger, Centre National de la Recherche Scientifique, France Tamara Hershey, Washington University, USA*

#### *\*Correspondence:*

*Marjan Jahanshahi, Cognitive Motor Neuroscience Group and Unit of Functional Neurosurgery, Sobell Department of Motor Neuroscience and Movement Disorders, UCL Institute of Neurology, The National Hospital for Neurology and Neurosurgery, 33 Queen Square, London WC1N 3BG, UK e-mail: m.jahanshahi@ucl.ac.uk*

Inhibition of inappropriate, habitual or prepotent responses is an essential component of executive control and a cornerstone of self-control. Via the hyperdirect pathway, the subthalamic nucleus (STN) receives inputs from frontal areas involved in inhibition and executive control. Evidence is reviewed from our own work and the literature suggesting that in Parkinson's disease (PD), deep brain stimulation (DBS) of the STN has an impact on executive control during attention-demanding tasks or in situations of conflict when habitual or prepotent responses have to be inhibited. These results support a role for the STN in an inter-related set of processes: switching from automatic to controlled processing, inhibitory and executive control, adjusting response thresholds and influencing speed-accuracy trade-offs. Such STN DBS-induced deficits in inhibitory and executive control may contribute to some of the psychiatric problems experienced by a proportion of operated cases after STN DBS surgery in PD. However, as no direct evidence for such a link is currently available, there is a need to provide direct evidence for such a link between STN DBS-induced deficits in inhibitory and executive control and post-surgical psychiatric complications experienced by operated patients.

**Keywords: subthalamic nucleus, Parkinson's disease, deep brain stimulation, inhibition, executive control, prepotent responses**

### **INTRODUCTION**

Parkinson's disease (PD) is the most typical basal ganglia disorder. In addition to the core motor symptoms of tremor, rigidity and bradykinesia and akinesia, patients experience a host of non-motor symptoms which include cognitive impairment and psychiatric disorders particularly depression, anxiety, apathy, hallucinations, and delusions. In relation to cognition, executive dysfunction can be present from the early stages of the illness and this and other forms of mild cognitive impairment can evolve into dementia in the later phases in a proportion of cases (Emre et al., 2007; Litvan et al., 2011, 2012; Dirnberger and Jahanshahi, 2013).

There is now evidence from randomized controlled studies that surgical treatment of PD with deep brain stimulation (DBS) of the subthalamic nucleus (STN) is effective in controlling the motor symptoms of the disease and improving the quality of life of the patients (e.g., Deuschl et al., 2006; Weaver et al., 2009, 2012; Follett et al., 2010; Williams et al., 2010). Also, a number of controlled studies have established that STN DBS does not produce any major deficits in global aspects of cognition in PD (e.g., Smeding et al., 2006; Witt et al., 2008; Weaver et al., 2009; Follett et al., 2010; Williams et al., 2010, 2011). Furthermore, the impact of STN DBS on cognition has been examined in a number of studies which have followed up patients for 5 (Schüpbach et al., 2005)8(Fasano et al. (2010), or 10 (Castrioto et al., 2011) years and the rates of dementia reported across these studies range from the 5–6 to 17–22%. These rates are no higher than those found as part of the natural history and progression of PD in longitudinal studies of cognition (Hughes et al., 2000; Aarsland et al., 2003; Hely et al., 2008), and suggest that STN DBS does not alter the risk of cognitive decline. The effect of STN DBS on more specific aspects of cognition was examined by Parsons et al. (2006) in a meta-analysis of 28 studies published between 1999 and 2006 based on 612 patients. The deterioration of verbal fluency with STN DBS was the most consistently reported change which had the highest effect size. There was also a small but still significant effect on verbal functions and executive functions.

Against this background, that STN DBS significantly improves the motor symptoms of PD and besides a deterioration of verbal fluency, STN DBS has no major negative impact on global cognitive function and is not associated with increased risk of cognitive decline, there is evidence that a specific aspect of executive function, executive control of action is impaired with STN DBS in PD, which is the focus of the rest of this review. To highlight these STN DBS induced deficits in executive control of action, the main part of the review reports the results of studies which have examined STN DBS effects on a range of tests including the Stroop, random number generation (RNG), stop signal task, go no go

**Abbreviations:** STN, subthalamic nucleus; DBS, Deep brain stimulation; PD, Parkinson's disease; RT, reaction time; SAT, speed accuracy trade-off; RNG, random number generation; CS1, count score 1; SSRT, stop signal reaction time; LFP, local field potential; pre-SMA, pre-supplementary motor area; ACC, anterior cingulate cortex; DLPFC, dorsolateral prefrontal cortex; IFC, inferior frontal cortex; GPi, internal segment of the globus pallidus; ERS, event-related synchronization.

reaction times, and tasks involving decision-making under conflict. Where available, relevant imaging and electrophysiological recording studies are also discussed.

## **EXECUTIVE CONTROL OVER PREPOTENT RESPONSES**

Executive control is considered to be achieved by the frontal cortex in non-routine and demanding situations. Norman and Shallice (1986) defined situations that require executive control as those that involve planning or decision making, situations where responses are not well-rehearsed or contain novel sequences of actions or are dangerous or technically difficult, those that involve error correction or troubleshooting, and finally situations that require resisting temptation or overcoming of a strong habitual response. The latter situation requires inhibition of strong habitual responses to allow engagement in alternative behavior more suited to the context. Such strong habitual responses are prepotent, in that they are likely to be executed fast and automatically, without much attention or thought. Isoda and Hikosaka (2011) distinguished three mechanisms for development of prepotent responses. The first is an innate mechanism whereby a salient stimulus naturally and instinctively draws a response, such as the orienting response to a flashing light. The second, motivational prepotency, is elicited by a highly valued and immediately available reward as in the delayed discounting task or a food deprived dieter reaching and eating a doughnut when faced with a plate full of them. The third, habitual prepotency, is developed through repetition and practice; for example a driver stopping the car when the traffic light turns red.

A key feature of habitual prepotent responses is that they are executed fast (Schneider and Chein, 2003), presumably because they reach the threshold for execution before alternative responses. Thus, according to evidence accumulation models, the response threshold reflects the amount of information that needs to be accumulated before a response is made (e.g., Ratcliff, 1978). According to these models the speed and accuracy of responses are controlled by a change in the distance between baseline and response threshold levels. If the distance is short, the threshold will be reached quickly, but noisy inputs and incorrect activations are likely to reach threshold first, resulting in fast but error-prone responses. In contrast, if the distance is large, the threshold will be reached more slowly, with a smaller probability of incorrect activations reaching threshold first, such that responses will be made slowly but accurately (Bogacz et al., 2010). Control of such speed accuracy trade-offs (SAT) has been attributed to changes in baseline activity in cortical areas (pre-supplementary motor area—pre-SMA or dorsolateral prefrontal cortex), striatum or the STN or strengthening synaptic cortico-striatal connections [for review see Bogacz et al. (2010)]. There is some support from fMRI studies showing increased activation of the pre-SMA, striatum and STN with changes in response caution and SAT under different experimental conditions (e.g., Forstmann et al., 2008; Mansfield et al., 2011). According to Isoda and Hikosaka (2011), inhibitory control over habitual prepotent responses may simply delay or postpone it, to allow time for alternative more controlled responses to reach threshold.

Thus, an important aspect of executive control is inhibitory control, which encompasses the ability (i) not to react automatically to external stimuli, (ii) to exert control over internal impulses, and (iii) to prevent automatic performance of habitual responses in situations where more controlled processing is required. Impulsive individuals tend to act fast without reflection or foresight. However, impulsivity is multifactorial and various forms of impulsivity have been described including reflection impulsivity (act fast without taking time to reflect), impulsive action (inability to control prepotent responses as reflected by premature responses in go no go RT tasks and failure of motor inhibition in stop signal tasks) and choice impulsivity (failure of delayed gratification); which, respectively, operate at the preparation, execution and outcome stages of behavioral control (Evenden, 1999). The neural and neurochemical bases of these different forms of impulsivity have been recently reviewed (see Dalley et al., 2011; Dalley and Roiser, 2012).

Such inhibitory control over internal impulses or responses externally triggered by external stimuli or prepotent habitual responses is a cornerstone of self-control and essential for adaptive decision-making and appropriate social interaction. As outlined in the supervisory attentional system of Norman and Shallice (1986), inhibitory control over behavior is volitional and hence resource and attention-demanding. Inhibitory control over behavior can be reactive or proactive, operate globally or be selective, with these proposed to differentially engage the hyperdirect and indirect fronto-striatal pathways (Aron, 2011). Reactive inhibition is reflected for instance in the ability to stop oneself from continuing to cross the road if a fast car approaches and represents adaptive modification of behavior triggered by a sudden and unexpected stimulus. Proactive inhibition involves responding with restraint to meet goals and objectives. In the above example, proactive inhibition or action restraint would be the slowing down of one's walking pace when approaching the busy road. In daily life, proactive inhibition often concerns the preparedness to act with restraint in face of temptation or situations that challenge self-control such as drinking or smoking or eating sweets. Proactive inhibition is considered essential for self-control and most often goes awry in psychiatric disorders (Jaffard et al., 2008; Aron, 2011). In real-life situations, inhibitory control is often a key process in conflict resolution. The necessity to decide between equally salient or valued or incompatible options can induce a conflict. When faced with such conflict between available options, inhibitory control is imposed on responding, to prevent hasty decisions and premature responses until an optimal decision is arrived at (Frank, 2006). These inter-related inhibitory processes, reactive and proactive inhibition and conflict resolution, are essential for executive control and to ensure adaptive behavior (Frank, 2006; Verbruggen and Logan, 2009; Aron, 2011).

A factor analysis of different behavioral measures of impulsivity and risk taking has revealed two main factors. The first related to "impulsive action" and measures of inhibition of prepotent responses on go no go or stop signal tasks. The second factor corresponded to the "impulsive choice/decision" and measures of risk taking and delay discounting (Reynolds et al., 2006). In the same way that impulsivity is multi-faceted, inhibition is not a unitary concept and also has been shown to have different components. In their factor analytic study of nine different measures of inhibition on a sample of 220 students, Friedman and Miyake (2004) identified three factors which they labeled "inhibition of prepotent responses," as measured by tasks such as the Stroop or stop signal RT task, "resistance to distractor interference" with tasks such as the Eriksen flanker task, and finally protection from "proactive interference" which measures resistance to memory intrusions from previously learned information with loadings from memory tasks such as the Brown-Peterson. Subsequently, the inhibition of prepotent responses and the resistance to distractor interference factors were shown to be related (*r* = 0.67) and were combined into a single factor. One aspect of RNG, suppression of habitual counting was found to be related to response-distractor inhibition. Thus, both inhibition and impulsivity are multi-faceted and here we are dealing with action impulsivity and inhibition of prepotent responses.

### **THE HYPERDIRECT, DIRECT AND INDIRECT PATHWAYS**

The connectivity between the cortex and the basal ganglia occurs via three pathways: the hyperdirect, direct and indirect pathways (see **Figure 1**). These pathways have been considered to constitute an ideal system for response selection under competition or conflict. In situations of conflict, the hyperdirect pathway via the STN is proposed to increase the response threshold to prevent premature responses and to allow time for information accumulation/reflection and selection of the appropriate response, the indirect pathway via the STN inhibiting inappropriate responses to allow selection of the appropriate response through the direct pathway (e.g., Chevalier and Deniau, 1990; Mink and Thach, 1993; Redgrave et al., 1999; Frank, 2006; Frank et al., 2007).

In exerting inhibitory control over prepotent or habitual responses, the priority is to stop the prepotent response from being executed. The hyperdirect route from the cortex to STN is the shortest and fastest route for influencing the tonic inhibition of the basal ganglia output pathways over the cortex and

achieving inhibition of action. The STN receives input from many frontal areas including the motor cortex, pre-SMA, caudal and dorsal premotor cortex, dorsolateral prefrontal cortex (DLPFC), and anterior cingulate cortex (ACC) and inferior frontal cortex (IFC) (Afsharpour, 1985; Parent and Hazrati, 1995; Nambu et al., 1997). Some of these areas such as the pre-SMA, IFC, ACC, and DLPFC are known to be involved in inhibitory control from investigations of the effects of accidental focal lesions in man (Devinsky et al., 1995; Aron et al., 2003; Dimitrov et al., 2003; Rieger et al., 2003; Sumner et al., 2007; Gläscher et al., 2012). This means that the STN is well-placed for a role in executive control through inhibition. Furthermore, recent imaging and tractography has revealed STN connectivity with the pre-SMA and IFC in man (Aron et al., 2007). Recent evidence from anterograde tracing studies in macaque monkeys suggests a topographically organized prefrontal-STN hyperdirect pathway, with limbic areas projecting to the medial tip of the STN, straddling its border and extending into the lateral hypothalamus and associative areas projecting to the medial half and motor areas to the lateral half; with limbic projections terminating rostrally and motor projections more caudally. A high degree of convergence existed between projections from functionally diverse cortical areas, which was considered to allow both functional specificity and integration (Haynes and Haber, 2013). Similar parcellation of the STN into three distinct zones was achieved *in vivo* and noninvasively in a study of brain connectivity profiles with diffusion weighted imaging which showed distinct limbic, associative and motor regions in the anterior, middle and posterior sections of the STN (Lambert et al., 2012).

Ablation of D2 receptor expressing striatal neurons in mice resulted in motor hyperactivity and examination of neuronal activity in the globus pallidus (GP) and substantia nigra pars reticulata (SNr) demonstrated that this ablation induced dramatic changes in the cortically evoked triphasic response of early excitation, inhibition and late excitation in the GP and SNr. It was concluded that the phasic late excitation in the SNr through the striatopallidal indirect pathway plays a key role in stopping movement and preventing motor hyperactivity (Sano et al., 2013).

Of interest is a recent study by Cui et al. (2013) which developed and used a novel *in vivo* photometry method to measure activity of spiny projection neurons in the direct and indirect pathways. Contrary to the classical model, of pro-kinetic "go" activity in the direct and anti-kinetic "no go" activity in the indirect pathway, during free movements in genetically engineered mice, there was co-activation of striatal neurons in both the direct and indirect pathways and spiny projection neurons in both pathways were quiet during inactive states. In an editorial on the Cui et al. paper, it was proposed that instead of issuing simple, generalized go or no go/stop commands, the co-activation of the direct and indirect pathways may be signaling "what to do" and "what not to do" and hence making recommendations about specific movements and their likely outcomes, which would enable selection of the optimal course of action (Surmeier, 2013).

## **MODELS AND PREDICTIONS**

Two models are relevant to a putative role for the STN in inhibitory and executive control. The first model proposed by Michael Frank (2006), Frank et al. (2007) considers the normal function of the STN to be to issue a "no go" signal to raise response thresholds when decision-making in situations of conflict to prevent premature and impulsive responding and to allow time for further information accumulation and reflection before a decision is made and a response is selected and executed. According to this model, alteration of STN activity as with STN DBS in PD should interfere with this normal function of the STN in raising the response threshold in situations of conflict and therefore be associated with impulsive responding.

A critical executive process is the ability to switch between automatic/habitual and controlled/goal-directed processing in a timely and efficient manner (Shiffrin and Schneider, 1984). Automatic habitual processing is employed when executing welllearned behaviors that require little attention. In contrast, when engaging in an attentionally-demanding behavior, such as when new learning is involved or when deliberately trying to override a well-learned habitual behavior, goal-directed controlled processing is necessary. It has been proposed that a critical function of the STN is switching between automatic and controlled strategies and that the STN receives a switching signal from regions of the prefrontal cortex, including the pre-SMA (Isoda and Hikosaka, 2008). Single cell recordings in primates during occulomotor tasks requiring such behavioral switching have revealed switch selective neurons in both the pre-SMA and the STN. Importantly, when a controlled response was required by the context, neurons in the pre-SMA fired *before* those in the STN, which in turn fired *before* response execution. Thus, recordings of neuronal activity in primates suggest that the STN implements a switch signal from the pre-SMA which enables a shift from habitual to controlled processing (Isoda and Hikosaka, 2008). Consequently, high frequency stimulation of the STN during STN DBS in PD may interfere with the ability to switch between automatic and controlled processing when a situation/task demands it.

#### **STN DBS IN PD IS ASSOCIATED WITH DEFICITS IN INHIBITORY OR EXECUTIVE CONTROL OVER PREPOTENT RESPONSES**

In experimental animals such as the rat, lesions of the STN result in increased premature responses in reaction time tasks (Baunez et al., 1995), impulsivity and an inability to inhibit operant responses (Wiener et al., 2008), and a generalized impairment of stopping on a modified stop signal reaction time task (Eagle et al., 2008). In man, accidental lesions of the STN have resulted in hemiballism, hyperphagia, hypersexuality, loggoreha, eurphoria and impulsivity, symptoms indicative of motor and behavioral disinhibition (Trillet et al., 1995; Absher et al., 2000; Park et al., 2011).

While PD is primarily a disorder of response initiation characterized by akinesia or poverty of spontaneous and automatic actions such as blinking, gesturing, facial expression; and bradykinesia or slowness of movement initiation and execution; some of the other symptoms of PD such as freezing of gait or medicationinduced dyskinesias represent excessive inhibition or disinhibition of movement, respectively. Patients with PD have been shown to have deficits in inhibitory control on tasks requiring inhibition of prepotent motor response such as go no go RT (Cooper et al., 1994) or stop signal RT (Gauggel et al., 2004; Obeso et al., 2011a,b), as well as inhibition on cognitive tasks requiring inhibition of prepotent or habitual responses such as the Stroop, RNG or the Hayling sentence completion task (Obeso et al., 2011a).

There is now an increasing body of evidence suggesting that treatment of PD with STN DBS is associated with deficits in inhibitory and executive control. This literature largely based on an STN DBS on vs. off methodology sometimes combined with imaging or involving recording of local field potentials (LFPs) from electrodes surgically implanted in the STN or intraoperative recording of single neuronal activity from the STN has relied on the use of a variety of tasks and will be reviewed below according to the specific tasks employed.

#### **THE STROOP INTERFERENCE TASK**

The Stroop interference task (Stroop, 1935) is a classic example of a task that involves response selection under conflict and requires inhibitory control over habitual prepotent responses in order to select an alternative response. Reading words is a habitual prepotent response built up through years of exposure to printed words. In the Stroop interference task (**Figure 2**), the color words, red, blue and green are presented in incongruent ink. For example the word red is printed in blue ink. The participants' instruction is to name the color of ink the word is printed in. To do this, the participant has to suppress the more habitual and prepotent response of reading the word (red), in order to select the alternative response and name the color of ink it is printed in (blue). As a result, participants take longer to complete and make more errors on the Stroop Interference task than on a control task when they name the color of ink of colored rectangles printed in red, blue or green.

Jahanshahi et al. (2000a) used a DBS on vs. off methodology to assess seven patients with PD who had had STN DBS and 6 with GPi DBS. Patients were assessed after overnight withdrawal of dopaminergic medication in the off state. While PD patients were significantly faster on the Stroop control task with STN DBS on than off, with STN DBS on, they made significantly more errors on the Stroop interference task, compared to when the stimulators were off. These effects were not significant for the patients with GPi DBS. This was the first experimental demonstration that STN DBS induced deficits in inhibitory control and an inability to suppress habitual prepotent responses and to engage in response selection under conflict in PD. These findings were subsequently replicated by others (Witt et al., 2004).

With PET, Schroeder et al. (2002) assessed the neural substrates of this DBS STN-induced deficit in inhibitory control on the Stroop interference task. Seven patients with PD performed the Stroop task or a control task naming animals, also printed in colored ink. With STN stimulation on the Stroop interference effect, the difference between completion of the Stroop interference and control tasks, was significantly larger than with DBS off. This greater interference effect and inability to inhibit the prepotent response with STN DBS was associated with increased activation of the left angular gyrus an area related to processing of words which was activated more with STN stimulation reflecting the increased inability to suppress processing of the words probably related to the reduced activation of the ventral striatum and anterior cingulate observed with STN stimulation compared

to DBS off. This suggests that the inability to inhibit prepotent word reading responses and to engage in response selection under conflict on the Stroop interference task is mediated by reduced activation of the limbic circuit induced by STN stimulation.

More recently, Brittain et al. (2012) recorded LFPs from the electrodes implanted in the STN of 12 medicated PD patients in the few days post-operatively when this is possible before the electrodes are connected to the impulse generating device. They used a computerized version of the Stroop interference task with color words presented in incongruent ink and a control task in which color words were presented in congruent ink. As expected, during incongruent trials, response times were longer (980 ms) and error rates higher (6.5%) due to the Stroop effect than on the congruent trials (807 ms error rate 1.3%). They found stimulus driven beta desynhronization (15–35 Hz) that lasted throughout the verbal response—consistent with the idea that beta synchrony decreases to allow motor output to occur. On the incongruent trials there was a rebound in beta desynchrony between −300 and 100 ms prior to the response which was not present for the congruent trials. They then looked at response locked change in beta power for correct and incorrect incongruent trials relative to baseline. On correct incongruent trials when the prepotent response was successfully inhibited, a beta resynchronization was seen before the response. During incorrect incongruent trials when the patient failed to inhibit the prepotent response, the beta resynchronization occurred after the response onset. On correct trials, the beta power reshyncronization occurred at a mean −138 ms before the response, whereas for the incorrect response, this occurred 132 ms after the response. It was suggested that this beta resynchronization or rebound during incongruent trials is an inhibitory signal via the hyperdirect pathway which pauses the motor system and delays the prepotent response until the conflict can be resolved and a correct response is selected and produced.

#### **RANDOM NUMBER GENERATION**

Random number generation (RNG) is procedurally simple. Participants are instructed to say the numbers 1–9 in a random fashion, as if picking them out of a hat, in synchrony with a pacing stimulus for 100 trials. RNG involves a number of executive processes. As there are nine possible responses to select from on each trial, RNG involves response selection under conflict. In addition, to engage in strategic response selection in a random fashion, participants have to suppress habitual counting in series (e.g., 123 or 987, measured as count score 1- CS1) which is a prepotent habitual response, developed through years of experience with numbers. Selection and switching generation strategies, monitoring of the output and synchronizing responses with the pacer are other processes involved in the task. As a result, RNG is an attention-demanding task that interferes with performance of other attention-demanding tasks and in turn is subject to interference when performed concurrently with other such tasks under dual-task conditions (Baddeley, 1966; Robertson et al., 1996; Brown et al., 1998). In fact, during such dual task conditions or when paced RNG is performed at faster rates which also increases attentional demands of the task, there is significant increase in habitual counting (CS1) during RNG, indicating that participants are less able to suppress habitual counting and engage in strategic response selection (Brown et al., 1998; Jahanshahi et al., 1998, 2000b, 2006; Dirnberger et al., 2005). With imaging, it has been shown that in healthy young individuals, performance of paced RNG at the fastest rates, requiring a response once every 1 or 0.5 s, is associated with significant increase in habitual counting (CS1) and decrease in frontal activation, presumably because the response selection and synchronization demands of the task exceed capacity (Jahanshahi et al., 2000b). Patients with PD, who show differentially greater increase in habitual counting at faster rates of paced RNG relative to age-matched controls, unlike the controls fail to show task or rate-dependent modulation of frontal activation, a dysfunction related to group differences in GPi activation across tasks and rates (Dirnberger et al., 2005).

Thobois et al. (2007) used PET to investigate the effect of STN DBS in PD on patterns of brain activation during fastpaced RNG or a control counting task (counting in series from 1 to 9) both paced by a 1 Hz tone. While STN DBS significantly improved the motor symptoms of PD, compared to DBS off, patients engaged in significantly higher habitual counting (CS1) with STN stimulation. STN DBS did not influence synchronization with the pacing stimulus as measured by the total time taken to complete the RNG task. STN stimulation was associated with significant increase in activation of the right GPi, and significant decreased activation in the left DLPFC, the left posterior cingulate and anterior cingulate during fast-paced RNG (see **Figure 3**). Furthermore, the measure of habitual counting during RNG, CS1 was significantly and negatively correlated with activation in the left anterior cingulate (BA 32), left inferior frontal gyrus (BA 47) and the left posterior cingulate (BA 23), indicating that reduced activation in these areas was associated with increased habitual counting during RNG. Using the right GPi as the seed area, psychophysiological interactions showed that STN stimulation was associated with negative coupling between the GPi and the left inferior gyrus, the left anterior cingulate and the right posterior cingulate (see **Figure 4**). These results were the first demonstration that STN stimulation interfered with inhibitory control over habitual responses and strategic response selection under conflict by altering pallidal-frontal-cingulate coupling during performance of the fast-paced RNG.

More recently, Anzak et al. (2013) recorded LFPs bilaterally from the electrodes implanted in the STN in 7 PD patients in the immediate post-operative phase while the patients performed 6 trials of either a paced (0.5 Hz) RNG or a control counting task. Performance of the paced RNG was associated with a significant increase in gamma band power in the 45–60 Hz range relative to the control counting task. Furthermore, STN LFP increases in the gamma band during RNG were significantly and positively correlated with the number of repeated pairs (a measure of controlled processing during RNG which participants engage in at slower rates of paced RNG when there is time for controlled processing and volitional repetition of the same number across successive trials) and negatively correlated with CS1 (the measure of habitual counting and automatic processing during RNG), suggesting that the higher gamma power change may represent a switch from automatic to controlled processing during the RNG task. These results directly relate measures of switching from automatic to controlled processing during the RNG task (indexed by the CS1 and repeated pairs measures, respectively) to modulation of activity in the STN itself.

#### **STOP SIGNAL RT TASK**

Another task that requires inhibition of prepotent responses is the stop signal reaction time (RT) task (Logan and Cowan, 1984), in which a stop signal presented at variable stop signal delays after a go signal, instructs participants to inhibit the response prepared following the go signal which may be close to execution and hence prepotent (**Figure 5**). This task has been widely used to measure reactive inhibition, through estimation of the stop signal reaction time (SSRT), on the basis of the "horse race" model which proposes that the outcome of the race between the go and the stop process determines whether the participant successfully stops or fails to stop and responds on the stop trials. Imaging studies in healthy participants have shown that successful motor inhibition on the stop signal task is associated with increased activation of frontal areas such as the pre-supplementary motor area (pre-SMA), IFC and the anterior cingulate as well as the striatum and the STN (Rubia et al., 2003; Aron and Poldrack, 2006; Li et al., 2006, 2008; Aron et al., 2007; Zandbelt and Vink, 2010). Furthermore, in an fMRI study using a *conditional* version of

**between the right internal segment of the globus pallidus (GPi) and the (A) inferior frontal cortex (IFC), (B) anterior cingulate cortex (ACC), (C) posterior cingulate cortex (PCC) and positive coupling between the right**

the stop signal task, significant activation of a right-hemispheric "braking" network of STN, IFC, and pre-SMA was described, in association with both reactive inhibition in response to a stop signal on "critical" trials and conflict-induced slowing on "non-critical" trials when the stop signal was presented but had to be ignored (Aron et al., 2007).

Several studies have examined the effect of STN DBS in PD on performance of the standard version of the stop signal task and the SSRT, the measure of inhibition usually derived by applying the race horse model and using a staircase tracking procedure and

integration method by subtracting the average stop signal delay (interval between go and stop signals) from the mean go RTs. van den Wildenberg et al. (2006), Swann et al. (2011), Mirabella et al. (2012) reported that SSRTs were significantly shorter with stimulation than with STN DBS off, suggesting that STN DBS *improves* inhibitory control on the stop signal task. In contrast, Ray et al. (2009) found that when PD patients were equated with healthy controls for baseline SSRTs, STN DBS was associated with longer SSRTs relative to DBS off. While reactive inhibition (e.g., stopping car when traffic light turns red) is required in some real life situations, proactive inhibition, the ability to act with restraint (e.g., not to eat a piece of cake when dieting) and to inhibit impulsive responding in situations of conflict to allow time for reflection are also relevant to daily life. Using the conditional version of the stop signal task, Obeso et al. (2013) examined the impact of STN DBS in PD on proactive inhibition and conflict-induced slowing as well as the SSRT measure of reactive inhibition. They found that SSRTs were significantly prolonged with STN stimulation relative to DBS off, whereas relative to healthy controls proactive action restraint was significantly lower with DBS off but not DBS on. While the mean measure of conflict-induced slowing was not altered by STN DBS, stimulation produced a significant differential effect on the slowest and fastest RTs on conflict trials, further prolonging the slowest RTs on the conflict trials relative to DBS off and to controls. These results indicate that STN DBS produces differential effects on reactive and proactive inhibition and on conflict resolution. These differential effects of STN stimulation on the various measures of inhibitory and executive control may be mediated through the hyperdirect, indirect and direct pathways, consistent with imaging evidence that while the hyperdirect pathway via the STN is crucial for temporary "hold your horses" braking or global reactive inhibition, proactive and selective inhibitory control may be mediated via the striatum (Jahfari et al., 2010, 2011; Zandbelt and Vink, 2010; Aron, 2011).

A number of factors may contribute to these inconsistent results of STN DBS on SSRTs across studies. First, the specific type of stop signal RT task used, the nature of stimuli and responses, crucial timing features and the proportion of go and stop trials that would have influenced the prepotency of the response and hence the difficulty of stopping, varied across studies. Second, the results of Ray et al. (2009) indicate that baseline SSRTs relative to controls may be important in determining the direction of the effects of STN stimulation, with those having similar baseline (DBS off) SSRTs as controls showing prolongation of SSRT with stimulation, whereas STN stimulation speeded up SSRTs for patients with slower baseline SSRTs relative to controls. Third, the precise effects of STN DBS on SSRT are likely to depend on the exact location of the active contacts used for stimulation. It has been shown that active contacts which are in the dorsal vs. ventral parts of the STN produce distinct effects on inhibitory processing on a go no go RT task (Hershey et al., 2010), with stimulation of contacts in the ventral STN inducing a greater inhibitory deficit. However, with the stop signal RT task, Greenhouse et al. (2011) did not find any differences in SSRT with stimulation of the most ventral vs. the most dorsal contacts in PD patients. Future studies relating the exact location of contacts in the STN to effects on SSRT can clarify this issue. Fourth, procedural variations such as whether DBS was unilateral (Ray et al., 2009) or bilateral (van den Wildenberg et al., 2006; Swann et al., 2011; Mirabella et al., 2012), whether stop signal task performance was unimanual (present study; Ray et al., 2009; Mirabella et al., 2012) or bimanual (van den Wildenberg et al., 2006; Swann et al., 2011), the type of movement performed (reaching, Mirabella et al., 2012 or manual keypress, van den Wildenberg et al., 2006; Ray et al., 2009; Obeso et al., 2013), whether patients were assessed on (van den Wildenberg et al., 2006; Ray et al., 2009; Swann et al., 2011; Obeso et al., 2013) or off (Mirabella et al., 2012) medication are important methodological differences across studies that would have influenced the results.

Event-related potentials from surface EEG have been recorded during performance of the stop signal with STN DBS on and off in one study of PD patients (Swann et al., 2011), while two other studies have recorded LFPs from electrodes implanted in the STN in PD patients during performance of the stop signal task (Ray et al., 2012; Alegre et al., 2013). Swann et al. (2011) examined stop signal RTs in 13 PD patients with bilateral DBS of the STN assessed on medication and 14 healthy controls and recorded 64 channels of scalp EEG during performance of the task. The only measure that was significantly altered by DBS of the STN was SSRT, which was significantly improved/faster with DBS on relative to DBS off. They found increased beta band power (considered to be "anti-kinetic") around the time of stopping with STN DBS on relative to off stimulation over the right frontal cortex. Furthermore, increased beta band activity over the right frontal cortex was noted on successful compared to unsuccessful stop trials. It was concluded that STN DBS alters the fidelity of information transmission in the subthalamic-cortical pathways which influences inhibitory control over action.

LFPs from the STN during a stop signal task with an auditory stop signal presented on 25% of trials were recorded by Ray et al. (2012)in 9 PD patients assessed on medication. Presentation of the stop signal was associated with increase in beta activity or beta event-related synchronization (ERS), but beta ERS after the stop signal was not different for failed vs. successful inhibition trials. However, once the influence of stop signal delays was removed, the time point at which beta synchrony increased during successfully inhibited trials correlated with SSRT suggesting that those with quicker onset of beta synchrony following the stop signal had shorter SSRTs. Gamma ERS was noted following go signals and was also evoked by stop signals. Gamma ERS was highest for failed stop trials and less for successfully inhibited trials, similar to the results of Alegre et al. (2013) discussed next. In 10 PD patients Alegre et al. (2013) recorded LFPs from the STN during performance of a stop signal task (50% stop trials) both on and off dopaminergic medication. Response preparation was associated with decrease in beta power (12–30 Hz) and cortico-subthalamic coherence in beta band, which was smaller and shorter when the response was successfully inhibited. In the theta band there was an increase in frontal-cortico-subthalamic coherence related to the presence of the stop signal which was higher when the response inhibition was unsuccessful, perhaps reflecting the conflict between performance and inhibition of an action. A differential pattern of gamma activity was seen on medication (see **Figure 6**). Performance of the response was associated with a significant increase in power (55–75 Hz) and cortico-subthalamic coherence, whereas successful inhibition of the response was associated with bilateral decrease in subthalamic power and cortico-subthalamic coherence. Importantly, the inhibition related decrease in gamma activity was absent in the four patients with dopamine agonist related impulse control disorders (ICDs). These results were interpreted as supporting involvement of STN in response inhibition in the stop signal task and suggesting that this may be mediated by a reduction of gamma oscillations in the cortico-subthalamic connection which may reflect the suppression of the intention to move.

On a modified version of the stop signal task, following excitotoxic lesions of the STN, rats showed a failure to activate the stop process, which resulted in a larger number of errors. In contrast, the SSRT, the main measure of inhibition on this task, was not affected by such STN lesions (Eagle et al., 2008). STN lesions also speeded up responses on go trials and reduced the accuracy of stopping for all SSDs, suggesting a more generalized stopping impairment (Eagle et al., 2008). Of great interest is the recent study by Schmidt et al. (2013) who recorded neuronal activity from multiple basal ganglia structures in rats during a modified version of the stop signal task. Striatal neurons became active on presentation of go cues but not stop cues. STN neurons had low latency responses to stop cues, on both stop success and stop failure trials, suggesting that the STN provides fast signals to stop action, whether this is successful or not. SNr neurons responded to stop cues on trials on which the response was successfully inhibited. Based on these results and simulations, it was suggested that the results support the race model, particularly the interactive race model of the go and stop processes. The outcome of the race

between the stop and go processes is determined by the relative timing of the distinct inputs to SNr neurons from the striatum and STN. If the GABAergic signal from the striatum arrives first and wins the race then SNr pauses firing and a response is made (success failure trial). In contrast, if the glutamatergic excitatory signal from the STN wins the race, then the SNr increases firing and the response is withheld (stop success trial). It was further proposed that the STN may be part of a broader "interrupt" system mediated by the centromedian and parafascicular thalamic nuclei and/or pendunclopontine nucleus that coordinates responses to salient cues across multiple timescales and pathways.

#### **GO NO GO RTs**

While stop signal RTs measure volitional inhibition and cancellation of a prepared action, Go no go RTs are a measure of withholding a response or action restraint. Many variations of the Go no Go RT task exist. Essentially, across trials patients are instructed to respond to most stimuli presented but to withhold their response to a specific stimulus. For example, press a button when a green square (go stimulus) is presented, but do not respond when a red square appears (no go stimulus). The proportion of go to no go stimuli in a block determines, the degree of motor preparation, and withholding responses becomes more difficult when motor readiness is increased with a higher percent of go trials in a block.

A number of studies have examined the influence of STN DBS in PD on go no go RTs, using a variety of approaches: behavioral, imaging and recording of LFPs. Using go no go RT tasks with 83 or 50% go trials, Hershey et al. (2004) provided evidence that STN DBS in PD patients assessed off medication selectively interfered with action restraint during the blocks with a higher percent of go trials, when the response was more prepotent; as patients had significantly higher commission errors and lower discriminability index with DBS on only during these trials. These results suggest that STN stimulation interferes with action restraint under conditions of high demand on executive control. Subsequently, Kuhn et al. (2004) recorded LFPs from the STN in 8 PD patients performing a go no go RT task (20% no go trials). As expected, beta band activity decreased prior to movement on go trials and was followed by a late post-movement increase in beta power. In contrast, on no go trials, the beta power drop following presentation of the imperative signal was prematurely terminated and reversed into beta power increase. When go trials were subtracted from no go trials, the difference was evident as beta power increase. These results suggest that changes in LFP activity in the STN in the beta band may be important for determining whether movement is initiated or withheld.

In a second behavioral study with the go no go task (83% go), Hershey et al. (2010) assessed the effects of unilateral stimulation through electrodes contralateral to the worst side of the body to compare the effects of stimulation through contacts which were in the dorsal vs. ventral sections of the STN on performance of 10 PD patients tested off medication. As in their previous study, patients responded to all letter stimuli but withheld a response when the number 5 was presented on 17% of the 150 trials. The same stimulation parameters were used for all dorsal and ventral contacts. Compared to a no stimulation condition, both dorsal and ventral STN stimulation resulted in significant reduction of UPDRS scores and improvement of motor symptoms of PD. While the go RTs did not differ for ventral vs. dorsal stimulation the discriminability index, which is based on the proportion of hits minus the proportion of false alarms was significantly lower with DBS through the ventral than the dorsal contacts. Only ventral stimulation decreased hits and increased false alarms on the go no go task. These results were interpreted as indicating that the ventral part of the STN is involved in the balance between selection and inhibition of prepared responses.

Using PET, Ballanger working with the Toronto group (Ballanger et al., 2009) examined the neural correlates of inhibitory control with STN DBS in PD to compare two different but not mutually exclusive frameworks of phasic "hold your horses" reactive inhibition in situations of conflict and more tonic "proactive inhibition" relating to situations of uncertainty, which make different predictions in terms of the brain areas involved and the time course of inhibition. According to the proactive inhibition model, a simple RT task which does not entail any conflict nevertheless engages proactive inhibitory control to withhold the response in advance of presentation of the go stimulus which signals release of inhibitory control over the response. They used an STN DBS on vs. off methodology to investigate STN modulation of go no go RTs (Go: white circle, No Go: white X, 40% no go) and simple RT (Go: white circle) in 7 PD patients tested off medication. As expected, STN DBS improved the motor symptoms of PD and speeded up RTs for the simple RT block as well as blocks with a mixture of Go and Go RTs. But at the same time STN stimulation increased errors of commission on the no go trials, indicating that patients were less able to inhibit the prepotent response. STN stimulation was associated with a significant increase in activation in subgenual ACC (BA24/32) and decreased activation in the medial posterior cingulate cortex (BA 29/30), pre-SMA (BA 6), dorsal ACC (BA 24/32) and left primary cortex (BA 4), inferior parietal lobe (BA 40), dorsal premotor cotex (BA6) and right ventral premotor cortex (BA 6) and IFC (BA 44) (see **Figure 7**). The main effect of Task and the Task x Stimulation interaction were not significant, suggesting the STN-induced changes in brain activation held for both

simple reaction time (GO) tasks **(B)** commission errors (CE) in the go no go task with subthalamic nucleus deep brain stimulation on or off **(C)** areas showing decreased and increased activation with

Ballanger et al. (2009). ACC, anterior cingulate cotex; PCC, posterior cingulate cortex, R, right; L, left; ∗ denotes significant differences (see text).

the simple and go no go RTs, respectively, involving predominantly proactive and reactive inhibitory control. The number of commission errors were significantly associated with activation in the precuneus. The increased activation in the subgenual ACC, an area associated with impulsive behavior in bipolar disorder (e.g., Swann et al., 2003) was considered to reflect increased motivational drive induced by STN DBS in PD. It was concluded that the results supported STN stimulation having an effect on brain areas mediating both reactive (ACC, pre-SMA, premotor cortex, IFC), and proactive inhibition (posterior cingulate, precuneus, inferior parietal cortex) inhibitory processes.

In a subsequent behavioral study by some of the same group, Favre et al. (2013) formulated bradykinesia in PD in terms of a deficit in release of proactive inhibitory control. They used warned (150, 350, 550 ms warning-go signal interval) and unwarned simple RT tasks in a mixed block or a pure block of unwarned RTs. They reported that relative to controls, the PD patients were impaired in releasing proactive inhibition when this was internally driven in an unwarned simple RT condition which was considered responsible for their slowness in movement initiation. While dopaminergic medication generally improved RTs, medication status did not influence the internal control of the proactive inhibition, whereas DBS of the STN restored voluntary release of the proactive inhibition. The results of Favre et al. (2013) together with the findings of Obeso et al. (2013), suggest that STN DBS in PD also influences proactive inhibition. As proactive inhibitory control is more relevant to self-control in daily life, this may be more pertinent to understanding development of psychiatric problems such as emergence of de novo ICDs following STN surgery (Smeding et al., 2007; Halbig et al., 2009; Lim et al., 2009). However, paradoxically, both Favre et al. (2013) and Obeso et al. (2013) suggest that proactive inhibitory control is better with STN stimulation than with DBS off.

#### **DECISION MAKING AND CONFLICT RESOLUTION**

Normally, when making decisions in situations of conflict, individuals take time to reflect and weigh up the available options and the desirability of their likely outcomes, which slows down their responses before a decision is made. To test the hypothesis outlined above (Frank, 2006; Frank et al., 2007) that in situations of conflict, the STN issues a global "no go" signal to temporarily brake responding and prevent impulsive action, to allow time for information accumulation and reflection before a choice is made, Frank et al. (2007), Cavanagh et al. (2011) have completed a number of studies on patients with PD with bilateral STN DBS. The same task was used which involved probabilistic decision-making. When faced with high conflict stimuli with equally high (e.g., 80 vs. 70%) probability of reward, elderly controls and unoperated PD patients on and off medication and the operated patients tested with DBS off slowed their response and had longer RTs relative to the low conflict condition with stimulus pairs with more discriminable reward values. In contrast, the patients with STN DBS on had faster RTs in the high than low conflict situations and so acted impulsively particularly when the stimuli were associated with a high probability of reward.

A later study from the same group using the same probabilistic decision making task, incorporated recording of scalp EEG with an STN DBS on vs. off methodology (*N* = 17) and intraoperative recording from a stimulating electrode inserted in the STN (*N* = 8). Cavanagh et al. (2011) showed that ordinarily the increase in theta band activity (4–8 Hz) over the medial prefrontal cortex is associated with raising the decision threshold for high conflict trials but not low conflict trials and that STN DBS in PD reversed this relationship such that increased theta band activity was associated with a decrease of the decision threshold on high conflict trials. It was proposed that the medial prefrontal cortex and the STN communicate in low frequency bands to represent decision conflict and that STN DBS interferes with the normal ability of the STN to react to decision conflict by modulating the decision threshold.

Further evidence for the Frank model and the STN role in decision-making under conflict has been provided by other groups. Fumagalli et al. (2011) recorded LFPS from the STN in 16 PD patients during processing of moral conflictual (e.g., "abortion is murder"), moral non-conflictual ("all men have a right to live") and neutral (e.g., "a piano has black and white keys"). RTs were significantly longer for moral conflictual than moral non-conflictual statements. Relative to baseline, there was a significant increase in low frequency power in the 5–13 HZ band in all conditions, which was significantly higher for the moral conflictual sentences. Intraoperative microelectrode recordings of single unit activity from the STN during DBS surgery in 14 PD patients off medication performing a probabilistic decision making task showed increased spiking activity in the STN when the patients engaged in a decision. Importantly, the level of STN spiking activity increased with the level of decision conflict and seemed to be related to choice difficulty and accuracy rather than reward association (Zaghloul et al., 2012). Coulthard et al. (2012) employed a task involving learning and probabilistic decisionmaking under conflict and examined the effect of STN DBS in 11 PD patients. They found that while STN stimulation did not affect the learning phase, it influenced the information integration phase when participants needed to update their decision on the basis of previous pieces of information presented and with STN DBS on patients failed to slow down to revise their plan, reflecting impulsivity.

To date, only one study has directly related STN activity during decision-making to presence of ICDs in PD. Rosa et al. (2013) used an economic decision-making task with conflictual and nonconflictual trials and recorded LFPs from the STN in PD patients, 8 of whom had pathological gambling. Based on an index of risk, 3 types of behavioral strategy—risky, random and nonrisky- were distinguished, that were, respectively, employed by 6 patients with pathological gambling, 5 patients (3 with pathological gambling and 2 without), and 6 patients without pathological gambling. The subgroup with pathological gambling who engaged a risky strategy had a significantly higher change in low frequency power in the STN when evaluating conflictual vs. nonconflictual stimulus pairs. Such a difference was not observed for the patients without pathological gambling who adopted a nonrisky strategy. These results directly relate low frequency STN activity to adoption of risky strategies in patients with PD and pathological gambling. Similar to these findings from Rosa et al. (2013), Rodriguez-Oroz et al. (2011) also merged the two strands of research on impulsivity in PD, respectively, focused on ICDs and STN DBS induced inhibitory deficits. They recorded LFPs from the STN and reported that on medication, 10 PD patients who had ICDs before surgery had theta-alpha (4–10 Hz) activity generated 2–8 mm below the intercommissural line from the ventral contacts and cortico-subthalamic coherence in the 4–7.5 Hz range in scalp electrodes over the prefrontal cortex.

Another task that requires perceptual decision-making is the "moving dots" task, which consists of a cloud of dots, a proportion of which move coherently in one direction, whereas the rest move randomly. The participant has to decide whether the dots are moving to the right or left. The task has been shown to involve modulation of response thresholds under speed and accuracy instructions, with response caution being associated with increased activation of the pre-SMA and the putamen (Forstmann et al., 2008). The moving dots task was used by Green et al. (2013) to investigate the impact of STN DBS in PD on modulation of response thresholds and speed-accuracy trade-offs under speed vs. accuracy instructions with six levels of task difficulty achieved by altering the degree of coherence of the moving dots, with the low coherence conditions considered equivalent to situations of high conflict. They found that with STN stimulation patients were faster but less accurate compared to DBS off, and STN DBS altered the degree to which patients were able to adjust their decision thresholds as a function of task difficulty/conflict. Similar changes in RTs and accuracy using this task have been obtained in our laboratory (Pote et al., in preparation) on 12 PD patients with STN DBS. Furthermore, application of the drift diffusion model showed that stimulation of the STN was associated with lowering of response thresholds under speed instructions only, such that the patients had differentially faster RTs but were less accurate under speed instructions compared to DBS off. While non-decision time was higher with DBS off than on, STN DBS did not influence drift rate which for the PD patients was almost half that observed for the healthy controls. With somewhat different decision-making tasks, the results of the Cavanagh et al. (2011), Green et al. (2013), and Pote et al. (in preparation) concur that STN DBS in PD lowers response thresholds during decision-making tasks which accounts for the fast and errorful reactions of the patients with the stimulators switched on when faced with decision conflict, task difficulty or time pressure.

#### **OTHER TASKS INVOLVING INHIBITORY CONTROL OR RESPONSE SELECTION UNDER CONFLICT**

The impact of STN DBS in PD on a number of other tasks that require inhibition of prepotent responses has been examined. The anti-saccade task requires volitional suppression of an innately prepotent response toward a peripherally presented target in order to make a saccade in the opposite direction (Isoda and Hikosaka, 2011). In 32 PD patients with STN DBS tested on medication, Yugeta et al. (2010) found that STN DBS *improved* the ability to suppress unwanted saccades to the cue stimulus in a memory-guided saccade task, but in the anti-saccade task prosaccades were not suppressed. The lack of effect of STN DBS on the anti-saccade task is unexpected and the results with the memoryguided saccades suggests that STN stimulation in PD improves inhibitory control over reflexive saccades, similar to the findings with manual responses on the stop signal task in some studies (van den Wildenberg et al., 2006; Swann et al., 2011; Mirabella et al., 2012).

STN DBS in PD produced some interesting results on the Simon task. In this task, an irrelevant stimulus dimension (e.g., side of the screen stimulus is presented on) can trigger a strong prepotent response that can interfere with selection and execution of the correct response. Wylie et al. (2010) presented blue or green circles to the left or right of a central fixation point to which patients had to respond by pressing right or left buttons with the right or left thumbs, respectively. While the side of the presentation of the stimulus is irrelevant, RTs to stimuli presented in the left visual field for example are faster with the left than with the right hand. The irrelevant stimulus dimension triggers a response with the hand corresponding to the side of stimulus presentation which needs to be inhibited so that the correct response can be selected and executed. The interference with performance due to this response conflict is called the Simon effect. RTs were on average 61 ms faster but accuracy on conflict trials was reduced with STN DBS on than off. By investigating the entire RT distribution, they found that in the fastest part of the RT distribution, STN DBS increased the number of fast premature response captures by the irrelevant stimulus feature, i.e., increased errors relative to DBS off. STN DBS also significantly reduced the magnitude of the Simon effect for the slowest incongruent responses, that is it improved the efficiency of inhibition of the incongruent responses for the slowest part of the RT distribution in the correct trials. These results suggested two temporally dissociable effects of STN stimulation in PD: an early increased automatic response capture by the irrelevant stimulus dimension reflecting impulsivity but a later improved interference control of the slowest responses, which the authors speculated may, respectively, reflect greater responsiveness of the STN with stimulation to inputs from the pre-SMA and IFC. As the STN-induced change in accuracy was limited to the conflict trials and accuracy on nonconflict trials was not altered, it was proposed that this selective effect argues against STN DBS producing a global shift in SATs.

In the latest study, Zavala et al. (2013) recorded LFPs from the STN while PD patients performed an Eriksen flanker task. They reported that correct fast incongruent trials similar to congruent trials had cue-locked STN theta band activity which showed phase alignment across trials followed by a peri-response increase in theta power, suggesting that the distractor flankers were successfully ignored. In contrast, correct incongruent trials with longer RTs had a relative reduction in theta phase alignment followed by higher theta power. It was concluded that STN is involved in processing of congruent and incongruent responses and response selection under conflict and that STN LFPs reflect conflict-related changes.

#### **SUMMARY AND CONCLUSIONS**

The studies which have investigated the effect of STN DBS on tasks requiring inhibition of prepotent responses are summarized in **Table 1**. As reviewed above, the majority suggest that STN DBS impairs inhibitory control over prepotent responses or in situations of decision conflict. The most inconsistent results are for the stop signal and go no go RT tasks. For the stop signal task, some studies suggest that inhibitory control on this task as


**Table 1 | Studies which have investigated the effect of deep brain stimulation (DBS) of the subthalamic nucleus (STN) in Parkinson's disease on tasks involving inhibition of prepotent responses, response selection under conflict or decision-making under conflict.**

measured by the SSRT is improved (van den Wildenberg et al., 2006; Swann et al., 2011; Mirabella et al., 2012), while others found prolongation/worsening of SSRTs (Ray et al., 2009; Obeso et al., 2013) with STN stimulation. As discussed above, there are many methodological factors that could have contributed to these divergent results. For go no go RTs, while no effects of STN DBS were found by van den Wildenberg et al. (2006); Hershey et al. (2004, 2010) and Ballanger et al. (2009) found that STN DBS induced inhibitory deficits reflected in increased commission errors. Furthermore, the work of Hershey and colleagues suggests that such STN DBS induced deficits in action restraint on the go no go task were only present when the response was prepotent (83% target rate) but not when there were fewer go trials (50%) which reduced the prepotency of the response. This group's subsequent work also suggests that stimulation through the ventrally located contacts of the implanted electrodes is associated with reduced discriminability in the go no go task (Hershey et al., 2010). The precise location of the active electrode contact in the STN is an important consideration in determining the effects of STN DBS on executive control. With exceptions (Hershey et al., 2010), in the majority of studies deficits in inhibitory and executive control have been reported for STN stimulation through contacts that are effective in controlling the motor symptoms of PD, suggesting that the active contacts are located in or near the sensorimotor section of the STN. However the influence of contact position in the STN to the observed effects on inhibitory and executive control needs to be directly examined in future studies.

In the context of the majority of the studies summarized in **Table 1** demonstrating STN stimulation induced deficits in inhibitory control, it is surprising that STN DBS did not adversely affect the inhibition of pro-saccades in the anti-saccade task although it improved inhibition on the memory guided task, by reducing reflexive saccades to cue presentation. Similarly, negative results were reported by Torta et al. (2012) who examined risk taking behavior and delay aversion, both characteristics of impulsivity, on the Cambridge Gambling task. STN DBS in PD had no effect on delay aversion or risk taking on this task. Using the Iowa Gambling task, which involves decision making under uncertainty, Czernecki et al. (2005) and Oyama et al. (2011) also found that DBS surgery or acute manipulation of stimulation had no overall effect on decision making and risk taking and choice from advantageous vs. non-advantageous decks on this task. The only effect found was that in the Oyama et al. (2011) study performance was worse on the last block of trials with DBS on relative to DBS off. STN DBS-induced worsening of performance on the last block did not correlate with levodopa equivalent dose but was associated with depression scores and DBS through ventral contacts. van Wouwe et al. (2011) reported that STN DBS improved learning of stimulus-action-reward associations on a probabilistic task. From the results of these studies, it is clear that not all forms of impulsivity are detrimentally affected by STN DBS in PD. In light of the multi-faceted nature of impulsivity noted above, future studies need to examine the impact of STN DBS on other components such as the ability to delay gratification and performance on tasks such as the delay discounting task.

**Table 2** summarizes the studies that have recorded LFPs or single neuronal activity of the STN during tasks that involve inhibitory control of prepotent responses, most of which were also described above. A range of tasks, go no go RTs, moralistic or probabilistic or economic decision making under conflict, the stop signal RT task, the Stroop, RNG and the Eriksen flanker task were used across these studies. The results of all of these studies concur that activity in the STN itself is modulated in relation to presence of conflict or inhibition of prepotent responses and counteract alternative interpretations such as antidromic stimulation of the cortex or stimulation spread to other structures. Taken together, the results of these electrophysiological studies provide evidence for direct involvement of the STN in inhibitory and executive control and suggest that while theta band activity may reflect evaluative processes and presence of conflict, beta band activity signals preparation and motor readiness, whereas oscillation in the gamma band represents more discrete motor readiness or vigor and gating of action performance/cancellation (Cavanagh and Frank, 2013).

In the imaging studies, as STN DBS alters behavior as well as the pattern of brain activation, it is not clear whether this altered pattern is a cause or effect of the altered behavior. However,



?*, not specified; mPFC, medial prefrontal cortex; SSRTs, stop signal reaction times; RT, reaction time; RNG, random number generation.*

both are induced by manipulation of STN activity and output. Nevertheless, there is convergent evidence from behavioral and imaging studies of the effect of STN DBS in PD and LFP or intraoperative neuronal recordings from the STN in PD, confirming a role for the STN in inhibitory control over prepotent responses and response selection under conflict during a range of tasks.

## **THEORETICAL IMPLICATIONS AND FUTURE DIRECTIONS**

The evidence reviewed supports both of the theoretical frameworks outlined above. The evidence supports the proposal (Frank, 2006; Frank et al., 2007) that alteration of STN activity with STN DBS interferes with the normal function of the STN to increase the response threshold depending on context or situation, to prevent premature and impulsive responses and to allow time for further information accumulation before a decision is made. Support is also provided based on the evidence reviewed above for speed-accuracy trade-off models which attributed a role to the STN in modulating response thresholds and influencing speed-accuracy trade-offs (Bogacz et al., 2010; Mansfield et al., 2011). As recently noted (Jahanshahi, 2013), what remains unclear is whether it is conflict *per se* (Cavanagh et al., 2011; Fumagalli et al., 2011; Zaghloul et al., 2012; Zavala et al., 2013), choice difficulty (Zaghloul et al., 2012; Green et al., 2013), choice accuracy (Zaghloul et al., 2012), the appetitive/aversive valence of the choices (Frank et al., 2007), information integration (Coulthard et al., 2012), adoption of a risk-taking strategy (Rosa et al., 2013) or simply time pressure (Pote et al., in preparation) that influences STN activity and engages it to dynamically modulate response thresholds. These possibilities need to be directly investigated and disentangled in future studies. Furthermore, it is possible that the STN involvement in modulating response thresholds and inhibitory and executive control operates across domains, motor, cognitive and limbic. Such cross-domain generality of the inhibitory role of STN also would be an interesting topic for investigation in future studies. To date, with exceptions (Favre et al., 2013; Obeso et al., 2013), the major focus of the literature on the impact of STN DBS on inhibitory control has been mainly on global reactive inhibition. Proactive and selective inhibition have greater parallels in daily life and, therefore, their investigation is more relevant to understanding the impact of STN DBS on the post-surgical behavior and functioning of PD patients and should be the focus of future study.

Successful performance on the Stroop interference task, fastpaced RNG, the Simon effect task, and the Eriksen flanker task necessitate suppression of habitual and automatic prepotent responses and controlled and strategic selection of alternative responses. The above evidence suggests that STN DBS interferes with this process and thus also supports the proposal that ordinarily the STN implements a switch signal from the frontal cortex to shift from automatic to controlled processing (Isoda and Hikosaka, 2008). In fact, an alternative interpretation of all the evidence showing STN modulated activity with decision or response conflict or task difficulty could be that the STN is signaling to the thalamus and the cortex to "bring more attentional resources" (Whitmer and White, 2012), which would also be consistent with a switch from automatic to controlled processing. The task-specific increase in LFP gamma band activity in the STN in relation to attention-demanding tasks such as paced RNG (Anzak et al., 2013) or verbal fluency (Anzak et al., 2011) may signify such demand for increased attentional resources. An unresolved question is whether STN DBS has detrimental effects only on higher order aspects of executive and inhibitory control or if relatively lower level modulation of response speed is at the heart of the observed deficits. There is some indication from the available evidence that the STN DBS induced deficits in executive and inhibitory control are observed only in conditions of high demand for cognitive control for example on go no go RTs with high but not low target rates (response more prepotent in former case) (Hershey et al., 2004), or when decision-making on win-win but not lose-lose high conflict trials (higher motivational salience in the former case) (Frank et al., 2007). This question needs to be addressed in future studies.

In their influential "paradox of surgery" paper, Marsden and Obeso (1994) posed the question of why disruption of basal ganglia output to the cortex with DBS or lesioning of the STN, GPi or thalamus not impair movement or behavior. They suggested that "Loss of their output to premotor regions might not grossly impair routine movement; the remainder of the distributed system could cope adequately in ordinary circumstances. However, loss of this basal ganglia contribution might impair motor flexibility and adaptation." The evidence reviewed above and summarized in **Table 1**, indicates that SN DBS surgery does impair aspects of non-routine behavior in PD and results in a deficit in inhibitory control over prepotent responses and executive control in situations that require response selection under conflict. These deficits resolve the "paradox of surgery."

## **CLINICAL IMPLICATIONS AND FUTURE DIRECTIONS**

A number of case studies have documented that STN DBS induces problems with inhibition of prepotent complex behaviors such as pathological laughter (Krack et al., 2001), pathological crying (Bejjani et al., 1999), pathological gambling (Smeding et al., 2007) or an architects' compulsion to draw female nude figures after surgery (Witt et al., 2006). Hypomania in the immediate post-operative phase has been documented (e.g., Herzog et al., 2003). Some of the other psychiatric complications induced by STN DBS in PD are debated. While STN DBS has been linked with post-surgical de novo emergence of ICDs such as pathological gambling or shopping in some samples (Smeding et al., 2007; Halbig et al., 2009; Lim et al., 2009), others have reported improvement of ICDs with STN DBS in PD (Ardouin et al., 2006; Lim et al., 2009). Similarly, while increased impulsivity and impaired executive control with STN DBS may also contribute to the increased risk of suicides documented in a retrospective study in a minority of cases following STN DBS surgery (Voon et al., 2008), recent prospective evidence has not found any such increased suicide risk in association with STN DBS in PD (Weintraub et al., 2013). Social disintegration, with breakdown of marriage and failure to resume work despite improvement of motor symptoms and function, have been documented following STN DBS in several centers (Houeto et al., 2006; Schüpbach et al., 2006). However, it remains unclear whether the deficits in executive control and heightened impulsivity documented above contribute to these psychiatric and social problems documented following STN DBS surgery in PD as there is no direct evidence available linking the two or whether other more complex psychological or social processes are responsible. To date, only the investigations by Rodriguez-Oroz et al. (2011) and Rosa et al. (2013) relate STN activity to ICDs in PD albeit, in cases who had these problems prior to surgery. It is necessary to directly examine the association between deficits in inhibitory and executive control on experimental tasks and psychiatric and behavioral side-effects of STN DBS in PD in future studies. Furthermore, the clinical significance of the deficits in inhibitory and executive control induced by STN DBS reviewed above for the everyday cognitive functioning of the patients in their daily lives has not been examined to date and remains unknown and is clearly a topic for future investigation.

The improvement of the motor symptoms of PD associated with STN DBS often results in reduction of dopaminergic medication after surgery which can cause apathy, a motivational deficit. Such alteration of motivational state which can in turn change salience or conflict detection, is also relevant to the study of the effect of STN DBS on inhibitory and executive control, and associations of post-surgical apathy with STN stimulation induced deficits in inhibition of prepotent responses and response selection under conflict should be considered.

Furthermore, the key question that arises is whether the deficits in inhibitory and executive control documented above are present in varying degrees in all operated cases. If the answer to this question is "yes," then a further pertinent question is why are the sequalea of such deficits in executive control not particularly evident in everyday life? Is there some compensatory mechanism operational? If the answer to the first question is "no" and not every operated patient shows deficits in inhibitory and executive control, then a further clinically relevant question is what factors determine which patients develop deficits in inhibitory and executive control. The pre-operative levels of executive functioning, individual differences in executive control (Braver et al., 2010) and predisposition to impulsivity, and the precise location of the implanted electrodes in the STN are likely to be some of the pertinent factors.

STN DBS has been used in treatment of other patients groups, such as those with dystonia (e.g., Kleiner-Fisman et al., 2007) or obsessive compulsive disorder (OCD) (Mallet et al., 2008). In dystonia, reduced cortico-cortical inhibition has been documented (e.g., Edwards et al., 2003). In OCD, obsessions reflect loss of inhibitory control over thoughts that become recurrent and intrusive and distressing and compulsions represent loss of inhibitory control over "safety" behaviors that become repetitive and are engaged in to reduce the anxiety associated with the inflated sense of perceived danger. In OCD both obsessions and compulsions are prepotent and although they are resisted by the patient, compulsions are nevertheless executed. Does STN DBS in dystonia or OCD produce similar or different effects on inhibitory and executive control as in PD? In OCD, it has been proposed that STN DBS may change rigidity to impulsivity/flexibility and that with such increased flexibility the patients are no longer bothered by their obsessions and are no longer driven to execute their compulsions (Krack et al., 2010). Patients with OCD have prolonged SSRTs on the stop signal RT and show deficits on the Stroop task (Chamberlain et al., 2005), indicative of deficits in inhibitory and executive control. Evidence suggests that the STN is also overactive in OCD, but perhaps not to the same extent as in PD (Piallat et al., 2011) and neuronal recordings from the STN revealed associations with doubt and checking behavior during performance of a matching to sample task (Burbaud et al., 2013). It would be interesting to determine how performance on tasks entailing inhibitory and executive control are altered by change of STN overactivity by STN DBS in OCD.

Disruption of the STN activity with DBS could have other as yet unknown implications for the ability of PD patients to exert executive control in making adaptive decisions during conditions of high-conflict in daily life, causing them to revert to automatic/status quo responses even though these may be sub-optimal. Future research would be well placed to examine the potential link between the inhibitory and executive control impairments reviewed here and psychiatric outcomes and everyday cognitive functioning in the course of daily life following STN DBS which nevertheless is highly effective in controlling the motor symptoms of PD.

### **REFERENCES**


idiopathic dystonia: impact on severity, neuropsychological status, and quality of life. *J. Neurosurg.* 107, 29–36. doi: 10.3171/JNS-07/07/0029


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 31 October 2013; accepted: 06 December 2013; published online: 25 December 2013.*

*Citation: Jahanshahi M (2013) Effects of deep brain stimulation of the subthalamic nucleus on inhibitory and executive control over prepotent responses in Parkinson's disease. Front. Syst. Neurosci. 7:118. doi: 10.3389/fnsys.2013.00118*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2013 Jahanshahi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## **Ahmed A. Moustafa1,2\* and Michele Poletti <sup>3</sup>**

<sup>1</sup> Department of Veterans Affairs, New Jersey Health Care System, East Orange, NJ, USA

<sup>2</sup> School of Social Sciences and Psychology and Marcs Institute for Brain and Behaviour, University of Western Sydney, Sydney, NSW, Australia

<sup>3</sup> Department of Mental Health and Pathological Addiction, AUSL of Reggio Emilia, Reggio Emilia, Italy

#### **Edited by:**

Alon Korngreen, Bar-Ilan University, Israel

#### **Reviewed by:**

Nicola B. Mercuri, University of Rome, Italy Gabriella Santangelo, Second University of Naples, Italy

#### **\*Correspondence:**

Ahmed A. Moustafa, School of Social Sciences and Psychology and Marcs Institute for Brain and Behaviour, University of Western Sydney, Room no. 24.1.58, Locked Bag 1797, Sydney, NSW 2751, Australia e-mail: a.moustafa@uws.edu.au

Parkinson's disease (PD) is a neurological disorder, associated with rigidity, bradykinesia, and resting tremor, among other motor symptoms. In addition, patients with PD also show cognitive and psychiatric dysfunction, including dementia, mild cognitive impairment (MCI), depression, hallucinations, among others. Interestingly, the occurrence of these symptoms—motor, cognitive, and psychiatric—vary among individuals, such that a subgroup of PD patients might show some of the symptoms, but another subgroup does not. This has prompted neurologists and scientists to subtype PD patients depending on the severity of symptoms they show. Neural studies have also mapped different motor, cognitive, and psychiatric symptoms in PD to different brain networks. In this review, we discuss the neural and behavioral substrates of most common subtypes of PD patients, that are related to the occurrence of: (a) resting tremor (vs. nontremordominant); (b) MCI; (c) dementia; (d) impulse control disorders (ICD); (e) depression; and/or (f) hallucinations. We end by discussing the relationship among subtypes of PD subgroups, and the relationship among motor, cognitive, psychiatric factors in PD.

**Keywords: Parkinson's disease, tremor, dementia, mild cognitive impairment, hallucinations, depression, impulse control disorders**

#### **INTRODUCTION**

Parkinson's disease (PD) is a neurodegenerative disorder characterized by motor (bradykinesia, rigidity and resting tremor) and non-motor symptoms, such as cognitive impairment, autonomic, affective and behavioral disturbances (Jankovic, 2008). The complex clinical picture of motor and non-motor symptoms is not only different between PD patients, but is also changing along the course of disease progression in each patient. The neurodegenerative nature of PD and its pharmacological management are involved in this clinical heterogeneity. PD is characterized by a progressive widespread diffusion of the Lewy body neuropathology from subcortical to cortical structures (Braak et al., 2003); therefore at different disease stages PD patients present different loads of Lewy body neuropathology and different involvements of subcortical and cortical structures. Moreover, drugs used to manage clinical symptoms, as dopaminergic drugs (Bonuccelli and Pavese, 2006; Poewe et al., 2010) have motor and nonmotor effects that change along disease progression (Poletti and Bonuccelli, 2013).

Classifications of PD into different subtypes have been proposed to reduce the heterogeneity of clinical features associated with PD, and thus better investigate their neural correlates and provide better treatment options. Two approaches are used to achieve these classifications: empirically assigned or data-driven (Marras and Lang, 2013). Empirically assigned classifications of specific clinical motor and non-motor symptoms in PD patients (e.g., rigidity, cognitive impairment, psychosis, impulse control disorder (ICD), autonomic dysfunction) compare samples of patients with vs. without the investigated clinical symptom; for example, on the basis of the predominant motor symptoms as indicated by the Unified Parkinson's Disease Rating Scale (UPDRS) motor subscores, PD patients were categorized into patients with predominant postural instability and gait difficulty and patients with predominant tremor (Jankovic et al., 1990). On the other hand, the data-driven approach searches for variables that group together each subtype without a-priori hypothesis; for example the cluster analysis of Lewis et al. (2005) identified four subtypes of PD patients: (1) young onset; (2) nontremor dominant with cognitive impairment and depression; (3) rapid progression without cognitive impairment; and (4) tremor dominant.

The empirically assigned and the data-driven classification approaches may identify partially overlapping subtypes (van Rooden et al., 2010), as in the case of the motor subtypes proposed by Jankovic et al. (1990) and clusters proposed by Lewis et al. (2005). These approaches have been attempted for motor, cognitive and psychopathological features of PD patients, while, they have not been attempted on findings of neuropathological and neuroimaging assessments.

A recent comparison of the empirically assigned and data driven classification approaches (Marras and Lang, 2013) underlined how the former has the advantages of small number of subtypes, ease of implementation and assignment of patients to one or another subtype; on the contrary clusters derived from the data-driven approach are inherently more complicated, incorporating more variables that are often not regularly measured in clinical practice, increasing the difficulty to assign some patients to subtypes.

Furthermore the empirically assigned classification is probably more informative on the pathophysiology of specific PD symptoms but it could also hamper a global view on complex clinical patterns of motor and non-motor symptoms that characterize PD patients. Since data driven subtypes reviewed elsewhere (van Rooden et al., 2010), this review aims at presenting an overview of some of the main PD subtypes derived from the empirically assigned classification; we briefly present their neural and behavioral correlates to show how these subtypes may correlate with each other. For this purpose in the next sections we describe the following subtypes: motor (postural instability and gait difficulty vs. tremor) and non-motor symptoms (mild cognitive impairment (MCI) vs. dementia; with/without ICD, depression, or hallucinations). Next, we briefly discuss existing data on how these subtypes correlate with each other.

## **COGNITIVE SUBTYPES IN PD: FROM MILD COGNITIVE IMPAIRMENT TO DEMENTIA**

Since the early clinical stages, PD patients present an increased risk of cognitive impairment, with prevalences of MCI ranging from 14.8–18.9% in newly diagnosed drug-naïve patients (Aarsland et al., 2009, 2010; Poletti et al., 2012a,b), up to a 50% and 5 years after clinical diagnosis (Broeders et al., 2013).

MCI may be diagnosed when a neuropsychological impairment is demonstrated by performances 1–2 standard deviations below appropriate norms in at least two tests of the same cognitive domain (MCI single domain; for example dysexecutive MCI when only the executive domain is impaired; amnestic MCI when only episodic memory is impaired) or in at least one test in two different cognitive domains (MCI multiple domains: for example executive and episodic memory domains or executive and visuospatial domains are impaired) and there is a preserved functional level in everyday activities, not considering difficulties related to the motor symptoms (Litvan et al., 2012).

In the early clinical stage, cognitive impairment in PD is primarily characterized by deficits if executive functions, caused by loss of dopaminergic neurons in the nigrostriatal pathway (Kish et al., 1988) and resulting in the reduction of dopamine levels in the striatum. This dopamine reduction negatively impacts the functioning of the dorsolateral frontostriatal loop (linking dorsolateral prefrontal cortex, dorsolateral caudate nucleus of the striatum, dorsomedial globus pallidus and thalamus), which is mainly involved in of executive functions such as working memory, planning and set-shifting (Alexander et al., 1986) and dopaminergic drugs have a beneficial effects on these functions in the early stages of disease by impacting striatal dopamine levels (Cools, 2006).

The presence of MCI since the early PD stages is associated not only with the frontostriatal deficit but also with an early involvement of parietal and occipital cortices (Pappata et al., 2011); this finding have been consistently reported in patients with MCI by structural neuroimaging studies, detecting atrophic changes in a number of cortical regions, including occipital, parietal, medial temporal and prefrontal cortices (Song et al., 2011; Weintraub et al., 2011a; Lee et al., 2012a) and cortical hypometabolism in frontostriatal loops and parietal and occipital regions (Nobili et al., 2011; Ekman et al., 2012; Garcia-Garcia et al., 2012; Nagano-Saito et al., 2013).

Advanced PD stages, usually presenting dementia in 75– 80% of patients (Aarsland et al., 2003; Williams-Gray et al., 2013), are characterized by widespread cortical and subcortical atrophic changes (Burton et al., 2004; Nagano-Saito et al., 2005; Summerfield et al., 2005; Beyer et al., 2007; Weintraub et al., 2011b; Melzer et al., 2012) and more severe temporal, parietal and occipital hypometabolism in comparison with MCI patients (Garcia-Garcia et al., 2012).

In sum, the cognitive profile of PD is usually characterized by an early executive impairment, a sign of nigrostriatal degeneration, and subsequently by impairment of visuospatial functions, memory and/or language, a sign of cortical diffusion of Lewy body pathology, evidenced by cortical atrophy and hypometabolismml: this second feature in particular, in comparison with the first one, is associated with an increased risk of developing MCI and subsequently dementia (Jellinger, 2013; Kehagia et al., 2013).

## **TREMOR VS. NON-TREMOR IN PD**

As discussed above, PD involves a spectrum of motor symptoms that include akinesia, bradykinesia, and resting tremor, among others. Few studies categorize the patients into different subgroups, depending on the motor symptoms they present. This usually involves dividing the patients into a tremor-dominant and non-tremor-dominant groups, with the latter is either patients with predominant akinesia, bradykinesia, or postural instability and gait symptoms (Jankovic et al., 1990; Zaidel et al., 2009; Mure et al., 2011; Schillaci et al., 2011; Lee et al., 2012b; Wylie et al., 2012). Bradykinesia, and postural instability and gait dysfunction are more common in patients with a rapid disease progression compared with PD patients with a slower progression rate (Jankovic et al., 1990).

Studies have generally shown that PD patients with tremor are usually less cognitively impaired than PD patients with akinesia or gait dysfunction (Burn et al., 2006; Lyros et al., 2008; Oh et al., 2009; Domellof et al., 2011). For example, Vakil and Herishanu-Naaman (1998) found that tremor-dominant patients are less impaired at procedural learning tasks than akinesia-dominant patients. Studies also showed that PD patients with tremor are less impaired than PD patients with other motor subtypes on perceptual tasks, including peripheral vision and visual processing speed (Seichepine et al., 2011). Interestingly, we also found that akinesia-dominant patients were more impaired than tremordominant patients at various working memory (Moustafa et al., 2013a) and learning (Moustafa et al., 2013b) measures. Prior studies have reported significant correlations between bradykinesia severity and cognitive measures in newly diagnosed PD patients (Domellof et al., 2011; Poletti et al., 2012b). For example, Domellof et al. (2011) found that bradykinesia scores correlate with Wisconsin Card Sorting Test, digit span, and Trail Making Test performance. Along the same lines, (Poletti et al., 2012a,b) reported a correlation between bradykinesia and Trail Making Test as well as achieved category in the Modified Card Sorting Test.

This pattern of results also applies to psychiatric symptoms. One neuropsychological study found that unlike tremordominant patients, PD patients with non-tremor symptoms show increased rates of depression, apathy, and hallucinations (Reijnders et al., 2009). Further, neuropsychological studies found depression is more common in akinesia-dominant patients than tremor patients (Starkstein et al., 1998). Recently, a neuropsychological study found that unlike patients with dominant tremor symptoms, patients with postural instability and gait deficits show more impulsive behavior (Wylie et al., 2012). Clinical and neuropsychological studies also suggest that the severity of akinesia symptoms is a risk factor for the development of dementia and MCI in PD patients (Poletti et al., 2011; Poletti and Bonuccelli, 2013).

For more than two decades, it has been shown that patient with dominant akinesia show more neural damage than PD patients with dominant tremor (Paulus and Jellinger, 1991). Recent neuropsychological studies showed that non-tremor symptoms in PD, including postural instability and gait deficits, are associated with grey matter degeneration (Rosenberg-Katz et al., 2013). Other neuropsychological and animal studies suggest that akinesia and bradykinesia are arguably associated with basal ganglia (and corticostriatal circuits) dysfunction, while tremor is perhaps associated with cerebellar, thalamic, and subthalamic nucleus abnormalities (Kassubek et al., 2002; Probst-Cousin et al., 2003; Weinberger et al., 2009; Zaidel et al., 2009; Mure et al., 2011). For example, Schillaci et al. (2011) found that PD patients with akinesia and rigidity as the predominant symptoms have significantly more widespread dopamine loss in the striatum than PD patients with tremor as the predominant symptom(also see Eggers et al., 2011). These results support a relationship between motor variables (including akinesia and bradykinesia) and cognitive performance in PD patients.

## **IMPULSE CONTROL DISORDERS IN PD**

Dopaminergic medications, especially some dopamine agonists, can trigger ICDs, such as hypersexuality, hobbyism, dopamine dysregulation syndrome, binge eating and pathological gambling, in a considerable subpopulation of PD patients (Dodd et al., 2005; Voon et al., 2007). It is also important to note that ICDs can be caused by other factors beside the administration of dopaminergic medications. For example, few studies reported the occurrence of ICDs in drug-naïve PD patients (Antonini et al., 2011). Interestingly, studies also report that some PD patients present either with single or multiple ICDs, and that each of these subgroups have a different cognitive and neural profile (Vitale et al., 2011). Further, few studies have investigated the prevalence and predictors of ICDs. For example, it was reported that alexithymia is a predictor of ICDs in PD patients (Goerlich-Dobre et al., 2013). It was also found that frontal executive function is a predictor of the occurrence of pathological gambling (Santangelo et al., 2009b).

Prior studies show that ICDs are observed more often in patients on D2 dopamine agonists (Weintraub et al., 2006; Voon and Fox, 2007) It is suggested that patients vulnerable to ICDs have a lower D2 receptor density, even before onset of PD (Dagher and Robbins, 2009). The density of D2 receptors might further decreases in these vulnerable patients by overstimulation of ventral striatal D2 receptors in PD and increases the rate of ICDs in such vulnerable patients. Other studies suggest that binding to dopamine D3 receptors is responsible for the occurrence of ICDs (Vilas et al., 2012). A recent study found that patients with ICDs were more likely to be on antidepressant medications and had more motor complications than those without ICDs (Mack et al., 2013). In addition to dopaminergic medications, studies also show that ICDs can be caused by deep brain stimulation of the subthalamic nucleus (Frank et al., 2007; Callesen et al., 2013a; but also see Santangelo et al., 2013). Studies also show that cognitive behavioral therapy can ameliorate ICDs in PD patients (Okai et al., 2013).

ICDs in PD is also associated with cognitive and psychiatric symptoms. Unlike patients without ICDs, PD with ICDs show increased discounting in delay discounting tasks (Housden et al., 2010; Voon et al., 2010b; Leroi et al., 2013), increased reward learning (Voon et al., 2011), and impairment performing the Iowa gambling task (Gescheidt et al., 2012). Studies have also shown that PD patients with ICDs are more impaired than patients without ICDs on working memory (Djamshidian et al., 2010; Voon et al., 2010a), but they did not differ on executive functioning (Siri et al., 2010). A recent study also showed that PD patients with ICDs are more impaired at planning and set-shifting tasks than patients without ICDs (Vitale et al., 2011), although the study did not include a healthy control group. Another study showed ICDs in PD are associated with executive dysfunction (Voon et al., 2010b) and working memory impairment (Voon et al., 2011). Further, it was found that PD patients with the hypersexuality subtype show more impairment on the Stroop task than patients with pathological gambling (Vitale et al., 2011), suggesting that the different impulsive behaviors are associated with different behavioral, and potential neural, profile. Studies also found that ICDs in PD are associated with depression and irritability (Pontone et al., 2006).

Neural studies have implicated cortical and subcortical structure for the occurrence of ICDs in PD. Many studies show that the underlying neural substrates of ICDs in PD are mostly the ventral striatum, including the nucleus accumbens (Cools et al., 2007; Dagher and Robbins, 2009; Steeves et al., 2009; Voon et al., 2010a). For example, using PET imaging, Steeves et al. (2009) found greater decreases in binding potential in the ventral striatum in PD patients on dopamine agonists with pathological gambling. Additionally, (Voon et al., 2010a,b) reported impaired dopamine signaling in ventral striatal blood oxygen level dependent (BOLD) in PD patients with ICDs. It has been suggested that restoration of dopamine transmission in the dorsal striatum might lead to overdosing of the ventral striatum which results in excessive dopamine receptor stimulation in the ventral striatum (Swainson et al., 2000; Cools et al., 2001) that induce ICDs in some subjects (Cools et al., 2003; Dagher and Robbins, 2009). A recent study showed that increase of striatal dopamine might lead to ICDs in PD (Voon et al., 2013). Studies have also implicated the hippocampus in the occurrence of ICDs in PD (Calabresi et al., 2013). Beside subcortical structures, neural studies have shown that patients with ICDs show dopamine reduction in the ventromedial prefrontal cortex (and nucleus accumbens) compared with those without ICDs (Lee et al., 2013).

## **DEPRESSION IN PD**

Depression is one of the most common non-motor symptoms in PD patients (Schrag et al., 2007; Picillo et al., 2009). It is estimated that roughly 40% of PD patients show depressive symptoms (Slaughter et al., 2001; McDonald et al., 2003; Leentjens, 2004; Schrag et al., 2007; clinical manifestations, etiology, and treatment of depression in PD). These depressive symptoms include social withdrawal and anhedonia (inability to experience pleasure). Depression in PD is often diagnozed as follows: patients with Beck Depression Inventory (BDI) score greater than 13/14 is considered in the depression group, while the others are in the non-depressed group (Leentjens et al., 2000; Schrag et al., 2007; Herzallah et al., 2010). Furthermore, patients with major depressive disorder have a threefold higher risk to develop Parkinson's later in life (Schuurman et al., 2002). Like ICDs, it is well recognized that the incidence of depression among PD patients is much higher than among age-matched healthy participants (Cummings, 1992; Veiga et al., 2009). Selective serotonin reuptake inhibitors as well dopamine agonists, such as pramipexole and pergolide were shown to have antidepressant effects in PD depression (Picillo et al., 2009).

Research has shown that depressive symptoms in PD have a negative impact on PD patients' quality of life (Karlsen et al., 1999; Schrag et al., 2000). Studies also suggest depression in PD is associated with intellectual impairment and inattention (Mayeux et al., 1981). Furthermore, depression in PD is associated with cognitive impairment (Kuzis et al., 1997; Bader and Hell, 1998; Errea and Ara, 1999). Current studies in PD patients with depression largely focus on executive functioning, working memory, or memory (Kuzis et al., 1997; Norman et al., 2002; Kummer et al., 2009; Santangelo et al., 2009a). We also found that PD patients with depression were more impaired at learning tasks than PD patients without depression (Herzallah et al., 2010).

It debated in the literature which mechanism contributes the occurrence of depression in PD, and whether depression symptoms are related to other psychiatric symptoms in PD, including apathy and anxiety. Further, it is debated whether depression in PD are related to dopamingeric dysfunction (Eskow Jaunarajs et al., 2011). Some also argue that depressive symptoms in PD can be due to psychosocial factors or secondary to motor impairment (Aarsland et al., 2012). Further, it is not clear which neurotransmitter system dysfunction contributes to PD depression, as many argue it could be related to dopaminergic, noradrenergic and/or serotonergic dysfunction (Aarsland et al., 2012). It is also argued that depressive symptoms are more related to severity of motor symptoms in PD, particularly akinesia and bradykinesia (Reijnders et al., 2009).

Neurobiological studies have also investigated the neural substrates of depression in PD. For example, imaging studies found that patients with PD who develop depression show structural changes that reflect dopaminergic dysfunction (Walter et al., 2010). This is in agreement with case reports showing that deep brain stimulation of the substrania nigra can trigger depressive symptoms in PD (Bejjani et al., 1999). Other neural studies suggest that while the primary neural dysfunction of PD is the dorsal striatum and its dopaminergic afferents (Kish et al., 1988), depression in PD is associated with deficits in the ventral regions within striatum, including the nucleus accumbens (Remy et al., 2005). Interestingly, Voon et al. (2011) argue that depression in PD (hypoactive state) is perhaps the antithesis of ICDs (hyperactive state). For example, research has shown that depression in PD can result from reduction of dopamine levels in the brain (Thobois et al., 2010). This is contrasted from ICDs, which is often associated with increased dopamine levels, due mostly to the administration of dopamine agonist therapy (Voon et al., 2007), but also see Callesen et al. (2013b) for evidence of association of impulsivity and depression in PD.

It is debated whether depression is caused by motor abnormalities or other neuropathology in PD. Imaging studies suggest that patients with depression who show structural abnormalities at the level of the substantia nigra are possibly at an elevated risk of later developing definite PD (Hoeppner et al., 2009; Shen et al., 2013). Nonmotor manifestations of PD (such as depression) are the earliest to appear (Simuni and Sethi, 2008).

#### **HALLUCINATIONS IN PD**

Psychosis and visual and olfactory hallucinations occurs in approximately 20–30% of PD patients (Rabey, 2009). In PD patients, visual hallucinations are more common than auditory or olfactory hallucinations (Diederich et al., 2009). Psychosis is usually defined as involving one of the following symptoms: (a) illusions (misinterpretations of existing stimuli), hallucinations (defined as hallucinatory symptoms), and/or delusional symptoms. Psychosis in PD is usually confirmed using the Parkinson's Psychosis Rating Scale (Friedberg et al., 1998). Although it was previously found that the administration of dopaminergic drugs is the main cause of psychosis and hallucinations (Morgante et al., 2012), recent studies additionally show that sleep disturbance, dementia, longer disease duration, and advanced stage of the disease, can also exacerbate psychotic symptoms in PD patients (Poewe, 2003; Fenelon et al., 2006; Fenelon, 2008; Bannier et al., 2012; Gibson et al., 2013; Lee and Weintraub, 2012; Morgante et al., 2012). So unlike earlier clinical practice, a diagnosis of psychosis in PD is made when the patients have not had any dopaminergic medications. Among risk factors, one longitudinal study found that frontal dysfunction was also a predictor for the development of hallucinations in PD (Santangelo et al., 2007).

Prior reports also suggest that hallucinations in PD patients are the main reason for admission to nursing homes (Diederich et al., 2009). Psychosis in PD is also a risk factor for the occurrence of severe cognitive dysfunctions such as dementia (Factor et al., 2003) and is associated with a diminished quality of life (Zahodne and Fernandez, 2008; Rabey, 2009). Prior studies have shown strong links between medication dosage, dementia, sleep disturbance, and psychosis in PD patients (Poewe, 2003; Fenelon, 2008; Bannier et al., 2012; Gibson et al., 2013).

There have been very few studies investigating the perceptual and cognitive correlates of psychosis in PD. Studies generally report cognitive and behavioral dysfunction in PD patients with psychosis than in patients without psychosis (Baydas et al., 2005; Shine et al., 2011). For example, studies have shown that PD patients with hallucinations are more impaired than PD patients without hallucination on recognition memory (Barnes et al., 2003), executive function (Grossi et al., 2005), frontal function (as measured using the frontal assessment battery), attentional processes (Meppelink et al., 2008), and semantic fluency (Ramirez-Ruiz et al., 2006). Specifically, Botha and Carr (2012) argue that the gating of irrelevant information to working memory is the mechanism underlying the occurrence of hallucinations. Prior studies have shown that hallucinations and psychosis in PD patients are associated with visual disturbance (Gallagher and Schrag, 2012; Shine et al., 2012). Behavioral differences between PD patients with and without psychosis can be observed during the performance of complex perceptual tasks (Shine et al., 2011).

Prior studies suggest that hallucinations in PD patients can be due to either cortical or subcortical atrophy (Papapetropoulos et al., 2006). Specifically, psychosis and hallucinations can stem from dysfunction to either the prefrontal cortex (Corlett et al., 2007; Fletcher and Frith, 2008), basal ganglia (Frank, 2008; Howes et al., 2012), or hippocampal region (Bogerts et al., 1985; Weinberger, 1999; Goldman and Mitchell, 2004; Keri, 2008; Grace, 2010). Studies also report temporal lobe dysfunction in PD patients with visual hallucination (Botha and Carr, 2012) and also with studies showing that early onset of hallucinations in PD patients is associated with dysfunction to the parahippocampus and inferior temporal cortex (Harding et al., 2002). It was found that carrying the *APOE* 4 allele, which is associated with a small hippocampal volume in healthy older subjects (Alexopoulos et al., 2011), is also a risk factor for the development of psychotic episodes in PD patients (de la Fuente-Fernandez et al., 1999; Goldman et al., 2004; Feldman et al., 2006).

## **DISCUSSION: RELATIONSHIPS AMONG PD SUBTYPES**

In the previous sections we briefly reviewed some clinical subtypes of PD derived from the empirically assigned classification approach, that has the advantage of describing a small number of subtypes as regards specific clinical symptoms, with ease of implementation and assignment of patients to one or another subtype. On the other hand, because of inability to investigate relationships between subtypes, this approach does not provide a global view on complex clinical patterns of motor and non-motor symptoms.

This study aimed at reviewing most common PD subtypes derived from the empirically assigned classification to attempt a possible integration of them, through the identification of their possible relationships and their possible common neuropathological causes Indeed the clinical heterogeneity of PD lead to the classification in many subtypes as regards motor symptoms (e.g., postural instability and gait difficulty vs. tremor), cognition (MCI vs. dementia), psychopathological features (e.g., with ICD vs. without ICD; with psychosis vs. without psychosis), demographic features (e.g., young onset vs. late onset) and disease features (e.g., rapid progression vs. slow progression).

A first issue involves the relationship between motor subtypes and subtypes of non-motor symptoms, in particular of the cognitive and psychopathological domains. The non-tremor dominant motor subtype usually presents a more severe clinical pattern of non-motor symptoms in comparison to the tremor dominant subtype. Since the early untreated stages of PD, this subtype is characterized by a higher risk of MCI (Poletti et al., 2012a) and longitudinally is associated with faster motor worsening (Vu et al., 2012), cognitive decline (Burn et al., 2006), and higher risk of developing dementia (Alves et al., 2006; Sollinger et al., 2010). This subtype is also associated with an increasing risk of developing psychopathological features, including affective features such as depression and alexithymia (Starkstein et al., 1998; Reijnders et al., 2009; Poletti et al., 2011; Burn et al., 2012) and psychotic features as hallucinations (Reijnders et al., 2009). Conversely, the tremor dominant subtype appears to be characterized by a less severe clinical picture, with a slower progression of motor and cognitive symptoms and a lower risk of developing dementia and psychopathological features. The more severe clinical picture of the non-tremor dominant PD subtypes is due to a more severe Lewy body neuropathological load, as found at the post-mortem pathological examination (Selikhova et al., 2009) and a more severe gray matter atrophy of cortical and limbic structures, as indicated by neuroimaging studies (Rosenberg-Katz et al., 2013).

A second issue is regarding the relationship between cognitive subtypes and psychopathological subtypes. Few empirical findings are available from studies based on the classification of cognitive subtypes: only two studies directly compared cognitive subtypes of PD patients in relation to other non-motor subtypes. One study compared 54 cognitively preserved patients, 48 PD patients with MCI and 25 PD patients with dementia (Leroi et al., 2012). Apathy was reported in almost 50% of MCI patients and PD patients with dementia, and was the only psychopathological manifestation differentiating cognitively preserved patients from MCI patients. Moreover the prevalence of psychotic symptoms as hallucinations and delusion progressively increased according to the degree of cognitive impairment (12.9% in cognitively preserved, 16.7% in MCI and 48% in PD patients with dementia). Another study compared different subtypes of MCI (Goldman et al., 2012) in 128 PD patients with MCI; according to the cognitive profile patients were classified as nonamnestic single domain (47.7% of the sample), amnestic multiple domain (24.2%), amnestic single domain (18.8%), and nonamnestic single domain, and executive functions and visuospatial functions were the most frequently impaired domains. In comparison to other subtypes, non-amnestic multiple domain MCI patients showed most pronounced difficulties with postural instability and gait; subtypes did not differ in relation to age, PD duration, medication use, mood or behavioral disturbances.

These few findings are in agreement with the more robust empirical literature on the classification in psychopathological subtypes, suggesting that different subtypes may present different clinical patterns along the disease course. The subtype characterized by affective features (e.g., depression, apathy and anxiety) is very common at each disease stage, because it is associated with several non-disease specific risk factors (Leentjens et al., 2013); moreover it is also associated with the executive impairment (Poletti et al., 2012a), and it is only partially modified by dopaminergic therapies (Eskow Jaunarajs et al., 2011).

Different patterns characterize psychotic features as hallucinations and delusions. Hallucinations may be present in patients with MCI (Shin et al., 2012; Hepp et al., 2013) but are particularly prevalent in advanced PD stages, in association with dementia (Rana et al., 2012) and with cortical atrophy at neuroimaging (Papapetropoulos et al., 2006). Delusions also may be associated with hallucinations and dementia in advanced patients but may be present in isolation in cognitively preserved patients, probably due to the interaction between individual susceptibility (as psychiatric familial history) and dopaminergic therapy (Poletti et al., 2012a).

ICD are more in common in patients without dementia than in patients with dementia (9.6% vs. 3.8% in a recent crosssectional study: Poletti et al., 2013). However, an impairment of both orbitofrontal and dorsolateral executive functions may represent a risk factor (Poletti and Bonuccelli, 2012) and probably interplays with individual (premorbid level of impulsivity), pharmacological (dopamine agonist therapy) and disease related (striatal dopaminergic alteration) risk factors (Dagher and Robbins, 2009).

A controversial issue involving motor, cognitive and psychopathological PD subtypes regards the possible influence of the side of motor onset. Empirical findings principally involve cognition: the side of motor onset does not influence cognition in newly diagnosed untreated patients (Erro et al., 2013; Poletti et al., 2013); in patients "on" dopaminergic therapy a right-side motor symptom predominance is typically associated with difficulties in tasks of language and verbal memory, whereas a left-side motor symptom predominance is typically associated with difficulties in visuospatial tasks (Verreyt et al., 2011). More heterogeneous and controversial are findings on the role of the side of motor onset on motor subtypes (e.g., Stewart et al., 2009; Baumann et al., 2013) and especially on psychopathological subtypes (e.g., Foster et al., 2011; Dewey et al., 2012), therefore more empirical studies are needed on this issue.

Further empirical studies are also needed on other clinical features that have been scarcely investigated in relation to motor, cognitive and psychopathological subtypes. For example, data driven classification methods (van Rooden et al., 2010) suggested that PD patients could be also classified as "rapid disease progression" vs. "slow disease progression" or according to age of PD onset (e.g., young vs. late onset); in this perspective these features could be investigated by studies based on empirically assigned classifications. Moreover also other features suggested by empirically assigned classifications, as freezing of gait and autonomic dysfunctions, should be investigated in relation to motor, cognitive and psychopathological subtypes.

In sum, all these psychopathological subtypes may be present in all disease stages, and the presence of cognitive impairment represents a risk factor for their occurrence. Overall, PD is a heterogeneous disorder encompassing many subtypes along motor, cognitive, and psychiatric dimensions. Further, as discussed here, some of these subtypes are related (e.g., isolated delusions and ICD are more common in patients without dementia while hallucinations are more common in patients with dementia).

#### **ACKNOWLEDGMENTS**

This Research is partially supported by a 2013 internal UWS Research Grant Scheme award P00021210 to Ahmed A. Moustafa.

#### **REFERENCES**


patterns of FP-CIT single photon emission computed tomography. *Mov. Disord.* 26, 416–423. doi: 10.1002/mds.23468


according to the timing of cognitive dysfunction. *J. Neurol.* 259, 469–473. doi: 10.1007/s00415-011-6203-x


diagnosed drug-naive patients with Parkinson's disease. *J. Neurol. Neurosurg. Psychiatry* 83, 601–606. doi: 10.1136/jnnp-2011-301874


Zaidel, A., Arkadir, D., Israel, Z., and Bergman, H. (2009). Akineto-rigid vs. tremor syndromes in Parkinsonism. *Curr. Opin. Neurol.* 22, 387–393. doi: 10.1097/wco. 0b013e32832d9d67

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 October 2013; accepted: 05 December 2013; published online: 24 December 2013.*

*Citation: Moustafa AA and Poletti M (2013) Neural and behavioral substrates of subtypes of Parkinson's disease. Front. Syst. Neurosci. 7:117. doi: 10.3389/fnsys.2013.00117*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2013 Moustafa and Poletti. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## *Clémentine Bosch-Bouju1,2, Brian I. Hyland2,3 and Louise C. Parr-Brownlie1,2\**

*<sup>1</sup> Department of Anatomy, Otago School of Medical Science, University of Otago, Dunedin, New Zealand*

*<sup>2</sup> Brain Health Research Centre, Otago School of Medical Science, University of Otago, Dunedin, New Zealand*

*<sup>3</sup> Department of Physiology, Otago School of Medical Science, University of Otago, Dunedin, New Zealand*

#### *Edited by:*

*Hagai Bergman,The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Robert S. Turner, University of Pittsburgh, USA Rea Mitelman, The Hebrew University of Jerusalem, Israel Jesse Goldberg, Cornell University, USA*

#### *\*Correspondence:*

*Louise C. Parr-Brownlie, Department of Anatomy, University of Otago, Lindo Ferguson Building, PO Box 913, 270 Great King Street, Dunedin 9054, New Zealand e-mail: louise.parr-brownlie@ otago.ac.nz*

Motor thalamus (Mthal) is implicated in the control of movement because it is strategically located between motor areas of the cerebral cortex and motor-related subcortical structures, such as the cerebellum and basal ganglia (BG). The role of BG and cerebellum in motor control has been extensively studied but how Mthal processes inputs from these two networks is unclear. Specifically, there is considerable debate about the role of BG inputs on Mthal activity. This review summarizes anatomical and physiological knowledge of the Mthal and its afferents and reviews current theories of Mthal function by discussing the impact of cortical, BG and cerebellar inputs on Mthal activity. One view is that Mthal activity in BG and cerebellar-receiving territories is primarily "driven" by glutamatergic inputs from the cortex or cerebellum, respectively, whereas BG inputs are modulatory and do not strongly determine Mthal activity. This theory is steeped in the assumption that the Mthal processes information in the same way as sensory thalamus, through interactions of modulatory inputs with a single driver input. Another view, from BG models, is that BG exert primary control on the BG-receiving Mthal so it effectively relays information from BG to cortex. We propose a new "super-integrator" theory where each Mthal territory processes multiple driver or driver-like inputs (cortex and BG, cortex and cerebellum), which are the result of considerable integrative processing. Thus, BG and cerebellar Mthal territories assimilate motivational and proprioceptive motor information previously integrated in cortico-BG and cortico-cerebellar networks, respectively, to develop sophisticated motor signals that are transmitted in parallel pathways to cortical areas for optimal generation of motor programmes. Finally, we briefly review the pathophysiological changes that occur in the BG in parkinsonism and generate testable hypotheses about how these may affect processing of inputs in the Mthal.

**Keywords: motor thalamus, basal ganglia, motor cortex, cerebellum, Parkinson's disease, LTS burst**

#### **INTRODUCTION**

Motor thalamus (Mthal) encompasses thalamic nuclei that are strategically located between motor areas of the cerebral cortex and two subcortical networks, the basal ganglia (BG) and the cerebellum, generally considered to be related to the complex cognitive and proprioceptive control of movement, respectively (Middleton and Strick, 2000). Lesion studies indicate that the Mthal has a role in maintaining posture, general movements and motor learning (Bornschlegl and Asanuma, 1987; Canavan et al., 1989). The main current paradigm for understanding information processing in thalamic nuclei comes from studies in sensory thalamus, and centers on the concept of contrasting functions of different inputs characterized as "drivers" and "modulators" with specific anatomical characteristics (Sherman and Guillery, 1998, 2006). However, there is considerable debate in the literature to determine what are the drivers and modulators in the Mthal, which raises the question whether this organization maps directly onto Mthal. Here, we first review the principal anatomical features and known physiology of Mthal, and the driver/modulator concept as derived from sensory thalamus. We then address whether existing anatomical and physiological evidence for cortical, BG and cerebellar inputs is consistent with driver/modulator functions, or if not, what the role of these inputs might be. We propose a new integrated model, in which cortical, cerebellar and BG afferents are considered to be of similar importance in determining Mthal activity. In this new model, Mthal acts as a "super-integrator" of motor information converging from cortex and BG, and from cortex and cerebellum, rather than simply a relay of driver signals as is thought to occur in the sensory thalamus. Finally, we consider how the different functions of inputs might impact thalamic processing in the generation of the symptoms of Parkinson's disease (PD). By laying a platform of current knowledge about the Mthal and then speculating on a new way of thinking we aim to encourage debate and renewed experimental attention to this disregarded area of neuroscience research.

## **THE MOTOR THALAMUS**

#### **ANATOMICAL ORGANIZATION OF MTHAL**

Mthal is well conserved across vertebrates indicating that it is likely to play an important role in the control of movement. In mammals, it is represented by a relatively consistent region of the ventral thalamus, strongly interconnected with cerebral motor cortex, and receives extensive afferent inputs from prominent motor related structures such as the cerebellum and BG. In birds, the equivalent region according to connectivity is in the medial nucleus of the dorsolateral region (DLM; Medina et al., 1997; Luo and Perkel, 1999a,b), but the ventral location is highly conserved across many mammalian species. There are two functional subdivisions of Mthal, the BG and cerebellar receiving territories, that are also relatively conserved across species with some specific differences. In cats, four regions are generally distinguished; ventral anterior (VA), anterior and posterior subdivisions of the ventral lateral region (VLa, ventral lateral anterior and the VLp, ventral lateral posterior) and ventral medial (VM) nuclei. In rats, anatomical distinction between VA and VL is more difficult, and these are often considered together (VA/VL). However, recent studies have found molecular markers able to more easily distinguish VA and VL nuclei in rats based on their afferents (Kuramoto et al., 2011; Nakamura et al., 2012). In humans and other primates, Mthal is further subdivided into numerous nuclei and the nomenclature is not yet consistent (Hirai and Jones, 1989; Krack et al., 2002; Helmich et al., 2012). To enable data to be compared across mammalian species and nomenclatures, we have used the VA, VM, VLa and VLp scheme, and applied Mthal sub-regions to it guided by Krack et al. (2002, see their Table 1).

Mthal is interconnected with the cerebral cortex, and receives major inputs from the deep cerebellar nuclei, namely the dentate and interposed nucleus, and from the output nuclei of BG, namely the substantia nigra pars reticulata (SNpr) and internal segment of the globus pallidus (GPi). Mthal also receives major input from the reticular thalamic nucleus (Pare et al., 1987; Hazrati and Parent, 1991) and to a lesser extent, from the superior colliculus (Sommer, 2003), pedunculopontine nucleus (Steriade et al., 1988), and somatosensory spinal cord (Jones, 2007). In this review we focus on its main afferents from the cortex, cerebellum and BG.

The vast majority of Mthal neurons are glutamatergic and project out of the nucleus onto dendrites of pyramidal neurons of layers I and II and to a lesser extent layer V of the cerebral cortex (McFarland and Haber, 2002; Hooks et al., 2013). In cats and monkeys, a small population of GABAergic interneurons exists, but not in rodents (Arai et al., 1994; Jones, 2007). Mthal output neurons have a distinctive "bushy shape" with medium sized and rounded somata, and dense, circular dendritic arborizations about 300–500 µm in diameter (Williams and Faull, 1987; Kultas-Ilinsky and Ilinsky, 1991; Yamamoto et al., 1991; Sawyer et al., 1994a). Mthal neurons also have a low spine density on both distal and proximal dendrites (Sawyer et al., 1994a).

At the cellular level, cortical inputs from motor and motorrelated areas innervate all Mthal neurons, whereas Mthal neurons receive inputs from either the BG or cerebellum (Ueki et al., 1977; Ueki, 1983; Yamamoto et al., 1984; Nambu et al., 1988, 1991). Mthal neurons receive cerebellar and BG afferents primarily on proximal dendrites, whereas cortical afferents terminate differently according to their laminar origin (**Figure 1**). Layer V cortical neurons innervate proximal dendrites, and layer VI neurons preferentially innervate distal dendrites (Kakei et al., 2001; Kultas-Ilinsky et al., 2003). The proximal location of cerebellar, BG and layer V cortical inputs indicate they are all likely to have a powerful effect on Mthal activity but the role of distal inputs from layer VI on Mthal activity is less clear.

At the level of Mthal territories, the cerebral cortex innervates all Mthal nuclei. In contrast, BG and cerebellar afferents segregate along a rostrocaudal continuum, with GABAergic inputs from BG more rostral and cerebellar glutamatergic inputs more caudal (**Figure 2**). This rostrocaudal continuum is conserved across mammals, but is easier to distinguish in cats and monkeys than in rodents. Afferents from SNpr are found mainly in VA and VM nuclei, afferents from GPi preferentially target the VLa nucleus, and afferents from the cerebellum are concentrated in the VLp (Anderson and Devito, 1987; Sakai et al., 1996; Kuramoto et al., 2011; Nakamura et al., 2012). Consequently, Mthal neurons appear unlikely to directly integrate information from BG and

**FIGURE 1 | Synaptic organisation of cortical, BG and cerebellar afferents on Mthal neurons**. Schematic diagram summarizing afferent inputs onto Mthal neurons (yellow) in BG-receiving (orange background) and cerebellum-receiving (purple background) territories. Afferents from the cerebral cortex (blue) innervate Mthal neurons in both BG (left) and cerebellar (right) receiving territories. Afferents from layer V of the cortex (large dark blue terminals) innervate somatic and perisomatic areas. Conversely, cortical layer VI afferents (small light blue terminals) innervate distal dendrites. In contrast, BG afferents (red terminals) innervate somatic and perisomatic areas of Mthal neurons, only in the BG receiving territory. The inset shows a multiple synapse formed by BG inputs. Cerebellar afferents (purple terminals) are located on primary dendrites of Mthal neurons in the cerebellar-receiving territory.

cerebellum because these two afferents do not converge at the neuronal level or within a territory.

Connections between the cortex and Mthal form reciprocal or non-reciprocal loops depending on the laminar origin of cortical pyramidal neurons. Projections from layer V neurons of associative, premotor and motor cortical areas to Mthal are generally reciprocated (Rouiller et al., 1999; Sakai et al., 2000; McFarland and Haber, 2002; Fang et al., 2006), while layer VI axons from cortex diffusely target Mthal neurons that do not project back to the same area, but do project to other cortical regions (Rouiller et al., 1998; Kakei et al., 2001; McFarland and Haber, 2002; Haber and Calzavara, 2009). Inputs from different cortical regions are also segregated to some extent relative to the BG and cerebellar receiving areas (Anderson and Devito, 1987; Percheron et al., 1996; Rouiller et al., 1998; McFarland and Haber, 2002; Akkal et al., 2007; Haber and Calzavara, 2009; Kuramoto et al., 2009). Mthal territories receiving from BG (VM, VA and VLa) are mainly interconnected with associative and premotor cortices, whereas the cerebellar receiving territory (VLp) is preferentially interconnected with primary motor areas of the cortex (**Figure 2**).

It is likely that the complex connectivity between the Mthal and its afferents is important for processes related to movement preparation to be efficiently transformed to final motor commands in the motor cortex. However, the exact mechanism of the transfer of information in this network from associative to motor territories in the Mthal is not yet fully understood. One current model is that thalamic nuclei are involved in open feedback loops that facilitate integration of information coding preparatory and performance aspects of movement by "spiraling" information, first from limbic areas to non-motor thalamic nuclei (mediodorsal), thence to associative cortex, then, via Mthal to motor cortex (McFarland and Haber, 2002; Haber and Calzavara, 2009). This hypothesis is consistent with studies showing that the reaction time in rats and monkeys is about 300 ms (Baunez et al., 1995; Kurata, 2005), allowing time for development and refinement of the motor programme within corticothalamic connections. A testable prediction of this anatomically-based hypothesis is that the onset of movement-related activity during preparation for movement should be earlier in VM and VA nuclei than VLa and VLp, but no studies have addressed this point. Although it is not explicit in this theory, anatomical evidence also suggests a possible reverse transfer from motor to premotor and associative areas via the same thalamic structures (Rouiller et al., 1998, 1999; Kakei et al., 2001; McFarland and Haber, 2002; Fang et al., 2006), which may have an important feedback role for motor learning.

#### **PHYSIOLOGY OF MTHAL NEURONS**

Recordings from thalamic neurons in anesthetized animals are characterized by large amplitude, slow oscillations in membrane potential with bursts of action potentials during up-states (Connelly and Errington, 2012; Ushimaru et al., 2012). This bursty activity is mainly due to the intrinsic capacity of thalamic neurons to exhibit high frequency bursts of spikes, called low threshold calcium spike (LTS) bursts, following a prolonged hyperpolarization of the membrane potential. This fundamental property of thalamic neurons relies on T-type calcium channels that have distinct dynamics (Jahnsen and Llinas, 1984a,b; Huguenard and McCormick, 1992; McCormick and Huguenard, 1992).

The T-type calcium channels in the thalamus are depolarizing channels that are activated following a prolonged hyperpolarization of the membrane potential under −70 mV (**Figure 3**). This initial prolonged hyperpolarization is necessary to de-inactivate the channel (**Figure 3**), thus making the channel responsive to depolarizing events. When the membrane is depolarized following a prolonged period of hyperpolarization, the T-type channel is activated (opens) briefly and the influx of calcium ions (I*<sup>T</sup>* calcium current) further depolarizes the membrane leading to activation of voltage-gated sodium channels underlying the generation of action potentials (**Figure 3**). Because the membrane potential of Mthal neurons remains depolarized by the T-type calcium channels for a relatively long duration (∼50 ms), the prolonged depolarized state triggers multiple spikes in the LTS bursts. The T-type channel then inactivates

**FIGURE 3 | Mechanism of LTS bursts in thalamic neurons**. **(A)**

Membrane potential of a thalamic neuron recorded in current-clamp during and after injection of negative current (I Inj), which hyperpolarizes the membrane. There is a prominent sag associated with continued injection of the negative current. At the offset of the injected current, the membrane potential depolarizes and a short burst of action potentials is evoked. Note, to clearly illustrate the spikes in the LTS burst different time scales are used for the first hyperpolarization phase, prior to the dashed part of the trace (100 ms scale bar) and the spiking phase (20 ms scale bar). **(B)** Diagram represents the major conductances underlying the membrane potential changes shown in **(A)**. The I*<sup>h</sup>* cationic current is activated by hyperpolarization of the membrane potential, which depolarizes the membrane potential and is the current mainly responsible for the sag. When the neuron repolarizes after current injection, the I*<sup>T</sup>* current is activated, which augments depolarization of the membrane potential. When the threshold for voltage-gated sodium channels is reached, action potentials occur and they are represented in **(B)** by the successive I*NA*

#### **FIGURE 3 |** (**Continued**)

and I*<sup>K</sup>* currents. Progressively, the I*<sup>T</sup>* current is reduced, which favors activation of I*<sup>K</sup>* currents and the neuron is repolarized to its resting membrane potential. **(C)** Conductances associated with inactivation (red) and activation (blue) gates of the T-type calcium channel underlying the I*<sup>T</sup>* current. Once the membrane potential is hyperpolarized, the inactivation gate opens slowly. As the membrane potential depolarizes, the inactivation gate closes slowly and at the same time the activation gate is opened. When both gates are open, the I*<sup>T</sup>* current occurs. Finally, the activation gate closes as the membrane potential returns to rest. **(D)** Illustration of a T-type calcium channel with its activation (blue) and inactivation (red) gates. Calcium ions are represented in yellow. The left example illustrates the configuration of the channel when the neuron is hyperpolarized. The middle and right examples illustrate the channel configuration when the I*<sup>T</sup>* current occurs and when the neuron is at its resting membrane potential, respectively. a.u: arbitrary units.

and the gate closes, permitting repolarization of the membrane potential (**Figure 3**). LTS burst activity is a characteristic firing pattern in the thalamus in certain brain states and has raised much interest in the field of thalamus physiology since its discovery. LTS bursts are frequently observed during slow wave activity (Hirsch et al., 1983; Llinas and Steriade, 2006) or when the animal becomes drowsy (Joffroy and Lamarre, 1974; Strick, 1976; Schmied et al., 1979). As far as we are aware, the occurrence of LTS bursts in the Mthal within a complete BG-thalamocortical network in awake mammals has only been reported in an abstract (Postupna and Anderson, 2002), therefore, the role of LTS bursts in awake states remains to be fully characterized.

A rapid sequence of spikes, such as occurs in LTS bursts, has different consequences on network activity than a single spike and these modes may encode specific aspects of information, depending on the system and/or the context (Xu et al., 2012). In the brain, the probability of neurotransmitter release at synapses by a single spike is generally low (Borst, 2010; Tarr et al., 2013). With a burst of spikes, the probability of neurotransmitter release is augmented considerably, increasing the reliability of synaptic transmission (Lisman, 1997). This is true for both LTS bursts, and other kinds of bursts elsewhere in the brain that are not triggered by a prolonged inhibition. One function of LTS bursts during awake states may be to trigger state changes such as between inattentive rest and active movement (Crick, 1984; Sherman, 2001; Bezdudnaya et al., 2006). Thus, while tonic firing seems to dominate Mthal activity during awake states, LTS bursts may occur at a particular moment in time to increase reliability of synaptic transmission to downstream neurons in the cortex. LTS bursts may also contribute to the plasticity of responses in thalamic neurons (Hsu et al., 2012) as they are triggered by a strong influx of calcium, a fundamental activator of synaptic and intrinsic plasticity (Xia and Storm, 2005).

Despite extensive studies about the role and mechanisms of LTS bursts in the sensory thalamus, very little data exist on LTS bursts in Mthal. Interestingly, in the songbird homolog of Mthal (DLM), LTS bursts can be triggered by GABAergic synaptic events coming from Area X (equivalent to GPi in mammals), in the *in vitro* slice preparation (Luo and Perkel, 1999a; Person and Perkel, 2005). Although it remains unknown if this is true in mammalian brains, it raises the possibility that GABAergic BG inputs may be able to have an excitatory effect on Mthal neurons, as discussed in more detail in section entitled Characteristics of Cortical, Cerebellar and BG Afferents in the Mthal.

#### **ACTIVITY OF THE MTHAL AND ITS MAIN AFFERENTS DURING MOTOR BEHAVIOR**

Mthal is defined as "motor" because of the extensive inputs from motor cortex, BG, and cerebellum; brain regions that exhibit changes in activity related to, and are essential for, preparation and execution of movements. Motor cortex and its associated cortical areas exhibit changes in spiking activity related to the parameters of the movement, such as velocity, orientation or force (Thach, 1978; Georgopoulos et al., 1982, 1983; Kalaska et al., 1983; Georgopoulos et al., 1986; Georgopoulos, 1988; Kalaska et al., 1989; Caminiti et al., 1990; Wickens et al., 1994; Quallo et al., 2012). The dominant current theory is that all parameters of the movement are coded in the motor cortex by disparate subpopulations of pyramidal neurons that may be functionally synchronized during preparation and execution of a movement by brief oscillatory patterns to form the motor programme (Sanes and Donoghue, 1993; Vaadia et al., 1995; Baker et al., 1997; Churchland et al., 2012). Consequently, neurons recorded individually in the motor cortex exhibit complex and variable movement-related modulations in activity.

Deep cerebellar nuclei, and the dentate nucleus in particular, display mainly increases in activity, preceding the movement or associated with the visual cue (Thach, 1978; Mink and Thach, 1991; Mushiake and Strick, 1993; Horne and Butler, 1995; Middleton and Strick, 2000; Ebner et al., 2011). In contrast to the motor cortex, there remains considerable debate on whether the cerebellum specifically codes parameters of a movement such as direction or amplitude (Trouche and Beaubaton, 1980; Mink and Thach, 1991; Thach et al., 1992; Horne and Butler, 1995; Ebner et al., 2011). Neural changes in movement-related activity in cerebellar nuclei are generally thought to coordinate movements across multiple joints, have a role as a temporal pattern generator, code proprioceptive information and error signalling to optimize movements and motor learning possibly through feed-forward and/or adaptive filter models, but do not specifically control initiation of movements (Sieb, 1989; Thach et al., 1992; Horne and Butler, 1995; Braitenberg et al., 1997; Ohyama et al., 2003; Jacobson et al., 2008; Dean et al., 2010; D'Angelo et al., 2011; Ebner et al., 2011).

The BG output nuclei, GPi and SNpr, display a variety of changes in activity related to movement, with GPi seemingly more related to movement execution whereas SNpr neurons tend to change their activity in relation to the preceding cue and the movement-related reward (Nambu et al., 1990; Mink and Thach, 1991; Jaeger et al., 1995; Mushiake and Strick, 1995; Turner and Anderson, 1997, 2005; Wichmann and Kliem, 2004; Nambu, 2007; Nevet et al., 2007; Fan et al., 2012). Like the cortex, BG output nuclei code direction and amplitude of movements (Georgopoulos et al., 1983; Turner and Anderson, 1997; Turner and Desmurget, 2010). At a more complex level, activity in the BG is dramatically modified between the first and last trials of a motor learning task (Jog et al., 1999; Barnes et al., 2005; Fan et al., 2012; Lemaire et al., 2012), indicating that it may play a role in motor learning. The BG receiving territory of Mthal is thus ideally situated to be involved in motor learning because of its connections with prefrontal cortex and premotor cortex (McFarland and Haber, 2002; Xiao et al., 2009; Redgrave et al., 2010).

Neuronal recordings in the Mthal of awake animals display a wide range of activity from low to high frequencies (1–80 Hz), with brief modulations in activity in relation to movements (Anderson and Turner, 1991; Forlano et al., 1993; Macia et al., 2002; Pessiglione et al., 2005). This mode of firing differs dramatically from the activity in the Mthal during slow wave activity in EEGs or local field potential recordings during anesthesia, where activity is organized in bursts that are repeated with consistent periodicity (Steriade et al., 1971; Nakamura et al., 2012). Given that the inputs to Mthal all display some movementrelated modulation in activity, information processing in Mthal during preparation or execution of movement is expected to also be reflected in temporally specific modulations. The activity of Mthal neurons during motor behavior has been mainly studied in primates and in behavioral paradigms requiring a movement triggered by a cue. These studies report that Mthal neurons change their activity in the period between presentation of the cue and onset of the movement, then activity returns to baseline levels (Strick, 1976; Schmied et al., 1979; Horne and Porter, 1980; MacPherson et al., 1980; Anderson and Turner, 1991; Nambu et al., 1991; Butler et al., 1992, 1996; Forlano et al., 1993; Vitek et al., 1994; Inase et al., 1996; Ivanusic et al., 2005; Kurata, 2005). These changes are mainly increases in activity, but decreases or complex patterns are also reported. The activity of Mthal neurons is correlated with movement duration, velocity or force, but only in a minority of cells (Butler et al., 1996; Ivanusic et al., 2005). Interestingly, despite anatomical segregation of information in BG and cerebellar territories, neurons across all regions of Mthal exhibit similar ranges of firing rate and movementrelated activity (Anderson and Turner, 1991; Nambu et al., 1991). This may reflect the fact that both BG output nuclei and deep cerebellar nuclei display complex responses with similar temporal characteristics during comparable tasks (Mushiake and Strick, 1993; Fan et al., 2012), but the precise role of each afferent on Mthal activity is still unknown. It can also be explained by inputs common to all motor thalamic nuclei such as afferents from premotor and motor cortices or the reticular thalamic nucleus (Pare et al., 1987; Hazrati and Parent, 1991). Another hypothesis is that glutamatergic synapses from cerebellum onto Mthal neurons are depressed and thus are unlikely to increase the firing frequency of Mthal neurons (Nakamura et al., 2012).

Insight about the role of the Mthal in the control of movement can be obtained from animal studies that have examined the effect of lesion or intrathalamic drug injection on behavior. Lesion effects are dependent on the site of the thalamic lesion, but in general these studies suggest that Mthal has a role in maintaining posture, controlling general movements and in motor learning. Mammals exhibit akinesia and bradykinesia, and posture is impaired following VA, VLa, VLp and VM electrolytic lesions or injection of GABA agonists or glutamate antagonists intrathalamically (Di Chiara et al., 1979; Starr and Summerhayes, 1983a,b; Klockgether et al., 1986a,b; Wullner et al., 1987; Canavan et al., 1989; Jeljeli et al., 2003). Large lesions of the VA, VLa and VLp nuclei produce ataxia and dysmetria in the contralateral arm of primates (Bornschlegl and Asanuma, 1987). The VLa appears to be particularly important in learning motor tasks because large lesions that included the VLa severely impaired relearning in primates, whereas lesions confined to VA did not (Canavan et al., 1989). Similarly in songbirds, the DLM is critical for motor learning, but not the production of song. DLM lesions exclusively impaired motor practice (babbling) and development of complex, mature syllables that are typical of adult male birds (Johnson and Bottjer, 1993; Goldberg and Fee, 2011).

Despite these advances, many questions remain to be addressed about the nature of information processing in Mthal and its role in motor control. In particular, Mthal activity needs to be explored during more complex motor tasks and during motor learning, and little data are available about how this activity is regulated by the various inputs. Finally, the role of Mthal in regulating activity in its efferent targets remains unknown.

## **CHARACTERISTICS OF CORTICAL, CEREBELLAR AND BG AFFERENTS IN THE MTHAL**

Much is known of the physiology of sensory thalamus and principles derived from sensory thalamus have been applied to Mthal because of their physical proximity and similarly intense interconnectivity with cortex. A major organizing principle for sensory thalamic nuclei is the classification of afferents as being either *drivers* or *modulators* (Sherman and Guillery, 1998, 2006). Driver afferents define the sensory receptive field properties of thalamic neurons and dictate spiking activity, whereas modulator afferents influence the activity of thalamus cells without directly triggering spikes. Activity in the sensory thalamus is thus strongly correlated with the activity of driver inputs but not with modulator inputs (Sherman and Guillery, 1998). A typical example is the lateral geniculate nucleus, which relays visual information from the retina to the visual cortex (Sherman and Guillery, 1998, 2006). Here, retinal ganglion inputs are drivers because they define the receptive field properties of relay neurons in the lateral geniculate nucleus, whereas inputs from the parabrachial region, reticular thalamic nucleus, layer VI of the cortex and local interneurons are considered modulators, because they do not fill the driver criteria (Sherman, 2007).

To determine if an input is a driver or a modulator, Sherman and Guillery (Sherman and Guillery, 1998, 2006, 2011; Sherman, 2007) defined several criteria. Basically, these criteria state that driver inputs have anatomical and physiological features that ensure information is reliably transmitted to, and controls the activity of, downstream thalamic neurons. An afferent input that does not fulfil all of these criteria is considered by default to be a modulator.

Anatomically, driver inputs have large diameter axons, with a dense terminal arborization, and preferentially target proximal dendrites and perisomatic areas of postsynaptic thalamic neurons. An additional anatomical criterion is that drivers do not send any collateral axon to the reticular thalamic nucleus. The reticular thalamic nucleus is intimately connected to the cortex and its GABAergic projection neurons innervate most thalamic nuclei, including Mthal (Pare et al., 1987; Hazrati and Parent, 1991). The significance of this additional criterion is not explicitly explained in the literature, but if a driver also sends collateral input to the reticular thalamic nucleus, feed forward inhibition from the reticular thalamic nucleus may simultaneously reduce the strength of the primary driver signal to Mthal neurons.

Although these anatomical features identify good candidate driver inputs of thalamic activity, physiological features must also be considered. For example, while it is generally assumed that afferents proximal to the cell body have a stronger effect on neuronal activity than distal ones due to degradation of the electrical signal along dendrites, this is not always the case because synaptic events can be electrically maintained by active conductances along dendrites (Gulledge et al., 2005). Four main physiological criteria characterize driver inputs (Sherman and Guillery, 1998, 2006, 2011; Sherman, 2007). First, in the sensory thalamus, it is considered that a driver has to be glutamatergic because only excitatory neurotransmitters are able to directly trigger action potentials in adult postsynaptic neurons. Second, the transmitter must act via ionotropic receptors for temporally precise and rapid onset/offset of conductances at the postsynaptic membrane. Conversely, transduction of the synaptic signal via metabotropic receptors can last for several hundreds of milliseconds, which is not compatible with temporally precise transmission of information across a synapse but could contribute to synaptic plasticity (Luscher and Huber, 2010). Third, the synaptic input has to be significant to produce large synaptic events that reliably trigger an action potential in the postsynaptic neuron. Fourth, a driver input should display paired pulse depression, which is a decrease in the amplitude of current for successive synaptic events when they are triggered at frequencies between 10 and 250 Hz. The reason for this criterion is that paired pulse depression means there is a high probability of neurotransmitter release at the first synaptic event, leading to a lower probability that a second synaptic event will be reliable (Zucker and Regehr, 2002). However, this last criterion is complex to interpret since paired pulse depression can be due to multiple factors with both pre and postsynaptic origins (Klug et al., 2012).

Several studies examining Mthal physiology have attempted to address whether cortical, cerebellar and BG afferents are more likely to be driver or modulator inputs, but the results to date are inconclusive (Anderson and Turner, 1991; Smith and Sherman, 2002; Person and Perkel, 2005; Bodor et al., 2008; Goldberg and Fee, 2012; Gulcebi et al., 2012; Nakamura et al., 2012; Rovo et al., 2012). Therefore, the following sections summarize the available anatomical and physiological data for cortical, cerebellar and BG inputs to Mthal and compare their features to the characteristics of drivers and modulators in an attempt to understand what their role might be in the Mthal. We also introduce the term "driver-like", when afferents fulfil most, but not all, of the criteria of a driver input. Notably, whereas traditional models of thalamic function based on sensory nuclei assume that each nucleus receives only one driver input (Sherman and Guillery, 2006), this review highlights that Mthal may integrate inputs from multiple sources, including driver-like afferents from BG.

## **CORTICAL AFFERENTS HAVE DRIVER AND MODULATOR CHARACTERISTICS, DEPENDING ON THE LAYER OF ORIGIN**

Cortical afferents to Mthal arise from pyramidal neurons in layers V and VI. Layer V afferents to the Mthal are collaterals from major descending axons that project to the brainstem and spinal cord. These layer V afferents, which represent a small proportion of all cortical inputs to Mthal, have been reported to match the anatomical criteria of a driver (Grofova and Rinvik, 1974; Rouiller et al., 1998; Kakei et al., 2001; Kultas-Ilinsky et al., 2003; Rouiller et al., 2003). However, a recent study did not report any large glutamatergic afferents from the cortex in the Mthal based on immunohistochemical staining for a glutamate transporter (vGLUT1), a stain for cortical driver afferents (Rovo et al., 2012). Nevertheless, layer V axons are thick (up to 3 µm) and terminate with very large boutons on the perisomatic area and proximal dendrites of thalamic neurons (Kultas-Ilinsky et al., 2003). Moreover, layer V afferents do not terminate in the reticular thalamic nucleus (Kakei et al., 2001; Kultas-Ilinsky et al., 2003). Functionally, one study reports that these layer V corticothalamic neurons from the motor cortex have a fast conduction velocity (∼40 m/s) typical of large diameter myelinated axons (Sirota et al., 2005), but further physiological data are not available and the postsynaptic receptors have not been characterized. While anatomical and physiological features of layer V afferents to Mthal are consistent with a driver role, full characterization is not yet available.

In contrast, cortical afferents from layer VI meet several modulator criteria. They preferentially target distal dendrites of Mthal neurons and have small diameter axons with small bouton terminals in the Mthal (Kakei et al., 2001; Kultas-Ilinsky et al., 2003). Moreover, layer VI cortical afferents send axon collaterals to the reticular thalamic nucleus (Kakei et al., 2001). Physiological data are scarce, but the conduction velocity of axons from layer VI cortical pyramidal neurons is less than 5 m/s, which is significantly slower than axons from layer V (Sirota et al., 2005). Nothing is known functionally, including whether synapses formed by layer VI inputs have ionotropic or metabotropic receptors. In summary, anatomical data suggest that layer VI cortical afferents to Mthal may have a less prominent role than layer V afferents.

The consequences of two functional inputs from the cortex on information processing in the Mthal during the preparation and execution of movements needs further consideration in the context of reciprocal and non-reciprocal thalamocortical connections. From the criteria described above, it appears that the features of layer V and VI afferents of Mthal activity are consistent with driver and modulator roles, respectively. There is a reciprocal feedback loop between cortical layer V and Mthal. Layer V afferents directly innervate the Mthal, and Mthal sends axons back to the corresponding cortical region (**Figure 2**). In contrast, layer VI afferents have an important role in integrating information across functional cortical boundaries because their axons target Mthal neurons that project to other cortical regions (McFarland and Haber, 2002; Haber and Calzavara, 2009). We can suppose from these data that Mthal activity of a specific nucleus is driven by its corresponding cortical area with layer V afferents and modulated at the same time by neighboring cortical areas with layer VI afferents, enabling the spiraling of information between the cortex and Mthal to facilitate the best functional movement outcome. In that case, layer VI cortical afferents would play a critical role in the corticothalamic network because they would be responsible for ensuring the Mthal integrates information across functional cortical areas. Currently, movement-related responses of layer VI pyramidal neurons in motor areas vary markedly (Sawaguchi et al., 1989; Matsumura et al., 1992; Beloozerova et al., 2003a,b; Isomura et al., 2009), therefore, further studies need to determine the precise role of layer VI cortical inputs on Mthal activity during execution of movements.

## **CEREBELLAR AFFERENTS IN THE MTHAL EXHIBIT SEVERAL DRIVER CHARACTERISTICS**

The connection between deep cerebellar nuclei and Mthal has been less studied than cortical afferents but the anatomical and physiological features indicate that cerebellar afferents to Mthal have several driver characteristics (Sherman and Guillery, 2006; Rovo et al., 2012). Indeed, application of the standard model developed from arrangements in sensory thalamus (where first order nuclei involve an ascending pathway whereas higher order nuclei are mainly implicated in corticocortical communications (Sherman and Guillery, 2006)), the cerebellar receiving territory of Mthal is considered to be a first order nucleus, driven by the cerebellum and modulated by layer VI cortical afferents (Sherman and Guillery, 2006; Rovo et al., 2012).

Consistent with a driver role, cerebellar afferents in Mthal form large boutons that mainly synapse on primary dendrites (Rinvik and Grofova, 1974; Kultas-Ilinsky and Ilinsky, 1991; Aumann et al., 1994; Sawyer et al., 1994b; Kuramoto et al., 2011; Rovo et al., 2012). These anatomical features are corroborated by intracellular studies showing that stimulation of cerebellar afferents produces strong, fast, excitatory events in Mthal neurons, that are even faster than cortical ones (Uno et al., 1970; Shinoda et al., 1985; Sawyer et al., 1994a). However, two important criteria of driver inputs that remain unknown are if the cerebellum innervates the reticular thalamic nucleus and the type of glutamatergic receptors involved at cerebellothalamic synapses. Further, the cerebellar receiving territory of Mthal also receives inputs from layer V of the cortex (Rouiller et al., 1998; McFarland and Haber, 2002), which, as noted above, have characteristics consistent with a driver role.

## **BG AFFERENTS IN THE MTHAL HAVE SEVERAL DRIVER-LIKE CHARACTERISTICS**

BG afferents in the Mthal are from the GPi and SNpr. Although these two BG afferents do not terminate on exactly the same nuclei within Mthal, we will consider them together because they are both GABAergic and the temporal aspects of their neural activity are similar at rest and during execution of movements (Wichmann et al., 1999; Boraud et al., 2002; Wichmann and Kliem, 2004).

Currently, the role of BG inputs on Mthal activity remains an enigma. One model classifying thalamic nuclei by their inputs has proposed that the BG receiving territory of Mthal is a higher order nucleus, driven by layer V cortical afferents and modulated by BG and layer VI cortical afferents (Sherman and Guillery, 2006; Gulcebi et al., 2012). However, several anatomical and physiological features of BG inputs to Mthal are consistent with a driver-like role and an alternative model proposes that BG inputs have a strong impact on Mthal activity (Albin et al., 1989; Alexander and Crutcher, 1990).

The main argument for BG providing modulatory input is that BG projection neurons are GABAergic and thus by definition, cannot be drivers (Sherman and Guillery, 2006). Intracellular recordings of Mthal activity have demonstrated that electrical stimulation in the GPi/SNpr induces strong inhibitory synaptic events in the Mthal (Deniau et al., 1978; Uno et al., 1978; Anderson and Yoshida, 1980; Chevalier and Deniau, 1982; Ueki, 1983; Tanibuchi et al., 2009). Moreover, the membrane potential of neurons in Mthal is about −60 mV during the anesthetized state (Paz et al., 2007), some 10 mV from the reversal potential for chloride, which favors generation of an inhibitory current. In the awake state, this will be exacerbated because neurons are less hyperpolarized (Franks, 2008) with firing rates between 1 and 80 Hz (Anderson and Turner, 1991; Forlano et al., 1993; Macia et al., 2002; Pessiglione et al., 2005), making GABAergic inputs from BG more likely to hyperpolarize Mthal neurons than excite them. While it is possible for GABAergic afferents to depolarize neurons depending on the equilibrium potential for chloride, (Viitanen et al., 2010; Kim et al., 2011), to date, an excitatory effect of BG input onto Mthal neurons mediated by direct GABAergic activation of ionotropic receptors has not been found in mammals. Another argument for a modulator role of BG afferents is that the SNpr sends axons to the reticular thalamic nucleus (Pare et al., 1990; Pazo et al., 2013), although it remains unclear whether these are collaterals of axons innervating the Mthal or a different set of neurons.

In distinct contrast to the assumption of a modulator role for BG inputs that develops from a thalamic perspective, the second theory assumes that the BG exert driver-like control on Mthal activity. This theory is based on models of BG function and circuitry, which treat the BG-territory of the Mthal as a "relay" responsible for transmitting BG output to cortex (Albin et al., 1989; Alexander and Crutcher, 1990; DeLong, 1990; Boraud et al., 2002; Bar-Gad et al., 2003; Nambu, 2004). Indeed, some anatomical and physiological characteristics of BG synapses onto Mthal neurons are consistent with a driver-like role. First, the synapse between the GPi and the Mthal in cat exhibits paired pulse depression (Uno et al., 1978). Second, receptors involved in the transmission between BG and the Mthal in birds are exclusively ionotropic (Luo and Perkel, 1999a). Third, and particularly importantly, electron microscopy studies show that synapses formed by GPi and SNpr terminals onto Mthal neurons in mammals are notably large and have giant (or multiple) synapses (Grofova and Rinvik, 1974; Kultas-Ilinsky and Ilinsky, 1990; Sakai et al., 1998; Bodor et al., 2008; Kuramoto et al., 2011; Rovo et al., 2012). The term "multiple synapses" has been chosen because every SNpr individual terminal forms between 5 and 20 synaptic contacts on Mthal neurons that are closely spaced on one bouton (Bodor et al., 2008; **Figure 1**). These multiple synapses are also concentrated on proximal dendrites and somata of Mthal neurons (Sakai et al., 1998; Bodor et al., 2008; Rovo et al., 2012). Moreover, these synaptic contacts are not separated by astrocytes (Bodor et al., 2008), which favors GABA spillover to perisynaptic ionotropic receptors and induces much larger inhibitory currents with a tonic inhibitory conductance (Farrant and Nusser, 2005). While ionotropic receptors are not generally thought to underlie LTS burst spiking because the GABA*<sup>A</sup>* current is too fast to cause the prolonged hyperpolarization required to deinactivate the T-type calcium channel, the

multiple synapse arrangement combined with the lack of astrocytes in close proximity to the synapse, favors a large amplitude hyperpolarization that is long enough to promote generation of an LTS burst (Jahnsen and Llinas, 1984a,b). Whether an LTS burst is generated at these giant synapses in mammals is not known, but it is one of several factors, such as synchronized GABAergic BG inputs and expression of GABA*<sup>A</sup>* versus GABA*<sup>B</sup>* receptors on Mthal neurons, that will affect generation of LTS bursts.

The specialized synapse structure of GABAergic inputs to Mthal is even more obvious in birds. Indeed, Area X (equivalent of GPi in mammals) connects DLM neurons (equivalent of Mthal neurons in mammals) with calyx-like synapses at a 1:1 ratio, in which GABAergic multiple synaptic contacts are closely spaced and distributed all around the soma (Luo and Perkel, 1999a,b; Doupe et al., 2005). The avian brain provides a strategic advantage to examine communication between BG and Mthal because activity in the calyx-like synaptic terminals from Area X can be simultaneously recorded with the soma of DLM neurons (Luo and Perkel, 1999a,b; Doupe et al., 2005). Indeed, recording extracellular spikes in axon terminals is rare as these electrical signals are usually very small (up to a thousand times smaller) compared to somatic spikes (Hubel and Wiesel, 1961; Schomburg et al., 2012). Recordings in the avian brain show reliable transmission from BG to Mthal with some temporal specificity (Person and Perkel, 2007; Kojima and Doupe, 2009; Leblois et al., 2009; Goldberg and Fee, 2012), consistent with the idea that the calyx-like terminals from the BG can provide a driver-like input that controls Mthal spiking. However, the state of the animal may critically determine the effect of BG input on Mthal inputs. In anesthetized birds where BG inputs have high firing rates (∼100 Hz) and the firing rate of Mthal neurons is low (∼5 Hz), BG inputs dominate spiking activity in Mthal (Person and Perkel, 2007; Kojima and Doupe, 2009; Leblois et al., 2009). In contrast, in awake singing birds, when BG firing rates are very high (∼300 Hz) and firing rates in the Mthal are also high (∼100 Hz), inputs from the cortex determine spiking activity in Mthal (Goldberg and Fee, 2012). These data indicate that excitatory and inhibitory inputs may have different consequences on Mthal activity depending on pre- and postsynaptic firing rates (Smith and Sherman, 2002; Guo et al., 2008; Goldberg et al., 2012). Notably, the firing rate of thalamic neurons in the awake bird is too high (∼100 Hz) to allow deinactivation of the Ttype calcium channel for LTS bursts to occur (Goldberg and Fee, 2012).

The avian BG-thalamic synapse is a special case, and it remains necessary to determine the physiological characteristics of BG inputs to Mthal in mammals to further understand how Mthal neurons process inputs, particularly in the awake state and during movement execution. Nevertheless, the available data in Mthal show that BG afferents have several characteristics that are consistent with both modulator and driver-like roles. Therefore, like cerebellum, the BG-territory of Mthal appears to receive more than one source of input that could play a driver-like role. Because of the importance of the BG-Mthal circuit in BG pathology, such as PD, we now consider in more detail the roles BG inputs may play in modulating Mthal activity.

#### **CONTROL OF MTHAL ACTIVITY BY BG INPUTS: POSSIBLE MECHANISMS AND IMPLICATIONS**

There are currently three main mechanisms proposed for how BG could control Mthal activity; the rebound model focusing on LTS bursts, the gating model focusing on the disinhibitory role of BG inputs and the entrainment model focusing on the temporal role of BG inputs. The strongest evidence for the rebound model comes from the anesthetized or *in vitro* avian brain, where inhibition of Mthal neurons by BG inputs reliably evokes LTS bursts that are locked in time (Person and Perkel, 2005, 2007; Kojima and Doupe, 2009; Leblois et al., 2009). Through the specialized calyx-like terminal structure, these BG inputs cause prolonged hyperpolarization of Mthal neurons that trigger a rebound LTS bursts of spikes. Therefore, BG afferents can be seen as indirect excitatory inputs due to their ability to trigger LTS bursts. We hypothesize that the role of LTS bursts may be complex and context dependent, in a similar way to synchronization of neuronal populations (Baker et al., 2001), occurring at precise, discrete periods during a movement. In this sense, BG GABAergic inputs to Mthal are not only consistent with most of the anatomical and physiological criteria of a driver input, but they may have the additional function of augmenting a functional movementrelated signal when they trigger a high-frequency burst of spikes. BG inputs could thus increase the reliability of the synaptic transmission of Mthal neurons to downstream neurons in the cortex because they trigger bursts of spikes temporally locked to the offset of inhibitory inputs.

The rebound model has not yet been investigated in mammals but a model study has shown that LTS bursts may allow detection of an inhibitory drive (Smith and Sherman, 2002). The few available data report that LTS bursts in the Mthal of awake primates do occur, but at very low rates (Postupna and Anderson, 2002) or particularly when animals are drowsy (Joffroy and Lamarre, 1974; Strick, 1976; Schmied et al., 1979). This may reflect the fact that there are fewer LTS bursts in freely moving animals but it may also be due to limitations in detecting LTS single spikes using extracellular recording techniques. Following a prolonged hyperpolarized state, T-type calcium channels in Mthal neurons can partially activate and trigger one spike but not necessarily a burst of spikes (Llinas and Steriade, 2006). This issue will only be resolved when the changes in membrane potential underlying all spikes are recorded in awake animals, which requires extremely challenging patch clamp or intracellular recordings in behaving animals. Given the lack of data investigating the significance of LTS bursts in Mthal activity during execution of movements, studies need to analyze neuronal recordings for LTS bursts and LTS spikes in behaving animals to understand the impact of this firing pattern on downstream structures.

Another mechanism proposed for how BG could exert powerful control on Mthal activity is the gating model (Horak and Anderson, 1984; Deniau and Chevalier, 1985; Chevalier and Deniau, 1990; Hikosaka, 2007). In this model, BG outputs can indirectly excite Mthal neurons, through disinhibition. BG are assumed to inhibit the thalamus under basal conditions because SNpr and GPi display high spontaneous spiking rates (between 10 and 70 Hz in mammals) (Wichmann and Kliem, 2004; Avila et al., 2010), releasing GABA in the Mthal. However, when the BG network is activated by cortical input, SNpr and GPi outputs are transiently suppressed and downstream targets, including the Mthal, are disinhibited.

Consideration of the complete cortico-BG network leads to a more precise formulation of the expected functional impact of BG input in this model. There are three main pathways for information transmission through the BG, the hyperdirect, direct and indirect pathways (For review, see Nambu, 2004). These pathways vary in the number of synapses from input to output, leading to the possibility that a single input signal could lead to successive waves of varying output. Thus, the hyperdirect pathway directly excites subthalamic nucleus, which in turn excites BG output nuclei, the GPi and SNpr. The direct pathway synapses in the striatum, which then inhibits the BG output nuclei, with a longer latency than the hyperdirect pathway. Finally, the indirect pathway, which synapses in both striatum and external part of the globus pallidus, leads to disinhibition of BG output nuclei, at the longest latency (Fujimoto and Kita, 1992; Maurice et al., 1998, 1999; Kolomiets et al., 2003). At the level of the GPi and SNpr, the consequence of the sequential activation of these three pathways is thus an excitation—inhibition—excitation sequence (Maurice et al., 1998, 1999; Kolomiets et al., 2003). According to the gating model, this BG output would be expected to cause a mirror sequence of Mthal activity. Following phasic input from BG, baseline firing rate in Mthal activity would first be inhibited, then disinhibited, and finally inhibited again before activity returns to baseline levels (Schneider and Rothblat, 1996; Nambu, 2004). The putative disinhibition of the Mthal produced by the direct BG pathway is the key element of the gating model. The BG may act as a gate that "decides" when cortical afferents freely drive Mthal activity.

The third mechanism to explain how BG control Mthal activity is the entrainment model. In this model, Mthal spiking is not positively correlated with BG activity, which would be expected for a driver input, but is instead restrained to a precise temporal window in which thalamic neurons can fire (Goldberg et al., 2012). It is postulated that BG input has an entrainment role because excitatory inputs that drive spiking of Mthal neurons interact with brief pauses of BG inhibition (Goldberg and Fee, 2012; Goldberg et al., 2012). This model has been extrapolated from studies in birds (Goldberg and Fee, 2012; Goldberg et al., 2012), where the activity of BG and Mthal is very high compared to mammals (∼300 and 100 Hz during singing, respectively). A comparable role of BG inputs on the temporal precision of Mthal activity has not yet been shown in mammals. However, in other brain areas receiving both glutamatergic and GABAergic inputs, interplay between the timing of these inputs has been shown to increase temporal precision of spiking activity in the postsynaptic neuron (Mainen and Sejnowski, 1995; Baufreton et al., 2005). The role of GABAergic inputs on Mthal activity needs to be explored further to determine if this temporal refining role of BG inputs applies in the Mthal of mammals.

In summary, it remains unknown if BG inputs control Mthal activity by the rebound, gating or entrainment models in mammals. The few available data indicate that the mechanism of information processing from BG to Mthal is dependent on the behavioral state of the animal, which is also supported by a study suggesting that the excitability of Mthal neurons is a critical factor for determining how the Mthal processes inputs (Goldberg et al., 2012). Therefore, to significantly advance our understanding of how the Mthal processes information underlying the control of movement, it is important that future studies are conducted in mammals executing movement tasks.

## **A NEW THEORY: MTHAL INTEGRATES, RATHER THAN RELAYS, INFORMATION FROM CORTICAL, CEREBELLAR AND BG AFFERENTS**

Several features of Mthal connectivity indicate it may process information very differently to sensory thalamus. The anatomical and physiological features of inputs to Mthal are not consistent with the dichotomous driver/modulator characteristics derived from sensory thalamus. In particular, evidence suggests the presence of multiple drivers, from cortex and cerebellum in the cerebellar territory and from cortex and BG in the BG territory, as reviewed above. Therefore, we propose a new theory of information processing in the Mthal summarized diagrammatically in **Figure 4**, where layer V of the cortex, cerebellum and BG can all have a strong influence on Mthal activity.

In this model, Mthal acts as a "super-integrator", actively assimilating information from multiple inputs. We propose the term "super-integrator" for Mthal because BG and cerebellar networks also receive information about the initial motor programme from the cortex, and independently integrate the cortical signal before forwarding it to their respective Mthal territories. Functionally, BG input is assumed to be responsible for adding motivational context, due to the dense dopaminergic inputs they receive, to select the best action needed to achieve the required behavioral outcome (Hassler, 1978; Redgrave et al., 2010). Similarly, cerebellar input will provide complex proprioceptive information, processed from sensory afferents of the spinal cord, vestibular apparatus etc., so that the current position of the body in space is used to optimize the motor programme (Eccles, 1973; Braitenberg et al., 1997). **Figure 4** also shows that the BG and cerebellar territories of the Mthal receive a copy of the developing motor programme directly from respective functional areas of the associative, premotor and motor cortices via projections from pyramidal neurons in layer V. Mthal processes information from all of these highly integrated inputs, with the weighting of each input dependent on the context and required motor outcome. Then, projections from Mthal return highly refined "super-integrated" motor plans back to the cortex to update development of preparatory and performance parameters of the motor programme.

The model outlined above proposes that information from BG and cerebellar territories project in parallel pathways to their recipient cortical areas (see **Figures 2**, **4**). However, we need to consider in this model that BG and cerebellar territories of Mthal exhibit broadly similar movement-related modulations in activity during preparation and execution of movements (Anderson and Turner, 1991; Nambu et al., 1991). Given the lack of overlap of cerebellar and BG input to single Mthal neurons, this uniform activity may reflect very complex integration and information processing in the Mthal such as "spiraling" of information from the cortex to Mthal, other inputs that innervate both territories,

and disparate inputs having similar functional consequences on Mthal activity. As discussed in section entitled Cortical Afferents have Driver and Modulator Characteristics, Depending on the Layer of Origin and summarized in **Figure 2**, corticothalamic efferents originating in layer VI transfer information from one functional area of cortex in a non-reciprocal relationship to another functional area in the Mthal, for example from cerebellum receiving Mthal to BG receiving Mthal, and *vice versa*, as depicted in **Figure 4**. Such iterative spiraling has been proposed to be important for fine-tuning an appropriate activation pattern of cortical neuronal networks that control agonist and antagonist muscles for a movement, and to contribute to response reaction times (Wickens et al., 1994). Inputs to both BG and cerebellar territories from other afferents, notably from the reticular thalamic nucleus, but also from the pedunculopontine nucleus, superior colliculus and locus coeruleus, may contribute to the similar movement-related activity in these territories as well (Lindvall et al., 1974; Rivner and Sutin, 1981; Pare et al., 1987; Steriade et al., 1988; Hazrati and Parent, 1991; Sommer, 2003). Furthermore, as discussed in section entitled Activity of the Mthal and its Main Afferents during Motor Behavior, while BG and cerebellum differ in the precise coding of a motor plan, they also have some similar movement-related modulations in activity, probably because they both receive input from motor cortex. Thus, integration of diverse and complex inputs by the Mthal could explain similarities in movement-related activity across both territories at the gross level, but still allow fine spatiotemporal differences, which are discussed below.

Importantly, various physiological mechanisms underpin this processing. Rather than uniformly additive integration in the Mthal, specific features of the inputs mean that they could converge to either boost a signal (e.g., cortical and cerebellar glutamatergic inputs) or act competitively (e.g., cortical glutamatergic inputs and GABAergic BG afferents). Furthermore, the input-output relationship resulting from this integration will vary greatly depending on the respective timing and the firing rate of both presynaptic afferents and the postsynaptic neuron, and on brain state, accounting for the variety of features seen under different experimental conditions. For instance, such interactions could account for the fact that Mthal activity in the awake, resting state is asynchronous throughout corticothalamic, cortico-BGthalamic and cerebellothalamic networks, while it is highly synchronized under anesthesia (Steriade, 2006; Crunelli and Hughes, 2010; Rowland et al., 2010). In awake, resting states, cerebellar and BG output neurons have the ability to generate action potentials autonomously (Eccles, 1973; Raman and Bean, 1997; Atherton and Bevan, 2005; Bosch et al., 2011), therefore, they have high spontaneous firing rates, but cortical inputs do not (Chen et al., 1996; Barth and Poulet, 2012). Input from BG and cerebellar afferents thus bombard their respective Mthal territories to prevent the cortex and Mthal from oscillating in synchrony.

In addition to the anatomical-functional considerations that underpin integration in the Mthal, as summarized in **Figure 4**, it is also important to consider temporal aspects of inputs and link these to spatial aspects to fully understand how the patterns of Mthal activity might arise. These temporal-spatial dimensions are likely to be important for determining the precise contribution BG and cerebellar Mthal territories make to motor programme development. With respect specifically to BG inputs, as reviewed in section entitled Control of Mthal Activity by BG Inputs: Possible Mechanisms and Implications, there is considerable evidence in support of a gating model in which temporal organization of activity in specific afferent pathways plays a crucial role; in particular, it has been proposed that Mthal is first inhibited by the hyperdirect pathway, then disinhibited by the direct BG pathway (Maurice et al., 1998, 1999; Kolomiets et al., 2003). This is followed by a further period of inhibition, mediated by the indirect BG pathway. Such serial events could underpin temporally precise coding of activity of specific individual muscles. In addition, there is also a superimposed spatial dimension to Mthal processing. Neurons in BG output nuclei define multiple parallel channels of motor information (Hoover and Strick, 1993), and in support of this idea, there is a lack of correlated activity between SNpr neurons during a cue-reward movement task (Nevet et al., 2007). These parallel channels of motor information from BG will also be reflected in the spatial location of Mthal neurons whose activity is modulated in relation to a movement, and each channel could have its own specific temporally organized pattern of activity on these separate BG pathways. Thus, we postulate that each Mthal neuron or small group of neurons receives a unique, integrated signal from BG output nuclei, and temporal and spatial mechanisms enhance the contrast between motor information to be promoted and suppressed in the Mthal. This could allow coding of temporally precise interrelationships of activity across different muscle groups, represented by spatially defined Mthal micro-domains.

The integrating role proposed for the Mthal is in stark contrast to the driver/modulator dichotomy and the relay function of sensory thalamic nuclei. This is perhaps not surprising given the different requirements that are likely to apply to motor and sensory functions, which would be expected to be reflected in the way these nuclei process information. Sensory thalamic nuclei need to relay information that accurately reflects stimulation of sensory receptors so that an animal correctly perceives their environment. In this case, relaying information through the sensory thalamus to the cortex with little integration will ensure that the neural code most accurately reflects stimuli at the original sensory receptor. This is achieved by each sensory thalamic nucleus having one defined driver that determines the overall activity in this nucleus, whereas all other afferents are modulators that fine-tune the neural activity. In contrast, for optimal motor function, a complex set of contextual information needs to be integrated so that the best motor programme can be activated. BG and cerebellar territories of the Mthal provide key sites to enable that to occur because they receive motivational and complex proprioceptive information about the movement that have been integrated in the BG and cerebellum and will enhance activation of the motor programme to be promoted and suppress unwanted motor programmes. For the Mthal to effectively select the best motor programme, it cannot simply relay the information from one driver input. Instead, the precise spatial and temporal pattern of activation in Mthal will reflect a functional movement goal.

## **INFORMATION PROCESSING IN THE MTHAL IN PARKINSON'S DISEASE**

PD is characterized by three major motor symptoms; rigidity, slowing of movements and tremor at rest. In this section, we focus on the possible roles of BG inputs in the Mthal because BG activity is profoundly altered in PD and although the cerebellum is a major input to Mthal, there is a paucity of data reporting changes in PD.

PD is caused by the progressive degeneration of dopaminergic neurons from the substantia nigra pars compacta that innervate the striatum and the other BG nuclei (Albin et al., 1989). Because dopamine plays a crucial role in BG physiology, the dynamics of the BG network are profoundly altered in the PD condition (Obeso et al., 2000; Boraud et al., 2002; Walters et al., 2007). Given that the BG are a major afferent of Mthal, understanding Mthal pathological activity is likely to be central to fully understand the neurophysiological origins of PD symptoms.

From the numerous studies that have examined BG activity in PD model animals and patients, the key pathological features of neural activity across BG nuclei can be summarized as follows:


Cortex has been relatively less studied in PD animal models, but changes in bursty activity and synchronization (Goldberg et al., 2002; Parr-Brownlie and Hyland, 2005; Pasquereau and Turner, 2011) and loss of specificity (Goldberg et al., 2002) have also been found in the primary motor cortex, which is both an afferent to and recipient of inputs from the Mthal. Moreover, it appears that motor cortex activity is more synchronized with BG nuclei such as the subthalamic nucleus, striatum or SNpr, in PD than in control conditions (Magill et al., 2001; Tseng et al., 2001; Sharott et al., 2005; Dejean et al., 2008; Brazhnik et al., 2012). However, the relative contributions of altered cortico-BGthalamocortical activity, loss of direct dopaminergic input to the cortex, or changes in wider networks in causing these cortical changes remains unclear.

Despite this evidence for widespread disruption of activity in structures afferent to Mthal, in the few studies performed to date, Mthal activity was not dramatically changed after dopamine depletion in animal models of PD. We might expect that Mthal neurons have lower firing rates, display bursty and oscillatory activity and exhibit LTS bursts for prolonged periods of time in PD, due to the increased activity of the output nuclei of the BG, and hence inhibitory tone in Mthal (Albin et al., 1989; Alexander and Crutcher, 1990; DeLong, 1990). One study in the awake monkey reports less specificity in receptive fields (Pessiglione et al., 2005), which is similar to reports in BG and cortex (Filion et al., 1988; Bergman et al., 1994; Abosch et al., 2002; Goldberg et al., 2002; Guehl et al., 2003). Other studies in anesthetized cats found decreases in Mthal firing rate (Voloshin et al., 1994; Schneider and Rothblat, 1996). These changes in Mthal activity are relatively mild, given the profound changes in neuronal activity in BG output nuclei. At present, there is no consensus on the effect of dopamine depletion on LTS bursts; of two studies examining LTS burst occurrence in the Mthal of PD patients one reported very few neurons exhibiting LTS bursts (Zirh et al., 1998), whereas the other found a high occurrence of LTS bursts (Magnin et al., 2000) and this discrepancy is probably due to the different parameters used to detect LTS bursts. To date, no animal studies have addressed changes or role of LTS bursts in Mthal in PD.

Data recorded from patients during surgery to implant electrodes to treat motor symptoms, such as tremor or rigidity provide useful information about the Mthal but they do not enable a direct comparison between control and PD patients; typically data from PD patients are compared to patients with another neurological disorder such as essential tremor or multiple sclerosis (Zirh et al., 1998; Raeva et al., 1998; Magnin et al., 2000; Brodkey et al., 2004; Hanson et al., 2012). A major finding of these human studies is that oscillatory signals occurring mainly in VLp neurons are coherent with the 3–8 Hz tremor and in synchrony with other Mthal neurons (Lenz et al., 1985, 1988, 1994; Zirh et al., 1998; Magnin et al., 2000; Marsden et al., 2000; Brodkey et al., 2004; Hanson et al., 2012). The origin of this oscillatory pattern of activity remains unknown, in particular, it has not been established whether the oscillatory activity is a cause or a consequence of the tremor. One hypothesis is that tremor arises from aberrant re-afferentation from cerebellar pathways to VLp and this pathway is normally used for rapid voluntary movements (Volkmann et al., 1996). Another theory suggests that the cerebellothalamocortical network produces the signal underlying the tremor and the BG network triggers when tremor occurs (Helmich et al., 2012). Because the BG and the cerebellum do not converge directly in the Mthal, this transfer presumably involves the cortex (Helmich et al., 2012).

Mthal lesions have been used for almost sixty years to treat tremor in PD (Hassler and Riechert, 1955). In recent years deep brain stimulation (DBS) of the thalamus has superseded thalamotomy primarily because side effects can be reduced by changing the stimulation parameters or stopped altogether by turning the stimulator off (Benabid et al., 1991; Koller et al., 1999; Okun and Vitek, 2004). Comparison of thalamotomy and DBS effects across studies is difficult, even when MRI has been used to determine anatomical landmarks, because of the inconsistent nomenclatures and subtleties in the placement of nuclei boundaries used for the human thalamus between research groups (Okun and Vitek, 2004) and difficulty visualizing some nuclei using MRI (Marsden et al., 2000; Bardinet et al., 2011). DBS has the advantage that the electrodes can cover a large area of thalamic volume and postsurgical testing can determine the best leads for effective stimulation (Katayama et al., 2005). Thalamotomy and DBS are highly effective for reducing the amplitude of tremor (Beuter and Titcombe, 2003; Duval et al., 2006; Mure et al., 2011). The position of the thalamotomy is critical, rather than the size of the lesion, and correlates with the degree of improvement (Atkinson et al., 2002). Notably, VLp lesion effectively treats tremor (Markham et al., 1966; Atkinson et al., 2002; Okun and Vitek, 2004; Klein et al., 2012). Similarly, high frequency (100–150 Hz) DBS within VLp (and to a lesser extent, VLa) achieved the best improvement in tremor (Yamamoto et al., 2004; Katayama et al., 2005; Klein et al., 2012). Clinical improvement of parkinsonian tremor occurs within 2–4 weeks of thalamotomy or DBS surgery and remains for 5–10 years without lasting side effects (Kelly and Gillingham, 1980; Nagaseki et al., 1986; Pahwa et al., 2006). A strategy used in thalamotomy and DBS surgery to improve outcomes is to target VLp based on the presence of oscillatory neuronal activity in the tremor range (Lenz et al., 1995; Garonzik et al., 2002). VLp thalamotomy and DBS are also effective for treating rigidity and quality of life (Markham et al., 1966; Benabid et al., 1991; Atkinson et al., 2002; Okun and Vitek, 2004; Klein et al., 2012). In contrast, the effect of VLp thalamotomy or DBS on bradykinesia, akinesia and fine motor control remains unclear with some studies reporting improvements (Perret, 1968; Perret et al., 1970), and other studies reporting no changes (Markham et al., 1966; Benabid et al., 1991; Beuter and Titcombe, 2003; Duval et al., 2006) or deleterious effects (van Someren et al., 1993; Boecker et al., 1997). Although the pathophysiology underlying PD tremor remains unknown, thalamotomy and DBS treatments support the idea that the VLp preferentially propagates oscillatory signals associated with tremor. It is possible that neural signals in other regions of Mthal play a role in parkinsonian akinesia and bradykinesia, such as the VA and VLa (Bornschlegl and Asanuma, 1987; Canavan et al., 1989; Okun and Vitek, 2004).

There is accumulating evidence that Mthal oscillatory activity in beta (12–30 Hz) and gamma (30–100 Hz) ranges is altered in PD. In the cortex and BG of PD patients and animal models, beta and gamma range oscillatory activities are routinely recorded, and are altered following administration of dopaminergic drugs (Levy et al., 2000, 2002; Brown et al., 2001; Sharott et al., 2005; Marceglia et al., 2006; Weinberger et al., 2006; Kuhn et al., 2009; Avila et al., 2010; Giannicola et al., 2010; Jenkinson and Brown, 2011; Brazhnik et al., 2012; Jenkinson et al., 2013), which provides support that these signals are pathological in PD. Similarly, oscillatory activity in the beta and gamma ranges have been reported in the cerebellar territory of patients (Paradiso et al., 2004; Kempf et al., 2009; Holdefer et al., 2010; Hanson et al., 2012; Brucke et al., 2013), and may also be pathological because they decrease and increase, respectively, during movements (Paradiso et al., 2004; Brucke et al., 2013), and gamma activity increases following administration of dopaminergic medication (Kempf et al., 2009). These surgeries were performed to implant stimulating electrodes into the VLp to treat tremor, so we do not know if VM, VA or VLa also show changes in oscillatory activity in beta and gamma ranges. To better understand the pathophysiology of PD, we need to know if synchronized activity in the beta and gamma ranges in Mthal is caused by a specific property of Mthal neurons in only the cerebellar territory or arises from inputs from the BG, cortex and/or cerebellum.

Taken together, the available data suggest that in normal conditions Mthal receives unsynchronized and non-bursty afferent signals from the cortex and BG, and that in PD this is profoundly altered so that the Mthal receives highly synchronized, oscillatory inputs. To date, the impact of this change in afferent activity on Mthal does not appear to have been fully characterized. For example, an issue that needs to be explored during movements in PD patients or animals models is whether Mthal neurons display the bursty and oscillatory activity seen in the BG and cortex. Bursty activity in the Mthal could be expected by two ways. First, if the cortex was the main driver, a bursty input will cause a coincident burst of activity in Mthal neurons, without necessarily causing LTS bursts. Second, if we consider the BG as a main driver, the profoundly bursty input from BG could produce a long inhibition in the Mthal that evokes LTS bursts during the interburst period i.e., when BG output is silent. In addition, if we consider that the cortex, cerebellum and BG are drivers, the occurrence of bursts and LTS bursts in Mthal will mainly depend on the timing of these inputs. It remains unclear how cerebellar activity is altered in PD. However, because BG and cortex seem to be synchronized and in-phase in PD (Williams et al., 2002; Goldberg et al., 2004; Walters et al., 2007; Dejean et al., 2008; Brazhnik et al., 2012), bursts from the cortex could be cancelled out by the profoundly strong, synchronized bursts from the BG. If this is the case, the occurrence and intensity of LTS bursts would not be as great as predicted. Recently, it has been proposed that pathological BG activity produces noise in the Mthal that compromises the fidelity of the Mthal output to the cortex, notably when a spike from the cortex gives rise to an LTS burst in the Mthal, due to pathological BG input (Guo et al., 2008; Rubin et al., 2012).

## **CONCLUSION AND FUTURE DIRECTIONS**

Mthal is a structure strategically situated between the BG, cerebellum and cortex, but how it processes information from these three structures remains to be determined. We have reviewed two current approaches to information processing in the Mthal. The first applies driver/modulator concepts from sensory thalamus to Mthal, and proposes that Mthal subregions are driven by one input and modulated by others. However, the data do not support such a simple functional distinction, and suggest instead the possibility that there may be multiple driver or driver-like inputs, including the possibility that BG may strongly influence activity in the BG territory of Mthal.

The second approach, derived mainly from considerations of the cortico-BG circuit treats the BG receiving Mthal as a conduit for BG activity by rebound, gating or entrainment mechanisms. To date, studies exploring these mechanisms have not considered the role of other inputs. The rebound mechanism proposes that BG trigger LTS bursts following a period of inhibition. The main drawback of this rebound model, however, is that LTS bursts seem to rarely occur in awake conditions. The gating model proposes that BG control Mthal activity by a disinhibiting process when the direct pathway of BG is activated. In this model, the BG may act in cooperation with the cortex to drive Mthal neurons. A more recent model, called the entrainment model, proposes that when Mthal and BG neurons have a very high firing rate, Mthal is driven by the cortex and BG play an essential role determining the time window in which thalamic spikes can occur.

Here, we propose a new approach that raises the possibility that BG inputs have a driver-like function in Mthal. Instead of treating Mthal as simply a relay structure, all inputs are important, with Mthal acting as an integrator of multiple inputs (each of which has already integrated multiple signals concerned with different aspects of motor control). In this view, Mthal emerges as a "super-integrator" of information from the cortex, the BG and the cerebellum. The cortex would initiate development of the motor programme, the cerebellar territory of the Mthal would process the complex proprioceptive information needed to produce an appropriate movement and the BG territory would process motivational information. All three pathways are necessary for motor learning and to evoke the optimal movement, and both Mthal territories send super-integrated signals back to the cortex (**Figure 4**). Furthermore, the open feedback loops involving the BG, cerebellum, Mthal and cortex ensure that motivational and proprioceptive aspects of the movement are incorporated into the highly integrated motor programme that develops in the Mthal.

This new approach to understanding Mthal function needs to apply across a range of behavioral states and pathophysiology. The Mthal shows modulations in firing rate with respect to movement, and although motor deficits are severe in most animal models of PD, the reported changes in Mthal activity are relatively minor. However, it is critical that future experiments investigating Mthal function are conducted in behaving animals with simultaneous recordings of Mthal, cortex, BG and cerebellum, to ensure that the processing of context relevant information is explored in the Mthal.

Many aspects of the super-integrator model require further investigation. In particular, to test the hypothesis it is necessary to elucidate the respective weights and roles of cortical, cerebellar and BG afferents on Mthal activity and how they interact to fine-tune Mthal activity. Moreover, it is essential to know the functionality of the BG-Mthal giant synapse in mammals and to determine how and when LTS bursts occur in the Mthal. Ultimately, understanding how the Mthal processes information and the role of cortico-BG and corticocerebellar loops will improve our understanding of how the brain controls movement and the mechanisms underlying movement disorders such as PD.

#### **ACKNOWLEDGMENTS**

Supported by a grant from the Neurological Foundation of New Zealand.

#### **REFERENCES**


Georgopoulos, A. P. (1988). Neural integration of movement: role of motor cortex in reaching. *FASEB J.* 2, 2849–2857.


recipient and cerebellar-recipient zones of the motor thalamus. *Cereb. Cortex* doi: 10.1093/cercor/bhs287. [Epub ahead of print].


downstream effects. *Eur. J. Neurosci.* 36, 2213–2228. doi: 10.1111/j.1460-9568. 2012.08108.x


chronization. *J. Neurosci.* 32, 1730–1746. doi: 10.1523/JNEUROSCI.4883-11. 2012


**Conflict of Interest Statement:**The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 July 2013; accepted: 24 October 2013; published online: 11 November 2013.*

*Citation: Bosch-Bouju C, Hyland BI and Parr-Brownlie LC (2013) Motor thalamus integration of cortical, cerebellar and basal ganglia information: implications for normal and parkinsonian conditions. Front. Comput. Neurosci. 7:163. doi: 10.3389/fncom.2013.00163*

*This article was submitted to the journal Frontiers in Computational Neuroscience. Copyright © 2013 Bosch-Bouju, Hyland and Parr-Brownlie. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*