Skip to main content

HYPOTHESIS AND THEORY article

Front. Neural Circuits, 27 June 2012
Volume 6 - 2012 | https://doi.org/10.3389/fncir.2012.00038

Oculomotor learning revisited: a model of reinforcement learning in the basal ganglia incorporating an efference copy of motor actions

  • Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA

In its simplest formulation, reinforcement learning is based on the idea that if an action taken in a particular context is followed by a favorable outcome, then, in the same context, the tendency to produce that action should be strengthened, or reinforced. While reinforcement learning forms the basis of many current theories of basal ganglia (BG) function, these models do not incorporate distinct computational roles for signals that convey context, and those that convey what action an animal takes. Recent experiments in the songbird suggest that vocal-related BG circuitry receives two functionally distinct excitatory inputs. One input is from a cortical region that carries context information about the current “time” in the motor sequence. The other is an efference copy of motor commands from a separate cortical brain region that generates vocal variability during learning. Based on these findings, I propose here a general model of vertebrate BG function that combines context information with a distinct motor efference copy signal. The signals are integrated by a learning rule in which efference copy inputs gate the potentiation of context inputs (but not efference copy inputs) onto medium spiny neurons in response to a rewarded action. The hypothesis is described in terms of a circuit that implements the learning of visually guided saccades. The model makes testable predictions about the anatomical and functional properties of hypothesized context and efference copy inputs to the striatum from both thalamic and cortical sources.

Introduction

One of the most fundamental problems an animal faces is how to modify its future actions based on the consequences of its past actions. One solution to this problem was first formulated by Edward Thorndike as the Law of Effect (Thorndike, 1911), according to which: “Responses that produce a satisfying effect in a particular situation become more likely to occur again in that situation.” The implementation of this principle, which forms the basis of reinforcement learning, instrumental learning, and stimulus-response learning (Sutton and Barto, 1998; Packard and Knowlton, 2002; Graybiel, 2008), requires three pieces of information: the action (response) that the animal makes, the context (situation) in which an action takes place, and an evaluation of the outcome (effect) of the action. Without any one of these components, learning of this type cannot occur.

Neural circuitry in the basal ganglia (BG) is intimately involved in the control of learned behaviors (Graybiel et al., 1994; Graybiel, 1998), and is thought to be essential for the modification of behavior through reinforcement (Barto, 1995; Doya, 2000; Daw and Doya, 2006). In the past few decades, a great deal of progress has been made in understanding the neural pathways that convey two of the key pieces of information noted above—outcome and context. An evaluation of the outcome of past actions is thought to be transmitted to the BG by neurons in dopaminergic brain centers (Wickens and Kötter, 1995; Reynolds et al., 2001; Hikosaka et al., 2006), whose firing rate signals the appearance of unexpected rewards or the absence of rewards that were expected (Montague et al., 1996; Schultz et al., 1997; Schultz, 2002; Cohen et al., 2012). The current situation, or context, in which the animal finds itself is thought to be transmitted to the striatum (the primary input structure of the BG) by a massive input from nearly all areas of sensory, motor, and premotor cortex (Graybiel et al., 1994; Wickens and Arbuthnott, 2010). In this framework, the term context includes all relevant information that might determine whether a particular action will lead to a positive outcome, including sensory stimuli, memory of recent past events, the time within a motor sequence, behavioral state, social context, and many others1.

The combination of reward signals with these cortical signals allows the BG to determine which patterns of cortical activity are associated with, or predictive of, a favorable outcome and thus to bias motor and cognitive circuits toward favorable actions (Lauwereyns et al., 2002; Watanabe et al., 2003; Samejima et al., 2005; Frank and O'Reilly, 2006). In many current models of BG function, the learning about which cortical states are favorable is thought to proceed by a modification of the strength of corticostriatal inputs under the control of dopaminergic inputs (Kreitzer and Malenka, 2008; Shen et al., 2008). In other words, cortical inputs that are associated with subsequent reward become strengthened, thus allowing the BG to detect patterns of cortical activity that lead to reward.

The difficulty with this conception of BG function is that the ultimate function of the BG is to shape behavior, and in order to learn which particular actions led to a reward, BG circuitry must know what actions the animal actually made. One possibility is that the “actor” that generates exploratory behaviors during learning is within the BG itself (Berns and Sejnowski, 1998; Hikosaka et al., 1999; Ito and Doya, 2011). This view is based on the highly influential “actor/critic” model of reinforcement learning (Barto, 1995; Sutton and Barto, 1998). However, it is unlikely that the BG is the origin of exploratory behaviors during learning. First, the brain contains many behavior-generating circuits distributed throughout motor cortex and the brainstem (Swanson, 2000). Second, many studies suggest that the BG may be involved in the generation of learned behaviors after learning, but that it is not a generator of spontaneous motor actions before learning. For example, lesions of the striatum can affect the ability of rats to learn the association between a visual cue and the correct response in a maze, but these lesions have little effect on their ability to engage in motor aspects of task training and navigation in the maze (Packard et al., 1989; McDonald and White, 1994; Packard and McGaugh, 1996). Nor do striatal lesions prevent the animal from learning spatial aspects of the task, which are instead affected by hippocampal lesions (Packard et al., 1989; Packard and Knowlton, 2002; Featherstone and McDonald, 2004). Similarly, lesions of the vocal-related BG in juvenile songbirds prevent subsequent vocal learning (Sohrabji et al., 1990; Scharff and Nottebohm, 1991), but have little immediate effect on the generation of exploratory vocal variability during learning (Goldberg and Fee, 2011). Also, lesions in adult birds, after learning, also have no effect on song production (Scharff and Nottebohm, 1991). Together, these findings are consistent with an emerging view that the BG may not be the source of motor actions, but may serve to select from, or modify through learning, motor programs generated elsewhere (Cools, 1980; Mink, 1996; Redgrave et al., 1999; Gurney et al., 2001; Brown et al., 2004; Grillner et al., 2005; Atallah et al., 2007).

So if the BG does not originate motor actions, then, in order to have information about what actions were taken, it must receive a copy of motor commands generated by motor circuits located elsewhere. Indeed, the songbird BG receives an efference copy of signals generated by cortical premotor vocal circuits (Vates and Nottebohm, 1995; Hessler and Doupe, 1999; Fee and Scharff, 2010). These findings have inspired a simple model of vocal learning in the songbird in which BG circuitry receives three separate inputs: an efference copy of motor signals that drive exploratory song variations, a context signal indicating the time in the song, and a reinforcement signal carrying song performance information (Fee and Goldberg, 2011). The BG then integrates this information to determine which vocal variations at each time in the song result in better performance (Fee and Goldberg, 2011). In the songbird model, the results of this computation are then transmitted through the output pathways of the songbird BG to bias neural activity and direct synaptic plasticity within cortical motor circuits. In this view, the BG is not the generator of motor actions (the “actor”), rather, its role is to evaluate the outcome of actions generated elsewhere and to use the results of this computation to bias (or direct) activity in other motor circuits to improve the likelihood that future actions will lead to reward.

To explore the implications of this view for reinforcement learning in the mammalian BG, here I combine these same elements into a simple model of oculomotor learning in the primate that incorporates an efference copy of actions (eye movements). I chose the oculomotor system because there is a tremendous amount of information about the firing patterns of neurons in different parts of the BG pathway in during oculomotor learning. I will first describe the basic elements of the current view of BG function in oculomotor control and highlight a potential weakness in our current mechanistic understanding of oculomotor learning. I will then review recent insights from the songbird that may solve this problem. Finally, I will return to the oculomotor system to show how a songbird-inspired model of BG function works well in this system, and potentially in other behaviors.

A Model of BG Control of Oculomotor Behavior

Much is known about the basic functional organization of the BG, and has been reviewed thoroughly elsewhere (Mink, 1996; Graybiel, 2005; Tseng and Steiner, 2010). Of particular importance are a series of studies from Hikosaka and Wurtz in which monkeys were trained to produce saccadic eye movements toward a visual target (Hikosaka and Wurtz, 1983a; Hikosaka et al., 2000). The BG control eye movements via an inhibitory projection of the substantia nigra pars reticulata (SNr) onto intermediate layers of the superior colliculus (Jayaraman et al., 1977; Graybiel, 1978; Chevalier et al., 1984; May and Hall, 1984). Neurons in the SNr exhibit a pause in their tonic activity prior to eye movement into a particular part of the visual field (Hikosaka and Wurtz, 1983a; Basso and Wurtz, 2002), releasing the SC from inhibition, thus driving or augmenting the performance of a saccade (Figures 1A,B). The movement fields of SNr neurons match the movement fields in the region of the superior colliculus to which they project (Robinson, 1972; Hikosaka and Wurtz, 1983b), consistent with the idea that SNr neurons are segregated into different output “channels” that can influence saccades in different directions (Figure 1A).

FIGURE 1
www.frontiersin.org

Figure 1. The basal ganglia can drive learned changes in visually- guided saccades. (A) Schematic diagram of the direct pathway of an oculomotor circuit in the BG. The output of the BG can be thought of as having discrete motor “channels.” Shown are two channels that project to the superior colliculus and can drive saccades to the left or right. In this simple model, these channels can be driven by sensory inputs from cortex, illustrated here by neurons responding to the appearance of visual targets 1 and 2. (B) Neurons in the substantia nigra pars reticulate (SNr) are tonically active and inhibit the generation of saccades by the superior colliculus. SNr neurons can be inhibited by spiking in medium spiny neurons (MSNs) in the striatum, thus releasing the superior colliculus from inhibition (adapted from Hikosaka et al., 2000). (C) Illustration of a stimulus-response task in which only saccades in one direction (e.g., leftward saccade) are rewarded while saccades in the other direction were not rewarded. (D) During training, saccades in the rewarded direction become faster are generated with a shorter latency than saccades in the unrewarded direction. This behavioral change is thought to be mediated by activation of MSNs in the rewarded “channel” by the appropriate cortical inputs. Images in panels (C,D) are taken from Lauwereyns et al. (2002).

SNr neurons receive inhibitory input from medium spiny neurons (MSNs) in the striatum, allowing MSNs to influence the behavior of downstream motor circuits2. For example, when monkeys are trained to make a saccade to a visual target, some MSNs generate a burst of activity (Hikosaka et al., 1989a) that inhibits downstream SNr neurons. The resulting “pause” in the tonic activity of SNr neurons leads to a dis-inhibition of saccade-generating neurons in the SC (Hikosaka et al., 2000; Figure 1B). The firing of MSNs is associated with increased speed and reduced latency of saccades (Watanabe et al., 2003).

Importantly, MSNs are not active prior to spontaneous saccades (Hikosaka et al., 1989a), but become active and exhibit complex firing patterns under conditions in which the animal is rewarded for some behaviors and not others (Hikosaka et al., 1989b,c). For example, if a monkey is trained to make saccades to several different targets, but only saccades to one of these targets are rewarded (Figure 1C), the animal begins to make saccades more rapidly in the rewarded direction and more slowly in the unrewarded direction (Kawagoe et al., 1998, 2004; Figure 1D). In these experiments, many MSNs became active prior to saccades only in the rewarded direction, and only when the rewarded direction was into a particular part of the visual field (to the left, for example). A similar degree of selectivity for reward and saccade direction was expressed by neurons in the SNr, which exhibited a suppression of activity only for rewarded saccades in a particular direction (Sato and Hikosaka, 2002), and in the superior colliculus, which exhibited increased activity in the rewarded direction (Ikeda and Hikosaka, 2003). These neurons exhibit the precise firing patterns expected for the striatal, SNr, and SC neurons in the model shown in Figure 1A (Hikosaka et al., 2006).

MSNs that, after learning, exhibit enhanced activity prior to a saccade in only one direction (say, to the left), and only in blocks of trials in which that direction was rewarded, can be thought of as signaling the value of a leftward saccade in a particular context. As a population, these neurons signal the expected value of all the different actions the animal might perform (Samejima et al., 2005). It is specifically these “action-value” MSNs that are used in the circuits described in this paper (Figures 14). Other MSNs develop a range of responses to different aspects of the target stimuli and the task (Hikosaka et al., 1989b,c; Lauwereyns et al., 2002). For example, some neurons become active only after a specific action is taken, and the size of the response correlates with the size of the reward the animal received from that choice of action (“chosen-value” neurons; Lau and Glimcher, 2008; Cai et al., 2011). Yet other MSNs developed a large visual response to targets that cue a saccade in any rewarded direction, independent of saccade direction (Kawagoe et al., 1998; Kobayashi et al., 2007), thus signaling the predicted value of the cue. The activity patterns of “chosen-value” and “cue-value” MSNs likely play a key role in computing the difference between the actual and predicted rewards (reward prediction error; Schultz et al., 1997). These MSNs may reside in the patch/striosome part of the striatum, which projects to dopaminergic centers rather than to the SNr (Graybiel and Ragsdale, 1978; Gerfen et al., 1987).

In the oculomotor learning task described above, the target stimuli represent the context that determines whether a particular action (e.g., a saccade in a particular direction) will be rewarded. For example, the appearance of a stimulus (target 1), followed by a saccade to the left results in reward, but the appearance of another stimulus (target 2) followed by a saccade to the left does not result in reward. In general, the association between a context (stimulus) and a response is arbitrary. For example, monkeys can be trained to saccade toward or away from a particular stimulus (Kunimatsu and Tanaka, 2010), or can be trained to make saccades in a particular direction depending the identity of an object image, rather than its location (Pasupathy and Miller, 2005).

Information about target stimuli in the oculomotor learning task is thought to be transmitted to the striatum by cortical inputs (Hikosaka et al., 2006). Thus the model shown in Figure 1A contains two cortical “units” that signal the appearance of each of the two targets3. Because the two cortical units represent arbitrary stimuli to which saccades can become associated, we assume that the connectivity from the cortical units to the MSNs is potentially all-to-all, and is initially weak. Let us imagine now that saccades to the left are rewarded when they occur after the appearance of target 1. In this case, we want to strengthen the connection from Ctx-1 to the left MSN (MSN-L), since activation of the left MSN will enhance the probability of generating (or increase the velocity of) a leftward saccade. One synaptic learning rule that has been hypothesized to implement this plastic change can be summarized as follows: strengthen a corticostriatal synapse whenever its presynaptic input is coincident with activity in the postsynaptic MSN, and is followed by reward (Houk et al., 1995; Wickens and Kötter, 1995; Hikosaka et al., 2006). In this case, the Ctx-1 to MSN-L synapse will be strengthened whenever target 1 appears (activating Ctx-1), and MSN-L happens to be active, causing the monkey to make a left saccade and resulting in a global dopamine reward signal. As desired, the strengthening of the Ctx-1 to MSN-L synapse will lead to activation of the left MSN, and a bias toward leftward saccades on future appearances of target 1.

Implicit in this model of oculomotor learning is the idea that the two MSNs are the “actors” that choose which way the monkey will saccade during learning. Early in the learning process, the monkey does not know which way to saccade in order to receive a reward, and will thus tend to make a random choice on each trial. In this model there must be some mechanism that adds a “randomness” or variability to MSN activity. In one prescription of the learning process (Hikosaka et al., 2006), MSNs are thought to be activated by the sensory cortical inputs even early in learning, so the randomness involved in the activation of MSNs is specified to arise from trial-to-trial variations in the strength of these corticostriatal synapses. Of course, with the learning rule stated above, any mechanism would work in which random variations in saccade directions are caused by random variations in MSN activity.

It is unlikely, however, that MSNs in the striatum are the “variability generators” responsible for driving the random trial-and-error saccades early in the learning process. First, MSNs are largely silent in untrained animals (Hikosaka et al., 1989a). Second, SNr neurons do not appear to produce pauses prior to spontaneous saccades, as they do after learning (Hikosaka and Wurtz, 1983a). Furthermore, lesions of the SNr lead to increased spontaneous saccade generation (Hikosaka and Wurtz, 1985), rather than the decrease that might be predicted if the BG were the motor source of spontaneous saccades early during learning. It appears, therefore, that such trial-and-error saccades are not initiated in the striatum, but are likely generated by one of the many of brain circuits that project to the superior colliculus, and are capable of triggering or influencing saccade generation (Wurtz and Albano, 1980).

In order to learn the outcome of actions, striatal MSNs must know what actions an animal just took. But if MSNs don't generate the actions during learning, how do they get this information? The resolution of this paradox was made explicit in a recent model of vocal learning in the songbird (Fee and Goldberg, 2011). In the songbird, exploratory variability during singing is generated by a cortical brain region that also transmits an efference copy of motor commands to the BG. By receiving an efference copy of actions generated elsewhere in the brain, MSNs in the model are able to evaluate the outcome of these actions, and then appropriately influence the “actors,” even when those actors are circuits outside of the BG. I will now turn to a description of the songbird model before applying this principle to the problem of oculomotor learning.

A Model of Songbird BG Incorporating an Efference Copy of Motor Actions

Songbirds acquire their songs by vocal imitation, and it has been proposed that this is achieved by a reinforcement learning mechanism (Doya and Sejnowski, 1995, 2000; Tumer and Brainard, 2007; Fee and Goldberg, 2011). An essential brain area underlying vocal learning in the songbird is Area X (Sohrabji et al., 1990; Scharff and Nottebohm, 1991), a BG circuit with a high degree of homology with the mammalian BG (Figures 2A,B; Jarvis, 2004; Reiner et al., 2004; Doupe et al., 2005; Person et al., 2008). Area X includes both striatal medium spiny neurons (Farries and Perkel, 2002; Goldberg and Fee, 2010), as well as pallidal neurons that project to the thalamus (Luo and Perkel, 1999; Farries et al., 2005; Goldberg et al., 2010).

FIGURE 2
www.frontiersin.org

Figure 2. A model of vocal learning in the songbird. (A) Schematic diagram of nuclei involved in song production and song learning. (B) Hypothesized homology between songbird and mammalian brain areas (MC, motor cortex). (C) Firing patterns of a single corticostriatal neuron in LMAN during singing. Each row of the raster plot shows the spikes produced during a different rendition of the song (spectrogram shown at top). The high degree of variability in LMAN activity is thought to drive exploratory song variations during learning. (D) Firing patterns of seven different corticostriatal neurons in HVC during singing. Raster plot shows spike produced during 10 sequential song renditions for each neuron. Note the highly stereotyped and sparse burst pattern of each neuron. The spiking of MSNs shows a similar degree of sparseness, potentially allowing MSN to compute the value of LMAN fluctuations independently at each time in the song. (E) A simple hypothesized circuit for song learning. LMAN and HVC inputs converge, together with dopaminergic inputs from the VTA, onto a single medium spiny neuron (MSN) in Area X. The LMAN input to the MSN (hollow circle) arises as an axon collateral of the projection of LMAN to the motor pathway (RA). This efference copy input does not drive spiking in the MSN, but gates synaptic plasticity at the HVC input (filled circle). If LMAN activity is coincident with the HVC input, and leads to improved song performance (signaled by increased dopamine input), then the HVC-MSN synapse is strengthened. On future song renditions, the HVC input drives the MSN to spike, thus disinhibiting the thalamus and biasing the LMAN neuron to be more active at that time. (F) Schematic showing the closed topographical loops between LMAN, Area X and the thalamic nucleus DLM (Luo et al., 2001). This allows Area X to independently evaluate and bias activity in each different subregion of LMAN. Also shown are the hypothesized divergent inputs from HVC.

Area X receives two distinct glutamatergic inputs. One input arises from the lateral nucleus of the anterior nidopallium (LMAN; Vates and Nottebohm, 1995), a cortical area known to be important for vocal learning (Bottjer et al., 1984; Scharff and Nottebohm, 1991; Brainard and Doupe, 2002). A key function of LMAN in vocal learning is the generation of vocal babbling and exploratory variability in learning birds (Kao et al., 2005; Olveczky et al., 2005; Kao and Brainard, 2006; Tumer and Brainard, 2007; Aronov et al., 2008; Hampton et al., 2009; Stepanek and Doupe, 2010). Individual LMAN neurons project to the motor pathway and produce a collateral that terminates in Area X. During singing, these LMAN neurons generate highly variable patterns of activity (Figure 2C; Hessler and Doupe, 1999; Kao et al., 2005, 2008; Olveczky et al., 2005; Aronov et al., 2008) that drive variability in the vocal motor pathway (Sober et al., 2008; Olveczky et al., 2011). Thus, the input to Area X from LMAN is an efference copy of an ongoing motor signal that drives vocal exploration.

Importantly, complete bilateral lesions of Area X have little effect on vocal variability in juvenile birds, suggesting that the BG circuitry is not directly involved in the generation of vocal exploration during learning (Goldberg and Fee, 2011). Furthermore, local brain cooling within LMAN results in slowing of the characteristic timescales of vocal babbling, suggesting that the biophysical and circuit dynamics within LMAN are involved in generating vocal variability (Aronov et al., 2011). These experiments support the idea that the cortical nucleus LMAN is the “variability generator” that drives vocal exploration during learning.

A second input to Area X comes from nucleus HVC (used as a proper name), a cortical region that controls the temporal structure of the song (Margoliash and Yu, 1996; Hahnloser et al., 2002; Long and Fee, 2008; Long et al., 2010). The HVC neurons projecting to Area X burst very sparsely, many generating a single highly reliable burst of spikes at one or a few specific moments in the song (Kozhevnikov and Fee, 2007; Prather et al., 2008; Fujimoto et al., 2011; Figure 2D). This input has been hypothesized to serve as a “context” signal that carries information about the current time in the song (Kozhevnikov and Fee, 2007; Fee and Goldberg, 2011). Interestingly, MSNs in Area X also fire extremely sparsely during singing, producing at most one burst of spikes at a specific moment of the song (Goldberg and Fee, 2010). This pattern suggests that the spiking of MSNs is likely driven by HVC (“context”) inputs rather than LMAN (“efference copy”) inputs. Furthermore, it indicates that spiking in any one MSN is driven by a very small subset of HVC inputs that are coactive at one moment in the song.

In songbirds, Area X also receives a large dopaminergic projection from VTA (Gale et al., 2008), and several lines of evidence suggest that this input could be important for song learning (Ding et al., 2003; Harding, 2004; Gale and Perkel, 2005; Kubikova and Kostal, 2010; Kubikova et al., 2010). In particular, it has been suggested that this input may serve as a fast reward prediction error signal that carries real-time information about song performance (Gale and Perkel, 2010; Fee and Goldberg, 2011). The evaluation of song performance in auditory cortical areas could be transmitted to VTA via descending forebrain projections (Gale et al., 2008; Gale and Perkel, 2010; Las et al., 2011). With such a song evaluation signal, and an efference copy of the LMAN activity leading to vocal variability, Area X would be in a position to determine which variations lead to a better song outcome.

Of course, just as a leftward saccade might lead to reward after the appearance of target 1 but lead to no reward after target 2, a particular song variation generated by LMAN might make the song better at one point in the song, but make it worse at another point. Thus, the evaluation of LMAN activity would need to be carried out independently at every time point in the song. The sparse firing of MSNs and of HVC inputs to Area X would facilitate the temporal specificity of this computation (Fiete et al., 2004). It has been proposed that such a context-specific evaluation of LMAN activity could be implemented with a synaptic learning rule, perhaps related to gated Hebbian “triplet” learning rules (Farries and Fairhall, 2007; Fiete et al., 2007; Izhikevich, 2007; Redondo and Morris, 2011), that detects coincident activation of LMAN and HVC inputs, followed by a dopaminergic reward signal (Fee and Goldberg, 2011; Figure 2E). The result of this pattern of coincident inputs would be to strengthen HVC-to-MSN synapses such that the activity of a medium spiny neuron at a particular time would indicate that activity in the LMAN neuron at that time consistently led to a better song performance. Finally, the output of Area X would be transmitted back to LMAN through the thalamus to bias future LMAN-generated song variations in the direction of better song performance (Kao et al., 2005; Olveczky et al., 2005; Fee and Goldberg, 2011). Such biased vocal variability has been directly observed in learning birds (Andalman and Fee, 2009; Warren et al., 2011).

An important feature of the HVC and LMAN inputs to Area X is their pattern of axonal projections (Figure 2F). The projection from LMAN to Area X is local and topographically organized, as is the projection of LMAN to the robust nucleus of the arcopallium (RA), the myotopically organized output nucleus of the motor pathway (Vicario, 1991; Iyengar et al., 1999). Thus Area X can be considered as divided into discrete motor “channels.” Furthermore, the projections from Area X to the pallidorecipient thalamic nucleus DLM (medial portion of the dorsolateral thalamus) and from DLM back to LMAN are also topographically organized, such that the connections among these three nuclei form discrete closed loops (Johnson et al., 1995; Luo et al., 2001). The closed-loop nature of these projections allows MSNs to feed back and selectively influence the activity of the particular subset of LMAN neurons from which they receive inputs (Figure 2F).

A Novel Model of Oculomotor Learning That Incorporates Efference Copy of Motor Actions

Returning now to the problem of oculomotor learning, we can imagine that early during learning—before the monkey has learned the association between a visual target and the saccade direction that leads to reward—there is a “variability generator” that generates random “guesses” at saccade direction during each behavioral trial. It could do this by transmitting a command signal to the superior colliculus, analogous to the commands sent by LMAN to the songbird vocal motor pathway. In order to explain how the striatum can learn the value of the saccade guesses, I hypothesize that MSNs must receive an efference copy of these saccade commands, just as Area X receives an efference copy of LMAN activity. There are several brain regions that could potentially generate saccade “guesses” during learning, but one likely possibility is the cortical frontal eye fields (FEF). The FEF sends a topographically organized projection to the superior colliculus (Bruce and Goldberg, 1985; Komatsu and Suzuki, 1985), an efference copy of which is transmitted to the striatum (Kunzle and Akert, 1977; Alexander et al., 1986). In the model shown in Figure 3A, the role of the efference copy input is not to drive spiking activity in the MSN, but rather to gate synaptic plasticity at cortical context inputs. The efference copy input is envisioned as a glutamatergic synapse that could operate at a mechanistic level by depolarizing MSN dendrites sufficiently to “enable” corticostriatal plasticity (Charpier and Deniau, 1997; Reynolds et al., 2001; Plotkin et al., 2011), but not necessarily enough to drive spiking.

FIGURE 3
www.frontiersin.org

Figure 3. A model of oculomotor learning in the BG incorporating an efference copy signal. (A) “Random” saccades during learning are generated in cortical frontal eye fields (dark yellow, FEF). Efference copy inputs to the MSN (hollow circle, analogous to LMAN inputs to Area X) arise from a collateral of the descending motor commands from the FEF to the superior colliculus (SC). Context inputs to the MSN (filled circles, analogous to HVC inputs to Area X) arise from cortical neurons conveying sensory inputs. The output of the SNr biases saccade generation by a projection to intermediate “motor” layers of SC. (B) The hypothesized learning rule that incorporates efferency copy, context, and reward signals. Coincident activation of context (CX) and efference copy (EC) inputs activates a transient eligibility trace (Etrace). If a reward signal (Reward) coincides with the eligibility trace, then the CX input is strengthened (ΔWCX-MSN > 0). (C) Hypothesized sequence of events during learning. (1) Cortical neuron Ctx-1 becomes active indicating the appearance of a particular target (i.e., Target 1). (2) The FEF generates a “random guess” at a saccade direction, in this case, to the left. This combination activates an eligibility trace in the Ctx-1 to MSN-L synapse. (3) If leftward saccades are rewarded in response to Target 1, monkey receives a reward, resulting in increased spiking in dopaminergic VTA neurons. (4) The coincidence of the reward and eligibility trace results in strengthening of the Ctx-1 to MSN-L synapse. Thus, future appearances of Target-1 will bias the monkey to make a leftward saccade.

If a particular action occurring within a particular context leads to reward, then future occurrences of that context should cause the action to occur with a higher probability (Thorndike, 1911). At the synaptic level this could be achieved by the same learning rule described above for the songbird—strengthening the context inputs that were active simultaneously with the arrival of a motor efference copy signal (action), and are followed by a reward signal (Figure 3B). In the model of oculomotor learning shown in Figure 3A, the sequence of events during learning would occur as follows (Figure 3C): on one particular trial, the appearance of the target 1 (which activates the Ctx-1 neurons), may be followed by a chance activation of the FEF-L neuron that initiates a saccade to the left. The left MSN will then simultaneously receive a context input from Ctx-1 and an efference copy input from FEF-L. If left saccades are rewarded after the appearance of target 1, this context-action pairing will be followed by a widespread dopaminergic reward signal from VTA/SNc. This combination of inputs would strengthen the Ctx-1 to MSN-L synapse. The Ctx-1 to MSN-R synapse will not be strengthened because, in this model, the efference copy of left saccades is transmitted only to the left MSN, and corticostriatal plasticity will only be enabled in this MSN (Figure 3C, right panel). The result of this learning rule is that future appearances of the target 1 will activate the left MSN neuron, which would initiate leftward saccades by the direct action of SNr neurons on the SC. In short, by incorporating an efference copy signal, the model is able to learn, in a highly specific manner, the value of any action in any context, as long as MSNs controlling that action receive CX inputs from a neuron signaling that context.

MSNs could also influence saccade direction via the pallidal projection to thalamic nuclei that project back to the FEF (Figure 4A; Alexander et al., 1986). This could serve to bias the “variability generator” in the FEF to generate leftward saccades after the appearance of target 1 through a BG-thalamocortical loop, in much the same way that Area X has been proposed to bias vocal variability from LMAN during learning (Andalman and Fee, 2009; Fee and Scharff, 2010; Fee and Goldberg, 2011; Warren et al., 2011). After learning, the BG would consistently drive activity in the FEF-L neuron after the appearance of target 1 (signaled by the activation of Ctx-1). Indeed, recordings of FEF neurons have revealed reward-related bias similar to that observed in the striatum (Ding and Hikosaka, 2006). This consistent pairing of activity in Ctx-1 and FEF-L could lead to strengthening of a direct connection from Ctx-1 to FEF-L, similar to the consolidation of LMAN driven bias into the direct HVC-to-RA projection, hypothesized in a recently proposed model of songbird vocal learning (Andalman and Fee, 2009; Fee and Scharff, 2010; Fee and Goldberg, 2011; Warren et al., 2011). This is also a mechanism by which often-repeated stimulus-response associations could be transformed into a cortically driven habitual behavior (Hikosaka et al., 2002; Yin and Knowlton, 2006; Graybiel, 2008).

FIGURE 4
www.frontiersin.org

Figure 4. Three additional models of BG circuits incorporating efference copy. (A) Efference copy comes from the FEF, as in Figure 3A, but the output of the BG acts to bias saccade generation in the FEF through the pallido-thalamo-cortical loop, rather than acting directly on the SC. (B) A model in which “random” saccades may be driven by any input to the SC, and efference copy signals to the BG arise from ascending tectothalamic and thalamostriatal pathways. In this model, the BG biases saccade generation by acting directly on the SC. (C) A model in which both efference copy inputs and context inputs arise from thalamostriatal pathways. This model is hypothesized to represent an evolutionarily early role for the BG in controlling brainstem-generated behavior.

Asymmetries Between Context and Efference Copy Inputs

This model incorporates two fundamental asymmetries between context (CX) inputs and efference copy (EC) inputs to the striatum. First, plasticity is only produced at CX inputs. In this model, the function of the BG circuit is to drive or bias a particular action (leftward saccade) in a given context (i.e., appearance of target 1). Such bias is naturally produced by strengthening the context input onto MSNs. From the perspective of learning an oculomotor stimulus-response association, it makes no sense to also strengthen the EC input to the MSN, the result of which would be that a spontaneous leftward saccade would tend to initiate another leftward saccade, independent of context.

A second essential asymmetry between CX and EC inputs relates to the convergence and divergence of these inputs onto MSNs. Namely, the projection of EC signals must be local within one motor channel of the BG, while the projection of CX signals must be highly divergent across many motor channels. Local projections of EC inputs within one motor channel of the striatum is required because, in order to learn the value of a saccade in a particular direction, the efference copy signal indicating a saccade in a particular direction needs to project precisely to the same MSNs that can influence that saccade direction in the future. For example, in the model shown in Figure 3A, if the efference copy inputs from the FEF each projected to both MSNs, there would be no specificity in the synaptic learning rule. Most generally, crosstalk of EC inputs across motor channels would have a detrimental effect, causing spurious actions to be learned. The highly topographic projection from LMAN to Area X exhibits precisely the kind of specificity suggested (Johnson et al., 1995; Luo et al., 2001), allowing MSNs to evaluate the effect of variability introduced into distinct “channels” of the motor pathway.

There is evidence for this type of channel specificity in mammalian BG circuits. In primates, for example, there is a coarse topographic organization of separate cortico-BG-thalamocortical loops for skeletomotor, oculomotor, prefrontal, and limbic circuits (Alexander et al., 1986). There is even evidence for some finer-grained topographic specificity within these larger loops. For example, multiple distinct pathways have been identified within the skeletomotor BG-thalamocortical circuit (Hoover and Strick, 1993), and there is some evidence for discrete topography in the output pathways of frontal eye fields (Robinson and Fuchs, 1969; Bruce and Goldberg, 1985; Komatsu and Suzuki, 1985; Schlag-Rey et al., 1992). It is unknown if motor and other coarse loops of the BG-cortical circuits exhibit the kind of fine-grained functional topography observed in the songbird, and predicted by the model proposed here.

In contrast to the local projections of EC inputs, the projection of CX signals must be highly divergent across many motor channels because context inputs to the striatum should have no intrinsic meaning in relation to actions. A particular action might lead to a reward in one context, but lead to an undesirable outcome in another context, and there may be little a priori knowledge about which contexts require which actions. It therefore seems adaptive to build in an enormous divergence in the projection of cortical context inputs onto MSNs. Indeed, the large degree of convergence of cortical projections onto striatal MSNs are widely recognized to be important (Goldman-Rakic and Selemon, 1986; Flaherty and Graybiel, 1991, 1993, 1994) in part because they may endow MSNs with an enormous capacity for pattern recognition (Kincaid et al., 1998; Zheng and Wilson, 2002; Bar-Gad et al., 2003) and identification of cortical “states” (Houk and Wise, 1995; Houk, 1995; Graybiel, 1998). From the perspective of our model, this could be used to link a wide variety of contexts to any action. In the songbird, for example, the projections from HVC to Area X are not topographically organized (Nottebohm et al., 1982; Luo et al., 2001), thus potentially allowing MSNs in every motor channel to evaluate LMAN activity at every time point in the song (Fee and Goldberg, 2011).

The view that the striatum receives functionally distinct cortical signals has already been proposed on the basis that cortical neurons produce two distinct types of projections to the striatum (Reiner et al., 2010)—one from pyramidal tract (PT) neurons in deep cortical layers and another from intratelencephalic (IT) neurons in layer 3 and upper layer 5 (Ramon y Cajal, 1911; Wilson, 1987; Cowan and Wilson, 1994; Levesque et al., 1996a,b; Levesque and Parent, 1998; Reiner et al., 2003; Parent and Parent, 2006). PT neurons (by definition) project to the spinal cord or to brainstem motor structures, and mediate the descending “motor” output of cortex. The axons of these neurons produce a fine collateral axon that terminates focally within the striatum, and typically forms a dense terminal plexus no more than 500 μm in diameter (Cowan and Wilson, 1994; Kincaid and Wilson, 1996; Parent and Parent, 2006). In contrast, IT neurons project only within the telencephalon, often to the contralateral cortical hemisphere (Gerfen and Wilson, 1996; Wright et al., 2001), and produce an axon collateral that projects diffusely within the striatum, typically over distances of several millimeters (Cowan and Wilson, 1994; Kincaid and Wilson, 1996; Parent and Parent, 2006). This pattern of striatal projections is consistent with the notion that PT neurons carry efference copy information and IT neurons carry context information. Thus, in the models shown in Figures 3A and 4A, the efference copy inputs to the striatum are hypothesized to be carried by topographically localized PT fibers from the FEF, while the context inputs would be carried by the diffuse IT fibers from a wide range of cortical areas, including potentially sensory and task-related frontal cortical areas.

Notably, the different firing patterns of PT and IT neurons in the motor and premotor cortex of primates also suggest that these inputs serve different functions (Reiner et al., 2010), perhaps related to their hypothesized role as efference copy and context signals, respectively. For example, the activity of PT neurons is more dense and more closely related to variations in motor activity, while that of IT neurons is sparser and perhaps more related to movement planning (Bauswein et al., 1989; Turner and DeLong, 2000; Beloozerova et al., 2003).

A Model of Motor Learning with A Thalamic Source of Efference Copy Signals

In relation to oculomotor learning, thalamostriatal projections are another particularly attractive candidate source of efference copy information. Several thalamic nuclei receive projections from the superior colliculus (McHaffie et al., 2005), in particular from the intermediate and deep layers in which neurons exhibit saccade-related motor activity (Harting et al., 1980; Krout et al., 2001). In primates, electrophysiological recordings in one of these thalamic nuclei (the lateral portion of the mediodorsal nucleus, MD) have confirmed the presence of neurons with robust saccade-related corollary discharge activity, producing a premotor burst of spikes tuned to saccades of a particular magnitude and direction (Sommer and Wurtz, 2002). While the focus of this work was on the role of MD in carrying efference copy signals to the frontal eye fields (Sommer and Wurtz, 2008), there is some evidence from anatomical studies in the cat that neurons in the homologous portions of the MD may also project to the striatum (Royce, 1983).

Figure 4B illustrates a simple model of oculomotor learning incorporating a thalamic source of efference copy information originating in the SC. The learning rule, and the logic by which action-specific learning happens in this circuit, is identical to the model with a cortical efference copy shown in Figures 3A and 4A. However, these previous models suffer from the difficulty that a large number of brain areas, besides the FEF, project to deep and intermediate layers of the superior colliculus (Wurtz and Albano, 1980), and can potentially influence saccade decisions. Thus, if efference copy information comes only from the FEF, the BG would not have access to information about saccades generated by these other components of the oculomotor system, and would not be able to learn from saccades generated by these other circuits. Thus, the advantage of the model shown in Figure 4B, in which efference copy signals arise from low-level brainstem motor systems and are transmitted back to the striatum through thalamostriatal pathways, is that the efference copy informs the striatum what action actually took place, not just what was instructed by the FEF.

Thalamostriatal inputs exhibit some features that may be consistent with a possible role in transmitting efference copy signals. The thalamostriatal projection forms excitatory glutamatergic synapses onto both direct- and indirect-pathway MSNs (Kemp and Powell, 1971; Wilson et al., 1983; Smith and Bolam, 1990; Doig et al., 2010; Reiner et al., 2010) that have properties distinct from corticostriatal synapses (Ding et al., 2008) and have been hypothesized to play a distinct role in striatal computations (Smith et al., 2011). Most importantly, striatal projections from some thalamic nuclei have been shown to produce a localized and dense terminal plexus (Deschenes et al., 1995; Deschenes and Bourassa, 1996; McFarland and Haber, 2001), similar to that described above for cortical PT neurons. Thus, there is some evidence that direct pathway MSNs receive topographically localized projections from the thalamus and topographically diffuse projections from IT-type cortical neurons, as required for the model shown in Figure 4B. We will return to a discussion of the indirect pathway in a later section of the paper.

The Ultrastructure of Striatal Inputs: Context and Efference Copy

In the model described above, context and efference copy inputs have a fundamental asymmetry in how they drive spiking activity in MSNs and how they undergo plasticity. Namely, context inputs are the primary drivers of spiking activity in MSNs, and are the only site of corticostriatal plasticity during learning. The model, as formulated, does not require that EC inputs drive spiking or undergo plastic changes. Such functional differences would likely be reflected in a structural asymmetry at the synaptic level. Of particular interest are reports that many thalamostriatal axons fibers terminate on the dendritic shafts of MSNs (Sadikot et al., 1992; Smith et al., 1994; Sidibe and Smith, 1996; Doig et al., 2010), while IT-type corticostriatal fibers synapse primarily on dendritic spines (Kemp and Powell, 1971; Reiner et al., 2003). How does this pattern relate to the hypothesized function of cortical and efference copy inputs? Because CX inputs onto MSNs are highly divergent, with each IT fiber probably contacting only a single spine on a given MSN (Kincaid et al., 1998; Zheng and Wilson, 2002), neighboring spines likely carry very different context signals. Thus, in order to avoid cross talk between neighboring CX inputs, the postsynaptic signals mediating plasticity at these inputs should be highly restricted to a single synapse. Such localization of synaptic signals and synaptic apparatus is thought to be one of the most important physiological functions of synaptic spines (Yuste, 2011).

In contrast, in the proposed models, EC inputs can be treated computationally as a single cell-wide input to an entire MSN; there is no reason to isolate EC synapses from either CX inputs or other EC inputs. Thus, based on the computational role of CX and EC inputs, it would make sense for context inputs (e.g., from IT fibers) to terminate on MSN spines and for efference copy inputs to terminate on dendritic shafts. According to the earlier identification of LMAN inputs to Area X as an efference copy signal and HVC inputs as a context signal, two clear predictions of this model are that (1) LMAN axons should terminate preferentially onto dendritic shafts of Area X MSNs, and (2) the dendritic spines of these MSNs to be contacted primarily by HVC axons.

How might efference copy and context inputs to MSNs interact to produce the desired learning rule, depicted in Figure 3B? It has previously been observed that corticostriatal plasticity is strongly modulated by the hyperpolarization state of the postsynaptic MSN (Charpier and Deniau, 1997). Thus, EC inputs could act to directly depolarize MSN dendrites, thus providing a widely distributed intracellular signal that could “enable” plasticity at corticostriatal context inputs, perhaps by pushing the dendrite into the “up” state (Wilson and Kawaguchi, 1996; Stern et al., 1998). Corticostriatal LTP is dependent on postsynaptic calcium (Charpier and Deniau, 1997), and postsynaptic calcium influx into spines following glutamatergic activation can be localized to a single spine, and is enhanced when the neuron is in a depolarized up state (Carter and Sabatini, 2004). Additionally, corticostriatal plasticity exhibits a strong dependence on the relative timing between cortical input and depolarization induced by backpropagating action potentials (Pawlak and Kerr, 2008) such that corticostriatal input followed by MSN spiking leads to long-term potentiation of cortical input. While Pawlak and Kerr interpreted these findings in terms of a Hebbian relation between presynaptic input and postsynaptic activity, such a mechanism could also result in selective potentiation of corticostriatal CX inputs at which presynaptic input is followed by dendritic depolarization due to an efference copy input, rather than from backpropagating action potentials.

Synaptic spines that received corticospinous CX input followed by depolarization from dendritic EC input would then be eligible for synaptic potentiation depending on the subsequent arrival of a reinforcement signal. It is known that corticostriatal LTP is dependent on dopaminergic signaling through D1-type receptors (Wickens and Kötter, 1995; Pawlak and Kerr, 2008). It has been suggested that postsynaptic calcium may constitute, or initiate, an “eligibility trace” (Houk et al., 1995; Wickens and Kötter, 1995; Suri and Schultz, 1999) that serves as a memory of prior corticostriatal activation until the later arrival of a dopaminergic reward signal. If the context signal is sufficiently sparse, such an eligibility trace allows for the strengthening of the correct synapses even if the reward signal is significantly delayed after the action occurs (Fiete et al., 2007; Fee and Goldberg, 2011). Dopaminergic inputs into the striatum have been reported to synapse preferentially onto the necks of dendritic spines (Freund et al., 1984), placing them in closer proximity to the site of cortical inputs than to the thalamostriatal synapses located on dendritic shafts (Smith et al., 1994, 2004).

Further evidence for the hypothesized relation between synaptic ultrastructure and the division of striatal inputs into context and efference copy comes from a closer examination of different types of thalamostriatal projections. While I have focused so far on potential thalamic sources of efference copy signals, it is possible that some thalamostriatal projections carry context signals. Context inputs from the thalamus might be expected to form diffuse widespread axonal arborizations in the striatum, just like cortical IT neurons. Notably, the literature on the thalamostriatal system in both rats (Deschenes et al., 1995) and in monkeys (McFarland and Haber, 2001) provides strong evidence for the existence of both diffuse and focal axonal arborizations within the striatum. In rats, for example, projections from the caudal intralaminar nuclei tend to make a focal, dense cluster of terminations that may make multiple contacts onto single MSNs (Deschenes et al., 1995; Parent and Parent, 2005). These thalamic areas receive input from middle and deep layers of the superior colliculus, and the striatal targets of these thalamic nuclei project to regions of the SNr that in turn project to the deep layers of the SC, thus forming a topographically ordered subcortical loop (McHaffie et al., 2005). This organization forms the anatomical basis of the feedback loop shown in Figure 4B, in which the topographically localized thalamostriatal projection serves as a motor efference copy signal.

In contrast to the focal, clustered terminal arborizations produced by neurons in the caudal intralaminar nuclei, projections to the striatum arising from several other thalamic nuclei make diffuse sparse projections (Deschenes et al., 1995; McFarland and Haber, 2001). One source of diffuse projections is the lateral posterior thalamus (LP), part of the extrageniculate visual system that receives input from the superficial exclusively visual layers of the SC (Graybiel, 1972; Abramson and Chalupa, 1988; Berson and Graybiel, 1991). The fact that the striatal projection from LP likely carries sensory information is consistent with the hypothesis that context inputs should project diffusely within the striatum.

Remarkably, studies of the synaptic ultrastructure of these two thalamostriatal inputs reveal an asymmetric pattern that correlates with the pattern of their axonal arborizations. As described earlier, the focal, clustered axonal arborizations of the caudal intralaminar nuclei—those potentially carrying efference copy information from deep layers of the SC—preferentially terminate on dendritic shafts of MSNs. In contrast, the diffuse projections from LP—thought to carry visual information—produce en-passant synapses preferentially onto the spines of striatal MSNs (Ichinohe et al., 2001; McHaffie et al., 2005), just as IT-type cortical fibers terminate preferentially onto synaptic spines of direct-pathway MSNs. Thus, a number of diverse findings on the circuit connectivity, axonal arborization patterns, and synaptic ultrastructure of different thalamostriatal circuits can be interpreted in light of the functional asymmetry between context and efference copy inputs hypothesized above.

One interesting corollary of the hypothesis that LTP at corticostriatal synapses is “gated on” by an excitatory input to dendritic shafts (such as an efference copy signal) is that LTP could also be “gated off” by an inhibitory input. Indeed, most inhibitory synapses onto MSNs, including those from other MSNs, terminate on dendritic shafts rather than on spines (Wilson and Goves, 1980), and are thus well suited to serve this function. In this case, spiking activity in one MSN would inhibit synaptic potentiation in the other MSNs to which it projects. It has long been suspected that inhibitory interactions between MSNs might introduce a winner-take-all mechanism that could increase sparseness and selectivity in MSN responses to cortical inputs (Wilson and Goves, 1980; Wickens et al., 1991). While recent evidence suggests that the lateral inhibition between MSNs is probably too sparse and weak to generate winner-take-all firing rate dynamics (Maass, 2000; Wilson, 2007; Plenz and Wickens, 2010), a competitive interaction that suppresses LTP of cortical inputs would likely require weaker lateral interactions, and could implement the previously hypothesized dimensionality reduction in the cortical-to-striatal transformation (Bar-Gad et al., 2003). Furthermore, when such lateral inhibition is coupled to the plasticity-promoting effect of an efference copy input, this competitive mechanism would tend to produce a compact, low-dimensional representation of the cortical context in which a particular action leads to reward.

The Indirect Pathway

It is interesting to speculate on how the indirect pathway might be integrated into the proposed view of BG function. In the classical division of the BG into direct and indirect pathways, tonically active pallidal output neurons in the GPi and SNr receive an inhibitory input from tonically active neurons in the external segment of the globus pallidus (GPe; Mink, 1996). These GPe neurons can, in turn, be inhibited by a distinct population of “indirect-pathway” MSNs expressing D2-type dopamine receptors. Cortical activation of indirect-pathway MSNs thus inhibits GPe neurons, causing increased spiking in the GPi/SNr output neurons and increased inhibition of downstream thalamic and motor targets (Alexander and Crutcher, 1990; Smith et al., 1998). Thus, activation of indirect-pathway MSNs has an effect exactly opposite that of activating direct-pathway MSNs, and is thought to be a mechanism to put a “brake” on downstream motor targets (Nambu, 2004).

For example, if a particular motor action produces a worse-than-expected outcome in a particular context, then strengthening of the corticostriatal CX inputs onto the appropriate indirect pathway MSNs would allow the context inputs to suppress that motor action in that context. More specifically, one can imagine a simple indirect-pathway counterpart to the models shown in Figures 3 and 4 in which EC and CX inputs converge onto indirect-pathway MSNs. Of course, for this model to work, the corticostriatal synapses onto indirect pathway MSNs would require a different learning rule than those onto direct pathway neurons. Namely, indirect-pathway corticostriatal LTP should result from simultaneous activation of a CX and EC input followed by the unexpected absence of a reward [signaled by a transient decrease in dopamine input (Schultz et al., 1997)], rather than the unexpected appearance of a reward (signaled by a transient increase in dopamine). Indeed, such differences in learning rules onto direct-and indirect-pathway have been reported in studies of corticostriatal plasticity that distinguish between MSNs on the basis of the different types of dopamine receptors expressed in these two pathways (Shen et al., 2008).

An additional prediction of this extended model is that EC inputs should form topographically localized projections onto both direct- and indirect-pathway MSNs. Using the arguments made above about thalamostriatal inputs, EC inputs might also be expected to form synapses onto the dendritic shafts in both MSN types. Indeed, some studies have indicated that thalamostriatal inputs form axodendritic synapses on both MSN types with roughly equal probability (Doig et al., 2010). However, other studies suggest that, while thalamic inputs to the striatum terminate on the dendritic shafts of direct-pathway MSNs, they terminate significantly less often onto indirect-pathway MSNs (Sidibe and Smith, 1996; Smith et al., 2004). In addition, the PT fibers that are the hypothesized cortical source of efference copy inputs preferentially contact indirect-pathway MSNs, and, furthermore, preferentially contact synaptic spines (Reiner et al., 2003, 2010). Of course, it is possible that PT fibers make sufficient contacts with direct-pathway MSNs (perhaps even onto dendritic shafts) to function as an EC input. But it is not clear, at this point, how the various reported differences between cortical and thalamic innervation of direct and indirect pathway MSNs can be related to the model proposed here.

One possibility, suggested by reports that PT inputs preferentially contact synaptic spines of indirect-pathway MSNs, is that these motor efference copy signals may also serve as a “context” signal in the indirect pathway, perhaps acting (as suggested by Reiner et al., 2010) to suppress specific motor channels that interfere with ongoing motor actions. More specifically, in the context of an ongoing motor action (represented in this case by a PT fiber acting as a context input), the occurrence of a conflicting motor action (represented by an efference copy input) would lead to a lower probability of reward. By the learning rule described above, this would lead to potentiation of the PT input onto the indirect-pathway MSN, such that during future occurrences of the ongoing motor action (the context), there would be a lower probability of generating the conflicting action. While this picture might explain the termination of PT axons onto the spines of indirect-pathway MSNs, it still violates the “principle” that context inputs should be topographically diffuse. Thus, while the anatomy of cortical and thalamic inputs to the direct pathway fit the proposed model reasonably well, a number of reported anatomical features of the inputs to the indirect pathway are not predicted by the model, as it is currently formulated.

Other Motor Systems

I have presented the argument that information about eye movements important for oculomotor learning may be transmitted from brainstem circuits as an efference copy through thalamostriatal pathways. Of course, the same view would likely apply to other motor systems as well. Central pattern generator circuits in the brainstem are capable of generating a wide range of behaviors: locomotion, turning, innate vocalizations, feeding, postural tone, and possibly even facial expressions, and other displays of emotions (Russell and Bullock, 1985; Grillner et al., 2005). It has been suggested (Grillner et al., 2005) that the BG play an important role in the selective and flexible control of this “tool box of motor infrastructure” (Takada et al., 1994; Swanson, 2000; Grillner, 2003; Hikosaka, 2007). If the hypothesis presented here for the oculomotor system is correct, then other brainstem motor circuits under the control of the BG should also send an efference copy of ongoing behaviors back to the striatum, as hypothesized for the oculomotor system.

For example, the BG are thought to exert control over posture and locomotion via projections of the SNr to the pedunculopontine tegmental nucleus (PPN; Garcia-Rill, 1986; Takakusaki et al., 2003; Hikosaka, 2007) and the mesencephalic locomotory region (MLR; Takakusaki et al., 2004). Indeed, it has been noted that the PPN and perhaps other brainstem motor structures exhibit a remarkably parallel pattern of interactions with the BG as that seen with the superior colliculus (Winn et al., 2010). This includes the presence of feedback connections to the striatum passing through the thalamus (Erro et al., 1999; Mengual et al., 1999) that could potentially carry some forms of efference copy signals useful for learning.

Another area of motor function for which the BG have been hypothesized to be important is in learning action sequences (Berns and Sejnowski, 1998; Hikosaka et al., 1999). Efference copy signals transmitted to the striatum about ongoing motor actions could in principle be used to learn such sequences. More specifically, let us imagine a situation in which action B has a higher probability of yielding a reward when it follows action A. In this case, if an efference copy of action A is available as one of the context inputs to an MSN that controls action B, then the learning rule described above (Figure 3B) will lead to a strengthening of this action A → action B context input. In this way, simple pairs or short sequences of actions could be learned. The potential utility of efference copy signals as context inputs suggest that axons carrying EC signals may possibly serve both of these functions. It would be interesting to determine if some EC projections have axons that form a diffuse projection that synapses onto dendritic spines of MSNs and a focal projection that synapses onto dendritic shafts. Indeed, it has been suggested that individual thalamostriatal axons from the ventral anterior and the vertral lateral (VA/VL) thalamic nuclei may form both a focal and a diffuse projection (McFarland and Haber, 2001).

Evolutionary Implications

The basal ganglia and its subcomponents are highly throughout vertebrate evolution, as are its interactions with downstream motor structures (Ganz et al., 2012; Stephenson-Jones et al., 2012). The potential role of the thalamostriatal system in transmitting efference copy signals arising from brainstem motor circuitry suggests an argument related to the evolution of the BG. The BG could have evolved to evaluate and reinforce brainstem-generated behaviors. This function could initially have been carried out using context and efference copy signals originating entirely in the brainstem and transmitted through the thalamus, rather than involving corticostriatal systems (Figure 4C). Consistent with this view, excitatory inputs to the striatum in amphibians originate almost entirely from thalamic nuclei, with comparatively little cortical input (Wilczynski and Northcutt, 1983; Reiner et al., 1998). For example, dorsal thalamic sensory relay nuclei, which in mammals and birds project principally to primary sensory cortices, project almost exclusively to the striatum in amphibians (Butler, 1994). Indeed, it has been suggested that the relatively minor projections to the striatum from specific sensory thalamic nuclei in mammals may be a remnant of the much larger striatal projection from these nuclei in ancestral amphibians (Reiner et al., 1998). In light of the model presented here, I would predict that the thalamostriatal projection in amphibians would also include substantial efference copy signals from intermediate layers of the tectum and other brainstem motor circuits. It would be further expected that sensory inputs and efference copy inputs from the thalamus in the frog would share the projection patterns and ultrastructural features described above for the mammalian LP and caudal intralaminar areas, respectively.

It is interesting that the amphibian pallium (the likely evolutionary precursor of neocortex) does not contain neurons that project out of the forebrain (Nieuwenhuys et al., 1998; Roth et al., 2007), and thus does not have the equivalent of pyramical tract neurons by which mammalian cortical outputs can directly influence brainstem or spinal motor functions. In these animals, “cortical” access to brainstem/spinal motor circuits may be mediated, at least in part, by the small but extant telencephalic projection to the striatopallidum (i.e., the BG; Nieuwenhuys et al., 1998; Roth et al., 2007). A prediction of the model I describe here is that these “corticostriatal” projections would act as context inputs, and would exhibit the anatomical, ultrastructural, and functional properties of IT fibers in mammalian striatum.

Of course, sensory context signals arising from thalamic and subcortical circuitry would tend to be more transient and much simpler than the kinds of responses produced by cortex. Mammalian cortical circuits can generate highly sophisticated representations of behaviorally important context information, including short term memory (Funahashi et al., 1991; Rainer et al., 1998; Romo et al., 1999), complex receptive fields (Tanaka, 1996), sensitivity to high-order combinations of sensory stimuli (Fitzpatrick et al., 1993), and object invariance (Freiwald and Tsao, 2010; Li and DiCarlo, 2010). The massive expansion of the pallium in amniotes (reptiles, birds, and mammals) could have been driven by the advantage of having such a rich set of context signals with which the striatum could evaluate the animal's actions. Of course, the expansion of context representations in cortex would have necessitated a corresponding increase in the number of MSNs, as reflected in the parallel evolutionary expansion of striatal and cortical sizes (Reiner et al., 1998).

As cortex evolved to carry out motor and executive functions, these cortical inputs to the striatum would need to also include efference copy signals of descending motor commands, as well as even more complex context signals that include complex representations of temporal order within sequential tasks, such as the signals transmitted from HVC to Area X in the songbird (Kozhevnikov and Fee, 2007; Fujimoto et al., 2011), or representations of the task or behavioral rules by circuits in prefrontal cortex (Miller and Cohen, 2001). These signals would allow even more sophisticated evaluations of actions, not just within the context of the external state of the world, but also in relation to the internal state of the animal, including emotional and social states and long-term goals.

Summary

The model I have presented here provides a very general framework by which the BG could learn to link specific contexts to actions, even if those actions are generated outside the striatum. It has long been hypothesized that the striatum receives two signals necessary for reinforcement learning: context signals that indicate the current state of the world and the animal, and an evaluation signal carrying information about rewards. Based on our developing understanding of the mechanisms of vocal learning in the songbird, here I hypothesize that the striatum receives an additional signal—an efference copy of motor command signals generated either in cortical or brainstem motor circuits. The role of this input is to allow the striatum to determine which actions, in which contexts lead to a reward. I have proposed that this learning can be accomplished by a simple synaptic learning rule in which motor efference copy signals “enable” synaptic plasticity in context inputs to striatal MSNs. The proposed model generates a number of predictions about differences in the divergence of the projections of context and efference copy striatal inputs, namely that efference copy inputs should be topographically localized and that context inputs should be diffuse. The model also makes predictions about synaptic plasticity in these inputs that may be consistent with the known ultrastructure of cortical and thalamic inputs. Specifically, it is hypothesized that context inputs terminate on synaptic spines to provide precise localization of synaptic plasticity, while efference copy inputs terminate preferentially on the dendritic shafts of MSNs, providing a widely distributed cellular signal that controls plasticity at MSN spines, perhaps by transiently driving the MSN into an “up” state.

Conflict of Interest Statement

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

I thank Michael Farries, Jesse Goldberg, Anusha Narayan, and Michael Stetner for helpful discussions and comments on earlier versions of the manuscript. I also thank the three reviewers for their insightful comments and suggestions. This work was supported by funding from the NIH (R01MH067105).

Footnotes

  1. ^In this paper, the model of song learning incorporates the time in the song as a context, the model of oculomotor learning incorporates a visual stimulus as the context.
  2. ^There are several different types of striatal MSNs, typically classified by their output targets and their expression of different dopamine receptor subtypes (Gerfen et al., 1990; Smith et al., 1998). So called “direct pathway” MSNs project to motor centers, such as the SNr, and preferentially express D1-type dopamine receptor. “Indirect pathway” MSNs project to the external segment of the globus pallidus (GPe) and preferentially express D2-type dopamine receptors. The activity of indirect-pathway MSNs has a net inhibitory effect on motor output. Yet another group of MSNs project reside in the patch/striosome part of the striatum (Graybiel and Ragsdale, 1978; Gerfen et al., 1987), and appear to project preferentially to midbrain dopaminergic centers rather than to motor-related pallidal outputs (Gerfen, 1985). The models shown in (Figures 14) apply only to the direct-pathway D1 MSNs of the matrix (motor output) regions of the BG. I will return to a discussion of the indirect pathway in a later section.
  3. ^In this description of context inputs, I have made the simplification that the cortical neurons represent a visual response to the target cue. In the actual experiments carried out by (Kawagoe et al., 1998), the monkeys performed a memory-guided saccade task, in which the cue was presented several seconds before the saccade was made. Thus, it may be more correct to think of the context inputs coming from neurons that represent a short-term memory of the cue, rather than from neurons that have a direct visual response.

References

Abramson, B. P., and Chalupa, L. M. (1988). Multiple pathways from the superior colliculus to the extrageniculate visual thalamus of the cat. J. Comp. Neurol. 271, 397–418.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Alexander, G. E., and Crutcher, M. D. (1990). Functional architecture of basal ganglia circuits: neural substrates of parallel processing. Trends Neurosci. 13, 266–271.

Pubmed Abstract | Pubmed Full Text

Alexander, G. E., DeLong, M. R., and Strick, P. L. (1986). Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annu. Rev. Neurosci. 9, 357–381.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Andalman, A. S., and Fee, M. S. (2009). A basal ganglia-forebrain circuit in the songbird biases motor output to avoid vocal errors. Proc. Natl. Acad. Sci. U.S.A. 106, 12518–12523.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Aronov, D., Andalman, A. S., and Fee, M. S. (2008). A specialized forebrain circuit for vocal babbling in the juvenile songbird. Science 320, 630–634.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Aronov, D., Veit, L., Goldberg, J. H., and Fee, M. S. (2011). Two distinct modes of forebrain circuit dynamics underlie temporal patterning in the vocalizations of young songbirds. J. Neurosci. 31, 16353–16368.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Atallah, H. E., Lopez-Paniagua, D., Rudy, J. W., and O'Reilly, R. C. (2007). Separate neural substrates for skill learning and performance in the ventral and dorsal striatum. Nat. Neurosci. 10, 126–131.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Bar-Gad, I., Morris, G., and Bergman, H. (2003). Information processing, dimensionality reduction and reinforcement learning in the basal ganglia. Prog. Neurobiol. 71, 439–473.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Barto, A. G. (1995). “Adaptive critics and the basal ganglia,” in Models of Information Processing in the Basal Ganglia, eds J. C. Houk, J. L. Davis, and D. G. Beiser (Cambridge, MA: The MIT Press), 215–232.

Basso, M. A., and Wurtz, R. H. (2002). Neuronal activity in substantia nigra pars reticulata during target selection. J. Neurosci. 22, 1883–1894.

Pubmed Abstract | Pubmed Full Text

Bauswein, E., Fromm, C., and Preuss, A. (1989). Corticostriatal cells in comparison with pyramidal tract neurons: contrasting properties in the behaving monkey. Brain Res. 493, 198–203.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Beloozerova, I. N., Sirota, M. G., Swadlow, H. A., Orlovsky, G. N., Popova, L. B., and Deliagina, T. G. (2003). Activity of different classes of neurons of the motor cortex during postural corrections. J. Neurosci. 23, 7844–7853.

Pubmed Abstract | Pubmed Full Text

Berns, G. S., and Sejnowski, T. J. (1998). A computational model of how the basal ganglia produce sequences. J. Cogn. Neurosci. 10, 108–121.

Pubmed Abstract | Pubmed Full Text

Berson, D. M., and Graybiel, A. M. (1991). Tectorecipient zone of cat lateral posterior nucleus: evidence that collicular afferents contain acetylcholinesterase. Exp. Brain Res. 84, 478–486.

Pubmed Abstract | Pubmed Full Text

Bottjer, S. W., Miesner, E. A., and Arnold, A. P. (1984). Forebrain lesions disrupt development but not maintenance of song in passerine birds. Science 224, 901–903.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Brainard, M. S., and Doupe, A. J. (2002). What songbirds teach us about learning. Nature 417, 351–358.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Brown, J. W., Bullock, D., and Grossberg, S. (2004). How laminar frontal cortex and basal ganglia circuits interact to control planned and reactive saccades. Neural Netw. 17, 471–510.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Bruce, C. J., and Goldberg, M. E. (1985). Primate frontal eye fields. I. Single neurons discharging before saccades. J. Neurophysiol. 53, 603–635.

Pubmed Abstract | Pubmed Full Text

Butler, A. B. (1994). The evolution of the dorsal thalamus of jawed vertebrates, including mammals: cladistic analysis and a new hypothesis. Brain Res. Brain Res. Rev. 19, 29–65.

Pubmed Abstract | Pubmed Full Text

Cai, X., Kim, S., and Lee, D. (2011). Heterogeneous coding of temporally discounted values in the dorsal and ventral striatum during intertemporal choice. Neuron 69, 170–182.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Carter, A. G., and Sabatini, B. L. (2004). State-dependent calcium signaling in dendritic spines of striatal medium spiny neurons. Neuron 44, 483–493.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Charpier, S., and Deniau, J. M. (1997). In vivo activity-dependent plasticity at cortico-striatal connections: evidence for physiological long-term potentiation. Proc. Natl. Acad. Sci. U.S.A. 94, 7036–7040.

Pubmed Abstract | Pubmed Full Text

Chevalier, G., Vacher, S., and Deniau, J. M. (1984). Inhibitory nigral influence on tectospinal neurons, a possible implication of basal ganglia in orienting behavior. Exp. Brain Res. 53, 320–326.

Pubmed Abstract | Pubmed Full Text

Cohen, J. Y., Haesler, S., Vong, L., Lowell, B. B., and Uchida, N. (2012). Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 482, 85–88.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Cools, A. R. (1980). Role of the neostriatal dopaminergic activity in sequencing and selecting behavioural strategies: facilitation of processes involved in selecting the best strategy in a stressful situation. Behav. Brain Res. 1, 361–378.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Cowan, R. L., and Wilson, C. J. (1994). Spontaneous firing patterns and axonal projections of single corticostriatal neurons in the rat medial agranular cortex. J. Neurophysiol. 71, 17–32.

Pubmed Abstract | Pubmed Full Text

Daw, N. D., and Doya, K. (2006). The computational neurobiology of learning and reward. Curr. Opin. Neurobiol. 16, 199–204.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Deschenes, M., and Bourassa, J. (1996). Striatal and cortical projections of single neurons from the central lateral thalamic nucleus in the rat. Neuroscience 72, 679–687.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Deschenes, M., Bourassa, J., and Parent, A. (1995). Two different types of thalamic fibers innervate the rat striatum. Brain Res. 701, 288–292.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Ding, J., Peterson, J. D., and Surmeier, D. J. (2008). Corticostriatal and thalamostriatal synapses have distinctive properties. J. Neurosci. 28, 6483–6492.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Ding, L., and Hikosaka, O. (2006). Comparison of reward modulation in the frontal eye field and caudate of the macaque. J. Neurosci. 26, 6695–6703.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Ding, L., Perkel, D. J., and Farries, M. A. (2003). Presynaptic depression of glutamatergic synaptic transmission by D1-like dopamine receptor activation in the avian basal ganglia. J. Neurosci. 23, 6086–6095.

Pubmed Abstract | Pubmed Full Text

Doig, N. M., Moss, J., and Bolam, J. P. (2010). Cortical and thalamic innervation of direct and indirect pathway medium-sized spiny neurons in mouse striatum. J. Neurosci. 30, 14610–14618.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Doupe, A. J., Perkel, D. J., Reiner, A., and Stern, E. A. (2005). Birdbrains could teach basal ganglia research a new song. Trends Neurosci. 28, 353–363.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Doya, K. (2000). Complementary roles of basal ganglia and cerebellum in learning and motor control. Curr. Opin. Neurobiol. 10, 732–739.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Doya, K., and Sejnowski, T. (1995). A novel reinforcement model of birdsong vocalization learning. Adv. Neural Inf. Process. Syst. 7, 101–108.

Doya, K., and Sejnowski, T. J. (2000). “A computational model of avian song learning,” in The New Cognitive Neurosciences, ed M. Gazzaniga (Cambridge, MA: MIT Press), 469–482.

Erro, E., Lanciego, J. L., and Gimenez-Amaya, J. M. (1999). Relationships between thalamostriatal neurons and pedunculopontine projections to the thalamus: a neuroanatomical tract-tracing study in the rat. Exp. Brain Res. 127, 162–170.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Farries, M. A., Ding, L., and Perkel, D. J. (2005). Evidence for “direct” and “indirect” pathways through the song system basal ganglia. J. Comp. Neurol. 484, 93–104.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Farries, M. A., and Fairhall, A. L. (2007). Reinforcement learning with modulated spike timing dependent synaptic plasticity. J. Neurophysiol. 98, 3648–3665.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Farries, M. A., and Perkel, D. J. (2002). A telencephalic nucleus essential for song learning contains neurons with physiological characteristics of both striatum and globus pallidus. J. Neurosci. 22, 3776–3787.

Pubmed Abstract | Pubmed Full Text

Featherstone, R. E., and McDonald, R. J. (2004). Dorsal striatum and stimulus-response learning: lesions of the dorsolateral, but not dorsomedial, striatum impair acquisition of a stimulus-response-based instrumental discrimination task, while sparing conditioned place preference learning. Neuroscience 124, 23–31.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Fee, M. S., and Goldberg, J. H. (2011). A hypothesis for basal ganglia-dependent reinforcement learning in the songbird. Neuroscience 198, 152–170.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Fee, M. S., and Scharff, C. (2010). The songbird as a model for the generation and learning of complex sequential behaviors. ILAR J. 5, 362–377.

Pubmed Abstract | Pubmed Full Text

Fiete, I. R., Fee, M. S., and Seung, H. S. (2007). Model of birdsong learning based on gradient estimation by dynamic perturbation of neural conductances. J. Neurophysiol. 98, 2038–2057.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Fiete, I. R., Hahnloser, R. H. R., Fee, M. S., and Seung, H. S. (2004). Temporal sparseness of the premotor drive is important for rapid learning in a neural network model of birdsong. J. Neurophysiol. 92, 2274–2282.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Fitzpatrick, D. C., Kanwal, J. S., Butman, J. A., and Suga, N. (1993). Combination-sensitive neurons in the primary auditory cortex of the mustached bat. J. Neurosci. 13, 931–940.

Pubmed Abstract | Pubmed Full Text

Flaherty, A. W., and Graybiel, A. M. (1991). Corticostriatal transformations in the primate somatosensory system. Projections from physiologically mapped body-part representations. J. Neurophysiol. 66, 1249–1263.

Pubmed Abstract | Pubmed Full Text

Flaherty, A. W., and Graybiel, A. M. (1993). Two input systems for body representations in the primate striatal matrix: experimental evidence in the squirrel monkey. J. Neurosci. 13, 1120–1137.

Pubmed Abstract | Pubmed Full Text

Flaherty, A. W., and Graybiel, A. M. (1994). Input-output organization of the sensorimotor striatum in the squirrel monkey. J. Neurosci. 14, 599–610.

Pubmed Abstract | Pubmed Full Text

Frank, M. J., and O'Reilly, R. C. (2006). A mechanistic account of striatal dopamine function in human cognition: psychopharmacological studies with cabergoline and haloperidol. Behav. Neurosci. 120, 497–517.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Freiwald, W. A., and Tsao, D. Y. (2010). Functional compartmentalization and viewpoint generalization within the macaque face-processing system. Science 330, 845–851.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Freund, T. F., Powell, J. F., and Smith, A. D. (1984). Tyrosine hydroxylase-immunoreactive boutons in synaptic contact with identified striatonigral neurons, with particular reference to dendritic spines. Neuroscience 13, 1189–1215.

Pubmed Abstract | Pubmed Full Text

Fujimoto, H., Hasegawa, T., and Watanabe, D. (2011). Neural coding of syntactic structure in learned vocalizations in the songbird. J. Neurosci. 31, 10023–10033.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Funahashi, S., Bruce, C. J., and Goldman-Rakic, P. S. (1991). Neuronal activity related to saccadic eye movements in the monkey's dorsolateral prefrontal cortex. J. Neurophysiol. 65, 1464–1483.

Pubmed Abstract | Pubmed Full Text

Gale, S. D., and Perkel, D. J. (2005). Properties of dopamine release and uptake in the songbird basal ganglia. J. Neurophysiol. 93, 1871–1879.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Gale, S. D., and Perkel, D. J. (2010). A basal ganglia pathway drives selective auditory responses in songbird dopaminergic neurons via disinhibition. J. Neurosci. 30, 1027–1037.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Gale, S. D., Person, A. L., and Perkel, D. J. (2008). A novel basal ganglia pathway forms a loop linking a vocal learning circuit with its dopaminergic input. J. Comp. Neurol. 508, 824–839.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Ganz, J., Kaslin, J., Freudenreich, D., Machate, A., Geffarth, M., and Brand, M. (2012). Subdivisions of the adult zebrafish subpallium by molecular marker analysis. J. Comp. Neurol. 520, 633–655.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Garcia-Rill, E. (1986). The basal ganglia and the locomotor regions. Brain Res. 11, 47–63.

Pubmed Abstract | Pubmed Full Text

Gerfen, C. R. (1985). The neostriatal mosaic. I. Compartmental organization of projections from the striatum to the substantia nigra in the rat. J. Comp. Neurol. 236, 454–476.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Gerfen, C. R., Engber, T. M., Mahan, L. C., Susel, Z., Chase, T. N., Monsma, F. J., and Sibley, D. R. (1990). D1 and D2 dopamine receptor-regulated gene expression of striatonigral and striatopallidal neurons. Science 250, 1429–1432.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Gerfen, C. R., Herkenham, M., and Thibault, J. (1987). The neostriatal mosaic: II. Patch-and matrix-directed mesostriatal dopaminergic and non-dopaminergic systems. J. Neurosci. 7, 3915–3934.

Pubmed Abstract | Pubmed Full Text

Gerfen, T. S., and Wilson, C. J. (1996). “The basal ganglia,” in Handbook of Chemical Anatomy. Integrated System of the CNS, Part III, eds L. W. Swanson, A. Bjorklund, and T. Hokfelt (Amsterdam: Elsevier), 360–466.

Goldberg, J. H., Adler, A., Bergman, H., and Fee, M. S. (2010). Singing-related neural activity distinguishes two putative pallidal cell types in the songbird basal ganglia: comparison to the primate internal and external pallidal segments. J. Neurosci. 30, 7088–7098.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Goldberg, J. H., and Fee, M. S. (2010). Singing-related neural activity distinguishes four classes of putative striatal neurons in the songbird basal ganglia. J. Neurophysiol. 103, 2002–2014.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Goldberg, J. H., and Fee, M. S. (2011). Vocal babbling in songbirds requires the basal ganglia-recipient motor thalamus but not the basal ganglia. J. Neurophysiol. 105, 2729–2739.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Goldman-Rakic, P. S., and Selemon, L. D. (1986). Topography of corticostriatal projections in nonhuman primates and implications for functional parcellation of the neostriatum. Cereb. Cortex 5, 447–466.

Graybiel, A. M. (1972). Some ascending connections of the pulvinar and nucleus lateralis posterior of the thalamus in the cat. Brain Res. 44, 99–125.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Graybiel, A. M. (1978). Organization of the nigrotectal connection: an experimental tracer study in the cat. Brain Res. 143, 339–348.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Graybiel, A. M. (1998). The basal ganglia and chunking of action repertoires. Neurobiol. Learn. Mem. 70, 119–136.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Graybiel, A. M. (2005). The basal ganglia: learning new tricks and loving it. Curr. Opin. Neurobiol. 15, 638–644.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Graybiel, A. M. (2008). Habits, rituals, and the evaluative brain. Annu. Rev. Neurosci. 31, 359–387.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Graybiel, A. M., Aosaki, T., Flaherty, A. W., and Kimura, M. (1994). The basal ganglia and adaptive motor control. Science 265, 1826–1831.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Graybiel, A. M., and Ragsdale, C. W. J. (1978). Histochemically distinct compartments in the striatum of human, monkeys, and cat demonstrated by acetylthiocholinesterase staining. Proc. Natl. Acad. Sci. U.S.A. 75, 5723–5726.

Pubmed Abstract | Pubmed Full Text

Grillner, S. (2003). The motor infrastructure: from ion channels to neuronal networks. Nat. Rev. Neurosci. 4, 573–586.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Grillner, S., Hellgren, J., Menard, A., Saitoh, K., and Wikstrom, M. A. (2005). Mechanisms for selection of basic motor programs–roles for the striatum and pallidum. Trends Neurosci. 28, 364–370.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Gurney, K., Prescott, T. J., and Redgrave, P. (2001). A computational model of action selection in the basal ganglia. I. A new functional anatomy. Biol. Cybern. 84, 401–410.

Pubmed Abstract | Pubmed Full Text

Hahnloser, R. H. R., Kozhevnikov, A. A., and Fee, M. S. (2002). An ultra-sparse code underlies the generation of neural sequences in a songbird. Nature 419, 65–70.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Hampton, C. M., Sakata, J. T., and Brainard, M. S. (2009). An avian basal ganglia-forebrain circuit contributes differentially to syllable versus sequence variability of adult Bengalese finch song. J. Neurophysiol. 101, 3235–3245.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Harding, C. F. (2004). Brief alteration in dopaminergic function during development causes deficits in adult reproductive behavior. J. Neurobiol. 61, 301–308.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Harting, J. K., Huerta, M. F., Frankfurter, J., Strominger, N. L., and Royce, G. J. (1980). Ascending pathways from the monkey superior colliculus: an autoradiographic analysis. J. Comp. Neurol. 192, 853–882.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Hessler, N. A., and Doupe, A. J. (1999). Singing-related neural activity in a dorsal forebrain-basal ganglia circuit of adult zebra finches. J. Neurosci. 19, 10461–10481.

Pubmed Abstract | Pubmed Full Text

Hikosaka, O. (2007). GABAergic output of the basal ganglia. Prog. Brain Res. 160, 209–226.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Hikosaka, O., Nakahara, H., Rand, M. K., Sakai, K., Lu, X., Nakamura, K., Miyachi, S., and Doya, K. (1999). Parallel neural networks for learning sequential procedures. Trends Neurosci. 22, 464–471.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Hikosaka, O., Nakamura, K., and Nakahara, H. (2006). Basal ganglia orient eyes to reward. J. Neurophysiol. 95, 567–584.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Hikosaka, O., Rand, M. K., Nakamura, K., Miyachi, S., Kitaguchi, K., Sakai, K., Lu, X., and Shimo, Y. (2002). Long-term retention of motor skill in macaque monkeys and humans. Exp. Brain Res. 147, 494–504.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Hikosaka, O., Sakamoto, M., and Usui, S. (1989a). Functional properties of monkey caudate neurons. I. Activities related to saccadic eye movements. J. Neurophysiol. 61, 780–798.

Pubmed Abstract | Pubmed Full Text

Hikosaka, O., Sakamoto, M., and Usui, S. (1989b). Functional properties of monkey caudate neurons. II. Visual and auditory responses. J. Neurophysiol. 61, 799–813.

Pubmed Abstract | Pubmed Full Text

Hikosaka, O., Sakamoto, M., and Usui, S. (1989c). Functional properties of monkey caudate neurons. III. Activities related to expectation of target and reward. J. Neurophysiol. 61, 814–832.

Pubmed Abstract | Pubmed Full Text

Hikosaka, O., Takikawa, Y., and Kawagoe, R. (2000). Role of the basal ganglia in the control of purposive saccadic eye movements. Physiol. Rev. 80, 953–978.

Pubmed Abstract | Pubmed Full Text

Hikosaka, O., and Wurtz, R. H. (1983a). Visual and oculomotor functions of monkey substantia nigra pars reticulata. I. Relation of visual and auditory responses to saccades. J. Neurophysiol. 49, 1230–1253.

Pubmed Abstract | Pubmed Full Text

Hikosaka, O., and Wurtz, R. H. (1983b). Visual and oculomotor functions of monkey substantia nigra pars reticulata. IV. Relation of substantia nigra to superior colliculus. J. Neurophysiol. 49, 1285–1301.

Pubmed Abstract | Pubmed Full Text

Hikosaka, O., and Wurtz, R. H. (1985). Modification of saccadic eye movements by GABA-related substances. II. Effects of muscimol in monkey substantia nigra pars reticulata. J. Neurophysiol. 53, 292–308.

Pubmed Abstract | Pubmed Full Text

Hoover, J. E., and Strick, P. L. (1993). Multiple output channels in the basal ganglia. Science 259, 819–821.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Houk, J. C. (1995). “Information processing in modular circuits linking basal ganglia and cerebral cortex,” in Models of Information Processing in the Basal Ganglia, eds J. C. Houk, J. L. Davis, and D. G. Beiser (Cambridge, MA: The MIT Press), 3–10.

Houk, J. C., Adams, J. L., and Barto, A. G. (1995). “A model of how the basal ganglia generate and use neural signals that predict reinforcement,” in Models of Information Processing in the Basal Ganglia, eds J. C. Houk, J. L. Davis, and D. G. Beiser (Cambridge, MA: The MIT Press), 249–270.

Houk, J. C., and Wise, S. P. (1995). Distributed modular architectures linking basal ganglia, cerebellum, and cerebral cortex: their role in planning and controlling action. Cereb. Cortex 5, 95–110.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Ichinohe, N., Iwatsuki, H., and Shoumura, K. (2001). Intrastriatal targets of projection fibers from the central lateral nucleus of the rat thalamus. Neurosci. Lett. 302, 105–108.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Ikeda, T., and Hikosaka, O. (2003). Reward-dependent gain and bias of visual responses in primate superior colliculus. Neuron 39, 693–700.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Ito, M., and Doya, K. (2011). Multiple representations and algorithms for reinforcement learning in the cortico-basal ganglia circuit. Curr. Opin. Neurobiol. 21, 368–373.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Iyengar, S., Viswanathan, S. S., and Bottjer, S. W. (1999). Development of topography within song control circuitry of zebra finches during the sensitive period for song learning. J. Neurosci. 19, 6037–6057.

Pubmed Abstract | Pubmed Full Text

Izhikevich, E. M. (2007). Solving the distal reward problem through linkage of STDP and dopamine signaling. Cereb. Cortex 17, 2443–2452.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Jarvis, E. D. (2004). Learned birdsong and the neurobiology of human language. Ann. N.Y. Acad. Sci. 1016, 749–777.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Jayaraman, A., Batton, R. R., and Carpenter, M. B. (1977). Nigrotectal projections in the monkey: an autoradiographic study. Brain Res. 135, 147–152.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Johnson, F., Sablan, M. M., and Bottjer, S. W. (1995). Topographic organization of a forebrain pathway involved with vocal learning in zebra finches. J. Comp. Neurol. 358, 260–278.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kao, M. H., and Brainard, M. S. (2006). Lesions of an avian basal ganglia circuit prevent context-dependent changes to song variability. J. Neurophysiol. 96, 1441–1455.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kao, M. H., Doupe, A. J., and Brainard, M. S. (2005). Contributions of an avian basal ganglia-forebrain circuit to real-time modulation of song. Nature 433, 638–643.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kao, M. H., Wright, B. D., and Doupe, A. J. (2008). Neurons in a forebrain nucleus required for vocal plasticity rapidly switch between precise firing and variable bursting depending on social context. J. Neurosci. 28, 13232–13247.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kawagoe, R., Takikawa, Y., and Hikosaka, O. (1998). Expectation of reward modulates cognitive signals in the basal ganglia. Nat. Neurosci. 1, 411–416.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kawagoe, R., Takikawa, Y., and Hikosaka, O. (2004). Reward-predicting activity of dopamine and caudate neurons– a possible mechanism of motivational control of saccadic eye movement. J. Neurophysiol. 91, 1013–1024.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kemp, J. M., and Powell, T. P. (1971). The structure of the caudate nucleus of the cat: light and electron microscopy. Philos. Trans. R. Soc. Lond. B Biol. Sci. 262, 383–401.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kincaid, A. E., and Wilson, C. J. (1996). Corticostriatal innervation of the patch and matrix in the rat neostriatum. J. Comp. Neurol. 374, 578–592.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kincaid, A. E., Zheng, T., and Wilson, C. J. (1998). Connectivity and convergence of single corticostriatal axons. J. Neurosci. 18, 4722–4731.

Pubmed Abstract | Pubmed Full Text

Kobayashi, S., Kawagoe, R., Takikawa, Y., Koizumi, M., Sakagami, M., and Hikosaka, O. (2007). Functional differences between macaque prefrontal cortex and caudate nucleus during eye movements with and without reward. Exp. Brain Res. 176, 341–355.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Komatsu, H., and Suzuki, H. (1985). Projections from the functional subdivisions of the frontal eye field to the superior colliculus in the monkey. Brain Res. 327, 324–327.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kozhevnikov, A. A., and Fee, M. S. (2007). Singing-related activity of identified HVC neurons in the zebra finch. J. Neurophysiol. 97, 4271–4283.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kreitzer, A. C., and Malenka, R. C. (2008). Striatal plasticity and basal ganglia circuit function. Neuron 60, 543–554.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Krout, K. E., Loewy, A. D., Westby, G. W., and Redgrave, P. (2001). Superior colliculus projections to midline and intralaminar thalamic nuclei of the rat. J. Comp. Neurol. 431, 198–216.

Pubmed Abstract | Pubmed Full Text

Kubikova, L., and Kostal, L. (2010). Dopaminergic system in birdsong learning and maintenance. J. Chem. Neuroanat. 39, 112–123.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kubikova, L., Wada, K., and Jarvis, E. D. (2010). Dopamine receptors in a songbird brain. J. Comp. Neurol. 518, 741–769.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kunimatsu, J., and Tanaka, M. (2010). Roles of the primate motor thalamus in the generation of antisaccades. J. Neurosci. 30, 5108–5117.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kunzle, H., and Akert, K. (1977). Efferent connections of cortical, area 8 (frontal eye field) in Macaca fascicularis. A reinvestigation using the autoradiographic technique. J. Comp. Neurol. 173, 147–163.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Las, L., Denissenko, N. I., Mandelblat-Derf, Y., and Fee, M. S. (2011). “A cortical projection to the ventral tegmental area (VTA) is necessary for vocal learning in the songbird,” in Neuroscience Meeting Planner, Program #303.12 (Washington, DC: Society for Neuroscience).

Lau, B., and Glimcher, P. W. (2008). Value representations in the primate striatum during matching behavior. Neuron 58, 451–463.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Lauwereyns, J., Watanabe, K., Coe, B., and Hikosaka, O. (2002). A neural correlate of response bias in monkey caudate nucleus. Nature 418, 413–417.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Levesque, M., Charara, A., Gagnon, S., Parent, A., and Deschenes, M. (1996a). Corticostriatal projections from layer V cells in rat are collaterals of long-range corticofugal axons. Brain Res. 709, 311–315.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Levesque, M., Gagnon, S., Parent, A., and Deschenes, M. (1996b). Axonal arborizations of corticostriatal and corticothalamic fibers arising from the second somatosensory area in the rat. Cereb. Cortex 6, 759–770.

Pubmed Abstract | Pubmed Full Text

Levesque, M., and Parent, A. (1998). Axonal arborization of corticostriatal and corticothalamic fibers arising from prelimbic cortex in the rat. Cereb. Cortex 8, 602–613.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Li, N., and DiCarlo, J. J. (2010). Unsupervised natural visual experience rapidly reshapes size-invariant object representation in inferior temporal cortex. Neuron 67, 1062–1075.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Long, M. A., and Fee, M. S. (2008). Using temperature to analyse temporal dynamics in the songbird motor pathway. Nature 456, 189–194.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Long, M. A., Jin, D. Z., and Fee, M. S. (2010). Support for a synaptic chain model of neuronal sequence generation. Nature 468, 394–399.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Luo, M., Ding, L., and Perkel, D. J. (2001). An avian basal ganglia pathway essential for vocal learning forms a closed topographic loop. J. Neurosci. 21, 6836–6845.

Pubmed Abstract | Pubmed Full Text

Luo, M., and Perkel, D. J. (1999). Long-range GABAergic projection in a circuit essential for vocal learning. J. Comp. Neurol. 403, 68–84.

Pubmed Abstract | Pubmed Full Text

Maass, W. (2000). On the computational power of winner-take-all. Neural Comput. 12, 2519–2535.

Pubmed Abstract | Pubmed Full Text

Margoliash, D., and Yu, A. C. (1996). Temporal hierarchical control of singing in birds. Science 273, 1871–1875.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

May, P. J., and Hall, W. C. (1984). Relationships between the nigrotectal pathway and the cells of origin of the predorsal bundle. J. Comp. Neurol. 226, 357–376.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

McDonald, R. J., and White, N. M. (1994). Parallel information processing in the water maze: evidence for independent memory systems involving dorsal striatum and hippocampus. Behav. Neural Biol. 61, 260–270.

Pubmed Abstract | Pubmed Full Text

McFarland, N. R., and Haber, S. N. (2001). Organization of thalamostriatal terminals from the ventral motor nuclei in the macaque. J. Comp. Neurol. 429, 321–336.

Pubmed Abstract | Pubmed Full Text

McHaffie, J. G., Stanford, T. R., Stein, B. E., Coizet, V., and Redgrave, P. (2005). Subcortical loops through the basal ganglia. Trends Neurosci. 28, 401–407.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Mengual, E., de las Heras, S., Erro, E., Lanciego, J. L., and Gimenez-Amaya, J. M. (1999). Thalamic interaction between the input and the output systems of the basal ganglia. J. Chem. Neuroanat. 16, 187–200.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Miller, E. K., and Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. Annu. Rev. Neurosci. 24, 167–202.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Mink, J. W. (1996). The basal ganglia: focused selection and inhibition of competing motor programs. Prog. Neurobiol. 50, 381–425.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Montague, P. R., Dayan, P., and Sejnowski, T. J. (1996). A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16, 1936–1947.

Pubmed Abstract | Pubmed Full Text

Nambu, A. (2004). A new dynamic model of the cortico-basal ganglia loop. Prog. Brain Res. 143, 461–466.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Nieuwenhuys, R., Donkelaar, H. J., and Nicholson, C. (1998). The Central Nervous System of Vertebrates. Heidelberg: Springer Verlag.

Nottebohm, F., Kelley, D. B., and Paton, J. A. (1982). Connections of vocal control nuclei in the canary telencephalon. J. Comp. Neurol. 207, 344–357.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Olveczky, B. P., Andalman, A. S., and Fee, M. S. (2005). Vocal experimentation in the juvenile songbird requires a basal ganglia circuit. PLoS Biol. 3:e153. doi: 10.1371/journal.pbio.0030153

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Olveczky, B. P., Otchy, T. M., Goldberg, J. H., Aronov, D., and Fee, M. S. (2011). Changes in the neural control of a complex motor sequence during learning. J. Neurophysiol. 106, 386–397.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Packard, M. G., Hirsh, R., and White, N. M. (1989). Differential effects of fornix and caudate nucleus lesions on two radial maze tasks: evidence for multiple memory systems. J. Neurosci. 9, 1465–1472.

Pubmed Abstract | Pubmed Full Text

Packard, M. G., and Knowlton, B. J. (2002). Learning and memory functions of the basal ganglia. Annu. Rev. Neurosci. 25, 563–593.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Packard, M. G., and McGaugh, J. L. (1996). Inactivation of hippocampus or caudate nucleus with lidocaine differentially affects expression of place and response learning. Neurobiol. Learn. Mem. 65, 65–72.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Parent, M., and Parent, A. (2005). Single-axon tracing and three-dimensional reconstruction of centre median-parafascicular thalamic neurons in primates. J. Comp. Neurol. 481, 127–144.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Parent, M., and Parent, A. (2006). Single-axon tracing study of corticostriatal projections arising from primary motor cortex in primates. J. Comp. Neurol. 496, 202–213.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Pasupathy, A., and Miller, E. K. (2005). Different time courses of learning-related activity in the prefrontal cortex and striatum. Nature 433, 873–876.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Pawlak, V., and Kerr, J. N. (2008). Dopamine receptor activation is required for corticostriatal spike-timing-dependent plasticity. J. Neurosci. 28, 2435–2446.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Person, A. L., Gale, S. D., Farries, M. A., and Perkel, D. J. (2008). Organization of the songbird basal ganglia, including area X. J. Comp. Neurol. 508, 840–866.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Plenz, P., and Wickens, J. R. (2010). “The striatal skeleton: medium spiny projection neurons and their lateral connections,” in Handbook of Basal Ganglia Structure and Function, eds H. Steiner and K. Y. Tseng (Amsterdam: Elsevier), 99–112.

Plotkin, J. L., Day, M., and Surmeier, D. J. (2011). Synaptically driven state transitions in distal dendrites of striatal spiny neurons. Nat. Neurosci. 14, 881–888.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Prather, J. F., Peters, S., Nowicki, S., and Mooney, R. (2008). Precise auditory-vocal mirroring in neurons for learned vocal communication. Nature 451, 305–310.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Rainer, G., Asaad, W. F., and Miller, E. K. (1998). Memory fields of neurons in the primate prefrontal cortex. Proc. Natl. Acad. Sci. U.S.A. 95, 15008.

Pubmed Abstract | Pubmed Full Text

Ramon y Cajal, S. (1911). Histologie du Systeme Nerveux de l Homme et des Vertebres. Paris: Maloine 1, 174–192.

Redgrave, P., Prescott, T. J., and Gurney, K. (1999). The basal ganglia: a vertebrate solution to the selection problem? Neuroscience 89, 1009–1023.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Redondo, R. L., and Morris, R. G. (2011). Making memories last: the synaptic tagging and capture hypothesis. Nat. Rev. Neurosci. 12, 17–30.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Reiner, A., Hart, N. M., Lei, W., and Deng, Y. (2010). Corticostriatal projection neurons -dichotomous types and dichotomous functions. Front. Neuroanat. 4:142. doi: 10.3389/fnana.2010.00142

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Reiner, A., Jiao, Y., Del Mar, N., Laverghetta, A. V., and Lei, W. L. (2003). Differential morphology of pyramidal tract-type and intratelencephalically projecting-type corticostriatal neurons and their intrastriatal terminals in rats. J. Comp. Neurol. 457, 420–440.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Reiner, A., Medina, L., and Veenman, C. L. (1998). Structural and functional evolution of the basal ganglia in vertebrates. Brain Res. Brain Res. Rev. 28, 235–285.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Reiner, A., Perkel, D. J., Bruce, L. L., Butler, A. B., Csillag, A., Kuenzel, W., Medina, L., Paxinos, G., Shimizu, T., Striedter, G., Wild, M., Ball, G. F., Durand, S., Gunturkun, O., Lee, D. W., Mello, C. V., Powers, A., White, S. A., Hough, G., Kubikova, L., Smulders, T. V., Wada, K., Dugas-Ford, J., Husband, S., Yamamoto, K., Yu, J., Siang, C., Jarvis, E. D., Guturkun, O., and Forum, A. B. N. (2004). Revised nomenclature for avian telencephalon and some related brainstem nuclei. J. Comp. Neurol. 473, 377–414.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Reynolds, J. N. J., Hyland, B. I., and Wickens, J. R. (2001). A cellular mechanism of reward-related learning. Nature 413, 67–70.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Robinson, D. A. (1972). Eye movements evoked by collicular stimulation in the alert monkey. Vision Res. 12, 1795–1808.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Robinson, D. A., and Fuchs, A. F. (1969). Eye movements evoked by stimulation of frontal eye fields. J. Neurophysiol. 32, 637–648.

Pubmed Abstract | Pubmed Full Text

Romo, R., Brody, C. D., Hernandez, A., and Lemus, L. (1999). Neuronal correlates of parametric working memory in the prefrontal cortex. Nature 399, 470–473.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Roth, G., Laberge, F., Muhlenbrock-Lenter, S., and Grunwald, W. (2007). Organization of the pallium in the fire-bellied toad Bombina orientalis. I: morphology and axonal projection pattern of neurons revealed by intracellular biocytin labeling. J. Comp. Neurol. 501, 443–464.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Royce, G. J. (1983). Single thalamic neurons which project to both the rostra1 cortex and caudate nucleus studied with the fluorescent double labeling method. Exp. Neurol. 79, 773–784.

Pubmed Abstract | Pubmed Full Text

Russell, J. A., and Bullock, M. (1985). Multidimensional scaling of emotional facial expressions: similarity from preschoolers to adults. J. Pers. Soc. Psychol. 48, 1290.

Sadikot, A. F., Parent, A., Smith, Y., and Bolam, J. P. (1992). Efferent connections of the centromedian and parafascicular thalamic nuclei in the squirrel monkey: a light and electron microscopic study of the thalamostriatal projection in relation to striatal heterogeneity. J. Comp. Neurol. 320, 228–242.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Samejima, K., Ueda, Y., Doya, K., and Kimura, M. (2005). Representation of action-specific reward values in the striatum. Science 310, 1337–1340.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Sato, M., and Hikosaka, O. (2002). Role of primate substantia nigra pars reticulata in reward-oriented saccadic eye movement. J. Neurosci. 22, 2363–2373.

Pubmed Abstract | Pubmed Full Text

Scharff, C., and Nottebohm, F. (1991). A comparative study of the behavioral deficits following lesions of various parts of the zebra finch song system: implications for vocal learning. J. Neurosci. 11, 2896–2913.

Pubmed Abstract | Pubmed Full Text

Schlag-Rey, M., Schlag, J., and Dassonville, P. (1992). How the frontal eye field can impose a saccade goal on superior colliculus neurons. J. Neurophysiol. 67, 1003–1005.

Pubmed Abstract | Pubmed Full Text

Schultz, W. (2002). Getting formal with dopamine and reward. Neuron 36, 241–263.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Schultz, W., Dayan, P., and Montague, P. R. (1997). A neural substrate of prediction and reward. Science 275, 1593–1599.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Shen, W., Flajolet, M., Greengard, P., and Surmeier, D. J. (2008). Dichotomous dopaminergic control of striatal synaptic plasticity. Science 321, 848–851.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Sidibe, M., and Smith, Y. (1996). Differential synaptic innervation of striatofugal neurones projecting to the internal or external segments of the globus pallidus by thalamic afferents in the squirrel monkey. J. Comp. Neurol. 365, 445–465.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Smith, A. D., and Bolam, J. P. (1990). The neural network of the basal ganglia as revealed by the study of synaptic connections of identified neurones. Trends Neurosci. 13, 1–7.

Pubmed Abstract | Pubmed Full Text

Smith, Y., Bennett, B. D., Bolam, J. P., Parent, A., and Sadikot, A. F. (1994). Synaptic relationships between dopaminergic afferents and cortical or thalamic input in the sensorimotor territory of the striatum in monkey. J. Comp. Neurol. 344, 1–19.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Smith, Y., Bevan, M. D., Shink, E., and Bolam, J. P. (1998). Microcircuitry of the direct and indirect pathways of the basal ganglia. Neuroscience 86, 353–387.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Smith, Y., Raju, D. V., Pare, J.-F., and Sidibe, M. (2004). The thalamostriatal system: a highly specific network of the basal ganglia circuitry. Trends Neurosci. 27, 520–527.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Smith, Y., Surmeier, D. J., Redgrave, P., and Kimura, M. (2011). Thalamic contributions to Basal Ganglia-related behavioral switching and reinforcement. J. Neurosci. 31, 16102–16106.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Sober, S. J., Wohlgemuth, M. J., and Brainard, M. S. (2008). Central contributions to acoustic variation in birdsong. J. Neurosci. 28, 10370–10379.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Sohrabji, F., Nordeen, E. J., and Nordeen, K. W. (1990). Selective impairment of song learning following lesions of a forebrain nucleus in the juvenile zebra finch. Behav. Neural Biol. 53, 51–63.

Pubmed Abstract | Pubmed Full Text

Sommer, M. A., and Wurtz, R. H. (2002). A pathway in primate brain for internal monitoring of movements. Science 296, 1480–1482.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Sommer, M. A., and Wurtz, R. H. (2008). Brain circuits for the internal monitoring of movements. Annu. Rev. Neurosci. 31, 317–338.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Stepanek, L., and Doupe, A. J. (2010). Activity in a cortical-basal ganglia circuit for song is required for social context-dependent vocal variability. J. Neurophysiol. 104, 2474–2486.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Stephenson-Jones, M., Ericsson, J., Robertson, B., and Grillner, S. (2012). Evolution of the basal ganglia; dual output pathways conserved throughout vertebrate phylogeny. J. Comp. Neurol. doi: 10.1002/cne.23087. [Epub ahead of print].

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Stern, E. A., Jaeger, D., and Wilson, C. J. (1998). Membrane potential synchrony of simultaneously recorded striatal spiny neurons in vivo. Nature 394, 475–478.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Suri, R. E., and Schultz, W. (1999). A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task. Neuroscience 91, 871–890.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Sutton, R. S., and Barto, A. G. (1998). Reinforcement Learning: An Introduction. Cambridge, MA: The MIT Press.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Swanson, L. W. (2000). Cerebral hemisphere regulation of motivated behavior. Brain Res. 886, 113–164.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Takada, M., Tokuno, H., Ikai, Y., and Mizuno, N. (1994). Direct projections from the entopeduncular nucleus to the lower brainstem in the rat. J. Comp. Neurol. 342, 409–429.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Takakusaki, K., Habaguchi, T., Ohtinata-Sugimoto, J., Saitoh, K., and Sakamoto, T. (2003). Basal ganglia efferents to the brainstem centers controlling postural muscle tone and locomotion: a new concept for understanding motor disorders in basal ganglia dysfunction. Neuroscience 119, 293–308.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Takakusaki, K., Saitoh, K., Harada, H., and Kashiwayanagi, M. (2004). Role of basal ganglia-brainstem pathways in the control of motor behaviors. Neurosci. Res. 50, 137–151.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Tanaka, K. (1996). Inferotemporal cortex and object vision. Annu. Rev. Neurosci. 19, 109–139.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Thorndike, E. L. (1911). Animal Intelligence. Darien, CT: Hafner.

Tseng, K. Y., and Steiner, H. (2010). Handbook of Basal Ganglia Structure and Function. Amsterdam: Elsevier.

Tumer, E. C., and Brainard, M. S. (2007). Performance variability enables adaptive plasticity of ‘crystallized’ adult birdsong. Nature 450, 1240–1244.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Turner, R. S., and DeLong, M. R. (2000). Corticostriatal activity in primary motor cortex of the macaque. J. Neurosci. 20, 7096–7108.

Pubmed Abstract | Pubmed Full Text

Vates, G. E., and Nottebohm, F. (1995). Feedback circuitry within a song-learning pathway. Proc. Natl. Acad. Sci. U.S.A. 92, 5139–5143.

Pubmed Abstract | Pubmed Full Text

Vicario, D. S. (1991). Organization of the zebra finch song control system: functional organization of outputs from nucleus robustus archistriatalis. J. Comp. Neurol. 309, 486–494.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Warren, T. L., Tumer, E. C., Charlesworth, J. D., and Brainard, M. S. (2011). Mechanisms and time course of vocal learning and consolidation in the adult songbird. J. Neurophysiol. 106, 1806–1821.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Watanabe, K., Lauwereyns, J., and Hikosaka, O. (2003). Neural correlates of rewarded and unrewarded eye movements in the primate caudate nucleus. J. Neurosci. 23, 10052–10057.

Pubmed Abstract | Pubmed Full Text

Wickens, J., and Kötter, R. (1995). “Cellular models of reinforcement,” in Models of Information Processing in the Basal Ganglia, eds J. C. Houk, J. L. Davis, and D. G. Beiser (Cambridge MA: The MIT Press), 187–214.

Wickens, J. R., Alexander, M. E., and Miller, R. (1991). Two dynamic modes of striatal function under dopaminergic-cholinergic control: simulation and analysis of a model. Synapse 8, 1–12.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Wickens, J. R., and Arbuthnott, G. W. (2010). “Gating of cortical input to the striatum,” in Handbook of Basal Ganglia Structure and Function, a Decade of Progress, eds H. Steiner, and K. Y. Tseng (Amsterdam: Elsevier), 341–352.

Wilczynski, W., and Northcutt, R. G. (1983). Connections of the bullfrog striatum: afferent organization. J. Comp. Neurol. 214, 321–332.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Wilson, C. J. (1987). Morphology and synaptic connections of crossed corticostriatal neurons in the rat. J. Comp. Neurol. 263, 567–580.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Wilson, C. J. (2007). GABAergic inhibition in the neostriatum. Prog. Brain Res. 160, 91–110.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Wilson, C. J., Chang, H. T., and Kitai, S. T. (1983). Origins of post synaptic potentials evoked in spiny neostriatal projection neurons by thalamic stimulation in the rat. Exp. Brain Res. 51, 217–226.

Pubmed Abstract | Pubmed Full Text

Wilson, C. J., and Goves, P. M. (1980). Fine structure and synaptic connections of the common spiny neuron of the rat neostriatum: a study employing intracellular injection of horseradish peroxidase. J. Comp. Neurol. 194, 599–615.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Wilson, C. J., and Kawaguchi, Y. (1996). The origins of two-state spontaneous membrane potential fluctuations of neostriatal spiny neurons. J. Neurosci. 16, 2397–2410.

Pubmed Abstract | Pubmed Full Text

Winn, P., Wilson, D. I. G., and Redgrave, P. (2010). “Subcortical connections of the basal ganglia,” in Handbook of Basal Ganglia Structure and Function, eds H. Steiner and K. Y. Tseng (Amsterdam: Elsevier), 397–408.

Wright, A. K., Ramanathan, S., and Arbuthnott, G. W. (2001). Identification of the source of the bilateral projection system from cortex to somatosensory neostriatum and an exploration of its physiological actions. Neuroscience 103, 87–96.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Wurtz, R. H., and Albano, J. E. (1980). Visual-motor function of the primate superior colliculus. Annu. Rev. Neurosci. 3, 189–226.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Yin, H. H., and Knowlton, B. J. (2006). The role of the basal ganglia in habit formation. Nat. Rev. Neurosci. 7, 464–476.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Yuste, R. (2011). Dendritic spines and distributed circuits. Neuron 71, 772–781.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Zheng, T., and Wilson, C. J. (2002). Corticostriatal combinatorics: the implications of corticostriatal axonal arborizations. J. Neurophysiol. 87, 1007–1017.

Pubmed Abstract | Pubmed Full Text

Keywords: context, corticostriatal, efference copy, motor learning, songbird, striatum, thalamostriatal

Citation: Fee MS (2012) Oculomotor learning revisited: a model of reinforcement learning in the basal ganglia incorporating an efference copy of motor actions. Front. Neural Circuits 6:38. doi: 10.3389/fncir.2012.00038

Received: 27 January 2012; Accepted: 01 June 2012;
Published online: 27 June 2012.

Edited by:

Massimo Scanziani, University of California, San Diego, USA

Reviewed by:

Naoshige Uchida, Harvard University, USA
Okihide Hikosaka, National Eye Institute, USA
Anatol Kreitzer, University of California, San Francisco, USA

Copyright: © 2012 Fee. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.

*Correspondence: Michale S. Fee, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, 46-5133, 77 Massachusetts Ave, Cambridge, MA 02139, USA. e-mail: fee@mit.edu

Download