%A Stewart,Terrence %A Bekolay,Trevor %A Eliasmith,Chris %D 2012 %J Frontiers in Neuroscience %C %F %G English %K 2-armed bandit,Basal Ganglia,neural engineering framework,reinforcement learning,ventral striatum %Q %R 10.3389/fnins.2012.00002 %W %L %M %P %7 %8 2012-January-31 %9 Original Research %+ Dr Terrence Stewart,University of Waterloo,200 University Avenue West,Waterloo,N2L 3G1,Ontario,Canada,tcstewar@uwaterloo.ca %# %! Learning to Select Actions %* %< %T Learning to Select Actions with Spiking Neurons in the Basal Ganglia %U https://www.frontiersin.org/articles/10.3389/fnins.2012.00002 %V 6 %0 JOURNAL ARTICLE %@ 1662-453X %X We expand our existing spiking neuron model of decision making in the cortex and basal ganglia to include local learning on the synaptic connections between the cortex and striatum, modulated by a dopaminergic reward signal. We then compare this model to animal data in the bandit task, which is used to test rodent learning in conditions involving forced choice under rewards. Our results indicate a good match in terms of both behavioral learning results and spike patterns in the ventral striatum. The model successfully generalizes to learning the utilities of multiple actions, and can learn to choose different actions in different states. The purpose of our model is to provide both high-level behavioral predictions and low-level spike timing predictions while respecting known neurophysiology and neuroanatomy.