**NEURONAL AND PSYCHOLOGICAL UNDERPINNINGS OF PATHOLOGICAL GAMBLING**

**Topic Editors Bryan F. Singer, Patrick Anselme, Mike J. F. Robinson and Paul Vezina**

BEHAVIORAL NEUROSCIENCE

#### *FRONTIERS COPYRIGHT STATEMENT*

© Copyright 2007-2014 Frontiers Media SA. All rights reserved.

All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.

Cover image provided by Ibbl sarl, Lausanne CH

**ISSN** 1664-8714 **ISBN** 978-2-88919-320-2 **DOI** 10.3389/978-2-88919-320-2

### *ABOUT FRONTIERS*

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### *FRONTIERS JOURNAL SERIES*

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing.

All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### *DEDICATION TO QUALITY*

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view.

By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

### *WHAT ARE FRONTIERS RESEARCH TOPICS?*

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area!

Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

## **NEURONAL AND PSYCHOLOGICAL UNDERPINNINGS OF PATHOLOGICAL GAMBLING**

Topic Editors:

**Bryan F. Singer,** University of Michigan, USA **Patrick Anselme,** University of Liège, Belgium **Mike J. F. Robinson,** Wesleyan University, USA **Paul Vezina,** The University of Chicago, USA

Like in the case of drugs, gambling hijacks reward circuits in a brain which is not prepared to receive such intense stimulation. Dopamine is normally released in response to reward and uncertainty in order to allow animals to stay alive in their environment – where rewards are relatively unpredictable. In this case, behavior is regulated by environmental feedbacks, leading animals to persevere or to give up. In contrast, drugs provide a direct, intense pharmacological stimulation of the dopamine system that operates independently of environmental feedbacks, and hence causes "motivational runaways". With respect to gambling, the confined environment experienced by gamblers favors the emergence of excitatory conditioned cues, so that positive feedbacks take over negative feedbacks. Although drugs and gambling may act differently, their abnormal activation of reward circuitry generates an underestimation of negative consequences and promotes the development of addictive/compulsive behavior. In Parkinson's and Huntington's disease, dopamine-related therapies may disrupt these feedbacks on dopamine signalling, potentially leading to various addictions, including pathological gambling. The goal of this Research Topic is to further our understanding of the neurobiological mechanisms underlying the development of pathological gambling. This eBook contains a cross-disciplinary collection of research and review articles, ranging in scope from animal behavioral models to human imaging studies.

# Table of Contents


Martin Zack, Robert E. Featherstone, Sarah Mathewson and Paul J. Fletcher


Martin Zack


Ruud van den Bos, Ruben Taris, Bianca Scheppink, Lydia de Haan and Joris C. Verster

*99 The Role of Dopamine in Risk Taking: A Specific Look at Parkinson's Disease and Gambling*

Crystal A. Clark and Alain Dagher


Stephanie E. Tedford, Nathan A. Holtz, Amanda L. Persons and T. Celeste Napier

### Neuronal and psychological underpinnings of pathological gambling

#### *Bryan F. Singer <sup>1</sup> \*, Patrick Anselme2, Mike J. F. Robinson3 and Paul Vezina4*

*<sup>1</sup> Department of Psychology, The University of Michigan, Ann Arbor, MI, USA*

*<sup>2</sup> Department of Psychology, The University of Liège, Liège, Belgium*

*<sup>3</sup> Psychology Department, Wesleyan University, Middletown, CT, USA*

*<sup>4</sup> Department of Psychiatry and Behavioral Neuroscience, The University of Chicago, Chicago, IL, USA*

*\*Correspondence: bfsinger@umich.edu*

#### *Edited and reviewed by:*

*Allan V. Kalueff, International Stress and Behavior Society, USA*

**Keywords: dopamine, gambling, reward, uncertainty, addiction, ventral striatum, stress, conditioning (psychology)**

Although pathological gambling (PG) is a prevalent disease, its neurobiological and psychological underpinnings are not well characterized. As legal gambling increases in prominence in a growing number of casinos as well as on the internet, the potential for a rise in PG diagnoses warrants investigation of the disorder. The recent reclassification of PG as a behavioral addiction in the DSM-5 raises the possibility that similar cognitive and motivational phenotypes may underlie both gambling and substance use disorders. Indeed, in this Research Topic, Zack et al. (2014) tested the hypothesis that exposure to reward unpredictability can recruit brain dopamine (DA) systems in a similar way to chronic exposure to drugs of abuse (see also Singer et al., 2012). Over the years a variety of models have proposed that alterations in DA signaling may mediate the transition from drug use to dependence; similarly, the hypothesis that aberrant DA responses may influence the transition from recreational, to problematic, and finally PG has only recently begun to be tested. The collection of articles in this Research Topic highlights the complexity of PG and posits several theories of how dopaminergic signaling may contribute to behavioral maladaptations that contribute to PG.

In this Research Topic, Paglieri et al. (2014) report a growing incidence of PG with a lack of effective treatments. As described by Goudriaan et al. (2014) (this Research Topic), PG is thought to result from "diminished cognitive control over the urge to engage in addictive behaviors" that manifests in the inability to control desire to gamble despite negative consequences. PG is characterized by several cognitive dysfunctions, including increased impulsivity and cognitive interference. Similar to drug addictions, gambling behavior is powerfully modulated by exposure to gambling-related conditioned stimuli. In this Research Topic both Anselme and Robinson (2013) as well as Linnet (2014) describe the supporting role of gambling-related cues in this behavioral addiction. Anselme and Robinson (2013) present a series of findings suggesting that surprising non-rewards enhance incentive salience attribution to conditioned cues in conditioning procedures as well as during gambling episodes. They discuss a possible evolutionary origin of this counterintuitive process. Linnet (2014) reviews the contribution of DA signaling to incentive salience and reward prediction. Noting the research demonstrating brain activation during gambling tasks despite the possibility of a loss, he suggests a role for DA dysfunction in reward "wanting" and anticipation.

Ventral striatal activation is thought to be critical for the attribution of incentive salience to reward-related cues. In this Research Topic, Lawrence and Brooks (2014) found that healthy individuals who are more likely to display disinhibitory personality traits, such as financial extravagance and irresponsibility, show increased capacity for ventral striatal DA synthesis. Thus it is possible that individual variation in DA signaling due to genetics or environmental factors may influence PG. Porchet et al. (2013) (this Research Topic) also investigated whether physiological and cognitive responses observed during the performance of gambling tasks could be altered in recreational gamblers with pharmacological manipulations. As the commentary from Zack (2013) suggests, the Porchet et al. (2013) results may reflect important differences in neurobiological function between recreational and pathological gamblers. This hypothesis, along with the results of Lawrence and Brooks (2014) demonstrating increased DA capacity in individuals thought to be more prone to gambling, illustrates the complexity of PG as a disease and the need to sample different populations with different techniques and behavioral tasks.

Two papers in this Research Topic suggest a role for cortisol in modulating incentive motivation in the ventral striatum. Li et al. (2014) demonstrate an imbalanced sensitivity to monetary vs. non-monetary incentives in the ventral striatum of pathological gamblers. They show that cortisol levels in PG positively correlate with ventral striatal responses to monetary cues. van den Bos et al. (2013) provide further evidence for the importance of cortisol by highlighting the strong positive correlation observed in men between salivary cortisol levels and risk-taking measures. This was a significant contrast to the weak negative correlation seen in women. Their findings highlight important gender differences in how stress hormones affect risky-decision making, and by extension, the role of stress in gambling.

In this Research Topic, Clark and Dagher (2014) provide a review of the literature investigating the relationship between DA agonists and impulse control disorders in Parkinson's patients, and how this relates to potential gains and losses within a decision-making framework. They provide the beginnings of a hypothetical model of how DA agonist treatments affect value and risk assessments. While a variety of research suggests that dopaminergic treatments for Parkinson's disease may affect PG, few have probed whether individuals with Huntington's disease (HD) display gambling-related phenotypes. Kalkhoven et al. (2014) (this Research Topic) show that HD patients exhibit symptoms of behavioral disinhibition similar to those observed in PG. However, HD patients do not typically develop problem gambling. Based on neurobehavioral evidence, these authors suggest why HD patients are unlikely to start gambling but have a higher chance of developing PG if they encounter a situation that promotes such behavior.

The investigation of neural mechanisms underlying PG is currently at an early stage. As emphasized by Potenza (2013) in this Research Topic, while previous research and the present findings suggest that DA may underlie gambling-related behaviors, other neurotransmitters and signaling pathways may also play vital roles in the emergence of the disease. Individual variation in PG populations (e.g., differing levels of impulsivity, compulsivity, decision making, and DA pathology) has produced discrepancies in the PG literature, warranting a systematic approach to investigating the disease in the future. Paglieri et al. (2014) also suggest the need for greater methodological integration of animal studies (rodents and primates) to better understand the mechanisms underlying PG. In particular, Tedford et al. (2014) note in this Research Topic that gambling activity involves costs/benefits decision-making and that intracranial self-stimulation provides experimental advantages over traditional reinforcement methods used to model PG in animals. Finally, Paglieri et al. (2014) suggest that computational modeling, already used to account for other psychiatric diseases, might be applied to PG as well. Taken together, this collection of articles suggests novel avenues for future research of PG to improve treatment options for the disease.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 May 2014; accepted: 06 June 2014; published online: 01 July 2014. Citation: Singer BF, Anselme P, Robinson MJF and Vezina P (2014) Neuronal and psychological underpinnings of pathological gambling. Front. Behav. Neurosci. 8:230. doi: 10.3389/fnbeh.2014.00230*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Singer, Anselme, Robinson and Vezina. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

### Chronic exposure to a gambling-like schedule of reward predictive stimuli can promote sensitization to amphetamine in rats

#### *Martin Zack1 \*, Robert E. Featherstone2, Sarah Mathewson3 and Paul J. Fletcher <sup>3</sup>*

*<sup>1</sup> Cognitive Psychopharmacology Laboratory, Neuroscience Department, Centre for Addiction and Mental Health, Toronto, ON, Canada <sup>2</sup> Translational Neuroscience Program, Department of Psychiatry, School of Medicine, University of Pennsylvania, Philadelphia, PA, USA*

*<sup>3</sup> Biopsychology Section, Neuroscience Department, Centre for Addiction and Mental Health, Toronto, ON, Canada*

#### *Edited by:*

*Bryan F. Singer, University of Michigan, USA*

#### *Reviewed by:*

*Louk Vanderschuren, University of Utrecht, Netherlands Ruud Van Den Bos, Radboud University Nijmegen, Netherlands Patrick Anselme, University of Liège, Belgium*

#### *\*Correspondence:*

*Martin Zack, Cognitive Psychopharmacology Laboratory, Neuroscience Department, Centre for Addiction and Mental Health, 33 Russell Street, Toronto, ON M5S 2S1, Canada e-mail: martin.zack@camh.ca*

Addiction is considered to be a brain disease caused by chronic exposure to drugs. Sensitization of brain dopamine (DA) systems partly mediates this effect. Pathological gambling (PG) is considered to be a behavioral addiction. Therefore, PG may be caused by chronic exposure to gambling. Identifying a gambling-induced sensitization of DA systems would support this possibility. Gambling rewards evoke DA release. One episode of slot machine play shifts the DA response from reward delivery to onset of cues (spinning reels) for reward, in line with temporal difference learning principles. Thus, conditioned stimuli (CS) play a key role in DA responses to gambling. In primates, DA response to a CS is strongest when reward probability is 50%. Under this schedule the CS elicits an expectancy of reward but provides no information about whether it will occur on a given trial. During gambling, a 50% schedule should elicit maximal DA release. This closely matches reward frequency (46%) on a commercial slot machine. DA release can contribute to sensitization, especially for amphetamine. Chronic exposure to a CS that predicts reward 50% of the time could mimic this effect. We tested this hypothesis in three studies with rats. Animals received 15 × 45-min exposures to a CS that predicted reward with a probability of 0, 25, 50, 75, or 100%. The CS was a light; the reward was a 10% sucrose solution. After training, rats received a sensitizing regimen of five separate doses (1 mg/kg) of d-amphetamine. Lastly they received a 0.5 or 1 mg/kg amphetamine challenge prior to a 90-min locomotor activity test. In all three studies the 50% group displayed greater activity than the other groups in response to both challenge doses. Effect sizes were modest but consistent, as reflected by a significant group × rank association (φ = 0*.*986, *p* = 0*.*025). Chronic exposure to a gambling-like schedule of reward predictive stimuli can promote sensitization to amphetamine much like exposure to amphetamine itself.

#### **Keywords: pathological gambling, sensitization, amphetamine, dopamine, uncertainty**

#### **INTRODUCTION**

Addiction has been characterized as a brain disease caused by chronic exposure to drugs of abuse (Leshner, 1997). Neuroplasticity is thought to mediate the effects of such exposure (Nestler, 2001). Sensitization of brain dopamine (DA) systems is a form of neuroplasticity implicated in hyper-reactivity to conditioned stimuli (CS) for drugs, and compulsive drug seeking (Robinson and Berridge, 2001). Sensitization has been operationally defined by increased DA release in response to a CS for reward and by increased locomotor response to pharmacological DA challenge (Robinson and Berridge, 1993; Pierce and Kalivas, 1997; Vanderschuren and Kalivas, 2000). Although sensitization is only one of many brain changes linked with addiction (cf. Robbins and Everitt, 1999; Koob and Le Moal, 2008), changes in presynaptic dopamine release have been suggested to represent common neuroadaptations involved in addictionbased drug-seeking (e.g., relapse), in that drugs that induce locomotor sensitization to opiate (e.g., morphine) or stimulant challenge (e.g., amphetamine), also cause reinstatement of extinguished operant responses for heroin or cocaine selfadministration—an animal model of relapse (Vanderschuren et al., 1999). Evidence that incentive sensitization (increased value of drug reward) is most pronounced after initial exposure to addictive drugs further suggests that sensitization may be involved in the early stages of addiction as well (Vanderschuren and Pierce, 2010).

Pathological gambling (PG) has been described as a behavioral addiction and recently reclassified to the same category as substance dependence disorders in the 5th edition of the Diagnostic and Statistical Manual of Mental Disorders (Frascella et al., 2010; A.P.A., 2013). This implies that PG may be caused by chronic exposure to gambling-like activity, that common mechanisms may mediate the effects of gambling and drug exposure (Zack and Poulos, 2009; Leeman and Potenza, 2012); and that sensitization of brain DA pathways may be one important element of this process.

Clinical evidence indirectly supports this possibility: Using positron emission tomography (PET) Boileau and colleagues found that male PG subjects exhibit significantly greater striatal DA release in response to amphetamine (0.4 mg/kg) than healthy male controls (Boileau et al., 2013). Overall group differences were significant in the associative and somatosensory striatum. In the limbic striatum, which includes the nucleus accumbens, the groups did not differ. However, in PG subjects, DA release in the limbic striatum correlated directly with the severity of PG symptoms. These findings are consistent with sensitization of brain DA pathways in PG, but also suggest some important differences with human substance dependent individuals and with the classic animal model of amphetamine sensitization. Unlike PG subjects and animals exposed to low doses of amphetamine (cf. Robinson et al., 1982), humans with substance dependence consistently exhibit decreased DA release to a stimulant challenge (Volkow et al., 1997; Martinez et al., 2007), and evidence from animals suggests that this may reflect deficits in DA function during the initial stages of abstinence following binge patterns of substance abuse (Mateo et al., 2005). In studies where stimulant sensitization is demonstrated in animals, enhanced DA release is usually observed in the limbic striatum rather than the dorsal (associative, somatosensory) striatum (Vezina, 2004). However, cue-induced (i.e., conditioned) drug-seeking in animals repeatedly exposed to cocaine has been linked with enhanced DA release in the dorsal striatum, a result thought to indicate a more habitual form of motivated behavior (Ito et al., 2002). Thus, the overall elevation in DA release in dorsal regions in PG subjects may be related to habit-based (inflexible, routinized) reward seeking involving "a progression from ventral to more dorsal domains of the striatum" (Everitt and Robbins, 2005, p. 1481), whereas the severity-dependent DA release in limbic striatum in these subjects may correspond more closely to incentive sensitization as typically modeled in animals. The PET findings cannot reveal whether DA hyper-reactivity was a pre-existing feature of these PG subjects, a consequence of gambling exposure, or a result of some other process entirely. To address this question, it is necessary to demonstrate induction of sensitization by chronic gambling exposure in subjects that are normal prior to exposure. This raises questions as to what features of gambling are most likely to induce sensitization.

Skinner noted that the variable schedule of reinforcement was fundamental to gambling's allure (or at least its persistence) (Skinner, 1953). Betting behavior in a slot machine game conforms well to the basic principles of instrumental conditioning, as reflected by a prospective correlation between monetary payoff and bet size on consecutive spins (Tremblay et al., 2011). Thus, variable ratio operant responding appears to provide an externally valid model of slot machine gambling.

Recent research with animals provides strong initial support for a causal effect of gambling exposure on sensitization. Singer and colleagues examined the effects of 55 1–h daily sessions of fixed (FR20) or variable (VR20) saccharin reinforcement in an operant lever-press paradigm on subsequent locomotor response to low dose (0.5 mg/kg) amphetamine in healthy male (Sprague Dawley) rats (Singer et al., 2012). They hypothesized that, if gambling leads to sensitization, rats exposed to the variable schedule, which mimics gambling, should exhibit greater response to amphetamine than rats exposed to the fixed schedule. As predicted, the VR20 group displayed 50% greater locomotor response to amphetamine than the FR20 group. In contrast, the groups displayed equivalent locomotion following a saline injection. These findings confirm that chronic exposure to variable reinforcement is sufficient to induce hyper-reactivity to a DA challenge in healthy animals randomized to the respective schedules.

A number of questions arise from this result: First, to what extent does the perceived contingency—or lack thereof—between the operant response and its outcome mediate these effects? In learning terms, does this effect involve a "response-outcome expectancy," or might a similar effect be seen in the absence of an operant response, i.e., "a stimulus-outcome expectancy" in a Pavlovian paradigm (cf. Bolles, 1972)? Second, does the degree of contingency between the antecedent event (response or stimulus) and its outcome influence the degree of sensitization?

The second question concerns the role of uncertainty in sensitization. For example, do games whose outcome is truly random—completely unpredictable—have greater potential to induce sensitization than games where the odds of winning are clearly defined but not random, even if the absolute rate of reward is low? The present research addressed these questions.

The experimental design was informed by a seminal study on reward expectancy and DA neuron response in monkeys (Fiorillo et al., 2003). The animals in that study received a juice reward (US) under 0, 25, 50, 75, or 100% variable ratio schedules. The schedules were designated by 1 of 4 different CS (icons). The 0% schedule delivered reward as often as the 100% schedule, but omitted the CS. Firing rate of DA neurons during the interval between CS onset and US delivery or omission was the key dependent measure. The study found that DA response increased as a function of the uncertainty of reward delivery. Thus, under the 100% schedule the CS evoked little activity, under the 25 and 75% schedules, the CS evoked moderate and similar levels of activity, and under the 50% schedule the CS evoked maximal activity. In each case, firing rate escalated over the course of the CS-US interval, i.e., as the expectancy approached fruition.

These findings indicate that DA activity not only varies with whether or not reward is certain (Fixed Ratio) or uncertain (Variable Ratio), but also varies in inverse proportion to the amount of information about reward delivery conveyed by the CS. In the 100% condition, the CS evokes the reward expectancy and also perfectly predicts its delivery. In the 25 and 75% conditions, the CS evokes the expectancy and predicts reward delivery three out of four times. In the 50% condition the CS evokes the expectancy but provides no information about reward delivery beyond chance alone. Based on their findings, Fiorillo et al. concluded: "This uncertainty-induced increase in dopamine could contribute to the rewarding properties of gambling" (p. 1901).

The effects of 50% variable reward in a single session should not change over the course of multiple sessions because the likelihood of reward remains entirely unpredictable on every trial. Thus, when considering the conditions that would maximize chronic activation of DA neurons over repeated episodes of gambling the 50% schedule should engender the most enduring as well as the most robust effect. This is noteworthy given that the long run rate of reward (payoff *>* 0) observed over thousands of spins on a commercial slot machine was 45.8% (Tremblay et al., 2011). Thus, 50% variable reward appears to accurately reflect the payoff schedule administered by actual gambling devices.

The present study used the same conditioning schedules as Fiorillo et al. in a chronic exposure, between-groups' design with rats. Animals underwent ∼3 weeks of daily conditioning sessions, where a CS (light) was paired with a US (small amount of sucrose). After the training phase, animals rested prior to assessment of sensitization indexed by locomotor response to amphetamine. Based on the literature, it was predicted that rats exposed to different reward schedules would not differ in their drug free locomotor behavior but would exhibit significantly different levels of locomotion following amphetamine, with the 50% group displaying a greater locomotor response to the drug relative to the other groups over the course of doses, a pattern that would be expected if the 50% animals had been previously exposed to additional doses of amphetamine itself (i.e., cross-sensitization).

### **EXPERIMENT 1 MATERIALS AND METHODS**

#### *Subjects*

Four groups (*n* = 8/group) of adult (300–350 g) male Sprague-Dawley rats (Charles River, St. Constant, Quebec, Canada) were housed individually in clear polycarbonate boxes (20 × 43 × 22 cm) under a reverse 12:12 light-dark cycle. They received *ad libitum* access to food and water, and daily handling by an experimenter for 2 weeks prior to the study. Each group was conditioned under one of four variable reward schedules: 0, 25, 50, or 100%. The 75% group was omitted in this initial study, as Fiorillo et al. (2003) found equivalent post-CS DA release under 25 and 75% reward schedules, such that both conditions led to greater DA release than did the 100% CS-US condition, but less than the 50% condition.

#### *Apparatus*

Access to sucrose presentations and to the CS was provided individually in operant conditioning boxes (33 × 31 × 29 cm). Each box was equipped with a reinforcer magazine, located on the front wall. A light in the top of the magazine served as the CS. A motorized, solenoid-controlled liquid dipper could be elevated to the floor of the magazine. Events in the box were controlled by Med Associates equipment and software, using an in-house program written in MED-PC. Locomotor testing was conducted individually in Plexiglas cages (27 × 48 × 20 cm). Each cage was equipped with a monitoring system consisting of six photo-beam cells to detect horizontal movement.

#### *Procedure*

*Training.* The study was conducted in compliance with the ethical guidelines set out by the Canadian Council on Animal Care. Rats were food-restricted to 90% of their body weight for the duration of the study and housed individually. Each rat received 15 days of sucrose reward training (10% water solution at 0.06 ml per reward): 5 consecutive days × 3 weeks, with weekends off. Animals were maintained on standard chow before and after the training phase; sucrose exposure was restricted to the fifteen ∼40 min training sessions. Each daily session consisted of 15 stimulus presentations (a light; CS), each separated by an inter-trial interval of 120 s. The light was located in the top panel of the magazine, and remained on for 25 s, with sucrose made available during the last 5 s. In the case of group 0 the sucrose dipper was raised every 140 s (for 5 s) but the stimulus light was not illuminated. This equated the interval between presentations of the dipper in group 0 and the other groups (120 + 25 s). Each treatment session lasted ∼40 min. On average, group 25 received sucrose once for every four CS presentations; group 50 received sucrose once for every two CS presentations, and group 100 received sucrose after every CS presentation.

*Testing.* Two weeks after the last sucrose access (or "conditioning") session, the locomotor response to d-amphetamine (AMPH; i.p.) was assessed. Rats were given three 2-h sessions to habituate to the locomotor boxes, followed by six AMPH test sessions. AMPH test days occurred at 1-wk intervals. On test days, rats were given 30 min to habituate to boxes then received a single 0.5 mg/kg dose of AMPH followed, on separate weekly sessions, by five 1.0 mg/kg doses (one dose per day) on test days 1 through 5. Post-AMPH locomotion was assessed for 90 min on each session.

#### *Data analytic approach*

Statistical analyses were conducted with SPSS (v. 16 and v. 21; SPSS Inc., Chicago IL). Immediate behavioral response to the CS was assessed in terms of nose pokes into the aperture where the sucrose was dispensed. The mean number of nose pokes during this interval (5 s per trial) was then compared to the mean number of nose pokes for the same duration (5 s) averaged over the time when the CS was absent. Group × Session ANOVAs of nose-pokes with CS present and absent tracked the acquisition of discriminative responding to the cue and indiscriminate nose poke responses under the different schedules over the course of the 15 sucrose training sessions.

Effects of treatment on locomotor responses were assessed with Group × Session ANOVAs for the drug-free habituation phase (three sessions), pre-sensitization 0.5 mg/kg AMPH challenge (one session), and during the five-session 1 mg/kg AMPH sensitization regimen, when groups were expected to differ in response to repeated doses of AMPH. Group × Session ANOVAs also assessed drug-free locomotor responses during the 30-min pre-injection habituation phase from each AMPH test session. Planned comparisons assessed the difference in mean performance for group 50 vs. group 0 (no expectancy control) and group 100 (no uncertainty control), by means of *t*-tests (Howell, 1992), using the MS error and df error terms for the relevant effect (i.e., group or group × session interaction) from the ANOVA (Winer, 1971). Polynomial trend analyses tested the profile of changes over the course of sessions.

To determine if approach responses in the presence and absence of the CS during the 15 sucrose training sessions contributed to variation in locomotor response to AMPH, or mediated group differences in AMPH response, follow-up analyses of covariance (ANCOVAs) were performed on the AMPH locomotor data, including total nose pokes (sum for 15 sessions) when the CS was absent as the covariate. A significant effect of the covariate would indicate that drug-free approach responses moderated (influenced the strength of) the effects of group or session. A decline in the significance of the effects of group or session in the presence of a significant covariate would indicate that approach responses mediated (accounted for) the effects of group or session. A decline in the significance of group or session effects in the absence of a significant covariate effect would simply reflect a loss of statistical power due to the reallocation of df from the error term to the covariate, and would not have bearing on the interpretation of the effects of group or session.

#### **RESULTS**

#### *Nose pokes during sucrose conditioning sessions*

*CS present.* **Figure 1A** shows the mean nose pokes for groups 25, 50, and 100 while the CS was present on the 15 sucrose conditioning sessions (nose pokes were not coded for group 0, which received no CS). A 3 Group × 15 Session ANOVA yielded significant main effects of Group, *F(*2*,* <sup>21</sup>*)* = 5*.*63, *p* = 0*.*011, and Session, *F(*14*,* <sup>294</sup>*)* = 14*.*00, *p <* 0*.*001, along with a significant Group × Session interaction, *F(*28*,* <sup>294</sup>*)* = 2*.*93, *p <* 0*.*001. **Figure 1A** indicates that the main effect of Session reflected an increase in nose pokes across sessions in all three groups, and the main effect of Group reflected generally higher overall scores in group 100 vs. group 25 with intermediate scores in group 50. A significant Group × Session interaction for the cubic trend, *F(*2*,* <sup>21</sup>*)* = 4*.*42, *p* = 0*.*030, indicated a rapid rise, dip, and leveling off in nose pokes over sessions in group 100, as against a linear increase over sessions in group 50, and a shallower linear increase over sessions in group 25.

*CS absent.* **Figure 1B** shows the mean nose pokes for all four groups for an equivalent duration (5 s × 15 trials) averaged over the time when the CS was absent. A 4 Group × 15 Session ANOVA yielded significant main effects of Group, *F(*3*,* <sup>28</sup>*)* = 7*.*06, *p* = 0*.*001, and Session *F(*14*,* <sup>392</sup>*)* = 2*.*84, *p <* 0*.*001, along with a significant Group × Session interaction, *F(*42*,* <sup>392</sup>*)* = 3*.*93, *p <* 0*.*001. A significant Group × Session interaction for the quadratic trend, *F(*3*,* <sup>28</sup>*)* = 3*.*91, *p* = 0*.*019, along with no interaction for the cubic trend, *F(*3*,* <sup>28</sup>*) <* 0*.*93, *p >* 0*.*44, reflected an "inverted-U" profile of nose pokes over sessions in group 0, as against a generally stable profile over sessions in the other groups.

#### *Habituation to locomotor chambers*

A 4 Group × 3 Session ANOVA yielded a main effect of Session, *F(*2*,* <sup>56</sup>*)* = 5*.*67, *p* = 0*.*006, and no other significant effects, *F(*3*,* <sup>28</sup>*) <* 1*.*60, *p >* 0*.*21. Mean (SE) beam breaks per 2 h in the locomotor boxes were 1681 (123) on session 1, 1525 (140) on session 2, and 1269 (96) on session 3. Planned comparisons found no significant differences between group 50 and group 0 or group 100 on the first or final habituation session, *t(*84*) <* 1*.*69, *p >* 0*.*05. Thus, in the absence of AMPH, repeated exposure to the test boxes was associated with a consistent decline in spontaneous locomotor activity in the four groups (i.e., Session effect), and

no differential response as a function of sucrose training schedule (no interaction).

#### *Test sessions*

#### *Effects of pre-sensitization 0.5 mg/kg AMPH challenge.*

*Pre-injection locomotion.* A 4 Group one-way ANOVA of locomotor response during the 30-min pre-injection habituation phase yielded no significant effects, *F(*3*,* <sup>28</sup>*) <* 1*.*05, *p >* 0*.*38. Planned comparisons found no significant difference between group 50 and group 0 or group 100, *t(*32*) <* 0*.*87, *p >* 0*.*40. Therefore, baseline differences in pre-injection locomotion did not account for group differences in locomotor response to AMPH. Mean (SE) beam breaks for the sample were 559 (77).

*Post-injection locomotion vs. final drug-free habituation session.* A 4 Group × 2 Session ANOVA compared the groups' locomotor responses on the final habituation session, and immediately after the pre-sensitization 0.5 mg/kg AMPH challenge. Scores for the habituation session (120 min) were scaled to correspond with the duration of the AMPH test session (90 min) (raw habituation score × 90/120). The analysis yielded a significant main effect of Session, *F(*1*,* <sup>28</sup>*)* = 34*.*16, *p <* 0*.*001, and no other significant effects, *F(*3*,* <sup>28</sup>*) <* 2*.*26, *p >* 0*.*10. The Session effect reflected an increase in mean (SE) beam breaks in response to the dose, from 952 (72) to 1859 (151). Planned comparisons found no significant differences between group 50 and group 0 or group 100 in response to the dose, *t(*56*) <* 1*.*72, *p >* 0*.*10. However, the rank order of beam break scores (M; SE) aligned with the hypothesis: group 50 (2205; 264) *>* group 0 (2025; 203) *>* group 100 (1909; 407) *>* group 25 (1296; 299).

#### *Effects of 1 mg/kg AMPH.*

*Pre-injection locomotion.* A 4 Group × 5 Session ANOVA of locomotor response during the 30-min pre-injection habituation phase on 1 mg/kg AMPH test sessions yielded a main effect of Session, *F(*4*,* <sup>112</sup>*)* = 43*.*64, *p <* 0*.*0001, and no other significant effects, *F(*3*,* <sup>28</sup>*) <* 0*.*97, *p >* 0*.*42. Planned comparisons found no significant difference between group 50 and group 0 or group 100 on the first or final test session, *t(*140*) <* 0*.*84, *p >* 0*.*30. Therefore, baseline differences in locomotion did not account for group differences in locomotor response to AMPH. Mean (SE) beam break scores for the pre-dose habituation phase on sessions 1–5 were: 454 (30), 809 (53), 760 (36), 505 (35), 756 (39).

*Post-injection locomotion.* **Figure 2** shows the effects of five injections of 1 mg/kg AMPH (one per week) on locomotor activity scores in the four groups. A 4 Group × 5 Session ANOVA yielded a main effect of Session, *F(*4*,* <sup>112</sup>*)* = 8*.*21, *p <* 0*.*001, a marginal main effect of Group, *F(*2*,* <sup>45</sup>*)* = 3*.*28, *p* = 0*.*085, and no significant interaction, *F(*12*,* <sup>122</sup>*) <* 0*.*77, *p >* 0*.*68.

Planned comparisons revealed that group 50 scores differed significantly from group 0, *t(*14*)* = 2*.*19, *p* = 0*.*037, and group 100, *t(*14*)* = 2*.*36, *p* = 0*.*025 [and differed marginally from group 25, *t(*14*)* = 2*.*03, *p* = 0*.*051]. Thus, in group 50, locomotor response to 1 mg/kg AMPH reliably exceeded that of the other three groups across all five test sessions. Polynomial trend analysis detected a significant quadratic trend across sessions, *F(*1*,* <sup>28</sup>*)* = 32*.*47, *p <* 0*.*0001, and no other significant trends, *F(*1*,* <sup>28</sup>*) <* 1*.*78, *p >* 0*.*19. **Figure 2** shows that this result reflected an "inverted U" pattern across sessions.

#### *Control for variation in nose poke responding during sucrose training*

The follow-up ANCOVA of locomotor responses to 1 mg/kg AMPH, with nose pokes (CS present) as the covariate, in the three groups that received the CS, yielded a marginal main effect of Group, *F(*2*,* <sup>20</sup>*)* = 3*.*07, *p* = 0*.*069, and no significant covariaterelated effects, *F(*4*,* <sup>80</sup>*) <* 0*.*05, *p >* 0*.*85. Thus, cued approach responding during training did not explain significant variation in the locomotor response to 1 mg/kg AMPH in groups 25, 50, or 100.

The follow-up ANCOVA of locomotor responses to 1 mg/kg AMPH, with nose pokes (CS absent) as a covariate, yielded a significant effect of the covariate, *F(*1*,* <sup>27</sup>*)* = 6*.*17, *p* = 0*.*020, a significant main effect of Group, *F(*3*,* <sup>27</sup>*)* = 4*.*13, *p* = 0*.*016, a marginal Session × Covariate interaction, *p* = 0*.*080, and no

**FIGURE 2 | Mean (SE) locomotor response (number of beam breaks in an electronic array per 90 min) to 1 mg/kg d-amphetamine (i.p.) on 5 weekly sessions in groups of Sprague Dawley rats (***n* **= 8/group) previously exposed to 15 daily conditioning sessions with sucrose reward (10% solution) delivered under 0, 25, 50, or 100% variable schedules.** The conditioned stimulus was a light (120 s). Group 0 received the same number of rewards as group 100 in the absence of conditioned stimuli. ∗*p <* 0*.*05 for mean difference between group 50 and group 0 as well as group 100, based on planned comparisons.

other significant effects, *F(*4*,* <sup>108</sup>*) <* 1*.*48, *p >* 0*.*21. Thus, un-cued (indiscriminate) approach responding during training explained significant variation in locomotor response to 1 mg/kg AMPH. However, this variation was non-overlapping with group-related variance, because inclusion of the covariate in the analysis increased rather than decreased the significance of the group effect.

#### **DISCUSSION**

The nose poke data while the CS was present show that groups acquired the association between CS and sucrose delivery as reflected by an increase in cued responses over training sessions. The profile of responding over sessions while the CS was present suggested that 100 and 50% CS-US schedules were equally effective in eliciting approach, whereas the 25% schedule elicited a more modest increase in cue-induced approach. The nose poke data while the CS was absent suggest that groups that received any of the three CS-sucrose training schedules (group 25, 50, 100) rapidly learned to reduce their nose pokes in the absence of the CS, whereas animals in group 0, which received no CS, only learned to decrease their approach behavior to a limited degree after extensive training.

The habituation data show that the groups did not differ prior to AMPH and that repeated exposure to the test boxes was associated with decreased drug-free locomotor response. Therefore, between-group differences and increased responding over repeated doses of AMPH cannot be attributed to pre-existing differences in locomotor behavior.

Results of the pre-sensitization challenge with 0.5 mg/kg AMPH confirmed that the drug increased locomotor activity relative to the final drug-free habituation day. In line with the hypothesis, group 50 ranked higher than groups 0 or 100 (as well as group 25) in terms of mean response to the dose, although the mean differences between groups were not significant.

For the sensitization sessions, the between-groups' planned comparisons showed that prior exposure to 50% conditioned sucrose reward led to a significant increase in locomotor response to a 1.0 mg/kg dose of amphetamine relative to the other three schedules. This effect was evident from the first dose and did not change appreciably over repeated doses. The trend analysis indicated a biphasic response (for the full sample) to repeated doses of AMPH, increasing up to the third dose and decreasing thereafter. The results of the follow-up ANCOVA with nose-pokes (CS absent) as the covariate confirmed that differences in the four groups' locomotor responses to 1 mg/kg AMPH were not mediated by un-cued approach responding during the sucrose training sessions.

The group effect during the sensitization sessions is consistent with our hypothesis. The bi-phasic session effect is not consistent with the expected continued escalation in locomotor responses with repeated AMPH doses. This may be related to the dosing interval. To address this issue, a procedure (alternate daily doses) shown to induce consistent escalation in locomotor response to 1.0 mg/kg doses of AMPH (i.e., behavioral sensitization) should be employed. The impact of a sensitizing regimen of AMPH on subsequent response to a second 0.5 mg/kg challenge would further support the generality of this effect. Inclusion of a saline challenge prior to AMPH would determine the role of expectancy or injection-related (e.g., stress) effects on the locomotor response to AMPH. Inclusion of a 75% conditioned sucrose group would help to clarify the role of reward uncertainty vs. reward infrequency on the pattern of responses for groups 50 and group 25. In addition, to permit assessment (by ANCOVA) of the contribution of drug-free cued approach responses to locomotion under AMPH (using nose pokes with CS present as the covariate), nose pokes were also coded for group 0 during the interval when the CS was present in the other four groups (i.e., so that nose pokes from all five groups—including group 0 which received no CS—could be included in the analysis of covariance with CS present as the covariate). These refinements were incorporated in experiment 2.

#### **EXPERIMENT 2**

#### **MATERIALS AND METHODS**

The methodology of experiment 2 was similar to that of experiment 1 but revised to better approximate a regimen found to reliably induce AMPH sensitization (Fletcher et al., 2005). Changes were as follows: (a) The 75% CS-sucrose group (*n* = 8) was included; (b) During sucrose training, rats (except for group 0) received 20 CS (light) presentations (as opposed to 15 in experiment 1); (c) CS presentations were each separated by an average inter-trial interval of 90 s; range: 30–180 s (vs. 120 s in experiment 1), which offset the increase in training trials to equate the duration of each training session to that of experiment 1; (d) the duration of each of the three habituation sessions was decreased from 120 to 90 min to correspond with the duration of the test sessions; (e) A saline (i.p., 1 ml/kg) challenge (90 min) was added (post-sucrose training day 8), to assess the locomotor effects of injection *per se* (e.g., expectation, stress); (f) The 1 mg/kg sensitization sessions were held on alternate weekdays (post-training days 12–21) rather than at weekly intervals as in experiment 1; (g) Along with the pre-sensitization 0.5 mg/kg AMPH challenge (post-training day 9) a second post-sensitization 0.5 mg/kg AMPH challenge was added (post-sucrose training day 28), to test the generality of the sensitization effect across doses; (h) nose pokes while CS was present were coded for all groups (including group 0); (i) nose pokes while CS was absent were recorded specifically from the 5-s interval immediately prior to the onset of the CS to index premature approach responding.

#### **RESULTS**

#### *Nose pokes during sucrose conditioning sessions*

A 5 Group × 15 Session × 2 Phase (CS present, CS absent) ANOVA of nose pokes yielded significant main effects of Group, *F(*4*,* <sup>19</sup>*)* = 2*.*89, *p* = 0*.*050, Session *F(*14*,* <sup>266</sup>*)* = 2*.*28, *p* = 0*.*006, and Phase, *F(*1*,* <sup>19</sup>*)* = 14*.*72, *p* = 0*.*001, as well as a significant three-way interaction, *F(*56*,* <sup>266</sup>*)* = 1*.*38, *p* = 0*.*050. Panels **(A,B)** of **Figure 3** plot the groups' mean nose poke scores for the CS present and CS absent phases, respectively. Comparison of the two panels reveals that the main effect of Phase reflected more overall nose poke responses when the CS was present vs. absent. Therefore, cued responses occurred significantly more often than did premature un-cued responses. The main effects of Group and Session were not readily interpreted due to the higher order interaction. This latter result reflected a convergence of scores for the five groups at a relatively stable low level across sessions when the CS was absent (**Figure 3B**), together with a divergence of scores into high (group 75, group 100), intermediate (group 50), and low (group 0, group 25) levels of nose poke responding over sessions when the CS was present (**Figure 3A**). Of the lower order polynomial trends (linear, quadratic, cubic) only the three-way interaction for the linear trend approached significance, *F(*4*,* <sup>19</sup>*)* = 2*.*32, *p* = 0*.*094, reflecting the generally monotonic increase in nose pokes over sessions in group 75 and relatively more rapid stabilization at high, intermediate, and low levels of responding in the other groups when the CS was present.

#### *Habituation to locomotor boxes*

A 5 Group × 3 Session ANOVA of drug-free locomotor responses yielded a significant main effect of Session, *F(*2*,* <sup>70</sup>*)* = 60*.*01, *p <* 0*.*0001, and no other significant effects, *F(*4*,* <sup>35</sup>*) <* 0*.*70, *p >* 0*.*60. Planned comparisons of group 50 with group 0 and with group 100 on the first and final habituation sessions yielded no significant effects, *t*'s *<* 0.84, *p >* 0*.*40. Therefore, mean drug-free locomotor response in the key groups did not differ prior to testing. Mean (SE) number of beam breaks per 90 min were 2162 (118) on session 1, 1470 (116) on session 2, and 1250 (98) on session 3.

#### *Test sessions*

*Saline.* A 5 Group × 2 Session ANOVA compared locomotor response on the final habituation session and saline challenge session. The ANOVA yielded a main effect of Session, *F(*1*,* <sup>35</sup>*)* = 62*.*46, *p <* 0*.*0001, and no other significant effects,

*F(*4*,* <sup>35</sup>*) <* 0*.*65, *p >* 0*.*64. **Figure 4** plots the group means and shows that the Session effect reflected an overall decrease in locomotor response from the final drug-free habituation session to the saline session, which did not vary by group. Thus, the decline in locomotor response seen over the three habituations sessions continued on the fourth drug-free exposure to the test boxes.

#### *Effects of 0.5 mg/kg AMPH.*

*Pre-injection locomotion.* A 5 Group × 2 Session ANOVA of preinjection locomotion (30-min) on the pre- and post-sensitization 0.5 mg/kg AMPH test days yielded a significant main effect of Session, *F(*1*,* <sup>35</sup>*)* = 13*.*39, *p* = 0*.*001, and no other significant effects, *F(*4*,* <sup>35</sup>*) <* 1*.*79, *p >* 0*.*15. Planned comparisons found no significant differences between group 50 and group 0 or group 100 on the first session, *t(*70*) <* 1*.*00, *p >* 0*.*30. However, on the second (post-sensitization) session group 50 (1203; 121) displayed significantly more pre-injection beam breaks (M; SE) than

did group 100 (756; 103), *t(*70*)* = 5*.*11, *p <* 0*.*001, but did not differ from group 0 (1126; 211), *t(*7*) <* 0*.*88, *p >* 0*.*40. Therefore, baseline differences in locomotion did not account for group differences in locomotor response to the first 0.5 mg/kg dose of AMPH but may have contributed to differences between group 50 and group 100 in locomotor response to the second 0.5 mg/kg dose of AMPH. Mean (SE) beam breaks for the pre-injection phase on the first and second 0.5 mg/kg AMPH test sessions were 757 (41) and 974 (59).

number of rewards as group 100 in the absence of conditioned stimuli.

*Post-injection locomotion.* A 5 Group × 2 Session ANOVA of locomotor response to 0.5 mg/kg AMPH before and after the 5-dose sensitizing regimen yielded a main effect of Session, *F(*1*,* <sup>35</sup>*)* = 76*.*05, *p <* 0*.*0001, and no other significant effects, *F(*4*,* <sup>35</sup>*) <* 1*.*10, *p >* 0*.*37. **Figure 5** shows the mean scores for each group and session.

The figure shows that the Session effect involved a significant increase in overall mean (SE) beam breaks per 90 min from 0.5 mg/kg dose 1, 3674 (216) to 0.5 mg/kg dose 2, 6123 (275). The lack of interaction or group effect suggested that sensitization to AMPH did not vary reliably across groups. Despite the lack of significant group-related effects in the ANOVA, inspection of the figure reveals that group 50 displayed the greatest response to both the first and second 0.5 mg/kg doses. Planned comparisons of response to the first 0.5 mg/kg dose revealed no significant difference between group 50 and group 0 or group 100, *t*'s*(*35*) <* 0*.*48, *p >* 0*.*50. However, in response to the second (postsensitization) 0.5 mg/kg dose, group 50 displayed significantly greater locomotion than group 0, *t(*35*)* = 2*.*00, *p <* 0*.*05, as well as group 100, *t(*35*)* = 3*.*29, *p <* 0*.*01.

In light of the significant group difference in pre-injection locomotion on the second 0.5 mg/kg AMPH session reported above, a follow-up 5 Group × 2 Session ANCOVA of locomotor

**Dawley rats (***n* **= 8/group) previously exposed to 15 daily conditioning sessions with sucrose reward (10% solution) delivered under 0, 25, 50, 75, or 100% variable schedules.** The conditioned stimulus was a light (120 s). Group 0 received the same number of rewards as group 100 in the absence of conditioned stimuli. ∗*p <* 0*.*05 for mean difference between group 50 and group 0 as well as group 100, based on planned comparisons.

response to 0.5 mg/kg AMPH was conducted, controlling for preinjection locomotion on the second session. This analysis yielded a significant effect of the covariate, *F(*1*,* <sup>34</sup>*)* = 8*.*65, *p* = 0*.*006, a main effect of Session *F(*1*,* <sup>34</sup>*)* = 10*.*83, *p* = 0*.*002, and no other significant effects, *F(*4*,* <sup>34</sup>*) <* 0*.*85, *p >* 0*.*50. Importantly, planned comparisons based on the MS error and df error from the ANCOVA confirmed that mean locomotor response to the second 0.5 mg/kg dose of AMPH remained significantly greater in group 50 than group 100, *t(*34*)* = 3*.*09, *p <* 0*.*01, and group 0, *t(*34*)* = 1*.*88, *p <* 0*.*05 (one-tailed), when pre-injection variation from session 2 was controlled. Thus, group 50 displayed significantly greater post-sensitization locomotor response to 0.5 mg/kg AMPH than did group 100 or group 0, and these group differences were not mediated by pre-injection locomotion on test days.

#### *Effects of 1.0 mg/kg AMPH.*

*Pre-injection locomotion.* A 5 Group × 5 Session ANOVA of 30-min pre-injection scores for the 1 mg/kg AMPH sensitization sessions yielded a main effect of Session, *F(*4*,* <sup>140</sup>*)* = 16*.*70, *p <* 0*.*0001, and no other significant effects, *F(*4*,* <sup>35</sup>*) <* 0*.*94, *p >* 0*.*45. Planned comparisons found no significant difference in preinjection locomotion between group 50 and group 0 or group 100 on the first session, *t(*175*) <* 1*.*66, *p >* 0*.*10. However, on the final session, group 50 (1167; 140) displayed significantly more beam breaks (M; SE) than did group 100 (1000; 99), *t(*175*)* = 2*.*35, *p <* 0*.*05, but did not differ from group 0 (1085, 120), *t(*175*) <* 1*.*16, *p >* 0*.*20. Therefore, differences in pre-injection locomotion contributed to differences between groups 50 and 100 in locomotor response to the final 1 mg/kg AMPH dose. Mean (SE) overall beam breaks for the sample during the pre-injection phase for Sessions 1 through 5 were: 810 (46), 784 (52), 760 (53), 726 (46), 1009 (51).

*Post-injection locomotion.* A 5 Group × 5 Session ANOVA of responses to 1 mg/kg AMPH yielded a significant main effect of Session, *F(*4*,* <sup>140</sup>*)* = 6*.*72, *p <* 0*.*001, a marginal Group × Session interaction, *F(*16*,* <sup>140</sup>*)* = 1*.*57, *p* = 0*.*085, and no main effect of Group, *F(*4*,* <sup>35</sup>*) <* 0*.*44, *p >* 0*.*77. Polynomial trend analyses revealed a significant linear trend, *F(*1*,* <sup>35</sup>*)* = 9*.*19, *p* = 0*.*005, and cubic trend, *F(*1*,* <sup>35</sup>*)* = 21*.*63, *p <* 0*.*001, over sessions 1 through 5. **Figure 6** shows the mean locomotor scores for each group and session.

The figure shows that the Session effect reflected a significant increase in overall mean (SE) beam breaks for the full sample from session 1, 4624 (213) to session 5, 5736 (272), confirming the emergence of sensitization to AMPH. The cubic trend denoted relative maxima on sessions 1, 3, and 5, with dips on sessions 2 and 4, particularly for groups 0 and 50. The figure also reveals that, despite the lack of significant interaction, group 25 displayed progressively greater locomotor response over sessions and differed considerably from the other groups on sessions 4 and 5 (9 and 22% greater respectively, than next highest group). Planned comparisons found that group 50 did not differ significantly from groups 0 or 100, *t(*175*) <* 0*.*89, *p >* 0*.*40 on the first or final 1 mg/kg AMPH test session.

#### *Control for variation in nose poke responding during sucrose training*

Two 5 Group × 2 Session ANCOVAs of locomotor response to 0.5 mg/kg AMPH before and after the sensitization regimen, including total nose pokes during sucrose training with CS present and with CS absent as separate covariates, found no significant effects for either covariate, *F(*1*,* <sup>18</sup>*) <* 1*.*03, *p >* 0*.*31. Therefore, approach responding during training did not mediate group differences in response to 0.5 mg/kg AMPH.

Two 5 Group × 5 Session ANCOVAs of locomotor response to 1 mg/kg during the sensitization sessions with total nose pokes (CS present, CS absent) as separate covariates yielded no significant effects of the covariate while the CS was present, *F(*4*,* <sup>104</sup>*) <* 1*.*04, *p >* 0*.*38, and a marginal main effect of the covariate while the CS was absent, *F(*1*,* <sup>18</sup>*)* = 3*.*32, *p* = 0*.*085.

#### **DISCUSSION**

The results of this study did not consistently support the hypothesis that group 50 would demonstrate higher locomotor response over sessions compared to the other groups. The 1 mg/kg AMPH data confirmed the emergence of sensitization with the alternateday dosing regimen. The pattern across groups indicated a trend for greater sensitization during the latter sessions in group 25, with no such evidence for group 50. In contrast, the 0.5 mg/kg dose results indicated a trend for greater sensitization in group 50, while at the same time confirming a significant overall increase in locomotor response across groups to the second vs. the first 0.5 mg/kg AMPH dose. The null effect of saline injection confirmed that expectancy or injection-related stress did not contribute to the AMPH effects.

The nose poke data again revealed an overall increase in approach responding over the course of training sessions when

the CS was present, with no corresponding increase when the CS was absent. Therefore, the animals appeared to acquire the association between the CS and the prospect of sucrose reward. Group differences in the frequency of nose pokes when the CS was present conformed roughly to the frequency of reward delivery under the respective schedules, with groups 75 and 100 displaying the most nose pokes, group 50 displaying intermediate numbers of nose pokes, and groups 0 and 25 displaying the fewest nose pokes. These results suggest that the CS came to control approach responding in a manner consistent with the overall probability of reward. Although speculative, one possible explanation for the lower nose poke rates with CS present in group 50 in experiment 2 vs. experiment 1 may be the shortening of the inter-trial interval, as longer inter-trial intervals (experiment 1) appear to encourage impulsive tendencies and this is associated with increased turnover of DA in anterior cingulate, prelimbic and infralimbic cortices (Dalley et al., 2002). Therefore, the 30% reduction in inter-trial interval in experiment 2 (and 3) may have altered cortical DA levels and promoted more selective (i.e., guided by the relative frequency of reward) vs. impulsive (not guided by reward frequency) approach responding in group 50 during training trials in experiment 2 as compared with experiment 1.

The lack of significant covariate-related effects for nose pokes in the CS present condition in the ANCOVAs indicates that approach responding during sucrose training did not mediate the effects of the different CS-sucrose schedules on responses to AMPH. The marginally significant effect of the covariate for the CS absent condition in the ANCOVA of locomotor responses to 1 mg/kg AMPH suggests that the tendency toward premature drug-free responding explained some of the variability in locomotor effects of AMPH during the sensitization sessions.

Together, the evidence suggests that the effects of conditioning history may be more discernible with 0.5 AMPH than with 1 mg/kg AMPH, and that a protocol that generates sensitization in the absence of any other manipulation may obscure or render redundant the effects of a putative sensitization-promoting behavioral manipulation (i.e., chronic variable reward).

Behavioral sensitization to AMPH is a robust effect in the laboratory. However, outside the laboratory, only a minority of individuals who gamble chronically escalate to pathological levels. Although risk for sensitization is related to risk for addiction (or drug seeking), especially for psychostimulants (Vezina, 2004; Flagel et al., 2008), many factors aside from sensitization risk may predispose one to addiction (e.g., Verdejo-Garcia et al., 2008; Conversano et al., 2012; Volkow et al., 2012). Nevertheless, trait factors that confer vulnerability to sensitization may interact with conditioning history to accentuate the effects of unpredictable reward (i.e., 50% CS-US schedule) on DA system reactivity. To investigate this possibility, experiment 3 employed the same procedure as experiment 2 but used Lewis strain instead of Sprague Dawley strain rats.

Sprague Dawley rats display intermediate levels of DA transporters, with lower levels than Wistar strain rats (Zamudio et al., 2005), but higher levels than Wistar-Kyoto rats (a "depressive" like strain) in the nucleus accumbens, amygdala, ventral tegmental area and substantia nigra (Jiao et al., 2003). This profile may render Sprague Dawley rats only moderately sensitive to environmental or pharmacological manipulations of DA function. In contrast, Lewis rats exhibit low levels of DA transporters as well as D2 and D3 DA receptors in the nucleus accumbens and dorsal striatum compared to other strains (e.g., F344) (Flores et al., 1998). These morphological differences may contribute to Lewis rats' differential response to DA manipulations. Lewis rats also exhibit a range of accentuated responses to experimental drug manipulations compared to other strains (e.g., F344). Most importantly, Lewis rats display greater sensitization to methamphetamine, characterized by low response to initial doses but higher response to later doses (Camp et al., 1994). Lewis rats also exhibit greater locomotor sensitization to a range of doses of cocaine (Kosten et al., 1994; Haile et al., 2001). Based on this pattern of effects, we surmised that Lewis rats would enable us to investigate whether susceptibility to sensitization amplifies the effects of conditioning schedule on subsequent response to AMPH.

#### **EXPERIMENT 3**

#### **MATERIALS AND METHODS**

The methodology was the same as in experiment 2, aside from the use of Lewis rats (200–225 g on arrival, Charles River, Quebec, Canada).

#### **RESULTS**

#### *Nose pokes during sucrose conditioning sessions*

A 5 Group × 15 Session × 2 Phase (CS present, CS absent) ANOVA of nose pokes yielded significant main effects of Group, *F(*4*,* <sup>34</sup>*)* = 6*.*12, *p* = 0*.*001, Session, *F(*14*,* <sup>476</sup>*)* = 3*.*42, *p <* 0*.*001, and Phase, *F(*1*,* <sup>34</sup>*)* = 20*.*83, *p <* 0*.*001, as well as a significant three-way interaction, *F(*56*,* <sup>476</sup>*)* = 1*.*56, *p* = 0*.*008. Panels **(A,B)** of **Figure 7** plot the groups' mean nose poke scores for the CS present and CS absent phases, respectively. Comparison of

**75, or 100% variable schedules.** The conditioned stimulus was a light (120 s). Group 0 received the same number of rewards as group 100 in the absence of conditioned stimuli. **(A)** Scores when CS was present (5 s × 20 trials). **(B)** Scores when CS was absent (average for 5 × 20 s while light was off).

the two panels reveals that the main effect of Phase reflected more overall nose poke responses when the CS was present vs. absent. Therefore, cued responses occurred significantly more often than did pre-mature responses. The main effects of Group and Session were not readily interpreted due to the higher order interaction. The three-way interaction reflected a convergence of scores for the five groups at a relatively stable low level across sessions when the CS was absent [Panel **(B)**], together with a divergence of scores when the CS was present into relatively discrete profiles for each group that paralleled their rank order of reward frequency: from highest (group 100) to lowest (group 25) [Panel **(A)**]. Only the linear trend for the interaction was significant, *F(*4*,* <sup>34</sup>*)* = 4*.*03, *p* = 0*.*009, reflecting the generally consistent increase in nose pokes over sessions in group 100 when the CS was present as against the relatively inconsistent profile of increase in nose pokes across sessions in the other groups during this phase.

**an electronic array per 90 min) on the last of 3 drug-free habituation sessions and on a subsequent session after saline injection (i.p., 1 ml/kg) in groups of Lewis rats (***n* **= 8/group) previously exposed to 15 daily conditioning sessions with sucrose reward (10% solution) delivered under 0, 25, 50, 75, or 100% variable schedules.** The conditioned stimulus was a light (120 s). Group 0 received the same number of rewards as group 100 in the absence of conditioned stimuli.

#### *Habituation to locomotor boxes*

A 5 Group × 3 Session ANOVA yielded a main effect of Session, *F(*2*,* <sup>70</sup>*)* = 23*.*07, *p <* 0*.*0001, and no other significant effects, *F(*8*,* <sup>70</sup>*) <* 1*.*47, *p >* 0*.*18. A curvilinear pattern of mean (SE) locomotor scores emerged from session 1, 1076 (74), through session 2, 644 (48), to session 3, 762 (59). Planned comparisons of group 50 with group 0 and with group 100 on the first and final habituation sessions revealed significantly fewer beam breaks in group 50 (*M* = 911; *SE* = 109) vs. group 0 (*M* = 1103; *SE* = 176) on habituation session 1, *t(*105*)* = 2*.*02, *p <* 0*.*05, but no difference between group 50 and group 100 (*M* = 1066; *SE* = 150), *t(*105*) <* 1*.*20, *p >* 0*.*20, on this session. Group 50 did not differ significantly from either group 0 or group 100 on the final habituation session, *t(*105*) <* 0*.*93, *p >* 0*.*30. Therefore, mean drug-free locomotor response in the key groups did not differ consistently prior to testing.

#### *Test sessions*

*Saline.* A 5 Group × 2 Session ANOVA of locomotor responses on the final habituation session and the saline test session yielded a significant main effect of Session, *F(*1*,* <sup>35</sup>*)* = 50*.*12, *p <* 0*.*0001, and no other significant effects, *F(*4*,* <sup>35</sup>*) <* 0*.*57, *p >* 0*.*68. **Figure 8** shows the group mean scores for the two sessions and indicates that the Session effect reflected a significant decline from habituation to saline test. Thus, receipt of the injection *per se* (e.g., expectancy, stress) did not enhance locomotor responding.

#### *Effects of 0.5 mg/kg AMPH.*

*Pre-injection locomotion.* A 5 Group × 2 Session ANOVA of pre-injection locomotion yielded a significant main effect of Session, *F(*1*,* <sup>35</sup>*)* = 15*.*04, *p <* 0*.*001, and no other significant effects, *F(*4*,* <sup>35</sup>*) <* 1*.*19, *p >* 0*.*33. Planned comparisons found no significant difference between group 50 and group 0 or group 100 on either test session, *t(*70*) <* 0*.*99, *p >* 0*.*30. Therefore, baseline differences in pre-injection locomotion did not account for group differences in locomotor response to 0.5 mg/kg AMPH. Mean (SE) beam breaks for the pre-injection phase for the first and second (post-sensitization) 0.5 mg/kg sessions were 325 (25) and 473 (36).

*Post-injection locomotion.* A 5 Group × 2 Session ANOVA of locomotor response to 0.5 mg/kg doses delivered before and after chronic 1 mg/kg AMPH yielded a main effect of Session, *F(*1*,* <sup>34</sup>*)* = 87*.*44, *p <* 0*.*0001, and no other significant effects, *F(*4*,* <sup>34</sup>*) <* 0*.*94, *p >* 0*.*45. **Figure 9** plots the mean locomotor scores for each group and session and shows that the Session effect reflected an increased overall response to the second 0.5 mg/kg dose, consistent with sensitization. The figure also shows that the groups performed very similarly on session 1, but that group 50 displayed more locomotor activity than the other groups on session 2. Planned comparisons in response to the first 0.5 mg/kg dose revealed no significant differences between group 50 and group 0 or group 100, *t(*35*) <* 1*.*28, *p >* 0*.*20. However, group 50 displayed significantly greater locomotor response to the second 0.5 mg/kg dose than did group 0, *t(*35*)* = 4*.*32, *p <* 0*.*001, or group 100, *t(*35*)* = 2*.*24, *p <* 0*.*05.

#### *Effects of 1 mg/kg AMPH.*

*Pre-injection locomotion.* A 5 Group × 5 Session ANOVA of 30 min pre-injection scores for the sensitization sessions yielded a main effect of Session, *F(*4*,* <sup>140</sup>*)* = 4*.*10, *p* = 0*.*004, and no other significant effects, *F(*4*,* <sup>35</sup>*)* = 1*.*25, *p >* 0*.*31. Planned comparisons found that beam breaks during the pre-injection phase

(M; SE) were significantly lower in group 50 (395; 62) than in group 100 (508; 62), *t(*175*)* = 2*.*58, *p <* 0*.*01, but not group 0, *t(*175*) <* 1*.*83, *p >* 0*.*10, on 1 mg/kg AMPH session 1. On the final 1 mg/kg AMPH session, planned comparisons also found that pre-injection locomotion in group 50 (378; 60) was significantly lower than in group 100 (650; 75), *t(*175*)* = 6*.*17, *p <* 0*.*001, but not in group 0, *t(*175*)*<1*.*84, *p >* 0*.*10. As the direction of these group differences (control group = group 50) was opposite to the hypothesized pattern, group differences in post-injection locomotion that align with the hypothesis cannot be attributed to pre-injection baseline differences. Mean (SE) overall beam breaks during the pre-injection phase for Sessions 1 through 5 were: 442 (34), 452 (32), 542 (40), 411 (26), 504 (37).

*Post-injection locomotion.* A 5 Group × 5 Sessions ANOVA of responses to the 1 mg/kg doses yielded a significant main effect of Session, *F(*4*,* <sup>140</sup>*)* = 6*.*15, *p <* 0*.*001, and no other significant effects, *F(*4*,* <sup>35</sup>*) <* 0*.*57, *p >* 0*.*68. Polynomial trend analyses revealed a significant linear trend, *F(*1*,* <sup>35</sup>*)* = 9*.*34, *p* = 0*.*004, and cubic trend, *F(*1*,* <sup>35</sup>*)* = 5*.*08, *p* = 0*.*031, the latter result denoting relative maxima on sessions 3 and 5. **Figure 10** plots these scores and shows that, despite the lack of significant interaction in the ANOVA, group 50 exhibited substantially greater locomotion than the other four groups in response to the final 1 mg/kg dose. Accordingly, planned comparisons revealed significantly greater mean scores on session 5 in group 50 than in all other groups, *t(*35*) >* 3*.*68, *p <* 0*.*001.

#### *Control for variation in nose poke responding during sucrose training*

Two 5 Group × 2 Session ANCOVAs of locomotor response to 0.5 mg/kg AMPH before and after the sensitization regimen, including total nose pokes during sucrose training with CS

**FIGURE 10 | Mean (SE) locomotor response (number of beam breaks in an electronic array per 90 min) to 1 mg/kg d-amphetamine (i.p.) on 5 weekly sessions in groups of Lewis rats (***n* **= 8/group) previously exposed to 15 daily conditioning sessions with sucrose reward (10% solution) delivered under 0, 25, 50, 75, or 100% variable schedules.** The conditioned stimulus was a light (120 s). Group 0 received the same number of rewards as group 100 in the absence of conditioned stimuli. ∗*p <* 0*.*05 for mean difference between group 50 and group 0 as well as group 100, based on planned comparisons.

present and with CS absent as separate covariates, found no significant effects for either covariate, *F(*1*,* <sup>32</sup>*) <* 0*.*44 *p >* 0*.*51. Two 5 Group × 5 Session ANCOVAs of locomotor response to 1 mg/kg AMPH during the sensitization sessions with total nose pokes (CS present, CS absent) as separate covariates yielded no significant effects of the covariate while the CS was present or absent, *F(*1*,* <sup>33</sup>*) <* 0*.*14, *p >* 0*.*71. Therefore, drug-free approach responding did not account for group differences in locomotor responses to either dose of AMPH.

#### **DISCUSSION**

Sensitization developed to the effects of repeated 1.0 mg/kg amphetamine. The habituation and saline data confirm that this effect was not due to pre-existing differences, expectancy, or stress-related responses to the injection. The ANCOVAs with nose pokes confirm that these effects were not due to drug-free approach behavior. The nose poke data themselves indicated that the groups acquired the association between the CS and prospect of sucrose reward. The groups' rank level of nose-poke responding at the end of training matched the overall frequency of reward under the different schedules from highest (group 100) to lowest (group 0), as it did in experiment 2. The relatively lower overall mean nose poke levels in this experiment compared to experiments 1 and 2 may reflect more selective approach responding to cues for reward in Lewis rats (Kosten et al., 2007).

The 0.5 mg/kg dose data showed that initial locomotor response to AMPH in Lewis rats (**Figure 9**) was somewhat suppressed compared to Sprague Dawley rats (experiment 2; **Figure 5**), but the within-group increase in response to the second dose in Lewis rats was considerable (nearly double the response to the first 0.5 mg/kg dose) following the 5-session AMPH regimen Most notably, group 50 displayed a greater locomotor response than all groups except group 25 to the second (i.e., post-sensitization) 0.5 mg/kg AMPH dose and a greater locomotor response than all other groups, including group 25, to the final 1 mg/kg AMPH dose (final sensitization session).

#### *Summary analysis of group rankings across experiments*

To determine the reliability of group differences in sensitization, a non-parametric analysis assessed the contingency between group and rank of mean locomotor response to the second (postchronic AMPH) 0.5 mg/kg dose and the final 1.0 mg/kg dose of AMPH from the 3 experiments. The analysis yielded a significant effect, ϕ = 0*.*986, *p* = 0*.*025, reflecting the fact that group 50 ranked first in all but one of the comparisons. The superior rank of group 50 compared to all other groups in response to the second (post-chronic AMPH) 0.5 mg/kg dose is depicted in **Figure 5** (experiment 2) and **Figure 9** (experiment 3). The superior rank of group 50 relative to other groups in response to the final 1.0 mg/kg dose is depicted in **Figure 2** (experiment 1) and **Figure 10** (experiment 3). The only exception to this pattern was the response to the final 1.0 mg/kg dose in Sprague-Dawley rats in experiment 2.

#### **GENERAL DISCUSSION**

The present series of experiments tested the hypothesis that chronic exposure to a gambling-like schedule of reward can sensitize brain DA pathways much like chronic exposure to drugs of abuse. Evidence for such an effect would suggest that neuroplasticity, of the same kind thought to contribute to drug addiction, can be induced by chronic exposure to unpredictable reward schedules. In line with the literature on drug addiction, locomotor response to 0.5 and 1.0 mg/kg doses of AMPH indexed DA system reactivity, with greater locomotion in response to later doses operationally defining sensitization (cf. Robinson and Berridge, 1993; Pierce and Kalivas, 1997; Vanderschuren and Kalivas, 2000).

Overall, the results are in line with our hypothesis. However, they also indicate considerable variability in experimental effects due to procedural factors. The effects of conditioning schedule were modest but consistent, with group 50 demonstrating greater response than the other four groups to both doses following the five dose-regimen. Although overall *F*-values for group-related effects in the variance analyses were often non-significant, key group differences were confirmed with pairwise planned comparisons. In this regard it should be noted that, "Current thinking, however, is that overall significance [for *F* in the ANOVA] is not necessary. First of all, the hypotheses tested by the overall test and a multiple-comparison test are quite different, with quite different levels of power. For example, the overall *F* actually distributes differences among groups across the number of degrees of freedom for groups. This has the effect of diluting the overall *F* in the situation where several group means are equal to each other but different from some other mean" (Howell, 1992, p. 338). This is the precisely the situation that applied in the present experiments, where group 50 was expected to differ from group 0 and group 100 controls but no difference between these control groups was predicted for group 25 or group 75.

The nose poke data confirmed that, in every experiment, the animals acquired the association between the CS and the prospect of sucrose reward. The correspondence between nose poke frequency for the different groups and overall frequency of reward under their respective training schedules suggests that the average rate of sucrose reward guided drug-free approach responding. However, the lack of mediating effect of nose pokes on grouprelated locomotor responses to AMPH in the ANCOVAs indicated that separate processes underlie the two behaviors.

In some cases, the effect of conditioning schedule was evident in response to the first AMPH dose; in other cases it only emerged after repeated doses. Group differences in locomotor response to the first AMPH dose suggest that exposure to gambling-like reward schedules is sufficient by itself to induce sensitization. Group differences in locomotion following multiple AMPH doses indicate a more subtle effect that could be characterized as "susceptibility," which only manifests when combined with ongoing exposure to the primary sensitizing agent (i.e., amphetamine).

Differences in the pattern of response across experiments suggest that a longer interval between training and initial AMPH challenge may maximize the opportunity to detect the inherent sensitizing effect of the conditioning treatment. This in turn suggests that effects of conditioned reward exposure may incubate over time, a phenomenon also seen with stimulant sensitization (Grimm et al., 2006). The pattern of response to the two doses of amphetamine suggests that the 0.5 mg/kg dose may be more effective in revealing the effects of conditioning history. This in turn suggests that conditioning effects under the current training protocol are somewhat subtle and may be camouflaged by ceiling effects under doses of AMPH and conditions that generate *de novo* sensitization.

In experiment 3, the biphasic pattern of response to the 0.5 mg/kg doses and progressive emergence of superiority in group 50 is consistent with the expected profile for Lewis rats in response to methamphetamine (Camp et al., 1994). This lends support to the validity of the present findings and suggests overlap between the factors that moderate vulnerability to psychostimulant sensitization and to gambling-like schedules of reward.

Across experiments, the post-sensitization locomotor response of group 50 generally exceeded that of the other groups under different doses of amphetamine and in different strains of animals. However, the high within-group variability and modest betweengroup effect sizes indicate a role for other factors in DA system reactivity to amphetamine following exposure to varying schedules of conditioned sucrose reward. Although responses of DA neurons to reward signals may provide a coarse model of gambling (Fiorillo et al., 2003), like all models, there is a loss of information for the sake of parsimony—i.e., to demonstrate a key process. As a result, the pattern of effects across CS-US conditions in the original Fiorillo et al. study does not fully generalize to locomotor response to amphetamine. Further refinements of the model are called for to fully capture the aspects of gambling that impact on DA system function.

Taken together, the results of this series of experiments provide provisional support for the hypothesis that chronic exposure to gambling-like schedules of reward enhances the reactivity of the brain DA system to psychostimulant challenge. As such, they extend the findings of Singer et al. (2012) who demonstrated that, relative to a fixed schedule, prior exposure to a variable reinforcement schedule in an operant paradigm enhances subsequent locomotor response to amphetamine. More specifically, the present findings point to uncertainty of reward delivery as the critical factor underlying the effects of variable reward. The magnitude of effects in the operant paradigm was substantially greater than the effects found in the present experiments. This may reflect greater chronic exposure to the gambling-like activity (55 vs. 15 days); it may reflect the effects of requiring an operant response to elicit the reward (i.e., a role for agency) rather than passive exposure, as in the present study. Increasing the duration of training in the present paradigm would help to resolve these questions.

The validity of variable reward and reinforcement schedules as models of gambling cannot be gleaned from these experiments. Future research that examines the impact of conditioning history on risk-taking behavior in rodent gambling tasks could address this issue. Similarly, the correspondence between the behavioral sensitization found here and the elevated striatal DA response to amphetamine recently found in pathological gamblers must await further investigation (Boileau et al., 2013). Micro-dialysis could address this question, and the prediction based on the human data would be that greater DA release in the group 50 "gambling phenotype" would be most clearly observed in the dorsal (sensorimotor) striatum rather than the ventral (limbic) striatum. Validation of 50% variable CS + reward exposure in these other paradigms would support its utility as a bona fide experimental model of PG.

Whereas some forms of gambling clearly entail an instrumental response (e.g., slot machines), in other forms of gambling (e.g., lottery) the link between the action (purchasing the ticket, i.e., placing the bet), the cues for reward (i.e., lottery numbers) and the reward itself (the winning number and monetary payoff) is much more diffuse. Nevertheless, activation of DA during the CS-US interval may well occur. This may explain why, when the "winning number" is announced, attention is riveted as each individual lottery ball drops in succession to compose the specific sequence of digits in the winning number. Although the probability of a specific digit occurring is mathematically defined, the outcome for each individual lottery ball is binary—hit (matches the player's number) or miss (does not match the player's number) and the outcome on any given trial is unknown. Such a scenario may better characterize the experience of group 50 in the present experiments, where reward was provided non-contingently but also unpredictably and the CS merely indicated the potential for reward without revealing whether it would occur on a given trial. Slot machines are more strongly linked with PG than are lottery tickets (Cox et al., 2000; Bakken et al., 2009), indicating an important role for instrumental factors (and immediacy) in the rewarding aspects of gambling for this population (Loba et al., 2001). Nonetheless, the Pavlovian process modeled in the present experiments (CS + uncertain reward) appears to be a necessary if not sufficient element of the gambling experience.

Along with the lack of a clear instrumental requirement, a number of other design features may have contributed to the relatively modest and variable pattern of experimental effects. The groups differed in overall sucrose exposure as well as the contingency between CS and sucrose reward. Although this may have contributed to inter-group variability, it cannot readily explain why animals with the greatest sucrose exposure (group 100) displayed less sensitization than group 50. In addition, group 0 received no stimulus before sucrose exposure on every trial. Although this precluded a cue-induced expectation of reward, it did not control for the presence of a stimulus before reward delivery, which existed in all other groups. To address this issue, future research should include a condition where animals receive reward on every trial following random exposure to a neutral stimulus (i.e., whose presence does not signal the potential for reward).

Another design limitation is the potential emergence of adjunctive behavior that could influence the effects of training schedule. In the face of uncertainty, animals may develop superstitious behaviors designed to enhance perceived control and reduce uncertainty-induced DA activation (cf. Harris et al., 2013). It is therefore possible that uncontrolled aspects of the experimental design enabled the animals to offset the effects of conditioning schedule. Such an effect could contribute to the relatively modest and variable response to amphetamine in group 50 following CS + sucrose training. Future research should record spontaneous behavior, aside from nose pokes, during training sessions to test this possibility, and control for it statistically should it emerge. Because such behavior would be expected to counteract or dampen the effects of schedule-induced uncertainty, locomotor response to amphetamine in group 50 should be enhanced when it is controlled (procedurally or statistically). Therefore, the present (uncontrolled) design provides a conservative test of the effects of 50% CS + reward on amphetamine sensitization.

In terms of external validity, the use of male rats also limits the generalizability of the results. The lack of a clear "punishment" condition also differs from gambling, where large monetary losses are common and exert important motivational effects (Nieuwenhuis et al., 2005; Singh and Khan, 2012). The ability to accumulate reward is also absent from the present paradigm and cumulative winnings in a slot machine game have been found to interact with DA manipulations in humans (Tremblay et al., 2011; Smart et al., 2013). Similarly, the opportunity for a jackpot is an important difference between the present model and actual gambling.

Despite these limitations, the present results suggest that 50% variable CS + reward can engage DA pathways implicated in the reinforcing effects of gambling (Fiorillo et al., 2003; Anselme, 2013). Cross-sensitization of response to AMPH following this gambling-like schedule is consistent with a pivotal role for DA in gambling and psychostimulant drug effects (Zack and Poulos, 2009), and extends earlier studies on cross-priming of motivation to gamble by AMPH in pathological gamblers (Zack and Poulos, 2004). The present results also indirectly suggest that modest doses of AMPH, which do not cause supra-physiological DA release, may better model brain activity in response to intermittent reward signals (i.e., during gambling) than exposure to high (i.e., binge-like) doses of stimulant drugs (cf. Vanderschuren and Pierce, 2010). Direct support for this correspondence could be derived by assessing DA release in response to the 50% variable CS-US schedule and different doses of AMPH using microdialysis.

From an experimental standpoint, the present Pavlovian model and the previous operant model of variable reinforcement both appear to engender a phenotype resembling the human pathological gambler. As such, they provide a valuable complement to rodent gambling tasks which model gambling behavior (as a dependent measure) but have, until now, only employed healthy animals, the equivalent of human social gamblers. Based on the literature, the animals chronically exposed to variable reward may well differ in these tasks, particularly in response to DA-ergic drugs. Combining the rat gambling phenotype with gambling tasks may permit systematic development of medications for the treatment of PG, which might not be fully accomplished with healthy animals alone. Further refinements in the experimental design and training regimen, as described above, should improve the correspondence between animals trained in this paradigm and actual pathological gamblers.

From the clinical-sociological standpoint, the finding that exposure to 50% variable CS + reward, which closely matches the reward schedule on a commercial slot machine (Tremblay et al., 2011), changes the brain DA system in reliable and enduring ways suggests that, in some cases, gambling activity, like drugs of abuse, may be a "pathogen" capable of causing addiction. However, the modest effect size and high variability in response to 50% CS + reward suggest that, like drugs of abuse, the tendency for gambling-like reward schedules to promote addiction will depend greatly on the pre-existing risk profile of the gambler. Nevertheless, to spare high risk individuals exposure to potential adverse gambling-related effects, it seems reasonable that policies applied to deter use and minimize harm from drugs of abuse could be extended to gambling as well.

#### **ACKNOWLEDGMENTS**

This research was funded by grants from The Natural Sciences and Engineering Research Council of Canada to Paul J. Fletcher. We sincerely thank Ms. Djurdja Djordjevic for preparing the figures.

#### **REFERENCES**


Singh, V., and Khan, A. (2012). Decision making in the reward and punishment variants of the iowa gambling task: evidence of "foresight" or "framing"? *Front. Neurosci.* 6:107. doi: 10.3389/fnins.2012.00107


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 November 2013; accepted: 23 January 2014; published online: 11 February 2014.*

*Citation: Zack M, Featherstone RE, Mathewson S and Fletcher PJ (2014) Chronic exposure to a gambling-like schedule of reward predictive stimuli can promote sensitization to amphetamine in rats. Front. Behav. Neurosci. 8:36. doi: 10.3389/fnbeh. 2014.00036*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Zack, Featherstone, Mathewson and Fletcher. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

Skinner, B. F. (1953). *Science and Human Behavior*. New York, NY: Free Press.

## Nonhuman gamblers: lessons from rodents, primates, and robots

#### **Fabio Paglieri <sup>1</sup>\*, Elsa Addessi <sup>1</sup> , Francesca De Petrillo<sup>2</sup> , Giovanni Laviola<sup>3</sup> , Marco Mirolli <sup>1</sup> , Domenico Parisi <sup>1</sup> , Giancarlo Petrosino<sup>1</sup> , Marialba Ventricelli <sup>2</sup> , Francesca Zoratto3,4 and Walter Adriani <sup>3</sup>\***

<sup>1</sup> Goal-Oriented Agents Lab (GOAL), Istituto di Scienze e Tecnologie della Cognizione, Consiglio Nazionale delle Ricerche (ISTC-CNR), Rome, Italy

<sup>2</sup> Department of Environmental Biology, University of Rome "La Sapienza", Rome, Italy

<sup>3</sup> Department of Cell Biology and Neurosciences, Istituto Superiore di Sanità, Rome, Italy

<sup>4</sup> Bambino Gesù Children's Hospital IRCCS, Rome, Italy

#### **Edited by:**

Patrick Anselme, University of Liège, Belgium

#### **Reviewed by:**

Francesca Cirulli, Istituto Superiore di Sanità, Italy Alicia Izquierdo, University of California, Los Angeles, USA

#### **\*Correspondence:**

Fabio Paglieri, Istituto di Scienze e Tecnologie della Cognizione, Consiglio Nazionale delle Ricerche (ISTC-CNR), Goal-Oriented Agents Lab (GOAL), Via S. Martino della Battaglia 44, 00185 Rome, Italy e-mail: fabio.paglieri@istc.cnr.it Walter Adriani, Department of Cell Biology and Neurosciences, Istituto Superiore di Sanità, Viale Regina Elena 299, 00185 Rome, Italy e-mail: walter.adriani@iss.it

#### **INTRODUCTION**

Gambling can be defined as betting money, or other equivalent goods, upon the future outcome of an event which presents a high degree of uncertainty, with a view to winning a prize. Winning is mainly (or exclusively) due to chance and not much (or not at all) to individual abilities. While betting may represent a recreational activity for the majority of people, it may become a serious behavioral disorder for others (Petry et al., 2005). The rapid worldwide growth of legalised gaming opportunities (Wilber and Potenza, 2006; McCormack et al., 2012; Donati et al., 2013), including the increasing possibility of online gambling through the Internet, has raised concerns over the impact of exaggerated gambling and its detrimental consequences on public health (Shaffer and Korn, 2002; Carragher and McWilliams, 2011). Thus, due to the increasing number of affected people, pathological gambling represents a growing concern for society.

In fact, this behavior is clinically characterized as a pathology: in DSM-IV-TR (American Psychiatric Association, 2000), it was described as a persistent, recurrent and maladaptive behavior, which disrupts personal, family, professional or vocational pursuits (Potenza, 2001). The personal and social consequences of this disorder often include job loss, family problems and divorce, financial and legal problems, and criminal behavior (Lowengrub et al., 2006). Pathological gambling affects 0.2–5.3% of adults in

The search for neuronal and psychological underpinnings of pathological gambling in humans would benefit from investigating related phenomena also outside of our species. In this paper, we present a survey of studies in three widely different populations of agents, namely rodents, non-human primates, and robots. Each of these populations offers valuable and complementary insights on the topic, as the literature demonstrates. In addition, we highlight the deep and complex connections between relevant results across these different areas of research (i.e., cognitive and computational neuroscience, neuroethology, cognitive primatology, neuropsychiatry, evolutionary robotics), to make the case for a greater degree of methodological integration in future studies on pathological gambling.

**Keywords: pathological gambling, risk sensitivity, uncertain reward, animal models, nonhuman primates, neurocomputational models, evolutionary models**

> western socities (Bastiani et al., 2013) and is highly comorbid with a range of other psychiatric disorders such as attentiondeficit/hyperactivity disorder (ADHD; and other impulse-control disorders, obsessive-compulsive disorders; Hollander et al., 2005) and with substance abuse (Petry et al., 2005; Hodgins et al., 2011). Some pathological features of gambling are similar to those of drug addiction, such as the need to gamble increasing amounts of money (escalation) in order to achieve the desired excitement or "rush" (tolerance), the irritability that accompanies the abstention from the activity (withdrawal), the failure of attempts to control or stop the behavior (loss of control). Notably, whilst pathological gambling has been classified until recently (in DSM-III and DSM-IV) among the "Impulse-Control Disorders Not Elsewhere Classified", it has been turned into a "no substance addiction" in DSM-V (American Psychiatric Association, 2013), that is a "behavioral addiction". Pathological gambling is also associated with increased suicidal ideation and attempts compared to the general population: approximately one out of five pathological gamblers attempts suicide (Volberg, 2002). Such rates among pathological gamblers are higher than for any other addictive disorder. Thus, gambling represents a public concern being both a social and a psychiatric issue.

> Far from being an adult concern, gambling is becoming a serious behavioral problem also among adolescents

(Cunningham-Williams and Cottler, 2001; Dickson et al., 2002), whose involvement has increased substantially over the past 20 years (Huang and Boyer, 2007). Epidemiological studies show that the prevalence of pathological gambling is 2–4 times higher among adolescents than among adults, with 3.5–8.0% of adolescents meeting the criteria for such pathology (Felsher et al., 2004; Ellenbogen et al., 2007; Hodgins et al., 2011; Caillon et al., 2012). Adolescence and young adulthood may be periods of especially heightened vulnerability for the development of gambling disorders, which are therefore receiving increasing attention by clinicians and preclinical researchers (Jazaeri and Habil, 2012; Zoratto et al., 2013).

The etiology of pathological gambling is multi-factorial; both genetic (e.g., a polymorphism in the serotonin transporter gene; Ibanez et al., 2003) and socio-environmental (e.g., Donati et al., 2013; Potenza, 2013) risk-factors have been identified. Moreover, cognitive models of gambling argue that irrational beliefs and erroneous perceptions may play a key role (Reid, 1986; Clark, 2010). Indeed, some authors argue that expectancies of winning, illusions of control, and subsequent entrapment do contribute to the development and the maintenance of gambling patterns (Joukhador et al., 2003). Psycho-genetic studies have revealed that, among genes involved in altered serotonergic and dopaminergic neurotransmission, the most significant for pathological gambling are serotonin transporter (SERT; Ibanez et al., 2003; Reuter et al., 2005) and dopamine transporter (DAT; Comings et al., 2001).

Methods for treating pathological gambling include various counselling-based approaches and pharmacological therapy, although there are no drugs which have been officially approved for the specific treatment of pathological gambling by the U.S. Food and Drug Administration (FDA). Therefore, in pathological gamblers, drugs are mainly prescribed for the treatment of the comorbid conditions and not for the pathology itself (Hollander et al., 2005). Pathological gamblers respond well to treatment with selective serotonin reuptake inhibitors (SSRIs, particularly paroxetine; Kim et al., 2002), mood stabilizers, and opioid antagonists (such as nalmefene), commonly used in the treatment of alcoholism (see for a review Lowengrub et al., 2006).

In view of the growing incidence of pathological gambling, its severe mental and social consequences, and the still preliminary nature of its treatment, it is urgent to mobilize various approaches and methods to further deepen our understanding of the neuronal and psychological underpinnings of this condition. Indeed, the present Research Topic constitutes an important and timely initiative towards that end. The contribution we offer in this review concerns how evidence obtained on nonhuman subjects is crucial to investigate pathological gambling in humans. In particular, we make the case for studying three widely different populations of agents: rodents (Section Rodents as an Animal Model of Gambling Behavior), nonhuman primates (Section Risky Choices in Nonhuman Primates: Implications for Human Pathological Gambling), and robots (Section Risk Attitudes, Environmental Uncertainty and Addictive Behavior: Perspectives From Computational Neuroscience and Evolutionary Robotics). While each of these populations offer valuable insights on the topic, their true worth is revealed only by looking at how they relate to each other. Hence we will review the literature across all these areas of research (i.e., cognitive and computational neuroscience, neuroethology, cognitive primatology, neuropsychiatry, evolutionary robotics), with the aim of suggesting the need for greater methodological integration in future studies on laboratory modeling of pathological gambling.

#### **RODENTS AS AN ANIMAL MODEL OF GAMBLING BEHAVIOR**

In the field of behavioral neuroscience, animal models enable the investigation of brain-behavior relations under controlled conditions (e.g., standardized housing and testing), with the aim of gaining insight into normal and abnormal human behavior and its underlying neural, psychobiological and neuroendocrinological processes (van der Staay, 2006). In particular, they are particularly suitable for the dissection of precise mechanisms involved in decision-making processes, for the analysis of inter-individual differences with a tight control of environmental and genetic conditions, and for follow-up studies (de Visser et al., 2011). As we shall see in what follows, these considerations do apply also to the study of gambling behavior, and especially to the use of rodents (mostly rats) as an animal model for risk proneness (e.g., Adriani et al., 2009, 2010).

#### **ASSESSMENT OF GAMBLING PRONENESS: CLINICAL AND PRECLINICAL APPROACHES**

In humans, Probability Discounting can be studied by means of either questionnaires or operant paradigms. The "South Oaks Gambling Screen" (for adults Lesieur and Blume, 1987; for adolescents Wiebe et al., 2000), the "Gambling Attitudes and Beliefs Survey" (Strong et al., 2004) and the "Canadian Problem Gambling Index" (Young and Wohl, 2011) are some examples of personality tests and reports, widely used in the framework of clinical psychology and experimental research. In these protocols, gamblers are characterized with scores that represent their averaged behavior over periods of weeks, months or years whilst the time spans that most naturally correspond to the expression of gambling behavior are those of seconds, minutes or hours. The main limitation of these traditional methods regards therefore the lack of an appropriate temporal dimension (van den Bos et al., 2013). By contrast, controlled experimental or clinical paradigms such as the "Iowa Gambling Task" (IGT; Bechara et al., 1994), the "Balloon Analogue Risk Task" (Lejuez et al., 2002) and the "Probability Discounting Task" (e.g., Scheres et al., 2006; Shead and Hodgins, 2009) allow to overcome the above mentioned limitation regarding the temporal dimension. However, as extensively discussed in van den Bos et al. (2013), they are characterized by a second limitation, i.e., the lack of appropriate context due to the artificial conditions of a laboratory environment. It should also be noted that these paradigms can be performed with either real rewards over limited time intervals (e.g., minutes, hours) or with questions about hypothetical ones (e.g., huge amounts of money) over months or years.

Due to the complexity of human studies, preclinical investigations in laboratory animal models are necessary for a deeper understanding of pathological gambling. Specifically, it is relevant to exploit preclinical models for (i) the symptoms; (ii) their neurobiological determinants; and (iii) their possible modulation by pharmacological manipulation. Specifically, these studies are crucial as they allow the dissection of processes and factors associated with normal and pathological gambling in a controlled way (de Visser et al., 2011; Winstanley et al., 2011; Koot et al., 2012). Furthermore, animal models have added value from a translational perspective because it is possible to use approaches that are virtually impossible with humans, as in the case of *in vivo* transgenic approaches that allow to directly reach and modulate expression of target genes in relevant brain areas (Adriani et al., 2010).

Many operant paradigms have been developed to study tolerance to uncertainty and/or gambling proneness in animal models (Mobini et al., 2000; Cardinal and Howes, 2005; Adriani et al., 2006; Wilhelm and Mitchell, 2008; Winstanley et al., 2011). Specifically, by exploiting uncertainty of reward delivery, these tasks allow to probe individual (in)tolerance to frustration, linked to missing an anticipated reward (i.e., the "loss"). The "IGT" involves the choice between a low probability of a large reward vs. a high probability of a small food reward (van den Bos et al., 2006). The "Probabilistic-Delivery Task" (PDT; which belongs to the broader category of Probability Discounting) is based on a choice between either a certain, small amount of food reward or larger amounts delivered (or not) depending on a given (and progressively decreasing) probability (Adriani and Laviola, 2006; Adriani et al., 2006). The "Risky Decision-Making Task" (RDT) implies the choice between a small, "safe" food reward or a larger food reward associated with the risk of punishment (e.g., footshock; Simon et al., 2009). The "rodent Slot Machine Task" (rSMT) allows to evaluate if the experimental subject discriminates a complete signal (e.g., three lights turned on, indicative of win) from a nearly complete one (e.g., two lights out of three, indicative of loss): by means of this task, it has been recently demonstrated that rats are susceptible to putative-win signals in non-winning trials (Winstanley et al., 2011; Cocker et al., 2013). Such a phenomenon might resemble the so-called "nearmiss effect", one of the cognitive distortion regarding gambling outcomes that is thought to confer vulnerability to pathological gambling (Reid, 1986; Clark, 2010; see also Section Normative (Algorithmic) Models).

Notably, the "IGT" and the "Probability Discounting Task" are widely used in experimental or clinical research on humans. Obviously, when performed on animals, these paradigms involve real, ethologically relevant rewards over limited time intervals. Symbolic reward (as money in humans) or time intervals longer than few hours cannot be used. Moreover, to be effective, the contrast between alternative rewards (e.g., small vs. large one) can not be as marked as it would be desired to mimic 1000-fold prizes as in humans. In these tasks, in which a moderate food restriction is usually applied to increase subjects' motivation to work for food delivery, the rewards' magnitude shall be accurately calibrated in order to (i) allow animals to eat enough food; (ii) prevent them from being fully satiated; and (iii) enable them to discriminate between rewards. The first aspect is especially relevant in "closed" (compared to "open") economies, in which subjects have to obtain all their daily meal from the operant

panels and no extra food is given at the end of each experimental session (Timberlake and Peden, 1987; Zoratto et al., 2012). The second one is necessary to avoid a potential recovery from the consequences of the food loss (occurring because of the probabilistic delivery). The last one can be crucial for the establishment of basal preference in developing rats (Zoratto et al., 2013). We have recently shown that high contrast between rewards (one pellet vs. five pellets instead of two pellets vs. six pellets) and high probability initially associated, during training, with the large reward (66% instead of 50%) are essential to shorten the overall testing period: namely, much less sessions are required for the development of baseline large-reward preference (which is otherwise slow in young animals). This is of paramount importance to overcome the developmental constraint associated with the short duration of the adolescent phase (Laviola et al., 2003).

These operant-behavior tasks imply a series of discrete decisions between two reward alternatives (Adriani et al., 2012a). In terms of automatization, the experimental apparatus requires two alternative *operanda* (e.g., levers or nosepoking holes, where the animal can express its choice), and computer-controlled delivery of reinforcers (e.g., food or liquids) that differ in size and actual probability of delivery (uncertainty). Other important features of the task are inherent to the trial/session schedule. For instance, the total number of choice opportunities (i.e., trials) given to the subject may be fixed (i.e., the session ends after the last trial) and independent of total time needed to complete the task. Alternatively, the total duration of the experimental session may be fixed (minutes, hours) and thus independent of the total number of trials actually completed within such time-window (Koot et al., 2012).

The protocols reviewed above probe animals for the balance between "innate, sub-cortical" drives and "evolved, cortical" processes (Adriani and Laviola, 2009). In other words, these operant tasks allow to evaluate a cognitive ability, i.e., to inhibit sub-cortical drives and to express a more controlled response. Self-control is known to require intact serotonergic function (Wogar et al., 1993; Harrison et al., 1997; Puumala and Sirvio, 1998; Dalley et al., 2002), especially within the prefrontal cortex (McClure et al., 2004; Ridderinkhof et al., 2004) and its cortico-striatal projections (Cardinal et al., 2004; Christakou et al., 2004).

#### **THE PROBABILISTIC-DELIVERY TASK (PDT)**

The "PDT" (Mobini et al., 2000; Adriani and Laviola, 2006) involves a larger but probabilistic reinforcer which is randomly withheld by the feeding device, and delivered only occasionally so that experimental subjects face a "loss". The progressively accumulating "losses" over time clearly have consequences for the sake of long-term payoff. Such a task also provides information reflecting the ability to cope with non-regularly delivered, randomly missing reinforcement. We have shown recently that laboratory rodents are not only tolerant to this random delivery, but are also sub-optimally attracted by this probabilistic uncertainty (Adriani and Laviola, 2006, 2009). Indeed, if the very frequent food-delivery omission is masked by the same cue (e.g., a light flash) normally accompanying occasional food delivery, this cue may turn out to act as a secondary reinforcer. As such, like in second-order schedules, this conditioned stimulus may sustain continued responding for the large/uncertain reward, even though this implies a decreased overall foraging in the long term. Gambling proneness may thus be sustained by the cueinduced secondary reward, which renovates in the subject the expectation for an eventual delivery of binge reward (Adriani and Laviola, 2006, 2009). Translated to human subjects, this would suggest that it is the thrill—associated with whatever physical stimuli accompanies both successful and unsuccessful gambling experiences—that sustain a motivation to gamble, in spite of abysmal odds and past (mostly negative) experience: looking at the ball madly spinning on the roulette and waiting for the crucial card to be turned, with a mix of hope for success and fear of loss, become rewarding in themselves, and it is in view of these (certain) rewards that people start enjoying gambling activities. Until the individual can keep under control the desire, these activities have nothing wrong in themselves. However, in vulnerable individuals, eventually a loss of control over these activities may intervene: pathological gamblers keep on gambling as this compulsive "urge" becomes a strong habit, not differently from other kinds of addictions (van den Bos et al., 2013).

#### **Methodological remarks on the probabilistic-delivery task (PDT)**

A theoretical framework has been recently formulated to interpret the performance of laboratory rats in this kind of two-choice tasks (Adriani and Laviola, 2006). Specifically, a landmark in the PDT protocol is the "indifference" point: i.e., the specific level of uncertainty at which the animals can choose either option freely with no effect on the overall economic convenience. As an example, if the ratio between large and small reward size is five-fold then the indifference point is at "*p"* = 20%. Therefore, once the "indifference" point is established, the range of "*p"* values providing worthy information is easily recognized at "*p"* values beyond the indifference point (i.e., 20% > "*p"* > 0%), when economical benefit (i.e., maximization of payoff) is attained unequivocally by choosing repeatedly the small-reward option. Thus, to maximize the payoff, subjects should be flexible enough to abandon their innate large-reward preference. As optimal performance in terms of benefit takes the form of a choice-shift towards small reward, this requires a self-control effort in order to overcome the "innate drive" that justifies its attractiveness (Adriani et al., 2006). By contrast, a sustained preference for large reward denotes "temptation by risk".

In this kind of two-choice tasks, details of the schedule can be calibrated appropriately (Adriani and Laviola, 2009), so that one alternative option leads to "optimal" benefit (i.e., the raw convenience in terms of quantitative foraging or any other measurable revenue), while the other alternative provides an "affective" benefit, with a more emotional outcome (i.e., better feeling and/or avoidance of adverse mood). In brief, to run a protocol providing useful information, any "inner drive" of interest (e.g., gambling proneness) shall push animals into a choice that necessarily leads to a sub-optimal outcome. Self-control is then defined as the ability to effect an optimal response (Stephens and Anderson, 2001) by directing choices onto the opposite *operandum* (nose-poking hole or lever to press). The protocol must never load both instances (i.e., the inner drive and the optimal payoff) on the same *operandum* because it would be impossible to discriminate whether any preference for that *operandum* is due to payoff-detecting processes ("economical efficiency") or to the "inner drive" itself.

#### **Probabilistic-delivery task (PDT) at very low probability levels**

Many factors can act together to push animals towards a suboptimal preference for a large reward, even though this is delivered quite rarely. One factor is insensitivity to risk, whereby the subjects are unable (i) to figure the uncertainty in the outcome (usually, they should anticipate the notion that reward is not for sure, which acts as a source of aversion immediately before choice) or (ii) to perceive the punishment of "losses" (represented by the occurrence of a randomly and frequently omitted delivery of reward).

Another factor is habit-induced rigidity, under which the subject seems to behave according to a well consolidated strategy. Such form of inflexibility may be due to a failure of negative reinforcement, namely to a lack of adaptation and feedbackreaction to the aversion (for an anticipated "unsure" prize) and/or to the punishment (due to an actually "omitted" prize) just described.

A third factor is temptation to gamble, whereby the motivational impact of the reward magnitude ("bingeing") seems to monopolize the subject's attention over any other reward feature. It is also possible that risk of punishment under conditions of uncertainty becomes attractive as a secondary conditioned feature, and this because the "binge" reward (eventually delivered) may well be generating an overwhelming peak of positive reinforcement. The latter could extend a secondary rewarding property to all cues and surrounding stimuli that predict uncertain features. Whatever of these factors is prevalent in the PDT and in similar tasks, the sub-optimal preference for big, rarefied reward is taken as an index of "gambling proneness" (namely, the innate attraction for a "rare but binge" event).

#### **"RISK OF LOSING" vs. "FAILING TO WIN"**

A crucial component of human gambling is the "risk of losing", that is, "the resources staked on a favorable outcome are lost when a wager is unsuccessful" (Zeeb et al., 2009). This is distinct from "failing to win", that is, the absence of any additional gain, causing a "frustration" but only compared to one's expectation.

Most paradigms of risky decision-making (Mobini et al., 2000; Cardinal and Howes, 2005; Adriani and Laviola, 2006; van den Bos et al., 2006) deal exclusively with "failing to win": i.e., complete omission of reward delivery, or delivery of an unpalatable reward. Thus, there is frustration of an expectation but no risk of "negative payoff ", i.e., of finishing the session at a disadvantage compared with the start. In other words, every case of unsuccess is an "unlucky event" but not necessarily a "risk". Therefore, while the attraction for uncertain reward may resemble the features of a "gambling proneness", it is not necessarily fitting with the construct of "risk proneness" (on this point, see Anselme, 2012). Therefore, it should be noted that "uncertainty" and "risk" are not synonymous:<sup>1</sup> indeed, the PDT and similar tasks do offer stochastic "unsuccess" which is even a "punishment", but not necessarily a "risk" which would need a construct implying a potential for overtly adverse consequences (e.g., footshock).

Recently, however, choice behavior has been also studied in a setting where a greater reward was associated with the probability of an overtly adverse event (i.e., the "risk"), represented by a foot shock (Simon et al., 2009). This can represent a promising methodological refinements of paradigms tailored for gambling proneness, although its ethical implications (especially when dealing with non-human primates) should be carefully evaluated.

Another attempt to deal with this issue is represented by the "Rat gambling task" (rGT; Zeeb et al., 2009). In this task, subjects have a limited amount of time to maximize the number of pellets earned, and loss is signaled by punishing timeouts during which reward cannot be obtained. On each trial, animals can choose from four options, each associated with a different number of sugar pellets; each subject then receives either the associated reward or a punishing timeout. Larger reward options are associated with a higher chance of longer timeouts, resulting in less reward earned overall per session. To maximize their earnings, rats must learn to avoid these risky options.

#### **THE ECOLOGICAL VALIDITY OF ANIMAL MODELS OF HUMAN (PATHOLOGICAL) GAMBLING**

Classically, the performance of laboratory animals on tasks tailored for gambling proneness is investigated by placing the animals (in most cases laboratory rodents, primarily rats) individually in operant chambers for a short daily session (Evenden and Ryan, 1996, 1999; Mobini et al., 2000, 2002; Adriani et al., 2009). Thus, differences across laboratories in working environments and in human interventions (e.g., handling and transport to a novel testing room) may compromise the reliability and reproducibility of behavioral data (Crabbe et al., 1999; Wahlsten et al., 2003).

Therefore, for the ecological validity of animal models of human (pathological) gambling, it is critical to address some crucial issues (van den Bos et al., 2013). Firstly, confounding factors such as stress due to handling, facing a new environment and social isolation should be avoided (e.g., de Visser et al., 2006; Spruijt and de Visser, 2006; Koot et al., 2009, 2012; Zoratto et al., 2013). Secondly, the level of tasks' automation should be increased, since the involvement of the experimenter during testing procedures (and for scoring behavior) may be difficult to standardize: indeed, results may often strongly vary between laboratories (Crabbe et al., 1999; Chesler et al., 2002). Thirdly, tasks incorporating a social component should be used, to assess the impact of social factors on gambling proneness. It is well known, indeed, that the social environment in humans may have an undeniable effect on the development and maintenance of pathological gambling. Finally, innovative tasks should be developed that allow the investigation of normal time-budget (and its potential disruptions) devoted to social interaction, foraging, and other activities. This aspect, which is yet unexplored in animal models, would be highly relevant. The goal is to identify altered time budget possibly analogous to the disruption of personal, professional or financial life, widely reported in human pathological gamblers (DSM-IV-TR, American Psychiatric Association, 2000; Potenza, 2001).

To address the issues mentioned above, different automated social home-cage systems have recently been developed for permanent monitoring of subjects' operant-choices and spontaneous (social and non-social) behavior (e.g., Adriani et al., 2012b). For instance, the Home-Cage Operant Panels (HOPs, PRS Italia) are new low-cost computer-controlled operant panels (Koot et al., 2009), which can be placed inside the home-cage, enabling rodents to operate it 24 h/day. Operant-choice tasks are particularly interesting to be run during adolescence (Adriani and Laviola, 2003; Adriani et al., 2004), but social deprivation during this ontogenetic period may produce changes in reward sensitivity (Van den Berg et al., 1999), as well as psychotic-like symptoms (Leussis and Andersen, 2008). To solve this problem, Zoratto et al. (2013) recently developed a considerable methodological improvement that allow testing adolescent rats in the home-cage with a task tailored for gambling proneness, while socially living and within the limited span of this developmental phase.

#### **RISKY CHOICES IN NONHUMAN PRIMATES: IMPLICATIONS FOR HUMAN PATHOLOGICAL GAMBLING**

Laboratory studies in nonhuman primates can inform the research on human pathological gambling in at least four three ways. First, the behavioral tasks employed in laboratory rodents (see The Probabilistic-Delivery Task (PDT)) may be implemented in non-human primates for studying the psychobiological bases and evolutionary roots of human gambling behavior. Second, the comparison of risk preferences between phylogenetically closely related nonhuman primate species with different ecologies can shed light on the selective pressures that shaped decision-making under risk in the course of the evolution. Third, the study of how nonhuman primates make decisions under risk may provide important information on the contextual and social factors determining the occurrence of similar risky choices in humans. Fourth, since nonhuman primates are our closest relatives, but are not constrained by the socio-cultural system of beliefs and attitudes that characterizes humans, their study may allow to assess whether biases in the making of decisions under risk emerged before the human lineage diverged from the other primates, or whether they are a more recent—and possibly culturally determined acquisition.

As noted above (see The Ecological Validity of Animal Models of Human (Pathological) Gambling), in studies with nonhuman primates, the term "risk" is typically understood as the frustration

<sup>1</sup>Another common way of distinguishing between risk and uncertainty is in terms of how measurable the odds are: Knight (1921) proposed to consider as "risky" those choices were the odds are measurable and known to the subject, whereas the term "uncertainty" should be reserved for probabilistic outcomes with unknown odds. While this distinction has become canonical in behavioral economics (e.g., Camerer and Weber, 1992; Tversky and Kahneman, 1992), its application to animal studies is highly problematic, due to obvious difficulties in establishing how much the odds are known (that is, precisely understood and quantitatively assessed) by experimental subjects.

of a positive expectation (failure to receive a reward), rather than as the occurrence of a negative event (a loss of valuable resources, or the infliction of physical damage). This happens since the second type of "risk" cannot be implemented in nonhuman primate experiments, mostly due to ethical considerations. However, it is clear that nonhuman primates are exposed, in their own environment, also to "true" risks of the second type (e.g., predation). Note that, in humans, the risks involved in pathological gambling include the loss of job, family, social reputation; in a laboratory model, the appropriate meaning of "risk" should encompass therefore the possibility of overtly adverse outcomes as consequence of "high stakes". In any case, a comparative approach has much to offer to our understanding of human attitudes towards such "high stakes risks", once appropriate methodologies for studying them will be developed.

#### **THE PROBABILISTIC-DELIVERY TASK (PDT) IN THE COMMON MARMOSET**

The behavioral tasks mentioned in Section Rodents as an Animal Model of Gambling Behavior, used to focus on particular gambling-related aspects, are classically performed in laboratory rodents, primarily rats. However, the implementation of these tasks in species other than rats (that is, non-human primates) may be relevant for studying the psychobiological bases and evolutionary roots of human gambling behavior. Moreover, very little is known about the possibility to run such tasks by means of automated operant panels. This possibility is especially relevant in sight of increasing the ecological validity of these models (see above). The HOPs, originally developed for rodents, have been recently adapted to small non-human primates like the common marmoset (*Callithrix jacchus*; Adriani et al., 2013). In such a recent experiment, whereby the *operandum* was adapted for example into hand-poking holes, we showed that HOPs can be reliably exploited to model operant-choice behavior in a delayedreward setting. The aim of future studies will be to evaluate marmosets as possible models for gambling behavior, using a PDT and drawing a comparison with rats.

#### **THE "ECOLOGICAL RATIONALITY" OF RISK PREFERENCES**

According to normative economic models, mainly formulated in mathematical terms, rational decision makers should be indifference when choosing between a safe option and a risky option leading on average to the same payoff (e.g., von Neumann and Morgenstern, 1947). In practical terms, this means that a rational decision maker has no reason to prefer either option when offered choice between e.g., a certain, small reward vs. an uncertain, larger one whose size is five-fold and whose probability of delivery is at "*p"* = 20% (i.e., at the indifference point). However, both human and nonhuman animals are not similar to such "rational" entity, as their instinct will guide their choice towards some kind of a preference: they are generally risk-averse for gains (e.g., Kahneman and Tversky, 1979; Kacelnik and Bateson, 1996), with the notable exception of nonhuman primates, for which the picture is more complicated (Stevens, 2010). To explain this pattern of behavior, it has been proposed that risk-related preferences could reflect the environments in which species evolved and, in particular, their feeding ecology (Heilbronner et al., 2008), leading to "ecologically rational" decisions (Gigerenzer and Todd, 1999). In order to test the above ecological hypothesis, risk preferences were compared in phylogenetically closely related primate species employing two main paradigms.

In the most simple paradigm, the subject is given a series of choices between two options: the "safe" option yields a reward that is constant in amount, whereas the "risky" option yields a reward that varies probabilistically around the mean, with the two options leading on average to the same payoff. Individuals' attitude towards risk is inferred on the basis of their preference for the safe option (indicating risk aversion), for the risky option (indicating risk seeking) or for neither option (indicating risk neutrality) (Kacelnik and Bateson, 1996, 1997). Bonobos (*Pan paniscus*) and chimpanzees (*Pan troglodytes*), two closely related species that evolved behavioral differences possibly as a result of their different ecologies (Wrangham and Pilbeam, 2001), received an experimental schedule whereby they were offered choices between two different upside-down bowls, covering the safe option (always four food items) and the risky option (either one or seven food items with equal probability; Heilbronner et al., 2008). The two species differed markedly in their risk preferences: chimpanzees were risk-seeking, whereas bonobos were risk-averse. Their feeding ecology offers a plausible explanation for this difference: bonobos feed mainly on terrestrial herbaceous vegetation, an abundant and reliable food source, whereas chimpanzees feed primarily on fruit, a more variable food source (Wrangham and Peterson, 1996). Thus, since chimpanzees often rely on more unpredictable food sources than bonobos, this evolutive force may have shaped their behavioral regulations so that to render them tolerant to, if not attracted from, a reward uncertainty. As such, an ecological feature may have led them to be more risk-seeking than their sister species (Heilbronner et al., 2008; Stevens, 2010).

A methodologically similar study conducted on individuals belonging to different lemur species (*Lemur catta*, *Eulemur mongoz*, *Varecia rubra*) showed that, as bonobos, lemurs were clearly risk-averse (MacLean et al., 2012). Subjects were required to choose between two images on a touch-screen, associated to a safe option and to a risky option, respectively. The safe option always led to one food item, whereas the payoff of the risky option varied across two experiments. In a first experiment, the risky option corresponded either to two food items or to zero food items with equal probability (leading to an average payoff of one food item, as the risky option). In a second experiment, the payoff of the risky option was gradually increased across trials up to 7.5 times the safe option. In the first experiment, lemurs strongly preferred the safe option; in the second experiment, half of the subjects switched to risk seeking only when the potential payoff of the uncertain option was at least five times higher than that of the safe option. These results are somewhat puzzling if compared to the findings obtained by Heilbronner et al. (2008) in chimpanzees. However, it can be hypothesized that animals living in a relatively productive environment compared to lemurs, like chimpanzees, can exploit also risky resources, and thus evolve a risk-seeking attitude, without incurring in the danger of starvation. In contrast, for animals living in very harsh environments, like lemurs (that have also evolved several anatomical and behavioral traits as adaptations to their unpredictable habitats; Wright, 1999), risk proneness is not advantageous in the long term and is better to rely on low-quality, yet stable resources (Caraco, 1981; McNamara, 1996).

In a more complex paradigm, Haun et al. (2011) investigated whether, when choosing between a safe and a risky option, the four nonhuman great ape species (*Pan paniscus*, *Pan troglodytes*, *Gorilla gorilla*, and *Pongo abelii*) make decisions based on the expected value, defined as the probability of receiving the reward multiplied by the amount of the reward. In each trial, subjects choose between a safe option, consisting in a small food item hidden under a yellow cover positioned to the right of the subject, and a risky option, consisting in a large food item put in one of four brown bowls placed in a row in front of the subject and hidden under a blue cover. The probability of receiving the reward was manipulated by increasing the number of blue cups covering the four brown bowls (varying from *P* = 100%, when one blue cup covered the brown bowl containing the risky option, to *P* = 25%, when four blue cups covered all the brown bowls), whereas the relative value of the risky option was increased by decreasing the size of the small food item. Overall, apes preferred the risky option, although their preferences were influenced by the expected value. In fact, subjects chose the safe option more often when (i) the safe reward increased in size compared to the risky reward, and (ii) the probability to receive the risky reward decreased. As for species differences, chimpanzees were more risk-seeking than bonobos (as in Heilbronner et al., 2008) also when tested in this more complex paradigm, and orang-utans, whose feeding ecology is somewhat similar to that of chimpanzees (Knott, 1999), were also risk-seeking.

Interestingly, similar differences in risk preferences have been observed in human small-scale societies, possibly as an effect of cultural differences and environmental conditions (Kuznar, 2001; Henrich and McElreath, 2002) that deserves further investigation.

#### **CONTEXTUAL AND SOCIAL FACTORS AFFECTING RISK PREFERENCES IN NONHUMAN PRIMATES**

Several neurophysiological studies in nonhuman primates have employed risk preference tasks to understand whether single neurons track the subjective value rather than the objective value of a chosen option (McCoy and Platt, 2005; O'Neill and Schultz, 2010; So and Stuphorn, 2010; but see Yamada et al., 2013). In a first study, McCoy and Platt (2005) tested rhesus macaques (*Macaca mulatta*) in a visual gambling task and measured the activity of single neurons in the posterior cingulate cortex. Macaques were presented with choices between visual targets offering on average the same reward but differing in reward uncertainty. They had to choose whether directing their gaze to a safe target (offering a 150 ms access to fruit juice) or to a risky target (randomly offering either a shorter or longer than 150 ms access to juice, resulting on average in 150 ms access). Overall, monkeys strongly preferred the risky target and its selection increased with the degree of risk, regardless of the internal state of the subjects. Also neuronal activity increased with increasing variance in payoff of the risky option, mirroring the macaques' risk proneness observed at the behavioral level. Interestingly, macaques continued to prefer the risky option even when the probability of receiving the larger outcome was reduced from 50 to 30% and thus its payoff was smaller than that of the safe option.

In the above study, rhesus macaques were consistently riskseeking and the same pattern was observed also in subsequent studies carried out by the same Authors and in other neurophysiological laboratories (Hayden et al., 2008b, 2010; Long et al., 2009; Watson et al., 2009; O'Neill and Schultz, 2010; So and Stuphorn, 2010; Heilbronner et al., 2011; but see Yamada et al., 2013). Interestingly, macaques' choices are not explained by nonlinear utility functions (as proposed by Lee, 2005) since they preferred an uncertain option, in which the delivery of the larger payoff was unpredictable, to an alternating option, in which the delivery of the larger payoff was predictably alternating across trials (Hayden et al., 2008a). Thus, borrowing the distinction between uncertainty and risk favored in the field of behavioral economics (Knight, 1921; Camerer and Weber, 1992; Tversky and Kahneman, 1992), macaques are not only risk prone, but also uncertainty-seeking.

However, not in all conditions do rhesus macaques exhibit a preference for risky options. In fact, when another macaque sample was tested in a risk preference task under different conditions, their behavior ranged from risk aversion to risk neutrality, but none of them was risk-seeking (Behar, 1961). Thus, although rhesus macaques' ecology may suggest a general predisposition for risk proneness (Goldstein and Richard, 1989; Richard et al., 1989), Heilbronner and Hayden (2013) proposed that macaques' risk preferences are driven by some features of the task design typically used in neurophysiological studies, such as (i) the small stakes involved in these experiments (typically 0.1–0.3 ml of juice); (ii) the large amount of trials (the same decision problem is typically presented hundreds or thousands of times to the same subject); and (iii) the short intertrial intervals (ITIs).

At least for the latter point, an experiment showed that this might be the case. Whereas in McCoy and Platt (2005), where macaques were risk-seeking, the average ITI was 3 s, in other nonhuman animal studies, where individuals were risk-averse (reviewed in Kacelnik and Bateson, 1996), the ITI was much longer (usually 30 s). Thus, Hayden and Platt (2007) presented rhesus macaques with a novel version of the visual gambling task in which the variance of the risky option was kept constant and the ITI varied from 1 s to 90 s. They found interestingly that, as the ITI increased, macaques' preference for the risky option decreased and monkeys turned to risk neutrality at 90 s ITI. To explain this pattern, Hayden and Platt (2007) hypothesized that macaques interpreted the risky option as a certain reward available at a future time and, since the higher payoff may occur on the next trial, the subjective expected utility of the risky option depends on the length of the ITI. Interestingly, when humans were tested with a paradigm as similar as possible to that usually employed with macaques, they were more risk-seeking than in typical oneshot gambling experiments employing questionnaires (Hayden and Platt, 2009).

However, the above factors cannot explain the risk-seeking behavior observed in chimpanzees and orangutans (Heilbronner et al., 2008; Haun et al., 2011), where the stakes involved where comparatively high, the number of trials lower, and the ITIs longer than in the macaque studies. Although the results on chimpanzees appears to be very robust and have been replicated with larger samples (Rosati and Hare, 2012, 2013), it cannot generally be excluded that the different risk preferences obtained in the nonhuman primate studies reviewed so far were due to individual differences. In fact, in rhesus macaques, risk sensitivity appears to be partly determined by the serotonergic system: serotonin depletion increases risk proneness (Long et al., 2009), a finding consistent with recent rodent data (Koot et al., 2012). Similarly, the length polymorphisms of the serotonin transporter gene promotor (known as 5-HTTLPR, the serotonin-transporterlinked polymorphic region) is crucial as well (Watson et al., 2009), in relation to interspecific and intraspecific behavioral variability. Wendland and colleagues (2006) found, in macaque species, that the 5-HTTLPR was responsible for interspecific behavioral variability. In contrast, Chakraborty et al. (2010) proposed that this particular polymorphism had a role in intraspecific variability, which in turn may account for the greater ecological success of 5-HTTLPR polymorphic species. An example of its consequences in the wild is represented by the presumed selective emigration of rhesus macaques over the Himalyan Mountains into China in the early history of the species (Champoux et al., 1997; Heinz et al., 1998). According to Belsky et al. (2009), this particular polymorphism may confer an advantage when dealing with novel, possibly hostile environments. Relative to Indian-derived monkeys, Chinese-hybrid macaques with higher prevalence of the long repeat allele of the 5-HTTLPR show predispositions to aggressive and risk-taking behaviors, as well as lower levels of serotonin as indicated via its metabolite (Champoux et al., 1997; Heinz et al., 1998). Nonetheless, although feeding ecology and inter-individual differences are likely to influence risk preferences, the findings obtained in rhesus macaques underline the importance of carefully controlling all task and environmental parameters when comparing risk preferences among different species.

Finally, as observed in humans (Bault et al., 2008; Ermer et al., 2008; Hill and Buss, 2010), another important factor affecting nonhuman primates' risk preferences seems to be the social context in which the individuals make decisions. To our knowledge, there is only one study evaluating this aspect in nonhuman primates (Rosati and Hare, 2012). Chimpanzees and bonobos were presented with choices between a safe option, yielding an intermediately preferred food item, and a risky option, yielding either a low-preferred or a high-preferred food item, in a competitive context and in a play context. In both contexts an experimenter interacted with the subject before the presentation of the decision-making task: in the competitive context, the experimenter first offered the subject a food item and then, when the subject attempted to take it, immediately pulled it out of the subject's reach; in the play context, the experimenter tickled or chased the subject. Apes' behavior in each condition was compared with a neutral context, in which the experimenter was present but not interacting with the subject. All subjects chose the risky option more in the competitive than in the neutral context, whereas the play context did not increase risk proneness. Probably, an eco-ethological explanation is very likely given that feeding competition and consequent loss of resources is a potential problem for all group-living species. In this frame it can be proposed that, in the competitive context, the salience and attractiveness of the larger option would be increased notwithstanding its uncertainty.

#### **THE EVOLUTIONARY ORIGINS OF BIASES IN DECISIONS UNDER RISK**

When making choices between risky options, humans show the so-called "reflection effect", i.e., the tendency to evaluate gambles in relation to an arbitrary reference point. The same individual can decide differently, being risk-seeking when some options are framed as losses and risk-averse when the same, identical options are framed as gains (Kahneman and Tversky, 1979; Tversky and Kahneman, 1981).

Nonhuman animals apparently share with humans the reflection effect and other behavioral biases (e.g., Waite, 2001; Marsh and Kacelnik, 2002; Shafir et al., 2002). This can be either because of an early emergence of economic biases during evolution, or because of convergent evolution. Only the study of nonhuman primates, our closest relatives, can allow to disentangle the topic and select one between these two hypotheses. To this aim, in recent years a series of studies investigated decision-making under risk in capuchin monkeys (*Sapajus* spp., formerly *Cebus apella*<sup>2</sup> ) that, despite 35 million year of independent evolution, show many striking analogies with humans in terms of encephalization index, ontogeny, lifespan, and various cognitive traits (Fragaszy et al., 2004).

In a first study (Chen et al., 2006), capuchins were tested in a token exchange task, in which they were provided with a starting budget of 12 tokens that could be exchanged with one of two experimenters, as they preferred. Preliminary experiments demonstrated that capuchins can behave rationally in this framework: when the two experimenters provided the same amount of two equally preferred different food types, capuchins exchanged a similar amount of tokens with each of them; however, when one experimenter doubled the amount of food provided in exchange for one token or showed two food items and delivered either one or two pieces with the same probability, capuchins reliably shifted their preference towards her, showing that they were able to maximize their payoff. In the main experiment, capuchins were presented with choices between experimenters providing a risky "trade" of either one or two food items with equal probability, but the amount of food initially displayed to the subject was different: one experimenter showed one food item and added a "gain" of one additional food item in half of the trials, whereas the other experimenter showed two food items and subtracted a "loss" of one food item in half of the trials. Although the two experimenters provided on average

<sup>2</sup>Recent molecular analysis has revealed that capuchin monkeys, formerly identified as the single genus *Cebus*, are two genera, with the robust (tufted) forms (including libidinosus, xanthosternos, apella and several other species) now recognized as the genus *Sapajus*, and the gracile forms retained as the genus *Cebus* (Lynch Alfaro et al., 2012). The nomenclature for *Sapajus* is registered with ZooBank (urn:lsid:zoobank.org:act:3AAFD645-6B09-4C88-B243- 652316B55918). Animals identified as *Cebus apella* in laboratory colonies outside of South America may be any combination of the several species (e.g., *C. apella*, *C. libidinosus*, *C. nigritus*) recognized as separate species since 2001 (Groves, 2001; Fragaszy et al., 2004), but previously considered *C. apella*.

the same payoff, capuchins preferred to exchange their tokens with the first experimenter, although—according to a rational perspective—they should have been indifferent between the two options. These results demonstrate that, as in humans, they chose on the basis of an arbitrary reference point (namely, the initial food amount shown by the two experimenters), therefore preferring the experimenter which was framing the "trade" as a gain.

In a subsequent study (Lakshminarayanan et al., 2011), capuchins were tested with a similar paradigm, presenting them choices between a risky option and a safe option yielding the same average payoff (of two food items) but in two conditions: (i) *Losses*: both experimenters initially displayed three food items, but the first experimenter always delivered two food items, whereas the second experimenter delivered either one or three food items with equal probability; and (ii) *Gains*: both experimenters initially displayed one food item, but the first experimenter always delivered two food items, whereas the second experimenter delivered either one or three food items with equal probability. Overall, capuchins showed a clear-cut evidence of the "reflection effect" since they were risk-seeking when options were framed as losses, and risk averse (although to a lesser extent) when options were framed as gains. Again, decisions appear to be made by subjects relative to their initial reference point.

In sum, the above findings suggest that humans and capuchin monkeys share the reflection effect, as is reported with other behavioral biases (Chen et al., 2006; Lakshminarayanan et al., 2008). However, a very recent "up-linkage" replication of Lakshminarayanan et al. (2011), in which adult humans were tested with exactly the same procedure employed with capuchin monkeys, failed to find a reflection effect (Silberberg et al., 2013). Nonetheless, it should be noted that such a replication may have had a low ecological validity for cognitively sophisticated adult humans, especially because of the repeated interactions with the experimenters, which the participants may have found boring or embarrassing. Future studies should investigate biases in decisions under risk in closely-related non-human primate species with different ecologies (Clutton-Brock and Harvey, 1979; Rosati and Stevens, 2009; Rosati and Hare, 2012) in order to understand whether these behavioral patterns are maladaptive, suboptimal, or instead "ecologically rational" (Todd and Gigerenzer, 2000).

### **RISK ATTITUDES, ENVIRONMENTAL UNCERTAINTY AND ADDICTIVE BEHAVIOR: PERSPECTIVES FROM COMPUTATIONAL NEUROSCIENCE AND EVOLUTIONARY ROBOTICS**

Computational models are a new way of doing science which can be very useful for theorizing about extremely complex systems like vertebrate organisms and their brains. The usefulness of computational models comes largely from two factors: (i) they express hypotheses in a formal, precise, and unambiguous way, so that from those hypotheses a number of detailed predictions can be unequivocally derived which can then be tested through empirical experimentation; (ii) they allow for a degree of direct manipulation on all relevant variables which is unparallelled by naturalistic methods.

The vast majority of computational models deal with the normal functioning of the brain and normal cognitive phenomena, but since the 1990s a number of models have been proposed that address psychiatric and neurological disorders, and recently these models have been raising increasing interest, so that several scholars started to discuss the prospects, challenges, and limitations of *computational psychiatry* (Maia and Frank, 2011; Montague et al., 2012; Huys, 2013). There are many ways in which computational models may help research on decision-making in general and pathological gambling more in particular. Here, we will focus on three different kinds of models: (1) normative (algorithmic) models; (2) neural models; and (3) evolutionary robotics models.

#### **NORMATIVE (ALGORITHMIC) MODELS**

A first class of relevant models is what we can call "normative" or "algorithmic" models. These models derive from the computational reinforcement learning literature (Sutton and Barto, 1998) and are normative because they are based on machine learning algorithms, which prescribe how an agent *should* behave in order to maximize its payoff with future rewards. They became famous in the mid 1990s when it was discovered that the dynamics of dopamine, which is highly involved in motivation and learning (Wise, 2004; Schultz, 2006; Berridge, 2007), as well as in drug addiction, could be modeled by the *reward prediction error* signal postulated in Temporal Difference (TD) reinforcement learning (Barto, 1995; Schultz et al., 1997). The reward prediction error of TD learning is a signal that quantifies "surprise", that is, the difference between expected and actual rewards, and it is used in reinforcement learning models as the learning signal that drives action learning. In a nutshell, the theory holds that an agent continually evaluates the current states (situations) with respect to the reward that it expects to achieve in those states. If it gets more reward than expected, then a prediction error signal is generated that is used to update both its prediction and its action policy, that is the way the animal selects its actions. The idea is that the probability to select an action again, in a given context, is increased if that action leads to more rewards than expected and is decreased if it leads to less reward than expected. Dopamine behaves just as the reward prediction error: its release is triggered by unexpected rewards or unexpected stimuli that predict reward but it is not released when the reward is perfectly predictable and it is inhibited (a deep in dopamine levels occurs) when an expected reward is omitted. This has led to conclude with the hypothesis that dopamine plays the same function of the reward prediction error, within phenomena of reinforcement. Phasic dopamine release would have the role of making the agent learn (1) the value ("saliency") of the stimuli and (2) which are the actions ("strategies") to be deployed in each circumstance in order to maximize future rewards. In mammals, these two roles are attributed to mesolimbic vs. nigrostriatal dopamine pathways, respectively. This theory has guided an enormous amount of empirical research and has received so much empirical support that it is now an important tenet of contemporary neuroscience, and it has become one of the most successful examples of using computational models in the behavioral and brain sciences (e.g., Montague et al., 2004; Ungless, 2004; Wise, 2004; Sugrue et al., 2005; Graybiel, 2008; Glimcher, 2011).

What is most interesting for our purposes is that the reward prediction error hypothesis for dopamine has not only been used to predict and explain behavioral and brain dynamics in normal conditions, but also to explain pathological phenomena. In particular, normative algorithmic models have been used to interpret brain imaging data related to various mental pathologies like schizophrenia and depression-related anhedonia (Smith et al., 2007; Kumar et al., 2008; Murray et al., 2008; Huys et al., 2013).

Moreover, a seminal work by David Redish (2004) used a TD model to explain drug addiction. In particular, the model explained addiction as the consequence of the pharmacological effect that certain drugs of abuse, like amphetamines, cocaine or nicotine, may have on forebrain dopamine circuits. Indeed, these drugs are known to increase dopamine levels upon acute administration. According to Redish's model, the addictive effect of these drugs is associated to specific consequences, due to the dopamine elevation produced by the drug. With natural rewards, a phasic release of dopamine is present only when the reward is not predicted, unexpected. In this perspective, the normal process of reinforcement, produced by any reward, can be cancelled out by accurate predictions. On the contrary, the model postulates that drugs of abuse generate also a pharmacologically-induced dopamine release, a term that cannot be compensated by predictions. Since, in this way, the dopamine prediction error never disappears, as if drug-related pleasure is always "unexpected", the subjective values of the drug related internal states will keep on increasing indefinitely, and the actions that lead to the drug consumption keep on being reinforced, hence becoming a strong habit and thus ultimately resulting in the development of addiction. This model explains several aspects of addiction including, for example, the fact that both drugs and natural rewards are sensitive to effort-related cost, but the reward provided by drugs is much less sensitive than that given by natural rewards. However, one of the key predictions of the theory has been falsified by subsequent research. In particular, the theory predicted that drugs should prevent blocking, i.e., the phenomenon for which a stimulus that predicts a reward, if paired with a new stimulus before presenting the reward, prevents the second stimulus to be conditioned as it stops the learning-inducing dopamine prediction error from occurring. If a drug always produced a dopamine prediction error, as postulated by Redish's model, then the conditioning of the second stimulus should occur, but it does not (Panlilio et al., 2007).

Building on this computational interpretation of drug addiction, Redish et al. (2007) proposed a model that provides a possible explanation of pathological gambling. This model adds to the basic TD prediction error model, which learns the values of states and actions, a second "situation recognition" system that learns to categorize the states. In particular, this system learns to categorize as different states all those situations in which, after having received high rewards, those rewards are not present anymore. Noteworthy, this addition was done to accommodate in the TD framework basic reinforcement learning phenomena related to the extinction of behaviors and their

renewal. However, it provides also an explanation of gambling. Indeed, many pathological gamblers became addict after having experienced an unlikely sequence of wins or a single very high win (Custer, 1984; but see Kassinove and Schare, 2001, for empirically founded doubts on the strength of this big win effect). The model assumes that, when the gambler experiences such a huge success (or the feeling to have almost succeeded, the so called "near miss" effect; Kassinove and Schare, 2001), he forms a very strong and unrealistic expectation that he can win again (or finally; on the similarity in neural processing of wins and near misses, see Chase and Clark, 2010; Winstanley et al., 2011). When the gambler starts to loose, instead of unlearning and cancelling this (false) expectation, by negative reinforcement, his situation recognition system starts to create new "associative" states, namely looking for cues that are supposed to distinguish the winning situation against the loosing ones. Hence, according to this model, pathological gambling results from a misclassification of the situation, with the irrational belief that there are contingencies in which the gambler can win as different from those where he looses. This explanation can account also for two related phenomena: (1) the "hindsight bias" effect, where gamblers analyze their losses and (*post-hoc*) identify which are the cues that differed from the situation when they won, as well as (2) the "illusion of control" phenomenon, in which they believe that they can control an otherwise random situation by identifying and following the right cues that, in their mind, distinguish winning from losing situations (Custer, 1984; Wagenaar, 1988). The most common superstitions of pathological gamblers are thus accounted for.

One limitation of this model is that it tries to explain pathological gambling as a unitary phenomenon with a unique cause, while it is likely that there might be several different causes that underlie this complex behavior, both in the same individual and across different individuals. For example, many pathological gamblers keep on gambling even if they report knowing that they will loose, something that is in contrast with the model (but see the results on cue-induced secondary rewards in rodents and their potential implications for human gambling, discussed in Section Assessment of Gambling Proneness: Clinical and Preclinical Approaches). However, the most important limit of this kind of normative, algorithmic models is that they provide abstract explanations on what computations may go awry in pathological conditions, but they do not explain which are the actual brain mechanisms that may underlie these phenomena: hence the range of phenomena that they can account for and predict is limited. In order to investigate the details of the brain processes that are the basis of the phenomena of study, we need models that simulate those details. This is the province of neural models.

#### **NEURAL MODELS**

Neural models explain a cognitive phenomenon by simulating (with a variable degree of abstraction) neurons and their connections, and making the simulated neural network reproduce the phenomenon. The first models of this kind were called "connectionist" models (McClelland and Rumelhart, 1989): they included very simple neural networks, which were supposed to perform computations in a brain-like manner, but whose structure was not meant to replicate the structure of real brains. More recently, much more biologically realistic models have been developed in computational neuroscience. In these models, different groups of nodes are meant to represent neurons belonging to different parts of the brain, and the connections between the different groups correspond to the connections between those brain areas. The architecture and functioning of the model are thus based on the anatomy and physiology of the same brain areas that are known to be relevant for the phenomenon under study. If the model is able to reproduce the phenomenon, this would give us a detailed explanation on what brain mechanisms may be responsible for it. The plausibility of such an explanation rests on two foundations: (i) how many anatomical and physiological constrains are considered, and how much they are respected; and (ii) how many different phenomena the model is able to account for. Furthermore, the model can be used to derive a number of predictions that can then be tested in humans as well as animal models, through further empirical experiments.

To the best of our knowledge, no neural models have been developed so far to explain pathological gambling, although there is evidence of a role of midbrain dopamine in the coding of reward uncertainty (Fiorillo et al., 2003), thus suggesting an influence of the dopaminergic system on risk-taking behavior. On the other hand, several models, both connectionist (e.g., Cohen and Servan-Schreiber, 1992; Cohen et al., 1996; Braver et al., 1999) and biologically detailed ones (Frank et al., 2004, 2007a,b,c; Gutkin et al., 2006; Waltz et al., 2007; Rolls et al., 2008; Ahmed et al., 2009; Maia and Frank, 2011), have been developed to describe neurological and psychiatric pathologies, including schizophrenia, Parkinson, Tourette's syndrome, ADHD, and drug addiction. Briefly reviewing these existing models can provide useful suggestions on how to apply the same methods to the investigation of pathological gambling.

Most of these models deal with the dopaminergic system and its interactions with the basal-ganglia-thalamo-cortical circuits that implement action selection. A notable example is the work of Frank and colleagues on modeling several aspects of Parkinson disease (e.g., Frank et al., 2004, 2007a; Moustafa et al., 2008). Parkinson disease is known to depend on the degeneration of nigro-striatal dopamine cells. This work is based on a detailed model of the basal ganglia-thalamo-cortical circuit that is assumed to implement action selection and reinforcement learning (e.g., Frank et al., 2001). The main idea behind the model is that two sub-systems, a Go and a no-Go system, are present in the basal ganglia, which together implement action selection. In particular, neurons in the basal ganglia are supposed to allow the release of actions in the cortex by selectively disinhibiting a certain action (through the Go system) while inhibiting the others (through the no-Go system). Furthermore, a third structure of the basal ganglia (the subthalamic nucleus) is supposed to dynamically exert a global inhibitory role and to modulate the threshold at which actions are selected depending on the level of cortical conflict. Importantly, neurons belonging to the different systems have different dopamine receptors distributions, with Go neurons having receptors which make dopamine excite the neuron and no-Go neurons that have receptors which make dopamine inhibit the neuron. Through such a model, Frank and colleagues have been able to reproduce and explain a number of detailed behavioral and neural data, and to predict new data that have been empirically verified, such as the effects of dopaminergic medication and of deep brain stimulation of the subthalamic nucleus (a procedure that is known to improve motor symptoms) on different cognitive tasks in Parkinson patients (Frank et al., 2007a), and why medication can lead those patients to develop pathological gambling (Dodd et al., 2005).

In order to explain other facets of this complex behavior and its neural basis, many more details should be added to these models. For example, pathological gambling is known to be associated with dysfunction not only of dopamine, but also of other neuromodulators like serotonin (e.g., Nordin and Eklundh, 1999) and noradrenaline (e.g., Meyer et al., 2004). For this reason, the role of these two neuromodulators should be modeled in future research, possibly by incorporating findings from other computational models that deal with the interactions between these neuro-modulators and dopamine (e.g., Daw et al., 2002). Furthermore, beyond the anomalies in the basal-ganglia and in associated fronto-cortical areas, recent evidence suggests that also deficits in amygdala functioning may be responsible for gambling behavior by significantly reducing loss aversion (De Martino et al., 2010). For this reason, modeling pathological gambling may require modeling the interactions between the amygdala and the basal-ganglia, as done in recent neuro-robotic models of the role of amygdala in conditioning (Mannella et al., 2007, 2008, 2010; Mirolli et al., 2010).

Finally, also factors related to *intrinsic motivations* (i.e., motivations related to novelty, surprise, and competence acquisition: Ryan and Deci, 2000; Baldassarre and Mirolli, 2013) may play a role in pathological gambling. For example, Parkinson patients that develop pathological gambling are distinguished from those that do not in tests that measure impulsivity and novelty seeking (Voon et al., 2007). Recent computational models assume that intrinsic motivations work by hijacking the neural brain systems that underlie also extrinsic motivations, and in particular the dopaminergic system and the action selection system in the basalganglia (e.g., Kakade and Dayan, 2002; Mirolli et al., 2013). Some of these models are detailed neural models very similar to the ones discussed above on dopamine in Parkinson, including basalganglia-thalamo-cortical circuits, the dopaminergic system, and other relevant areas (e.g., Baldassarre et al., 2013; Fiore et al., 2014). Merging the two kinds of models may be a promising way to further understand the brain mechanisms underlying pathological gambling.

#### **EVOLUTIONARY ROBOTICS MODELS**

Evolutionary robotics provide a valuable platform to test evolutionary hypotheses on the ecological pressures behind the emergence of specific behaviors and traits. Such hypotheses, like those already discussed in Sections Rodents as an Animal Model of Gambling Behavior and Risky Choices in Nonhuman Primates: Implications for Human Pathological Gambling with respect to risk attitudes, are often plausible, but also hard to verify directly. They rely on key assumptions about the environment in which the evolution of a given species occurred, and yet it is typically hard to observe with precision the effects of a given ecological variable (e.g., dangers of predation) on the behavior under study (e.g., risk proneness/aversion). Moreover, these assumptions refer to ancestral environments, not present-day ecologies: while there are methods to acquire data on living conditions in ancestral times (e.g., through paleobiology and primate archeology; Haslam et al., 2009), they are bound to deliver incomplete information at best, in spite of substantial research efforts. Recent work has demonstrated the viability and fruitfulness of computational methods, e.g., experimental evolutionary robotics: the basic idea is to let populations of simulated robots evolve under specific ecological pressures, and then observe their behavior with the aim of drawing implications for the understanding of processes in natural organisms faced by similar, uncertainty-based tasks (Da Rold et al., 2011; Saglimbeni and Parisi, 2011). This approach allows to observe how several forms of risk introduced in the evolutionary environment affect choice behavior, both in ecology and in experimental settings.

Moreover, robots are controlled by simple neural networks, whose evolution and effects on behavior can be studied with extreme precision and flexibility: not only recording their activity during behavior, but also "lesioning" a well-adapted neural network and observing the impact on risk-related choices, hence drawing new insights into pathological gambling. These are all key advantages of computational evolutionary models, as opposed to purely mathematical and game-theoretical approaches, for putting forward hypotheses regarding the evolution of certain aspects of risk attitudes in uncertain environments (e.g., McNamara et al., 2013). While mathematical and theoretical models certainly provide valuable contributions to breach the gap between laboratory studies and ecological observations, they lack the opportunities for direct manipulation and experimental observation granted instead by robotics platforms, be they purely simulated or physically implemented.

To the best of our knowledge, no evolutionary (computational) model of pathological gambling have yet been proposed. However, there are several interesting simulations on how risk attitudes in general might have evolved: some of these works have already important implications for our understanding of gambling behavior, and points towards promising research directions. For instance, Niv et al. (2002) used evolutionary computation techniques to evolve near-optimal neuronal learning rules in a simple neural network model of reinforcement learning in bumblebees foraging for nectar. This resulted in a replication of two well-documented choice strategies in these animals: risk aversion and probability matching. Moreover, risk aversion evolved even in a completely risk-less environment. These results suggest that risk-aversion may be a direct consequence of nearoptimal reinforcement learning, with no need to assume further evolutionary constraints, such as the existence of a nonlinear subjective utility function for rewards. Their results were also demonstrated in real-world situations, using experiments in a Kephera wheeled robot, and they dovetail nicely with the evidence on the role of the reward prediction error in determining various choice behaviors (see Section Normative (Algorithmic) Models).

Other models do not explicitly focus on any particular species, but rather try to address general issues pertaining the evolution of risk attitudes. Arbilly et al. (2011) used agent-based evolutionary simulations to investigate an important connection between environmental features, risk-aversion, and the evolution of social learning. They started from the observation that, in environments with significant risks associated to higher value rewards (e.g., an ecology in which the most valuable food is rare and difficult to obtain), the possibility of acquiring such rewards is most likely to require a certain number of failed attempts, before success is achieved. In these circumstances, risk-aversion would lead to neglect such rewards, even if doing so may be sub-optimal in the long run (Real, 1991). However, Arbilly and colleagues noted that this situation also create an important (and often overlooked) evolutionary advantage to social learning over individual learning, since social learners can by-pass the problem of risk aversion by learning where to forage from individuals that have already found food. The results of their evolutionary simulations, which combined a producer–scrounger game with explicit individual and social learning rules for associating different food patch types with experienced reward, confirmed the key role of social learning in similar situations, as an antidote to the adverse effects of risk-aversion in this type of environment. Incidentally, this also provides an explanation to why many species, humans included, continue to rely heavily on social learning even when it produces disastrous effects, e.g., in escape panic scenarios (Helbing et al., 2000). And it also illustrates how this reliance on social learning can be used to produce "contagious gambling": this is precisely what happens when conartists and casinos employ confederates who (falsely) win huge sums, in order to lure unsuspecting potential gamblers into the game.

While the number of computational evolutionary models of risk attitudes is still too limited to permit any universal conclusions on the evolution of this complex suite of behaviors, some important methodological implications stand out, and are worth noticing. This methodology has in fact both advantages and limitations, but what matters is that they tend to be complementary to those exhibited by naturalistic methods. Thus, integrating evolutionary simulations with naturalistic studies has the potential for huge scientific payoffs. With respect to experimental evolutionary robotics (Da Rold et al., 2011; Saglimbeni and Parisi, 2011), advantages of this method include the following ones. First, full observability means that robots' behavior can be observed in extreme detail both "in the wild" (i.e., in the ecological setting where robots evolve), and "in the lab" (i.e., under specific test conditions). Second, there is full control, meaning that all variables can be easily and precisely manipulated, regarding both ecology and test conditions, including the possibility of "counterfactual experiments" (that is, studying how ecological pressures for which no natural correlate is known might affect behavior). Third, there is neurocomputational transparency, in that also the internal dynamics of the robots' control system (e.g., a neural network) are precisely measured (which is not entirely the case for natural, alive organisms). Fourth, individual differences emerge, since robots differ in how they cope with their ecology and in their level of proficiency (also, opening the way to the study of artificial pathologies). Interestingly, non-deterministic responses are present, since evolutionary robots are typically responding in a non-deterministic way, with respect to external stimuli, facilitating comparison with natural, alive organisms (who also do not react always in the same way to identical inputs from the environment). Finally, a potential exists for embodied implementation, since simulated robots are based on simulators of real physical platforms, thus allowing easy implementation in realworld scenarios.

In contrast, the method is mostly vulnerable to the following problems and limitations. First, abstraction, since both the ecology and the artificial laboratory are much simpler than most natural counterparts (and the same is true for the structure of the robot's body and its control system). Second, there is much arbitrariness, since a huge variety of parameters needs to be set by the experimenter, concerning both the ecology, the robot's structure, and the test conditions (and these are likely to have some impact on the resulting behavior). Last, there is need to start small; however, given the number of variables directly controlled by the experimenter and the amount of data obtained, a scalar approach is unavoidable (to understand the results). As mentioned, however, most of these drawbacks can be easily overcome, by allying computational evolutionary models with naturalistic studies (see Sections Risky Choices in Nonhuman Primates: Implications for Human Pathological Gambling and Risk Attitudes, Environmental Uncertainty and Addictive Behavior: Perspectives from Computational Neuroscience and Evolutionary Robotics).

#### **CONCLUSIONS**

In this review, we first discussed how the development of refined operant protocols, to reproduce and to evaluate the gambling proneness phenotype in animal models, is fundamental to increase our understanding of the neurobiological determinants underlying the etiology of pathological gambling and/or to develop new treatment strategies. Then, we surveyed the role of comparative studies on choice behavior in other species, in particular in nonhuman primates, for informing us on the evolutionary origins and cognitive underpinnings of human attitudes towards risk and uncertainty. Finally, we summarized various ways in which computational models can be of assistance in the study of gambling behaviors: while results in this area are still preliminary, we were able to point out several substantial indications originated from combining naturalistic observations and artificial modeling.

Reviewing such diverse studies together is meant to impact on the methodology of future gambling research: while looking at each of these three rich areas of research in isolation is certainly useful, the potential emerging benefits are only compounded by integrating all these methods together. What one learns from an animal model (about the neurobiological underpinnings of pathological gambling) should immediately be verified via computational techniques, and the further predictions generated by that computational model should be tested empirically in natural, alive organisms. Similarly, any evolutionary hypothesis on what adaptive pressures shaped risk attitudes, and generated (possibly as a by-product) gambling behavior, should be verified via computational evolutionary models, which in turn should be informed by naturalistic data coming from ethological studies. Only by bringing to the table both human and nonhuman gamblers, we shall understand what makes us so vulnerable to such a self-destructive behavioral pattern.

#### **ACKNOWLEDGMENTS**

Fabio Paglieri and Elsa Addessi received funding for this research by an ISTC-CNR intramural grant and by an American Society of Primatologists General Small Grant. Elsa Addessi also gratefully acknowledge the support of the PNR/CNR Aging Program 2012-2014. Walter Adriani received support for this research as Principal Investigator of the ERA-net "NeuroGenMRI", and Giovanni Laviola and Walter Adriani received funding by the project "GAMBLING—Fattori psicobiologici alla base di comportamenti di ricerca del rischio, disturbi nel controllo degli impulsi e gioco d'azzardo patologico" from the Department of Antidrug Policies, Presidency of the Council of Ministers (Italy).

#### **REFERENCES**


computational models of striatal-cortical dysfunction. *Biol. Psychiatry* 62, 756– 764. doi: 10.1016/j.biopsych.2006.09.042


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

#### *Received: 30 November 2013; paper pending published: 07 January 2014; accepted: 22 January 2014; published online: 11 February 2014.*

*Citation: Paglieri F, Addessi E, De Petrillo F, Laviola G, Mirolli M, Parisi D, Petrosino G, Ventricelli M, Zoratto F and Adriani W (2014) Nonhuman gamblers: lessons from rodents, primates, and robots. Front. Behav. Neurosci. 8:33. doi: 10.3389/fnbeh.2014.00033*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Paglieri, Addessi, De Petrillo, Laviola, Mirolli, Parisi, Petrosino, Ventricelli, Zoratto and Adriani. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Getting a grip on problem gambling: what can neuroscience tell us?

#### **Anna E. Goudriaan <sup>1</sup>\*, Murat Yücel <sup>2</sup> and Ruth J. van Holst 1,3**

<sup>1</sup> Department of Psychiatry and Amsterdam Institute for Addiction Research, Academic Medical Center, University of Amsterdam, Amsterdam, Netherlands <sup>2</sup> Monash Clinical and Imaging Neuroscience (MCIN) Laboratory, Monash Biomedical Imaging and School of Psychological Sciences, Monash University, Monash, VIC, Australia

<sup>3</sup> Centre for Cognitive Neuroimaging, Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands

#### **Edited by:**

Paul Vezina, The University of Chicago, USA

#### **Reviewed by:**

Douglas L. Delahanty, Kent State University, USA Susan Ferguson, University of Washington, USA Matthias Brand, University Duisburg-Essen, Germany Marco Leyton, McGill University, Canada

#### **\*Correspondence:**

Anna E. Goudriaan, Department of Psychiatry and Amsterdam Institute for Addiction Research, Academic Medical Center, University of Amsterdam, PO Box 22660, 1100 DD Amsterdam, Meibergdreef 5, 1105 AZ Amsterdam, Netherlands e-mail: agoudriaan@gmail.com

In problem gamblers, diminished cognitive control and increased impulsivity is present compared to healthy controls. Moreover, impulsivity has been found to be a vulnerability marker for the development of pathological gambling (PG) and problem gambling (PrG) and to be a predictor of relapse. In this review, the most recent findings on functioning of the brain circuitry relating to impulsivity and cognitive control in PG and PrG are discussed. Diminished functioning of several prefrontal areas and of the anterior cingulate cortex (ACC) indicate that cognitive-control related brain circuitry functions are diminished in PG and PrG compared to healthy controls. From the available cue reactivity studies on PG and PrG, increased responsiveness towards gambling stimuli in fronto-striatal reward circuitry and brain areas related to attentional processing is present compared to healthy controls. At this point it is unresolved whether PG is associated with hyper- or hypo-activity in the reward circuitry in response to monetary cues. More research is needed to elucidate the complex interactions for reward responsivity in different stages of gambling and across different types of reward. Conflicting findings from basic neuroscience studies are integrated in the context of recent neurobiological addiction models. Neuroscience studies on the interface between cognitive control and motivational processing are discussed in light of current addiction theories.

**Clinical implications**: We suggest that innovation in PG therapy should focus on improvement of dysfunctional cognitive control and/or motivational functions. The implementation of novel treatment methods like neuromodulation, cognitive training and pharmacological interventions as add-on therapies to standard treatment in PG and PrG, in combination with the study of their effects on brain-behavior mechanisms could prove an important clinical step forward towards personalizing and improving treatment results in PG.

**Keywords: pathological gambling, disordered gambling, reward sensitivity, impulsivity, cue reactivity, response inhibition, review, addictive behaviors**

#### **GAMBLING, COGNITIVE CONTROL, AND IMPULSIVITY: ON GAMBLING AND THE CONCEPT OF SELF-CONTROL**

Pathological gambling (PG) has a relatively stable prevalence in western countries, with estimations ranging from 1.4% (lifetime prevalence) in the USA, to 2% in Canada (Welte et al., 2002; Cox et al., 2005). Prevalence rates are comparable and relatively stable between countries and across survey instruments (Stucki and Rihs-Middel, 2007), with a cumulative rate around 3% for PG and problem gambling (PrG) together.

Diminished cognitive control over the urge to engage in addictive behaviors is a central characteristic of PG. It is central to the *phenomenology* of PG as defined in several of the diagnostic criteria of PG (e.g., unsuccessful efforts to control, cut back, or stop gambling). Defined from a neurocognitive perspective, the overarching notion of cognitive control can be defined as the ability to control one's actions. Cognitive control can be divided in several (sub) processes such as the ability to inhibit automatic responses (referred to as response inhibition, measured by tasks like the stop signal task) and the ability to ignore irrelevant interfering information (referred to as cognitive interference measured by tasks such as the Stroop task). In terms of the verbal representation of cognitive control, the term "impulsivity" is used regularly, to indicate a tendency to act on a whim, to display behavior that is characterized by little or no forethought, reflection, or consideration of the consequences (Daruna and Barnes, 1993). Impulsivity is a multi-faceted construct that often is deconstructed into the concept of "impulsive action", characterized by diminished motor inhibition and "impulsive choice", represented by a propensity to favor immediate rewards over delayed, larger, or more beneficial rewards in decision-making processes (Lane et al., 2003; Reynolds, 2006; Reynolds et al., 2006; Broos et al., 2012). Impaired response inhibition is thought to predispose for impulsive behavior, and diminished cognitive control has been implicated as an endophenotypic vulnerability marker for addictive disorders in recent years.

Numerous self-report and neurocognitive studies in PG indicate increased impulsivity on measures such as the Barratt Impulsiveness Scale, or Eysenck's and Impulsiveness Questionnaire (Eysenck et al., 1985) and diminished cognitive control as evidenced in diminished response inhibition, cognitive interference, and delay discounting tasks (for reviews see: Goudriaan et al., 2004; Verdejo-Garcia et al., 2008; van Holst et al., 2010a,b). Clinically, the diminished control over one's own behavior could lead to a higher vulnerability to develop PrG or PG, since for instance a diminished control to inhibit responses (response inhibition) could be associated with a more fast progression into PrG due to the diminished ability to stop gambling when one's money runs out. Similarly, a diminished cognitive interference ability could lead to a diminished ability to ignore cues for gambling in the environment. For example, experiencing high cognitive interference could lead to a higher responsivity towards gambling advertisements, which could lead to a higher likelihood of engaging in gambling, whereas diminished cognitive control could result in diminished ability to stop gambling despite high losses.

Several reviews have already been published with a focus on cognitive control or impulsivity studies in PG (van Holst et al., 2010a,b; Conversano et al., 2012; Leeman and Potenza, 2012). This review therefore focuses on more recent neurocognitive and neuroimaging studies that have been published in PG and PrG. Specifically, this review also focuses on neuroimaging studies of motivational aspects (e.g., cue reactivity), cognitive functions (e.g., impulsivity), and on neuroimaging studies addressing the interaction between cognitive and motivational processes.

Whereas a clear definition of PG is present, fulfillment of the (usually latest version of the) DSM diagnostic criteria for PG, there is no clear definition for PrG. Usually, PrG refers to a less severe form of PG, or is used when no clinical diagnosis can be determined, due to the administration of questionnaires instead of structured clinical interviews. Some studies define PrG by a score of 5 or higher on the South Oaks Gambling Screen (SOGS) or by a score of 3 or higher on a short version of the SOGS (Slutske et al., 2005). In other studies gamblers who are in treatment for problematic gambling, and fulfill up to four criteria of the PG criteria, are defined as problem gamblers (Scherrer et al., 2005), or the entire studied group is defined as "problem gamblers" when not all of the participants who are in treatment fulfill five or more of the PG criteria (e.g., de Ruiter et al., 2012). In this review therefore, PrG is used, when no information is given on DSM diagnosis of PG, but when questionnaire data indicate that PrG is present.

As concluded in Conversano et al. (2012), several studies indicate diminished cognitive control in PG as evidenced in stopsignal tasks, Go-NoGo tasks, and also in Stroop task performance. Ledgerwood et al. (2012) however assessed response inhibition with a Stroop and stop signal task, and reported no differences between pathological gamblers and controls on these tasks, but differences were present in planning tasks (Tower of London) and in cognitive flexibility (Wisconsin Card Sorting Test). As the sample included both community-recruited pathological gamblers (not in treatment) and treatment-seeking pathological gamblers, differences with other studies may be related to a less severe cognitive profile in non-treatment seeking pathological gamblers. Indeed, in another study by the same group lower impulsivity scores (Barratt impulsivity Scale), lower past-year illegal behaviors, lower depression and dysthymic disorders, and lower preoccupation with gambling were present in community-recruited pathological gamblers vs. pathological gamblers in treatment (Knezevic and Ledgerwood, 2012).

Despite the number of neuropsychological studies indicating diminished cognitive control, the number of neuroimaging studies focusing on the neural mechanisms underlying diminished cognitive control is very limited and therefore all neuroimaging studies on cognitive control are discussed here. In a study by Potenza et al. a Stroop task was administered in an fMRI study in 14 pathological gamblers and 13 healthy controls (HCs) (Potenza et al., 2003a). Diminished BOLD responsivity in the left ventromedial PFC and in the superior OFC was reported in pathological gamblers compared to HCs, despite a lack of behavioral differences. This lack of behavioral differences may have been related to the modified version of the Stroop that was used: silent naming of the colors of the letters and behavioral performance measured by self-report of the participants after performing the Stroop task. In a recent study by de Ruiter et al. (2012), diminished neural responsivity after failed inhibitions was found in the anterior cingulate cortex (ACC) in 17 problem gamblers compared to 17 HCs. Of note, reduced activity was also observed following successful inhibitions in similar regions (right dorso-medial PFC bordering on ACC) HCs. In this study similar to the study by Potenza et al.—no behavioral differences were found for the PrG group compared to the HCs, which may be related to power issues due to the smaller sample sizes of fMRI studies in PrG and PG compared to neuropsychological studies. Both these fMRI studies on cognitive control in PG and PrG show that diminished functioning of several prefrontal areas and of the ACC indicate that cognitive-control related brain circuitry functions are diminished in PG and PrG compared to HCs. These results implicate that diminished frontal functions may contribute to the pathophysiology of PG and PrG, in which diminished control over gambling behavior is central.

Another line of studies shows that impulsivity also plays an important role as vulnerability factor for the development of PrG. Several longitudinal studies in adolescents and adults from a research group from Montreal in Canada show that level of impulsivity is a predictor of both gambling and of PrG (Vitaro et al., 1997, 1999; Wanner et al., 2009; Dussault et al., 2011). Specifically, increasing impulsivity levels were associated with higher levels of PrG (Vitaro et al., 1997). In one of the more recent studies, a positive predictive link between impulsivity at age 14 and depressive symptoms and gambling problems at age 17 was present (Dussault et al., 2011). In another study using two male community samples, behavioral disinhibition and deviant peers were related to PrG, but also to substance use and delinquency, indicating similar risk factors for vulnerability to several externalizing problem behaviors (Wanner et al., 2009). These studies focused on adolescents and the predictive role of impulsivity for PrG; very recently two large-scale longitudinal birth cohort studies, investigated the role of impulsivity in early childhood and PrG during adulthood. In one of these studies (Shenassa et al., 2012), psychologists rated impulsive and shy/depressed behaviors at age 7, and related this to life-time self-reported PrG as adults, in a follow-up. Whereas impulsive behavior at age 7 predicted PrG, shy/depressed behavior did not predict PrG in adulthood, in this US based cohort of 958 offspring from the Collaborative Perinatal Project. In a large birth cohort study from Dunedin, New Zealand, temperament was assessed at age 3, and disordered gambling was assessed in this cohort when aged 21 and 32. Remarkably, children with (behaviorally and emotionally) undercontrolled temperament when aged 3 years, were more than twice as likely to evidence disordered gambling in adulthood, compared to children who were well-adjusted at age 3. This relationship was even stronger in boys compared to girls (Slutske et al., 2012). Several other studies show that impulsivity is also a vulnerability marker for engaging in gambling (Pagani et al., 2009; Vitaro and Wanner, 2011).

In conclusion, from this line of studies, there is strong evidence that impulsivity and diminished behavioral control play an important promoting role from the engagement in gambling to the development and persistence of at-risk gambling and PrG.

Given this crucial role of cognitive control in promoting gambling and PrG, evidenced from the birth cohort studies, neurocognitive studies, more neuroimaging studies in PrG and PG should focus on cognitive control, in order to illucidate what neurophysiological mechanisms may underly diminished cognitive control in problematic gambling. Thus, studying interactions between (novel) psychological, pharmacological, or neuromodulation interventions in PG, and their effect on the neurocircuitry of cognitive control in PG, is a very relevant venue for future neuroimaging and clinical intervention studies in PG (detailed in the Discussion section).

#### **RIGHT ON CUE? CUE-REACTIVITY STUDIES IN PROBLEM GAMBLING**

Compared to the small number of neuroimaging studies on cognitive control or impulsivity in PG and PrG, the topic of the neural mechanisms of cue-reactivity in PG and PrG is relatively well-studied. Five neuroimaging studies on cue-reactivity in PG and PrG (Potenza et al., 2003b; Crockford et al., 2005; Goudriaan et al., 2010; Miedl et al., 2010; Wölfling et al., 2011) and several studies focusing on cue reactivity relating to subjective craving and/or peripheral physiological responses in PrGs are present (Freidenberg et al., 2002; Kushner et al., 2007; Sodano and Wulfert, 2010). For the purpose of this review, we focus on the neuroimaging findings.

Of the five neuroimaging studies in PG and PrG related to cue reactivity, the first (Potenza et al., 2003b) used a cue reactivity paradigm consisting of videos designed to evoke emotional and motivational antecedents to gambling. In these videos, actors mimicked emotional situations (e.g., happy, sad), after which the actor described driving to or walking through a casino and experiencing the feeling of gambling. In this study, timeframes in which the participants experienced craving were analyzed for 10 pathological gamblers compared to eleven HCs. In all cases, this was before actual gambling cues were present and in response to the actors' descriptions of the emotional situation (i.e., gambling scenarios). Less activation in the cingulate gyrus, (orbito) frontal cortex (OFC), caudate, basal ganglia, and thalamic areas was present in the 10 pathological gamblers compared to the 11 HCs. In another study using gambling-related videos to elicit cuereactivity, 10 pathological gamblers and 10 HCs were compared on brain responsivity to these gambling-related videos compared to watching nature-related videos (Crockford et al., 2005). Higher activation in dorsal prefrontal areas, inferior frontal areas, the parahippocampal areas, and occipital lobe was found in pathological gamblers compared to HCs. In a subsequent fMRI cuereactivity study, Goudriaan et al. (2010) found elevated activity of similar regions when comparing 17 pathological gamblers vs. 17 HCs using gambling-related and gambling unrelated photos. In this last study, a positive relationship was found between subjective craving for gambling in pathological gamblers and activity of the frontal and parahippocampal regions when viewing gambling pictures vs. neutral pictures. In an EEG study by Wölfling et al. (2011), 15 pathological gamblers were compared to 15 HCs on EEG responsivity to gambling pictures compared to neutral, positive and negative emotional pictures. Compared to HCs, pathological gamblers showed significantly larger late positive potentials (LPPs) induced by gambling stimuli when compared to neutral stimuli, but displayed comparable LPPs towards negative and positive emotional pictures. In contrast, in HCs there was a larger response towards positive and negative stimuli compared to both neutral and gambling stimuli. Higher LPPs were present in the parietal, central, and frontal electrodes in PGs compared to HCs, interpreted as a higher overall psychophysiological responsivity towards gambling stimuli in pathological gamblers.

Finally, in an fMRI study comparing brain responsivity towards high-risk vs. low-risk gambling situations in 12 problem gamblers vs. 12 HCs, problem gamblers showed an increased BOLD response in thalamic, inferior frontal, and superior temporal regions during high-risk trials, whereas a signal decrease in these regions during low-risk trials was present. The opposite pattern was observed in the non-problem gamblers (Miedl et al., 2010). The authors argue that this frontal-parietal activation pattern during high-risk trials compared to low-risk trials in problem gamblers reflects a cue-induced addiction memory network, triggered by gambling-related cues. The findings of this study implicate that high-risk wagers may be attractive to problem gamblers, eliciting cue-reactivity and craving, whereas low-risk wagers, representing a high chance to win a smaller amount of money may elicit higher reward expectations in nonproblem gamblers. A possible interpretation of the diminished responsiveness to low-risk wagers in the problem gamblers, may be that this is due to a diminished reward sensitivity due to a blunted brain response to low-risk monetary rewards.

When summarizing the neuroimaging studies on cuereactivity in PG and PrG, a convergent picture emerges regarding the studies that employ gambling pictures or gambling movies in which actual gambling scenes are included. In these studies, increased responsiveness in fronto-striatal reward circuitry and brain areas related to attentional processing towards gambling stimuli is present in pathological gamblers/problem gamblers compared to HCs (Crockford et al., 2005; Goudriaan et al., 2010; Miedl et al., 2010; Wölfling et al., 2011). In contrast, in the one study employing stress-provoking situations, followed by verbal descriptions of wanting to engage in gambling, diminished responsiveness in fronto-striatal circuitry was found (Potenza et al., 2003b). These findings imply that cue-reactivity elicited by gambling stimuli engages reward- and motivation related circuitry thus potentially enhancing the chance of engaging in gambling. On the other hand, negative mood states induced by stressful situations may induce a relatively diminished activity in the same reward- and motivation related circuitry in pathological gamblers, which in turn may elicit craving for gambling, in order to relieve this depletion in reward experience (or anhedonia). The one finding of diminished fronto-striatal reactivity (Potenza et al., 2003b) relates to the "allostatic" negative emotional state (e.g., dysphoria, anxiety, irritability) reflecting a motivational withdrawal syndrome state as hypothesized by Koob and Le Moal and as recently integrated in a review by Koob and Volkow (2010). The remainder of the neuroimging findings in response to gambling cues relate to the preoccupation and anticipation of engaging in addictive behavior, characterized by craving. Thus, both increased responsivity in the brain's reward system to gambling cues as well as decreased responsivity of the reward system to stress-provoking cues in anticipation of gambling could lead to craving and (relapse in) gambling. This combination is also consistent with a behavioral study by Kushner et al. (2007), in which diminished cue reactivity was reported after negative mood induction.

Together, these cue-reactivity studies and addiction theories indicate that an important area to investigate in PG and PrG is the link between positive mood states and negative mood states/ stress reactivity, and both craving for gambling and gambling behavior. From the studies comparing gambling stimuli to neutral stimuli, increased frontal-striatal reactivity relating to increased cue-reactivity is evident. However, the role of the amygdala and negative emotional mood states (i.e., as a "motivational withdrawal syndrome") in inducing craving and relapse in PG and PrG should receive additional research attention.

The "withdrawal/negative affect" part of the addiction cycle, which consists of re-engagement in addictive behaviors due to withdrawal effects or negative affect, in order to diminish withdrawal and/or negative affect (Koob and Volkow, 2010) can be linked to the emotionally vulnerable problem gambler, one of the three subtypes of problem gamblers, as proposed by Blaszczynski and Nower (2002) and characterized by stress reactivity and negative mood as a pathway to PrG (Blaszczynski and Nower, 2002). The "preoccupation/anticipation" part of the addiction cycle, which is characterized by enhanced attention and cuereacitivity towards addiction-relevant cues, links to the "antisocial, impulsivist" subgroup of problem gamblers as defined by Blaszczynski and Nower (2002). They describe the latter subgroup of problem gamblers as characterized by higher impulsivity, and clinical impulsive behaviors such as ADHD and substance abuse, which promote and fasten processes of classical and operant conditioning in developing PrG (Blaszczynski and Nower, 2002).

So far, these three subtypes of pathological gamblers have hardly been studied empirically: Ledgerwood and Petry investigated these three gambling subtypes within a group of 229 pathological gamblers, which were based on self-report questionnaires. Although the subtypes differed on PrG severity, subtyping did not predict a differential treatment response. Several behavioral studies indicate differences between problem gamblers and HCs in stress reactivity. For instance, in a recent study (Steinberg et al., 2011), uncontrollable noise (stress induction) led to diminished craving for gambling in problem gamblers, whereas it increased craving for alcohol use in problem gamblers, alcohol use disordered participants and HCs. This finding, although in a small sample (12 participants in each clinical group), indicates that differential changes in craving for different addictive behaviors may result from stress (here: gambling vs. alcohol use). In a selfreport study (Elman et al., 2010) the only measure positively related to gambling urges in problem gamblers was a daily stress inventory, indicating a positive relation between stress and craving for gambling. Interestingly, in a recent pilot-study with a pharmacological challenge with yohimbine, significant left amygdala activation in response to yohimbine across all four PG subjects was observed, whereas this effect was not present in the five HCs, suggesting pharmacologically induced stress sensitization in the brain of pathological gamblers. Thus, studies focusing on the relation between stress reactivity and gambling cues, gambling urges, and gambling behavior are needed, in order to elucidate the etiology of both the withdrawal/negative affect (stress reactivity) and the motivation/anticipation (cue reactivity) part of the addiction cycle in PG and PrG. Based on the results of these behavioral and physiological studies, and the negative finding from the one study focusing on the three subtypes of pathological gamblers (Ledgerwood and Petry, 2010), it is clear that more (neuro)biological research is needed into subtyping of PG. It may well be that one problem gambler subtype is identified for whom gambling urges emerge through negative affect (with amygdala circuit abnormalities as a neural mechanism) and another problem gambler subtype where gambling urges emerge through gambling cues (with a hyperactive orbitofrontostriatal circuitry as underlying neural mechanism). This subtyping of pathological gamblers based on endophenotype (negative affect/stress reactivity vs. positive affect/gambling cue reactivity) could then be compared to the three subtypes as defined by Nower and Blaszczynski (2010): behaviorally conditioned, emotionally vulnerable and antisocial-impulsive.

Although a minimal number of neuroscience studies on stress reactivity in PG and PrG exist, a related issue is the presence of either increased or decreased reward sensitivity in neuroimaging studies in PG and PrG, and these studies will be discussed next.

#### **EXCESSIVE OR DIMINISHED REWARD SENSITIVITY IN PROBLEM GAMBLING: IS IT ALL IN THE GAME OR ALL IN THE MONEY?**

A popular hypothesis of addiction is that substance dependent persons suffer from a reward deficiency syndrome, which makes them pursue strong reinforcers (i.e., drugs) to overcome this deficiency (Comings and Blum, 2000). The first fMRI studies in PG focusing on reward processing have reported results consistent with such decreased reward sensitivity. For example, in response to monetary gains compared to monetary losses pathological gamblers showed blunted activation of the ventral striatum and ventral prefrontal cortex (Reuter et al., 2005). Similarly attenuated activation of ventral prefrontal cortices was present in with a cognitive switching paradigm where problem gamblers could win or lose money dependent on their performance (de Ruiter et al., 2009).

Recently, more detailed studies investigating *different phases of reward processing* have been conducted. Using a modified monetary incentive delay (MID) task (Knutson et al., 2000) in which subjects have to make speeded responses to acquire points/money or to prevent losing points/money, pathological gamblers showed attenuated ventral striatal responses during reward anticipation as well as in response to monetary wins (Balodis et al., 2012; Choi et al., 2012). Whereas results from these two studies are consistent with the reward deficiency hypothesis, other fMRI studies have found increased responses in anticipation of reward or after receiving rewards in fronto-striatal reward related brain areas.

For instance, using a probabilistic choice game to model anticipatory processing, pathological gamblers showed greater dorsal striatum activity during anticipation of large rewards compared to small rewards (van Holst et al., 2012c). In addition, pathological gamblers compared to controls showed higher activity in the dorsal striatum and OFC for gain-related expected value. Hyperreactivity after receiving monetary rewards in high risk bets was also found in the medial frontal cortex with an ERP study using a black jack task (Hewig et al., 2010). In a fMRI study by Miedl et al. (2012) subjective value coding for delay discounting and probability discounting in pathological gamblers and HCs was investigated. The subjective value for each task was computed for each participant individually and correlated with brain activity in the ventral striatum. Compared to controls, pathological gamblers showed a greater subjective value representation in the ventral striatum on a delay discounting task, but a reduced subjective value representation during the probabilistic discounting task. This indicates that pathological gamblers evaluate values and probabilities differently than controls. These results suggest that abnormal choice behavior with regard to future delayed rewards in problem gamblers could be related to different value coding.

At this point it is unresolved whether PG is associated with hyper- or hypo-activity in the reward circuitry in response to monetary cues, a similar issue that consists in the substance dependence literature (Hommer et al., 2011). Several methodological issues could explain the hyper- or hypo-activity findings in the reward circuitry found in the above mentioned studies. For example, in the MID task subjects have to respond as quickly as possible to a target to obtain a reward whereas in the task used by van Holst et al. (2012c) subjects have no influence on their wins or losses. This difference in control over the task outcomes could have influenced the striatal responses during the task. Furthermore, the graphic designs of the two studies also differed markedly; the MID task used in the study by Balodis et al. (2012) used non-monetary abstract pictograms, the task by van Holst et al. (2012c) featured familiar playing cards and Euro coins and bills. These gambling associated cues may elicit cue reactivity responses leading to hyperresponsivity in the striatal regions (see for a discussion: Leyton and Vezina, 2012; van Holst et al., 2012c,d). This hypothesis regarding diminished reactivity of the striatum in the absence of addiction relevant cues, and an overactivity of the striatum in the presence of addiction relevant cues was recently reviewed in depth by Leyton and Vezina (2013).

The reward deficiency hypothesis of addiction has received considerable support from PET studies measuring dopamine functioning, consistently showing lower dopamine D2/D3 receptor binding potential in drug dependent subjects (Martinez et al., 2004, 2005, 2011; Volkow et al., 2004, 2008; Lee et al., 2009). Whether this D2/D3 receptor binding potential underlies PG is still unclear because PET techniques have only recently been utilized in PG. Currently, no significant differences in baseline DA binding in pathological gamblers compared to HCs seems to be present (Linnet et al., 2010; Joutsa et al., 2012; Boileau et al., 2013) but other studies indicate a positive correlations between DA binding and gambling severity and impulsivity (Clark et al., 2012; Boileau et al., 2013). In addition, A PET study measuring DA activity during the Iowa gambling task found that DA release in pathological gamblers was related to excitement (Linnet et al., 2011a) and poor performance (Linnet et al., 2011b). Overall these results do suggest a role for abnormal DA binding in PG but not to the same extent as that found in drug addiction in which clear diminished binding potentials are consistently reported (Clark and Limbrick-Oldfield, 2013). Missing from the literature are studies measuring more stable baseline DA synthesis capacity: existing studies have only focused on aspects related to highly state dependent DA D 2/3 receptor availability. Studies measuring DA synthesis capacity could test the hypothesis of a higher DA synthesis capacity in PG and PrG. Higher DA synthesis could lead to higher dopaminergic *reactivity* when confronted with addiction related cues (e.g., games, money, risk). Furthermore, PG studies directly manipulating DA and measuring fMRI BOLD responses during reward processing could provide important information about the causal role of DA in PG.

An alternative hypothesis, next to the reward deficiency hypothesis for PG and PrG is that, similar to substance use disorders (SUDs; Robinson and Berridge, 2001, 2008), pathological gamblers and problem gamblers suffer from an enhanced incentive salience for gambling related cues. This enhanced incentive salience for gambling cues could be so strong that it overrides incentive salience of alternative sources of reward, leading to an imbalance in incentive motivation. To test whether pathological gamblers would suffer from an overall reward deficiency or from an imbalance in incentive salience, Sescousse et al. (2013) compared neural responses to both financial gains and to primary rewards (erotic pictures) in pathological gamblers and HCs. In line with the latter hypothesis, hypo-reactivity was observed for the erotic cues, in contrast with normal-reactivity to the financial rewards, indicating an imbalanced incentive salience attribution in PG. Taken all the above studies together, at this point it seems most likely that pathological gamblers are not suffering from a reward deficiency in general but that pathological gamblers have a different appraisal of gambling related stimuli, presumably caused by enhanced incentive salience of gambling related stimuli.

Recently fMRI studies have focused on specific gambling related cognitive biases. This is important because problem gamblers often display a number of cognitive biases regarding gambling games (Toneatto et al., 1997; Toneatto, 1999; Clark, 2010; Goodie and Fortune, 2013). For example, gamblers are known to falsely believe that they can influence outcome probabilities of games ("illusion of control") (Langer, 1975). Various intrinsic features of games of chance promote these biases (Griffiths, 1993), as for example "near-miss" events (Kassinove and Schare, 2001). These near-wins or near-miss outcomes (which are actually losses) occur when two reels of a slot machine display the same symbol and the third wheel displays that symbol immediate above or below the pay-off line. A study investigating near-miss effects in problem gamblers found that brain responses during near-miss outcomes (compared to full-miss outcomes) activated similar brain reward regions such as the striatum and insular cortex as during win outcomes (Chase and Clark, 2010). Habib and Dixon (2010) found that near-miss outcomes lead to more win-like brain responses in pathological gamblers, whereas HCs activated brain regions associated with losses to a larger extent. These studies contribute to a better understanding of the addictiveness of gambling games and its underlying neuronal mechanism.

#### **CAN ENHANCED SALIENCE FOR GAMBLING RELATED STIMULI LEAD TO LOSS OF CONTROL OVER BEHAVIOR?**

An influential and empirically grounded neurobiological model for substance dependence, the Impaired Response Inhibition and Salience Attribution (I-RISA) model, postulates that repeated drug use triggers a series of adaptations in neuronal circuits involved in memory, motivation, and cognitive control (Volkow et al., 2003). If an individual has used drugs, memories of these events are stored as associations between the stimulus and the elicited positive (pleasant) or negative (aversive) experiences, facilitated by dopaminergic activation caused by the drug of abuse. This results in an enhanced (and long-lasting) salience for the drug and its associated cues at the expense of decreased salience for natural reinforcers (Volkow et al., 2003). In addition, the I-RISA model assumes loss of control (disinhibition) over drugs due to enhanced salience and pre-existing deficiencies (as discussed in part 1 of the review), which renders individuals suffering from addictive disorders vulnerable to relapse into addictive behavior.

In addictive disorders including PG, there is evidence that both affective and motivational systems are more sensitive to addiction relevant material. For example, studies have shown that addiction related cues attract more attention than other salient stimuli, a phenomenon known as "attentional bias" (McCusker and Gettings, 1997; Boyer and Dickerson, 2003; Field and Cox, 2008). As discussed in the "cue reactivity" section of this review, in problem gamblers, enhanced brain responsiveness towards gambling related cues ("cue reactivity") has also been found in brain areas related to motivational processing and cognitive control (amygdala, basal ganglia, ventrolateral prefrontal cortex and dorsolateral prefrontal cortex; Crockford et al., 2005; Goudriaan et al., 2010).

As discussed in the first section of this review, PG is associated with impaired cognitive control. However how cognitive control interacts with motivational processes is still subject of investigation. Just recently, studies have started to test the interaction between cognitive control and salience attribution in PG. In one of our recent studies, we employed a modified Go/NoGo task by including affective stimulus blocks (gambling, positive and negative), in addition to the standard affectively neutral block in problem gamblers and HCs (van Holst et al., 2012b). Subjects were requested to respond or withhold a response to specific types of pictures with a different emotional loading, allowing the investigation of the interaction between motor inhibition and salience attribution. Whereas we found no behavioral differences on neutral response inhibition trials, problem gamblers compared to controls showed greater dorsolateral prefrontal and ACC activity. In contrast, during gamble and positive pictures problem gamblers made less response inhibition errors than controls and showed reduced activation of the dorsolateral prefrontal and ACC. This study indicated that pathological gamblers rely on compensatory brain activity to achieve similar performance during neutral response inhibition. However, in a gambling-related or positive context response inhibition appears to be *facilitated*, as indicated by lower brain activity and fewer response inhibition errors in pathological gamblers. Data from this Go/NoGo study was further analyzed to test the effect of affective stimuli on functional connectivity patterns during the task (van Holst et al., 2012a). As expected, adequate response inhibition was related to functional connectivity within the sub-regions of the dorsal executive system as well as on functional connectivity between the dorsal executive and the ventral affective system in both HCs and problem gamblers. Compared to HCs, problem gamblers showed a stronger positive correlation between the dorsal executive system and task accuracy during inhibition in the gambling condition. These findings suggest that increased accuracy in pathological gamblers during the gambling condition was associated with increased connectivity with the dorsal executive system (van Holst et al., 2012a). It seems likely that DA function plays an important role in these findings. Salient stimuli enhance DA transmission in the mesolimbic system (Siessmeier et al., 2006; Kienast et al., 2008) and DA is known to modulate prefrontal cortex functioning (Robbins and Arnsten, 2009). Indeed, in humans, DA transmission has an effect on functional connectivity within the corticostriatal thalamic loops (Honey et al., 2003; Cole et al., 2013). More research is needed to further clarify the interaction between motivation, DA and cognitive control in PG. In the earlier mentioned review by Leyton and Vezina (2013), a model is proposed that integrates the influence of these opposite striatal responses on the expression of addictive behaviors. Central to his model is the idea that low striatal activity leads to an inability to sustain focussed goal-directed behavior, whereas in the presence of high striatal activity (when drug cues are present) a sustained focus and drive to obtain rewards is present. The findings reviewed above (van Holst et al., 2012a,b) fit this model well: better performance was present in problem gamblers in the positive and gambling conditions, and more functional connectivity was found with the dorsal executive system in problem gamblers in the gambling condition. This could be an indication of normalization in probleml gamblers of the underactive striatal system, in the presence of salient motivational cues in the positive and gambling Go/NoGo conditions.

It is clinically relevant to further investigate whether increased activity in the reward system indeed has the effect of transiently restoring prefrontal cortex functioning in problem gamblers. This could be tested by pharmacological challenges or by enhancing activity in the reward system more locally, for example by using real time-fMRI neurofeedback (deCharms, 2008) or Transcranial Magnetic Stimulation (TMS; Feil and Zangen, 2010). However, we suggest that enhanced salience to rewarding stimuli could also lead to *impaired* task performance. For example, when too much attention is allocated to salient stimuli, this can result in attenuated executive control recourses (Pessoa, 2008). Enhanced reward seeking behavior and enhanced responsiveness to potential rewards could therefore be an important concept in understanding why especially on tasks with contingencies gamblers show diminished cognitive performance (Brand et al., 2005; Goudriaan et al., 2005, 2006; Labudda et al., 2007; Tanabe et al., 2007; de Ruiter et al., 2009).

#### **SUMMARY NEUROIMAGING FINDINGS: SELF-CONTROL, CUE-REACTIVITY, REWARD SENSITIVITY AT DIFFERENT STAGES OF GAMBLING, AND THE INTERACTION BETWEEN SELF-CONTROL AND MOTIVATIONAL URGE**

When trying to reach an overarching conclusion with regard to the studies reviewed, it is clear that for some topics, consistent findings have been established over the years. For instance, the notion of increased impulsivity in PG and PrG is firmly established and the first neuroimaging studies show that this heightened impulsivity is accompanied by diminished prefrontal and ACC functioning. It is clear that the field of cognitive functions in PG needs more neuroimaging studies to investigate what cognitive functions are most affected. Neuroimaging cuereactivity studies indicate that when gambling cues are present, the motivational system of the brain is overactive in PG and PrG, as evidenced in higher parahippocampal, amygdala, basal ganglia, and OFC activation. With regard to either enhanced neural reward sensitivity or diminished reward sensitivity, the first studies seem to indicate that whereas enhanced activation of the brain's reward circuitry is present in *anticipation of* winning or in experiencing risky gamble situations, diminished reward responsiveness is present in this same circuitry *after* winning and/or losing money. Finally, the interaction of cue-reactivity and cognitive control suggests that the activation of the cognitive control system in problem gamblers may be enhanced by activating the motivational circuit. However, this finding is in need of replication, and the role of DA in facilitating or diminishing cognitive control in PG deserves further study.

#### **CLINICAL IMPLICATIONS**

Cognitive behavioral therapy (CBT) for problem gamblers focuses on behavioral and cognitive interventions to curb the motivational lure of gambling behavior and has been shown to be effective in the treatment of PG (Petry, 2006; Petry et al., 2006), although relapse is still high, ranging around 50–60% in treatment studies, with rates of continuous abstinence for a year as low as 6% (Hodgins et al., 2005; Hodgins and el Guebaly, 2010). Thus, there is still room for major improvement in treatment results for PG/PrG. CBT focuses on enhancement of cognitive control over gambling, and a change in the behavior of engagement in gambling due to encountering gambling cues or experiencing craving. Specific techniques used in CBT for PG and PrG include learning coping strategies, applying stimulus control strategies, and handling high risk situations by implementing behavioral strategies, for instance on emergency cards. Thus, in CBT for PG and PrG, a substantial part of the intervention depends on engagement of executive functions by implementing behavior and emotion regulation strategies. In other psychiatric disorders, neuroimaging studies have shown that differences in pre-treatment brain functioning can predict CBT treatment effects. For instance, better frontal-striatal brain functions during a response inhibition task resulted in better response to CBT in post traumatic stress disorder (Falconer et al., 2013). Increased activity at baseline in the ventromedial PFC as well as valence effects in emotional tasks (e.g., social threat tasks) in the (anterior) temporal lobe, ACC and DLPFC promote treatment success in major depressive disorder (Ritchey et al., 2011) and in social anxiety disorder (Klumpp et al., 2013). These findings not only suggest that brain functions may be important new biomarkers for indicating the chance for treatment success with CBT, but also point to the potential value of new interventions targeting neurobiological vulnerabilities of PG and PrG. By studying brain functions that are biomarkers for CBT success in PG and subsequently improving these brain functions by neuromodulation or pharmacological interventions, treatment results for PG and PrG may improve.

Several interventions targeted at neurobiological vulnerabilities of PG and PrG are promising and may result in additional treatment effects by interacting and improving the functions that are a prerequisite for CBT success. Recently, neuromodulation interventions have gained interest in addiction research. Specifically, neurostimulation methods such as repeated Transcranial Magnetic Stimulation (rTMS) and transcranial Direct Current Stimulation (tDCS) were evaluated in a meta-analysis (Jansen et al., 2013). From this meta-analysis, a medium-effect size was found for neurostimulation with either rTMS or tDCS to reduce craving for substances or high-palatable food. In a study with multiple sessions of rTMS in 48 heavy smokers, 10 daily sessions of active rTMS over the DLPFC resulted in diminished cigarette consumption and nicotine dependence, compared to a control condition of sham rTMS (Amiaz et al., 2009). Related to neurostimulation, EEG neurofeedback in SUDs has recently gained renewed interest, with some pilot studies showing positive results of EEG neurofeedback training in cocaine dependence (Horrell et al., 2010) and opiate dependence (Dehghani-Arani et al., 2013). Thus, interventions with neurostimulation or neurofeedback in PG and PrG are warranted as well, to investigate whether neurostimulation interventions also hold promise in this behavioral addiction.

As a potential non-pharmacological intervention, changes in the motivational system in PG could be targeted by "attentional retraining" (MacLeod et al., 2002; Wiers et al., 2006). During attentional retraining patients are trained to reverse their attentional bias by performing computer tasks, thus aiming to reduce cue reactivity and to change habitual behaviors. A related intervention is retraining of automatic action tendencies, in which approach behavior towards addiction related stimuli is retrained to avoidance behavior (Wiers et al., 2006, 2010; Schoenmakers et al., 2007). In alcohol use disorders, results from the suggested interventions are promising (Wiers et al., 2006, 2010). However, these interventions have not yet been tested in PG and long-term effects of attentional and action tendency retraining are not yet available and need to be assessed in future research.

#### **PHARMACOLOGICAL INTERVENTIONS**

In addition to the potential of neurostimulation, neurofeedback and attentional retraining interventions, a number of promising pharmacological interventions for the treatment of PG have been reported (for a review see van den Brink, 2012). Neurobiological findings indicate a pivotal role of the mesolimbic pathway, comprising the ventral striatum, and ventromedial prefrontal cortex (VMPFC) in PG. Because the VMPFC is a structure that mainly depends on DA projections that communicate with limbic structures to integrate information, dysfunctional DA transmission could be the underlying deficit causing the VMPFC dysfunctions in PG. However, numerous other neurotransmitter systems are probably also engaged and may interact during the processing of positive and negative feedback. For example, opiates are known to increase DA release in the reward pathway, and the opiate antagonists naltrexone and nalmefene, which are known to decrease DA release, have been found to reduce reward sensitivity and probably increase punishment sensitivity as well (Petrovic et al., 2008). Moreover, treatment with opiate antagonists has been shown to be effective in PG and to diminish gambling urges (Kim and Grant, 2001; Kim et al., 2001; Modesto-Lowe and Van Kirk, 2002; Grant et al., 2008a,b, 2010b).

Whereas in substance addictions, drugs and drug-associated stimuli may elicit DA release in the ventral striatum and reinforce drug intake during the acquisition of a substance use disorder, *chronic* drug intake is associated with neuroadaptation of glutamatergic neurotransmission in the ventral and dorsal striatum and limbic cortex (McFarland et al., 2003). In addition, cue exposure has been found to depend on projections of glutamatergic neurons from the prefrontal cortex to the nucleus accumbens (LaLumiere and Kalivas, 2008). Blocking the release of glutamate has prevented drug seeking behavior in animals as well as in human substance dependent persons (Krupitsky et al., 2007; Mann et al., 2008; Rösner et al., 2008). Therefore, the first promising results from pilot studies with N-acetyl cysteine (Grant et al., 2007) and memantine (Grant et al., 2010a), which modulate the glutamate system, warrant larger studies that investigate the effects of these glutamate regulating compounds in the treatment of PG.

Besides the focus on improving cognitive functions and diminishing craving by neuromodulation or pharmacological techniques, recently, interest in the influence of protective factors has grown. For instance, low impulsivity and active coping skills have been linked to a more positive outcome for SUDs. Thus, not only a focus on risk factors, but also on the role of protective factors and environmental variables that promote them may foster our understanding of the brain-behavior relationships and the pathways in developing and recovering from PG and PrG. A potential application of a focus on both risk and protective factors may be to monitor cognitive-motivational and brain functions during treatment, investigate which functions spontaneously normalize, and which functions need additions from novel interventions such as cognitive training, neuromodulation, or pharmacological interventions.

#### **CONCLUSIONS**

PG and PrG are clearly associated with cognitive and motivational differences in neuropsychological and brain functioning. Specifically, higher impulsivity and impaired executive functioning is present, which is associated with diminished functioning of the cognitive control circuitry in the brain, such as the ACC and dorsolateral prefrontal cortex. In addition, motivational functions are affected, which are associated with differential functioning in medial frontal areas and in the thalamo-striatal circuitry, linking to the frontal cortex. More research is needed to investigate the interaction between cognitive and motivational functions, as the combination of gambling cues in cognitive tasks sometimes also improves cognitive functions. Investigating the efficacy of novel interventions that target these neurobiological mechanisms, such as neuromodulation, cognitive training, and pharmacological interventions, is needed in order to investigate its potential to improve treatment outcome. In addition, research focusing on protective factors and the spontaneous recovery of risk factors could indicate which mechanisms to target in order to improve the course of PG.

#### **AUTHOR CONTRIBUTIONS**

Anna E. Goudriaan, Murat Yücel, and Ruth J. van Holst contributed to the design of the review, Anna E. Goudriaan and Ruth J. van Holst drafted parts of the manuscript, Anna E. Goudriaan, Ruth J. van Holst, and Murat Yücel revised this work critically for important intellectual content. Final approval of the version to be published was given by all authors and all authors agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

#### **REFERENCES**


Langer, E. J. (1975). The illusion of control. *J. Pers. Soc. Psychol.* 32, 311–328.

Ledgerwood, D. M., Orr, E. S., Kaploun, K. A., Milosevic, A., Frisch, G. R., Rupcich, N., et al. (2012). Executive function in pathological gamblers and healthy controls. *J. Gambl. Stud.* 28, 89–103. doi: 10.1007/s10899-010-9237-6


Pessoa, L. (2008). On the relationship between emotion and cognition. *Nat. Rev. Neurosci.* 9, 148–158. doi: 10.1038/nrn2317


longitudinal study of a complete birth cohort. *Psychol. Sci.* 23, 510–516. doi: 10. 1177/0956797611429708


to young adulthood: additive and moderating effects of common risk factors. *Psychol. Addict. Behav.* 23, 91–104. doi: 10.1037/a0013182


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 November 2013; accepted: 08 April 2014; published online: 20 May 2014*.

*Citation: Goudriaan AE, Yücel M and van Holst RJ (2014) Getting a grip on problem gambling: what can neuroscience tell us? Front. Behav. Neurosci. 8:141. doi: 10.3389/ fnbeh.2014.00141*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience*.

*Copyright © 2014 Goudriaan, Yücel and van Holst. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms*.

**OPINION ARTICLE** published: 02 December 2013 doi: 10.3389/fnbeh.2013.00182

### What motivates gambling behavior? Insight into dopamine's role

#### *Patrick Anselme1 \* and Mike J. F. Robinson2,3*

*<sup>1</sup> Département de Psychologie, Université de Liège, Liège, Belgium*

*<sup>2</sup> Department of Psychology, University of Michigan, Michigan, MI, USA*

*<sup>3</sup> Department of Psychology, Wesleyan University, Connecticut, CT, USA*

*\*Correspondence: panselme@ulg.ac.be*

#### *Edited by:*

*Bryan F. Singer, University of Michigan, USA*

#### *Reviewed by:*

*Nichole Neugebauer, University of Chicago, USA*

**Keywords: dopamine, motivation, gambling, loss, reward uncertainty**

It is commonly believed that monetary gain is the cause of gambling behavior in humans. Mesolimbic dopamine (DA), the chief neuromediator of incentive motivation, is indeed released to a larger extent in pathological gamblers (PG) than in healthy controls (HC) during gambling episodes (Linnet et al., 2011; Joutsa et al., 2012), as in other forms of compulsive and addictive behavior. However, recent findings indicate that the interaction between DA and reward is not so straightforward (Blum et al., 2012; Linnet et al., 2012). In PG and HC, DA release seems to reflect the unpredictability of reward delivery rather than reward *per se*. This suggests that the motivation to gamble is strongly (though not entirely) determined by the inability to predict reward occurrence. Here we discuss several views of the role of DA in gambling, and attempt to provide an evolutionary framework to explain its role in uncertainty.

#### **TRADITIONAL VIEW: MONEY DRIVES GAMBLING**

Common sense suggests that if gambling at casinos is attractive for many people, it is because it offers an opportunity to win money (Dow Schüll, 2012). Of course, a "big win" is rare, but the random component behind most games and the publicizing of big winners lets people believe that the chance of winning a lot is not so unlikely. In this traditional view, money is a gambler's primary motivation, and randomness in games allows the gambler to hope that the gains will overcome the losses.

This view is compatible with the evidence that DA released in the nucleus accumbens, a mesolimbic region in the brain, magnifies the attractiveness of rewards and conditioned cues (Berridge, 2007). Mesolimbic DA transforms neutral cues into conditioned cues when they come to reliably predict reward delivery (Melis and Argiolas, 1995; Peciña et al., 2003; Flagel et al., 2011). Money is certainly a strong conditioned cue, which has been associated with abundance and power in all human civilizations. As with other reward sources, money is known to enhance mesolimbic DA levels in the human striatum during gambling episodes, suggesting that money is what motivates gamblers (Koepp et al., 1998; Zald et al., 2004; Zink et al., 2004; Pessiglione et al., 2007). For example, Joutsa et al. (2012) showed that DA is released in the ventral striatum during instances of high- but not low-reward, in both PG and HC, and that the severity of symptoms in PG is associated with larger DA responses.

#### **THE ATTRACTIVENESS OF LOSSES**

Although the traditional view is in agreement with neuroscientific data, it fails to explain why people often describe gambling as a pleasant activity rather than as an opportunity to gain money. During gambling episodes, PG report euphoric feelings comparable to those experienced by drug users (van Holst et al., 2010), and the more PG lose money, the more they tend to persevere in this activity a phenomenon referred to as loss-chasing (Campbell-Meiklejohn et al., 2008). Such results are hardly compatible with the traditional view. Animal and human studies indicate that the role of DA in reward is, at least in gambling, more complex than initially believed (Linnet, 2013).

Determining the exact timing of subjective feelings or how losses spur on a gambler's desire to play during gambling episodes is difficult because different emotions and cognitions constantly overlap. Nevertheless, Linnet et al. (2010) were able to measure mesolimbic DA release in PG and HC winning or losing money. Unexpectedly, they found no difference in dopaminergic responses between PG and HC who won money. Dopamine release in the ventral striatum, however, was more pronounced for the losses in PG relative to HC. Given the motivational impact of mesolimbic DA, Linnet and colleagues argue that this effect could explain losschasing in PG. In addition, they point out that "PG are not hyperdopaminergic *per se*, but have increased DA susceptibility toward certain types of decisions and behavior" (p. 331). This finding that DA release is higher in PG losing money than in PG winning money is consistent with the evidence that "near misses" enhance the motivation to gamble and recruit the brain reward circuit more than "big wins" (Kassinove and Schare, 2001; Clark et al., 2009; Chase and Clark, 2010). Possibly related to this phenomenon is the evidence that, compared with gains, the amount of monetary losses has limited effect on the extent to which probabilistic (and delayed) losses are discounted in humans (Estle et al., 2006). This suggests that a lower probability (and a longer delay) reduces a gambler's motivation less when losses rather than gains are involved. In contrast, the big win hypothesis suggests that pathological gambling develops in individuals that initially experienced large monetary gains, but the attempts to demonstrate this effect on persistence of gambling have failed (Kassinove and Schare, 2001; Weatherly et al., 2004). Current evidence therefore suggests that losses contribute to motivate gambling more than gains.

#### **THE ATTRACTIVENESS OF REWARD UNCERTAINTY**

One of the main underlying factors to the phenomenon of loss-chasing may relate to the importance of reward uncertainty. Studies have shown that reward uncertainty rather than reward *per se*, will magnify mesolimbic DA, both in monkeys (Fiorillo et al., 2003; de Lafuente and Romo, 2011) and healthy human participants (Preuschoff et al., 2006). In PG, accumbens DA is maximal during a gambling task when the probability of winning and losing money is identical a 50% chance for a two-outcome event representing maximal uncertainty (Linnet et al., 2012). Although non-dopaminergic neurons might also be involved in the coding of reward uncertainty (Monosov and Hikosaka, 2013), these results based on electrophysiological and neuroimaging techniques indicate that DA is crucial for the coding of reward uncertainty. This suggestion is corroborated by a large number of behavioral studies, showing that mammals and birds respond more vigorously to conditioned cues predicting uncertain rewards (Collins et al., 1983; Anselme et al., 2013; Robinson et al., under review) and tend to prefer an uncertain food option over a certain food option in dualchoice tasks (Kacelnik and Bateson, 1996; Adriani and Laviola, 2006), sometimes despite a lower reward rate (Forkman, 1991; Gipson et al., 2009). According to Greg Costikyan, an award-winning game designer, games cannot hold our interest in the absence of uncertainty—which can take many forms, occurring in the outcome, the game's path, analytical complexity, perception, and so on (Costikyan, 2013). Discussing the game of *Tic-Tac-Toe*, Costikyan (p. 10) notes that this game is dull for anyone beyond a certain age because its solution is trivial. The reason why children play this game with enjoyment is that they do not understand that the game has an optimal strategy; for children, the game of *Tic-Tac-Toe* produces an uncertain outcome. A predictable game is dull, just like a detective novel for which the identity of the murderer is known in advance. Based on this assumption, Zack and Poulos (2009) note that several payoff schedules (slot machines, roulette, and dice game of craps) have a probability of winning close to 50%, so that they are expected to elicit maximal DA release and, therefore, reinforce the act of gambling.

The evidence that uncertainty itself appears to be a source of motivation is visible in the growing trend of pathological gambling that involves extended play at video poker or slot machines (Dow Schüll, 2012). Individuals are playing to play rather than to win, and monetary wins are conceived as the opportunity to extend the duration of play, rather than the game's main objective. In addition, game programmers have uncovered a profitable trend toward larger and larger number of bets per round of a given game (in Australia, *>*100 bets on a given roll), with smaller and smaller amounts (going as low as one cent), giving rise to a "losses disguised as wins" effect, where players win less than they wagered (Dixon et al., 2010). It is almost as if players were drawn to placing bets or trying to uncover the algorithm that determines the wins and losses (this is often reported in players, see Dow Schüll, 2012). Recently, we have shown in adult rats that an initial exposure (8 days) to conditioned cues predicting highly uncertain rewards sensitizes responding to those cues in the long term (for at least 20 days) despite a gradual reduction in the level of uncertainty (Robinson et al., under review). No behavioral sensitization was apparent following a later exposure to high uncertainty (rewards were provided with certainty during the first 8 days). This result is compatible with other findings showing that persistent gambling behavior is more likely to occur in individuals that experience unpredictable environments and gambling situations early in life (Scherrer et al., 2007; Braverman and Shaffer, 2012).

#### **A POSSIBLE EVOLUTIONARY ORIGIN OF GAMBLING BEHAVIOR**

Since wins are rare and often small during gambling episodes, it is unlikely that they are sufficient to motivate people to persevere in the task. The fact that losses can motivate gambling more than gains is also difficult to understand. So, why do people gamble? Pathological gambling is certainly maladaptive behavior, but the attractiveness of uncertain rewards is so widespread in the animal kingdom that this tendency should have an adaptive origin. Here we suggest a hypothesis—referred to as the compensatory hypothesis—developed by one of the authors, that describes gambling-like behavior in an evolutionary framework (Anselme, 2013).

In nature, animals are subject to a lack of cognitive control in many circumstances; they are often unable to predict what is going to happen. This essentially occurs for two reasons. First, the distribution of natural resources is random, so that a large number of responses must be produced before finding vital resources. Second, the reliability of conditioned cues is often imperfect—e.g., for some species, fruit-trees may act as conditioned cues because of their association with reward (the presence of fruits), but this association is unreliable since fruit-trees have no fruits for most of the year. Given this lack of cognitive control about objects and events, it can be argued that if reward uncertainty were not a source of motivation, most behaviors would extinguish because of the high failure rate (and energy loss) experienced by animals. The compensatory hypothesis suggests that, when a significant object or event's predictability is low, motivational processes are recruited to compensate for the inability to make correct predictions; motivation would act as a mechanism to delay extinction (Anselme, 2013). In other words, allowing an animal to persevere in a task is only possible if its behavior is motivated by the lack of predictability (i.e., uncertainty) rather than by reward itself. The compensatory hypothesis could explain why losses are so important in motivating human gamblers: without the opportunity of receiving no reward, gains become predictable and hence most games become dull (Costikyan, 2013). In addition, this hypothesis provides an interpretation to the evidence that, like physiological deprivations (Nader et al., 1997), psychosocial deprivations such as a lack of maternal care enhance mesolimbic DA release and, correlatively, incentive motivation to seek food (Lomanowska et al., 2011). Psychosocial deprivations also seem to be a cause of gambling-like behavior in both pigeons and humans (van Holst et al., 2010; Pattison et al., 2013). In fact, all forms of deprivation result from the inability to predict how to find/obtain appropriate stimuli whether food, social relationships, opportunities to work and play, etc. In most cases, this inability is a consequence of environmental poverty. On account of this, poor environments resemble unpredictable environments and the compensatory hypothesis suggests that, in both cases, a higher motivation is recruited to persevere in the laborious task of finding resources.

Assuming that this interpretation is correct, gambling behavior in humans could be phylogenetically inherited from older mammalian species whose members motivated by reward uncertainty had a better chance of survival in complex, dynamic environments. Pathological gambling might be the exaggeration of a natural tendency exploited by casinos and games of chance. Of course, uncertaintydriven motivation is no longer required to survive within most western cultures. However, gambling might be hijacking an evolutionary system designed to resolve uncertainty by spurring pulses of motivation, despite or because of repeated losses. How could pathological gambling be addressed? We think that this psychopathology should certainly be treated on a case-by-case basis, depending on the vulnerability of each PG. For example, favoring enrichment of a PG's daily environment by varying leisure activities and social relations may reduce his desire to seek a surplus of stimulation. At a societal level, one approach allowing to address pathological gambling might be that gamblers at casinos can win more often than they lose but only very small gains (similar to the wagered amounts) in order to render gambling persistence less attractive. More thorough investigations are needed to identify the parameters underpinning the addictive power of games and to promote the development of games which do not exploit our phylogenetic vulnerability.

#### **REFERENCES**

Adriani, W., and Laviola, G. (2006). Delay aversion but preference for large and rare rewards in two choice tasks: implications for the measurement of self-control parameters. *BMC Neurosci.* 7:52. doi: 10.1186/1471-2202-7-52


a twin cohort. *J. Nerv. Ment. Dis.* 195, 72–78. doi: 10.1097/01.nmd.0000252384.20382.e9


striatal responses to monetary reward depend on saliency. *Neuron* 42, 509–517. doi: 10.1016/S0896- 6273(04)00183-7

*Received: 20 October 2013; accepted: 12 November 2013; published online: 02 December 2013.*

*Citation: Anselme P and Robinson MJF (2013) What motivates gambling behavior? Insight into dopamine's role. Front. Behav. Neurosci. 7:182. doi: 10.3389/fnbeh. 2013.00182*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2013 Anselme and Robinson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Neurobiological underpinnings of reward anticipation and outcome evaluation in gambling disorder

#### **Jakob Linnet 1,2,3,4\***

<sup>1</sup> Research Clinic on Gambling Disorders, Aarhus University Hospital, Aarhus, Denmark

<sup>2</sup> Center of Functionally Integrative Neuroscience, Aarhus University, Aarhus, Denmark

<sup>3</sup> Division on Addiction, Cambridge Health Alliance, Cambridge, MA, USA

<sup>4</sup> Department of Psychiatry, Harvard Medical School, Harvard University, Cambridge, MA, USA

#### **Edited by:**

Bryan F. Singer, University of Michigan, USA

#### **Reviewed by:**

Alexis Faure, Centre Neurosciences Paris Sud (CNPS), CNRS, France Jonathan David Morrow, University of Michigan, USA

#### **\*Correspondence:**

Jakob Linnet, Research Clinic on Gambling Disorders, Aarhus University Hospital, Nørrebrogade 44, Building 30, DK-8000 Aarhus C, Denmark e-mail: linnet@cfin.au.dk; jakolinn@rm.dk; jlinnet@mac.com

Gambling disorder is characterized by persistent and recurrent maladaptive gambling behavior, which leads to clinically significant impairment or distress. The disorder is associated with dysfunctions in the dopamine system. The dopamine system codes reward anticipation and outcome evaluation. Reward anticipation refers to dopaminergic activation prior to reward, while outcome evaluation refers to dopaminergic activation after reward. This article reviews evidence of dopaminergic dysfunctions in reward anticipation and outcome evaluation in gambling disorder from two vantage points: a model of reward prediction and reward prediction error by Wolfram Schultz et al. and a model of "wanting" and "liking" by Terry E. Robinson and Kent C. Berridge. Both models offer important insights on the study of dopaminergic dysfunctions in addiction, and implications for the study of dopaminergic dysfunctions in gambling disorder are suggested.

**Keywords: anticipation, reward prediction error, reward prediction, incentive salience, dopamine, gambling disorder, pathological gambling**

#### **NEUROBIOLOGICAL UNDERPINNINGS OF REWARD ANTICIPATION AND OUTCOME EVALUATION IN GAMBLING DISORDER**

Gambling disorder is characterized by persistent and recurrent maladaptive gambling behavior, which leads to clinically significant impairment or distress (American Psychiatric Association [DSM 5], 2013). Gambling disorder was recently reclassified from "pathological gambling" (an impulse control disorder) to a "behavioral addiction" under the substance use classification, which emphasizes the association between gambling disorder and other types of addiction.

Gambling disorder is associated with dysfunctions in the dopamine system. The dopamine system is sensitive to behavioral stimulation related to monetary reward, particularly in the ventral striatum (Koepp et al., 1998; Delgado et al., 2000; Breiter et al., 2001; de la Fuente-Fernández et al., 2002; Zald et al., 2004). Dopaminergic dysfunctions in the ventral striatum are linked to gambling disorder (Reuter et al., 2005; Abler et al., 2006; Linnet et al., 2010, 2011a,b, 2012; van Holst et al., 2012; Linnet, 2013).

The dopamine system codes *reward anticipation* and *outcome evaluation*. Reward anticipation refers to dopaminergic activation prior to reward, while outcome evaluation refers to dopaminergic activation after the reward. This article reviews evidence on dopaminergic dysfunctions in reward anticipation and outcome evaluation in gambling disorder from two vantage points: a model of reward prediction and reward prediction error by Schultz et al. (Fiorillo et al., 2003; Schultz, 2006; Tobler et al., 2007; Schultz et al., 2008), and a model of "wanting" and "linking" by Robinson and Berridge (Robinson and Berridge, 1993, 2000, 2003, 2008; Berridge and Aldridge, 2008; Berridge et al., 2009). It is suggested that gambling disorder may provide a "model disorder" of addiction for the two approaches, which is not confounded by ingestion of exogenous substances.

The ventral striatum and the nucleus accumbens (NAcc) play a central role in both models, which is consistent with findings of dopamine dysfunctions in the ventral striatum in gambling disorder. Therefore, this review focuses on the ventral striatum in relation to gambling disorder. Other relevant areas include the prefrontal cortex (e.g., orbitofrontal cortex) and other areas of the basal ganglia (e.g., the putamen, nucleus or caudate).

#### **REWARD PREDICTION AND REWARD PREDICTION ERROR**

Reward prediction refers to the anticipation of reward, while reward prediction error refers to the outcome evaluation. Reward prediction and reward prediction error are associated with the learning of reward properties of stimuli. According to Wolfram Schultz (2006), reward prediction and reward prediction error derive from Kamin's *blocking rule* (Kamin, 1969), which suggests that a reward that is fully predicted does not contribute to learning. A stimulus that can be entirely predicted contains no new information, and the reward prediction error rate is therefore zero. Rescola and Wagner described the so-called *Rescola-Wagner learning rule* (Rescola and Wagner, 1972), which states that learning slows progressively as the reinforcer becomes more predicted.

In random binary outcome conditions, e.g., reward vs. noreward, the *expected value* (EV) is the average value that can be expected from a given stimulus, which is a linear function of reward probability. In contrast, *uncertainty*, which can be defined as the variance (σ 2 ) of a probability distribution (Schultz et al., 2008), is the mean squared deviation from the EV, which is an inverse U-shaped function. Midbrain and striatal dopamine coding of EV and uncertainty follow linear and quadratic functions of reward prediction similar to their mathematical expressions (Fiorillo et al., 2003; Preuschoff et al., 2006; Schultz, 2006). The dopamine system also codes deviations in outcome from the reward prediction, i.e., reward prediction error: ". . .dopamine neurons emit a positive signal (activation) when an appetitive event is better than predicted, no signal (no change in activity) when an appetitive event occurs as predicted, and a negative signal (decreased activity) when an appetitive event is worse than predicted. . .[and] dopamine neurons show bidirectional coding of reward prediction errors, following the equation Dopamine response = Reward occurred−Reward predicted" (Schultz, 2006, pp. 99–100).

Fiorillo et al. (2003) investigated dopamine activation in reward prediction and reward prediction error in relation to EV and uncertainty (i.e., variance in outcome). In the study, two monkeys were exposed to stimuli with varying reward probabilities (*P* = 0, *P* = 0.25, *P* = 0.5, *P* = 0.75 and *P* = 1.0). The rate of anticipatory licking and the activation of dopamine neurons in the ventral midbrain (area A8, A9 and A10) were recorded. Dopaminergic coding of reward prediction was measured as a *phasic* signal immediately after stimulus presentation, while coding of reward prediction error was measured as a phasic signal immediately after the outcome of the stimulus (reward or no reward). Dopaminergic coding of uncertainty was measured as a *sustained* signal from stimulus presentation to outcome.

The authors reported three main results. First, the reward probabilities of stimuli were correlated with the anticipatory licking rate and the anticipatory phasic dopamine response. This suggests that the reward probability reinforced the dopaminergic activation and the behavioral response. Second, the sustained dopamine response toward uncertainty followed the properties of variance, i.e., it was largest toward stimuli with 50% reward probability (*P* = 0.5), smaller toward stimuli with *P* = 0.75 and *P* = 0.25, and smallest toward stimuli with *P* = 1.0 and *P* = 0.0. Third, rewarded stimuli with lower reward probability had a larger phasic dopamine response following the reward, which suggests a larger positive reward prediction error signal; rewarded stimuli with higher reward probability had a smaller phasic dopamine response following the reward, which suggests a smaller reward prediction error signal.

Neurobiological studies of gambling in humans support the evidence of reward prediction and reward prediction error. Abler et al. (2006) used functional magnetic resonance imaging (fMRI) to investigate reward prediction and reward prediction error in an incentive task where participants were shown five figures associated with different reward probabilities (*P* = 0.0, *P* = 0.25, *P* = 0.50, *P* = 0.75, and *P* = 1.0). The results showed a significant anticipatory blood oxygen level dependent (BOLD) activation in the NAcc, which was proportional to the reward probability. Furthermore, there was a significant interaction between outcome and BOLD activation in the NAcc, where the BOLD activation was higher when low probability stimuli were rewarded, and lower when high probability stimuli were rewarded.

Preuschoff et al. (2006) used a card guessing task to investigate the relationship between risk and uncertainty in relation to anticipated reward. The task consisted of 10 cards ranging from 1 to 10, where two cards were drawn in succession. Before the drawing of the second card participants had to guess whether the first card would be higher or lower than the second card. The results showed that reward probability was linearly associated with immediate BOLD activation: higher reward probability was associated with a higher immediate anticipatory BOLD signal, and lower reward probability was associated with a lower immediate anticipatory BOLD signal. In contrast, uncertainty showed an inverse U-shaped relation with late BOLD activation: the highest anticipatory BOLD signals were seen around maximum uncertainty (*P* = 0.5) and the lowest anticipatory BOLD signals were seen around maximum certainty (*P* = 1.0 and *P* = 0.0).

Neurobiological studies support the notion of dopaminergic dysfunctions of reward anticipation in gambling disorder. van Holst et al. (2012) compared 15 gambling disorder sufferers with 16 healthy controls in a fMRI study investigating reward anticipation in a card guessing task. Gambling disorder sufferers showed a significant increase in BOLD activation in the bilateral ventral striatum and in the left orbitofrontal cortex toward gain-related EV. This suggests an increased BOLD activation toward reward anticipation. No differences in BOLD activation were found toward outcome evaluation. Linnet et al. (2012) compared 18 gambling disorder sufferers and 16 healthy controls in a positron emission tomography (PET) study using the Iowa Gambling Task (IGT). Dopamine release in the striatum of gambling disorder sufferers showed a significant inverted U-curve with the probability of advantageous IGT performance. Gambling disorder sufferers with maximum uncertainty of outcome (*P* = 0.5) had a larger dopamine release than individuals with IGT performance closer to certain gains (*P* = 1.0) or certain losses (*P* = 0.0). This is consistent with the notion of dopaminergic coding of uncertainty. No interaction was found between dopamine release and uncertainty among healthy control subjects, which could suggest a stronger reinforcement of gambling behavior among gambling disorder sufferers. Therefore, in gambling disorder dopaminergic anticipation of reward and uncertainty might represent a dysfunctional reward anticipation, which reinforces the gambling behavior despite losses.

In outcome evaluation the evidence suggests a blunted dopamine response in gambling disorder sufferers. Reuter et al. (2005) compared 12 gambling disorder sufferers with 12 healthy controls in a card guessing task. Gambling disorder sufferers showed a significantly lower BOLD response in the ventral striatum toward winning compared with healthy controls. Furthermore, gambling disorder sufferers showed a significant negative correlation between the BOLD activation and severity in gambling symptoms, which suggests a blunted outcome evaluation in gambling disorder.

One of the limitations of the reward prediction and reward prediction error model is that it is not a theory of addiction or gambling disorder, *per se.* In other words, while the increased dopaminergic activation toward uncertainty might be a central mechanism in the reinforcement of gambling behavior, it does not explain why some individuals become addicted to gambling, while others do not. In contrast, the incentive-sensitization model suggests that addictive behavior is associated with a combination of dopaminergic reinforcement and changes to the dopamine system (sensitization) following repeated drug exposure.

#### **INCENTIVE-SENSITIZATION MODEL OF "WANTING" AND "LIKING"**

Terry E. Robinson and Kent C. Berridge (Robinson and Berridge, 1993, 2000, 2003, 2008; Berridge and Aldridge, 2008; Berridge et al., 2009) have proposed an *incentive-sensitization* model, which distinguishes pleasure ("liking") from incentive salience ("wanting") in addiction. "Wanting" is associated with anticipation of reward, while "liking" is associated with outcome evaluation.

The incentive-sensitization model focuses on the dopamine system as a core neurobiological basis of addiction. The ventral striatum and its main component the NAcc are associated with addiction. Changes in the dopamine system associated with drug exposure render the brain circuits hypersensitive or "sensitized" to drugs or drug cues. Sensitization from repeated drug exposure may also occur at the level of psychomotor or locomotor activity. Sensitization is linked with increased incentive salience, which is the cognitive process associated with drug seeking and drug taking behavior. Incentive salience ("wanting") refers to a motivational state, which can be conscious or unconscious, goal-oriented or non goal-oriented, and pleasurable or non-pleasurable:

"The quotation marks around the term "wanting" serve as caveat to acknowledge that incentive salience means something different from the ordinary common language sense of the word wanting. For one thing, "wanting" in the incentive salience sense need not have a conscious goal or declarative target. . . . Incentive salience is separable from beliefs and declarative goals that constitute cognitive aspects of "wanting"" (Berridge and Aldridge, 2008, pp. 8–9).

Incentive salience ("wanting") increases after repeated exposure to drugs and drug cues, while pleasure ("liking") remains the same or decreases over time. The incentive-sensitization model of "wanting" and "liking" offers an explanation for the apparent paradox that individuals with substance use disorder have an increased desire for drugs despite getting less pleasure from taking them. Incentive "hotspots" have been identified in the NAcc: activation in the medial NAcc shell is distinctly associated with "liking", while activation throughout the NAcc (particularly around the ventral pallidum) is associated with "wanting" (Berridge et al., 2009).

Incentive sensitization defines the relationship between incentive salience and sensitization. Incentive salience must be coupled with sensitization to account for addictive behavior: an increase in dopamine binding does not define incentive sensitization, but an increase in dopamine binding in relation to particular drug cues does; locomotor activity does not indicate incentive sensitization, but running around to get drugs does; psychomotor preoccupation does not indicate incentive sensitization, but an obsession with taking drugs does. Therefore, simple reinforcement of behavior is insufficient to account for addictive behavior.

"The central idea is that addictive drugs enduringly alter NAccrelated brain systems that mediate a basic incentive-motivational function, the attribution of incentive salience. As a consequence, these neural circuits may become enduringly hypersensitive (or "sensitized") to specific drug effects and to drug-associated stimuli (via activation by S-S associations). The drug-induced brain change is called neural sensitization. We proposed that this leads psychologically to excessive attribution of incentive salience to drug-related representations, causing pathological "wanting" to take drugs" (Robinson and Berridge, 2003, p. 36).

Berridge and Aldridge (2008) provide an example of the incentive-sensitization approach to research in addiction. In this approach, animals are trained under two conditions: first, the animals are conditioned to work (press a lever) for rewards (e.g., food pellets), and must persist working to earn rewards. In a separate training session the animals receive rewards without having to work for them, where each reward is associated with an auditory tone cue for 10–30 s, which is the conditioned stimulus (CS+). After training, the animals are tested in an extinction paradigm where "wanting" is measured as the number of lever presses the animal is willing to perform without receiving a reward. Since the animals receive no rewards, the "wanting" is not confounded by consumption of reward. The key of the paradigm is to test changes in behavior when the conditioned auditory stimulus is introduced during different drug induced states. In a series of studies, Wyvell and Berridge (2000, 2001) showed that rats injected with amphetamine microinjections in the NAcc shell had significantly more lever presses when the conditioned auditory stimulus was introduced compared to rats injected with saline microinjections. In a related experiment, Wyvell and Berridge (2000, 2001) found that the measures of liking (facial reaction to receiving a sugar reward) did not differ whether the animals received saline or amphetamine microinjections. These findings suggest that amphetamine is associated with an increased cuetriggered "wanting", but not with increased pleasure ("liking") from receiving the reward.

The incentive-sensitization model's suggestions of increased "wanting" and decreased "liking" in addiction are consistent with the findings from the gambling disorder literature of increased dopamine activation to anticipated reward (Fiorillo et al., 2003; Abler et al., 2006; Preuschoff et al., 2006; Linnet et al., 2011a, 2012) and blunted dopamine activation to outcome of reward (Reuter et al., 2005). These findings suggest that dopaminergic dysfunctions toward *anticipated* rewards, rather than actual rewards, reinforce gambling behavior among gambling disorder sufferers. The sensitization of the dopamine system toward anticipated rewards rather than incurred rewards can explain why gambling disorder sufferers continue gambling despite losses, and might play a central role in the formation of erroneous perceptions about the likelihood of winning from gambling (Benhsain et al., 2004).

One of the limitations of the incentive-sensitization model is that individuals with substance use disorder have lower dopamine release and lower dopamine receptor availability despite having increased incentive-sensitization:

"However, it must be acknowledged that the current literature contains conflicting results about brain dopamine changes in addicts. For example, it has been reported that detoxified cocaine addicts actually show a decrease in evoked dopamine release rather than the sensitized increase described above. . . . Another finding in humans that seems inconsistent with sensitization is that cocaine addicts are reported to have low levels of striatal dopamine D2 receptors even after long abstinence. . . . This suggests a hypodopaminergic state rather than a sensitized state" (Robinson and Berridge, 2008, p. 3140).

While lower binding potentials are reported in substance use disorders, there is no evidence of decreased binding potentials in the gambling disorder literature (Linnet, 2013). Therefore, gambling disorder might serve as a "model" disorder for the incentive-sensitization model, as gambling is not confounded by the ingestion of exogenous substances.

#### **IMPLICATIONS OF REWARD ANTICIPATION AND OUTCOME EVALUATION IN GAMBLING DISORDER**

The models by Schultz et al. and Robinson and Berridge provide important insights on the study on gambling disorder. The reward prediction and reward prediction error model by Schultz et al. offers an explanation for the behavioral reinforcement of reward anticipation in addiction, while the incentive-sensitization model by Robinson and Berridge explains the mechanisms of "wanting" and "liking" in addiction. At the same time, gambling disorder may serve as a "model" disorder in addressing certain aspects of the two models.

First, the lower levels of binding potentials reported in substance use disorder are not seen in gambling disorder (Linnet et al., 2010, 2011a,b, 2012; Clark et al., 2012; Boileau et al., 2013). This might suggest that incentive sensitization can occur independently of baseline dopamine binding in support of the incentive-sensitization model.

Second, while the studies by Fiorillo et al. (2003) and Preuschoff et al. (2006) support the notion of sustained anticipatory dopamine activation toward uncertainty, more research is needed to determine whether or not this mechanism is associated with dopaminergic dysfunctions in gambling disorder.

Third, the gambling disorder literature suggests increased brain activation toward reward anticipation and blunted activation toward outcome evaluation. This is consistent with the incentive-sensitization model's suggestion of increased "wanting" but decreased "liking" in addiction and the notion of sustained anticipatory dopamine activation in reward prediction. Dopaminergic dysfunction in reward anticipation might constitute a common mechanism of addiction, because it occurs in the absence of reward. Therefore, reward anticipation may have a similar (dys)function, whether the reward is food, drugs or gambling. Further studies should address reward anticipation and outcome evaluation in gambling disorder.

#### **ACKNOWLEDGMENTS**

This study was supported by funding from the Danish Agency for Science, Technology and Innovation grant number 2049-03- 0002, 2102-05-0009, 2102-07-0004, 10-088273 and 12-130953; and from the Ministry of Health grant number 1001326 and 121023.

#### **REFERENCES**


"wanting" without enhanced "liking" or response reinforcement. *J. Neurosci.* 20, 8122–8130.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 02 January 2014; accepted: 10 March 2014; published online: 25 March 2014. Citation: Linnet J (2014) Neurobiological underpinnings of reward anticipation and outcome evaluation in gambling disorder. Front. Behav. Neurosci. 8:100. doi: 10.3389/ fnbeh.2014.00100*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Linnet. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Ventral striatal dopamine synthesis capacity is associated with individual differences in behavioral disinhibition

#### **Andrew D. Lawrence<sup>1</sup>\* and David J. Brooks 2,3**

<sup>1</sup> School of Psychology, Cardiff University, Cardiff, UK

<sup>2</sup> Division of Brain Sciences, Department of Medicine, Imperial College, London, UK

<sup>3</sup> Department of Nuclear Medicine, PET Centre, Aarhus University, Aarhus, Denmark

#### **Edited by:**

Mike James Ferrar Robinson, Wesleyan University, USA

#### **Reviewed by:**

Frederic Boy, Swansea University, UK Marco Leyton, McGill University, Canada

#### **\*Correspondence:**

Andrew D. Lawrence, School of Psychology, Cardiff University, Tower Building, 70 Park Place, CF10 3AT Cardiff, UK e-mail: lawrencead@cardiff.ac.uk

Pathological gambling, alongside addictive and antisocial disorders, forms part of a broad psychopathological spectrum of externalizing disorders, which share an underlying genetic vulnerability. The shared externalizing propensity is a highly heritable, continuously varying trait. Disinhibitory personality traits such as impulsivity and novelty seeking (NS) function as indicators of this broad shared externalizing tendency, which may reflect, at the neurobiological level, variation in the reactivity of dopaminergic (DAergic) brain reward systems centered on the ventral striatum (VS). Here, we examined whether individual differences in ventral striatal dopamine (DA) synthesis capacity were associated with individual variation in disinhibitory personality traits. Twelve healthy male volunteers underwent 6-[18F]Fluoro-L-DOPA (FDOPA) positron emission tomography (PET) scanning to measure striatal DA synthesis capacity, and completed a measure of disinhibited personality (NS). We found that levels of ventral, but not dorsal, striatal DA synthesis capacity were significantly correlated with inter-individual variation in disinhibitory personality traits, particularly a propensity for financial extravagance and irresponsibility. Our results are consistent with preclinical models of behavioral disinhibition and addiction proneness, and provide novel insights into the neurobiology of personality based vulnerability to pathological gambling and other externalizing disorders.

**Keywords: addiction, dopamine, externalizing, impulsivity, positron emission tomography, pathological gambling, reward, ventral striatum**

#### **INTRODUCTION**

Patterns of systematic co-occurrence ("comorbidity") between substance misuse and antisocial disorders are best accounted for by a model positing a shared underlying genetic vulnerability, known as externalizing (Krueger et al., 2002, 2007). This broad externalizing vulnerability is a highly heritable, continuously varying dimension of risk (Krueger et al., 2007). Pathological gambling [now called gambling disorder] systematically co-occurs with both substance misuse and antisocial disorders (Kessler et al., 2008; Oleski et al., 2011) and this co-variation likewise reflects a shared genetic vulnerability (Slutske et al., 2001, 2013; Blanco et al., 2012). Thus, pathological gambling can be considered one variant of an externalizing spectrum of disorders.

The broad personality trait of disinhibition reflects individual differences in the tendency to behave in a disinhibited vs. controlled fashion (Dindo et al., 2009). Disinhibitory personality traits are strongly linked with externalizing disorders (Ruiz et al., 2008), including pathological gambling (MacLaren et al., 2011). Importantly, a shared genetic diathesis underlies the associations between trait disinhibition and externalizing disorders (Krueger et al., 2002; Hicks et al., 2011).

Furthermore, prospective studies suggest that trait disinhibition, measured early in life, predates and predicts the emergence of externalizing pathology, including pathological gambling (Elkins et al., 2006; Slutske et al., 2012) and mediates the co-variation between externalizing disorders (Ruiz et al., 2008). Thus the antecedent trait of disinhibition provides the temperamental core of the externalizing disorders, and disinhibitory personality traits such as impulsivity and sensation seeking function as indicators of the general externalizing propensity (Krueger et al., 2002, 2007).

The genetic liability to externalizing may be related, at the neurobiological level, to brain mechanisms underpinning sensitivity to reward (Iacono et al., 2008). Brain dopamine (DA) systems have long been hypothesized to underlie individual variation in reward sensitivity. According to Gray (1987), individual differences in trait impulsivity reflect individual variation in the reactivity of a neural "behavioral activation system" (BAS), centred on the ventral striatum (VS) and its dopaminergic (DAergic) irrigation, which is triggered by cues for reward. Likewise, in Cloninger's (1986) model of temperament, novelty-seeking (NS) tendencies reflect genetically determined variation in reward-seeking behaviors, mediated by DAergic modulation of the BAS. When activated, the BAS functions as an impulsive "go" motivational system, and variation in BAS reactivity is potentially a potent source of inter-individual variation in behavioral disinhibition (Newman and Wallace, 1993).

Recent research highlights that genetic variation in DA synthesis pathways may play a key role in the etiology of externalizing liability. DA synthesis occurs within DA neurons. Tyrosine is transported into the cell via amino acid carriers in the bloodbrain barrier and cell membranes. Once in the intracellular space it is hydroxylated to L-3,4-dihydroxiphenylalanine (L-DOPA) by tyrosine hydroxylase (TH). L-DOPA is then decarboxylated by aromatic L-amino acid decarboxylase (AADC) (also called dopa decarboxylase, DDC) to DA (Elsworth and Roth, 2009). In an important study, Derringer et al. (2010) found that a combination of multiple common variants (single nucleotide polymorphisms, SNPs) in the DDC gene predicted individual variation in sensation seeking traits, suggesting that genetic variation in DA synthesis contributes to the broad externalizing liability, of which sensation seeking functions as one indicator (Krueger et al., 2007).

Positron Emission Tomography (PET) can be used to study the activity of AADC in pre-synaptic DA terminals in the living brain. The PET tracer 6-[18F]fluoro-L-DOPA (FDOPA), a radioactive analog of L-DOPA, the precursor of DA, is taken up by pre-synaptic DAergic neurons and is metabolized by AADC to <sup>18</sup>F-DA, which is trapped and stored within vesicles in the nerve terminals (Kumakura and Cumming, 2009). FDOPA uptake, quantified as the influx constant *K<sup>i</sup>* , can be used as a measure of AADC activity and vesicular storage capacity (Brown et al., 1999). High values for FDOPA *K<sup>i</sup>* are observed in areas of dense DA nerve terminal innervation, such as the striatum (Kumakura and Cumming, 2009).

Consistent with the notion that externalizing propensity reflects, neuro-biologically, inter-individual variation in DAergic modulation of the BAS, we recently found, in a group of Parkinson's disease patients, that individual differences in ventral striatal FDOPA *K<sup>i</sup>* values were related to individual differences in disinhibitory personality traits, particularly a propensity for financial extravagance (Lawrence et al., 2013). The patients in that study, were however, being treated with DA agonist medication, which could potentially have influenced levels of both striatal DA synthesis (Rowlett et al., 1993) and behavioral disinhibition (Lawrence et al., 2003). Thus, it is important to ascertain whether the relationship between ventral striatal DA synthesis capacity and disinhibitory personality traits holds in a sample of healthy, medication-free individuals. Based on our previous findings, we predicted that increased FDOPA uptake in ventral, but not dorsal, striatum would be related to increased levels of trait disinhibition, in particular propensities for financial extravagance and irresponsibility.

#### **MATERIALS AND METHODS**

#### **PARTICIPANTS**

Twelve right-handed healthy male volunteers (mean age 38 years, SD ± 7 years, range 29–49 years) participated, all with a normal neurological history and examination. A trained psychiatrist assessed participants and current and past psychiatric morbidity, including alcohol or drug dependency, was excluded by routine psychiatric interview and the General Health Questionnaire (Jackson, 2006) with a cut-off of 5 points or fewer.

The study was limited to men as there are gender differences in the prevalence and clinical presentation of gambling disorder and its relation to the externalizing spectrum (Blanco et al., 2006; Oleski et al., 2011) and in DA synthesis capacity (Laakso et al., 2002). Additionally, fMRI studies suggest a stronger relationship between ventral striatal activity to reward cues and impulsivity in men than women (Lahey et al., 2012).

Permission to undertake the study was granted by the Hammersmith Hospitals Research Ethics Committee and all participants gave written informed consent following a full explanation of the procedure. The Administration of Radioactive Substances Advisory Committee (ARSAC) of the UK approved radioisotope use.

#### **PERSONALITY TRAIT MEASUREMENT**

Our measure of trait behavioral disinhibition was based on NS from Cloninger's Tri-dimensional Personality Questionnaire (TPQ; Cloninger, 1987). The version of the TPQ used here was a 100-item, self-administered, true-false instrument. The questionnaire is scored so that higher scores reflect greater NS tendencies.

As originally constructed (Cloninger, 1987) TPQ-NS comprised four narrow facet-level scales: Exploratory Excitability vs. Stoic Rigidity (NS1), Impulsiveness vs. Reflection (NS2), Extravagance vs. Reserve (NS3), and Disorderliness vs. Regimentation (NS4). When Ando et al. (2004), however, examined the genetic and environmental factor structure of NS, factor analysis of the genetic inter-correlations yielded factors that did not fully resemble the phenotypic structure of NS as proposed by Cloninger (1987). NS was revised (r-NS) to consist of Impulsiveness vs. Reflection (NS2), Extravagance vs. Reserve (NS3) and Disorderliness vs. Regimentation (NS4), excluding Exploratory Excitability vs. Stoic Rigidity (NS1). Further, Flory and Manuck (2009), using factor analysis in a large normative sample of adults, found Impulsiveness vs. Reflection (NS2) and Extravagance vs. Reserve (NS3) to have high loadings on a "disinhibition" factor, along with the Barratt Impulsiveness Scale (BIS), whereas Exploratory Excitability vs. Stoic Rigidity (NS1) and Disorderliness vs. Regimentation (NS4) loaded on a distinct "Experience seeking" factor.

Hence, in the current study, we focused on those r-NS facets most strongly linked to trait disinhibition: Impulsiveness (vs. Reflection) (NS2) (8 items) and Extravagance (vs. Reserve) (NS3) (7 items). Sample items include "*I often follow my instincts, hunches, or intuition without thinking through all the details"* (Impulsivity, NS2) and "*I often spend money until I run out of cash or get into debt from using too much credit"* (Extravagance, NS3).

In addition to NS, we also measured Harm Avoidance (HA) traits using the TPQ. We calculated a total HA score based on the sum of the four individual HA facet-level scales, as Ando et al. (2004) confirmed Cloninger's (1987) claim that the subscales used to define HA share a common genetic basis. According to Cloninger (1986), although NS and HA are genetically independent traits, at the phenotypic level high levels of HA should inhibit the expression of NS tendencies, since activation of the HA system results in a "reflexive" or "reactive" form of behavioral inhibition (Carver, 2008)—dampening the expression of appetitive approach behavior and NS, given cues of potential punishment (Newman and Wallace, 1993; Nikolova and Hariri, 2012). Indeed, meta-analysis reveals a consistent strong negative correlation between NS and HA (Miettunen et al., 2008; here the relation between HA and NS3 for example was *r* = −0.44), a relationship that is environmentally (i.e., through experience) and not genetically mediated (Ando et al., 2002). Hence, we controlled for the influence of HA when examining the relation between striatal FDOPA *K<sup>i</sup>* values and NS traits.

#### **POSITRON EMISSION TOMOGRAPHY (PET) SCANNING PROTOCOL**

Participants were pre-treated with 150 mg carbidopa and 400 mg entacapone 1 h prior to radioisotope administration (to block peripheral metabolism of FDOPA and so enhance specific signal detection) and underwent three-dimensional FDOPA PET using an ECAT EXACT HR++ (CTI/Siemens 966) camera, which covers an axial field of view of 23.4 cm and provides 95 transaxial planes. The tomograph has a spatial resolution of 4.8 + 0.2 mm FWHM (transaxial, 1 cm off axis) and 5.6 mm + 0.5 mm (axial, on axis) after image reconstruction (Spinks et al., 2000). A transmission scan, which corrects for attenuation of emitted radiation by skull and tissues, was acquired using a single rotating photon point source of 150 MBq of <sup>137</sup>Cs. 30 s after the start of the emission scan, 110 (range 102–135) MBq of FDOPA in 10 ml normal saline was infused intravenously over 30 s. Three-dimensional sinograms of emission data were then acquired over 90 min as 26 time frames. Participants were placed in the scanner with the orbito-meatel line parallel to the transaxial plane of the tomograph. Head position was monitored via laser crosshairs and video camera.

#### **IMAGE QUANTIFICATION**

Parametric images of specific FDOPA influx constants (*K<sup>i</sup>* maps) were created at a voxel level for the whole brain using linear graphical analysis (Patlak and Blasberg, 1985) of time activity curves with an occipital cortex (Brown et al., 1979) non-specific reference input function. Qualitative summated ADD images created from the dynamic FDOPA time series by integrating all 26 frames of the dynamic image were also produced and then transformed into standard stereotaxic (Montreal Neurological Institute, MNI) space using an FDOPA template created inhouse from a healthy volunteer database. These ADD images contain both tracer delivery and specific uptake information and provide adequate anatomical detail to allow them to be stereotaxically normalized into standard MNI space. Subsequently, the *K<sup>i</sup>* maps were individually normalized to MNI stereotaxic space by applying the transformation parameters defined during the normalization of their respective ADD images. This spatial transformation of parametric images made it possible to perform a region of interest (ROI) analysis as described below.

#### **REGION OF INTEREST (ROI) ANALYSIS**

Standard ROI object maps sampling the ventral and dorsal striatum were defined on the MNI single-subject ROI in stereotaxic space. For our striatal ROIs, the volume was subdivided as follows: all planes containing striatal structures below the anterior commissure-posterior commissure plane were operationally defined as the ventral striatum (VS) ROI, and all planes above the anterior commissure-posterior commissure plane containing striatal structures formed the dorsal striatum (DS) ROI. The standard object map was applied to the transformed *K<sup>i</sup>* maps and values of FDOPA *<sup>K</sup><sup>i</sup>* (units: ml · <sup>g</sup> −1 · min−<sup>1</sup> ) were obtained for the two striatal ROIs for each individual (McGowan et al., 2004). When performing our ROI analysis a manual correction for head movement was applied as previously described (Whone et al., 2003).

#### **STATISTICAL ANALYSIS**

We used Pearson partial correlations to examine the relationships between striatal FDOPA *K<sup>i</sup>* values and disinhibitory NS traits, controlling for relevant nuisance variables (Spector and Brannick, 2011). Statistical significance was set at a Bonferroni-corrected *P* < 0.0125 (i.e., 0.05/4).

#### **RESULTS**

Mean ± SD scores in our sample for NS2 (Impulsivity) and NS3 (Extravagance) were 3.5 ± 2.5 and 4.3 ± 1.1, respectively. These results are comparable to those obtained in a normative sample of 106 UK men (mean age 31, SD ± 11.5) by Otter et al. (1995) (NS2 mean 3.1, SD ± 2.2; NS3 mean 3.8, SD ± 2.0). HA scores (HA mean 8.1, SD ± 4.7) were somewhat lower than those reported by Otter et al. (HA mean 10.7 ± 6.2), perhaps reflecting self-selection bias in individuals who volunteer for PET scanning (Oswald et al., 2013).

Mean ±SD FDOPA *K<sup>i</sup>* values for the VS and DS ROIs were 0.0131 <sup>±</sup> 0.001 and 0.0125 <sup>±</sup> 0.002 ml · <sup>g</sup> −1 · min−<sup>1</sup> respectively.

Since, in adults, NS shows a significant decrease with increasing age (Otter et al., 1995), we controlled for the effects of age when examining the relationship between striatal FDOPA *K<sup>i</sup>* and disinhibitory NS traits (Impulsivity and Extravagance). Furthermore, for the reasons outlined above, we additionally controlled for HA scores.

When controlling for the influence of age and HA there was a significant relationship between VS FDOPA *K<sup>i</sup>* and NS3 (Extravagance) (*r* = 0.78, bootstrap 95% CI 0.52–0.98, *P* = 0.008), but not between VS FDOPA *K<sup>i</sup>* and NS2 (Impulsivity) (*r* = 0.44, *P* = 0.2). There were no significant relations between DS FDOPA *K<sup>i</sup>* and either NS3 (*r* = 0.28, *P* = 0.40) or NS2 (*r* = 0.003, *P* = 0.99) when controlling for age and HA (see **Figure 1**). Examination of **Figure 1** suggests that one individual data point may be an outlier. When this data point was removed, however, the relationship between VS FDOPA *K<sup>i</sup>* and NS3, controlling for age and HA, remained significant (*r* = 0.71, bootstrap 95% CI 0.51–0.92, *P* = 0.014). We found identical results when using a Spearman partial correlation (Schemper, 1991).

#### **DISCUSSION**

Consistent with our hypothesis, we found that, controlling for the effects of age and HA, variation in trait disinhibition was associated with levels of striatal DA synthesis capacity. Individuals with greater levels of trait disinhibition, in particular, tendencies to financial irresponsibility and extravagance, had greater DA synthesis capacity, as indexed by FDOPA *K<sup>i</sup>* values, in the ventral but not dorsal striatum.

We (Lawrence et al., 2013) recently found that individual differences in behavioral disinhibition (using the same personality trait measure as used here) were similarly related to individual

differences in ventral striatal DA synthesis capacity in individuals with Parkinson's disease. Those, individuals, were however, being treated with DA agonist medication, which could potentially have influenced both striatal DA synthesis (Rowlett et al., 1993) and externalizing behaviors, including pathological gambling (Weintraub et al., 2006). The current results importantly extend our earlier findings to healthy, non-medicated individuals, showing a relationship between disinhibitory traits and ventral striatal DA synthesis capacity in the absence of potential DAergic drug-induced effects. Taken together with the finding that genetic variation in DDC activity predicts disinhibitory sensation seeking tendencies in healthy individuals (Derringer et al., 2010), our results suggest that the link between behavioral disinhibition and ventral striatal DA synthesis capacity is likely to be, to a significant extent, genetically mediated. At the same time, we acknowledge that there are substantial (potentially shared) environmental influences on ventral striatal DA synthesis capacity (Stokes et al., 2013), behavioral disinhibition (Lomanowska et al., 2011) and externalizing (Hicks et al., 2013).

As in our earlier study of Parkinson's disease, here we found that only the r-NS facet-level scale NS3 (Extravagance vs. Reserve) was related to ventral striatal DA synthesis capacity. There was no significant relation with the NS2 subscale (Impulsivity vs. Reflection). The reasons for this are unclear. It is notable, however, that, of the NS facet-level scales, NS3 shows the strongest relation to both pathological gambling (Kim and Grant, 2001; Nordin and Nylander, 2007) and substance abuse (Etter et al., 2003). It may be that, of the disinhibitory NS facets, NS3 most closely indexes those traits (irresponsibility, problematic impulsivity) that lie at the core of the broad externalizing factor (Krueger et al., 2007).

Consistent with the proposal that externalizing vulnerability reflects, at least in part, individual differences in reward sensitivity (Iacono et al., 2008); the influence of variation in ventral striatal DA synthesis capacity on externalizing propensity likely reflects DA's role in one particular aspect of reward processing, namely the attribution of incentive salience (Berridge, 2012). Incentive salience is a motivational component of reward, one that transforms sensory information about rewards and reward cues into attractive, "wanted" incentives, motivating pursuit (Berridge, 2012). Notably, Flagel et al. (2010) found in rats that incentive salience attribution and behavioral disinhibition are genetically influenced, correlated traits. Available data suggest that animals prone to attribute incentive salience to reward cues have a more active DA system than those who do not (Flagel et al., 2010). In humans, VS FDOPA *K<sup>i</sup>* values have been found to positively correlate with BOLD-fMRI activity to reward cues in limbic brain regions linked to incentive salience attribution (Siessmeier et al., 2006), and limbic BOLD-fMRI responses to reward cues are correlated with both disinhibitory personality traits (Beaver et al., 2006; Buckholtz et al., 2010) and externalizing symptomatology (Bjork et al., 2010). One possibility is that individuals high on externalizing risk show exaggerated phasic DA release to reward cues, resulting from a larger releasable pool of DA generated by increased DA synthesis capacity (Bello et al., 2011; Anzalone et al., 2012), triggering excessive attribution of incentive salience to environmental cues and their associated rewards, leading to behavioral disinhibition (Flagel et al., 2010; Lovic et al., 2011) (but see Huys et al., 2014 for an alternative proposal).

At first glance, our findings seem inconsistent with an earlier study of detoxified alcoholics, which found no differences in ventral striatal DA synthesis capacity relative to a healthy control group (Heinz et al., 2005). Alcohol misuse, however, is multiply determined, and influenced to a greater extent by factors unique to alcohol, than by the general tendency to externalizing (Krueger et al., 2007). Further, it is possible that chronic alcohol use may produce potentially neurotoxic effects on DA neurons (Gilman et al., 1998), obscuring any pre-morbid trait influence on DA synthesis capacity.

It is important to note that FDOPA is not a specific ligand for DA neurons but rather is metabolized by all neurons that contain AADC (Brown et al., 1999). Hence, it is a marker for all tissues that take up and store monoamines, including serotonin (5-hydroxytryptamine, 5-HT) as well as DA neurons (Hashemi et al., 2012). 5-HT has been implicated in various aspects of impulsivity (Carver et al., 2008; Cools et al., 2008). Notably, Jupp et al. (2013) however, failed to find a relationship between levels of trait impulsivity (defined by premature responding on a 5-choice serial reaction time task) and levels of accumbens 5-HT in rats. It is likely, therefore, that individual differences in trait disinhibition are primarily related to individual differences in ventral striatal DA synthesis capacity.

In conclusion, we have found that personality based vulnerability to externalizing problems, including pathological gambling, is related to relatively increased DA synthesis capacity in the ventral, but not dorsal, striatum in a sample of healthy men. Our results are consistent with preclinical models of behavioral disinhibition and addiction proneness, and may prove informative in understanding the neurobiological and psychological mechanisms underlying personality risk for phenotypically diverse forms of disinhibitory psychopathology.

#### **ACKNOWLEDGMENTS**

The UK Medical Research Council funded this research. We are grateful to the technical PET staff for their assistance and to all the volunteers who participated in this study. This paper is dedicated to the memories of Dr Stephen McGowan and Dr Andy Calder.

#### **REFERENCES**


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 04 February 2014; accepted: 28 February 2014; published online: 14 March 2014.*

*Citation: Lawrence AD and Brooks DJ (2014) Ventral striatal dopamine synthesis capacity is associated with individual differences in behavioral disinhibition. Front. Behav. Neurosci. 8:86. doi: 10.3389/fnbeh.2014.00086*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Lawrence and Brooks. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

### Opioidergic and dopaminergic manipulation of gambling tendencies: a preliminary study in male recreational gamblers

*Roseline I. Porchet 1,2, Linde Boekhoudt 1, Bettina Studer 1,2,3, Praveen K. Gandamaneni 4,5, Nisha Rani 4,5, Somashekar Binnamangala4,5, Ulrich Müller 2,4 and Luke Clark1,2\**

*<sup>1</sup> Department of Psychology, University of Cambridge, Cambridge, UK*

*<sup>2</sup> Department of Psychology, Behavioural and Clinical Neuroscience Institute, University of Cambridge, Cambridge, UK*

*<sup>3</sup> Institute of Cognitive Neuroscience, University College London, London, UK*

*<sup>4</sup> Department of Psychiatry, University of Cambridge, Cambridge, UK*

*<sup>5</sup> Cambridgeshire and Peterborough NHS Foundation Trust, Cambridge, UK*

#### *Edited by:*

*Bryan F. Singer, University of Michigan, USA*

#### *Reviewed by:*

*Martin Zack, Centre for Addiction and Mental Health, Canada Harriet De Wit, University of Chicago, USA*

#### *\*Correspondence:*

*Luke Clark, Department of Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, Cambridge, UK e-mail: lc260@cam.ac.uk*

Gambling is characterized by cognitive distortions in the processing of chance and skill that are exacerbated in pathological gambling. Opioid and dopamine dysregulation is implicated in pathological gambling, but it is unclear whether these neurotransmitters modulate gambling distortions. The objective of the current study was to assess the effects of the opioid receptor antagonist naltrexone and the dopamine D2 receptor antagonist haloperidol on gambling behavior. Male recreational gamblers (*n* = 62) were assigned to receive single oral doses of naltrexone 50 mg, haloperidol 2 mg or placebo, in a parallel-groups design. At 2.5 h post-dosing, participants completed a slot machine task to elicit monetary wins, "near-misses," and a manipulation of personal choice, and a roulette game to elicit two biases in sequential processing, the gambler's fallacy and the hot hand belief. Psychophysiological responses (electrodermal activity and heart rate) were taken during the slot machine task, and plasma prolactin increase was assessed. The tasks successfully induced the gambling effects of interest. Some of these effects differed across treatment groups, although the direction of effect was not in line with our predictions. Differences were driven by the naltrexone group, which displayed a greater physiological response to wins, and marginally higher confidence ratings on winning streaks. Prolactin levels increased in the naltrexone group, but did not differ between haloperidol and placebo, implying that naltrexone but not haloperidol may have been functionally active at these doses. Our results support opioid modulation of cognition during gambling-like tasks, but did not support the more specific hypothesis that naltrexone may act to ameliorate cognitive distortions.

**Keywords: naltrexone, haloperidol, pathological gambling, addiction, reward, motivation, decision-making, psychophysiology**

#### **INTRODUCTION**

Gambling is a widespread form of recreational risk-taking that becomes excessive and pathological in a subset of the population (around 1%; Wardle et al., 2010). Pathological gambling is increasingly viewed as a "behavioral addiction" and has been reclassified within the Addictions category in the DSM-5 (Petry et al., 2013). Recent work on pathological gambling has studied its underlying neurobiological basis, highlighting the similarities with substance use disorders (Potenza, 2008) and focusing on the neuroimaging of reward-based tasks (Limbrick-Oldfield et al., 2013) and changes in neurotransmitter function (Leeman and Potenza, 2012). A distinct cognitive approach to gambling has emphasized the role of erroneous thinking styles ("cognitive distortions") during gambling (Ladouceur and Walker, 1996; Clark, 2010): gamblers experience a variety of biases and erroneous thoughts during play, pertaining in particular to their perceived level of skill in controlling the outcomes ("the illusion of control"; Langer, 1975) and their tendency to detect patterns in random sequences ("the Gambler's Fallacy"; Oskarsson et al., 2009). While the gambling cognitions are apparent in non-problem gamblers and student populations, the overall level of distorted thinking is elevated in people with gambling problems (Miller and Currie, 2008; Emond and Marmurek, 2010; Michalczuk et al., 2011) and these cognitions can be targeted effectively by cognitive-behavioral therapies (Fortune and Goodie, 2012). The neurobiological mechanisms that underlie these gambling-related distortions have received minimal attention to date, and the aim of the present study was to examine their pharmacological basis, looking at dopamine and opioid receptor manipulations, in a sample of mild recreational gamblers.

The opioid system is the target of growing interest in pathological gambling, primarily on the basis of clinical trials showing significant benefits of the opioid receptor antagonists naltrexone and nalmefene on gambling symptom severity and self-reported craving (Kim et al., 2001; Grant et al., 2006, 2008). These medications are well established in the clinical management of opiate and alcohol dependence (O'Brien, 2005). Preclinical evidence indicates that opioid receptors are distributed widely in the mesolimbic system, and can modulate dopamine transmission (Spanagel et al., 1992). Endogenous opioids are implicated particularly in hedonic aspects of reward processing (Pecina et al., 2006; Barbano and Cador, 2007). Of relevance to gambling behavior, a pharmacological fMRI study of the μ-opioid antagonist naloxone found attenuated reward-related responses in the ventral striatum, and enhanced loss-related activity in the medial prefrontal cortex, on a wheel of fortune task in healthy volunteers (Petrovic et al., 2008). Thus, the treatment effect in pathological gambling may be mediated by a dual action of enhancing aversive processing and attenuating positive processing of gambling outcomes. The present study employed the opioid receptor antagonist naltrexone, which is a competitive antagonist at μ- and κ-opioid receptors, and to a lesser extent at δ-opioid receptors (Kreek, 1996). We used a 50 mg single dose that is widely used in other cognitive studies in healthy volunteers (Katzen-Perez et al., 2001; Mitchell et al., 2007; Boettiger et al., 2009).

Dopamine dysregulation has also been indicated in problem gambling, based on genetic data (Lobo and Kennedy, 2009) and studies measuring peripheral markers (Bergh et al., 1997; Meyer et al., 2004), as well as the provocative syndrome in Parkinson's Disease where medications acting at the dopamine D2/D3-receptor are linked to the emergence of disordered gambling as a side-effect (Voon et al., 2009; Djamshidian et al., 2011). Dynamic PET studies with the dopamine D2/D3 radiotracer [11C]raclopride have confirmed that monetary reinforcement induces dopamine release in healthy volunteers performing gambling-like tasks (Zald et al., 2004; Martin-Soelch et al., 2011), and the magnitude of dopamine release is elevated in at least a subset of patients with pathological gambling (Steeves et al., 2009; Linnet et al., 2011; Joutsa et al., 2012). In addition, acute administration of the dopamine stimulant amphetamine, and the D2-receptor antagonist haloperidol, were both seen to modulate gambling tendencies in pathological gamblers (Zack and Poulos, 2004, 2007). In the present study, we sought to manipulate dopamine transmission with haloperidol, a first generation antipsychotic with high D2 binding affinity in the striatum (Kapur et al., 1996; Xiberas et al., 2001). We selected a low (2 mg) dose of haloperidol that we expected to act preferentially on the presynaptic D2 auto-receptors to *increase* dopamine transmission (Frank and O'Reilly, 2006).

We examined a number of gambling variables that can be elicited with laboratory tasks. We used a slot machine task that delivered unpredictable monetary wins as well as "near-miss" outcomes: non-wins that are spatially proximal to a jackpot win (Reid, 1986). Relative to "full-misses," near-misses are rated as unpleasant but increase motivations to continue gambling, despite their objective non-win status (see also Kassinove and Schare, 2001). Previous neuroimaging of this task showed that near-misses recruited overlapping brain circuitry to the win outcomes, including the ventral striatum and insula, in both healthy volunteers and regular gamblers (Clark et al., 2009; Chase and Clark, 2010). In the present study, we measured the subjective response to these wins and near-misses with trial-by-trial ratings. We also recorded psychophysiological activity following these outcomes using electrodermal activity (EDA) and heart rate (HR) recording, which have established sensitivity to gambling outcomes (Dixon et al., 2011; Lole et al., 2012; Studer and Clark, 2011; Clark et al., 2012a). In addition, the slot machine task measures one example of illusory control, the effect of personal choice, by comparing the expectancies of winning under conditions where the participant either chose, or was not able to choose, the "play icon." Subjects rate their expectancy of winning as higher on participant-chosen trials (Clark et al., 2009, 2012a), and fMRI signals to monetary wins are enhanced under this choice manipulation (Coricelli et al., 2005; Studer et al., 2012).

We also administered a second task, based upon roulette, which involved binary predictions of red or black outcomes and a subsequent confidence rating (Ayton and Fischer, 2004). The Gambler's Fallacy is observed as the reduced choice of one color (e.g., red) after a "run" of consecutive outcomes of that color (e.g., four successive reds). In addition, participant's confidence ratings are sensitive to their prediction accuracy, with "streaks" of consecutive correct guesses (i.e., wins) increasing self-reported confidence, and incorrect predictions (i.e., a losing streak) leading to decreased confidence. These are known as "hot hand" effects (Gilovich et al., 1985; Ayton and Fischer, 2004). Past neuroimaging studies found modulation of caudate, insula and medial prefrontal cortex activity by streaks of wins and losses in binary choice games (Elliott et al., 2000; Akitsuki et al., 2003).

As a preliminary investigation, we examined the effects of haloperidol and naltrexone on these gambling variables in a group of healthy male volunteers, who reported recreational gambling involvement. There is evidence that both gambling-related cognitive distortions, and problem gambling symptom severity, exist on a continuum, such that recreational gamblers are considered at some degree of risk for later problematic gambling (Toce-Gerstein et al., 2003; Raylu and Oei, 2004). Rates of gambling involvement and the prevalence of pathological gambling are typically higher in males (Bland et al., 1993; Shaffer et al., 1999).

The overarching hypothesis was that the gambling cognitions under scrutiny would be modulated by the dopamine and opioid-based treatments. Given preclinical evidence that μopioid blockade exerts a downstream effect on dopamine transmission (Spanagel et al., 1992), we were further interested in the overlap between the cognitive variables affected by naltrexone and haloperidol. Previous work afforded a number of more specific predictions. First, there are some indications that dopamine may modulate near-miss effects and illusory control, specifically. Using a rodent version of a slot machine, amphetamine and the dopamine D2 agonist quinpirole increased erroneous lever presses on a game with near-misses (2 of 3 identical symbols) (Winstanley et al., 2011). Dopamine is also implicated in perceptions of control (Declerck et al., 2006; Redgrave and Gurney, 2006); for example, levodopa increased the sense of agency ("action-effect binding") on a timing task in patients with Parkinson's Disease (Moore et al., 2010). As such, we predicted that the low dose of haloperidol would potentiate subjective and physiological responses to win and near-miss outcomes, and enhance the influence of personal choice, on the slot machine task. Second, drawing on Petrovic et al. (2008), we predicted that the naltrexone group would show attenuated responses to winning outcomes, coupled with enhanced negative processing (affect following near-misses) on the two tasks. Given the lack of past work to guide predictions about neurotransmitter effects on the Roulette task, these data were analyzed in an exploratory manner.

#### **METHODS**

#### **PARTICIPANTS**

Male participants (*n* = 62) were recruited through the University and community advertisements. Participants were aged 18–49 years, and reported past year gambling involvement and at least 5 lifetime gambling experiences. Exclusion criteria (confirmed through telephone interview): a score = 8 (indicative of probable pathological gambling) on the Problem Gambling Severity Index (Ferris and Wynne, 2001), significant neurological or physical illness, current or past mental health problems, including substance use and heavy smoking (*>*10 cigarettes/day). The study was approved by the Cambridgeshire 4 Research Ethics Committee (10/H0305/79). All participants gave written informed consent and were paid £35 for their participation (plus a task-related bonus of £6).

#### **STUDY DESIGN**

The study was a double-blind, parallel-groups, placebocontrolled design, involving a single session at a clinical research facility. Subjects were randomly allocated to the three treatment groups: 2 mg haloperidol, 50 mg naltrexone or placebo (microcritalline cellulose) hidden in identical gelatine capsules. Upon arrival, a urine sample was taken to confirm absence of recent opiate use, and participants completed trait questionnaires assessing impulsivity (UPPS-P; Cyders et al., 2007) and susceptibility to gambling biases (Gambling Related Cognition Scale; Raylu and Oei, 2004). Participants also completed the electronic Mini International Neuropsychiatric Interview (eMINI) (Sheehan et al., 1998) for further investigation of current and lifetime psychiatric disorders. Mental health problems were detected in 17 participants; 7 subjects in the placebo group (alcohol dependence *n* = 4, obsessive-compulsive disorder *n* = 1, hypomanic episode *n* = 1, bulimia nervosa *n* = 1), 7 subjects in the haloperidol group (alcohol dependence *n* = 1, alcohol abuse *n* = 2, obsessive-compulsive disorder *n* = 2, cocaine abuse *n* = 1, generalized anxiety disorder *n* = 1); and 3 subjects in the naltrexone group (alcohol dependence *n* = 1, hypomanic episode *n* = 1, major depressive episode *n* = 1). The proportion of participants meeting eMINI diagnoses did not differ across the three treatment groups (χ<sup>2</sup> <sup>=</sup> <sup>2</sup>*.*77, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*25). Given that participants had disclosed no past or current mental health problems in the telephone interviews, we cannot rule out the possibility that the eMINI detections were false positives.

Following dosing, participants rested for 2.5 h to allow drug absorption. This timing was based upon pharmacokinetic data showing that haloperidol reaches maximal plasma concentrations after 3 h (plasma half-life: 24 h) (Darby et al., 1995), whereas naltrexone reaches maximal plasma concentration after 45 min with a plasma half-life of 4 h (Crabtree, 1984; Meyer et al., 1984). After this rest period, volunteers completed the Slot Machine Task (Clark et al., 2009) with concurrent psychophysiological monitoring of HR and EDA, followed by the Roulette Task (Ayton and Fischer, 2004). Blood samples were taken pre-dosing (T1) and at the start of the testing period (T2, +2.5 h) to measure serum prolactin levels as a marker of dopaminergic tone (Ben-Jonathan and Hnasko, 2001). Blood pressure (BP) and HR were measured with a wrist cuff, and mood was measured with Visual Analogue Scales (Bond and Lader, 1974), at T1, T2, and on completion of testing (T3, +4 h). VAS data were unavailable for a single subject.

#### **PROLACTIN ANALYSIS**

Blood samples (4.7 ml) were centrifuged at 4000 rpm for 5 min at room temperature to obtain serum and then distributed into two aliquots of about 1.5 ml. The samples were frozen at −80◦C until analysis. Prolactin levels were analyzed by the National Institute for Health Research Cambridge Biomedical Research Center Core Biochemistry Assay Laboratory, Addenbrooke's hospital, and were tested with immunofluorometric assay (ADVIA Centaur prolactin assay, Siemens). Results are reported in mU/L. Prolactin samples were unavailable or contaminated by macroprolactin in two subjects.

#### **TASKS**

#### *Slot machine task*

Participants completed 60 trials (following 4 practice trials) on a simplified two-reel slot machine task, described in detail in Clark et al. (2009) (see **Figure 1**). Psychophysiological signals (EDA and HR) were monitored during the task using a Biopac MP36 (see below). The screen background color (white or black) designated two choice conditions: either participant-chosen trials, in which the participant selected the "play icon" on the left reel by scrolling the reel up or down, and computer-chosen trials,

**FIGURE 1 | The slot machine task displayed two reels, with the same six icons on each reel.** Each trial involved a fixed £0.15 p wager. After a selection phase in which either the computer or the participant chose one of the icons on the left reel as the "play icon," the right reel spun for a variable anticipation phase. The right reel decelerated and came to a standstill. If the right reel stopped on the chosen play icon, i.e., the reels were aligned on the central payline, the subject won £1. If the right reel stopped on a different icon (5/6 trials), the participant lost their wager. In the analysis of these non-wins, we distinguished near-misses (with the play icon either side of the payline) and full-misses (with the play icon more than one position from the payline).

in which the play icon was selected automatically. Following icon selection, the right reel spun and decelerated (mean spin time: 4.2 s) to deliver a win (£1), near-miss, or full-miss outcome (outcome duration 6 s). Current earnings were displayed in the inter-trial interval (duration 5 s), with an initial endowment of £5. The outcomes and choice condition (participantchosen, computer-chosen) occurred in a fixed pseudo-random sequence such that wins occurred on 1/6, and near-misses on 1/3 trials. As a consequence of the fixed sequence, all participants completed the task with £6, which they received as a bonus.

On each trial, three Likert ratings were taken: following icon selection, "How do you rate your chances of winning?" (0 to +100), and following the outcome, "How pleased are you with the result?" (−100 to +100) and "How much do you want to continue to play?" (0 to +100).

#### *Roulette task*

This binary choice task was modified from Ayton and Fischer (2004). The roulette wheel displayed an equal number of red and blue segments (see **Figure 2**), and on each trial, the participant first guessed red or blue, and then gave a confidence rating on 21 point scale. A history bar during the color choice presented the 10 previous outcomes, to minimize working memory demands that may be independently affected by the drug treatments.

Following the color choice and confidence rating, the wheel spun for 800–1200 ms, and the outcome was presented (e.g., "Blue: you win"). Participants received £0.10 for correct guesses, with no reinforcement (i.e., losses) for incorrect guesses. Participants completed 3 practice trials, followed by a total of 90 trials, using a pre-specified color sequence in order to deliver runs of 1–5 consecutive outcomes of the same color. This fixed sequence had an equal probability of either color, and a probability of alternation of 0.48 (see Oskarsson et al., 2009 for derivation). We refer to consecutive outcomes of the same color as "outcome runs" (i.e., blue, red, red, red is an outcome run of length 3), and consecutive correct or incorrect predictions as "feedback streaks." Two dependent variables were derived: (1) the probability of choosing either color as a function of the outcome run of that color, indicative of the Gambler's Fallacy, (2) the confidence rating as a function of feedback streak, indicative of the Hot Hand Beliefs.

#### **PSYCHOPHYSIOLOGICAL MEASUREMENT**

During the slot machine task, electrodermal activity (EDA) and HR calculated from electrocardiogram (ECG) were recorded via a BIOPAC MP36 unit (BIOPAC Systems Ltd, Goleta, CA, USA), following methods previously (Clark et al., 2012a,b). The BIOPAC unit, sampling at 1000 Hz, was connected to the stimulus delivery computer and to a second recording computer running AcqKnowledge 4.1 software. Task events were marked on the psychophysiological trace via a parallel port connection. EDA was recorded through fingertip electrodes attached to the index and middle fingers of the non-dominant hand. Heart rate was recorded using ECG electrode patches applied to the right wrist and left ankle. The psychophysiological data were extracted using in-house scripts developed in Microsoft Visual Basic (v6.0): activity on the slot machine task was modeled to the time of outcome delivery, using change from baseline scores calculated from the mean activity in the final 2 s of reel spin. Mean EDA was extracted in 6 × 2 s bins from the onset of the outcome phase. An EDA summary measure was calculated from the maximum change from baseline value in bins 2–4 (i.e., 2–8 s post-outcome), given the typical time-course for EDA changes (Dawson et al., 2000). HR responses were calculated using the median HR in 12 × 0.5 s bins from the onset of the outcome phase. Two HR summary measures isolated the initial HR deceleration component (the minimum value in bins 1–6, i.e., 0–3 s post-outcome, minus the baseline) and the subsequent HR acceleration component (the maximum in bins 7–12, i.e., 3–6 s post-outcome, minus the deceleration minima) (Hodes et al., 1985; Bradley, 2000).

#### **STATISTICAL ANALYSIS**

Statistical analysis was performed in SPSS version 19.0. Demographic and trait variables were compared across groups using One-Way ANOVA. Fisher's least significant difference test was used for *post-hoc* comparisons, as is appropriate for 3-group designs (Cardinal and Aitken, 2006). Mood scales, cardiovascular

measures, and prolactin levels were assessed with mixed-model ANOVAs including Timepoint as a within-subjects factor.

On the slot machine task, the subjective ratings and psychophysiology summary measures were analyzed with mixedmodel ANOVA, with Outcome (wins, near-misses, full-misses) and Choice (participant-chosen, computer-chosen) as withinsubjects factors, and Treatment (3 levels: haloperidol, naltrexone, placebo) as a between-subjects factors. Data from the roulette task were analyzed using two mixed-model ANOVAs, with Treatment (3 levels: placebo, haloperidol, naltrexone) as the betweensubjects factor. For analysis of color predictions, Outcome Run length was the within-subjects factor. For the analysis of confidence ratings, Feedback Streak length and Outcome (winning, losing) were within-subjects factors. Simple main effects analysis of the roulette task data compared shorter runs/streaks (1–2 successive events) against longer runs/streaks (4–5 successive events). As the feedback streaks were not pre-specified, some subjects did not experience any longer streaks. For participants missing only streaks of length five, we imputed their streak length 4 value for their missing value (this is a conservative approach that underestimates any effect of the longer streaks). Three participants were excluded who did not experience streaks longer than three events. In addition, one further participant was excluded who did not vary either his color choice or confidence ratings across the task.

As the primary aim of this study was to compare the effects of haloperidol and naltrexone relative to the placebo condition, rather than the direct comparison of the two active treatments, the omnibus 3-group model was decomposed using two planned comparisons of the haloperidol group vs. placebo, and the naltrexone group vs. placebo. For all analyses, the Greenhouse-Geisser correction was applied when sphericity assumptions were violated, and the Huynh-Feldt correction was reported when the Greenhouse-Geisser estimate was greater than 0.75 (Cardinal and Aitken, 2006). All tests were thresholded at *p <* 0*.*05 two-tailed.

#### **RESULTS**

The three treatment groups did not differ significantly in age, years of education, trait gambling distortions, or impulsivity (see **Table 1**). The overall level of problem gambling was low on the PGSI (mean 1.5; *SD* 1.73; range 0–7; a score = 8 is indicative of probable pathological gambling), but GRCS scores were in range


*The values are reported in means and standard deviations; PGSI, problem gambling severity index (range 0–27); GRCS, gambling-related cognitions scale (range 23–161), UPPS-P Impulsivity Scale (range 59–236). NS, not significant.*

of previous data in recreational gamblers (Raylu and Oei, 2004; Billieux et al., 2012).

#### **PROLACTIN LEVELS**

For plasma prolactin levels, there was a significant Treatment × Time interaction [*F(*2*,* <sup>57</sup>*)* <sup>=</sup> <sup>4</sup>*.*09, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*022, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*13]. The main effects of Treatment [*F(*2*,* <sup>57</sup>*)* <sup>=</sup> <sup>2</sup>*.*42, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*098, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*08] and Time [*F(*1*,* <sup>57</sup>*)* <sup>=</sup> <sup>3</sup>*.*44, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*069, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*06] approached significance. Analysis of change scores (T2 minus T1) indicated prolactin increase in the naltrexone group compared to placebo [*t(*38*)* = −2*.*78, *p* = 0*.*008], consistent with downstream dopaminergic blockade by naltrexone. The haloperidol group did not differ significantly from placebo (*p >* 0*.*1) (see **Figure 3**).

#### **MOOD AND CARDIOVASCULAR MEASURES**

On the subjective mood ratings, there were no differences between treatment groups (i.e., the Treatment × Time interaction term) for Alertness [*F(*4*,* <sup>116</sup>*)* = 1*.*06, NS], Happiness [*F(*4*,* <sup>116</sup>*)* = 1*.*70, NS] or Calmness [*F(*4*,* <sup>116</sup>*)* = 0*.*09, NS]. Main effects of Time were observed on Alertness [*F(*2*,* <sup>116</sup>*)* <sup>=</sup> <sup>23</sup>*.*10, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> *p* = <sup>0</sup>*.*29] and Happiness [*F(*2*,* <sup>116</sup>*)* <sup>=</sup> <sup>7</sup>*.*26, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*11], reflecting a general decrease over time across all groups.

On the cardiovascular measures, there were no differences between treatment groups (i.e., Treatment × Time interactions) on HR [*F(*4*,* <sup>116</sup>*)* = 1*.*07, NS], systolic BP [*F(*3*.*3*,* <sup>96</sup>*.*2*)* = 1*.*76, NS] or diastolic BP [*F(*4*,* <sup>116</sup>*)* = 1*.*65, NS]. Systolic BP and HR decreased over time across all groups [main effect of Time: *<sup>F</sup>(*1*.*7*,* <sup>96</sup>*.*2*)* <sup>=</sup> <sup>3</sup>*.*92, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*030, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*06; *F(*2*,* <sup>116</sup>*)* = 45*.*49, *p <* 0*.*001, η<sup>2</sup> *<sup>p</sup>* = 0*.*44, respectively].

#### **SLOT MACHINE TASK**

#### *Subjective effects of wins and near-misses*

On the ratings of "pleased with outcome," the omnibus ANOVA revealed a significant main effect of Outcome [*F(*1*.*0*,* <sup>59</sup>*.*0*)* = 189*.*66, *p <* 0*.*001, η<sup>2</sup> *<sup>p</sup>* = 0*.*77], such that participants were more pleased after wins compared to near-misses [*t(*59*)* = 13*.*15, *p <* 0*.*001] and full-misses [*t(*59*)* = 13*.*40, *p <* 0*.*001] (see **Table 2**). Near misses were more pleasant than full misses [*t(*59*)* = 2*.*19,

bars indicate standard errors of the means.

*p* = 0*.*033]. An Outcome × Choice interaction was observed [*F(*1*.*5*,* <sup>84</sup>*.*0*)* <sup>=</sup> <sup>4</sup>*.*32, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*026, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*07], such that for nearmisses and full-misses, participant-chosen outcomes were significantly less pleasant than computer-selected outcomes [*t(*59*)* = 2*.*32, *p* = 0*.*024; *t(*59*)* = 2*.*84, *p* = 0*.*006; respectively], whereas for wins, pleasantness ratings did not differ by choice condition [*t(*59*)* = 1*.*46, *p* = 0*.*149]. An Outcome × Treatment interaction was observed [*F(*2*.*1*,* <sup>59</sup>*.*0*)* <sup>=</sup> <sup>4</sup>*.*15, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*020, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*13], driven by an effect of haloperidol [haloperidol model: *F(*1*.*0*,* <sup>39</sup>*.*1*)* = <sup>6</sup>*.*56, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*014, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*15; naltrexone model: *F(*1*.*0*,* <sup>38</sup>*.*6*)* = 0*.*85, NS]. The haloperidol group rated higher pleasure after wins [*t(*38*)* = −2*.*20, *p* = 0*.*034] and greater unpleasantness after nonwins [near-misses: *t(*38*)* = 2*.*16, *p* = 0*.*038; full-misses: *t*35*.*<sup>7</sup> = 2*.*36, *p* = 0*.*024] compared to the placebo group (see **Table 2**). Thus, on a subjective rating, haloperidol appeared to potentiate both the positive affect to winning as well as negative affect following non-winning outcomes.

On the rating of "continue to play," the omnibus ANOVA revealed a significant main effect of Outcome [*F(*1*.*2*,* <sup>67</sup>*.*0*)* = 45*.*5, *p <* 0*.*001, η<sup>2</sup> *<sup>p</sup>* = 0*.*44], reflecting higher ratings after wins compared to non-wins [near-misses: *t(*59*)* = 6*.*60, *p <* 0*.*001; fullmisses: *t(*59*)* = 7*.*53, *p <* 0*.*001]. There was an Outcome × Choice interaction [*F(*2*,* <sup>114</sup>*)* <sup>=</sup> <sup>13</sup>*.*5, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*19]: the desire to play was higher after participant-chosen near-misses, compared to computer-chosen near-misses [*t(*59*)* = 5*.*00, *p <* 0*.*001], and participant-chosen full-misses [*t(*59*)* = 4*.*78, *p <* 0*.*001], as previously observed on this task (Clark et al., 2009, 2012a,b). There was also an Outcome × Choice × Treatment interaction [*F(*4*,* <sup>114</sup>*)* <sup>=</sup> <sup>3</sup>*.*39, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*012, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*11], driven by an effect of naltrexone [*F(*2*,* <sup>74</sup>*)* <sup>=</sup> <sup>3</sup>*.*57, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*033, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*09] [haloperidol model: *F(*2*,* <sup>76</sup>*)* = 0*.*42, NS]. In the placebo group, the participantchosen near-misses were rated as more motivating than



*Values are reported as mean (SD), separated by the participant-chosen condition and the computer-chosen condition.*

either computer-chosen near-misses [*t(*18*)* = 3*.*27, *p <* 0*.*001] or participant-chosen full-misses [*t(*18*)* = 3*.*66, *p <* 0*.*001]. These differences were not observed in the naltrexone group (all *p*s *>* 0.1) and the calculated difference score between participantchosen and computer-chosen near-misses was marginally higher in the placebo group than in the naltrexone group [*t(*37*)* = 1*.*80, *p* = 0*.*08]. Thus, naltrexone had a modest effect of attenuating the motivational ratings after self-selected near-misses (see **Figure 4**).

#### *Psychophysiological responses to wins and near-misses*

For EDA max, there were significant main effects of Outcome [*F(*1*.*4*,* <sup>77</sup>*.*4*)* <sup>=</sup> <sup>24</sup>*.*8, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*31] and Choice [*F(*1*,* <sup>56</sup>*)* = 28*.*0, *p <* 0*.*001, η<sup>2</sup> *<sup>p</sup>* = 0*.*33]. Across all groups, participants experienced higher EDA responses after wins compared to non-wins [near-misses: *t(*58*)* = 5*.*42, *p <* 0*.*001; full-misses: *t(*58*)* = 5*.*14, *p <* 0*.*001]. There was a marginal increase in EDA after nearmisses in comparison to full-misses [*t(*58*)* = 1*.*98, *p* = 0*.*053]. Participants showed higher EDA on participant-chosen outcomes compared to computer-chosen outcomes [wins: *t(*58*)* = 2*.*06, *p* = 0*.*044; near-misses: *t(*58*)* = 4*.*32, *p <* 0*.*001; full-misses: *t(*58*)* = 2*.*72, *p* = 0*.*009]. There was a significant Outcome × Treatment interaction [*F(*2*.*8*,* <sup>77</sup>*.*4*)* <sup>=</sup> <sup>3</sup>*.*52, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*022, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*11], which was driven by the naltrexone group [*F(*1*.*4*,* <sup>51</sup>*.*9*)* = 5*.*09, *p* = 0*.*018, η2 *<sup>p</sup>* = 0*.*12] [haloperidol model: *F(*1*.*3*,* <sup>46</sup>*.*0*)* = 0*.*15, NS]. The EDA change to wins relative to full-misses was greater in the naltrexone group than the placebo group [*t(*37*)* = −2*.*47, *p* = 0*.*018], as well as marginally higher for the near-miss vs. full-miss change score [*t(*37*)* = −1*.*80, *p* = 0*.*081] (see **Figure 5A**). Thus, naltrexone increased the physiological responsiveness to wins in comparison to full-misses (**Table 3**).

**FIGURE 4 | Motivational ratings on the slot machine task showed an outcome (near-miss, full-miss) by control (participant-chosen, computer-chosen) interaction, whereby participant-chosen near-misses increased motivation to continue relative to the computer-chosen near-misses, in the placebo group (and haloperidol group), and this was attenuated in the naltrexone group.** Error bars indicate standard error of the mean. ∗*p <* 0*.*08.

#### **Table 3 | Psychophysiological responses to outcomes on the slot machine task (change scores from baseline).**


*Values are reported as mean (SD), separated by participant-chosen condition and the computer-chosen condition. EDA, electrodermal activity, in µS, where the Max refers to the maximum value across bin 2–4 (i.e., 2–8 s post-outcome) minus the pre-trial baseline. HR, heart rate, in beats per minute, where the Deceleration value refers to the minimum value in bins 1–6 (i.e., 0–3 s postoutcome) minus the baseline; and the subsequent HR acceleration component refers to the maximum in bins 7–12 (i.e., 3–6 s post-outcome) minus the deceleration minima.*

On HR deceleration, there was a main effect of Outcome [*F(*2*,* <sup>106</sup>*)* <sup>=</sup> <sup>6</sup>*.*44, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*002, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*11], with greater HR decelerations after wins and near-misses in comparison to fullmisses [*t(*55*)* = −3*.*23, *p* = 0*.*002; *t(*55*)* = −3*.*03, *p* = 0*.*004; respectively]. A trend Outcome × Treatment interaction was observed [*F(*4*,* <sup>106</sup>*)* <sup>=</sup> <sup>2</sup>*.*11, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*085, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*074], driven by an effect of naltrexone [*F(*2*,* <sup>70</sup>*)* <sup>=</sup> <sup>2</sup>*.*86, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*064, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*08] [haloperidol model: *F(*2*,* <sup>68</sup>*)* = 1*.*64, NS]. The HR deceleration to wins relative to full-misses was greater in the naltrexone group than the placebo group [*t(*35*)* = 2*.*17, *p* = 0*.*03] (see **Figure 5B**), similar to the EDA effect. For HR acceleration, there was a significant effect of Outcome [*F(*1*.*8*,* <sup>93</sup>*.*4*)* <sup>=</sup> <sup>8</sup>*.*12, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> *p* = 0*.*13], reflecting higher HR acceleration after near-misses relative to wins [*t(*55*)* = 2*.*45, *p* = 0*.*018] and full-misses [*t(*55*)* = 4*.*99, *p <* 0*.*001]. There was also a main effect of Choice [*F(*1*,* <sup>53</sup>*)* = <sup>4</sup>*.*17, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*046, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*07], indicating higher HR acceleration for participant-chosen outcomes compared to computer-chosen outcomes [wins: *t(*55*)* = 2*.*02, *p* = 0*.*048; full-misses: *t(*55*)* = 2*.*89, *p* = 0*.*006]. The HR acceleration effects did not interact with Treatment group.

#### *Subjective effects of personal choice (the illusion of control)*

On the ratings of "chances of winning," participants reported a greater expectancy of winning when they chose the play icon, compared to the computer-chosen condition [*F(*1*,* <sup>57</sup>*)* = 44*.*59, *p <* 0*.*001, η<sup>2</sup> *<sup>p</sup>* = 0*.*44]. This effect did not vary across treatment groups [Treatment × Choice: *F(*2*,* <sup>57</sup>*)* = 2*.*19, NS; Treatment: *F(*2*,* <sup>57</sup>*)* = 0*.*59, NS] (see **Table 2**).

#### **ROULETTE TASK**

#### *Gambler's fallacy*

The analysis of color choice yielded a main effect of Run Length [*F(*1*,* <sup>55</sup>*)* <sup>=</sup> <sup>7</sup>*.*84, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*007, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*13], reflecting decreased choice of choosing either color after a longer run of that color (*M* = 43*.*1, *SD* = 23*.*6) compared to a short run (*M* = 50*.*9, *SD* = 9*.*5). This represents a typical Gambler's fallacy pattern. Treatment group did not moderate this effect [Run Length × Treatment: *F(*2*,* <sup>55</sup>*)* = 1*.*07, NS; Treatment: *F(*2*,* <sup>55</sup>*)* = 0*.*74, NS].

#### *Hot hand belief*

Analysis of confidence ratings as a function of feedback streak showed a weak effect of Outcome [*F(*1*,* <sup>55</sup>*)* = 3*.*34, *p* = 0*.*073, η2 *<sup>p</sup>* = 0*.*06], whereby confidence was higher after correct predictions compared to incorrect predictions, in line with the hot hand belief. The Streak Length × Outcome × Treatment interaction approached significance [*F(*2*,* <sup>55</sup>*)* <sup>=</sup> <sup>2</sup>*.*51, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*091, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*08]. This effect was driven by the naltrexone group, in which a significant 3-way interaction [*F(*1*,* <sup>35</sup>*)* <sup>=</sup> <sup>5</sup>*.*41, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*026, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*13] and a trend Outcome × Treatment interaction [*F(*1*,* <sup>35</sup>*)* = <sup>3</sup>*.*81, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*059, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*10] were observed [haloperidol model: *F(*1*,* <sup>37</sup>*)* = 2*.*37, NS]. Analysing winning and losing streaks separately in the naltrexone model, the Streak Length × Treatment interaction approached significance for wins [*F(*1*,* <sup>35</sup>*)* = 3*.*43, *p* = 0*.*073, η<sup>2</sup> *<sup>p</sup>* = 0*.*09], but not for losses [*F(*1*,* <sup>35</sup>*)* = 1*.*91, NS], such that the naltrexone group showed a greater increase in confidence on longer winning streaks, compared to placebo (see **Figure 6**).

#### **DISCUSSION**

In this study, we assessed the effects of the opioid antagonist naltrexone and the dopamine D2-receptor antagonist haloperidol on two gambling tasks in male recreational gamblers. A slot machine task was used to deliver near-miss outcomes, elicit perceptions of control, and to measure physiological responses to winning outcomes. A roulette task was used to study the impact of outcome runs and feedback streaks on choice behavior and confidence ratings, respectively. Collapsing across the three treatment groups, both tasks were reasonably successful at inducing these gambling phenomena. On the slot machine task, the jackpot wins were rated as pleasurable and increased the motivation to play, and the winning outcomes were also associated with increased EDA and HR deceleration, relative to the non-wins. Comparing nearmisses to full-misses, we confirmed our previous results on this task, that motivation ratings were higher after near-misses, and this effect depended on personal choice over the gamble (Clark et al., 2009, 2012a). The perceived chances of winning were also higher on participant-chosen trials than computer-chosen trials, consistent with an illusion of control. Near-misses were associated with increased EDA and rebound HR acceleration, as we have

described previously (Clark et al., 2012a, 2013). On the roulette task, there was an expected Gambler's Fallacy effect, such that the choice of either color decreased after long runs of that color (i.e., negative recency) (replicating Ayton and Fischer, 2004). There was also a weaker effect of increased confidence after wins compared to losses, consistent with the "Hot Hand" belief (Ayton and Fischer, 2004).

In terms of the pharmacological effects, several differences were observed between the treatment groups, although generally, these were not in line with our predictions. The three groups were demographically matched and did not differ significantly on impulsivity, a relevant personality trait, or level of gambling involvement (PGSI) or trait gambling cognitions (GRCS). Prolactin levels increased in the naltrexone-treated group, but did not differ significantly between the haloperidol group and the placebo group. This implies that the single low dose of naltrexone (50 mg) was functionally active, but that the 2 mg haloperidol dose may not have been. Indeed, on the two gambling tasks, the majority of the detected group differences were between the naltrexone and placebo groups: the naltrexone group had a greater physiological response to winning outcomes on the slot machine task, in terms of EDA (significant) and HR deceleration (marginally significant). On the roulette task, the naltrexone group showed marginally higher confidence ratings after winning streaks compared to the placebo group, indicating a possible enhancement of the hot hand effect. At the same time, the motivational effect of the near-misses on participant-chosen trials was significantly attenuated in the naltrexone group. By contrast, in the haloperidol group, the only observed effect was a greater disparity in pleasantness ratings between the win and nonwin outcomes (i.e., a treatment by outcome interaction). Neither group showed differences in the effect of personal control on the slot machine task, or the Gambler's Fallacy on the roulette task.

#### **EFFECTS OF NALTREXONE ON GAMBLING BEHAVIOR**

Based upon the reported clinical efficacy of naltrexone in the treatment of pathological gambling (Kim et al., 2001; Grant et al., 2006, 2008), our overarching hypothesis for the naltrexone group was that cognitive effects characteristic of excessive gambling would be ameliorated by naltrexone. In addition, we predicted that these participants would show blunted responses to wins (c.f. Petrovic et al., 2008). Our data indicated that naltrexone did modulate the responsivity to wins, but in the opposite direction to that predicted: the naltrexone group displayed *higher* EDA following wins, and this hyper-reactivity was substantiated by a trend effect for HR deceleration. *Prima facie*, these results are difficult to reconcile with the substantial literature reporting that opioid blockade reduces reward processing in laboratory models (Drewnowski et al., 1995; Zhu et al., 2011; Langleben et al., 2012a), and reduces cravings and drug self-administration in groups with substance use disorders (Davidson et al., 1999; Drobes et al., 2004; Myrick et al., 2008; Langleben et al., 2012b; Miranda et al., 2013).

A number of methodological differences may be pertinent here, and may be useful to inform the design of future experiments. A key point is that our participants were recreational gamblers with modest levels of gambling involvement; it is possible that pathological gamblers may show a qualitatively different response to opioid blockade. Our decision to use recreational gamblers was based on several factors: the ease of recruitment to achieve sufficient group sizes, ethical considerations about the use of gambling simulations in individuals with disordered gambling, and evidence that gambling severity is dimensional (Toce-Gerstein et al., 2003). However, within the context of substance addictions (namely alcohol dependence), the response to naltrexone is known to vary as a function of genetics (the OPRM1 polymorphism) (Ray and Hutchison, 2007) and family history of alcoholism (Krishnan-Sarin et al., 2007). Family history of alcoholism is also a predictor of a positive treatment response to naltrexone in pathological gamblers (Grant et al., 2008). In the study by Krishnan-Sarin et al. (2007), while naltrexone acted to decrease drinks consumed in a laboratory test in heavy drinkers with a family history of alcoholism, naltrexone actually *increased* drinking in those who were family history negative, similar to the effects observed here. The authors speculated that this effect may have been linked to individual difference in kappa-opioid action, which increase alcohol consumption in a rodent model (Mitchell et al., 2005).

In the most comparable study to the present experiment, Petrovic et al. (2008) found reduced brain responses to winning outcomes following opioid blockade in healthy participants, coupled with greater activation to monetary losses. However, the Petrovic et al. (2008) study (and also Drewnowski et al., 1995) used naloxone rather than naltrexone, and delivered intravenously rather than orally. The more rapid changes in brain concentrations associated with intravenous injection as opposed to oral dosing may cause divergent effects on behavior, as in the case of methylphenidate (Volkow and Swanson, 2003). Naltrexone may also exert partial agonism effects (Ignar et al., 2011), and along with naloxone and nalmefene, it is only moderately selective for the μ-opioid receptor, which may modify its effects on reward seeking behavior (Giuliano et al., 2012). As a third notable difference, the majority of past work in clinical groups has employed either subchronic (e.g. 7 day) dosing (e.g., Davidson et al., 1999; Drobes et al., 2004; Myrick et al., 2008) or slow-release depot formulations (Langleben et al., 2012a). Compensatory effects can occur in single-dose designs; for example, single dose citalopram treatment in healthy volunteers induced an impairment in reversal learning that was comparable to (rather than opposite to) effects observed in patients with major depression (Chamberlain et al., 2006). Nevertheless, the single administration of naltrexone used in the present study was seen to increase plasma prolactin levels, replicating Shaw and Al'Absi (2010). Given that prolactin release is inhibited by hypothalamic dopamine transmission (Freeman et al., 2000), a prolactin rise is presumed to reflect downstream dopamine blockade, indicative of overall opioid down-regulation.

In terms of the other gambling distortions under study, our findings were mixed. Consistent with the increased responsivity to wins, there was also an indication of enhanced confidence after winning streaks (i.e., increased "hot hand" effect). However, the motivational effects of near-miss outcomes were blunted in the naltrexone group. Given that the naltrexone effect on near-misses was restricted to a subjective rating ("continue to play") and did not generalize to the psychophysiological measures, this result should be treated with caution. Moreover, the naltrexone group did not differ from placebo on two cardinal gambling distortions, the Gambler's Fallacy (on the roulette task) and the illusion of control (the manipulation of personal choice on the slot machine task), despite the fact that these distortions were robustly elicited in the overall study group. Related to the possibility that pathological gamblers may show a distinctive response to naltrexone, it is also conceivable that pathological and recreational gamblers may differ in their responses to gambling effects like near-misses (Habib and Dixon, 2010) or illusory control (Orgaz et al., 2013).

#### **EFFECTS OF HALOPERIDOL ON GAMBLING BEHAVIOR**

Prior research has shown that the stimulation of dopamine transmission can induce (Voon et al., 2009) and exacerbate (Zack and Poulos, 2004) gambling tendencies, as well as specific distortions including the sense of agency (relevant to the illusion of control) (Moore et al., 2010) and the behavioral response to near-misses (Winstanley et al., 2011). There is some evidence that these effects are D2-receptor specific (Zack and Poulos, 2007; Weintraub et al., 2010; Winstanley et al., 2011). Based on the argument by Frank and O'Reilly (2006) that lower doses of dopamine D2 receptor antagonists act preferentially on presynaptic D2 auto-receptors to *increase* dopamine transmission (see also Zack and Poulos, 2007), we predicted that low dose haloperidol would enhance the reactivity to win and near-miss outcomes on the slot machine task, and increase the influence of personal choice. We found limited support for these predictions, and haloperidol showed few effects in this study. The only statistically significant difference from the placebo group was on the pleasantness ratings on the slot machine task, where the haloperidol group showed increased pleasantness ratings after wins and increased ratings of unpleasantness after non-win outcomes. This effect was not corroborated by any change in physiological reactivity under haloperidol. It should also be noted that collapsing across treatment groups, the pleasantness ratings varied significantly as a function of personal choice (i.e., an Outcome × Choice interaction), but no 3-way interaction was evident with treatment group. We infer that the haloperidol group may have been more extreme in their affective ratings, but that this may not constitute a genuine drug action.

Notably, the lack of any observed effect of haloperidol on prolactin levels raises the possibility that the 2 mg dosage may not have been functionally active. In the study by Frank and O'Reilly, 2 mg haloperidol significantly increased prolactin levels in a crossover design. While we note that our post-dose plasma sample was obtained slightly earlier (at 2.5 h) than the expected peak (at 3 h in Darby et al., 1995; at 4 h in Frank and O'Reilly, 2006), we also observed no cardiovascular or mood effects, unlike past reports (Zack and Poulos, 2007; Pine et al., 2010). A number of other studies have employed low doses of haloperidol (1–3 mg) in 3-arm studies that have included a group treated with the dopamine precursor levodopa (Pessiglione et al., 2006; Pleger et al., 2009; Pine et al., 2010; Oei et al., 2012). These studies have generally succeeded in demonstrating linear effects (i.e., haloperidol *<* placebo *<* levodopa) on reinforcement-related parameters, although in several instances, the specific haloperidol vs. placebo contrast was either non-significant (Pine et al., 2010), or not reported (Pessiglione et al., 2006; Oei et al., 2012). Of course, based upon the argument of presynaptic upregulation, an intermediate dose may exist where the presynaptic and post-synaptic actions cancel each other out. It is also recognized that both phasic and tonic components of dopamine signaling are implicated in reward-driven behavior, and that a presynaptic manipulation may primarily affect phasic firing (Grace, 1991; Niv et al., 2007). Overall, we find limited evidence for functional effects of the 2 mg dose, and the absence of a significant prolactin response is particularly concerning; we recommend that future studies in healthy participants opt for higher doses ≥3 mg.

#### **LIMITATIONS AND CONCLUSION**

This study was the first to assess the effects of an opioid antagonist, naltrexone, and a dopamine D2-receptor antagonist, haloperidol, on gambling tendencies. The indications of increased gambling proclivity following naltrexone (increased physiological reactivity to wins, increasing confidence ratings on winning streaks) are at odds with the reported clinical efficacy of naltrexone in pathological gambling, although the non-clinical study population and single dose administration design necessarily limit any direct comparison. As a strength of the study, the two tasks were successful at inducing the key cognitive distortions of interest in the overall study group. While the group comparisons involved no correction for the multiple dependent variables (hence risk of Type I error), we sought to corroborate effects on behavioral measures and subjective ratings with the acquisition of event-related psychophysiology, which successfully demonstrated significant EDA and HR reactivity to wins and near-misses. We opted to use a 3-arm, parallel-groups design, because our tasks were not known to be suitable for repeated testing, although this decision had several consequences. First, the direct comparisons involved non-independent tests against the same placebo group, and some of the specific gambling effects

#### **REFERENCES**


dopamine function in pathological gambling. *Psychol. Med.* 27, 473–475. doi: 10.1017/S0033291796 003789


(HR deceleration to wins, the hot hand effect) were not selectively evident in the placebo group. In addition, between-groups analysis limits any examination of individual differences in drug responses; for example whether dopamine or opioid effects varied with age or trait impulsivity (Zack and Poulos, 2009). As further limitations, we acknowledge that laboratory-based gambling simulations entail some compromises to ecological validity (Gainsbury and Blaszczynski, 2011). While our slot machine task delivered real monetary wins, which is important for establishing physiological arousal (Ladouceur et al., 2003), our tasks did not involve a variable wager. With regard to the limited effects of haloperidol on the gambling tasks, we highlight the nonsignificant change in prolactin as an indication that our low dose may not have achieved functional effectiveness, and as such, the null effects for haloperidol on the gambling tasks may say little about the relevance of dopamine signaling pathways to the neurobiology of gambling or the treatment potential of dopaminergic medications. However, the observed actions of naltrexone substantiate the relevance of opioid transmission to human decision-making and reinforcement processing, with treatment implications for a range of addictive and impulse control- related disorders.

#### **ACKNOWLEDGMENTS**

The study was supported by research grants from the Royal Society (RG62223; 2010/R2), the University of Cambridge Isaac Newton Trust (10.44) and the Medical Research Council (G1100554). Roseline Porchet received a PhD studentship from the Behavioral and Clinical Neuroscience Institute (director TW Robbins), supported by a consortium award from the Medical Research Council and the Wellcome Trust. The study was run at the Wellcome Trust Clinical Research Facility at Addenbrooke's Hospital, Cambridge, and we are grateful to the nursing staff at the WTCRF for their assistance.


midbrain response to nearmiss outcomes. *J. Neurosci.* 30, 6180–6187. doi: 10.1523/JNEURO SCI.5758-09.2010


Gambling near-misses enhance motivation to gamble and recruit win-related brain circuitry. *Neuron* 61, 481–490. doi: 10.1016/j.neuron. 2008.12.031


in Parkinson's disease–a review of the literature. *Mov. Disord.* 26, 1976–1984. doi: 10.1002/mds.23821


seeking and binge-like eating. *Neuropsychopharmacology* 37, 2643–2652. doi: 10.1038/npp.20 12.128


reactivity and sensitivity: an initial randomized trial. *Addict. Biol.* doi: 10.1111/adb.12050. [Epub ahead of print].


antipsychotic drugs in patients with schizophrenia. *Br. J. Psychiatry* 179, 503–508. doi: 10.1192/bjp.179.6.503


transmission in the human striatum during monetary reward tasks. *J. Neurosci.* 24, 4105–4112. doi: 10.1523/JNEUROSCI.4643-03.2004

Zhu, J., Spencer, T. J., Liu-Chen, L. Y., Biederman, J., and Bhide, P. G. (2011). Methylphenidate and mu opioid receptor interactions: a pharmacological target for prevention of stimulant abuse. *Neuropharmacology* 61, 283–292. doi: 10.1016/j.neuropharm.2011. 04.015

**Conflict of Interest Statement:** Luke Clark is a consultant for Cambridge Cognition Ltd. Ulrich Müller has been a consultant for Janssen Cilag and Eli-Lilly; he has received travel expenses and honoraria from the British Association for Psychopharmacology (BAP), Janssen Cilag, Eli-Lilly, Shire, UCB Pharma and the UK Adult ADHD Network (UKAAN) for talks at scientific and educational meetings. The other co-authors have no financial disclosures.

*Received: 10 July 2013; accepted: 16 September 2013; published online: 07 October 2013.*

*Citation: Porchet RI, Boekhoudt L, Studer B, Gandamaneni PK, Rani N, Binnamangala S, Müller U and Clark L (2013) Opioidergic and dopaminergic manipulation of gambling tendencies: a preliminary study in male recreational gamblers. Front. Behav. Neurosci. 7:138. doi: 10.3389/fnbeh.2013.00138*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2013 Porchet, Boekhoudt, Studer, Gandamaneni, Rani, Binnamangala, Müller and Clark. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

### Opioid and dopamine mediation of gambling responses in recreational gamblers

#### *Martin Zack\**

*Centre for Addiction and Mental Health, Neuroscience Research, Toronto, ON, Canada \*Correspondence: martin.zack@camh.ca*

#### *Edited by:*

*Rainer Spanagel, Central Institute of Mental Health, Germany*

**Keywords: gambling, opioid, dopamine, cognition, motivation, emotion**

#### **A commentary on**

**Opioidergic and dopaminergic manipulation of gambling tendencies: a preliminary study in male recreational gamblers** *by Porchet, R., Boekhoudt, L., Studer, B., Gandamaneni, K., Rani, N., Binnamangala, S., et al. (2013). Front. Behav. Neurosci. 7:138. doi: 10.3389/fnbeh. 2013.00138*

Cognitions play an important role in addictive behavior. This may be especially true for "behavioral addictions," like pathological gambling, where reinforcement derives from environmental events whose value is, for the most part, learned. The study by Porchet and colleagues examines the roles of dopamine and the endogenous opioids in response to tasks designed to evoke gambling-related cognitive distortions in recreational gamblers. The investigators report that the dopamine D2 receptor antagonist, haloperidol had little effect on subjective responses to nearmisses (outcomes that closely approximate wins) but slightly enhanced physiological

response to these stimuli. In contrast, the mixed opioid receptor antagonist, naltrexone increased physiological reactivity to these stimuli and also increased subjective confidence to predict future outcomes following a winning streak on a roulette task. The findings for haloperidol are consistent with the increased physiological response and lack of subjective effects of this drug on response to gambling activity previously seen in healthy individuals. The findings for naltrexone are counterintuitive, given that naltrexone and the opioid antagonist nalmefene have proven effective in curbing urges to gamble in pathological gamblers. Although not entirely predicted, the results confirm that, like drugs of abuse, gambling activity reliably engages the dopamine and opioid systems. Together with other evidence, they also indirectly suggest that recreational gamblers may respond differently to drug manipulations than pathological gamblers due to functional differences in the brains of these two populations. Whereas the effects in recreational gamblers reflect a perturbation from homeostatic baseline function, the increase in dopamine cell firing induced by haloperidol and increase in stress axis responding induced by naltrexone may act to restore or mitigate deviations from normal brain function that represent the new baseline or "allostatic" brain state of the pathological gambler. Replication of this experiment in pathological gamblers would be a valuable complement to this important study.

#### *Received: 17 September 2013; accepted: 26 September 2013; published online: 14 October 2013.*

*Citation: Zack M (2013) Opioid and dopamine mediation of gambling responses in recreational gamblers. Front. Behav. Neurosci. 7:147. doi: 10.3389/fnbeh. 2013.00147*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2013 Zack. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Endogenous cortisol levels are associated with an imbalanced striatal sensitivity to monetary versus non-monetary cues in pathological gamblers

#### **Yansong Li 1,2\*, Guillaume Sescousse<sup>1</sup>† and Jean-Claude Dreher 1,2**

<sup>1</sup> Reward and Decision Making Team, Centre de Neurosciences Cognitives, CNRS, UMR 5229, Lyon, France <sup>2</sup> Neuroscience Department, Université Claude Bernard Lyon 1, Lyon, France

#### **Edited by:**

Mike James Ferrar Robinson, Wesleyan University, USA

#### **Reviewed by:**

Guido Van Wingen, Academic Medical Center Amsterdam, Netherlands Eve Limbrick-Oldfield, University of Cambridge, UK

#### **\*Correspondence:**

Yansong Li, Reward and Decision Making Team, Centre de Neurosciences Cognitives, CNRS, UMR 5229, 67 Boulevard Pinel, 69675, Lyon, France e-mail: yansong.li@isc.cnrs.fr

#### **†Present address:**

Guillaume Sescousse, Donders Institute for Brain, Cognition and Behavior, Radboud University Nijmegen, Nijmegen, Netherlands Pathological gambling is a behavioral addiction characterized by a chronic failure to resist the urge to gamble. It shares many similarities with drug addiction. Glucocorticoid hormones including cortisol are thought to play a key role in the vulnerability to addictive behaviors, by acting on the mesolimbic reward pathway. Based on our previous report of an imbalanced sensitivity to monetary versus non-monetary incentives in the ventral striatum of pathological gamblers (PGs), we investigated whether this imbalance was mediated by individual differences in endogenous cortisol levels. We used functional magnetic resonance imaging (fMRI) and examined the relationship between cortisol levels and the neural responses to monetary versus non-monetary cues, while PGs and healthy controls were engaged in an incentive delay task manipulating both monetary and erotic rewards. We found a positive correlation between cortisol levels and ventral striatal responses to monetary versus erotic cues in PGs, but not in healthy controls. This indicates that the ventral striatum is a key region where cortisol modulates incentive motivation for gambling versus non-gambling related stimuli in PGs. Our results extend the proposed role of glucocorticoid hormones in drug addiction to behavioral addiction, and help understand the impact of cortisol on reward incentive processing in PGs.

**Keywords: cortisol, reward, pathological gambling, fMRI, ventral striatum, addiction, incentive, glucocorticoid hormones**

#### **INTRODUCTION**

Glucocorticoid hormones (cortisol in humans and corticosterone in rodents) are produced by the adrenal cortex after the hypothalamic-pituitary-adrenal (HPA) axis is stimulated by psychologically or physiologically arousing stimuli (Sapolsky et al., 2000; Herman et al., 2005; Ulrich-Lai and Herman, 2009). These hormones have essential roles in normal physiological processes, such as acting on anti-stress and anti-inflammatory pathways, and, by doing so, have wide-ranging effects on behavior. Over the past few years, the potential role of glucocorticoid hormones on mental disorders has gained increased attention (Meewisse et al., 2007; Wingenfeld and Wolf, 2011). In particular, in the search for risk factors for drug addiction, increasing evidence points to an interaction between HPA functioning and drug exposure (Stephens and Wand, 2012). For example, a positive correlation between glucocorticoid levels and self-administration of psychostimulants has been observed in rodents (Goeders and Guerin, 1996; Deroche et al., 1997). In addition, drug administration produces stress-like cortisol responses (Broadbear et al., 2004) and similarly, acute administration of cortisol promotes cocaine craving in cocaine-dependent individuals (Elman et al., 2003). These findings not only point to the link between glucocorticoid hormones and addiction (Lovallo, 2006), but also emphasize the

need to develop integrative theories explaining the mechanisms by which they affect addictive behavior.

Animal and human neuroimaging studies have demonstrated that addiction involves altered functioning of the mesolimbic reward system (Koob and Le Moal, 2008; Koob and Volkow, 2010; Schultz, 2011). Another line of research has shown that altered HPA response is associated with changes in dopaminergic regulation (Oswald and Wand, 2004; Alexander et al., 2011) and that glucocorticoid hormones have modulatory effects on dopamine release in the mesolimbic pathway, especially in the nucleus accumbens (NAcc; Oswald et al., 2005; Wand et al., 2007). Building on this evidence, it has been proposed that glucocorticoid hormones have facilitatory effects on behavioral responses to drugs of abuse, and that these effects are implemented *via* action on the mesolimbic reward system (Marinelli and Piazza, 2002; de Jong and de Kloet, 2004). Furthermore, on the basis of the incentive sensitization theory stating that the mesolimbic reward system mediates addiction-related cue hypersensitivity (Robinson and Berridge, 1993; Vezina, 2004, 2007; Robinson and Berridge, 2008), it has been proposed that glucocorticoid hormones contribute to drug addiction by modulating this neural system directly (Goodman, 2008; Vinson and Brennan, 2013).

Pathological gambling is a behavioral addiction characterized by compulsive gambling behavior and loss of control, which has gained much attention recently (van Holst et al., 2010; Conversano et al., 2012; Achab et al., 2013; Clark and Limbrick-Oldfield, 2013; Petry et al., 2013; Potenza, 2013). Since pathological gambling behavior shares many similarities with drug addiction in terms of clinical phenomenology (e.g., craving, tolerance, compulsive use, or withdrawal symptoms), heritability, and neurobiological profile (Potenza, 2006, 2008; Petry, 2007; Wareham and Potenza, 2010; Leeman and Potenza, 2012), it may be similarly under the influence of glucocorticoid hormones. However, little is known about the interaction between glucocorticoid hormones and incentive reward processing in pathological gambling. In the present study, we examined how endogenous cortisol modulates the processing of monetary and non-monetary cues in PGs. To achieve this goal, we re-analyzed previously published data using an incentive delay task manipulating both monetary and erotic rewards in PGs and healthy controls (Sescousse et al., 2013), and performed further correlation analyses between basal cortisol levels and neural responses. Based on the role of glucocorticoid hormones in drug addiction, we expected endogenous cortisol levels to be associated with neural responses to addictionrelated cues versus non-addiction related cues. Specifically, since our previously published analysis found a differential response to monetary versus erotic cues in the ventral striatum of gamblers (Sescousse et al., 2013), we expected that higher cortisol levels would be associated with an increased differential response in anticipation of monetary versus erotic rewards in PGs.

#### **MATERIALS AND METHODS**

#### **SUBJECTS**

We evaluated 20 healthy control subjects and 20 PGs. All were right-handed heterosexual males. We chose to study only men because men generally respond more to visual sexual stimuli than women (Hamann et al., 2004; Rupp and Wallen, 2008) and because there is a higher prevalence of pathological gambling among men than among women (Blanco et al., 2006; Kessler et al., 2008). The dataset from these subjects has already been used in our published functional magnetic resonance imaging (fMRI) study aiming at comparing primary and secondary rewards in healthy controls and pathological gamblers (PGs; Sescousse et al., 2013). Our current analysis focuses specifically on the relationship with cortisol levels and is therefore entirely original. As described in Sescousse et al. (2013), our published analysis excluded data from two PGs, due to technical problems with the task presentation in one case, and due to a highly inconsistent behavior in terms of hedonic ratings throughout the task in the other case. In the current analysis, we further discarded the data from one pathological gambler, because of a failure in successfully collecting blood samples. Therefore, the results reported are based on 20 healthy control subjects and 17 PGs. All subjects gave written informed consent to participate in the experiment. The study was approved by the local ethics committee (Centre Léon Bérard, Lyon, France).

Subjects underwent a semi-structured interview (Nurnberger et al., 1994) performed by a psychiatrist. All PGs met the DSM-IV-TR [Diagnostic and Statistical Manual of Mental Disorders (fourth edition, text revision)] criteria for pathological gambling diagnosis. Patients had a minimum score of 5 on the South Oaks Gambling Screen questionnaire (SOGS; range: 5–14) (Lesieur and Blume, 1987). Importantly, all were active gamblers, and none were under therapy or treatment of any type. Healthy control subjects had a score of 0 on the SOGS questionnaire, except one subject who had a score of 1. In both groups, a history of major depressive disorder or substance abuse/dependence (except nicotine dependence) in the past year was considered an exclusion criterion. All other DSM-IV-TR axis I disorders were excluded based on lifetime diagnosis.

We used a number of questionnaires to assess our subjects. The Fagerstrom Test for Nicotine Dependence (FTND; Heatherton et al., 1991) measured their nicotine dependence severity; the Alcohol Use Disorders Identification Test (AUDIT; Saunders et al., 1993) was employed to estimate their alcohol consumption; the Hospital Anxiety and Depression scale (HAD; Zigmond and Snaith, 1983) was used to evaluate current depressive and anxiety symptoms; and finally the Sexual Arousability Inventory (SAI; Hoon and Chambless, 1998) was used to assess their sexual arousal. Both groups were matched on age, nicotine dependence, education, alcohol consumption, and depressive symptoms (**Table 1**). PGs scored slightly higher on the anxiety subscale of the HAD questionnaire. Importantly, the two groups did not differ on income level and sexual arousability (**Table 1**), thereby ensuring a comparable motivation across groups for monetary and erotic rewards.

To assess the subjects' motivation for money, we asked them about the frequency with which they would pick up a 0.20e coin from the street on a scale from 1 to 5 (Tobler et al., 2007) and matched the two groups based on this criterion (**Table 1**). To ensure that all subjects would be in a similar state of motivation to see erotic stimuli, we asked them to avoid any sexual contact during a period of 24 h before the scanning session. Finally, we also sought to enhance the motivation for money by telling subjects that the financial compensation for their participation would add up the winnings accumulated in one of the three runs. For ethical reasons, however, and unbeknownst to the subjects, they all received a fixed amount of cash at the end of the experiment.

All subjects were medication-free and instructed not to use any substance of abuse other than cigarettes on the day of the scan.

#### **EXPERIMENTAL TASK**

We used an incentive delay task with both erotic and monetary rewards (**Figure 1A**). The total number of trials was 171. Each of them consisted of two phases: reward anticipation and reward outcome. During anticipation, subjects saw one of 12 cues announcing the type (monetary/erotic), probability (25/50/75%) and intensity (low/high) of an upcoming reward. An additional control cue was associated with a null reward probability. After a variable delay period (question mark representing a pseudorandom draw), subjects were asked to perform a visual discrimination task. If they answered correctly within less than 1 s, they were then allowed to view the outcome of the pseudorandom draw. In rewarded trials, the outcome was either an erotic image (with high or low erotic content) or the picture of a safe mentioning the amount of money won (high [10/11/12e] or low

#### **Table 1 | Demographic and clinical characteristics of PGs and healthy controls**.


SAI, sexual arousability inventory; AUDIT, alcohol use disorders identification test; FTND, fagerstrom test for nicotine dependence; HADS, hospital anxiety and depression scale; SOGS, south oaks gambling screen; Groups were compared using independent sample t-tests for normally distributed variables, and with Mann-Whitney U-tests for non-normally distributed variables.

[1/2/3e]). Following each reward outcome, subjects were asked to provide a hedonic rating on a 1–9 continuous scale (1 = very little pleased; 9 = very highly pleased). In non-rewarded and control trials, subjects were presented with "scrambled" pictures. A fixation cross was finally used as an inter-trial interval of variable length.

#### **STIMULI**

Two categories (high and low intensity) of erotic pictures and monetary gains were used. Nudity being the main criteria driving the reward value of erotic stimuli, we separated them into a "low intensity" group displaying females in underwear or bathing suits and a "high intensity" group displaying naked females in an inviting posture. Each erotic picture was presented only once during the course of the task to avoid habituation. A similar element of surprise was introduced for monetary rewards by randomly varying the amounts at stake (low amounts: 1, 2, or 3e; high amounts: 10, 11, or 12e). The pictures displayed in non-rewarded and control trials were scrambled versions of the pictures used in rewarded trials and hence contained the same information in terms of chromaticity and luminance.

#### **PLASMA CORTISOL MEASUREMENTS**

In order to minimize the effect of circadian hormone rhythms, we conducted all fMRI sessions between 8.50 and 11.45 AM. Just prior to the scanning session, blood samples were collected (mean time, 9.24 AM ± 0.27 mn) to measure the levels of plasma cortisol for each subject. Cortisol concentrations were measured by radioimmunoassay using an antiserum raised in rabbit immunized with cortisol 3-O (carboxy-methyl oxime) bovine serum albumin conjugate, <sup>125</sup>I cortisol as tracer and buffer containing 8-anilino-1-naphtalene sulfonic acid (ANS) for cortisol-corticosteroid-binding globulin dissociation. Below is the description of the procedure. 100 µL of <sup>125</sup>I cortisol (10000 dpm) was mixed with the standard or the sample (10 µL), buffer (500 µL) and 100 µL of antiserum solution. Samples were incubated for 45 min at 37◦C and 1 h at 4◦C. Bound and free cortisol was separated by addiction of a mixture PEG—anti-rabbit gamma globulin. After centrifugation, the radioactivity of the supernatant, containing the cortisol bound to antibody, was counted in a gammacounter. The within and inter-assay coefficients of variation were less than 3.5 and 5.0% respectively at 300 nmol/L cortisol level. This method has been validated by gas chromatography/mass spectrometry measurements (Chazot et al., 1984).

#### **FUNCTIONAL MAGNETIC RESONANCE IMAGING (fMRI) DATA ACQUISITION**

Imaging was conducted on a 1.5 T Siemens Sonata scanner, using an eight-channel head coil. The scanning session was divided into three runs. Each of them included four repetitions of each cue, with the exception of the control condition, repeated nine times. This yielded a total of 171 trials. Within each run, the order of the different conditions was pseudorandomized and optimized to improve signal deconvolution. The order of the runs was counterbalanced between subjects. Before scanning, all subjects were given oral instructions and familiarized with the cognitive task in a short training session. Each of the three functional runs consisted of 296 volumes. Twenty-six interleaved slices parallel to the anterior commissure–posterior commissure line were acquired per volume (field of view, 220 mm; matrix, 64 × 64; voxel size, 3.4 × 3.4 × 4 mm; gap, 0.4 mm), using a gradient-echoechoplanar imaging (EPI) T2\*-weighted sequence (repetition time, 2500 ms; echo time, 60 ms; flip angle, 90◦ ). To improve the local field homogeneity and hence minimize susceptibility artifacts in the orbitofrontal area, a manual shimming was performed within a rectangular region including the orbitofrontal cortex (OFC) and the basal ganglia. A high-resolution T1-weighted structural scan was subsequently acquired in each subject.

#### **FUNCTIONAL MAGNETIC RESONANCE IMAGING (fMRI) DATA ANALYSIS**

The analysis of the data was conducted using Statistical Parametric Mapping (SPM2). The pre-processing procedure included the deletion of the first four functional volumes of each run, slice-timing correction for the remaining volumes and spatial realignment to the first image of each time series. Subsequently,

**FIGURE 1 | Incentive delay task and behavioral results. (A)** Subjects first saw a cue informing them about the type (pictogram), intensity (size of pictogram) and probability (pie chart) of an upcoming reward. Three cases are represented here: a 75% chance of receiving a large amount of money (left), a 25% chance of seeing a low erotic content picture (middle) and a sure chance of getting nothing (control trials, right). Then the cue was replaced by a question mark, symbolizing a delay period during which a pseudorandom draw was performed according to the announced probability. Following this anticipation phase, participants had to perform a target discrimination task within <1 s. The target was either a triangle (left button press required) or a square (right button press required). Both their performance and the result of the pseudorandom draw determined the nature of the outcome. In rewarded trials, subjects saw a monetary amount displayed on a safe (high or low amount, left) or an erotic picture (with high or low erotic content, middle), and had to provide a hedonic rating on a continuous scale. In non-rewarded and control trials, subjects saw a scrambled picture (right). **(B)** Plot of mean reaction times according to reward type (monetary/erotic) and group (healthy controls/gamblers) in the discrimination task. There is a significant interaction between group and reward type, driven by slower reaction times for erotic compared to monetary cues in gamblers. Error bars indicate SEM. Asterisks denote significance of Tukey's HSD tests (\*\* p < 0.01).

we used tsdiffana utility<sup>1</sup> to search for residual artifacts in the time series and modeled them with dummy regressors in our general linear model. Then, the functional images were normalized to the Montreal Neurological Institute (MNI) stereotaxic space using the EPI template of SPM2 and spatially smoothed with a 10 mm fullwidth at half-maximum isotropic Gaussian kernel. Anatomical scans were normalized to the MNI space using the icbm152 template brain and averaged across the subjects. The averaged anatomical image was used as a template to display the functional activations.

Following the preprocessing step, the functional data from each subject was subjected to an event-related statistical analysis. Responses to monetary and erotic cues were modeled separately with 2.5 s box-car functions time-locked to the onset of the cue. For each cue, two orthogonal parametric regressors were added to account for the trial-to-trial variations in reward probability and intensity. The control condition was modeled in a separate regressor. Outcome-related responses were modeled as events time-locked to the appearance of the reward. The two rewards (monetary/erotic) and two possible outcomes (rewarded/nonrewarded) were modeled as four separate conditions. Two covariates linearly modeling the probability and the ratings were further added to each rewarded condition, while another covariate modeling the probability was added to each of the non-rewarded conditions. A last regressor modeled the appearance of a scrambled picture in the control condition. All regressors were subsequently convolved with the canonical hemodynamic response function and entered in a first level analysis. A high-pass filter with a cut-off of 128 s was applied to the time series. Contrast images were calculated based on the parameter estimates output by the general linear model, and were then passed in a second level group analysis.

Second-level analyses focused on the anticipation phase. First, we examined the contrast "monetary > erotic cue" in gamblers minus control subjects. This contrast was thresholded using a cluster-wise family-wise error (FWE) corrected *p* < 0.05. Then, based on our hypothesis, we investigated the relationship between basal cortisol levels and the differential brain response to monetary versus erotic cues. This correlation was computed separately for each group, and was then compared between groups. Based on our *a priori* hypotheses regarding the role of the ventral striatum in attributing incentive salience to reward cues, we used a small volume correction (SVC) based on 7 mm radius spheres centered around the peak voxels reported in a recent meta-analysis on reward processing (*x*, *y*, *z* = 12, 10, −6; *x*, *y*, *z* = −10, 8, −4) (Liu et al., 2011). We used a cluster-wise FWE corrected threshold of *p* ≤ 0.05. To further describe the patterns of activation, we used the EasyROI toolbox to extract the parameter estimates from significant clusters in the ventral striatum.

#### **RESULTS**

#### **HORMONAL DATA**

No significant differences between PGs and healthy control subjects were observed in basal cortisol levels (PGs: mean = 511.59, SD = 137.46; Healthy controls: mean = 588.7, SD = 121.61; *t*(35) = −1.81, *p* > 0.05). This is consistent with findings from recent studies reporting no difference in basal cortisol levels between recreational and PGs (Franco et al., 2010; Paris et al., 2010a,b). In addition, we performed a correlation analysis between cortisol levels and gambling symptom severity in PGs as indexed by the SOGS scale. Our result did not reveal a significant correlation between these variables (*r* = −0.35, *p* = 0.17).

<sup>1</sup>http://imaging.mrc-cbu.cam.ac.uk/imaging/215DataDiagnostics

#### **BEHAVIOR**

In our previous study (Sescousse et al., 2013), the main behavioral finding was a group × reward type interaction in the reaction time data, reflecting a weaker motivation for erotic compared with monetary rewards in gamblers. Given that one subject was discarded from our current analysis due to a failure to collect hormonal data, we performed this analysis again without this subject. The previous group × reward type interaction remained significant without this subject (*F*(1,35) = 7.85, *p* < 0.01). In addition, Tukey's *post-hoc t*-tests confirmed that the interaction was due to slower reaction times for erotic (mean = 547.54, SD = 17.22) compared with monetary rewards (mean = 522.91, SD = 14.29) in gamblers relative to healthy controls (*p* < 0.01) (**Figure 1B**). However, there was no significant correlation between basal cortisol levels and the performance on the discrimination task in either group.

#### **BRAIN-CORTISOL CORRELATION**

Our previously published analysis revealed a group × reward type interaction in the ventral striatum, reflecting a larger differential response to monetary versus erotic cues in PGs compared with controls (Sescousse et al., 2013). In our current analysis, the results of the group × reward type interaction were still significant after removing the discarded subject (*x*, *y*, *z* = −9, 0, 3, *T* = 4.11; 18, 0, 0, *T* = 3.88; *p*(SVC) < 0.05, FWE). The present analysis focused on how this differential response relates to endogenous cortisol levels. Between-subject correlation analyses revealed a positive relationship between cortisol levels and BOLD responses to monetary versus erotic cues in the ventral striatum of gamblers (*x*, *y*, *z* = 3, 6, −6, *T* = 4.76, *p*(SVC) < 0.05, FWE; **Figure 2A**), but no such relationship in healthy controls. The direct comparison between groups was also significant (*x*, *y*, *z* = −3, 6, −6, *T* = 3.10, *p*(SVC) ≤ 0.05, FWE; **Figure 2B**). We additionally examined whether cortisol levels were correlated with brain activity elicited by each reward cue separately, as compared to the control cue. This analysis did not reveal any significant correlation in the ventral striatum in either group (at *p* < 0.001 uncorrected).

#### **DISCUSSION**

To the best of our knowledge, this is the first study exploring the relationship between cortisol levels and brain activation during an incentive delay task in PGs. In line with our *a priori* hypothesis, we observed that higher endogenous cortisol levels were associated with an increased differential neural response to monetary versus erotic cues in the ventral striatum of gamblers as compared to healthy controls. This indicates a specific role of cortisol in biasing gamblers' motivation towards monetary relative to non-monetary cues. Thus, cortisol may contribute to the addictive process in PGs by enhancing the saliency of gambling-related cues over other stimuli. Because enhanced incentive salience of gamblingrelated cues in PGs triggers gambling urges, this supports a link between cortisol and PGs' motivation to pursue monetary rewards.

One potential mechanism through which cortisol might act to influence cue-elicited brain activity is glucocorticoid receptors in the NAcc. It has been shown that glucocorticoid hormones act on the brain through binding with two main intracellular receptors: the mineralocorticoid receptor (MR) and the glucocorticoid receptor. Glucocorticoid hormones play a fundamental role in reward-related behavior *via* their influence on mesolimbic dopamine circuitry and the NAcc in particular. For example, animal evidence shows that glucocorticoid hormones facilitate dopamine transmission in the NAcc shell through glucocorticoid receptors (Marinelli and Piazza, 2002). Microdialysis studies reported that corticosterone has stimulant effects on dopamine transmission in the NAcc (Piazza et al., 1996). Furthermore, infusion of glucocorticoid receptor antagonists has inhibitory effect on drug-induced dopamine release in the NAcc (Marinelli et al., 1998). In line with these findings in animals, human studies found evidence that cortisol levels were positively associated with amphetamineinduced dopamine release in the ventral striatum (Oswald et al., 2005).

It is important to note that we did not observe differences in basal cortisol levels between PGs and controls. Although this finding is in agreement with previous reports showing no difference in basal cortisol levels between PG and recreational gamblers (Meyer et al., 2004; Paris et al., 2010a,b), it does not imply that there is no HPA dysfunction in PGs. Indeed, while most previous studies investigating cortisol levels in PGs have focused on HPA responses to stress-inducing cues, such as gambling cues (Ramirez et al., 1988; Meyer et al., 2000; Franco et al., 2010), in the current study we measured baseline cortisol and its relationship with striatal activations. Moreover, other factors, such as the time of the day when blood or saliva are collected for cortisol level assessment, need to be considered because there are known endogenous diurnal variation in cortisol levels, which may vary between PGs and healthy controls or recreational gamblers. In particular, PGs may have a greater cortisol rise following waking than do recreational gamblers (Wohl et al., 2008).

Another important aspect to consider is that although cortisol is frequently used as a biomarker of psychological stress, a linear relationship between cortisol and other measures of HPA related endocrine signals does not necessarily exist (Hellhammer et al., 2009). Moreover, the absence of relationship between rewardrelated activity and basal cortisol levels in healthy controls is consistent with the variable effects of both acute stress and cortisol levels observed in the neuroimaging literature on reward processing in healthy individuals. For example, a recent study reported that stress reduces NAcc activation in response to reward cues, but that cortisol suppresses this relationship, as high cortisol was related to stronger NAcc activation in response to reward (Oei et al., 2014). Another study reported that acute stress decreased the response of the dorsal (not ventral) striatum and OFC to monetary outcomes (Porcelli et al., 2012), while no difference was observed in the NAcc between a stress group and control group using an emotion-induction procedure (Ossewaarde et al., 2011). Together, the evidence from fMRI studies indicates non-trivial relationships between stress, cortisol levels and brain activation and suggest that stress and cortisol may play distinct mediating roles in modulating sensitivity to potentially rewarding stimuli through the ventral striatum.

Several limitations of the present study need to be considered. First, only male PG were involved in the current study. It remains unclear whether our current findings would extend to female gamblers. This is an important question because sex differences exist in several aspects of gambling activity (Tschibelu and Elman, 2010; Grant et al., 2012; González-Ortega et al., 2013; van den Bos et al., 2013). Moreover, the modulatory effect of a number of hormonal factors on cognitive functioning varies between sexes (Kivlighan et al., 2005; Reilly, 2012; Vest and Pike, 2013). The current study only included men because they are generally more responsive to visual sexual stimuli than women (Stevens and Hamann, 2012; Wehrum et al., 2013) and show an elevated risk for gambling problems or severity of gambling compared to women (Toneatto and Nguyen, 2007; Wong et al., 2013). Second, we cannot make causal inferences regarding the effects of cortisol on neural responses because our results are based on correlational analyses. A pharmacological design with external cortisol administration compared to a placebo condition would be needed to assess the causal role of cortisol on gambling addiction. Despite these limitations, we believe that our current findings provide a foundation for further research on the interaction between cortisol and brain responses to incentive cues.

#### **CONCLUSIONS**

We have found that, in PGs, endogenous cortisol levels are associated with a differential activation of the ventral striatum in response to gambling-related incentives relative to nongambling-related incentives. Our results point to the importance of integrating endocrinology with a cognitive neuroscience approach to elucidate the neural mechanisms underlying maladaptive gambling behavior. Finally, this study may have important implications for further research investigating the role of cortisol on vulnerability to develop behavioral addictions such as pathological gambling.

#### **ACKNOWLEDGMENTS**

This work was performed within the framework of the LABEX ANR-11-LABEX-0042 of Université de Lyon, within the program "Investissements d'Avenir" (ANR-11-IDEX-0007) operated by the French National Research Agency (ANR). Yansong Li was supported by a PhD fellowship from Pari Mutuel Urbain (PMU). Guillaume Sescousse was funded by a scholarship from the French Ministry of Research and the Medical Research Foundation. We thank P. Domenech and G. Barbalat for clinical assessment of PGs. We thank Dr. I. Obeso for helpful revision on the manuscript and the staff of CERMEP–Imagerie du Vivant for helpful assistance with data collection.

#### **REFERENCES**


eds C. Davis, W. Yarber, R. Bauserman, R. Schreer and S. Davis (Thousand Oaks, CA: Sage), 71–74.


Petry, N. M. (2007). Gambling and substance use disorders: current status and future directions. *Am. J. Addict.* 16, 1–9. doi: 10.1080/10550490601077668


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 November 2013; accepted: 25 February 2014; published online: 25 March 2014.*

*Citation: Li Y, Sescousse G and Dreher J-C (2014) Endogenous cortisol levels are associated with an imbalanced striatal sensitivity to monetary versus non-monetary cues in pathological gamblers. Front. Behav. Neurosci. 8:83. doi: 10.3389/fnbeh.2014.00083 This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Li, Sescousse and Dreher. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

### Salivary cortisol and alpha-amylase levels during an assessment procedure correlate differently with risk-taking measures in male and female police recruits

#### *Ruud van den Bos <sup>1</sup> \*, Ruben Taris 2, Bianca Scheppink2, Lydia de Haan3 and Joris C. Verster 3,4*

*<sup>1</sup> Department of Organismal Animal Physiology, Radboud University Nijmegen, Nijmegen, Netherlands*

*<sup>2</sup> Police Academy, Recruitment and Selection, Apeldoorn, Netherlands*

*<sup>3</sup> Division of Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Utrecht, Netherlands*

*<sup>4</sup> Centre for Human Psychopharmacology, Swinburne University of Technology, Melbourne, Australia*

#### *Edited by:*

*Paul Vezina, The University of Chicago, USA*

#### *Reviewed by:*

*Kelly Lambert, Randollph-Macon College, USA Jessica Weafer, University of Chicago, USA*

#### *\*Correspondence:*

*Ruud van den Bos, Department of Organismal Animal Physiology, Faculty of Science, Radboud University Nijmegen, Heyendaalseweg 135, NL-6524 AJ Nijmegen, Netherlands e-mail: ruudvandenbos1 @gmail.com; ruudvdbos@science. ru.nl*

Recent laboratory studies have shown that men display more risk-taking behavior in decision-making tasks following stress, whilst women are more risk-aversive or become more task-focused. In addition, these studies have shown that sex differences are related to levels of the stress hormone cortisol (indicative of activation of the hypothalamus-pituitary-adrenocortical-axis): the higher the levels of cortisol the more risk-taking behavior is shown by men, whereas women generally display more risk-aversive or task-focused behavior following higher levels of cortisol. Here, we assessed whether such relationships hold outside the laboratory, correlating levels of cortisol obtained during a job-related assessment procedure with decision-making parameters in the Cambridge Gambling Task (CGT) in male and female police recruits. The CGT allows for discriminating different aspects of reward-based decision-making. In addition, we correlated levels of alpha-amylase [indicative of activation of the sympatho-adrenomedullary-axis (SAM)] and decision-making parameters. In line with earlier studies men and women only differed in risk-adjustment in the CGT. Salivary cortisol levels correlated positively and strongly with risk-taking measures in men, which was significantly different from the weak negative correlation in women. In contrast, and less strongly so, salivary alpha-amylase levels correlated positively with risk-taking in women, which was significantly different from the weak negative correlation with risk-taking in men. Collectively, these data support and extend data of earlier studies indicating that risky decision-making in men and women is differently affected by stress hormones. The data are briefly discussed in relation to the effects of stress on gambling.

#### **Keywords: cortisol, alpha-amylase, decision-making, Cambridge Gambling Task, sex, humans**

#### **INTRODUCTION**

Recently we have reviewed whether sex differences are present in the occurrence and development of disordered gambling (van den Bos et al., 2013a); an area of research still poorly studied (see also van den Bos et al., 2013b). Among others, stress may promote gambling episodes in men and women (Tschibelu and Elman, 2011), and, in addition, may (be expected to) affect gambling behavior as stress has been shown to disrupt reward-based decision-making under laboratory conditions (review: Starcke and Brand, 2012). In particular, studies encompassing both sexes have shown that men display more risk-taking behavior following stress, whilst women are more risk-aversive or become more taskfocused (Preston et al., 2007; Lighthall et al., 2009; van den Bos et al., 2009; Mather and Lighthall, 2012). In addition, it has been found that the higher the levels of cortisol [indicative of activation of the hypothalamic-pituitary-adrenal cortex (HPA) axis] the more risk-taking behavior men show (van den Bos et al., 2009), while in general women show more risk-aversive or task-focused behavior (Lighthall et al., 2009; van den Bos et al., 2009). A recent study in men has shown that activation of the sympathetic

nervous system [releasing catecholamines, i.e., (nor)adrenaline] is associated with decreased risk-taking, while this study confirmed that cortisol is associated with increased risk-taking (Pabst et al., 2013).

While data in the laboratory using standardized protocols, such as the Trier Social Stress Test, begin to reveal the relationship between sex, neuro-endocrine status and decision-making, they may not be indicative of the effects occurring in real-life, where currently circulating levels of cortisol and catecholamines, related to earlier events, context and time of the day, may determine the outcome of decision-making (see for discussion: van den Bos et al., 2013a,c). Next to understanding the relationship to activities such as gambling, this knowledge may also be of relevance for decision-making behavior in the military, police force, financial business or health care, where decisions often have to be made under highly stressful conditions. When decisions are taken wrongly due to changes in risk-perception under stress they may have a highly negative personal, financial and societal impact (Taylor et al., 2007; LeBlanc et al., 2008; LeBlanc, 2009; Arora et al., 2010; Akinola and Mendes, 2012). Therefore, given the limited body of current knowledge as well as to assess the effects of circulating levels of cortisol and catecholamines on risktaking, we correlated spontaneously occurring variation in stress hormones during a job-assessment procedure in male and female police recruits with reward-based decision-making parameters in the Cambridge Gambling Task (CGT) (Rogers et al., 1999). Thus, we chose to conduct the study in an applied setting to assess whether laboratory findings would hold under real-life conditions.

The CGT allows for discriminating different aspects of rewardbased decision-making, such as risk-taking, impulsivity and risk-adjustment (e.g., Rogers et al., 1999; Deakin et al., 2004; Newcombe et al., 2011; van den Bos et al., 2012). Male and female subjects performed the CGT during their assessment for the Master of Criminal Investigation at the Police Academy. This assessment is generally considered to be stressful by candidates. Thus, rather than using a laboratory set-up with a separate stress group and control group, we used spontaneously occurring variation in levels of salivary cortisol (activation of the HPA-axis; review: Foley and Kirschbaum, 2010) and alpha-amylase [activation of the sympatho-adrenomedullary (SAM) axis; review: Nater and Rohleder, 2009] to correlate physiological changes and behavior. We predicted that the higher the current levels of salivary cortisol in men, the more risk-taking behavior they display, while in women the opposite effect was expected (conform Lighthall et al., 2009; van den Bos et al., 2009). As no data exist regarding sex differences for current salivary alpha-amylase levels and risk-taking behavior, no specific predictions were made for these correlations.

#### **MATERIALS AND METHODS**

#### **SUBJECTS AND PROCEDURE**

Physically and psychologically healthy men [*n* = 49; age (mean ± *SD*): 28.5 ± 5.4 years; range 22–43 years] and women (*n* = 34; age: 26.7 ± 4.1; range 22–37 years; Student *t*-test; *t* = 1*.*516, *df* = 81, *p* = 0*.*133) were recruited from subjects who applied for the Master of Criminal Investigation. All subjects signed an informed consent before participating in this study. The study was performed in accordance with the ethical standards as formulated in the 1964 Declaration of Helsinki [*The Code of Ethics of the World Medical Association (Declaration of Helsinki) for experiments involving humans* http://www*.*wma*.*net/en/30publications/ 10policies/b3/index*.*html].

Candidates were subjected to a two-day assessment at the Police Academy (Apeldoorn, Netherlands) containing a series of physical tests (day 1) and psychological tests (day 2). Only candidates who passed the physical tests enrolled into the second day of psychological tests. The psychological tests encompassed cognitive ability tests, a personality inventory, a psychological interview and a job-related simulation [Fact Finding Decision-Making (FFDM) task]. For logistic reasons inherent to the assessment procedure at the Police Academy the order of tests varied between subjects. Therefore, we scheduled the CGT to follow the FFDM task for each candidate, such that each candidate had the same test immediately before the CGT.

To determine daytime cortisol and alpha-amylase levels in saliva, samples using Salivettes® Cortisol (Sarstedt, Nümbrecht, Germany) were collected at four moments during the assessment procedure according to procedures and recommendations of the manufacturer: (1) when subjects arrived early in the morning (8:15–8.45 AM), (2) directly before the start of the FFDM task (8:45 AM, 10:15 AM, or 2:15 PM), (3) after the FFDM, which lasted 1.45 h, which is directly *before* the CGT (10:30 AM, 0:15 PM, or 4:00 PM), and (4) *after* the CGT (11.00 AM, 1:00 PM, 4.30 PM; see below). In cases where subjects started with the FFDM task as their first assignment of the day saliva sample 1 and 2 collided. As only levels *before* (3) and *after* (4) the CGT are of relevance for the present paper, only these values will be reported here. We chose to obtain levels of salivary cortisol and alpha-amylase *before* and *after* the CGT to optimize correlations between these levels and task-performance. It should be noted that the CGT by itself is not a stress-inducing task.

#### **CAMBRIDGE GAMBLING TASK**

The CGT was developed to assess different aspects of decisionmaking (Rogers et al., 1999). Detailed information on the task and procedure can be found in the manual of CGT (www*.*cantab*.* com) and earlier published papers (Rogers et al., 1999; Deakin et al., 2004; Newcombe et al., 2011; van den Bos et al., 2012). In brief, in each trial the subject is presented with an array of 10 red and blue boxes. The subject must guess if a yellow token is hidden in a red or blue box by touching one of two rectangles, with the word "red" or "blue," on the screen. The ratio of red to blue boxes varies from trial to trial. Some trials have highly favorable odds (e.g., nine blue boxes/one red box), while others have less favorable odds (e.g., six blue boxes/four red boxes). In the gambling stages the subjects start with 100 points. Subjects can select a proportion of these points (5, 25, 50, 75, or 95%), displayed in an ascending or a descending order, to bet on whether the yellow token is hidden in a blue or red box. In the ascending order subjects start with the option to gamble 5% of their credit points on their choice (blue or red) after which percentages increase (as indicated above; about 2 s delay between options) until subjects press the button on the screen, which is the taken as their choice for this trial. In the descending order subjects start with the option to gamble 95% of their credit points on their choice (blue or red) after which percentages decrease (as indicated above; about 2 s delay between options) until subjects press the button on the screen, which is the taken as their choice for this trial.

The task contains five stages. The first stage is a decisionmaking stage. Subjects have to choose whether the token is hidden in a blue or red box (four trials). The second stage is a gambling training stage (ascending order; four trials). Subjects have to choose whether the token is hidden in a blue or red box and then select the amount they wish to bet, both by touching the screen. The third stage is a gambling test stage (ascending order; four series of nine trials). The fourth stage is a gambling training stage (descending order; four trials). The fifth stage is a gambling test stage (descending order; four series of nine trials). The subjects must try to accumulate as many points as possible. Whether subjects start with the ascending order followed by the descending order or the other way round is randomized across test-subjects. The task takes 20–25 min to complete.

The following measures are extracted: (1) *Quality of decisionmaking (QDM)*: a measure which reflects the ability of subjects to judge the likelihood of events to happen (cognition), i.e., it measures the proportion of trials on which the subject chose to gamble on the more likely outcome. The higher the value the more appropriate subjects behave according to the situation. (2) *Overall proportion bet (OPB)* and *Risk taking (Likely Proportion Bet; LPB)*: both parameters are measures of risk tolerance, i.e., the higher the value the more subjects tolerate risks. *OPB* measures the average proportion of the current points total that the subject chose to risk on each gamble test trial, including trials on which they bet on the less likely outcome. However, differences may exist regarding betting behavior on likely or unlikely options. For instance, subjects may bet a lower amount of credit points when choosing an unlikely option than a likely option. Therefore, the CGT also includes a second parameter, which is labeled *Risk taking* in the manual, but will be labeled *LPB* here to stay in line with the previous parameter. This measure reports the mean proportion of the current points total that the subject chose to risk on gamble test trials for which they had chosen the more likely outcome, i.e., trials on which they had a higher chance of winning than losing. *OPB* equals *LPB* when subjects hardly choose the unlikely option, i.e., in such case they are highly correlated (van den Bos et al., 2012). In line with our earlier studies (van den Bos et al., 2012) we used both measures. (3) *Deliberation time (DT)* and *Delay Aversion (DA)*: two measures which may reflect impulsivity. *DT* is the mean latency from presentation of the colored boxes to the subject's choice of which color to bet on. The higher the value the longer subjects take to decide. This parameter measures reflection impulsivity although the CGT is not a task in which delay increases the information available. Subjects who are unable/unwilling to wait will bet larger amounts when they are presented in descending order than in ascending order. This is reflected in *DA,* which is calculated as the difference between the risk-taking score in the descend condition and the ascend condition. This measure reflects DA, but may also reflect motor impulsivity. The higher the value the more impulsive subjects are or the more they avoid delays. (4) *Risk adjustment (RA)*: the ability to adjust betting behavior according to the likelihood of winning (interaction cognition-reward), i.e., subjects will gamble more of their current points when the odds are strongly in favor of them. A low RA score could be interpreted as a failure to use the available information when making a decision. This measure reflects the tendency to bet a higher proportion of points on trials when the large majority of the boxes are of the color chosen (e.g., 9:1) than when a small majority of the boxes are of the color chosen (e.g., 6:4). This RA score was calculated as the degree to which the risk differed across the ratios, as a proportion of the overall amount risked by that subject: RA = [2∗(% bet at 9:1) + (% bet at 8:2) − (% bet at 7:3) − 2∗(% bet at 6:4)]/average % bet. A RA score of approximately zero reflects no systematic tendency to take differential risks across the ratios, whereas a high positive score indicates a tendency to bet a larger proportion of the available points on the higher ratio (9:1 and 8:2) trials than on the lower ratio (7:3 and 6:4) trials.

#### **PHYSIOLOGICAL MEASUREMENTS**

Saliva samples were stored at −20◦C directly following collection and remained at this temperature for a maximum period of 4 months until processing at the Specieel Laboratorium Endocrinologie (UMCU, Utrecht, Netherlands).

Cortisol in saliva was measured without extraction using an in house competitive radio-immunoassay employing a polyclonal anticortisol-antibody (K7348). [1,2-3H(N)]-Hydrocortisone (PerkinElmer NET396250UC) was used as a tracer. The lower limit of detection was 1.0 nmol/l and inter-assay variation was *<*6% at 4-29 nmol/l (*n* = 33). Intra-assay variation was *<*4% (*n* = 10). Samples with levels *>*100 nmol/L were diluted 10× with assay buffer.

Alpha-amylase in saliva was measured on a Beckman-Coulter AU5811 chemistry analyzer (Beckman-Coulter Inc., Brea, CA). Saliva samples were diluted 1000× with 0.2% BSA in 0.01 M phosphate buffer pH 7.0. Interassay variation was 3,6% at 200.000 U/L (*n* = 10).

Although cortisol and alpha-amylase levels may differ between women which use oral contraceptives or not, and cortisol levels vary across the menstrual cycle (Foley and Kirschbaum, 2010) we did not take these differences into account here as we were interested in the effects of the current levels of cortisol and alphaamylase on decision-making behavior (see also van den Bos et al., 2009; de Visser et al., 2010). However, the number of male and female subjects was counterbalanced across morning and afternoon periods to account for differences in morning and afternoon values (Nater et al., 2007).

#### **STATISTICAL ANALYSIS**

All statistical analyses were carried out using SPSS 16.0 for Windows or the Vasserstats website (www*.*vasserstats*.*net) where needed. Tests are indicated in the Results section. Significance (two-tailed) was set at *p* ≤ 0.05; *p*-values *>* 0.05 and ≤ 0.10 were considered trends, while *p*-values *>* 0.10 were considered non-significant (NS).

#### **RESULTS**

#### **CAMBRIDGE GAMBLING TASK**

No differences were found between men and women for choosing the most likely option [QDM: men vs. women (mean ± *SD*): 0.96 ± 0.06 vs. 0.95 ± 0.06; Student *t*-test, NS], for risk-taking measures [OPB: 0.53 ± 0.09 vs. 0.54 ± 0.11 (Student *t*-test, NS); LPB: 0.58 ± 0.10 vs. 0.58 ± 0.11 (Student *t*-test, NS)] and for impulsivity measures [DT: 2019.6 ± 1132.8 ms vs. 1749.8 ± 565.2 ms (Student *t*-test, NS); DA: 0.14 ± 0.12 vs. 0.19 ± 0.16 (Student *t*-test, NS)]. Only risk-adjustment differed significantly between men and women (1.82 ± 0.80 vs. 1.46 ± 0.74; Student *t*-test: *t* = 2*.*098, *df* = 81, *p* = 0*.*039). As subjects chose the most likely option often (QDM *>* 0.95) it should be noted that *OPB* and *LPB* are virtually identical. These measures were strongly correlated in men and women: men: *r* = 0*.*975, *n* = 49, *p <* 0*.*001; women: *r* = 0*.*979, *n* = 34, *p <* 0*.*001.

#### **SALIVARY CORTISOL AND ALPHA-AMYLASE**

**Table 1A** shows the levels of salivary cortisol and alpha-amylase *before* the CGT at the different time-points across the day, while **Table 1B** shows the levels of salivary cortisol and alphaamylase *after* the CGT at the different time-points across the day. While cortisol levels decreased across time-points in both cases [*before*: two-way ANOVA; time-points: *F(*2*,* <sup>77</sup>*)* = 6*.*552, *p* = 0*.*002; *after*: *F(*2*,* <sup>77</sup>*)* = 6*.*345, *p* = 0*.*003], no differences were found between men and women [*before*: sex: *F(*1*,* <sup>77</sup>*)* = 0*.*801, NS; sex∗time-points: *F(*2*,* <sup>77</sup>*)* = 0*.*612, NS; *after*: sex: *F(*1*,* <sup>77</sup>*)* = 0*.*011, NS; sex∗time-points: *F(*2*,* <sup>77</sup>*)* = 1*.*186, NS]. In both cases no differences were observed for time-points or sex for alphaamylase levels (*before*: *F* values *<*0.671, *p*-values *>*0.415; *after*: *F* values *<*1.566, *p*-values *>*0.215).

#### **CORRELATION BETWEEN CGT PARAMETERS AND SALIVARY CORTISOL AS WELL AS ALPHA-AMYLASE**

In both men and women cortisol as well as alpha-amylase levels *before* and *after* the CGT were highly correlated: men, cortisol: *r* = 0*.*971, *n* = 49, *p <* 0*.*001; women, cortisol: *r* = 0*.*953, *n* = 34, *p <* 0*.*001; men, alpha-amylase: *r* = 0*.*716, *n* = 49, *p <* 0*.*001; women, alpha-amylase: *r* = 0*.*926, *n* = 34, *p <* 0*.*001. To reduce the number of correlations we therefore decided to calculate the mean of the levels *before* and *after* the CGT to capture the average levels of salivary cortisol and alpha-amylase *during* the task and correlate these average levels with the CGT parameters.

**Figure 1A**, shows the correlations between salivary cortisol levels and CGT measures. Salivary cortisol levels were positively and significantly correlated with LPB (*r* = 0*.*408, *n* = 49, *p* = 0*.*004) and OPB (*r* = 0*.*378, *n* = 49, *p* = 0*.*007) in men, which were significantly different from the negative, but nonsignificant, correlations in women (LPB: *r* = −0*.*241, *n* = 34, NS; Fisher-*r*-to-*z*, *z* = 2*.*92 *p* = 0*.*004; OPB: *r* = −0*.*196, *n* = 34, NS; Fisher-*r*-to-*z*, *z* = 2*.*57, *p* = 0*.*01). Cortisol levels in men tended to correlate negatively with RA (*r* = −0*.*271, *n* = 49, *p* = 0*.*06). No other significant differences or trends were found. It should be noted that the significant correlations in men remain even when we would correct for the number of correlations (*p*-value = 0.05/6 = 0.0083). In, addition we confirmed that the main effects of LPB and OPB in men were not due to differences in levels of cortisol across time-points *per se* (see **Tables 1A,B**) as correlations remained significant following correction for differences in time-points: *before* CGT: no correction OPB: *r* = 0*.*365, *df* = 47, *p* = 0*.*01, LPB: *r* = 0*.*395, *df* = 47, *p* = 0*.*005; with correction (partial correlations): OPB: *r* = 0*.*287, *df* = 46, *p* = 0*.*048; LPB: *r* = 0*.*329, *df* = 46, *p* = 0*.*023, *after* CGT: no correction: OPB: *r* = 0*.*387, *df* = 47, *p* = 0*.*006; LPB: *r* = 0*.*418, *df* = 47, *p* = 0*.*003; with correction (partial correlations): OPB: *r* = 0*.*314, *df* = 46, *p* = 0*.*030; LPB: *r* = 0*.*355, *df* = 46, *p* = 0*.*013.

**Figures 2A,B**, show the significant correlations between salivary cortisol levels and LPB as well as OPB scores in men and the non-significant correlations in women. The panels show that risktaking measures and cortisol levels were within the same range in men and women. The mean values of cortisol were not different between men and women (men vs. women; mean ± *SD*; nmol/l): 15.50 ± 6.20 vs. 15.24 ± 5.18 (Student *t*-test, NS).

**Figure 1B**, shows the correlations between salivary alphaamylase levels and CGT measures. Salivary alpha-amylase levels correlated positively and significantly with LPB (*r* = 0*.*336, *n* = 34, *p* = 0*.*05), while a trend was observed for the correlation with OPB (*r* = 0*.*324, *n* = 34, *p* = 0*.*06), in women, which were significantly different from the negative, but non-significant, correlations in men (LPB: *r* = −0*.*184, *n* = 49, NS; Fisher-*r*-to-*z*, *z* = −2*.*31, *p* = 0*.*02; OPB: *r* = −0*.*178, *n* = 49, NS; Fisher-*r*-to*z*,*z* = −2*.*22, *p* = 0*.*03). Risk-adjustment tended to correlate negatively in women (*r* = −0*.*312, *n* = 34, *p* = 0*.*07), which tended to differ from the non-significant positive correlation in men (*r* = 0*.*112, *n* = 49, NS; Fisher *r*-to-*z*, *z* = 1*.*87, *p* = 0*.*06). No other significant differences or trends were found. It should be noted that the significant correlations in women disappear when we would correct for the number of correlations (*p*-value = 0.05/6 = 0.0083).

**Table 1A | Salivary cortisol and alpha-amylase levels (mean ±** *SD***)** *before* **the CGT in men and women at different time-points during the day; number of subjects is indicated between brackets.**


**Table 1B | Salivary cortisol and alpha-amylase levels (mean ±** *SD***)** *after* **the CGT in men and women at different time-points during the day; number of subjects is indicated between brackets.**


the CGT and CGT parameters (x-axis). **(B)** Correlations (*r*-values; y-axis) between alpha-amylase levels *during* the CGT and CGT parameters (x-axis). *For both panels:* QDM, quality of decision-making; LPB, likely proportion bet; OPB, overall proportion bet; DT, deliberation time; DA, delay aversion; RA, risk-adjustment. Gray bars indicate significant differences between *r*-values of men and women (see text for details); asterisks indicate significant *r*-values (see text for details).

**Figures 2C,D**, show the significant correlations between salivary alpha-amylase levels and LPB as well as OPB scores in women and the non-significant correlations in men. The panels show that risk-taking measures and alpha-amylase levels were within the same range in men and women. The mean values of alphaamylase were not different between men and women (men vs. women; mean ± *SD*; U/l): 379.859 ± 219.974 vs. 324.397 ± 201.199 (Student *t*-test, NS).

A significant negative correlation was found between salivary cortisol and alpha-amylase levels in women (*r* = −0*.*394, *n* = 34, *p* = 0*.*02); this was not the case in men (*r* = −0*.*137, *n* = 49, NS). We therefore used multiple regression to assess whether the combination explained more of the variance. This was not case (not shown). Since it was observed earlier that in women curve-linear relationships may exist between cortisol and risk-taking (van den Bos et al., 2009), this possibility was also explored for cortisol and alpha-amylase and LPB as well OPB scores. However, no such curve-linear relationships were found (not shown).

**Figures 2A,B**, suggest that the risk-taking measures are lower in men than women at the low end of cortisol levels, while the opposite is the case at the high end of cortisol levels. To capture this as well as to further underpin the correlations we calculated the quartiles for the cortisol values and assessed risk-taking measures according to these quartiles. We only compared the low end (quartile 1) and the high end values (quartile 4). **Table 2A** shows that no difference existed between men and women regarding the cortisol levels when quartiles for men and women were calculated. In contrast risk-taking measures changed differently in men and women related to the low and high end quartiles. While in men LPB and OPB increased significantly from quartile 1 to 4, in women they did not, in line with the correlations reported above. Furthermore, LPB and OPB values in women were higher than values of men at the low end, while the opposite was true at the high end of cortisol quartiles. In addition, alpha-amylase levels tended to be lower at high end of the cortisol levels in men, but not women.

**Figures 2C,D**, suggest that the risk-taking measures are lower in women than men at low levels of alpha-amylase, while the opposite is the case at high levels. To capture this as well as to further underpin the correlations we calculated the quartiles for the alpha-amylase values and assessed risk-taking measures according to these quartiles. We only compared the low end (quartile 1) and the high end values (quartile 4). **Table 2B** indicates that women showed overall slightly lower alpha-amylase levels. Risk-taking measures changed differently in men and women related to the low and high end of the quartiles. While in women LPB and OPB increased significantly, in men they did not, in line with the correlations reported above. Furthermore, LPB and OPB values in men were higher than values in women at the low end, while this was not the case at the high end of alpha-amylase levels. In addition, cortisol levels tended to be lower at high end of the alpha-amylase quartiles in women, but not men.

#### **DISCUSSION**

The aim of this study was to determine whether individual differences in current levels of salivary cortisol (activation of the HPA-axis) and/or alpha-amylase (activation of the SAM-axis) in an assessment procedure were related to differences in decisionmaking related parameters in the CGT in men and women. The main findings of this study were that, (1) men and women differed in risk-adjustment in the CGT, (2) cortisol levels correlated strongly positively with risk-taking measures in men, which was significantly different from the weak negative correlation in women, and (3) alpha-amylase levels correlated positively, but not strongly, with risk-taking in women, which was significantly different from the weak negative correlation with risk-taking in men. Collectively, these data support and extend data of earlier studies indicating that risky decision-making in men and women is differently affected by stress hormones (Lighthall et al., 2009; van den Bos et al., 2009).

#### **GENERAL**

Men and women only differed in risk-adjustment in the CGT. This difference between sexes matches the outcome of earlier studies (Deakin et al., 2004; van den Bos et al., 2012), indicating that this is a robust finding between sexes regarding decisionmaking (review: van den Bos et al., 2013b,c). As we did not include a control group we cannot address the question whether CGT parameters, for instance those related to risk-taking, were in

**Table 2A | Risk-taking parameters and salivary alpha-amylase levels (mean ±** *SD***) in men and women calculated according to cortisol-related quartiles (see text).**


*Red: values significantly different between men and women; Blue: values trend between men and women.*


**Table 2B | Risk-taking parameters and salivary cortisol levels (mean ±** *SD***) in men and women calculated according to alpha-amylase-related quartiles (see text).**

*Red: values significantly different between men and women; Blue: values trend between men and women.*

generally higher or lower in the job assessment group. However, earlier data of a group of subjects within the same age range (van den Bos et al., 2012) suggest that LPB and OPB scores were overall higher in the present study.

We did not assess levels of (psychological or subjective) stress experienced by our test-subjects, as this was not the objective of this study. However, the assessment procedure is generally considered to be stressful by the candidates. As increased levels of subjective stress and increased levels of stress hormones co-occur (e.g., Starcke and Brand, 2012; van den Bos et al., 2013c), the levels of salivary cortisol and alpha-amylase, that we observed here, suggest that subjects may have been psychologically stressed: levels were above for what may normally be found across the day (e.g., Nater et al., 2007; Nater and Rohleder, 2009; van den Bos et al., 2009; de Visser et al., 2010). Therefore, discussions which follow should be considered against the background of possibly psychologically stressed subjects.

#### **CGT, CORTISOL, AND ALPHA-AMYLASE**

A striking finding was that while risk-taking measures and current salivary cortisol levels during the assessment procedure were not different between men and women, current salivary cortisol levels were strongly and positively correlated with risk-taking measures in men, which was significantly different from the nonsignificant negative correlation between current salivary cortisol levels and risk-taking parameters in women. These correlations and differences between sexes were supported by the analysis of differences in risk-taking parameters related to the low and high end of cortisol quartiles. In conjunction with the trend for a negative correlation with risk-adjustment the data in men suggest that related to HPA-axis activation men increase their bets across the entire range of odd-ratio's without adjusting betting behavior according to the odds of winning. This increased risk-taking may be related to a cortisol induced increase in reward-processing and decrease in punishment-processing (Putman et al., 2010; Mather and Lighthall, 2012).

An obvious limitation of our study is that we did not explicitly use a control and stress group as in laboratory studies to manipulate cortisol levels (Lighthall et al., 2009; van den Bos et al., 2009). Still, our data are in line with data obtained in the laboratory, where it has been shown, using a stress and control group, that higher levels of salivary cortisol are associated with higher levels of risk-taking behavior in men and higher levels of salivary cortisol with risk-aversive and/or task-focused behavior in women (Lighthall et al., 2009; van den Bos et al., 2009; Pabst et al., 2013). Thus, this study confirms and extends earlier reports and points to a general difference between sexes. Furthermore, these data add to the validity of laboratory studies showing that differences in cortisol levels in daily life affect the behavior of men and women differently. In contrast to an earlier study (van den Bos et al., 2009) we did not observe a curve-linear relationship between cortisol and task-performance in women. This may be related to differences between the (parameters of) CGT and Iowa Gambling Task or the way stress was elicited (short-lasting Trier Social Stress Test vs. long-lasting assessment procedure).

A second striking finding, but less strongly than the first, was that while current salivary alpha-amylase levels were not different between men and women, current salivary alpha-amylase levels were differently correlated with risk-taking measures in men and women: salivary alpha-amylase levels correlated positively with risk-taking in women, which was significantly different from the non-significant negative correlations with risk-taking in men. These correlations and differences between sexes were supported by the analysis of differences in risk-taking parameters related to the low and high end alpha-amylase quartiles. In conjunction with the trend for a negative correlation with risk-adjustment the data in women suggest that related to SAM-axis activation women increase their bets across the entire range of odd-ratio's without adjusting betting behavior according to the odds of winning. Although measuring salivary alpha-amylase may be indicative of SAM-axis activation (Nater and Rohleder, 2009; but see Bosch et al., 2011 for critical remarks) the present results should be confirmed using other parameters indicative of SAM-axis activation such as heart rate and heart rate variability.

A recent study in men showed that an increase in SAM-axis activation was associated with a decrease in risk-taking behavior (Pabst et al., 2013). While we did not observe a clear-cut relation between SAM-axis activation and risk-taking here in men, the sign of the correlation was in the same direction as in the study by Pabst et al. (2013). Currently, no studies have studied SAM-axis activation regarding reward-based decision-making in both men and women. These data thus await further confirmation in laboratory studies. However, one recent study clearly showed a difference between men and women regarding amygdala activation, emotional memory and noradrenaline (Schwabe et al., 2013) hinting at differences between men and women in the way SAMaxis activation may affect behavior.

It would be tempting to suggest from the present data that in men low levels of cortisol (low HPA-axis activation) and high levels of alpha-amylase (high SAM-axis activation) are associated with lower risk-taking levels than in women, while the opposite is the case for high levels of cortisol and low levels of alphaamylase. Similarly, it would be tempting to suggest that in women low levels of cortisol (low HPA-axis activation) and high levels of alpha-amylase (high SAM-axis activation) are associated with higher risk-taking levels than in men, while the opposite is the case for high levels of cortisol and low levels of alpha-amylase. While we observed an inverse relationship between cortisol and alpha-amylase in women, the relationship in men was less strong and clear, although the analysis using quartiles did suggest such an inverse relationship. At present therefore this precludes drawing too strong conclusions regarding the interplay of HPA-axis and SAM-axis activation as well as the role of differences in coping styles in men and women [see for a discussion van den Bos et al. (2013c)]. Thus, while the data do not allow for extensive speculation as yet, they do suggest differences in the effects of SAM-axis and HPA-axis activation on risk-taking behavior in men and women. Future studies should focus on differences in the interaction between HPA-axis and SAM-axis activation in men and women in more detail.

The present study clearly extends data of previous studies further as the CGT measures also other aspects of decision-making. Thus, we did not observe any correlation between cortisol levels or alpha-amylase levels with other measures of decision-making such as impulsivity as measured by *DT* (speed of decisions; reflective impulsivity) and delay-aversion (the inability to wait, motor impulsivity) and the ability to assess whether events are more or less likely to happen (QDM; cognition). It has been suggested that acute stress may increase the speed with which subjects make choices, indicative of a loss of top-down control (Keinan et al., 1987; Porcelli and Delgado, 2009). While we did observe that stress increased decision-making speed in women in our earlier study (van den Bos et al., 2009), this effect was independent of cortisol levels. In a delay discounting task, which measures aspects of impulsivity or levels of self-control it was shown that low levels of saliva alpha-amylase correlate with high levels of impulsivity in men (Takahashi et al., 2007). These data seem in line with the weak correlation between alpha-amylase levels and risk-taking in men that we observed here. In another study it was shown that high and low impulsive male subjects did not differ in basal or gambling induced increases in cortisol levels (Krueger et al., 2005), suggesting no direct relationship between impulsivity and cortisol, which is in line with the data observed here. Future studies should further examine the relationship between speed of decision-making, different forms of impulsivity and stress in more detail.

#### **NEURONAL UNDERPINNINGS**

As to the underlying neural substrates, sex differences in the regulation of the balance between prefrontal areas and subcortical areas may underlie behavioral differences as we have recently discussed extensively elsewhere (van den Bos et al., 2013c; see also Wang et al., 2007). We refer therefore to this review for detailed information. Here, we only allude to general conclusions, especially related to the effects of cortisol as this has been studied in more detail than adrenergic effects (Schwabe et al., 2013). The increase in risk-taking behavior in men in rewardrelated decision-making under high levels of cortisol may be associated with a loss of top-down control of prefrontal (lateral orbitofrontal cortex and dorsolateral prefrontal cortex) over subcortical structures. Furthermore, within the limbic system high levels of cortisol may shift the balance of the activity of the ventral striatum (reward-related behavior) and amygdala (punishment-related behavior) toward the ventral striatum. In line with this, it was recently observed that systemic injections of corticosterone in male rats in a rodent analog of the Iowa Gambling Task disrupted decision-making performance, which was associated with changes in activity in prefrontal structures (Koot et al., 2013). As to the underlying neural substrate in women it seems that top-down control may actually be increased under stress, related to levels of cortisol, with among others a lower striatal and a stronger amygdala activity. It has been suggested that the persistent activity in, for instance, the anterior cingulate cortex following a stressful experience in women may be associated with the development of depressive symptoms in women related to tendencies of ruminative thinking. The menstrual cycle has a strong effect on the outcome of stress-related changes in neuronal activity (Goldstein et al., 2010; Ter Horst et al., 2013). At present changes in neuronal activity in women are less clear and straightforward than in men. However, by and large these changes in women seem compatible with a shift toward riskaversive behavior. However, given the current lack of studies that have assessed the behavior of women in decision-making tasks, changes in decision-making behavior are better documented in men than women. Clearly, there is a need for more studies measuring stress, stress hormones and decision-making behavior in men and women under the same conditions using fMRI to assess task-related changes in neuronal activity (Lighthall et al., 2011; Mather and Lighthall, 2012; Porcelli et al., 2012).

#### **IMPLICATIONS**

The data of this study add to the growing number of studies showing differences between men and women in task-performance encompassing emotional regulation (Cahill, 2006; van den Bos et al., 2012, 2013a,b,c). Related to gambling we have elsewhere discussed that more attention should be given to assessing sex differences in the tendency to engage in gambling and develop disordered gambling (van den Bos et al., 2013a). While stress may trigger gambling episodes, underlying reasons for this may be different, e.g., excitement in men vs. overcoming negative mood in women (van den Bos et al., 2013a). In addition, here we show that depending on neuro-endocrine status the consequences in men and women may be different when being involved in gambling episodes. It is clear that studies are needed to assess whether these neuro-endocrine differences also relate to patterns of problematic gambling behavior in real-life.

Finally, the data suggest that some individuals in the military, police force, financial business or health care, which may experience high levels of work-related stress throughout the day, may be at risk of taking wrong decisions due to strong HPA-axis and/or SAM-axis induced changes in risk-perception (Taylor et al., 2007; LeBlanc et al., 2008; LeBlanc, 2009; Arora et al., 2010; Akinola and Mendes, 2012). Both high tendencies to take risks and high tendencies to avoid them may not be optimal for job fulfillment (van den Bos et al., 2013c). Given that police officers may have to take decisions at unexpected time-points during a potential stressful day, the design of the study mimics this situation. Laboratory conditions may not adequately address such a dynamic situation. By doing so, our study revealed differences in patterns between men and women due to (long-term) activation of the HPA-axis and SAM-axis. These data may in turn lead to new laboratory designs for testing the effects of stress on decision-making.

#### **CONCLUSION**

In conclusion, the data of this study show that high levels of HPAaxis and SAM-axis activation may have different effects in men and women on risk-taking behavior. Future studies should concentrate on the underlying mechanisms of these sex differences.

#### **AUTHOR CONTRIBUTIONS**

Ruud van den Bos, Ruben Taris, Lydia de Haan, Joris C. Verster, and Bianca Scheppink designed the experiment. Bianca Scheppink and Ruben Taris conducted the research. Bianca Scheppink, Ruben Taris, and Ruud van den Bos analyzed the data. Ruud van den Bos, Ruben Taris, Bianca Scheppink, Lydia de Haan, and Joris C. Verster wrote the manuscript.

#### **ACKNOWLEDGMENTS**

The authors would like to acknowledge the financial support of the Police Academy (analyses of cortisol and alpha-amylase). The authors wish to thank Inge Maitimu from the Specieel Laboratorium Endocronologie of the Wilhelmina Children's Hospital of UMC Utrecht (Utrecht, Netherlands) for analysis of the cortisol and alpha-amylase samples. Furthermore, the authors would like to thank Dr. Judith Homberg for critically reading an earlier version of the manuscript.

#### **REFERENCES**


for monoaminergic mechanisms. *Neuropsychopharmacology* 20, 322–339. doi: 10.1016/S0893-133X(98)00091-8


emotion and cognitive control," in *Handbook on Psychology of Decision-making,* eds K. O. Moore and N. P. Gonzalez (Hauppage, NY: Nova Science Publisher Inc.), 179–198.


**Conflict of Interest Statement:** Joris C. Verster has received research support from Takeda Pharmaceuticals, Red Bull GmbH, and acted as consultant for Sanofi-Aventis, Transcept, Takeda, Sepracor, Red Bull GmbH, Deenox, Trimbos Institute, and CBD. Ruud van den Bos acts as consultant for Chardon Pharma. The other authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 October 2013; paper pending published: 21 November 2013; accepted: 19 December 2013; published online: January 2014. 16*

*Citation: van den Bos R, Taris R, Scheppink B, de Haan L and Verster JC (2014) Salivary cortisol and alpha-amylase levels during an assessment procedure correlate differently with risk-taking measures in male and female police recruits. Front. Behav. Neurosci. 7:219. doi: 10.3389/fnbeh.2013.00219*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 van den Bos, Taris, Scheppink, de Haan and Verster. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## The role of dopamine in risk taking: a specific look at Parkinson's disease and gambling

#### **Crystal A. Clark and Alain Dagher \***

Montreal Neurological Institute, McGill University, Montreal, QC, Canada

#### **Edited by:**

Paul Vezina, The University of Chicago, USA

#### **Reviewed by:**

Walter Adriani, Istituto Superiore di Sanita, Italy Andrew David Lawrence, Cardiff University, UK

#### **\*Correspondence:**

Alain Dagher, Montreal Neurological Institute, McGill University, 3801 University St., Montreal, QC H3A 2B4, Canada e-mail: alain.dagher@mcgill.ca

An influential model suggests that dopamine signals the difference between predicted and experienced reward. In this way, dopamine can act as a learning signal that can shape behaviors to maximize rewards and avoid punishments. Dopamine is also thought to invigorate reward seeking behavior. Loss of dopamine signaling is the major abnormality in Parkinson's disease. Dopamine agonists have been implicated in the occurrence of impulse control disorders in Parkinson's disease patients, the most common being pathological gambling, compulsive sexual behavior, and compulsive buying. Recently, a number of functional imaging studies investigating impulse control disorders in Parkinson's disease have been published. Here we review this literature, and attempt to place it within a decision-making framework in which potential gains and losses are evaluated to arrive at optimum choices. We also provide a hypothetical but still incomplete model on the effect of dopamine agonist treatment on these value and risk assessments. Two of the main brain structures thought to be involved in computing aspects of reward and loss are the ventral striatum (VStr) and the insula, both dopamine projection sites. Both structures are consistently implicated in functional brain imaging studies of pathological gambling in Parkinson's disease.

**Keywords: impulse control disorders, impulsivity, reward, loss aversion, insula, ventral striatum**

#### **GAMBLING AS A DISORDER OF REWARD AND PUNISHMENT PROCESSING**

Pathological gambling can be conceptualized as a disorder of reward and punishment processing, whereby the gambler selects an immediate but risky opportunity to obtain money over the larger, more probable opportunity to save money (Ochoa et al., 2013). Indeed, gambling is typically conceptualized as a disorder of impulsivity, in which decision-making is rash and relatively uninfluenced by future consequences. Pathological gamblers demonstrate increased impulsivity and increased delayed discounting on laboratory measures (Verdejo-Garcia et al., 2008). The coupling of increased reward seeking behavior with insensitivity to negative consequences may explain the persistence of gambling in the face of overall monetary losses (Vitaro et al., 1999; Petry, 2001b; Cavedini et al., 2002). This conceptual framework is similar to that used in drug addiction, where seeking immediate gains while minimizing potential risks is ubiquitous. Hallmarks of addiction include cravings or compulsions, a loss of control, and continued engagement in behaviors that maintain the addiction despite repeated negative consequences (American Psychiatric Association, 2000). Similarly, pathological gambling can be referred to as a behavioral addiction because it shares many common features with drugaddiction, such as compulsion and loss of control over one's behavior, as well as continuation of the behavior in the face of negative consequences (Grant et al., 2006; Goodman, 2008). Pathological gamblers exhibit uncontrollable cravings, tolerance, habituation, and withdrawal symptoms, similar to those of drug addicts (Wray and Dickerson, 1981; Castellani and Rugle, 1995; Duvarci and Varan, 2000; Potenza et al., 2003). Moreover, both pathological gambling and substance abuse are associated with the same specific personality traits, namely sensation seeking and impulsivity (Zuckerman and Neeb, 1979; Castellani and Rugle, 1995), which index heightened arousal to potential rewards and reduced self-control and inhibitory function. The high comorbidity between substance dependence (drugs and alcohol) and pathological gambling (Petry, 2001a; Petry et al., 2005), and evidence for common genetic factors, point to the two disorders having overlapping etiologies (Slutske et al., 2000; Goodman, 2008).

One useful model views reward and punishment learning as inherent components in the decision-making process. Decisionmaking can be broken down to the weighing of the probability and value of reward against potential costs (e.g., negative consequences). Other factors such as outcome ambiguity and variance (sometimes referred to as risk) also affect individual choices (Huettel et al., 2006), but here we will only consider potential gains and losses as determinants of decision-making while gambling. We will also take "risk" to mean the potential loss attached to any choice. Risk, as so defined, increases with the magnitude and probability of potential losses. In fact, risktaking may be seen as an indicator of the balance existing between computations of potential gains and losses. Two of the main brain structures thought to be involved in these computations are the ventral striatum (VStr) and the insula, both dopamine projection sites. Both have been linked to computations of value, with the VStr being especially responsive to reward prediction error (RPE), encoding gain anticipation positively and loss anticipation negatively (Rutledge et al., 2010; Bartra et al., 2013), and the insula responding predominantly to losses and loss anticipation in some studies (Knutson and Greer, 2008) or to both positive and negative outcomes in others (Campbell-Meiklejohn et al., 2008; Rutledge et al., 2010). Bartra et al.'s meta-analysis (**Figure 1**) suggests that the insula encodes arousal or salience as opposed to value, as it responds positively to both gains and losses. This meta-analysis also raises the possibility of a greater role for the insula in the assessment of risk and losses than gains (compare panels A and B in **Figure 1**). Alteration of the balance between these gain and loss anticipation systems may underlie the inappropriate choice behaviors that occur in disorders such as addiction, gambling and impulse control disorders.

Recent research suggests that differences in brain function, structure, and biochemistry are present in those who develop gambling problems, with dopamine being a common etiological factor. Imaging studies have demonstrated an increase in mesolimbic dopamine release during gambling tasks in healthy subjects (Thut et al., 1997; Zald et al., 2004; Hakyemez et al., 2008). However it should be noted that unpredictable reward tasks have the ability to cause a suppression and enhancement of dopamine transmission in different regions of the striatum (Zald et al., 2004; Hakyemez et al., 2008). Earlier research on pathological gamblers suggested altered dopaminergic and noradrenergic systems, as found through a decrease in concentration of dopamine and an increase in cerebrospinal fluid levels of 3,4-dihydroxyphenyl-acetic acid and homovanilic acid (Bergh et al., 1997). Pathological gamblers have also been reported to have higher cerebrospinal fluid levels of 3-methoxy-4-hydroxyphenylglycol, a major metabolite of norepinephrine, as well as significantly greater urinary outputs of norepinephrine in comparison to controls (Roy et al., 1988), indicative of a functional disturbance of the noradrenergic system. In addition there is evidence that genetic polymorphisms affecting dopaminergic neurotransmission act as risk factors for problem gambling (Lobo and Kennedy, 2006).

#### **DOPAMINE IN REINFORCEMENT**

Considerable evidence from animal studies, implicating dopamine in behavioral reinforcement, provides a neurobiological substrate that could encompass processing of natural rewards, such as food and sex, as well as drugs of abuse and pathological gambling (Di Chiara and Imperato, 1988; Wise and Rompre, 1989; Wise, 1996, 2013). The observations of Schultz and others (Schultz et al., 1998; Schultz, 2002) confirmed a role for dopamine neurons in response to rewards; however the current model of dopamine signaling can be traced to a seminal paper by Montague, Dayan and Schultz (Schultz et al., 1997), where it was argued that the firing pattern of dopamine neurons did not signal reward *per se*, but a RPE signal, similar to those used in machine learning. This finding, along with evidence that dopamine could modulate synaptic plasticity

**FIGURE 1 | Meta-analysis of fMRI studies of value (taken from Bartra et al., 2013)**. The authors extracted peak coordinates of activation from 206 published fMRI studies that investigated value computations. **(A)** Significant clustering of positive responses. **(B)** Significant clustering of negative responses. **(C)** Conjunction maps, showing regions with significant clustering for both positive and negative responses. **(D)** Results of a between-category comparison, showing regions with significantly greater clustering for positive than negative effects. **(E)** Detail of the striatum, illustrating overlap between the conjunction map (**Panel C**) and the difference map (**Panel D**). These data demonstrate the relative response of anterior insula, striatum and ventromedial PFC to positive and negative value.

(Calabresi et al., 2007; Surmeier et al., 2010) led to the theory that dopamine acts as a learning (or reinforcement) signal that shapes future motivated behavior. Subsequent research has shown that dopamine may also encode predictions about upcoming rewards and reward rate, thus acting as a value signal in the mesocortical and mesolimbic dopaminergic pathways (Montague and Berns, 2002).

The main projection site of dopamine neurons is the striatum, whose connectivity to frontal, limbic and insular cortex, provides a mechanism whereby dopamine can act as a prediction error signal driving both "Go" learning, which relates to actions with positive outcomes, and "No Go" or avoidance learning, which relates to actions that lead to punishment or an absence of reward. First, dopamine signaling operates in two modes (Grace, 2000): slow constant release of dopamine regulates tonic levels, which mostly signal via dopamine D<sup>2</sup> receptors on striatal medium spiny neurons; phasic bursts of dopamine firing lead to large increases in synaptic dopamine which signal via both the D<sup>1</sup> and D<sup>2</sup> receptor systems. D<sup>1</sup> receptors have low affinity for dopamine (Marcellino et al., 2012) and only respond to large increases in synaptic dopamine released during phasic dopamine neuron bursts that reflect positive RPEs, supporting learning to approach rewarding stimuli (Frank, 2005). Dopamine D<sup>2</sup> receptors, on the other hand, have a higher affinity for dopamine, allowing them to respond to tonic dopamine signaling, and to detect transient reductions in tonic dopamine levels that follow pauses in dopamine neuron firing during negative RPEs. This facilitates learning to avoid negative outcomes (Frank, 2005). The cortico-striatal system can be divided into a direct and an indirect pathway (**Figure 2**), which have opposite effects on the thalamus and hence cortex (Albin et al., 1989). In the dorsal striatum, receptors are segregated, with the D<sup>1</sup> receptors within the direct pathway, related to action selection, while the D<sup>2</sup> receptors control response inhibition within the indirect pathway (Mink, 1996). This separation allows dopamine to drive both reward (increases in dopamine signaling a better outcome than expected) and punishment (reductions in tonic dopamine indicated a worse outcome than expected). Frank proposed a model in which phasic dopamine bursts following rewards promote positive reinforcement while reductions in tonic dopamine levels lead to negative reinforcement, each controlled by the D1/direct pathway and the D2/indirect pathway, respectively (Cohen and Frank, 2009). This computational model suggests that the RPE dopamine signal promotes learning from positive outcomes via stimulation of D<sup>1</sup> receptors, whereas learning to avoid negative outcomes is mediated via disinhibition of indirect pathway striatal neurons secondary to a reduction of D<sup>2</sup> receptor stimulation during dopamine pauses (Cohen and Frank, 2009). A negative outcome (punishment or lack of an expected reward) leads to pause in the firing of dopamine neurons, which then leads to a transient reduction in tonic dopamine. It should also be noted that D<sup>2</sup> receptor stimulation reduces excitability of neurons in the indirect pathway (Hernandez-Lopez et al., 2000), therefore, reductions in D<sup>2</sup> receptor signaling have the effect of activating the inhibitory "No Go" pathway. This allows for bidirectional positive and negative reinforcement signaling by dopamine neurons. Support for this model has been provided by numerous

**FIGURE 2 | Basal ganglia model.** A possible model whereby basal ganglia compute the utility of gains and losses via two segregated pathways in the corticostriato-thalamocortical circuit. Striatal output neurons of the direct pathway express D1 receptors and project to the internal globus pallidus (GPi) and the substantia nigra pars reticulata (SNr), and has an action selection effect on cerebral cortex. Striatal output neurons in the indirect pathway express D2 receptors and reduce the tonic inhibition of the external globus pallidus (GPe) on the GPi/SNr, which leads to action inhibition in the cortex. D1 receptors respond mainly to phasic (high concentration) dopamine signaling due to their low affinity for dopamine. D2 receptors have high affinity for dopamine and respond to lower tonic dopamine levels. Excitatory projections in green, inhibitory in red.

experiments. Parkinson's disease patients show enhanced positive learning when on their medications, but improved negative learning while off medication (Frank et al., 2004). Pharmacological manipulations also support the model (Frank and O'Reilly, 2006; Pizzagalli et al., 2008). The striatal release of dopamine is linked to associative learning and habit formation via control of corticostriatal synaptic plasticity, which is affected in an opposite manner by D<sup>1</sup> and D<sup>2</sup> signaling (Shen et al., 2008). D<sup>1</sup> dopamine receptor signaling promotes long-term potentiation (Reynolds et al., 2001; Calabresi et al., 2007), whereas D<sup>2</sup> receptor signaling promotes long-term depression (Gerdeman et al., 2002; Kreitzer and Malenka, 2007). Note that this model has been tested most thoroughly at the level of the striatum. Multivariate analysis of fMRI data shows that reinforcement and punishment signals are ubiquitous in the brain, most notably in the entire frontal cortex and striatum (Vickery et al., 2011). Less is known about the information signaled by dopamine projections to brain areas other than the striatum, such as frontal cortex, insula, hippocampus and amygdala, or how the RPE signal is used by these areas.

#### **STRIATUM AND MONETARY REWARD**

In human functional neuroimaging studies, changes in brain activation have been demonstrated consistently in response to monetary rewards (Thut et al., 1997; Elliott et al., 2000; Knutson et al., 2000; Breiter et al., 2001; O'Doherty et al., 2007). Further, studies have teased apart the different brain areas involved in the various components of monetary reward, such as anticipation, feedback, winning and losing. There seems to be a specialization within dopamine projection sites in relation to monetary reward: anticipation of monetary reward increases activation in the VStr, which includes the nucleus accumbens, while rewarding outcomes increase activation in the ventral medial prefrontal cortex, dorsal striatum, and posterior cingulate, with deactivation in the aforementioned regions during reward omission (Elliott et al., 2000; Breiter et al., 2001; Knutson et al., 2001b; Tricomi et al., 2004). Neuroimaging experiments in humans suggest that VStr activity strongly correlates with expected value, as well as magnitude and probability (Breiter et al., 2001; Knutson et al., 2001a, 2005; Abler et al., 2006; Yacubian et al., 2006; Rolls et al., 2008). Work by D'Ardenne et al. (2008) supports a role for the mesolimbic dopamine system in monetary RPE signaling. Activation of the ventral tegmental area, the origin of the mesolimbic dopamine circuit, reflected positive RPEs, whereas the VStr encoded positive and negative RPEs. Similarly, Tom et al. (2007) showed that VStr activity reflected potential monetary gains and losses bidirectionally. This study also demonstrated that these neural signals reflected individual variations in loss aversion, the tendency for losses to be more impactful than potential gains. Finally, the influential actor-critic model (Sutton and Barto, 1998) proposes that the VStr uses prediction errors to update information about expected future rewards while the dorsal striatum uses this same prediction error signal to encode information about actions that are likely to lead to reward. This distinction has found support from fMRI experiments (O'Doherty et al., 2004; Kahnt et al., 2009). Interestingly, the ability to update behavior in response to RPE was shown to correlate with functional connectivity between dorsal striatum and dopaminergic midbrain (Kahnt et al., 2009). The imaging studies mentioned here support the theory of dopamine as a RPE signal, at least in its striatal projection.

#### **INSULA AND RISK**

The insula is frequently activated in functional neuroimaging experiments (Duncan and Owen, 2000; Yarkoni et al., 2011). Functionally it can be divided into three distinct subregions: a ventroanterior region associated with chemosensory (Pritchard et al., 1999) and socio-emotional processing (Sanfey et al., 2003; Chang and Sanfey, 2009), a dorsoanterior region associated with higher cognitive processing (Eckert et al., 2009), and a posterior region associated with pain and sensorimotor processing (Craig, 2002; Wager et al., 2004). Different functional insular areas project to different striatal targets: the VStr receives insular projections primarily related to food and reward, whereas the dorsolateral striatum receives insular inputs related to somatosensation (Chikama et al., 1997).

The insular cortex is involved in decision-making processes that involve uncertain risk and reward. Specifically, fMRI studies have reported insular cortex involvement in risk-averse decisions (Kuhnen and Knutson, 2005), risk avoidance and the representation of loss prediction (Paulus et al., 2003), monetary uncertainty (Critchley et al., 2001), and encoding a risk prediction error (Preuschoff et al., 2008). Patients with insular cortex damage place higher wagers in comparison with healthy participants and their betting is less sensitive to the odds of winning, with high wagers even at unfavorable odds (Clark et al., 2008). Other research suggests that optimum decisions involving risk depend on the integrity of the insular cortex, showing that insula lesion patients have altered decision-making involving both risky gains and risky losses (Weller et al., 2009) (However see Christopoulos et al., 2009). Specifically, insula damage was associated with a relative insensitivity to expected value differences between choices. Previous research has shown that there is a dissociation between insula and VStr, with VStr activation preceding risk-seeking choices, and anterior insula activation predicting risk-averse choices (Kuhnen and Knutson, 2005) suggesting that the VStr represents gain prediction (Knutson et al., 2001a), while anterior insula represents loss prediction (Paulus et al., 2003). While imaging studies also demonstrate a more general role of the anterior insula in signaling the valence (positive or negative) of potential rewards (Litt et al., 2011; Bartra et al., 2013) the lesion data argue that the anterior insular cortex has a role in risk evaluation, specifically in making risk-averse decisions. Indeed, in healthy subjects, the insula is part of a value network that appears to track potential losses in a way that correlates with individual loss aversion level (Canessa et al., 2013). It is possible that an imbalance between prefrontal-striatal circuitry and insular-striatal circuitry may lead to suboptimal choices when weighing potential gains and losses, as observed in pathological gamblers (Petry, 2001a; Goudriaan et al., 2005).

#### **PATHOLOGICAL GAMBLING AMONG PATIENTS WITH PARKINSON'S DISEASE**

Pathological gambling was first reported in the context of Parkinson's disease and dopamine replacement therapy in 2000 (Molina et al., 2000). The lifetime prevalence of pathological gambling in the general public is approximately 0.9 to 2.5% (Shaffer et al., 1999). In Parkinson's disease, the prevalence rates are higher, from 1.7 to 6.1% (Ambermoon et al., 2011; Callesen et al., 2013). The risk factors associated with the occurrence of pathological gambling in Parkinson's disease are young age of Parkinson's disease onset, a personal or family history of drug or alcohol abuse, depression, and relatively high impulsivity and novelty seeking personality scores (Voon et al., 2007b). Interestingly, these are similar to the risk factors for drug addiction and pathological gambling in the general population. Also, there have been reports of addiction to L-dopa in certain patients (e.g., Giovannoni et al., 2000), a phenomenon that had already been noted in the 1980s. It was perhaps initially surprising to find that Parkinson's disease patients can become addicted to their own medication or develop behavioral addictions because they were thought to not possess the personality type typical of addicted individuals. They are generally described as industrious, punctual, inflexible, cautious, rigid, introverted, slow-tempered, with lack of impulsiveness and novelty seeking, and they have low lifetime risks for cigarette smoking, coffee drinking, and alcohol use predating Parkinson's disease onset (Menza et al., 1993; Menza, 2000).

Dopamine replacement therapy has been implicated in the development of pathological gambling in Parkinson's disease (Gschwandtner et al., 2001; Dodd et al., 2005) and a remission or reduction of pathological gambling is typically noted after reduction or cessation of dopamine agonist medication (Gschwandtner et al., 2001; Dodd et al., 2005). A broader set of behavioral addictions termed impulse control disorders, including but not limited to pathological gambling, compulsive sexual behavior, and compulsive buying, have been reported in association with dopamine replacement therapy (Weintraub et al., 2006; Voon et al., 2007a; Dagher and Robbins, 2009). Dopamine agonists (pramipexole, ropinirole and pergolide) appear to pose a greater risk than L-Dopa monotherapy (Seedat et al., 2000; Dodd et al., 2005; Pontone et al., 2006). Reducing the dopamine agonist and increasing L-Dopa to achieve same motor response abolished pathological gambling in affected individuals (Mamikonyan et al., 2008), while a cross-sectional study of over 3000 Parkinson's disease patients found that taking a dopamine agonist increased the odds of developing an impulse control disorder by 2.72 (Weintraub et al., 2010). Finally, these side-effects of dopamine agonist therapy have been recently noted in other diseases, such as restless leg syndrome, fibromyalgia and prolactinomas (Davie, 2007; Driver-Dunckley et al., 2007; Quickfall and Suchowersky, 2007; Tippmann-Peikert et al., 2007; Falhammar and Yarker, 2009; Holman, 2009). It should be noted however that some studies have reported behavioral addictions and/or impulsivity and compulsivity in association with high-dose L-Dopa monotherapy (Molina et al., 2000), deep brain stimulation for Parkinson's disease (Smeding et al., 2007), and in drug naïve Parkinson's disease patients (Antonini et al., 2011), all in the absence of dopamine agonists. Nonetheless, the clinical evidence overwhelmingly supports the theory that dopamine agonism at the D<sup>2</sup> receptor family is sufficient to cause impulse control disorders.

### **BRAIN IMAGING STUDIES**

#### **NEUROTRANSMITTER IMAGING**

Positron emission tomography (PET) imaging allows for changes in endogenous levels of dopamine to be inferred from changes in the binding of the [11C]raclopride to the dopamine D<sup>2</sup> receptors. The first [11C]raclopride PET study in this area was on Parkinson's patients with dopamine dysregulation syndrome. Dopamine dysregulation syndrome is characterized by the compulsive taking of dopaminergic drugs, which is often comorbid with impulse control disorders (Lawrence et al., 2003). Patients with dopamine dysregulation syndrome exhibited enhanced L-Dopa induced VStr dopamine release compared to similarly treated Parkinson's disease patients not compulsively taking dopaminergic drugs (Evans et al., 2006). This was the first study to provide evidence for sensitization of mesolimbic dopamine circuitry in Parkinson's disease patients prone to compulsive drug use. Subsequent studies have supported a relative hyperdopaminergic state in Parkinson's disease patients with pathological gambling. Three studies mapping the concentration of dopamine reuptake transporters (DAT) have shown reduced levels in the VStr of Parkinson's disease patients with impulse control disorders compared to unaffected patients (Cilia et al., 2010; Lee et al., 2014; Voon et al., 2014). Unfortunately the finding is non-specific, as reduced DAT concentration can index either reduced nerve terminals (and reduced dopamine signaling) or reduced DAT expression (and therefore increased tonic dopamine levels). Supporting the latter hypothesis, impulse control patients demonstrate reduced [ <sup>11</sup>C]raclopride binding in the VStr compared to Parkinson's controls (Steeves et al., 2009), which is also consistent with elevated tonic dopamine in this group. Note, however that this result failed to be replicated in a similar study (O'Sullivan et al., 2011).

However, these two [11C]raclopride PET studies reported a greater reduction of VStr binding potential (an index of dopamine release) during gambling (Steeves et al., 2009) and following reward-related cue exposure (images of food, money, sex) compared to neutral cues (O'Sullivan et al., 2011) in Parkinson's disease patients with impulse control disorders compared to unaffected patients. This suggests an increased responsiveness of striatal reward circuitry to gambling and reward-related cues in those patients with impulse control disorders. In O'Sullivan et al. (2011) dopamine release was only detected in the VStr and only when subjects received a dose of oral L-Dopa just prior to scanning, consistent with post-mortem data in Parkinson's disease showing that brain dopamine levels are much lower in dorsal than VStr (Kish et al., 1988). These results are therefore consistent with the sensitization hypothesis proposed by Evans et al. (2006). More recently it was reported that Parkinson's disease patients with pathological gambling have a reduced concentration of dopamine autoreceptors in the midbrain (Ray et al., 2012), which is known to correlate with elevated dopaminergic responsivity and increased impulsivity (Buckholtz et al., 2010). Finally, in Parkinson's disease patients, dopamine synthesis capacity, as measured by [18F]DOPA PET, correlates with a personality measure of disinhibition, itself a risk factor for pathological gambling and other addictions (Lawrence et al., 2013). In summary, PET studies provide converging evidence of heightened dopaminergic tone and increased dopamine response to reward cues as the underlying vulnerability in Parkinson's disease patients who develop pathological gambling during dopamine agonist treatment.

#### **FUNCTIONAL MAGNETIC RESONANCE IMAGING**

Parkinson's disease patients with pathological gambling show enhanced hemodynamic responses to gambling-related visual cues in the bilateral anterior cingulate cortex, left VStr, right precuneus and medial prefrontal cortex (Frosini et al., 2010). This is in line with similar experiments in pathological gambling without Parkinson's disease (Crockford et al., 2005; Ko et al., 2009) and drug addiction (Wexler et al., 2001), supporting the view that impulse control disorders in Parkinson's disease may be conceptualized as behavioral addictions.

Parkinson's disease patients with an impulse control disorder show diminished BOLD activity in the right VStr during risk taking and significantly reduced resting cerebral blood flow in the right VStr compared to their healthy disease counterparts (Rao et al., 2010). Similarly, it was found that Parkinson's disease patients with impulse control disorders showed a bias toward risky gambles compared to control patients, and that dopamine agonists enhanced risk taking while decreasing VStr activity (Voon et al., 2011). The authors suggested that dopamine agonists may decouple brain activity from risk information in vulnerable patients, thus favoring risky choices. Another fMRI study reported that, relative to Parkinson's controls, impulse control disorder Parkinson's patients had decreased anterior insular and orbitofrontal cortex RPE signals. They also showed that dopamine agonists increased the rate of learning from gain outcomes, and increased striatal RPE activity, suggesting that dopamine agonists may skew neural activity to encode "better than expected" outcomes in Parkinson's disease patients susceptible to impulse control disorders (Voon et al., 2010).

While differences in striatal dopamine signaling may distinguish Parkinson's disease patients who do and do not develop pathological gambling, the mechanism of action by which dopamine agonists change risk assessment remains unclear. Dopamine agonists change the way in which the brains of healthy individuals respond to the anticipation and feedback of rewards. During reward feedback, administration of a single dose of pramipexole to healthy adults caused decreased VStr activity in a lottery game (Riba et al., 2008). Similarly, there was reduced VStr activation when Parkinson's patients received a dose of L-Dopa compared to placebo (Cools et al., 2007). This pattern of hypoactivation is reminiscent of that found in pathological gamblers without Parkinson's disease (Reuter et al., 2005): during a simulated gambling task, pathological gamblers showed decreased activation with respect to controls in the ventromedial prefrontal cortex and the VStr. Severity of gambling was negatively correlated with the BOLD effect in the VStr and ventromedial prefrontal cortex, suggesting that hypoactivity is a predictor of gambling severity. As noted above, impulse control disorder Parkinson's patients were found to have diminished resting perfusion as well as diminished BOLD activity during risk taking in the VStr compared to Parkinson's controls (Rao et al., 2010). These studies suggest that dopamine agonists cause individuals to seek rewards and make risky choices (Riba et al., 2008), in the face of suppressed VStr response to rewards.

It should be noted however that reduced VStr activation in fMRI experiments does not necessarily indicate reduced dopaminergic signaling. There is evidence to support relatively spared mesolimbic dopamine signaling as the risk factor for pathological gambling in Parkinson's disease. First, the repeated taking of a dopaminergic medication for the treatment of Parkinson's disease could lead to sensitization of dopamine signaling. VStr sensitization has been shown following repeated amphetamine administration in humans (Boileau et al., 2006). Moreover, in Parkinson's disease the ventral portion of striatum is relatively spared by the disease compared to the dorsal areas (Kish et al., 1988), and thus dopamine replacement therapy, while correcting the dopamine deficiency in the dorsal striatum to normal levels, has the potential to raise dopamine levels in the VStr circuit to higher than optimal levels (Cools et al., 2007). This "overdose" theory was first proposed by Gotham et al. (1988) to explain the fact that L-Dopa administration to Parkinson's disease patients, while improving some cognitive deficits, could also cause specific impairments in other fronto-striatal cognitive tasks. In the case of impulse control disorders, we propose that excessive dopaminergic stimulation in the VStr obscures the dips in dopamine signaling related to negative prediction errors.

The insula has also been implicated in imaging studies of pathological gambling in Parkinson's disease. In an fMRI study, Ye et al. (2010) found that during the anticipation of monetary rewards, a single dose of pramipexole (compared to placebo) increased the activity of the VStr, enhanced the interaction between the VStr and the anterior insula, but weakened the interaction between the VStr and the prefrontal cortex, leading to increased impulsivity. Cilia et al. (2008) found Parkinson's patients with pathological gambling showed resting over-activity in brain areas in the mesocorticolimbic network, including the insula. In an fMRI study, relative to Parkinson's controls, impulse control disorder patients had decreased anterior insular and orbitofrontal cortex activity (van Eimeren et al., 2009; Voon et al., 2010). Finally, in a study of Parkinson's disease patients with and without hypersexuality, a single dose of L-Dopa abolished the normal insular deactivation seen in response to erotic pictures, only in the hypersexual patients (Politis et al., 2013). Taken together these results may suggest an imbalance between the prefrontal-striatum connectivity and insula-striatum connectivity, favoring the influence of potential gains over that of potential risks (losses) in decision-making.

#### **RISK TAKING AND LOSS AVERSION**

An influential framework for studying risky decision making is prospect theory, developed by Kahneman and Tversky (1979). A key finding of their work is loss aversion, a tendency for losses to loom larger than potential gains, and for individuals to typically forego risky choices when less valuable safer alternatives exist. For example most people will reject the offer of a coin flip unless the potential gain is considerably larger than the potential loss. Impulsiveness, at least in a gambling context, can be characterized as a reversal of loss aversion, and an overweighing of potential rewards relative to losses. It remains to be seen whether loss aversion results from asymmetrical weighting of gains and losses along a single value axis (Tom et al., 2007), or from a competitive interaction between separate systems for gains and losses (Kuhnen and Knutson, 2005; De Martino et al., 2010). Possibly, both models are correct: recent fMRI evidence (Canessa et al., 2013) shows bidirectional responses to losses and gains in the VStr and ventromedial prefrontal cortex (positive for gains) and the amygdala and insula (positive for losses). In both cases, there is greater activation to potential losses, correlating with individual loss aversion measured using prospect theory (Kahneman and Tversky, 1979). However, there are also brain regions that respond uniquely to potential losses, namely the right insula and the amygdala, once again reflecting individual variation in loss aversion (Canessa et al., 2013). In sum, a network of regions centered on VStr, insula and amygdala seems to compute gain and loss anticipation in a way that typically results in loss aversion. Interestingly these structures, along with dorsal anterior cingulate, form an intrinsic connectivity network as identified by resting state fMRI. This network is thought to be involved in detecting and processing emotionally salient events (Seeley et al., 2007).

Loss aversion can be explained on an emotional basis, with both potential gains and losses influencing behavior via different emotions (Loewenstein et al., 2001), namely motivation on the gain side and anxiety for losses. Such a model might tie the former to the nucleus accumbens and the latter to the amygdala and insula. In either case, it is conceivable that individuals who are relatively less loss averse may also be at risk for impulsive behaviors such as drug addiction and gambling, due to relative under valuation of losses, although surprisingly this has yet to be formally tested.

There is some evidence implicating the striatum in reversal of normal loss aversion in pathological gamblers. Loss of striatal dopamine neurons in Parkinson's disease is associated with reduced risk-taking behavior compared to control subjects (Brand et al., 2004; Labudda et al., 2010), while chronic administration of dopamine agonists, especially in high doses, reverses this tendency and promotes risky behavior and impulsivity (Dagher and Robbins, 2009). In the healthy brain, acute administration of D<sup>2</sup> dopamine agonists may also cause an increase in risky choices in humans (Riba et al., 2008) and rats (St Onge and Floresco, 2009). Acute D2/D<sup>3</sup> receptor stimulation has been found to produce complex changes in the value of losses judged worth chasing (chasing being the continued gambling to recover losses) (Campbell-Meiklejohn et al., 2011). Taken together, this suggests dopamine, acting on the striatum and possibly other mesolimbic structures, may modulate loss aversion. Two studies in Parkinson's disease patients not affected by impulse control disorders found that a single dose of the dopamine agonist pramipexole reduced loss prediction error coding in the orbitofrontal cortex in one case (van Eimeren et al., 2009) and the orbitofrontal cortex and insula in the other (Voon et al., 2010). In sum, tonic dopamine activity appears to reduce loss prediction signaling, and may therefore reduce loss aversion.

We propose a general framework based on prospect theory, in which the anticipation of potential losses and rewards is computed, possibly in separate brain regions initially, and integrated to compute a decision value (**Figure 3**). We speculate that gain anticipation might be computed in the ventral medial prefrontal cortex, based on numerous imaging studies implicating this area in computation of value (Kable and Glimcher, 2007; Plassmann et al., 2007; Bartra et al., 2013). As reviewed above, the amygdala and insula may be involved in computing loss anticipation. A possible site for the final computation of value, at least for the purpose of updating choices and action plans, is the striatum, which has fairly direct access to brain regions involved in action planning (van der Meer et al., 2012). The striatum has inherent roles in both response-reward associations (dorsal striatum) (Alexander and Crutcher, 1990) and creating stimulus-reward contingencies (VStr), which afford it the unique opportunity for computation of value (Packard and Knowlton, 2002). Striatal value signals can promote reinforcement processes leading to the updating of future actions, strategies and habits, mediated by the dorsal striatum, while also driving appetitive reward seeking behavior via the VStr. For a review of the role of the striatum in value coding see Knutson et al. (2008); Bartra et al. (2013). The balance between gain and loss evaluation systems may be modulated at least in part by dopamine. We propose a model in which tonic dopamine, acting via the indirect basal ganglia pathway (**Figure 2**) regulates inhibitory control manifesting as loss aversion. Here lower levels of tonic dopamine would be associated with increased loss aversion. Conversely, phasic dopamine, acting via the direct pathway, would increase the value of gains. This is based on the finding that young healthy subjects given a single dose of the dopamine agonist cabergoline show reduced learning in response to gains (positive feedback), due presumably to a

presynaptic effect (in low doses, cabergoline, a D<sup>2</sup> agonist, reduces phasic dopamine neuron firing via actions on the high affinity D<sup>2</sup> autoreceptor, located pre-synaptically on dopamine neurons) (Frank and O'Reilly, 2006). Conversely, haloperidol, a D<sup>2</sup> antagonist, increased learning from gains, probably due to its ability to enhance phasic dopamine firing. With respect to Parkinson's disease, if a patient has an individual vulnerability to undervalue losses, then dopamine agonist therapy, which tonically stimulates D<sup>2</sup> receptors and blocks sensing of the phasic dopamine dips associated with negative rewards, (Frank et al., 2004, 2007), could result in even lower loss aversion. One interpretation is that the intensity of phasic activity sets the gain on the value of potential rewards, while the tonic stimulation of D<sup>2</sup> receptors blocks the negative feedback associated with losses.

Parkinson's disease patients show enhanced positive learning when on dopaminergic medications, and improved negative learning while off medication, compared to age-matched controls (Frank et al., 2004). Treatment with dopamine D<sup>2</sup> agonists is now accepted as the cause of impulse control disorders in Parkinson's disease, in which problem gambling is phase locked to medication use. In the model proposed here, D<sup>2</sup> stimulation would reduce loss aversion via the indirect corticostriatal pathway. We suggest that under D<sup>2</sup> agonist treatment, these patients have a tendency to undervalue losses and be more risk seeking. This is consistent with the observation that Parkinson's disease patients' deficits in risky decision making is dominated by impaired ability to use negative feedback (Labudda et al., 2010). The effect on gain, risk, and loss processing of dopamine signaling in other parts of the mesolimbic and mesocortical system, notably the vmPFC, OFC, insula and amygdala, remains to be investigated in greater depth.

Loss tolerance profile may also be affected by norepinephrine signaling. In healthy volunteers, a single dose of the centrally acting beta blocker propranolol reduced the perceived magnitude of losses (Rogers et al., 2004) and normal variations in norepinephrine reuptake transporter in the thalamus, as assessed by PET, correlate with loss aversion (Takahashi et al., 2013). An explanation for this is that norepinephrine increases the arousal response to potential losses, and low norepinephrine signaling may therefore reduce loss aversion. While norepinephrine neurons are also affected in Parkinson's disease, their role in the motivational and impulsive aspects of the disease have yet to be investigated (Vazey and Aston-Jones, 2012).

#### **CONCLUSION**

The causal association between dopamine D<sup>2</sup> receptor agonism and impulse control disorders in Parkinson's disease has implications for addiction more generally. First, not all individuals develop addictive syndromes following dopamine replacement therapy; those who do appear to have relatively preserved dopamine signaling in the mesolimbic pathway, possibly through a combination of their specific pattern of neurodegeneration, sensitization and pre-morbid vulnerability (as evidenced by the fact that a family history of addiction is a risk factor). It is conceivable that enhanced mesolimbic transmission is also a risk factor in the general population (Buckholtz et al., 2010). Second, it is clear that D<sup>2</sup> receptor agonism alone is sufficient for the development of the addictive syndrome. While combined D1/D<sup>2</sup> agonists such as L-Dopa may themselves be addictive (Lawrence et al., 2003), D<sup>2</sup> agonists are not typically administered compulsively; rather, they have the ability to promote other addictions such as pathological gambling (O'Sullivan et al., 2011). This is supported by animal experiments (Collins and Woods, 2009), computational neuroscience models (Cohen and Frank, 2009), and molecular biology evidence (Shen et al., 2008) suggesting that D<sup>1</sup> receptor stimulation is reinforcing while D<sup>2</sup> receptor stimulation inhibits the inhibitory indirect pathway. We suggest that D<sup>2</sup> agonism, in vulnerable individuals, has the effect of "releasing the brake" on reinforcement systems, thus facilitating the development of impulse control disorders. The time-locked nature of the D<sup>2</sup> effect, and the fact that addictive behaviors typically resolve upon discontinuation of the dopamine agonist, is consistent with the theory that tonic dopamine has an invigorating effect on reward seeking behavior (Niv et al., 2007; Dagher and Robbins, 2009).

We note however that other mechanisms besides dopaminemediated disruption of responses to reinforcing events and stimuli may play a role. For example, Averbeck et al. (2014) have proposed that Parkinson's disease patients with impulse control disorders are uncertain about using future information to guide behavior, which could lead to impulsivity (a tendency to privilege immediate action). Also, frontal lobe deficits (Djamshidian et al., 2010) could also lead to impulsivity through impaired selfcontrol. These mechanisms need not be mutually exclusive.

#### **ACKNOWLEDGMENTS**

This work was supported through grants from the Canadian Institutes of Health Research and Parkinson Society Canada to Alain Dagher and fellowships from the National Sciences and Engineering Research Council of Canada to Crystal A. Clark.

#### **REFERENCES**


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 16 March 2014; accepted: 12 May 2014; published online: 30 May 2014*. *Citation: Clark CA and Dagher A (2014) The role of dopamine in risk taking: a specific look at Parkinson's disease and gambling. Front. Behav. Neurosci. 8:196. doi: 10.3389/fnbeh.2014.00196*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience*.

*Copyright © 2014 Clark and Dagher. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Risk-taking and pathological gambling behavior in Huntington's disease

#### **Carla Kalkhoven<sup>1</sup> , Cor Sennef <sup>1</sup> , Ard Peeters <sup>1</sup> and Ruud van den Bos <sup>2</sup>\***

<sup>1</sup> Chardon Pharma, Herpen, Netherlands

<sup>2</sup> Department of Organismal Animal Physiology, Faculty of Science, Radboud University Nijmegen, Nijmegen, Netherlands

#### **Edited by:**

Patrick Anselme, University of Liège, Belgium

#### **Reviewed by:**

Damien Brevers, Université Libre de Bruxelles, Belgium Bryan F. Singer, University of Michigan, USA

#### **\*Correspondence:**

Ruud van den Bos, Department of Organismal Animal Physiology, Faculty of Science, Radboud University Nijmegen, Heyendaalseweg 135, NL-6524 AJ Nijmegen, Netherlands e-mail: ruudvdbos@science.ru.nl; ruudvandenbos1@gmail.com

Huntington's disease (HD) is a genetic, neurodegenerative disorder, which specifically affects striatal neurons of the indirect pathway, resulting in a progressive decline in muscle coordination and loss of emotional and cognitive control. Interestingly, predisposition to pathological gambling and other addictions involves disturbances in the same corticostriatal circuits that are affected in HD, and display similar disinhibition-related symptoms, including changed sensitivity to punishments and rewards, impulsivity, and inability to consider long-term advantages over short-term rewards. Both HD patients and pathological gamblers also show similar performance deficits on risky decision-making tasks, such as the Iowa Gambling Task (IGT). These similarities suggest that HD patients are a likely risk group for gambling problems. However, such problems have only incidentally been observed in HD patients. In this review, we aim to characterize the risk of pathological gambling in HD, as well as the underlying neurobiological mechanisms. Especially with the current rise of easily accessible Internet gambling opportunities, it is important to understand these risks and provide appropriate patient support accordingly. Based on neuropathological and behavioral findings, we propose that HD patients may not have an increased tendency to seek risks and start gambling, but that they do have an increased chance of developing an addiction once they engage in gambling activities. Therefore, current and future developments of Internet gambling possibilities and related addictions should be regarded with care, especially for vulnerable groups like HD patients.

**Keywords: Huntington's disease, risk-taking, gambling, prefrontal cortex, basal ganglia, disinhibtion**

#### **INTRODUCTION**

Huntington's disease (HD) is a genetic neurodegenerative disorder, inherited in an autosomal dominant fashion. The disease is characterized by progressive motor, cognitive and behavioral symptoms, which usually become apparent between 30 and 50 years of age, and lead to premature death in 10–20 years after disease onset. HD is caused by a mutation in the Huntingtin gene (HTT), which leads to protein aggregation, deregulation of several cellular processes, and eventually cell death. Neuronal degeneration initially occurs selectively in the striatum (caudate nucleus and putamen), where it affects cortico-striatal pathways that serve to control motor and cognitive functions (Reiner et al., 2011; Vonsattel et al., 2011). At the motor level, this degenerative process is expressed as disorganized movements (chorea), while at the cognitive/behavioral level patients display an "executive dysfunction syndrome", encompassing amongst others impulsivity, poor risk assessment and an inability to halt a poor course of action (Hamilton et al., 2003; Duff et al., 2010b). Similar behavioral and cognitive symptoms are seen in addictive behavior related to substances or activities (Newman, 1987; Rosenblatt, 2007; Iacono et al., 2008). Therefore, it may be expected that HD patients are at risk of developing addictions. Decision-making paradigms in laboratory settings have indeed suggested deficits in risky decision-making in advanced HD patients (e.g., Stout et al., 2001), and pathological gambling has incidentally been observed in this patient group (De Marchi et al., 1998). However, these findings are rare, and surprisingly few studies have directly examined symptoms and consequences of, for instance, behavioral disinhibition in HD.

In this review we will argue that HD patients may be a risk group for developing problematic gambling. Firstly, problematic gambling is characterized by subjects' inability to stop gambling despite financial, personal or professional problems. Based on neurobiological disturbances and behavioral symptoms the capacity to stop gambling behavior seems diminished or absent in HD patients. Secondly, due to the more liberal attitudes towards gambling and increasing possibilities of legal and illegal Internet gambling (see e.g., Griffiths, 2003), we may expect the occurrence of gambling problems to increase in the coming years. Increased accessibility may specifically pose a risk to vulnerable groups, such as HD patients, that have not been previously exposed to such risks.

In general, changing external conditions and treatment methods can have unexpected and undesirable effects on patient behavior, especially in complex neurological diseases. Such effects are easily missed when behavioral symptoms are not regularly reevaluated. This may be best illustrated by the case of Parkinson's disease, where the introduction of drug treatment with dopamine agonists led to impulse control disorders such as compulsive gambling, shopping, eating, and hypersexuality, caused by overstimulation of the mesolimbic dopaminergic system (Dodd et al., 2005; Witjas et al., 2012; Weintraub et al., 2013). However, these side effects were not recognized until years after the introduction of dopamine agonist therapies in combination with societal changes related to (the availability of) shopping, food consumption, sexuality, Internet, and gambling. This example illustrates that reassessment of risk factors is important to be able to provide effective treatment and guidance to patients in face of a changing environment.

Here, we will explore the disease profile of HD in relation to addiction, gambling problems, and decision-making deficits. In Section *HD: Neuropathology, Symptoms, and Progression*, progression of HD symptoms will be discussed in relation to disturbances in cortico-striatal circuits involved in task learning, sensitivity to punishment, and cognitive/impulse control. In Section *Risk Taking and Pathological Gambling Behavior in HD*, the neurobiological profile of HD patients will be discussed in the context of gambling and well-established risk-taking and decision-making tests, such as the Iowa Gambling Task (IGT) and the Cambridge Gambling Task (CGT). In Section *Discussion*, we will discuss how a characterization of gambling risks may lead to recommendations for HD patients and their caretakers on how to deal with this issue and which situations are best avoided. We also aim to identify yet unanswered questions, which may act as a starting point for future research into the occurrence and risks of gambling problems in HD patients.

#### **HD: NEUROPATHOLOGY, SYMPTOMS, AND PROGRESSION NEUROBIOLOGICAL DISEASE MECHANISMS**

HD is caused by an unstable CAG (trinucleotide; cytosineadenine-guanine) repeat in the coding region of the HTT gene, which leads to the production of mutant huntingtin protein (Htt) with an expanded polyglutamine (polyQ) stretch (MacDonald et al., 1993). The number of trinucleotide repeats is inversely correlated to the age of onset of disease (Snell et al., 1993; Stine et al., 1993). The majority of HD patients has 40–55 repeats which causes typical adult-onset disorder, while expansions of more than 70 repeats lead to juvenile onset disorder. Individuals with fewer than 35 CAG repeats in the HTT gene will not develop HD. Although the exact mechanisms of HD pathogenesis remain unknown and cannot be discussed here in detail, they involve the formation of protein aggregates by polyQ expanded Htt, as well as the interaction of mutant Htt with numerous proteins that are involved in energy metabolism, protein and vesicle transport, and regulation of gene transcription (Li and Li, 2004; Jones and Hughes, 2011). The resulting deregulation of these cellular processes eventually leads to neuronal degeneration through mechanisms involving excitotoxicity and apoptosis.

Neuronal degeneration is initially restricted to the basal ganglia, where the medium spiny neurons in the striatum (caudate nucleus and putamen) are specifically affected (Vonsattel and DiFiglia, 1998; Kassubek et al., 2004). The striatum receives its main excitatory (glutamatergic) input from cortical areas, while it receives its dopaminergic input from the substantia nigra. The striatum has two main inhibitory (GABA-ergic) outputs: a direct and an indirect pathway **(Figure 1A)**. Striatal neurons of the direct pathway project to the internal globus pallidus (GPi), which in turn has inhibitory projections to the thalamus. The thalamus gives rise to the main excitatory input to the cortex. Thus, in effect, activation of the direct striatal pathway inhibits GPi activity, which in turn disinhibits thalamocortical activity, thereby facilitating movement and cognitive functions. The indirect striatal pathway, on the other hand, projects to the external GP (GPe), which in turn sends inhibitory projections to the subthalamic nucleus (STN). The STN sends excitatory projections to the GPi. Accordingly, activation of the indirect striatal pathway thereby disinhibits the STN, allowing it to activate the GPi, which in turn inhibits thalamocortical activity, suppressing movement and cognitive functions. Adaptive behavior results from a (delicate) balance of activity in the direct and indirect pathway. Pathology in the indirect pathway is key to HD and disrupts the balance in striatal control resulting in a loss of inhibitory control over motor functioning and behavior (**Figure 1B**; Albin et al., 1989; Alexander and Crutcher, 1990).

Cortico-basal ganglia circuits, encompassing connections between cortical areas, striatal areas, pallidal areas and thalamic areas, are organized in a parallel fashion subserving different functions in the organization of behavior. As many excellent reviews exist on the anatomy and function of these circuits (e.g., Alexander et al., 1986, 1990; Alexander and Crutcher, 1990; Yin and Knowlton, 2006; Verny et al., 2007; Yin et al., 2008; Haber and Knutson, 2010; Sesack and Grace, 2010), we only highlight a few issues here conducive to our review. First, roughly speaking a dorsal to ventral topographical organization in both cortical and striatal areas exists. Thus, the dorsal prefrontal areas are associated with dorsal striatal areas while the more ventral prefrontal areas are associated with more ventral striatal areas (including the nucleus accumbens). Second, broadly three functionally different circuits may be described. The sensorimotor circuit encompasses the sensorimotor striatum (putamen) and sensorimotor cortices associated with the execution of motor behavior. The associative/cognitive control circuit involves the dorsolateral prefrontal cortex, anterior cingulate cortex, and associative striatum (caudate nucleus). This circuit is especially relevant for executive functioning, i.e., it is involved in cognitive control, planning and working memory. In addition it is involved in promoting long-term adaptive behavior by reinforcing or stopping (punishing) instrumental behavior, i.e., sequences of behavioral acts, learned in interaction with the environment (Kravitz et al., 2012; Paton and Louie, 2012). The limbic circuit includes the orbitofrontal cortex, ventromedial prefrontal cortex, amygdala, and limbic striatum (nucleus accumbens). This circuit is especially relevant for evaluating the affective value of stimuli, signaling the expected reward or punishment of an upcoming stimulus, choice or event, emotional control, and adaptive (emotional) learning (O'Doherty et al., 2001; Rushworth et al., 2007; van den Bos et al., 2013b, 2014).

the indirect pathway (X) in HD leads to a decrease in inhibitory control over cortical functions. GPe: external globus pallidus; GPi: internal globus pallidus; STN: subthalamic nucleus. Red: inhibitory (GABA) pathways, Blue: excitatory (glutamate) pathways.

Pathology in HD is observed in both the putamen and caudate nucleus (Vonsattel and DiFiglia, 1998; Kassubek et al., 2004; Vonsattel, 2008; Vonsattel et al., 2011; Hadzi et al., 2012). In addition, in both structures atrophy follows a characteristic pattern, starting in the dorsal and caudal regions and moving towards the ventral and rostral regions as the disease progresses (Vonsattel and DiFiglia, 1998; Kassubek et al., 2004; Vonsattel, 2008). However early atrophy has also been observed in the nucleus accumbens and globus pallidus in some studies (van den Bogaard et al., 2011; Sánchez-Castañeda et al., 2013). While disturbances in the sensorimotor circuit (putamen) may be related to the motor symptoms, disturbances in the associative/cognitive control circuit (caudate nucleus) may be related to executive dysfunction, and cause deficits in e.g., working memory in early HD patients (Lawrence et al., 1996; Bonelli and Cummings, 2007; Wolf et al., 2007). Disturbances in the limbic circuit, such as due to early atrophy in the nucleus accumbens, may be related to apathy and depression (Bonelli and Cummings, 2007; Unschuld et al., 2012). Progressive atrophy in the striatum may lead to a successive dysfunction of cortico-striatal circuits. For instance, the ventral caudate nucleus is also part of the orbitofrontal circuit, which is affected as the disease progresses. Dysfunction of this circuit is related to behavioral disinhibition (Bonelli and Cummings, 2007). Eventually, degeneration may spread to other brain areas, including other parts of the basal ganglia (pallidal areas and thalamus), hippocampus, amygdala and cortical areas at the late stages of the disease.

In sum, HD is characterized by a specific degeneration of striatal neurons belonging to the indirect pathway. As the disease progresses, atrophy of the striatum spreads along a caudal-rostral and dorsal-ventral gradient causing a sequential disturbance of cortico-striatal circuits. The resulting loss of inhibitory control in these circuits is directly related to the progression of motor, cognitive and behavioral symptoms in HD, as discussed below.

#### **SYMPTOMS OF HD**

HD is characterized by a variety of progressive motor, cognitive and behavioral symptoms. The first symptoms usually arise at mid-age, with an average onset age of 40, although a small percentage of patients suffer from juvenile-onset HD, which starts before the age of 20. As the symptoms and progression of juvenileonset HD are somewhat distinct from adult-onset disorder, we will focus on the latter patient group in this review. One of the first symptoms to become apparent in HD is chorea (involuntary movement disorder), and a clinical diagnosis is usually made after onset of movement abnormalities (Shannon, 2011). Some studies, however, report subtle cognitive and emotional changes before onset of motor symptoms, and the exact order of occurrence and progression of HD symptoms remains a subject of debate. Nevertheless, several comprehensive reviews of the clinical manifestations of HD are available (Roos, 2010; Anderson, 2011; Shannon, 2011).

#### **Motor symptoms**

Motor symptoms start to become apparent in the early stages of HD, and are usually the first symptoms to be noticed in laboratory settings and by first-degree relatives of HD patients (de Boo et al., 1997; Kirkwood et al., 1999, 2001). Motor disturbances appear to begin as a dysfunction in error feedback control (Smith et al., 2000), consistent with the role of the cortico-striatal motor circuit in sensorimotor learning and control (Graybiel et al., 1994). The first signs of motor abnormalities are often subtle involuntary movements (chorea) of e.g., facial muscles, fingers and toes ("twitching"), hyperreflexia, and exaggerated voluntary movements (Young et al., 1986; Shannon, 2011), which lead to a general appearance of restlessness and clumsiness in early HD patients. These abnormal movements are subtle and often go unnoticed at first, but gradually worsen and spread to all other muscles over time. Other early motor symptoms include slow or delayed saccadic eye movements (Peltsch et al., 2008) and dysarthria (Ramig, 1986; Young et al., 1986). Dysarthria, a motor speech disorder, leads to difficulty with articulation and slurring of words, which makes speech progressively more difficult to understand. Dysphagia (swallowing difficulties) is observed in most patients with an onset at mid-disease stages, and gradually worsens until patients can no longer eat unassisted and often require a feeding tube in late-stage HD (Heemskerk and Roos, 2011). Other, non-choreic motor symptoms that usually become apparent at mid-stage disease include complex gait disorder, postural instability, and dystonia (involuntary muscle contractions that cause slow repetitive movements and abnormal postures), which is often accompanied by frequent falls (Koller and Trimble, 1985; Tian et al., 1992; Louis et al., 1999; Grimbergen et al., 2008). Rigidity and bradykinesia (slowness of movement and reflexes) are sometimes observed, but are mostly restricted to cases of juvenile-onset HD (Bittenbender and Quadfasel, 1962; Hansotia et al., 1968). These motor symptoms are consistent with dysfunction of the sensorimotor (and associative/cognitive control) cortico-striatal circuits that are commonly affected in HD.

#### **Behavioral and psychiatric symptoms**

Behavioral disorders in HD can be complex and difficult to classify, and their occurrence and onset is highly variable between individuals. Moreover, it can sometimes be difficult to distinguish behavioral disorders from normal coping with a distressing disease (Caine and Shoulson, 1983). The number of studies that have characterized behavioral symptoms in HD is limited, and as a result there is relatively little insight in their prevalence in the disease (van Duijn et al., 2007). The most frequently and consistently reported behavioral and emotional symptoms in HD are irritability, apathy, and depression, which occur with a prevalence of approximately 50% (Caine and Shoulson, 1983; Folstein and Folstein, 1983; Craufurd et al., 2001; Kirkwood et al., 2001; van Duijn et al., 2007, 2014; Tabrizi et al., 2009). Both irritability and apathy are sometimes observed in pre-manifest HD patients (Tabrizi et al., 2009; van Duijn et al., 2014), and also depression has been reported at early clinical stages (Shiwach, 1994; Julien et al., 2007; Epping et al., 2013). These affective symptoms are among the first non-motor symptoms to be noticed by firstdegree relatives (Kirkwood et al., 2001). Typical apathy-related symptoms, which gradually become worse during the course of the disease, include lack of energy, motivation and initiative, decreased perseverance and quality of work, impaired judgment, poor self-care and emotional blunting (Craufurd et al., 2001; Kirkwood et al., 2001). Depressive symptoms have been related to increased activity in the ventromedial prefrontal cortex (Unschuld et al., 2012). Irritability is associated with orbitofrontal circuit dysfunction, which leads to decreased control over emotional responses in the amygdala (Klöppel et al., 2010).

Other, less commonly observed psychiatric symptoms and disorders in HD are anxiety, obsessive-compulsive disorder, mania, schizophrenia-like psychotic symptoms, such as paranoia, hallucinations, and delusions (Caine and Shoulson, 1983; Folstein and Folstein, 1983; Craufurd et al., 2001; Kirkwood et al., 2001; van Duijn et al., 2007). These symptoms usually don't occur until mid or late stages of the disease, although they have incidentally been reported to occur in preclinical HD patients (Duff et al., 2007). Obsessive-compulsive disorder has been associated with damage to the orbitofrontal cortex and anterior cingulate cortex, while schizophrenia, a disorder which involves deficits in organizing, planning and attention, is related to dorsolateral prefrontal cortex dysfunction (Tekin and Cummings, 2002).

It is suggested that most psychiatric symptoms in HD are in fact part of a broad, ill-defined "frontal lobe syndrome" or "executive dysfunction syndrome", which includes symptoms such as apathy, irritability, disinhibition, impulsivity, obsessiveness, and perseveration (Lyketsos et al., 2004; Rosenblatt, 2007), all of which are commonly observed in HD patients (Hamilton et al., 2003; Duff et al., 2010b). Taken together, the literature indicates that onset and progression of behavioral symptoms in HD is heterogeneous, with affective disorders occurring most often and with early onset, while anxiety, obsessive-compulsive disorder, and psychotic symptoms are less common and usually occur later in the disease. These psychiatric symptoms are associated with dysfunction of limbic and associative/cognitive control corticostriatal circuits that are commonly affected in HD.

#### **Cognitive symptoms**

Cognitive decline is another important aspect of HD pathology. Many studies have focused specifically on the occurrence of cognitive symptoms in preclinical and early clinical stages of HD, in the hope to discover early clinical biomarkers of the disease (reviewed in Papp et al., 2011; Dumas et al., 2013). Overall, results suggest that subtle cognitive changes may be observed up to 5–10 years before onset of motor symptoms with sufficiently sensitive methods. One study even found that, at preclinical and early clinical stages of HD, about 40% of patients already meet the criteria for mild cognitive impairment (a disorder associated with limited memory loss, not meeting the criteria for diagnosis of dementia; Duff et al., 2010a). However, not all studies support these findings (Blackmore et al., 1995; Giordani et al., 1995; de Boo et al., 1997; Kirkwood et al., 2001). In general, the literature agrees that information processing and psychomotor speed are especially affected at this early stage (Rothlind et al., 1993; Kirkwood et al., 1999; Verny et al., 2007; Paulsen et al., 2008). Other commonly observed early cognitive impairments include problems with attention, (working) memory, and visuospatial performance (Jason et al., 1988; Rothlind et al., 1993; Foroud et al., 1995; Lawrence et al., 1996; Hahn-Barma et al., 1998; Verny et al., 2007; Paulsen et al., 2008; Tabrizi et al., 2009; Papp et al., 2011; Stout et al., 2011). Cognitive inflexibility has been observed in early disease patients (Jason et al., 1988), at which stage extra-dimensional shifts are specifically impaired, while reversal learning is still intact (Lawrence et al., 1996). Thus, patients are still able to reevaluate stimulus value and learn new stimulusreward contingencies within the same dimension (e.g., shape or color), but have problems shifting their attention to a different dimension (e.g., from color to shape) as required by the new task rule to obtain reward. In later stages of the disease, cognitive inflexibility and perseveration also cause impaired reversal learning in HD patients (Josiassen et al., 1983; Lange et al., 1995). This progression of symptoms is consistent with specific dysfunction of the dorsolateral prefrontal circuit early in the disease, since extradimensional set shifting is mediated by the dorsolateral prefrontal cortex, while reversal learning is mediated by the orbitofrontal cortex (Dias et al., 1996; McAlonan and Brown, 2003). Other early impairments include disorganized behavior, impaired planning, poor judgment, and reduced behavioral and emotional control (Watkins et al., 2000; Paradiso et al., 2008; Duff et al., 2010b). Disinhibition has been observed in early HD patients, whose performance is impaired on tasks that require inhibition of prepotent but inappropriate responses (Holl et al., 2013). Finally, several studies have found that preclinical HD patients are impaired in the recognition of negative emotions such as anger, disgust, fear and sadness. Emotional recognition declines progressively, and can spread to problems with neutral emotions in early clinical stages of the disease (Johnson et al., 2007; Tabrizi et al., 2009; Labuschagne et al., 2013). This phenotype is related to dysfunction of the orbitofrontal cortex, which is involved in processing emotional and reward information (Henley et al., 2008; Ille et al., 2011).

Studies with animal models of HD show similar cognitive impairments to those observed in human patients. Although not all studies find robust cognitive deficits (Fielding et al., 2012), findings in rat and mouse models of HD include anxiety, increased responsiveness to negative emotional stimuli, and impairments in reversal learning and strategy shifting (Faure et al., 2011; Abada et al., 2013). One study found specific early deficits in reversal learning before onset of motor symptoms in a rat model of HD (Fink et al., 2012). Interestingly, HD animals appear to have an increased responsiveness to negative emotional stimuli, while human patients show decreased recognition of negative emotions. At present it is unclear whether this reflects differences in task administered (recognizing emotions *versus* behavioral responses to threatening stimuli), species-related differences in the outcome of pathology or a fundamental difference between the rat model and the human condition. In general, studies in both human patients and animal models of HD demonstrate that a wide range of cognitive functions can already be impaired in early HD. Early abnormalities mainly include deficits in attention, memory, cognitive flexibility, and emotional recognition. At this early stage, patients often have impaired awareness of their own (decline in) cognitive abilities (Hoth et al., 2007). Over time, cognitive symptoms progressively get worse, eventually leading to severe subcortical dementia in late stages of the disease. Although the occurrence of symptoms is generally consistent with successive impairment of associative/cognitive control and limbic corticostriatal circuits, respectively, specific functions that are related to the limbic circuit can also already be affected at early-stage HD.

#### **Conclusion**

Motor, behavioral and cognitive symptoms in HD have been studied extensively in the past, and continue to be a topic of interest due to the wide variety and variability in the occurrence and onset of these symptoms across patients. In general, behavioral and cognitive symptoms are related to three frontal behavioral categories: apathy, executive dysfunction, and disinhibition. The combination of these symptoms is sometimes referred to as "executive dysfunction syndrome". All of these symptoms are related to deficits in the cortico-striatal circuits involving the orbitofrontal cortex, dorsolateral prefrontal cortex and anterior cingulate cortex. As discussed above, neuropathological studies have observed a gradual degeneration of the striatum in a dorsal to ventral direction in HD patients. Although the behavioral and cognitive observations partly agree with a progressive impairment of cortico-striatal circuits, the symptomatic findings appear to be more diffuse than expected based on pathological observations. Onset and progression of behavioral and cognitive symptoms in HD is highly heterogeneous, indicating that damage to striatal regions may be more variable and widespread in early stages of HD than previously thought. This view is supported by evidence from several structural imaging studies (Thieben et al., 2002; Rosas et al., 2005; van den Bogaard et al., 2011).

### **RISK TAKING AND PATHOLOGICAL GAMBLING BEHAVIOR IN HD**

#### **PATHOLOGICAL GAMBLING**

While many people are able to gamble recreationally, it may become an overt problem for some, as they develop pathological forms of this behavior. Pathological gambling is characterized by an excessive urge to gamble despite clear negative financial, personal and professional consequences. It has recently been classified as an addiction in DSM-V, as it closely resembles substance abuse disorders in both diagnostic criteria and neuropathology (van Holst et al., 2010; Clark and Goudriaan, 2012). Pathological gambling will be the first and only "behavioral addiction" recognized within the category "*Addiction and Related Disorders"*. Nevertheless, it should be noted that differences exist between addiction to psychoactive substances and addiction to gambling. First, satisfying craving for psychoactive substances lies in consuming the substance of which the effect is known, while satisfying the craving for gambling may have an uncertain outcome as money may be won or not, unless, it is the act of gambling itself, for instance as an exciting activity. Thus, pathological gambling may be more heterogeneous in this respect with also a more uncertain outcome than substance abuse. It should be noted that outcome variability, including both wins and losses, may be crucial to the development of gambling addiction, as it presents a variable intermittent pattern of reinforcement, which is the most powerful form of instrumental/classic conditioning (Sharpe, 2002; Fiorillo et al., 2003). Second, psychoactive substances may more strongly change activity in the brain and peripheral nervous system than gambling, due to their direct pharmacological activity at several neurotransmitter systems, accelerating thereby addictive processes, making substance abuse a more powerful form of addiction.

The underlying neurobiological mechanisms of gambling are complex and involve many different brain regions and neurotransmitter systems (reviewed in Raylu and Oei, 2002; Goudriaan et al., 2004; Potenza, 2013). Predisposition to addiction has been related to a reduced level of dopamine D2 receptors in the striatum, which function in a feedback loop to inhibit further dopamine release. The resulting hyperactivity of dopaminergic pathways increases sensitivity to reward, motivation, and positive reinforcement of the addictive behavior (Volkow et al., 2002; Di Chiara and Bassareo, 2007). Specific motivational changes that occur when pathological gambling develops include increased motivation to gamble (van Holst et al., 2012) and enhanced attention to gambling-related stimuli (Brevers et al., 2011a,b). In addition, pathological gamblers have reduced cognitive control over behavior in general, as exemplified by decreased performance on response inhibition tasks, increased impulsivity, and a preference for immediate over delayed rewards in neurocognitive tasks (Goudriaan et al., 2004; Brevers et al., 2012a; van den Bos et al., 2013a).

Pathological gamblers perform poorly compared to controls on formal reward-related risky decision-making tasks (e.g., Cavedini et al., 2002; Brand et al., 2005; Brevers et al., 2012b; review: Brevers et al., 2013). This poor performance is independent of whether tasks contain explicit and stable rules for wins and losses such as the Game of Dice Task (Brand et al., 2005) or whether subjects have to learn by trial-and-error which choices are advantageous in the long run, such as the IGT (Cavedini et al., 2002; Brevers et al., 2012b; see Section *Risky Decision-Making by HD Patients on Laboratory Tasks* for details of this task). However, gambling severity was rather correlated with performance on decision-making tasks in which probability of outcome is unknown (IGT) than with tasks with explicit rules (Brevers et al., 2012b). This observation is interesting in view of the fact that in normal subjects the second half of the IGT when subjects have learned task contingencies is akin to tasks with explicit rules. Collectively, these data therefore suggest that in pathological gambling impairments in decision-making may result from both decreased executive control, which is related to more explicit rules, and disturbed reward-punishment (emotional) processing, which is more related to trial-and-error learning to assess long-term value of options (van den Bos et al., 2013a, 2014). In addition, it suggests that disturbances in the latter may be a predisposing factor to escalation of gambling behavior.

From these studies it is clear that neurobiological predisposition for developing pathological gambling behavior involves disturbances in both the associative/cognitive control circuit and the limbic circuit (van den Bos et al., 2013a). As a result, pathological gamblers display reduced cognitive control, increased impulsivity, and increased sensitivity to reward, all of which are aspects of behavioral disinhibition (Iacono et al., 2008). The chance that an individual develops an addiction in its life, however, also depends on many other aspects, such as early-life experiences and environmental risks.

#### **PATHOLOGICAL GAMBLING IN HD: EPIDEMIOLOGICAL EVIDENCE**

With the increasing amount of possibilities offered by the Internet, there has also been a rise in both legal and illegal online gambling opportunities in recent years. These easily accessible and often uncontrolled gambling activities may pose a risk to anyone who has an increased susceptibility to gambling addiction, but may otherwise not become involved in such activities (Griffiths, 2003). HD patients are one of the groups for which Internet gambling may pose such a risk, because behavioral disinhibition a common feature in the disease—is an important factor in the development of addictions (Iacono et al., 2008). Indeed, as mentioned above, HD patients show several signs of disinhibition, such as irritability, impaired response inhibition, and reduced emotional recognition, at an early stage in the disease. Other symptoms that have been observed in HD, and can influence patients' ability to make rational decisions, are cognitive inflexibility, perseveration, poor judgment, and reduced self-awareness. Besides these symptomatic similarities between HD patients and pathological gamblers, both groups display structural and functional abnormalities in similar cortico-striatal circuits.

In view of these similarities between pathological gamblers and HD patients, we may expect the incidence of gambling problems to be increased among HD patients compared to the normal population. Nevertheless, only one study so far has reported cases of pathological gambling in an Italian family with HD (De Marchi et al., 1998). In this family, two individuals were diagnosed with pathological gambling around the age of 18, well before the onset of clinical signs of HD. Other epidemiological studies have not reported on this issue, although impaired decision-making, risk taking, and poor judgment have been shown to pose a risk for HD patients handling important life decisions and financial affairs (Klitzman et al., 2007; Shannon, 2011). Similarly, reports on related issues such as substance abuse and addiction to Internet use are missing in the current literature on HD pathology. At this moment, it is unclear whether the absence of reports of gambling problems in the HD literature is caused by a lack of attention for this phenomenon, or whether there really is no increased prevalence of pathological gambling among HD patients. Several reasons may explain why such problems have not been reported more frequently. Firstly, even if the incidence of pathological gambling is increased in HD, this will likely still only affect a small percentage of patients. In combination with the fact that the HD-affected population itself is limited in number, this may cause gambling problems to go unnoticed as a specific issue in this patient group. Secondly, the lack of gambling problems in HD may be related to the inability or unwillingness of patients to leave the house due to motor disorders and frequently observed signs of apathy and depression. Before the advent of Internet gambling, this may have kept HD patients from visiting public gambling places like the casino. Finally, adolescence appears to be a sensitive period for developing gambling problems (van den Bos et al., 2013a), while most HD patients do not start to show disinhibition-related symptoms until later in life. However, with the rise of Internet-related activities of adolescents, they may acquire forms of recreational behavior such as online gambling, which develop into a problem when HD symptoms become manifest later in life. Thus, while the environment in which gambling-susceptible HD patients find themselves may not have promoted such behavior in the past, it is clear that an increased accessibility and availability of gambling opportunities from the home may change the prevalence of related problems in the HD population.

#### **RISKY DECISION-MAKING BY HD PATIENTS ON LABORATORY TASKS**

Laboratory tasks are commonly used to assess cognitive and behavioral abnormalities in neurological disorders. To gain insight into the processes and impairments involved in decisionmaking and risk-taking behavior, several tasks have been developed, including the IGT (Bechara et al., 1994) and the CGT (Rogers et al., 1999). On the IGT, participants are presented with four decks of cards. They are instructed to choose cards from these decks, with which they can win or lose money; the goal of the task is to win as much money as possible. The decks differ from each other in the frequency and amount of wins and losses. Two of these are "bad" decks, leading to an overall loss in the long run, and two are "good" decks, leading to an overall gain. The participants are not given this information, however, and need to discover which decks are most advantageous during the experiments. Normal, healthy, participants successfully learn the rules of the task after a certain amount of sampling, and eventually start to prefer the two "good" decks. Nevertheless, there are significant individual differences in performance even among healthy participants, including for example clear sex differences (van den Bos et al., 2013b). On the CGT, participants are presented with a row of 10 boxes of two different colors, and need to make a probabilistic decision in which color box a token is hidden. They must then gamble credit points on their confidence in this decision. In this task, all relevant information is presented to the participant during the experiment, and trials are independent, thus minimizing working memory and learning demands. Both gambling tasks are well established, and the IGT is accepted as a valid simulation of real-life decision-making (Buelow and Suhr, 2009), while the CGT is especially useful for studying decisionmaking outside a learning context.

HD patients have been tested on both the Iowa and Cambridge Gambling Task. In a study with intermediate-stage patients, Stout et al. (2001) found that performance on the IGT was reduced compared to normal subjects. The difference in performance became apparent in the second part of the task; where subjects normally start to show a preference for the good decks, HD patients continued to make frequent selections from the bad decks. This suggests that HD patients either did not learn which decks were advantageous, or continued to choose cards from the bad decks despite this knowledge. The authors noted that several HD participants indicated to know that some decks were disadvantageous, but still continued to select cards from those decks, suggesting that HD patients can learn the rules of the task, but are not able to enforce an advantageous selection pattern and resist responding to individual punishments and rewards. Nevertheless, reduced performance was found to be associated with impaired memory and conceptualization, leading the authors to speculate that HD patients may have trouble learning or remembering the long-term consequences of choosing cards from a particular deck. HD patients also scored higher on disinhibition than healthy controls, but this measure was not correlated with task performance. In a follow-up of the same data Stout and colleagues, compared three cognitive decision models to explain the performance deficit of HD patients, and found that this was best explained by deficits in working memory and by increases in recklessness and impulsivity (Busemeyer and Stout, 2002). Impaired performance of HD patients on the IGT may also be related to a reduced impact of losses on these patients, which was found by measuring skin conductance responses during the IGT (Campbell et al., 2004). This finding is consistent with impaired recognition of negative emotions in HD patients (Johnson et al., 2007; Ille et al., 2011), and suggests that they may be less sensitive to large punishments, and therefore less likely to turn away from the bad card decks. Especially the second part of the IGT requires the ability to suppress disadvantageous courses of action in response to punishments, while reinforcing profitable actions (de Visser et al., 2011; van den Bos et al., 2013b, 2014).

A limited number of other studies have tested risky decisionmaking in early stages of HD, but did not find performance difficulties in these patients on either the IGT or the CGT (Watkins et al., 2000; Holl et al., 2013). Thus, it appears that impairments in decision-making and risk of gambling problems do not develop until intermediate stages of the disease. However, these studies did find impairments in tasks that required planning and inhibition of pre-potent responses in early HD patients. It thus appears that HD patients first develop subtle problems with inhibition, planning, emotional recognition, and working memory. In some patients this can already lead to problems with judgment and decision-making in early stages of the disease, but most HD patients don't have problems with risky decision-making tasks until they reach an intermediate stage of the disease.

#### **NEUROBIOLOGICAL MECHANISMS OF DECISION MAKING IN HD Neurobiological pathways underlying normal decision-making processes in the IGT**

The neurobiological mechanisms underlying decision-making processes in the IGT have been well studied and described (see e.g., Bechara et al., 2000; Doya, 2008; de Visser et al., 2011; van den Bos et al., 2013b, 2014). Normal execution of this task requires an interaction between the limbic and associative/cognitive control cortico-striatal circuits. Activity in the limbic circuit is thought to be dominant during the first phase of the IGT, during which it is involved in exploratory behavior, responding to rewards and punishments, and learning the affective values of short- and long-term outcomes of decisions in the task (Manes et al., 2002; Clark and Manes, 2004; Fellows and Farah, 2005; Gleichgerrcht et al., 2010; de Visser et al., 2011; van den Bos et al., 2014). The associative/cognitive control circuit, on the other hand, is more important during the second part of the IGT, when it is necessary to suppress impulsive responses to rewards and punishments for long-term benefit, reinforce advantageous behavioral patterns and suppress disadvantageous patterns (Manes et al., 2002; Clark and Manes, 2004; Fellows and Farah, 2005; Gleichgerrcht et al., 2010; de Visser et al., 2011; van den Bos et al., 2014).

#### **Neurobiological abnormalities in IGT decision-making processes in HD**

Since decision-making processes in the IGT involve an interaction of limbic and associative/cognitive control cortico-striatal circuits, it is not surprising that HD patients are impaired in the performance of this task. One of the observations by Stout and colleagues is that the impact of loss on decision-making is reduced in HD patients (Campbell et al., 2004). This is consistent with findings that these patients are impaired in the recognition of negative emotions, and may be explained by disturbances in the orbitofrontal cortex (Ille et al., 2011). The orbitofrontal cortex is important for emotional processing, and is activated in normal subjects in response to punishments and rewards in a decision-making task (O'Doherty et al., 2001). Another finding by Stout et al. (2001) is that the performance of HD patients on the IGT is correlated with decreased conceptualization and longterm memory measures on the Mattis Dementia Rating Scale. A failure to learn or remember which decks are advantageous on the long-term may be associated with decreased activity of the associative/cognitive control circuit, which is required for long-term planning and impulse control (Manes et al., 2002; Clark and Manes, 2004; Fellows and Farah, 2005; Gleichgerrcht et al., 2010). This is also consistent with specific deficits of the indirect pathway in HD, since a recent study shows that the indirect pathway is important for sensitivity to punishment in a reinforcement-learning task (Kravitz et al., 2012; Paton and Louie, 2012). Insensitivity to the future consequences of a decision may also be caused by ventromedial prefrontal cortex dysfunction, since similar insensitivity is observed in patients with damage to this prefrontal area (Bechara et al., 1994). Thus, decreased performance of HD patients on the IGT may be caused by a combination of dysfunctions in cortico-striatal circuits involving the orbitofrontal cortex, ventromedial prefrontal cortex and dorsolateral prefrontal cortex. This leads to reduced responsiveness to punishment in the first phase of the task, and failure to learn which decks are long-term advantageous, plan accordingly, and suppress impulsive responses in the second phase of the IGT.

#### **DISCUSSION**

#### **HD AND PATHOLOGICAL GAMBLING: WHAT ARE THE RISKS?**

The typical array of motor, emotional, and cognitive symptoms of HD is caused by progressive striatal atrophy that affects the different cortico-striatal circuits. Although onset and progression of behavioral and cognitive symptoms appear to be highly heterogeneous, motor and cognitive circuits are typically affected early in the disease, while the limbic circuit is affected at a later stage. Interestingly, neurobiological predisposition to pathological gambling and other addictions involves disturbances in the same cortico-striatal circuits that are affected in HD. Despite these striking similarities, however, in the medical literature HD has not been associated with pathological gambling or other addictive behaviors. Only one study so far has described a family in which gambling problems occurred in several HD-affected family members (De Marchi et al., 1998). We speculate that patients' motor symptoms, as well as their age and social environment, may thus far have prevented them from developing pathological gambling, despite their increased susceptibility to such problems. On the other hand, the frequently diagnosed depression may be expected to increase impulsivity and the risk of gambling problems, based on correlation studies (Clarke, 2006). Another explanation for the lack of observations of gambling problems in HD may be related to differences in underlying neuropathology. While the cognitive disturbances appear to be highly similar between pathological gamblers and HD patients, the emotional changes are of a different nature. Pathological gamblers mainly show an increased sensitivity to rewards, urging them to start and continue gambling. HD, on the other hand, has been associated with a decreased sensitivity to punishments and negative emotions. This difference may be an important reason why HD patients do not appear to have an increased tendency to start gambling or engage in other rewarding, addictive behaviors.

Nevertheless, disturbances in the limbic cortico-striatal circuit of HD patients may still promote risky decision-making in situations with uncertain outcome, as demonstrated in the IGT (Doya, 2008). Moreover, the combination of decreased sensitivity to punishment, failure to inhibit impulsive responses to immediate rewards, and inability to consider long-term delayed rewards and enforce advantageous behavioral patterns accordingly, makes it likely for HD patients to develop gambling problems, when they encounter a situation that promotes such behavior. Characteristic problems of HD patients with strategy shifting and symptoms of cognitive inflexibility and perseveration may contribute to the progression of pathological behavior in these situations. Thus, we propose that HD patients do not have an increased tendency to start gambling or other addictive behaviors inherent to their neuropathology, but that they do have an increased risk of developing an addiction once they engage in gambling. In accordance with this idea, it has been observed that frontal lesion patients become impulsive and often make poor decisions, but that they do not exhibit increased risk-taking behavior (Miller, 1992; Bechara et al., 2000). This suggests that impaired decisionmaking and risk-taking or -seeking behavior do not necessarily occur together, and that different combinations of limbic and associative/cognitive control circuit disturbances can have different effects on risky-decision making and gambling behavior. Our hypothesis would also explain why HD patients have not been observed to perform worse on the CGT. Since all information about chances and values of wins and losses is available up front in this task, HD patients may not develop disadvantageous strategies, because they are not actively seeking risks. However, this would need to be tested in more advanced disease patients.

If HD patients indeed have an increased risk of developing pathological gambling behavior when presented with the appropriate situation, the rise of easily accessible Internet gambling opportunities may pose a specific risk for this patient group. Even if they do not actively seek out these situations, HD patients are now much more likely to come across gambling opportunities than they were in the past. This is especially true for patients who spend most of their time at home due to their symptoms, where the Internet may be an important means to occupy them. A higher probability of engaging in gambling behavior may therefore cause a disproportionate increase in related problems in the HD population. We suggest that caretakers should be aware of these possible risks, and preferably try to prevent HD patients from engaging in (online) gambling activities. Moreover, we argue that clinicians should regularly assess the risk and prevalence of gamblingrelated problems in the HD population, to be able to provide appropriate treatment and guidance to patients and caretakers.

#### **FUTURE DIRECTIONS**

Besides epidemiological studies to assess the prevalence of pathological gambling and other addictions in HD, several lines of research can be suggested to increase our understanding of the issues discussed in this paper. First of all, it would be interesting to link performance deficits on the IGT directly to disturbances in cortico-striatal activity in HD patients. To this end, HD patients' brain activation patterns can be studied with functional magnetic resonance imaging while performing the IGT, and compared to activity in normal subjects. Activity in the striatum, dorsolateral prefrontal cortex and orbitofrontal cortex is expected to be

decreased in HD patients during decision-making on the IGT. To study the behavioral and neurobiological aspects of gambling-behavior in HD in more detail, currently available rodent disease models can be utilized. On a behavioral level, these animals can be expected to show decreased performance on the IGT, similar to human patients. Rodent versions of the IGT are available (review: de Visser et al., 2011) and the involvement of different neuronal structures in these models is well characterized (de Visser et al., 2011; van den Bos et al., 2013a, 2014). Therefore, such experiments are feasible, and can be combined with in-depth analysis of underlying neuronal changes in rodent models of HD using a variety of techniques. Furthermore, with the advent of more ecological valid research methods and tools to assess the development of pathological behaviors, the risk for developing pathological gambling may be studied under (semi)natural conditions in both humans and animals (van den Bos et al., 2013a). Together, these studies of gambling-related symptoms and underlying neuropathology in both human patients and animal models of HD will provide us with a better understanding of the risks related to gambling—and possibly other addictive behaviors—in HD, and improve our ability to provide appropriate treatment and guidance.

#### **REFERENCES**


study of gene carriers. *J. Neurol. Neurosurg. Psychiatry* 64, 172–177. doi: 10. 1136/jnnp.64.2.172


prefrontal cortex. *Psychiatry Res.* 203, 166–174. doi: 10.1016/j.pscychresns.2012. 01.002


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

#### *Received: 30 November 2013; paper pending published: 18 January 2014; accepted: 12 March 2014; published online: 02 April 2014.*

*Citation: Kalkhoven C, Sennef C, Peeters A and van den Bos R (2014) Risk-taking and pathological gambling behavior in Huntington's disease. Front. Behav. Neurosci. 8:103. doi: 10.3389/fnbeh.2014.00103*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience*.

*Copyright © 2014 Kalkhoven, Sennef, Peeters and van den Bos. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms*.

### How central is dopamine to pathological gambling or gambling disorder?

#### *Marc N. Potenza\**

*Departments of Psychiatry, Neurobiology, and Child Study Center, Yale University School of Medicine, New Haven, CT, USA \*Correspondence: marc.potenza@yale.edu*

*Edited by:*

*Bryan F. Singer, University of Michigan, USA*

#### *Reviewed by:*

*Luke Clark, University of Cambridge, UK*

**Keywords: dopamine, gambling, addiction, PET, serotonin, glutamate, opioids**

Pathological gambling [PG—now termed "gambling disorder" in DSM-5 (APA, 2013; Petry et al., 2013)] is characterized by maladaptive patterns of gambling that are associated with significant impairments in functioning. Over the past decade, significant advances have been made in understanding the pathophysiology of PG (Potenza, 2013). Similarities between PG and substance-use disorders (Petry, 2006; Potenza, 2006; Leeman and Potenza, 2012) prompted the reclassification of PG in DSM-5 as an addictive disorder (rather than an impulse-control disorder, as was the case in DSM-IV).

Multiple neurotransmitter systems have been implicated in PG including serotonergic, noradrenergic, dopaminergic, opioidergic, and glutamatergic (Potenza, 2013). An understanding of these systems as they relate to PG is important clinically for drug development as presently there are no FDA-approved medications with indications for PG. Dopamine has long been implicated in substance addictions and early articles postulated a similarly important role for dopamine in PG (Potenza, 2001). However, a precise role for dopamine in PG remains unclear. Studies of cerebrospinal fluid samples indicated low levels of dopamine and high levels of dopamine metabolites in PG, raising the possibility of increased dopamine turnover (Bergh et al., 1997). However, medications that target dopamine function have not demonstrated clinical effects in PG. For example, medications that block dopamine D2-like receptor function (e.g., olanzapine) have shown negative results in small, randomized clinical trials (Fong et al., 2008; McElroy et al., 2008). Furthermore, a D2-like dopamine receptor antagonist widely used in the treatment of psychotic disorders (haloperidol) was found to increase gambling-related motivations and behaviors in individuals with PG (Zack and Poulos, 2007). However, administration of the pro-dopaminergic (and pro-adrenergic) drug amphetamine also led to increased gambling-related thoughts and behaviors in PG (Zack and Poulos, 2004).

Recent imaging studies have begun to use radioligands and positronemission tomography to investigate dopamine function in PG. In contrast to findings in cocaine dependence in which between-group differences were observed in [11C]raclopride-binding in the striatum, similar levels were observed in PG and comparison subjects by two investigative groups (Linnet et al., 2010, 2011; Clark et al., 2012). Similarly, no between-group difference between PG and comparison subjects was observed using [11C]raclopride or the D3-preferring agonist-radioligand [11C]-(+)-propyl-hexahydro-naphthooxazin (PHNO) (Boileau et al., 2013). However, in these studies, relationships with mood-related or generalized impulsivity, disadvantageous decisionmaking or problem-gambling severity were reported, suggesting that dopamine function may relate to specific aspects of PG (Potenza and Brody, 2013). These findings are consistent with the idea that PG represents a heterogeneous condition and that identifying biologically relevant individual differences or subgroups may help advance treatment development or the appropriate targeting of therapeutic interventions.

A now well-documented association between dopamine and PG exists in Parkinson's disease (PD) (Leeman and Potenza, 2011). Specifically, dopamine agonists (e.g., pramipexole, ropinirole) have been associated with PG and excessive or problematic behaviors in other domains (relating to sex, eating, and shopping) in individuals with PD (Weintraub et al., 2010). Furthermore, levodopa dosing has also been associated with these conditions in PD (Weintraub et al., 2010). However, factors seemingly unrelated to dopamine (e.g., age of PD onset, marital status and geographic location) have also been associated with these conditions in PD (Voon et al., 2006; Weintraub et al., 2006, 2010; Potenza et al., 2007), highlighting the complicated etiologies of these disorders. Nonetheless, in a study using [11C]raclopride, individuals with PD and PG as compared to those with PD alone demonstrated in the ventral (but not dorsal) striatum diminished D2-like binding at baseline and greater [11C]raclopride displacement during a gambling/decisionmaking task (suggesting greater dopamine release in the PG group during task performance) (Steeves et al., 2009). These findings are reminiscent of those suggesting blunted levodopa-induced displacement of [11C]raclopride in the ventral but not dorsal striatum in PD subjects who selfadminister dopamine-replacement therapies to excess (as compared to those who do not) (Evans et al., 2006). As other findings have identified in association with behavioral addictions in PD (vs. those with PD alone) relatively reduced signal in the ventral striatum at baseline and during risk-taking (Rao et al., 2010), a question arises as to whether dopamine might relate to these processes in PD. Similar questions exist about the relatively blunted ventral striatal activation seen in non-PD PG in non-ligand-based imaging during simulated gambling (Reuter et al., 2005) and monetary reward processing (Balodis et al., 2012a; Choi et al., 2012). Although multiple studies have found blunted ventral striatal activation during the monetary-reward-anticipation phase (particularly during the performance of Monetary Incentive Delay tasks) across multiple addictive disorders [e.g., alcoholuse (Wrase et al., 2007; Beck et al., 2009) and tobacco-use (Peters et al., 2011) disorders] and other conditions characterized by impaired impulse control [e.g., binge-eating disorder (Balodis et al., 2013, in press)], other studies have found relatively increased ventral striatal activation during reward processing in individuals with PG and those with other addictions (Hommer et al., 2011; van Holst et al., 2012a), further raising questions about how striatal function contributes precisely to PG and addictions and how dopamine may be involved in these processes (Balodis et al., 2012b; Leyton and Vezina, 2012; van Holst et al., 2012b).

Although much of the radioligandrelated data described above investigate D2/D3 receptor function, other dopamine receptors warrant consideration in PG. For example, on a rodent slot-machine task, the D2-like receptor agonist quinpirole enhanced erroneous expectations of reward on near-miss trials, and this effect was attenuated by a selective D4 (but not D3 or D2) dopamine receptor antagonist (Cocker et al., 2013). These preclinical findings complement human studies that suggest a role for the D4 dopamine receptor in gambling behaviors. For example, allelic variation at the gene coding for the D4 dopamine receptor has been associated with differential responses to levodopa-related increases in gambling behaviors (Eisenegger et al., 2010). These findings complement a larger literature linking the D4 dopamine receptor to impulsivity-related constructs and disorders like attention-deficit/hyperactivity disorder, albeit somewhat inconsistently (Ebstein et al., 1996; Gelernter et al., 1997; DiMaio et al., 2003). As preclinical (Fairbanks et al., 2012) and human (Sheese et al., 2012) data suggest geneby-environment interactions involving the gene encoding the D4 dopamine receptor and aspects of impulsive or poorly controlled behaviors, further research should examine a role for the D4 dopamine receptor in PG, particularly in studies employing careful assessments of environmental and genetic factors. Although several D4 preferring/selective agonist compounds (e.g., PD-168,077 and CP-226,269) have been used in preclinical studies to study D4 receptors, additional research is needed to study human D4 dopamine receptors as might be accomplished through positronemission-tomography studies—this represents an important line of future research (Bernaerts and Tirelli, 2003; Tarazi et al., 2004; Basso et al., 2005). Additionally, as the D1 dopamine receptor has been implicated in addictions like cocaine dependence (Martinez et al., 2009), a role for the D1 dopaminergic system in PG warrants exploration.

The above findings indicate that how dopaminergic function may contribute to PG and other addictions is currently at an early stage of understanding. Current data suggest that individual variability in dopamine function may obscure differences between PG and non-PG populations, with arguably the strongest between-group differences to date observed in a group with dopaminergic pathology (PD). The individual characteristics (e.g., impulsivity, decisionmaking and gambling-related behaviors) linked to dopamine function in PG and non-PG subjects also warrant consideration from a clinical perspective and suggest that these might represent novel treatment targets that link particularly closely to biological function [raising the possibility that they may be particularly amenable to targeting with medications (Berlin et al., 2013)]. Additionally, other potential endophenotypes like compulsivity (Fineberg et al., 2010, in press) warrant consideration given their preliminary links to treatment outcome in PG (Grant et al., 2010). Additionally, systems that may regulate dopamine function warrant further consideration in treatment development. For example, in randomized clinical trials, opioid antagonists like nalmefene and naltrexone have been found to be superior to placebo in treating PG (Grant et al., 2006, 2008b), particularly amongst individuals with strong gambling urges or familial histories of alcoholism (Grant et al., 2008a). Similarly, glutamatergic systems warrant consideration in this regard (Kalivas and Volkow, 2005), with preliminary data linking the neutraceutical n-acetyl cysteine to positive treatment outcome in PG (Grant et al., 2007). As dissecting the dopamine system is providing insight into PG, similar approaches should be used to investigate serotonin function in PG (Potenza et al., 2013), particularly given inconsistent findings with serotonergic medications in the treatment of PG (Bullock and Potenza, 2012). A systematic approach to investigating the neurobiology and clinical characteristics of PG should help advance prevention and treatment strategies for PG.

#### **DISCLOSURES**

Dr. Marc N. Potenza has no financial conflicts of interest with respect to the content of this manuscript and has received financial support or compensation for the following: Dr. Marc N. Potenza has consulted for and advised Boehringer Ingelheim, Ironwood, and Lundbeck; has consulted for and has financial interests in Somaxon; has received research support from Mohegan Sun Casino, the National Center for Responsible Gaming, Forest Laboratories, Ortho-McNeil, Oy-Control/Biotie, Psyadon, Glaxo-SmithKline, the National Institutes of Health and Veteran's Administration; has participated in surveys, mailings or telephone consultations related to drug addiction, impulse control disorders or other health topics; has consulted for law offices and the federal public defender's office in issues related to impulse control disorders; provides clinical care in the Connecticut Department of Mental Health and Addiction Services Problem Gambling Services Program; has performed grant reviews for the National Institutes of Health and other agencies; has guest-edited journal sections; has given academic lectures in grand rounds, CME events and other clinical or scientific venues; and has generated books or book chapters for publishers of mental health texts.

#### **ACKNOWLEDGMENTS**

This study was funded by the National Institute on the Drug Abuse (NIDA) grant P20 DA027844, National Institute on Alcohol Abuse and Alcoholism grant RL1 AA017539, Connecticut Department of Mental Health and Addiction Services, Connecticut Mental Health Center, and the National Center for Responsible Gaming's Center of Excellence in Gambling Research at Yale University.

#### **REFERENCES**


D4 dopamine-receptor (DRD4) alleles and novelty-seeking in substance-dependent, personality-disorder and control subjects. *Am. J. Hum. Genet.* 61, 1144–1152. doi: 10.1086/301595


Parkinson disease. *Neurology* 66, 1750–1752. doi: 10.1212/01.wnl.0000218206.20920.4d


*Received: 15 October 2013; accepted: 02 December 2013; published online: 23 December 2013.*

*Citation: Potenza MN (2013) How central is dopamine to pathological gambling or gambling disorder? Front. Behav. Neurosci. 7:206. doi: 10.3389/fnbeh.2013.00206 This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2013 Potenza. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

### A new approach to assess gambling-like behavior in laboratory rats: using intracranial self-stimulation as a positive reinforcer

#### *Stephanie E. Tedford1,2\*, Nathan A. Holtz 1,2, Amanda L. Persons 1,2 † and T. Celeste Napier 1,2,3*

*<sup>1</sup> Department of Pharmacology, Center for Compulsive Behavior and Addiction, Rush University Medical Center, Chicago, IL, USA*

*<sup>2</sup> Department of Pharmacology, Rush University Medical Center, Chicago, IL, USA*

*<sup>3</sup> Department of Psychiatry, Rush University Medical Center, Chicago, IL, USA*

#### *Edited by:*

*Patrick Anselme, University of Liège, Belgium*

#### *Reviewed by:*

*Christelle Baunez, Centre National de la Recherche Scientifique, France Yueqiang Xue, The University of Tennessee Health Science Center, USA*

#### *\*Correspondence:*

*Stephanie E. Tedford, Department of Pharmacology, Center for Compulsive Behavior and Addiction, Rush University, 1735 W. Harrison St., Suite 463, Chicago, IL 60612, USA e-mail: stephanie\_e\_tedford@ rush.edu*

*†Previously published as A. L. Mickiewicz.*

Pathological gambling is one manifestation of impulse control disorders. The biological underpinnings of these disorders remain elusive and treatment is far from ideal. Animal models of impulse control disorders are a critical research tool for understanding this condition and for medication development. Modeling such complex behaviors is daunting, but by its deconstruction, scientists have recapitulated in animals critical aspects of gambling. One aspect of gambling is cost/benefit decision-making wherein one weighs the anticipated costs and expected benefits of a course of action. Risk/reward, delay-based and effort-based decision-making all represent cost/benefit choices. These features are studied in humans and have been translated to animal protocols to measure decision-making processes. Traditionally, the positive reinforcer used in animal studies is food. Here, we describe how intracranial self-stimulation can be used for cost/benefit decision-making tasks and overview our recent studies showing how pharmacological therapies alter these behaviors in laboratory rats. We propose that these models may have value in screening new compounds for the ability to promote and prevent aspects of gambling behavior.

**Keywords: cost/benefit decision-making, discounting, effort-based decision-making, gambling, intracranial selfstimulation**

#### **INTRODUCTION**

Problem or maladaptive gambling, including the extreme condition termed pathological gambling, is characterized by behaviors that often persist over extended periods. Problem gambling can have a significant negative impact on personal, professional and financial well-being. In the last two decades, gambling opportunities have increased through changes in legislation and the introduction of new venues (e.g., internet gambling). Accordingly, the prevalence of problem gambling has been on the rise. There are no FDA-approved treatments for this disorder, and thus, it is critical to better understand these behaviors in order to develop efficacious therapies.

Problem gambling is a complex phenomenon, which includes increased levels of impulsive decision-making (Alessi and Petry, 2003; Dixon et al., 2003; Holt et al., 2003; Kraplin et al., 2014) that stem from disadvantageous evaluations of cost/benefits. Clinical assessments of decision-making, which often employ survey and interactive computer-based tools, have been instrumental in determining suboptimal decision-making profiles in various pathologies including pathological gamblers (Ledgerwood et al., 2009; Madden et al., 2009; Michalczuk et al., 2011; Petry, 2011; Miedl et al., 2012). Clinical assessments are frequently made based on three differing, albeit overlapping, aspects of cost/benefit decision-making, including the following: (i) the amount of risk in obtaining a reward (risk/reward decision-making), (ii) a delay experienced before reward delivery (delay-based decision-making), and (iii) the amount of effort required to obtain a reward (effort-based decision-making). Several tasks have been developed to measure these critical features of suboptimal decision-making to further understand processes that comprise problem gambling. In these tasks, the subject chooses between a small and large reward, each associated with specific response contingencies. In risk/reward decision-making (i.e., probability discounting), subjects choose between a small reward delivered consistently at high probabilities (e.g., 100% probability of receiving \$10) and a large reward delivered at varying probabilities (e.g., 10–80% probability of receiving \$100). In clinical and preclinical studies, the absence of an expected reward is an aversive event which elicits corresponding physiological responses (Douglas and Parry, 1994; Papini and Dudley, 1997). Preference for the larger, "risky" option over the small, certain option is considered to reflect suboptimal risk/reward decision-making, and has been reported for several human pathologies that display enhanced impulsivity (Reynolds et al., 2004; Rasmussen et al., 2010; Dai et al., 2013). In delay-based decision-making (i.e., delay discounting, a measure of impulsive choice), the small reward is delivered soon after the option is selected, whereas the large reward is delivered following a variable delay, (e.g., \$10 now or \$100 in 2 weeks). Individuals who exhibit high impulsivity demonstrate preference for immediately available rewards (even if smaller), over delayed rewards (even if larger) although the latter option may be more beneficial to the individual (Crean et al., 2000; Reynolds et al., 2004; Bickel et al., 2012). In effort-based decision-making, the subject chooses between a small reward delivered following small amounts of effort, or a large reward delivered after a greater amount of effort has been exerted. In this task, individual preference for the high effort/large reward option and the "point" at which the individual switches to the low effort/small reward option is determined. Studies of effort-based decision-making in human gamblers have yet to be conducted, but would be of significant interest to assess cognitive function in this population.

Decision-making protocols used in clinical assessments can be modified to study decision-making in laboratory rats, and these models are critical for exploring the behavioral and neuropharmacological aspects of pathological gambling. In rats, decision-making can be assessed by placing the animal in an operant conditioning chamber, and allowing the animal to choose between two levers (or two nose-poke hoppers) that are made available at the same time. The established reward modality for the positive reinforcer in these rodent tasks is food (Stopper and Floresco, 2011; Eubig et al., 2014). We discuss here a novel method used in our laboratory which employs direct electrical stimulation of brain reward pathways (intracranial self-stimulation; ICSS) to assess cost/benefit decision-making in rats and the contribution of monoaminergic neurotransmitters in decision-making (Rokosik and Napier, 2011, 2012; Tedford et al., 2012; Persons et al., 2013).

#### **INTRACRANIAL SELF-STIMULATION**

An operant reinforcer is a stimulus, which when made dependent upon some action, increases the likelihood of the recurrence of that action. Intracranial self-stimulation (ICSS) is an operant behavior in which animals self-administer electrical stimulation to brain regions known to be involved in positive reinforcement. ICSS was first studied in the 1950s when James Olds and Peter Milner (Olds and Milner, 1954) determined that rats would repeatedly return to a location in a box where they received electrical stimulation to reward-related regions in the brain. They allowed rats to work for this electrical brain stimulation (EBS) by responding on an operant manipulandum (e.g., pressing a lever, spinning a wheel) (Olds and Milner, 1954). The discovery of this technique has been instrumental in mapping reward pathways throughout the brain, and while there are many regions of the brain that can be used to support ICSS (Olds and Milner, 1954; Wise and Bozarth, 1981; Wise, 1996), it is well-documented that stimulation of the medial forebrain bundle (MFB) promotes profound and reliable behavioral outputs (Corbett and Wise, 1980; Pirch et al., 1981; McCown et al., 1986; Tehovnik and Sommer, 1997). Stimulation current parameters can be manipulated to affect the reinforcing value of the EBS and therefore alter ICSS behavior. These parameters include the intensity (i.e., amperes) of the electrical current and the current frequency (i.e., hertz). Elevations in both parameters typically results in increased excitation of the reward-relevant neurons being stimulated, either by increasing the number of neurons engaged by the stimulation (amperes) (Keesey, 1962; Wise et al., 1992) or by increasing the frequency in which a set population of neurons fire (hertz) (Wise and Rompre, 1989; Wise, 2005). Manipulations of current intensity alter the number of neurons activated, i.e., larger current intensities affect a wider population of neurons than smaller currents. Thus, when this parameter is kept constant, the population of neurons excited by EBS is relatively similar regardless of current frequency. The stimulation parameter variable of choice for these protocols is current frequency, as this selection allows us to manipulate the firing rate of the same group of neurons with minimal effects on the time or space of stimulation integration. By manipulating these EBS parameters, we have developed sophisticated models of cost/benefit decision-making that employ ICSS (Rokosik and Napier, 2011, 2012; Tedford et al., 2012; Persons et al., 2013). This application represents a radical departure from the traditionally used reinforcing stimulus (i.e., food) in tasks assessing decision-making in rodents. ICSS may provide several experimental advantages over traditional reinforcement methods. To facilitate operant responding for food, daily intake is often restricted (Feja and Koch, 2014; Hosking et al., 2014; Mejia-Toiber et al., 2014). This practice can confound outcome measures, as there is substantial overlap in the neurobiological systems that are altered during chronic food restriction and those that mediate impulsive decision-making (Schuck-Paim et al., 2004; Minamimoto et al., 2009). Additionally, animals reinforced with food become increasingly satiated throughout a session, which decreases the value of food reinforcement (Bizo et al., 1998), although this effect may be dependent on reinforcer size (Roll et al., 1995). In contrast to food reinforcement, the reinforcer value of the EBS remains stable throughout a session, allowing for more extensive and consistent behavioral assessments (Trowill et al., 1969). This feature allows for testing sessions to occur repeatedly throughout a day, which can be beneficial when studying the effects of pharmacological therapies, specifically chronic drug treatment. Our published probability discounting studies (discussed below) were conducted several times a day throughout chronic dopamine agonist (pramipexole) treatments. We propose that this procedural benefit is more applicable to the human condition and thus provides enhanced translational findings. To date, similar studies assessing dopamine agonist effects on impulsive decision-making using food reward have only assessed acute drug treatments (St Onge and Floresco, 2009; Zeeb et al., 2009; Madden et al., 2010; Johnson et al., 2011; Koffarnus et al., 2011) and it will be of significant interest to compare the behavioral outcomes following both acute and chronic drug treatment between these different reinforcers. While ICSS provides several advantages over food reinforcement, ICSS also presents several disadvantages. For example, ICSS requires invasive brain surgery and recovery, and ill-fitted head stages can result in loss of subjects throughout the behavioral paradigm. Despite these drawbacks, we hold that ICSS is a viable alternative to food reinforcement and presents considerable advantages to food reinforcement in these behavioral tasks.

Cost/benefit decision-making tasks require choices to be made between options associated with varying reward magnitudes. Accordingly, reinforcers used in these tasks should demonstrate the ability to produce such changes in reward magnitude and subsequently rats must be able to discriminate between the small reinforcer (SR) and large reinforcer (LR) option. In procedures that use food reinforcement, this is achieved by altering the number of food pellets obtained after a response. In ICSS, the EBS can be varied by changing stimulation current intensity or current frequency. **Figure 1** illustrates lever-press responding obtained when current intensity is varied (i.e., current frequency was held constant; **Figure 1A**) or when current frequency is varied (i.e., current intensity was held constant; **Figure 1B**). When

either parameter is altered, rats exhibit moderate lever pressing for small EBS values and show increased lever-pressing rates for large EBS values, suggesting that the reinforcer value of the larger stimulation is greater (independent of whether current intensity or frequency is manipulated). EBS can therefore be tailored for the small and large reinforcer necessary for cost/benefit decision-making protocols. These reinforcer values can be determined in individual rats by generating stable lever-pressing rate response curves for each animal (Rokosik and Napier, 2011, 2012). Alternatively, a population curve can be generated from a group of rats from which a standardized SR and LR value can be determined (Tedford et al., 2012; Persons et al., 2013). This latter approach provides a more time-efficient and yet reliable means to derive the SR and LR. In a second series of studies, we used either manipulations of current intensity or frequency to establish SR/LR values in a probability discounting task (i.e., risk/reward decision-making). Changes in current intensity reinforcer values (i.e., current frequency was held constant) and current frequency values (i.e., current intensity was held constant) both produce significant discounting behavior in rats (**Figures 1C,D**). Based in part on the steepness of the discounting curve, current frequency was determined to be the appropriate parameter for manipulating reinforcement values. Once it is established that rats can distinguish between the standardized current frequencies used for the SR and LR, they can be tested in any one of our ICSS-mediated decision-making paradigms: (i) risk/reward decision-making (Rokosik and Napier, 2011, 2012), (ii) delaybased decision-making (Tedford et al., 2012), or (iii) effort-based decision-making (Persons et al., 2013).

#### **VALIDATING THE USE OF ICSS TO EVALUATE MEASURES OF IMPULSIVITY AND DECISION-MAKING**

The development of new animal models requires careful consideration regarding validity. Thus, in designing these ICSS-mediated decision-making tasks, we have strived to verify face and construct validity, and to ascertain the likelihood for predictive validity.

Face validity refers to the extent in which a test subjectively appears to measure its intended phenomenon. The design of each ICSS-mediated decision-making task was based on current protocols employed in humans for delay and probability discounting (Rasmussen et al., 2010; Leroi et al., 2013) and other effort-based decision-making tasks (Treadway et al., 2009; Buckholtz et al., 2010; Wardle et al., 2011). In humans, measures of cost/benefit decision-making are derived from asking individuals to select between several options available with specific contingencies placed on each selection (i.e., risk, delay, or effort). We emulate this scenario by presenting rats with two simultaneously extended levers, wherein a selection of either lever is associated with small or larger rewards that are also delivered under particular parameters of contingency. Thus, each of our ICSS-mediated decision-making tasks demonstrates face validity.

Construct validity refers to the ability of the paradigm to accurately assess what it proposes to measure. In risk/reward and delay-based decision-making, preference for the large reward is decreased as the probability of delivery is lowered, or the delay toward reward delivery is increased, respectively. In effort-based decision-making, individuals demonstrate initial preference for the high effort/large reward option when the effort associated with the large reward is deemed reasonable. A shift in preference to the low effort/small reward is observed when the high effort is no longer worth the energy expenditure. It is well-documented that rodents exhibit similar patterns of risk/reward, delay-based and effort-based decision making compared to humans (Rachlin et al., 1991; Buelow and Suhr, 2009; Jimura et al., 2009), and we have observed these profiles in each of our tasks (Rokosik and Napier, 2011, 2012; Tedford et al., 2012; Persons et al., 2013) (for example, see **Figure 2**).

Predictive validity refers to the ability of models to foresee future relationships, and we pose that our models can be used to predict the capacity of novel pharmacological treatments to alter cost/benefit decision-making. That is, by demonstrating proof-of-concept through replicating the effects of pharmacological agents on decision-making behaviors that have already been established in humans, we propose that our models may be efficacious in predicting how other drugs may mediate these behaviors in the clinic. For example, a subset of patients with Parkinson's disease (PD) who are treated with dopamine agonist therapies demonstrate an increased prevalence of gambling behavior (Weintraub et al., 2010) and increased discounting in delay-based decision-making (Housden et al., 2010; Milenkova et al., 2011; Voon et al., 2011; Leroi et al., 2013; Szamosi et al.,

**FIGURE 2 | Effects of pramipexole on risk/reward decision-making using a probability discounting task.** Chronic (±)PPX decreases discounting in PD-like **(A)** and sham control **(B)** rats. Briefly describing the task, PD-like (*n* = 11) and sham control (*n* = 10) rats were trained in the probability discounting task using ICSS. Probabilities associated with delivery of the large reinforcer (LR) were presented in a pseudo-randomized order. Once stable behavior was observed, rats were treated chronically with twice daily injections of 2 mg/kg (±)PPX for 13 days. Data shown were collected from the time point in which we observed the peak effect on the final day of treatment (i.e., 6 h post injection) and are compared with the pretreatment baseline (BL). Shown is the percent selection of the LR (i.e., free-choice ratio) vs. the probability that the LR was delivered. A Two-Way rmANOVA with *post hoc* Newman-Keuls revealed significant increases in % selection of the uncertain, LR following chronic PPX treatment (∗*p <* 0*.*05) for both PD-like and sham rat groups. Although the group averages indicate a PPX-induced increase in suboptimal risk/reward decision-making, two rats in each group showed less than a 20% increase from baseline at the lowest probability tested; therefore, some rats appeared to be insensitive to the ability of the drug to modify probability discounting. Figure modified from Rokosik and Napier (2012) and reprinted with permission from the publisher. 2013). Thus, our laboratory set out to model PD in rats and study the effects of pramipexole, a commonly employed dopamine agonist associated with gambling behaviors (Weintraub et al., 2010), on cost/benefit decision-making in the rat using the probability discounting task (risk/reward decision-making) (Rokosik and Napier, 2012). To do so, rats were rendered "PD-like" by selective lesioning of dopaminergic terminals within the dorsolateral striatum *via* bilateral infusions of 6-OHDA, while control rats received infusions of the 6-OHDA vehicle (Rokosik and Napier, 2012). Neurons in the dorsolateral striatum of only the 6-OHDA treated rats show a decrease in tyrosine hydroxylase (Rokosik and Napier, 2012), a marker of dopamine. PD-like rats exhibit motor disturbances similar to humans with early-stage PD, which can be reversed dose-dependently with pramipexole treatment. The dose of pramipexole we administered to study risk/reward decision-making alleviates motor deficits, and thus is therapeutically-relevant (Rokosik and Napier, 2012). While we find no difference in baseline "risky" behavior between control rats and PD-like rats, chronic pramipexole treatment increases selection of the risky LR in both groups of rats when probabilities of delivery were small (**Figures 2A,B**), indicating that pramipexole induces suboptimal risk/reward decision-making. These data concur with studies that have assessed the effects of pramipexole in humans (Spengos et al., 2006; Pizzagalli et al., 2008; Riba et al., 2008). Nonetheless, we infer the predictive validity of our rodent models in indicating other pharmacological agents that may mediate cost/benefit decision-making in humans.

We also have tested mirtazapine, an atypical anti-depressant, in the effort-based decision-making task. Behavioral addictions and substance abuse share many overlapping characteristics, including suboptimal decision-making, and new studies in humans and non-human animals illustrate that mirtazapine is effective at reducing behaviors motivated by abused drugs (e.g., opiates and psychostimulants) even those that are associated with relapse during periods of abstinence (for review, see Graves et al., 2012). Data collected from our ICSS-mediated effortbased decision-making task indicates that mirtazapine effectively reduced preference for a high effort/LR, switching to a low effort/SR, suggesting that the amount of effort required for the LR was no longer "worth it," or that the reward value of the LR was diminished (Persons et al., 2013). These results suggest that it may be of interest to study the effects of mirtazapine on suboptimal decision-making in problem gamblers in the clinic.

#### **CONCLUSION**

In summary, we have utilized ICSS as a positive reinforcer in several novel tasks designed to measure separate, yet overlapping, aspects of cost/benefit decision-making exhibited in problem gambling. These measures can be used to further explore the contribution of various neuroanatomical substrates and neurotransmitter systems in problem gambling. ICSS-mediated tasks provide a viable alternative to food reinforcement in these complex operant paradigms. We believe that the validity of these tasks indicates that they can aid in screening drugs for their potential to induce impulse control disorders, such as problem gambling, and to help identify drugs that reduce these disorders.

#### **ACKNOWLEDGMENTS**

This work was supported by the National Center for Responsible Gaming, the Michael J. Fox Foundation, the Daniel F. and Ada L. Rice Foundation, and USPHSGs NS074014 to T. Celeste Napier and DA033121 to Stephanie E. Tedford and T. Celeste Napier.

#### **REFERENCES**


responsiveness. *Psychopharmacology (Berl)* 196, 221–232. doi: 10.1007/s00213- 007-0957-y


**Conflict of Interest Statement:** Dr.Napier has received research support from the National Institutes of Health, the Michael J. Fox Foundation and the National Center for Responsible Gaming. Dr. Napier has received compensation for the following: consulting for a not-for-profit health education center and for law offices on issues related to addictions and impulse control disorders; speaking on addictions at community town hall meetings, public high schools, communitybased not-for-profits, and professional meetings of drug courts; providing grant reviews for the National Institutes of Health and other agencies; and academic lectures and grand rounds. Dr. Napier is a member of the Illinois Alliance on Problem Gambling, and she provides expert advice on medication development to the Cures Within Research Foundation. Dr. Holtz, Dr. Persons, and Ms. Tedford declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 06 March 2014; accepted: 27 May 2014; published online: 11 June 2014. Citation: Tedford SE, Holtz NA, Persons AL and Napier TC (2014) A new approach to assess gambling-like behavior in laboratory rats: using intracranial self-stimulation as a positive reinforcer. Front. Behav. Neurosci. 8:215. doi: 10.3389/fnbeh.2014.00215 This article was submitted to the journal Frontiers in Behavioral Neuroscience. Copyright © 2014 Tedford, Holtz, Persons and Napier. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*