# REWARD PROCESSING IN MOTIVATIONAL AND AFFECTIVE DISORDERS

EDITED BY: Frank Ryan and Nikolina Skandali PUBLISHED IN: Frontiers in Psychology

#### *Frontiers Copyright Statement*

*© Copyright 2007-2016 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88919-986-0 DOI 10.3389/978-2-88919-986-0

## About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

## Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

## Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

## What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **REWARD PROCESSING IN MOTIVATIONAL AND AFFECTIVE DISORDERS**

Topic Editors:

**Frank Ryan,** Imperial College, UK **Nikolina Skandali,** University of Cambridge, UK

SUMA surface mapping of brain image Image: National Institute of Mental Health, National Institutes of Health, Department of Health and Human Services

Preferential reward processing is the hallmark of addiction, where salient cues become overvalued and trigger compulsion. In depression, rewards appear to lose their incentive properties or become devalued. In the context of schizophrenia, aberrations in neural reward signalling are thought to contribute to the overvaluation of irrelevant stimuli on the one hand and the onset of negative symptoms on the other. Accordingly, reward processing has emerged as a key variable in contemporary, evidence based, diagnostic frameworks, such as the Research Domain Criteria launched by the United States National Institute of Mental Health. Delineation of the underlying mechanisms of aberrant or

blunted reward processing can be of trans-diagnostic importance across several neuropsychiatric disorders. Reward processing can become automatic thus raising the question of cognitive control, a core theme of this Topic, which aims at justifying the necessity of reward processing as a potential therapeutic target in clinical settings. Empirical and theoretical contributions on the following themes were expected to:


**Citation:** Ryan, F., Skandali, N., eds. (2016). Reward Processing in Motivational and Affective Disorders. Lausanne: Frontiers Media. doi: 10.3389/978-2-88919-986-0

# Table of Contents


Kristine Rømer Thomsen


Gonzalo Arrondo, Nuria Segarra, Antonio Metastasio, Hisham Ziauddeen, Jennifer Spencer, Niels R. Reinders, Robert B. Dudas, Trevor W. Robbins, Paul C. Fletcher and Graham K. Murray


Lisa C. G. Di Lemma, Joanne M. Dickson, Pawel Jedras, Anne Roefs and Matt Field


Giles W. Story, Michael Moutoussis and Raymond J. Dolan

# Editorial: Reward Processing in Motivational and Affective Disorders

Frank Ryan<sup>1</sup> \* and Nikolina Skandali <sup>2</sup>

*<sup>1</sup> Centre for Mental Health, Division of Brain Sciences, Imperial College London, London, UK, <sup>2</sup> Department of Psychiatry, Behavioural and Clinical Neuroscience Institute, University of Cambridge, Cambridge, UK*

Keywords: reward, anhedonia, depression, schizophrenia, addiction, computational neuroscience, Bayesian models

**The Editorial on the Research Topic**

#### **Reward Processing in Motivational and Affective Disorders**

Reward prediction and valuation are central to decision making (Schultz et al., 1997; O'Doherty, 2004), and thus motivate and guide human action. Faulty reward processing can be pragmatically viewed as compromised decision-making, reflected in making suboptimal choices. A primary aim of this Research Topic is to provide a dimensional approach for reward processing. In accordance with the Research Domain Criteria, a newly proposed research classification of mental health disorders based on behavioral dimensions and neurobiological findings (Insel et al., 2010), reward processing deficits can be of transdiagnostic importance. This prominence has been largely influenced by the evolution of novel neuroimaging techniques, experimental cognitive psychology findings, and the application of computational modeling in simulating human behavior. This body of work has increased knowledge of the neural mechanisms underlying aberrant reward processing.

We maintain that construing patterns or expressions of reward processing as potential biomarkers, or indices of psychological vulnerability, can facilitate early detection and intervention in the clinical arena. Additionally, therapeutic approaches originating in the psychology laboratory aimed at modifying or reversing cognitive biases or behavioral approach biases linked to aberrant reward processes are showing promise in preventing relapse in the context of addiction (see Gladwin et al., 2016). It is this twin promise of enhanced prediction of vulnerability or risk and ultimately improved clinical outcomes, combined with a deeper understanding of brain functions, that motivated us to gather together this unique series of articles linked by the common thread of reward processing.

The Topic includes four original articles exploring reward processing in schizophrenia, depression, addiction and in the context of stress or anxiety. These empirical contributions are complemented by three review articles, two theoretical contributions and an opinion piece. In tandem with laboratory findings, these more conceptual articles put emphasis on the role of the integrity of neuromodulatory systems implicated in reward processing as well as the remarkable insights that can be derived from the implementation of computational modeling.

Rømer Thomsen, set the scene by outlining the subcomponents of reward processing: wanting, liking and learning. This parsing of reward processing enables a critical analysis of the concept of anhedonia, suggesting that deficits in reward processing are not restricted to "liking" (the subjective, pleasurable, experience of being rewarded). Other components, "wanting" and mechanisms underlying learning about rewards, can also be disrupted and contribute to the development and maintenance of disorders such as addiction and depression.

#### Edited and reviewed by:

*Gianluca Castelnuovo, Catholic University of the Sacred Heart, Italy*

> \*Correspondence: *Frank Ryan f.ryan@imperial.ac.uk*

#### Specialty section:

*This article was submitted to Psychology for Clinical Settings, a section of the journal Frontiers in Psychology*

Received: *18 March 2016* Accepted: *12 August 2016* Published: *30 August 2016*

#### Citation:

*Ryan F and Skandali N (2016) Editorial: Reward Processing in Motivational and Affective Disorders. Front. Psychol. 7:1288. doi: 10.3389/fpsyg.2016.01288*

Arrondo et al. demonstrated blunted reward anticipation in people with a diagnosis of schizophrenia and depression. This attenuated striatal response to the prospect of monetary reward correlated with depressive symptoms in the schizophrenia group, but did not cohere with clinical symptoms of depression in the depressed cohort. Di Lemma et al. and colleagues found that, contrary to predictions, approach and avoidance tendencies following positive and negatively themed videos did change in parallel challenging the expectation that these are independent processes. Woud et al. investigated broadly similar processes investigating cohorts of current or former tobacco smokers. The researchers reported no attentional or behavioral approach biases in either current smokers, nicotine deprived smokers and exsmokers, thus challenging incentive theories of addiction. These three innovative studies raise important questions for further research, and refine experimental methods in the process.

Robinson et al. applied a stress manipulation paradigm in order to study the effect of acute stress on two wellestablished biases in decision-making, temporal discounting and the framing effect. The researchers observed mood alterations in response to experimentally induced stress, but no effects on decision-making processes. Acute stress impacted on low level "bottom–up" perceptual biases, but higher level executive processes were unperturbed. The findings support the application of psychotherapeutic approaches aiming to enhance cognitive control as an apparently resilient component of therapeutic intervention for affective disorders.

Chekroud discussed the distortions in reward sensitivity and/or reward learning that contribute to the development of depression. He described the implementation of the free-energy principle, which views the brain as a "predictive machine," aimed at reducing surprise (i.e., free energy) by constructing congruent cognitive models and optimizing actions. One potential clinical application is that changing cognitive representations using pharmacological agents or psychotherapy will be a necessarily gradual rather than immediate process. Also on the topic of depression, Dillon assigned a pivotal role to reward processing in the formation of long term memories. He concluded that impaired reward processing reflected distorted mesolimbic dopaminergic transmission, thus impeding the transfer of short-term memories into long-term episodic memory storage. Consequentially, in order to recover from depression, not only

## REFERENCES


will somebody who is depressed need to overcome a memory bias for the recall of negative events, they will also struggle to recall positive events.

Cousijn highlighted the role of fronto-parietal and limbic brain networks that are implicated in vulnerability to cannabis use disorders, other substance misuse disorders and increased risk of anxiety and depression. When these cognitive control systems are compromised individuals are more likely to reach out for immediately available rewards whether they are linked to substance use or the powerful negative reinforcement that occurs when emotional distress is alleviated by avoidance, thus increasing the possibility of anhedonia and depression.

Moutoussis et al. differentiated between optimal decisions delivering the best possible rewards and a conceptualization of psychiatric disorder based on suboptimal reward processing. In this context, rewards are milestones or surrogates to strategic goals such as health, wellbeing and social affiliation. The researchers emphasized the importance of considering the patient's autonomy by pointing out the need for the clinician to engage in a dialog to elicit the patient's values and goals. Story et al. identified two factors involved in delayed reward discounting in psychiatric disorders: the opportunity cost that waiting for a delayed reward entails and the associated uncertainty of reward delivery.

The theoretical insights and experimental findings presented in this topic justify further exploration of the mechanisms underlying the anticipation, valuation and pursuit of rewards. In tandem with this, adapting and applying these findings in clinical settings could, we believe, provide additional therapeutic benefit.

## AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

## FUNDING

Dr. NS is supported by a UK Medical Research Council Doctoral grant studentship. Dr. FR wishes to acknowledge the support and encouragement he received from his employer Camden & Islington NHS Foundation Mental Health Trust and the Centre for Mental Health, Imperial College.

Schultz, W., Dayan, P., and Montague, P. R. (1997). A neural substrate of prediction and reward. Science 275, 1593–1599.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Ryan and Skandali. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# **Measuring anhedonia: impaired ability to pursue, experience, and learn about reward**

#### *Kristine Rømer Thomsen\**

*Centre for Alcohol and Drug Research, Department of Psychology and Behavioural Sciences, Aarhus University, Aarhus C, Denmark*

Ribot's (1896) long standing definition of anhedonia as "the inability to experience pleasure" has been challenged recently following progress in affective neuroscience. In particular, accumulating evidence suggests that reward consists of multiple subcomponents of wanting, liking and learning, as initially outlined by Berridge and Robinson (2003), and these processes have been proposed to relate to appetitive, consummatory and satiety phases of a pleasure cycle. Building on this work, we recently proposed to reconceptualize anhedonia as "impairments in the ability to pursue, experience, and/or learn about pleasure, which is often, but not always accessible to conscious awareness." (Rømer Thomsen et al., 2015). This framework is in line with Treadway and Zald's (2011) proposal to differentiate between motivational and consummatory types of anhedonia, and stresses the need to combine traditional selfreport measures with behavioral measures or procedures. In time, this approach may lead to improved clinical assessment and treatment. In line with our reconceptualization, increasing evidence suggests that reward processing deficits are not restricted to impaired hedonic impact in major psychiatric disorders. Successful translations of animal models have led to strong evidence of impairments in the ability to pursue and learn about reward in psychiatric disorders such as major depressive disorder, schizophrenia, and addiction. It is of high importance that we continue to systematically target impairments in all phases of reward processing across disorders using behavioral testing in combination with neuroimaging techniques. This in turn has implications for diagnosis and treatment, and is essential for the purposes of identifying the underlying neurobiological mechanisms. Here I review recent progress in the development and application of behavioral procedures that measure subcomponents of anhedonia across relevant patient groups, and discuss methodological caveats as well as implications for assessment and treatment.

#### **Keywords: anhedonia, reward, pleasure, motivation, learning, depression, schizophrenia, addiction**

## **Introduction**

The generally accepted understanding of the term anhedonia has remained almost unaltered since Ribot (1896) first defined it as the "inability to experience pleasure" over a century ago. However, during the last 5 years the term has been subject to debate and some progress has been made in terms of elucidating the underlying neurobiological mechanisms. A number of recent reviews (Treadway and Zald, 2011; Der-Avakian and Markou, 2012; Whitton et al., 2015), summarize

#### *Edited by:*

*Frank Ryan, Imperial College, UK*

#### *Reviewed by:*

*Richard J. Tunney, University of Nottingham, UK Mike J. F. Robinson, Wesleyan University, USA*

#### *\*Correspondence:*

*Kristine Rømer Thomsen, Centre for Alcohol and Drug Research, Department of Psychology and Behavioural Sciences, Aarhus University, Bartholins Allé 10, 8000 Aarhus C, Denmark krt.crf@psy.au.dk*

#### *Specialty section:*

*This article was submitted to Psychology for Clinical Settings, a section of the journal Frontiers in Psychology*

*Received: 29 May 2015 Accepted: 03 September 2015 Published: 17 September 2015*

#### *Citation:*

*Rømer Thomsen K (2015) Measuring anhedonia: impaired ability to pursue, experience, and learn about reward. Front. Psychol. 6:1409. doi: 10.3389/fpsyg.2015.01409* this progress and offer improved understanding of the underlying neurobiology. However, their conceptual understanding of anhedonia diverge. Treadway and Zald (2011) made a convincing case to differentiate between motivational and consummatory types of anhedonia and introduced the term decisional anhedonia to emphasize the influence of anhedonic symptoms on decisionmaking. In contrast, Der-Avakian and Markou (2012) recently argued that deficits in motivational and decision-making processes (albeit disturbed, e.g., in depressed patients) should not be labeled under the umbrella of anhedonia.

Overall, findings from affective neuroscience have challenged Ribot's (1896) definition, which is restricted to subjectively experienced pleasure. Accumulating evidence suggests that reward consists of multiple subcomponents and processes of wanting, liking and learning (Robinson and Berridge, 2003; Berridge and Kringelbach, 2008) and these processes have been proposed to relate to appetitive, consummatory and satiety phases of a pleasure cycle (Kringelbach et al., 2012). Building on this work, we recently proposed to reconceptualize anhedonia as "impaired ability to pursue, experience and/or learn about pleasure, which is often, but not always accessible to conscious awareness" (Rømer Thomsen et al., 2015, p. 2).

The parsing of reward into wanting, liking and learning components was originally introduced by Robinson and Berridge (1993) in their influential incentive sensitization theory of drug addiction. The theory has received support in animal and human studies of drug addiction (Vezina and Leyton, 2009; Leyton and Vezina, 2013) and recently also in terms of behavioral addiction like Gambling Disorder (Leyton and Vezina, 2012; Rømer Thomsen et al., 2014). In Robinson and Berridge's taxonomy they differentiate between core reactions that are not necessarily conscious ("wanting," "liking," and "learning") and their conscious counterparts (wanting, liking, and learning, i.e., denoted without quotation marks; Berridge and Robinson, 2003; Berridge and Kringelbach, 2008). In other words, reward can be parsed into three main components—motivation, hedonic impact and learning—and each of these components consist of both conscious and unconscious subcomponents (see **Figure 1A**). For example, motivation consists of "(1) core incentive salience "wanting" processes that are not necessarily conscious (e.g., cuetriggered "wanting" for food or drugs) and (2) conscious desires for incentives or cognitive goals" (Berridge and Kringelbach, 2008, p. 2). Hedonic impact consists of "(1) core "liking" reactions that need not necessarily be conscious and (2) conscious experiences of pleasure, in the ordinary sense of the word, which may be elaborated out of core "liking" reactions by brain mechanisms of awareness" (Berridge and Kringelbach, 2008, p. 2). Similarly, learning (or learned predictions) include "(1) implicit knowledge as well as associative conditioning, such as basic pavlovian and instrumental associations and (2) explicit and cognitive predictions" (Berridge and Kringelbach, 2008, p. 2).

The subcomponents of reward constantly interact through the appetitive, consummatory and satiety phases of a pleasure cycle, but can be teased apart using systematic scientific analysis. Selfreport measures can help identify the conscious components (wanting, liking, and learning) and provide valuable information on this level of processing. However, self-report measures are of course limited in their ability to capture unconscious processes, as well as in their ability to parse out contributions that may have been made by any of the unconscious processes, considering that these processes interact strongly together. In contrast, behavioral procedures from animal studies provide useful markers of the core "wanting," "liking," and "learning" reactions (**Figure 1B**). For example, "liking" reactions have been studied in rodents by measuring the affective orofacial expressions that are elicited in response to sweet tastes (Pfaffmann et al., 1977; Grill and Norgren, 1978a,b), and a number of procedures have been developed to study "wanting" in rodents, e.g., by measuring the effort exerted to obtain rewards (Salamone et al., 2007) or the ability of rewardrelated cues to act as motivational magnets (Wyvell and Berridge, 2000). In recent years, some of these animal models have been successfully translated to human studies and provide valuable behavioral measures of subcomponents of reward, which can complement traditional self-report measures (**Figure 1C**).

Overall, findings from animal and human studies applying these types of measures support the view that reward is a complex process consisting of several psychological components that correspond to partly dissociable neurobiological mechanisms (Berridge and Robinson, 2003; Berridge and Kringelbach, 2008, 2015). For example, there is strong evidence that dopamine plays an important role in "wanting," but not in "liking" reactions. In animal and human studies where "wanting" and "liking" reactions have been systematically teased apart, specific manipulation of dopamine signaling has failed to shift "liking" reactions to rewards (Berridge and Valenstein, 1991; Peciña et al., 2003; Ward et al., 2012). In contrast, there is accumulating evidence that dopamine plays an important role in "wanting" processes. For example, elevation of dopamine has been shown to increase willingness to work for a food reward in rodents (Bardgett et al., 2009), while dopamine attenuation or blockade has the opposite effect (Cousins and Salamone, 1994; Salamone et al., 2007). Similarly, evidence from human studies suggests that amphetamine/L-Dopa-induced elevated dopamine increases subjective ratings of drug wanting, but not subjective ratings of drug liking during consumption (Leyton et al., 2002, 2007; Liggins et al., 2012). Recently, Wardle et al. (2011) provided evidence that elevated levels of dopamine increase willingness to work for reward in humans using a behavioral measure.

Building on the framework set forward by Berridge and Robinson (2003) suggesting that reward consists of multiple subcomponents of wanting, liking, and learning and recent proposals relating these processes to the appetitive, consummatory and satiety phases of a pleasure cycle (Kringelbach et al., 2012), we recently proposed to reconceptualize anhedonia as "impairments in the ability to pursue, experience and/or learn about pleasure" (Rømer Thomsen et al., 2015, p. 2). In this conceptualization of anhedonia, impairments in each of the subcomponents can lead to a malfunctioning pleasure system. Normally, wanting, liking, and learning processes are balanced over time, however this balance can be compromised by impairments in each of the components. Depending on which of the subcomponents are most affected, and how the components are affected, this can lead to distinct subtypes of anhedonia, that are associated with distinct imbalances of the pleasure system

(Rømer Thomsen et al., 2015). For example, patients suffering from major depressive disorder often describe a diminished ability to pursue and experience pleasure, i.e., a progressive decrease in some (or all) of the reward components. In contrast, drug addiction can be characterized by excessive wanting for the drug of choice, which grows over time independently of drug liking. While anhedonia has traditionally been conceived as diminished responses (typically, diminished subjectively experienced pleasure), "our proposed framework acknowledges that both too much and too little activity in specific parts of the pleasure system can lead to pathological changes. This is for example illustrated in the excessive wanting for drugs in drug addiction or in disorders with hypersexuality" (Rømer Thomsen et al., 2015, p. 15).

It is important to note, that in this terminology (Rømer Thomsen et al., 2015) *pleasure* and *pleasure system* is not restricted to hedonic impact, but is instead used to encompass all of the phases of reward processing. This is in contrast to the dominating terminology, where *pleasure* is restricted to the hedonic impact of a reward, while *reward* is used to encompass all of the rewardrelated processes (see, e.g., Berridge and Kringelbach, 2008). In an attempt to avoid misunderstandings, I have changed the wording of our definition in the present paper to reflect the dominating terminology. Hence, anhedonia is defined here as "impairments in the ability to pursue, experience, and/or learn about *reward*."

In line with our proposed reconceptualization of anhedonia, there has been a growing bulk of evidence suggesting that reward processing deficits are not restricted to impaired hedonic impact in psychiatric disorders typically associated with anhedonia. Findings from the past 5 years suggest that motivational and learning processes are impaired, e.g., in patients suffering from major depressive disorder (subsequently referred to as depression) and schizophrenia (Treadway and Zald, 2011; Fervaha et al., 2013b; Rømer Thomsen et al., 2015; Whitton et al., 2015). Part of this work is based on successful translations of animal models, thereby paving the way for validated behavioral paradigms that can supplement traditional self-report measures. These efforts are exciting and hold promise in terms of elucidating the role of subcomponents of anhedonia and the underlying neurobiological mechanisms across major psychiatric disorders. Here I review recent progress in the development and application of behavioral procedures that measure subcomponents of anhedonia in relevant patient groups (including patients suffering from depression, schizophrenia and addiction) and discuss implications for clinical assessment and treatment.

## **Measuring Subcomponents of Anhedonia**

In line with the generally accepted understanding of anhedonia as "decreased subjective experience of pleasure" [as per Ribot's (1896) definition], the most popular way of measuring anhedonia has been self-report scales or questionnaires like The Fawcett–Clark Pleasure Scale (FCPS; Fawcett et al., 1983) or The Snaith–Hamilton Pleasure Scale (SHAPS; Snaith et al., 1995). The majority of these instruments are restricted to measuring subjective experiences of hedonic impact (i.e., liking), but some of the more recently developed questionnaires also include aspects of reward motivation (i.e., wanting). For example, The Temporal Experience of Pleasure Scale (TEPS; Gard et al., 2006) differentiates between anticipatory and consummatory experiences of pleasure, and The Sensitivity To Reinforcement of Addictive and other Primary Rewards (STRAP-R; Goldstein et al., 2010) measures liking and wanting of drug and non-drug rewards under various situations (e.g., current and hypothetical). Building on Robinson and Berridge's incentive sensitization theory (Robinson and Berridge, 1993; Robinson et al., 2013), Lende (2005) developed a short Incentive Salience Scale that measures key aspects of drug wanting and has been used to predict addiction status.

While these instruments provide useful information about the conscious components of anhedonia, they are of course limited in their ability to capture unconscious components. Similar to research on reward (Berridge and Robinson, 2003; Berridge and Kringelbach, 2008), it is crucial to differentiate between conscious and unconscious components of anhedonia (Rømer Thomsen et al., 2015). Accumulating evidence suggests that we do not always know what motivates our behavior or brings us pleasure (Aharon et al., 2001; Winkielman et al., 2005; Moeller et al., 2009; Parsons et al., 2011), and there is convincing evidence that reward (also) affects our behavior on an unconscious level (Winkielman et al., 2005; Pessiglione et al., 2007, 2008; Aarts et al., 2008).

During the last 5 years a number of validated and useful behavioral procedures have been developed that can be used to measure impairments in the described subcomponents of reward (**Figure 1**). Of particular relevance are recent developments in behavioral procedures that can be used to measure impairments in the ability to pursue and learn about reward.

## **Impaired Ability to Pursue Reward**

A large number of animal models have been developed to study motivational processes by looking at behavior related to obtainment of rewards such as food. Of particular relevance here are models of the effort exerted to obtain rewards [e.g., by measuring how eagerly the animal runs for rewards in a runway (Berridge and Valenstein, 1991; Peciña et al., 2003) or its willingness to exert effort in exchange for more palatable food rewards (Salamone et al., 1994, 2007)] and of the ability of rewardrelated cues to act as motivational magnets [e.g., by measuring pavlovian instrumental transfer (Wyvell and Berridge, 2000, 2001)]. Recently, some of these models have been successfully translated to studies of humans and the reported findings offer intriguing information on the role of reward motivation across major psychiatric disorders.

The effort expenditure for rewards task (EEfRT), which was developed by Treadway et al. (2009), represent a good example of how a validated animal model of motivation (Salamone et al., 1994) can be successfully translated to human studies. The EEfRT is an effort-based decision-making task, where reduced reward motivation is operationalized as a decreased willingness to choose greater-effort/greater-reward over less-effort/less-reward options with varying probability (Treadway et al., 2009). Recently, the task has been applied to relevant clinical populations and provide evidence of reduced willingness to expend effort for rewards in patients with subsyndromal depression, first-episode depression and remitted depression, compared to controls (Treadway et al., 2012; Yang et al., 2014).

A recent longitudinal study of reward seeking behavior in individuals at risk of depression provides intriguing evidence of diminished reward motivation as a potential precursor of depression (Rawal et al., 2013). Adolescent offspring of depressed parents performed the Cambridge Gambling Task in order to measure betting behavior under different odds. Compared to healthy adolescents and adolescents with externalizing disorders, the adolescent offspring of depressed parents showed diminished reward seeking (i.e., betted less at favorable odds). Importantly, the magnitude of this diminished response predicted depressive symptoms, depression-onset and functional impairment 1 year later (Rawal et al., 2013).

Several recent studies have reported decreased willingness to work for rewards using the EEfRT or similar tasks in patients suffering from schizophrenia (Fervaha et al., 2013c; Gold et al., 2013; Barch et al., 2014). For example, Barch et al. (2014) reported that patients with schizophrenia were less willing to work harder when the size of the rewards increased or when the rewards were more probable, compared to control participants. Furthermore, among patients with schizophrenia, there was an association between choosing fewer greater-effort/greater-reward choices in the task and having more severe negative symptoms (selfreported) and worse community and work function (reported by caretaker; Barch et al., 2014).

Overall these findings are exciting and promising by providing strong evidence of reduced reward motivation across major psychiatric disorders. However, as we have previously stressed (Rømer Thomsen et al., 2015) participants are working for abstract rewards in these tasks, and not fundamental rewards (as in the animal models). Whether abstract and fundamental rewards are treated in the same way remains an open question, but emerging evidence suggests that there are important differences in the underlying brain processing (Sescousse et al., 2013a,b).

Other groups have used a related measure of reward motivation in humans by combining a key-pressing procedure with salient face stimuli (Aharon et al., 2001; Parsons et al., 2011). In these tasks, "wanting" is operationalized as the amount of work participants perform (i.e., by pressing a key) in order to change the duration they view images of adult/infant faces on a screen. Findings from these studies provide support for a dissociation of conscious liking ratings of salient face stimuli and the behavioral measure of "wanting" (Aharon et al., 2001; Parsons et al., 2011). For example, heterosexual males used more effort to keep female compared to male faces on a screen, but in a self-report task they rated the faces as equally attractive (Aharon et al., 2001). The use of salient face stimuli in combination with a key-pressing procedure represents a promising way to study possible impairments in the ability (or willingness) to work for fundamental social rewards in humans.

Moeller et al. (2009) have used a similar key-pressing paradigm in combination with salient drug-related stimuli and provide evidence for increased "wanting" of drug-related stimuli in drug addicted: cocaine addicted used more effort to view cocainerelated stimuli in a behavioral choice task, compared to control participants. Furthermore, they reported dissociation between a self-report measure of hedonic impact and a behavioral measure of motivation in cocaine addicted individuals: in the self-report task they rated pictures of pleasant scenes as more pleasant than cocaine-related pictures, however, in the behavioral choice task they did not show this preference (Moeller et al., 2009). These findings are in line with the incentive sensitization theory (Robinson and Berridge, 1993; Robinson et al., 2013) which argues that cue-triggered "wanting" of drug-related stimuli is enhanced in drug addicted individuals, and that these "wanting" processes are partly dissociable from "liking" processes in the brain, and in behavior. The reported dissociation between selfreported hedonic impact and a behavioral measure of motivation also reflects the impaired insight that characterizes drug addicted individuals (Goldstein et al., 2009; Moeller and Goldstein, 2014).

A related and promising measure of effort is the use of forcegrip procedures which allows us to quantify various aspects of effort including; how much effort is exerted over time, how fast participants start to squeeze, or how fast the force is increased (Aarts et al., 2008). By combining force-grip procedures with subliminal priming paradigms it becomes possible to study motivational processes that we are not aware of (Pessiglione et al., 2007; Aarts et al., 2008). For example, it has been shown that subliminally priming of the concept of exertion (i.e., words such as "exert" or "vigorous") can prepare people for forceful action, and when primes are accompanied with a rewarding stimulus (i.e., a consciously visible positive word) they are motivated to spend more effort (Aarts et al., 2008). In a similar set-up, Pessiglione et al. (2007) studied unconscious motivation with an Incentive Force Task, where the amount and reportability of monetary rewards participants could gain through physical effort varied. Pessiglione et al. (2007) reported that even when participants were unable to report how much money was at stake, they still used more effort for larger rewards. These paradigms have yet to be applied to samples of relevant patients, but they represent a promising way to investigate impairments in unconscious reward motivation.

Another important component of reward motivation is the ability of reward-related cues to capture our attention and act as motivational magnets. In human studies, one way of operationalizing the ability of certain stimuli to capture our attention is by using variants of the attentional blink paradigm. Studies using this type of measure provide evidence that drug addicted individuals display an attentional bias toward drugrelated visual cues and that this bias is correlated with selfreported craving (Wiers and Stacy, 2006; Field et al., 2009; Tibboel et al., 2010). For example, heavy social drinkers have reduced attentional blink for alcohol-related stimuli, which is consistent with the hypothesis of enhanced attentional bias for salient drugrelated cues (Tibboel et al., 2010). Recent evidence suggests that a similar mechanism is present in behavioral addiction like Gambling Disorder (Brevers et al., 2011a,b; Rømer Thomsen et al., 2014). For example, in an attentional blink paradigm problem gamblers exhibited enhanced processing of gamblingrelated cues compared to neutral cues (Brevers et al., 2011b), and in a change detection task problem gamblers were faster at detecting gambling-related stimulus changes compared to neutral (Brevers et al., 2011a). Taken together, these findings support the hypothesis of an attentional bias toward addiction-related stimuli in drug and behavioral addiction.

## **Impaired Ability to Learn About Reward**

There is an extensive literature from animal and human studies investigating the ability to learn from experiences with reward and punishment. Recently, some of these paradigms have been applied to relevant patient groups and provide evidence of impairments in the ability to learn about reward in patients suffering from depression and schizophrenia.

In a series of studies, Pizzagalli and colleagues have investigated impairments in the ability, or propensity, to develop a response bias toward stimuli that are more frequently rewarded than others using a probabilistic reward task (Pizzagalli et al., 2005, 2008; Pechtel et al., 2013; Vrieze et al., 2013). The task has been applied to patients with varying degree of depressive and anhedonia symptoms and findings from these studies consistently show evidence of impaired reward learning. In the first study, Pizzagalli et al. (2005) showed that in participants with low levels of depressive symptoms there was an increase in the response bias over time, which was not present in participants with high levels of depressive symptoms. Subsequent studies of clinical populations show evidence of diminished reward responsiveness in depressed patients compared with controls (Pizzagalli et al., 2008; Vrieze et al., 2013), in patients with remitted depression compared with controls (Pechtel et al., 2013), and in depressed patients with high vs. low levels of anhedonia symptoms (Vrieze et al., 2013).

Of relevance here are also studies using probabilistic learning tasks that differentiate between reward-guided and punishmentguided learning. So far, this type of paradigm has not been systematically applied to clinically depressed patients, but one study reported evidence of blunted reward- and punishmentguided learning in depressed patients compared with controls (Chase et al., 2010). More data is available from patients suffering from schizophrenia. Compared to controls, patients suffering from schizophrenia consistently show deficits in reward-guided learning, while findings regarding punishment-guided learning are conflicting (Waltz et al., 2007, 2011; Strauss et al., 2011; Gold et al., 2012; Yilmaz et al., 2012; Fervaha et al., 2013a).

Studies targeting impairments in unconscious reward learning are intriguing, considering recent evidence of reward learning occurring outside our awareness. Pessiglione et al. (2008) used a subliminal instrumental conditioning task, where cues predicting monetary reward or punishment are subliminally presented, and showed that participants develop a propensity to choose cues associated with reward, even though the cues are not consciously perceived. These findings support the notion that cues related to reward and punishment (also) affect behavior and decisionmaking processes on an unconscious level and underscores the need to study reward processing deficits on both conscious and unconscious levels. This type of paradigm has yet to be applied to relevant patients, but represents a promising method to study potential impairments in unconscious reinforcement learning.

In animal studies, the conditioned place preference (CPP) procedure has long been used to study the development of preferences for environments or stimuli which have previously been associated with rewarding drug intake through the process of classical conditioning (Tzschentke, 1998, 2007). Recently, Mayo et al. (2013) successfully translated the CPP procedure into a human drug conditioning task and showed that healthy participants develop a behavioral preference for cues that have been paired with drug intake (a dose of methamphetamine), compared with cues that have been paired with placebo. These findings were recently replicated and extended by including a broader range of measures of the conditioned drug response, including self-report, behavioral and psychophysiological measures. After the conditioning procedure, participants showed an increase in behavioral preference, positive emotional reactivity, and attentional bias toward the cue associated with drug intake, compared with the cue associated with placebo (Mayo and de Wit, 2015). This paradigm represents a promising method to study individual determinants of classical conditioning and is therefore highly relevant for the disorders discussed here. For example, this type of paradigm can shed light on individual risk factors in

the development of sensitized responses to drugs/drug-related cues and blunted responses to other types of rewards (e.g., social, sexual, and sensory) in drug addiction, and similarly in behavioral addiction such as Gambling Disorder. This procedure is also highly promising in terms of studying deficient associative learning in patients suffering from depression and schizophrenia, preferably by using different types of rewards.

In line with the strong evidence suggesting that "wanting," "liking," and "learning" processes are dissociable in the brain and in behavior, it is important to note that the impaired reward learning reviewed here is not necessarily related to impairments in the ability to learn about "liking" (i.e., the hedonic impact of a reward), but could as easily be due to a reduced or modified sensitivity to the rewarding properties of the stimulus in the absence of "liking." Future studies are needed to tease these differences apart.

#### **Impaired Ability to Experience Pleasure**

While successful models have been developed to study aspects of reward motivation and reward learning in humans, it has proven more difficult to develop behavioral procedures that measure hedonic impact in humans. In animal studies, hedonic impact of pleasurable stimuli has been successfully studied by measuring affective orofacial expressions elicited by the hedonic impact of sweet tastes. Studies applying taste-reactivity paradigms have convincingly shown that sweet tastes elicit rhythmic licking of lips (i.e., facial "liking" reactions) and bitter tastes elicit gapes (i.e., facial "disliking" reactions) in rodents and human infants (Steiner, 1973, 1974; Pfaffmann et al., 1977; Grill and Norgren, 1978a,b; Steiner et al., 2001). However, these affective orofacial measures are not easily translated to (adult) human studies, because we learn to control and mimic orofacial reactions to food as we grow up.

The hedonic impact of other types of rewards (than food) appears to be easier to measure behaviorally, or physiologically. Although mostly taboo, there is an increasing interest in the mechanisms underlying sexual pleasures (Georgiadis and Kringelbach, 2012; Georgiadis et al., 2012), and a number of measures have been developed to quantify pleasure-elicited "liking" reactions to sexual pleasures, e.g., by measuring rectal pressure variability and self-reported level of sexual arousal (Georgiadis et al., 2006). Although impairments related to sexual activity and sexual pleasures are still taboo, they represent a promising area of research that can help shed light on impairments in hedonic impact in relevant patient groups, including patients suffering from depression, schizophrenia and addiction.

A number of studies have measured facial reactions to pictures of emotional facial expressions and there is some evidence of a blunted response to positive facial expressions in depressed patients (Bylsma et al., 2008). For example, Dimberg (1982, 1990) has used electromyographic (EMG) recordings to detect emotionrelated facial movements and shown that we elicit distinct facial reactions in response to emotional facial expressions, which partly reflects a tendency to mimic the facial expression. Studies have shown that these reactions are elicited very rapidly (Dimberg and Thunberg, 1998) and even when participants are not aware that they are being exposed to facial stimuli (Dimberg et al., 2000). Although it is unlikely that all changes in facial musculature are related to emotion, EMG recordings of facial reactions may provide a way to investigate deficits in "liking" reactions to social pleasure (e.g., happy facial expressions). These rapid facial reactions have already been related to empathy (Dimberg et al., 2011), however more work is needed to confirm that they are in fact indicators of pleasure "liking."

So far, the most popular way of measuring hedonic impact in humans has been to measure self-reported hedonic reactivity (i.e., subjective ratings of pleasure) to pleasant solutions and odors in a here-and-now setting. Surprisingly, the majority of studies report similar, or higher, pleasantness ratings in depressed patients compared to controls in response to sweet solutions (Amsterdam et al., 1987; Berlin et al., 1998; Scinska et al., 2004; Swiecicki et al., 2009; Dichter et al., 2010) and various odors (Steiner et al., 1993; Pause et al., 2001; Lombion-Pouthier et al., 2006; Scinska et al., 2008; Clepce et al., 2010). Similarly, evidence from studies of patients suffering from schizophrenia does not suggest that this patient group experiences lower levels of hedonic reactivity to pleasurable stimuli compared with controls (Heerey and Gold, 2007; Barch and Dowd, 2010; Strauss and Gold, 2012).

Interestingly, the same patient groups (depressed and schizophrenic) report diminished enjoyment in studies where they are asked to rate prospective, retrospective, or hypothetical experiences (McFarland and Klein, 2009; Watson and Naragon-Gainey, 2010; Strauss and Gold, 2012). One way of interpreting this discrepancy is that anhedonic patients retain core "liking" reactions, but do not cognitively value them in the same way as they did before (Dichter et al., 2010; Berridge and Kringelbach, 2011). This interpretation should however, be seen in the light of standard clinical examinations where depressed patients often present with behavioral characteristics that do not only imply impairments in cognitive evaluations of their experiences. For example, clinicians often report less smiling and less reactivity to stimuli (in general) which might reflect diminished "liking." The disagreement—between laboratory based studies using taste-reactivity paradigms and clinical observations of patients—underscores the need to consider methodological aspects. The laboratory based studies reviewed here (where they failed to show diminished "liking" reactions to pleasurable solutions and odors in depressed and schizophrenic patients) were all based on self-reported ratings of hedonic impact. It remains an open question whether behavioral or physiological measures of "liking" will inform us differently.

## **Implications for Assessment and Treatment**

The generally accepted understanding of anhedonia as "diminished subjectively experienced pleasure" is reflected in current diagnostic classification systems. For example, in the DSM 5 anhedonia is one of two main symptoms needed for the diagnosis of depression and is defined as "decreased interest and pleasure in most activities most of the day" (American Psychiatric Association, 2013). In this definition of anhedonia, wanting and liking components are collapsed, which is in contrast to the accumulating evidence suggesting that these processes are in fact dissociable in the brain and in behavior. For example, findings from animal and human studies suggest that dopamine plays a crucial role in reward motivation ("wanting" and wanting), but not in hedonic impact ("liking" and liking; Berridge and Robinson, 2003; Berridge and Kringelbach, 2008).

Further, findings from affective neuroscience suggest that reward processing deficits are *not* restricted to impaired hedonic impact. As reviewed here, increasing evidence suggests that the ability to pursue and learn about reward is compromised in patients suffering from depression, schizophrenia, and drug/behavioral addiction (Treadway and Zald, 2011; Rømer Thomsen et al., 2015; Whitton et al., 2015). In contrast, it is less clear whether core "liking" reactions are in fact compromised in, e.g., depression and schizophrenia.

The growing evidence that reward processing deficits are not restricted to diminished experience of pleasure across major psychiatric disorders stresses the need to consider impairments in reward wanting and reward learning in clinical assessments. As a start, self-report instruments could be elaborated to reflect all phases of reward processing. The motivational aspect has already been included in some of the more recent questionnaires (e.g., the TEPS and the STRAP-R), but so far the learning component has been absent. Considering the growing evidence that unconscious components of reward affect our behavior, and are not always accompanied by conscious awareness (Berridge and Winkielman, 2003; Pessiglione et al., 2007, 2008; Aarts et al., 2008), it is highly debatable whether self-report instruments are sufficient in clinical assessments. Or whether they should be complemented by behavioral procedures. For example, behavioral measures of "wanting" could compliment self-report questionnaires in clinical assessments with advantage and help guide subsequent treatment. Depending on which subcomponents of reward processing are mainly affected, different medical treatments may be afforded. For example, depressed patients characterized by impaired ability to pursue pleasurable activities may benefit from medical interventions that target neurotransmitter systems such as the mesolimbic dopamine system and the opioid system, which have been shown to play a crucial role in reward motivation (Treadway and Zald, 2011; Soskin et al., 2013; Rømer Thomsen et al., 2015).

These insights are also relevant in terms of psychological treatment options. For example, cognitive behavioral therapy (CBT) has so far shown more promising treatment effects than pharmacological treatments in patients suffering from drug and behavioral addiction (Gooding and Tarrier, 2009; Potenza et al., 2011; Bullock and Potenza, 2012). In the context of addiction, CBT is expected to improve the individual's control over motivation by increasing awareness of cues that trigger craving and by learning skills that enable new patterns of thinking and acting (Potenza et al., 2011). These efforts are important and efficiently target conscious feelings of craving. However, this type of cognitive intervention has limited efficacy in terms of targeting unconscious mechanisms. In particular, cue-induced craving reactions that occur outside our awareness are not likely to be targeted in CBT, but play an important role in maintaining the addictive behavior as outlined, e.g., by the incentive sensitization theory of addiction (Robinson et al., 2013). Hence, although CBT reduces some of the cognitive layers of responsiveness to drug cues, it is very likely that unconscious layers persist (Robinson et al., 2013).

Other types of psychological interventions may provide a way to target unconscious "wanting" (or "craving") mechanisms, such as mindfulness based interventions that aim to improve the individual's awareness of bodily and emotional signals (Garland et al., 2014). There is some evidence to suggest that mindfulness based interventions can reduce consumption and craving of a number of substances in substance users, although more randomized controlled trials are warranted (Chiesa and Serretti, 2014). For example, in a recent randomized controlled trial Tang et al. (2013) reported that brief meditation training reduced smoking by 60% in smokers who wanted to quit smoking, which was accompanied by increased activity in brain regions related to self-control and self-awareness. These findings foster hope that mindfulness based interventions can improve self-control and awareness of otherwise unconscious "wanting" reactions, and stresses the need to consider these types of treatments in combination with CBT, although more randomized controlled studies are warranted.

## **Concluding Remarks**

Ribot's (1896) long standing definition of anhedonia as "the inability to experience pleasure" has been challenged following progress in affective neuroscience, and in particular following pioneering work suggesting that reward consists of multiple subcomponents that can be divided into the processes of wanting, liking and learning (Berridge and Kringelbach, 2008). Recent proposals to reconceptualize anhedonia as motivational or

## **References**


consummatory subtypes of anhedonia (Treadway and Zald, 2011), or as impaired ability to pursue, experience, and/or learn about pleasure (Rømer Thomsen et al., 2015) have paved the way for objective behavioral measures to complement traditional selfreport measures of anhedonia. As reviewed here, a number of behavioral procedures have been developed that can be used to measure impairments in reward motivation and reward learning, while behavioral measures of hedonic impact have proven more difficult. Findings from studies applying these methods support the new conceptualizations of anhedonia by providing robust evidence that reward processing deficits are not restricted to impaired hedonic impact in major psychiatric disorders. Instead, there is increasing evidence of impairments in the ability to pursue and learn about reward in, e.g., depression and schizophrenia. This progress is essential for the purposes of identifying the underlying neurobiological mechanisms of anhedonia, and has important clinical implications for assessment and treatment of anhedonia. For example, selfreport measures of anhedonia could be elaborated to reflect all phases of reward processing and it is debatable whether selfreport measures of anhedonia are sufficient, or whether they should be complemented by behavioral measures in clinical assessments.

## **Acknowledgments**

I am grateful for the support of the Ministry of Children, Gender Equality, Integration and Social Affairs (Denmark).


behavior and judgements of value. *Pers. Soc. Psychol. Bull.* 31, 121–135. doi: 10.1177/0146167204271309


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Rømer Thomsen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Unifying treatments for depression: an application of the Free Energy Principle

### *Adam M. Chekroud1,2\**

*<sup>1</sup> Department of Psychology, Yale University, New Haven, CT, USA*

*<sup>2</sup> Department of Neuroscience, Oxford University, Oxford, UK*

#### *Edited by:*

*Nikolina Skandali, University of Cambridge, UK*

#### *Reviewed by:*

*Philip R. Corlett, Yale University School of Medicine, USA Christoph Daniel Mathys, University College London, UK*

#### *\*Correspondence:*

*Adam M. Chekroud, Department of Psychology, Yale University, 2 Hillhouse Avenue, New Haven, CT 06511, USA*

*e-mail: adam.chekroud@yale.edu*

Major Depressive Disorder is a debilitating and increasingly prevalent psychiatric condition (Compton et al., 2006; Andersen et al., 2011). At present, its primary treatments are antidepressant medications and psychotherapy. Curiously, although the pharmacological effects of antidepressants manifest within hours, remission of clinical symptoms takes a number of weeks—if at all. Independently, support has grown for an idea—proposed as early as Helmholtz (von Helmholtz, 1924)—that the brain is a prediction machine, holding generative models<sup>1</sup> for the purpose of inferring causes of sensory information (Dayan et al., 1995; Rao and Ballard, 1999; Knill and Pouget, 2004; Friston et al., 2006; Friston, 2010). If the brain does indeed represent a collection of beliefs about the causal structure of the world, then the depressed phenotype may emerge from a collection of depressive beliefs. These beliefs are modified gradually through successive combinations of expectations with observations. As a result, phenotypic remission ought to take some time as the brain's relevant statistical structures become less pessimistic.

**Keywords: major depressive disorder, predictive coding, free-energy principle, antidepressants, computational psychiatry, generative models, antidepressants efficacy**

#### **THE FREE-ENERGY PRINCIPLE**

The free-energy principle has been proposed as a unifying framework that simultaneously links perception and action, and formalizes the roles of brain theories including attention, motor control, and perceptual learning (Friston et al., 2006; Friston, 2010; Clark, 2013). It is a mathematical description whereby the brain is a predictive device that builds statistical models of the world and then seeks to minimize "free energy," an approximation of surprise. Free energy depends on a number of quantities, including the internal states of the brain, the external environment, and exchanges between the two (through action and perception). Mathematically, free energy is a statistical quantity that approximates the surprise in a sensory input, and it rests on two probability densities: a recognition density and a generative model (Friston et al., 2006; Friston, 2010; Clark, 2013).

The first component—the recognition density—is an approximate probability distribution of the causes of sensory data (Friston, 2010). This is quite a simple concept to apply: when a visual neuron fires in response to a horizontal bar in an area of the visual field, we could think that the stimulus caused the neural response. Equally, from the point of the neuron, we can consider its firing to reflect a probability that our sensation of a stimulus was caused by a horizontal bar in a particular area. The second component—the generative model—is a joint probability density between data and their causes from which samples can be drawn (Friston, 2010). In the present context, the generative model seeks to capture the statistical structure of its sensory environment by tracking the web of causes of that statistical structure. Crucially, the inferences we make about causes are not restricted to immediate sensory signals (e.g., "the switch caused light"), nor even time-varying/transitive inferences (e.g., "that bird is flying") (Friston, 2010). The brain's model of the world also includes time-invariant regularities that afford structure to our world e.g., "gravity makes things fall," but could equally be "I am in control of my own actions." Having said this, it is important to note that Bayes rule and Bayesian brain theory do not guarantee veridical associations.

#### **MODELS IN THE BRAIN?**

A "belief" in the context of the free-energy principle is formalized as a probability distribution of an external state as internally represented by its sufficient statistics (Huys and Dayan, 2009; Friston, 2010; Mathys et al., 2011). For the purposes of this article, a "belief" is simplified as a prediction about the cause of an observation, given a particular circumstance and some previous experience. A "depressive belief" can then be considered as any consistent (negative) bias in these predictions, or vice versa2. If our beliefs are to be useful, and reflect genuine associations rather than random co-occurrences we must consider the prior observations of all elements concerned (Fletcher and Frith, 2009). A simple thought experiment illustrates the concept of belief well:

<sup>1</sup>Technically there is only one generative model, but for the purpose of this essay I refer to multiple internal models since the hierarchical structure supports many processing levels (as in Clark, 2013)

<sup>2</sup>Interested readers are encouraged to read an excellent formal treatment of model inversion by FitzGerald et al. (2014), which explores the implications of approximate Bayesian inference on behavior.

upon meeting a three-legged dog, one needs to recall all the previous times one encountered four-legged dogs to avoid the false prediction that dogs only have three legs.

Bayes' rule, a mathematical theorem, offers a mechanism for how beliefs should develop over time: updating as a function of past experiences (the prior), and the current experience (the likelihood) to produce a posterior belief or expectation. This interplay between likelihoods and priors may sound abstract, but it has the very practical implication that all our experiences depend on our knowledge of their predictability. The connection between the free-energy principle, predictive coding<sup>3</sup> and the Bayesian brain rests on the fact that minimizing free energy corresponds to variational Bayesian inference. This may sound technical; however, it brings an important insight to the table: namely, all quantities involved in making predictions must jointly minimize surprise or free-energy. Notably, proposed quantities include synaptic activity (encoding beliefs about the current state of the world), synaptic efficacy (encoding regularities and causal structure) and synaptic gain (encoding the precision of beliefs) (Corlett et al., 2009, 2011; Adams et al., 2013). This three way split provides a natural framework to understand perceptual inference, learning, and the encoding of uncertainty, respectively. Crucially,

<sup>3</sup>Predictive coding refers to a class of theories in which the brain is held to continually generate models of the world based on context and information

from memory to predict sensory input (Rao and Ballard, 1999; Friston and Kiebel, 2009; Clark, 2013).

to optimize any one set of these quantities one needs the optimal values of the others. The implicit circular dependency means that disruptions to inference, learning or the encoding of uncertainty will necessarily cause abnormalities in the other domains. Of particular importance here is the notion of precision. In predictive coding, precision corresponds to the (synaptic) gain applied to prediction errors and plays the role of a learning rate. We will return to this later when considering the link between neuromodulators, synaptic gain and their effects on perceptual inference and learning.

#### **PERCEPTION AND BELIEF: WE SEE WHAT WE WANT TO SEE**

Exchanges between our brain's internal states and our external environment are bidirectional. That is, the brain draws its input through perception as it forms a model of the world, and then engages the external environment through action. It is this sampling of the environment that dictates our sensations, thus completing an action-perception cycle. Consider an intuitive example that occurs as we wander through our bedroom in complete darkness. We anticipate what we might touch in the world around us (expectations), and then feel around accordingly as we attempt to confirm these expectations (selective sampling). This process—whereby an agent selectively samples the sensory inputs that it expects—is known as active inference (Friston et al., 2009; Friston and Kiebel, 2009). In most real-life cases, there is already considerable contextual (i.e., prior) information in place when we encounter new information (Friston et al., 2006; Clark, 2013). There is, therefore, the potential for many prior expectations to be primed, alter the processing of incoming sensory information, and influence future environment sampling through action (Friston, 2010; Clark, 2013).

## **WHEN THINGS GO WRONG**

Predictions are only as good as the model that generates them. Disadvantages begin to creep into the system when beliefs become either inaccurate or inflexible (Fletcher and Frith, 2009; Ma, 2012). Recall that all of our experiences are influenced by our beliefs. Experiences that are in line with beliefs become predictable, strengthen the original belief and eliminate the need for the energy consuming processing of predictable sensations; because they have already been predicted and provide no "newsworthy" information. When an incorrect belief gains strength, it can result in one ignoring potentially informative experiences, or a range of other misattributions (Fletcher and Frith, 2009). The bidirectional belief-action relationship means that any inaccuracies in our model of the world might result in abnormal perception or action, and vice versa (Fletcher and Frith, 2009; Friston, 2010). Additionally, since the model must have a neural basis, correct predictions (in the form of some distributed neural network) could plausibly be disrupted by neurobiological changes.

#### **CHANGING THE MODEL: MINIMIZING FREE-ENERGY**

Friston's original proposition offered two mechanisms for minimizing free-energy: through optimizing actions, and optimizing representations (Friston, 2010). In other words, we must either change the inputs to the model, or change the internal states. Returning to the example of walking through a dark room, there are two ways in which we might minimize surprise. One could sample differently (through action), e.g., turn on a light. Alternatively, one could change expectations (perceptual inference), e.g., entertain the alternative belief that you have woken up in a hotel room as opposed to your bedroom. It is critical to note here that both action and perception constitute an iterative cycle and depend upon each other. This contextualizes the three way dependencies between perceptual inference, learning and precision noted above: in other words, any changes in action rest upon changes in perception that—at some level—depend upon perceptual learning. Because perceptual learning proceeds at a much slower timescale than inference, our beliefs (which underlie action) do not change immediately; rather we successively combine past and current experience to optimize our generative model of the world. As such, rectifying a depressive model of the world (and thus the depressive phenotype) will be a gradual process. More specifically, this gradual process corresponds to the acquisition of generative models and involves the suppression of free-energy or prediction errors (over time) by changing connection strengths in the generative model. It is this process that one might consider to be the target of therapeutic interventions (e.g., by increasing learning rates—as is discussed later).

#### **ANTIDEPRESSANTS: REPAIRING REPRESENTATIONS?**

Interesting parallels arise when considering depression from a free energy viewpoint. Anhedonia—a decreased interest in rewarding stimuli—is a cardinal symptom in the diagnosis of depression. Computational theories of reward-guided learning hold that future reward expectations depend heavily on the difference between actual and expected reward outcomes, i.e., prediction errors (Rescorla and Wagner, 1972; Sutton and Barto, 1998). Neurally, a close link has long been noted between prediction error signals and the firing of dopaminergic neurons during associative learning (Schultz et al., 1997). For instance, one recent study used optogenetic techniques to demonstrate a causal relationship between dopamine and anhedonia (Tye et al., 2013). Here, optogenetic silencing of midbrain (VTA) dopaminergic neurons was shown to induce a lack of sucrose preference (a homolog of anhedonia) in mice, while optogenetic stimulation of the same neurons relieved anhedonia (Hayes, 2013; Tye et al., 2013). However, while animal models suggest that phasic prediction error signaling is impaired in anhedonia, a recent behavioral meta-analysis of human data suggests otherwise (Huys et al., 2013).

Computationally, anhedonia can arise through either a primary insensitivity to reward, or disrupted ability to learning about rewards (Huys et al., 2013). Huys et al. (2013) directly contrasted these alternatives, conducting a model-based Bayesian meta-analysis of six datasets where depressed patients completed a probabilistic reward learning experiment. They found that reward sensitivity4 —but not learning—was impaired in MDD patients, but a dopamine agonist pramipexole showed the

<sup>4</sup>It may be important to clarify that while the Huys paper demonstrated impaired reward sensitivity for *wanting* in MDD patients, it seems that *liking* rewards remains intact in MDD (Dichter et al., 2010).

opposite pattern (Huys et al., 2013). Serotonin, on the other hand, is understood to selectively modulate behavioral and neural representations of reward value (Seymour et al., 2012). Specifically, it has been shown through acute depletion of a serotonin precursor (tryptophan) that serotonin depletion leads to impaired reward sensitivity in humans (Seymour et al., 2012). Other model-based differences in reward processing between depressed patients and controls have also been shown: depressed patients have blunted prediction error signals compared to healthy controls (Kumar et al., 2008; Gradin et al., 2011), and fail to adjust reaction times (e.g., post-error slowing) in the same way as control participants (Steele et al., 2007). As a brief but important note, either deficit would lead to inaccuracies in our recognition model: no longer faithfully reflecting reward causalities in our interactions with the world. Here we see important examples of how neuromodulation (serotonin and dopamine) can adversely affect beliefs about precision (sensitivity or gain and learning rates) to produce suboptimal inference and learning, respectively.

Seemingly abstract differences between computational quantities may carry important implications for the treatment of depression. Disruptions to neural representations of either reward sensitivity or reward learning would introduce inaccuracies to our generative model of the world. Indeed, a distorted mapping between actions and rewards could conceivably explain a number of depressive symptoms, particularly the feelings of hopelessness, distorted appetite, and anhedonia/decreased interest in pleasurable stimuli. However, the precise mechanism by which this occurs is critical to treatment strategies: while either impairment might lead to similar behavioral deficits, exactly which impairment a patient has carries implications for their treatment. At present, serotonin-targeting treatments are the first-line antidepressant method, but do not seem to alleviate depressive symptoms in many patients (Harmer and Cowen, 2013). After failed SSRI treatment, dopamine-targeting treatments can be attempted (Rush et al., 2006; Trivedi et al., 2006). It is entirely plausible that depressive symptoms, which broadly result from impaired reward processing, might stem from either, or both, impairments to serotonergic reward sensitivity and dopaminergic reward learning.

Perhaps, therefore, a reinforcement learning experiment might have predictive value over which treatment will be effective. If model-based analyses show a patient has a learning rate impairment, they may be more suited to dopaminergic treatment. If a patient has impaired reward sensitivity, perhaps serotonergic interventions ought to work. Considering behaviorallydissociable recognition density distortions offers an interesting re-appraisal of inconsistent antidepressant success, with potential therapeutic implications. In fact, there is no reason to limit our investigation to these parameters. In one recent example (Diaconescu et al., 2014), sophisticated computational modeling was applied to a social learning task (based on Behrens et al., 2008) to investigate the mechanisms by which we infer the intentions of others. Such analyses characterize both social and non-social aspects of learning behavior extensively, and would enable researchers to consider potential abnormalities in MDD in a rich fashion. This kind of computational psychiatric approach is becoming increasingly popular, and has enjoyed recent success across a range of disorders (Montague et al., 2012; Corlett and Fletcher, 2014; Stephan and Mathys, 2014), including psychosis (Corlett et al., 2009, 2011), borderline personality disorder (Fineberg et al., 2014), schizophrenia (Fletcher and Frith, 2009), and delusions (Moutoussis et al., 2011).

## **REWIRING BELIEFS**

Since beliefs must be stored in the brain, using antidepressants to correct aberrant models of the world ought also to require some neurophysiological restructuring. This is in line with extant explanations for the delay in antidepressant efficacy. One early hypothesis for the delay in clinical effect of SSRIs argued that the desensitization of serotonin autoreceptors on serotonergic bodies and terminals is required before SSRIs can fully increase serotonin nervous transmission (Blier and de Montigny, 1994). In line with this suggestion, clinical trials combining SSRIs with serotonin autoreceptor antagonists have shown a faster and enhanced antidepressant effect (Whale et al., 2010). More recent alternative suggestions concerned the effects of SSRIs on neurotrophins and cellular processes generating new neurons and synapses. Animal models of depression highlight decreased BDNF production, neurogenesis, and synaptic plasticity: effects that are reversed by repeated administration of SSRIs (Santarelli et al., 2003; Castrén, 2005; Castrén and Rantamäki, 2010). Although interesting attempts have been made to apply the free-energy principle to monoaminergic (dopamine) transmission and inference (Friston et al., 2012), the present essay makes no prescriptions as to what specific neurobiological changes would happen as models become "less depressed."

#### **PSYCHOTHERAPY: BREAKING THE ACTION-PERCEPTION CYCLE?**

Correcting representations (the perceptual side to free-energy) might be one way of treating a depressive model of the world, but it is not the only way. Earlier I described the notion of active inference, whereby an agent selectively samples the environment in line with its model of the world, using the intuitive example of wandering in the dark (Friston, 2010). Another way in which we can influence the model is by changing its inputs, that is, optimizing actions to sample the environment differently. For instance, it may be that interactions with certain people or objects are further enhancing depressive symptoms, and/or (conversely) that a lack of positive actions is having a similar effect, reminiscent of learned helplessness models of depression. There is evidence in line with this; several studies have noted depressed patients spend significantly longer looking at negative stimuli (Matthews and Antes, 1992; Eizenman et al., 2003; Caseras et al., 2007; Seth, 2013)—perhaps this excessive negative sampling is skewing the inputs to our models. But this notion also extends beyond an individual's physical actions; excessively sampling negative causal relationships might also distort an agent's model of the world. Indeed, this idea of active sampling concurs with recent theoretical work linking interoceptive inference and emotion, where emotion is held to emerge from cognitive appraisals of physiological states (Seth, 2013). One recent computational paper attempted to model emotional valence as the second time-derivative of freeenergy, where emotional valence regulates the learning rate of the causes of sensory inputs (Joffily and Coricelli, 2013). In more plain terms: "when sensations increasingly violate the agent's expectations, [emotional] valence is negative and increases the learning rate. Conversely, when sensations increasingly fulfill the agent's expectations, [emotional] valence is positive and decreases the learning rate" (Joffily and Coricelli, 2013). Put simply, active inference requires us to sample the world in accordance with our expectations. If expectations imply the world is a rather hostile place, then it will be sampled as such. There are clear analogies with learned helplessness models of depression here. Interestingly, learned helplessness can be (Bayes) optimal, if the world is indeed persistently hostile and has a low volatility.

Note again the crucial role of the learning rate in facilitating the (re)learning of a generative model. Under predictive coding, an implementation of the free-energy principle, the learning rate increases with the expected volatility of environmental contingencies, but volatility is only one factor influencing it (Mathys et al., 2011, 2014). Behaviorally, it has been shown that healthy human subjects assess volatility in an optimal manner—that is, increase their learning rate when the environment is more volatile (Behrens et al., 2007). In this study, the authors demonstrated that the optimal estimate of environmental volatility was reflected in the fMRI signal in the anterior cingulate cortex (ACC), and variations in this signal predicted between-subject variations in learning rate. Although no study has specifically investigated the ability of depressed patients to optimally update learning rate according to their environment, one study showed that controls but not patients—significantly activated the ACC when given negative feedback during a gambling task (Gradin et al., 2011). Of course, this itself does not mean that ACC activity significantly differed between controls and patients (Gelman and Stern, 2006).

Nonetheless, it seems optimizing actions in order to change a model of the world is reflected in psychotherapy approaches. APA (2010)<sup>5</sup> guidelines for depression psychotherapy include helping people "gradually incorporate enjoyable, fulfilling activities back into their lives," and "improve patterns of interacting with other people that contribute to their depression," both of which would constitute optimisation of actions under the free-energy framework. Essentially, breaking any actions or sampling mechanisms that further a depressive model of the world appears to be a recommendation that the free-energy framework makes, that psychotherapy treatments have already taken.

#### **FREE-ENERGY: A HOLISTIC APPROACH?**

It is worth briefly setting this approach in the context of other accounts of depression. It is true that we already have elegant emotional/cognitive accounts of depression (Harmer and Cowen, 2013), and there are many putative biological explanations of disruption and restoration at cellular (Castrén and Rantamäki, 2010) and molecular (Berman et al.) levels. However, pharmacological level explanations often lose sight of the multidimensional nature of the depressive phenotype; and emotional or high-level explanations are difficult to relate directly back to the neurobiology. In fact, a predictive coding approach resembles a previous information-processing level approach, illustrated in **Figure 1**, known as the "network hypothesis of depression" (Nestler et al., 2002; Castrén, 2005). At a minimum, this is encouraging: it suggests that the free-energy framework is largely consistent with theories of depression at multiple levels, and offers a plausible alternative that also unifies global brain theories in biological and physical sciences. In future, it may offer an opportunity for researchers to directly transition between depression's many levels of research in a principled, model-based fashion (Montague et al., 2012; Friston et al., 2014; Stephan and Mathys, 2014). For example, there may be different underlying problems in MDD, with different behavioral ways of testing, with specific therapeutic implications.

It is also interesting to relate the free-energy approach to the literature on Depressive Realism, a claim that depressed people are sometimes better at evaluating instrumentality than nondepressed people (Alloy and Abramson, 1979; Alloy et al.). The claim appears robust, if small: a recent meta-analysis of 75 studies indicated a small overall depressive realism effect, although both depressed and non-depressed individuals showed a substantial "optimism bias" (Moore and Fresco, 2012). Some compelling model-driven research suggests the effect may be driven by contextual processing differences, rather than depressed individuals having consistently low expectation of control (Msetfi et al., 2005). This is partially supported by one recent pharmacological study showing that amongst a group of 15 non-depressed participants, acute tryptophan depletion improved contingency judgments for participants with particularly low scores on the Beck Depression Inventory (BDI *<* 6; Chase et al., 2011). In a free-energy view, "control" in the clinical psychological context corresponds to outcome entropy, and it directly influences an individual's belief about what kinds of outcome distributions are likely. "Maladaptive" priors or generalization tendencies could equally result in differences in perceived control, although "maladaptive" here requires some clarification. Since both depressed and non-depressed individuals typically show an optimism bias, "maladaptive" is simply with reference to non-depressed individuals, rather than a comment on optimality. Although a detailed analysis of entropy and perceived control is beyond the scope of the current article, Huys and Dayan (2009) offer an excellent mathematical treatment of behavioral control from a Bayesian perspective.

The free-energy approach detailed in this review is not, however, an exhaustive account of depression. Symptoms of low mood and anhedonia may be cardinal symptoms in MDD but they are not the only ones: the accompanying loss of appetite, sleep disturbance, diurnal fluctuation, low energy and somatic symptoms are a key part of the illness. Furthermore, these additional symptoms can sometimes be the ones that are slowest to resolve. It is possible that wider symptoms may emerge as a behavioral consequence of a distorted generative model: for instance, if food rewards are no longer subjectively rewarding then loss of appetite or motivation to eat is understandable, if not predictable. In addition, although this review focused on the most common treatments for depression—monoaminergic antidepressants and psychotherapy—there is now preliminary evidence that intravenous administration of ketamine and other

<sup>5</sup>Available online at: http://psychiatryonline.org/pb/assets/raw/sitewide/ practice\_guidelines/guidelines/mdd.pdf

glutamatergic drugs can have remarkably quick—but transient antidepressant effects in unipolar and bipolar depression (aan het Rot et al., 2010; Aan Het Rot et al., 2012; McGirr et al., 2014). The speed of ketamine's antidepressant efficacy here may appear problematic for a free-energy interpretation at first glance. However, few treatments in psychiatry or medicine are effective after a single dose, and ketamine is no exception: patients often return to the depressed state without a course of treatment over a number of weeks (aan het Rot et al., 2010; McGirr et al., 2014). From a free-energy perspective, ketamine can be considered a faster vehicle for repairing representations, but one that nonetheless takes some time to repair the generative model. In addition, from a neurobiological perspective, ketamine's acute and sustained antidepressant effects have been hypothesized to depend on synaptogenesis (Li et al., 2010), in reminiscent fashion to monoaminergic antidepressants. Further insight comes from Bayesian treatments of psychosis using ketamine as a model (Corlett et al., 2009, 2011). Here, distinct influences have been proposed for ketamine in the short and long term. In the short term, it is thought that ketamine briefly disturbs cortical inference by blocking NMDA receptors, and impairing the specification of top-down prior expectancies (Corlett et al., 2011). With chronic ketamine use, however, there is a compensatory increase in the number and function of NMDA receptors; longer-lasting changes that can give way to a delusional phenotype and depressed mood rather than remission from depression (Morgan et al., 2010; Corlett et al., 2011).

## **CONCLUSION**

Under the free-energy principle the brain is an active prediction engine that seeks to establish a model of the causal structure of our environment, and minimize long-term surprise. The brain makes inferences about causal relationships at many levels of abstraction, and there is growing neural evidence in line with this theory. If the brain does indeed represent a collection of beliefs about the causal structure of the world, then the depressed phenotype emerges from a collection of depressive beliefs. The two mechanisms by which free-energy is minimized (and perhaps, how agents survive) are by optimizing actions, and optimizing representations. The two are markedly reminiscent of depression's two main therapies: psychotherapy and antidepressants, respectively. Distorted representations of the world might stem from distortions in reward representation, and correcting these through monoaminergic interventions might be a solution to anhedonia symptoms in particular. Similarly, a distorted sampling mechanism may exacerbate depressed mood, and require psychotherapies in an attempt to break the spiral of self-defeating actions. Either way, solutions ought not to be immediate: beliefs are changed gradually through successive combinations of past experiences and current observations. Irrespective of the formal insights into putative pathophysiology in depression, it may be the case that the holistic (theoretical) framework on offer here may be useful in cognitive behavior therapy. In other words, it may provide a rationale for the conjoint use of psychotherapeutic and pharmacological approaches that could be useful for both the therapist and patient alike. One thing is clear: depression is a multi-faceted illness in which disruptions to beliefs, emotions, perception and action are intertwined. Perhaps, therefore, our approach must intertwine beliefs, emotions, perceptions and actions accordingly.

## **ACKNOWLEDGMENTS**

I would like to thank Prof. Karl Friston, as well as the two reviewers, for their detailed comments on the manuscript. In addition, I would like to thank Profs. Catherine Harmer, Phil Burnet, Jutta Joorman, and Gregory McCarthy for their thoughtful discussions and advice. A. M. Chekroud was supported by the Philip Wright Scholarship, offered by Wadham College (Oxford).

## **REFERENCES**


Sutton, R. S., and Barto, A. G. (1998). *Reinforcement Learning*. Cambridge, MA: MIT Press.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 November 2014; accepted: 30 January 2015; published online: 20 February 2015.*

*Citation: Chekroud AM (2015) Unifying treatments for depression: an application of the Free Energy Principle. Front. Psychol. 6:153. doi: 10.3389/fpsyg.2015.00153*

*This article was submitted to Psychology for Clinical Settings, a section of the journal Frontiers in Psychology.*

*Copyright © 2015 Chekroud. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **The neuroscience of positive memory deficits in depression**

## *Daniel G. Dillon\**

*Motivated Learning and Memory Laboratory, Center for Depression, Anxiety and Stress Research, McLean Hospital, Harvard Medical School, Belmont, MA, USA*

Adults with unipolar depression typically show poor episodic memory for positive material, but the neuroscientific mechanisms responsible for this deficit have not been characterized. I suggest a simple hypothesis: weak memory for positive material in depression reflects disrupted communication between the mesolimbic dopamine pathway and medial temporal lobe (MTL) memory systems during encoding. This proposal draws on basic research showing that dopamine release in the hippocampus is critical for the transition from early- to late-phase long-term potentiation (LTP) that marks the conversion of labile, short-term memories into stable, long-term memories. Neuroimaging and pharmacological data from healthy humans paint a similar picture: activation of the mesolimbic reward circuit enhances encoding and boosts retention. Unipolar depression is characterized by anhedonia–loss of pleasure–and reward circuit dysfunction, which is believed to reflect negative effects of stress on the mesolimbic dopamine pathway. Thus, I propose that the MTL is deprived of strengthening reward signals in depressed adults and memory for positive events suffers accordingly. Although other mechanisms are important, this hypothesis holds promise as an explanation for positive memory deficits in depression.

#### **Keywords: depression, reward, memory, anhedonia, hippocampus**

## **Introduction**

Unipolar depression impairs episodic memory (Burt et al., 1995; Zakzanis et al., 1998), and the dominant hypothesis is that stress-induced changes in the hippocampus are responsible (Sapolsky, 1996, 2000; Sheline et al., 1999; MacQueen et al., 2003; MacQueen and Frodl, 2011; Huang et al., 2013; Travis et al., 2014). The idea is straightforward: stress is a potent risk factor for depression (Kendler et al., 1999; Monroe and Harkness, 2005), and the dense concentration of glucocorticoid receptors in the human hippocampus (Wang et al., 2013) makes it a prime candidate for stressinduced neurotoxicity<sup>1</sup> . Indeed, the hippocampus is smaller in adults with recurrent depression (MacQueen and Frodl, 2011), and post-mortem exams reveal shrunken hippocampal neurons and glial cells in depressed individuals (Stockmeier et al., 2004). Because the hippocampus is the seat of episodic memory (Squire, 1992), a causal chain running from stress to depression to hippocampal volume reductions provides an appealing account of memory deficits in depression.

However, there are reasons to think the hippocampal stress hypothesis would benefit from supplementation. First, although adults with recurrent depression typically show memory deficits and hippocampal volume reductions, strong evidence for a direct relationship between these two phenomena is lacking (MacQueen et al., 2003; Travis et al., 2014). Second, research on

<sup>1</sup>For brevity, I will use the terms "depression" and "memory" to refer to unipolar depression and episodic memory, respectively, throughout the remainder of the manuscript unless otherwise noted.

#### *Edited by:*

*Frank Ryan, Imperial College London, UK*

#### *Reviewed by:*

*Carlos E. Norte, Federal University of Rio de Janeiro, Brazil Roger C. McIntosh, University of Miami, USA*

#### *\*Correspondence:*

*Daniel G. Dillon, Motivated Learning and Memory Laboratory, Center for Depression, Anxiety and Stress Research, McLean Hospital, Harvard Medical School, 115 Mill Street, Belmont, MA 02478, USA ddillon@mclean.harvard.edu*

#### *Specialty section:*

*This article was submitted to Psychology for Clinical Settings, a section of the journal Frontiers in Psychology*

*Received: 29 June 2015 Accepted: 13 August 2015 Published: 07 September 2015*

#### *Citation:*

*Dillon DG (2015) The neuroscience of positive memory deficits in depression. Front. Psychol. 6:1295. doi: 10.3389/fpsyg.2015.01295* emotional memory suggests an important role for neural mechanisms implicated in positive emotional responses. Excessive sadness is one of two cardinal symptoms of major depressive disorder (MDD; American Psychiatric Association, 2013), and since there is a tendency for depressed adults to preferentially attend to and ruminate on negative information (Beck et al., 1979; Nolen-Hoeksema, 1991), one might expect better memory for negative material in depressed adults, as information consistent with one's mood is preferentially encoded (Bower, 1981, 1987). Indeed, this effect is often found (e.g., Gotlib et al., 2004; Hamilton and Gotlib, 2008; Matt et al., 1992)—but positive memory deficits are also robust. Healthy adults often show better memory for positive versus negative or neutral material, but this advantage is reduced in depression (e.g., Gotlib et al., 2004; Hamilton and Gotlib, 2008), and a seminal meta-analysis found that this group difference is more reliable than enhanced memory for negative material in depression (Burt et al., 1995). Why is memory for positive material impaired in depressed adults? I propose that it reflects anhedonia—the second cardinal symptom of MDD—and its association with dysfunction in mesolimbic dopamine circuits that respond to reward (Schultz, 1998).

In this article I review key studies from cellular, behavioral, and human neuroscience that underscore the critical role of dopamine transmission in the persistence of episodic memories (for more extensive reviews, see Frey and Morris, 1998a; Lisman and Grace, 2005; Shohamy and Adcock, 2010; Lisman et al., 2011). When considered alongside growing evidence of reward system dysfunction in depression (Eshel and Roiser, 2010; Treadway and Zald, 2011; Dillon et al., 2014b), these data invite the following inference: memory for positive material is impaired in depressed adults because, on average, mesolimbic dopamine circuits do not mount an adequate response to reward in such individuals, compromising interactions between reward and memory systems that ensure memory retention. As stress is the most likely cause of weak dopaminergic reward responses in depression (Dillon et al., 2014b; Pizzagalli, 2014), this hypothesis extends the existing literature: stress cannot only induce hippocampal volume reductions, it can also perturb reward circuits and preferentially disrupt the formation of positive memories.

## **Synaptic Tagging and Capture**

The synaptic tagging and capture (STC) hypothesis forms the foundation of this proposal (Frey and Morris, 1997, 1998a,b; Frey and Frey, 2008). STC solves a routing problem confronted by the brain's cellular learning and memory mechanism, longterm potentiation (LTP; Bliss and Lømo, 1973). To appreciate the nature of the problem, consider a population of hippocampal neurons communicating across a synapse. If an experimenter applies weak electrical stimulation to the pre-synaptic neurons and measures the post-synaptic response, she will observe an increase in activation (specifically, excitatory field potentials) that will decay back to baseline within about 3 to 6 h; this transient response is called early-LTP (Frey and Morris, 1997). But if she applies strong stimulation, she will record increased post-synaptic activation that can be maintained for days, weeks,

or months (Abraham, 2003); this sustained response is called late-LTP. Late-LTP reflects the operation of molecular processes that structurally remodel the connection between pre- and postsynaptic neurons, transforming a country path into an eight-lane highway and making information trafficking easier (Baudry et al., 2011). The routing problem arises because most of the plasticityrelated proteins (PRPs) needed for remodeling are synthesized in the body of the neuron, but LTP is synapse-specific. There are thousands of synapses per neuron (Pakkenberg et al., 2003), all of them far downstream from the cell body, so getting PRPs to the right synapses is a significant challenge. In other words, if late-LTP amounts to building a bridge between neurons, then the concrete is mixed at a rural plant and must be delivered to a construction site in a busy neighborhood downtown—how does the delivery driver find his way?

Synaptic tagging and capture provides an appealing answer (**Figure 1**). It turns out that even weak activation is sufficient to set molecular tags that mark a synapse as a candidate for strengthening. If only weak activation occurs, PRPs will not be synthesized and the tags will fade away, consistent with the transient nature of early-LTP (Frey and Morris, 1998a,b; Rogerson et al., 2014). However, if PRPs are synthesized they can begin their trek down the axon without a determined destination, simply stopping at any synapse that displays a tag. When they do so, structural remodeling occurs and a stronger synaptic connection is in place: late-LTP.

This raises another issue. Neurons use an on-demand inventory system and synthesize PRPs as they are needed. In the laboratory, strong electrical stimulation drives PRP synthesis, but what triggers their production in nature?

Dopamine is critical (Frey and Morris, 1998a; Smith et al., 2005; Lisman et al., 2011). Direct evidence for this claim comes from an *in vitro* study that used fluorescence imaging to track the production of new proteins in hippocampal neurons (Smith et al., 2005). This work showed that applying D1/D5 dopamine receptor *agonists* to the hippocampus results in increased protein synthesis, an effect that can be blocked by applying D1/D5 receptor *antagonists*. Furthermore, the newly synthesized proteins included a subunit of the AMPA receptor, which is a key contributor to late-LTP: increased post-synaptic AMPA receptor density is a major part of the "bridge" between neurons (Malinow, 2003). Finally, if dopamine-driven protein synthesis contributes to LTP, then one would expect application of D1/D5 receptor agonists and antagonists to influence the post-synaptic response to pre-synaptic stimulation in opposite directions. Indeed, this was observed: for a given level of stimulation, application of the agonists doubled the response frequency of post-synaptic neurons, but application of the antagonists blocked this effect. This is compelling evidence that hippocampal dopamine release is crucial for the synthesis of proteins that mediate late-LTP.

There is also a wealth of evidence regarding the role of dopamine in the synthesis of PRPs from studies that inferred their presence (or absence) by examining late-LTP. For instance: (1) the concentration of hippocampal dopamine increases following late-LTP induction (Frey et al., 1990); (2) the application of D1/D5 receptor agonists can directly induce late-LTP, skipping early-LTP entirely (Huang and Kandel, 1995); and (3) the oral administration of L-DOPA—a dopamine precursor used to treat Parkinson's Disease—lowers the stimulation necessary to transition from early to late-LTP in the rodent hippocampus, as does application of a D1/D5 receptor agonist (Kusuki et al., 1997). All of these effects can be blocked by administration of D1/D5 receptor antagonists or protein synthesis inhibitors.

In summary, dopamine release drives PRP synthesis, which enables the transition from early to late-LTP (Frey and Morris, 1998a). Critically, any event capable of driving dopamine release is also strong enough to place tags on the synapses it activates. Since PRPs are sequestered by tag-bearing synapses, this is a mechanism for memory formation: dopamine release stabilizes LTP for the events that caused its release. If dopamine release is disrupted, memory for those events will suffer accordingly. Because many of the events that cause dopamine release also elicit positive emotional responses, disrupting the mesolimbic dopamine circuit should preferentially impair long-term memory for emotionally positive events.

## **Dopamine Supports the Retention of Episodic Memory in Non-human Animals**

One could accept the findings reviewed above and yet question whether dopamine is relevant to episodic memory. After all, most of the work just described was conducted *in vitro* rather than in behaving animals. Even if dopamine proves important for hippocampal function in rodents, classic accounts of episodic memory hinge on conscious experience (Tulving, 1993). Given the challenges associated with assessing consciousness in lab rats, one could fairly ask: Can we study episodic memory in animals?

The answer is yes. An alternative approach to episodic memory conceptualizes it not in terms of consciousness, but rather as memory for events-in-context: not only knowing that an event happened, but also knowing the spatial and temporal circumstances in which it happened (Clayton and Dickinson, 1998; Allen and Fortin, 2013). In other words, memory for an episode, defined as events taking place in a certain space or time, or in a particular sequence (MacDonald et al., 2013).

An elegant study used a method that met these criteria to show that memory persistence depends on the activation of hippocampal dopamine receptors (Bethus et al., 2010). The study used a sand-filled rectangular "event arena" with six wells in which rats could dig. On each training trial, rats were placed in one of four start boxes and given a food pellet with one of six flavors. The rat was then allowed to enter the arena and search for food. Critically, the "cue" pellet given in the start box determined which well contained additional pellets. The pairing of cues to wells was stable, and after 16 days the rats were running to the correct wells on about 80% of trials. This degree of accuracy is especially impressive because the use of four start boxes located in different places meant that a strategy based on landmarks would necessarily fail. Instead, the animals must have developed a map into which the cue-well associations were embedded.

With training complete, the experimenters tested episodic memory by introducing a novel cue flavor and placing two new wells in the arena, along with the original six. Only one of the two new wells was loaded with pellets. The rats explored the arena until they found the pellets, thus encoding a new cue-well association. Next, an identical trial was used to test memory, following delays of either 30 min or 24 h. Rats showed excellent memory regardless of the delay, running to the new cued well and avoiding both the new uncued well and the original six wells. However, performance after 24 h was at chance following hippocampal lesions administered after training but prior to encoding (Tse et al., 2007), and a similar impairment was observed when NMDA receptor blockers were injected into the hippocampus (Bethus et al., 2010). Because activation of NMDA receptors is essential to LTP (Lynch, 2004), these results indicate that 24-h memory for material learned in the event arena requires hippocampal LTP. In other words, performance in this task displays the hallmarks of episodic memory: rapid encoding of events-in-context, with long-term retention dependent on LTP in the hippocampus.

Having established that (episodic) memory for novel cuewell pairs depends on hippocampal LTP, the experimenters next demonstrated a critical role for dopamine in memory persistence (**Figure 2**). When they injected D1/D5 receptor antagonists into the hippocampus prior to encoding and tested retrieval after a 30-min delay, performance was unaffected. However, delaying the retrieval test by 24 h revealed a profound impairment, with performance no better than chance: given the new cue, rats ran to the uncued well as often as to the cued well (importantly, memory for the original six cue-well pairings was unimpaired). Control experiments showed that this effect did not reflect statedependent retrieval: if the antagonists were injected into the hippocampus at encoding then 24 h memory was impaired

whether or not the antagonists were also injected at retrieval, and injecting the antagonists only at retrieval had no effect. In other words, using a drug to block dopamine release in the hippocampus at encoding impaired long-term memory, and this effect could not be rescued by injecting the drug again just prior to the memory test. These data extend *in vitro* studies reviewed in the prior section to behavior *in vivo*: just as dopamine is more important for late than early LTP, so dopamine release in the hippocampus is more important for long-term versus short-term retention of episodic memories (for additional evidence, seeWang et al., 2010).

One need not inject dopaminergic agents to observe these effects, as well-chosen behavioral interventions yield similar results. For instance, novelty elicits mesolimbic dopamine release in non-human animals (Lisman and Grace, 2005), and exposure to novelty enhances memory persistence in the event arena. In a modified version of the task, rats encoded a new association between a cue flavor and the location of a single well—the only well available—before completing a cued retrieval test in which six wells were available (Wang et al., 2010). This procedure was repeated daily for 6 months, with a different cue-well association learned and tested each day (the experimenters drew an analogy to parking one's car in different spots within the same lot). When a single pellet served as the reward for finding the correct well during learning, retrieval for cue-well pairings was adequate after a 30-min delay but decayed substantially after 24 h. However, 24-h memory was rescued by a simple manipulation: exposing the rats to a novel environment 30 min after learning.

Synaptic tagging and capture can account for this finding. Encoding the cue-well pairing results in synaptic activation that is sufficient to set a tag but not strong enough to drive PRP synthesis because of the meager reward (1 pellet) delivered for finding the correct well. Consequently, if nothing else happens, late-LTP will not occur and the cue-well memory will fade away within 24 h. However, exposure to a novel context drives dopamine release in the mesolimbic circuit, and this is sufficient to trigger PRP synthesis. The molecular tags set during cue-well learning will not fade in the 30 min post-encoding, so when novelty exposure triggers dopamine release and PRP synthesis, the tags will still be set and can sequester PRPs, leading to late-LTP and robust 24 h memory. This account was strengthened by the fact that the positive effect of novelty on memory retention was blocked if novelty exposure was accompanied by the injection of D1/D5 receptor antagonists or protein synthesis inhibitors; either of these agents can disrupt PRP synthesis and thus deny the synapse a chance at late-LTP.

Note that enhanced memory persistence following a postencoding manipulation (here, novelty exposure) is consistent with STC: as long as synaptic tags are present when PRPs are made available, the transition from early to late-LTP will occur and a lasting memory will be formed, regardless of whether the PRPs are synthesized before or after the tags are set (Frey and Morris, 1998a). STC also predicts a time window governed by the decay of the tags, and this study found evidence for such a window: novelty exposure 6 h post-encoding did not rescue 24-h memory and injecting protein synthesis inhibitors 6 h post-encoding did not impair 24-h memory. Thus, tag setting and PRP capture are complete within 6 h after encoding (other experiments find evidence for considerably more narrow windows, e.g., 2 h in Moncada and Viola, 2007).

Finally, using larger reward (3 pellets) at encoding had the same effect as novelty exposure: memory accuracy was sustained at the 24-h test, and injection of D1/D5 receptor antagonists into the hippocampus blocked this effect. Remarkably, this blockade of "strong" reward memory could be prevented by novelty exposure. When rats explored a novel environment prior to completing encoding trials in which 3 pellet reward were given, 24-h memory was intact even if encoding occurred under dopamine receptor blockade. This confirms a central prediction of STC: once PRPs are made available—here, by novelty exposure—they can be captured by tags that are set shortly afterward, even if PRP production is blocked during tag setting (see also Moncada and Viola, 2007; Ballarini et al., 2009).

In summary, rodents can form episodic memories, the persistence of which depends on dopamine release into the hippocampus. Furthermore, memory strength can be bolstered by behavioral manipulations that trigger dopamine release, such as the opportunity to explore a new environment or the receipt of 3 pellets rather than one for successful encoding. The next section highlights the fact that these manipulations have similar effects in humans.

## **Anticipation of Reward and Novelty Support Episodic Memory in Healthy Humans**

A role for dopamine in episodic memory formation is consistent with anatomical studies in non-human animals (Lisman and Grace, 2005). In the rodent, there are direct projections from the ventral tegmental area (VTA) to the hippocampus, along with indirect projections from the hippocampus to the VTA that go through the nucleus accumbens and pallidum. The VTA-tohippocampus connection permits the dopaminergic modulation of hippocampal LTP that has been discussed thus far. Meanwhile, the hippocampal-to-VTA connection is critical for triggering dopamine bursts in the first instance. The hippocampus holds a representation of the current context and detects deviations from that context. When a novel stimulus or an unexpected reward is detected, the hippocampus registers the deviation and transmits that information to the VTA, leading to burst firing of dopamine neurons.

Do humans have the same pathways? Definitive anatomical studies have not been done, but an investigation of spontaneous functional connectivity in fMRI data is suggestive (Kahn and Shohamy, 2013). In two large samples (*n* = 100 and *n* = 894), analysis of resting state fMRI signals revealed significant correlations among the body of the hippocampus, the nucleus accumbens, and the VTA. Although functional connectivity does not imply anatomical connectivity, the fact that correlated activation among these regions was detectable at rest is encouraging because it implies synchronous patterns of activation in the absence of external stimulation, which suggests stable communication between these regions in humans.

Furthermore, pharmacological work in humans has shown that dopaminergic agents can influence episodic memory. Specifically, enhancing dopamine transmission improves memory in a manner that appears consistent with STC. Chowdhury et al. (2012) administered L-DOPA to healthy older adults prior to an encoding session in which they viewed two categories of images, with one category reliably predicting delivery of monetary reward. Memory for half the items was tested after a 2-h delay, with memory for the remaining items tested after a 6-h delay. The study generated a striking finding—namely, a quadratic relationship between L-DOPA levels and delayed memory for neutral images (i.e., images that did not predict reward delivery), such that a moderate dose of L-DOPA improved 6-h memory for the neutral images relative to small or large doses. No such curve was evident after the shorter delay or for reward-predicting images at either delay. This is intriguing because 6-h memory for neutral images is the condition in which maximal forgetting would be expected, and thus it is also the condition in which heightened levels of dopamine could most easily rescue performance, much in the way that novelty exposure rescued 24-h memory in the 1-pellet encoding condition tested by Wang et al. (2010), as described earlier.

Additional pharmacological studies in humans are needed, as most of the human data pertinent to dopamine and memory come from task-based fMRI research (for review, see Shohamy and Adcock, 2010). Of course, fMRI cannot directly measure dopamine transmission and interpretations should be cautiously made (but for evidence of a relationship between dopamine receptor occupancy and fMRI signal from simultaneous fMRI/PET, see Mandeville et al., 2013). Nonetheless, the parallels with rodent data are striking. The earliest work on this topic demonstrated activation of the hippocampus and midbrain, including the VTA and substantia nigra (SN), in response to novel configurations of familiar images and during the encoding of successfully recalled words (Schott et al., 2004). Robust hippocampal activation was also observed, consistent with the hypothesis that these two regions form a functional unit in humans as well as in rodents.

The interpretation of these data is predicated on the hypothesis that the human mesolimbic dopamine network responds to novelty in much the same way as it responds to reward. A subsequent study from the same team provided compelling evidence for this hypothesis. The dopaminergic midbrain fires strongly to reward-predicting cues and to unexpected reward delivery (Schultz, 1998). To determine whether the human midbrain shows similar functionality with respect to novelty, Wittmann et al. (2007) presented participants with two colored squares that predicted the appearance of novel and familiar images, respectively. The predictions were accurate 75% of the time; on the remaining 25% of trials, unexpected novel or familiar pictures were presented. As hypothesized, fMRI data from the VTA/SN showed a strong response to the novelty-predicting cue and to the unexpected delivery of novel pictures following the familiarity-predicting cue. This is the pattern expected from the reward literature. Meanwhile, the bilateral hippocampus showed a strong response to the cue predicting novel (versus familiar) images, and activation of the VTA/SN and right hippocampus in response to the novelty-predicting cue was correlated across participants. Finally, memory was tested after a 24-h delay, and higher rates of recollection versus familiarity were observed for expected versus unexpected novel pictures. Taken together, the fMRI and behavioral data suggest that coactivation of the VTA/SN and hippocampus during novelty anticipation—prior to image presentation—facilitated encoding success.

Additional fMRI research has found evidence consistent with this interpretation, although it has used monetary reward rather than novel images to drive the VTA/SN (**Figure 3**). For example, in another study, Wittmann et al. (2005) presented images from two categories, only one of which reliably predicted a chance to win money, and then tested memory for the images at two times: immediately following encoding and again 3 weeks later. The fMRI data showed a strong VTA/SN response to the reward-predicting images, which were better remembered than the non-rewarded images when memory was tested after 3 weeks but not when memory was tested immediately. Moreover, when the experimenters probed brain regions whose encoding activation predicted 3-week memory, they found activation in the VTA/SN and the hippocampus. Furthermore, VTA/SN responses showed a *Reward × Memory* interaction: activation was higher for subsequently remembered images that predicted reward delivery relative to both images that (1) did not predict reward and (2) predicted reward but were ultimately forgotten. These data indicate that long-term memory is supported by encoding activation in the hippocampus and the dopaminergic midbrain, and they highlight reward anticipation as a potent means for driving the human VTA/SN (see also Adcock et al., 2006; Wolosin et al., 2012, 2013).

Why does the prospect of earning a reward influence encoding? Murty and Adcock (2014) advance the following argument with respect to the incidental encoding of task-irrelevant but salient events: if you are pursuing a valuable goal and you notice something unusual, it may be worth remembering in case it bears on goal-attainment. To test this hypothesis, they showed participants high (\$2.00) and low (\$0.10) value cues before repeatedly presenting color versions of trial-unique "target"

**dopaminergic midbrain predict episodic memory after 3 weeks' delay.** In the hippocampus, stronger encoding activations was observed for pictures that predicted reward delivery and for pictures that were subsequently recognized, but no interaction was observed. By contrast, in the midbrain [including the substantia nigra and ventral tegmental area (VTA)], a *Reward × Memory* interaction was seen: encoding activations was highest for reward predicting pictures that were ultimately remembered. Thus, in healthy controls the midbrain and medial temporal lobe memory regions appear to work together to support episodic memory for images that predict reward delivery. Reprinted from Wittmann et al. (2005) with permission from Elsevier.

images. The participants' task was to press a button when the target image changed from color to grayscale. On a subset of trials, a novel image was inserted into the series of targets, allowing Murty and Adcock to pose this question: Are participants more likely to remember novel images following presentation of the high versus the low-value cue? The authors predicted that this effect would emerge, based on the hypothesis that the hippocampus would signal expectancy violations (random presentation of a novel image in a repeating sequence) especially vigorously when a desired goal was at stake.

A memory test administered 30 min later confirmed expectations: memory was better for novel images presented after the high-value versus the low-value cue. Furthermore, only one brain region showed a stronger response to novel images following high- versus low-value cues—namely, the left hippocampus. Hippocampal activation was predicted by the VTA response to high-value cues, with subsequent analyses indicating that the VTA-to-hippocampal relationship was not direct, but was instead mediated by several cortical regions, including the visual cortex, medial PFC, ventrolateral PFC, and subgenual cingulate. In summary, this work showed that reward anticipation can serve as a context that facilitates incidental encoding, and it also provided a rationale for the existence of this mechanism: the brain is frugal, allocating memory space to unexpected events only if it seems like they might help the organism reap reward more effectively.

Overall, the human literature is consistent with studies in hippocampal slices and rodents. Novel configurations of familiar stimuli, stimuli presented when novelty is expected, and images shown during reward anticipation are all well-retained after a delay, although dissociating the effect of these manipulations on short- versus long-term has not received as much attention as in the rodent literature. Limited data from pharmacological studies indicate that dopamine transmission may drive these effects, and fMRI data are consistent with this hypothesis, as they show that coactivation of the hippocampus and dopaminergic midbrain supports memory in these tasks. Although more research is needed, the existing data indicate a similar role for dopamine vis-à-vis memory persistence in humans and rodents.

## **Depression**

Based on the evidence reviewed thus far, disruption of the mesolimbic dopamine pathway should have significant, negative consequences for episodic memory in humans, with stronger negative effects on long-term versus short-term memory. Because reward delivery is a potent trigger of dopamine release, memory for positive (i.e., rewarding) events should be preferentially disrupted. The putative relationship between anhedonic depression and dysfunction in mesolimbic dopamine circuitry is now widely-known and has been the subject of numerous reviews (Eshel and Roiser, 2010; Treadway and Zald, 2011; Dillon et al., 2014b; Pizzagalli, 2014), to which the interested reader is directed. The novel question at hand is whether or not this dysfunction has the expected effect on memory for rewarded/positive material. Surprisingly, there is virtually no work on this topic. However, a recent study from our group yielded encouraging results.

We scanned healthy controls and unmedicated, depressed adults as they viewed drawings followed by reward and zero (non-reward) tokens (Dillon et al., 2014a). A source memory test administered directly after encoding revealed better memory for rewarded versus non-rewarded drawings in the controls, but this effect was absent in the depressed group (**Figure 4**), consistent with the loss of the positive memory advantage in depression (Burt et al., 1995). We also found a stronger response to the reward versus zero tokens in the VTA/SN and right parahippocampus in the controls, but not the depressed group (**Figure 5**). Finally, the memory advantage for rewarded versus non-rewarded stimuli was strongly correlated with encoding activation in the VTA/SN in the controls, but no correlation was observed in the depressed adults. Thus, we obtained evidence consistent with the hypothesis that disrupted activation of the VTA/SN and MTL memory regions compromised episodic memory for rewarded material in depressed adults relative to controls.

However, there is much work to be done. Our study—like many studies in the human literature on reward and memory—tested memory directly after encoding. Future studies in depression should test memory after a delay of at least 6 h, as this would give dopamine a chance to exert its effects on the transition from earlyto late-LTP. Furthermore, an incidental encoding design—where memory testing is unannounced beforehand—would be preferable (participants in our study knew their memory would be tested prior to encoding). This is because intentional encoding

designs like the one we used invite group differences in encoding strategy, which make accurate interpretation challenging. Finally, although it is not possible to directly assess dopamine transmission using fMRI, the reinforcement learning literature has productively used computational modeling to extract a putative dopamine signal from fMRI data (O'Doherty et al., 2007), and there is reason to believe that a computational approach to psychiatric disorders will prove fruitful (Maia and Frank, 2011). In particular, computational modeling may be able to provide increasingly sensitive tests of the proposed role for dopamine abnormalities in human episodic memory failures (for theory on the role of prediction errors in memory, see Henson and Gagnepain, 2010). For example, one could look for a positive relationship between (positive) prediction errors at encoding and accuracy on delayed memory tests in healthy controls, since positive prediction errors are known to elicit burst firing in VTA dopamine neurons (Schultz, 1998). Obtaining such evidence would then allow a test of the hypothesis that this relationship is disrupted in unipolar depression, either because prediction errors are not generated appropriately in depressed adults, or because, once generated, they are not signaled effectively to medial temporal lobe memory regions.

At this point some readers may wonder whether this hypothesis, even if true, is clinically relevant. After all, the cardinal symptoms of major depression are excessive sadness and anhedonia, not memory deficits, and most people probably do not think of depression as a memory disorder. However, research with patients tells a different story. MacQueen et al. (2002) administered a survey to 100 outpatients and found that memory problems were rated as the third most troublesome aspect of depression, behind low libido and weight gain but ahead of sad mood, low energy, poor sleep, and many other symptoms that might be considered more characteristic of depression. The same study found reason to believe patient reports: patients' self-reported assessment of memory problems was correlated with performance on recollection-based memory tests, which depend heavily on hippocampal function. In other words, the participants thought that their memories were failing, and they were right (but see Mowla et al., 2008, for evidence that depressed adults have limited insight into the extent of their memory deficits).

A new field focused on memory therapeutics for depression is emerging in response to findings like these (for review, see Dalgleish and Werner-Seidler, 2014; for a review of work directed at improving working memory in depressed adults, see (Becker et al., 2015). The field is oriented around the fact that autobiographical memory retrieval has several striking qualities in depressed adults. First, it is oriented toward emotionally negative material. Second, it is frequent overly general: given a cue and the explicit instruction to retrieve a specific, time-limited memory, depressed adults are instead prone to recall categorical memories that span several discrete events and that often have a negative theme. Third, autobiographical memory retrieval is apt to set off a downward spiral of rumination and self-recrimination, as the depressed individual perseverates on past failures and fails to see how events may play out differently in the future. The field of memory therapeutics takes these qualities of autobiographical memory retrieval as targets, training depressed adults to rapidly retrieve positive memories and elaborate upon them, being as specific as possible and avoiding cycles of negative rumination. Furthermore, when unintentional retrieval of negative memories does occur, clients are instructed to regard them with the nonjudgmental, accepting perspective taught in mindfulness-based cognitive therapy (Teasdale et al., 2000), as a way to defuse the memories' emotional charge. Although the field of memory therapeutics is very new, there is already some evidence that these methods are clinically effective (e.g., Watkins et al., 2009, 2012; Neshat-Doost et al., 2012).

The proposal developed here is complementary to work on memory therapeutics because its focus is different. Memory therapeutics are primarily aimed at improving the precision and selectivity of retrieval, whereas the hypothesis advanced here proposes that dopamine dysfunction compromises memory formation. If this hypothesis proves true, there is no reason why a clinician could not target both encoding and retrieval for maximum benefit. One can imagine a scenario in which improved dopaminergic tone in mesolimbic circuits could enhance reward responses in depressed adults, boosting late-LTP and thus improving long-term memory for positive events. Simultaneously, a memory therapeutics approach focused on directing retrieval searches toward concrete, specific, positive material from a client's life could enhance positive mood while simultaneously decreasing the propensity to ruminate on overgeneral negative memories. Together, these two strategies could have a powerful effect on mood that would place clients on an upward spiral toward better outcomes.

## **Qualifications and a Role for Dopamine in Negative Memories**

In order to advance the central argument of this article, I have glossed over several important points that deserve mention. First, this proposal should not be read as equating depression

with dopamine dysfunction. Depression is a heterogeneous condition and a diagnosis of MDD can reflect a wide range of symptoms, many of which have little (if anything) to do with dopamine. Thus, this proposal should be narrowly read: it is strictly about how dopamine dysfunction may compromise memory persistence in depression. Second, the mechanism proposed here is complementary to the hippocampal stress hypothesis, because stress is thought to cause dopaminergic abnormalities that are central to depression (Dillon et al., 2014b). In other words, hippocampal volume reductions and rewardbased memory deficits may both be downstream consequences of stress, and it may be useful to determine whether and how they interact with one another (i.e., do changes in D1/D5 receptor distributions figure in hippocampal volume reductions?), particularly because recent rodent studies indicate that antagonizing glucocorticoid receptors can disrupt the acquisition, retrieval, and reconsolidation of conditioned place preferences for rewarding events (Dong et al., 2006; Fan et al., 2013; Achterberg et al., 2014). Third, chronic stress models used to induce anhedonia in rodents result in tonically reduced dopamine concentrations (Willner, 2005), but memory for individual events is probably influenced by phasic dopamine bursting. How tonic and phasic dopamine levels interact to influence memory retention is unclear and an important topic for future study (Shohamy and Adcock, 2010). Fourth, this proposal is focused on encoding and consolidation, but it will be important to examine memory retrieval in depression as well, as the discussion of memory therapeutics implies. Healthy adults retrieve positive memories to repair negative moods, and in doing so they activate the striatum and the medial PFC (Speer et al., 2014), two regions that figure prominently in mechanistic accounts of depression (Pizzagalli et al., 2009; Lemogne et al., 2012). Thus, poor memory for positive material in depression may frequently reflect problems with retrieval. Indeed, as retrieval depends heavily on PFC function (Dobbins et al., 2002) and depression is characterized by hypofrontality (e.g., Mayberg et al., 1994; Pizzagalli, 2011), problems mounting successful retrieval attempts may be a general issue in depressed adults, extending beyond memory for positive material. Fifth, although there is a wealth of evidence indicating that depression is typically associated with weak responses to rewarding stimuli and dysfunction in brain reward systems (Treadway and Zald, 2011; Pizzagalli, 2014; Whitton et al., 2015), evidence linking these findings to dopamine is usually indirect and often based on inference from studies in non-human animals, thus the link between anhedonic symptoms of depression and dopamine needs strengthening. Other neurochemical systems, including the opioid, endocannabinoid, serotonergic, and glutamatergic systems, are likely relevant, and of course humans experience a diverse range of positive emotions that may be more (e.g., excitement) or less (e.g., tranquility) tightly linked to phasic dopamine bursting. Similarly, the account of LTP offered earlier amounts to a thumbnail sketch, as the process of memory formation is remarkably complex and involves many factors at the molecular level. Nonetheless, dopamine appears to play a central role in anhedonic depression and the transition from early-to-late LTP, and thus it is an excellent starting point for a mechanistic account of positive memory deficits in depression.

Finally, it is very clear in the behavioral neuroscience literature that the positive effects of dopamine on retention are not limited to memory for positive events (e.g., Zweifel et al., 2011), although it may be most easily detected and studied in such cases. As an example, Sariñana et al. (2014) showed that D1 receptors in the dentate gyrus are crucial for contextual fear conditioning in mice. They created D1 and D5 receptor knockouts and administered shock in one box (context A). 24 h later, the mice were exposed to the original box and another, similar box (context B). D5 knockouts and control mice froze readily in context A but not context B, consistent with contextual fear conditioning, but D1 knockouts did not discriminate—they froze to a similar degree in both contexts. Similarly, markers of early gene activity (c-fos counts) in the dentate gyrus told a similar tale: D5 knockouts discriminated between their home cage, context A (where they had been shocked), and an entirely novel context, but no such differentiation was seen in the D1 knockouts. Consistent with many findings reviewed earlier, no deficit was seen in D1 knockouts when memory was tested one to 3 h after acquisition, implying that the effect is specific to memory persistence, and no deficit was seen in amygdala-dependent cuebased fear conditioning, implying that the effect is dependent on D1 receptors in the hippocampus. In short, this study provided evidence that D1 receptors in the dentate gyrus are critical for contextual fear conditioning.

In the current context, I wish to use the technically elegant work of Sariñana et al. (2014) to make a simpler point: despite the strong link between dopamine and reward, this study underscores the fact that dopamine (and D1 receptors in particular) can be important for retaining memory for aversive experiences so long as they depend on hippocampal activation. Presumably these findings depend on the subset of midbrain dopaminergic neurons that respond to salient stimuli whether those stimuli are rewarding or punishing (Matsumoto and Hikosaka, 2009; Bromberg-Martin et al., 2011). Consequently, although studying interactions between dopamine networks and the MTL memory system may prove especially valuable for understanding positive memory deficits in depression, it may help explain poor memory more broadly. Along these lines, an intriguing study in rodents found that if phasic dopamine bursting is blocked, a simple light-shock fear conditioning paradigm can result in behavior consistent with generalized anxiety, presumably because the tight relationship between illumination of the light and shock delivery is not well-encoded, leading to overgeneralization of the fear response (Zweifel et al., 2011).

## **Conclusion**

Depressed adults typically present with episodic memory deficits, and they rate these deficits as a particularly troublesome aspect of the illness (MacQueen et al., 2002). Furthermore, memory for positive material is especially impaired in depression but the neural mechanisms responsible for this deficit are not well characterized. I propose that poor memory for positive material in depression emerges because of anhedonia and its association with dysfunction in mesolimbic dopamine networks widely associated with reward processing.

Although there is little direct work on this topic, there is a compelling body of evidence, across several levels of analysis, implicating dopamine transmission in memory persistence. The STC hypothesis presents dopamine as the instigator of protein synthesis that cements the transition from early- to late-LTP in hippocampal neurons. Experiments probing episodic memory in rodents show that blocking D1/D5 receptors in the hippocampus during encoding has little effect on tests of immediate memory but exerts a powerfully negative effect on delayed tests. By contrast, administration of D1/D5 agonists and exposure to novelty reliably boost memory retention. A growing literature in healthy humans is consistent with findings in rats and hippocampal slices, as reward anticipation and novelty exposure enhance encoding and retention. These mechanisms are space-saving devices: the brain can only store so much material, and it gives privileged access to events that are proximal to dopamine release. In this way, episodes that culminate in reward delivery are well-retained, presumably to allow the organism to behave adaptively should similar circumstances arise in the future.

In sum, there is a widespread consensus that anhedonic depression is associated with dysfunction in brain reward circuitry, with an emphasis on stress-induced disruption of the mesolimbic dopamine system. Initial results suggest that this has consequences for memory, as one would predict based on the molecular, behavioral, and human neuroscience literatures. Given the potential to make a meaningful difference in the way memory problems in depression are understood and treated, and in light of the considerable supporting evidence marshaled here, thoroughly

## **References**


testing and refining this proposal would constitute time and effort well-spent.

## **Acknowledgments**

This author is indebted to Drs. Diego Pizzagalli, Randy Auerbach, and Michael Treadway for their helpful comments on drafts of this manuscript. Funding: The author is supported by a grant from the National Institute of Mental Health (K99/R00 MH094438).


hippocampus; stability with age. *Neurobiol. Aging* 34, 1662–1673. doi: 10.1016/j.neurobiolaging.2012.11.019


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Dillon. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **Reduction in ventral striatal activity when anticipating a reward in depression and schizophrenia: a replicated cross-diagnostic finding**

*Gonzalo Arrondo1, Nuria Segarra1, Antonio Metastasio1, Hisham Ziauddeen1, 2, 3, Jennifer Spencer 1, 3, Niels R. Reinders 1, Robert B. Dudas 1, 3, 4, Trevor W. Robbins 5, 6, Paul C. Fletcher 1, 2, 3, 6 and Graham K. Murray 1, 3, 6\**

#### *Edited by:*

*Eduardo A. Garza-Villarreal, National Institute of Psychiatry, Mexico*

#### *Reviewed by:*

*Sharna Jamadar, Monash University, Australia Kristine Rømer Thomsen, Aarhus University, Denmark Tracy Barbour, Massachusetts General Hospital, USA*

#### *\*Correspondence:*

*Graham K. Murray, Department of Psychiatry, University of Cambridge, Box 189 Addenbrooke's Hospital, Cambridge, CB2 0QQ, UK gm285@cam.ac.uk*

#### *Specialty section:*

*This article was submitted to Psychology for Clinical Settings, a section of the journal Frontiers in Psychology*

> *Received: 04 March 2015 Accepted: 11 August 2015 Published: 26 August 2015*

#### *Citation:*

*Arrondo G, Segarra N, Metastasio A, Ziauddeen H, Spencer J, Reinders NR, Dudas RB, Robbins TW, Fletcher PC and Murray GK (2015) Reduction in ventral striatal activity when anticipating a reward in depression and schizophrenia: a replicated cross-diagnostic finding. Front. Psychol. 6:1280. doi: 10.3389/fpsyg.2015.01280* *<sup>1</sup> Department of Psychiatry, University of Cambridge, Cambridge, UK, <sup>2</sup> Wellcome Trust-MRC Institute of Metabolic Science, Cambridge, UK, <sup>3</sup> Cambridgeshire and Peterborough NHS Foundation Trust, UK, <sup>4</sup> Psychiatric Liaison Service, Ipswich Hospital, Norfolk and Suffolk NHS Foundation Trust, UK, <sup>5</sup> Department of Psychology, University of Cambridge, Cambridge, UK, <sup>6</sup> Behavioural and Clinical Neuroscience Institute, University of Cambridge, Cambridge, UK*

In the research domain framework (RDoC), dysfunctional reward expectation has been proposed to be a cross-diagnostic domain in psychiatry, which may contribute to symptoms common to various neuropsychiatric conditions, such as anhedonia or apathy/avolition. We used a modified version of the Monetary Incentive Delay (MID) paradigm to obtain functional MRI images from 22 patients with schizophrenia, 24 with depression and 21 controls. Anhedonia and other symptoms of depression, and overall positive and negative symptomatology were also measured. We hypothesized that the two clinical groups would have a reduced activity in the ventral striatum when anticipating reward (compared to anticipation of a neutral outcome) and that striatal activation would correlate with clinical measures of motivational problems and anhedonia. Results were consistent with the first hypothesis: two clusters in both the left and right ventral striatum were found to differ between the groups in reward anticipation. *Post-hoc* analysis showed that this was due to higher activation in the controls compared to the schizophrenia and the depression groups in the right ventral striatum, with activation differences between depression and controls also seen in the left ventral striatum. No differences were found between the two patient groups, and there were no areas of abnormal cortical activation in either group that survived correction for multiple comparisons. Reduced ventral striatal activity was related to greater anhedonia and overall depressive symptoms in the schizophrenia group, but not in the participants with depression. Findings are discussed in relation to previous literature but overall are supporting evidence of reward system dysfunction across the neuropsychiatric continuum, even if the specific clinical relevance is still not fully understood. We also discuss how the RDoC approach may help to solve some of the replication problems in psychiatric fMRI research.

**Keywords: reward system, ventral striatum, monetary incentive delay, depressive symptoms, research domain framework**

## **Introduction**

Current psychiatric diagnostic manuals divide psychopathology into separate diagnostic categories based in the co-occurrence of signs and symptoms rather than on the basis of underlying physiology. However, within each of these categories there is great heterogeneity in the type and severity of symptoms and, similarly, it is common to have overlap in symptoms between the different diagnoses (Lilienfeld, 2014). Recently the Research Domain Criteria (RDoC) framework has been proposed as an alternative framework in order to advance the understanding of mental disorders (Insel et al., 2010). The specific aim of the RDoC project is to increase research that validates new crossdiagnostic dimensions and biological and behavioral measures to carry out better classifications of mental problems. To achieve this improved classification, the RDoC group has proposed a set of five domains or functional systems that are typically affected in psychopathology, and seven units of analysis at which these five constructs can be studied, thus creating a 2-dimensional matrix that can guide research. Additionally, this matrix also includes a further column of "paradigms," that is, tools that can be used to measure abnormalities in the domains (Cuthbert, 2014a).

Schizophrenia patients, as defined in current diagnostic manuals, typically suffer from symptoms of psychosis (delusions, hallucinations, and disordered behavior, thought, and speech), whereas depression patients are diagnosed on the basis of low mood, anhedonia, and accompanying physical symptoms such as reduced energy. However, psychotic symptoms can appear in the course of a major depressive episode and low mood, blunted affect, alogia, or anhedonia are also characteristic of schizophrenia (Bedwell et al., 2014). Moreover, the boundary between depression and schizophrenia is frequently unclear, and the diagnosis of schizoaffective disorder is often used in cases in which a patient has features of mood disorder and schizophrenia. Additionally, the existence of mild symptoms of psychosis in young people, previously thought to be confer a high risk for schizophrenia, has been shown to be a more general risk factor for different psychiatric disorders including depression (Murray and Jones, 2012; Hui et al., 2013).

At the cognitive-behavioral construct level, disrupted reward processing has been implicated in both schizophrenia and depression. Moreover, it has been suggested that whereas reward receipt may only be subtly affected in both disorders (Cohen and Minor, 2010; Arrondo et al., 2015), the anticipation of reward (Juckel et al., 2006b; Sherdell et al., 2012) and its motivational aspects (e.g., the effort that a subject is willing to make to get it) (Treadway et al., 2012; Barch et al., 2014; Gard et al., 2014) may be markedly dysfunctional in schizophrenia and in depression (for in-depth reviews of the issue see Barch and Dowd, 2010; Kring and Caponigro, 2010; Treadway and Zald, 2011; Der-Avakian and Markou, 2012; Argyropoulos and Nutt, 2013; Kring and Barch, 2014; Whitton et al., 2015). Abnormalities in reward prediction error signaling in the striatum in schizophrenia are also a well-known finding that could be involved in the pathogenesis of psychotic symptoms (Fletcher and Frith, 2009; Ziauddeen and Murray, 2010). Similar changes have also been found in depression (Kumar et al., 2008), and indeed when participants in both patient groups were studied within the same prediction error learning paradigm researchers found brain activation differences between controls and the two groups of patients (Gradin et al., 2011).

Hence, consistent with the RDoC proposal, there appears to be dimensional continuity between schizophrenia and depression with respect to at least some aspects of reward processing. According to the RDoC matrix some of the stated similarities would fit in the Positive Valence systems domain, which is defined as involving "Systems primarily responsible for responses to positive motivational situations or contexts, such as reward seeking, consummatory behavior, and reward/habit learning," and specifically the Expectancy/Reward Prediction Error component within the Approach Motivation subsystem. The ventral striatum/nucleus accumbens is considered a key area involved in the processing of reward anticipation, and the main neurotransmitter thought to be involved in predicting rewards and learning from them is dopamine. The Monetary Incentive Delay task (MID) is a paradigm that it is well-known to elicit strong striatal activations related to the expectation and salience of rewards (Knutson et al., 2001). It has been widely used both in the healthy population and in patients with neuropsychiatric symptoms, with several results pointing toward both schizophrenia and depression patients having a decreased activity in the striatum when anticipating rewards (Juckel et al., 2006a,b; Nielsen et al., 2012; Stoy et al., 2012). Moreover, reanalyzing data from a set of studies, Hägele and colleagues showed that depression, schizophrenia and alcohol disorders were all associated with reduced activity in the right ventral striatum with this effect correlated with depressive symptoms as measured by the Beck Depression Inventory (Hägele et al., 2015). However, given that patient groups were not matched to each other or to controls in age and gender in the study by Hägele and colleagues we sought to replicate and extend it using a specifically-designed study with matched groups; we also wished to relate neural responses to additional key symptoms, as previous work has indicated that ventral striatum (de)activation could also relate to other symptoms such as anhedonia (Simon et al., 2010; Stoy et al., 2012) and more generally, to positive (Nielsen et al., 2012), and negative (Juckel et al., 2006a; Waltz et al., 2009) symptoms.

In brief, we compared patients with depression and schizophrenia to healthy controls in a Monetary Incentive Delay task. The aim was to further understand how perturbation of the anticipation of reward relates to anhedonia, depression, and overall positive and negative symptoms. Consistent with the results of Hägele and colleagues, we hypothesized that the two clinical groups would have a reduced activity in the ventral striatum when anticipating reward and that the striatal activation would correlate with clinical measures (in a direction such that patients with the least activation would have the more pronounced psychopathology).

## **Materials and Methods**

## **Participants**

Sixty seven participants were recruited for the study, of whom 22 had a diagnosis of schizophrenia, 24 a diagnosis of major depressive disorder and 21 were healthy volunteers. DSM-IV criteria were used for group classification. Inclusion criteria were to be aged between 18 and 65 and speak English proficiently. Exclusion criteria were any contraindication for entering a MRI scan and history of neurological disorder, physical illness, and alcohol or drug dependence. Participants from both clinical groups had subjective symptoms of anhedonia. Demographic and clinical details of participants are provided in **Table 1**. Patients with schizophrenia were recruited through psychiatric community services of the Cambridgeshire and Peterborough NHS Foundation Trust. Patients with depression were recruited through psychiatric and psychological community services of the Cambridgeshire and Peterborough NHS Foundation Trust, and through public advertisement. Diagnoses for patients from psychiatric services of the Cambridgeshire and Peterborough NHS Foundation Trust and suitability for the study were confirmed by the review of all available clinical and anamnestic information by each individual's psychiatrist (an experienced psychiatrist with several years of postgraduate experience who had passed the membership examination of the Royal College of Psychiatrists). Diagnoses for patients recruited from psychology services or advertisement were confirmed by a psychiatric interview including PANSS (Kay et al., 1987) and assessment with the Mini-International Psychiatric Inventory (Sheehan et al., 1998) the interview was conducted either by an experienced research psychiatrist with several years of postgraduate experience who had passed the membership examination of the Royal College of Psychiatrists or by a registered clinical psychologist with several years postgraduate experience.

Thirteen of the depressed participants were taking antidepressant medication: citalopram 30–60 mg daily, mirtazapine 30–45 mg, and venlafaxine 75–225 mg. All patients with schizophrenia were taking atypical antipsychotic medication (specifically clozapine, aripiprazole, risperidone, quetiapine, or olanzapine); two patients were taking a combination of typical and atypical medication. The mean chlorpromazine equivalent dose was 401.24 (*sd* 91.43) mg/day (Kroken et al., 2009). Eight patients with schizophrenia were additionally taking antidepressant medication: citalopram 20–40 mg, fluoxetine 20 mg, mirtazapine 45 mg, venlafaxine 150–225 mg.

The study was conducted at University of Cambridge (Wolfson Brain Imaging Centre and Department of Psychiatry). All participants were evaluated using the following clinical scales: Brief Psychiatric Rating Scale (BPRS, Overall and Gorham, 1962); Positive and Negative Syndrome Scale (PANSS, Kay et al., 1987); Scale for the Assessment of Negative Symptoms Beck Depression Inventory, (SANS, BDI, Beck et al., 1996); Snaith–Hamilton Pleasure Scale (SHAPS, Snaith et al., 1995); and the Temporal Experience of Pleasure scale–TEPS, (Gard et al., 2006). Scales were selected to measure constructs with a possible striatal neural substrate and also according to previous findings of significant correlations with ventral striatum activity during the MID. The Cattell Culture Fair Intelligence Test (CFIT) was used to measure IQ (Cattell et al., 1973).

The study was approved by the Cambridgeshire 3 National Health Service research ethics committee. Written informed consent was obtained from all participants prior to participation.

## **fMRI Paradigm**

The fMRI paradigm was a variation of the Monetary Incentive Delay (MID) task (**Figure 1**). It used an event-related design in which stimuli served as cues signaling the subsequent outcome. Overall, there were 60 trials in the experiment, which was conducted in a single scanning session. There were two types of cues (after Kirsch et al., 2003): reward cue (an arrow pointing upwards; 30 events) or neutral (a horizontal bar with arrows in both extremes; 30 events), and the participants were instructed to press a button in a rapid manner when requested, after the cues disappeared but before the outcome was known. After a 1– 4 s random interval showing a fixation cross, the image of a coin indicated the amount of reward (£1 in 70% of the events and 1 penny in 30%; 21 and 9 events) in the case of the win cue, whereas a yellow or orange circle (70 and 30% of events, respectively; 21 and 9 events) were shown after the neutral cue. Hence, despite our instruction to the participants (which was designed to help engagement with the task), rewards did not depend on the subject's performance while pressing the button. This alteration from the original MID task was intended to reduce the confounds of motor preparation and task-induced anxiety which have been proposed as possible reasons for the previously inconsistent results in depression using the MID (Treadway and Zald, 2011). The inter-trial interval, in which a black screen was shown, lasted between 2 and 6 s. Reward and neutral cues, as well as ensuing outcomes were pseudo-randomly presented. Thus, the design was optimized to detect differences between the two anticipation conditions. Behavioral information obtained from the task included reaction time and responses. Data on response times was lost for 8 participants due to programming problems that did not affect acquisition of other data.

## **MRI Acquisition and Preprocessing**

A Siemens Trio Tim 3 T scanner with a 12 channel head coil was used for image acquisition. Functional images were obtained using a Gradient-echo T2\*-weighted echo planar sequence and consisted in 32 non-contiguous oblique axial planes (in order to minimize signal drop-out in ventral regions, which was especially important according to our hypothesis). Other parameters included relaxation time = 2000 ms; echo time = 30 ms; flip angle <sup>=</sup> 78; voxel size <sup>=</sup> 3.14×3.14×3.75 mm3, matrix size 64 × 64; bandwidth 2232 HZ/Px. The structural image was obtained using a high-resolution T1-weighted three-dimensional MP-RAGE sequence.

Imaging preprocessing and analysis was carried out using the FEAT v5.98 (FMRI Expert Analysis Tool) routine within the FSL program (FMRIB's Software Library, www.fmrib.ox.ac.uk/fsl). Functional time-series were sequentially realigned, coregistered to a whole brain echo-planar image and finally to the structural high resolution T1 image, and non-brain components were removed. Functional images were also spatially smoothed using a 6 mm at full width half-maximum (FWHM) Gaussian kernel and frequency filtered (130 s cut off). Images were normalized to the


Montreal Neurological Institute (MNI) standard template and the first six volumes were discarded to allow for T1 equilibration effects.

# **Statistical Analysis**

**Non-imaging Data** Non-imaging comparisons were carried out in SPSS 21 (IBM, Armonk, NW, US). Results were considered significant if *p <* 0*.*05. Normality of all variables was initially evaluated through visual inspection of histograms, whisker plots, and Q-Q plots in the 3 groups.

Results from clinical scales did not follow a normal distribution in at least one of the groups. Hence, differences between groups were tested using Kruskal-Wallis ANOVA. To investigate differences in proportions (gender, handedness, and ethnicity) Chi-square comparisons were carried out. Finally, variables which a-priori were considered to be more likely to meet ANOVA assumptions, and for which the results of the initial inspection was less clear (age, intelligence, and education), were taken to a One-Way ANOVA, residuals saved, and explored through inspection of histograms, whisker plots, and Q-Q plots. In the case of these three variables there was some evidence for a non-fully normal distribution of the residuals. However, since ANOVA is considered to be robust to deviations from normality, parametric results are reported in the main article. Additionally, results from Kruskal-Wallis ANOVAs for these variables can be found in Supplementary Material.

Whenever an ANOVA was significant, we conducted *posthoc* tests consisting of pair-wise comparisons corrected for multiple testing. In the case of Kruskal-Wallis ANOVA, adjusted significance levels were calculated by multiplying the unadjusted significance values by the number of comparisons with a maximum *p*-value of 1 (SPSS standard method). Gabriel (due to the slightly unequal sample sizes) or Games-Howell procedures were used for One-Way ANOVAs depending on the result of Levene's test on the inequality of variances; the latter was used if the test was significant.

In the case of response times, whisker plots representing data in the reward and neutral condition showed that a participant with depression had a much higher response time in both conditions. However, the difference between the RT in both conditions was within normal parameters (as confirmed by the evaluation of the residuals of the RT differences between conditions), and therefore not likely to influence the results. Nevertheless, we carried out a repeated-measures ANOVA (within subjects effect: Two levels of condition, between subjects effect: group) with and without this participant.

**Imaging Data** We used a single statistical linear regression model with 6 explanatory variables (reward cue, neutral cue; high reward outcome, low reward outcome, and the two neutral outcomes) and their temporal derivatives. Movement parameters from the realignment step were also included in the first-level model.

The a-priori contrast of interest was the anticipation of reward and consisted of the comparison of the BOLD levels during the reward cue and the neutral cue. The reward anticipation contrast uses all cue events in the experiment, with 30 neutral cues and 30 reward cues. Other possible contrasts included the comparison between outcomes, but it was decided not to investigate them at the group level due the reduced number of events that they involved, and because well-predicted rewards often evoke limited brain activity at the time of reward delivery (Berns et al., 2001).

The reward anticipation contrast (reward cue vs. neutral cue) from the first level was taken to the group-level analysis, where it was included in a one sample analysis (control group only, to illustrate this contrast in the healthy population), and One-Way between groups ANOVA (to investigate group differences). Differences were evaluated at the whole brain level and within an a-priori volume of interest mask of the ventral striatum previously used by our group (Bernacer et al., 2013). This region of interest (ROI) included the nucleus accumbens and ventral aspects of the caudate nucleus and putamen (blue regions in **Figure 4**). Comparisons at the whole brain level and within the ROI were cluster-thresholded using a family-wise error (FWE) correction of *p <* 0*.*05 after a strict initial cluster threshold of *Z >* 3 (Woo et al., 2014). Uncorrected results are also displayed in Supplementary Material as part of exploratory analyses that may be of use in future hypothesis generation and meta-analyses, using the same *Z >* 3 threshold and a minimum cluster size of 10 across the whole brain.

We extracted the mean parameter estimates for all clusters of differential activation between groups in the imaging ANOVA analysis (using the FSL tool Featquery in the normalized individual images). Then, *post-hoc* pairwise comparisons (twotailed *t*-tests, equality of variances was not assumed if Levene's test was statistically significant), aimed at exploring the group differences that were driving the significant ANOVA results, were carried out in SPSS. The same *post-hoc* analysis procedure was used for the significant clusters within the ROI comparison of the ventral striatum region.

Activation tables were created using Autoaq (Automatic atlas queries for fsl: http://brainder.org/2012/07/30/automatic-atlasqueries-in-fsl). Number of voxels, maximum voxel *Z*-value (Z max), MNI coordinates of the maximum peak (MAX X,Y,Z), anatomical label of the max peak and significant pairwise *post-hoc t*-test comparisons are reported within tables. Anatomical labels were the most probable location of the highest peak according to the Harvard-Oxford cortical and subcortical structural atlases included in FSL (Desikan et al., 2006). Images were created using MRIcroN (C. Rorden; http://www.mccauslandcenter.sc. edu/mricro/mricron/index.html) and presented in neurological form (right in the image corresponds to the right hemisphere).

A secondary analysis consisted of carrying out Spearman correlations between the mean parameter estimates in the right ventral striatum and clinical symptoms in each of the groups separately. The right ventral striatum was selected as it was the region with reduced activation in schizophrenia and in depression.

## **Results**

There were no group differences in age, gender, handedness, years of education, or ethnicity. As expected, participants with schizophrenia had a lower IQ and patients had greater psychiatric symptomatology than healthy controls (**Table 1**, Table S1).

Response times were shorter in the reward condition (*F* = 36.71, *p* ≤ 0*.*001) but did not differ between groups (*F* = 0*.*384, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*683). **Figure 2** shows whisker plots for the three groups and the two conditions and **Table 2** summarizes means and standard deviations. This difference in response time between conditions is characteristic of the paradigm hence indicating that the experimental manipulation was effective in the whole group of participants (Hägele et al., 2015). Results did not change when a subject with slow RTs was taken out (within subject factor *F* = 40.348, *p* ≤ 0*.*001; between subject factor *F* = 0.624 *p* = 0*.*540).

Reward anticipation (contrast of reward cue vs. neutral cue) in the control group activated areas in the frontal lobe (medial frontal cortex, anterior cingulate cortex), striatum, and thalamus, and cerebellum (FWE cluster corrected across the whole brain *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*05; see **Figure 3** and Table S2). When the three groups were compared using ANOVA, no clusters survived the multiple comparison correction at the whole brain level. However, the ROI analysis in the ventral striatum led to the appearance of two clusters (significant when corrected for multiple comparison)

**FIGURE 2 | Reaction time.** Repeated measures ANOVA (within subject factor: condition, between subject factor: group): reaction times in reward trials were shorter (*F* = 36.71, *p* = 0*.*001) but did not differ between groups (*F* = 0.384, *p* = 0*.*683). For graphical purposes a subject from the depression group was eliminated from the image due to a much greater mean RT (around 800 ms) in both conditions, but included in the statistical analysis (However, its elimination from the ANOVA did not change significant results). Central line represents the median value (second quartile, Q2) and the box borders indicate the 1st (Q1) and 3rd quartile (Q3). Hence, the total length of the box is the interquartile range (IQR). Whiskers mark last value in the sample located between the 1.5 × IQR below Q1 and 1.5 × IQR above Q3. Circles represent data between the 1.5 and 3× IQR below Q1 or above Q3.


*C is the control group, D depression and S schizophrenia. p Levene is the p-value of Levene's test for the inequality of variances. Standard deviations are denoted as* ± *sd.*

bilaterally in the accumbens nuclei (**Figure 4**, **Table 3**). As ANOVA does not indicate the direction of group difference effects, we employed *post-hoc* tests. *Post-hoc* analysis showed activation was significantly greater in the controls than the depressed patients in the right and left accumbens; controls' activation was significantly higher in the right, but not left, accumbens when compared to schizophrenia patients; the two patient groups did not differ from each other (**Figure 5** and **Table 3**).

There were two clusters with *Z*-values above 3 and a cluster size greater than 10 in the uncorrected whole-brain analysis (Table S3). One was located in the right accumbens, the other in the frontal pole. A cluster of eight voxels in the left accumbens was the next biggest cluster. The frontal pole result was derived

indicates voxels within significant clusters corrected at the whole brain level (cluster family wise error corrected *p <* 0*.*05 after a cluster-inducing primary threshold of *Z >* 3). Numbers under slices indicate mm in the MNI coordinate system. The left side of the image represents the left side of the brain.

from a reduced activation in the depression group compared to the other two groups.

An analysis of correlations between anhedonia (SHAPS and TEPS scales), depression (BDI), psychiatric symptoms (BPRS), and positive and negative symptoms (PANSS and SANS scales) and the parameter estimates in the right ventral striatal cluster of significant differences in the ANOVA was carried out (**Table 4** and Figures S1–S9 in Supplementary Material). A negative correlation between severity of depression and anhedonia symptoms and ventral striatum activity was found in the schizophrenia group. Regarding anhedonia, SHAPS and the total TEPS score were statistically significant, whereas TEPS' subscales showed a trend toward significance (marginal significance). The correlation with SANS was also close to significance. Significant results found in schizophrenia patients did not hold in the other

**FIGURE 4 | Differences between groups in reward anticipation (One-Way ANOVA of the first level contrast between reward and neutral cues).** Yellow color indicates voxels within significant clusters: cluster family wise error corrected *p <* 0*.*05 within the ventral striatum after a cluster-inducing primary threshold of *Z > 3*. The region of interest is shown in blue on the left of the image. For improved visualization cluster limits are not circumscribed to our ROI. Differences were driven by greater activations in the healthy controls group compared to both groups of patients (right ventral striatum) or only the depression group (left ventral striatum). Numbers under slices indicate mm in the MNI coordinate system. The left side of the image represents the left side of the brain.

groups and the direction of other correlations, such as the BPRS and PANSS negative symptoms within the depression group, was opposed to that expected.

## **Discussion**

We used a Monetary Incentive Delay (MID) paradigm in a single experiment designed to examine activation during reward anticipation in controls, schizophrenia, and depression. Both clinical groups had a reduction in right ventral striatal activity when anticipating rewards as predicted; results in the left ventral striatum were reduced in depression but not definitively reduced in schizophrenia. We were not able to find a clear correlation between striatal activation and clinical symptoms of depression or anhedonia in the depression group, but such a correlation was present in the schizophrenia group. The design of the study is consistent with the Research Domain Criteria (RDoC) project: one of the designs that the RDoC framework has proposed is to include participants with disorders from different sections of the DSM/ICD diagnostic manuals with the aim of exploring an abnormal neurobehavioral construct to further understand its pathological mechanisms (Cuthbert, 2014a). The rationale underlying the study was the fact that neurocognitive domains of motivation and reward processing have been proposed to be abnormal in both disorders, and such abnormalities may be at the root of the some of the common features between schizophrenia and depression.

Our results of reduced right ventral striatal activity during reward anticipation in both depression and schizophrenia can be considered a replication of the work of Hägele et al. which was recently published (2015). The combined evidence of both studies suggests that reduced BOLD signaling in the right nucleus accumbens is indeed a hallmark of pathological reward anticipation. The results in the left accumbens are more equivocal; unlike Hägele and colleagues, we demonstrated


**TABLE 3 | Differences between groups in reward anticipation compared to the anticipation of a neutral outcome (ANOVA F test).**

*Corrected clusters (p < 0.05 FWE after a primary cluster inducing threshold of Z > 3 within the ventral striatum) are displayed. Number of voxels, maximum voxel Z-value (Z max), MNI coordinates of the maximum peak (MAX X,Y,Z), anatomical label of the max peak and post-hoc pair-wise t-tests are reported. C is the control group, D depression, and S schizophrenia. Any significant post-hoc test results for two-group (pair-wise) comparisons (p < 0.05) are also indicated by the use of "greater than" symbols (eg., S&D>C indicates that results in pair-wise comparisons S vs. C and D vs. C were significant).*

reduced left accumbens activation in depression, but similar to Hägele and colleagues we failed to demonstrate a conclusive abnormality here in schizophrenia. Further study will be required to investigate whether there is any fundamental pathological importance in this small laterality effect or whether it relates to more trivial issues such as the precise sensitivity of the paradigm.

In the case of schizophrenia, our findings are not new, as previous studies had also found decreased activity in the basal ganglia and specifically in the ventral striatum (Nielsen et al., 2012), although other smaller studies did not find such differences (Walter et al., 2009; Waltz et al., 2010). On the other hand, results in the MID paradigm with depression patients have been surprisingly inconclusive. For example, the first study using the MID in depression (*n* = 14 patients and 12 controls) showed no evidence of abnormality in the basal ganglia (Knutson et al., 2008). Our study and the work by Hägele et al. are among the ones with the biggest sample sizes and when taken into account in a combined way strongly suggest that abnormalities in the reward circuit are not limited to patients with a diagnosis of schizophrenia, but also relate to patients with depression symptoms. Other data such as a meta-analysis on reward and depression (Zhang et al., 2013) and a study by Pizzagalli et al. (2009), indicate that reduced basal ganglia activations may not only (or even primarily) be located in the ventral striatum, but could also involve other subcortical regions such as the caudate head or the posterior putamen. We were not able to confirm the results of the meta-analysis by Zhang and colleagues suggesting frontal and anterior cingulate over-activation in depression during reward anticipation.

One reason that has been put forward to account for the discrepancies between MID depression studies has been that reward is usually contingent on a speeded performance on the MID, which might be influencing some patients through a stress related, possibly dopaminergic, response (Treadway and Zald, 2011); these authors argue that the necessity to respond rapidly may enhance activation particularly in anxious patients, whose motivation may relate to hypersensitivity to perceived failure. To deal with this potential confound we modified the original task so reward did not depend on the speed of the response. This change may be related to the finding of reduced accumbens activation, which has also been shown in other experiments in which a speeded motor response was not involved (Smoski et al., 2009). It must be noted however that the paradigm used in the studies reported by Hägele et al. (2015) was closer to the original design of the MID and included the necessity of a fast response to obtain the reward. Similarly, we did not include a loss condition in our design, as it has been proposed that the strongest activations in healthy controls and when comparing them to patients come from contrasting the gain and neutral cues (Hägele et al., 2015) Hence, our design may be suitable and sensitive for psychiatric research when the main objective is to study reward processing. Although reward was delivered irrespective of the speed of button pressing, response time in the reward condition was shorter. Valence effects on response time can occur irrespective of a direct consequence of speeded responses (O'doherty et al., 2004; Pessiglione et al., 2006; Murray et al., 2008b). The phenomenon has been termed "reinforcement related speeding" (Cools et al., 2005; Murray et al., 2008a), and it

are data further apart from the median.



*C is the control group, D depression, and S schizophrenia. Parameter estimates correspond to the mean of the parameter estimates in the clusters of differential activation between groups in the accumbens (obtained from the right significant cluster in the ANOVA comparing reward anticipation activity between groups). R is the Spearman correlation between the measures. BPRS is Brief Psychiatric Rating Scale, BDI is Beck Depression Inventory, TEPS is the Temporal Experience of Pleasure Scale (ant, anticipatory subscale; con, consummatory subscale), SHAPS is Snaith–Hamilton Pleasure Scale, PANSS is Positive and Negative Syndrome Scale (*+*, positive symptoms;* −*, negative symtoms), and SANS is Scale for the Assessment of Negative Symptoms.*

is thought that a potential reward leads to enhanced motivation and hence faster responding (Crespi, 1942).

Whilst the right accumbens was a site of common underactivation in both patient groups compared to controls, no differences between the two patient groups were found. This result indicates that the two groups of psychiatric participants may be more similar to each other than when compared to healthy controls, which would be in accordance with the RDoC perspective of a common abnormal domain. Comparisons between patient groups were not reported in Hägele et al., but a qualitative analysis of their plotted results show them to be in line with ours; the two patient groups had a similar activity reduction in the striatum. Regarding the effects of the medications for psychosis, it is a limitation that all of the schizophrenia patients were taking antipsychotic medication. Thus, we cannot exclude the possibility that the results may in part be secondary to medication effects. However, we note that as a previous study from the Berlin group found that the striatal deactivation normalized when changing from typical to atypical antipsychotics (Schlagenhauf et al., 2008), and as all of our schizophrenia patients were taking atypical antipsychotic medication, it is unlikely that the results are solely due to medication. Nevertheless, it will be important, albeit challenging, to study medication free samples in future research.

The second hypothesis was that the BOLD signal in the striatum would negatively correlate with clinical symptoms of depression and anhedonia. Previous studies have only reported one or two clinical measures per study, but clinical constructs that correlated with the activity of the ventral striatum have included depression (Hägele et al., 2015) anhedonia and apathy (Simon et al., 2010; Stoy et al., 2012), positive symptoms (Nielsen et al., 2012) or negative (Juckel et al., 2006a; Waltz et al., 2009) symptoms, and severity of overall psychiatric symptoms measured by the BPRS (Waltz et al., 2009). In accordance with our cross-diagnostic approach and the aim of further investigating the mechanisms of abnormal reward processing, we decided to include a broad range of clinical measures that encompassed most of the constructs previously reported to correlate with striatal activity in the MID. Results in this regard were mixed. On the one hand we were able to replicate Hägele's et al. results of reduced activity in the right accumbens nucleus of those schizophrenia participants with more depressive symptoms. We also extend the results of Hägele and colleagues to relationships between less activity and more anhedonia in schizophrenia, as measured by the TEPS and SHAPS scales. A limitation of our work is that we did not correct our correlation analyses for multiple comparisons, and considering the modest effect sizes observed, those significant correlations we do find may be vulnerable to Type I error. In contrast to results from Hägele et al., neither the BDI nor any of the other measures were associated with ventral striatal activation in both patient groups. However, the lack of brain-symptom associations demonstrated in the depression group, and the control group (as some of the scales were designed to assess patients only), cannot be considered as evidence of absence of brain-symptom relations. Some correlations such as the significant positive relationship between activation and the BPRS and PANSS (negative symptoms subscale) scales in the depression group were counterintuitive (and not maintained across groups). This may reflect a chance finding, or, as Treadway and Zald (2011) have speculated, the activation elicited by the MID task may be a composite of activation associated with reward anticipation and anticipatory anxiety about potential failure. We attempted to address this possibility by our modification of the task to dissociate reinforcement from performance but it remains possible that, especially in depression, some symptoms may relate to striatal overactivity (leading to greater activation when more psychopathology) and other symptoms may relate to underactivity. The relationship between psychopathology and striatal reward processing activation may be complex, and it is possible that striatal (dys)function may contribute to symptom expression only through interactions with other regional dysfunction and other psychological processes. The concept of dysfunction of one psychological process in one brain area leading to expression of one symptom has the attraction of being a testable hypothesis but is necessarily an oversimplification.

An important challenge in assessing brain-symptom relationships is accurate symptom measurement. This is especially challenging when experts disagree about what constitutes a particular symptom. Anhedonia is defined in the DSM-IV-TR as a loss of interest or pleasure (American Psychiatric Association, 2000), which arguably reflects the consensus use of the term over the past 100 or so years (e.g., Myerson, 1922; see also Berrios, 1996). However, the DSM-5 contains a new definition within the schizophrenia (not depression) chapter, "the decreased ability to experience pleasure from positive stimuli or a degradation in the recollection of pleasure previously experienced," and arguments continue as to whether a broad or narrow use of the term is more helpful (recently discussed by Treadway and Zald, 2011; Der-Avakian and Markou, 2012; Romer Thomsen et al., 2015). Consensus appears to be building that in depression and schizophrenia, anticipatory, and motivational aspects of reward are more compromised than consummatory (reward receipt) aspects, possibly related to dopaminergic abnormalities in both conditions (Argyropoulos and Nutt, 2013; Kring and Barch, 2014; Whitton et al., 2015); though see a recent study (Gard et al., 2014) documenting enhanced anticipation of pleasure in schizophrenia). However, it remains possible that this relative consensus may in part reflect the methods that have been recently used to investigate the issue. Variants of the monetary incentive delay task as we used here may be more sensitive to reward anticipation effects than reward delivery effects, as well-predicted rewards tend to evoke less strong brain responses than surprising rewards (e.g., Berns et al., 2001). In addition, given the limited temporal resolution of fMRI it can be hard to dissociate anticipatory and consummatory aspects of reward. Furthermore, most fMRI patient studies have, as we did, used monetary rewards, but processing of primary and secondary rewards may differ in important respects (Sescousse et al., 2013).

## **Strengths and Limitations**

As noted, our work is similar to existing research, such as that of Hägele and colleagues. However, there are differences between the two studies. Our study was at higher field strength (we used a 3 Tesla magnet vs. a 1.5 Tesla of Hägele and all). Our study uses a matched control group whereas Hägele and colleagues, because theirs is a retrospective synthesis and reanalysis of previously published separate works, use a control group that is not matched to their patients in the basic features of age and gender. Our study uses different (slightly simpler) stimuli to Hägele and colleagues and, while both studies require a button press, in ours, the reinforcement is not actually contingent on the button press reaction time (as discussed above, this was suggested by Treadway and Zald (2011) as being advantageous in reducing anticipatory anxiety). Our study includes a more detailed assessment of psychopathology (with the limitations that, as previously mentioned, when utilizing the psychopathology for correlation analyses we did not correct for multiple correlations, and that our sample size is modest for correlation analysis).

## **Conclusion**

In summary, while a reduced activation in the ventral striatum when anticipating rewards is a common endophenotype in psychopathology, the mechanisms underpinning this finding and related symptoms are not completely clear. It will be crucial to further pinpoint the clinical relevance of this finding but it will require further studies and replications. Although some evidence from both the previous literature and our work points toward negative or depressive symptoms being more related to the reported finding, they require further confirmation. The future of this line of research fits nicely within the specifications laid out by the RDoC project, although it also faces similar challenges, such as the measurement error, and the biological and psychometric limitations of proposed endophenotypes and their relationship to behavior (Cuthbert, 2014b; Lilienfeld, 2014; Weinberger and Goldberg, 2014). Upcoming studies on reward and psychopathology will have to use bigger sample sizes and a broader range of clinical measurements in order to be able to obtain a compelling evidence of the relationship between brain activation and everyday behavior. As shown in our work, future studies could benefit from including participants with a range of diagnoses. This aim can be best achieved by a large-scale collaboration across different research groups. Moreover, the wide use of the MID task makes it a good candidate measure for such collaboration, although a common "official" version would be important. Our results indicate that a paradigm that does not base reward on performance might be better fitted for research with stress-prone participants.

Our overall findings are further evidence of reward system dysfunction across the neuropsychiatric continuum, even if the specific clinical relevance is still not fully understood. Studies on this line of fruitful research could provide new insights on the cross-diagnostic mechanisms of psychopathological symptoms, especially if conducted in a way that minimizes the challenges posed to the Research Domain Criteria approach.

## **Acknowledgments**

Supported by the Wellcome Trust Institutional Strategic Support Fund [097814/Z/11], a MRC Clinician Scientist [G0701911], a Brain and Behavior Research Foundation Young Investigator, and an Isaac Newton Trust award to GM; an award to NS from the Secretary for Universities and Research of the Ministry of Economy and Knowledge of the Government of Catalonia and the European Union; by the University of Cambridge Behavioural and Clinical Neuroscience Institute, funded by a joint award from the Medical Research Council [G1000183]and Wellcome Trust [093875/Z/10/Z]; by awards from the Wellcome Trust [095692] and the Bernard Wolfe Health Neuroscience Fund to PF, and by the Cambridge NIHR Biomedical Research Centre. The authors are grateful for the help of clinical staff in CAMEO, in the Cambridge Rehabilitation and Recovery service and Pathways, and in the Cambridge IAPT service, for help with participant recruitment.

## **References**


## **Supplementary Material**

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2015.01280


**Conflict of Interest Statement:** Trevor W. Robbins has received research support from or served as a consultant to Cambridge Cognition, Eli Lilly, GlaxoSmithKline, and Lundbeck. Hisham Ziauddeen has been jointly funded by the Wellcome Trust and GlaxoSmithKline on the Translational Medicine and Therapeutics programme. Paul C. Fletcher has received funds from GlaxoSmithKline for consultation services and from Astra Zeneca for a lecture. Gonzalo Arrondo, Nuria Segarra, Antonio Metastasio, Jennifer Spencer, Robert B. Dudas, and Graham K. Murray and Niels R. Reinders have no financial interests.

*Copyright © 2015 Arrondo, Segarra, Metastasio, Ziauddeen, Spencer, Reinders, Dudas, Robbins, Fletcher and Murray. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The impact of threat of shock on the framing effect and temporal discounting: executive functions unperturbed by acute stress?

*Oliver J. Robinson\*, Rebecca L. Bond and Jonathan P. Roiser*

*Institute of Cognitive Neuroscience, University College London, London, UK*

#### *Edited by:*

*Nikolina Skandali, University of Cambridge, UK*

#### *Reviewed by:*

*Xochitl Angelica Ortiz, Universidad Autónoma de Nuevo León, Mexico Scott A. Langenecker, The University of Illinois at Chicago, USA*

#### *\*Correspondence:*

*Oliver J. Robinson, Institute of Cognitive Neuroscience, University College London, Alexandra House, 17 Queen Square, London, UK oliver.j.robinson@gmail.com*

#### *Specialty section:*

*This article was submitted to Psychology for Clinical Settings, a section of the journal Frontiers in Psychology*

> *Received: 22 May 2015 Accepted: 17 August 2015 Published: 31 August 2015*

#### *Citation:*

*Robinson OJ, Bond RL and Roiser JP (2015) The impact of threat of shock on the framing effect and temporal discounting: executive functions unperturbed by acute stress? Front. Psychol. 6:1315. doi: 10.3389/fpsyg.2015.01315* still poorly understood. Prior work has demonstrated clear impacts of stress upon basic cognitive function: biasing attention toward unexpected and potentially threatening information and instantiating a negative affective bias. However, the impact that these changes have on higher-order, executive, decision-making processes is unclear. In this study, we examined the impact of a translational within-subjects stress induction (threat of unpredictable shock) on two well-established executive decision-making biases: the framing effect (*N* = 83), and temporal discounting (*N* = 36). In both studies, we demonstrate (a) clear subjective effects of stress, and (b) clear executive decisionmaking biases but (c) no impact of stress on these decision-making biases. Indeed, Bayes factor analyses confirmed substantial preference for decision-making models that *did not* include stress. We posit that while stress may induce subjective mood change and alter low-level perceptual and action processes (Robinson et al., 2013c), some higher-level executive processes remain unperturbed by these impacts. As such, although stress can induce a transient affective biases and altered mood, these need not result in poor financial decision-making.

Anxiety and stress-related disorders constitute a large global health burden, but are

Keywords: threat of shock, stress, temporal discounting, framing effect, anxiety, depression, executive function, Bayesian models

## Introduction

Stress can significantly alter the way that we perceive and react to the world, promoting the processing of threatening and unexpected information (Robinson et al., 2013b,c). This threat bias can be adaptive – improving the ability to detect and avoid further sources of stress – but this bias also likely contributes, at least in part, to the facilitatory role that stress plays in the onset of mood and anxiety disorders (Kendler et al., 2004). Threat of unpredictable shock is a reliable within-subject method of inducing stress in both humans and experimental animals (Grillon, 2008; Davis et al., 2010). While the impact of threat of shock on basic perceptual processes is relatively well studied, its impact on higher-level executive processes such as decision-making is surprisingly poorly understood (Robinson et al., 2013c). Here, we explore the impact of threat of shock on two classic decision-making biases: the framing effect and temporal discounting.

Threat of shock is a translational stress-induction procedure in which an individual anticipates an unpredictable and unpleasant electrical shock (Schmitz and Grillon, 2012). In animals threat of shock has been shown to engage neural circuitry distinct from that engaged during fear conditioning (Davis et al., 2010), another widely used but conceptually different aversive processing paradigm. More precisely, *anxiety* (or stress) is operationally defined as the prolonged apprehensive response to a context in which threats *may* occur, whereas *fear* is the acute response to a discrete, defined and *predictable* aversive stimulus or cue (Davis et al., 2010; Robinson et al., 2013c). Critically, stress induced by threat of shock has well documented psychological (Robinson et al., 2013c), psychophysiological (Grillon et al., 1991), and neural effects (Cornwell et al., 2007; Robinson et al., 2012, 2013b). Perhaps more importantly there is also emerging evidence that threat of shock evokes mechanisms related to those that participate in pathological anxiety, for example Generalized Anxiety Disorder (Robinson et al., 2013c, 2014). As such, it is hoped that exploring the impact of threat of shock on cognitive functions may also provide a window into the mechanisms which contribute to pathological stress-related disorders in healthy individuals, prior to disorder onset (Robinson et al., 2014).

Executive function is an umbrella term encompassing cognitive functions that do more than passively process information. Such non-automatic functions might integrate information from multiple sensory domains along with information stored in memory. Executive function therefore encompasses higher order processes, such as planning and decision-making. The impact of threat of shock on executive function has not been comprehensively studied. Here we explore one aspect of executive function: financial decision-making. We are aware of only two studies in which threat of shock was shown to alter financial choices. In the first, threat of shock promoted risk-avoidant decision-making (Clark et al., 2012). However, in this study, the threat cues were discrete, of short duration (5–5.5 s) and possibly more comparable to a fear cue than an anxiety/stress condition. In the second study (Robinson et al., 2015), threat of shock had no main effect on gambling choices, but did interact with trait anxiety, promoting 'harmavoidant' (i.e., playing less 'disadvantageous' decks) under stress in those with the low anxiety symptoms. However, the Iowa gambling task used in this second study confounds a number of decision-making and basic cognitive processes, making the causes of this result unclear. In the small number of remaining studies that have addressed this question, the effects seemed to be largely restricted to reaction times (Murphy, 1959; Keinan, 1987; Engelmann et al., 2015), with stress having no impact on the decisions themselves.

Microeconomic theory has outlined a number of biases – or heuristics – which have been shown to guide individual financial decision-making behavior. In this study, we explore two wellestablished biases: the framing effect and temporal discounting. The framing effect describes the reliable propensity of individuals to alter their decisions, dependent on whether the same choice is 'framed' as a loss or a gain. Specifically, individuals tend to avoid risk when a choice is framed as a gain (e.g., keep £2 out of £10 vs. a 20% chance to win £10), and become risk-seeking when the exact same choice is framed as a loss (e.g., lose £8 out of £10 vs. a 20% chance to win £10). That is to say, all other things being equal, individuals are biased against certain outcomes 'framed' as losses (Kahneman and Tversky, 1979; De Martino et al., 2006). Temporal discounting is another such bias, in which an individual assigns less value to gains in the future relative to the present (Rachlin and Green, 1972; Berns et al., 2007). For instance, offered £10 today and £11 in a month, there is a bias toward accepting the lower value of £10 now. In other words, temporal distance causes devaluation of potential gains. In both paradigms, the subjective utility of financially identical options is biased by the context in which the options are presented. Given that both of these biases can be financially suboptimal and result in reduced gains/increased losses, it is plausible that they might be shifted by contexts such as stress-induced biases toward negative stimuli (Robinson et al., 2013c). Clinical support for this hypothesis comes from the observation of altered decisionmaking and negative biases in disorders associated with anxiety and negative affect such as major depression (Eshel and Roiser, 2010; and it should be noted that stress, negative mood, and anxiety are relatively diffuse, likely overlapping, concepts).

Therefore, we explored the impact of stress on the framing effect and temporal discounting. We predicted that threat of shock would induce a state of adaptive anxiety (Robinson et al., 2013c) and promote harm-avoidant decisions (Clark et al., 2012; Robinson et al., 2013c), thereby increasing both framing and temporal discounting. Specifically, in the context of uncertain threat an individual might be more loss averse, resulting in an increased framing effect (Porcelli and Delgado, 2009). At the same time in the context of an uncertain future, an individual might be biased toward immediate vs. future gains resulting in increased temporal discounting (Pulcu et al., 2014). This study explored these hypotheses with conventional significance testing as well as a Bayesian approach to enable a more nuanced comparison of different behavioral models.

## Materials and Methods

#### Sample and Screening

Participants were recruited from the UCL Institute of Cognitive Neuroscience Subject Database, of which *N* = 83 (49 female: 34 male; mean age = 24, *SD* = 5) completed the framed gamble task and *N* = 36 (18 female: 18 male; mean age = 24, *SD* = 6) completed the temporal discounting task (*N* = 35 completed both). All participants completed a prior phone screen in which they reported no personal history of/treatment for psychiatric or neurological disorders or drug use (from a detailed specific checklist of all disorders), along with no cardiovascular problems, pacemakers, or cochlear implants. The demographics represented the naturalistic sample of individuals who responded to our call for participants and who passed screening. All subjects provided written informed consent (UCL Research Ethics Committee Project ID Number: 1764/001). Both decision tasks were presented on a desktop computer using the Cogent toolbox for Matlab (Wellcome Trust Centre for Neuroimaging and Institute of Cognitive Neuroscience, UCL, London, UK). To incentivise performance subjects were informed that additional compensation would be provided based upon task performance. Shocks were delivered to the non-dominant wrist using a DS7 stimulator (Digitimer Ltd, Welwyn Garden City, UK). Prior to testing, all subjects completed a shock work-up procedure in which shocks were titrated (over approximately 3–5 stimulations) to a level that was 'unpleasant but not painful'(Schmitz and Grillon, 2012).

## Anxiety Measures

At the end of each block, participants indicated how anxious they had felt during each of the threat and safe conditions on a scale from 1 ("not at all") to 10 ("very much so") as a subjective manipulation check. Participants also provided selfreport measures of depression (Beck Depression Inventory: BDI; Beck and Steer, 1987) and trait anxiety (State Trait Anxiety Inventory: STAI; Spielberger et al., 1970) at the end of the session.

## Framed Gamble Task

This task was adapted from that used by De Martino et al. (2006). The task consisted of eight blocks (four safe, four threat), each comprising 14 trials. "YOU ARE NOW SAFE FROM SHOCK" or "YOU ARE NOW AT RISK OF SHOCK" was presented for 3 s at the beginning of each new block. Threat blocks had a red background, whilst safe blocks had a blue background. A single shock was delivered at a pseudorandom time in each threat block. Each trial began with a message "You receive £X" where, X = varying monetary amounts. Participants then had 4 s to choose between a certain option, which would leave them with a guaranteed portion of the total £X, or an option to gamble, which could lead to either winning the entire amount or winning nothing. The participant did not discover the outcome of any gambles, but was instructed to consider which option they would choose to maximize wins and minimize losses. In gain frames the participant would have the 'sure' option to "Keep £Y," a certain portion of the total whereas in loss frames participants were told they would "Lose £Z" (where Z + Y = X), implying that they would retain the rest of the total (i.e., £Y) – note that this represents precisely the same decision. Alternatively, participants could choose a gamble option which was presented with a pie chart indicating the probability of each keeping or losing the entire £X amount. In experimental trials, the expected values (sum of all possible outcomes weighted by their respective probabilities) of the gamble and sure options were matched (**Figure 1A**). Expected outcomes were 20, 40, 60, or 80% of the initial total £X, which was set as £25, £50, £75, or £100. Monetary parameters were counterbalanced across decision frames and between threat and safe blocks. 'Catch' trials were also included to verify that the participants were attending to and had understood the task. These trials were designed such that the expected outcome of one option was much larger than that of the other option, such that participants should always choose the option of higher value.

Each block consisted of 10 standard trials (five in each frame) and with four catch trials (one in each combination of frame and preferred option) in a random order, with the certain and gamble options randomized to the left or right of the screen. Choices were indicated by pressing the left or right arrow key. The chosen option was highlighted by a star for 1 s. Analysis was conducted on the proportion of trials on which participants chose to gamble, and reaction times to choose each of the gamble and the certain options. Before the main task, six practice trials (one of each type of catch trial, plus a standard trial in each the gains and losses frames) were completed without a time limit or threat of shock to ensure participants understood the task. Task duration was approximately 15 min.

### Temporal Discounting Task

On each trial subjects were presented with a (self-paced) choice between an immediate reward (e.g., "£5.00 now") and a delayed reward (e.g., "£10.00 in 25 years"). The value of the delayed reward was fixed whilst the immediate reward was adjusted based upon previous choices until an indifference point was reached. Adjustment was based on Yi et al. (2010) abbreviation of Johnson and Bickel's (2002) algorithm. Indifference points were recorded for three delayed values (£10, £100, and £1000), three delays (1 day, 1 year, and 25 years) and for both gain and loss frames ("Which would you prefer to receive?" or "Which would you prefer to lose?"), yielding a total of 18 indifference points (**Figure 1B**). The task consisted of six blocks (three safe, three threat), during each of which six (randomly selected without replacement) indifference points were reached. As the algorithm adjusted the choices presented based upon previous responses, the number of trials per block varied (range ∼60–80). The screen displayed "YOU ARE NOW SAFE FROM SHOCK" with a blue background at the start of safe blocks or "YOU ARE NOW AT RISK OF SHOCK" with a red background at the start of threat blocks. Shocks (*N* = 4) were presented in a pseudo-random order during threat blocks. Task duration was approximately 15 min.

## Analysis

Conventional frequentist significance tests were run in SPSS version 22 (IBM Corp, Armonk, NY) whilst Bayesian analyses were run in JASP, employing the default prior (Rouder et al., 2012; Love et al., 2015; Morey and Rouder, 2015). Frequentist and Bayesian repeated-measures analysis of variance (ANOVA) models were constructed in exactly the same manner for all analyses (see below), with frequentist ANOVAs used to generate F-statistics, p-values and effect sizes for interactions of interest, and Bayesian ANOVAs used to generate log Bayes factors (logBF10) <sup>1</sup> for models of interest relative to a null model (main effect of subject).

In our Bayesian analyses, the 'winning' model was defined as the model with the highest BF10 relative to the null, and the relative predictive success of one model over another was computed by dividing the BF10 for one model by the other. Any value greater than zero indicates a model *better* than the comparison. Semantic labels were assigned to the magnitude of these comparisons to aid interpretation, ranging from anecdotal (1–3), to substantial (3–10), to strong (10–30) to decisive (*>*100; Jeffreys, 1998). Where reported for interactions, the Bayes factors

<sup>1</sup>Note that the 'BF10' nomenclature in JASP refers to the Bayes factor for H1 vs. H0 (model relative to null); as distinct from 'BF01,' which is the Bayes factor for H0 vs. H1 (null relative to model). It does *not* refer to log10*.* Where we refer to 'logBF10,' we report the natural log of the BF10 values. This log is not required; it is simply used to make the frequently very large numbers more interpretable.

represent a model including the interaction plus the main effect of each component of the interaction.

#### Statistical Models

*Manipulation efficacy* was assessed using a paired *t*-test (and Bayesian equivalent) to compare retrospective ratings across stress conditions.

For the *framed gamble task*, the proportion of trials on which participants chose the gamble option was assessed using a repeated-measures ANOVA with stress condition (threat/safe) and decision frame (gains/losses domains) as within-subject factors. Reaction time was analyzed in a similar model, with the addition of choice (gamble/sure) as an additional within-subjects factor.

For the *temporal discounting task*, we analyzed normalized indifference points (indifference point/fixed delayed value) in a repeated-measures ANOVA with stress condition (threat/safe), decision valence (gain/lose), delayed value (£10/£100/£1000) and time (1 day, 1 year, 25 years) as within-subjects factors. Reaction times for choices across each indifference point (mean reaction time of all responses required to reach indifference) were analyzed in a separate model with the same factors.

Finally, for both tasks, additional exploratory betweensubjects analyses were also run including measures of mood symptoms and order of threat-safe counterbalancing as covariates. These were separate models, run after the *a priori* within-subject models. BDI symptom data was not normally distributed, and was square-root transformed prior to analysis. Further exploratory analyses suggested during peer review were also run: gender, age, and threat potentiated (threat minus safe) subjective ratings were included as additional between subject factors.

## Results

Data for these tasks are freely available for download2 .

## Manipulation Check

Subjects reported feeling significantly more anxious in the stress relative to the safe condition in both the framed gamble [a mean rating (±*SD*) of 6 ± 2/10 relative to 2 ± 1/10; *t*(82) = –16, *p <* 0.001] and temporal discounting tasks [6 ± 2/10 relative to 2 ± 1/10; *t*(34) = –10, *p <* 0.001]. Bayes factors indicated that models including stress conditions were decisively better than the null model for both the framed gamble (logBF10 = 57) and temporal discounting tasks (logBF10 = 21).

## Framed Gamble Task (A) Within-Subjects Effects *Choice behavior*

A significant framing effect was demonstrated. Specifically, participants gambled more in the losses frame (probability of gambling = 0.54 ± 0.2) than in the gains frame [probability of gambling = 0.37 ± 0.2; main effect of frame: *F*(1,82) = 83, *p <* 0.001, η<sup>2</sup> <sup>p</sup> = 0.5]. However, this did not interact with threat

<sup>2</sup>http://dx*.*doi*.*org/10*.*6084/m9*.*figshare*.*1423293

of shock [stress × frame interaction: *F*(1,82) = 0.13, *p* = 0.72, η2 <sup>p</sup> <sup>=</sup> 0.002; **Figure 2A**] and there was no main effect of stress [*F*(1,82) <sup>=</sup> 3.7, *<sup>p</sup>* <sup>=</sup> 0.06, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.043]. Bayes factor analysis revealed the winning model to be one including only a main effect of frame (logBF10 = 46), which was substantially (7.6 times) better than a model additionally including the stress × frame interaction (logBF10 = 44).

## *Reaction times*

Subjects (only *N* = 74 had RTs in all cells, since some subjects never chose at least one of the options) were significantly faster to choose in the loss than gain frame [main effect of frame: *<sup>F</sup>*(1,74) <sup>=</sup> 7.5, *<sup>p</sup>* <sup>=</sup> 0.008, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.091] and under threat of shock [main effect of stress: *F*(1,74) = 5.0, *p* = 0.029, η<sup>2</sup> <sup>p</sup> = 0.063]. These effects were qualified by a significant stress × frame interaction [*F*(1,74) = 7.0, *p* = 0.01, η<sup>2</sup> <sup>p</sup> = 0.086]. Analyses of the simple main effects revealed that stress induced quicker responses in the gains domain [*F*(1,74) = 12, *p* = 0.001, η<sup>2</sup> <sup>p</sup> = 0.14] but not in the losses domain [*F*(1,74) <sup>=</sup> 0.11, *<sup>p</sup>* <sup>=</sup> 0.74, <sup>η</sup><sup>2</sup> <sup>p</sup> *<* 0.002; **Figure 2B**]. Bayes factor analysis revealed that the winning model comprised individual main effects of stress and frame (logBF10 = 2.1), without the interaction; but this was only anecdotally (1.1 times, Jeffreys, 1998) better than the model also including a stress × frame interaction (logBF10 = 2.0).

## (B) Between-Subjects Effects *Choice behavior*

Neither safe-threat order [frame × order interaction: *<sup>F</sup>*(1,81) <sup>=</sup> 0.20, *<sup>p</sup>* <sup>=</sup> 0.66, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.002] nor baseline symptoms [frame × STAI interaction : *F*(1,81) = 1.4, *p* = 0.24, η<sup>2</sup> <sup>p</sup> = 0.017; frame <sup>×</sup> BDI interaction: *<sup>F</sup>*(1,80) <sup>=</sup> 0.28, *<sup>p</sup>* <sup>=</sup> 0.60, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.003] interacted with any of the main effects of interest. There was no exploratory stress × frame × gender interaction [*F*(1,81) = 0.07, *<sup>p</sup>* <sup>=</sup> 0.8, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.001], stress × frame × age interaction [*F*(1,81) = 0.62, *p* = 0.53, η<sup>2</sup> <sup>p</sup> = 0.008] or stress × frame × threat potentiated (threat minus safe) anxiety rating interaction [*F*(1,80) = 0.40, *p* = 0.53, η<sup>2</sup> <sup>p</sup> = 0.005].

## *Reaction times*

Neither safe–threat order [stress × frame × order interaction: *F*(1,73) = 0.20, *p* = 0.65, η<sup>2</sup> <sup>p</sup> = 0.003] nor baseline symptoms [stress × frame × STAI interaction: *F*(1,73) = 0.11, *p* = 0.74, η2 <sup>p</sup> = 0.001; stress × frame × BDI interaction: *F*(1,72) = 1.3, *<sup>p</sup>* <sup>=</sup> 0.26, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.018] interacted with any of the interaction effects of interest.

#### Temporal Discounting Task (A) Within-Subjects Effects *Choice behavior*

Temporal discounting was demonstrated by a significant main effect of delay on indifference points [*F*(2,70) = 79, *p <* 0.001, η2 <sup>p</sup> = 0.7]. This varied depending upon whether subjects were asked about wins or losses [time × valence interaction: *<sup>F</sup>*(2,70) <sup>=</sup> 8, *<sup>p</sup>* <sup>=</sup> 0.001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.2] but did not differ across the different values [time × value interaction: *F*(4,140) = 1.3, *<sup>p</sup>* <sup>=</sup> 0.26, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.04]. Critically, this also did not differ under stress [time <sup>×</sup> stress interaction: *<sup>F</sup>*(2,70) <sup>=</sup> 0.24, *<sup>p</sup>* <sup>=</sup> 0.79, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.007, **Figure 3A**; main effect of stress: *<sup>F</sup>*(1,35) <sup>=</sup> 0.8, *<sup>p</sup>* <sup>=</sup> 0.37, η2 <sup>p</sup> = 0.02; time × valence × stress: *F*(2,70) = 0.16, *p* = 0.86, η2 <sup>p</sup> = 0.004; time × valence × value × stress: *F*(4,140) = 0.73, *p* = 0.58, η<sup>2</sup> <sup>p</sup> = 0.02]. Bayes factor analysis revealed a winning indifference point model comprising a time by valence interaction (logBF10 = 294) that was decisively (*>*150 times) better than model additionally including a stress by time interaction (logBF10 = 264), a stress by valence by time model (logBF10 = 271) or a time alone model (logBF10 = 271).

## *Reaction times*

There was a main effect of time on RTs [*F*(2,70) = 9.9, *p <* 0.001, η<sup>2</sup> <sup>p</sup> = 0.22], a main effect of valence [*F*(1,35) = 9.6, *<sup>p</sup>* <sup>=</sup> 0.004, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.22] and a significant valence × time interaction [*F*(2,70) = 7.9, *p <* 0.001, η<sup>2</sup> <sup>p</sup> = 0.18]. Simple main effects analyses revealed that this interaction was driven by a main effect of time in the gain domain [subjects became progressively slower with increasing time: *<sup>F</sup>*(2,34) <sup>=</sup> 15, *<sup>p</sup> <sup>&</sup>lt;* 0.001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.47] but not the loss domain [subjects were always as slow as the slowest (i.e., 25 years) time point in the gains domain: *F*(2,34) = 1.8, *<sup>p</sup>* <sup>=</sup> 0.19, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.09]. There was no main effect of stress [*F*(1,35) <sup>=</sup> 2.6, *<sup>p</sup>* <sup>=</sup> 0.12, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.07] or stress by time interaction [*F*(2,70) <sup>=</sup> 0.8, *<sup>p</sup>* <sup>=</sup> 0.4, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.02]. There was a trend toward a

stress <sup>×</sup> valence interaction [*F*(1,35) <sup>=</sup> 4.1, *<sup>p</sup>* <sup>=</sup> 0.050, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.11] but Bayes factor analysis revealed this model (logBF10 = 9) to be decisively (*>*150 times) worse than the winning valence × time model (logBF10 = 23).

### (B) Between-Subject Effects

## *Choice behavior*

No effects of interest interacted with task order [for indifference points, time × order interaction: *F*(2,68) = 1.1, *p* = 0.34, η2 <sup>p</sup> = 0.03] or baseline symptoms (for indifference points time × STAI interaction: *F*(2,68) = 0.51, *p* = 0.6, η<sup>2</sup> <sup>p</sup> = 0.02; time <sup>×</sup> BDI interaction: *<sup>F</sup>*(2,68) <sup>=</sup> 0.27, *<sup>p</sup>* <sup>=</sup> 0.8, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.008]. There was no exploratory time × stress × gender interaction [*F*(2,68) <sup>=</sup> 0.26, *<sup>p</sup>* <sup>=</sup> 0.8, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.007], time × stress × age interaction [*F*(2,68) <sup>=</sup> 0.13, *<sup>p</sup>* <sup>=</sup> 0.88, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.004] or time × stress × threat potentiated (threat minus safe) anxiety rating interaction [*F*(2,66) <sup>=</sup> 1.4, *<sup>p</sup>* <sup>=</sup> 0.30, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.039]).

## *Reaction times*

There was no time × order interaction [*F*(2,68) = 0.94, *<sup>p</sup>* <sup>=</sup> 0.4, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.03] or time × valence × order interaction [*F*(2,68) <sup>=</sup> 0.043, *<sup>p</sup>* <sup>=</sup> 0.96, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.001], but there was a significant stress × order interaction [*F*(1,34) = 16, *p <* 0.001, η2 <sup>p</sup> = 0.32] driven by those who experienced the safe condition first responding significantly faster under threat [*F*(1,34) = 17, *p <* 0.001, η<sup>2</sup> <sup>p</sup> = 0.33; no difference between conditions in those who received threat first: *<sup>F</sup>*(1,34) <sup>=</sup> 2.5, *<sup>p</sup>* <sup>=</sup> 0.12, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.07]. There was no interaction between trait anxiety and any of the effects of interest (all *p >* 0.1), but there was a significant interaction between stress and BDI scores [*F*(1,34) = 9.1, *<sup>p</sup>* <sup>=</sup> 0.005, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.21] driven by a negative correlation [*r*(35) = – 0.5] between the difference between threat and safe RTs and BDI (correlation substantially better model than null: logBF10 = 2.3; **Figure 3B**). In other words, the more depression symptoms an individual reported, the faster they responded under threat relative to safe conditions.

## Discussion

In this study, we were able to replicate two well-established biases in decision-making: the framing effect and temporal discounting. Moreover, we demonstrated a clear impact of threat of shock on subjective mood and choice reaction times. However, contrary to our predictions, stress did *not* alter the observed decisionmaking biases, perhaps because these executive decision-making biases are traits that are impervious to, or are able to override, the lower-level state affective biases induced by stress (Robinson et al., 2013c).

We first replicated the well-established framing effect, in which there is a bias toward risky behavior in the losses domain, and toward risk-aversion in the gains domain (Kahneman and Tversky, 1979). However, this bias was not altered by threat of shock in this study. To the best of our knowledge there is no previous literature exploring the impact of threat of shock on this effect, but there are a number of studies exploring the impact of different manipulations on framing. Using the cold pressor task Porcelli and Delgado (2009) found that the framing effect was enhanced by stress relative to a non-stress control condition. Given our sample size (*N* = 81), we had 99.1% statistical power (with alpha = 0.05; two-tailed) to replicate this interaction (with effect size |*d*| = 0.487; Porcelli and Delgado, 2009). One possible explanation for the discrepancy is simply that they used a different manipulation; very little, if any, work has directly compared threat of shock and cold pressor on stress responses (Robinson et al., 2013c). One key difference between paradigms, however, is that the cold pressor is generally completed *prior to* the task, since it requires the individual to submerge their hand in cold water. As such, it is plausible that it explores the impact of *recovery from* stress (Robinson et al., 2013c) rather than a current stressful context (which is a key advantage of the threat of shock technique used here). Another possibility highlighted by the authors is that their effect was a learning effect since the stress block always followed the no-stress block (Porcelli and Delgado, 2009).

Two further studies have explored the effects of another stress manipulation – the Trier social stressor test – on framing. Pabst et al found the *opposite* pattern to Porcelli and Delgado (2009); a *reduced* framing effect under stress (Pabst et al., 2013) albeit only in a subsample of their participants. Buckert et al. (2014) also showed *reduced* framing under social stress on a game of dice task. This discrepancy across manipulations could be attributed to the specific type of stress; whilst the cold pressor is physically painful, the Trier task asks subjects to ready themselves for unprepared public speaking which is a social form of stress. It is possible that the specific domain of anxiety influences the outcome; e.g., social rewards might be particularly influenced by a social anxiety induction. Nevertheless, adding our null effect with a further stressor to these inconsistent effects of stress leaves the role of stress on framing unclear (**Table 1**). This lack of clarity also highlights the need for work directly comparing different stress manipulations across the same tasks.

In our second experiment we replicated the temporal discounting effect. Specifically, participants assigned less utility to outcomes in the future. Again, however, we failed to detect an impact of stress on this bias. Temporal discounting has not been assessed under threat of shock to our knowledge, but this null finding is consistent with at least three prior studies (Lempert et al., 2012; Haushofer et al., 2013; Jenks and Lawyer, 2015) utilizing the Trier social stressor test. In all three studies stress increased both subjective and/or hormonal indicators of stress, but had no effect on discounting (**Table 1**). Where anxiety *has* been associated with temporal discounting it is in studies exploring *between-subject* individual differences in social anxiety (Rounds et al., 2007) or clinically defined depression (Pulcu et al., 2014). In both cases this is consistent with temporal discounting being a stable, *trait* measure (Odum, 2011). Temporary or acute mood fluctuations such as stress or anxiety may be therefore be unable to influence such traits. That said, it should be noted that we fail to see an interaction with trait depression or anxiety symptoms in our sample and the Rounds effect cited above recently failed to replicate (Jenks and Lawyer, 2015) so firm conclusions are perhaps unwarranted.

We do, however, see some evidence of stress impacting reaction time across both tasks. This is critical because it suggests that the threat of shock manipulation was effective at instantiating behavioral change. In other words, in addition to the subjective reports of anxiety, subjects' responses *were* impacted by threat, even though these did not carry over into their decisions. Reaction time effects in the absence of decision effects are consistent with prior work (Murphy, 1959; Keinan, 1987; Engelmann et al., 2015) and perhaps reflect a bias toward making some decisions faster under conditions of stress. From an evolutionary perspective, such a mechanism could be adaptive: a faster decision about which direction to go when running away from a predator, for instance, may improve survival chances. It should be noted, moreover, that the reaction time correlation with depression symptoms we see in the temporal discounting

TABLE 1 | Review of findings cited in this paper (**=**, null effect; N/A, not cited; **↑**, increased effect; **↓**, reduced effect).


reaction time effect, somewhat mimics our previously reported effect in the Iowa Gambling Task (Robinson et al., 2015). In that study, we saw increased selection of disadvantageous decks under threat relative to safe in individuals with high anxiety or depression symptomatology (Robinson et al., 2015). Here we observed (in a partially overlapping sample) quicker responses under threat relative to safe conditions in those who reported greater depressive symptoms. It is plausible that these effects may represent some form of underlying vulnerability in individuals with subclinical depressive symptoms. Having said that, the effect in the present task was not also observed in an relationship with trait anxiety scores (unlike our prior report), which is surprising because trait anxiety and BDI scores are highly correlated in most samples (including this one: *R* = 0.8, *p <* 0.001). Whilst it is possible that the present effect is specific to depression vs. anxiety symptoms we feel that this conclusion would be unwarranted based on the current data, and it requires replication in a larger sample. Reaction time effects can, however, have multiple underlying causes; the effects could be driven by altered decision-making processes, but could also be driven by the time it takes to encode or instantiate a reaction toward a stimulus (Ratcliff and McKoon, 2008) and it is not possible to fully distinguish between these possibilities. Recent work has in fact highlighted the need for researchers to be extremely cautious when using reverse-inference to infer cognitive states from reaction time (Krajbich et al., 2015). Indeed, in general, the exact nature of the reaction time effects seen here were not predicted *a priori*, and in one instance the Bayesian and frequentist tests are partially discrepant (within-subject temporal discounting p = 0.050 vs. 150 times worse) and as such we do not wish to draw firm conclusions beyond observing that the effects indicate that the manipulation was having some effect during the tasks.

This raises the question as to why is there are clear affective biases under threat of shock (Robinson et al., 2013c), but that these biases do not impact decisions. One speculation is that this is because such 'bottom–up' biases do not influence some higher level, executive processes. Or, if the biases *are* processed later in the hierarchy, it may be that the executive *overrides* lower level biases. Evolutionarily, an ability of executive function to ignore or override lower level fear and stress responses might be adaptive in certain circumstances. Alternatively, these biases might reflect the use of highly efficient heuristics/rules of thumb which are robust to the effects of stress on affective processing. Lower level affective biases may therefore constitute *state* effects of mood disorders that change with symptoms, whilst the executive decisionmaking biases constitute stable *traits*. Such traits may contribute to stress-related disorder susceptibility (Pulcu et al., 2014), but not change with mood symptoms. Understanding the distinction between different levels of cognitive function that are impacted by stress might plausibly inform our treatments for stress-related disorders. Specifically, the focus might be on shifting lower-level *state* affective biases rather than *trait* executive biases. Either way, in contrast to our hypotheses the present study provides evidence for the proposition that certain higher order executive decisionmaking functions are impervious to stress induced by threat of shock.

#### Limitations

It should be noted that one explanation for our lack of effect is that our stress manipulation was not strong enough to elicit change. Perhaps under conditions of extreme threat (e.g., a warzone or high-stakes work environment) decision-making of this type can be shifted by threat. Alternatively given higher time pressure, higher financial gambles, or explicit feedback about the outcomes of gambles, individual's decisions would have been shifted by threat of shock. Moreover, as discussed above, there are many different 'stress' manipulations across social, pain, and other domains and it is unclear exactly how these overlap. The extent to which this non-significant effect of stress generalizes across stress manipulations is unclear. In addition, these findings do not rule out the impact of threat of shock on other types of decision-making such as those that involves working memory or inhibitory control [both of which have in fact been shown to be influenced by threat (Robinson et al., 2013a,c)]. Overall, a non-significant effect of this nature is difficult to prove as it may simply

## References


Jeffreys, H. (1998). *The Theory of Probability*. Oxford, Oxford University Press.

be that we have failed to find the correct context in which stress impairs the executive functions explored here. A further limitation is that these individuals were not screened using a diagnostic interview. As such, some individuals may have previous diagnoses which they had forgotten, or may have been missed by using a checklist screening instrument. Finally, these findings do not of course rule out the possibility that positive mood might have effects on these sorts of decision making tasks. Indeed there is preliminary (albeit complex) data suggesting that positive mood can influence temporal discounting (Hirsh et al., 2010) and Iowa Gambling Task performance (De Vries et al., 2008).

## Acknowledgments

A Medical Research Council Career Development Award to OR (MR/K024280/1) funded this research. We thank Benedetto De Martino for the original framing task code.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Robinson, Bond and Roiser. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Assessment of Tobacco-Related Approach and Attentional Biases in Smokers, Cravers, Ex-Smokers, and Non-Smokers

Marcella L. Woud<sup>1</sup> \*, Joyce Maas <sup>2</sup> , Reinout W. Wiers <sup>3</sup> , Eni S. Becker <sup>2</sup> and Mike Rinck 1, 2

<sup>1</sup> Department of Psychology, Mental Health Research and Treatment Center, Ruhr-Universität Bochum, Bochum, Germany, <sup>2</sup> Behavioural Science Institute, Radboud University Nijmegen, Nijmegen, Netherlands, <sup>3</sup> Addiction Development and Psychopathology Lab, University of Amsterdam, Amsterdam, Netherlands

#### Edited by:

Frank Ryan, Imperial College, UK

#### Reviewed by:

Sally Adams, University of Bath, UK Charlotte Elisabeth Wittekind, University Medical Center Hamburg-Eppendorf, Germany

#### \*Correspondence:

Marcella L. Woud marcella.woud@rub.de This study was conducted while Marcella L. Woud held a position at Behavioural Science Institute, Radboud University Nijmegen, Montessorilaan 3, 6525 HR Nijmegen, Netherlands.

#### Specialty section:

This article was submitted to Psychology for Clinical Settings, a section of the journal Frontiers in Psychology

Received: 07 May 2015 Accepted: 29 January 2016 Published: 26 February 2016

#### Citation:

Woud ML, Maas J, Wiers RW, Becker ES and Rinck M (2016) Assessment of Tobacco-Related Approach and Attentional Biases in Smokers, Cravers, Ex-Smokers, and Non-Smokers. Front. Psychol. 7:172. doi: 10.3389/fpsyg.2016.00172 According to theories of addictive behaviors, approach and attentional biases toward smoking-related cues play a crucial role in tobacco dependence. Several studies have investigated these biases by using various paradigms in different sample types. However, this heterogeneity makes it difficult to compare and evaluate the results. The present study aimed to address this problem, via (i) a structural comparison of different measures of approach-avoidance and a measure of smoking-related attentional biases, and (ii) using within one study different representative samples in the context of tobacco dependence. Three measures of approach-avoidance were employed: an Approach Avoidance Task (AAT), a Stimulus Response Compatibility Task (SRC), and a Single Target Implicit Association Test (ST-IAT). To assess attentional biases, a modified Stroop task including smoking-related words was administered. The study included four groups: n = 58 smokers, n = 57 non-smokers, n = 52 cravers, and n = 54 ex-smokers. We expected to find strong tobacco-related approach biases and attentional biases in smokers and cravers. However, the general pattern of results did not confirm these expectations. Approach responses assessed during the AAT and SRC did not differ between groups. Moreover, the Stroop did not show the expected interference effect. For the ST-IAT, cravers had stronger approach associations toward smoking-related cues, whereas non-smokers showed stronger avoidance associations. However, no such differences in approach-avoidance associations were found in smokers and exsmokers. To conclude, these data do not provide evidence for a strong role of implicit approach and attentional biases toward smoking-related cues in tobacco dependency.

Keywords: tobacco dependence, approach-avoidance, attention, AAT, SRC, STIAT, Stroop

## INTRODUCTION

I need more cigarettes Give me more cigarettes, I need Gotta get more cigarettes I want more cigarettes Cigarettes Cigarettes (More Cigarettes–Replacements–2012) The inability to control drug use is a hallmark symptom of a drug addiction (Diagnostic Statistical Manual of Mental Disorders; DSM-5; 2013). Smoking, for example, represents such an addictive behavior, and it is considered one of the most difficult addictions to break. According to the World Health Organization (WHO) report in 2008, tobacco smoking causes 5.4 million deaths per year, and it remains the leading preventable cause of death worldwide (World Health Organization, 2008). Furthermore, research also showed that smoking increases the risk of engaging in other addictive behaviors (Merrill et al., 1999; Creemers et al., 2009). Hence, it is not surprising that there is a growing interest to elucidate the motivational and reward mechanisms underlying this destructive behavior.

According to dual process models of addiction (e.g., Deutsch and Strack, 2006; Wiers et al., 2007; Gladwin et al., 2011; for critical discussion see Gladwin and Figner, 2014), addictive behaviors can be understood best as the output of two distinct types of processes. On the one hand, reflective processes with limited (cognitive) capacity involve processes that are slower, more deliberate and explicit. On the other hand, impulsive processes do not require limited (cognitive) capacity and involve processes that are fast and automatic. It has been suggested that the latter processes are particularly involved in emotional and motivational aspects of behavior. Such dual process models of addiction posit that addictive behaviors are the result of an imbalance between these two processes, i.e., there is no cooperative interplay: There are easily activated, drugoriented impulsive processes, in combination with relatively slow reflective processes that are not strong enough to control or regulate the impulsive process. Furthermore, and in line with the incentive-sensitization-theory of Robinson and Berridge (1993, 2003, 2008), dual process models of addiction hypothesize that the impulsive processes become sensitized with repeated drug use. Drug-related cues acquire incentive salience, which results in an activation of the mesolimbic dopamine system and an increase in dopamine levels. As a consequence of this neurological chain, the brain "interprets" drug-related cues as rewarding cues, and therefore prepares the corresponding motivational state, i.e., an approach action tendency, aimed at consuming the drug of interest. From an information processing perspective, this explains behavioral phenomena such as attentional and approach biases for drug-related cues: Due to the incentive salience of these cues, they automatically capture an individual's attention and activate approach-related behaviors.

Over the last decades, there has been a surge of interest in tobacco-related information processing biases (for an overview and meta-analysis, see e.g., Waters and Sayette, 2006; Field and Cox, 2008; Rooke et al., 2008). Such investigations are important from a theoretical but also from a clinical perspective: On the one hand, such studies can test specific hypotheses derived from models of addiction, and on the other hand, these studies can advance our understanding of factors related to the high number of relapse in tobacco dependence. For example, Waters et al. (2003) found that smokers who showed a greater attentional bias for smoking-related words were more likely to lapse in the short-term.

Before summarizing studies investigating tobacco-related approach biases, an important distinction has to be made. This distinction concerns the operationalization of approach biases, namely whether they are operationalized as symbolic or actual motor responses. Regarding the assessment of symbolic motor responses, the Stimulus-Response Compatibility (SRC, Mogg et al., 2003) task has been used to assess symbolic tobacco-related approach biases. During the SRC, participants are instructed to move a manikin figure toward (approach) or away (avoidance) from, for example, smoking-related or neutral pictures. The time needed to initiate the manikin's approach and avoidance movements serves as the dependent variable. Studies employing the SRC showed that smokers are faster to approach than to avoid smoking-related cues (e.g., Mogg et al., 2003, 2005; Bradley et al., 2004, 2008; Thewissen et al., 2007). Regarding the assessment of actual motor responses, the Approach-Avoidance Task (AAT; Rinck and Becker, 2007) is a suitable paradigm. Indeed, it also has been used to assess tobacco-related approach biases. During the AAT, participants are instructed to pull (approach) and to push (avoidance) a joystick in response to, for example, smokingrelated or neutral pictures, that appear on the computer screen. Here, the time needed to execute the push and pull movements serve as the dependent variable. Most AATs apply an indirect task version. That is, the instructions do not ask participants to respond to the pictures' content. Instead, participants are required to respond to an unrelated feature such as the pictures' orientation or format. The advantage of such an indirect task version is that participants respond to a stimulus feature that is independent of the stimulus dimension that the task aims to assess, which disguises the research question and makes the use of response strategies less likely (Rinck and Becker, 2007). In the context of tobacco dependence, the AAT is a rather novel paradigm, and to the best of our knowledge, only three studies have employed the AAT so far (but for more AAT studies in the context of alcohol dependency, see e.g., Palfai and Ostafin, 2003; Wiers et al., 2010, 2011; Eberl et al., 2013; Kersbergen et al., 2015). The study by Wiers C. E. et al. (2013), examined tobacco-related approach biases in heavy smokers, non-smokers, and ex-smokers. Results showed that heavy smokers were faster to approach smoking-related pictures compared to non-smokers and exsmokers. Moreover, this approach bias was correlated with levels of craving. The study by Machulska et al. (2015) compared smokers to non-smokers, and found that smokers, unlike nonsmokers, exhibited an approach bias toward smoking-related pictures compared to food-related control pictures (see also Larsen et al., 2014). Finally, according to results of Watson et al. (2013), tobacco-related approach biases can also be conditional. They tested a group of deprived cigarette smokers and found that the bias assessed at baseline was associated with participants' level of craving. After the baseline assessment, half of the participants were allowed to smoke a cigarette. These participants reported a reduction in craving but an increase in approach bias.

Beyond the studies examining actual and symbolic tobaccorelated approach biases, there are also studies targeting tobaccorelated approach associations. Word categorization tasks such as the Implicit Association Test (IAT; Greenwald et al., 1998) have been employed in this type of research. During the IAT, participants simultaneously categorize target stimuli (e.g., smoking-related vs. control stimuli) and attribute words (e.g., approach- or avoidance-related words) as fast as possible into the appropriate superordinate category. The difference in reaction times between the possible combinations (e.g., smoking-related stimuli and approach attributes share the same response key, and neutral stimuli and avoidance-related words share the same response key) is assumed to reflect whether smoking is associated more strongly with either attribute category, with relatively fast responses reflecting relatively strong associations. The study by De Houwer et al. (2006) examined such associations and found that smokers indeed had stronger approach- than avoidancerelated tobacco associations, respectively. However, most of the IAT studies compared general positive vs. negative smokingrelated associations, and here the evidence is less clear (e.g., Swanson et al., 2001; Sherman et al., 2003; Huijding et al., 2005).

Regarding tobacco-related attentional biases, several studies found that smokers are slower to respond to smoking-related pictures (visual probe task) and words (Stroop task), compared to neutral pictures or words (visual probe task: e.g., Mogg et al., 2003, 2005; Bradley et al., 2008; Stroop task: e.g., Munafò et al., 2003, 2005; Larsen et al., 2014; and for a meta-analysis, see Cox et al., 2006). There is also evidence which further specifies these findings. Results of Mogg and Bradley (2002) showed a positive correlation between smoking-related attentional biases and daily cigarette consumption. Moreover, Wertz and Sayette (2001) found a greater Stroop interference in participants who were told that they were allowed to smoke during the study, compared to those who were told they were not allowed to smoke. Finally, smoking-related attentional biases seem to be related to levels of self-reported craving (Zack et al., 2001; Mogg and Bradley, 2002), and increase after participants have been deprived of smoking (Cox et al., 2006). For example, using a visual probe task, Field et al. (2004) found that deprived smokers maintained their gaze toward smoking-related cues compared to neutral cues.

In summary, there is evidence showing that tobacco dependency is characterized by smoking-related approach and attentional biases. Despite the importance of these findings, however, there are two significant limitations: First, within previous studies, only a limited number of groups have been compared. Second, previous studies employed only a limited number of tasks. This heterogeneity makes it difficult to compare and evaluate these studies, particularly in relation to the underlying theory. The present study aimed to address this problem via (i) a structural comparison of different measures of approach-avoidance and the most commonly used paradigm to assess attentional biases (i.e., the Stroop), and (ii) using within one study different representative samples in the context of tobacco dependence. To assess smokingrelated approach avoidance biases, three different measures were used: the Approach Avoidance Task (AAT), the Stimulus Response Compatibility Task (SRC), and a Single Target Implicit Association Test (STIAT; Wigboldus et al., 2004). Given the fact that smoking does not have an inherently meaningful contrast category (such as, for example, alcohol vs. soft drinks), we chose to use a STIAT instead of an IAT. The AAT and SRC used pictorial stimuli (smoking-related and matched control pictures), the STIAT used word stimuli (targets: smokingrelated words, attributes: approach avoidance words). A modified Stroop including smoking-related words was administered to assess tobacco-related attentional biases. Finally, we also assessed explicit attitudes toward smoking and levels of craving over the course of the study. The study included four groups: smokers, cravers, ex-smokers, and non-smokers. Following the predictions of theories of addictive behaviors and the existing empirical evidence in this context, our main hypothesis was to find strong tobacco-related approach and attentional biases in smokers and cravers, compared to ex-smokers and non-smokers. Moreover, we expected that tobacco-related approach and attentional biases would be correlated positively across smokers, cravers, and ex-smokers.

## METHODS

## Participants

A total of 232 students from Radboud University (NL) were tested (Mage: 22.36, SD = 3.2, 158 females). Within this group, there were n = 59 smokers, n = 59 non-smokers, n = 56 cravers, and n = 58 ex-smokers. The selection criteria were as follows: Smokers were included if they were smoking at least six cigarettes a day for at least 2 months. The same criteria applied for cravers. In order to avoid craving, smokers were instructed to smoke a cigarette prior to the study. However, cravers were instructed to not smoke for 6 h prior to the study. The group of non-smokers included individuals who had never smoked a cigarette or a joint. Ex-smokers were included if they had stopped smoking at least 6 months earlier and had smoked a minimum of six cigarettes a day while actively smoking. Prior to the analyses, 11 participants were excluded: Two non-smokers (one because of technical problems during testing and another who was actually smoking once in a while), four cravers (one did not smoke six or more cigarettes a day, one was tested too early and thus did not crave for 6 h, and two did not comply with the rule to not smoke for 6 h prior to the study), four ex-smokers and one smoker (technical problems during testing), leaving a total sample of N = 221 (n = 58 smokers, n = 57 non-smokers, n = 52 cravers, and n = 54 ex-smokers)<sup>1</sup> .

## Materials

## Self-Report Measures

#### **Fagerström test for nicotine dependence (FTND)**

The FTND (Heatherton et al., 1991) is a self-report measure assessing the degree of nicotine dependence. It contains six items, e.g., "How many cigarettes per day do you smoke?" "How soon after waking do you smoke your first cigarette?" The higher the FTND sum score, the higher participants' level of dependence.

### **Explicit attitudes toward smoking**

To assess explicit attitudes toward smoking, participants were asked to evaluate eight adjective pairs (e.g., smoking is

<sup>1</sup>Please note that there are additional missing data for the self-report measures and reaction time data due to technical problems during testing.

"good–bad," "sociable–unsociable," "sexy–unsexy") on a 7-point scale (see Huijding et al., 2005; Huijding and de Jong, 2006).

#### **Pictorial stimuli**

The pictorial stimuli included 20 smoking-related pictures and 20 matched control pictures (for examples, see Supplementary Material). These 40 pictures were divided across two sets, i.e., set A and B, each containing 10 smoking-related pictures and the corresponding 10 matched control pictures.

#### **Approach avoidance task (AAT)**

During the AAT (Rinck and Becker, 2007), participants responded to pictures presented on the computer screen by approaching and avoiding them using a joystick. The joystick was positioned in front of the computer screen, tightly fastened to the table. The instructions said that all pictures were tilted either slightly to the left or right, and that the tilt determined whether the pictures had to be pulled (approach movement) or pushed (avoidance movement; for a similar procedure, see Cousijn et al., 2011). Within each of the four participants groups, half of the participants pulled left-tilted and pushed right-tilted pictures, whereas the other half pushed left-tilted and pulled right-tilted pictures. Participants initiated each trial by pressing a button of the joystick with their index finger while holding the joystick in the central position. When the picture appeared, participants had to decide quickly whether the picture was tilted to the left or to the right, and had to respond according to their instructions. During pushing, the pictures became smaller, whereas they became larger during pulling. This zoom supported the approach-avoidance effect visually. Moreover, participants were instructed to "pull the joystick toward themselves," and to "push it away from them." Via these instructions, the movements' reference point was the participant's body. This disambiguated the movements, and labeled them as clear and unambiguous approach or avoidance movements. After pushing or pulling the joystick all the way into the right direction, participants had to bring it back to the central position and start the next trial. Pictures disappeared only when the joystick was pulled or pushed in the correct direction and when the joystick was pulled or pushed by an angle of 30 degrees.

The AAT started with a practice block during which two practice pictures were pushed and pulled 10 times each. After that, 160 assessment trials followed, including 10 smokingrelated pictures and 10 matched control pictures. The assessment was divided into two blocks of 80 trials each. Within each block, the smoking-related pictures and matched control pictures were pushed and pulled four times each [i.e., (4 × 10)+(4 × 10) = 80 × 2 = 160 trials in total].

#### **Stimulus response compatibility task (SRC)**

In each trial of the SRC task (Mogg et al., 2003), a picture appeared in the center of the screen. In addition, a manikin figure was displayed either below or above the picture. Participants were instructed to move the manikin figure either toward or away from the picture by making use of the keys "2" (manikin moved downwards) and "8" (manikin moved upwards) on the numeric part of the keyboard. There were two blocks with two different stimulus-response assignments: One block required participants to move the manikin toward smokingrelated pictures (approach movement) and to move the manikin away from control pictures (avoidance movement), whereas the other block required participants to move the manikin away from smoking-related pictures and toward control pictures. For the sake of brevity, the following terms will be used to describe these two different stimulus-response assignments: compatible block: manikin approaches smoking-related pictures and avoids control pictures; incompatible block: manikin avoids smokingrelated pictures and approaches control pictures. The latency between picture onset and the participant's response served as the dependent variable. All participants completed both blocks. However, the order of blocks was counterbalanced: Within each of the four participants groups, half of the participants started with the compatible block and then completed the incompatible block, whereas the other half started with the incompatible block and then completed the compatible block. Within each block, the manikin appeared below the picture in 50% of the trials, and above the picture in the other 50%. When the manikin appeared below the picture, 50% of the trials required a down response, whereas the other 50% required an up response, and the same was true when the manikin appeared above the picture. The manikin position and picture type varied randomly over trials.

The SRC started with a practice block during which the manikin approached one picture four times and also avoided one picture four times. After that, 160 assessment trials followed, including 10 smoking-related pictures and 10 matched control pictures. The assessment was divided in two blocks of 80 trials each. Within each block, the smoking-related pictures and matched control pictures were approached and avoided four times each [i.e., (4 × 10)+(4 × 10) = 80 × 2 = 160 trials in total].

#### **Single target implicit association test (STIAT)**

The STIAT (Wigboldus et al., 2004) consisted of a complete sequence of five blocks: (a) attribute discrimination, (b) practice combined block, (c) first combined block, (d) practice reversed combined block, and (e) reversed combined block. Each block started with instructions describing the discrimination category and the assignment of the response keys (left vs. right). The procedure started with (a) the attribute discrimination block, in which participants had to sort words that belonged to two categories, namely approach or avoidance. Participants were asked to press one key in response to approach-related words, and the other key in response to avoidance-related words (i.e., either key "A" on the very left part of the keyboard or key "6" on the numeric part of the keyboard). The stimuli in this block consisted of six approach-related attribute words and six avoidance-related attribute words. Words were presented one after another in a fixed random order. In the second block, six smoking-related target words were also presented. Participants therefore practiced the combined block (b). There were two different response assignments: one assignment required participants to categorize smoking-related target words with the same key as approach-related attribute words. The other assignment required participants to categorize smoking-related target words with the same key as avoidance-related attribute words. For the sake of brevity, the following terms will be used to describe these two response assignments: compatible block: smoking-related targets and approach-related attributes shared the same response key; incompatible block: smoking-related targets and avoidance-related attributes shared the same response key. The combined practice block included 24 trials: The six target words were presented once, the six attribute words that required the same response were also shown once, and the six words that required the opposite response key were presented 12 times. Because targets were assigned to only one key during combined blocks, there are fewer responses on the opposite key. Hence, to balance this mismatch of responses by the left and right key, attributes assigned to the opposite side of the targets were presented twice as often, resulting in an equal number of left and right key responses in each of the combined blocks. The key-assignment was counterbalanced within each of the four participant groups: Half of the participants were told to press the left key ("A" key) in response to all targets, and the other half were told to press the right key ("6" key) in response to all targets. Moreover, we controlled for the sequence of the combined blocks: Within each of the four participants groups, half of the participants started with the compatible block and then completed the incompatible block, whereas the other half started with the incompatible block and then completed the compatible block.

After the practice trials, the actual combined block followed (c). This block included 72 trials: The six target words were presented three times (18 trials), the six attribute words which required the same response were also shown three times (18 trials), and the six words which required the opposite response were presented six times each (36 trials). Next, participants practiced the reversal of the response assignment for target words (d). That is, participants who had pressed the approach key in response to smoking-related targets now had to respond with the avoidance key, the other half of the participants vice versa. This combined reversed practice block also consisted of 24 trials: The six target words were presented once, the six attribute words that required the same response were also shown once, and the six words that required the opposite response were presented 12 times each. Finally, the actual reversed combined block followed (e). This block included 72 trials again: The six target words were presented three times, the six attribute words which required the same response were also shown three times, and the six words which required the opposite response were presented six times. During each trial, reminder labels (appropriate category names positioned in the top left and top right corner of the screen) remained visible. Within each block, stimuli appeared in the same fixed random order for each participant. After incorrect responses, a red "X" appeared in the center of the screen. Given the high numbers of German students at Radboud University, we had two STIATs; a Dutch and a German version.

#### **Emotional stroop**

During this task, participants categorized word stimuli according to their print color. The stimuli were presented on cards. There were five print colors: white, blue, red, green, and yellow. There were three types of cards. All participants started with the practice card. Here, meaningless colored strings of "XXX" were presented. After that, the smoke card or the neutral card was presented (randomized). On the smoke card, eight smoking-related words were shown (e.g., cigarette, smoke, cigar). These words differed from the smoking-related words used during the STIAT. On the neutral card, eight household-related words were shown (e.g., towel, broom, spoon). The order of the smoke and the neutral card was random. All cards contained 40 stimuli each, i.e., eight stimuli distributed across five columns. Each card appeared on the screen after a mouse click initiated by the experimenter. As soon as the participant had named the last word's print color, the experimenter clicked again and the card disappeared. Reaction times were saved on the computer, for each card separately, and these reaction times were used in the analyses. Participants' errors were recorded by the experimenter, who was blind to the type of card that was presented. Given the high numbers of German students at Radboud University, we used a German and a Dutch Stroop version.

## **Procedure**

Participants were tested individually in separate testing cubicles. After having signed informed consent, participants' level of carbon monoxide (CO) was assessed by means of the piCO+ smokelyzer (Bedfont Scientific, Kent, England). For smokers, CO levels were assessed 10 min after they had smoked their cigarette. Next, smokers, cravers and ex-smokers answered a question about their level of craving ("How strong is your urge to smoke a cigarette right now?") using a scale from 0 (= no urge) to 100 (= strong urge). Moreover, smokers and cravers had to indicate how many cigarettes they would smoke on a normal day. Exsmokers were asked to indicate this for the time they were still smoking. Then, the four computer tasks followed. There were two orders and this was counterbalanced: Within each of the four participants groups, half of the participants received order one (STIAT, SRC, Stroop, AAT), the other half order two (AAT, Stroop, SRC, STIAT). The tasks' order was linked to the picture set (A or B, example: if a participant started with the AAT, picture set A was used for the AAT, and picture set B was used for the SRC, and vice versa if a participant started with the SRC). That is, for task order one, the AAT always included picture set A and the SRC included picture set B. For task order two, the AAT always included picture set B and the SRC included picture set A. After the computer tasks, smokers, cravers and ex-smokers completed a second craving question and the FTND. Ex-smokers received an adapted version of the FTND that was related to their past smoking behavior. The smoking attitude rating was then completed by all participants. Finally, cravers were asked to smoke a cigarette and after 10 min, their CO levels were assessed a second time. This second assessment served as an extra check for the cravers' temporal abstinence, i.e., we expected their CO value to be higher than their CO value assessed before the start of the study. The present study had the necessary ethical approvals via the Behavioural Science Institute.

## RESULTS

## Participant Characteristics

**Table 1** gives an overview of the samples' characteristics and the means and standard deviations of the following measures: average of daily smoked cigarettes, levels of carbon monoxide (CO) pre study, craving pre and post-study, and scores on the Fagerström Test for Nicotine Dependence (FTND). A chi-square test revealed that the four groups did not differ concerning gender, χ 2 (3) <sup>=</sup> 4.84, <sup>p</sup> <sup>=</sup> 0.18. Univariate ANOVAs were conducted to examine the following baseline measures (please note that not all groups were involved in all comparisons): Age, F(3, 217) = 7.81, p < 0.001, eta<sup>2</sup> = 0.1; Average of daily smoked cigarettes, F(2, 161) = 0.51, p = 0.6; CO levels pre study, F(3, 202) = 94.91, p < 0.001, eta<sup>2</sup> = 0.59; Craving pre study, F(2, 160) = 91.2, p < 0.001, eta<sup>2</sup> = 0.53; FTND scores, F(2, 161) = 7.84, p < 0.01, eta<sup>2</sup> = 0.09.

These outcomes were treated as follows: For age, we repeated the main analyses (i.e., for the AAT, SRC, STIAT, and Stroop) including age as a covariate. This did not change the results, and thus for clarity and given the lack of specific hypotheses regarding age, we report unadjusted analyses without this covariate, and did not analyse this baseline imbalance further. Hence, for the sake of brevity, we report all analyses without this factor. Moreover, we did not further analyze FTND scores, given the fact that the ex-smokers' score is a retrospectively assessed score and thus not an optimal measure. However, we did further examine the findings concerning the CO levels and craving scores pre study. Regarding the pre study CO levels, Bonferroni posthoc tests including all four groups (i.e., smokers, non-smokers, cravers, ex-smokers) revealed that all group comparisons were significant (p's < 0.002), except for the non-smokers vs. exsmokers comparison (p = 1). Regarding the craving scores pre study, Bonferroni post-hoc tests including smokers, cravers, and ex-smokers revealed significant differences for all three comparisons, p's < 0.03.

## Craving Over the Course of the Study

We also assessed participants' level of craving over the course of the study. A repeated-measures ANOVA including the betweensubjects factor Group (smokers, cravers, ex-smokers) and the within-subjects factor Time (craving pre, craving post) revealed a significant main effect of Time, F(1, 158) = 23.88, p < 0.001, eta<sup>2</sup> = 0.13, and Group, F(2, 158) = 99.91, p < 0.001, eta<sup>2</sup> = 0.56. Moreover, there was a marginally significant Time x Group interaction; F(2, 158) = 2.79, p = 0.065, eta<sup>2</sup> = 0.03. This interaction was further examined by three paired-samples ttests, i.e., one for each group comparing craving scores pre vs. post: smokers: t(56) = 3.86, p < 0.001; cravers: t(51) = 2.76, p < 0.01; ex-smokers: t(51) = 1.92, p = 0.06. Following this, smokers' and cravers' level of craving significantly increased over the course of the study. In the group of ex-smokers, this increase was marginally significant, although it would no longer be after application of a Bonferroni correction (for means and standard deviations, see **Table 1**).

## Analyses Approach-Avoidance Biases

For the analyses of the AAT, SRC, and STIAT, the effects of potential outliers were corrected by computing the median reaction time (RT) of each participant. Thus, the means reported below are means of medians.

## Approach Avoidance Task (AAT)

The analysis included only trials during which a participant pushed or pulled the joystick all the way into the right direction within one movement. As a first step, we examined the groups' error trials by means of a univariate ANOVA. Results showed that the groups did not differ here: F(3, 217) = 1.64, p = 0.18 (smokers: M = 0.05, SD = 0.04; non-smokers: M = 0.05, SD = 0.05; cravers: M = 0.07, SD = 0.09; ex-smokers: M = 0.05, SD = 0.06).

Next, a difference score per participant was calculated. As a first step, RTs of pull movements were subtracted from RTs of push movements, for both picture types (i.e., smoking and control). As such, a positive difference score reflects an approach bias. After that, we subtracted the control pictures' difference score from that of smoking-related pictures. Here, a positive difference score indicates a stronger approach bias toward smoking-related pictures. Finally, participants with an error percentage greater than 20% were excluded from the analysis (non-smokers n = 2, cravers n = 3, ex-smokers n = 2).

To analyze the AAT data, a univariate ANOVA was conducted with Group (smokers, non-smokers, cravers, ex-smokers) and Order Tasks (one, two) as between-subjects factor, and the overall difference score as dependent variable. Of most interest was the main effect of Group. However, this effect did not reach significance, F(3, 206) = 0.9, p = 0.44. As such, the groups did not differ in their approach-avoidance responses toward smokingrelated and control pictures (for an overview of means, standard deviations and n's per group, see **Table 2**) 2 .

## Stimulus Response Compatibility (SRC) task

As a first step, we examined the groups' error scores by means of a Univariate ANOVA. Results showed that the groups did not differ here: F(3, 216) = 0.41, p = 0.75 (smokers: M = 0.06, SD = 0.04; non-smokers: M = 0.06, SD = 0.04; cravers: M = 0.06, SD = 0.04; ex-smokers: M = 0.07, SD = 0.05). Based on this error check, we excluded two participants from the analysis because their error percentage was greater than 20% (smokers: n = 1, exsmokers: n = 1). To analyze the RT data, we subtracted RTs of the compatible block from RTs of the incompatible block. As such, a positive difference score indicates faster approach of smokingrelated pictures. Next, we conducted a univariate ANOVA with

<sup>2</sup>For the sake of clarity, we only report the outcome of main interest in the text. Hence, please find the additional outcomes here (i.e., main effects and interactions) of the analyses of the AAT, SRC, STIAT and Stroop data: AAT: Order Tasks, F(1, 206) = 0.25, p = 0.62, Group × Order Tasks, F(3, 206) = 0.89, p = 0.45. SRC: Order Tasks, F(1, 202) = 0.49, p = 0.49, Order SRC, F(1, 202) = 0.41, p = 0.52, Group × Order Tasks, F(3, 202) = 1.59, p = 0.19, Group × SRC order, F(3, 202) = 2.88, p = 0.04, eta<sup>2</sup> = 0.04, Order Tasks × SRC Order, F(1, 202) = 0.02, p = 0.9, Group × Order Tasks × SRC Order, F(3, 202) = 0.59, p = 0.63. STIAT: Order Tasks, F(1, 202) = 4.1, p = 0.04, eta<sup>2</sup> = 0.02, STIAT Order, F(1, 202) = 1.1, p = 0.3, Group × Order Tasks, F(3, 202) = 0.64, p = 0.59, Group × STIAT Order, F(3, 202) = 4.43, p < 0.01, eta<sup>2</sup> = 0.06, Order Tasks × STIAT Order, F(1, 202) = 18.62, p < 0.001, eta<sup>2</sup> = 0.08, Group × Order Tasks × STIAT Order, F(3, 202) = 4.13, p < 0.01, eta<sup>2</sup> = 0.06. Stroop: Order Tasks, F(1, 192) = 0.56, p = 0.45, Order Stroop, F(1, 192) = 14.32, p < 0.001, eta<sup>2</sup> = 0.07, Group × Order Tasks, F(3, 192) = 0.53, p = 0.67, Group × Order Stroop, F(3, 192) = 3.87, p = 0.01, eta<sup>2</sup> = 0.06, Order Tasks × Order Stroop, F(1, 192) = 0.48, p = 0.49, Group × Order Tasks × Order Stroop, F(3, 192) = 2.51, p = 0.6, eta<sup>2</sup> = 0.04.

#### TABLE 1 | Descriptives of the four groups.


CO levels pre: Levels of carbon monoxide (CO) assessed before the study (please note Footnote 1); CO levels post: Levels of carbon monoxide (CO) assessed again in cravers after the study; Average daily smoking: Average of daily smoked cigarettes for smokers, cravers and ex-smokers (retrospective); Craving pre study: Levels of cigarette craving in smokers, cravers and ex-smokers before the study; Craving post-study: Levels of cigarette craving in smokers, cravers and ex-smokers after the study; FTND: Mean sum score of the Fargerström Test for Nicotine Dependence (FTND) in smokers, cravers, and ex-smokers (retrospective).


AAT, Approach Avoidance Task; Score 1, RTs push movements—RTs pull movements smoke pictures; Score 2, RTs push movements—RTs pull movements control pictures; Overall difference score, Score 1—Score 2, i.e., a positive difference score indicates a stronger approach bias toward smoking-related pictures. SRC: Score 1, Compatible block (manikin approaches smoking-related pictures and avoid control pictures); Score 2, Incompatible block (manikin avoids smoking-related pictures and approaches control pictures); Overall difference score, Incompatible block—compatible block, i.e., a positive difference score indicates faster approach of smoking-related pictures. STIAT: Score 1, Compatible block (smoking-related targets and approach-related attributes shared the same response key); Score 2, Incompatible block (smoking-related targets and avoidance-related attributes shared the same response key); Overall difference score, Incompatible block—compatible block, i.e., a positive difference score indicates faster approach-related associations toward smokingrelated pictures. Stroop: Score 1, RTs smoke card; Score 2, RTs neutral card; Overall difference score, Score 1—Score 2, i.e., a positive difference score indicates a greater interference for smoking-related stimuli.

Group (smokers, non-smokers, cravers, ex-smokers), Order SRC (compatible-incompatible, incompatible-compatible) and Order Tasks (one, two) as between-subjects factor, and the difference score as dependent variable. Of most interest here was the main effect of Group. Results showed that this effect was marginally significant, F(3, 202) = 2.24, p = 0.085, eta<sup>2</sup> = 0.03. However, post-hoc Bonferroni tests revealed that none of the between-group comparisons were significant (p's > 0.05). As such, the groups did not differ in their approach-avoidance responses toward smoking-related and control pictures (for an overview of means, standard deviations and n's per group, see **Table 2**).

## Single Target Implicit Association Test (STIAT)

As a first step, we examined the groups' error score by means of a Univariate ANOVA. Results showed that the groups did not differ here: F(3, 216) = 1.31, p = 0.27 (smokers: M = 0.04, SD = 0.03; non-smokers: M = 0.05, SD = 0.04; cravers: M = 0.04, SD = 0.04; ex-smokers: M = 0.04, SD = 0.04). Based on this error check, we excluded two participants from the analysis because their error percentage was greater than 20% (non-smokers: n = 1, ex-smokers: n = 1). To analyze the RT data, we subtracted RTs of the compatible block from RTs of the incompatible block. As such, a positive difference score indicates faster approach-related associations toward smokingrelated pictures. Next, we conducted a univariate ANOVA with Group (smokers, non-smokers, cravers, ex-smokers), Order STIAT (compatible-incompatible, incompatible-compatible) and Order Tasks (one, two) as between-subjects factor, and the difference score as dependent variable. Of most interest here was the main effect of Group. Results showed that this effect was significant, F(3, 202) = 5.74, p < 0.01, eta<sup>2</sup> = 0.08. Post-hoc Bonferroni tests revealed the following: cravers vs. non-smokers: p < 0.001, cravers vs. smokers: p = 0.051, cravers vs. ex-smokers: p = 0.094. None of the other comparison reached significance (p's > 0.05). Following this, cravers had a stronger approach-bias toward smoking-related cues than smokers, non-smokers and exsmokers (for an overview of means, standard deviations and n's per group, see **Table 2**).

## Attentional Bias

## Emotional Stroop

Prior to the analysis, a difference score was calculated per participant. Here, we only used RTs of the neutral card and the smoke card, not the practice card including the meaningless colored "XXX" strings. More precisely, RTs of the neutral card were subtracted from RTs of the smoke card. As such, a positive difference score indicates greater interference for smoking-related stimuli. We excluded 6 participants because their difference score deviated more than 3 SD from their group's mean difference score (non-smokers n = 2, cravers n = 4). We conducted a univariate ANOVA including the between-subjects factor Group (smokers, non-smokers, cravers, ex-smokers), Order Stroop Card (smoke-neutral, neutral-smoke) and Order Tasks (one, two), and the difference score as dependent variable. Of main interest here was the main effect of Group. However, results showed that this effect was not significant, F(3, 192) = 0.4 p = 0.75. As such, there were no group differences concerning the interference of smoking-related vs. neutral stimuli (for an overview of means, standard deviations and n's per group, see **Table 2**).

## Ratings Explicit Attitudes Toward Smoking

Prior to analysis, scores on the eight adjective pairs were collapsed into one overall score. Here, higher sum scores signal a more negative attitude toward smoking. To investigate whether the groups differed concerning their explicit attitude toward smoking, a univariate ANOVA was conducted with Group (smokers, non-smokers, cravers, ex-smokers) as betweensubjects factor and the collapsed attitude sum score as dependent variable. Results showed a significant main effect of Group, F(3, 217) = 52.32, p < 0.001, eta<sup>2</sup> = 0.42. Post-hoc Bonferroni tests revealed significant results for all group comparisons (p's < 0.01), except for the smoker vs. craver comparison, p = 0.1 (smokers: M = 33.66, SD = 4.47; non-smokers: M = 45.7, SD = 6.65; cravers: M = 33.82, SD = 5.68; ex-smokers: M = 38.11, SD = 6.41). Moreover, one sample t-tests showed that all four group means deviated significant from zero, p's < 0.001. This result pattern generally shows that non-smokers have the most negative attitude toward smoking, followed by ex-smokers, smokers, and cravers.

## Correlations

Across the group of smokers, cravers, and ex-smokers, correlations were calculated for the following measures: AAT, SRC, STIAT, Stroop, explicit attitudes toward smoking, daily smoking, FTND scores, urge pre study, urge post-study. **Table 3** gives an overview of these findings. We particularly expected to find positive correlations between tobacco-related approach and attentional biases. However, there were only two marginally significant correlations, i.e., between the AAT and the STIAT (r = 0.15), showing that the stronger the approach bias toward smoking-related pictures on the AAT, the stronger the approach-associations toward smoking-related words on the STIAT; and between the SRC and the Stroop (r = 0.15), showing that the stronger the approach bias toward smoking-related pictures on the SRC, the greater the smoking-related attentional bias on the Stroop. When looking at the correlations including explicit attitudes toward smoking, daily smoking, FTND scores, urge pre study, urge post-study, we found the following: Both the AAT and the SRC correlated significantly with explicit smoking attitudes (r = −0.15, marginally significant, and r = −0.18), showing that the stronger the approach bias toward smoking-related pictures, the less negative participants' attitude toward smoking was. Moreover, there was a marginally significant correlation between the Stroop and FTND scores (r = 0.16), indicating that the greater the smoking-related attentional bias on the Stroop, the higher participants' levels of nicotine dependence. Urge assesed before and after the computer tasks correlated with the STIAT (urge pre: r = 0.27, urge post: r = 0.25), and the Stroop (urge post: r = 0.17), showing that the higher levels of urge, the stronger the approach-associations toward smoking-related words on the STIAT, and the stronger the smoking-related attentional bias on the Stroop. Please note, however, that most of the correlations would not remain significant after controlling for multiple comparisons.

## DISCUSSION

The present study examined the role of approach and attentional biases in tobacco dependence. We tested four groups, namely smokers, cravers, ex-smokers, and non-smokers. The following tasks were employed: To assess approach-related biases, we used the Approach Avoidance Task (AAT), the Stimulus Response Compatibility Task (SRC), and a Single Target Implicit Association Test (STIAT). A modified Stroop including smokingrelated words was administered to assess attentional biases. Moreover, we assessed explicit attitudes toward smoking and levels of craving over the course of the study. We expected to find strong tobacco-related approach and attentional biases in smokers and cravers compared to ex-smokers and nonsmokers. However, the general pattern of results did not confirm these expectations. Approach responses assessed during the AAT and SRC did not differ between groups. Moreover, the Stroop did not show the expected interference effect. Regarding the data of the STIAT, results were partly in line with our expectations: Cravers showed stronger approach associations toward smoking-related cues, whereas non-smokers showed stronger avoidance associations. However, no such differences


TABLE 3 | Correlations between AAT, SRC, STIAT, Stroop, explicit attitudes, daily smoking, FTND, urge pre-study, and urge post-study in smokers, cravers, and ex-smokers (N = 226).

AAT, Approach-Avoidance Task (difference score smoke-related pictures—difference score control pictures); SRC, Stimulus Response Compatibility (incompatible—compatible); STIAT, Single Target Implicit Association Test (incompatible—compatible); Daily smoking, cigarettes per day; FTND, Fagerström Test for Nicotine Dependence.

#p < 0.100, \*p < 0.05, \*\*p < 0.01, \*\*\*p < 0.001.

in approach-avoidance associations were found in smokers and ex-smokers. Generally, correlational analyses did not reveal the expected positive correlations between tobacco-related approach and attentional biases among smokers, cravers, and ex-smokers. However, we did find some patterns that are in line with the theory, e.g., the stronger the approach bias toward smokingrelated pictures, the stronger the smoking-related approach associations. Regarding the assessment of participants' explicit smoking-related attitudes, results were generally indicative of a negative attitude toward smoking, with non-smokers and exsmokers having the most negative attitudes toward smoking. Finally, results showed that smokers' and cravers' level of craving significantly increased over the course of the study. In the group of ex-smokers, this increase was marginally significant.

To summarize, our data do not provide strong evidence for the role of approach and attentional biases in tobacco dependency, except for findings on the STIAT. Given the large sample size of each group, a lack of statistical power does not seem to be a likely explanation. Hence, a closer inspection of the tested groups, the tasks and their stimuli could help to understand these nullfindings. Regarding the groups, their average smoking behavior is the first index to check and compare. Across our groups, smokers, cravers, and ex-smokers smoked 12–13 cigarettes a day. These scores are rather low when comparing them, for example, with the samples tested by Wiers C. E. et al. (2013): In that study, smokers and ex-smokers reported an average between 22 and 24 cigarettes a day. Thus, one could argue that our groups were not "smoking enough" in order to show tobaccorelated approach and attentional biases. However, other studies found such biases in samples that exhibited a similar smoking behavior than ours (e.g., Munafò et al., 2003; Bradley et al., 2004; Mogg et al., 2005). Moreover, in our study, smokers and cravers were supposed to be active smokers for at least two years. Another index, i.e., the groups' score on the Fargerström Test of Nicotine Dependence (FTND), is also rather inconclusive. Our groups scored around five on the FTND, which does not deviate much from the values in other studies (e.g., Munafò et al., 2003; Bradley et al., 2008; Wiers C. E. et al., 2013). To conclude, the sample's general smoking-related characteristics match with other studies, and thus do not provide a sufficient explanation for the null-findings. Regarding the tasks we employed, we used well-established tasks in the context of tobacco-related approach and attentional biases (i.e., AAT, SRC; STIAT, and emotional Stroop). The AAT is a rather novel task for this specific type of addictive behavior. However, it has been proven successful in the assessment of alcohol-related approach biases (e.g., Palfai and Ostafin, 2003; Wiers et al., 2010, 2011; Eberl et al., 2013; Kersbergen et al., 2015). Therefore, given the fact we tapped into similar processes (i.e., approach biases), in combination with the successful results reported by the three previous studies (Wiers C. E. et al., 2013; Larsen et al., 2014; Machulska et al., 2015), the AAT seemed a promising instrument. Only the STIAT provided results that partly supported our predictions. That is, we found stronger approach associations toward smoking-related cues in cravers, whereas stronger avoidance associations were found in non-smokers. From a theoretical perspective, this is in line with assumptions put forward by dual process models of addiction (e.g., Deutsch and Strack, 2006; Wiers et al., 2007) and the incentive-sensitization model (Robinson and Berridge, 1993, 2003, 2008): For cravers who were deprived of smoking, smoking-related cues had a high incentive salience, which in turn automatically elicited an approach association. For nonsmokers, in contrast, for whom smoking-related cues did not have incentive salience and were rather associated with negativity and unpleasantness, smoking-related cues automatically elicited an avoidance association. Interestingly, however, our STIAT version slightly deviated from that of other STIATs as it included a high number of trials. To summarize, the details of the specific tasks used cannot explain the present null-results. Finally, the choice of stimulus material needs to be analyzed. The AAT and SRC included pictures that depicted clear smoking-related scenes or attributes, and the corresponding matched control picture. A problem with such matched control pictures could be that they were in fact "too good." That is, given their high similarity with the smoking-related pictures, they were possibly not distinctive enough. This, in combination with participants' instruction to react to the pictures as quickly as possible, could be partly responsible for not finding any differences in approach-related response within the tested groups. Moreover, some of the pictures contained food-related objects, so the control pictures could have elicited approach tendencies too. In this context, Machulska et al. (2015) suggest that it might be beneficial to use pictures that depict the commencement of smoking behavior (following findings by Stippekohl et al., 2012, and for similar reason when using pleasant vs. unpleasant smoking-related pictures, see Bradley et al., 2008). Our picture set included only three of such pictures, which could partly explain the null-findings of the AAT and SRC. Finally, there are three additional limitations that could partly explain the present findings. A first limitation is the low reliability of some of the tasks we applied. Second, we did not use baseline CO levels as an inclusion criterion. Third, we cannot rule out that the smokers who were asked to smoke a cigarette prior to testing experienced a smoking-related priming effect during testing. Especially the latter two issues could have affected the results in an unfortunate manner.

To conclude, although the study has some limitations as highlighted above, the present findings remain rather puzzling. Our results neither replicate earlier findings, nor support predictions of dual process models of addiction (e.g., Deutsch and Strack, 2006; Wiers et al., 2007) or the incentive-sensation model (Robinson and Berridge, 1993, 2003, 2008). Following this, our findings do not provide support for studies aiming to re-train approach and attentional biases, a development which has revealed promising findings in the area of alcohol addiction. Here, results showed that computerized trainings, i.e., procedures derived from "Cognitive Bias Modification" techniques (cf. Koster et al., 2009; Woud and Becker, 2014), are able to reduce alcohol-related approach biases (e.g., via Alcohol-AAT-Training, AAATT). Most important, however, results showed that such trainings improve treatment outcomes even at one-year followup (Wiers et al., 2010, 2011; Eberl et al., 2013; Gladwin et al., 2014; and for an overview of CBM-related results in addiction, see Wiers R. W. et al., 2013). In fact, CBM training could be

## REFERENCES


also quite useful in the context of tobacco-related biases, as they operate comparably to those reported in the alcohol literature. Indeed, one published study applied a computerized re-training in the context of tobacco dependence (Wittekind et al., 2014). This study found that tobacco avoidance training reduced levels of cigarette consumption and dependence. However, the study is only a pilot study without a control group. Hence, these data should be interpreted with caution.

Following our null-findings, we suggest that future research should address a number of issues. To start with, studies should further examine the exact conditions of tobacco-related approach and avoidance biases. Moreover, this has to be examined among various relevant groups, e.g., smokers, cravers, ex-smokers, and non-smokers, while taking craving into account (Watson et al., 2013). Although the empirical evidence is rather supportive of the existence of tobacco-related biases, a few studies also found results that do not support a strong role of such biases in smoking. To illustrate, Munafò et al. (2003) and Mogg and Bradley (2002) did not find differences in information processing biases between abstinent smokers and non-abstinent smokers, whereas Larsen et al. (2014) did not find differences in biases between smokers and non-smokers. Hence, it might be possible that there are a number of subtle, boundary conditions, which are not yet fully understood.

Taken together, we did not find the expected tobacco-related approach and attentional biases, and therefore encourage future research to advance our understanding of the nature of these phenomena.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2016.00172


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Woud, Maas, Wiers, Becker and Rinck. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Priming of conflicting motivational orientations in heavy drinkers: robust effects on self-report but not implicit measures

Lisa C. G. Di Lemma1, 2, Joanne M. Dickson<sup>1</sup> , Pawel Jedras <sup>1</sup> , Anne Roefs <sup>3</sup> and Matt Field1, 2 \*

*<sup>1</sup> Department of Psychological Sciences, University of Liverpool, Liverpool, UK, <sup>2</sup> UK Centre for Tobacco and Alcohol Studies, Liverpool, UK, <sup>3</sup> Clinical Psychological Science, Maastricht University, Maastricht, Netherlands*

#### Edited by:

*Frank Ryan, Imperial College, UK*

## Reviewed by:

*Thomas Edward Gladwin, Ministry of Defense, Netherlands Kim Mitchell Caudwell, Curtin University, Australia*

#### \*Correspondence:

*Matt Field, Department of Psychological Sciences, University of Liverpool, Bedford Street South, Liverpool L69 7ZA, UK mfield@liv.ac.uk*

#### Specialty section:

*This article was submitted to Psychology for Clinical Settings, a section of the journal Frontiers in Psychology*

Received: *19 June 2015* Accepted: *14 September 2015* Published: *02 October 2015*

#### Citation:

*Di Lemma LCG, Dickson JM, Jedras P, Roefs A and Field M (2015) Priming of conflicting motivational orientations in heavy drinkers: robust effects on self-report but not implicit measures. Front. Psychol. 6:1465. doi: 10.3389/fpsyg.2015.01465* We report results from three experimental studies that investigated the independence of approach and avoidance motivational orientations for alcohol, both of which operate within controlled and automatic cognitive processes. In order to prime their approach or avoidance motivational orientations, participants watched brief videos, the content of which (positive or negative depictions of alcohol, or neutral) varied by experimental group. Immediately after watching the videos, participants completed self-report (Approach and Avoidance of Alcohol Questionnaire; all studies) and implicit (visual probe task in study 1, stimulus-response compatibility task in studies 2 and 3) measures of alcohol-related approach and avoidance. In study 3, we incorporated an additional experimental manipulation of thought suppression in an attempt to maximize the influence of the videos on implicit measures. Findings were consistent across all three studies: increases in self-reported approach inclinations were mirrored by decreases in avoidance inclinations, and vice versa. However, a combined analysis of data from all studies demonstrated that changes in approach inclinations were partially independent of changes in avoidance inclinations. There were no effects on implicit alcohol-related processing biases, although methodological issues may partially account for these findings. Our findings demonstrate that subjective approach and avoidance inclinations for alcohol tend to fluctuate in parallel, but changes in approach inclinations may be partially independent from changes in avoidance inclinations. We discuss methodological issues that may partially account for our findings.

Keywords: alcohol, ambivalence, approach, automatic, avoidance,implicit, thought suppression

## Introduction

According to the ambivalence model of craving (Breiner et al., 1999; McEvoy et al., 2004), the decision to consume alcohol is determined by the balance between motivational inclinations to indulge ("approach") and to abstain ("avoidance"). Approach and avoidance inclinations might arise from the desire for intoxication or the wish to keep a clear head for the next day, respectively. Motivational conflict (or ambivalence), which plays an important role in alcohol use disorders and their treatment (Hettema et al., 2005), arises when a person has the motivation to drink and to abstain at the same time. Importantly, these motivational orientations can operate in both controlled (or explicit) and automatic (or implicit) cognitive processes. Controlled processes are rule-based and reflective, they operate within conscious awareness and they can be assessed with self-report measures. Automatic processes are activated spontaneously and they are typically assessed with indirect tasks such as computerized measures of attentional bias and automatic approach tendencies (Stacy and Wiers, 2010). One theoretical model proposed that subjective craving (a controlled process) and attentional bias (an automatic process) have reciprocal causal influences on each other (Field and Cox, 2008), although an alternative account is that automatic and controlled processes are both outputs of underlying processes that cannot be measured directly (motivational orientations; see Christiansen et al., 2015). In this paper we report results from three studies in which we experimentally manipulated motivational orientations for alcohol in order to thoroughly investigate the independence of approach and avoidance in both controlled and automatic processing.

Regarding controlled processes, the Approach and Avoidance of Alcohol Questionnaire (AAAQ; McEvoy et al., 2004) was developed to capture the strength of approach and avoidance inclinations for alcohol. The initial factor analysis of nondependent drinkers' responses on the AAAQ yielded three subscales, two representing approach inclinations (inclinedindulgent and obsessed-compelled subscales, corresponding to mild and strong inclinations, respectively), and one representing inclinations to avoid drinking alcohol (the resolved-regulated subscale). Subsequent studies employed the AAAQ with different populations of drinkers and performed factor analysis on participants' responses. Each of these studies confirmed that approach and avoidance represent distinct underlying factors, although some studies with alcohol dependent patients (Klein et al., 2007; Schlauch et al., 2013c; but see Klein and Anker, 2013) identified a single underlying factor to approach inclinations rather than the qualitative distinction between mild and strong approach that was reported in the initial study (McEvoy et al., 2004). Many of these studies demonstrated that both approach and avoidance inclinations are independently associated with drinking-related variables. For example, approach and avoidance inclinations account for unique variance in quantity and frequency indices of alcohol consumption in both non-dependent (McEvoy et al., 2004) and alcohol dependent (Klein et al., 2007) drinkers. Approach and avoidance inclinations also have differential predictive validity in alcohol-dependent patients: following treatment, relapse to drinking is predicted by the strength of approach inclinations, but avoidance inclinations are not predictive (Schlauch et al., 2012; Klein and Anker, 2013; see also Schlauch et al., 2013c). On the other hand, avoidance inclinations (but not approach inclinations) predict the likelihood of entering into and engaging with treatment (Schlauch et al., 2012). Taken together, these findings provide support for the ambivalence model of craving (Breiner et al., 1999) because they demonstrate that self-reported approach and avoidance inclinations for alcohol are separable constructs that are uniquely associated with past and future drinking behavior (see also Curtin et al., 2005; Schlauch et al., 2013a,b).

Regarding automatic processes, there is evidence for coexistence of appetitive (approach) and aversive (avoidance) alcohol-related processing biases in problem drinkers in a variety of sub-domains, including affective associations (Dickson et al., 2013), attentional bias (Stormark et al., 1997), and approach and avoidance tendencies (Barkby et al., 2012). Regarding attentional bias, heavy drinkers who are not seeking treatment have an attentional bias for alcohol cues (Townshend and Duka, 2001; Field et al., 2004). The strength of this attentional bias is reliably associated with the strength of subjective craving (Field et al., 2009) and is potentiated by experimental manipulations that increase the motivation to drink, such as induction of negative mood and exposure to alcohol-related cues (see Field and Cox, 2008). By contrast, alcohol-dependent patients who are tested in treatment contexts show initial attentional bias that is quickly followed by attentional avoidance (Stormark et al., 1997; Noël et al., 2006; Townshend and Duka, 2007; Vollstädt-Klein et al., 2009; Field et al., 2013). The latter pattern of attentional bias may reflect ambivalence, with appetitive motivational processes mapped to the initial attentional bias and aversive motivational processes mapped on to the subsequent attentional avoidance (see Field et al., 2013, for discussion). Consistent with this interpretation, a recent eye tracking study demonstrated that heavy drinkers who were identified as ambivalent (as assessed with the AAAQ) had an approach-avoidance pattern of attentional bias for alcohol cues (i.e., the initial attentional bias quickly followed by attentional avoidance that is characteristic of alcohol-dependent patients), whereas heavy drinkers who were not ambivalent maintained their attentional bias for alcohol cues (Lee et al., 2014).

Automatic approach and avoidance tendencies evoked by alcohol-related cues have been assessed with the alcohol-related stimulus-response compatibility (SRC) task (Field et al., 2008) and related tasks (Wiers et al., 2009). These tasks reveal that in heavy drinkers who are not seeking treatment, alcohol cues evoke automatic approach tendencies (Field et al., 2008, 2011; Wiers et al., 2009; Christiansen et al., 2012; Sharbanee et al., 2013a,b; Kersbergen et al., 2015), and in some studies the strength of these approach tendencies was associated with the strength of subjective craving (Field et al., 2005, 2008). A different pattern is seen in alcohol-dependent patients: one study reported no reliable tendency to approach or avoidance (Barkby et al., 2012) whereas another study found an automatic avoidance tendency, the strength of which was predictive of subsequent relapse (Spruyt et al., 2013). One explanation for these findings is that the standard version of the SRC task yields an index of automatic approach that isrelative to avoidance. This means that the pattern that is observed in heavy drinkers who are not seeking treatment (Field et al., 2008, 2011; Christiansen et al., 2012; Kersbergen et al., 2015) could be attributed to strong automatic approach, weak automatic avoidance, or a combination of the two. Among alcohol-dependent patients, if alcohol cues simultaneously evoke strong automatic approach at the same time as strong automatic avoidance, this may explain why this population display either no overall bias (Barkby et al., 2012) or an avoidance bias (Spruyt et al., 2013) depending on the strength of their motivational orientations to avoid alcohol at the time of testing.

Findings from the cross-sectional and prospective studies described above are consistent with the ambivalence model (Breiner et al., 1999) because they suggest that approach and avoidance motivational orientations for alcohol may exist independently of each other, rather than lying at opposite ends of a single continuum. More compelling evidence for the independence of approach and avoidance can be derived from experimental studies that attempt to influence one motivational orientation (approach or avoidance) in order to investigate if the opposing motivational orientation is (un)affected. Regarding automatic processes, we recently demonstrated that subliminal priming of approach or avoidance motivational orientations for alcohol had no effect on attentional biases or automatic approach or avoidance tendencies, although methodological issues complicated interpretation of those findings (Baker et al., 2014). Regarding controlled processes, several studies investigated the effects of exposure to alcohol cues on selfreported approach and avoidance inclinations for alcohol, and all reported findings that were suggestive of partially independent approach and avoidance responses to those cues (Curtin et al., 2005; Jones et al., 2013; Schlauch et al., 2013a,b). For example, in one study exposure to alcohol cues (pouring, holding, and sniffing a beer) led to increases in approach inclinations (AAAQ inclined-indulgent and obsessed-compelled subscales), but avoidance inclinations (AAAQ resolved-regulated subscale) were unaffected (Jones et al., 2013).

Although these studies are informative, a more rigorous experimental test of the independence of approach and avoidance would be to contrast the effects of experimental manipulations that are intended to increase approach or avoidance motivational orientations for alcohol. To achieve this, we were inspired by methods used in a previous study (Roefs et al., 2006) in which participants' automatic processing of food-related words was assessed in contexts that were intended to activate either approach (focusing on the preparation of a tasty meal) or avoidance (focusing on the importance of a healthy diet, and therefore avoiding unhealthy foods). In the present studies, participants viewed short videos that depicted either the positive or negative aspects of alcohol consumption, which should in principle activate approach or avoidance, respectively. Control groups of participants viewed videos that were unrelated to alcohol consumption. Immediately after watching the videos, participants completed the AAAQ (all studies) followed by computerized measures of attentional bias (study 1) and automatic approach and avoidance tendencies (studies 2 and 3). In addition, in study 3 we investigated if thought suppression (see Moss et al., 2015) would moderate the influence of videos on implicit measures.

Our general hypotheses were that the video depicting the positive consequences of alcohol consumption would increase self-reported approach (inclined-indulgent and obsessedcompelled subscales of the AAAQ) and indices of automatic approach (attentional bias in study 1, automatic approach tendencies in studies 2 and 3), but would not influence self-reported and automatic avoidance, as assessed by the resolved-regulated subscale of the AAAQ and attentional avoidance (study 1) and automatic avoidance tendencies (studies 2 and 3), respectively. By contrast, the video depicting the negative consequences of alcohol consumption would increase self-reported and automatic indices of avoidance, but indices of approach would be unaffected.

## Study 1

The alcohol-related visual probe task (see Field et al., 2004) is a computerized measure of attentional bias that can distinguish between attentional bias toward and attentional bias away from alcohol-related pictorial stimuli (hereafter referred to as attentional avoidance). In each trial of the task, an alcohol-related picture and a matched neutral picture are briefly presented on opposite sides of a computer screen before a visual probe replaces one of the pictures. Participants' manual reaction times to probes are used to infer biases in the allocation of visuospatial attention. An attentional bias for alcohol cues is inferred if the participant is faster to react to probes that replace alcohol pictures (congruent trials), rather than probes that replace neutral pictures (incongruent trials). If, however, this pattern is reversed (i.e., if the participant is faster to respond on incongruent trials), this is interpreted as attentional avoidance of alcohol cues. Biases in automatic attentional capture or delayed disengagement of attention can be inferred by comparing reaction times on these trials with those on other trials in which only neutral pictures are presented (Koster et al., 2004; see Baker et al., 2014). Although the literature on group differences is inconsistent (see Field and Cox, 2008), several studies demonstrated that heavy drinkers who are not seeking treatment have an attentional bias for alcohol cues when those cues are presented for 500 ms or longer (Townshend and Duka, 2001; Field et al., 2004; Baker et al., 2014), and this has been corroborated by studies of eye movements toward those cues (Lee et al., 2014). Conversely, alcohol-dependent patients who are tested in treatment settings show initial attentional bias for briefly-presented alcohol cues (50–100 ms), that is followed by attentional avoidance when those cues are presented for longer periods (upwards of 500 ms; Stormark et al., 1997; Noël et al., 2006; Townshend and Duka, 2007; Vollstädt-Klein et al., 2009; Field et al., 2013).

In the present study, participants watched a brief video that depicted either the positive consequences of alcohol consumption (alcohol-positive group), the negative consequences of alcohol consumption (alcohol-negative group), or that had no alcohol-related content (control group). Immediately after watching the video, participants completed the AAAQ and an alcohol-related visual probe task in which picture pairs were presented for 50 or 500 ms. We hypothesized that, relative to the control group, participants in the alcohol-positive group would have elevated scores on the inclined-indulgent and obsessed-compelled subscales of the AAAQ, and elevated attentional bias for alcohol cues presented for both 50 and 500 ms; however, scores on the resolved-regulated subscale of the AAAQ would not differ between alcohol-positive and control groups. By contrast, compared to the control group, participants in the alcohol-negative group would have elevated scores on the resolved-regulated subscale of the AAAQ and would exhibit an "approach-avoidance" pattern of attentional bias on the visual probe task, with bias toward alcohol cues presented for 50 ms followed by attentional avoidance of those cues presented for 500 ms; however, scores on the inclined-indulgent and obsessedcompelled subscales of the AAAQ would not differ between the alcohol-negative and control groups.

## Methods

#### Participants

Ninety participants (69 Female, mean age 21.70, SD = 5.04) were recruited from the students and staff at the University of Liverpool via online and poster advertising. Inclusion criteria included fluency in English, age between 18 and 45, normal or corrected-to-normal vision and self-reported alcohol consumption in excess of the current UK government guidelines for safe drinking (these are 14 units per week for females and 21 units per week for males, where 1 unit equals 8 g of alcohol). Exclusion criteria included any history of alcohol use disorders. Participants who had taken part in studies 2 or 3 were ineligible to participate. All participants provided informed consent before taking part in the study, which was approved by the University of Liverpool Research Ethics Committee.

### Materials

#### **Self-report measures**

**Timeline followback drinking diary (Sobell and Sobell, 1992)** Participants indicated their alcohol consumption over the previous 2 weeks. From this, we were able to calculate the total amount of alcohol consumed in standard UK units.

#### **Alcohol use disorders identification test (AUDIT; Saunders et al., 1993)**

This 10-item self-report questionnaire contains questions about frequency and quantity of alcohol consumption, and alcoholrelated problems and harms. It yields a total score ranging between 0 and 40, with scores of 8 or above indicative of hazardous drinking.

#### **Approach and avoidance of alcohol questionnaire, right now version (AAAQ; McEvoy et al., 2004)**

This 14-item questionnaire assesses subjective tendencies to approach or avoid drinking at that moment in time. Respondents are asked to rate how strongly they agree with each item on a 9-point Likert scale, from 0 (not at all) to 8 (very strong). There are three underlying sub-scales: "Inclined-Indulgent" (mild approach, akin to desire to drink) "Obsessed-Compelled" (strong approach, akin to obsessive thoughts about drinking); and "Resolved-Regulated" (motivation to avoid drinking).

#### **Positive and negative affect schedule (PANAS; Watson et al., 1988)**

The PANAS is a 20-item Likert scale that yields scores on positive affect (PA) and negative affect (NA). Results are reported in the Supplementary Materials.

### **Video questionnaire**

This eight-item questionnaire was developed to measure participants' perception of and engagement with the videos. Participants responded to each item on a 5-point Likert scale, with labels ranging from "strongly disagree" to "strongly agree." Items are shown in Tables S2A–C.

### **Visual probe task (for similar tasks see Field et al., 2004; Koster et al., 2004)**

This task was programmed in Psychopy v.1.74 (Peirce, 2007) and was administered on a desktop computer with a 15-inch monitor. On each trial, a small white fixation cross was presented in the center of the screen for 500 ms. Immediately after offset, a pair of pictures (each 65 mm high × 80 mm wide) was presented on the left and right of the screen, 130 mm apart, for either 50 or 500 ms. Immediately after the screen was cleared, the visual probe (a small white arrow that pointed up or down) was presented on either the left or right side of the screen, in the position that had been occupied by one of the pictures. The probe remained on the screen until participants made a response by pressing a key labeled "up" or "down" on the computer keyboard.

Participants were instructed to rest the index fingers of their left and right hands on the "up" and "down" keys, to fixate on the fixation cross at the beginning of each trial, and to rapidly categorize the visual probe as soon as it appeared. The latency and accuracy of responses were recorded. There was an initial practice block of 10 trials in which four pairs of affectively neutral pictures, taken from the International Affective Picture System (IAPS; Lang et al., 2008) were presented. The main block of trials then followed, and this comprised two different types of trials: alcohol-neutral trials, and neutral-neutral trials. For alcohol-neutral trials, a set of seven alcohol-related pictures were each paired with neutral pictures that depicted items of stationery. We used a subset of picture pairs that had been used in earlier studies (Field et al., 2011; Barkby et al., 2012) and the pictures in each pair were matched on perceptual characteristics including brightness and complexity. On neutral-neutral trials, we used four pairs of affectively neutral pictures from the IAPS, as described above. During the main block of trials, there were 112 alcohol-neutral trials and 64 neutral-neutral trials. Each picture pair was presented 16 times, and picture location (left or right), stimulus onset asynchrony (SOA; 50 or 500 ms), probe position (left or right) and probe type (up or down arrow) were counterbalanced for all picture pairs. Trials were presented in a random order.

## **Video stimuli**

We created three different videos in order to manipulate participants' inclinations to drink alcohol or to refrain from drinking. Videos were created in Windows Movie Maker (version 2.6) and were presented in Windows Media Player (version 7) player in full-screen mode on the computer. Participants wore headphones while watching the videos, all of which were 3 min and 45 s in duration. All video files are available from the Corresponding Author on request.

The alcohol-positive video was intended to evoke motivational inclinations to approach alcohol. It comprised still images depicting people having fun while drinking alcohol, together with some text slides that provided information about the positive consequences of drinking and was accompanied by an upbeat soundtrack. The alcohol-negative video was intended to evoke motivational inclinations to avoid alcohol. It comprised still images depicting the negative consequences of drinking, including scenes of alcohol-related violence and vomiting, and other slides depicting graphic government advertisements that warned of the consequences of drink-driving and alcohol-related organ damage and was accompanied by a downbeat soundtrack. The neutral video comprised still photos of office equipment and furniture, and was accompanied by non-descript jazz music. All images were obtained using a Google Images search.

#### Procedure

Participants were randomly allocated to experimental condition. They were tested in a laboratory in the Department of Psychological Sciences at the University of Liverpool. After providing informed consent participants completed the timeline follow back drinking diary, AUDIT, AAAQ, and PANAS (time 1). Then, participants put on the headphones and watched one of the videos (depending on experimental condition), before completing the Video Questionnaire and the AAAQ and PANAS again (time 2). Finally, participants completed the visual probe task. After completing the study, participants were debriefed and offered either course credit or a £5 Shopping Voucher to compensate them for their time.

#### Results

#### Group Characteristics

Participants reported consuming 20.55 (SD = 11.53) units of alcohol per week, and the mean score on the AUDIT was 12.18 (SD = 5.28). There were no between-group differences in weekly alcohol consumption or AUDIT scores (Kruskal–Wallis tests ps > 0.09), although there was a trend for participants in the alcohol-positive group to be older than participants in the other two groups (Kruskal–Wallis p = 0.05). There were no group differences in gender ratio (χ <sup>2</sup> = 0.45, p > 0.1).

#### Effects of Video Manipulation on AAAQ Ratings (Figure 1A)

AAAQ ratings were analyzed using a mixed design ANOVA, with within-subject factors of sub-scale (3: inclined-indulgent, obsessed-compelled, resolved-regulated), time (2: before video, after video), and group (3: alcohol-positive, alcohol-negative, control). The sub-scale x time x group interaction was statistically significant [F(4, 174) = 27.05, p < 0.001]. Subsequent posthoc ANOVAs confirmed that the time x group interaction was significant for all three sub-scales [inclined-indulgent F(2, 87) = 25.29, p < 0.001]; obsessed-compelled F(2, 87) = 5.72, p < 0.01; resolved-regulated F(2, 87) = 28.32, p < 0.001].

There were no group differences on any of the AAAQ subscales before participants watched the video [inclined-indulgent F(2, 89) = 1.12, p > 0.1; obsessed-compelled F(2, 89) = 0.26, p > 0.1; resolved-regulated F(2, 89) = 0.22, p > 0.1]. As predicted, groups differed on all three sub-scales after watching the video, [inclined-indulgent F(2, 89) = 9.13, p < 0.001; resolved-regulated

F(2, 89) = 17.57, p < 0.01], although this fell short of significance for the obsessed-compelled sub-scale [F(2, 89) = 2.93, p = 0.059]. Post-hoc LSD contrasts confirmed that scores on both the inclined-indulgent and obsessed-compelled sub-scales were higher in the alcohol-positive group compared to the alcoholnegative group (ps < 0.01), but this pattern was reversed for the resolved-regulated subscale (p < 0.01). The direct test of our hypotheses requires contrasts between these groups and the control group. These contrasts revealed that alcohol-positive and control groups did not differ on any subscale (p > 0.1). However, scores on the resolved-regulated subscale were higher, and scores on the inclined-indulgent subscale lower, in the alcohol-negative compared to the control group (ps < 0.01). Alcohol-negative and control groups did not differ on the obsessed-compelled subscale (p > 0.1).

Paired-samples t-tests revealed that among participants in the alcohol-positive group, scores on the inclined-indulgent and obsessed-compelled sub-scales increased after watching the video [t(28) = 2.92, p < 0.01 and t(28) = 2.29, p < 0.05], whereas scores on the resolved-regulated sub-scale did not change [t(28) = 0.61, p > 0.1]. A different pattern was seen in the alcohol-negative group: inclined-indulgent and obsessedcompelled scores decreased [t(31) = 6.41, p < 0.001 and t(31) = 2.24, p < 0.05], whereas scores on the resolvedregulated sub-scale increased [t(31) = 6.34, p < 0.001]. In the control group, scores on both the inclined-indulgent and resolved-regulated sub-scales decreased after watching the video, although the former failed to reach significance [t(28) = 1.95, p = 0.06 and t(28) = 2.66, p < 0.05]; scores on the obsessedcompelled sub-scale did not change [t(29) = 0.30, p > 0.1].

We also re-ran the omnibus Three-Way ANOVA on AAAQ scores but added PANAS positive and PANAS negative affect after the video as covariates. The three way interaction sub-scale × time × group remained statistically significant [F(4, 170) = 17.25, p < 0.001]. Therefore, statistically controlling for positive and negative mood at the time did not modify the influence of the videos on the AAAQ.

#### Visual Probe Task (Table 1)

Data were analyzed in accordance with previous studies (e.g., Field et al., 2004). Firstly, trials with errors were discarded, and then outlying reaction times were removed if they were faster than 200 ms, slower than 2000 ms, and then if they were more than three standard deviations above the individual mean. All data from three participants were excluded as they had an

TABLE 1 | Mean reaction times (in milliseconds) from the different trials of the visual probe task in study 1.


*Values are means* ± *SD.*

outlying high rate (>28%) of missing data due to errors and outliers. For the remainder of the sample, on average 7% of trials were missing due to errors and a further 1% due to outliers, and these values did not differ between groups (ps > 0.1).

Mean reaction times for different trial types and SOAs were analyzed using a 3 × 2 × 3 mixed design ANOVA, with within-subject factors of trial type (3: congruent alcohol trials, incongruent alcohol trials, neutral-neutral trials) and SOA (2: 50, 500 ms), and a between-subjects factor of group. The predicted trial type x SOA x group interaction was not statistically significant [F(4, 170) = 1.16, p > 0.1]. There was a significant main effect of SOA [F(1, 85) = 390.10, p < 0.001], indicating faster reaction times on 500 ms trials compared to 50 ms trials. Importantly, the non-significant main effect of trial type [F(2, 84) = 1.09, p > 0.1], and trial type × SOA interaction [F(2, 84) = 2.12, p > 0.1] demonstrate that there was no reliable attentional bias for alcohol cues overall, at either SOA.

#### Discussion

Overall, results from this study did not support the independence of approach and avoidance orientations for alcohol in either controlled or automatic processes. Data from the AAAQ could be interpreted as independence of self-reported approach and avoidance inclinations for alcohol after watching a video depicting the positive consequences of alcohol consumption, because participants who watched this video reported an increase in self-reported approach inclinations (inclined-indulgent and obsessed-compelled subscales) but no corresponding reduction in avoidance inclinations (the resolved-regulated subscale). However, a video that depicted the negative consequences of alcohol consumption prompted an increase in self-reported avoidance inclinations (the resolved-regulated subscale) in parallel with a decrease in self-reported approach inclinations (the inclined-obsessed and resolved-regulated subscales). Comparisons between these groups and a control group revealed that approach and avoidance inclinations were similar in the control group and the group that had watched the video depicting the positive consequences of alcohol consumption, whereas approach inclinations were lower, and avoidance inclinations higher, in the group that had watched the video depicting the negative consequences of alcohol consumption, compared to the control group.

The visual probe task revealed no evidence of attentional bias or attentional avoidance of alcohol cues in any group (or in the sample as a whole), therefore our hypotheses regarding the influence of the videos on attentional bias can be rejected. One interpretation is that attentional bias is insensitive to experimental manipulations of the motivation to drink or to avoid alcohol, although the absence of attentional bias in the control group argues against this interpretation. In the next study we repeated the general methodology of study 1 so that we were again able to investigate the effects of the different videos on self-reported approach and avoidance of alcohol. Given the null results from the visual probe task, we omitted this task and replaced it with a measure of automatic approach and avoidance tendencies for alcohol cues.

## Study 2

In the alcohol version of the stimulus-response compatibility (SRC) task (Field et al., 2008), a manikin is presented on a computer screen either above or below an alcohol-related or neutral picture. Participants must move the manikin toward or away from the pictures as quickly as possible. On some blocks of the task, participants must make the manikin move toward alcohol pictures and away from neutral pictures, whereas these instructions are reversed in other blocks of the task. Automatic approach tendencies for alcohol cues are inferred if participants are faster to respond on blocks of the task when alcohol pictures require the "approach" movement in comparison to blocks when alcohol pictures require the "avoidance" movement. By contrast, if participants are faster on the "avoid alcohol" blocks compared to the "approach alcohol" blocks, this would suggest that alcohol cues evoke automatic avoidance tendencies. Heavy drinkers who are not seeking treatment display automatic approach tendencies for alcohol cues (Field et al., 2008, 2011; Christiansen et al., 2012; Kersbergen et al., 2015), whereas alcohol-dependent patients may show the opposite pattern, i.e., they are faster to avoid rather than approach alcohol-related pictures (Spruyt et al., 2013; but see Barkby et al., 2012).

Findings obtained from the standard version of the SRC task must be interpreted cautiously because this task yields an index of automatic approach that is relative to automatic avoidance, therefore an apparent bias in automatic approach could be attributed to either strong automatic approach, weak automatic avoidance, or a combination of the two. Among alcohol-dependent patients, if alcohol cues simultaneously evoke strong automatic approach at the same time as strong automatic avoidance, this may explain why they display either no reliable bias on the task (Barkby et al., 2012) or a bias to faster avoidance (Spruyt et al., 2013) depending on the strength of their automatic tendencies to avoid alcohol at the time of testing. In the present study, we overcame this limitation by modifying the task so that it is able to distinguish automatic alcohol approach and avoidance tendencies from each other. This modified version of the task includes neutral movements (to the side) in addition to the standard approach and avoidance movements, and is split into four blocks instead of two (see Baker et al., 2014).

The present study was identical to study 1 with the important difference that participants completed a modified SRC task instead of the visual probe task that was used in study 1. We hypothesized that we would replicate the effects of the videos on the AAAQ that were observed in study 1. Regarding the indices of approach and avoidance tendencies from the SRC task, we hypothesized that, relative to the control group, participants in the alcohol-positive group would show stronger automatic alcohol approach tendencies but these groups would not differ in automatic avoidance tendencies. By contrast, relative to the control group we anticipated stronger automatic avoidance tendencies in the alcohol-negative group, but these two groups would not differ in automatic approach tendencies.

## Methods

#### Participants

Ninety participants (56 Female, mean age 24.56, SD = 5.34) were recruited from the local community and students and staff at the University of Liverpool via online and poster advertising. Inclusion and exclusion criteria were identical to those described for study 1. Participants who had taken part in studies 1 or 3 were ineligible to participate. Participants provided informed consent before taking part in the study, which was approved by the University of Liverpool Research Ethics Committee.

## Materials

## **The modified stimulus-response compatibility task**

The modified stimulus-response compatibility Task (Baker et al., 2014) is used to measure automatic approach and avoidance responses evoked by alcohol-related cues. Participants are instructed to rapidly categorize alcohol-related and stationeryrelated (control) pictures by moving a manikin either toward or away from the pictures, or to the left (neutral movement), as quickly as possible by pressing one of three specific keys on the keyboard, which were labeled with arrows pointing up, down, and left. The task was programmed in Inquisit software (Millisecond Software, 2006) and presented on a laptop computer with a 13 inch screen.

The format of the task, trial structure, and perceptual characteristics of the pictorial stimuli were identical to those used in previous studies (Field et al., 2011; Barkby et al., 2012). Fourteen colored pictures (a subset of the picture set used in Barkby et al., 2012) were used in the task: seven pictures of alcoholic drinks and close-ups of individuals holding or consuming those drinks, and seven control pictures of stationery items and close-ups of models interacting with those items.

There were four sub-blocks of the task, which differed according to task instructions. In the "approach alcohol" block, participants were required to move the manikin toward alcohol pictures, and to the left for stationery pictures. In the "avoid alcohol" block, participants moved away from alcohol pictures and to the left for stationery pictures. In the "approach control" block, participants moved toward stationery pictures and to the left for alcohol pictures. Finally, in the "avoid control" block, participants moved away from stationery pictures and to the left for alcohol pictures. Note that in the case of approach and avoidance movements, the position of the manikin was crucial: if the manikin was above the picture, an "approach" response required participants to press the "down" key, and an "avoidance" response required participants to press the "up" key. This was reversed if the manikin was below the picture. Participants were instructed to respond quickly and accurately on each trial. If they pressed the correct key, the manikin moved up, down or to the left in an animation lasting 500 ms. If they pressed the wrong key, error feedback was provided in the form of a large red cross presented in the center of the screen for 500 ms. There was an inter-trial interval of 500 ms.

Each sub-block of the task comprised four practice trials, in which two alcohol pictures and two control pictures were presented, once with the manikin above each picture type and once with the manikin below. If participants did not understand the task, this practice block was repeated. There then followed 28 "critical" trials, in which each of the 14 pictures was presented twice: once with the manikin above the picture and once with the manikin below. Trials were presented in a new random order for each participant. Participants completed the sub-blocks in a counterbalanced order. Responses and reaction times (in milliseconds) to initiate the manikin movement were recorded on each trial.

#### Procedure

Participants were randomly allocated to experimental conditions. They were tested in a laboratory in the Department of Psychological Sciences or in quiet public places in which alcohol was not available (e.g., cafes and libraries). After providing informed consent, participants completed the timeline followback drinking diary, AUDIT, AAAQ, and PANAS (time 1). Then, participants put on the headphones and watched one of the videos, before completing the Video Questionnaire and the AAAQ and PANAS again (time 2). Finally, participants completed the SRC task. After completing the study, participants were debriefed and offered either course credit or a £5 Shopping Voucher to compensate them for their time.

#### Results

#### Group Characteristics

Participants reported consuming 30.21 (SD = 23.53) units of alcohol per week, and the mean score on the AUDIT was 12.85 (SD = 5.34). There were no between-group differences in age, weekly alcohol consumption, or AUDIT scores (all Kruskal– Wallis tests p > 0.1). There were no group differences in gender ratio (χ <sup>2</sup> = 0.66, p > 0.1).

#### Effects of Video Manipulation on AAAQ Ratings (Figure 1B)

AAAQ ratings were analyzed using a mixed design ANOVA, with within-subject factors of sub-scale (3: inclined-indulgent, obsessed-compelled, resolved-regulated), time (2: before video, after video), and group (3: alcohol-positive, alcohol-negative, control). The sub-scale × time × group interaction was statistically significant [F(4, 174) = 19.16, p < 0.001]. Subsequent post-hoc ANOVAs confirmed that the time x group interaction was significant for all three sub-scales [inclinedindulgent F(2, 87) = 19.67, p < 0.001]; obsessed-compelled F(2, 87) = 5.28, p < 0.01; resolved-regulated F(2, 87) = 13.80, p < 0.001].

Groups did not differ on the inclined-indulgent [F(2, 89) = 1.56, p > 0.1] or obsessed-compelled [F(2, 89) = 1.26, p > 0.1] sub-scales before watching the video. However, there was a group difference in the resolved-regulated sub-scale before the video [F(2, 89) = 3.90, p < 0.05], and post-hoc LSD contrasts revealed that scores were lower in the control group compared to both the alcohol-positive and alcohol-negative groups (p < 0.05), who did not differ from each other (ps > 0.1). As predicted, groups differed on all three sub-scales after watching the video [inclined-indulgent F(2, 89) = 10.78, p < 0.001; obsessed-compelled F(2, 89) = 4.85, p = 0.01; resolved-regulated F(2, 89) = 20.78, p < 0.001]. Post-hoc LSD contrasts revealed that scores on both inclined-indulgent and obsessed-compelled sub-scales were higher in the alcohol-positive group compared to both alcohol-negative and control groups (p < 0.01), who did not differ from each other (p > 0.1). On the other hand, scores on the resolved-regulated sub-scale were higher in the alcohol-negative group compared to both alcohol-positive and neutral groups (ps < 0.01), who did not differ from each other (p > 0.08).

Paired-samples t-tests revealed that among participants in the alcohol positive group, scores on the inclined-indulgent and obsessed-compelled sub-scales increased after watching the video [t(29) = 2.74, p = 0.01 and t(29) = 2.84, p < 0.01], whereas scores on the resolved-regulated sub-scale decreased [t(29) = 2.90, p < 0.01]. The reverse pattern was seen in the alcohol negative group: the decrease in inclined-indulgent ratings and the increase in resolved-regulated ratings after watching the video were statistically significant [t(29) = 6.70, p < 0.001 and t(29) = 3.15, p < 0.01], although there was no significant change in scores on the obsessed-compelled sub-scale [t(29) = 0.32, p > 0.1]. In the control group, scores on both the inclined-indulgent and resolved-regulated sub-scales decreased after watching the video [t(29) = 2.56, p < 0.05 and t(29) = 2.63, p < 0.05], but scores on the obsessed-compelled sub-scale did not change [t(29) = 0.27, p > 0.1].

We also re-ran the omnibus Three-Way ANOVA on AAAQ scores but added PANAS positive and PANAS negative affect after the video as covariates. The three way interaction sub-scale × time × group remained statistically significant [F(4, 170) = 9.55, p < 0.001]. Therefore, statistically controlling for positive and negative mood at the time did not influence the influence of the videos on the AAAQ.

#### SRC Task (Table 2)

Data were analyzed in accordance with previous studies (e.g., Field et al., 2011). Firstly, trials with errors were discarded, and then outlying reaction times were removed if they were faster than 200 ms, slower than 2000 ms, and then if they were more than three standard deviations above the individual mean. All data from three participants were excluded as they had an outlying high rate (>40%) of missing data due to errors and outliers. For the remainder of the sample, on average 5% of trials were missing due to errors and a further 9% due to outliers, and these values did not differ between groups (ps > 0.1).

TABLE 2 | Mean reaction times (in milliseconds) from the different blocks of the SRC task in study 2.


*Values are means* ± *SD.*

Mean reaction times in the different blocks of the task were then analyzed using a mixed design 2 × 2 × 3 ANOVA, with within-subject factors of movement type (2: approach, avoidance) and picture type (2: alcohol, stationery; this refers to the type of picture that the approach or avoidance movement had to be directed toward or away from, with the sideways movement required for the other type of picture), and a betweensubjects factor of group. The hypothesized three way interaction was not statistically significant [F(2, 84) = 0.80, p > 0.1]. There were, however, significant main effects of picture type [F(1, 84) = 4.75, p < 0.05; participants were faster to respond on blocks when the approach or avoidance movement had to be made in response to alcohol pictures rather than stationery pictures], and movement type [F(1, 84) = 4.82, p < 0.05; participants were faster on "approach" blocks than "avoid" blocks of the task). The picture type x movement type interaction was not statistically significant [F(1, 84) = 0.19, p > 0.1]. Overall, these results show that participants were faster to make approach rather than avoidance movements, and they were faster to make both approach and avoidance movements in response to alcohol pictures in comparison to stationery pictures. However, the video manipulation had no effect on performance on the task.

### Discussion

Consistent with results from study 1, results from this study did not provide clear support for the independence of self-reported approach and avoidance inclinations for alcohol following an experimental manipulation of those inclinations. Betweengroup contrasts demonstrated that, relative to the control group, scores on the inclined-indulgent and obsessed-compelled subscales of the AAAQ were elevated in participants who had watched a video depicting the positive consequences of alcohol consumption, but this video did not influence scores on the resolved-regulated subscale. The complete opposite pattern was seen in participants who had watched a video depicting the negative consequences of alcohol consumption: increased scores on the resolved-regulated subscale, but there was no difference between this group and the control group in scores on the inclined-indulgent and obsessed-compelled subscales. These contrasts support predictions made by the ambivalence model of craving (Breiner et al., 1999), because they suggest that it is possible to experimentally manipulate subjective approach and avoidance inclinations for alcohol independently of each other. Unfortunately, this conclusion must be heavily caveated given the presence of group differences on the resolved-regulated subscale at baseline, and because withinsubject contrasts suggest that increases in approach inclinations were accompanied by decreases in avoidance inclinations in the alcohol positive group, and vice versa for the alcohol negative group.

The data from the SRC task also did not support our hypotheses: there was no evidence that participants were faster to approach or slower to avoid alcohol cues compared to stationery (control) cues, and the experimental manipulation did not influence the task. Therefore, even though our experimental manipulation had a clear influence on self-reported approach and avoidance inclinations for alcohol, there were no parallel changes in automatic approach or avoidance tendencies evoked by alcohol cues. In our third and final study, we again investigated the influence of alcohol-positive and alcohol-negative videos on selfreported (AAAQ) and automatic (modified SRC task) indices of approach and avoidance motivational orientations for alcohol, but we combined this with an experimental manipulation of thought suppression in an attempt to maximize the influence of the video manipulation on automatic measures of approach and avoidance.

## Study 3

People who are attempting to reduce their alcohol consumption often attempt to suppress unwanted thoughts, such as intrusive cravings, in order to achieve their goal (Moss et al., 2015). Unfortunately, thought suppression has unwelcome consequences because it increases, rather than decreases the frequency of intrusive thoughts that are the target of suppression, but only when competing demands are placed on cognitive resources (Wenzlaff and Wegner, 2000). Indeed, attempting to suppress thoughts about alcohol paradoxically increases the accessibility of alcohol-related cognitions as evidenced by increased attentional bias for alcohol words (Klein, 2007) and accessibility of alcohol-related semantic associations (Palfai et al., 1997).

We hypothesized that if participants were primed to think about the positive or negative consequences of alcohol consumption and were then instructed to suppress those thoughts, this should provoke an increase in the accessibility of those thoughts that would manifest itself as a bias in automatic approach or avoidance tendencies in response to alcohol cues. Specifically, participants who viewed a video depicting the positive consequences of alcohol and then attempted to suppress thoughts about alcohol should have stronger automatic approach tendencies for alcohol cues compared to participants who watched the same video but did not attempt to suppress their thoughts. We expected comparable moderating effects of thought suppression on automatic avoidance tendencies in participants who watched a video depicting the negative consequences of alcohol.

## Methods

#### Participants

One hundred participants (53 Female, mean age 27.87, SD = 6.97) were recruited from the local community and students and staff at the University of Liverpool via online and poster advertising. Inclusion and exclusion criteria were identical to those described for studies 1 and 2. Participants who had taken part in studies 1 or 2 were ineligible to participate. All participants provided informed consent before taking part in the study, which was approved by the University of Liverpool Research Ethics Committee.

#### Procedure

Participants were randomly allocated to one of four experimental conditions: (1) alcohol-positive video combined with thought suppression, (2) alcohol-positive video combined with control manipulation, (3) alcohol-negative video combined with thought suppression, or (4) alcohol-negative video combined with control manipulation. Note that none of the participants in this study watched the neutral video that we used in studies 1 and 2. Participants were tested in a laboratory in the Department of Psychological Sciences or in quiet public places in which alcohol was not served (e.g., cafes, libraries). After providing informed consent, participants completed the timeline follow back drinking diary, AUDIT, AAAQ, and PANAS (time 1). Then, participants put on the headphones and watched one of the videos, before completing the Video Questionnaire and the AAAQ and PANAS again (time 2).

Participants in the thought suppression groups were then instructed to think about anything, but to make every effort to suppress thoughts of alcohol; the importance of the latter was emphasized. Participants in the control groups were instructed to think about anything that came to mind, including alcohol. All participants were then given a 5 min to think freely and to write notes about what they were thinking about on a piece of paper. They were also instructed to place a mark in the right hand margin of the paper each time they thought about alcohol; these marks were subsequently counted up and cross-checked with the content of participants' notes.

Immediately after this 5-min period, participants were asked to respond to two questions by placing a mark on 100 ms visual analog scales (VAS). The questions, which were the same for both groups, were: "To what extent did you think about alcohol"? (anchors "I did not think about alcohol at all" and "I thought about alcohol a lot") and "To what extent did you succeed in complying with the instructions" (anchors "totally unsuccessful" and "totally successful").

Participants then completed the SRC task. The thought suppression or control instructions were re-iterated to participants before each block of the task. In order to increase demands on working memory, participants were given a sevendigit number at the beginning of each sub-block of the task. They were given 50 s to memorize the number and then instructed to hold it in memory, as they would be asked to recall it at the end of each sub-block of the task. Recall of this number was recorded at the end of each sub-block and the process was repeated with a different number at the beginning of each sub-block (see Bryant et al., 2011). After completing all blocks of the SRC task, participants again completed the two 100 ms VAS to indicate the extent to which they had thought about alcohol, and had complied with instructions, whilst they were doing the task. Finally, participants were debriefed and offered either course credit or a £5 Shopping Voucher to compensate them for their time.

#### Results

#### Group Characteristics

Participants reported consuming 22.53 (SD = 15.63) units of alcohol per week, and the mean score on the AUDIT was 10.56 (SD = 4.52). There was no group difference in gender ratio (χ <sup>2</sup> = 4.62, p > 0.1), although there were group differences in both weekly alcohol consumption and AUDIT scores (Kruskal–Wallis tests, ps < 0.05). Participants in both thought suppression groups had higher weekly alcohol consumption and higher scores on the AUDIT compared to participants in both control groups. Therefore, we repeated all primary analyses (detailed below) with the addition of weekly alcohol consumption and AUDIT scores as covariates.

### Effects of Video Manipulation on AAAQ Ratings (Figure 1C)

AAAQ ratings were analyzed using a mixed design ANOVA, with within-subject factors of sub-scale (3: inclined-indulgent, obsessed-compelled, resolved-regulated), time (2: before video, after video), and between-subject factors of video group (2: alcohol-positive, alcohol-negative), and thought suppression group (2: thought suppression, control). The sub-scale x time x video group interaction was statistically significant [F(2, 95) = 41.60, p < 0.001] but the four way interaction sub-scale × time × video group × thought suppression group was not [F(2, 95) = 1.17, p > 0.1]. The three way interaction remained significant after adding AUDIT scores and weekly alcohol consumption as covariates [F(2, 93) = 38.40, p < 0.001]. Subsequent post-hoc ANOVAs confirmed that the time x video group interaction was significant for all three sub-scales [inclined-indulgent F(1, 98) = 40.51, p < 0.001; obsessed-compelled F(1, 98) = 23.84, p < 0.001; resolved-regulated F(1, 98) = 34.35, p < 0.001].

Groups did not differ on any of the sub-scales before watching the video [inclined-indulgent t(98) = 0.08, p > 0.1; obsessedcompelled t(98) = 1.62, p > 0.1; resolved-regulated t(98) = 1.46, p > 0.1]. As predicted, groups differed on all three subscales after watching the video [inclined-indulgent t(98) = 4.06, p < 0.001, higher in the alcohol-positive video group; obsessedcompelled t(98) = 2.15, p < 0.05, also higher in the alcoholpositive video group; resolved-regulated t(98) = 5.68, p < 0.001, higher in the alcohol-negative video group].

Paired-samples t-tests revealed that in the alcohol-positive video group, scores on the inclined-indulgent and obsessedcompelled sub-scales increased after watching the video [t(49) = 1.96, p < 0.05 and t(49) = 3.12, p < 0.01), whereas scores on the resolved-regulated sub-scale decreased [t(49) = 2.20, p < 0.05]. The reverse pattern was seen in the alcohol-negative group: scores on the inclined-indulgent and obsessed-compelled sub-scales decreased after watching the video [t(49) = 6.16, p < 0.001 and t(49) = 3.95, p < 0.001] whereas scores on the resolved-regulated sub-scale increased [t(49) = 5.45, p < 0.001].

We also re-ran the omnibus Four-Way ANOVA on AAAQ scores but added PANAS positive and PANAS negative affect after the video as covariates. The three way interaction sub-scale x time x group remained statistically significant [F(2, 93) = 30.79, p < 0.001]. Therefore, statistically controlling for positive and negative mood at the time did not influence the influence of the videos on the AAAQ.

#### SRC Task (Table 3)

All data were missing from two participants due to an experimenter error. Data were analyzed as described for study 2. All data from two additional participants were excluded as they had an outlying high rate (>35%) of missing data



TABLE 4 | The effect of thought suppression on self-reported alcohol-related thoughts and task instructions.


*Visual analog scales are 100 ms VAS. Values are means* ± *SD.*

due to errors and outliers. For the remaining participants, on average 4% of trials were missing due to errors and a further 7% due to outliers; these values did not differ between groups (ps > 0.1).

Mean reaction times in the different blocks of the task were then analyzed using a mixed design 2 × 2 × 2 × 2 ANOVA, with within-subject factors of movement type (2: approach, avoidance) and picture type (2: alcohol, stationery; this refers to the type of picture that the approach or avoidance movement had to be directed toward or away from, with the sideways movement required for the other type of picture), and between-subject factors of video group (2: alcohol-positive, alcohol-negative), and thought suppression group (2: thought suppression, control). The hypothesized four way interaction was not statistically significant [F(1, 92) = 0.99, p > 0.1]. There was, however, a significant main effect of picture type [F(1, 92) = 11.22, p < 0.01] which was subsumed under a significant picture type x movement type interaction [F(1, 92) = 7.85, p < 0.01]. Paired-samplest-tests revealed that participants were significantly faster to approach alcohol rather than control pictures [t(95) = 4.45, p < 0.001], but reaction times to avoid alcohol and control pictures did not differ [t(95) = 0.97, p > 0.1].

There were no other significant main effects or interactions (Fs < 1.68, ps > 0.1), and results were unaffected when the analysis was repeated with AUDIT scores and weekly alcohol consumption added as covariates. Overall, these results show that participants were faster to make approach movements to alcohol pictures than control pictures, but there was no difference in the speed of avoidance movements. Most importantly, neither the video manipulation or the thought suppression manipulation, or the interaction between the two, had any effect on performance on the task.

#### Thought Suppression and Working Memory Load Manipulation Checks

Responses on the visual analog scales, and the number of alcohol-related thoughts that participants recorded, are shown in **Table 4**. Each VAS was analyzed using a separate 2 × 2 ANOVA, with between-subject factors of video group (2: alcohol-positive, alcohol-negative), and thought suppression group (2: thought suppression, control). There were no main effects or interactions for the "To what extent did you succeed in complying with the instructions"? VAS at either time (Fs < 2.99, ps > 0.08). There were no main effects or interactions for the "To what extent did you think about alcohol" question immediately after the thought suppression and thought listing exercise (Fs < 2.24, ps > 0.1), which suggests that the thought suppression manipulation was not effective. However, the main effect of thought suppression group was statistically significant for this question immediately after participants had completed the SRC task [F(1, 99) = 8.74, p < 0.001], as participants in the thought suppression group reported significantly fewer alcohol-related thoughts while completing the SRC task than participants in the control group. There were no other main effects or interactions (Fs < 1.80, ps > 0.1). Finally, there were no significant main effects or interactions for the number of alcohol-related thoughts that participants recorded during the thought suppression and thought listing exercise (Fs < 0.72, ps > 0.1).

Overall, these results indicate that the thought suppression manipulation was not successful, because there were no differences in perceived suppression success or the number of alcohol-related thoughts recorded between thought suppression and control groups during the thought listing exercise. However, when asked immediately after completing the SRC task, participants in the thought suppression group reported that they thought about alcohol significantly less than participants in the control group. Furthermore, the lack of significant thought suppression x video group interactions for these measures suggests that the alcohol positive and alcohol negative videos did not have differential effects on the success of attempted thought suppression.

Finally, 97% of participants successfully recalled the 7-digit number at the end of each sub-block of the SRC task, which demonstrates that compliance with the manipulation of working memory load was very high.

#### Discussion

Consistent with the findings from studies 1 and 2, results from this study demonstrated that participants who watched a video depicting the positive consequences of alcohol reported an increase in approach inclinations for alcohol that was accompanied by a reduction in avoidance inclinations; the converse pattern was seen among participants who watched a video depicting the negative consequences of alcohol consumption.

The primary novel feature of this study was the incorporation of a thought suppression manipulation in an attempt to magnify the influence of the videos on automatic approach and avoidance responses evoked by alcohol-related cues. Contrary to our hypotheses, we observed no evidence that the thought suppression manipulation led to an increase in alcohol-related thoughts; by contrast, participants' self-reports indicated that they were able to suppress alcohol when instructed to do so. We observed that participants were faster to approach alcohol rather than control pictures, thereby replicating previous findings using a related task (Field et al., 2008, 2011; Christiansen et al., 2012; Kersbergen et al., 2015). However, this pattern of results was unaffected by the videos, the thought suppression manipulation, or the interaction between the two.

## Combined Analysis

Results from all three studies demonstrated that self-reported inclinations to approach and avoid alcohol were not independent of each other: increases in approach inclinations were accompanied by parallel decreases in avoidance inclinations, and vice versa. This interpretation could be bolstered by investigating the strength of the associations between changes in subjective approach and avoidance inclinations for alcohol after exposure to videos depicting the positive and negative consequences of alcohol consumption.

To this end, we combined the AAAQ data from all three studies, but disregarded data from the control groups (participants who watched the neutral video) in studies 1 and 2. Given that the thought suppression manipulation had no influence on the AAAQ in study 3, we collapsed these data across thought suppression groups. This combined analysis, with a total sample size of 221, confirmed that the sub-scale × time × group interaction was highly statistically significant, and that the time × group interactions were highly significant for all three of the AAAQ subscales (all ps < 0.001). Furthermore, paired samples t-tests confirmed that in the combined alcohol-positive group, scores on the inclined-indulgent and obsessed-compelled subscales increased, whereas scores on the resolved-regulated subscale decreased, after watching the video. The reverse pattern was seen in the combined alcohol-negative group. Data are shown in Table S4.

We then computed change scores to capture the change in each AAAQ subscale after participants watched the videos. By correlating these change scores with each other we were able to investigate the strength of the association between changes in self-reported approach and avoidance inclinations: as approach inclinations increase, do avoidance inclinations decrease by a similar magnitude (and vice versa)? Overall, intercorrelations between these change scores were statistically significant but small. After watching a video depicting the positive consequences of alcohol consumption, the magnitude of the increase in scores on the inclined-indulgent and obsessedcompelled subscales was associated with the magnitude of the decrease in scores on the resolved-regulated subscale, although the size of the correlation co-efficients (r = −0.20) suggests only 4% shared variance. After watching a video depicting the negative consequences of alcohol consumption, the magnitude of the increase in scores on the resolved-regulated subscale was weakly associated with the reduction in scores on the inclinedindulgent subscale (r = −0.36; 13% shared variance), but was unrelated to the reduction in scores on the obsessed-compelled subscale.

Finally, we are grateful to the reviewer who suggested the following additional analyses, which are detailed in Table S5. We performed hierarchical linear regression analyses to investigate if the change score for self-reported approach inclinations would be predicted by the video manipulation, even after entering the change score for self-reported avoidance inclinations in the first step of the model. With the score on the inclined-indulgent subscale as the dependent variable, the score on the resolvedregulated subscale accounted for 22% of variance [F(1, 279) = 77.29, p < 0.01], but addition of experimental group as a predictor in the subsequent step of the regression accounted for an additional 8% of variance [F1(1, 277). = 29.73, p < 0.01]. Similarly, with the score on the obsessed-compelled subscale as the dependent variable, the score on the resolved-regulated subscale accounted for 7% of variance [F(1, 279) = 22.20, p < 0.01], but addition of experimental group as a predictor accounted for an additional 6% of variance [F1(1, 277). = 18.42, p < 0.01].

However, we did not observe the same pattern when scores on the resolved-regulated subscale were entered as the dependent variable, and scores on the inclined-indulgent and obsessedcompelled subscales were entered as independent variables (in separate analyses). In the first case, after accounting for the 22% of variance attributable to the inclined-indulgent subscale [F(1, 279) = 77.29, p < 0.01], the addition of experimental group as a predictor did not account for additional variance in scores on the resolved-regulated subscale [<1% of additional variance; F1(1, 277). = 1.07, p = 0.30]. Similarly, after accounting for the 7% of variance attributable to the obsessed-compelled subscale [F(1, 279) = 22.20, p < 0.01], the addition of experimental group as a predictor did not account for additional variance in scores on the resolved-regulated subscale [<1% of additional variance; F1(1, 277). = 0.20, p = 0.66].

These analyses reveal that self-reported approach and avoidance inclinations for alcohol can operate at least partly independently of each other, but the effect is not symmetrical. Changes in self-reported avoidance inclinations after watching videos depicting the positive or negative consequences of alcohol consumption were completely accounted for by changes in selfreported approach inclinations. However, the changes in selfreported approach inclinations that were evoked by these videos were at least partly independent of changes in self-reported avoidance inclinations.

## General Discussion

A number of consistent findings emerged from the three studies reported here. When participants viewed short videos that depicted the positive or negative consequences of alcohol consumption, their self-reported approach and avoidance inclinations for alcohol tended to change in parallel: as approach inclinations increased, avoidance inclinations decreased, and vice versa. However, between-group contrasts suggested some degree of independence of approach and avoidance inclinations, although these findings were not consistent across studies. In addition, although changes in approach and avoidance inclinations tended to be inversely correlated (as one increased, the other decreased), these correlations were small and there was some evidence that changes in approach inclinations were partly independent of changes in avoidance inclinations. Finally, we found no evidence in support of our predictions that alcohol-related implicit cognitions would be influenced by experimental manipulations of motivational orientations for alcohol, regardless of whether we measured attentional biases or automatic approach / avoidance tendencies, or whether the experimental manipulation was combined with a thought suppression exercise.

One of the primary aims of these studies was to expand on findings from previous studies that used the Approach and Avoidance of Alcohol Questionnaire (AAAQ) in order to test predictions made by the ambivalence model of craving (Breiner et al., 1999). Specifically, if subjective approach and avoidance inclinations for alcohol are independent of each other, it should be possible to dissociate them by exposing participants to experimental manipulations that are designed to increase one but not the other. Overall, our findings demonstrated that approach and avoidance inclinations tend to change in parallel because as one increased, the other tended to decrease. This casts doubt on the independence of these constructs as predicted by the ambivalence model. Specifically, withinsubject contrasts revealed that, after watching a video depicting the positive consequences of alcohol consumption, approach inclinations increased and avoidance inclinations decreased, whereas the reverse pattern was seen in participants who watched a video depicting the negative consequences of alcohol consumption. Although there were some minor inconsistencies between studies, the results of a combined analysis of data from all studies confirmed that approach and avoidance inclinations tended to fluctuate alongside each other. Furthermore, the combined analysis revealed that the magnitude of changes in approach and avoidance inclinations over time were reliably negatively correlated with each other: the magnitude of the increase in the strength of approach inclinations was associated with the magnitude of the corresponding decrease in avoidance inclinations, and vice versa. Finally, the regression analysis confirmed that variation in the change in avoidance inclinations after watching the videos was completely accounted for by the change in approach inclinations.

However, other analyses suggested that self-reported approach and avoidance inclinations could be characterized as at least partly independent of each other. Although the combined analysis confirmed inverse correlations between the magnitude of changes in approach and avoidance inclinations over time, these relationships were weak (with, at most, 13% shared variance), and regression analyses demonstrated that changes in approach inclinations were at least partly independent of changes in avoidance inclinations. More importantly, betweensubject contrasts with a control group suggested that approach inclinations increased without a corresponding change in avoidance inclinations, and vice versa. Unfortunately, we have limited confidence in these group differences because they were only seen in study 2; in study 1, group differences in approach inclinations were accompanied by group differences in avoidance inclinations.

Our findings are consistent with some previous observations that subjective approach and avoidance inclinations for alcohol tend to change in parallel after exposure to appetitive alcohol cues (Curtin et al., 2005), although one study reported a dissociation between approach and avoidance inclinations after cue exposure (Jones et al., 2013). Importantly, the studies reported here are the very first to investigate the influence of an experimental manipulation that was intended to activate motivational orientations to avoid drinking; findings from participants in the alcohol-negative groups in all studies clearly suggest that this manipulation led to the predicted increase in self-reported avoidance inclinations that was accompanied by a decrease in approach inclinations. However, it is important to clarify that previous studies demonstrated that approach and avoidance inclinations have independent predictive validity for individual differences in drinking behavior and prospective drinking behavior (Curtin et al., 2005; Schlauch et al., 2012, 2013a,b,c; Klein and Anker, 2013). The three studies reported here cannot speak to the predictive validity of these constructs.

In contrast to the robust effects on self-reported approach and avoidance motivational orientations (assessed with the AAAQ), the videos depicting the positive and negative consequences of alcohol consumption had no effect on measures of motivational orientations operating within automatic processes, that is attentional biases (study 1) and approach / avoidance tendencies (studies 2 and 3). To our knowledge, these are the first studies that experimentally manipulated the motivation to avoid drinking and our findings suggest that these automatic processing biases are impervious to motivational orientations to avoid drinking. This interpretation is consistent with findings from an earlier study (Baker et al., 2014) in which we attempted to prime motivational orientations by presenting alcoholpositive and alcohol-negative primes below the threshold of conscious awareness; this manipulation also failed to influence attentional biases and approach / avoidance tendencies. However, on the basis of previous demonstrations of robust, albeit weak associations between subjective craving (typically assessed with self-report instruments that capture only "approach" inclinations) and attentional bias (Field et al., 2009), and demonstrations that experimental manipulations of craving such as negative mood induction, exposure to alcohol cues, and acute alcohol intoxication all lead to increases in attentional bias (see Field and Cox, 2008), we anticipated elevated attentional biases in the "attend positive" vs. the control groups. This pattern of results was not seen. Perhaps most importantly, with one exception (the bias to more rapidly approach alcohol rather than control images in study 3), we found no evidence of attentional or approach or avoidance biases in any of the studies, and no evidence that individual differences in alcohol consumption, hazardous drinking or scores on the AAAQ were associated with these implicit processing biases in any of the studies (see Supplementary Materials). These findings cast doubt on the validity and sensitivity of the tasks that were used in the current studies, all of which were slightly modified versions of tasks that are more commonly used in the literature. Future investigations of this research question should attempt to develop measures of approach and avoidance inclinations operating in automatic processes that have acceptable construct validity and sensitivity for this purpose.

The studies reported here have other weaknesses in addition to the questionable construct validity of the implicit measures. All participants consumed alcohol in excess of UK government guidelines and therefore their alcohol consumption was placing their health at risk. However, we did not attempt to recruit participants who were concerned about or attempting to limit their alcohol consumption, and we did not measure participants' motivation to change using a validated self-report measure, so it is possible that our participants were relatively insensitive to our experimental manipulations that were designed to exaggerate their ambivalence about alcohol consumption. Future studies could investigate this issue by recruiting heavy drinking participants who are currently motivated to reduce their alcohol consumption (and are actively attempting to do so), because motivational orientations in these participants might be expected

## References


to be more sensitive to the experimental manipulations that were used in the present study. A further limitation is that we did not record participants' occupational or socioeconomic status so we are unable to fully characterize participants who took part in these studies. In addition, it is possible that the videos had robust effects on self-report but not computerized measures because participants always completed the former before the latter; this could be investigated by counterbalancing the order in which assessments are administered in future studies. Our study also had strengths, including measurements of participants' subjective mood after they had watched the videos, which enabled us to rule out changes in mood as a contributor to the influence of those videos on self-reported and automatic motivational orientations.

In conclusion, findings from the three studies reported here question the degree of independence of self-reported approach and avoidance inclinations for alcohol, which tended to co-vary in response to experimental manipulations of inclinations to drink or inclinations to avoid alcohol. However, results from a combined analysis of data from all studies suggest that changes in inclinations to drink may be at least partially independent of changes in inclinations to avoid alcohol. Our findings also suggest that measures of alcohol-related motivational orientations that operate in automatic processes are impervious to these experimental manipulations, although the modified tasks that we used here have questionable construct validity and sensitivity which suggests that these findings should be interpreted with caution.

## Funding

Funded by a research grant from the Wellcome Trust, reference 086247/Z/08/Z, awarded to MF and JD.

## Acknowledgments

We thank Abigail Kirkland, Georgina Schwarz, and Michael Tapp for their help with data collection.

## Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2015.01465

load. Conscious. Cogn. 20, 515–522. doi: 10.1016/j.concog.2010. 11.004


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Di Lemma, Dickson, Jedras, Roefs and Field. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Embracing comorbidity: a way toward understanding the role of motivational and control processes in cannabis use disorders

Janna Cousijn\*

*Departments of Developmental and Experimental Psychology, Utrecht University, Utrecht, Netherlands*

Keywords: cannabis use disorders, depression, anxiety, comorbidity, motivation, control

## Background

Although the general public perceives cannabis as one of the less harmful illicit drugs, the past decades saw a surge in treatment demands for CUDs (UNODC, 2014). Cannabis nowadays is the primary illicit drug of concern in drug treatment services across North America, Oceania and Africa (UNODC, 2014). The low perceived harms of cannabis use are reflected in the small number of studies investigating the neurocognitive processes underlying CUDs [e.g., only 3 published functional Magnetic Resonance Imaging (fMRI) studies in individuals with a diagnosed CUD compared to controls, contrasting more than 1000 studies in individuals with an Alcohol Use Disorder]. Most studies on the mechanisms underlying cannabis abuse, including my own, investigated heterogeneous groups of chronic or heavy cannabis users with various levels of cannabis use related problems, not groups with diagnosed CUDs.

#### Edited by:

*Frank Ryan, Imperial College London, UK*

#### Reviewed by:

*Anne Marije Kaag, University of Amsterdam, Netherlands*

> \*Correspondence: *Janna Cousijn, j.cousijn@gmail.com*

#### Specialty section:

*This article was submitted to Psychology for Clinical Settings, a section of the journal Frontiers in Psychology*

Received: *06 February 2015* Accepted: *08 May 2015* Published: *27 May 2015*

#### Citation:

*Cousijn J (2015) Embracing comorbidity: a way toward understanding the role of motivational and control processes in cannabis use disorders. Front. Psychol. 6:677. doi: 10.3389/fpsyg.2015.00677*

Even though a substantial part of regular cannabis users will not experience any clear negative social and health consequences of cannabis, this does not imply that CUDs are less severe than other Substance Use Disorders (SUDs). The mental health issues associated with CUDs are substantial and often include comorbid psychiatric disorders including depression and anxiety (Stinson et al., 2006). Moreover, CUDs are difficult to treat and long-term abstinence is achieved by fewer than 20% (Danovitch and Gorelick, 2012). This urgently calls for a better understanding of CUDs. It is therefore time to reach out to those coping with CUDs by studying the mechanisms underneath. The goal of this opinion article is twofold: First, I want to address the strong need for neurocognitive studies in CUDs. Second, I propose that studying neurocognitive commonalities and differences between CUDs and comorbid disorders like depression and anxiety has great potential to unravel the mechanisms underlying CUDs and to eventually reveal new treatment targets.

## Motivational and Control Processes in Cannabis Use Disorders

Strong motivations towards drug use (e.g., craving, automatic tendencies to attend to and approach the drug), paired with an insufficient capacity to keep these under control are thought to play a prominent role in SUDs (Goldstein and Volkow, 2002; Robinson and Berridge, 2003; Dawe and Loxton, 2004; Everitt and Robbins, 2005; Wiers et al., 2007; Verdejo-Garcia and Bechara, 2009). Recent behavioral studies suggest that this is also the case in CUDs: confrontation with cannabis or related objects and contexts (i.e., cues) can trigger craving (e.g., Gray et al., 2011; Lundahl and Johanson, 2011), capture their attention (attentional bias; e.g., Cousijn et al., 2013b; Asmaro et al., 2014), and activate approach tendencies (approach bias; e.g., Field et al., 2006; Cousijn et al., 2011). In addition, cognitive control-related functions like planning, organizing, problem solving, decision-making, and working-memory appear to be impaired in individuals with a CUD (Fernandez-Serrano et al., 2011). Chronic cannabis exposure may (temporarily) impair cognitive control, but cognitive control deficits may also be a risk factor for the onset of cannabis use and escalation into CUDs (Cousijn et al., 2014).

## Embracing Comorbidity as a Tool

While the comorbidity between CUDs and other psychiatric disorders is widely accepted, neurocognitive studies mostly study disorders in isolation. Comorbid symptoms are often even controlled for by excluding such participants. A 3-year longitudinal epidemiological study specifically investigated the role of mental health factors in non-dependent versus dependent heavy cannabis use (Van Der Pol et al., 2013). Although externalizing psychiatric disorders like ADHD and conduct disorder were common to non-dependent and dependent users, internalizing psychiatric disorders such as mood and anxiety disorders were uniquely associated with dependence. Combined, the 283 almost daily cannabis users that participated in my previous studies revealed a correlation of r = 0.50 between cannabis use-related problems and depression symptoms (e.g., Cousijn et al., 2012; Beraha et al., 2013; Cousijn et al., 2013a,b). Similarly as in SUDs, neurocognitive models of depression (Weir et al., 2012) and anxiety disorders (Bruhl et al., 2014) stress the importance of dyscontrol over motivational processes and abnormal functioning of the underlying brain systems in the emergence of these disorders. Fronto-parietal and fronto-limbic brain networks are thought to play a key role in this (**Figure 1**; Seeley et al., 2007). The fronto-parietal network is thereby the main substrate for relatively cold executive control (e.g., working memory, attention, inhibition). The fronto-limbic network is primarily involved in emotion regulation, salience attribution and the integration of motivational information (e.g., reward, emotions) into decision processes.

The overlap in neurocognitive mechanisms underlying SUDs with depression and anxiety appears evident. Litle is known, however, about why certain symptoms cluster together and what differentiates disorders. From a clinical perspective, the vague boudaries between psychiatric disorders, the heterogeneity in psychiatric problems within patient groups and the poor treatment response in a substatial number of patients also underline the need to look beyond dichotomous disorder classifications (Casey et al., 2013). The new edition of the diagnostic statistical manual (DSM-V) introduced stages of disorder severity but still relies on self-reports (American Psychiatric Association, 2013). In the quest to identify more objective biomarkers for psychiatric problems, the US National Institute of Mental Health recently called for a transdiagnostic dimensional approach in the study of psychiatric disorders, in which the neurobiology underlying symptom dimensions is central, not the disorder classification itself (Casey et al., 2013). Embracing comorbid psychiatric problems, rather than factoring them out in neurocognitive studies, is an important step in this and I believe that such an approach has great potential to advance

our knowledge of psychiatric disorders, including CUDs. An additional advantage of such an approach is that participants with comorbid problems are more representative of individuals with (sub-threshold) psychiatric problems in the general population and of patients in treatment.

Studying the common and unique neurocognitive mechanisms underlying psychiatric disorders may help us to identify new biomarker that could advance prevention and treatment. In the case of CUDs, we can only speculate about the neurocognitive mechanisms underlying CUDs, let alone understand why depression and anxiety disorders are associated with CUDs. Cognitive control deficits and malfunctioning of the underlying fronto-parietal brain networks may be shared between all three disorders, posing a general risk factor for the development of CUDs, depression and anxiety disorders (Koob and Volkow, 2010; Weir et al., 2012; Cousijn et al., 2013b; Bruhl et al., 2014; Peterson et al., 2014). In contrast, motivational processes within specific emotional and rewarding contexts may differentiate disorders. Although abnormal approach-avoidance behavior is common to all three disorders, depression and anxiety are associated with overactive avoidance of certain social and emotional situations (Trew, 2011; Caouette and Guyer, 2014), whereas CUDs may be associated with overactive approach of cannabis cues (Cousijn et al., 2011). Moreover, SUDs including CUDs and depression are both characterized by low positive affect (anhedonia) and abnormal reward responsiveness within various fronto-limbic brain areas (Koob and Volkow, 2010; Hatzigiakoumis et al., 2011; Elman et al., 2013; Morgan et al., 2013; Telzer et al., 2014). Interestingly, a recent PET study among 14 heavy cannabis users showed a link between anhedonia and reduced dopamine transmission in the striatum (Bloomfield et al., 2014). Unlike, CUDs, depression and anxiety further show abnormal processing of social emotional stimuli in the amygdala (Burghy et al., 2012; Weir et al., 2012; Caouette and Guyer, 2014). Further, amygdala connectivity with the ventral medial prefrontal cortex may differentiate between anxiety and depression by uniquely contributing to certain symptoms (Mcclure et al., 2007; Beesdo et al., 2009; Burghy et al., 2012).

Genetics are also known to play an important role in the risk for CUDs, depression and anxiety disorders. Motivational and control processes are influenced by genetic factors, including genes involved in drug metabolism and neurotransmission (Sweitzer et al., 2012). Motivational and control processes may thereby, at least partly, mediate the genetic vulnerability to all three disorders. For example the D2 dopamine receptor gene (DRD2) Taq1 A polymorphism affects dopamine binding in the striatum and is consistently associated with SUDs, depression and anxiety disorders (genedisorder association indices retrieved from Gene Prospector; Yu et al., 2008). The A1 allele of the DRD2 Taq1 A polymorphism has been linked to reduced dopamine D2 receptor availability in the striatum, which could in turn reduce general reward responsiveness (Belcher et al., 2014). Another polymorphism consistently associated with all three disorders is COMTval158met (Yu et al., 2008). The COMT gene encodes an enzyme that is involved in the inactivation of catecholamine neurotransmitters like dopamine, epinephrine, and norepinephrine. The COMTval158met polymorphism has been linked to altered dopamine signaling in the prefrontal cortex, thereby influencing cognitive control (Bruder et al., 2005). Important to note CUDs, depression and anxiety disorders are polygenetic. Single genes are often only weakly associated with the risk for certain psychiatric disorders. To investigate genetic factors underlying polygenetic disorders large-scale multicenter genome-wide studies are needed. To allow DNA data contribution of small studies to large-scale consortia, DNA data collection should be facilitated for new studies, even though the primary objectives do not necessary comprise genetics. Moreover, epigenetics should be considered, that is the processes involved in long-term changes in gene expression that are heritable to daughter cells. Interestingly, a recent study in mice showed that a single epigenetic mechanisms (histone methylation of fosb) can influenced gene expression in the nucleus accumbens

## References


## A Critical Note

Although knowledge of the common and unique neurobiology underlying comorbid disorders could identify biomarkers, researchers and clinicians should carefully evaluate and compare the clinical value of such measures for the individual patient. Our group-based findings may not necessarily translate to the individual. Also, neuroimaging techniques are expensive compared to questionnaires and neuropsychological tests. It is therefore important to explicitly test if certain neural indices explain unique variance on top of simpler (and cheaper) methods.

## Conclusions

The worldwide high treatment demands for CUDs, but the significant lack of studies investigating it warrant new studies that investigate neurocognitive functions in cannabis users with a clinically diagnosed CUDs. Uncovering the common and unique neurocognitive mechanisms and associated (epi)genetics underlying CUDs and highly comorbid disorders like depression and anxiety can provide valuable knowledge for improving current state-of-the-art treatments and for developing new neuroscience based interventions, such as neurocognitive training (e.g., approach-action retraining; Wiers et al., 2011), neuromodulation (e.g., stimulating brain areas involved in control; Ressler and Mayberg, 2007; Berlim et al., 2013; Da Silva et al., 2013) and pharmacotherapy (e.g., medication that enhances emotion regulation; Ressler and Mayberg, 2007; Sofuoglu, 2010; Mohler, 2012; Farb and Ratner, 2014). I reiterate that it is vital to study motivational processes and cognitive control in ecologically valid groups of individuals, that is, by including those coping with comorbid psychiatric problems.

## Acknowledgments

JC is supported by The Consortium Individual Development (CID). CID is funded through the Gravitation program of the Dutch Ministry of Education, Culture, and Science and the Netherlands Organization for Scientific Research (NWO grant number 024.001.003 awarded to Chantal Kemner, Utrecht University, The Netherlands).


and interacting risk factors for human diseases. BMC Bioinformatics 9:528. doi: 10.1186/1471-2105-9-528

**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Cousijn. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The computational psychiatry of reward: broken brains or misguided minds?

#### *M. Moutoussis1\*, G. W. Story1,2 and R. J. Dolan1,3*

*<sup>1</sup> Wellcome Trust Centre for Neuroimaging, University College London, London, UK, <sup>2</sup> Centre for Health Policy, Institute of Global Health Innovation, Imperial College, London, UK, <sup>3</sup> Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London, UK*

Research into the biological basis of emotional and motivational disorders is in danger of riding roughshod over a patient-centered psychiatry and falling into the dualist errors of the past, i.e., by treating mind and brain as conceptually distinct. We argue that a psychiatry informed by computational neuroscience, computational psychiatry, can obviate this danger. Through a focus on the reasoning processes by which humans attempt to maximize reward (and minimize punishment), and how such reasoning is expressed neurally, computational psychiatry can render obsolete the polarity between biological and psychosocial conceptions of illness. Here, the term 'psychological' comes to refer to information processing performed by biological agents, seen in light of underlying goals. We reflect on the implications of this perspective for a definition of mental disorder, including what is entailed in asserting that a particular disorder is 'biological' or 'psychological' in origin. We propose that a computational approach assists in understanding the topography of mental disorder, while cautioning that the point at which eccentric reasoning constitutes disorder often remains a matter of cultural judgment.

#### *Edited by:*

*Nikolina Skandali, University of Cambridge, UK*

## *Reviewed by:*

*Michelle Dow Keawphalouk, Harvard University – Massachusetts Institute of Technology, USA Stefan Borgwardt, University of Basel, Switzerland*

#### *\*Correspondence:*

*M. Moutoussis, Wellcome Trust Centre for Neuroimaging, University College London, 12 Queen Square, London WC1N 3BG, UK m.moutoussis@ucl.ac.uk*

#### *Specialty section:*

*This article was submitted to Psychology for Clinical Settings, a section of the journal Frontiers in Psychology*

*Received: 26 June 2015 Accepted: 09 September 2015 Published: 29 September 2015*

#### *Citation:*

*Moutoussis M, Story GW and Dolan RJ (2015) The computational psychiatry of reward: broken brains or misguided minds? Front. Psychol. 6:1445. doi: 10.3389/fpsyg.2015.01445* Keywords: computational psychiatry, dualism, optimality, psychiatric nosology, Bayesian inference

*I'm gonna, I'm gonna lose my baby/So I always keep a bottle near [The psychiatrist] said, "I just think you're depressed."/This, me, yeah, baby, and the rest. A. Winehouse (2007), musician who died of alcohol intoxication in 2011*

## Introduction

The idea that reward processing is important in emotional and motivational psychiatric disorders comes from a view of the mind-as-decision-maker. This idea has been developed within the nascent field of computational psychiatry, the clinical offshoot of computational neurobiology. Within this framework, 'psychiatric disorder' entails a breakdown in the brain's inability to optimize decisions. Thus, to the extent that good decisions set up the individual to optimally obtain reward, 'psychiatric disorder' entails a suboptimal seeking of reward within an environment. As an approach computational psychiatry promises much by way of future diagnostic and therapeutic applications (Huys et al., 2011; Montague et al., 2012).

We are of course mindful that psychiatry has seen many promising directions that have delivered much less than hoped. In this article we argue that computational psychiatry has already made major contributions in resolving important conceptual divides in mental health. These have been expressed in varying ways but are located around biological/psychological – diagnostic/whole-person polarities (Boyle and Johnstone, 2014; Hayes and Bell, 2014). This has led to a situation where biological research is accused of shocking oversimplification of the mind, and psychosocial research accused of an equally shocking neglect of the brain ('mindlessness vs. brainlessness'). Intimately related is the question of when psychiatric intervention is justified to address mental symptoms1 . Here medical professionals may inappropriately diagnose and prosecute biological interventions (Szasz, 1960), while psychological therapists can be just as disempowering (Dolnick, 1998; Romito, 2008). These splits, like old religious conflicts in Europe, concern resource or power struggles among 'denominations' as much as they concern disagreements of substance (Bentall, 2009). It is important to note that that a resolution of the latter, to which our present work contributes, may only make slow inroads into the former.

An unhealthy mind is one disposed to make bad decisions and there is no end of examples in psychiatry. Decisions are not just the sine qua non of overt actions, such as a decision of a patient with depression to stay all day in bed, or drink a lethal quantity of vodka and die. We are also 'deciding' when we believe a proposition such as 'my wife has been replaced by a double,' or believe our senses when they inform that 'I look fat' – as in the body image distortion seen in anorexia – right through to a conclusion that 'the voice is real' in psychosis. Good decisions on the other hand entail those (among others) that lead to a healthy life, maintain safety and successful reproduction. Computational psychiatry goes further, postulating that *healthy organisms take optimal decisions, given their resources*. 'Good decisions' cannot but be those that successfully obtain the 'best reward,' those which are good in life and for life. We can call this the Leibnitz2 principle – the best possible world of decision-making is with us. Within this framework, 'psychiatric disorder' entails an inability to optimize decisions. Thus, to the extent that good decisions set up an individual to optimally obtain reward, 'psychiatric disorder' entails a suboptimal reaping of possible reward.

Common sense tells us that motivational and emotional disorders are central to most psychiatric disorders such as drug dependence, clinical depression or schizophrenia. For example, craving is a key motivational disturbance and DSM5 rightly includes it in the 11 criteria of Substance Use Disorder (APA, 2013). In disorders of mood, the symptom of anhedonia is a core criterion of clinical depression, marking it as a motivational and emotional disorder. It is also likely that the fears expressed within persecutory delusions are signs of deeply disordered emotional processing, whereby a diseased brain has recruited basic motivational and emotional mechanisms, originally meant to warn and protect the individual against dire threat, in a completely unwarranted fashion. It may also seem obvious that psychopaths can be construed as individuals inadequately motivated by the pain of others.

At the same time the study of reward in Psychiatry necessitates a widening of the scope of classic computational neurobiology to take seriously the *subjective experience* of motivational and emotional symptoms. Psychiatry is first and foremost a branch of medicine, not of engineering. Psychiatrists recommend biological, psychological and social interventions first and foremost in order to alleviate the suffering of a patient, and those around the patient. Unlike other disciplines, changing people's behavior is not the final goal but a part – usually a very important part – of restoring health. Conversely, understanding *behavior* motivated by reward and loss is important for psychiatric research. If we were concerned with physical trauma or viral illnesses, a thorough understanding of the body's mechanisms of immunity and tissue repair would be important, while supporting and correcting such processes would constitute practicing medicine. On the one hand, health research strives to understand both the physiology (the healthy function) and the pathophysiology (function-in-illness) of an underlying biological substrate. On the other, the clinician helps people who suffer as best as possible, while neither over- or under- applying their craft, as condensed in the dictum 'only the expert surgeon knows when *not* to operate.' As there is much suffering which medical interventions do not help, much maladaptive behavior and loss-related suffering is within the frame of scientific interest but outside the clinical scope of psychiatry.

Computational psychiatry focuses on those brain-based mechanisms which strive to optimize reward within the environment. We claim that this indivisible coupling of brainfunction-environment has already transcended the troublesome polarities of biological vs. psychological, diseased brain vs. maladjusted mind. Furthermore, once an optimizing of function is understood in relation to an individual patient's needs, the approach also transcends polarities of normative vs. libertarian and reductionist vs. anti-scientific psychiatry. If the study of reward-related decision-making is our new analytic tool, then the goal of this article is to clarify how much emotional and motivational disorders might yield to its explanatory power. Here we also consider foreseeable pitfalls in addition to how this new way of seeing disorder transcends the old polarities that still haunt psychiatry.

## Methods: Review of the Normative Account

One opportunity that a working hypothesis of optimal rewardseeking, given one's resources, affords is that of normativity. Behavior (be it choice between A and B or free, creative expression) is no longer judged in comparison to a reference sample, a 'healthy control' group with all the limitations this entails. Instead behavior is compared to demonstrably optimal solutions in face-valid but solvable tasks.

<sup>1</sup>A most curious term: a symptom without a mental dimension is not a symptom but a sign.

<sup>2</sup>Leibnitz claimed that we live in the best world that could possibly (logically, self-consistently) exist. He was famously satirized by Votaire in 'Candide' [en. wikipedia.org/wiki/Candide], exemplifying the normativity-pathology dialectic that is highly relevant to us.

#### The Bayesian Approach

In an *uncertain* world, each piece of *information* is used to *update* the person's *beliefs* about the *reality underlying appearances* according to this person's rule-book of *how reality gives rise to appearance*. This is what's called 'Bayesian inference' see **Table 1** for an illustative toy example.

This toy example does not include decisions about which action to take as yet, only decisions about which state the world is in (here, a self-worth state). Neither motivation nor reward, the central topics of this work are, as yet, explicit.

It is still necessary to write down the stages of information processing leading to normative decisions, and therefore classify where the process may break down in psychiatric disorders. Agents must


We now consider decisions about actions, rather than passive beliefs. If, as computational neuroscience claims, brains seek the best possible decisions then they should have values that they seek to optimize, values which are meaningful even if not explicitly represented.

## Utility as Consistent Probability Representation

If the value of different outcomes that can be obtained via different behavioral strategies in a given context is well defined


for an agent, we can call these values the 'utilities' of the different outcomes and map them to the probability of an agent adopting the corresponding strategy. Rewards are outcomes that reinforce human behavior or are reported as appetitive, desirable, hedonic, pleasant by healthy humans. Confronted with known choices A, B, and C an agent will ascribe 'utilities' *u*(A), *u*(B), *u*(C) such that they can choose by applying a well-defined choice probability. For choice A, this would be π (A; *u*(A),*u*(B),*u*(C)). Here we operationalize the motivational value of an outcome as the relative (but otherwise consistent) probability with which it is chosen. Let's call this 'consistent probability representation' on the part of the agent. The fact that this can be well-defined is a hypothesis with extremely productive consequences, which we presently describe. It is consistent probability representation that makes it possible to construct a full Bayesian Decision-making psychiatry (BDP), (Montague et al., 2012; Huys et al., 2014). In addition to 1–3 above, agents need -


We immediately note that consistent probability representation firmly maps utility to probability – which is, in Bayesian terms, just another kind of belief (Friston et al., 2013); where the model of control is just part of the generative model. Thus (4) and (5) are not additions to the Bayesian schema but are special cases of its elements.

'Mental disorder' can be said to exist when this decisionmaking apparatus itself is impaired, rather than reflecting any issues with its inputs. Note, however, that the decisionmaking apparatus by virtue of its Bayesian nature accumulates experience. Every updated belief contains the weight of its priors and forms the prior of the next update. Hence an 'impairment' may consist in the development of a decision-making apparatus poorly adapted for the circumstance in question so that there is no firm distinction between maladapted and diseased decisionmaking apparati. At the same time there is no guarantee that brain development will not encode posteriors into irreversible structure. As an example, the accent with which one speaks is part of the posteriors about the world encoded in childhood. It is very difficult to learn to pronounce a foreign language like a native in adulthood. Hence with respect to an environment where speaking this new language without a foreign accent is optimal, child development has in this broad definition "damaged" the brain.

We can now illustrate this scheme by locating motivational and emotional problems to distinct parts of this apparatus:

(1) Development may not have equipped the patient with an adequate repertoire of classes. A traumatic life which has set the prior probability that others will dislike me as equal to one can be seen to correspond to the extreme of Beck's notion of a core belief, which says 'I am worthless.' In the opening quote, "I'm gonna lose my baby" might be an exemplar of such a (prior) certainty.


As above, focusing at reward contingent on actions yields two further potential problem areas:


However, not all is perfect with utility-maximizing Bayesian schema: human beings appear to violate systematically3 the hypothesis of 'consistent probability representation.' We now briefly consider an alternative proposal of how people may represent their preferences, which offers a potential solution to these violations.

## Busemeyer: Preference for Reward as a Mixture State

Suppose that decision-making is probed first according to one of three options, for example by asking how much one prefers A over (B or C). We then probe how much B is preferred over C. It turns out that the choice probabilities4 (and implied utilities) experimentally measured are not consistent with performing the same experiment in the alternative possible orders. This violation of consistent probability representation – here an order effect – is one of several apparent inconsistencies in probabilistic reasoning that people display. Various explanations has been put forward, ranging from an erroneous bias to invocation of specialized context effects.

But what if before *making* the choice between A, B and C a person does not in fact encode all decision probabilities π(A; *u*(A),*u*(B),*u*(C)) etc. ? It may be that their psychological and neurobiological state is better described as the subject being in two or more minds, in a so-called "mixture state" *s* = (*a*(A),*a*(B),*a*(C)) where *a* are the amplitudes of the mixture. In this scenario the process of *making the choice* is implemented as a reduction or projection of the mixture state. The rigorous formalism used to describe the dynamics of mixture states, and what happens at the point of reduction, was first described in quantum physics and recently introduced into decision making by Busemeyer et al. (2009). The reduction or projection process naturally produces order effects: essentially, enquiring about A automatically affects B and C, and so on. The framework is referred to as 'quantum probability' (QP) – an unfortunate term in the current context as no physical quanta are involved at all.

The consequences of this framework have not been worked out nearly as fully as, for example, a Bayesian framework. We suggest that the experimental evidence supporting it casts some doubt on the fundamental idea that people represent preference probabilities corresponding to a well defined utility function. There are, however, many cases in which the two frameworks concur (Busemeyer and Bruza, 2012) and so in this instance we proceed with the better-worked-out Bayesian framework, mindful that there are good reasons why its assumptions might provide poor approximations of psychology.

## Marr: Process Models in the Brain

So far we have considered reward and emotion at the level of information processing. Computational psychiatry, however, is not just behavioral economics or behaviorist psychology. Following Marr (1982), we seek evidence that specific, normative, computations we hypothesize are instantiated in neural wetware. This in turn raises the thorny issue that a specific computation – in the sense of a specific normative solution – can be achieved with different problem-solving techniques. It is the signatures of these algorithms that we look for in the neural substrate, and the complete account – from stimulus to neural response, to neural computation to its representation in experimental data – is the 'process model.' The best-established process models relevant to the computational psychiatry of reward are arguably those that posit the basal ganglia as representing reward-based learning prediction errors (Seymour et al., 2004) and of the ventral and medial prefrontal cortex representing the values of different actions available to the subject (Rushworth et al., 2011).

## Modeling Motivation and Emotion

There is one issue in motivational and emotional research which has been relatively overlooked within the framework of reward processing, and if unaddressed might reinforce dualist splits. The working definition of motivation within the Bayesian framework appears to claim something trivial, namely defining the motivational power of an outcome as the frequency with which it is chosen. We can choose to call this 'motivation,' and this is fine if we were talking about math or physics, where there is no danger of confusing a rigorously defined quantity, say the charge of a quark, with a property of the mind. However, here we are *also* talking about motivation as experienced by patients, so we need to be clear about what sort of claim we are making about the semantic referents to which the term 'motivation' belongs. More specifically, are we claiming that the choicefrequency definition of 'motivation' is to be taken for granted, while the phenomenal experience of 'motivation' is a subject for a future, maybe more optional, clarification or research? This

<sup>3</sup>Though not grossly.

<sup>4</sup>Consistent choice utilities map to consistent choice probabilities. In a Bayesian world we would require such consistency of utility. In reality the decision-making processes people use mean that this isn't so. This is called 'violation of revealed preference theory.'

would constitute a linguistic coup d'etat! The hard problem of consciousness need not concern us here: we only need to avoid dualism and – like good Bayesians – optimally combine both linguistic and decision-behavioral evidence.

People place great importance in the distinction between 'I can't' and 'I don't care.' 'He doesn't care about me' is a much more serious accusation than 'he can't understand me.' Yet our measurement of motivation as the currency between observable outcomes and decision probability often makes this distinction quite difficult. Suppose button A gives me a piece of jellied eel four times out of 10, and button B six times out of 10. If I prefer them equally, is it that I am very good as working out frequencies but I don't care about jellied eel (no motivation), or that I'm very keen on eel but I am incapable of working out frequencies (no ability)? Similarly, if task performance depends on some other psychiatric variable (say on anxiety) we could easily confuse performance at the left side of the Yerkes–Dodson curve (arousal and motivation too low) with performance on the right (high motivation, but arousal detrimentally high). It is not, of course, impossible to distinguish between the 'I can't' and 'I don't care' but ideally both phenomenological and behavioral enquiry are needed. It is interesting to note that the individual's 'I can't' may be the genetic pool's 'I haven't learnt to appreciate.'

Models traditionally address the issue of motivation-peroutcome by fitting a single parameter (often called 'temperature') for each agent. More recently models have parametrized two different aspects of how motivating reward are, even before considering the phenomenological level. The first relates to how often a choice would be made if the reward emanating from it were immediately obtained with great certainty. Even an obviously preferable outcome ('do you want £5 or £0?') may not be chosen 100% of the time due, for example, to lapses in attention/misunderstanding. The second aspect has to do with how motivation to make a decision changes as the outcomes of these decisions are, with time, more reliably inferred. This can be seen as an 'motivational exchange rate' or 'decision temperature' pertaining to a unit change of outcome away from the point of indifference. This pair of concepts is codified as 'lapse rate and inverse temperature' in the classic RL temperature (Guitart-Masip et al., 2012) and 'goal priors and action precision' in an active inference framework (Friston et al., 2013). Note these are not just different names for the same variables and although they refer to related concepts they have subtly different computational roles.

Although we have a working definition of motivation, we have less of a handle on the term emotion. Our *implied* definition of emotion: a positive or negative utility attaches a value upon the outcomes with which it is associated, and thus upon the states and decisions that lead to them, corresponding to more positive or negative emotional states respectively. Emotion contains *as inseparable parts of each unitary phenomenological state* not only valence and magnitude but rich information about context, intention etc. The desire for sex and the desire for knowledge are not just differently tagged emotions, they are different emotions.

At the moment the way that researchers relate computational variables relate to emotions (if at all) is haphazard; yet tentative progress is being made. In one path breaking study, Rutledge et al. (2014) related changes in subjective well-being to several aspects of a participants' reward – such as their cumulative reward ('wealth'), immediate reward and most importantly *immediate reward compared to expectations –* their reward prediction error (RPE). Here changes in subjective wellbeing, 'how happy do you feel at the moment,' were best predicted by RPEs. In a bold formulation, Joffily and Coricelli (2013) posited that the phenomenology of several emotions, not just the single dimension of higher vs. lower wellbeing, is intimately linked to *both the temporal dynamics and the certainty* of the beliefs about how one's state evolves relative to one's goals or desires. Thus not only does a person feel 'positive' as their beliefs shift toward a desired state (as a positive RPE would entail); but this positive emotion has the color of happiness if the current belief is certain but the color of hope if the corresponding belief is uncertain.

This experimental and theoretical progress attests to the feasibility of unifying the 'client' (subjective) and the 'decisionmaker' (objective) perspectives on emotion. The links between the dynamics of reward and the dynamics of emotion show great promise and need a lot of experimental testing, but the first steps of clinical importance have been taken.

## Results

Approaching motivational and emotional disorders through the lens of (computational) reward processing furnishes a number of important results with respect to two of the polarities that have plagued psychiatry but has not made as much progress with respect to a third.

## Biological vs. Psychosocial

Computational psychiatry simultaneously addresses the computational level of what the problem is, the algorithmic level of how it can be operationalized in terms of information processing, and the implementation level in terms of the neural substrate. More practical considerations, such as the behavioral economics of interpersonal exchanges (Camerer, 2003), has obliged scientists to integrate social psychology and neuroscience with basic, or impersonal, reward processing. Let us consider two findings: first, that subjective wellbeing follows RPEs (Rutledge et al., 2014) as above. Second, during interpersonal exchange people may encode both ordinary RPEs (e.g., I'm pleasantly surprised with what she gave me) but also person-representation prediction errors [She will be pleasantly surprised about me, as I'm about to reciprocate generously (Xiang et al., 2012)]. If ordinary RPEs drive some aspects of emotion, it would be strange indeed if person-representation RPEs were unrelated to the strong emotions we experience in an interpersonal sphere: for example, their fragility in emotionally unstable personality or their presumed dearth in psychopathy.

On the other hand, of course, we are far from elucidating the actual way in which social emotions and non-social emotions are represented in their neurological substrates and, inversely, how social and non-social emotional processing changes this substrate, be it through trauma (Chen and Etkin, 2013), learning in psychosis (Murray et al., 2008) or subtle plasticity (Garvert et al., 2015). Signatures of biased reward processing have been found in several disorders but they are far from explaining these disorders either in the sense of explaining symptoms in the here and now or in the sense of predicting the course of the disorder much better than traditional methods (Whelan et al., 2014).

### Disease vs. Maladjustment

Learning about reward takes place at different levels of information processing. Let us consider the example of psychosis. The early, and celebrated, aberrant salience hypothesis of psychotic disorders (Kapur et al., 2005) postulated a disease level wherein dopamine discharges might be epileptic-like, unrelated to information processing, leading to the establishment of psychotic associations (both beliefs and choices) at the phenomenological and behavioral levels. Such an account separates the diseased brain reporting aberrant increased salience; and the healthy brain downstream that tries to make sense of this abnormal salience. However no epileptiform activity has been demonstrated. Increased aberrant salience has been demonstrated in association with schizotypy in healthy individuals and in medicated patients with delusions (Roiser et al., 2009); however, it does not seem to be prominent in prepsychotic and early psychotic states, where no changes in aberrant salience have been found so far (Smieskova et al., 2015). At the same time there is evidence that exaggerated dopamine reactivity to stress is associated with psychotic experiences in predisposed individuals (Hernaus et al., 2015).

Therefore the evidence points toward disease being an overall brain-state, the result of adjustment to psychobiological challenges performed by the individual's neural phenotype. Computationally, this is inference about salient stimuli at the developmental timescale; while genetically it is likely to be based on 'intermediate phenotypes,' e.g., of atypical connectivity (Cao et al., 2016). We can see that this framework renders the dualistic view of 'disease' and 'maladjustment' obsolete. The canonical teaching of an illness being explained in terms of predisposing, precipitating and perpetuating factors fits much more comfortably with the dynamical view of computational psychiatry, wherein dopamine reactivity or the interplay of prior and posterior beliefs are meaningful (if suboptimal) at different but intimately linked Marrian levels. The computational models of Ruppin and coworkers (Horn and Ruppin, 1994) illustrate a beautiful early example of such thinking. They suggested that the brain performed compensatory adjustments to longrange dysconnectivity in order to preserve the ability to activate appropriate perceptions in response to stimuli. However, these compensatory adjustments result in a propensity for percepts that bear small correlation to stimuli (i.e., hallucinations) to arise. Neurobiological and computational research has greatly refined these insights. We close this brief foray into psychosis research by point out a promising theme relevant to the role of reward, the focus of this issue. From the early theories of dopaminedependent signal-to-noise (Cohen and Servan-Schreiber, 1992; Servan-Schreiber et al., 1996) to the influential analysis of the role of precision at sensory vs. cognitive levels (Adams et al., 2013) to the findings of exaggerated dopamine reactivity to stress (Hernaus et al., 2015), psychosis has been about aberrations and compensatory changes in synaptic gain. The original aberrant salience theory of psychosis has opened new horizons regarding the role of reward- and threat- anticipation in psychosis; yet it may be the increasingly sophisticated understanding of synaptic gain, especially in its guise as precision calculated in cortical NMDA fields (Adams et al., 2013), that helps us go beyond the oversimplified aspects of salience theory.

At a theoretical level some biological factors are so dominant that to call them 'predisposing factors' is misleading (e.g., Down's syndrome causing Alzheimer's disease). These can be thought of as maladjustments at another level of the hierarchy – where an evolving reproductive apparatus has not learnt to avoid trisomies. Such maladjustments may be chance events or indeed the result of optimizing compromises between priorities.

At the same time the normative view of reward processing contains an ambiguity that needs acknowledgment and resolution. This is that for *any* input-output behavioral pattern a cost structure can be found for which this pattern of behavior is optimal (Daunizeau et al., 2010). For *any* behavior we can simply say that the person in question emits it because it genuinely optimizes their happiness. This is analogous to the psychological assertion that a patient 'refuses to change because it would be too painful for them,' or that an addict or pedophile simply finds indulging too rewarding to trade it against an alternative. Given a conception of what is valuable, e.g., making the most money, we can offer to explain how people attempt to optimize their behavior, and which parts of the process may go wrong. In current practice most research that investigates abnormalities of reward processing takes as a starting point an assumption that there are rewards out there, which have a normative relationship with the individual's behavior and that peop*le should* value a*nd should* seek. When a rat or human are hungry, two lumps of sugar are more rewarding than one, and we can measure how much harder subjects are willing to work for the chance to get them. We have a normative yardstick: our subject should work just hard enough to maximize the utility of (sugar + effort). Motivational disorder is then defined as a statistically significant deviation from this norm. However, in the real world it is hard to know what people *should* care about and computational, biological and psychosocial research agendas could do well to take seriously what we don't know.

Computational psychiatry does not do as well, as yet, when dealing with the complexity of human emotion. The problem is acute not because we should address emotion in its huge complexity, but because we have so far dealt with it by a simplification into positive vs. negative emotion, albeit tagged according to experimental tasks in question. If human emotion relevant to psychopathology contains multiple facets as inseparable parts of a phenomenological states, and if these rich states have computational relevance, then current studies are likely to be very remote from actual clinical relevance. Paradiso and Rudrauf (2012) put it eloquently: "... the fear experienced by a mountain climber in potential danger has levels of social complexity unlikely to be reached in mice. In addition to fearing his own end, the mountain climber anticipating a possible death is equally likely also to be scared of losing his spouse and children, leaving them fatherless and exposed to dangers, of the financial consequences of his death on them, of the emotional effects on his parents, and so on. He may simultaneously experience shame (another social emotion) and danger (perhaps toward his self) for having neglected what he thinks were routine safety measures. A human facing the possibility of ceasing to exist has emotions that encompass the inescapable social nature and interconnectedness of our species and multiple levels of self-representation and projection." We don't really know which complex emotional constellations found in psychiatric disorders are most relevant, especially for decisionmaking that can be considered pathological. At the moment we haven't developed a good way of addressing this most important question scientifically either.

## Discussion

A computational psychiatry of emotional disorders has begun to put on the table key issues that have plagued psychiatry. It provides a framework for bridging biological-psychologicalsocial divides and offers novel perspectives on the question of emotional-motivational 'diseases' versus 'problems.' This is rendered possible by formulating disorders of motivation and emotion within a normative probabilistic framework which offers sophisticated and neurobiologically plausible accounts of how reward motivate decisions. Many challenges remain. Phenomenology is only tentatively connected to computation; much-promising theoretical concepts have not been put to experimental test, while their normative basis is not understood. For example, we have no rigorous normative account of what utility structures correspond to mental health. A key example is how reward *should* be discounted in the face of time (inter-temporal discounting), valence (complex discounting of negative future events, including dread) or social distance (social discounting). Therefore the statistical connections that have been found between temporal discounting and addictive disorders lack a true normative basis.

Let us now consider a libertarian (or Szaszian) critique of reward processing as a basis for psychiatric research. Szasz protested against a medicalization of deviant behavior, believing that so called psychiatric disorders lack an adequate biological basis. Hence 'medicalizing' unjustifiably transgressed peoples' autonomy (Szasz, 1960). Deciding *a priori* what reward people should value more (as manifested in their choices) or what reward they should care about (as manifested in their phenomenology) is just as much 'playing god' once we move beyond trivial choices: in many cases psychiatrically relevant situations are complex enough to negate a dream of finding a normative standard against which to measure motivational disorder. Reward processing should maximize long-term outcomes and so in research practice we use paradigms that have well-defined ends or may be thought of as going on 'for ever' (as for example near the beginning of a task with hundreds of trials). Yet what sorts of long-term outcomes are involved in the long-term reward processing important for psychiatry? Individual reproductive fitness? We have no clear idea, and the temptation is to

import convenient social norms, rendering our framework only pseudo-normative. Even in the simple example of working for lumps of sugar, mentioned above, there will usually be *some* evaluation of effort and sugar5 that renders behavior optimal. This evaluation may be normative with respect to the person's history, not the task. In any psychiatrically relevant situation considerations rapidly multiply. For example, what if our hungry human is overweight? And what if the reproductive fitness associated with slimness (attracting mates) is socially constructed?

Thomas Szasz and the libertarian tradition (to which the authors belong) argue that rather than impose norms on people – say about which reward would maximize their life expectancy, their reproductive success – we should respect the priorities they have and, by definition, accept a person's autonomy to seek their own reward by deploying their own motivational structures. So is there no such thing as a motivational or emotional disorder and in fact everyone is just doing the best they can? Szasz would claim that the dream of aberrant reward processing pinning down what's essential about motivational and emotional disorders is no more solid than the 'chemical imbalance' theory of depression or of Freud's 'unconscious motivation' theory of mental illness. To be more specific if the Reward processing domain of the otherwise promising 'Research Domain Criteria' framework (Casey et al., 2013) is applied too simplistically we may end up with exactly the same mistakes as in previous biological or psychoanalytic normative straightjackets.

If a Szaszian position simply accepts peoples' choices for what they are, its extreme opposite would be a 1984 world where people have been taught through social, psychological and biological interventions, not only *what to decide* but actually *what to desire.* While we recoil from the Szaszian extreme as it is dismissive of the importance of psychiatric suffering, psychiatrists cannot dictate what patients should care about – even about their symptoms. The so-called recovery movement can already teach computational neuroscientists that the rewards that patients really care about are not so much to do with their symptoms as with their life goals and values. In that case perhaps the priorities for researching archetypal motivational disorders like depression are not about 'what motivational disturbance underpins depression' but 'what decision structures of the depressed can help them fulfill their values' (Hayes et al., 1999). Here we have dialectic, because the scientific baby should not be thrown away with the essentialist bath water. The clinician could bring to the patient a biopsychosocial assessment of 'wrong priors,' 'wrong models,' or 'wrong utilities.' They would then decide *in dialog with a patient*, with a diagnosis of say the successor of 'Depressive Episode,' now defined in computational terms, what key needs must be targeted and optimized. When it comes to the severe mental illnesses, formulations that go beyond 'Schizophrenia,' or indeed 'Abnormal Salience syndrome' (Van Os, 2009) will help clinicians and patients consider emotional and motivational dispositions both as threats and as

<sup>5</sup>i.e., the person's goal or preference priors: Friston et al. (2013).

instruments toward recovery. Of course this account assumes patients with some capacity to consider the issues in question, which may itself be severely compromised – for example in acute psychosis.

Why bring in the concept of need when considering reward and emotional disorder? Because biologically reward is not an end in itself, but a good surrogate toward longer-term biological goals. The stability properties of a self-perpetuating system, like a species in an ecosystem, can be conceptualized in terms of having the 'purpose' or 'goal' to keep perpetuating the system (e.g., the species). One has to be careful philosophically to avoid false teleological justifications, but in the first instance this is small print. We assert that there are physiological homeostatic needs, reproductive/sexual needs, and more complex ones such as needs for social contact. Furthermore, people are motivated by reward that extend beyond their own lives. They will often, in fact, sacrifice their life for much less than 'two brothers or four cousins,' as mathematical evolutionary biologists have put it (Maynard Smith, 1993). Each of these needs entails goals, desires and reward; all are relevant to psychiatry; but probably few can be the target of fruitful intervention for each particular patient.

## References


## Conclusion: Computational Psychiatry must be Profoundly Biopsychosocial

In the best possible world scientists will take seriously the question of what needs really matter for patients, what reward form the best surrogates or milestones toward the fulfillment of such needs and will do so in open collaboration with relevant stakeholders. At first sight the rigorous, biologically based discipline of computational psychiatry seems far from patients' expressed needs, yet the fact that it puts reward and motivation at the center of understanding psychiatric disorder gives it a privileged vantage point toward serving patients.

## Acknowledgments

RD is supported by a Wellcome Trust Senior Investigator Award (ref 098362/Z/12/Z). The current work is funded by a Strategic Award by the Wellcome Trust (ref 095844/7 /11/Z). MM is also supported by the Biomedical Research Council.


during the early stages of psychosis. *Schizophr. Res.* 166, 17–23. doi: 10.1016/j.schres.2015.04.036


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Moutoussis, Story and Dolan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# A Computational Analysis of Aberrant Delay Discounting in Psychiatric Disorders

Giles W. Story 1, 2, <sup>3</sup> \*, Michael Moutoussis 1, 2 and Raymond J. Dolan1, 2

<sup>1</sup> Max Planck University College London Centre for Computational Psychiatry and Ageing Research, University College London, London, UK, <sup>2</sup> Wellcome Trust Centre for Neuroimaging, University College London, London, UK, <sup>3</sup> Centre for Health Policy, Imperial College London, Institute of Global Health Innovation, St. Mary's Hospital, London, UK

Impatience for reward is a facet of many psychiatric disorders. We draw attention to a growing literature finding greater discounting of delayed reward, an important aspect of impatience, across a range of psychiatric disorders. We propose these findings are best understood by considering the goals and motivation for discounting future reward. We characterize these as arising from either the opportunity costs of waiting or the uncertainty associated with delayed reward. We link specific instances of higher discounting in psychiatric disorder to heightened subjective estimates of either of these factors. We propose these costs are learned and represented based either on a flexible cognitive model of the world, an accumulation of previous experience, or through evolutionary specification. Any of these can be considered suboptimal for the individual if the resulting behavior results in impairments in personal and social functioning and/or in distress. By considering the neurochemical and neuroanatomical implementation of these processes, we illustrate how this approach can in principle unite social, psychological and biological conceptions of impulsive choice.

Keywords: discounting, time preference, psychiatric, computational psychiatry, mental illness, biopsychosocial

## INTRODUCTION

Vitae summa brevis spem nos vetat incohare longam

Life's short span forbids our embracing far-reaching hopes - Horace, Odes (23BC)

Humans and animals often accept a smaller reward immediately, rather than wait to receive a larger reward in the future (Ainslie, 1974; Thaler, 1981; Thaler and Shefrin, 1981; Fishburn and Rubinstein, 1982; Frederick et al., 2002; McClure et al., 2007; Kalenscher and Pennartz, 2008; Pine et al., 2009). In economic terms, this behavior indicates that the subjective value of reward decreases as it is delayed, a process referred to as temporal discounting (for reviews see Frederick et al., 2002; Kalenscher and Pennartz, 2008). As we will discuss, biological agents have good reason to discount delayed rewards, since these might either fail to materialize or arrive too late to satisfy the organism's current needs. Indeed, as pointed out by the Roman poet Horace in the quotation above, the ultimate motive for discounting is that the agent will die before deferred rewards are realized.

In humans, temporal discounting can be measured by examining choices between quantities of money at varying delays (Mazur, 1987; Kirby and Marakovic, 1995; Myerson et ´ al., 2001; Green and Myerson, 2004). The most commonly used method elicits choices between a larger, delayed amount

#### Edited by:

Gianluca Castelnuovo, Università Cattolica del Sacro Cuore, Italy

#### Reviewed by:

Michelle Dow Keawphalouk, Harvard and Massachusetts Institute of Technology, USA Warren K. Bickel, Virginia Polytechnic Institute and State University, USA

> \*Correspondence: Giles W. Story g.story@ucl.ac.uk

#### Specialty section:

This article was submitted to Psychology for Clinical Settings, a section of the journal Frontiers in Psychology

Received: 29 June 2015 Accepted: 04 December 2015 Published: 13 January 2016

#### Citation:

Story GW, Moutoussis M and Dolan RJ (2016) A Computational Analysis of Aberrant Delay Discounting in Psychiatric Disorders. Front. Psychol. 6:1948. doi: 10.3389/fpsyg.2015.01948 of money, (e.g., "\$100 in 6 months"), and a series of immediate amounts of decreasing magnitude (e.g., "\$80 today"). By observing at each delay the magnitude of smaller-sooner reward at which the participant switches to preferring the later reward, the decrease in value of the later reward can be plotted as a function of delay. A non-parametric estimate of discounting can be derived by taking the area beneath this indifference curve (Myerson et al., 2001). Alternatively, the shape of the curve can be fitted with a discount function.

Samuelson (1937), and later Strotz (1957), showed that a decision-maker who discounts future benefits according to an exponentially decreasing function (and behaves as if to maximize the sum of exponentially discounted reward) allocates resources across time in a self-consistent manner. Under the classical model, the effect of delay, d, is described by an (exponential) discount function, here denoted by1(d), such that:

$$
\triangle \left( d \right) = \left. e^{-kd} \right|\_{} \tag{1}
$$

Where k is an exponential discount rate, such that higher values of k lead to a steeper decrease in reward value with delay. The effect of reward magnitude, here signified by r, is independently described by an instantaneous utility function, u(r), such that the subjective utility of a stream of future rewards is then given by:

$$U(r\_t, r\_{t+1}, r\_{t+2}, \dots, r\_{T-1}, r\_T) = \sum\_{t}^{T} u(r\_t) \Delta(\mathfrak{x} - t) \tag{2}$$

As reviewed by Frederick et al. (2002), the above account was not intended as a veridical psychological model of choice over time. In keeping with this, many experimental studies have shown that a discounting function is better approximated via a hyperbolic than an exponential function (e.g., Green et al., 1994; Kirby and Herrnstein, 1995; Kirby and Marakovic, 1995; ´ Myerson and Green, 1995; Laibson, 1997; van der Pol and Cairns, 2002; Rubinstein, 2003), of the form:

$$
\Delta \left( d \right) = \frac{1}{1 + kd} \tag{3}
$$

Here k denotes a hyperbolic discount rate (though for alternative accounts see Read, 2001; Kable and Glimcher, 2010; Read et al., 2012; Luhmann, 2013).

Temporal discounting has received considerable attention in human behavioral neuroscience, not least because many forms of maladaptive behavior are readily characterized as pursuit of immediate gratification at the expense of reaping greater rewards in the future (Critchfield and Kollins, 2001; Bickel et al., 2007, 2014a; Koffarnus et al., 2013; Story et al., 2014). Indeed, lending validity to the discounting construct, steeper discounting is positively associated with behaviors with potentially harmful long-term consequences such as tobacco smoking (Odum et al., 2002; Epstein et al., 2003; Reynolds et al., 2004; Bickel et al., 2008; MacKillop and Kahler, 2009; Fields et al., 2009a,b; Reynolds and Fields, 2012), alcohol use (Van Oers et al., 1999; Mazas et al., 2000; Petry, 2001; Field et al., 2007; Reynolds et al., 2007; Rossow, 2008; MacKillop and Kahler, 2009; Moore and Cusens, 2010), illicit drug misuse (Kirby et al., 1999; Petry and Casarella, 1999; Kollins, 2003; Petry, 2003; Kirby and Petry, 2004; Washio et al., 2011; Stanger et al., 2012), credit card debt (Meier and Sprenger, 2012) and risky sexual or drug-taking practices (Odum et al., 2000; Dierst-Davies et al., 2011). Also, many authors have explored how discounting relates to demographic variables, finding that measured discounting decreases across the lifespan (Green et al., 1996, 1999; Chao et al., 2009; Steinberg et al., 2009), is negatively correlated with income (Green et al., 1996; Eckel et al., 2005; Reimers et al., 2009), and tends to be lower in individuals living in the developed world than in the developing world (Wang et al., 2010). Furthermore, although discounting is sensitive to a gamut of contextual factors (for a review see Koffarnus et al., 2013), the level of discounting has been shown to exhibit high test-retest reliability when measured under similar conditions (Odum, 2011), and the extent of individual discounting for different forms of reward is correlated (Odum, 2011), suggesting that discounting has a substantial trait component.

More recently, researchers have taken an interest in comparing discounting behavior in groups who exhibit symptoms of a given psychiatric disorder and those who do not. These studies have found evidence for steeper discounting amongst patients with symptoms of schizophrenia (Heerey et al., 2007, 2011; Ahn et al., 2011; MacKillop and Tidey, 2011; Wing et al., 2012; Avsar et al., 2013; Weller et al., 2014), depression (Takahashi et al., 2008; Dennhardt and Murphy, 2011; Dombrovski et al., 2012; Imhoff et al., 2014; Pulcu et al., 2014), mania (Mason et al., 2012), attention deficit hyperactivity disorder (ADHD) (Barkley et al., 2001; Tripp and Alsop, 2001; Bitsakou et al., 2009; Paloyelis et al., 2010a,b; Scheres et al., 2010; Scheres and Hamaker, 2010), anxiety disorder (Rounds et al., 2007) and cluster B personality disorder (Dougherty et al., 1999; Moeller et al., 2002; Petry, 2002; Dom et al., 2006a,b; Lawrence et al., 2010; Coffey et al., 2011). This line of enquiry is not without theoretical justification, for example the broader construct of impulsivity, defined as taking action without forethought or regard for consequences (Moeller et al., 2001), of which discounting is an element, is a defining feature of some psychiatric disorders, for example borderline personality disorder (Moeller et al., 2001; DSM V, 2013) and mania (Swann, 2009). Also, psychiatric disorders are strongly associated with poor health choices, including but not limited to cigarette smoking, and drug and alcohol misuse (Robson and Gray, 2007), which have themselves been associated with steeper discounting (Bickel et al., 2012b, 2014a,b; Story et al., 2014). However, in many cases this research, although clearly valuable, appears to have been opportunist.

In this article we attempt to understand increases in discounting seen across a range of psychiatric disorders in light of the reasons why people should discount the future in the first place. We propose that the study of intertemporal impulsivity in psychiatric disorders would benefit from fractionating these underlying motives, and that parsing discounting in this manner can assist in drawing out the contributing psychological and biological processes. Our approach follows that of the neuroscientist David Marr (Marr, 1982), who proposed that information processing systems can be understood at three levels of analysis: a "computational" level, specifying what information processing problem is being solved by the system, an "algorithmic" level, formalizing how the system attempts to solve the problem, and an "implementational" level, denoting how these processes are realized physically.

For the case of discounting, the computational problem is easily defined in economic terms: to optimize the sum of future reward. However, this definition obscures a difficult question as to what constitutes "reward" (Moutoussis et al., 2015). It is convenient here to assume that all biological agents share some fundamental objective function. Rather than attempting to characterize the objective function directly, we assume some consensus on the kinds of outcome that organisms often seek, and that can therefore be considered "rewarding." We then consider a subset of generic scenarios under which behavior consistent with discounting would indeed optimize the sum of future "reward." This will give us some insight as to the contexts that agents, who discount future reward in different ways, including humans deemed to have mental disorders, might be adapted to.

We go on to speculate as to the broad classes of algorithms that biological agents might use to optimize reward, and where relevant their possible neural implementation. We argue that the application of this approach to psychiatric disorders, the bedrock of the emerging field of computational psychiatry (Huys et al., 2011; Montague et al., 2012; Friston et al., 2014; Stephan and Mathys, 2014; Wang and Krystal, 2014), can help to bridge a gap between psychological and biological conceptions of mental ill health (for further discussion see Moutoussis et al., 2015).

## MARR'S COMPUTATIONAL LEVEL: REASONS TO DISCOUNT FUTURE REWARD

The discount function estimated from the analysis of intertemporal choice paradigms is likely to reflect the influence of factors jointly serving to make impatience potentially advantageous. A key ambiguity in the classical economic model concerns whether these factors should be properly assigned to the time series of future rewards, or to the discount function (Frederick et al., 2002; Frederick and Loewenstein, 2008; Friston et al., 2013; for a review of contextual influences on discounting see Koffarnus et al., 2013). The following discussion illustrates that if they are made fully explicit in the utility function, behavior consistent with temporal discounting emerges.

## Opportunity Cost

#### Growth and Missed Investment

For most organisms growth and development are necessary to reach reproductive capacity (Williams, 1957). For humans, development also extends to furthering one's social status. Growth potential motivates obtaining rewards sooner rather later, since earlier rewards can be invested—effectively loaned out at some rate of interest (see Rachlin, 2006; Kacelnik, 2011). The form of discounting that results depends on whether or not interest can be re-invested. Under the most straighforward scenario, referred to as simple interest, interest is not reinvested during the term of the loan. Consider a reward with utility r (for simplicity we omit the instantaneous utility function) invested for a period of time, d, to yield a larger payout, R. With simple interest:

$$R = r + krd\tag{4}$$

Solving for r and expressing as a ratio of the payout gives:

$$\frac{r}{R} = \frac{1}{1+kd} \tag{5}$$

A decision-maker should therefore be indifferent between a larger reward of utility, R, received after a delay, d, and a smaller reward, r, received immediately. Thus, linear growth (simple interest) motivates hyperbolic discounting (see Read, 2004; Rachlin, 2006).

In the above example, after the delay has lapsed the agent ought to reclaim their money and re-invest the entire payout to avoid losing out to a lower rate of interest. Compound interest represents a continual reinvestment of the payout, and generates exponential growth, such that the payout accrued at time d after choosing r is given by:

$$R = r e^{\mathrm{gd}} \tag{6}$$

Where g reflects the interest rate. Rearranging as before gives:

$$\frac{r}{R} = \,e^{-\lg d} \tag{7}$$

Thus, compound interest motivates exponential discounting.

#### Missed Income

In the natural world, delay often entails inactive waiting, during which other sources of reward cannot be harvested. The cost associated with an inactive delay can be quantified as the reward that is missed out on while waiting (Kacelnik, 2011). Under one such formulation, organisms should consequently choose an action which maximizes a rate of reward per unit time, a concept that has arisen in ecological theory independently from the notion of discounting (Stevens and Krebs, 1986). Under this formulation, discounted value is simply inversely proportional to delay (Chung and Herrnstein, 1967). It can be easily shown however that if even "immediate" rewards are associated with some small delay, m, where m = 1/k, this is equivalent to hyperbolic discounting (Daw and Touretzky, 2000). Thus, at indifference:

$$\frac{r}{m} = \frac{R}{m+d} \tag{8}$$

Rearranging as previously:

$$\frac{r}{R} = \frac{m}{m+d} = \frac{1}{1+d/m} = \frac{1}{1+kd} \tag{9}$$

A corollary of this theory is that the opportunity cost of delaying reward on a particular option depends on the average rate of reward from all other options (Chung and Herrnstein, 1967; Daw and Touretzky, 2000; Niv et al., 2007).

Inactive waiting leads to interesting results if other options become available only once the delays associated with the current choice have lapsed. Consider for example a lawyer who is paid by the hour for seeing clients at weekdays, but does not work at weekends. Say that he or she has two lunch options, either waiting in a long queue for a tasty lunch at a popular café, or being able to buy an equally calorific but less enjoyable meal straightaway at a sandwich bar. The lawyer might be optimally inclined to choose the sandwich bar on weekdays, so as to facilitate a sooner return to work, but might choose to wait at the café if faced with the same choice on a weekend. Here the intertemporal choice is influenced by other available sources of reward, which are inaccessible during the delay. In ecological terms, if an organism is foraging in a reward-rich area, the opportunity cost of delaying foraging by engaging in other activities is greater than when foraging in a reward-poor area (Niv et al., 2007).

Thus, expressed in terms of the total reward received, and letting the average rate of reward available after the delay be signified by ρ, then at indifference:

$$R = r + \rho d\tag{10}$$

Thus:

$$r = \mathbb{R} - \rho d \tag{11}$$

This arrangement allows for the possibility that a delayed reward carries negative value, whereby a decision-maker would willing to pay so as to be able to resume seeking rewards at the average rate, rather than to wait for the delayed reward.

## Uncertainty

#### Probability and Hazard

Whenever reward (capital) is stored for the future, for example when a person lends money to another person or when an animal stores food, there is some possibility that the capital will be lost (for example if a conspecific raids the food store or the debtor defaults on their loan). If there is some constant probability per unit time, referred to as a hazard rate, that future rewards do not materialize as promised, the expected value of reward (magnitude × probability) decreases exponentially with delay and gives rise to exponential discounting (Sozou, 1998).

Following the notation above at indifference:

$$r = \mathcal{Re}^{-\lambda t} \tag{12}$$

Rearranging:

$$\frac{r}{R} = e^{-\lambda t} \tag{13}$$

Where λ denotes a constant hazard rate.

Thus, the agent choosing whether to store reward should adopt a discount rate appropriate to the estimated hazard rate. For example a creditor ought to demand a rate of interest that is commensurate with the risk of the debtor's chance of default per unit time. Interestingly, where the appropriate hazard rate is uncertain, decision-makers ought to weight each possible hazard rate by its probability of being the true rate; such a weighted average of exponential rates approximates hyperbolic discounting (Sozou, 1998; Kurth-Nelson and Redish, 2009). As shown by Sozou (1998), hyperbolic discounting results exactly if:

$$\int\_0^\infty f\left(\lambda\right) e^{-\lambda t} \,d\lambda = \frac{1}{1+\,kt} \tag{14}$$

Where f (λ) is a probability density function over hazard rates. The above is satisfied if:

$$f\left(\lambda\right) = \frac{1}{k}e^{-\lambda/k} \tag{15}$$

i.e., if there is an exponential prior distribution over hazard rates, where k determines the shape of this distribution. In support of Sozou's theory, Takahashi et al. (2007) find that the subjective probability of receiving delayed reward in standard intertemporal choice tasks indeed decays hyperbolically.

As the quotation at the start of this article encapsulates, death creates a fundamental motive not to defer rewards for too long. In computational terms death can be considered to be an absorbing state, from which no future reward can be harvested. Notably a hazard rate for the event of dying can be seen to depend on the organism's current state, such that a greater physiological deficit is associated with a greater probability of dying per unit time. The fundamental value of reward is then its effect to reduce the hazard rate for dying (before successfully securing one's legacy). This argument suggests that it is optimal for biological agents to discount future reward more steeply when they are currently far from a physiological set point, based simply on an increased probability of their dying before future reward is attained.

#### Volatility

In summary, environmental hazards create a motive to discount the future, since future rewards might not materialize as promised. In addition, the utility of future rewards might be more uncertain, in the sense of having higher variance than immediate rewards (when the variance is known the resulting uncertainty is referred to as risk). Many behavioral economic studies have shown that people tend to be risk averse (Kahneman and Tversky, 1979; Holt and Laury, 2002; Trepel et al., 2005; Andersen et al., 2008; Platt and Huettel, 2008; Jones and Rachlin, 2009), in so far as they will accept a smaller expected payoff over a larger expected payoff with higher variance. If future events tend to evolve with a random component, the uncertainty associated with future events increases with delay (Mathys et al., 2011). To take an example, a decision-maker responding to a discounting questionnaire might have some degree of uncertainty about the subjective utility of a \$20 payout received immediately (if this appears implausible, imagine being paid in a foreign currency, whose worth is uncertain). However, owing to volatility governing future events in their lives (e.g., becoming ill, falling into debt, national economic collapse), uncertainty regarding the utility of the \$20 ought to increase as it is delayed. In combination with risk aversion this motivates delay discounting. In support of this idea, individual discount rates are correlated with risk aversion (Leigh, 1986; Anderhub et al., 2001; Eckel et al., 2005; Jones and Rachlin, 2009; Dohmen et al., 2010).

Notably, risk aversion can be expressed in terms of probability discounting, which is found to be hyperbolic in the odds against receiving a reward. Whilst probability discounting and temporal discounting are often found to be correlated across individuals (e.g., Jones and Rachlin, 2009), they are subject to distinct influences. For example, increasing reward magnitude increases probability discounting (i.e., risk aversion) and decreases temporal discounting (Green and Myerson, 2004). This is often taken as evidence that temporal discounting does not encompass an estimate of the risk associated with future rewards. However, pertinent to discounting is how a person estimates risk to be dependent on delay. Probability discounting offers a measure of risk aversion but does not access this time-dependent representation of risk. In support of this idea Takahashi et al. (2007) find that while probability and temporal discounting are uncorrelated across individuals, temporal discounting does correlate with the rate of decay in the subjective probability of receiving reward after increasing delay. This may help explain why psychiatric disorders are often associated with increased inter-temporal discounting but not necessarily with excessive probability discounting.

## MARR'S ALGORITHMIC LEVEL: PROCESSES SUB-SERVING INTERTEMPORAL CHOICE

In the preceding analysis we have outlined some generic scenarios under which behavior consistent with discounting would be optimal. These scenarios illustrate that discounting need not be considered as a unitary process, rather as (implicitly or explicitly) reflecting an expectation of different environmental contingencies. Under reinforcement learning formulations, such contingencies are seen as engendering transtitions in a statespace (Sutton and Barto, 1998; Dayan and Balleine, 2002; Dayan and Daw, 2008; Kurth-Nelson and Redish, 2009). That it is, an action is assumed to move the agent from one (discrete) state to another, where each state may be associated with a varying quantity of reward. The state-space is equivalent to the vector of rewards described in the classical economic model (Equation 2), though may also be made contingent on the agent's future behavior, giving rise to a matrix, or "decision-tree." A key question for this account is whether the (discounted) utility of a delayed reward is directly parameterized, which is to say that there is no more inference or learning beyond the state where this utility is considered, or whether the delayed reward is instead considered as part of a cascade of preceding states.

## A Parametric Discount Function?

If higher organisms indeed represent a discount function parametrically, they would require a widespread and efficient system for making this information accessible for decisionmaking. Neuromodulatory systems, with their diffuse connections to many areas of the brain, would be well placed to achieve this, and several authors have speculated that neuromodulators, such as dopamine and norepinephrine might represent some of the relevant parameters. For example, Niv et al. (2007) have proposed that the average rate of reward is signaled in the mammalian brain by tonic levels of extracellular dopamine in the striatum, suggesting that increased striatal dopamine availability might increase discounting by increasing the implicit opportunity cost of delay. Commensurate with this hypothesis, systemic administration in humans of the dopamine precursor l-Dopa increases discount rates (Pine et al., 2010), although potentially countervailing evidence is that decreasing dopamine transmission in rats by administration of haloperidol (Denk et al., 2005) or flupethixol (Floresco et al., 2008) has been found to increase discounting, or in other studies to exert no significant effect on discounting (Winstanley et al., 2005).

Similarly, a good deal of decision-making neuroscience seeks to uncover how uncertainty is represented neurally (see Behrens et al., 2007; Wilson et al., 2010; Mathys et al., 2011; Nassar et al., 2012). A recent suggestion is that operating in an unstable environment is associated with tonic release (over a time course of minutes) of norepinephrine (Yu and Dayan, 2003, 2005). The latter would suggest that tonic norepinephrine might signal environmental volatility, and thus influence discounting. Clearly, further psychopharmacological work is needed to fully uncover the role of monoaminergic signaling on discounting behavior. Also, if organisms indeed have a parametric model of discounting in the strictest sense, then this ought be revealed in the manner in which estimates of discounting are updated in light of changes in the environment, and careful behavioral work is required to probe this possibility.

## Discounting as a Revealed Phenomenon

According to a second possibility outlined above, choosing a delayed reward leads to a cascade of states, and may (or may not) lead to the promised reward, which if it occurs, may be delivered in a variety of future states (just in time for Christmas, after I've been killed by a bus, etc.) (see Peters and Büchel, 2010). If an agent uses this cascade of states to evaluate their actions, only the resulting transitions will endow this action with whatever value percolates through from the end states. Here discounting takes place due to learning and/or inference, where the value of the reward gradually evaporates as inference (or learning) propagates through a cascade of states. Given the properties of organisms and their environments, as outlined above, behavior consistent with discounting would simply emerge as the end result of applying these learning processes to situations where there is delay in the receipt of reward. Under this possibility, in terms of the economic model, all relevant information is summarized in an agent's utility function, which then implicitly incorporates the discount function. It appears likely that organisms use parallel mechanisms to calculate the value of the resulting state-space, operating across different timescales of information integration, ranging from updating innate behaviors through evolution, through learning from experience, to inferring future states via deployment of a cognitive map or model of the world.

Reliable valuations may be refined and passed on through genetic inheritance and evolution. For example, the possibility of death, and its associated opportunity cost, is likely incorporated through evolution, whereby internal states deviating from a homeostatic ideal, such as hunger and thirst, are assigned an innate cost as a proxy (see Keramati and Gutkin, 2011). Thus, discounting for food would be expected to increase when hungry, due to innate negative value associated with prolonging a state of hunger. Furthermore, actions themselves might in some cases be selected from an innately determined repertoire. Through Pavlovian conditioning, a stimulus (termed unconditioned stimulus, US, e.g., food) that elicits an innate response (the unconditioned response, e.g., salivation), can become associated with another stimulus (conditioned stimulus, CS, e.g., a tone), such that the latter subsequently becomes capable of eliciting an appropriate innate response independently (Rescorla and Solomon, 1967; Williams and Williams, 1969; Hershberger, 1986; Pavlov, 2003). Here the conditioning process, whereby CS becomes associated with US, can incorporate the cost of delay to conform to the optimal adaptations of some of the computational processes above. For example, if delivery of food follows a tone, with an intervening delay of 10 s, the "Pavlovian value" of the tone may be temporally discounted by a given proportion per unit time relative to that of the food (Domjan, 2003). Algorithmic accounts of classical conditioning, such as temporal difference learning, thus incorporate an exponential discount factor (O'Doherty et al., 2003; Moutoussis et al., 2008; Dayan, 2009; Kurth-Nelson and Redish, 2009). Exactly how such discounting is represented at a neurobiological process level remains unclear, but the influences outlined must be important. For example, the incremental process of temporal-difference learning, including Rescorla-Wagner learning (Domjan, 2003), means that the strength of the association between CS and US comes to reflect their probabilistic relationship.

Organisms can also learn the value of actions based simply on whether or not they yielded benefits in the past, referred to as instrumental conditioning (Domjan, 2003). In algorithmic terms, this can be most parsimoniously achieved by integrating the history of reinforcement following a given action, without representing an explicit model of the relationship between actions and their outcomes (Watkins and Dayan, 1992; Daw et al., 2005; Seymour et al., 2005; Schultz, 2006; Moutoussis et al., 2008; McDannald et al., 2011). This is referred to as model-free reinforcement learning, and corresponds to the "Thorndikian" Law of Effect (Thorndike, 1927), or "habit" learning (Dickinson et al., 1995; Ouellette, 1998; Neal, 2006; Tricomi et al., 2009; Dolan and Dayan, 2013; Orbell and Verplanken, 2014). Instrumental learning would be expected to incorporate discounting, to the extent that the environmental influences described earlier in this article affect the timecourse of reward contingent on a particular action.

Finally, biological agents can be availed of a cognitive map, or model, of the world, detailing the results of different actions and their respective values (Dickinson and Balleine, 1994; Balleine and Dickinson, 1998; Gläscher et al., 2010; Daw et al., 2011; McDannald et al., 2011). The choice of action proceeds by thinking forward through the map (or tree), and considering the consequences of alternative actions (see Seymour and Dolan, 2008). This mode of control is referred to in reinforcement learning applications as model-based (Gläscher et al., 2010; Daw et al., 2011; Wunderlich et al., 2012; Smittenaar et al., 2013; Lucantonio et al., 2014), and corresponds to the definition of goal-directed behavior in animal learning as being rapidly sensitive to changes in the contingency between action and outcome, or to devaluing the outcome (Dickinson and Balleine, 1994; Balleine and Dickinson, 1998). An advantage of the modelbased approach lies in its flexibility. For example, this approach is necessary to generate appropriate intertemporal choices in esoteric scenarios, to which a smooth discount function is not well adapted. For example, say a generous experimenter offers me a choice between \$100 today and \$125 4 weeks from today. The knowledge that I will be receiving my monthly pay of \$1000 exactly 4 weeks from today, and that without additional income I am likely to exceed my overdraft limit next week by around \$50, incurring a heavy fine, would likely encourage me to choose the immediate money. If I were to try choose between the immediate and delayed money according to a parametric discount function alone, without considering extraneous sources of (dis)utility, I might lose out to the overdraft fine. In summary, through the above innate and instrumental learning processes, given appropriate experience of the cost of delay, an organism can behave in a manner consistent with discounting without directly computing discounted value at all.

## (MAL)ADAPTIVE DISCOUNTING IN PSYCHIATRIC DISORDERS

We propose that whether parametric, or revealed through the above valuation processes, discounting nevertheless represents encoding of different environmental contingencies. It is therefore noteworthy, where changes in discounting are observed, for example in psychiatric disorders, to consider such changes in light of the environment to which a given individual might be "tuned to" (see also Del Giudice, 2014). The key point here is that, the decision-maker brings to a laboratory intertemporal choice task their previous experience of delay and may also consider the rewards of the task in the context of other future outcomes they expect to receive. We consider particular instances of this below.

## Mania as a State of Increased Opportunity Cost

Might steeper discounting in some pathological states reflect increased estimates of opportunity cost? In support of discounting being sensitive to changes in opportunity cost, discount rates for money have been shown to increase in line with increases in inflation (Ostaszewski et al., 1998). More speculatively, steeper discount rates in childhood and adolescence which decline into adulthood (Green et al., 1999; Chao et al., 2009; Steinberg et al., 2009) might even reflect greater potential for growth in adolescence. We propose that the pathological state of mania is associated with perceived high rates of reward and high growth potential, creating a heightened opportunity cost associated with inaction. Mania is known to be associated with impulsive behavior, such as overspending, rash financial decision-making or drug–taking (Swann, 2009), and one study (Mason et al., 2012) finds evidence for steeper discounting in an intertemporal choice task with real-time delays in the order of seconds in individuals prone to hypomanic symptoms.

Notably growth potential creates something of a paradox. On the one hand investing reward to achieve growth implies that the decision maker has adopted a long-term view. On the other hand, having something worthwhile to invest in favors choices that obtain rewards sooner rather than later, so that they too can be invested. For example, imagine you are starting a new business venture. Whilst this is necessarily a long-term project, you might sacrifice other potential rewards, such as your health or relationships, in order to invest resources in the business, which can be seen as borrowing predicated on a high level of return from your new business. Manic individuals generate novel, and often unrealistically ambitious, goals, for example, enlisting on education courses, or indeed starting new business ventures (DSM V, 2013). We propose that these goals create high opportunity costs to delaying reward, increasing preference for immediate rewards, so as to enlist resources for goal-pursuit. This offers a putative psychological explanation for why increased impulsivity in mania (Swann, 2009), including steeper discounting (Mason et al., 2012), manifests alongside an apparent increase in goal-directed activity.

The investment in apparently long-term goals in mania seems to occur at the expense of patients correctly "playing out" or "forward modeling" future scenarios themselves. This explains why the same (mal)adaptation is found across several behavioral domains. McClure and colleagues (McClure et al., 2004, 2007) have suggested that the explicit influence of largerlater options on behavior is associated with greater cognitive control, which is reduced in mania in tandem with prefrontal activation (Murphy et al., 1999; Townsend et al., 2010). This reduction in "forward modeling" is in fact consistent—if not necessary—for the suggestion we make here to work. That is, if a person with mania were to consider in detail the path ahead leading to their goals, they would realize that the projection implicit in their growth estimate is unrealistic and they would feel able to afford to be patient. A further interesting possibility, discussed further below, is that such forward modeling itself takes time, and that in the face of high opportunity costs, the depth of such model-based strategies is reduced in favor of more roughand-ready heuristics, or more Pavlovian or habitual responding (Dezfouli, 2009; Huys et al., 2012). Future investigations of mania might focus on measuring beliefs about growth and opportunity cost directly, and whether such beliefs correlate with changes in discounting. Interestingly, Dezfouli (2009) similarly propose that the abnormally high rewards engendered by drugs of abuse lead to an artifically elevated estimate of the average reward rate in the environment, and that this accounts for increased discounting seen amongst substance abusers (e.g., Kirby et al., 1999; Kollins, 2003; Kirby and Petry, 2004).

Finally, we have shown above how an increase in the rate of reward available from activities other than those currently on offer increases impatience to complete the current activity as soon as possible (i.e., increases discounting for rewards obtained from the task in hand). Niv et al. (2007) use the same approach to explain variations in response vigor. In their model they propose that the agent can choose to reduce latency of its responses, at some energetic cost that is proportional to the latency reduction. Thus, choosing how quickly to perform a particular action itself becomes an intertemporal choice. As their model illustrates, greater vigor (shorter response latency) is then optimal where the average reward rate is higher, in order that agents can resume reward seeking as soon as possible. This description accords well with that of mania, where sufferers often describe the need to complete various tasks with great urgency and where the general vigor of behavior is markedly increased. Furthermore, the model of Niv and colleagues incorporates a latency-independent cost associated with switching tasks. As the authors show, at high reward rates latency-dependent costs tend to dwarf the switching cost, leading to greater task switching than at low reward rates. This too is in keeping with behavior exhibited in manic states, where sufferers have difficulty sustaining tasks.

## Economic Poverty as a Deficit State

In keeping with the normative notion that deficit states increase a hazard rate for losing out on future reward, discounting indeed tends to be higher in states of monetary or physiological deficit. For example, steeper discounting is observed in individuals with lower incomes (Green et al., 1996; Reimers et al., 2009), an effect which remains after controlling for level of education. Of course, such studies are correlational, making it difficult to conclude that changes in income directly alter discounting. However, an interesting study by Callan et al. (2011) provides indirect support for a more causal role of low income in increasing discounting. The authors found that a manipulation which lead people to believe that their income was lower than their peers brought about an increase in discounting, relative to a group who were lead to believe that their income was similar to that of their peers. The manipulation was interpreted as priming personal notions of deservedness, though this might just as easily be formalized as a shift toward a perceived deficit state. In a conceptually related study Haushofer et al. (2013) performed an experiment in which subjects performed an effort task for monetary reward, after which different groups received either an increase in income from a low starting endowment, or a decrease in income from a high starting endowment. The design thus allowed the effect of (experimental) wealth changes to be dissociated from absolute wealth. Subjects' temporal discount rates were measured before and after the task, with the finding that negative income shocks lead to an increase in discounting, while positive income shocks effected a small decrease in discounting. Starting wealth was found to be unrelated to discounting. Notably, the size of an experimental endowment might not be expected to have an effect on discounting, since the endowment was likely to be small in comparison to subjects' total real-world wealth. The effect of negative income shocks, which might be interpreted as having primed an increased hazard rate for future earnings, suggests that instability in earnings, rather than simply total wealth, is an important determinant of the relationship between socioeconomic status and discounting.

A study in women deprived of food and water (for 4 h after their usual waking time) found that women given a preloading meal prior to testing chose an option leading to the delayed, rather than immediate, delivery of juice significantly and significantly more so than women who had not received a preloading meal (Kirk and Logue, 1997). Also, Wang and Dvorak (2010) measured monetary discounting before and after participants drank either a sugary or a sugar-free drink (both caffeine-free), finding a significant decrease in discounting in the group who drank the sugary drink and a significant increase in the control group. This finding suggests that raising blood glucose decreases discounting, an idea congruent with increased discounting associated with deficit states.

Economic poverty may well underlie some of the steeper discounting seen in psychiatric disorders, through an association between mental illness and lower socioeconomic status (e.g., Weich and Lewis, 1998; Lorant et al., 2007) (however in several studies associations remain after controlling for socioeconomic characterisitics). Notably, there may be an interdependent relationship between low socioeconomic status, discounting, and mental ill health, whereby impatience for rewards leads to maladaptive choices such as substance misuse, which in turn are associated with worsening finances, further increases in discounting and increased risk of psychiatric disorder (e.g., Fields et al., 2009b; Leitão et al., 2013). A similar idea has been championed by Bickel et al. (2014b), who propose that the environment associated with low socioeconomic status promotes steeper discounting, which in turn engenders unhealthy choices, thus contributing to known socioeconomic gradients in health status (Adler and Rehkopf, 2008). This is supported by evidence that cigarette smoking, obesity, alcohol use and illicit drug use all exhibit negative relationships with socioeconomic status (Conner and Norman, 2005), that these behaviors are associated with poor executive functioning (e.g., Bickel et al., 2012a), and that economic poverty is prospectively associated with poor executive functioning (Lupien et al., 2007; Noble et al., 2007; Evans and Schamberg, 2009). We discuss this interaction between environment and cognition in Section The Cost of Thinking in Economic Poverty, Borderline Personality Disorder and Schizophrenia below.

## ADHD as a Deficit State

Interestingly, the effects of deprivation appear to cross modalities of reward. For example, mild opioid deprivation in opioid dependent individuals increases discounting for money as well as heroin (Giordano et al., 2002). Arguably this might be motivated by a desire on the part of subjects to obtain money sooner so as to buy drugs. However, it might equally be attributable to a more global alteration in decision-making associated with physiological deficit states (see also Loewenstein, 1996; Metcalfe and Mischel, 1999). In further support of this idea, exposure to erotic cues increases discounting for money, as well as for candy bars or soda drinks in men (Van den Bergh et al., 2008). Furthermore, the effect of sex cues to increase discounting for food and drink rewards was attenuated by satiation with money, providing evidence for a global physiological signaling mechanism. Niv et al. (2007) propose that this mechanism "global drive" mechanism might involve modulation in tonic dopamine signaling.

In some cases steeper discounting observed in psychiatric disorders might reflect processes associated with normal deficit states. ADHD is a possible example. ADHD is defined by behavioral symptoms of inattentiveness, over-activity and impulsivity, of long-standing duration and is most commonly diagnosed in school-aged children (DSM V, 2013). Many studies have shown that children with ADHD have a greater tendency than controls to choose immediate over delayed rewards in single choices (e.g., Sonuga−Barke et al., 1992; Schweitzer and Sulzer−Azaroff, 1995; Kuntsi et al., 2001; Bitsakou et al., 2009; for reviews see Luman et al., 2005; Paloyelis et al., 2009) and (relative to controls) are biased toward choosing tasks which yield earlier, rather than delayed, reinforcement (Tripp and Alsop, 2001). Also, on delay of gratification tasks (Mischel et al., 1989) children with hyperactivity exhibit a greater tendency to terminate the delay to obtain a smaller reward, rather than waiting an allotted time for a larger reward (Rapport et al., 1986). Furthermore, several studies now report steeper monetary discounting in children with ADHD (Paloyelis et al., 2009; Scheres et al., 2010; Wilson et al., 2011; Demurie et al., 2012) or in adults with previous ADHD (Hurst et al., 2011).

We hypothesize that the increased discounting rates found in ADHD reflect both the well-known genetic vulnerability for this disorder but also encode the more deprived environments that lead to increased expression of this disorder (Apperley and Mittal, 2013; Russell et al., 2015). In support of this, in one study boys with ADHD symptoms who had been reared in deprived institutions showed increased aversion to delay compared with ADHD controls compared to less deprived patients (Loman, 2012). Thus, seeking of immediate reward in ADHD might reflect underlying mechanisms linking increased discounting with states of internal deprivation. One such mechanism would be that outlined above of higher rates of reward available from alternative tasks. For example, say that children with ADHD have an internal state resembling a deprivation of loving attention; their performance of tasks that do not offer this attention, such as quiet private study, is likely to be more impatient, so as to more quickly return to actions that do command attention from others.

## Increased Estimates of Uncertainty and Hazard

Although conventional discounting tasks offer choices between rewards that are promised to be delivered with certainty, decision-makers likely come to the task with a prior belief regarding the level of hazard in the environment, and so tend to implicitly distrust the experimenter's assertion that the future rewards are guaranteed. In support of this, discount rates amongst cigarette smokers have been shown to correlate positively with their belief that the future reward will be delivered (Reynolds et al., 2007). Also, within a standard discounting questionnaire, people discount more steeply when rewards are framed as being received from fictive characters rated as untrustworthy, as opposed to from characters perceived as trustworthy (Michaelson et al., 2013).

In an interesting study, Callan et al. (2009) measured discounting in 56 undergraduate students who first watched an interview with a HIV-positive woman. One group were told that she had acquired HIV through unprotected sex and the other group that she had acquired the virus via an infected blood transfusion. The latter group exhibited significantly steeper discounting, an effect which was proposed to result from the story of the infected blood transfusion having primed a belief that the world is unjust. A related explanation, independent of feelings of injustice per se, would be that the transfusion scenario increased the perceived hazard rate for adverse life events.

Finally, as described previously, the ultimate hazard is that one will die before the future reward occurs. In keeping with this, in a South African population, discounting was found to be higher amongst individuals with the lowest perceived survival probability than amongst those with average survival probability (Chao et al., 2009), and to correlate with the number of bereavements of close family members reported by North Americans (a factor putatively increasing perceived mortality risk) (Pepper and Nettle, 2013). Furthermore, discounting has been shown to increase on conscription into the Israeli army (Lahav et al., 2011), and to be higher in youths living in slums in Rio De Janeiro than in an age matched sample of university students (Ramos et al., 2013).

Populations with psychiatric disorders might well believe that future rewards are less likely to materialize (a higher hazard rate) than do healthy control populations, for quite rational reasons, given their life experiences (Hill et al., 2008). In other words, the past is the best predictor of the future, and this may be why psychiatric disorders associated with hazardous development are characterized by higher discounting rates. Populations with psychiatric illness have experienced an excess of major life events compared with the healthy population (Paykel, 1978), and have excess mortality from physical health conditions compared with the general population (Robson and Gray, 2007). The latter would be expected to be associated with lower perceived survival probability, given correlations between perceived and actual mortality in the general population (Idler and Benyamini, 1997). To our knowledge no previous studies have examined this. This may in turn result in decisions that perpetuate or worsen the disorder. Indeed, Sonuga-Barke has hypothesized that the high discounting rates measured in the laboratory in youths with conduct disorder represent an accurate—and hence adaptive in their native environment—summary of the increased hazards that these youths so commonly have experienced (Barke, 2014). An interesting possibility for future research would be to elicit beliefs of groups with psychiatric disorder about the likelihood that future reward will be forthcoming, and to regress this against their discounting choices. Similarly further research is needed to examine relationships between an individual's experience of significant life events, their confidence in the future, and their level of temporal discounting.

## The Cost of Thinking in Economic Poverty, Borderline Personality Disorder and Schizophrenia

It appears that a greater engagement of model-based control, a faculty tightly dependent on working memory, is associated with more future-oriented responses on discounting paradigms. Promoting mental simulations of future outcomes by cueing participants with episodes in their lives corresponding to the timing of the options decreases measured discount rates (Peters and Büchel, 2010). Higher working memory capacity is associated with both lower discounting (Shamosh et al., 2008), and an increased emphasis on model-based control (Eppinger et al., 2013), while working memory training in substance misusers has been found to decrease their delay discounting (Bickel et al., 2011b).

In keeping with the above, functional neuroimaging studies have found that the dorsolateral prefrontal cortex (dlPFC), an area often implicated in tasks dependent on working memory (Curtis and D'Esposito, 2003), is sensitive to model-based learning signals (Gläscher et al., 2010). This area is also known to be active when choosing delayed rewards on intertemporal choice paradigm (McClure et al., 2004, 2007). Furthermore, disrupting dlPFC function (using either transcranial magnetic stimulation or transcranial direct current stimulation) both decreases the emphasis on model-based control (Smittenaar et al., 2013) and increases temporal discounting (Hecht et al., 2013). The process of mentally simulating future outcomes is also known to be dependent on the hippocampus (Hassabis et al., 2007; Johnson et al., 2007; Schacter et al., 2008; Schacter and Schacter, 2008), and rats with hippocampal lesions have been found to exhibit increased discounting (Mariano et al., 2009). Taken together these results suggest that mental simulation of the future tends to generate more patient intertemporal choices, and that this process is working memory dependent.

A plausible explanation for the above is that mentally simulating the future resolves uncertainty about the utility of larger-later rewards (see Daw et al., 2005). For example, I might be uncertain about how much I am likely to require money in 7 months' time, but if I remember that my partner's birthday is in seven and a half months' time, and I anticipate needing the money to buy him or her an expensive present, I might revise my estimate of the utility of the future money. An interesting possibility is that decision-makers face a trade-off between making the best possible decisions and doing so in a timely manner with the minimum of effort. Model-based simulation of the future is compuationally costly, i.e., consumes time and energy. If conditions are sufficiently unpredictable, then attempting to explicitly plan out future possibilities is futile, and may even be disadvantageous (see Daw et al., 2005). Thus, prolonged exposure to an unstable environment during development ought to both discourage the use of model-based strategies and increase discounting via greater uncertainty associated with future rewards. This possibility would conceptually bind together an unstable childhood environment, diminished cognitive ability and steeper discounting of reward, providing a tentative theoretical basis for explaining the association between these factors in several psychiatric disorders. For example, people with borderline personality disorder are likely to have experienced childhood abuse (Lewis and Christopher, 1989; Ogata et al., 1990; Zanarini et al., 1997), exhibit below average cognitive function (Swirsky-Sacchetti et al., 1993) and discount the future more steeply than healthy controls (Lawrence et al., 2010).

A similar interaction might in part underlie associations between low socioeconomic status, steeper discounting and psychiatric disorder. Bickel et al. (2014a, 2011a) propose a neuropsychological explanation for relationships between low socioeconomic status and unhealthy lifestyle choices, in terms of a dual-systems model of cognition, whereby low socioeconomic status encourages engagement of a more "impulsive" decisionmaking system, putatively mediated by limbic brain structures, over an "executive" decision-making system, mediated by parts of frontal cortex. The authors point to evidence that several neurocognitive abilities including working memory, declarative memory, and cognitive control exhibit socioeconomic gradients (Noble et al., 2007). This association appears to hold in prospective analyses too. On a developmental timescale, Evans and Schamberg (2009) show that childhood poverty predicts lower working memory in young adulthood, and that high levels of childhood stress mediate this relationship. State-based effects of poverty on cognitive function are also evident, for example Indian sugar-cane farmers exhibit worse cognitive performance before their harvest, when they are poor, than after their harvest, when they are richer, even controlling for levels of stress (Mani et al., 2013). The dual-systems approach is not incompatible with our three-way division of behavioral control. The model-based system for instance appears to depend on executive functions such as working memory, but has the advantage of carrying a specific algorithmic meaning. Also, we envisage the threecontrollers as sharing the mutual goal of maximizing reward (Dayan et al., 2006), and suggest that their relative deployment is also subject to a cost-benefit trade-off (Daw et al., 2005; Dezfouli, 2009; Huys et al., 2012). We therefore go as far as to propose that diminished deployment of model-based control in states of deprivation might reflect an evolutionary milieu in which such changes were approximately optimal, for example in response to irreducible future uncertainty.

Deficits in future thinking appear likely to underlie steeper discounting seen in patients diagnosed with schizophrenia compared with healthy controls (Heerey et al., 2007, 2011), in keeping with observations that such patients often exhibit cognitive and executive dysfunction. Furthermore, patients with schizophrenia exhibit atrophy of frontal and temporal brain regions (Madsen et al., 1999; Velakoulis et al., 2001; van Haren et al., 2008), a pattern which would be expected to be accompanied by shortened time perspective, given the role of these structures in imagining future scenarios (Hassabis et al., 2007; Johnson et al., 2007; Schacter et al., 2008; Schacter and Schacter, 2008). Heerey et al. (2011) present evidence to support this view, comparing measures of discounting, cognitive function and "future representation" in 39 patients with schizophrenia and 25 healthy control participants. Patients discounted more steeply than controls, and when asked to list events which they thought might happen to them in their lives, on average reported future life-events that were nearer in time. This shortened future perspective correlated with lower working memory scores in both patients and controls, to the extent that controlling for working memory abolished the effect of schizophrenia status on discounting. These results suggest that discounting deficits in schizophrenia are attributable to an impaired ability to imagine the future, a faculty that is limited by working memory capacity.

## FUTURE DIRECTIONS

The above account leaves considerable room for future research. The foregoing discussion has largely focused on appetitive processes evoked in the appraisal of future rewards. A complementary, but distinct, set of principles might apply to how humans evaluate future punishment. For example, as a complement to the theory that tonic dopamine signals the average reward rate, it has been proposed that tonic serotonin signals the long run average punishment rate, and thus controls the vigor of avoidance behavior (Dayan, 2012a,b, see also Crockett et al., 2012). This idea might hold relevance for increased discounting in depression, which is associated with both marked avoidance (Ferster, 1973) and possible serotonergic abnormalities (e.g., Mann et al., 2000). Although a normative account of the role of serotonin in depression remains elusive, it is interesting that decreasing serotonin availability (achieved by tryptophan depletion) in healthy subjects acts to increase discounting (Tanaka et al., 2007; Schweighofer et al., 2008), commensurate with increased discounting seen in depression (Takahashi et al., 2008; Dennhardt and Murphy, 2011; Dombrovski et al., 2011, 2012; Imhoff et al., 2014; Pulcu et al., 2014) (For further discussion of temporal preferences for punishment see Berns et al., 2006; Story et al., 2013, 2015).

A further area for future research concerns the effect of stress on discounting (e.g., Diller et al., 2011; Kimura et al., 2013). A recent meta-analysis (Fields et al., 2014) of 16 studies examining the relationships between delay discounting or delay of gratification and subjective or physiological measures of stress and found that stress was associated with steeper discounting, with a large aggregate effect size (Hedge's g = 0.59). Seemingly contradicting these findings, low baseline cortisol levels have been associated with increased delay discounting (Takahashi, 2004), and similarly predict higher discounting at 6 month follow up (Takahashi et al., 2009). A possible explanation would be that baseline stress and responsivity to stress manipulations exert distinct influences on discounting. In part supporting this idea, Lempert et al. (2012) found that when placed under stressful conditions, individuals with low trait perceived stress showed higher discounting than those with high trait perceived stress, perhaps reflecting greater responsiveness to acute stressors in subjects with low trait stress. In addition acute administration of hydrocortisone, a key hormone involved in stress response, has been found to cause a short-lived increase in discounting (Cornelisse et al., 2013). Further work is required to understand the relationships between baseline and induced stress and their interaction with discounting, as well as to characterize stress in terms of the information content of stressful situations.

The above account has not specifically addressed willpower. Several lines of evidence point to the fact that humans often renege on best-laid plans, in favor of immediate consumption. We propose that this results since people are poor in predicting in advance the effect of conditioned cues and motivational state changes on their behavior (see also Loewenstein, 1996; Metcalfe and Mischel, 1999; Read, 2001; Chapman, 2005; Dayan et al., 2006; Story et al., 2014). Thus, one might plan to abstain from eating dessert as part of a diet plan, but find it harder to resist when presented with a piece of cake (see for example Read and Van Leeuwen, 1998; Allan et al., 2010) and relapses in drug-taking behavior following abstinence commonly occur after exposure to a previous drug-taking environment (O'Brien et al., 1998). Similarly, people appear poor in predicting their behavior in future motivational states that differ from their current motivational state. For example, in a study of analgesic preferences for childbirth (Christensen-Szalanski, 1984), women asked roughly 1 month in advance of labor preferred to avoid invasive spinal anesthesia in favor of less invasive but less effective pain relief methods, however during active labor women frequently reversed preference and opted for anesthesia. "Battles of will" then consist in the attempt to punish or extinguish existing habitual or Pavlovian responses through the imposition of countervailing model-based (goaldirected) valuations. Hyperbolic discounting theoretically gives rise to similar intertemporal choice conflicts, but considered alone has difficulty accounting for the state-dependence of real world failures of self-control. Thus, in the study of Christensen-Szalanski (1984) it seems likely to be the transition into a painful state that brings about a shift in womens' preferences for analgesia, rather than the time preceding childbirth per se as hyperbolic discounting would suggest. An interesting direction for future research will be to examine whether individuals with psychiatric disorders, for example borderline personality disorder, exhibit greater choice inconsistency over time, relative to controls. This possibility would accord with a well-esteemed theory that individuals with borderline personality disorder are impaired in modeling mental states (Bateman and Fonagy, 2004).

Another interesting direction not explored here concerns discounting of past rewards (Yi et al., 2006; Bickel et al., 2008). Discounting for past rewards has been shown to be systematic and hyperbolic in form, and is correlated with the degree of future discounting across individuals (Yi et al., 2006). Furthermore, cigarette smokers are found to discount past, as well as future, rewards more steeply than non-smokers (Bickel et al., 2008). Symmetry between past and future discounting is in keeping with evidence that remembering the past and imagining the future are both dependent on the hippocampus (Hassabis et al., 2007; Johnson et al., 2007; Schacter et al., 2008; Schacter and Schacter, 2008). Notably past discounting is difficult to directly account for in terms of some of the informational influences suggested in this article. Growth potential for example ought to motivate having received rewards in the distant past, since these should have had time to accrue greater value. Further work is clearly needed to understand the possible normative basis of past discounting. One possibility is that factors tending to foreshorten model-based consideration of future outcomes, such as uncertainty, also dimish retrieval of episodic memories, leading to a narrowing of temporal perspective. Notably, the learning rate in model-free reinforcement learning algorithms corresponds to an exponential discount factor for past reward. Yechiam et al. (2005) have shown that susbtance misusers and inidividuals with ventral medial prefrontal cortex lesions both exhibit increased learning rates on the Iowa gambling task, where an excessive focus on recent reinforcement is disadvantageous. This suggests that high learning rates might reflect a form of "retrospective impulsivity," through assigning too little weight to distant past experience. Further work is required to explore this possibility.

A final consideration is that of how discounting differs between different forms of outcome. Discounting for several forms of appetitive outcome shows consistency across individuals, for example discount rates for money are strongly and significantly correlated with other forms of appetitive outcome, such as the discounting of cigarettes for cigarette smokers, the discounting of heroin for opioid-dependent outpatients and the discounting of food amongst college students (Odum, 2011; Pearson r = 0.93; p = 0.0007 for money vs. the mean of all other outcomes). However, rates are not identical across commodities: people tend to discount primary reinforcers such as food, water and sex more steeply than money (Lawyer et al., 2010; Odum, 2011; Jarmolowicz et al., 2013) and a number of studies have shown that people with substance dependence discount their drug of abuse more steeply than money (e.g., Madden et al., 1997; Bickel et al., 1999; Petry, 2001). Steeper discounting for primary reinforcers might reflect their greater engagement of innate appetitive systems. In other words, deliberative consideration of primary reinforcers might increase attention to the relevant underlying deficit state (drive). Steeper discounting then putatively results due to the negative Pavlovian value associated with prolonging the deficit state. Further research is needed to examine this possibility.

Interesting results have been obtained when discounting choices are made across different commodities, for example in choices between money now vs. cigarettes later, termed crosscommodity discounting (CCD), as opposed to single-commodity discounting (SCD). For instance, Bickel et al. (2011a, 2007) examined discounting in cocaine-dependent individuals between cocaine now vs. cocaine later (C-C), money now vs. money later (M-M), cocaine now vs. money later (C-M), and money now vs. cocaine later (M-C) conditions, where the amounts of money and cocaine across conditions were equated in immediate worth. Consistent with previous findings, C-C discount rates were significantly greater than M-M discount rates; indeed there was a significant main effect of changing the delayed commodity to cocaine, consistent with cocaine being discounted more steeply than money. However, the authors found that, whilst C-M and M-M discounting were statistically indistinguishable, M-C discount rates were significantly higher than C-C discount rates. Wesley et al. (2014) broadly replicate this result, and Jarmolowicz et al. (2014) find a similar pattern of findings for money vs. sex CCD, wherein a M-S condition was associated with the steepest discounting. A possible explanation in terms of the classical economic model would be that cocaine (or sex) is both discounted more steeply and has a less concave utility function than money. Bickel et al. (2011a, 2007) illustrate this possibility though favor an explanation in terms of a framing effect. We propose a framing hypothesis whereby primary reinforcers are associated with a steeper implicit hazard rate than money (this might in part underlie their steeper discounting, but is of itself insufficient to explain the above findings); SCD then hypothetically diminishes the implicit hazard rate, by priming the idea that the commodity will definitely be received sooner-or-later. By contrast, the implicit exchange of money for primary reinforcement in CCD hypothetically amplifies the hazard rate for the delayed commodity, by priming the notion that the delayed commodity is not guaranteed. This hypothesis leads to the observed interaction, with the steepest discounting for CCD in which primary reinforcement is delayed, and is an eminently testable. The possible modulation of such cross-commodity effects in various psychiatric disorders might offer further clues as to the underlying decision mechanisms at play.

In summary we have reviewed motivations for steeper discounting of delayed reward. Discounting tends to be increased across a broad range of disorders, including ADHD, schizophrenia, bipolar disorder, hypomania, depression, borderline personality disorder and substance misuse disorders. We have proposed that these findings can be parsimoniously understood by examining the reasons why people should discount the future, namely the opportunity costs of delay, uncertainty associated with future outcomes and the cognitive costs of resolving this uncertainty. We have detailed different types of information processing in the brain that can take these factors into account, broadly distinguishing "parametric discounting," whereby rewards labeled as delayed are automatically discounted as a function of delay, vs. "planful discounting" where the factors associated with the delay are accounted for in the course of learning. Where possible we have attempted to map these normative influences onto putative, albeit broad neurobiological mechanisms. More generally we propose that this approach, that is, attempting to understand the biological substrates of psychiatric disorder in terms of their physiological function, and in light of a person's life history, is key to bridging psychosocial and biological conceptions of mental illness. We accept that our use of this approach here might appear speculative. In essence, we feel is this justified given the emerging nature of the field and await further research developments with eager interest.

## FUNDING STATEMENT

This work was supported by the Wellcome Trust [Ray Dolan Senior Investigator Award 098362/Z/12/Z]. The Wellcome Trust Centre for Neuroimaging is supported by core funding from the Wellcome Trust 091593/Z/10/Z. Dr. Moutoussis is also supported by the UCLH Biomedical Research Council.

## REFERENCES


discounting a measure of impulsivity? Drug Alcohol Depend. 96, 256–262. doi: 10.1016/j.drugalcdep.2008.03.009


in smokers. Exp. Clin. Psychopharmacol. 11, 131–138. doi: 10.1037/1064- 1297.11.2.131


discounting and earlier reproduction. Evol. Hum. Behav. 34, 433–439. doi: 10.1016/j.evolhumbehav.2013.08.004


in chronic schizophrenia. Biol. Psychiatry 50, 531–539. doi: 10.1016/S0006- 3223(01)01121-0


J. Child Psychol. Psychiatry 52, 256–264. doi: 10.1111/j.1469-7610.2010. 02347.x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Story, Moutoussis and Dolan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.