# Avoidance: From Basic Science to Psychopathology

edited by: Richard J. Servatius, Kevin C. H. Pang, Gregory J. Quirk and Catherine E. Myers published in: Frontiers in Behavioral Neuroscience

### *Frontiers Copyright Statement*

*© Copyright 2007-2016 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88919-828-3 DOI 10.3389/978-2-88919-828-3

### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

## **Avoidance: From Basic Science to Psychopathology**

Topic Editors:

**Richard J. Servatius,** Syracuse VA Medical Center, Syracuse and Rutgers University-Rutgers Biomedical Health Sciences, USA **Kevin C. H. Pang,** New Jersey Health Care System and Rutgers University-Rutgers Biomedical Health Sciences, USA **Gregory J. Quirk,** University of Puerto Rico School of Medicine, USA **Catherine E. Myers,** New Jersey Health Care System and Rutgers University-Rutgers Biomedical Health Sciences, USA

Coping has a myriad of facets: knowledge concerning the circumstances of threats to emotional and physical well being, the ability to meet immediate needs to mitigate, the potential for recurrence, the ability to apply efforts and resources to manage recurrence, and the complex assessment of competing motivations and changing circumstances. Successful coping is measured in the efficiency of efforts in balance with the degree of threat and likelihood of future occurrence. As one means of coping, avoidance encompass thoughts and efforts toward prevention of future aversive experiences and events. Anxiety disorders exemplify an extreme bias toward avoidance. A diathesis learning model focuses research efforts on individual vulnerabilities to acquire and express avoidance, the neurobiology of avoidance learning and its attendant circuitry. A fundamental understanding of avoidance through a diathesis learning model offers will facilitate the development of effective treatment protocols in alleviating anxiety disorders.

**Citation:** Servatius, R. J., Pang, K. C. H., Quirk, G. J., Myers, C. E., eds. (2016). Avoidance: From Basic Science to Psychopathology. Lausanne: Frontiers Media. doi: 10.3389/978-2-88919-828-3

# Table of Contents


Richard J. Servatius, Pelin Avcu, Nora Ko, Xilu Jiao, Kevin D. Beck, Thomas R. Minor and Kevin C. H. Pang

*43 Contribution of emotional and motivational neurocircuitry to cue-signaled active avoidance learning*

Anton Ilango, Jason Shumake, Wolfram Wetzel and Frank W. Ohl

*48 Persistent active avoidance correlates with activity in prelimbic cortex and ventral striatum*

Christian Bravo-Rivera, Ciorana Roman-Ortiz, Marlian Montesinos-Cartagena and Gregory J. Quirk

*56 Altered activity of the medial prefrontal cortex and amygdala during acquisition and extinction of an active avoidance task*

Xilu Jiao, Kevin D. Beck, Catherine E. Myers, Richard J. Servatius and Kevin C. H. Pang


Xilu Jiao, Kevin D. Beck, Amanda L. Stewart, Ian M. Smith, Catherine E. Myers, Richard J. Servatius and Kevin C. H. Pang


Catherine E. Myers, Ian M. Smith, Richard J. Servatius and Kevin D. Beck


Michael Todd Allen, Catherine E. Myers and Richard J. Servatius


Margaret G. McCue, Joseph E. LeDoux and Christopher K. Cain

*197 Enhanced discriminative fear learning of phobia-irrelevant stimuli in spiderfearful individuals*

Carina Mosig, Christian J. Merz, Cornelia Mohr, Dirk Adolph, Oliver T. Wolf, Silvia Schneider, Jürgen Margraf and Armin Zlomuzica


Kazumi Osada, Sadaharu Miyazono and Makoto Kashiwayanagi

## Editorial: Avoidance: From Basic Science to Psychopathology

### Richard J. Servatius \*

*Neuroscience, Syracuse DVA Medical Center, Stress and Motivated Behavior Institute, Rutgers Biomedical Health Sciences, Newark, NJ, USA*

Keywords: avoidance learning, anxiety disorders, depressive disorder, research domain criteria, expectancy, animal models of mental disorders, association learning, escape

**The Editorial on the Research Topic**

### **Avoidance: From Basic Science to Psychopathology**

As a means of coping, avoidance encompasses thoughts and efforts toward prevention of future aversive experiences and events. Avoidance has been and remains controversial. Avoidance is accepted as a construct in many areas of research, but is roundly disdained in others. Why is such a critical feature of coping both acknowledged as such, but almost reluctantly studied?

For one, avoidance is often conflated with fear. Fear is an emotion. Threat conditions which engender fear also engender a host of physiological and behavioral responses (Ledoux, 2013). In animals, exposure to aversive stimuli or cues associated with aversive stimuli induce freezing, fleeing, or aggressive displays depending on the context of exposure—all behavioral manifestations of threat (Osada et al.). Responses to threat are relatively simple, engendered and refined through a circumscribed neural circuitry (Ledoux and Muller, 1997; Delgado et al., 2008). Fear and defensive responses to threat are readily and almost universally acquired. Those under threat (Shors and Servatius, 1997), stress (Servatius and Shors, 1994), and fearful (Mosig et al.) have a generalized facilitation of associative learning making threat and fear more pervasive. The engendering of fear and its expression is a highly researched concept; advancements in fear and the neurobiology subsuming fear is among the most notable and exhaustive neurobiological achievements in the past half century.

By comparison, avoidance is a fairly sophisticated construct. Avoidance is the situational evaluation of likelihoods, efficacy of responses, and costs. Avoidance is often weighed against alternatives; alternatives with differing or competing motivations (Beck et al., 2011; Fernando et al., 2014; Ilango et al.; Sheynin et al.). For many applications and circumstances fear and avoidance seem to be inseparable, so the terms become conflated. In the vernacular, fear is an immediate response to stressors and fear motivates avoidance. Therefore, in many circumstances those avoiding are expected to be experiencing fear. However, the empirical literature provides ample evidence that the processes are distinct (Bolles, 1968; Seligman and Johnston, 1973; Rio-Alamos et al.) and while the neurocircuitry, such as the lateral habenula (Shumake et al., 2010; Ilango et al., 2013) and cerebellum (Steinmetz et al., 1993) overlaps (Freeman et al., 1996, 1997; Bravo-Rivera et al., 2014; Campese et al.; McCue et al.; Jiao et al.), their influences on these processes potentially do not. Further distinguishing fear and threat from avoidance, septal (Thomas and Van Atta, 1972; Hedges et al., 1975) and hippocampal lesions (Cominski et al.) are known to facilitate avoidance acquisition, whereas these brain regions are critically involved in fear conditioning when intact (Kim et al., 1993; Desmedt et al., 1998; Knight et al., 2004).

As a research topic, avoidance all but disappeared through the 1990's, a phenomenon that has been noted in a number of recent reviews (Dymond and Roche, 2009; Krypotos et al.). Reduction in the study of avoidance stemmed from theoretical and practical considerations. In humans, the rise of institutional review boards and the reluctance of institutions and investigators to study reactions to aversive, painful stimulation or uncomfortable situations stymied progress. Added to

### Edited and reviewed by:

*Nuno Sousa, University of Minho, Portugal*

> \*Correspondence: *Richard J. Servatius richard.servatius@va.gov*

Received: *08 January 2016* Accepted: *28 January 2016* Published: *12 February 2016*

### Citation:

*Servatius RJ (2016) Editorial: Avoidance: From Basic Science to Psychopathology. Front. Behav. Neurosci. 10:15. doi: 10.3389/fnbeh.2016.00015* these concerns, there were growing controversies regarding the role of awareness and instructional sets in human associative learning. Explicit information stemming from the consent form and instructions complicated experimental designs and interpretations of acquisition. Now there are but a few laboratories across the world with a vested interest in studying avoidance acquisition and extinction in humans, the TOPIC highlights several (Myers et al., 2013; Schlund et al., 2013; Sheynin et al., 2015; Cameron et al.; Moustafa et al.). Otherwise, avoidance and coping are primarily studied through self-report survey instruments which document coping strategies (Snell et al., 2011; Ayers et al., 2014).

In animals, the meteoric rise of electrophysiological and molecular techniques made reductionistic procedures ever more popular. This was in the face of Bolles formulation of species specific defense reactions (SSDRs) (Bolles, 1970). A reading of Bolles strongly suggests that the most popular applications of avoidance learning in animals were reducible to reflexes. Avoidance that relied on SSDRs would be difficult to distinguish from fear responses or their modification and would be better studied in clearer procedures. Bolles did not negate avoidance learning, but argued that avoidance was obscured by SSDRs and arbitrary avoidances provided clear evidence of avoidance, which would be slowly and incrementally acquired. The Bolles position muddled already difficult discussions concerning reinforcement in avoidance acquisition (Bersh, 2001; Dinsmoor, 2001; Hineline, 2001). The many criticisms of avoidance learning and its proper interpretation became more and more inaccessible to the average reader and more esoteric in argument. The zeitgeist is avoidance responses either a SSDR or require the suppression of SSDRs. SSDRs reflect fear and fear is more clearly examined in freezing (Fanselow and Poulos, 2005) or by examining its exaggeration of acoustic startle responses (Davis, 2006) under conditions in which control procedures are established to reveal associativity. Although arbitrary responses provide clear evidence of avoidance (Avcu et al.; Bravo-Rivera et al., 2014; Servatius et al.), these procedures became more and more unpopular. An increase in demand for throughput (self-contained, relatively short, and easily scored procedures) is at odds with the seemingly slow development of avoidance. In an unfortunate happenchance, "passive avoidance" remains in the parlance of behavioral neuroscience, but the high-throughput tasks and protocols to study "passive avoidance" are essentially assessing punishment.

Modern theorists of avoidance have moved away from response dynamics to cognitive processes driving response dynamics. Humans and mammals form expectancies. Avoidance expression reflects propositional knowledge but also the context in which knowledge is to be expressed (Seligman and Johnston, 1973; Lovibond et al., 2008, 2009; Dymond and Roche, 2009). Knowledge is subject to error and error correction (Myers et al.; Sheynin et al.). The difficulties encountered in learning arbitrary responses may not rest in how unnatural such responses are to humans and animals (Dinsmoor, 2001), but in the pressures of time/distance (Fanselow and Lester, 1988) and a cost/benefit analysis. There is a need for conceptual bridges between propositional knowledge central to expectancy models of avoidance and animal research in which processes are resolvable to response dynamics (response selectivity, strength of responding, and probability of responding; Krypotos et al.).

Recently, the National Institute of Mental Health (NIMH) in the United States embarked on research domain criteria (RDoC) to facilitate integration across levels of analysis and between diagnostic boundaries. The Negative Valence System encompasses acute responses to threat (fear) and inferred threat (anxiety), with escape/avoidance learning and expression emerging with sustained threat. In the NIMH working group discussion, ambivalence was expressed concerning whether sustained threat is distinct from acute threat, except for the time dimension. An undercurrent is that the sustained threat dimension, and by implication avoidance and escape, is not distinctive of acute conditions. The bounding conditions of avoidance are not only the duration of threat (acute/sustained), but the perceived intensity of threat, its perceived proximity, and the utility of responses or efforts. For perceived proximity of time, parametric manipulations of signal-shock intervals illustrate this point. Shuttling as the requisite response (a modified SSDR) is efficiently acquired with CS-US intervals of 10–20 s (Black, 1963). In lever press (not an SSDR) avoidance, escape behaviors predominate when signal-shock intervals are less than 20-s (Berger and Brush, 1975), with very few avoidance responses expressed after days of training (Servatius et al.). However, knowledge about avoidance is acquired; avoidance is not expressed (Servatius et al.). Using a crossover design those trained with a 10-s warning signal and exhibiting nominal avoidance rates displayed greater than 60% avoidance when switched to 60-s warning signal—nearly asymptotic performance of those trained initially with a 60-s warning signal. As to stressor intensity, shuttle box avoidance is efficiently acquired with foot shocks of moderately low intensity (0.2–0.5 mA) (Levine, 1966) with decrements apparent with shock intensity greater than 1.0 mA (Moyer and Korn, 1964). In contrast, lever press avoidance is efficiently acquired with shock intensities of 1.0– 2.0 mA (Berger and Brush, 1975; Servatius et al., 2008; Avcu et al.). These features illustrate that avoidance acquired with arbitrary responses differ in a number of parameters from those modifying reflexive responses or "natural" responses, which are in turn distinct from fear responses. On the other hand, recent work also shows fear is more nuanced as fear contributes to sustained processes such as foraging (Kim et al., 2014).

In subsequent position papers concerning RDoCs, fear and threat processes feature prominently, whereas avoidance and coping do not (Cuthbert et al., 2003; Insel et al., 2010; Cuthbert, 2015). This is indeed unfortunate. An opportunity to intensify efforts in avoidance research is being missed. The mental health implications are extensive. Psychologically healthy coping strikes a balance between avoidance (responding in anticipation of aversive stimulation) and escape (responding in the presence of the stimulation) and competing motivations of approach (Ilango et al.; Ilango et al.). Deviant forms of avoidance are evident in autism (Richer, 1976), anxiety (Ly and Roelofs, 2009); (Kashdan et al., 2014), phobias, posttraumatic stress disorder (PTSD; North et al., 2004; Kashdan et al., 2009), major depression (Ottenbreit et al., 2014) and suicide (Dixon et al., 1991). Over-expression of avoidance, as in anxiety disorders and PTSD, insulates one from aversive thoughts or experiences at the expense of self-limiting interpersonal and environmental interactions. Under-expression of avoidance, as in depression or suicidality, unduly exposes one to aversive thoughts and experiences that would be otherwise controllable, severely depleting resources and progressing down a demoralizing spiral.

Diathesis models of mental illness capture avoidance biases as dynamic interactions of vulnerabilities (genes, epigenetics, personality, and developmental phases) with risk factors (psychological stressors, physical injuries) ultimately expressed as psychopathology. For example, behaviorally inhibited temperament, withdrawal in the face of social and nonsocial challenges, is a vulnerability factor for anxiety disorders (Moffitt et al., 2007). Humans expressing behavioral inhibition (BI) display enhanced avoidance expression (Sheynin et al.), and enhanced new motor learning (Caulfield et al.; Holloway et al., 2014), especially under degraded contingencies (Holloway et al., 2014; Allen et al.). Facilitated avoidance acquisition (Avcu et al.; Beck et al.; Jiao et al.; Servatius et al.) and new motor learning (Ricart et al., 2011a,b) are also apparent in Wistar-Kyoto rats, an animal model of BI temperament. Further, avoidance extinction is typically more difficult to obtain than extinction of fear. This is likely amplified by individual differences (Avcu et al.; Cominski et al.). Uncovering of neurobiological processes biasing avoidance expression and extinction has the promise of providing targets for individualized therapeutics and treatments for a number of psychopathological disorders.

### REFERENCES


Hence, there continues to be a need for an integration of human and animal research focused on coping and in particular avoidance coping. Model systems of avoidance that allow for bidirectional modifications of acquisition, expression, and extinction—protocols that allow for increased as well as decreased expression—are useful in translating basic science to psychopathology. By extension, RDoC constructs should be sensitive to individual differences, both accentuating and diminishing in the appearance of avoidance.

An open discussion of what features constitute fear, threat, anxiety, and avoidance would not only benefit basic science and psychopathology, but areas of research that are otherwise ignoring the infighting and are making substantial progress in improving health (e.g., fear-avoidance model of pain Vlaeyen et al., 1995; Crombez et al., 2012).

Approach Avoidance: Have no fear!

### AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.

### FUNDING

Supported by the Stress and Motivated Behavior Institute through funding from the Department of Defense, Armaments Research Development Engineering Center.


neuronal activity during concurrent discriminative approach and avoidance training in rabbits. J. Neurosci. 16, 1538–1549.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Servatius. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Avoidance learning: a review of theoretical models and recent developments

Angelos-Miltiadis Krypotos 1, 2, Marieke Effting1, 2 , Merel Kindt 1, 2 and Tom Beckers 1, 2, 3 \*

*<sup>1</sup> Department of Clinical Psychology, University of Amsterdam, Amsterdam, Netherlands, <sup>2</sup> Amsterdam Brain and Cognition, University of Amsterdam, Amsterdam, Netherlands, <sup>3</sup> Department of Psychology, KU Leuven, Leuven, Belgium*

Avoidance is a key characteristic of adaptive and maladaptive fear. Here, we review past and contemporary theories of avoidance learning. Based on the theories, experimental findings and clinical observations reviewed, we distill key principles of how adaptive and maladaptive avoidance behavior is acquired and maintained. We highlight clinical implications of avoidance learning theories and describe intervention strategies that could reduce maladaptive avoidance and prevent its return. We end with a brief overview of recent developments and avenues for further research.

Keywords: avoidance, fear, anxiety, learning, neuroscience

### Introduction

### Edited by:

*Richard J. Servatius, Syracuse DVA Medical Center, USA*

### Reviewed by:

*Christine A. Rabinak, Wayne State University, USA Hadley C. Bergstrom, National Institutes of Health, USA*

### \*Correspondence:

*Tom Beckers, Department of Clinical Psychology, University of Amsterdam, Weesperplein 4, 1018 XA Amsterdam, Netherlands T.R.J.Beckers@uva.nl*

> Received: *20 May 2015* Accepted: *06 July 2015* Published: *21 July 2015*

### Citation:

*Krypotos A-M, Effting M, Kindt M and Beckers T (2015) Avoidance learning: a review of theoretical models and recent developments. Front. Behav. Neurosci. 9:189. doi: 10.3389/fnbeh.2015.00189* Avoidance of genuinely threatening stimuli or situations is a key characteristic of adaptive fear. People will typically not enter a building after a major earthquake nor approach a stray lion. At the same time, excessive avoidance in the absence of real threat can severely impair individuals' quality of life and may stop them from encountering anxiety-correcting information (Barlow, 2002). In such cases, avoidance loses its adaptive value and may transform into a maladaptive response. Maladaptive avoidance is in fact a central characteristic of a wide spectrum of mental disorders (World Health Organization, 2004; American Psychiatric Association, 2013). Individuals with Obsessive Compulsive Disorder (OCD), for instance, tend to avoid situations in which the potential for contact with contaminants is high (Rachman, 2004), Post-Traumatic Stress Disorder (PTSD) patients will try to avoid intrusive memories (Brewin and Holmes, 2003; Williams and Moulds, 2007), and social phobics will refuse to attend group gatherings (Bögels et al., 2010; Schneier et al., 2011).

Given the key role of avoidance in normal and disordered psychological functioning, it is critical to better understand the relevant conditions and psychological mechanisms responsible for the learning of avoidant reactions. Alas, although avoidance learning was once a central topic in basic psychological research, interest has waned since the 1970's, leaving important questions unanswered. Only recently has there been a resurgence of theoretical, experimental and clinical interest in the study of avoidance (see **Figure 1**). In the last years, new psychological theories of avoidance learning have been proposed (e.g., De Houwer et al., 2005; Lovibond, 2006) and avoidance is quickly becoming a topic of prime empirical interest not only in experimental psychology but also in clinical psychology and psychiatry as well as in behavioral neuroscience (see the present special issue). The latest edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5; American Psychiatric Association, 2013) includes avoidance in several diagnostic criteria that previously referred to fear only. In parallel, recent years have brought rapid increases in our understanding of the brain processes involved in the learning

(e.g., Delgado et al., 2009), expression (e.g., Cominski et al., 2014), and reduction (e.g., McCue et al., 2014) of avoidance behavior.

In this paper we review the main historical and modern theories of avoidance learning and present a set of principles of avoidance learning that integrate those theoretical propositions with the strongest experimental support. We also address the clinical implications of those principles and relate them to current and novel interventions for maladaptive avoidance such as in anxiety disorders or PTSD. Lastly, we consider recent findings from behavioral neuroscience.

The outline of the paper is as follows: We first describe how avoidance learning is studied in laboratory settings and how functionally similar behaviors can serve the avoidance of or the escape from an aversive event. Next, we discuss traditional theories of avoidance learning, including Mowrer's two-factor theory. In the third section we describe Bolles' (1970, 1971) Species-Specific Defense Reactions (SSDR) theory. We then review more recent theories of avoidance learning that address informational factors (e.g., expectancies) in avoidance. Next, we propose a set of principles for avoidance learning that incorporates the most well-validated propositions of the aforementioned theories. We end our review with suggestions for closer alignment between basic and clinical science and a few avenues for future research.

### Laboratory Procedures for Studying Avoidance Learning

Avoidance learning procedures typically entail the cancelation of an impending aversive event by either the emission or inhibition of an experimenter-designated response. In active avoidance procedures, for example, an antecedent stimulus is followed by an aversive event unless an experimenter-designated response is executed, a response that typically also terminates the antecedent stimulus. For example, dogs will learn to jump a barrier following the presentation of a light, previously associated with shock administration (Solomon and Wynne, 1953).

By contrast, in passive avoidance procedures, the aversive event occurs only if an experimenter-designated response is executed during the antecedent stimulus presentation. For example, in a standard passive avoidance procedure for rats, a rat is placed in a brightly lit compartment of a two-compartment box, with the second compartment being dark and the two compartments separated by a closed door (Venable and Kelly, 1990; Kaminsky et al., 2001). Given that rats have a preference for dark compared to lit environments (see Costall et al., 1989; Bourin and Hascoët, 2003), they will move to the dark compartment once the door is opened, an action that will be followed by shock administration. This procedure often results in the rats passively avoiding the shock by remaining in the light compartment on future occasions.

In avoidance procedures, the experimenter-designated response is not necessarily performed prior to the aversive outcome, but can also be performed in presence of it. In such cases, it would be more accurate to categorize the performed response as escape rather than avoidance. Escape responses involve distancing oneself from an ongoing aversive event while avoidance refers to behavior that causes the omission of a forthcoming noxious outcome, predicted by an antecedent stimulus (Bowrer and Hilgard, 1981; see also the distinction between antecedent events and aversive outcomes in Lovibond and Rapee, 1993). Thus, what differentiates avoidance from escape is the proximity of threat (imminent vs. ongoing).

An elegant animal laboratory procedure to differentiate between avoidance and escape responses, especially relevant to behavioral neuroscience, is the elevated T-maze task (ETM; Pellow et al., 1985). The ETM has been primarily designed for testing rats' defensive reactions in innately fearful environments (i.e., open spaces and heights; Montgomery, 1955). It typically consists of three elevated arms with one arm surrounded by a wall (enclosed arm) and the other two being open (open arms). Initially, the rat is placed in the enclosed arm. While exploring the rat will eventually end up in the open arms. Following repeated trials, the rat will tend to remain longer in the enclosed arms after being placed there (i.e., passive avoidance) or run toward the enclosed arm after being placed in one of the open arms (i.e., escape).

By using this procedure, research has illuminated the differences in the neurobiology of avoidance and escape. Specifically, serotonin, an anxiogenic neurotransmitter relevant for defensive responses (Graeff, 2002), seems to play a different role in the two types of behaviors (Zangrossi et al., 2001), with serotonin administration enhancing avoidance and inhibiting escape (Graeff, 1991). That observation supports the argument that avoidance and escape may constitute diverse types of defensive behaviors, differently elicited as a function of the proximity of threat (imminent or ongoing), a hypothesis also in line with models associating defensive response selection to predatory imminence (Fanselow, 1994). Of note, the difference between escape and avoidance is also relevant for clinical practice. Specifically, Deakin and Graeff (1991) suggested that (passive) avoidance is mainly related to generalized anxiety disorder (GAD), where a threatening event is typically anticipated, and escape to panic disorder, where panic reactions could be considered as responses to an ongoing perceived danger (Shuhama et al., 2007). This suggestion has gathered partial support from pharmacological studies. It has been demonstrated that commonly prescribed anxiolytic drugs (e.g., diazepam) result in passive avoidance deficiencies while leaving escape behavior intact. On the contrary, cholecystokinin agonists, which typically invoke panic attacks, facilitate escape behaviors (Pinheiro et al., 2007; Graeff and Zangrossi, 2010). Taken together, avoidance and escape seem to be distinct subtypes of defensive behaviors, and that they might play a different role in mental disorders.

Active and passive avoidance and escape procedures have proved valuable for testing avoidance and/or escape learning in laboratory settings. Based on findings obtained with those procedures, theories have emerged that address the underlying psychological mechanisms. We now turn to a discussion of early theories of avoidance/escape learning in psychology.

### Early Theories of Avoidance/Escape Learning and the Two-factor Theory

In the early days of psychology, learned avoidance was considered an example of a Pavlovian conditioned reflex (Bekhterev, 1907, 1913; Watson, 1916). Just like Pavlov's dogs would salivate upon the sound of a metronome previously associated with food administration (Pavlov, 1927), in the studies of Bekhterev (1913), a dog would flex its leg after the presentation of an antecedent stimulus, previously associated with shock administration (Herrnstein, 1969; Bolles, 1972). Since leg flexion would occur in the presence of the antecedent stimulus and prior to shock delivery, the acquired response was considered to reflect Pavlovian learning.

Nonetheless, two procedural characteristics differentiated the acquired responses from learned Pavlovian reflexes. First, what constituted the avoidance response (e.g., leg flexion) was usually an experimenter-defined voluntary response, whereas in Pavlov's experiment the learned response toward an initially neutral stimulus (i.e., salivation upon sound of the metronome) would typically consist of the automatic response toward an evolutionary relevant stimulus (i.e., salivation during food presentation; Unconditioned Stimulus or US). Second, the emitted response would lead to the cancelation of the impending event, making the (non-)presentation of the aversive stimulus dependent on the organism's response (Herrnstein, 1969). This procedural aspect is at odds with the standard Pavlovian procedure in which the presentation of food, or of any other US, would not depend on the animal's response (i.e., food would be presented independently of whether dogs salivated or not). Those procedural differences pointed to the potential operation of instrumental processes during avoidance learning, since in instrumental learning procedures an experimenter-defined action of the organism is necessary for outcome presentation or omission (Rescorla and Solomon, 1967). The potential involvement of instrumental processes, however, raised the question as to how avoidance responses are reinforced. Although one might intuitively argue that the source of reinforcement is the omission of the impending aversive event (i.e., the dogs flex their legs because this cancels the shock), assigning the cause of behavior to an event that has not yet occurred (i.e., shock administration) violated the dominant scientific principles of psychology at the time (i.e., the behaviorist paradigm; Watson, 1913).

A solution to that conundrum was offered in the two-factor theory formulated by Orval (Mowrer, 1951), who proposed that the performed response was reinforced by fear reduction (Hull, 1943). Specifically, Mowrer argued that as a result of Pavlovian fear conditioning (first factor), i.e., an antecedent stimulus (e.g., a tone) being associated with the administration of an aversive event (e.g., a shock), presentation of the antecedent stimulus will come to evoke fear. Subsequently, during the instrumental phase (second factor), escape responses that are emitted in the presence of the antecedent stimulus will be negatively reinforced by fear reduction, due to increased distance to or cessation of the antecedent stimulus. This idea was heavily inspired by the avoidance learning procedures used at the time, where avoidance responses led to the termination of the antecedent stimulus by locomotion (e.g., moving away from a shock area of a box) or by the antecedent stimulus being turned off. Of note, according to Mowrer, the omission of the aversive outcome event was to be regarded as a mere by-product of the performed CS escape behavior (Schöenfeld, 1950; Mowrer, 1960).

Krypotos et al. Avoidance learning

Two-factor theory quickly gained popularity in experimental psychology. By resorting to the concept of fear reduction during the instrumental phase, Mowrer's proposition was in line with the dominant drive reduction theories of the time (e.g., the drive reduction theory of Hull, 1943). Concurrently, by suggesting that the initial fear learning was based on Pavlovian processes, rather than drive reduction, the theory was better applicable to experimental data than the competing theory of Neal Miller (Miller, 1948), according to which all reinforcement during escape/avoidance learning originated from fear reduction.

Mowrer's theory has been used not only for explaining how maladaptive avoidance is acquired (Levis, 1981), but also as a basis for clinical interventions (Eysenck and Rachman, 1965). For example, in exposure therapy a patient is repeatedly confronted with a fearful situation or stimulus, in order to reduce that fear. It is commonly suggested that patients should be kept in the exposure situation until fear or anxiety levels have declined. This suggestion is rooted in two-factor theory and the notion that if the exposure session is terminated while fear levels remain high, the fear reduction caused by the termination of the session could promote escape or avoidance of similar situations in the future (Mathews et al., 1981; Emmelkamp, 1982; see next section for arguments against this notion).

Despite its wide influence on basic research and clinical science, two-factor theory had trouble explaining later data (see Rescorla and Solomon, 1967; Herrnstein, 1969; McAllister and McAllister, 1991, for extended discussions of two-factor theory). We now turn to some of the key criticisms against the two-factor theory.

### Criticisms Against the Two-factor Theory

One of the strongest criticisms against the two-factor theory concerned the purported role of fear in motivating the emission of a learned escape/avoidance response. According to Mowrer's proposition, escape/avoidance is motivated by high fear levels. This notion implies that no such actions should be performed in the absence of fear. One of the first experiments to show that this may not be true was done by Solomon et al. (1953). In their experiment, dogs were first trained to jump across a barrier in response to the sounding of a buzzer previously paired with shock. The dogs then received a fear extinction treatment in which the buzzer was repeatedly presented without shock. Such extinction procedure typically leads to the reduction of fear levels (Hermans et al., 2006). If, as assumed by Mowrer, it is fear that motivates escape/avoidance, it would be expected that following Pavlovian fear extinction, dogs would also stop performing the avoidance response. The results contradicted this hypothesis: Dogs continued to jump upon sounding of the buzzer, even when the shock device had long been turned off permanently.

The observation that fear may not be necessary for avoidance has clear clinical implications. As mentioned earlier, patients are typically prevented from prematurely terminating exposure out of concern that the fear reduction resulting from termination of the session could otherwise serve as negative reinforcement for escape (Eysenck and Rachman, 1965). Experimental data, however, indicate that patients undergoing exposure therapy show similar clinical improvement regardless of whether they ended exposure while fear levels were high or low (De Silva and Rachman, 1984; Rachman et al., 1986). Taken together, both experimental data and clinical findings suggest that fear may sometimes, but not always, be involved in the maintenance of avoidance, and as such, fear and avoidance may not always "synchronize" with each other (Rachman and Hodgson, 1974).

Two-factor theory also had trouble explaining how avoidance can be acquired in the absence of an explicit antecedent stimulus. Specifically, in unsignaled avoidance procedures (Sidman, 1953a, 1962; see Lázaro-Muñoz et al., 2010; McCue et al., 2014 for more recent examples), rats learn to avoid shocks presented at fixed time intervals, in the absence of a discrete antecedent stimulus (Sidman, 1953a,b; see Hassoulas et al., 2013, for examples in humans). In its initial form, two-factor theory assumed the operation of explicit antecedent stimuli during the Pavlovian and the instrumental phases, stimuli that during unsignaled procedures appear to be absent.

A potential explanation for the observation of unsignaled avoidance procedures is that although not explicit, warning stimuli may still be present in a "silent" form. Temporal and proprioceptive stimuli (e.g., the passage of time), for example, could be associated with the aversive outcome (Schöenfeld, 1950; Dinsmoor, 1977). Subsequently, those stimuli could signal the presentation of an aversive event (for an alternative account centering on the role of US omission during avoidance learning see Herrnstein, 1969).

By assuming that avoidance is based on reinforcement learning, the two-factor theory also failed to explain how avoidance is acquired in naturalistic settings, where the first encounter of an organism with danger could prove fatal (Bolles, 1970; Osada et al., 2014). Similarly, when it comes to maladaptive avoidance, patients do not always report a direct traumatic event as the source of their symptomatology (Rachman, 1990). An explanation for those observations is that avoidance need not always be acquired through direct experience but can be acquired via other pathways as well (Rachman, 1991, 1977; Olsson and Phelps, 2004, 2007). Those pathways include vicarious learning (e.g., learning to be afraid of dogs after observing someone being afraid of a dog) and instructional learning (e.g., learning to be afraid of a dog after someone suggesting that dogs often attack people; Bandura and Rosenthal, 1966; Rachman, 1977). Recent evidence shows that avoidance learning can be achieved even more indirectly, such as through symbolic generalization. One demonstration of that was presented by Augustson and Dougher (1997). They first trained individuals to categorize eight different stimuli (i.e., A1, B1, C1, D1, A2, B2, C2, and D2) into two arbitrary categories (i.e., 1 and 2), using standard conditional discrimination procedures (Sidman, 1987): Participants first saw a target stimulus (e.g., A1) and were then asked to choose one from three stimuli presented on screen. One of those stimuli was arbitrarily assigned to the same category (e.g., B1), one stimulus to the other category (e.g., C2) and another one was irrelevant (e.g., X). Participants' task was to learn to choose the stimulus from the same category as the target stimulus (e.g., B1). In case of a correct response, the word "correct" was presented and in case of an incorrect response, the word "wrong" was presented. For example, when stimulus A1 was presented, selecting the stimulus that belonged to the same category (e.g., B1, C1 or D1) was positively reinforced whereas the selection of any other stimulus (e.g., B2 or X) was punished. Such a procedure typically results in people learning that the stimuli within each category are functionally equivalent. During a subsequent fear conditioning phase, B1 was paired with shock and B2 with the absence of shock. An instrumental phase followed, where participants could avoid shocks with a button press. Critically, results showed that, in addition to performing more avoidance responses in the presence of B1 than B2, participants also performed more avoidance responses to (unreinforced) presentations of C1 and D1 (the stimuli arbitrarily related to B1) than in the presence of C2 and D2 (the stimuli related to B2). Apparently, avoidance responses to B1 symbolically generalized to C1 and D1, which had been previously trained as equivalent to B1, without an avoidance schedule being trained for those stimuli. Those findings have recently been replicated and extended (Dymond and Roche, 2009; Dymond et al., 2011). Taken together, direct traumatic experiences may not always be necessary for the acquisition of avoidance, a proposition that is in line with contemporary views on fear learning and psychopathology (Mineka and Zinbarg, 2006).

Another argument against the two-factor theory regards the extent to which Pavlovian and instrumental learning are both necessary for the acquisition of avoidance. As a result of Pavlovian fear conditioning, a previously neutral cue will come to evoke various fear responses (e.g., enhanced physiological arousal; Beckers et al., 2013). According to emotion theories, action tendencies are an essential component of emotions (Lang, 1985; Frijda, 1988). As such, "to fear is to want to avoid." Therefore, it could be assumed that the tendency to avoid a stimulus or situation may be acquired via purely Pavlovian learning, without instrumental reinforcement. We have recently tested this assumption. Following a differential fear conditioning procedure, during which pictures of one geometrical object were always followed by shock (CS+) whereas pictures of another object were never followed by shock (CS−), participants were faster to avoid the CS<sup>+</sup> and approach the CS<sup>−</sup> than vice versa in a symbolic approach-avoidance reaction time task (AAT; Krypotos et al., 2014b). Crucially, shock electrodes were detached from participants' hands during the AAT, and responses had no influence on the duration or presence of the CS, eliminating any instrumental basis for avoidance. Those results suggest that avoidance tendencies can be acquired via mere Pavlovian associations, in the absence of instrumental learning. Such acquisition of motor responses toward a CS in absence of instrumental learning has long been established in the appetitive domain. In auto-shaping procedures (Brown and Jenkins, 1968), for example, pairings of a visual CS with food would typically result in the animal (e.g., a rat or a pigeon) producing consumption responses toward the CS (e.g., licking or pecking, respectively), despite those responses being irrelevant for food presentation.

Lastly, recent findings suggest that avoidance can be evoked by the identification of predator related stimuli (e.g., smells) even in absence of a previous encounter with the predator. Specifically, mice (Osada et al., 2013) and deer (Osada et al., 2014) tend to avoid (or emit other defense-like behaviors) areas where the active components of predators' urine odors are presented, without any previous experience with that specific predator. An explanation for those findings is that such avoidance may be the result of "evolutionary memory" (Provenza, 1995) and as such, avoiding such stimuli (e.g., odors or blood) does not require the learning of associations between a stimulus (i.e., smell) and the aversive event. Of note, similar effects are yet to be demonstrated in humans.

To sum up, although influential, the two-factor theory of Mowrer proves unable to account for a series of experimental results and clinical observations. In response to those shortcomings of the two-factor theory, alternative theories have been proposed. One of those, with major influence in the experimental field, is the SSDR theory of Bolles (1970, 1971).

### Species-specific Defense Reactions

It has long been demonstrated that in Pavlovian conditioning, evolutionary relevant stimuli (e.g., spiders) are more readily associated with an aversive event (e.g., shock) than nonevolutionary relevant stimuli (e.g., flowers; Öhman and Mineka, 2001). Similarly, in the area of taste aversion, associations between tastes and induced sickness are acquired more readily than between audio-visual cues and sickness whereas audiovisual cues are associated more readily with shock than tastes are (the Garcia effect; Garcia et al., 1955; Davis and Riley, 2010).

An explanation for those findings is that by being wired to preferentially associative aversive events (i.e., shock and sickness) with phylogenetically relevant stimuli (i.e., spiders and tastes), an organism is better prepared to learn about likely cues for danger. Such ability would equip the organism with an evolutionary advantage for surviving potentially harmful cues or situations. An interesting question is whether individuals are similarly predisposed to associate the cancelation or termination of an aversive event with particular behavioral responses.

This seems to be the case. Rats will learn much more rapidly to avoid an aversive outcome by running—a behavior commonly used for avoiding a predator—than by moving their tail (Maatsch, 1959; Meyer et al., 1960; Theios et al., 1966; Masterson, 1970). Observations such as those inspired Bolles in his formulation of the Species-Specific Defense Reactions (SSDRs) theory. According to this theory, under a state of fear, organisms are phylogenetically predisposed to emit specific types of responses (e.g., fleeing, freezing, or fighting) rather than others. Bolles went one step further to suggest that for such responses, reinforcement learning is actually unnecessary; the organism just needs to learn that a stimulus predicts an aversive outcome for it to elicit an SSDR (Bolles, 1970, 1971).

The SSDR theory could indeed explain the fast acquisition of specific avoidance responses that served the evolutionary survival of the organism. Nonetheless, sometimes, an avoidance response has to be acquired that does not belong to an organism's SSDR repertoire (Crawford and Masterson, 1982). Rats, for example, can learn to avoid an aversive outcome by rearing on a wheel, a response that arguably does not usually serve survival purposes, although at a much slower rate than learning to avoid that same

Krypotos et al. Avoidance learning

outcome by running on the wheel (Bolles and Grossen, 1969). In those cases, it is suggested that although under a state of fear SSDRs will initially be performed, those SSDRs will be punished by the occurrence of the aversive event, allowing for non-SSDRs to subsequently emerge. In other words, non-SSDRs would not (or not merely) be negatively reinforced, as Mowrer suggested, but they would arise because of SSDRs being positively punished by the presentation of the aversive event.

To sum up, Bolles provided a theory that could explain why some avoidance responses are acquired more readily than others, and why reinforcement may often be unnecessary for the acquisition of avoidance. Nonetheless, the theory has limitations. For one, fear states are on a continuum, ranging from low to high levels, making it difficult to define at which specific level the restriction of the behavioral repertoire to SSDRs occurs (Masterson and Crawford, 1982). Also, by rejecting reinforcement as a source for SSDRs, Bolles' theory makes it hard to explain findings suggesting that bar pressing is acquired faster when it leads to access to a safe place than when it merely leads to US omission (Crawford and Masterson, 1978). Thus, although SDDR theory provided answers to important avoidance learning questions, it fails to accurately define the conditions under which avoidance learning will occur and to provide a comprehensive account for all instances of avoidance acquisition.

### Informational Factors in Avoidance Learning

The theories we have discussed so far provided functional explanations of avoidance learning, denying any role for cognitive or informational factors in the interpretation of avoidance acquisition. As such, those theories were in harmony with the dominant functional accounts of learning of the behaviorist paradigm. However, those functional accounts of learning failed to explain a series of laboratory phenomena such as that of blocking (i.e., impaired CS1-US acquisition if CS1 is paired with the US in compound with a CS2 that has been previously paired with the US in itself; Kamin, 1956, 1967, 1969) and conditioned inhibition (i.e., learned inhibitory responses toward CS1 if a CS1–CS2 compound is repeatedly presented without the US while CS2 is paired with the US when presented in itself; Rescorla, 1969). Such phenomena challenged traditional associative theories according to which CS-US contiguity is sufficient for Pavlovian acquisition (Rescorla, 1972; Miller et al., 1995).

As a result, a general shift in theories of learning has been observed, such that informational factors (e.g., outcome expectancies or stimulus surprisingness) started to be considered as potential explanations of acquired behavior (Rescorla, 1972).

One theory of learning with a lasting influence that was developed in that period, is the Rescorla-Wagner model (RWM; Rescorla and Wagner, 1972; Wagner and Rescorla, 1972). The basic premise of the RWM is that the rate of conditioning to a CS depends on whether the ensuing presentation of the US is surprising or not. If the CS did not elicit an accurate prediction of the (non-)occurrence of the US (negative or positive prediction error), learning about the CS occurs; if it did, no learning occurs. This model clearly deviates from earlier theories of learning in that it recognizes the role of informational factors (i.e., predictions) in conditioning, and despite justified criticisms (Miller et al., 1995), it is a model with great heuristic and predictive value (Beckers and Vervliet, 2009).

In the next section we present avoidance learning models that, like the RWM for Pavlovian learning, rely on informational factors to account for avoidance learning.

### The Role of Safety Signals

In our examples of Pavlovian fear conditioning, we have so far discussed situations in which an antecedent stimulus signals an aversive event. It might be assumed that in such situations, knowledge is acquired about the relation between the antecedent stimulus and the aversive event only. However, if there are also signals present that predict the absence of an otherwise expected aversive event (i.e., safety signals), those will be learned about as well, and fear responses will be inhibited in the presence of those stimuli (conditioned inhibition, see above).

A relevant question then is whether, in addition to reducing fear levels, safety signals might also be able to inhibit avoidance behavior. This hypothesis was tested in a seminal study of Rescorla and LoLordo (1965), who showed that dogs would learn to jump a small barrier after the presentation of a tone previously associated with shock but would withhold avoidance upon the presentation of another tone previously associated with the absence of shock (see Weisman and Litner, 1969, for a replication). Those findings indicate that during avoidance learning, knowledge is acquired not only about stimuli that signal an aversive event, so that those stimuli come to elicit defensive behaviors (e.g., avoidance, escape), but also about stimuli that signal the absence of a forthcoming aversive event, with the latter stimuli inhibiting defensive behaviors.

Assigning a role to safety stimuli in avoidance learning allows to explain a series of data that previous theories could not account for. For example, it has been shown that avoidance can be acquired readily even if the CS is not terminated upon the performance of the experimenter-designated avoidance response (Soltysik et al., 1983; Avcu et al., 2014). This observation contradicts two-factor theory, according to which the termination of the antecedent event is necessary for fear reduction. A safety signal account can explain those data, by assuming that changes in contextual or internal cues upon performance of the experimenter-designated response can serve as safety signals. The reinforcing value of safety cues can also explain why avoidance is acquired faster if the avoidance response leads to a safe environment than if it does not (Crawford and Masterson, 1978; see also Morris, 1975; Kim et al., 2006, for related evidence).

Another observation partially supportive of a safety signal account is that rats prefer a box compartment in which a light is presented before the administration of an (unavoidable) shock, over a compartment in which shocks occur unannounced (Lockard, 1963). According to a safety signal account, the predictability of the aversive event makes the situation less unpleasant because the absence of the warning stimulus signals

that the situation is safe (Seligman, 1968; Seligman et al., 1971), a view also in line with the RWM approach to inhibitory learning.

The safety signal account can also help to explain why exposure therapy does not necessarily lead to a reduction of avoidance. Specifically, it has been suggested that while undergoing exposure therapy, patients will try to reduce their unpleasantly high fear levels by engaging in overt or covert safety behaviors that reduce those fear levels ("within situation safety"; Wells et al., 1996; Salkovskis et al., 1999). Such safety behaviors may involve the generation of safety signals to reduce fear levels. For instance, individuals who are afraid to experience a panic attack while flying, might endure traveling in an airplane as long as they can carry anxiolytics with them. Having such medication in their pocket can serve as a signal that they can avoid a potential panic attack. No matter how helpful such strategy is in reducing momentary fear levels, the engagement in safety behaviors typically preserves the dysfunctional belief that the situation is inherently dangerous. By implication, on future occasions where they cannot engage their customary safety behaviors, individuals may revert into avoidance behavior in those situations.

### The Cognitive Theory of Seligman and Johnson

One of the most influential theories of avoidance learning, which explicitly addressed the role of informational factors, is the cognitive theory of Seligman and Johnston (1973).

In spite of its name, the theory actually contains both a cognitive and an emotional component. The cognitive component revolves around the assumption that human and non-human animals would prefer not receiving an aversive stimulus over receiving one. The cognitive component also contains the notion that as a result of an avoidance learning schedule, humans or animals would learn to expect an aversive stimulus if an avoidance response is not performed and not to expect an aversive stimulus if an avoidance response is performed. The emotional component mainly refers to the Pavlovian fear responses that develop during an avoidance learning procedure as previously described by Mowrer and others.

By incorporating a role for expectancies in avoidance learning and maintenance, the cognitive model could account for data not easily explained by two-factor theory. For example, in the experiment of Solomon et al. (1953) mentioned previously, dogs would keep jumping a barrier subsequent to a buzzer presentation previously associated with shock, even when the experimenter stopped shock administration. Since shocks no longer followed buzzer presentation, it would be expected that fear would extinguish (see Pavlovian extinction). According to two-factor theory, fear motivates escape/avoidance responses, and since fear was assumed to have extinguished, it should be expected that avoidance would diminish as well, a hypothesis that was at odds with the data. Those data, however, can be explained by the cognitive theory. According to it, avoidance is maintained despite the potential reduction in fear levels because by jumping the barrier, the animal does not directly experience that the antecedent stimulus is not followed by the aversive outcome. As such, the expectancy that an aversive event would occur after the presentation of the antecedent stimulus if an avoidance response were not emitted, is preserved.

Disconfirmation of expectancies, on the other hand, could explain why avoidance diminishes when exposure is combined with response prevention (Baum, 1966, 1970). During such schedule, individuals encounter a phobic stimulus while the execution of all escape responses is blocked. As a result, avoidance responses are typically thwarted when the individual encounters the antecedent stimulus in the future. This reduction in avoidance response execution, not achieved by traditional exposure programs, can be explained by considering that by not executing the escape response during the antecedent stimulus presentation, patients can realize that the expected noxious event was not to occur anyway. This then, removes the need to execute any defensive reaction during future encounters with the antecedent stimulus.

The cognitive model is not without limitations, some of which Seligman and Johnson noted themselves (Seligman and Johnston, 1973, pp. 100–101). One limitation is that the theory implicitly treats all types of avoidance responses as equivalent. Research on SSDRs suggests that this notion is untenable. Also, as noted by Lovibond (2006), the model remains silent as to how fear and expectancies relate to each other<sup>1</sup> . As a result, it cannot explain experimental observations such as the return of extinguished fear when the performance of the avoidance response is prevented (Solomon et al., 1953).

### Negative Occasion Setter Account

Informational factors also play an important role in the avoidance learning theory of De Houwer and colleagues (De Houwer et al., 2005; Declercq and De Houwer, 2008). According to those authors, an avoidance response serves as a signal that a CS is not going to be followed by an aversive event, which in associative learning language is called a "negative occasion setter" (see Holland, 1992; Schmajuk and Holland, 1998, for reviews on occasion setting).

Functionally, in negative occasion setting experiments, an antecedent stimulus (e.g., a sound; CS) is followed by an outcome (e.g., a shock; US) unless another stimulus X is presented (the occasion setter, OS; e.g., a light). Accordingly, the antecedent stimulus is going to be followed by an outcome only when the occasion setter is absent and vice versa. Stimulus X thus helps to disambiguate the relationship between the antecedent stimulus and the forthcoming outcome.

Negative occasion setting can be translated to clinical situations. It could be argued, for example, that the engagement in safety behaviors, such as avoidance, serves as a signal that a specific phobic stimulus or dreaded situation is not going to result an aversive event. Going back to our earlier example, the administration of an anxiolytic pill could signal that being in an airplane is not going to be accompanied by a panic attack. Such a proposition is different from the safety signal account. According to this latter account, a specific stimulus is supposed

<sup>1</sup>Notably, and in contrast to more recent models of avoidance learning (e.g., the expectancy model; Lovibond, 2006), Seligman and Johnston (1973) did not assign a mediating role to expectancies for Pavlovian learning, although they did not fully exclude such possibility either (see footnote 14 in Seligman and Johnston, 1973).

to predict the absence of an aversive event directly. According to the negative occasion setter account, however, the presence of a specific event (e.g., avoidance) predicts that the otherwise valid relation between a threatening stimulus and an aversive event does not hold.

De Houwer and Declercq tested whether an avoidance response functions as a negative occasion setter by comparing the properties of avoidance behavior to properties of negative occasion setters identified in the Pavlovian literature (Holland, 1992; Schmajuk and Holland, 1998). Those properties are (a) modulation (i.e., CRs toward the CS are stronger in the absence than in the presence of the OS), (b) resistance to counter conditioning (i.e., CRs toward the CS are attenuated by the OS even if the OS has been paired with the US itself), and (c) selective transfer (i.e., the OS will modulate responding to other CSs that have be subject to modulation before; Holland, 1992). The experiment of De Houwer et al. (2005) was at follows.

In the first phase of their experiment, stimuli A and B were always followed by a US (shock) whereas a third stimulus C was followed by a US 50% of the times. In the second phase, the US could be prevented by pressing key X when the A stimulus was present and by pressing key Y when the B stimulus was present. The third phase consisted of the presentation of all the events occurring in the previous phases, in addition to trials in which the US occurred upon pressing the X key (X-US trials). In the crucial test phase, individuals were presented with all possible stimulus combinations, without the US ever occurring, and they were asked to rate their expectancy of a US.

Results from the test phase confirmed that avoidance responses exhibit the properties of negative occasion setters. First, participants reported higher US expectancies when the avoidance response was not available (i.e., A and B test trials) than when it was (i.e., AX and BY trials; modulation). This result shows that the function of the antecedent stimulus as a predictor of a negative outcome is dependent on the availability of the avoidance response, a finding at odds with two-factor theory but in line with other theories, such as Seligman and Johnson's cognitive model. Second, modulation was not affected by trials in which avoidance responses were followed by the US (i.e., the X-US trials; resistance to counterconditioning). This finding is in partial contrast with both the safety signal hypothesis and the cognitive model, given that the avoidance response has now become a predictor of the aversive event rather than of its omission. Third, participants would generalize this trained modulation to new stimuli, particularly those stimuli that had been involved in avoidance training (i.e., higher modulation for the AY and BX trials than for the CX or CY trials; selective transfer). Other models cannot easily account for this selectivity in trained modulation.

Despite subsequent replication and extension of those findings (Declercq and De Houwer, 2008), more recent evidence (e.g., Declercq and De Houwer, 2011) argues against the negative occasion setting account, as the property of selective transfer could also be explained by the lesser reinforcement of C compared to A and B during the first phase of the experiment. Still, despite this limitation, insights from the negative occasion setting account could prove clinically important. Current therapeutic techniques mainly target the relation between the avoidance response and the omission of an unpleasant event. The negative occasion setting account assumes that there is a hierarchical structure in avoidance learning, with the avoidance response disambiguating the relation between the antecedent stimulus and the omission, or presentation, of the aversive event (see Declercq and De Houwer, 2009, for relevant evidence). If that is the case, interventions should perhaps focus more on challenging patients' beliefs about whether their avoidance response leads to the omission of an otherwise to-be-expected noxious outcome in the presence of some antecedent stimulus, rather than on their beliefs as to whether the antecedent stimulus is a reliable predictor of the noxious event (Declercq and De Houwer, 2009).

### Lovibond's Expectancy Model

The most recent avoidance learning theory to include informational factors is the expectancy model of Lovibond (2006), which is basically an extension of the cognitive model of Seligman and Johnston (1973).

The expectancy model agrees with the idea that avoidance is acquired by a combination of Pavlovian and instrumental learning processes. It also accepts Seligman and Johnson's notion that during the instrumental phase, knowledge is acquired about the effects of avoidance (e.g., the omission of an expected unpleasant event) as well as non-avoidance (i.e., the presentation of an expected unpleasant event). Lastly, it aligns with the safety signal account in that it accepts that during the presentation of safety stimuli avoidance behavior is inhibited. However, the expectancy model rejects the notion that safety signals serve as positive reinforcers of the emitted response.

A major deviation of the expectancy model from the aforementioned theories is the assumption that both Pavlovian and instrumental learning are based on propositional knowledge. According to the propositional learning account, all learning reflects higher-order, reason-based processes, whereas earlier associative learning theories, explicitly or implicitly, rely on automatic association formation as the mechanism of learning. As such, the expectancy model assumes that avoidance learning is achieved by the accumulation of explicit knowledge about all the stimulus contingencies (e.g., that a CS is followed by a US during a Pavlovian phase) involved in avoidance learning protocols. In the same vein, the expectancy account assumes that expectancies play a crucial role not only in instrumental learning (as hypothesized by Seligman and Johnson) but in Pavlovian conditioning as well.

Most of the data we have presented so far can be accommodated by the expectancy model. Similarly to the model of Seligman and Johnston (1973), it can explain how avoidance is acquired, how it can be maintained despite Pavlovian extinction, and why response prevention during extinction will reduce avoidance. In addition, by reserving a central role for expectancies in Pavlovian learning, the expectancy model can better account for the relation between fear and avoidance. The expectancy model can explain, for example, why fear returns during response prevention, because in the absence of the avoidance response the aversive event is to be expected, and outcome expectancy is what generates fear.

Lastly, although not explicitly mentioned by Lovibond (2006), the fact that avoidance relies on propositional knowledge also allows for the acquisition of avoidance via pathways other than direct experience, rendering the account able to accommodate observations of avoidance acquired through observation, instruction, and symbolic generalization (see above).

In summary, Lovibond's expectancy model is a hybrid of earlier theories that accounts for the majority of experimental findings. Still, the model does not readily explain why some avoidance responses are learned more readily than other. It also cannot account for data in which escape/avoidance responses are acquired and expressed in the absence of instrumental reinforcement (Krypotos et al., 2014b). Lastly, it is debatable whether all elements of the expectancy model, and specifically the notion that propositional knowledge is a prerequisite for learning, could be generalized to non-human animals (Castro and Wasserman, 2009; but see Beckers et al., 2006).

### Principles of Avoidance Learning

We have reviewed early and more recent theories of avoidance learning, describing the strengths and limitations of each theoretical account. In this section we synthesize the strongest points of each theoretical account, distilling a set of principles of avoidance learning that collectively can account for the majority of existing data. We then consider recent experimental findings on the acquisition of avoidance tendencies in the light of those principles and discuss their potential clinical implications.

For illustration, we will describe the various phases of avoidance learning by reference to a modified version of the procedure used in the third experiment of Rescorla and LoLordo (1965) that we have briefly described above (see the section on safety signals). We use this experiment because it resembles a typical active avoidance learning procedure while also illustrating the potential inhibition of avoidance responses by safety stimuli.

In the experiment, dogs were placed in a large twocompartment box, separated by a small barrier. The experimental procedure included the presentation of a short tone (threat signal; CS+) always followed by shock administration (US) and the presentation of a long tone (safety signal; CS−) never followed by shock administration. Dogs could avoid shock administration by jumping across the barrier (i.e., the avoidance response) during the presentation of the CS+. As described, such procedure would typically lead to reliable execution of the avoidance response during the CS<sup>+</sup> and absence of such responses during the CS−. We now turn to a step-by-step description of how those acquired responses can be theoretically explained.

In line with most of the theoretical accounts we have presented above, we propose that avoidance learning involves a Pavlovian component. Specifically, we assume that following a Pavlovian conditioning procedure, the presentation of the CS<sup>+</sup> will elicit the expectancy of an aversive outcome, whereas the CS<sup>−</sup> will elicit the expectancy of absence of the aversive outcome. As a result of those expectancies, fear responses will be evoked (e.g., physiological arousal, avoidance tendencies) in the presence of the CS+, whereas responses of relief (e.g., physiological relaxation, approach tendencies) are expressed in the presence of the CS−.

In extension to the models we have presented above, we maintain that the described Pavlovian component is sufficient for evoking avoidance tendencies. As such, we essentially treat avoidance tendencies as conditioned responses, similar to physiological arousal or subjective apprehension, elicited by the presentation of a threat cue. Such an explanation fits the data of Krypotos et al. (2014b) where avoidance tendencies were expressed in absence of any instrumental reinforcement. Going back to our example, we propose that dogs will acquire a tendency to exhibit avoidance in the presence of the CS<sup>+</sup> sound prior to any instrumental learning.

Although acquired, we propose that those avoidance tendencies need not be translated into overt behavior (Strack and Deutsch, 2004). At the same time, we argue that if expressed, they will take the form of an SSDR (Bolles, 1970, 1971). In other words, the dogs in our example are expected to start jumping across the barrier upon presentation of the CS+, prior to the operation of any instrumental processes. Such an idea might explain the rapid learning of specific types of responses that serve avoidance, relative to other responses (Bolles and Grossen, 1969). In such cases, however, it would be more accurate to classify those responses as escape responses, expressed in a reflex-like manner, as their primary aim is to distance the organism from the CS<sup>+</sup> (i.e., CS escape).

Despite the rapid acquisition of SSDRs by mere Pavlovian learning, the maintenance of those SSDRs, or the learning of non-SSDRs, will in fact depend on instrumental learning. As such, we adhere to the spirit of Mowrer's two-factor theory and more recent reformulations in assuming that avoidance learning depends on both Pavlovian and instrumental processes. Yet, we propose that instrumental learning plays a different role depending on whether the to-be-learned avoidance response belongs to an organism's SSDR repertoire or not. Revisiting our example, dogs are expected to start jumping across the barrier upon presentation of the CS+, with such response being maintained if the US stops being administrated (i.e., negative reinforcement). If, however, such a response does not lead to cancelation of the US, that particular avoidance response will cease to be performed. For illustration, assume that the experimenter-designated response changes from jumping (i.e., an SSDR) to rearing (i.e., a non-SSDR). As jumping does not result in omission of the US, another SSDR response will be performed (e.g., running within the same compartment). However, since that novel response is followed by US presentation as well, the dog will cease that response too. Eventually, when all SSDRs have been aborted, the dog may incidentally perform a non-SSDR (i.e., rearing). This response is then hypothesized to be maintained through negative reinforcement by US omission. Although such treatment of non-SSDRs bears similarities with Bolles' (1971) theory, in the sense that non-SSDRs are supposed to emerge after SSDRs are no longer performed, it also deviates from SSDR theory in that it accepts a role for reinforcement learning in the maintenance of SSDRs and the learning of non-SSDRs.

In contrast to existing models, we maintain that for the instrumental component of avoidance learning, there are multiple sources of reinforcement. Those sources include (1) the non-occurrence of an expected aversive event after performance of the avoidance response; (2) the reduction of fear due to escape from the CS+, at least in the initial stages of reinforcement learning; (3) the occurrence of discrete stimuli upon performance of an avoidance response that become safety signals. Although this is a bottom-up suggestion, we believe that by assuming multiple sources of reinforcement, the described principles fit a larger amount of data.

In line with both the cognitive and the expectancy model, we propose that during instrumental learning, knowledge is acquired that (1) the performance of an avoidance response in the presence of the antecedent stimulus leads to the omission of the otherwise expected aversive event and; (2) the non-performance of an avoidance response in the presence of the antecedent stimulus leads to the administration of the aversive event. Those two theoretical accounts also assume that an individual or an animal will prefer not encountering an aversive event (here a shock) over encountering one.

Lastly, we take no position here in the debate as to whether avoidance learning (and in particular the Pavlovian component thereof) is based exclusively on propositional knowledge or also relies on non-propositional association formation processes. The recent literature provides strong evidence to support a singleprocess propositional account of learning (see De Houwer, 2009; Mitchell et al., 2009) as well as intriguing evidence to the contrary (e.g., Sevenster et al., 2014).

### Clinical Implications

The set of principles described above can help to detect sources of maladaptive avoidance. By building on those principles, clinical interventions for maladaptive avoidance can also be improved.

For instance, given that avoidance learning is assumed to depend on multiple pathways, the causes of maladaptive behavior should not be assumed to be limited to direct experience with threat. Maladaptive avoidance may just as well be the result of instructions (e.g., kids being instructed to stay away from dogs leading to a phobia toward dogs) or vicarious learning (e.g., watching a plane crash on television as an onset of avoidance of flying). Another potential pathway of avoidance learning concerns symbolic generalization (e.g., avoidance of thoughts of contamination that only symbolically relate to actual contamination; Dymond et al., 2012).

Independent of the pathway, however, patients will often be able to articulate purported relations between the antecedent stimuli and the presentation or omission of an aversive outcome, as well as the relationship between the cancelation of that aversive outcome and the performance of an avoidance response (Beck et al., 2005). Given that those relationships often reflect misconceptions of the patient (e.g., that social interactions will always end in embarrassment), those beliefs should be challenged and modified during clinical interventions, a suggestion in line with current cognitive-behavior therapy protocols.

In terms of interventions, a common therapeutic technique for anxiety disorders and phobias is exposure therapy, which entails confronting the individual repeatedly with the phobic stimulus. Clinical results suggest that exposure is necessary, but not sufficient, for the reduction of avoidance. A more effective technique for diminishing avoidance is the combination of exposure therapy with response prevention. This is consistent with existing studies that show that in order to extinguish an avoidance behavior it is not enough to omit the aversive outcome after an antecedent stimulus, it is imperative that subjects are meanwhile not allowed to perform the avoidance response (see Solomon and Wynne, 1953; Brush, 1957; Seligman and Campbell, 1965). This suggestion is based on experimental findings showing that participants engaging in avoidance and other safety-like behaviors during extinction will keep avoiding the fear conditioned stimulus when avoidance is again allowed (see the "protection-from-extinction" phenomenon; e.g., Lovibond et al., 2009). Similarly, in clinical cases, patients with social phobia will often endure an exposure session by performing subtle behaviors that reduce the level of experienced fear ("within-situation safety"; Wells et al., 1996). Helpful as that may be for enduring the phobic situation, it does not help in the modification of the initial fear beliefs (Salkovskis et al., 1999). Therefore, the combination of exposure therapy with response prevention could lead to stronger symptom reduction, despite causing sustained fear levels during exposure (Abramowitz, 1996; Neziroglu et al., 2011). In that direction, and in line with the expectancy model, a change in irrational beliefs of the clients, as typically done in cognitive-behavior therapies, could also prove potentially helpful (Whittal et al., 2005).

Finally, we have shown that avoidance tendencies can be established by mere Pavlovian association, without any instrumental reinforcement. Those tendencies are prone to return after Pavlovian extinction (Krypotos et al., 2014b). Given that such avoidance tendencies may act as precursors of overt avoidance, it might be helpful if therapeutic protocols also target the modification of action tendencies in addition to overt behavior. Encouraging results in that direction come from experimental studies on the modification of action tendencies using approach-avoidance training tasks (training AATs; see Wiers et al., 2010; Wittekind et al., 2015). In those tasks, participants have to primarily approach one type of stimulus and avoid another type of stimulus by using a joystick or keyboard buttons. Following such procedures, participants typically exhibit stronger approach tendencies toward the former type of stimuli and stronger avoidance tendencies toward the latter type of stimuli than before. Training AATs seem to influence corresponding overt behavior as well, at least in nonclinical populations (Taylor and Amir, 2012; Amir et al., 2013). However, translation of those findings to clinical populations has proven unsuccessful so far (e.g., Asnaani et al., 2014; Krypotos et al., in press; van Uijen et al., 2015). More research is clearly warranted here.

### Current Challenges and Future Directions

The renewed interest in avoidance learning is apparent not only in experimental psychology research, but also in clinical psychology, psychiatry, and behavioral neuroscience (see above). With the increased interest in avoidance learning in clinical science and behavioral neuroscience come new challenges, such as a need for better communication between basic researchers and clinicians and integration of insights from neuroscience in psychological theories of avoidance learning.

### Enhancing the Communication between Basic Scientists and Clinicians

A first step for enhancing communication between basic scientists and clinicians (clinical psychologists and psychiatrists) is convergence on a common language. As an illustration of the discrepancy between the experimental and clinical use of similar terms we refer to the definition of active avoidance as used in DSM-5 (American Psychiatric Association, 2013). There it is stated that "Active avoidance means the individual intentionally behaves in ways that are designed to prevent or minimize contact with phobic objects or situations (e.g., takes tunnels instead of bridges on daily commute to work for fear of heights; avoids entering a dark room for fear of spiders; avoids accepting a job in a locale where a phobic stimulus is more common)." This definition does not distinguish between active and passive avoidance or between escape and avoidance; yet those behaviors might reflect the operation of distinct psychological mechanisms (see above). We believe that a common definition of different expressions of avoidance and the conditions under which it occurs, may enable clearer communication between experimental and clinical science and enhance the translation of experimental findings to clinical interventions and back.

Throughout our review, we have attempted to translate basic experimental findings to clinical situations. It is equally important that clinical observations feed into basic research. A good example for that is a recent study in which avoidance behavior led to US cancelation on some, rather than all, trials. This experimental protocol may map better on real life situations where avoidance is not always successful in preventing the occurrence of an unpleasant event (Vervliet, 2014). This should therefore provide better ecological validity to experimental findings (Vervliet and Raes, 2013).

The avoidance learning procedures we have described so far mainly focus on how someone learns to avoid an expected aversive event. This is, however, quite different from what is mostly of interest in clinical practice. Take for example someone with symptoms of spider phobia. In that case, the spider could serve as the antecedent stimulus of the aversive event, such as a venomous bite. In experimental settings, that would entail that the individual would escape the spider and avoid the bite. However, it is quite common for an individual to try and avoid the spider altogether, for example by keeping away from places where spiders are commonly found (e.g., forests). Therefore, experimental studies should develop protocols in which it is possible for someone to avoid the antecedent stimulus rather than the aversive event. Procedures that include sequential CSs (e.g., CS1 → CS2 → US relationships) could prove useful in this regard (e.g., Levis and Boyd, 1979; Malloy and Levis, 1988).

### Future Developments

The theories of avoidance learning reviewed above are utterly silent with regard to the role of individual differences factors. Likewise, most experimental studies focus on how avoidance responses are acquired in general, without attending to differences between participants. However, investigating how individual differences in trait characteristics or biological factors (e.g., sex hormones) affect avoidance learning could be important for theoretical and clinical reasons.

Theoretically, the consideration of differences in biological factors or trait characteristics may help in predicting distinct avoidance learning patterns. With respect to biological factors, for example, animal research has demonstrated sex and strain differences in the rate at which avoidance is acquired (Avcu et al., 2014) and extinguished (Jiao et al., 2011). Sex differences have also been documented in humans (e.g., in Sheynin et al., 2014). Regarding trait characteristics, Lommen et al. (2010) have shown that individuals with high levels of neuroticism tend to show higher levels of avoidance toward ambiguous stimuli than individuals with low levels of neuroticism. Of importance, differences in the rate of avoidance learning might be associated with differences in other learning processes. For example, behaviorally inhibited individuals exhibit stronger conditionability between a CS and a US compared to noninhibited individuals (Allen et al., 2014).

Clinically, the investigation of individual differences could inform the development of treatments that are tailored to specific disorders. Indeed, it has been suggested that avoidance learning patterns may differ across mental disorders (Mosig et al., 2014). Support for this suggestion comes from findings demonstrating differences in conditionability (i.e., the tendency to "acquire a larger and more persistent (autonomic) differential response to an aversive CS"; Wegerer et al., 2013) between individuals with low and high levels of spider phobia (Mosig et al., 2014). It may be hypothesized that those differences could also generalize to distinct avoidance learning patterns that differ across mental disorders. Future studies might try to unravel whether distinct patterns of avoidance constitute prognostic factors for specific disorders, allowing more targeted interventions.

Another avenue for research might be the investigation of how avoidance responding turns from goal-directed to habitual (Dickinson, 1980; Wood and Neal, 2007). If we assume that persistent maladaptive avoidance behavior (e.g., in anxiety disorders) is often instrumentally reinforced, the principles described above imply that avoidance is performed in a goaldirected manner. In other words, it is assumed that avoidance is based on knowledge about the consequences of each action and the desirability of each outcome event. However, especially in the area of psychopathology, behavior is often performed in a habitual manner, that is in a more automatic/reflexive way than goal-directed actions. Specifically, in cases such as that of anxiety disorders, maladaptive avoidance is performed repetitively, over long period of times. Such overtraining of maladaptive avoidance might result in that behavior taking the form of a habit, performed in an automatic—opposed to goaldirected—manner (Wood and Neal, 2007). This distinction is important as habitual responses have been shown to be less sensitive to extinction (Yin and Knowlton, 2006). As such, it would be worthwhile both experimentally and clinically to address the factors that turn a goal-directed avoidance action into a habit. In that direction, Ilango et al. (2014) proposed that in cases of habitual avoidance in rats, active avoidance recruits and depends on the same neural structures also involved in habit formation (e.g., the striatal-nigral-striatal circuitry), suggesting that this region should be targeted in the investigation of persistent avoidance.

Another potential source of novel insight are recent computational models of avoidance learning. One such model is the actor-critic model of Maia (2010) that is strongly inspired by the two-factor theory of Mowrer but also takes the role of prediction error in learning into account. Myers et al. (2014) also recently presented a reinforcement learning computational model that could successfully predict the acquisition of avoidance behavior in Sprague-Dawley and Wistar-Kyoto rats. Lastly, we have provided a Bayesian driftdiffusion model decomposition of performance in AAT tasks (Krypotos et al., 2014a). Such models often allow a more accurate assessment of performance and a more precise investigation of the psychological mechanisms involved in avoidance learning/performance. Of importance, knowledge of the cognitive mechanisms in play during maladaptive behavior expression could serve the development of more targeted clinical interventions (Kazdin, 2008).

Over the past decade, behavioral neuroscience research has greatly expanded our insight into the neural correlates of avoidance learning. It has been shown, for example, that avoidance learning in humans correlates with activation of amygdala, which is known to play a key role in the acquisition of Pavlovian fear responses (Phelps and LeDoux, 2005), and the striatum (O'Doherty, 2004), which is involved in learning about rewards (Delgado et al., 2009). In rodent studies, areas such as the amygdala (Cain and LeDoux, 2008), the infralimbic prefrontal cortex (Moscarello and LeDoux, 2013), and the prefrontal striatal circuits (Bravo-Rivera et al., 2014; see also Phelps and LeDoux, 2005; Ilango et al., 2014 for reviews) have also been shown to play a role in avoidance acquisition. Findings such as these help to further increase our understanding of how Pavlovian and instrumental learning processes shape avoidance behavior.

Exciting possibilities for the deeper investigation of the neural correlates in avoidance acquisition open up with the introduction of new methodologies. For example, in order to mimic the behavior of natural predators, Choi and Kim (2010) used the Robogator (LEGO Mindstorms robot) while rats where approaching or avoiding food stimuli. The results showed that rats' foreaging behavior increased during amygdala inactivation and decreased during amygdala activation. Also, Bravo-Rivera et al. (2014) attempted to dissociate the neural circuits mediating active avoidance in rats. Importantly, researchers argued against the use of the traditional shuttlebox, where both compartments can predict shock making it difficult to differentiate between circuits involved in active avoidance and escape. Instead, they used a new paradigm where a US could be avoided if animals would step on a nearby platform that had never been shocked before. The results showed that active avoidance is mediated by the prefrontal-striatal circuit. Further neurobiological dissection of the differences between passive avoidance, active avoidance, and escape is likely to necessitate updating of psychological theories of avoidance learning, which typically assume that they rely on similar processes. In that respect, exciting possibilities for studying the neural circuits of avoidance acquisition open up with the use of optogenetics (see Kravitz et al., 2012; Nabavi et al., 2014; Namburi et al., 2015 for recent examples).

### Conclusions

Research on avoidance learning had waned since the 1970s, leaving key questions unanswered. In light of the recent renewal of interest in avoidance in behavioral and brain research and in clinical science, we have provided a review of the most prominent historical and modern avoidance learning theories and relevant empirical findings. This review has yielded a number of essential principles of avoidance learning that should be useful for experimental researchers as well as clinicians.

Many questions remain to be answered. We have highlighted topics for future research as well as ways to enhance communication between basic and clinical science. By doing so, we hope that the present paper contributes to further research on the psychological and biological basis of avoidance and helps to pave the way for novel interventions.

### Author Contributions

AMK and TB wrote the manuscript. MK and ME provided critical feedback to the manuscript. All authors have approved the final version of the manuscript.

### Acknowledgments

Preparation of this paper was supported by Innovation Scheme (Vidi) Grant 452-09-001 of the Netherlands Organization for Scientific Research (NWO) awarded to TB. While writing parts of this paper, AMK was a scholar of the Alexander S. Onassis Public Benefit Foundation (FZE 039/2011-2012) and a visiting fellow at the school of Psychology at the University of New South Wales (Sydney, Australia). We would like to thank Peter Lovibond for his invaluable contribution to the present paper and Frederic Westbrook for useful discussions related to the theories reviewed here.

### References


Beckers, T., and Vervliet, B. (2009). The truth and value of theories of associative learning. Behav. Brain Sci. 32, 200–201. doi: 10.1017/S0140525X09000880

Bekhterev, V. (1907). Objective Psychology. St. Petersburg, FL: Soikin.


vulnerability: insights from computational modeling. Front. Behav. Neurosci. 8:283. doi: 10.3389/fnbeh.2014.00283


Pavlov, I. P. (1927). Conditioned Reflexes. London, Courier Dover Publications.


in the maintenance of panic disorder with agoraphobia. Behav. Res. Ther. 37, 559–574.


depression. Behav. Res. Ther. 45, 1141–1153. doi: 10.1016/j.brat.2006. 09.005


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Krypotos, Effting, Kindt and Beckers. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

### Avoidance as expectancy in rats: sex and strain differences in acquisition

### *Pelin Avcu1,2, Xilu Jiao2,3, Catherine E. Myers 1,2,3,4, Kevin D. Beck1,2,3,4, Kevin C. H. Pang1,2,3,4 and Richard J. Servatius 1,2,3,4\**

*<sup>1</sup> Graduate School of Biomedical Sciences, New Jersey Medical School, Rutgers Biomedical and Health Sciences, Newark, NJ, USA*

*<sup>2</sup> Stress and Motivated Behavior Institute, New Jersey Medical School, Rutgers Biomedical and Health Sciences, Newark, NJ, USA*

*<sup>3</sup> Department of Neurology and Neurosciences, New Jersey Medical School, Rutgers Biomedical and Health Sciences, Newark, NJ, USA*

*<sup>4</sup> Neurobehavioral Research Lab, Department of Veteran Affairs Medical Center – New Jersey Health Care System, East Orange, NJ, USA*

### *Edited by:*

*Gregory J. Quirk, University of Puerto Rico, USA*

#### *Reviewed by:*

*Bronwyn Margaret Graham, University of New South Wales, Australia Marie-France Marin, Massachusetts General Hospital, Harvard Medical School, USA*

### *\*Correspondence:*

*Richard J. Servatius, Department of Veterans Affairs, Syracuse VA Medical Center, Research 151, 800 Irving Avenue, Syracuse, NY 13210, USA*

*e-mail: richard.servatius@va.gov*

Avoidance is a core feature of anxiety disorders and factors which increase avoidance expression or its resistance represent a source of vulnerability for anxiety disorders. Outbred female Sprague Dawley (SD) rats and inbred male and female Wistar-Kyoto (WKY) rats expressing behaviorally inhibited (BI) temperament learn avoidance faster than male SD rats. The training protocol used in these studies had a longstanding interpretive flaw: a lever-press had two outcomes, termination of the warning signal (WS) and prevention of foot shock. To disambiguate between these two explanations, we conducted an experiment in which: (a) a lever-press terminated the WS and prevented shock, and (b) a lever-press only prevented shock, but did not influence the duration of the WS. Thus, a 2 × 2 × 2 (Strain × Sex × Training) design was employed to assess the degree to which the response contingency of the WS termination influenced acquisition. Male and female SD and WKY rats were matched on acoustic startle reactivity within strain and sex and randomly assigned to the training procedures. In addition, we assessed whether the degree of avoidance acquisition affected estrus cycling in female rats. Consistent with earlier work, avoidance performance of female rats was generally superior to males and WKY rats were superior to SD rats. Moreover, female SD and male WKY rats were roughly equivalent. Female sex and BI temperament were confirmed as vulnerability factors in faster acquisition of avoidance behavior. Avoidance acquisition disrupted estrus cycling with female WKY rats recovering faster than female SD rats. Although termination of the WS appears to be reinforcing, male and female WKY rats still achieved a high degree (greater than 80% asymptotic performance) of avoidance in the absence of the WS termination contingency. Such disambiguation will facilitate determination of the neurobiological basis for avoidance learning and its extinction.

**Keywords: lever-press avoidance, anxiety, vulnerability, behavioral inhibition, Wistar-Kyoto (WKY) rat, expectancy, shock avoidance**

### **INTRODUCTION**

Avoidance in its various forms (experiential, emotional, and cognitive) is a common feature of all anxiety disorders (American Psychiatric Association, 2000). Development of anxiety disorders can be best explained by diathesis models, accounting for a complex interaction of individual vulnerabilities with environmental risk factors. A greater focus on individual differences in avoidance learning theories has the promise of providing insights into the epidemiology and course of anxiety disorders.

A number of organizing principles have been offered to understand acquisition, maintenance, and extinction of avoidance: those focused on associative learning of signs and signals (Seligman and Johnston, 1973; Lovibond, 2006), those focused on reinforcement (Mowrer, 1956; Bersh and Alloy, 1978; Hineline, 2001), and those using a cognitive framework (De Houwer et al., 2005; Declercq and DeHouwer, 2008; Dymond and Roche, 2009; Mitchell et al., 2009). Avoidance learning is most efficient when the subject's response terminates the WS and prevents the shock (shock avoidance). However, concurrent WS termination and shock prevention does not allow a clear interpretation of increases in responses. Is the animal responding to terminate the WS or is it responding to avoid shock? Classic work attempted to separate the reinforcing effects of WS termination and shock avoidance on avoidance learning (Sidman, 1955; Kamin, 1956; Sidman and Boren, 1957; Keehn, 1959; Lockard, 1963; Bower et al., 1965; Bolles et al., 1966; Owen et al., 1977). If WS termination was the primary reinforcement, then responses were controlled by the present, not the future. Converging evidence documented that an inability to terminate the WS merely affected the acquisition of avoidance response (Sidman, 1955; Kamin, 1956; Sidman and Boren, 1957; Keehn, 1959; Lockard, 1963; Bower et al., 1965; Bolles et al., 1966).

These previous studies examined avoidance learning in procedures that utilized shuttling, wheel running, and jumping as responses. These responses are all species specific defense reactions (SSDR), and therefore, the learning of the avoidance response has been previously questioned (Bolles, 1970). A specific case of avoidance is signaled lever-press avoidance; that is, a WS contingent with shock presentation provides the opportunity to learn that shock may be terminated (escape) or prevented (avoidance). Unlike SSDRs, arbitrary avoidance responses, like lever-press, are slowly and incrementally acquired. Through autoshaping, lever-presses are reinforced for shock termination and prevention.

Classic literature concentrates on explanations for normal avoidance acquisition, expression, and extinction in outbred strains. Yet, it is well-documented that abnormal avoidance expressions form the core of all anxiety disorders. Normal acquisition may only partially, and therefore incompletely inform the acquisition and maintenance of avoidance in anxiety disorders. A greater susceptibility to acquire pathological avoidance symptoms may cause some individuals to be more vulnerable to develop anxiety disorders. Factors that accelerate acquisition and promote perseveration of avoidance represent a diathesis for anxiety disorders—a learning-diathesis model. Accordingly, BI temperament (Rosenbaum et al., 1991, 1993; Schneier, 2003; Fox et al., 2005) and female sex (Wittchen and Hoyer, 2001; Vogt et al., 2005; Bleich et al., 2006; Foa et al., 2006; Hapke et al., 2006; Karamustafalioglu et al., 2006; Smith et al., 2008) are identified as independent vulnerability factors for the development of anxiety disorders.

As one moves toward understanding vulnerability factors in rapid avoidance acquisition and its persistence in animal models, a clear interpretation of the source of sensitivity is essential. Similar to human literature, female sex is also associated with greater risk for anxiety disorders in animal models. Female rats acquire lever-press avoidance faster than male rats (Heinsbroek et al., 1983; Beck et al., 2010). Resembling humans expressing BI temperament, inbred WKY rats display extreme withdrawal in the face of novel social challenges (Ferguson and Cada, 2004; Braw et al., 2006) and non-social challenges (McCarty et al., 1984; Pare and Schimmel, 1986; Pare, 1989; McAuley et al., 2009); and greater sensory motor reactivity compared to SD rats (Servatius et al., 1998; Beck et al., 2010). Acquisition of lever press avoidance is faster and expressed to a greater degree in WKY rats than SD rats (Servatius et al., 2008; Beck et al., 2010; Jiao et al., 2011; Perrotti et al., 2013). Furthermore, both females (Shors et al., 1986, 1998; Wood and Shors, 1998; Beck et al., 2008, 2012), and WKY rats (Ricart et al., 2011a,b) exhibit faster associative learning evident in eyeblink conditioning. Thus, enhanced avoidance acquisition may be through enhanced sensitivity to WS termination (driven by WS-shock associations) or prevention of shock. The existing data—in which WS co-terminated with the shock—fail to disambiguate these two possibilities.

The present study compared avoidance acquisition under two training procedures: (1) a lever-press terminated the WS and prevented shock occurrence (contingent WS) and (2) a lever press only prevented shock occurrence (non-contingent WS). Considering the interpretation of early findings (Sidman, 1955; Kamin, 1956; Sidman and Boren, 1957; Keehn, 1959; Lockard, 1963; Bower et al., 1965; Bolles et al., 1966), we expected acquisition of avoidance to be evident in the absence of the WS termination contingency, albeit slower than with a WS termination contingency. Further, we expected the female rats to express avoidance acquisition to a greater degree than male rats regardless of the training parameters. Similarly, we expected WKY rats to express avoidance to a higher degree than SD rats. Moreover, we expected an interaction of female sex and BI temperament such that female WKY rats would express avoidance to the highest degree of those tested. Thus, the vulnerability factors examined will demonstrate a specific enhanced sensitivity to the avoidance of impending shock, as opposed to momentary responses to the presence of the WS.

Additionally, we investigated the effect of avoidance acquisition on estrous cycle in female SD and WKY rats. Typically, physiological stress levels are maximal early in training which is commensurate with a relatively high degree of shock exposure. As avoidance is acquired, stress levels are reduced (Coover et al., 1973; Berger et al., 1981). Exposure to stress has been shown to have adverse effects on several aspect of the reproductive system (e.g., estrous cycle) in both humans (Genazzani et al., 1991) and animals (Gonzalez et al., 1994; Norman et al., 1994). Thus, it is expected that a disruption of the estrous cycle would be observed early during initial sessions of avoidance training but return during the later sessions when avoidance is learned and stress has presumably been reduced. With regard to WKY rats, the disrupted estrous cycle may recover faster than SD rats because WKY rats learn avoidance faster and to a greater extent than SD rats. On the other hand, if high asymptotic levels of avoidance is pathological, female WKY rats may continue to show irregular cycling throughout avoidance training.

### **MATERIALS AND METHODS ANIMALS**

Forty-nine SD and forty-six WKY male (*n* = 48) and female (*n* = 47) rats (300–350 g, 8–10 weeks old; Harlan Sprague-Dawley Laboratories, Indianapolis, IN) were single-housed and kept on a 12:12 h light cycle (lights on 0700) with access to laboratory chow and water *ad libitum*. Upon arrival, rats were acclimated to the housing conditions for at least 2 weeks prior to experimentation. All experiments occurred between 0700 and 1200 h, in the light portion of the cycle. All studies were approved by the Institutional Animal Care and Use Committee (IACUC) in accordance with AAALAC standards.

### **OPEN-FIELD TEST**

Naive rats were evaluated for locomotor activity in the open-field test, consistent with previous work (Servatius et al., 1995). The apparatus consisted of a gray cylindrical arena, 82 cm in diameter with 30 cm high aluminum walls. The arena floor was divided into three concentric circles, demarcated by black paint. The smallest inner circle had a diameter of 20 cm measured from the center of the arena. The second circle had a diameter of 50 cm and the arena wall defined the outer limit of the third circle. Each of the outer circles was divided by radial lines into equally sized areas of approximately 251 cm2. A light (100 W) was located 150 cm directly above the center of the open field. Performance in the open-field was scored based on latency to leave the center segment of the open-field and number of line crossings (segments entered by all four limbs) made during a 2 min time window. The open field was wiped with a soap solution between the testing of each rat.

### **ACOUSTIC STARTLE REACTIVITY**

Twenty-four hours after the open-field test, all rats were evaluated for sensorimotor reactivity in the acoustic startle reactivity test (previously described in Servatius et al., 1998). Rats were placed on platform accelerometers (Coulbourn Instruments, Langhorne, PA) in restrainers and allowed to acclimate to the testing apparatus for 10 min prior to the onset of testing. Each 40 min testing session consisted of 60 trials of exposure to a single white noise burst (102 dB, 100 ms) against a continuous ambient background noise level of 68 dB. The inter-stimulus interval varied between 25 and 35 s. The test chambers were wiped with a soap solution between the testing of each rat. The test was performed with the ventilation fans on and the lights off. A startle response was scored if the activity exceeded a response threshold amplitude during a 250 ms baseline window prior to the onset of the stimulus. Threshold was defined as activity that exceeded 6X the standard deviation of the baseline activity. If movement did not exceed the threshold, no startle response was scored for that trial. Startle magnitude was calculated by correcting the response amplitude by body weight of the rat. Data was analyzed as mean startle magnitude of 60 consecutive trials.

### **LEVER-PRESS ESCAPE/AVOIDANCE (E/A) TRAINING**

Training was conducted in 16 identical operant chambers (Coulbourn Instruments, Langhorn, PA), enclosed in sound attenuated boxes (previously described in Servatius et al., 1998). Foot-shocks (1.0 mA) were delivered through a grid floor (Coulbourn Instruments, Langhorn, PA). The auditory WS was a 1000 Hz, 75 dB tone, against a 10 dB background noise. A 3 min inter-trial interval (ITI) was identified with a 5 Hz flashing light located above the lever. Graphic State Notation software (v. 3.02, Coulbourn Instruments, Langhorn, PA) controlled the stimuli and recorded response times.

Each session began with a 60 s stimulus free period, which was followed by the presentation of the WS (60 s maximum). A lever-press during the WS was considered an *avoidance response* and prevented shock exposure for both training conditions. In the contingent WS protocol, the avoidance response immediately terminated the WS and initiated the ITI. In the non-contingent WS protocol, the WS remained on for the full 60 s regardless of an avoidance response. A maximum of 99 foot-shocks (0.5 s in duration) could be delivered in the absence of an avoidance response with an inter-shock-interval of 3 s. A lever-press during the shock delivery was considered an *escape response* and immediately terminated the shock train and initiated the ITI for both training conditions. Each training session consisted of 20 trials.

### **ESTROUS CYCLE**

Estrous cycle of all female SD and WKY rats was determined through vaginal smears in order to explore the possible effects of avoidance learning on estrous cycling. Vaginal smears occurred 24 h after each training session, on days that E/A training did not occur. To be able to account for a pre-training difference in the cycling patterns of SD and WKY rats, 3 additional data points (each separated by 24 h) were obtained prior to E/A training. The smear procedure consisted of sampling the cells of the vaginal canal with sterile saline using a glass pipette. The recovered solution was placed on microscope slides, stained with cresyl violet, and dried. Dried slides were histologically examined under a medium power microscope (Leica, 20×/0.70 of magnification). Each slide was classified as being in the proestrus, estrus, or diestrus phase of the estrous cycle as described by Sharp and La Regina (1998).

Taking the length of each estrous phase into account, each session block of the training was assigned numerical values ("1" = if a rat was in a different estrous phase in each data point of a given session block, "2" = if a rat was in the same estrous phase for two consecutive data points of a given session block, "3" = if a rat was in the same estrous phase for all 3 data points of a given session block). Mean and SEM values of both strains were determined for each session block. Higher values indicated stagnancy and therefore greater irregularity in estrous cycle. Pre-training values of the estrous cycle were included as a covariate factor in the analysis.

### **TESTING SCHEDULE**

Prior to E/A training, SD and WKY rats were evaluated in the open-field test and then for sensory reactivity in the acoustic startle test. Rats from each strain were stratified based on the magnitude of their startle response and then randomly assigned within each stratum to either "Contingent WS" or "Non-contingent WS" training protocols for E/A training. A total of 15 E/A training sessions occurred 3 times per week (every 2–3 days). Rats that failed to emit a single lever-press response by the end of the 4th session were omitted from the study (local IACUC standard). Two SD and two WKY rats were omitted from the study for this reason.

### **DATA ANALYSIS**

The number and the latency of all lever-press responses as well as the number of shocks delivered were collected by Graphic State (Coulbourn Instruments, Langhorn, PA). Data were subsequently processed and collated using S-Plus (InsightfulCorp). All data were expressed as means ± standard error of the mean (SEM). Statistical results were reported only where significant differences were found. Data from the open field and acoustic startle reactivity test were analyzed using a *t*-test for independent groups. For avoidance acquisition, mean values were obtained for each of five session blocks consisting of 3 consecutive training sessions per block. Acquisition of avoidance responses was analyzed as percent avoidance (percentage of trials per session block for which an avoidance response was emitted). Avoidance latencies were naturally skewed to the onset of the WS, especially with the WS termination contingency. Therefore, all latencies were log transformed prior to analysis.

### **RESULTS**

### **OPEN-FIELD TEST**

WKY rats exhibited less activity compared to SD rats in the open field test. WKY rats exhibited longer latencies (12.59 ± 1.36 s) to leave the center of the open-field compared to SD rats (7.061 ± 0.57 s), *t*(93) = −3*.*822, *p <* 0*.*001. Moreover, WKY rats exhibited reduced activity (26.04 ± 2.27 segments) compared to SD rats (65.39 ± 3.78 segments), *t*(93) = 8*.*78, *p <* 0*.*001. Furthermore, females exhibited greater activity (32.73 ± 3.03 segments) compared to males (60.23 ± 30.34 segments), *t*(93) = −5*.*15 (*p <* 0*.*05).

### **ACOUSTIC STARTLE REACTIVITY**

WKY rats had greater magnitudes of acoustic startle responses (3.90 ± 0.24 AU) compared to SD rats (2.12 ± 0.1 AU), *t*(93) = −7*.*15 (*p <* 0*.*001).

### **AVOIDANCE ACQUISITION**

Under the contingent WS protocol, all rats acquired avoidance incrementally over training, with an asymptotic performance at least ∼80%. Overall, female rats showed superior avoidance acquisition and expression compared to male rats (**Figure 2A**). Moreover, WKY rats compared to SD rats acquired avoidance responses faster and to a greater asymptotic level regardless of the training parameters (**Figure 1**). Consistent with previous data, strain differences were also observed in within session performance (Servatius et al., 2008; Beck et al., 2010, 2011; Jiao et al., 2011). Strain differences in within session avoidance performance were most notable in session block 4 and 5; especially in the early trials of those session blocks (**Figure 1**). In the non-contingent WS protocol, a decrement in avoidance performance was observed for all groups compared to the contingent WS protocol (**Figure 2B**). However, the slowest group (male SD rats) still attained a 50% avoidance rate by the last session block of the training. These impressions were confirmed by a 2 × 2 × 2 × 20× 5 (Strain × Sex × Training × Trial × Session Block) mixed design analysis of variance (ANOVA). The main effect of Sex [*F*(1*,* 83) = 14*.*98], as well as the Training × Session Block interaction [*F*(4*,* 332) = 3*.*77] and a triple interaction of Strain × Trial × Session Block [*F*(76*,* 6308) = 1*.*29] were all significant (*p*s *<* 0.05). The higher level of avoidance responding in females compared to males in the first session block could be due to faster learning by the females or greater general activity, leading to increased lever pressing. Therefore, an additional analysis detailed sex differences in the acquisition and expression of avoidance responses on the first training session. In the first training session, the rate of avoidance responding was comparable between female (24 ± 3%) and male rats (16 ± 3%), [*t*(41) = 1*.*93, *p* = 0*.*06]. This further statistical analysis provides evidence that females learned avoidance responses faster than male rats.

Early in training, contingent and non-contingent WS protocols revealed similar avoidance latencies. However, only the contingent WS group exhibited reduced latencies to avoid across E/A training, while the non-contingent WS group continued to exhibit relatively unchanging avoidance latencies throughout the training. As a result, in the last session block of the training, non-contingent WS group had greater avoidance latencies (18.65 ± 0.38 s) compared to contingent WS group (12.27 ± 0.29 s). Furthermore, males had greater latencies (16.70 ± 0.41 s) compared to females (14.52 ± 0.30 s). A 2 × 2 × 2 × 5 (Strain × Sex × Training × Session Block) mixed-ANOVA confirmed

these impressions. The test revealed a main effect of Training, *F*(1*,* 83) = 32*.*95 as well as a main effect of Sex *F*(1*,* 83) = 4*.*25 (*p*s *<* 0.05).

### **ESTROUS CYCLE**

For both strains, avoidance training significantly disrupted regular cycling patterns that were observed prior to training. The highest rate of irregularity was evident during the 2nd and the 3rd session block of avoidance training both for SD and WKY rats (**Table 1**). However, WKY rats appeared to exhibit a faster recovery of their estrous cycle compared to SD rats. A 2 × 4 (Strain × Session Block) mixed-ANOVA confirmed these impressions as an interaction of Session Block × Strain was evident [*F*(1*,* 176) = 3*.*77; *p <* 0*.*05].

### **DISCUSSION**

Discrete trial lever press avoidance has had a long, but intermittent history. Lever press avoidance is acquired slowly compared to other types of responses; a virtue from the stand point of specificity, a bane in terms of throughput. However, relatively rapid avoidance acquisition was demonstrated by the work of Berger and Brush (1975) with the introduction of a set of procedures recapitulated decades later. Among the procedures was using a protocol in which the WS co-terminated with the shock. Another modification was having multiple sensory modalities contingent upon the requisite response (lack of the presence of shock with the passage of time, termination of the WS, and initiation of a flashing light as explicit safety). These multiple lines of reinforcement may have constituted a successful solution to the behavioral problem of observing acquisition through auto-shaping of the lever-press response. However, multiple reinforcement contingencies obscure the source of reinforcement and hinder efforts to understand neurobiological underpinnings. Which motivation drives rats to lever press? Is it escape from the WS? Prevention of impending shock? Or experiencing the safety signal?

Although experiments in shuttle box and jump up avoidance have addressed many of these concerns, an explicit test was warranted. Herein, rats were trained without reinforcement in the form of WS termination. Similar to results obtained with shuttle box (Sidman, 1955; Kamin, 1956; Sidman and Boren, 1957; Keehn, 1959; Lockard, 1963; Bower et al., 1965; Bolles et al., 1966; Bolles and Grossen, 1969), avoidance responses were acquired when lever presses were not associated with WS termination. Clearly, these data rule out one possible confounding explanation of acquisition of the lever-press avoidance, that is, escape from the WS. However, that is not to say that WS termination did not have reinforcement value. This was most evident in male SD rats, which exhibited the largest performance differential between the two contingency protocols. The reinforcement value was especially evident in terms of avoidance latencies, which were considerably longer in the non-contingent WS protocol. These data are generally in line with previous work, WS durations in other protocols varied from 5 to 10 s. Even in a trace procedure the WS was 2 s and the trace interval was 8 s (Bolles and Grossen, 1969). With avoidance latencies in the non-contingent WS protocol generally between 5 and 25 s in the final session block of


### **Table 1 | The effect of avoidance acquisition on the estrous cycle over the 5 session blocks of the training.**

*Irregular cycling was evident for both SD and WKY rats starting from block 1. Moreover, both strains showed recovery of the estrous cycle to a similar degree by the end of block 5. Yet, WKY rats exhibited a faster recovery of their estrous cycle compared to SD rats.*

training, the delay between avoidance responding and WS termination was considerably longer. Thus, the contrast between the contingent and non-contingent conditions was far greater herein.

In addition to the effects on avoidance acquisition, contingent and non-contingent WS could also affect other aspects of avoidance, such as extinction. The persistence of avoidance responses is clearly an important issue and a contributing factor in the difficulty in treating anxiety disorders. Moreover, WKY rats display persistent avoidance responding as compared to SD rats, and this persistence is dependent on shock intensity (Jiao et al., 2011). With regard to contingent and non-contingent WS protocols, extinction of avoidance following the contingent protocol might be expected to be slower than following the non-contingent protocol because avoidance was learned to a greater degree in the contingent protocol. Alternatively, delayed feedback in the noncontingent protocol could serve to slow the extinction process. It will be important to compare extinction in contingent and non-contingent protocols for SD and WKY rats in future studies.

With an emphasis on diathesis as expressed in avoidance acquisition, two vulnerabilities were studied: female sex and strain differences in temperament. Analysis showed that these factors were independent sources of vulnerability, so they will be separately discussed.

Consistent with the literature, females acquired avoidance faster and to a higher degree than male rats. This is most apparent in SD rats. Moreover, females are generally quicker to respond than male rats, regardless of the WS contingency. Beyond sex differences, we compared strain differences in learning within female rats. To our knowledge, this is the first report detailing the effect of avoidance acquisition on the estrus cycle. Exposure to avoidance context is stressful and hence, can disrupt regular cycling. Irregular cycling was evident in female rats early in training, however over the course of training—which covers numerous cycles—estrus returned to normal in almost all rats. The rate of recovery did not reflect the different training parameters. That is, in female rats the degree of difference between contingent and non-contingent WS termination was minimal. A similar pattern was observed in WKY females, albeit WKY females normalized faster. Although it is tempting to relate the faster normalization to overall better avoidance performance in female WKY rats, female WKY rats generally received more shock than female SD rats. This was apparent early in training (the first two session blocks of the training), with the two strains experiencing shock during the latter session blocks essentially at the same rate. Therefore, our data suggested that the reduction of stress (recovery of irregular cycling) as avoidance response is acquired and reliably performed is not indicative of continued stress in pathological avoidance, represented by WKY rats. The relationship between avoidance acquisition and estrus cycle would be clearer with appropriate yoked controls for number, density, control, and prediction. Nonetheless, the process of physiological adaptation to stress in females confronted with avoidance learning is an intriguing finding.

Consistent with previous demonstrations (Servatius et al., 2008; Beck et al., 2010, 2011; Jiao et al., 2011), WKY rats acquired and expressed avoidance to a higher degree than SD rats; the strains exhibited substantially different patterns of avoidance responding that were evident in both between and within session performance. One recurring pattern in within session performance is that in contrast to SD rats, WKY rats exhibit avoidance on the first trial of each session. This first trial avoidance is expressed soon after avoidance responses become numerous and insulates WKY rats to experience changes in response contingency. As a result, avoidance responding perseverates throughout the process of extinction—even to the extent that avoidance performance is 5%, that is, only the first trial of a session (Servatius et al., 2008; Beck et al., 2011; Jiao et al., 2011). Initial avoidance, once established, is apparent in feral rats, those with septal lesions, or when shocks are delivered prior to the first WS, leading to the conclusion that first trial avoidance is secondary to differences in arousal. The first trial avoidance of WKY rats does not seem to fit this characterization. Inspection of the avoidance latency data shows avoidance latencies are similar between the first trial and subsequent trials. This similarity is also evident in WKY rats trained under non-contingent WS protocol. These longer latencies in the first trial of those trained with a termination contingency and those without suggests that first trial avoidance is considered, not evoked by the contextual elements, which are the same in both groups. The robustness of first trial avoidance in WKY rat provides a potential therapeutically-relevant target for pharmacological manipulations.

Discrete-trial lever press avoidance can inform us about the development of both adaptive and maladaptive coping strategies. Stripped of reinforcement from WS termination, it is clear that avoidance reinforcement is from future events, whether through absence of shock or the inherent conception of safety. With expectations come decisions on whether to respond or not, knowing failure could result in the experience of foot shock. Outbred rats temper the avoidance with escape; escape providing actual evidence of shock presence. Willingness to experience the shock allows for increased sensitivity to changes in shock presence, thereby allowing for and facilitating extinction. Female SD rats, although expressing avoidance to a higher degree than male rats, do not express avoidance to the degree that they are insulated from the presence or absence of shock. Thus, sex differences are confined to rates of acquisition and expression.

The expectations of such aversive future events does not depend on the amygdala (unpublished observations). The essential neural circuitry for normal acquisition of avoidance expectancies awaits further research. Inbred WKY rats are more driven by the expectation of shock, expressing avoidance from the first trial of a session. This motivation may contribute to the facilitation of associative learning evident in classical eyeblink conditioning and the lack of latent inhibition by WKY rats (Ricart et al., 2011a), or it may be additional to such enhanced associative processes. Facilitation of avoidance expressed by WKY rats appears to depend on the amygdala (unpublished observations). This distinction echoes amygdala differences observed in those with anxiety disorders compared to otherwise healthy individuals (Liberzon et al., 1999; Shin et al., 2004).

In summary, sex and strain differences are apparent in avoidance acquisition and expression regardless of a WS termination contingency, indicating enhanced sensitivity for the expectation of future aversive events. In contrast to simple non-associative and associative processes, the neurobiology of avoidance and expectancy is not fully defined. This finding has the promise of providing critical insights into the neurobiology of extreme avoidance expression in anxiety disorders, but also other stress related pathology in which maladaptive coping is a common feature.

### **ACKNOWLEDGMENTS**

Supported by the Stress and Motivated Behavior Institute, Biomedical Laboratory Research and Development Service of the VA Office of Research and Development Award Number I01BX000218 to Kevin D. Beck and I01BX007080 to Kevin C. H. Pang. New Jersey Commission on Brain Injury Research grant CBIR11PJT003.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 June 2014; accepted: 05 September 2014; published online: 06 October 2014.*

*Citation: Avcu P, Jiao X, Myers CE, Beck KD, Pang KCH and Servatius RJ (2014) Avoidance as expectancy in rats: sex and strain differences in acquisition. Front. Behav. Neurosci. 8:334. doi: 10.3389/fnbeh.2014.00334*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Avcu, Jiao, Myers, Beck, Pang and Servatius. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Avoidance expression in rats as a function of signal-shock interval: strain and sex differences

Richard J. Servatius 1, 2, 3 \*, Pelin Avcu1, 3, Nora Ko1, 3, Xilu Jiao<sup>4</sup> , Kevin D. Beck 2, 3, 4 , Thomas R. Minor <sup>5</sup> and Kevin C. H. Pang2, 3, 4

*<sup>1</sup> Syracuse Veterans Affairs Medical Center, Stress and Motivated Behavior Institute, Syracuse, NY, USA, <sup>2</sup> Department of Neuroscience, Stress and Motivated Behavior Institute, Rutgers Biomedical Health Sciences, Newark, NJ, USA, <sup>3</sup> Rutgers Biomedical Health Sciences, Graduate School of Biomedical Sciences, Newark, NJ, USA, <sup>4</sup> New Jersey Health Care System, East Orange, NJ, USA, <sup>5</sup> Psychology, University of California at Los Angeles, Los Angeles, CA, USA*

Inbred Wistar Kyoto (WKY) rats express inhibited temperament, increased sensitivity to stress, and exaggerated expressions of avoidance. A long-standing observation for lever press escape/avoidance learning in rats is the duration of the warning signal (WS) determines whether avoidance is expressed over escape. Outbred female Sprague-Dawley (SD) rats trained with a 10-s WS efficiently escaped, but failed to exhibit avoidance; avoidance was exhibited to a high degree with WSs longer than 20-s. We examined this longstanding WS duration function and extended it to male SD and male and female WKY rats. A cross-over design with two WS durations (10 or 60 s) was employed. Rats were trained (20 trials/session) in four phases: acquisition (10 sessions), extinction (10 sessions), re-acquisition (8 sessions) and re-extinction (8 sessions). Consistent with the literature, female and male SD rats failed to express avoidance to an appreciable degree with a 10-s WS. When these rats were switched to a 60-s WS, performance levels in the initial session of training resembled the peak performance of rats trained with a 60-s WS. Therefore, the avoidance relationship was acquired, but not expressed at 10-s WS. Further, poor avoidance at 10-s does not adversely affect expression at 60-s. Failure to express avoidance with a 10-s WS likely reflects contrasting reinforcement value of avoidance, not a reduction in the amount of time available to respond or competing responses. In contrast, WKY rats exhibited robust avoidance with a 10-s WS, which was most apparent in female WKY rats. Exaggerated expression of avoidances by WKY rats, especially female rats, further confirms this inbred strain as a model of anxiety vulnerability.

Keywords: avoidance learning, motivation, anxiety disorders, diathesis-stress model, WKY, extinction learning, shock, temperament

### Introduction

Avoidance encompasses efforts, thoughts, and behaviors to forestall or eliminate a predicted aversive event or state. Abnormal or irrational expressions of avoidance are a core feature of anxiety disorders such as separation anxiety disorder, acute stress disorder and posttraumatic stress disorder (PTSD) (American Psychiatric Association, 2013). Neurobiological processes

### Edited by:

*Regina Marie Sullivan, Nathan Kline Institute and New York University School of Medicine, USA*

### Reviewed by:

*Seth Davin Norrholm, Emory University School of Medicine, USA Jee Hyun Kim, University of Melbourne, Australia*

### \*Correspondence:

*Richard J. Servatius, Syracuse Veterans Affairs Medical Center, 600 Irving Ave, Mail Stop 151 Syracuse, NY 13210, USA richard.servatius@va.gov*

> Received: *31 March 2015* Accepted: *15 June 2015* Published: *06 July 2015*

### Citation:

*Servatius RJ, Avcu P, Ko N, Jiao X, Beck KD, Minor TR and Pang KCH (2015) Avoidance expression in rats as a function of signal-shock interval: strain and sex differences. Front. Behav. Neurosci. 9:168. doi: 10.3389/fnbeh.2015.00168* that increase the expression of avoidance or its resistance to extinction represent vulnerabilities to develop anxiety disorders, in keeping with diathesis models of anxiety disorders (Mineka and Zinbarg, 2006). Animal models of avoidance as a means to understand the etiology of anxiety disorders is consistent with recent organizational efforts in psychopathology for research domain criteria (RDoC) (Sanislow et al., 2010).

One animal model is discrete trial lever press or bar press avoidance. A lever press is not among species specific defense reactions that confound the interpretation of avoidance learning and expression (Bolles, 1970). In a signaled version, trials begin with a WS. A lever press during the WS prevents foot shock and constitutes an avoidance response. In the absence of an avoidance response, intermittent foot shocks are delivered for a period of time. A lever press after the initiation of foot shock terminates shock and initiated a safety period, which may be signaled. Lever presses during the safety period are not reinforced. As an arbitrary response, rates of nonspecific responding are generally low enhancing sensitivity (Servatius et al., 2008; Avcu et al., 2014). An escape response is generally acquired early in training with avoidance emerging and reaching asymptotic performance over several sessions of training (Servatius et al., 2008; Jiao et al., 2011, 2014; Pang et al., 2011; Avcu et al., 2014; Beck et al., 2014). Although avoidance is not expressed as quickly as other preparations (e.g., shuttle box), the slowness in acquisition is cited as a virtue (Bolles, 1970).

However, one area that is both basic and particularly troublesome is the apparent influence of WS durations on avoidance performance (Cole and Fantino, 1966; Jones and Swanson, 1966; Berger and Brush, 1975). Efficient avoidance is observed with WSs greater than 20 s. As the length of the WS decreases, the predominant behavior is escape; training with a fixed interval 10-s WS produced few avoidance responses, even fewer than a variable interval 10-s WS (Berger and Brush, 1975). The predominance of escape responding with WS durations shorter than 20 s may reflect an inability to acquire avoidance (resulting from response competition or failure to encounter the avoidance contingency to the degree necessary to support acquisition) or reduced expression of avoidance. The acquisition/expression issue is critical for interpretation and understanding neurosubstrates of avoidance. This anomaly was never pursued or elaborated, subsequent studies exploited the escape/avoidance patterns for understanding physiological concomitants (Brennan et al., 1992).

Beyond the basic science understanding of avoidance, strain differences in avoidance may be used to illustrate vulnerabilities. Among strains of rats exhibiting abnormally high degree of avoidance learning and expression, the WKY rat is unusual. Typically, rats selectively bred for rapid avoidance acquisition are less emotional and less stress reactive (Brush, 2003; Steimer and Driscoll, 2003). Further, rats selectively bred for emotionality exhibit an inverse relationship between emotionality and avoidance performance (Powell and North-Jones, 1974). However, the WKY strain is quintessentially stressreactive (Paré, 1994), with an extensive literature linking the WKY rat as a model of depression (Paré, 1989; Carr et al., 2010). Yet, despite its behaviorally inhibited temperament (Servatius et al., 1998; Ferguson and Cada, 2004), the inbred WKY acquire lever press avoidance faster or to a higher degree than outbred Sprague-Dawley (SD) rats, especially female WKY rats (Servatius et al., 2008; Avcu et al., 2014). Previous work has exclusively examined avoidance acquisition with a 60 s WS. In that the relatively long CS provides ample opportunity for the WKY to overcome its inherent inhibited temperament, faster and greater expression of avoidance maybe an artifact of the CS duration chosen in initial studies as opposed to a generalized bias in avoidance acquisition and expression.

Therefore, the current study was conducted comparing acquisition of male and female SD and WKY rats with a 60 or 10-s WS. To determine whether behavioral patterns engendered during initial training interfere with learning once conditions are more conducive, a cross-over design was employed. After repeated sessions of extinction, rats were retrained with the WS duration not experienced during initial training. Thus, there were four phases: initial acquisition, initial extinction, re-acquisition, and re-extinction. For female SD rats, the experiment represents a replication and extension of research in the 1970s (Berger and Brush, 1975). We expected that male and female SD rats would not exhibit avoidances to an appreciable degree with a 10-s WS, this poor avoidance performance would transfer to subsequent training with a 60-s WS. For WKY rats, we expected the faster associativity of WKY rats (Ricart et al., 2011a,b) to offset the reduced exploratory time represented by the shorter WS period so that WKY rats still acquire and express avoidance to a higher degree than SD rats. The cross-over design would reveal whether poor avoidance performance would inhibit future avoidance (10–60 s cross) and to the extent generally high avoidance performance is affected by a shorter WS duration (60–10 s cross).

### Methods and Materials

### Animals

Sprague-Dawley (SD) and Wistar-Kyoto (WKY) male and female rats (approximately 60–80 days of age at the start of the experiment) were obtained from Harlan Laboratories (Indianapolis, IN). Rats were housed in individual cages with free access to food and water in a room maintained on a 12:12 h day/night cycle for at least 2 weeks prior to experimentation. Experiments occurred between 0700 and 1900 h in the light portion of the cycle. All procedures received prior approved by the VA-NJHCS Institutional Animal Care and Use Committee in accordance with AAALAC standards.

### Group Assignment

Naïve rats were tested for their acoustic startle response (ASR) as previously described (Servatius et al., 1998). A 15-min test session consisted of the presentation of 24 white noise bursts (100 ms with a 5-ms rise/fall time) at an intensity of 82, 92, or 102-dB, 8 trials of each sound level. The inter-stimulus interval varied between 15 and 25 s. Startle magnitudes at 102 dB white noise were used to match rats with strain and sex for random assignment to receive initial training with a 10 or 60-s WS. Therefore, the overall design was a 2 × 2 × 2 (Strain × Sex × WS duration) with 8 rats in each group. There were four phases of training: 10 acquisition sessions, 10 extinction sessions, 9 re-acquisition sessions, and 9 re-extinction sessions. Sessions occurred 3 times per week (every 2–3 days). A rat that failed to emit a lever-press response by the end of the fourth training session of initial acquisition was removed from the study. One male SD and one WKY rat in the 60-s group were dropped from the study for this reason (N = 7 for these two groups).

### Lever-press Escape/Avoidance Training

The apparatus was described previously (Servatius et al., 2008). Training was conducted in 16 identical operant chambers (Coulbourn Instruments, Langhorne, PA). Each operant chamber was enclosed in a sound-attenuated box. Scrambled 2.0-mA foot-shock was delivered through the grid floor. The auditory WS was a 1000-Hz, 75-dB tone (10 dB above background noise). A 3-min intertrial interval (ITI) was explicitly signaled with a 5-Hz blinking cue light located above the lever. Graphic State Notation software (v. 3.02, Coulbourn Instruments, Langhorne, PA) controlled the stimuli and recorded response times.

Each session began with a 60-s stimulus-free period. A trial commenced with the presentation of the auditory WS. After either 10 s or 60 s, shocks (0.5-s, 1.0 mA) were delivered with a 3 s intershock interval until a lever press or 99 shocks were delivered. If a lever press occurred prior to the initiation of shock, the shocks were prevented, the WS terminated, and the safety period commenced; this event constitutes an avoidance response. If a

lever press occurred after the initiation of shock, the shock train was immediately terminated, the WS ended and the safety period commenced; this event constituted an escape. During extinction training, both shock and the blinking cue light were deactivated. Each session consisted of 20 trials.

### Data Analysis

All data are expressed as means ± the standard error of the mean. Statistical results are reported only where significant differences were found. For avoidance training, the number of avoidance responses and the number of shock received for each training session were compiled; shocks received only pertain to acquisition and reacquisition phases. Phases of the experimental were separately analyzed. F-tests for simple effects and Dunnett's and Dunn's tests were used for understanding contrasts.

### Results

### Acquisition

Acquisition with a 60-s WS progressed over the 10 training sessions with all rats attaining asymptotic levels of greater the 80% by the end of training (see **Figure 1**). In contrast, acquisition was poor with a 10-s WS in male and female SD rats; each exhibited less than 20% avoidance by the end of 10 session of training. In contrast, WKY rats generally acquired avoidance with the 10-s WS. Male WKY rats achieved a modest degree of avoidance (∼60%) by the end of 10th session, whereas acquisition by female WKY rats with a 10-s WS was similar to that expressed

with a 60-s WS. These impressions were confirmed with a 2×2 × 2×10 (Strain × Sex × WS × Sessions) mixed analysis of variance (ANOVA). Two triple interactions subordinate interactions and main effects: Strain × WS × Sessions, F(9, 486) = 7.4 and Strain × Sex × Sessions, F(9, 486) = 2.02, all ps < 0.05.

Although the WS duration affected the rates of avoidance acquisition, numbers of shocks received differed only as a function of Strain and Sex. These impressions were confirmed with a 2 × 2 × 2 × 10 (Strain × Sex × WS × Sessions) mixed-ANOVA. The triple interaction of Strain × Sex × Sessions, F(9, 486) = 4.8, p < 0.05, superseded the subordinate interactions and main effects. Although all rats dramatically reduced the number of shocks received over training, male WKY rats only received roughly 60% of the shocks received by male and female SD and female WKY rats during the initial training sessions (See **Figure 2**).

### Extinction

For the 60-s WS, two dominant patterns were evident: WKY rats extinguished slower than SD rats, and female rats extinguished slower than male rats (see **Figure 3**). Thus, the slowest group to reduce avoidance responses was female WKY rats. For the 10-s WS, the relatively poor performance of all but the WKY female group precluded direct comparisons. All rats trained with a 10-s WS exhibited less than 20% avoidance responding by the end of the 10th session of extinction. These impressions were confirmed with a mixed ANOVA from which the four-way interaction was significant, F(9, 486) = 2.64, p < 0.05. Of note, was the extinction patterns of the female WKY rats trained with either 10 or 60-s WS. Although avoidance performance in the last session of acquisition was similar, extinction with a 10-s WS was considerably faster than with a 60-s WS in female WKY rats.

### Reacquisition

Reintroduction of the US and crossover to a new WS duration induced different patterns of avoidance responding in all groups except WKY females (see **Figure 4**). Switching from 10-s WS to the 60-s WS had an immediate impact on avoidance responding; avoidance responding was greater than 60% for all groups during the initial training session under the 60-s WS interval. There were little incremental performance differences over the remaining 9 sessions of training. Similar to initial acquisition with a 10- WS, female WKY rats performed better than all other groups at 60-s reaching nearly perfect asymptotic performance. Improved performance was evident in male WKY rats retrained with a 60-s WS, with asymptotic performance comparable to male WKY rats initially trained with a 60-s WS. The most dramatic differences were apparent in male and female SD rats, with each exhibiting avoidance performance levels in the initial session of reacquisition comparable to asymptotic performance of SD rats initially trained with 60-s WS, respectively. Switching from 60 to 10-s WS affected the avoidance performance of all groups, with only subtle changes the performance of female WKY rats. WKY males and SD females remained at modest levels of performance without improvement over training sessions. In contrast, SD males reduced their avoidance rates over sessions. Avoidance performance of SD males retrained with a 10-s WS dipped to the levels exhibited by rats initially trained with a 10-s WS. Overall, two general patterns were evident: WKY rats performed better then SD rats and females performed better than males. The mixed-ANOVA which yielded triple interactions of Sex × WS × Sessions, F(7, 378) = 2.23, Strain × WS × Sessions, F(7, 378) = 2.06, all ps < 0.05.

As for the number of shocks received, those trained with the 10-s WS received more shocks than those with 60-s WS. Moreover, SD rats received more shock than WKY rats. These impressions were confirmed with a mixed-ANOVA which only yielded main effects of WS, F(1, 54) = 4.41, and Strain, F(1, 54) = 5.13, all p's < 0.05. Interestingly, SD rats who exhibited efficient escape responding during initial training with a 60-s WS—rarely experiencing more than a single shock on a given trial—were less efficient during reacquisition with a 10-s WS. SD rats retrained with a 10-s WS had numerous trials beyond the first trial of a session with more than one shock received whereas WKY rats retrained under similar conditions rarely experienced more than one shock on a given trial during reacquisition (see **Figure 5**).

### Re-extinction

The dominant patterns exhibited during extinction were expressed during re-extinction, albeit with subtle differences (see **Figure 6**). Extinction after reacquisition with a 60-s WS showed clear strain differences with WKY rats exhibiting relatively slow extinction compared to SD rats. As with the initial extinction phase, the rates of extinction with the 10-s WS reflected asymptotic performance in the presence of the US. Thus, WKY females exhibited the slowest extinction; female SD and male WKY exhibited rapid rates from moderate levels of avoidance performance at the end of training. The mixed ANOVA revealed a triple interaction of Sex × WS × Sessions, F(7, 378) = 4.75, which superseded the subordinate interactions and main effects, and a main effect of Strain, F(1, 54) = 21.1, all ps < 0.05.

### Avoidance Latency

The impact of WS duration on latency to respond was cited as a potential explanation for performance differences between a 10 and 60-s WS. To facilitate interpretation of avoidance performance during training with 10 or 60-s warning, avoidance latencies were examined during the last session of acquisition (see **Figure 7**). Inasmuch as there were substantial differences between the two strains in avoidance performance in acquisition and reacquisition, we grouped rats by strain to ease presentation. Both SD (60%) and WKY (53%) rats exhibited disproportionately high rates of avoidance latencies less the 10 s during acquisition at 60 s (See **Figure 5**, bottom row). Moreover, the disproportionately high rate increased for SD (67%) and

WKY (79%) rats during reacquisition with 60 s after initial training at 10 s. As for avoidance latency distributions for training with 10-s WS, the bulk of avoidance latencies were between 2 and 5 s (see **Figure 5**, top row).

### Discussion

In previous work it was observed that female SD rats failed to acquire avoidance with a 10-s WS (Berger and Brush, 1975). Consistent with this early work, female SD rats failed to appreciably acquire an avoidance response after 10 sessions of training with a fixed number of trials per session—considerably more extensive training than that previous. Moreover, male SD rats performed at a comparable level. Female and male SD rats did acquire efficient escape responses; both sexes received about one shock per trial during the last session of initial acquisition with a 10-s WS. The poor performance could be simply attributed to a lack of experience with the avoidance contingencies; however, all SD rats had experience with the avoidance contingency over the initial acquisition phase.

Failure to express avoidance was specific to the WS duration, in that after several sessions of extinction (absence of both shock and safety signal), avoidance acquisition was extremely rapid; avoidance rates during the last session of initial acquisition with a 10-s WS were nominal at 10–20%, but rose in the initial session of reacquisition with the introduction of a 60-s WS to 75% for female SD rats and 66% for male SD rats. The high avoidance response rates during reacquisition could have been attributable

to expectancy to escape, that is, habitual responses that provided escape in 10 s WS, but now constituted avoidances with a 60-s WS. Response latencies during the last acquisition session with a 10-s WS for male (11.5 ± 0.3 s) and female (13.4 ± 3.2 s) SD rats, when both predominantly exhibited escape responses, reflect rapid responses with the onset of shock. In contrast, response latencies in the first session of reacquisition with a 60-s WS were considerably longer for both male (47.6 ± 6.0 s) and female (26.5 ± 2.0 s) SD rats. Failure to acquire or express avoidance with relatively short WS duration did not interfere with acquisition once conditions were more conducive. These data argue against the speculation that failure with 10-s WS is the result of proactive interference accruing from experience with unavoidable shock (Berger and Brush, 1975) similar to descriptions of interference in escape acquisition after experience with inescapable shock (Seligman, 1972). The high avoidance rates in the initial session of reacquisition by SD rats trained with a 10-s WS then switched to a 60-s WS argue strongly that SD rats acquired knowledge concerning the avoidance contingency, but did not express that knowledge through behavioral responses. The abrupt increase in avoidance responding from the marginal levels at 10-s WS to greater than 60% in the first session of re-acquisition with a 60-s WS resembles classic descriptions of latent learning (Tolman and Honzik, 1930).

The cross-over design also evaluated re-acquisition with a 10-s WS in those previously trained with a 60-s WS. Curiously, two patterns were then evident: (1) SD males decreased avoidance response rates over the next several sessions to near nominal levels, whereas response rates of SD females remained steady, but not incrementing with further exposure to the US during the reacquisition phase. This decline in male SD rats was evident even though avoidance response latencies of less than 10 s were virtually identical between the last session of acquisition with a 60-s WS and reacquisition with a 10-s WS (∼50%). For female rats, no change in rates was evident between the last sessions of acquisition with a 60-s WS (∼50%) and reacquisition with a 10-s WS (43–53% throughout reacquisition). Again, these data suggest that avoidance contingency may be acquired, but not expressed at a 10-s WS.

In the past, poor performance was suggested to be the result of: (a) reduced opportunity to respond, (b) scheduleinduced differences in nonspecific responding, and (c) response interference (Berger and Brush, 1975). For example, the longer interval would presumably allow for greater opportunity to adventitiously encounter the lever and through trial and error. However, acquisition curves under varying WS duration find an abrupt increase in rates above 20 s WS with no advantage conferred by the greater opportunity. The supposition concerning indiscriminant responding—responding irrespective of the warning or safety signals was convincingly refuted by previous work, and fully supported here. The latter, response interference, may take different forms and will be more carefully

FIGURE 5 | Shock received during re-acquisition. Shocks rates per trial are depicted for the first session (RA1, left panel) and 8th session of re-acquisition (RA8; right panel) as a function of strain, sex, and WS duration. Initial male WKY rats experienced less shocks. Cross over from 60 to 10 s led

addressed. One form of response interference postulates reflexive responses or response competition as the source of poor performance. Avoidance latencies may be used to address this point. In the simple case, an incompatible response would be evident early in the WS interval that dissipates as the interval lengthens (e.g., freezing). An alternative form with similarity would postulate that incompatible responses would be engendered more specifically with the shorter WS interval. In both, an incompatible response—whether associative or non-associative in nature—would interfere with the otherwise arbitrary lever press response and its acquisition. First, there is little empirical support that incompatible behaviors are differentially conditioned by 10 or 60-s. For example, a comparison of short and long WSs paired to a foot shock found similar degrees of freezing to the WS (Quinn et al., 2002; Barnet and Hunt, 2005). Given the assumption that an

Frontiers in Behavioral Neuroscience | www.frontiersin.org July 2015 | Volume 9 | Article 168 |

incompatible response such as freezing would be engender to the WS inhibiting either general exploration or specific responding, one would expect that avoidances in the 60-s group would have a considerably longer latency than 10 s. However, 60% of avoidance responses of SD rats trained with a 60-s WS occur with latencies less than 10 s. Thus, most of the avoidance responses of rats trained at 60 s have latencies within the window that would also be coincidental with incompatible freezing responses. Thus, response interference must be considered a highly unlikely explanation.

Typically, avoidance has a feed-forward impact on performance, that is, once the rat experiences the avoidance contingency avoidance responses tend to become more numerous to asymptotic performance generally exceeding 60%. Rats exceeding a plateau of 50% or better during a particular session go on to refine performance at better than 60% in subsequent sessions. However, this pattern—evident in those SD rats trained with a 60-s WS herein—was not evident in those trained with a 10-s WS. Of six SD rats that attained 50% or better in a particular session, not one exhibited asymptotic performance better than 60%. More dramatic was the decrease in avoidance performance over session in male SD rats initial training with 60-s WS then retrained with a 10-s WS. Together, these data suggest that for SD rats avoidance during 10-s WS was not as reinforcing as avoidance with 60-s signal duration.

One assumes that reinforcement of avoidance is the absence of foot shock, which is the same whether training with a 10 or 60-s WS. However, the perceived aversiveness of the foot shock may differ. Exposure to foot shock induces conditional and unconditional reductions in pain sensitivity (Fanselow and Sigmundi, 1986; Helmstetter and Fanselow, 1987). Conditional hypoalgesia is apparent through associations with discrete cues (Fanselow, 1986; Hagen and Green, 1988) and contextual elements (Matzel and Miller, 1987). Conditional hypoalgesia shows gradations in appearance relative to the duration of the cue (Seo et al., 2008). Consistent with these data, one may postulate that conditional hypoalgesia in the present studies is maximal more proximal to shock delivery. Accordingly, the differences between 10 and 60-s are postulated to reflect differences in motivation to avoid related to the imposition of conditioned analgesia; further studies are necessary to support this contention.

Two vulnerability factors for anxiety disorders were directly compared in their influence on avoidance learning and expression: temperament and sex. Consistent early work, females express avoidance to a higher degree than males (Van Oyen et al., 1981; Heinsbroek et al., 1983; Servatius et al., 2008; Avcu et al., 2014), although this sex difference was only evident in initial acquisition with 60-s WS and reacquisition with a 10-s WS. In addition, extinction was slower in female rats, particularly evident with initial extinction after training with 60-s WS.

A stronger factor is temperament, represented by the inbred WKY rat, which shows robust patterns of avoidance responding. With regard to acquisition with a 60-s WS, several patterns noted previously were evident here (although not stressed in the results section): (1) WKY rats lacked warm-up, as acquisition progressed WKY rats avoided on the first trial of a session, whereas SD rats generally exhibited an escape response (Servatius et al., 2008), (2) WKY rats exhibited slower rates of extinction (Servatius et al., 2008; Jiao et al., 2014). In stark contrast to SD rats, WKY rats acquired with a 10-s WS with females WKY rats expressing avoidance to a higher degree than male WKY rats. Although avoidance expression of male WKY rats initially trained with a 60-s WS was substantially faster than that with a 10-s WS, female WKY rats showed no differences between 10 and 60-s WS. Those similar learning curves allowed for the comparison of extinction; rates of extinction with 10-s WS were faster than that with a 60 s WS for female WKY rats. Regardless of WS duration WKY rats exhibited perseveration of avoidance for the first trial of an extinction session.

Unlike SD rats, WKY rats increased their expression of avoidance during the reacquisition phase with a 10-s WS. With the introduction of the 10-s WS period before initiation of shock, WKY rats not only matched the number of responses shorter than 10-s during initial training with the 60-s WS, but both male and female WKY rats increased the number of avoidance responses. For male WKY rats, the rates of avoidances with latencies shorter than 10 s during the last sessions of training with 60-s WS was 39% with avoidance rates in reacquisition with 10-s WS ranging from 48 to 60%. Similarly, female WKY rats initial acquisition rates were 64% increasing to 70–86% during training with 10-s WS. These data suggest that WKY rats are more flexible in modifying established avoidance responses.

For WKY rats, it is not just the degrees of avoidance expression but the manner in which avoidance is expressed. Once acquired, WKY rats begin each session with avoidance (lack of warm up). Early avoidance is a double-edged sword. Fewer foots shocks are experienced by the rat. However, the rat is insensitive to environmental changes. This insensitivity is clearly evident if there is continued immediate contingent feedback either in the form of WS termination or initiation of safety signal. Under such conditions, avoidance expression continues during extinction (disconnection of the US) without appreciable decline for at least 8 sessions (Servatius et al., 2008). If contingent feedback is discontinued during extinction (as was done herein), avoidance rates gradually decline to nominal levels. Whereas a number of

### References


SD rats (male and female) extinguish to the degree that entire sessions lapse without a single lever press, almost all WKY rats continue to respond on the first trial of a session, even when that is the only response for the entire session. These early session responses, highly specific to the presence of the WS, continue even though the environmental conditions over the training session are essentially the same. Such lever presses to the WS in extended absence of foot shock are reminiscent of excessive worry.

Worry is a core feature of generalized anxiety disorder. Inhibited temperament is not only strongly associated with social anxiety (Hudson et al., 2011) and general anxiety disorder (Moffitt et al., 2007; Hudson et al., 2011), but obsessive compulsive disorder (Ivarsson and Winge-Westholm, 2004) and posttraumatic stress disorder (Myers et al., 2012a,b); these anxiety disorders are more prevalent in females (Pigott, 2003; Steel et al., 2014). Female inbred WKY rats are a homogenous group to understand neurobiological influences on avoidance and expressions of avoidance in the development of anxiety disorders. Perseveration of early session avoidance responses during extinction could provide a specific target for therapeutics aimed at reducing worry.

### Summary

Extending long standing work, the avoidance performance of SD rats trained with a 10-s WS was poorer than when training was accomplished with a 60-s WS. The cross-over design illuminated poorer performance as expression not acquisition of avoidance. Reduced avoidance expression in SD rats likely reflects reduced reinforcement value with a 10-s WS. As models of inhibited temperament, inbred WKY rats (especially female WKY rats) expressed avoidance to a greater degree than outbred SD rats regardless of WS duration.

### Acknowledgments

Supported by the Stress and Motivated Behavior Institute, Biomedical Laboratory Research and Development Service of the VA Office of Research and Development Award Number I01BX000218 to KB and I01BX007080 to KP. New Jersey Commission on Brain Injury Research grant CBIR11PJT003. Research also supported by Research Fellowship from Rutgers Graduate School of Biomedical Sciences (Avcu and Ko) and Rutgers Foundation Scholars Award (Ko).


in rats. Physiol. Behav. 51, 723–727. doi: 10.1016/0031-9384(92) 90108-E


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Servatius, Avcu, Ko, Jiao, Beck, Minor and Pang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

OPINION ARTICLE published: 27 October 2014 doi: 10.3389/fnbeh.2014.00372

### Contribution of emotional and motivational neurocircuitry to cue-signaled active avoidance learning

### **Anton Ilango<sup>1</sup>\*, Jason Shumake<sup>2</sup>\*,WolframWetzel <sup>1</sup> and FrankW. Ohl 1,3,4**

<sup>1</sup> Leibniz Institute for Neurobiology, Magdeburg, Germany

<sup>2</sup> Department of Psychology, The University of Texas, Austin, USA


### **Edited by:**

Richard J. Servatius, Department of Veterans Affairs Medical Center, USA

### **Reviewed by:**

Amy Poremba, University of Iowa, USA

Claudio Da Cunha, Universidade Federal do Paraná, Brazil

**Keywords: punishment, active avoidance, safety signal, neural circuits, dopamine, amygdala, striatum**

### **INTRODUCTION**

Many animal and human subjects can learn to avoid punishment or noxious stimuli by exploiting the sensory cues predicting them. In cue-signaled active avoidance (AA) learning, subjects first learn about the predictive properties of cues and subsequently learn a behavioral strategy of an avoidance response (e.g., crossing a hurdle that divides a two-compartment cage in order to avoid a mild but unpleasant footshock). AA learning develops with fear reduction as an intervening variable (Mowrer and Lamoreaux, 1946; Miller, 1948). If execution of avoidance action occurs on the pursuit of seeking safety, an overlap of recruitments of neurocircuitry essential for reward processing and for avoidance can be expected on the basis of two-process theory. Recent insights from two-way AA (2WAA) studies, which integrate both Pavlovian and instrumental components, provide strong evidence for the recruitment of emotional circuitry centered on the amygdala and motivational circuitry centered around midbrain dopaminergic structures. In this review, we address the following: (1) the role of emotional neurocircuitry in the formation of AA, (2) the involvement of reward circuitry and its input-output pathways on AA, and (3) the possible serial and parallel processing within and between these circuitries.

### **ROLE OF EMOTIONAL CIRCUITRY IN AA**

Pavlovian learning increases synaptic plasticity in amygdala neurons (Tye et al., 2008) that respond to cues predicting either

appetitive or aversive outcomes (Paton et al., 2006; Tye and Janak, 2007). While the amygdala has been studied extensively for its involvement in fear processing (LeDoux, 2000), recent studies highlight the amygdala as a key substrate during acquisition of AA. Studies indicate that different subnuclei of the amygdala contribute differently to the acquisition of an avoidance strategy and to the consolidation of avoidance memories. Bilateral electrolytic lesions of the basolateral amygdala (BLA) (Segura-Torres et al., 2010) and pre-training infusion of an NMDA antagonist into the BLA impair the acquisition of 2WAA learning (Savonenko et al., 2003). Central amygdala (CeA) lesions disrupt the acquisition of an AA response but have no effect on the retrieval of a previously acquired AA response (Roozendaal et al., 1993). However, for animals that failed to acquire 2WAA after 3 days of training, CeA lesions actually improved AA learning (Choi et al., 2010). Thus, the BLA appears important for all phases of AA learning whereas the role of the CeA appears to be limited to acquisition and complex, potentially facilitating or impeding AA learning depending on the specific AA response, and/or innate individual differences in AA learning ability.

During the initial stages of AA acquisition, the expression of conditioned freezing to the CS can interfere with AA learning. In such situations, the infralimbic prefrontal cortex (IL) exerts feed forward inhibition of the amygdala to reduce the expression of freezing responses to the CS while sparing the predictive association between the CS and US, which is necessary for AA learning (Moscarello and LeDoux, 2013). Studies on rabbits trained to induce wheel rotation to avoid shock during CS<sup>+</sup> presentations confirm the involvement of amygdala (LA, B, and Ce), cingulate cortex, thalamus, and auditory cortex in the acquisition and retention of AA (Smith et al., 2001). Intra-BLA infusion of muscimol significantly affected the acquisition of discriminative avoidance with two tones but did not affect the CR to both CS after overtraining (Poremba and Gabriel, 1997, 1999).

### **ROLE OF MOTIVATIONAL CIRCUITRY IN AA**

Given the heterogeneity of the ventral tegmental area (VTA), it is not surprising that dopamine (DA) neurons play different roles ranging from signaling reward and mediating motivation to coding aversion, salience, uncertainty, and novelty (Horvitz, 2000; Bromberg-Martin et al., 2010; Ilango et al., 2012; Lammel et al., 2014). Additional complexity became evident after the discovery of subsets of DA neurons co-transmitting glutamate in the nucleus accumbens (NAc) shell (Stuber et al., 2010), GABA in the dorsal striatum (Tritsch et al., 2012) and GABA in the lateral habenula (LHb) (Stamatakis et al., 2013) raising several specific questions about the involvement of this system.

Here, we will discuss the tonic and phasic DA release associated with signaled AA. Briefly, DA antagonists impair AA responses, and electrical stimulation of reward circuitry facilitates AA [see reviews by Salamone (1994) and Ilango et al. (2012)]. NAc DA release increases during the first training block of 2WAA and progressively decreases in the second training block as the number of AA responses increase (Dombrowski et al., 2013). Signaled AA learning progresses with the increase of tonic DA release in medial prefrontal cortex (mPFC) and reaches its peak during the formation of the first successful avoidance trails (Stark et al., 1999, 2001). Elevated DA release in mPFC was also found in a training scenario in which two different cues were first associated with the same meaning (signaling Go response) and were subsequently associated with different meanings (Go and NoGo) (Stark et al., 2004).

Recent studies implicate the relevance of the phasic DA signal to punishment prediction and avoidance. During the safety period, in both avoidance and escape, DA release in the NAc was increased (Oleson et al., 2012). Furthermore, brief electrical stimulation applied to the LHb contingent to the AA response (i.e., at the initiation of the safety period) impaired acquisition but not retention of 2WAA (Shumake et al., 2010; Ilango et al., 2013; Shumake and Gonzalez-Lima, 2013). Studies utilizing viral gene therapy in DA deficient mice showed that shock escape and learning 2WAA require DA signaling in both the amygdala and striatum. And after overtraining (which is more resistant to extinction), DA signaling in the striatum alone was sufficient to maintain 2WAA. In contrast, restoring DA signaling in the PFC and amygdala was insufficient to maintain AA (Darvas et al., 2011). Careful lesion experiments in different regions of the striatum confirmed that NAc core and dorsolateral striatum (DLS) lesions delayed 2WAA acquisition without disrupting the ability to acquire AA. In contrast, dorsomedial striatum (DMS) lesion did not affect the early phase but decreased 2WAA after six training sessions (Wendler et al., 2014). Pre-training infusion of D1 or D2 antagonist into the DLS did not affect the number of AA responses during training but significantly decreased AA responses during the retention test 24 h later (Boschen et al., 2011; Wietzikoski et al., 2012).

Also, the laterodorsal tegmental nucleus (LDTg) and pedunculopontine tegmental nucleus (PPTg), which send glutamatergic and cholinergic projections to the midbrain, play an important role in 2WAA (Mena-Segovia et al., 2008; Lammel et al., 2012). Bilateral lesions of the PPTg completely abolished the acquisition of 2WAA (Fujimoto et al., 1989). Rats with ipsilateral disconnection of the SNc from the PPTg learned the 2WAA, but contralateral disconnection blocked learning even after 3 days of conditioning, suggesting that PPTg-SNc communication is necessary to acquire 2WAA (Bortolanza et al., 2010).

### **PERSPECTIVES**

The majority of DA neurons are inhibited by aversive USs, or by CSs signaling aversive USs [for details see Ilango et al. (2012)]. Hypothetically,once the aversive CS–US association is repeated several times, the tonic inhibition mode changes and the change in DA dynamics prepare the organism to perform successful avoidance responses. Perhaps, this learning relevant change in DA signaling also led to change in synaptic plasticity occurring between hippocampal→amygdala neurons and the neurons of the direct pathway that are active at the same time, thus guiding the organism to repeat the instrumental response. There are several pathways that could provide midbrain DA neurons with information about aversive events. Nociceptive signals from the spinal cord pass pain-related signals to the parabrachial nucleus (PBN). Through its direct glutamatergic projection or indirect glutamatergic route to the rostromedial tegmental nucleus (RMTg) and VTA GABA neurons, the aversive signal is relayed to midbrain VTA/SNc DA neurons. Inactivation of the PBN either reduces the amplitude or completely abolishes the inhibitory response of DA neurons to footshock (Coizet et al., 2010). In addition, modulation of pain and aversion-related signals conveyed by the LHb reaches DA neurons directly or indirectly through the RMTg (Jhou et al., 2009). This informative signal is highly processed and capable of assigning motivational valence based on prior events. Indeed, unexpected footshock increased LHb-to-RMTg glutamate release, and optogenetic activation (60 Hz) of this pathway promoted place aversion and AA learning to prevent the activation (Stamatakis and Stuber, 2012).

There is also evidence that CeA–DA interactions may be important for AA learning. The CeA projects to substantia nigra DA neurons and to the DLS. Moreover, electrical stimulation of the CE modulates the firing of SNc DA neurons (Rouillard and Freeman, 1995). The CE is also known to strengthen the effect of Pavlovian stimuli on instrumental performance, and the CeA→DLS pathway is implicated in habit acquisition (Corbit and Balleine, 2005; Lázaro-Muñoz et al., 2010; Lingawi and Balleine, 2012). Reciprocally, the lateral part of SNc DA neurons project to the CeA, playing a role in surprise-induced enhancement of attention and learning (Lee et al., 2006, 2008). Accordingly, the first successful avoidance trial in an AA paradigm likely constitutes a"pleasant surprise"for the animal – a violation of the expectation that the CS is always followed by the US – and we hypothesize that early AA success may recruit the SNc→CeA pathway and consequently enhance attention and learning of the AA response.

From the above-mentioned evidences, it is clear that parallel streams of information might reach the amygdala and DA midbrain structures, and both systems might interact at the striatum level (**Figure 1**). Blocking transmission of glutamate signals by knocking out the NMDA receptor in medium spiny neurons of the striatum impaired learning of a simple FR1 operant task for food reward as well as 2WAA. This confirms that 2WAA recruits simple motor coordination circuits (Beutler et al., 2011). Moreover, striatal specific deletion of adenosine A(2A) receptors impaired 2WAA. This is interesting given the known role of these receptors in finetuning the DA-glutamate balance in the striatum (Singer et al., 2013).

Unlike reward learning, when a subject masters AA, it no longer receives external motivation in the form of a US, providing a mystery to early learning theorists. How can behavior be sustained in the absence of reinforcement? One solution is that a fear memory of the US continues to be evoked by the CS, and alleviation of this fear state can continue to motivate the AA response in the absence of the US. But if the CS is no longer

paired with the US, why does the CS–US association not undergo extinction? Moreover, animals that have mastered an AA task no longer show strong physiological signs of fear or distress. For these animals, the execution of the avoidance response takes on the quality of a habit. Therefore, we propose that AA learning ultimately recruits and depends on the same circuitry involved in habit formation, such as the so-called spiraling loop of striatal–nigral– striatal circuitry (Yin and Knowlton, 2006; Belin and Everitt, 2008; Ilango et al., 2014). We believe that this circuitry is a prime target for investigating the neural mechanisms that sustain avoidance behavior, and it may reveal novel ways of facilitating its extinction.

### **REFERENCES**


impaired learning and altered neuronal morphology in mice lacking NMDA receptors in medium spiny neurons. *PLoS ONE* 6:e28168. doi:10.1371/ journal.pone.0028168


cingulothalamic training-induced neuronal activity. *Neurobiol. Learn. Mem.* 76, 403–425. doi:10. 1006/nlme.2001.4019


Yin, H. H., and Knowlton, B. J. (2006). The role of the basal ganglia in habit formation. *Nat. Rev. Neurosci.* 7, 464–476. doi:10.1038/nrn1919

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 June 2014; accepted: 08 October 2014; published online: 27 October 2014.*

*Citation: Ilango A, Shumake J, Wetzel W and Ohl FW (2014) Contribution of emotional and motivational neurocircuitry to cue-signaled active avoidance learning. Front. Behav. Neurosci. 8:372. doi: 10.3389/fnbeh.2014.00372*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Ilango, Shumake, Wetzel and Ohl. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Persistent active avoidance correlates with activity in prelimbic cortex and ventral striatum

Christian Bravo-Rivera† , Ciorana Roman-Ortiz † , Marlian Montesinos-Cartagena and Gregory J. Quirk\*

Departments of Psychiatry and Anatomy and Neurobiology, University of Puerto Rico School of Medicine, San Juan, Puerto Rico

Persistent avoidance is a prominent symptom of anxiety disorders and is often resistant to extinction-based therapies. Little is known about the circuitry mediating persistent avoidance. Using a recently described platform-mediated active avoidance task, we assessed activity in several structures with c-Fos immuno-labeling. In Task 1, rats were conditioned to avoid a tone-signaled shock by moving to a safe platform, and then were extinguished over two days. One day later, failure to retrieve extinction correlated with increased activity in the prelimbic prefrontal cortex (PL), ventral striatum (VS), and basal amygdala (BA), and decreased activity in infralimbic prefrontal cortex (IL), consistent with pharmacological inactivation studies. In Task 2, the platform was removed during extinction training and fear (suppression of bar pressing) was extinguished to criterion over 3–5 days. The platform was then returned in a post-extinction test. Under these conditions, avoidance levels were equivalent to Experiment 1 and correlated with increased activity in PL and VS, but there was no correlation with activity in IL or BA. Thus, persistent avoidance can occur independently of deficits in fear extinction and its associated structures.

Keywords: infralimbic, fear extinction, amygdala, c-Fos, freezing

### Introduction

To ensure survival, individuals must learn to actively avoid cues predictive of danger. Active avoidance can be extinguished when the cues no longer predict danger. A long standing theory proposes that avoidance is initially reinforced by fear, and it is subsequently reinforced by fear reduction (Mowrer and Lamoreaux, 1946). However, others studies have dissociated fear from avoidance (Lolordo and Rescorla, 1966; Riccio and Silvestri, 1973; Overmier and Brackbill, 1977). The majority of studies focus on neural mechanisms of avoidance acquisition (Gabriel et al., 1983; Orona and Gabriel, 1983; Gabriel, 1990; Maren et al., 1991; Savonenko et al., 1999; Holahan and White, 2002; Lázaro-Muñoz et al., 2010; Shumake et al., 2010; Darvas et al., 2011; Moscarello and LeDoux, 2013; Beck et al., 2014; Lichtenberg et al., 2014; Ramirez et al., 2015), but few focus on its extinction (Gabriel et al., 1983; Pang et al., 2010; Jiao et al., 2011; Bravo-Rivera et al., 2014b; Wendler et al., 2014). In a platform-mediated avoidance task (Bravo-Rivera et al., 2014b), rats learn to avoid a tone-signaled footshock by stepping onto a nearby platform at the expense of sucrose pellets. Pharmacological inactivation of prelimbic prefrontal cortex (PL) or ventral striatum (VS), but not infralimbic cortex (IL), impaired avoidance expression, whereas inactivation of the IL prior to avoidance extinction impaired retrieval of extinction the following day.

### Edited by:

Allan V. Kalueff, ZENEREI Institute, USA; Guangdong Ocean University, China

### Reviewed by:

Rick Richardson, The University of New South Wales, Australia Ewelina Anna Knapska, Nencki Institute of Experimental Biology, Poland Justin Moscarello, New York University, USA

#### \*Correspondence:

Gregory J. Quirk, Departments of Psychiatry and Anatomy and Neurobiology, University of Puerto Rico School of Medicine, Americo Miranda Avenue San Juan 00936-5067, PO Box 365067, San Juan, Puerto Rico gregoryjquirk@gmail.com

†These authors have contributed equally to this work.

> Received: 09 April 2015 Accepted: 01 July 2015 Published: 15 July 2015

#### Citation:

Bravo-Rivera C, Roman-Ortiz C, Montesinos-Cartagena M and Quirk GJ (2015) Persistent active avoidance correlates with activity in prelimbic cortex and ventral striatum. Front. Behav. Neurosci. 9:184. doi: 10.3389/fnbeh.2015.00184

In anxiety patients, persistent avoidance is maladaptive and can severely impair quality of life (Aupperle et al., 2012). A percentage of rats persist in platform-mediated avoidance, despite extinction training (Bravo-Rivera et al., 2014b). We therefore sought to identify structures involved in persistent avoidance using c-Fos immuno-labeling. We focused on structures previously linked to avoidance expression, such as PL (Beck et al., 2014; Bravo-Rivera et al., 2014b) and VS (Darvas et al., 2011; Bravo-Rivera et al., 2014b; Ramirez et al., 2015), as well as structures linked to conditioned fear and fear extinction such as basal amygdala (BA; Herry et al., 2006; Laurent et al., 2008; Sierra-Mercado et al., 2011), and IL (Milad and Quirk, 2002; Sierra-Mercado et al., 2011; Do-Monte et al., 2015a), respectively. We used a correlational analysis to reveal individual variation across rats. Extinction of signaled avoidance has two components: (1) extinction of the tone-shock association (Rescorla and Heth, 1975); and (2) extinction of avoidance responding (Pang et al., 2010; Beck et al., 2011; Todd et al., 2014; Wendler et al., 2014). We therefore modified our extinction task to dissociate these two components.

### Materials and Methods

### Bar–Press Training

A total of 47 male Sprague–Dawley rats (Harlan Laboratories, Indianapolis, IN, USA) weighing 300–360 g were used in this study. Rats were restricted to 18 g/day of standard laboratory chow, followed by 10 days of training to press a bar for sucrose pellets on a variable interval schedule of reinforcement averaging 30 s (VI–30 s). Rats were trained until they reached a criterion of >10 presses/min. All procedures were approved by the Institutional Animal Care and Use Committee of the University of Puerto Rico School of Medicine, in compliance with National Institutes of Health's Guide for the Care and Use of Laboratory Animals (Eighth Edition).

### Platform–Mediated Avoidance Training

We used the same parameters for the platform-mediated avoidance task as in our previous study (Bravo-Rivera et al., 2014b). Rats were trained in standard operant chambers (26.7 cm long, 27.9 cm wide, 27.9 cm tall; Coulbourn Instruments, Allentown, PA, USA) located in sound–attenuating cubicles (Med Associates, Burlington, VT, USA). The floor of the chambers consisted of stainless steel bars delivering a scrambled electric foot–shock. Shock grids and floor trays were cleaned with soap and water, and chamber walls were cleaned with wet paper towels. Sucrose pellets were available on a VI-30 s schedule throughout all phases of training and tests. Rats were conditioned with a pure tone (30 s, 4 kHz, 75 dB) co-terminating with a shock delivered through the floor grids (2 s, 0.4 mA). The inter–trial interval was variable, averaging 3 min. An acrylic square platform (14.0 cm each side, 0.33 cm tall) located in the opposite corner of the sucrose–delivering bar protected rats from the shock. The platform was fixed to the floor and was present during bar-press training prior to conditioning to reduce novelty. Rats were trained in platform-mediated avoidance for 10 days to reduce freezing and return spontaneous press rates to pre-conditioning levels. Each day, rats received three sessions consisting of three tone-shock trials each (9 tone–shock pairings per day). Rats were left in the training chamber between sessions for 5 min to reinforce bar–press training and to reduce contextual fear.

### Extinction Training

For Task 1, rats were presented daily with 15 unreinforced tones in the same conditioning chamber with the platform present. After two extinction training days, rats were presented with an avoidance test (two tones). For Task 2, the platform was removed prior to extinction, and rats were presented daily with 15 unreinforced tones. Given that freezing decreases to low levels with platform-mediated avoidance training, we used bar-press suppression as an index of fear memory (Bouton and Bolles, 1980; Quirk et al., 2000). A criterion of <25% suppression during the first two extinction trials was used to ensure adequate extinction, and rats were given 3–5 days of extinction training (without platform) to reach criterion. Two out of 24 rats were excluded from the experiment for failing to reach this criterion. One day following the conclusion of extinction training, the platform was returned and rats were presented with an avoidance test (two tones). Task 1 was designed such that it would reveal failure rats with a fixed amount of training that typically results in successful extinction, whereas Task 2 extinguished rats to criterion to normalize Pavlovian extinction such that it could not be a contributing factor to persistence. After the avoidance test in each task, a subset of rats exhibiting a wide range of avoidance values was selected for c-Fos immuno-labeling.

### Data Analysis

Behavior was recorded with digital video cameras (Micro Video Products, Bobcaygeon, ON, Canada) and freezing was automatically analyzed from video images (Freezescan, Clever Systems, Reston, VA, USA). The amount of time freezing to the tone was expressed as a percentage of the tone presentation (% Freezing). Avoidance was defined as the time spent with at least two paws on the platform, and was expressed as a percentage of the tone presentation (% Time on platform). We scored two paws on platform as avoidance because rats typically test the bars for shock with the two front paws during tone presentations. Moreover, rats cannot reach the sucrose–delivering bar from the platform, which constitutes a behavioral cost of avoidance. Avoidance was scored from videos by a trained observer. We also measured the percent of bar-press suppression to the tone (Bouton and Bolles, 1980; Quirk et al., 2000), calculated as follows: (pretone rate − tone rate)/(pretone rate + tone rate)<sup>∗</sup> (100). A value of zero percent indicates no suppression, whereas a value of 100% indicates complete suppression. We performed regression analyses with the corresponding F-test and Student's t-test (Statistica, StatSoft, Tulsa, OK, USA) throughout the study.

### Immunohistochemistry

We used a c-Fos immuno-labeling protocol that we previously described (Padilla-Coreano et al., 2012). One hour after the end of the behavioral test, rats were anesthetized with sodium pentobarbital (450 mg/Kg, i.p.) and then perfused transcardially with 250 ml of 0.9% saline followed by 500 ml of cold 4% paraformaldehyde (PFA) in 0.1 M phosphate buffer saline (PBS) at pH 7.4. Brains were removed and fixed overnight in 4% PFA, and transferred to 30% sucrose in 0.1 M PBS for 48 h, for cryoprotection. Frozen sections were cut coronally (40 µm) with a cryostat (CM 1850; Leica) at different levels of the prefrontal cortex, striatum, and amygdala.

All sections were washed in 0.1 M PBS with 0.1% tween (Tween-20, Sigma Aldrich, USA) between reactions three times for 5 min each. Sections were initially blocked in a solution of 2% normal goat serum (NGS, Vector Laboratories, Burlingame, CA, USA) and 0.1% tween in 0.1 M PBS (pH 7.4) for 1 h. Afterwards, sections were incubated overnight at room temperature with anti-c-Fos serum raised in rabbit (Ab-5, Oncogene Science, USA) at a concentration of 1:20,000. Sections were then incubated for 2 h at room temperature in a solution of biotinylated goat anti-rabbit IgG (Vector Laboratories, USA) and placed in a mixed avidin biotin horseradish peroxidase complex solution (ABC Elite Kit, Vector Laboratories, USA) for 90 min. Black/brown immuno-labeled nuclei labeled for c-Fos were visualized after 15 min of exposure to a solution containing 0.02% diaminobenzidine tetrahydrochloride with 0.3% nickel ammonium sulfate in 0.05 M Tris buffer, pH 7.6, followed by a 10 min incubation period in a chromogen solution with glucose oxidase (10%) and D-Glucose (10%). The reaction was stopped using three 5 min washes of 0.1 M PBS without tween. Sections were mounted on gelatin coated slides, dehydrated, and coverslipped. Counter sections were collected, stained for Nissl bodies, cover-slipped, and used to determine the anatomical boundaries of each structure analyzed.

c-Fos immuno-labeled neurons were automatically counted at 20× magnification with an Olympus microscope (Model B×51) equipped with a digital camera. Micrographs were generated for prelimbic cortex (PL, +3.00 to +3.70 AP), infralimbic cortex (IL, +3.00 to +3.70 AP), orbitofrontal cortex (OFC, +3.00 to +3.70 AP), ventral striatum (VS, +2.00 to 0.00 AP), basolateral nucleus of the amygdala (BLA, −3.00 to −2.00 AP) and central nucleus of the amygdala, divided into lateral (CeL, −3.00 to −2.00 AP) and medial (CeM, −3.00 to −2.00 AP) portions, and paraventricular thalamic nucleus (PVT, −3.00 to −2.00 AP). Example of micrographs are shown in **Figure 3B**. The c-Fos immunolabeled neuron counts were averaged for each hemisphere in 2–3 different sections for each structure (Metamorph software version 6.1). Density was calculated by dividing the number of c-Fos positive neurons by the total area of each region.

### Results

### Task 1: Extinction with Platform Present

Rats were given 10 days of avoidance conditioning, followed by 2 days of extinction training (tones without shocks; **Figure 1A**). On the first day of extinction (Day 11), the percent of time spent on the platform during the tone dropped from 66% to 10% (**Figure 1B**). The following day (day 12), rats started the session with relatively low levels of avoidance (27% of time on platform), indicating good retrieval of extinction (**Figure 1B**). During the avoidance test (day 13), rats showed minimal avoidance expression on average (24% of time on platform), again indicating retrieval of extinction (**Figure 1B**). However, 30% (n = 7) of the rats spent >40% of the time on the platform, indicating persistence of avoidance (**Figure 1C**); however, freezing and avoidance were not significantly correlated (r = 0.29, p = 0.17).

### Task 2: Extinction with Platform Absent

Persistent avoidance could be due to an inability of rats to extinguish the tone-shock association across 2 days. Alternatively, persistent rats may be incapable of suppressing avoidance despite adequate extinction of the tone-shock association. In order to distinguish between these two possibilities, we modified the task by removing the platform during extinction training, and fully extinguishing the toneshock association to criterion (**Figure 2A**). Given that freezing decreases to low levels with platform-mediated avoidance training, we used bar-press suppression as an index of toneshock association memory (Bouton and Bolles, 1980; Quirk et al., 2000). Extinction sessions were repeated daily until each rat exhibited <25% suppression at the start of the session. Forty-five percent of rats reached this criterion by the 3rd extinction session, 23% of rats required a 4th session, and 32% required a 5th session. One day following the last extinction session, the platform was returned and rats were tested for avoidance. We reasoned that persistent avoidance should be reduced if impairment in tone-shock extinction was the main factor driving persistence.

Extinction training reduced suppression of bar-pressing from 80 to 24% by the last day of extinction. Surprisingly, however, persistent avoidance still occurred at test (**Figure 2C**). The percentage of rats spending >40% of the time on the platform was similar to Task 1 (27% vs. 30% of rats). The average avoidance values for the group were also similar (Task 1: 24.2%, Task 2: 23.6%; **Figure 2D**). Freezing at test, however, was significantly lower in Task 2 compared to Task 1 (Task 1: 26.0%, Task 2: 6.9%, t<sup>44</sup> = 2.40, p = 0.020; **Figure 2D**), despite equivalent freezing levels prior to extinction training (Task 1: 11.9%, Task 2: 14.3%, t<sup>44</sup> = 0.31, p = 0.75). Thus, persistent avoidance can occur despite successful tone-shock extinction (**Figure 2B**). Next, we used c-Fos immuno-labeling to identify structures in which activity correlated with persistent avoidance.

### c-Fos Expression at Test in Task 1

In Task 1, we selected a subset of rats (16 of 23) that underwent extinction training and avoidance testing, to assess neural activation using the activity marker c-Fos (**Figures 3A,B**). The subset of rats expressed a wide range of avoidance values at test, in order to assess co-variance with c-Fos expression. We observed significant positive correlations between avoidance at test and c-Fos density in PL (r = 0.79, p < 0.001), VS (r = 0.86, p < 0.001) and BA (r = 0.67, P = 0.0046; **Figure 3C**), consistent with findings from pharmacological inactivation studies of platform-mediated avoidance (Bravo-Rivera et al., 2014b).

Furthermore, avoidance at test correlated negatively with c-Fos density in IL (r = −0.74, p < 0.001, **Figure 3C**), consistent

with impaired retrieval of extinction following inactivation of IL (prior to extinction) in this task (Bravo-Rivera et al., 2014b). Moreover, c-Fos density in IL correlated negatively with c-Fos density in BA (r = −0.65, p = 0.007; **Figure 3E**), with rats separating into two clusters: (1) those with high avoidance (with increased and decreased activity in BA and IL, respectively); and (2) those with low avoidance (with decreased and increased activity in BA and IL, respectively). Given the established roles of IL in fear extinction (Fontanez-Nuin et al., 2011; Sierra-Mercado et al., 2011; Do-Monte et al., 2015a) and BA in fear expression (Anglada-Figueroa and Quirk, 2005; Herry et al., 2008), persistent avoidance in Task 1 appears to correlate with poor extinction of fear.

### c-Fos Expression at Test in Task 2

We next characterized activity patterns during persistent avoidance in Task 2 by selecting a subgroup of rats (12 out of 22) showing a wide range of avoidance values at test. Similar to Task 1, we observed positive correlations between avoidance expression and c-Fos density in PL (r = 0.85, p < 0.001) and VS (r = 0.77, p = 0.0043; **Figure 3D**). Unlike Task 1, however, avoidance expression was not correlated with c-Fos density in IL (r = 0.41, p = 0.19) or BA (r = −0.17, p = 0.60; **Figure 3D**). In fact, all rats showed high activity levels in IL, and most rats showed low activity levels in BA, consistent with successful fear extinction. Furthermore, there was no clustering of rats into behavioral subgroups in the BA vs. IL plot (**Figure 3F**): rats with high IL and low BA activity exhibited both high and low avoidance.

### Discussion

In this study, we sought to characterize persistent avoidance using a platform-mediated avoidance task. When extinction tones were delivered in the presence of the platform (Task 1), persistent avoidance resembled deficient extinction of toneshock associations, as evidenced by high freezing and c-Fos expression patterns. However, when extinction tones were delivered in the absence of the platform (Task 2), persistent avoidance still occurred despite successful extinction of toneshock associations.

In Pavlovian fear extinction, there is considerable variability in the subsequent retrieval of extinction (Bush et al., 2007; Sotres-Bayon et al., 2008), and such failure is associated with a pattern of c-Fos similar to what we observed in Task 1 (Herry and Mons, 2004; Berretta et al., 2005; Knapska and Maren, 2009; Kim et al., 2010). Thus, persistent avoidance in Task 1 could simply reflect poor tone-shock extinction. However, correlations do not imply causality, and there was no correlation between freezing and avoidance at test in Task 1. The relationship between extinction of fear and avoidance was specifically addressed in Task 2, in which tone-shock associations were separately extinguished to criterion prior to the avoidance test. This manipulation reduced freezing at test (as expected) but did not reduce avoidance. Thus, persistent avoidance in Task 2

(and perhaps Task 1 as well) could be operating independently from fear extinction.

The only structures in both tasks that were positively correlated with avoidance were PL, VS, and OFC (see **Tables 1**, **2**). PL and VS likely play a role in avoidance expression, given that inactivation of PL (Beck et al., 2014; Bravo-Rivera et al., 2014b) or VS (Bravo-Rivera et al., 2014b; Ramirez et al., 2015) impairs the expression of avoidance. PL may mediate avoidance through projections to VS (Bravo-Rivera et al., 2014a; Lee et al., 2014). In contrast, activity in BA and IL were only correlated with avoidance in Task 1, suggesting that these areas do not mediate avoidance expression per se. IL mediates extinction of avoidance (Bravo-Rivera et al., 2014b), as well as extinction of conditioned fear (Burgos-Robles et al., 2007; Laurent and Westbrook, 2009; Amir et al., 2011; Fontanez-Nuin et al., 2011; Sierra-Mercado et al., 2011; Do-Monte et al., 2015a). Our observation that IL activity was deficient in persistent rats is likely due to a reduction in IL activity during extinction training, rather than at test (Do-Monte et al., 2015a). BA has been reported to mediate avoidance (Lázaro-Muñoz et al., 2010; Ramirez et al., 2015) and inactivation of BA reduces expression of platform-mediated avoidance (Bravo-Rivera et al., 2014b). However, the lack of correlation in Task 2 instead suggests that BA signals the toneshock association, which is reduced in Task 2 by extinction to criterion in the absence of the platform. In fact, at test, the majority of rats in Task 2 (9 of 12) showed the pattern of c-Fos consistent with successful tone-shock extinction (low BA, high IL, **Figure 3F**; Knapska and Maren, 2009), yet were mixed in their expression of avoidance. Activity in the CeM and the PVT correlated with avoidance in Task 1, as shown in **Table 1**. Both CeM (LeDoux et al., 1988; Goosens and Maren, 2001) and PVT (Do-Monte et al., 2015b; Penzo et al., 2015) are key mediators of conditioned fear. Moreover, a recent study showed that activity in CeM correlated with shuttle avoidance (Martinez et al., 2013).

If not a deficiency in tone-shock extinction, what factors might be generating persistent avoidance in Task 2? (1) Returning the platform at test could trigger renewal of fear, if rats


Legend: PL, prelimbic cortex; IL, infralimbic cortex; VS, ventral striatum; BA, basal amygdala; CeM, centromedial nucleus of the amygdala; CeL, centrolateral nucleus of the amygdala; PVT, paraventricular nucleus of the thalamus; OFC, orbitofrontal cortex. \*Depicts significanct correlation after Bonferroni correction.

associated the platform with the occurrence of shock (Bouton and King, 1983). This is unlikely however, because all rats showed low freezing at test. (2) Avoidance may have been driven by nonfearful motivations such as habit learning (Atallah et al., 2007; Balleine and O'Doherty, 2010), or increased value estimation (Berridge et al., 2009). Interestingly, the latter is dependent on the OFC, which was positively correlated with avoidance in both tasks. However, our data do not dissociate possible roles of OFC in avoidance prior to extinction vs. after extinction. Other studies have shown that avoidance could be extinguished independently from fear (Lolordo and Rescorla, 1966; Riccio and Silvestri, 1973), suggesting that fear and avoidance circuits are dissociable.

Avoidance is a core symptom of anxiety disorders (Kashdan et al., 2006) and a prominent feature of Post Traumatic Stress Disorder (PTSD; Friedman et al., 2011; American-Psychiatric-Association, 2013). Prolonged exposure therapy is based on fear extinction and it is the standard of care for PTSD (Davis et al., 2006; Foa, 2006; Kearns et al., 2012). However, extinctionbased therapies often do not reduce avoidance behaviors (Sripada et al., 2013). Therefore, avoidance can occur independent of fear, suggesting that therapies that reduce fear may not be useful in reducing persistent avoidance behaviors.

Human neuroimaging studies of active avoidance are beginning to emerge, and recent findings implicate a prefrontalcingulate-striatal circuit (Delgado et al., 2009; Aupperle et al.,



Legend: PL, prelimbic cortex; IL, infralimbic cortex; VS, ventral striatum; BA, basal amygdala; CeM, centromedial nucleus of the amygdala; CeL, centrolateral nucleus of the amygdala; PVT, paraventricular nucleus of the thalamus; OFC, orbitofrontal cortex. \*Depicts significanct correlation after Bonferroni correction.

2015), consistent with the involvement of PL and VS in active avoidance (Bravo-Rivera et al., 2014b; Lee et al., 2014). Distinguishing extinction of fear from extinction of avoidance could help identify substrates of persistent avoidance in humans, and may help guide treatments for avoidance-related disorders such as PTSD.

### References


### Acknowledgments

This work was supported by NIH grants MH102968 to CB-R, MH092912 (ENDURE) to MM-C, MH058883, MH086400 to GJQ, and the University of Puerto Rico President's Office.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Bravo-Rivera, Roman-Ortiz, Montesinos-Cartagena and Quirk. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Altered activity of the medial prefrontal cortex and amygdala during acquisition and extinction of an active avoidance task

Xilu Jiao<sup>1</sup> \*, Kevin D. Beck 2,3 , Catherine E. Myers 2,3 , Richard J. Servatius 3,4 and Kevin C. H. Pang2,3

<sup>1</sup> Neurobehavioral Laboratory, Veterans Bio-Medical Research Institute (VBRI), East Orange, NJ, USA, <sup>2</sup> Neurobehavioral Research Laboratory, Department of Veterans Affairs, New Jersey Health Care System, East Orange, NJ, USA, <sup>3</sup> Department of Pharmacology, Physiology and Neuroscience, Rutgers Biomedical Health Sciences, Newark, NJ, USA, <sup>4</sup> Syracuse VA Medical Center, Department of Veterans Affairs, Syracuse, NY, USA

Altered medial prefrontal cortex (mPFC) and amygdala function is associated with anxiety-related disorders. While the mPFC-amygdala pathway has a clear role in fear conditioning, these structures are also involved in active avoidance. Given that avoidance perseveration represents a core symptom of anxiety disorders, the neural substrate of avoidance, especially its extinction, requires better understanding. The present study was designed to investigate the activity, particularly, inhibitory neuronal activity in mPFC and amygdala during acquisition and extinction of lever-press avoidance in rats. Neural activity was examined in the mPFC, intercalated cell clusters (ITCs) lateral (LA), basal (BA) and central (CeA) amygdala, at various time points during acquisition and extinction, using induction of the immediate early gene product, c-Fos. Neural activity was greater in the mPFC, LA, BA, and ITC during the extinction phase as compared to the acquisition phase. In contrast, the CeA was the only region that was more activated during acquisition than during extinction. Our results indicate inhibitory neurons are more activated during late phase of acquisition and extinction in the mPFC and LA, suggesting the dynamic involvement of inhibitory circuits in the development and extinction of avoidance response. Together, these data start to identify the key brain regions important in active avoidance behavior, areas that could be associated with avoidance perseveration in anxiety disorders.

Keywords: c-Fos, gamma-aminobutyric-acid (GABA), intercalated cell (ITC), glutamic acid decarboxylase (GAD), parvalbumin, lever-press, rat

### Introduction

Avoidance is a common feature of anxiety disorders (American Psychiatric Association, 2000). As avoidance behavior is a key behavioral component of anxiety disorders, learning to extinguish such behavior is a fundamental concept embedded in cognitive behavioral therapy for anxiety disorders, including post-traumatic stress disorders (PTSD) and phobias (Rau et al., 2005; Rauch et al., 2006). Thus, a better understanding of the neurobiological basis of active avoidance and its extinction will provide important insights into future behavioral and pharmacological treatment for clinical anxiety.

### Edited by:

Israel Liberzon, University of Michigan, USA

### Reviewed by:

Barry Setlow, University of Florida, USA Shane Alan Perrine, Wayne State University, USA

### \*Correspondence:

Xilu Jiao, Neurobehavioral Laboratory, Veterans Bio-Medical Research Institute (VBRI), VA Medical Center, 385 Tremont Avenue, East Orange, NJ 07018, USA xilu.jiao@va.gov

> Received: 31 March 2015 Accepted: 27 August 2015 Published: 15 September 2015

### Citation:

Jiao X, Beck KD, Myers CE, Servatius RJ and Pang KCH (2015) Altered activity of the medial prefrontal cortex and amygdala during acquisition and extinction of an active avoidance task. Front. Behav. Neurosci. 9:249. doi: 10.3389/fnbeh.2015.00249

Malfunctions in medial prefrontal cortex (mPFC)—amygdala circuit have been identified in patients suffering PTSD, social anxiety disorder (SAD) and general anxiety disorder (GAD; Schwartz and Rauch, 2004; Cottraux, 2005; Guyer et al., 2008). Imaging studies indicate that one of the most consistent findings in PTSD patients is hypoactive ventral mPFC combined with hyperactive amygdala following provocation (Milad et al., 2006; Phan et al., 2006; Rauch et al., 2006). Avoidance develops slowly over time in anxiety disorders, so avoidance learning in animals may provide an opportunity to study the dynamic and progressive neurobiological changes associated with the development of anxiety disorders.

In animal studies, brain regions associated with avoidance behavior include prefrontal cortex (PFC), and amygdala, as well as hippocampus, striatum, medial septum and periaquaductal gray (Kirkby and Kimble, 1968; Bailey et al., 1986; Quirk and Gehlert, 2003; Mobbs et al., 2007; Straube et al., 2009; Pang et al., 2011; Cominski et al., 2014). Electrolytic lesion of the infralimbic cortex (IL) impaired active avoidance learning but facilitated freezing behavior in rats, while central amygdala (CeA) lesion resulted in the opposite behavioral changes (Moscarello and LeDoux, 2013). Rats that previously failed to learn shuttle avoidance can acquire such task following CeA lesion, suggesting ventral mPFC and CeA are playing opposite roles in avoidance learning (Choi et al., 2010). Using an Immunocytochemistry (ICC) approach, Duncan et al. reported that shuttle-box avoidance elicited c-Fos activity in the mPFC, cingulate cortex (CG), and medial amygdala in rats (1996). We recently showed that elevated c-Fos activation in mPFC is associated with faster extinction in rats (Jiao et al., 2011). Elevated and prolonged neural activity in mPFC was also observed in well-trained SD rats, represented by delta-FosB accumulation using Western blot (Perrotti et al., 2013). In addition, we found that rats that exhibited deficits in avoidance extinction also displayed lower gamma-aminobutyric acid (GABA) neuron counts and neuronal activation in basolateral amygdala, suggesting inhibitory modulation is important to ensure successful extinction (Jiao et al., 2011). The present study was conducted to further define the activity of inhibitory neurons in the mPFC and amygdala during the acquisition and extinction of lever-press avoidance.

### Materials and Methods

### Animals

Sixty-six male Sprague-Dawley (SD) rats (approximately 60 days of age at the start of the experiment) were obtained from Harlan Laboratories (Indianapolis, IN) and housed in individual cages with free access to food and water. Rats were housed in a room maintained on a 12:12 h light/dark cycle for at least 2 weeks prior to the start of the experiment. Experiments occurred between 0700 and 1700 h in the light portion of the cycle (light onset occurred at 0600 h). All procedures received prior approval by the VA NJ Health Care System Institutional Animal Care and Use Committee in accordance with AAALAC standards.

### Lever-Press Avoidance Training

As previously described (Servatius et al., 2008), training was conducted in 16 identical operant chambers (Coulbourn Instruments, Langhorn, PA) enclosed in 16 sound-attenuated boxes. The unconditional stimulus (US) was a scrambled 1.0-mA electric foot-shock delivered through the grid floor (Coulbourn Instruments, Langhorn, PA). The CS was a 1000-Hz 75-dB tone (10 dB above background noise). The inter-trial interval (ITI) was 3 min in duration and signaled by a blinking light above the lever.

The avoidance training procedure was composed of 10 sessions of acquisition (A01–A10) and six sessions of extinction (E01–E06), based on previous studies (Servatius et al., 2008; Beck et al., 2010). Avoidance training consisted of 20 trials per session. A session occurred three times per week (sessions separated by 2–3 days). Each session began with a 60 s stimulusfree period. A trial commenced with the delivery of the auditory CS. During the acquisition phase, a lever-press during the first 60 s shock free (warning) period turned off the CS and prevented the delivery of US; this response was designated an ''avoidance'' response. If no avoidance response was made, a shock period (shock duration = 0.5 s, inter-shock interval = 3 s, 100 shocks maximum/trial) was initiated 60 s after the start of the trial. The CS was presented during the warning and shock periods. Following a lever press during the shock period or if the maximum shock period elapsed, the CS and shock coterminated and the ITI was initiated. A lever press during the shock period was designated an ''escape'' response. Extinction sessions were similar to acquisition sessions except no shocks were delivered during trials. ''Avoidance'' responses during the extinction sessions were lever presses with latencies less than 60 s. A rat that failed to emit a lever press response by the end of the fourth acquisition session was removed from the study. Six rats were dropped from the study for this reason.

Neural activity was assessed at four times in acquisition and two times in extinction. Rats were randomly assigned to be sacrificed after the 2nd, 4th, 8th or 10th acquisition session (session A02, A04, A08 or A10) or after the 1st or 6th extinction session (session E01 or E06) based on A% data with stratification after session A01, and are referred to as group ACQ02, ACQ04, ACQ08, ACQ10, EXT01 and EXT06, respectively.

Data analysis. One-way ANOVA design with main factor of group was used to study the dependent measures in each session to determine whether differences occurred between groups on avoidance ratio, escape ratio, and shock number during the acquisition phase, and avoidance ratio during the extinction phase. A second ANOVA with repeated measurement of session was conducted for each group to assess the change in behavior across acquisition and extinction sessions. Post hoc testing was conducted using Tukey's test for pairwise comparison between groups. All data are expressed as means ± the standard error of the mean. Due to recording errors, data from two rats in group EXT01 and 1 rat in group EXT06 were missing from session A01. Therefore, the missing subjects were not included in the analysis for session A01.

### Immunocytochemistry (ICC)

Ninety–120 min after the end of a session (A02, A04, A08, A10, E01 or E06), rats were deeply anesthetized with sodium pentobarbital (150 mg/kg) and transcardially perfused with 200 ml of saline solution, followed by 200 ml of 4% paraformaldehyde solution. Brains were extracted, post-fixed in 4% paraformaldehyde at 4◦C overnight, and then stored at 4◦C in 0.1 M phosphate buffer (PB) solution containing 30% (w/v) sucrose until the brains sank.

Coronal brain sections (50 microns) were prepared on a freezing microtome, and every 6th sections collected from the mPFC (bregma: +4.20 mm ∼ +2.53 mm) and the amygdala (bregma: −2.04 mm ∼ −3.24 mm; Paxinos and Watson, 1998) were immunostained. To reveal the neural activity during acquisition and extinction of avoidance learning, we quantified c-Fos immunoreactivity (ir; a product from the expression of the immediate early gene c-fos) as a marker of neural activity (Chaudhuri et al., 2000). Given the important role of inhibitory circuits especially in anxiety (McCabe et al., 2004; Berretta et al., 2005), we were particularly interested in the activation of inhibitory (mostly GABAergic) neurons. We stained for parvalbumin (PV), as previously described, to detect GABAergic neurons (Jiao et al., 2011). PV, a calcium-binding protein, is expressed in more than 55% of GABAergic neurons in the basal amygdala (BA) in various species (Sorvari et al., 1996; Ambalavanar et al., 1999; Kemppainen and Pitkänen, 2000; Gabbott et al., 2006; Dávila et al., 2008), and in about 35% mPFC GABAergic neurons in rat (Gabbott and Bacon, 1996; Gabbott et al., 2006). However, some PV-ir negative neurons could be GABAergic neurons. In the intercalated cell clusters (ITCs), which is composed of groups of small to medium size fast-firing GABAergic neurons located between BA and CeA (Royer et al., 1999; Royer and Paré, 2002; Ma´nko et al., 2011), anti-glutamic acid decarboxylase isoform 67 (GAD67, the ratelimiting enzyme of GABA synthesis) antibody was used to define GABAergic neurons (Izumi et al., 2011). Double labeling of c-Fos and PV or GAD67-ir was assessed to evaluate selective neuronal activation in each region of interest (ROI).

ICC procedures were conducted as previously described (Pang et al., 2001; Miller et al., 2005; Jiao et al., 2011). Sections were stained for c-Fos, followed by a second staining for PV or GAD67. Briefly, sections were incubated in rabbit antic-Fos IgG (sc-52, 1:1000, Santa Cruz, CA), mouse anti-PV (P3088, 1:1000, Sigma-Aldrich, MO), or mouse anti-GAD67 (MAB5406, 1:1000, Chemicon, CA); sections for c-Fos and PV staining were incubated overnight at room temperature while sections for GAD67 staining were incubated for 48 h at 4◦C. Following incubation in primary antibodies, sections were incubated in secondary antibodies (biotinylated donkey anti-rabbit IgG, or biotinylated donkey anti-mouse IgG (1:200, Jackson ImmunoResearch, PA) for 2 h at room temperature. Visualization was performed using the avidin-biotin method (Vector Laboratories, Burlington, CA) with nickel-enhanced diaminobenzidine for c-Fos and diaminobenzidine alone for PV, or GAD67.

c-Fos-ir nuclei were counted in all ROIs; double labeled cfos/PV-ir perikarya were counted in the anterior CG, prelimbic (PL), and IL cortices of the mPFC, and the lateral amygdala (LA)/BA; only c-Fos nuclei were counted in the ITCs (defined by darker GAD-ir area). Estimates of the number of immunostained neurons or nuclei were obtained using standard stereology procedures (West, 1993; West et al., 2009) and were conducted blind to the training conditions of the animal. Volume measures for each of the brain regions were also determined. The optical fractionator method (Stereo Investigator v.7.0, MicroBrightField, Colchester, VT) was used to obtain the estimates of cell number on a microscope with an x-, y-, z-axis motorized stage (ASI MS-2000, Applied Scientific Instrumentation, Eugene, OR). Cells containing c-Fos- and c-fos/ PV-ir double labeling were identified using a 40× objective lens. Double-labeled cells were defined by observing a PV-positive soma (light brown in cytoplasm) with a black nucleus in the center (c-Fos). The counting frame had a height of 10 µm and was 80 µm × 80 µm in size for basal and LA, 150 um × 100 um in size for CeA, 50 um × 100 um in size for ITC, and 50 µm × 50 µm in size for medial prefrontal area. Seven to 8 animals per group were counted for analysis in mPFC regions, 5–7 animals per group were counted in BA, and 5–6 animals per group were counted in the CeA and ITC.

Data analysis. A one-way ANOVA with main factor of group (sacrifice time) was used to assess for differences in neural activity within each ROI. General neural activity was represented by the density of c-Fos-ir cells (number of c-Fos-ir cells/volume) while GABAergic activation was represented by the ratio of the density of c-Fos—PV-ir double labeled neurons to the density of single labeled PV-ir neurons. Neural activation was also compared between phases (acquisition or extinction). Post hoc testing was conducted using Tukey's test for pair-wise comparison between groups. Additional t-tests were performed to compare c-Fos-ir cells and activation in PV-ir neurons between ACQ10 and EXT01 or EXT6. All data are expressed as means ± the standard error of the mean.

### Results

### Acquisition and Extinction of Lever-Press Avoidance

As judged by avoidance ratio, all groups acquired the task similarly at sacrifice time (main factor = group, ps > 0.05). Groups EXT01 and EXT06 extinguished similarly in the first extinction session, p > 0.05 (**Figure 1A**). No group differences were found for any of the other measures (i.e., escape ratio and shock number per session, **Figures 1B,C**; ps > 0.05). As expected, rats avoided more in later acquisition sessions compared to early sessions in all groups (ACQ04, F(3,27) = 33.48; ACQ08, F(7,63) = 20.23; ACQ10, F(9,81) = 5.43; EXT01, F(8,72) = 5.66; EXT06, F(8,72) = 6.15; ps < 0.0001) except for group ACQ02. Moreover, the number of shocks reduced with training (ACQ04, F(3,27) = 30.69; ACQ08, F(7,63) = 16.08; ACQ10, F(9,81) = 6.1; EXT01, F(8,72) = 3.04; EXT06, F(8,72) = 3.18; ps < 0.01). During extinction, rats avoided less in later extinction sessions compared to early extinction sessions (EXT06, F(5,45) = 3.89, p < 0.01). These data suggest that rats in each of the groups acquired and extinguished avoidance responses similarly and that observed

difference in immediate early gene product is likely resulting from training phases (i.e., acquisition vs. extinction) and training stages (i.e., session).

and 1 subject from group EXT06 were lost from session A01 due to a power

### Neural Activity of the mPFC Acquisition

failure during training).

Rats from ACQ10 exhibited the highest number of c-Fos-ir cells compared to other acquisition groups (**Figure 2A**), suggesting that mPFC neurons are still active during asymptotic avoidance

performance, (CG: F(3,25) = 5.78; IL: 4.69; PL: 7.29; ps < 0.01), post hoc ps < 0.05. Importantly, a greater activation of PV-ir neurons of the CG, PL and IL was observed also in ACQ10 compared to earlier acquisition sessions (CG: F(3,25) = 3.87, p < 0.05; PL: F(3,25) = 5.46, p < 0.005; IL: F(3,25) = 7.75, p < 0.001), post hoc ps < 0.05, suggesting enhanced inhibitory tone in the mPFC as active avoidance response is fully developed (**Figure 2B**). (for detailed analysis results, see **Table 1**).

### Extinction

Compared to the acquisition phase, all three sub-regions of the mPFC had greater neural activity during the extinction phase than acquisition phase (CG: F(1,42) = 10.79; IL: 10.3; PL: 8.31; ps < 0.01; **Figure 2A**). However, c-Fos-ir did not differ between ACQ10 and EXT01 nor between EXT01 and EXT06, suggesting enhanced mPFC activity might be the continuation of mPFC activity in late acquisition while c-Fos-ir cell counts sustained during extinction when response dropped. In addition, a greater proportion of PV-ir neurons was activated in all three sub-regions of mPFC during extinction compared to acquisition phase (CG: F(1,42) = 18.25; IL: 8.68; PL: 10.61; ps < 0.01; **Figure 2B**). Interestingly, there is a trend showing PV-ir neurons are more activated in EXT01 compared to, ACQ10 in CG (t(13) = 1.77, p = 0.10), and EXT06 in CG and IL (t(13) = 1.68 and 1.76, ps = 0.11 and 0.10), suggesting


<sup>∗</sup>Denotes ps < 0.05.

greater inhibitory activity is associated with the transition to extinction.

### Neural Activity in Sub-Nuclei of the Amygdala

### BA and LA

### Acquisition

In the LA and BA nuclei, the numbers of c-Fos-ir cells remained the same during acquisition sessions (ps > 0.05; for detailed analysis results, see **Table 1**). However, activity of inhibitory PV-ir neurons in the BA increased as acquisition proceeded (F(3,25) = 4.24, p = 0.0172), while activity of inhibitory PV-ir neurons in the LA did not change across acquisition sessions, p > 0.05. The increased activity of inhibitory neurons in the BA as avoidance is acquired may be due to increased glutamatergic inputs from mPFC. Post hoc analysis showed greater BA PVir activation in ACQ10 compared to ACQ02 group, p < 0.05 (**Figure 3B**).

### Extinction

Compared to the acquisition phase, BA and LA were activated to a greater extent during the extinction phase (F(1,34) = 11.87 (BA) and 7.33 (LA), ps < 0.005 and 0.05 respectively), suggesting enhanced BA activity during extinction learning (**Figure 3A**). Particularly in the BA, there were more c-Fosir cells from EXT01 compared to ACQ10, t(11,2.76) , p < 0.05, suggesting increased BA activity is associated with transition to extinction. Greater activity of inhibitory PV-ir neurons was observed in both the BA and LA during extinction compared to the acquisition phase, F(1,34) = 11.64 (BA) and 11.65 (LA), ps < 0.005 (**Figure 3B**). However, neither c-Fos-ir nor activated PV-ir neurons, altered between early and late extinction sessions.

### ITCs

### Acquisition

ITC area was defined by GAD67 staining as depicted in **Figure 4B**. The number of c-Fos-ir neurons increased while acquisition proceeded in both medial and lateral ITC (lITC: F(1,33) = 11.81; mITC: F(1,32) = 8.82; ps < 0.05; **Table 1**). Particularly in the mITC, the number of c-Fos-ir cells was higher on later sessions compared to early sessions, post hoc analysis ps < 0.05 (**Figure 4A**; for detailed statistical analysis results, see **Table 1**). Since ITCs are composed mainly of GABAergic neurons, elevated ITCs activity during acquisition

strongly suggests that the inhibitory tone develops as rats are acquiring the avoidance task.

### Extinction

ITCs activity was greater during the extinction phase than acquisition phase. Although c-Fos-ir cell counts did not differ between ACQ10 and EXT01, a significant increase in c-Fos-ir cell counts was observed in EXT06 in both lITC and mITC, t(10,2.53) and t(10,2.44) , ps < 0.05, suggesting transition to extinction did not significantly increase such activity simultaneously, instead, in a delayed mode.

### CeA

### Acquisition

During the acquisition phase, the number of c-Fos-ir neurons increased with training and peaked in session A08, then reduced to the early acquisition level in session A10. This pattern indicates that CeA may be actively involved in learning active avoidance, but is less involved as learning proceeds to asymptotic performance.

### Extinction

In contrast to activity in other ROIs, CeA was activated to a lower degree during the extinction phase compared to acquisition phase, p < 0.005 (**Figure 5**). Moreover, CeA neural activity remained at such a low level during the entire extinction phase suggesting that CeA activation may be inhibited when avoidance

is acquired and when shock is no longer present (**Figure 5**, for detailed statistical analysis results, see **Table 1**).

### Discussion

Here we report differential activity of mPFC and amygdalar sub-regions during lever-press avoidance and extinction. In the mPFC and most amygdalar sub-regions, activity increased in late acquisition sessions (A08–A10) when avoidance response was acquired and peaked in extinction phase when shock was no longer present. GABAergic neurons in the mPFC had a similar pattern, more activated in the mPFC in later acquisition, even more in early extinction (E1), but less activated in late extinction (E6). In contrast, activity in the CeA increased

during early acquisition sessions, peaked in A08 and reduced in late acquisition and extinction. Therefore, different patterns of activity were observed in mPFC, BA, LA and ITC compared to CeA. These data suggest that general activity, and particularly inhibitory neuronal activation within mPFC-amygdala circuit shifts in a time-dependent manner during acquisition and extinction of lever press avoidance. Together, these data suggest that altered activity observed in similar regions in the present study using avoidance paradigm in rats and in imaging studies in patients with anxiety disorders (Schwartz and Rauch, 2004; Cottraux, 2005; Guyer et al., 2008).

The role of mPFC and amygdala in avoidance task has been previously studied, however mainly using lesion technique in rodents (Choi et al., 2010; Moscarello and LeDoux, 2013; Beck et al., 2014). Pre-training lesion provides a useful tool to evaluate well defined structure-dependence of a task, (Wan et al., 1994). However, compensatory changes in response to lesions may complicate interpretation. We recently reported that mPFC, striatum and amygdala neural activity assessed by c-Fos and delta-FosB was associated with avoidance and extinction in lever-press avoidance (Jiao et al., 2011; Perrotti et al., 2013). However, these studies only evaluated time points at asymptotic avoidance performance and at the end of extinction learning. In order to understand the role of amygdala and mPFC in the acquisition and extinction processes of avoidance, the present study monitored neural activity at various time points during avoidance acquisition and extinction.

The importance of mPFC neural activity in active avoidance and extinction has been recognized and appreciated in recent works using lever-press or shuttle-box avoidance paradigms (Duncan et al., 1996; Jiao et al., 2011; Moscarello and LeDoux, 2013). These studies demonstrated that avoidance learning induced prominent c-Fos expression in the mPFC and CG (Duncan et al., 1996) while IL lesion impaired avoidance learning (Moscarello and LeDoux, 2013). We reported that rats that failed to extinguish lever-press avoidance exhibited lower c-Fos expression in the mPFC compared to rats that successfully extinguished such response (Jiao et al., 2011). Thus mPFC is actively involved in both acquisition and extinction of active avoidance task in rats.

It is known that mPFC is a heterogeneous structure (Gabbott et al., 1997; Vertes, 2004). In fear conditioning and extinction, PL is associated with fear learning while IL is important in extinction learning (Milad and Quirk, 2002; Quirk et al., 2006; Sierra-Mercado et al., 2011). If fear and avoidance share the same pathway, we would expect greater PL activation during acquisition and greater IL activation during extinction. However, we found that the pattern of c-Fos-ir changes was similar in the PL and IL in avoidance. In support of our data, Moscarello and LeDoux (2013) reported that IL is needed to acquire shuttle avoidance, and to reduce warning signal-elicited freezing, yet PL lesion did not affect acquisition. In shuttle avoidance, the cue that is initially paired with the shock induces fear and facilitates freezing behavior, subsequently preventing a shuttle response. Thus fear needs to be overcome while shuttle avoidance is being acquired. Our results support and extend those of Moscarello and LeDoux in that IL activity is increased during lever press avoidance was acquired. In contrast to Moscarello and LeDoux, PL neural activity was also increased during acquisition of lever press avoidance; these differences may due to different avoidance paradigms used in these studies. We also observed an interesting trend on the activation of PV-ir neurons in this area. While there are more activated PV-ir neurons following A10, there is a trend showing increased number of activated PV-ir neurons after E1 (e.g., CG) and decrement after E6 (e.g., IL). Given c-Fos-ir cell counts remained similar following A10, we speculate that there might be increased excitatory activity in the mPFC during late extinction. However, this speculation requires further investigation.

In the ITCs, c-Fos-ir expression progressively increased as learning proceeded from acquisition to extinction of leverpress avoidance. Accumulated evidence demonstrates that ITCs is critical for fear extinction, specifically, for the expression of extinction (Herry et al., 2008; Likhtik et al., 2008; Ma´nko et al., 2011). This cell cluster receives input from vmPFC and modulates fear extinction through the CeA, a feed-forward inhibition mechanism of extinction (Quirk and Gehlert, 2003; Milad et al., 2004; Hefner et al., 2008; Likhtik et al., 2008; Ma´nko et al., 2011). Thus the greater ITCs activity here could lead to reduced ''fear'' component in late acquisition sessions and in extinction via increasing excitatory input from mPFC neurons. Based on the present data, we speculate that when animals reach near asymptotic avoidance performance (i.e., receiving very few shocks), CeA activity is suppressed by increased ITCs input induced by enhanced mPFC activity.

As described above, we observed an inverse relationship in neural activity between CeA and mPFC-ITCs circuits, which is an increase of c-Fos-ir cells counts in the mPFC and non-CeA amygdala in late acquisition accompanied by a decrease of c-Fosir cell counts in the CeA following A08. The inverse relationship of c-Fos-ir in the mPFC and CeA is supported by the anatomical connection between these two structures and their physiological roles in aversive learning that we addressed earlier (Morgan and LeDoux, 1999; Rosenkranz et al., 2003; Amano et al., 2010). Lesion/deactivation in the CeA facilitated shuttle avoidance by reducing freezing (Choi et al., 2010; Moscarello and LeDoux, 2013), suggesting that CeA activity inhibits the acquisition of an active avoidance task. Thus rats exposed to shocks would have high CeA activity during early acquisition phase when avoidance response has not yet been fully acquired. However our study showed that the peak CeA activation occurred on session A08 but not A02 when avoidance responding is near asymptote. Similarly, higher c-Fos-ir in the CeA has been reported in rats that are ''good'' avoiders compared to ''poor'' avoiders in shuttle-box avoidance, suggesting that elevated CeA activity is associated with active avoidance learning (Martinez et al., 2013). Other than freezing, CeA is associated with arousal, sympathetic and parasympathetic responses to stimuli (LeDoux, 2007). It is possible that elevated CeA activity is due to other factors such as valence (i.e., bad/good behavioral outcome) state (Moul et al., 2012). In addition, it is known that CeA is highly heterogeneous, for instance, a large portion of CeA neurons are interneurons that inhibit CeA output (McDonald and Augustine, 1993; Pitkanen et al., 1997; Sah et al., 2003). Thus it is possible that the high CeA activity on session A08 could due to increased interneuron activation and lead to reduced CeA output. Therefore, the highest CeA activation observed on A08 may be the result of accumulative neural activation, but not necessarily indicate highest levels of fear.

In addition, our findings indicate that both LA and BA regions remained active during extinction of lever-press avoidance. The involvement of BA and LA in acquisition is expected since this region is necessary to acquire and perform an active avoidance task (Silveira et al., 2001; Anglada-Figueroa and Quirk, 2005). However, the extended activity during extinction suggests that the extinction of active avoidance requires both structures. We also found elevated activity in inhibitory PV-ir cells in the LA during extinction. As BA receives robust inputs from LA, increased inhibitory activity in the LA may lead to decreased output to the BA. Moreover, increased PV-ir neuronal activity was observed in BA following A10 and remained the same during extinction while overall BA activity was higher following E1, suggesting different neuronal population may be involved. For instance, BA neurons that fired to fear-associated CS or extinction-associated CS are innervated by projections from heterogeneous origins such as ventral hippocampus or mPFC (Repa et al., 2001; Herry et al., 2008). Thus, it is possible that activities from distinct neuronal populations associated with acquisition or extinction of avoidance learning are overlapping during late acquisition and early extinction, resulting in different activity patterns in the BA through LA.

It is important to note that the neuronal activity of the brain areas investigated here did not change in relation to shock number. Early in training, rats experienced the most amount of shocks in A01 and then reduce number of shocks through out the acquisition phase. In contrast, mPFC and amygdala subregions, except CE, had neuronal activity increasing through out the acquisition phase, and some even through extinction when there were no shocks experienced. Even activity in the CE nucleus did not exactly reflect shock number as its activity was highest at A08, but not A01. Previous studies have implicated that fear is greatest early in avoidance training and gradually reduces as avoidance is learned (Coover et al., 1973; Servatius et al., 2008). Thus, the neuronal activity reported in this study does not exactly correlate with the expected dynamics of fear during avoidance learning. Moreover, previous studies show that activity in the amygdala increase during fear conditioning in humans and animals, paralleling the conditioned

### References


response (Quirk et al., 1997; Cheng et al., 2007). Another study reported that c-Fos activity in the mPFC was significantly increased in rats acquiring wheel-turn avoidance compared to yoked and house-exposure control rats (Coco and Weiss, 2005). In addition, increased number of mPFC c-Fos-ir neurons was reported following fear extinction compared to unpaired CS/US control groups (Kim et al., 2010). Thus, these data suggest that the changes in neural activity likely results from the development of behavioral avoidance and extinction, as our observation indicates that there is a lack of association between c-Fos-ir cell counts and shock number received from groups (**Figures 1C**, **2A**, **3A**, **4A**, **5**). Taken together, the results of the present study suggest that activity of mPFCamygdala during avoidance learning is not merely reflecting fear.

Limitations: The evaluation of c-Fos expression only indicates association of these regions in avoidance acquisition and extinction, but not the necessity of these regions in avoidance learning or extinction. Further investigation is needed to make mechanistic conclusions of these brain regions in avoidance and its extinction. Our results further suggest that to move forward, selective lesions of different cell populations within the mPFC and amygdala is necessary. Future studies should also include other regions such as nucleus accumbens (Ramirez et al., 2015) and striatum, as they are important in motivation and stress controllability.

In conclusion, we demonstrated that the dynamic interaction between mPFC and amygdalar sub-regions could partially be the underlying mechanism of avoidance acquisition and extinction. Thus the network activity in avoidance may not be the same in fear conditioning, while there are common structures involved. Our findings on GABAergic neural activity in acquisition and extinction of active avoidance may shed a light on better understanding the mechanism of avoidance and benefit extinction-based therapy for anxiety.

### Acknowledgments

The present work was supported by Merit Review Awards I01BX000132, I01BX000218 and I01CX000771 from the US Department of Veterans Affairs Biomedical Laboratory and Clinical Sciences Research and Development Services, NIH grant RO1-NS44373, and the Stress and Motivated Behavior Institute. The contents do not represent the views of the U.S. Department of Veterans Affairs or the United States Government.

Anglada-Figueroa, D., and Quirk, G. J. (2005). Lesions of the basal amygdala block expression of conditioned fear but not extinction. J. Neurosci. 25, 9680–9685. doi: 10.1523/jneurosci.2600-05.2005


posttraumatic stress disorder. Arch. Gen. Psychiatry 63, 184–192. doi: 10. 1001/archpsyc.63.2.184


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Jiao, Beck, Myers, Servatius and Pang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## The role of the hippocampus in avoidance learning and anxiety vulnerability

#### **Tara P. Cominski <sup>1</sup>\*, Xilu Jiao<sup>2</sup> , Jennifer E. Catuzzi <sup>1</sup> , Amanda L. Stewart <sup>2</sup> and Kevin C. H. Pang1,3**

<sup>1</sup> Department of Neurology and Neurosciences, Rutgers – New Jersey Medical School, Rutgers, The State University of New Jersey, Newark, NJ, USA

<sup>2</sup> Neurobehavioral Research Laboratory, Veterans Affairs Biomedical Research Institute, East Orange, NJ, USA

<sup>3</sup> Neurobehavioral Research Laboratory, Veterans Affairs New Jersey Health Care System, East Orange, NJ, USA

### **Edited by:**

Angela Roberts, University of Cambridge, UK

### **Reviewed by:**

Phillip R. Zoladz, Ohio Northern University, USA Hannah Frances Clarke, University of Cambridge, UK

### **\*Correspondence:**

Tara P. Cominski, Veterans Affairs New Jersey Healthcare System, 385 Tremont Avenue, 11th Floor, Room 11-160, East Orange, NJ 07018, USA e-mail: tara.cominski@gmail.com; tara.cominski@va.gov

The hippocampus has been implicated in anxiety disorders and post-traumatic stress disorder (PTSD); human studies suggest that a dysfunctional hippocampus may be a vulnerability factor for the development of PTSD. In the current study, we examined the effect of hippocampal damage in avoidance learning, as avoidance is a core symptom of all anxiety disorders. First, the effect of hippocampal damage on avoidance learning was investigated in outbred Sprague Dawley (SD) rats. Second, the function of the hippocampus in Wistar-Kyoto (WKY) rats was compared to SD rats. The WKY rat is an animal model of behavioral inhibition, a risk factor for anxiety, and demonstrates abnormal avoidance learning, marked by facilitated avoidance acquisition and resistance to extinction. The results of the current study indicate that hippocampal damage in SD rats leads to impaired extinction of avoidance learning similar to WKY rats. Furthermore, WKY rats have reduced hippocampal volume and impaired hippocampal synaptic plasticity as compared to SD rats.These results suggest that hippocampal dysfunction enhances the development of persistent avoidance responding and, thus, may confer vulnerability to the development of anxiety disorders and PTSD.

**Keywords: hippocampus, avoidance, PTSD, anxiety,WKY, synaptic plasticity, LTP**

### **INTRODUCTION**

The development of post-traumatic stress disorder (PTSD) and anxiety disorders is a function of an individual's experience and inherent vulnerability. While much research effort has been devoted to the effects of traumatic stress on individuals, less effort has been devoted to the study of vulnerability factors. Vulnerability or risk factors may be inherited (i.e., personality traits or genetic variations) or due to prior experiences (i.e., abuse or experience of a previous trauma).

The hippocampus is a brain region implicated in PTSD. Patients with PTSD have reduced hippocampal volume (Gurvits et al., 1996; Villarreal et al., 2002). A recent study, using high resolution MRI, showed that reduced hippocampal volume in PTSD patients is localized to the CA3/DG region of the hippocampus (Wang et al., 2010). These findings agree with animal studies that showed severe chronic stress leads to atrophy of apical dendrites in the CA3 region, reduced neurogenesis, and mature granule cell death in the dentate gyrus (DG) of the hippocampus due to elevated levels of glucocorticoids (Gould et al., 1990, 1998; McEwen et al., 1995; Gould and Tanapat, 1999). Based on these studies, it was hypothesized that reduced hippocampal volume in individuals with PTSD was a consequence of the traumatic event and subsequent development of PTSD (Bremner, 2001). However, more recent human research has challenged this hypothesis.

Rather than a consequence of the traumatic experience, reduced hippocampal volume may be a risk factor for developing PTSD. The first suggestion of this was a study of identical

twins discordant for combat experience (Gilbertson et al., 2002). In this study, individuals with combat experience were divided into those diagnosed with PTSD and those not diagnosed and then paired with their twin siblings who were not exposed to combat and were not diagnosed with PTSD. The results of this study replicated the previous finding of reduced hippocampal volume in combat-exposed individuals with PTSD compared to combat-exposed individuals without PTSD (Gurvits et al., 1996). Importantly, the twin sibling of the combat-exposed PTSD subject had reduced hippocampal volume compared to the twin sibling of the combat-exposed non-PTSD subjects. Thus, these data suggested that decreased hippocampal volume pre-existed trauma exposure and diagnosis of PTSD. A subsequent study linked the reduced hippocampal volume to a learning impairment (Gilbertson et al., 2007). Therefore, the evidence suggests that reduced hippocampal volume, with concomitant dysfunction, is a risk factor for PTSD. Despite this relationship, the manner in which hippocampal dysfunction contributes as a risk factor for PTSD is unclear.

Excessive avoidance is a core feature of all anxiety disorders and is a core component of PTSD diagnosis (American Psychiatric Association, 2013). Moreover, pathological avoidance symptoms increase with time after a trauma and parallel the trajectory of PTSD (O'Donnell et al., 2007). Once developed, avoidant responses are notoriously difficult to treat, and are resistant to pharmacological and cognitive behavioral therapy. The growth of avoidance suggests a learning component to the pathological avoidance. Thus, knowledge of the mechanisms involved in avoidance learning may lead to important insights into the development of avoidance symptoms in anxiety disorders and PTSD.

Although the role of the hippocampus in anxiety-related behaviors like elevated plus maze and fear conditioning has been studied extensively, its role in active avoidance behavior is not well established [for reviewBarkus et al. (2010)]. An abnormal hippocampus may provide risk to the development of anxiety disorders and PTSD by enhancing sensitivity to active avoidance behaviors. Hippocampal damage leads to facilitated avoidance learning in shuttle avoidance [for review, see Olton (1973) and Black et al. (1977)] and lever-press avoidance (Schmaltz and Giulian, 1972). In addition, we previously showed that damage of GABAergic neurons in the medial septum, a major non-cortical input to the hippocampus, prior to avoidance training impaired extinction but not acquisition of the avoidance response (Pang et al., 2011). Thus, dysfunction of the hippocampus may enhance the rate of avoidance acquisition and the development of persistent avoidant responding, thereby resulting in risk for anxiety disorders and PTSD.

The Wistar-Kyoto (WKY) rat is an animal model of behavioral inhibition and displays many characteristics related to anxiety disorders. Trait behavioral inhibition is a vulnerability factor for the development of anxiety disorders, as behaviorally inhibited children are more likely to develop anxiety disorders (Kagan et al., 1987). WKY rats demonstrate the trait behavioral inhibition phenotype, observed as decreased activity and withdrawal in novel social (Pare, 2000) and non-social challenges (Pare, 1994). WKY rats display low activity in the open field (Pare, 1994) and have enhanced sensitivity to stress-induced ulcer formation (Pare, 1989), hyper-responsive peripheral and central stress responses (Pardon et al., 2002), and learning and memory alterations (Ferguson and Cada, 2004). Of particular relevance to this study, WKY rats acquire lever-press avoidance faster and to a higher degree than Sprague Dawley (SD) rats (Servatius et al., 2008). Avoidant behaviors of WKY rats are also more persistent during extinction training than in SD rats, especially at high shock intensity (Jiao et al., 2011). In fact, extinction following avoidance learning at high shock intensity was virtually non-existent in WKY rats, a pattern that was not displayed by SD rats.

The present study was performed to further elucidate the role of the hippocampus in acquisition and extinction of lever-press avoidance using two approaches. First, the effect of hippocampal damage on avoidance learning was investigated in outbred SD rats. Second, hippocampal synaptic plasticity and hippocampal volume were assessed in WKY rats since human studies suggested impaired hippocampal function in those individuals with vulnerability for PTSD. The results of the current study suggest that a dysfunctional hippocampus enhances the development of persistent avoidant responses.

### **MATERIALS AND METHODS**

### **SUBJECTS**

Male SD rats (*n* = 43) were 300–350 g and male Wistar-Kyoto (WKY, *n* = 8) rats were 200–250 g at the start of the behavioral study. Thirty-five SD rats underwent surgery for lesions or sham procedures. Eight SD and eight WKY rats were behaviorally tested without surgery. All rats were housed individually on a 12-h light/dark cycle with lights turning on at 7:00 a.m. Training and testing were performed during the light phase of the light/dark cycle. All procedures were conducted in accordance with the NIH Guide for the Care and Use of Laboratory Animals and were approved by the IACUC of the Veterans Affairs Medical Center at East Orange, New Jersey.

### **LESION SURGERY**

Rats were anesthetized with isoflurane (2%). Burr holes were drilled into the skull overlying the hippocampus or entorhinal cortex. The coordinates (in mm) for the entorhinal cortex lesion sites in relation to bregma were as follows (four sites per hemisphere): AP -5.3, ML ±6.5, DV -5.0; AP -6.0, ML ±6.5, DV -5.0; AP -6.7, ML ±5.0, DV -6.5; AP -7.4, ML ± 5.0, DV -6.5. The coordinates for the hippocampal lesion sites in relation to bregma were as follows (five sites per hemisphere): AP -2.5, ML ±1.6, DV -3.8; AP -4.2, ML ±2.6, DV -3.1; AP -5.3, ML ±4.4, DV -3.4; AP -5.8, ML ±5.6, DV -4.1; AP -6.0, ML ±5.6, DV -4.1. Injections were made bilaterally. The needle of a Hamilton syringe was inserted into the desired location to infuse saline for sham surgery or ibotenic acid (10µg/µl) to damage hippocampus or entorhinal cortex. Infusions occurred at a rate of 0.1µl/min with a volume of 0.5µl dispensed per site. Rats were allowed at least 10 days to recover from surgery. Extent and location of lesions are depicted in **Figure 1**.

### **BEHAVIOR**

### **Avoidance learning**

Rats were trained in an operant box with a lever (10.5 cm above the floor), a cue light (20.5 cm above the grid floor), and a speaker (26 cm above the grid floor) on one wall and a light (14 W) on the opposite wall that was constantly lit during the session. Scrambled footshocks were delivered through the grid floor (Coulbourn Instruments, Langhorn, PA, USA). The operant box was enclosed in a sound-attenuating box.

Avoidance learning occurred as described previously (Pang et al., 2011). Briefly, each session was separated by 2–3 days (3 sessions/week). Each session began with a 60-s stimulus-free period, followed by 20 trials. A trial started with the presentation of the warning signal (1000-Hz 75-dB tone, 10 dB above background noise). A lever response made during the first 60 s of the trial immediately terminated the warning signal and initiated 3-min intertrial interval (ITI). This response was an avoidance response, as the rat avoided the footshock. If an avoidance response was not made, foot shocks (2 mA, 0.5 s duration, 3 s intershock interval) were delivered starting at 60 s and continued until a lever response was made (scored as an "escape response") or 100 shocks were delivered. Immediately following an escape response or the maximum number of foot shocks, the warning signal was terminated and a 3-min ITI was initiated. All ITIs were signaled by a flashing light (ITI signal, 5 Hz, 50% duty cycle). Responses during the ITI had no effect but were recorded. The acquisition phase consisted of 12 sessions.

During the extinction phase, all procedures were the same as in the acquisition phase except the foot shock and ITI signal were omitted. Although shocks were omitted, responses during the first 60 s of the trial were designated as "avoidance" responses, and

those with latencies greater than 60 s were designated as "escape" responses. The extinction phase consisted of six sessions.

### **Data analysis**

Data are expressed as mean ± standard error of the mean. Performance was assessed by calculating the proportion of trials in each session with an avoidance response. A mixed design ANOVA with session as the within subject factor and lesion/strain as the between subjects factor was performed. All lesion groups and unoperated SD and WKY groups were included in this overall ANOVA and are captured in the lesion/strain factor. Separate analyses were performed for the acquisition and the extinction phases. To determine whether non-specific responding might be increased by lesion or strain, lever presses during each minute of the ITI were analyzed. Mean lever presses per trial per minute was determined for the ITI and assessed statistically using a mixed design ANOVA with session and ITI minute as within subject factors and lesion or strain as a between subjects factor. Statistical analysis was performed with α = 0.05 using SPSS for Windows (version 12.0.1, SPSS, Inc., Chicago, IL, USA). Mixed design analysis of variance (ANOVA) was used to compare groups. Mauchly's test was used to determine violations in the assumptions of sphericity for repeated measure factors and Greenhouse–Geisser correction was used in the appropriate situations to correct for violations (Geisser and Greenhouse, 1958). Corrected statistics are only reported when the uncorrected and corrected *p*-values disagreed with respect to significance; otherwise only the uncorrected values are reported. Tukey's *post hoc* tests were performed to specify group differences. Analysis of covariance (ANCOVA) was performed to determine whether significant effects in extinction remained after covarying performance on the last acquisition session. Interactions were evaluated by *F*-test.

### **ELECTROPHYSIOLOGY**

### **Recording**

Six naïve SD and six WKY male rats were obtained from Harlan Laboratories (Indianapolis, IN, USA) at three months of age. All rats were given at least one week to acclimate to the new surroundings prior to recordings. Recordings were performed during the light phase of the light/dark cycle. Procedures were as described previously (Yoder and Pang, 2005). Rats were anesthetized with urethane (1.5 g/kg, i.p.) and immediately placed in a stereotaxic apparatus. A recording electrode (75µm, Teflon coated stainless steel wire) was placed in the hilar region of the DG (AP 4.0 mm posterior and 2.5 mm lateral from bregma, 2.8–3.2 mm ventral from the brain surface; WKY rats: 4.0 mm posterior, 2.8 mm lateral from bregma, 2.8–3.2 mm ventral from the brain surface) and a stimulating electrode (125µm, Teflon coated stainless steel wire) was inserted into the medial perforant pathway (mPP) (SD rats: 8.1 mm posterior and 3.1 mm lateral from bregma, 2.0– 2.8 mm ventral from brain surface; WKY rats: 8.1 mm posterior and 3.6 mm lateral from bregma, 2.0–2.8 mm ventral from brain surface). The response was optimized within the dorsal/ventral coordinate range specified. Constant current stimulation (biphasic pulse, 300µs duration; AM Systems Isolated Pulse Stimulator, Model 2100, Carlsborg, WA, USA) was applied to the mPP at a rate of 1/10 s. Evoked field potentials were recorded from the DG after amplification of 1000× and bandpass filtering between 0.1 Hz and 5 KHz (AM Systems Differential AC Amplifier, Model 1700, Carlsborg, WA, USA). Evoked responses were visualized on a digital oscilloscope and off-line analysis was performed using SciWorks software (version 7.2 SP1, DataWave Technologies).

A total of six input–output (i/o) response curves were generated to monitor changes in slope of field EPSP (fEPSP) and population spike amplitude (leading peak to valley) before and after high frequency stimulation (HFS). The i/o curves were generated using 100–1000µA stimulation. Average waveforms were generated from five evoked responses at each stimulus intensity, and slopes of fEPSP and population spike were measured from these averaged waveforms. Two i/o curves were used to establish baseline. HFS to induce LTP was based on parameters established

previously (Messaoudi et al., 2002). Stimulation for HFS was delivered at the lowest intensity that generated the maximal population spike for each animal and consisted of three sets of four trains, each train consisted of eight pulses given at a frequency of 400 Hz, intertrain interval 10 s, and interset interval of 5 min. Early phase LTP was determined from averaged i/o curves at 15 min and 1 h after HFS. Late phase LTP was evaluated from averaged i/o curves generated at 2 and 3 h after HFS. At the end of the recording session, small lesions were made to mark electrode placement (30 s, 500µA) and brains were processed as described below.

### **Statistical analysis**

Data are expressed as mean ± standard error of the mean. Statistics were performed on raw values of fEPSP slope and population spike amplitude determined from averaged waveforms. fEPSP slope and population spike amplitude were each assessed separately for each strain using a 3 (phase) × 2 (time) × 7 (stimulus Intensity) repeated measures ANOVA. Statistical analysis was performed with SPSS similar to that described for the behavioral studies.

### **HISTOLOGY**

At the end of behavioral testing or recording, all animals were perfused intracardially with saline followed by formalin. Brains were extracted and submerged overnight in formalin followed by 30% sucrose. Brains were sectioned (50µm) through the hippocampus and entorhinal cortex with a sliding microtome. Sections were stained with cresyl violet. For the lesion study, location and extent of brain damage were assessed (**Figure 1**). For the electrophysiology study, placement of the electrode tips was confirmed under a light microscope.

In a separate group of rats, volumetric comparisons were made between SD and WKY rats. Rats (*n* = 5 for each strain) were sacrificed, perfused, and brains extracted. Brains were sectioned at 50µm thickness and every fifth section was collected and stained with cresyl violet. Volume of the hippocampus, neocortex, corpus callosum, and striatum was estimated using the Cavalieri method (Slomianka and West, 2005) (StereoInvestigator, v 7.0, MicroBrightField, Colchester, VT, USA). A MANOVA was used to compare volumes of the various brain regions between strains (SPSS for Windows).

### **RESULTS**

### **AVOIDANCE ACQUISITION Avoidance responses**

Hippocampal and entorhinal cortex lesions did not alter acquisition of avoidance (**Figure 2A**). Similarly, acquisition of avoidance in WKY rats did not differ from SD rats (**Figure 2B**). Rats from all groups increased avoidance responding with training [**Figure 2**; main effect of session: *F*(11,495) = 30.55, *p* < 0.001]. Acquisition of avoidance did not differ between lesion groups nor between strains [main effect of lesion/strain, *F*(5,45) = 1.91, *p* = 0.111; session × lesion/strain interaction, *F*(55,495) = 1.14, *p* = 0.237] (**Figures 2A,B**).

### **ITI responses**

Intertrial interval responses were analyzed because they represent non-reinforced responses. The number of ITI responses generally

**hippocampal and entorhinal cortex lesion and in WKY rats**. Hippocampal and entorhinal cortex lesions did not alter avoidance acquisition **(A)**. Rats with hippocampal lesions were impaired in extinction learning compared to sham controls **(A)**. Acquisition of avoidance in WKY rats did not differ from SD rats **(B)**. WKY rats exhibited a trend toward impaired extinction of avoidant responding **(B)**. Although all six groups were statistically analyzed together, lesion **(A)** and unoperated **(B)** groups are displayed separately for clarity.

increased with training, peaking around session 4 or 5, then leveling off [main effect of session, *F*(11,484) = 6.02, *p* < 0.001]. ITI responding was greater in the first minute of the ITI as compared to the second or third minutes [main effect of ITI, *F*(2,88) = 871.10, *p* < 0.001]. Whereas the main effect of lesion/strain [*F*(1,44) = 1.6, *p* = 0.18] was not significant, the lesion/strain × session × ITI interaction [*F*(110,968) = 1.54, *p* = 0.001] did reach significance. In *post hoc* analysis of each ITI minute, lesion/strain affected the first minute ITI response [lesion/strain × session interaction, *F*(55,495) = 1.78, *p* = 0.001; main effect of lesion/strain, *F*(5,45) = 1.62, *p* = 0.174] (**Figure 3**), but not second or third minute ITI responses [main effects, *F*(5,45) ≤ 1.61, *p* > 0.171; lesion/strain × session interaction, *F*(55,495) ≤ 1.17, *p* > 0.196]. Comparisons made between sham and lesions and between unoperated SD and WKY rats for first minute ITI responses revealed that hippocampal but not entorhinal lesions facilitated responding [main effect, *F*(1,13) = 8.32, *p* = 0.013], and strain differences trended toward significance with WKY tending to make more ITI responses than SD rats [session × strain interaction, *F*(11,154) = 2.42, corrected *p* = 0.051]. Thus, hippocampal but not entorhinal cortex lesions increased ITI responding during the first, but not second or third, minutes of the ITI (**Figure 3A**). A similar trend was present for WKY rats as compared to SD rats (**Figure 3B**).

### **AVOIDANCE EXTINCTION**

### **Avoidance responses**

Hippocampal lesion and WKY rats were impaired in extinction of avoidance responding (**Figures 2A,B**). Overall, rats decreased avoidance responding during extinction training, [*F*(5,225) = 13.73, *p* < 0.001]. Lesion/strain differed [*F*(1,45) = 9.09, *p* < 0.001] but session did not interact with lesion/strain [*F*(25,225) = 1.12, *p* = 0.322]. *Post hoc* analysis demonstrated that rats with hippocampal lesions and WKY rats extinguished their avoidant responding slower than the other groups (*p* < 0.05) (**Figures 2A,B**). Moreover, extinction of hippocampal lesion and WKY rats were not different. The rate of extinction can be affected by performance immediately prior to extinction training. Because groups differed in their asymptotic level of avoidance performance at the end of acquisition, extinction was analyzed with performance on session 12 of acquisition as a covariate. Even after covarying performance on session 12, main effects of lesion and strain still differed [*F*(5,44) = 5.71,*p* < 0.001], demonstrating persistent avoidant responding in rats with hippocampal lesions and WKY rats but not rats with entorhinal cortex lesions.

### **ITI responses**

Similar to the acquisition phase, ITI responses during extinction were greater during the first minute of ITI as compared to the second and third minutes [main effect of ITI, *F*(2,90) = 306.47, *p* < 0.001]. ITI responses were most numerous during early extinction sessions and gradually decreased with extinction training [main effect of session, *F*(5,225) = 13.75, *p* < 0.001] (**Figures 3A,B**). Lesion/strain interacted with ITI, [*F*(10,90) = 2.53, *p* = 0.01], but did not interact with session, [*F*(25,225) = 0.60, *p* = 0.934]. The triple interaction did not reach significance, [*F*(50,450) = 2.94, *p* = 0.058]. Upon further analysis, it was determined that lesion/strain was significantly different during the first minute of ITI, [*F*(5,44) = 3.249, *p* = 0.014]

**FIGURE 3 | ITI responding during the first minute of the intertrial interval (ITI)**. Hippocampal lesion increased ITI responding during the first minute of the ITI during acquisition and extinction **(A)**. Group differences were not present during the second or third minute of the ITI (not shown). WKY rats showed a trend for increased ITI responding during the first minute of the ITI in the avoidance phase, but not extinction **(B)**. Strain differences were not observed during the second or third minute of the ITI (not shown).

(**Figures 3A,B**). *Post hoc* analysis revealed a significant group difference between hippocampal lesion and hippocampal sham during the first minute (**Figure 3A**). During the third minute of ITI, lesion/strain was also different, [*F*(5,11) = 2.44, *p* = 0.048]; however, *post hoc* analysis found no significant group differences. Thus, hippocampal lesions increased ITI responding during the first, but not second or third, minute of the ITI (**Figure 3A**).

To summarize, hippocampal lesions slowed extinction of avoidant responses similar to that observed with WKY rats (**Figures 2A,B**). Moreover, non-reinforced ITI responding (during minute one) was increased in hippocampal lesion and WKY rats (**Figures 3A,B**). These effects were not observed following damage to the entorhinal cortex.

### **HIPPOCAMPAL VOLUME**

Because SD rats with hippocampal damage mimicked the persistent avoidant behaviors of WKY rats, we investigated whether WKY rats might have an abnormal hippocampus as demonstrated by a smaller hippocampus and impaired hippocampal synaptic plasticity. Hippocampal and cortical volume was reduced in WKY rats compared to SD rats (**Figure 4**). The volume of the hippocampus, neocortex, corpus callosum, and striatum was estimated using the Cavalieri method. Regional brain volumes in WKY rats differed from SD rats [main effect of strain, Wilks' Lambda, *F*(4,5) = 6.348, *p* < 0.05]. WKY rats had significantly smaller hippocampus [*F*(1,8) = 25.396, *p* < 0.01] and cortex [*F*(1,8) = 9.017, *p* < 0.05] compared to SD rats (**Figure 4**). Corpus callosum and striatum were not different between strains.

### **HIPPOCAMPAL ELECTROPHYSIOLOGY**

Long-term potentiation (LTP) of the mPP to DG synapse was impaired in WKY rats. Evoked field potentials had similar waveforms in SD and WKY rats (**Figure 5**). LTP of the fEPSP was observed in SD rats, but not in WKY rats (**Figures 6A,B**). In SD rats, both early phase LTP (15 min and 1 h after HFS) and late phase LTP (2 and 3 h after HFS) were observed, as main effect of phase [*F*(2, 10) = 5.229 *p* = 0.028] and the phase × stimulus intensity interaction [*F*(12,60) = 4.507, *p* < 0.001] were significant (**Figure 6A**). The main effect of stimulus intensity was also significant, [*F*(6,30) = 13.139, *p* < 0.001]. In contrast to SD rats, LTP of the fEPSP was not observed in WKY rats (**Figure 6B**). Neither main effect of phase [*F*(2,10) = 1.913, *p* = 0.198] nor the phase × stimulus intensity interaction [*F*(12,60) = 1.794, *p* = 0.07] were significant. The main effect of stimulus intensity was significant, [*F*(6,30) = 22.234 *p* < 0.001].

Similar to fEPSP, LTP of the population spike was observed in SD but not in WKY rats (**Figures 5** and **7A,B**). In SD rats, early and late phase LTP were observed (**Figure 7A**), as main effect of phase [*F*(2, 10) = 22.393, *p* < 0.001] and the phase × stimulus intensity interaction [*F*(12,60) = 7.014 *p* < 0.001] were significant. The main effect of stimulus intensity was also significant, [*F*(6,30) = 14.660, *p* < 0.001]. LTP of the population spike was not observed in WKY rats (**Figure 7B**). Neither main effect of phase [*F*(2,10) = 4.291; corrected *p* = 0.085] nor the phase × stimulus intensity interaction [*F*(12,60) = 1.543, *p* = 0.134] were significant, although the main effect of stimulus intensity was significant, [*F*(6,30) = 3.081, *p* = 0.018].

### **DISCUSSION**

An abnormal hippocampus may provide risk for developing PTSD. Smaller hippocampal volume and associated poorer learning were

observed in soldiers with PTSD and their non-combat, non-PTSD twin siblings (Gurvits et al., 1996; Gilbertson et al., 2002) The present study investigated whether impaired hippocampal function might enhance anxiety risk by increasing the sensitivity and persistence of avoidance learning, as avoidance is a core symptom of all anxiety disorders and PTSD (American Psychiatric Association, 2013). Our results show that hippocampal damage enhances the formation of persistent lever-press avoidance and non-reinforced responding, similar to an animal model of anxiety vulnerability, the WKY rat. Moreover, reduced hippocampal volume and impaired hippocampal synaptic plasticity were evident in the WKY rat, potentially contributing to their persistent avoidant responding.

The role of the hippocampus in lever-press avoidance learning is an understudied topic. In one study, hippocampal damage caused enhanced acquisition of lever-press avoidance (Schmaltz and Giulian, 1972). The present study found a trend for rats with hippocampal damage to acquire lever-press avoidance more rapidly and to a greater asymptotic level, although these results were not statistically reliable. The Schmaltz and Giulian study found no effect of hippocampal lesions on extinction of lever-press avoidance, which is in contrast to the results of the present study. This discrepancy can be explained by several differences between the two studies. Schmaltz and Giulian made their hippocampal lesions by aspiration after acquisition was stable. In the current study, hippocampal lesions using ibotenic acid were performed prior to the start of avoidance acquisition. Results from both studies are consistent with the view that persistent avoidant responding is set during acquisition due to abnormal learning of the avoidance response, and not specifically due to effects of hippocampal lesions on extinction learning. In addition, the shock intensity used in the Schmaltz and Giulian study was much lower than the current study. We have previously shown that shock intensity is particularly important in the persistence of avoidance responding during extinction in WKY rats (Jiao et al., 2011). Thus, the present study extends the work of Schmaltz and Giulian in elucidating the effect of hippocampal lesion on lever-press avoidance and its extinction.

In contrast to active avoidance, the role of the hippocampus in anxiety-related behaviors as assessed in behavioral tests like the elevated plus maze and fear conditioning is better characterized [for review Barkus et al. (2010)]. Complete lesions of the hippocampus and selective lesions of the ventral hippocampus lead to reduced conditioned freezing (Richmond et al., 1999). Rats with ventral hippocampal lesions, but not dorsal hippocampal lesions, enter the open arms in the elevated plus maze more freely than sham rats (Bannerman et al., 2002; Kjelstrup et al., 2002). While these results might suggest an anxiolytic nature of hippocampal damage, the effects of hippocampal damage on elevated plus maze are not always clear; they depend on the extent of hippocampal damage, location of the damage, and the dependent measure evaluated. Still, enhanced persistent avoidant responding following hippocampal damage in the present study is more indicative of anxiogenic rather than anxiolytic action. Future studies are needed to disentangle the role of the hippocampus in different symptoms and tests of anxiety.

The persistent avoidance responding of SD rats with hippocampal damage and intact WKY rats suggests that WKY rats may have

**FIGURE 6 | LTP of the dentate gyrus field EPSP (fEPSP) following HFS of the medial perforant pathway in SD and WKY rats**. SD rats exhibited early and late phase LTP of the fEPSP **(A)**. In contrast, WKY rats did not demonstrate LTP at either early or late time points **(B)**. Displayed are input–output (i/o) curves showing the baseline response, early phase LTP and late phase LTP. Shown are an average of two i/o curves generated prior to HFS (baseline), an average of i/o curves generated 15-min and 1-h after HFS (early), and an average of i/o curves generated 2- and 3-h after HFS (late).

an abnormal hippocampus. In order to investigate this possibility, we compared hippocampal volume in SD and WKY rats. WKY rats had reduced hippocampal and cortical volume, but similar striatum and corpus callosum volume to SD rats. The reduced hippocampal volume was similar in magnitude to soldiers with PTSD and their non-combat, non-PTSD twin siblings (Gilbertson et al., 2002). One difference between the present study and the human studies is the difference in cortical volume in WKY rats. In the human studies, cortical volume was not reported; however, total brain volume was not affected in these studies. Thus, WKY rats appear to replicate some aspects of PTSD risk factors, but there may be additional impairments exhibited by this animal model.

Combat, PTSD patients and their twin siblings had impaired configural learning that was associated with reduced hippocampal volume (Gilbertson et al., 2007). In order to determine whether

the reduced hippocampal volume in WKY rats amounted to a functional impairment, we assessed hippocampal synaptic plasticity. LTP is currently the best model of the synaptic changes hypothesized to occur during learning (Morris et al., 1986). The waveform of the evoked response to stimulation of the medial perforant path was similar in SD and WKY rats. However, the lack of LTP in WKY rats following HFS was dramatic and supports the idea that impaired hippocampal synaptic plasticity may underlie the impairments in hippocampal dependent learning displayed by WKY rats (Clements and Wainwright, 2007; Clements et al., 2007). Furthermore, the impaired hippocampal synaptic plasticity in WKY rats may contribute to the persistent avoidance learning, as SD rats with damaged hippocampus behaved similarly.

WKY rats normally demonstrate enhanced acquisition of leverpress avoidance, as well as the perseveration of this response (Servatius et al., 2008; Jiao et al., 2011). In the current study, a significant difference was not found between WKY and SD rats in avoidance acquisition. Although avoidance acquisition was not significantly different between SD and WKY rats, the general direction was for WKY rats to learn avoidance to a greater level than SD rats. Importantly, WKY rats still displayed more resistance to extinction training than SD rats. Thus, WKY rats had more persistent avoidance responding, despite the lack of strain differences in avoidance acquisition.

In addition to persistent responding during extinction, WKY rats and SD rats with hippocampal damage made more ITI responses, another type of non-reinforced responding. During the acquisition phase, lever-press responses during the first but not second or third minutes of ITI were higher for SD rats with hippocampal lesion and WKY rats, as compared to sham lesions and unoperated SD rats, respectively. During the extinction phase, ITI responses were higher in rats with hippocampal lesions (minute 1, not minutes 2 and 3) but not in WKY rats. One explanation for persistent avoidance responding during extinction training is that hippocampal lesions increase general activity, including leverpress responding. However, the lack of group differences during minutes 2 and 3 of the ITI suggests that this is not the case. The increase in ITI responding in WKY rats during the acquisition phase has been previously reported (Beck et al., 2010). WKY rats are not prone to higher general activity compared to SD rats given the behavioral inhibited temperament of WKY rats (Pare, 2000; McAuley et al., 2009), but the increased ITI responding during avoidance learning may be a result of enhanced stress behaviors demonstrated by WKY rats.

Depression and anxiety are commonly comorbid (Kessler et al., 2003), and a smaller hippocampus has been associated with both disorders (Sheline et al., 1996). In addition to the anxiety-like traits the WKY rat exhibits, it has previously been considered as a model of depression as it displays depressive-like behavior in the forcedswim test (Lopez-Rubalcava and Lucki, 2000). However, excessive avoidance is typically not associated with depression (Chase et al., 2010), but with anxiety (Mineka and Zinbarg, 2006). In those cases where a relationship between depression and avoidance is found, it is passive, not active, avoidance that is related to depression (Ottenbreit and Dobson, 2004). Moreover, avoidance symptoms in anxiety disorders may be the cause of depression in patients with comorbidity (Moitra et al., 2008). Therefore, the enhanced and persistent active avoidance observed in rats with hippocampal damage and in WKY rats is more consistent with a model of anxiety than depression.

In summary, previous human studies with PTSD patients have suggested that an abnormal hippocampus may be a risk factor for developing PTSD. Here, we present evidence that hippocampal damage facilitates the development of persistent avoidance responding, similar to symptoms of anxiety disorders in humans. Moreover, we provide support that an animal model of behavioral inhibition, a risk factor for anxiety disorders (Kagan et al., 1987) and associated with self-reported avoidance symptomology in combat veterans (Myers et al., 2012), has reduced hippocampal volume and impaired hippocampal synaptic plasticity. The present findings support the idea that hippocampal dysfunction due to impaired synaptic plasticity and reduced volume leads to abnormally persistent avoidance learning, which in and of itself is a risk factor to develop anxiety disorders.

### **ACKNOWLEDGMENTS**

The research presented in the current study was supported by the Biomedical Laboratory Research and Department of Veterans Affairs Office of Research & Development grant I01BX000132, NIH grant RO1-NS44373, New Jersey Commission on Brain Injury Research grant CBIR11PJT003 and the Stress and Motivated Behavior Institute. Also, the authors would like to thank Dr. Sara Fazelinik for her assistance with the lesion experiments.

### **REFERENCES**


in chronic, combat-related posttraumatic stress disorder. *Biol. Psychiatry* 40, 1091–1099. doi:10.1016/S0006-3223(96)00229-6


differing in their neuroendocrine and behavioral responses to stress: implications for susceptibility to stress-related neuropsychiatric disorders. *Neuroscience* 115, 229–242. doi:10.1016/S0306-4522(02)00364-0


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 18 June 2014; accepted: 23 July 2014; published online: 08 August 2014. Citation: Cominski TP, Jiao X, Catuzzi JE, Stewart AL and Pang KCH (2014) The role of the hippocampus in avoidance learning and anxiety vulnerability. Front. Behav. Neurosci. 8:273. doi: 10.3389/fnbeh.2014.00273*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Cominski, Jiao, Catuzzi, Stewart and Pang . This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# BEHAVIORAL NEUROSCIENCE

## Effects of psychotropic agents on extinction of lever-press avoidance in a rat model of anxiety vulnerability

**Xilu Jiao1,2,3\*, Kevin D. Beck 1,2,4, Amanda L. Stewart <sup>3</sup> , Ian M. Smith1,3, Catherine E. Myers 1,2,4 , Richard J. Servatius 1,2,4 and Kevin C. H. Pang1,2,4**

<sup>1</sup> Neurobehavioral Research Laboratory, Veteran Affairs New Jersey Health Care System, VA Medical Center, East Orange, NJ, USA

<sup>2</sup> Stress and Motivated Behavior Institute, Rutgers – New Jersey Medical School, Rutgers Biomedical and Health Sciences, The State University of New Jersey, Newark, NJ, USA

<sup>3</sup> Veterans Bio-Medical Research Institute (VBRI), VA Medical Center, East Orange, NJ, USA

<sup>4</sup> Department of Neurology and Neurosciences, Rutgers – New Jersey Medical School, Rutgers Biomedical and Health Sciences, The State University of New Jersey, Newark, NJ, USA

### **Edited by:**

Gregory J. Quirk, University of Puerto Rico, USA

### **Reviewed by:**

Gregg Stanwood, Vanderbilt University, USA Christina Dalla, University of Athens, Greece

### **\*Correspondence:**

Xilu Jiao, Neurobehavioral Laboratory, Veterans Bio-Medical Research Institute (VBRI), VA Medical Center, 385 Tremont Avenue, East Orange, NJ 07018, USA e-mail: xilu.jiao@va.gov

Avoidance and its perseveration represent key features of anxiety disorders. Both pharmacological and behavioral approaches (i.e., anxiolytics and extinction therapy) have been utilized to modulate avoidance behavior in patients. However, the outcome has not always been desirable. Part of the reason is attributed to the diverse neuropathology of anxiety disorders. Here, we investigated the effect of psychotropic drugs that target various monoamine systems on extinction of avoidance behavior using lever-press avoidance task. Here, we used the Wistar-Kyoto (WKY) rat, a unique rat model that exhibits facilitated avoidance and extinction resistance along with malfunction of the dopamine (DA) system. Sprague Dawley (SD) and WKY rats were trained to acquire lever-press avoidance. WKY rats acquired avoidance faster and to a higher level compared to SD rats. During pharmacological treatment, bupropion and desipramine (DES) significantly reduced avoidance response selectively in WKY rats. However, after the discontinuation of drug treatment, only those WKY rats that were previously treated with DES exhibited lower avoidance response compared to the control group. In contrast, none of the psychotropic drugs facilitated avoidance extinction in SD rats. Instead, DES impaired avoidance extinction and increased non-reinforced response in SD rats. Interestingly, paroxetine, a widely used antidepressant and anxiolytic, exhibited the weakest effect in WKY rats and no effects at all in SD rats. Thus, our data suggest that malfunctions in brain catecholamine system could be one of the underlying etiologies of anxiety-like behavior, particularly avoidance perseveration. Furthermore, pharmacological manipulation targeting DA and norepinephrine may be more effective to facilitate extinction learning in this strain. The data from the present study may shed light on new pharmacological approaches to treat patients with anxiety disorders who are not responding to serotonin re-uptake inhibitors.

**Keywords: avoidance perseveration, anxiolytic, behavioral inhibition, dopamine, norepinephrine, serotonin, transporter inhibitors**

### **INTRODUCTION**

Anxiety disorders are the most common psychiatric disorder with a lifetime prevalence of over 15% in the U.S. (Kessler et al., 2005; Somers et al., 2006). Although the etiopathology of anxiety disorders remains elusive, the core characteristic of all anxiety disorders is pathological avoidance (American Psychiatric Association, 2000). Compared to normal strategic avoidance, psychopathological avoidance is hypersensitive to stimuli, resistant to extinction, and often results in poor productivity and inefficiency that hinder daily activity (Beck et al., 2010;Berman et al., 2010). However, therapies targeting pathological avoidance are quite underdeveloped and problematic for people suffering clinical anxiety.

Current treatment for anxiety disorder includes psychological [i.e., cognitive behavioral therapy (CBT)], pharmacological approaches, and the combination of both. Extinction-resistant avoidance is one of the target symptoms in CBT, which is largely based on changing behavior through different learning approaches (Holmes and Quirk, 2010; Nic Dhonnchadha and Kantak, 2011; Schneier, 2011; Melo et al., 2012). Clinical evidence shows that combined approaches yield the highest success rate compared to each approach alone [for review, see Pollack et al. (2008)]. Among the Food and Drug Administration (FDA) approved anxiolytic agents, selective serotonin re-uptake inhibitors (SSRIs) are the drugs of first choice due to their mild side effects allowing for better compliance in patients. However, a large group of patients do not respond to SSRI treatment (>40%) or relapse after initial effective treatment (Pollack et al., 2006, 2008), providing a need for new therapeutic agents and strategies for refractory cases.

A better understanding of the neuropathology of core anxiety symptoms is essential for developing more effective treatments. We recently described a rat model of anxiety-like behavior, the Wistar-Kyoto (WKY) rat (Servatius et al., 2008; Jiao et al., 2011b). This inbred rat strain differs from normal outbred strains, such as Sprague Dawley (SD) rats, in avoidance propensity and perseveration and neuronal activity in brain regions critical for fear learning and anxiety (Beck et al., 2011;Jiao et al., 2011b). The WKY strain also exhibits behavioral inhibition temperament in the face of social and non-social stressful stimuli, heightened physiological and neuroendocrine responsiveness to stressful stimuli, and negative bias toward external cues, indicating greater anxiety vulnerability compared to normal outbred strains (Athey and Iams, 1981; Pare, 1992a,b; Redei et al., 1994; Lopez-Rubalcava and Lucki, 2000).

Although the exact mechanism for the altered behavior of the WKY rats is not well understood, much focus has been directed toward malfunction of central monoaminergic systems. The WKY rat exhibits altered dopamine (DA) and norepinephrine (NE) receptors and transporter levels in cortical and subcortical regions compared to SD and Wistar (WIS) rats (Jiao et al., 2011b). Pharmacological studies demonstrate that repeated treatment with drugs that enhance catecholaminergic transmission reverses abnormal behavior in WKY but not in outbred rats (Pare et al., 2001; Tejani-Butt et al., 2003). For instance, bupropion (BUP) [dopamine transporter (DAT) inhibitor] increases locomotion in the open field test (OFT), while nomifensine [DAT and norepinephrine transporter (NET) inhibitor] and desipramine (DES) (tricyclic antidepressant that mainly block NET) facilitate OFT activity and swimming behavior in the Forced Swim Test (FST) of WKY rats (Pare et al., 2001;Tejani-Butt et al., 2003). However, neither fluoxetine nor paroxetine (PAR) (selective serotonin transporter inhibitors, SSRIs) are effective on similar behaviors (Durand et al., 1999; Lopez-Rubalcava and Lucki, 2000; Tejani-Butt et al., 2003). The data suggest that anxiety-like symptoms in WKY rats are SSRI-resistant but may be modified by psychotropic drugs acting on NE and/or DA (Lahmame et al., 1997; Tejani-Butt et al., 2003).

In the present study, we compared the effects of monoaminergic transporter inhibitors on avoidance extinction in SD and WKY rats. We predicted that NET and DAT inhibitors but not SSRI would facilitate avoidance extinction and reduce active-avoidance behavior selectively in WKY rats.

### **MATERIALS AND METHODS ANIMALS**

Forty male SD (body weight = 321 ± 2.8 g) and 40 male WKY (body weight = 239 ± 1.8 g) rats (approximately 60 days of age at the start of the experiment) were obtained from Harlan Sprague-Dawley Laboratories (Indianapolis, IN, USA). Rats were housed in individual cages with free access to food and water in a room maintained on a 12:12 h day/night cycle for 2 weeks prior to experimentation. Experiments occurred between 07:00 and 15:00 h in the light portion of the cycle. One WKY rat treated with DES was eliminated from the study due to significant weight loss in the last three extinction sessions. All procedures received prior approval by the Institutional Animal Care and Use Committee at the VA New Jersey Health Care System and were conducted in accordance with the NIH Guide for the Care and Use of Laboratory Animals.

### **LEVER-PRESS ESCAPE/AVOIDANCE TRAINING**

The apparatus was described previously (Servatius et al., 2008). Training was conducted in 16 identical operant chambers (Coulbourn Instruments, Langhorn, PA, USA). Each operant chamber was enclosed in a sound-attenuated box. Scrambled 2.0-mA footshocks were delivered through the grid floor (Coulbourn Instruments, Langhorn, PA, USA). Despite an increased sensitivity to stress in WKY rats, WY, and SD rats exhibited a similar threshold to vocalize in response to foot-shock (un-published observation), suggesting similar pain sensitivity to foot-shocks. The auditory warning signal was a 1000-Hz, 75-dB tone (10 dB above background noise). A 3-min intertrial interval (ITI) was explicitly signaled with a 5-Hz blinking cue light (safety signal) located above the lever. Graphic State Notation software (v. 3.02, Coulbourn Instruments, Langhorn, PA, USA) controlled the stimuli and recorded responses.

Each session began with a 60-s stimulus-free period. A trial commenced with the presentation of the auditory warning signal. During avoidance acquisition training, a lever-press during the first 60 s of the warning signal constituted an"avoidance"response, terminated the warning signal, and triggered the ITI period. In the absence of a lever-press in the first 60 s of the warning signal, 0.5 s foot shocks were delivered with an inter-shock interval of 3 s. A lever-press during the shock period constituted an "escape response," terminated the shock and warning signal and triggered the ITI. A maximum of 100 foot-shocks could be delivered on each trial. During avoidance extinction training, foot-shock was not delivered and safety signal was not presented. A lever-press made during the first 60 s of the warning signal constituted an avoidance response, while the lever-press made during the rest warning signal constituted an escape response. Both responses terminated the warning signal and initiated a non-signaled ITI period. Each session consisted of 20 trials. Extinction training occurred in the same training box as acquisition learning.

### **DRUG ADMINISTRATION**

Bupropion hydrochloride (a DAT blocker, 20 mg/ml/kg, i.p., Sigma-Aldrich, St Louis, MO, USA), desipramine hydrochloride (a NET blocker, 10 mg/ml/kg, i.p., Sigma-Aldrich, St Louis, MO, USA), paroxetine hydrochloride (a SSRI, 10 mg/ml/kg, i.p., Toronto Research Chemicals, Toronto, ON, CA), or saline vehicle solution (1 ml/kg, i.p.) was injected daily between 16:00 and 17:00 (after the avoidance session on days of training) to avoid acute drug effects on behavioral testing. The dosage used was the effective dosage tested in open field and FST on WKY rat (Tejani-Butt et al., 2003; Jiao et al., 2006).

### **SEQUENCE OF BEHAVIORAL PROCEDURES**

Avoidance training sessions occurred three times per week (every 2–3 days). Avoidance acquisition training continued for 12 sessions. After the acquisition phase, rats were administered BUP, DES, PAR, saline (SAL) treatment, or no treatment. For each strain, rats were stratified on avoidance performance during acquisition session 12 (A12) and then randomly assigned within each stratum to BUP, DES, PAR, SAL treatment, or no injection group. Rats were treated daily from the day following the last acquisition session (A12) to the day before the sixth extinction session (E06).

Extinction training (absence of shock and intertrial-interval signal) began 2 weeks after the last acquisition session (session 13) and continued for nine sessions. Therefore, the first six sessions of extinction were with drug or SAL and the last three extinction sessions were drug-free. Thus, extended drug effect was evaluated in the last three drug-free sessions.

### **DATA ANALYSIS**

Mixed design analysis of variance (ANOVA) was used to analyze behavioral aspects of acquisition and extinction. Ratios of avoidance and escape responses, lever-presses during the first minute of each session [anticipated responses (AR)] and non-reinforced intertrial-interval responses (ITRs) during the first, second, and third minute ITI (ITI-1, ITI-2, and ITI-3 min) were examined in both phases. Both between-session and within-session avoidance responses were examined to illustrate the main effects of strain, drug treatment, and interactions.

In the acquisition phase, two-way ANOVA with repeated measures of session and between-groups measure of strain (2 × 12) was conducted to analyze all behavioral features using Tukey-Kramer for *post hoc* comparisons. Within-session avoidance responses were examined in early (A01–04), mid (A05–08), and late (A09–12) session-blocks with four sessions/block across trials (2 × 20).

In the extinction phase, mixed design ANOVA was used to analyze all the behavioral aspects. Analysis of rats receiving SAL injection compared to non-injection animals revealed no differences (all *F*-values <1 and *p*-values >0.2), and so data from subjects in these two groups were combined into one control (CTL) group within each strain for analysis and figure illustration during extinction. A mixed ANOVA with between-subjects factors of strain and treatment and repeated measures of session (2 × 4 × 9) was used to analyze main factors of strain, treatment, session, and interactions. In order to examine the immediate (i.e., when treatment was administered) and lasting (i.e., when treatment was discontinued) drug effects on extinction learning in each strain, the mean avoidance responses were separately analyzed within the first six sessions and the last three sessions of extinction, using two-way ANOVA with repeated measures of session and between-groups measure of treatment design, treatment × extinction session (4 × 6 and 4 × 3, respectively). To evaluate drug effects on within-session extinction learning, within-session avoidance response was analyzed in early (E01–03), mid (E04–06), and late (E07–09) extinction, respectively, using strain × treatment × trial (2 × 4 × 20) design with strain and treatment as between-subjects factors and trial as a within subject factor. *Post hoc* analysis was conducted using Dunnett's test to identify interactions.

All data are expressed as means ± SEM. An alpha level equal to 0.05 was used to determine significance across all analyses. Statistical results are reported only where significant differences were found.

### **RESULTS ACQUISITION**

### **Avoidance responding**

In all respects, strain differences in avoidance learning in this study replicate what has been described previously (Servatius et al., 2008; Beck et al., 2010, 2011). Rats from both strains emitted greater numbers of avoidance responses as acquisition proceeded, Session, *F*(11,858) = 101.8, *p* < 0.001 (**Figure 1A**). Compared to SD rats, WKY rats acquired avoidance response to a greater extent, strain, *F*(1,78) = 17.8, *p* < 0.001.

Within-session analysis was conducted to compare avoidance responses in three-session blocks (i.e., early, mid, and late blocks). Within-session avoidance responses are averaged across early (A01–04), mid (A05–08), and late (A09–12) acquisition sessions. The data indicate that both strains emitted more avoidance responses in later trials of the session, Trial, *F*(19,1482) = 23.3 (early, sessions A01–04), 13.9 (mid, sessions A05–08), and 7.2 (late, sessions A09–12), *p*s < 0.001. WKY rats exhibited superior within-session avoidance learning compared to SD rats, strain, *F*(1,78) = 24.8 (early), 5.6 (mid), and 15.2 (late), *p*s < 0.001. Consistent with our previous findings, the within-session acquisition learning is more obvious in SD rats as WKY rats emitted similar or greater avoidance responding on the first trial of a session compared to the last trial from the previous session, suggesting a lack of "warm-up" that plays a pivotal role in the development of avoidance perseveration during extinction phase in the WKY strain (Servatius et al., 2008) (**Figure 1B**).

### **Non-reinforced response**

In terms of ARs, WKY rats made more lever-presses during the first minute of each session as compared to SD rats, strain, *F*(1,78) = 4.3, *p* < 0.05; both strains of rats emitted more responses as acquisition proceeded, session, *F*(11,858) = 18.2, *p* < 0.001 (**Figure 4A**). The number of intertrial-interval responses (ITRs) in the first, second, and third-minute of the ITI period was altered as the acquisition phase proceeded, *F*(11,858) = 24.4 (ITI-first minute), 13.0 (ITI-second minute), and 14.5 (ITI-third minute), *p*s < 0.001 (**Figure 3A**). Both strains of rats emitted more ITRs in the first minute compared to the second and third minute. ITRs differed between strains for all ITIs, strain × session, *F*(11,858) = 16.3 (ITI-first minute), 11.60 (ITI-second minute), and 15.1 (ITI-third minute), respectively, *p*s < 0.001; WKY rats responding more frequently in early than late acquisition sessions, whereas SD rats emitted similar number of ITRs across acquisition sessions.

### **EXTINCTION**

### **Avoidance responding**

During extinction, all rats made fewer avoidance responses as extinction proceeded across sessions, *F*(8,573) = 47.96, *p* < 0.001 (**Figure 2A**). Overall, WKY rats emitted more avoidance responses compared to SD rats, *F*(1,72) = 14.88, *p* < 0.001. Similar to our previous findings, WKY rats without drug treatment exhibited more avoidance responses compared to SD rats without drug, as reflected by significant strain × treatment interaction, *F*(3,72) = 3.91, *p* < 0.05. In an analysis of only WKY rats, DES treatment facilitated extinction compared to the un-drugged (CTL) group as reflected by treatment × session, *F*(24,285) = 2.01, *p* < 0.005, *post hoc p* < 0.05; in contrast, DES treatment in SD rats enhanced avoidance responses compared to CTL as reflected by treatment, *F*(3,36) = 3.48, *p* < 0.05, *post hoc p* < 0.05.

Rats were treated with drug or SAL for the first six sessions of extinction, and then untreated for another three sessions of

lever-press responding significantly increased in both strains although WKY rats acquired avoidance responses significantly faster and reached greater asymptotic performance compared to SD rats. **(B)** Within-session avoidance response. Both strains emitted more avoidance responses as an acquisition session proceeded. During early, mid, and late

### extinction. In order to assess immediate drug effects and extended drug effects on extinction learning in both strains, mean avoidance responses were compared between treatment groups within each strain for extinction sessions with treatment (E01–06) and extinction sessions without treatment (E07–09).

### **Extinction sessions with treatment (E01–06)**

avoidance responding in the first trial of a session compared to the last trial of a previous session; however, this phenomenon is not evident in WKY rats. Each data point represents group mean ± SEM

During the first six extinction sessions, all rats reduced avoidance response as extinction proceeded, reflected by main effect of session, *F*(5,360) = 52.76, *p* < 0.001; WKY rats emitted more avoidance responses compared to SD rats, as reflected by a main effect of

(n = 40/strain).

responses than SD CTL rats. In WKY rats, BUP treatment significantly decreased avoidance responses during injection sessions while DES treatment significantly reduced such responses during all extinction sessions, compared to CTL group. In SD rats, DES treatment impaired extinction of avoidance response compared to CTL. **(B)** Within-session avoidance

facilitation in late sessions while no injection was administered. In contrast, in SD rats, DES treatment trended to produce increased avoidance response in mid and late extinction sessions compared to CTL. Each data point represents group mean ± SEM. (n = 8/treatment group, 16/CTL group). Gray shade on x-axis indicates sessions in which drugs were administered (E01–06).

strain, *F*(1,72) = 8.96, *p* < 0.005, and a strain × treatment interaction, *F*(2,72) = 3.36, *p* < 0.05. In a separate analysis of WKY rats, BUP treatment decreased avoidance responding in all extinction sessions except E02, DES treatment decreased avoidance responding in sessions E03–06, and PAR treatment decreased avoidance responding in the last two extinction sessions, as reflected by treatment × session interaction, *F*(15,180) = 2.31, *p* < 0.01, *post hoc*, *p*s < 0.05, suggesting all three drugs facilitated extinction when daily treatment was administered. In an analysis of SD rats, DES treatment led to the highest number of avoidance response among all treatment groups; the remaining groups (i.e., BUP, PAR, and CTL) did not differ, main effect of treatment, *F*(3,36) = 3.41, *p* < 0.05, *post hoc p* < 0.05. These results suggest DES treatment is detrimental for extinction learning in SD rats.

### **Extinction sessions without treatment (E07–09)**

During the last three sessions of extinction when treatment was discontinued, WKY CTL rats emitted more avoidance response compared to SD CTL rats [strain × treatment, *F*(3,71) = 4.0, *p* < 0.05]. In the analysis of WKY rats, DES group exhibited a trend of less avoidance responses compared to CTL group as reflected by

main effect of treatment that just missed significance (*p* = 0.066). In the analysis of SD rats, DES group did not differ from other treatment groups when the drug administration was discontinued; suggesting that enhanced avoidance following DES treatment (E01–06) was not long-lasting.

Within-session avoidance responses are as averaged across early (E01–03), mid (E04–06), or late (E07–09) extinction sessions. Within-session analysis demonstrated that avoidance responses decreased significantly in both strains with increasing trials in a session, *F*(19,1368) = 23.07 (early, sessions E01–03), 46.45 (mid, sessions E04–06), and 44.75 (late, sessions E07–09), *p*s < 0.001 (**Figure 2B**). CTL WKY rats exhibited significantly higher avoidance responses compared to CTL SD rats throughout mid and late sessions indicating slower within-session extinction in WKY rats without drug treatment, *F*(3,72) = 4.89 (mid) and 4.00 (late), *p*s < 0.005 and 0.05, respectively, *post hoc p*s < 0.05. However, in both strains, drug treatment affected avoidance responding differently in early, mid and later session-blocks. In WKY rats, BUP and DES significantly facilitated within-session extinction during early, middle, and late phases of extinction compared to CTL (early: treatment × trial interaction, *F*(57,684) = 1.47, *p* < 0.05;

proceeded and more ITRs during ITI-second and -third minute windows in mid acquisition sessions. **(B)** During extinction, DES facilitated ITRs selectively in SD rats across all three ITI windows compared to CTL. None of the treatments affected ITRs in WKY rats regardless of ITI windows. Each data point represents group mean ± SEM. (n = 8/treatment group, 16/CTL group). Gray shade on x-axis indicates sessions in which drugs were administered (E01–06).

increased ARs as acquisition continued. **(B)** During the extinction phase, the ARs did not change as extinction continued regardless of strain or

Gray shade on x-axis indicates sessions in which drugs were administered (E01–06).

middle: *F*(3,36) = 4.31,*p* < 0.05; late: *F*(57,684) = 1.75,*p* < 0.001. More over, first trial avoidance was not altered regardless of treatment, suggesting between-session extinction may not be apparent measured by first trial avoidance. In SD rats, none of the drugs affected avoidance response in early extinction sessions. However, in mid and late phases of extinction, DES impaired within-session extinction (enhanced avoidance responding) compared to the other treatments [middle: main effect of treatment, *F*(3,36) = 5.0, *p* < 0.005, and treatment × trial interaction, *F*(57,684) = 1.51, *p* < 0.05, *post hoc p*s < 0.05; late: treatment × trial interaction, *F*(57,684) = 1.45, *p* < 0.05, *post hoc p*s < 0.05]. Neither BUP nor PAR significantly altered within-session extinction in SD rats.

### **Non-reinforced responding**

Anticipated responses decreased during the extinction phase, *F*(8,573) = 2.09, *p* < 0.05 (**Figure 4B**). Notwithstanding that ARs

in WKY rats were not affected by any drug treatment, DES treatment increased lever-presses compared to CTL treatment in SD rats,*F*(3,36) = 3.11,*p* < 0.05,*post hoc p* < 0.05. On the other hand, fewer lever-presses were emitted by all rats in each minute of the ITI as extinction proceeded as reflected by main factor of Session, *F*(8,573) = 31.5 (ITI-first minute), 26.25 (ITI-second minute), and 29.78 (ITI-third minute), respectively, *p*s < 0.001 (**Figure 3B**). More ITRs were emitted during the ITI-first minute compared to responses emitted during the second and third minute regardless of strain or treatment. In WKY rats, DES-treated rats emitted fewer ITRs than CTL-treated peers, *F*(24,285) = 2.47 (ITI-first minute), 1.89 (ITI-second minute), and 1.76 (ITI-third minute), *p*s < 0.05, *post hoc*, *p*s < 0.05. In contrast, DES treatment in SD rats enhanced ITRs compared to the other treatments, *F*(3,36) = 3.07 (ITI-first minute), 6.40 (ITI-second minute), and 5.14 (ITI-third minute), *p*s < 0.05, *post hoc*, *p*s < 0.05.

### **DISCUSSION**

Avoidance and its perseveration represent key features of anxiety disorders. Pharmacological approaches that reduce avoidance behavior could facilitate recovery in patients with anxiety disorders. The present study used an animal model of behavioral inhibition, a risk factor for anxiety disorders, to test the effectiveness of pharmacological intervention on preservative avoidance behavior. WKY rats exhibited facilitated acquisition and delayed extinction of lever-press avoidance compared to SD rats, consistent with our previous reports (Servatius et al., 2008; Beck et al., 2010, 2011; Jiao et al., 2011a). Moreover, drugs targeting distinct monoamine systems affected extinction learning differently and in a strain-dependent manner. BUP and DES significantly facilitated extinction of avoidance selectively in WKY rats while none of the drugs enhanced extinction learning in SD rats. Instead, DES impaired extinction learning in SD rats by increasing avoidance responding. These data suggest that drugs can facilitate extinction of avoidance response selectively in animals with innate vulnerability to anxiety.

In addition, the sensitivity to pharmacological manipulations was not necessarily similar between different behavioral measures. For instance, BUP and DES reduced avoidance responses in WKY rats without affecting ARs or ITRs. In SD rats, DES not only increased avoidance response but also enhanced ITRs. Despite the fact that AR, ITR, and avoidance responding during extinction all constitute non-reinforced behaviors, the results suggest that these behaviors may be under different neurochemical CTL and that pharmacological treatment could be designed to selectively alleviate psychopathological avoidance and leave other behavioral features intact to reduce side effects.

The results of the present study provide important information regarding the neurobiological mechanisms of extinction of avoidance behavior. Several neurochemical pathways have been implicated in the development of anxiety and its neuropsychopharmacology, while the neural mechanism of avoidance and its extinction is far less understood. Converging literatures demonstrate that an aberrant DA circuitry and (or) defective noradrenergic function is associated with anxiety disorders (Mathew et al., 1981; Hamner and Diamond, 1996; Ballenger, 2001). Early studies on the neurobiology of active avoidance also focused on catecholamine systems (Beer and Lenard, 1975; Ashford and Jones, 1976; Fibiger and Mason, 1978; Oei and King, 1978; Raskin et al., 1983;Koob et al., 1984). In general, DA has mostly been implicated in the acquisition and expression of avoidance responses (Lenard and Beer, 1975; Fibiger and Mason, 1978) while NE may be more involved in the extinction of such responses (Lenard and Beer, 1975; Fibiger and Mason, 1978; Raskin et al., 1983). However, selective lesions that targeted either the DA or NE system have yielded inconsistent results. For instance, NE depletion did not appreciably alter avoidance learning, but rather led to impairment of extinction (Lenard and Beer, 1975; Fibiger and Mason, 1978), while mice that lacked NE extinguished more rapidly compared to intact CTLs (Thomas and Palmiter, 1997). The inconsistent effects of lesions may be due to different lesion procedures and compensatory mechanisms after lesion.

Extinction deficiency has been associated with malfunction of various brain regions, especially the medial prefrontal cortex (mPFC) and amygdala. The evidence is mainly obtained from fear extinction paradigms. Hypoactive mPFC and hyperactive limbic system, including nucleus accumbens (NAc) and amygdala, are susceptibility factors in psychopathology of anxiety disorders (Milad et al., 2006; Rauch et al., 2006). Historically, dysfunctional catecholamine transmission in the mPFC and NAc has been associated with abnormal active-avoidance behavior and implicated in anxiety pathology (Giorgi et al., 1994; Duncan et al., 1996; Lacroix et al., 1998; Weiss et al., 2001). However, results in rodents are inconsistent due to the wide variety of animal models, behavioral procedures and techniques employed. Here we utilized (1) pharmacological agents that modulate NE, DA, and 5-HT neurotransmission in the brain by blocking corresponding transporters in order to identify their role in extinction of avoidance and (2) a unique rat model, the WKY rat strain that exhibits innate abnormalities in DA and NE systems.

In the present study, DES, a tricyclic antidepressant that increases synaptic NE level, facilitated extinction in WKY rats after 2 weeks of treatment. In previous studies, increased locomotion in OFT and swimming time in FST were reported following DES treatment in WKY rats (Lopez-Rubalcava and Lucki, 2000; Tejani-Butt et al., 2003). Thus, the present results with those obtained previously suggest that blocking NET can ameliorate anxiety- and depression-like behaviors in the WKY rat. WKY rats exhibit higher NE transporter (NET) binding in hippocampus and amygdala compared to SD rats, and repeated exposure to novel stressors reduced β- and α2-adrenergic receptors selectively in WKY rats, suggesting a pre-existing vulnerability to stress is associated with malfunctions in noradrenergic system (Tejani-Butt et al., 1994). In the present study, the 2 mA foot-shock during acquisition may function as repeated physical stressors, while the context and the warning signals may function as repeated psychological stressors; both could alter NET and receptor function in WKY rats, which could lead to exaggerated avoidance response. DES exerts its pharmacological effects via inhibition on NET and auto-receptor desensitization in rats (Sacchetti et al., 2001; Zhao et al., 2008; Zhang et al., 2009). Chronic DES treatment (i.e., more than 10 days) not only reduces NET binding, but also alters β- and α-adrenergic receptor binding in a region-specific manner in rats (Hancock and Marsh, 1985; Zhao et al., 2008; Zhang et al., 2009). Treatment with DES at the same dose that altered NE receptor and NET (i.e., 10 mg/kg/day) blocked stress – and alcohol-induced anxiety-like behavior in WKY rats (Durand et al., 2000; Getachew et al., 2008). Thus, DES administration was expected to improve extinction in WKY rats,possibly through pharmacological changes in the NE system. In contrast, the same treatment appeared to retard avoidance extinction in SD rats. The differential effects of DES on avoidance response in the two strains could be attributed to different innate noradrenergic function. DES-treated SD rats also emitted more non-reinforced response (i.e., ITRs) during extinction, suggesting the retardation of avoidance extinction may be due to elevated general locomotor activity, which has also been reported previously following DES treatment (Maj et al., 1987; Tejani-Butt et al., 2003). Thus, altering NE function could yield different outcomes depending upon baseline NE activity and individual variability in noradrenergic system following chronic drug administration.

Here, we also report that BUP, a selective DAT blocker and a weaker NET blocker with antagonism of adrenergic receptors and acetyl cholinergic receptors (Carroll et al., 2014), facilitates extinction of active-avoidance selectively in the WKY strain. These effects of BUP may be related to the fact that WKY rats have an altered DA system associated with the distribution of transporter and receptors in the brain (Jiao et al., 2003, 2006; Yaroslavsky et al., 2006; Novick et al., 2008;Yaroslavsky and Tejani-Butt, 2010). Given the role of the mesolimbic DA system in cognitive, emotional, and motivational behaviors, we previously examined the distribution of DAT sites in the brains of WKY compared to Wistar (WIS) and SD rats and reported that WKY rats exhibited a differential pattern of distribution of DAT binding sites in terminal field regions versus the cell body areas in comparison to WIS and SD rats (Jiao et al., 2003). At the time, we speculated that the observed differences in the density and distribution of DAT sites in WKY rats may lead to altered modulation of synaptic DA levels in the cell body and mesolimbic regions and contribute to behavioral differences previously observed. In terms of the role of DA in active avoidance, the results are not clear and may depend on which region investigated. NAc DA depletion leads to a substantial reduction in learning to lever-press to avoid or escape a shock (McCullough et al., 1993), while mPFC 6-OH-DA lesions, which reduce DA level to 13% of the CTL levels, do not affect avoidance responding (Koob et al., 1978; Sokolowski et al., 1994). Although the involvement of DA in the extinction of active avoidance is unknown, DA in mPFC and amygdala is actively involved in extinction of conditioned fear in rodents through modulation of GABAergic neurons in the intercalated cell cluster (ITC) of amygdala and basolateral amygdala (Morrow et al., 1999; Fernandez Espejo, 2003; de la Mora et al., 2010; Rey et al., 2014). WKY rats exhibit slower extinction of lever-press avoidance and lower mPFC activity and amygdalar GABAergic activity compared to SD rats (Jiao et al., 2011a), suggesting dysfunctional DA transmission in mPFC and amygdala could be the possible mechanisms. Repeated treatment with DAT blockers (i.e., BUP and nomifensine) not only increases DAT levels in mesolimbic regions (Jiao et al., 2006) but also facilitates extinction learning (in the present report) and reduces anxiety-like behavior in the OFT (Tejani-Butt et al., 2003), further supporting that DA is playing a critical role in modulating emotion and responses associated with aversive stimuli.

Nucleus accumbens, a heavily DA-innervated limbic area, is another region of interest involved in BUP-associated effects because of its role in motivated behavior and emotion (Koob, 1992;Ahn and Phillips, 2007). Higher DA turnover rate and receptor binding combined with lower DAT binding in the NAc leads to elevated DA activity in the NAc in WKY rats (Jiao et al., 2003; De La Garza and Mahoney, 2004; Novick et al., 2008; Scholl et al., 2010), and this condition is often associated with increased emotionality and greater avoidance responding (Ikemoto and Panksepp, 1999). In the present study, BUP administration accelerated extinction in WKY rats, supporting a positive DA involvement in extinction learning. Moreover, PFC DA is important for cognitive processes such as decision-making and avoidance, but PFC has very low DAT distribution in rats and the reuptake of DA in this region mainly relies on NET (Wayment et al., 2001; Moron et al., 2002).

Therefore, both DES and BUP may elicit similar effects (i.e., increased synaptic DA and NE levels) within PFC, which is a possible mechanism underlying their similar effects on extinction in WKY rats.

The role of 5-HT in avoidance is less clear. Earlier pharmacological studies using one-way avoidance in shuttle box demonstrated that increased 5-HT transmission is associated with a deficit in acquisition and retention and decreased 5-HT leads to facilitated acquisition through the hippocampus and prefrontal cortex (Ogren, 1986a,b). However, pharmacological agents that facilitate serotonin transmission either impaired passive avoidance and facilitated its extinction (Shugalev et al., 2008) or had no effect on a two-way avoidance task (Sun et al., 2010) in rats. Moreover,chronic fluoxetine treatment reverses generalized avoidance in a mouse model of post-traumatic stress disorder (PTSD) (Pamplona et al., 2011), suggesting serotonergic agents modulate avoidance and its extinction via influencing the emotional response. Thus, serotonergic agents seem to be in a good position to alter behavioral abnormalities such as persistent avoidance. Most importantly, although SSRIs are the first line medication to treat anxiety symptoms, they are found to be ineffective in many patients (Pollack et al., 2006, 2008). Similar to those SSRI refractory cases, WKY rats do not respond to chronic SSRI treatment measured by OFT or FST at baseline condition or following stress challenge (Sanchez and Meier, 1997; Durand et al., 1999; Lopez-Rubalcava and Lucki, 2000; Tejani-Butt et al., 2003; Rosenzweig-Lipson et al., 2007). Here, we observed that PAR facilitated within-session extinction learning in WKY rats only in the mid extinction sessions; however, both BUP and DES facilitated within-session extinction in early, mid, and late extinction. Moreover, WKY rats treated with PAR appeared to resume avoidance responses in extinction sessions in which the drug was not on board. Therefore, the WKY rat may be a useful model for SSRI-resistant anxiety.

We also found that PAR, at a dose that is effective in reducing stress-induced abnormalities in OFT/FST in SD rats (Tejani-Butt et al., 2003), did not change avoidance responding in SD rats in the present study. This discrepancy could be due to different behavioral procedures and paradigms used in previous studies and the present study. Previously, relatively short behavioral tests such as OFT and FST were used to evaluate emotional response following various durations of stress period (i.e., acute versus 7 days to weeks of chronic stress). Here, we trained rats to acquire leverpress avoidance using foot-shock for 12 sessions and each session lasted for over an hour depending on performance. This paradigm allows the development of effective coping mechanisms in normal rat strains but promotes avoidance perseveration in rats that are vulnerable to stress, such as the WKY strain (Jiao et al., 2011a). Therefore, the lack of effect in SD rats following PAR treatment here may be explained by normal coping behaviors being more resilient to pharmacological intervention due to homeostasis in brain neurochemistry. However, the possibility that a higher dose may have facilitated avoidance extinction can not be ruled out since only a single dose was tested in the present study. Thus, our findings of the distinctive role of monoamine in avoidance behavior will, hopefully, shed a light on the neurochemical mechanisms underlying anxiety disorders.

Anticipated responses are often associated with fear and anxiety disorders in humans and experimental animals (Conrod, 2006; Bailey and Crawley, 2009; Straube et al., 2009). However, whether and how this behavioral feature responds to pharmacological manipulation has not been thoroughly studied. Consistent to our previous report (Perrotti et al., 2013), here we found that WKY rats exhibited more ARs during acquisition compared to SD rats,suggesting a positive relationship between ARs and avoidanceprone behavior. However, this strain difference disappeared during extinction in CTL-treated WKY and SD rats, suggesting ARs may be labile depending on environmental factors (i.e., both foot-shock and the flashing light were removed during extinction). In addition, none of the agents significantly altered ARs in drugged groups compared to CTL groups, regardless of strain. The present data provide little support to associate ARs with avoidance perseveration. However, evaluating ARs in anxiety is beyond the scope of this study since we only measured lever-press as the main response to evaluate AR. Physiological and autonomic responses such as skin conductance and heart beat may be more appropriate to study anticipatory responding. In the future, these measurements may be used to better characterize pharmacological effects on ARs.

Lastly, we believe that the effects of the agents used in the present study are due to chronic pharmacodynamics changes at transporter and receptor levels instead of neurochemical concentration changes at synaptic level. Given that all three agents have relatively short half-lives in rat brain tissue, from 5 h (PAR) to 8 h (DES) (Suckow et al., 1986; Caccia et al., 1993; Cox et al., 2011), we treated animals over 12 h before the start of the first post-treatment extinction session. Moreover, other posttreatment extinction sessions occurred days and weeks after the last administration, which provides sufficient clearance to elucidate non-drug effect on extinction. Further examination of neuronal activation in limbic regions will provide direct evidence illustrating how these agents affect avoidance behavior in both strains. On the other hand, only one dosage for each agent was used in this study. Although SSRIs have a relative flat dose– response curve to treat social anxiety disorder and fixed-dose of SSRIs has been used as a standard treatment strategy, clinical evidence suggests that optimal effect may be obtained with higher doses of SSRI (van der Linden et al., 2000; Baker et al., 2003; Lader et al., 2004). Thus, the lack of effectiveness of PAR in extinction training may reflect an insufficient dose used in WKY rats.

In summary, this study examined the effects of three classes of psychotropic agents commonly used in treating anxiety and depression-like symptoms in humans on extinction of a lever-press active-avoidance task in rats. Given the behavioral, neurochemical, and pharmacological features demonstrated in the WKY rat, NET, and DAT inhibitors were more effective in facilitating extinction of avoidance behaviors but SSRIs was the least effective. Thus, the WKY rat could be used as a powerful tool to examine novel treatment targeting anxiety symptoms in patient population that is resistant to conventional SSRI treatment. Similar to the enhanced prevalence of anxiety disorder in females (Pigott, 2003), we have reported that female SD rats are more sensitive to learn avoidance than male SD rats, while female and male WKY rats learn

avoidance to similar degrees (Beck et al., 2011). It would be important to assess the effects of monoaminergic drugs on female rats in the future.

### **ACKNOWLEDGMENTS**

This study was supported by research funds from the Biomedical Laboratory Research and Development Service of the VA Office of Research Development (I01BX000218 and I01BX007080), NIH (NS044373), and the Stress and Motivated Behavior Institute.

### **REFERENCES**


the mesolimbic and nigrostriatal dopamine systems. *Brain Res.* 303, 319–329. doi:10.1016/0006-8993(84)91218-6


Pare, W. P. (1992b). The performance of WKY rats on three tests of emotional behavior. *Physiol. Behav.* 51, 1051–1056. doi:10.1016/0031-9384(92)90091-F


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 June 2014; accepted: 29 August 2014; published online: 15 September 2014. Citation: Jiao X, Beck KD, Stewart AL, Smith IM, Myers CE, Servatius RJ and Pang KCH (2014) Effects of psychotropic agents on extinction of lever-press avoidance in a rat model of anxiety vulnerability. Front. Behav. Neurosci. 8:322. doi: 10.3389/fnbeh.2014.00322*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Jiao, Beck, Stewart, Smith, Myers, Servatius and Pang . This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

### ITI-signals and prelimbic cortex facilitate avoidance acquisition and reduce avoidance latencies, respectively, in male WKY rats

### **Kevin D. Beck 1,2,3\*, Xilu Jiao1,2,4, Ian M. Smith1,4, Catherine E. Myers 1,2,3, Kevin C. H. Pang1,2,3 and Richard J. Servatius 1,2,3**

<sup>1</sup> Neurobehavioral Research Laboratory, VA New Jersey Health Care System, East Orange, NJ, USA

<sup>2</sup> Stress and Motivated Behavior Institute, Rutgers – New Jersey Medical School, Rutgers Biomedical and Health Sciences, Rutgers – The State University of New Jersey, East Orange, NJ, USA

<sup>3</sup> Department of Neurology and Neurosciences, Rutgers – New Jersey Medical School, Rutgers Biomedical and Health Sciences, Rutgers – The State University of New Jersey, Newark, NJ, USA

<sup>4</sup> Veterans Biomedical Research Institute, East Orange, NJ, USA

### **Edited by:**

Gregory J. Quirk, University of Puerto Rico, USA

### **Reviewed by:**

Silvia Middei, Consiglio Nazionale delle Ricerche, Italy Christian Bravo-Rivera, University of Puerto Rico School of Medicine, Puerto Rico

### **\*Correspondence:**

Kevin D. Beck, Neurobehavioral Research Laboratory, East Orange VA Medical Center, 385 Tremont Avenue, East Orange, NJ 07018, USA e-mail: kevin.d.beck@njms. rutgers.edu

As a model of anxiety disorder vulnerability, male Wistar-Kyoto (WKY) rats acquire leverpress avoidance behavior more readily than outbred Sprague-Dawley rats, and their acquisition is enhanced by the presence of a discrete signal presented during the inter-trial intervals (ITIs), suggesting that it is perceived as a safety signal. A series of experiments were conducted to determine if this is the case. Additional experiments investigated if the avoidance facilitation relies upon processing through medial prefrontal cortex (mPFC). The results suggest that the ITI-signal facilitates acquisition during the early stages of the avoidance acquisition process, when the rats are initially acquiring escape behavior and then transitioning to avoidance behavior. Post-avoidance introduction of the visual ITI-signal into other associative learning tasks failed to confirm that the visual stimulus had acquired the properties of a conditioned inhibitor. Shortening the signal from the entirety of the 3 min ITI to only the first 5 s of the 3 min ITI slowed acquisition during the first four sessions, suggesting the flashing light (FL) is not functioning as a feedback signal.The prelimbic (PL) cortex showed greater activation during the period of training when the transition from escape responding to avoidance responding occurs. Only combined PL + infralimbic cortex lesions modestly slowed avoidance acquisition, but PL-cortex lesions slowed avoidance response latencies. Thus, the FL ITI-signal is not likely perceived as a safety signal nor is it serving as a feedback signal. The functional role of the PL-cortex appears to be to increase the drive toward responding to the threat of the warning signal. Hence, avoidance susceptibility displayed by male WKY rats may be driven, in part, both by external stimuli (ITI signal) as well as by enhanced threat recognition to the warning signal via the PL cortex.

**Keywords: prelimbic cortex, infralimbic cortex, lever-press avoidance, safety signals, conditioned inhibitor, anxiety vulnerability**

### **INTRODUCTION**

Anxiety disorders are a product of experience and underlying vulnerabilities (Merikangas et al., 1999). Since avoidance is a prime symptom of all anxiety disorders, avoidance susceptibility can be considered a vulnerability factor for maladaptive coping and anxiety disorder development (Kashdan et al., 2006). The underlying source of avoidance susceptibility is unknown, but it may involve inherent differences in the perception of threat versus safety. Some avoidance models have utilized discrete stimulus cues to represent oncoming noxious stimuli (threat) and/or periods when aversive stimuli are never present (i.e., safety). Individuals with anxiety disorders commonly do not react to signals associated with safety in the same manner as controls (Rachman, 1984; Grillon, 2002; Schmidt et al., 2006; Lohr et al., 2007; Jovanovic et al., 2010), and regions of prefrontal cortex that have been implicated in the perception of threat versus safety in animals may also be

involved in the expression of anxiety disorders (Schiller et al., 2008). Therefore, a model system that can show both prefrontal cortex activation and threat-signal and/or safety-signal influences upon the acquisition of avoidance behavior would be advantageous in order to gain a greater understanding of potential sources of anxiety vulnerability.

Male Wistar-Kyoto (WKY) rats exhibit facilitated acquisition of lever-press avoidance when there is a flashing light (FL) presented during the non-shock inter-trial intervals (ITIs); this does not appear to be the case for female WKY rats or Sprague-Dawley (SD) rats of either sex (Beck et al., 2011). Others have documented strain (Powell, 1972; Sutterer et al., 1981; Berger and Starzec, 1988; Overstreet et al., 1990; Escorihuela et al., 1995; Blizard and Adams, 2002; Brush, 2003; Servatius et al., 2008) and sex (Beatty and Beatty, 1970; Gray and Lalljee, 1974; Archer, 1975;Van Oyen et al., 1981; Steenbergen et al., 1990; Heinsbroek et al., 1991; Díaz-Véliz

et al., 2000; Beck et al., 2010) differences in avoidance susceptibility; however, WKY rats are a unique rodent, in that, they exhibit qualities of behavioral inhibition (low exploration of novel spaces and stimuli), but they also exhibit rapid acquisition of activeavoidance behavior, which they become resistant to extinguishing (Pare, 1989, 1994a,b, 2000; Servatius et al., 2008; McAuley et al., 2009; Beck et al., 2011; Jiao et al., 2011). This paradoxical combination of behaviorally inhibited temperament and facilitated avoidance acquisition could be due to an added sensitivity to stimuli that predict safety, not just those that predict threat.

The medial prefrontal cortex (mPFC) has a significant role in the acquisition of avoidance behavior in animals (Gabriel and Orona, 1982; Sparenborg and Gabriel, 1990; Shibata, 1993; Kubota et al., 1996; Joel et al., 1997). The infralimbic (IL) cortex region of the mPFC is specifically implicated in the acquisition of two-way shuttle avoidance, by inhibiting the reflexive freezing response that conflicts with running to the safe-side of the apparatus (Moscarello and LeDoux, 2013). Thus, *failure* to exhibit avoidance can be a product of excessive freezing, caused by an overactive central amygdala and/or underactive IL cortex (Choi et al., 2010; Lazaro-Munoz et al., 2010; Martinez et al., 2013). However, the role of prefrontal areas in *increased susceptibility* to acquire active-avoidance behavior (i.e., facilitated avoidance learning) has not been elucidated. Making such determinations is important because avoidance behavior alone is not pathological, but it is the overexpression of avoidance that is pathological.

In a series of six experiments, we sought to determine the role a discrete FL ITI signal has in facilitating active-avoidance learning in WKY rats and the potential neurobiological role of the mPFC in that process. Based on our prior findings (Beck et al., 2011), it appears that the FL ITI-signal facilitates male WKY rat avoidance acquisition in the early phase of training. Hence, our initial experiment was to test whether removing or introducing the ITI-signal mid-acquisition affected overall acquisition in male WKY rats. Next, we conducted two tests of proactive interference to assess the possibility that the FL ITI-signal is perceived as a safety signal. A safety signal in animal behavior is operationalized as a conditioned inhibitor of fear (Rescorla and Lolordo, 1965; Moscovitch and Lolordo, 1968; Rescorla, 1969a); therefore, a safety signal is a stimulus that has acquired certain properties in the animal. If WKY rats perceive the FL ITI-signal as a conditioned inhibitor, then having that same FL serve as a key feature during the training of an unrelated behavior should cause interference (retardation) of the acquisition rate of that newly acquired behavior (Rescorla, 1969b). Similarly, as a conditioned inhibitor of fear reactions, if the ITI-signal is introduced in a separate feareliciting situation, its presence should reduce the magnitude of an elicited fear response (summation) (Rescorla, 1969b). Therefore, we examined whether the FL, post-avoidance training, would slow the acquisition of a new conditional response, where a similar FL predicts the occurrence of an unconditional stimulus (CS), using eyeblink conditioning (retardation). This was combined with a parallel experiment to assess whether the avoidance ITI-signal is a conditioned inhibitor of fear using a summation test. Here, we planned to use the avoidance warning signal as an inducer of a fear state using a fear-potentiated startle paradigm, then introduce a FL compound to determine if that state is reduced by the added

presence of the light (as a conditioned inhibitor of fear). Following inconclusive results of the retardation/summation tests, we examined whether the ITI-signal served as a feedback signal for the male WKY rats; if so, then shortening the duration of the FL to only the first 5 s of the ITI should be sufficient to facilitate avoidance learning. This was not the case.

As stated above, the inhibition of fear during certain avoidance procedures has been linked to the IL cortex; whereas others have proposed the dorsal prelimbic (PL) cortex increases threat detection. We used these distinctions to try to understand whether IL or PL-cortex serve a role in the acquisition of lever-press active avoidance in male WKY rats, when an ITI-signal is present. First, we examined whether avoidance training with and without an ITIsignal causes differentially expressed neuronal activation (c-Fos expression) across the acquisition process. Again, we hypothesized ITI-induced differential activation of the vmPFC would be most apparent in the early sessions of acquisition. Specifically, the IL cortex of the mPFC should be more activated if there is active processing of "safety," whereas the more dorsal prelimbic (PL) cortex should be more activated if there is a significant difference in the perception of threat. This was followed by an experiment where either or both the IL and PL cortices were lesioned prior to avoidance training. There was the expectation that IL cortex lesions would slow acquisition of lever-press avoidance responding, if there is an important role for conditioned inhibition of fear; whereas, the PL-cortex lesions would slow acquisition if it is specifically required to perceive threat during the acquisition of lever-press avoidance. Finally, if there needs to be a comparison of safety versus threat in order to acquire the lever-press avoidance behavior, combined lesions may be required to slow acquisition. In sum, these experiments were designed to try to elucidate the function of the ITI-signal in this learning paradigm for the male WKY rats, as well as determine the functional role the mPFC may have in those processes.

### **MATERIALS AND METHODS SUBJECTS**

Two hundred fifty-six male WKY rats (2–3 months of age upon arrival) were obtained from Harlan Labs (Indianapolis, IN) to serve in one of six possible experiments. Upon arrival, all subjects were maintained on a 12:12 light:dark cycle (lights on 07:00) and had free access to food and water while in the homecages. Room temperatures were maintained in the acceptable ranges as set forth by the NIH Guide for the Care and Use of Animals. Behavioral training occurred at least 14 days post-arrival. In the case of the surgical procedure required prior to eyeblink conditioning (Experiment 2), animals were first trained in lever-press avoidance, and were subjected to the EMG-electrode implantation surgery shortly thereafter (within 1 week of the last session). They were then tested 1 week following the surgery in eyeblink conditioning. All procedures were approved by the VA New Jersey Health Care System Institutional Animal Care and Use Committee, in accordance with *The NIH Guide for the Care and Use of Animals*.

### **AVOIDANCE LEARNING**

Rats were trained in discrete level-press avoidance behavior for varying durations (ranging from 1 session to 12 sessions, depending on the experiment). In order to accomplish this, Coulbourn Instruments (Allentown, PA, USA) operant chambers, containing a grid floor, a lever, a white light, and a speaker were used in conjunction with Graphic State software. The software controlled the stimulus states in the chamber as well as recorded responses upon the bar within those designated states.

The same parameters previously reported to elicit differences in the acquisition of the avoidance lever-press behavior in WKY rats (Beck et al., 2011) were used for these experiments. Each leverpress avoidance-training session was separated by 1–2 days, with 20 trials conducted per session. For each session, rats were placed in the operant chambers, and, following an initial 60 s non-stimulus period, were exposed to a 60 s warning signal (1 kHz frequency tone at 75 dBA intensity). Following the initial 60 s of the warning signal, intermittent (every 3 s), scrambled shocks (1.0 mA and 0.5 s in duration) were applied to the grid floor. Depressing a lever located on one wall in the test chamber ceased shock presentation (i.e., an escape response). If the lever was depressed in the 60 s period preceding a trial's first footshock, the shock was avoided. Following each trial, there was a 3 min ITI, when a white cue light, located 10 cm directly above a lever, flashed at a 5 Hz rate (80 lux) for those subjects assigned to the ITI-signal condition. This signal was presented for the entire 3 min ITI, only the first 5 s of the ITI (only Experiment 4), or not at all; however, at no time were shocks administered during the ITI, regardless of the presence/absence of the FL. The opposing wall to the lever was a mounted house-light that provided a baseline low-level of luminance (approximately 40–50 lux), providing enough light for the experimenter to observe the rats when the ITI-signal was not flashing.

### **EYEBLINK CONDITIONING – RETARDATION TEST**

In Experiment 2, rats were trained in avoidance behavior for 12 sessions prior to the implantation of the necessary electrodes for eyeblink conditioning. Under surgical anesthesia, EMG electrodes were implanted into the orbicularis oculi and associated acrylicfixed headstages to the surface of the skull (Servatius, 2000). One week following surgery, each rat was first tested for signal quality, while habituating to the test chamber (day 1). For the next 2 days, all rats were exposed to eyeblink conditioning with a 500 ms 82 dB(A) white-noise CS and a 10 ms 10 V eyelid muscle stimulation unconditioned stimulus (US). Every 10 trials comprised a trial-block, which included 1 CS-alone trial, 1 US-alone trial, and 8 CS-US pairings where the CS coterminated with the US (Servatius, 2000; Servatius et al., 2001; Servatius and Beck, 2003). There were 10 trial-blocks per daily session. For the retardation test, we added an 80 lux 5 Hz FL (approximately the same height above the floor as the avoidance chambers) that signaled a US-containing trial. The light flashed for 5 s immediately prior to the CS (on paired trials) and the US on US-alone trials. As such, the FL could be learned as an occasion-setter (OS) for the US (in addition to the acoustic CS) or as a primary CS with a 500 ms trace-interval. We also included a condition where a 1 kHz tone was presented as the OS for 5 s.

### **STARTLE REACTIVITY – SUMMATION TEST**

In Experiment 3, rats were exposed to 60 startle test trials, once prior to avoidance training and once following avoidance training (2 days following the last training session). Following avoidance training, it was expected that a tone similar to the avoidance warning signal, preceding the startle pulse, would increase startle reactivity, whereas a co-occurring FL (similar to the ITI signal from avoidance) would reduce that potentiation (Davis and Astrachan, 1978; Hitchcock and Davis, 1987; Grillon et al., 1991). For the startle tests, all rats were given 5 min to acclimate to the testing chamber prior to the initiation of the startle trials. In this test protocol, four trial types were presented in a pseudorandom order, such that no two trial types occurred more than twice within each six trials. The trials were comprised of the following: startle-pulse alone, tone/startle pulse, FL/startle pulse, and tone + FL/startle pulse. Each white-noise startle-pulse stimulus was 100 ms in duration with a 5 ms rise/fall. In the three trial types where there was a preceding stimulus, the preceding stimuli were presented for 5 s. The FL was produced by a similar wall-mounted bulb (as in the avoidance chambers) with 5 Hz flash rate at 80 lux intensity. The 1000 Hz tone, at 75 dB(A) intensity, was produced from two speakers on the ceiling of the startle chamber. The 102 dB(A) startle pulse, produced from the same speakers, followed less than 0.5 s thereafter. The stimulus presentation and data collected from the weight displacement upon the accelerometers (Coulbourn Instruments, Allentown, PA, USA) was conducted through A/D conversion and a custom program written in Labview (National Instruments Corp, Austin, TX, USA). For each startle stimulus presentation, a response threshold for whole body response was computed as the average rectified activity 200 ms prior to stimulus onset plus six times the SD of that rectified activity. Response amplitudes, the maximum rectified activity within 125 ms after stimulus onset, were only recorded when post-stimulus activity exceeded the response threshold. For trials in which activity did not reach this criterion "not available" was recorded, for all others, this calculated value was corrected by each rat's body weight measured immediately post-testing. These methods for calculating startle reactivity are described in detail elsewhere (Servatius et al., 1994, 1995).

### **IMMUNOHISTOCHEMISTRY**

In experiment 5, all subjects were randomly assigned to have their brains harvested following a specific training session. Ninety minutes following the assigned session, each rat was prepared for perfusion-fixation via an injection of 150 mg/kg sodium pentobarbital. Once deeply anesthetized, the rats were subjected to transcardial perfusion of 0.9% saline, followed by 10% buffered formalin. Brains were removed, post-fixed in 10% formalin at 4°C overnight, and placed in 30% (weight/volume) sucrose of 0.1 M phosphate buffer solution until the brains sank. A sliding microtome was utilized to slice coronal brain sections of the mPFC (4.20–2.53 mm anterior to Bregma), each with a thickness of 50µm. All slices were stored in cryoprotectant (0.2 M phosphate buffer solution, glycerin, and ethylene glycol) at −20°C for approximately 4 months. As described elsewhere (Jiao et al., 2011), immunohistochemistry for c-Fos was conducted on every forth mPFC brain section with rabbit anti-c-Fos antibody (1:1000, #sc-52, Santa Cruz Biotechnology, Dallas, TX, USA) for 18 h. Sections were then incubated in biotinylated donkey anti-rabbit secondary

antibody (1:200, Jackson ImmunoResearch Laboratories, West Grove, PA, USA) solution for 3 h, followed by incubation in avidin-biotin complex 4°C overnight (Vectastain Standard kit, Vector Laboratories, Burlington, CA, USA). A chromogenic peroxidase oxidation–reduction reaction was performed utilizing nickel-enhanced DAB. Estimates of c-Fos immunoreactive nuclei were obtained using unbiased stereology procedures (Optical fractionator method, Stereo Investigator v. 9.0, MicroBrightField, Colchester, VT, USA). Volume of the ACc, PL cortex, and IL cortex were also obtained to calculate density of c-Fos immunoreactive cells. A Leica microscope with an *x*-, *y*-, and *z*-motorized stage was used. The counting frame had a consistent length-width-height dimension of 80 × 60 × 10. Cell counts and volume regions were counter balanced across hemispheres: half of the rats were counted on the left hemisphere, while the remaining half of the rats were analyzed on the right hemisphere. Counts were performed by an individual blind to the treatment of each analyzed brain. Data are expressed as means of density (cell count/volume) per mPFC region per animal.

### **VENTROMEDIAL PREFRONTAL CORTEX LESIONS**

In Experiment 6, male WKY rats were randomly assigned to have lesions to the IL cortex, PL cortex, both PL + IL cortex, or a sham control condition (saline injection). Two to three weeks prior to avoidance acquisition, bilateral lesions to either or both the IL and PL cortex were administered under sodium pentobarbital anesthesia (50 mg/kg, i.p.). The surgical site was prepared and the rat was placed into the stereotaxic apparatus. The coordinates were A/P: +2.9, L: ±1.0 (PL) or 1.5 (IL). Hamilton microsyringes (30 gage needle) were lowered on a 10° angle (PL) or 15° angle (IL). The lowering speed was 0.2 mm/min. Needles were lowered 4.0 mm for PL cortex and 4.5 mm for IL cortex (4.2 mm for combined). Ibotenic acid (5 mg/ml) was delivered in 0.2µl volumes to single sites and 0.4µl volumes for the combined PL + IL lesions. All rats recovered under daily administration of banamine and fluids (as necessary).

### **DATA ANALYSIS**

Avoidance behavior training in Experiments 1, 4, and 6 was assessed for differences in the emission of lever-press avoidance responses with respect to between-session acquisition and withinsession acquisition. Between-session analyses utilized mean session avoidance responses as the dependent measure over sessions, whereas within-session analyses utilized the mean percent of subjects emitting an avoidance response on each trial (collapsed over session blocks of two sessions each) as the dependent measure. A mixed analysis of variance (ANOVA), with session as the repeated measure, was the statistical model used for analysis of the former, and a mixed ANOVA with session block and trial serving as the repeated measures was used for the latter analysis. The betweensession analysis provides an overall assessment of avoidance learning over the 4 weeks of acquisition, whereas the within-session analyses provide a means of assessing difference in the acquisition within sessions.

For Experiments 2 and 3 (retardation and summation), mixed designs were also required. A mixed ANOVA, with day and trial block as the repeated measures, was used to assess group differences in the acquisition of conditioned eyeblink responses post-avoidance learning. Similarly, a mixed ANOVA was used to assess group differences in startle magnitude for Experiment 3. All groups experienced all four trial types both pre- and post-avoidance training.

In Experiment 5, the brain analysis required rats to be sacrificed after 1, 2, 4, or 8 sessions of avoidance training. Thus, in order for the avoidance behavior measurement to parallel that of the c-Fos measurements, mean avoidance responses on the day of brain harvest were analyzed via a between subjects ANOVA (no repeated measures). The c-Fos densities across all three regions of the mPFC (IL cortex, PL cortex, and anterior cingulate cortex), were each analyzed via a between subjects ANOVA.

With significant ANOVAs in the above experiments, specific group comparisons were conducted with the Fisher's LSD multiple comparison test. The probability of making a Type I error was set at 0.05 for all levels of analysis.

### **RESULTS**

### **EXPERIMENT 1: AVOIDANCE ACQUISITION WITH ITI-SIGNAL SWITCH**

Thirty-two male WKY rats were randomly assigned to begin leverpress avoidance training with or without a FL ITI-signal for half of acquisition (Initial Condition, sessions 1–6); half of each group subsequently had that signal present or absent for sessions 7–12, thus creating four distinct groups (final condition). The hypothesis was that a safety signal would enhance the acquisition rate of the avoidant behavior. As shown in **Figure 1**, the expected difference in acquisition between those training with and without the ITI signal was replicated in the first half of training; WKY rats acquire quicker when there is a FL presented during the ITIs. This impression was confirmed by a significant main effect of session, *F*(5, 110) = 20.7, *p* < 0.001 and a significant group × session interaction, *F*(5, 110) = 4.3, *p* < 0.001. However, through the second half of acquisition sessions, the group differences ceased to exist. To this end, only a main effect of session was calculated, *F*(5, 100) = 5.0, *p* < 0.001. Additional analyses were conducted on the number of non-reinforced responses emitted during each min of the ITI. These failed to detect any difference in the amount of responding during the ITI that was attributable to the presence/absence of the FL during the ITI.

Within-session acquisition was assessed through two repeated measures ANOVAs, one analyzing the first half of training (shown in **Figures 1B,C**) and a second analyzing the second half of training (shown in **Figures 1D,E**). The first half of acquisition was analyzed using a 2 (initial condition) × 2 (session block) × 20 (trial) ANOVA. Each three consecutive sessions comprised a session block. This analysis yielded significant main effects of session block, *F*(1, 22) = 41.2, *p* < 0.001 and trial, *F*(19, 418) = 4.2, *p* < 0.001, complemented by an initial condition × session block interaction, *F*(1, 22) = 5.0, *p* < 0.03. The *post hoc* analyses found that these significant effects recapitulate the between-session analyses showing acquisition over sessions (with the ITI-signal group acquiring faster), and there is the added confirmation of within-session learning as well. However, there was no interaction between within-session learning (trial) and initial condition. The second phase of training was analyzed via a 2 (initial condition) × 2 (final condition) × 4 (session block) × 20 (trial)

**sessions**. At the mid-point of training, following session 6, half of the subjects had their flashing light (FL) ITI-signal status switched. Shown in **(A)** are the mean avoidance responses per condition per session. The facilitation of lever-press avoidance learning by the presence of a FL ITI-signal is evident through session 6, with significant differences between the two initial conditions denoted by an asterisk (\*). In the latter half of training, there were no significant differences between the groups, regardless of the

percentage of subjects avoiding on each trial through the first (sessions 1–3) and second (sessions 4–6) session blocks, respectively. **(D)** (Sessions 7–9) and **(E)** (sessions 10–12) show the percentage of subjects avoiding after half of each group had an ITI-signal switch. This within-session analysis demonstrates that the presence/absence of the FL during the ITIs does not affect the between-session retention of the learning, as much as the within-session acquisition process.

ANOVA. These analyzes yielded significant main effects of session block, *F*(1, 20) = 8.1, *p* < 0.01 and trial, *F*(19, 380) = 1.8, *p* < 0.02, with an additional significant final condition × trial interaction, *F*(19, 80) = 1.7, *p* < 0.03. *Post hoc* analyses found that the groups with the ITI-signal in sessions 7–12 exhibited more avoidance responses in the later trials of those sessions (trials > 13). Thus, differences in within-session learning were evident across acquisition, with the differences due to the presence/absence of the ITI-signal being predominately reflected in the latter trials within those sessions.

The results of this experiment suggest that the greatest effect the ITI-signal has on acquisition of avoidance behavior is during those first few sessions after the transition from mostly escape responding to predominantly avoidance responding. Moreover, any ITI-signal associated differences in acquisition, within sessions, are reflected through differences in attaining asymptotic response levels.

### **EXPERIMENT 2: ITI-SIGNAL PROACTIVE INTERFERENCE TEST – RETARDATION**

Sixty-four male WKY rats were initially trained for 12 sessions of lever-press avoidance (data not shown). Seven rats were removed due to a lack of acquiring the avoidance behavior. The remaining 57 were subsequently trained in eyeblink conditioning to determine if the presentation of stimuli experienced in avoidance training would cause proactive interference for the acquisition of conditioned eyeblink responses (a retardation effect). Either the warning signal (tone) or the ITI signal (FL) from the avoidance training was presented 5 s prior to the introduction of conditional stimuli (CSs) and/or unconditional stimuli (USs). The CS-preceding stimulus (either tone or FL) was more predictive of the US than the CS (due to the fact there are CS-alone trials and US-alone trials). The expectation was that the presentation of the warning signal or ITI signal as an occasion setter for the US would slow eyeblink conditioning (i.e., retardation through proactive interference) because the stimulus will have already been associated with conditions from avoidance learning. A control group with no additional OS stimulus presentation was included to discern if the novel experience of experiencing the FL during eyeblink conditioning, as an occasion setter for the US, would facilitate acquisition above that of those without any occasion setter (the normal control condition). Of the 57 WKY rats trained in eyeblink conditioning, five additional rats were removed following analysis of the signal (poor signal quality).

As shown in **Figure 2**, all groups of rats emitted more conditioned responses over each session of both conditioning days. This impression was confirmed by significant main effects of day, *F*(1, 48) = 31.4, *p* < 0.001 and trial block, *F*(9, 432) = 9.3, *p* < 0.001. Still, it is clear that the four groups differed in their rate of conditioned response expression over training. This impression was confirmed by a significant group × trial-block interaction, *F*(27, 432) = 1.7,*p* < 0.02. *Post hoc* analyses confirmed that there are specific group differences in the acquisition of eyeblink conditioned responses over trial-blocks each day. As predicted, the groups that had a FL occasion setter differed based on prior experience with the FL during avoidance training. Those that had previously

experienced the FL, as an ITI signal in avoidance training, emitted significantly fewer conditioned eyeblinks in trial blocks 6, 7, 8, 9, and 10 than the group for whom the FL was a novel stimulus. Moreover, those rats with experience of the FL during avoidance emitted fewer conditioned responses, compared to the no-OS control group on trial blocks 1, 2, 5, 6, and 10. The tone occasion setter also caused fewer conditioned eyeblinks, comparing the tone OS group to the no-occasion-setter group in trial blocks 2, 3, and 6. Thus, although the 5 s FL occasion setter did not appreciably facilitate learning of the conditioned response above that of the no-occasion-setter condition, rats that had prior experience with the FL were less likely to emit conditioned responses. This suggests acquisition of a second learned response to the ITI-signal, as well as the warning signal, is mildly retarded due to the prior exposure to those stimuli during avoidance learning (i.e., proactive interference).

### **EXPERIMENT 3: ITI-SIGNAL PROACTIVE INTERFERENCE TEST – SUMMATION**

Sixteen male WKY rats were matched on their baseline startle magnitudes on pulse-alone trials, and randomly assigned to be trained in lever-press avoidance with a FL ITI-signal or not. The day following the 12th and final avoidance-training session, all rats were re-tested for startle reactivity. As with the pretest, there were four trial types: pulse alone, 5 s tone/pulse, 5 s FL/pulse, and 5 s FL + tone/pulse. Assuming the avoidance protocol, warning tone (post-acquisition) would enhance startle reactivity (i.e., fear-potentiated startle), the FL was expected to dampen that enhancement if the FL acquires the properties of a safety signal (for those rats trained with the FL as the ITI-signal).

One rat from the ITI-signal-trained group was removed from the study, as it did not meet the requirements of having acquired the avoidance behavior, leaving seven ITI-signal-trained and eight non-ITI-signal-trained rats to be analyzed across both pre and post-avoidance startle tests to determine if the acquired properties of the tone and FL during avoidance learning can potentiate or dampen the subsequently elicited startle reflex. The resulting 2 (group) × 2 (session) × 4 (trial type) mixed ANOVA failed to detect any differences due to being trained in avoidance with or without the FL ITI-signal. Only main effects of session, *F*(1, 13) ( 18.9, p ( <.0010.001 and trial type, F (3, 39) ( 47.5, p ( <.0010.001 were evident. As evidenced in **Figure 3**, postavoidance startle tests had significantly lower startle magnitudes, independent of avoidance-training ITI-signal group assignment. Moreover, preceding startle pulses with an equivalent tone as the avoidance warning signal reduced the magnitudes of the elicited startle responses. Unexpectedly, this was even the case prior to the avoidance training. Subsequent avoidance training with the tone was a warning signal did not change this pattern. These findings suggest the tone had startle dampening properties that 12 avoidance acquisition sessions are not sufficient to overcome, through eliciting a fear or anxiety-potentiated startle response prior to the startle pulse. The FLflashing light alone had no discernible effect on the elicited startle response. Thus, for those rats trained with the FLflashing light ITI-signal, subsequent exposure to the FLflashing light prior to startle pulses does not reduce the vigilance of the rats.

### **EXPERIMENT 4: SHORTENED ITI-SIGNAL**

In order to determine whether the duration of the ITI-signal is a critical element for the facilitation of avoidance acquisition in male WKY rats, the duration of the FL was shortened to the first 5 s of the 3 min ITI. Sixteen male WKY rats were randomly assigned to be trained with either a shortened signal or no signal during the

that previously were trained in avoidance but did not have a FL during the

3 min ITI. As shown in **Figure 4**, the groups differed in performance over the first four sessions of acquisition. This impression was confirmed by significant main effects of group, *F*(1, 14) = 5.7, *p* < 0.05 and session, *F*(11, 134) = 60.2, *p* < 0.001, as well as a significant group × session interaction, *F*(11, 154) = 2.0, *p* < 0.05. *Post hoc* analyses confirmed that the two groups differed in the

group and the tone occasion-setter group (p < 0.05 Fishers LSD).

post-avoidance test, regardless of begin trained with or without a FL ITI signal. However, exposure to the 1 kHz tone reduced startle magnitudes approximately 50–60%. The FL did not appear to influence the magnitude of the elicited startle response.

percentage of avoidance responses emitted during sessions 1, 2, and 4,with the no-signal group emitting more avoidance responses during those three sessions.

### **EXPERIMENT 5: PREFRONTAL ACTIVATION DURING AVOIDANCE ACQUISITION**

Based upon past findings with fear conditioning and lever-press avoidance, the mPFC cortex was expected to exhibit sub-region

specific responding during the acquisition process and based on the presence/absence of a FL ITI-signal. The anterior cingulate cortex was expected to exhibit greater activation once avoidance was acquired. The IL cortex was predicted to exhibit greater neuronal activation following the perception of perceived safety, whereas the PL cortex was predicted to exhibit greater activation following the perception of perceived threat. Thus, greater activation in IL cortex would support the theory that the ITI-signal is being processed as a safety signal, but greater PL-cortex release would suggest greater perceived threat.

difference between groups for a particular session (p < 0.05, Fishers LSD).

Sixty-four male WKY rats were randomly selected to be sacrificed at different stages of lever-press avoidance acquisition, with the goal of determining the subregions of the mPFC that are most activated at each stage and whether the presence of the ITI-signal influences that neuronal activation. As shown in **Figure 5A**, acquisition of the lever-press escape-avoidance behavior occurred in both the ITI-signal and non-ITI-signal groups. These data were analyzed via a 2 (group) × 4 (session) between subjects ANOVA, which calculated significant main effects of group, *F*(1, 54) = 5.5, *p* < 0.02, and session, *F*(3, 54) = 20.6, *p* < 0.001. *Post hoc* tests suggested the ITI-signal/no-signal groups differed from each other prior to session 4, with the non-signal group exhibiting less avoidance behavior. The general pattern of the ITI-signal groups exhibiting faster acquisition early in training was replicated. Further, significant differences in ITI-responding were assessed via a 2 (group) × 4 (session) between subjects ANOVA. These analyses detected a significant main effect of session, *F*(3, 50) = 5.9, *p* < 0.002, but no effect of group. Overall, more non-reinforced

**FIGURE 5 | Brains were extracted from subjects, trained in the lever-press avoidance protocol with either the FL ITI-signal or no explicit ITI-signal (nFL), following a randomly assigned number of training sessions**. Shown in **(A)** is the percentage of avoidance responses emitted from the rats on the day they were sacrificed. **(B–D)** provide the density of c-Fos labeling in the anterior cingulate (AC) cortex, prelimbic (PL) cortex, and infralimbic (IL) cortex, respectively, for those subjects depicted in **(A)**. Behaviorally, the groups with a flashing light (FL) ITI-signal emitted more lever-press avoidance responses during sessions 1(† ) and 4 (\*) than the groups without an ITI-signal. Neurochemically, only the PL cortex exhibited significant differences between groups. Overall, the density of c-Fos labeling was significantly different between sessions 2 and 4, regardless of the ITI signal (‡ ).

ITI responses were emitted during the second session compared to all other sessions, regardless of signal condition (data not shown).

Group × session (2 × 4) between subjects analysis of variance was utilized to determine regional differences in c-Fos immunoreactivity density in anterior cingulate cortex, PL cortex, and IL cortex. The only significant difference was found in the c-Fos density in PL cortex (see **Figure 5C**). A main effect of session was found, *F*(3, 49) = 2.7, *p* < 0.05. Session 2 c-Fos densities were significantly different than session 4 c-Fos densities, regardless of group assignment. No significant differences were detected in anterior cingulate cortex (**Figure 5B**) or IL cortex (**Figure 5D**). This suggests mPFC is involved in the process of acquisition but its level of activation is not appreciably influenced by the presence/absence of an explicit ITI-signal, even when that ITI-signal appears to facilitate the transition from escape to avoidance responding.

### **EXPERIMENT 6: VENTROMEDIAL PREFRONTAL CORTEX LESION EFFECTS ON AVOIDANCE**

Sixty-four male WKY rats were subjected to excitotoxic lesions to the PL, IL, or combined PL and IL cortex. Confirmation of target site damage (see **Figure 6A**) required one IL, three PL, and eight PL + IL lesion rats to be dropped due to failure of bilateral lesions in both targets. In addition, two PL-lesion rats were dropped due to failing to acquire an escape response within five sessions. This yielded the following groups: sham (16), IL (15), PL (11), and PL + IL (8). The percentage of avoidance responses emitted of those rats were analyzed via a 4 (group) × 12 (session) mixed ANOVA. The complete analysis only produced a main effect of session, *F*(1, 11) = 60.8, *p* < 0.001. As observed in **Figure 6B**, all four groups exhibited a significant increase in lever-press avoidance responding over the 12 acquisition sessions. Hence, it is clear that neither the PL or IL cortex is necessary for the acquisition of leverpress avoidance behavior in male WKY rats. Still, there appears to be a general trend for larger lesions (PL + IL) to somewhat slow the acquisition process. Given this observation, we removed the PL cortex-alone and IL cortex-alone groups from the analysis to assess whether a vmPFC lesion (PL + IL cortex) significantly slows acquisition of lever-press avoidance behavior. This analysis yielded significant mains effect of group, *F*(1, 22) = 8.7, *p* < 0.01 and session, *F*(11, 241) = 28.3, *p* < 0.001. These results suggest more extensive bilateral damage to the vmPFC slows acquisition, but avoidance acquisition is still quite significant over sessions. Thus, the vmPFC is not necessary for lever-press avoidance in male WKY rats, but its actions may contribute to the normally rapid acquisition and higher asymptotic performance levels normally displayed in male WKY rats.

As above, we also conducted within-session analyses to determine if the lesions affected one of the most prominent features of lever-press avoidance learning in WKY rats – the absence of avoidance warm-up. Acquisition was separated into three phases (session blocks) for the within-session analyses. Thus, a 4 (group) × 3 (session block) × 20 (trial) mixed ANOVA determined differences in the percentage of subjects that emitted avoidance responses on specific trials within the session blocks (early acquisition, mid-acquisition, and late acquisition). As evidenced in **Figure 6C**, the session differences in **Figure 6B** predominately reflect differences in within-session acquisition. This impression

is supported by a significant main effects of session block, *F*(2, 92) = 116.1,*p* < 0.001 and trial,*F*(19, 874) = 2.8,*p* < 0.001, as well as a significant session block × trial interaction, *F*(38, 1748) = 2.5, *p* < 0.001. Despite the obvious difference in the PL + IL lesion condition, a main effect of group and a group × trial interaction failed to attain significance (*p* = 0.09 and *p* = 0.08, respectively).

Across-session analyses for detecting early trial warm-up versus first-trial avoidance failed to detect any significant differences that would be indicative of warm-up at the beginning of either session block 2 or 3.

A within subjects analysis also was conducted on the emitting of non-reinforced lever-presses, during each of the 3 min of the ITI, to determine whether lesions of the mPFC influence the emitting of those responses. Thus, a 4 (group) × 3 (session block) × 3 (ITI Min) mixed ANOVA was used for determining group differences across acquisition, as well as, across the 3 min of the ITI. There was only a significant main effect of ITI Min, *F*(2, 92) = 265.6, *p* < 0.001. Over acquisition, the mean number of lever-presses emitted in each of the ITI minutes was 2.0 ± 0.05, 0.7 ± 0.03, and 0.6 ± 0.03, respectively. There were no significant effects of group or session block (all *p*'s > 0.08).

Finally, we assessed whether the lesions may have affected the timing of the emitting of the avoidance responses. A group × session mixed ANOVA determined that there was the expected main effect of session, *F*(11, 396) = 9.4, *p* < 0.001, as latencies of the avoidance responses decreased over sessions (see **Figure 7**). However, an additional significant group × session interaction, *F*(33, 396) = 2.0, *p* < 0.005 suggested that there was a differential decrease among the lesion groups. The *post hoc* analyses found the combined PL + IL lesion group was significantly different from the Sham-lesion group for the fourth session, but the PL-lesion group was significantly different from the Shamlesion group for sessions 4, 6, 7, and 8. The IL-group was never significantly different from the Sham-lesion group.

### **DISCUSSION**

The use of explicit ITI signals has occurred in active-avoidance learning for over 40 years (Dillow et al., 1972; Berger and Brush, 1975; Berger and Starzec, 1988; Brennan et al., 2003); yet, rarely has

**FIGURE 7 |The change in avoidance response latency was affected by the lesion in male WKY rats**. As denoted by the asterisk (\*), the PL-cortex lesion group has slower avoidance latencies beginning with session 4, the session where equal or more avoidance responses are typically emitted (Fisher LSD, p < 0.05). Thus, the shorter latencies prior to session 4 are due to much fewer avoidance responses being represented in the calculation, then the more representative mean values for the latencies are longer through session 8. The cross († ) represents the combined PL + IL lesion group being significantly different from the Sham-lesion condition during session 4 only.

the actual neurobehavioral role of extra stimuli been specifically studied. Previously, we established that male WKY rats acquire lever-press active-avoidance behavior quicker than male SD rats, and that this facilitation was completely attributable to the presence/absence of a FL during the 3-min ITIs that followed each trial (Beck et al., 2011). Here, we established that the FL ITIsignal appears to acquire some mnemonic properties in the male WKY rats, as evidenced by the mild proactive interference it causes when it is introduced into another learning paradigm following avoidance training (i.e., a mild retardation effect on eyeblink conditioning). However, the enhancing effect of the ITI-signal has a temporal component. When the duration of the signal was reduced to the first 5 s of the ITI, male WKY rats were slower to acquire lever-press avoidance compared to their non-signaled counterparts. This temporal element to the ITI-signal could be viewed as support for its role as a safety signal, but higher c-Fos activation of the IL cortex would be expected for a conditioned inhibitor of fear, which was not observed. Instead, a significant change in c-Fos activation was observed in the PL cortex between session 2 and 4, the period where the rats transition from escape responding to avoidance responding. The first 4–6 sessions are also the period where the presence of the FL during the ITI could facilitate or slow acquisition (depending on duration of exposure). Yet, lesions to either the PL or IL cortex did not appreciably slow acquisition of the lever-press avoidance behavior, but lesions to both regions did cause slower acquisition compared to sham controls. One interpretation of this additive effect is that either a cortical evaluation that inhibits fear (IL) or enhances perceived threat (PL) is capable to support rapid acquisition in male WKY rats, but when neither cortical evaluation is functional, the acquisition is slowed (although still clearly evident). In all, these data suggest the mPFC can modulate the acquisition of lever-press avoidance in male WKY rats, and a FL, which lasts the duration of the ITI, acquires associative properties enhances avoidance acquisition; yet the mPFC neuronal activation and the critical period for its effectiveness do not necessarily support its role as a conditioned inhibitor or fear (i.e., safety signal).

The IL cortex, not PL cortex, has been shown to be positioned in the cortico-limbic network of emotional/motivational responsiveness as a primary inhibitor of amygdala-dependent fear reactions (e.g., conditioned freezing) (Quirk et al., 2006). Recent work has expanded the role of the IL cortex to a necessary structure for shuttle-avoidance learning, in that, conditioning to the warning signal can cause the rats to freeze, thus impeding their ability to run to the "safe" chamber (Moscarello and LeDoux, 2013). This finding complements those that have shown poor shuttleavoidance learners can be "saved" by causing a lesion to the central amygdala, which releases the rats from emitting species-specific conditioned freezing responses (Choi et al., 2010). However, this competing freezing effect may be paradigm specific. For example, inactivation of IL cortex increases freezing in a step-up avoidance paradigm, but it does not to the extent that it impairs emitting the avoidance behavior (Bravo-Rivera et al., 2014). Still, IL cortex may be required for the extinction-learning that instills a lasting reduction in avoidance behavior (Bravo-Rivera et al., 2014). This was also recently observed in a fear/safety versus reward discrimination paradigm. In that paradigm, IL cortex inactivation did

not affect the reduction of freezing to a compound fear-inducing CS with a safety signal; however, IL cortex inactivation increased freezing to both the CS and compound CS/safety signal during extinction recall (Sangha et al., 2014). In the current study, we did not observe any difference in IL c-Fos activation throughout acquisition of lever-press active-avoidance behavior, and lesions to the IL cortex did not appreciably affect acquisition of leverpress avoidance. The lack of difference over sessions and between ITI-signal/no ITI-signal male WKY rats suggest any role the IL cortex has in lever-press avoidance is not particularly sensitive to changes in stimulus perception or motor responding that occur over acquisition sessions. This further supports the growing literature suggesting the role for IL cortex in avoidance behavior may be paradigm specific. If a motor response needs to be inhibited (e.g., freezing) then IL cortex is quite important; however, if the paradigm is less sensitive to conflicting reflexive fear responses, then the IL cortex may not be required. Thus, the role of IL cortex may be specific to the processes involved in the acquisition of motor inhibition, rather than serving a role of acquiring associations linked to conditions of perceived safety (versus threat).

Despite the fact that the IL cortex may not be involved in the processing of perceived safety,the FL during the ITI-signal may still be processed as a safety signal through other brain regions. Tests of retardation and summation were conducted to test whether the ITI-signal is perceived as a safety signal. Introducing a similar FL in another associative learning paradigm, following avoidance acquisition with a FL ITI signal, caused proactive interference of the newly acquired reflexive conditional eyeblink response. As a novel stimulus, the FL did not appreciably influence the acquisition of eyeblink CRs; therefore, prior experience with the FL did cause interference whereas it normally would not affect the acquisition. This apparent demonstration of proactive interference is important for two distinct reasons. First, it confirms the FL, as an external stimulus, likely acquires associations to some aspects of the avoidance learning situation. Even though habituation to the FL may appear to be a parsimonious explanation for both the lack of difference over time in avoidance acquisition and no additional acquisition-enhancing effect in eyeblink conditioning, we would have expected dishabituation to the FL in the novel eyeblink conditioning test chambers, as habituation to repeated external stimuli, during emotional learning, is generally context dependent (Hermitte et al., 1999; Tomsic et al., 2009). Second, previous work suggests abnormal proactive interference in WKY rats. Specifically, latent inhibition to an auditory CS, used subsequently in eyeblink conditioning, could be elicited in SD rats following 30 pre-exposures; WKY rats were unaffected by the same amount of pre-exposure (Ricart et al., 2011a). In contrast, in the current study, we appear to have proactive interference in male WKY rats. Granted, although the FL was positioned as an added CS in eyeblink conditioning, we cannot conclude that the rats specifically utilized the FL signal in combination with or instead of the auditory CS. Moreover, the group with the novel FL OS did not differ from those trained without an OS. Previously, we observed a similar non-significant trend toward facilitation of eyeblink conditioning in WKY when a 5 Hz light was flashed throughout the entire session (Beck et al., 2011). In contrast to male SD rats, which exhibited a significant facilitation of acquisition with the constant FL, the effects in WKY rats suggest the additional stimulus may only be providing a relatively mild increase in arousal or attention to the subsequent CS. Interestingly, since the re-introduction of the avoidance tone as an OS for eyeblink conditioning had a comparable effect as the re-introduction of the FL, we can conclude that any retardation effect is not specific to an association formed to the ITI signal. Further, this suggests the proactive interference observed pertains to all the signals acquired during the avoidance training, not just the signal of "safety." Therefore, the specificity of the retardation of learning cannot be attributed to the ITI-signal specifically serving as a safety signal.

Next, we tested whether the ITI-signal could pass the criterion of summation. Following a similar design as human tests of safetysignal inhibition of fear-potentiated startle (Grillon et al., 1994), we unexpectedly found the mere exposure to a 1-kHz tone prior to a startle pulse was sufficient to significantly dampen the magnitude of the motor response. Historically, rodent fear-potentiated startle paradigms have utilized either a light or a sound as the CS. For those studies that have utilized an acoustic CS, 3–5 s is a common CS duration (Brown et al., 1951; Kurtz and Siegel, 1966; Hitchcock and Davis, 1987; Fendt et al., 2005). In order to best match the warning signal, we had the tone produced at an intensity of 75 dB(A). This intensity is higher than that used in some studies (Kurtz and Siegel, 1966) but not others (Fendt et al., 2005). We considered that this dampening of startle reactivity may have been due to the use of male WKY rats, as other have reported poor fear conditioning in this strain (Pardon et al., 2002), but we tested this startle protocol with male SD rats and obtained similar pre-avoidance results (unpublished observation). It is a common procedure to habituate the startle reflex prior to the assessment of startle potentiation (Rosen et al., 1996); however, in this case, the tone CS suppressed the startle response more than any within-session habituation to the pulse alone (50% of the pulse-alone trial magnitudes). Moreover, male WKY rats have been shown to exhibit higher startle magnitudes than male SD rats (Ricart et al., 2011b); hence, even extended habituation may not have equated pulse-alone trials to those with the preceding tone. Further research will need to determine what factors led to this substantial suppression of the startle response by the tone that was to serve as the CS, but, nonetheless, after the surprising results of the pretest, there was the possibility that the tone would acquire aversive properties during avoidance training. This was clearly not the case. It instead confirms previous experimentation, with other endpoints, which were interpreted to reflect a decrease in elicited fear to the CS in avoidance paradigms (Starr and Mineka, 1977). Any future test of summation with respect to the FL ITI signal will undoubtedly require a distinct aversive learning situation from that of the avoidance paradigm, thereby removing the additional issues of using the acoustic warning signal and any reduction in fear to that signal that develops over training.

With respect to the ITI-signal, the data derived from these experiments do not support the role of the ITI-signal as a conditioned inhibitor of fear. Although we could argue a summation or retardation test following the initial 1 or 2 sessions of escapeavoidance training may provide stronger evidence that the FL ITI-signal acquires the properties of a conditioned inhibitor, i.e., safety signal, the IL cortex is positioned to be a likely neuronal

source of inhibition upon the amygdala (Quirk et al., 2006). The lack of differential IL cortex activation throughout training and the lack of IL cortex lesions on avoidance acquisition suggests the presence or absence of the FL during the ITIs is not activating an inhibitory response upon the amygdala via the IL cortex. Granted, other areas of the brain have also been suggested to process aspects of "learned safety" outside of the vmPFC and the amygdala, such as the insular cortex and bed nucleus of the stria terminalis (Christianson et al., 2008, 2011), but those circuits have been specifically implicated in coping with uncontrollable stressors, which is not the case in this avoidance paradigm.

Differences in neuronal activation were evident from session 2 to session 4 in the PL cortex. This period of acquisition is particularly interesting for the WKY rats. As mentioned above, session 4 is an average point where more WKY rats exhibit more avoidance responding than escape responding (Servatius et al., 2008; Beck et al., 2011; Jiao et al., 2011; Perrotti et al., 2013); however, session 4 is also proximal to when non-reinforced responding changes in WKY rats (Perrotti et al., 2013). This pattern was observed in the rats sacrificed following session 2 versus session 4, with the decrease in responding during the ITI occurring in both those with a FL-signaled and unsignaled ITI. Yet, neither reinforced nor non-reinforced behavior was correlated with c-Fos density measured in any of the three subregions of the mPFC. Moreover, lesions to either or both the PL and IL cortex failed to significantly affect the emitting of non-reinforced responses during the ITI period. Therefore, if the expression changes of c-Fos activation, which we observed in the PL cortex of male WKY rats, were caused by the acquisition of lever-press avoidance, it is not likely tied to changes in the specificity of responding (i.e., less during the ITIs). Instead, the analyses of the avoidance latencies suggest that proximal to the fourth session (when avoidance responses equate that of escape responses) the PL cortex is involved in driving the avoidance response to be emitted quicker. In contrast, the IL lesions do not affect the latencies, and the combined lesions only statistically affected latency during the fourth session. The PL cortex has been hypothesized as having a role for detecting threat to facilitate fear responses through the amygdala (Stern et al., 2010), but given the type of non-species-specific behavior required to avoid the shock, it may be that the PL cortex is driving the behavioral response through other pathways not specifically associated with the amygdala (Bravo-Rivera et al., 2014). Combined PL + IL lesions exhibited greater variability in avoidance latencies than the PL-lesions alone, but the trend was similar to that exhibited by those with lesions limited to the PL cortex. Thus, lesions of both PL and IL cortex suggest the combined activation of both cortices modulate the *rate and speed* by which lever-press active avoidance is acquired and emitted, but, at the same time, neither cortical area is *required* for that acquisition process.

These data are somewhat similar to shuttle-avoidance lesion studies in that damage to the PL cortex is reported to not be detrimental (Moscarello and LeDoux, 2013), but unlike that paradigm, it appears that only more diffuse PL and IL cortex lesions can significantly slow active-avoidance acquisition. This may be due to the type of response that is required, running versus lever-pressing, species-specific versus non-species-specific behavior. Still, c-Fos activation in PL cortex and IL cortex has been shown to be

correlated with shuttle behavior alone and/or shuttling combined with freezing, respectively, in a non-cued Sidman avoidance procedure (Martinez et al., 2013). These c-Fos measures, however, occurred following a session where the aversive stimulus (shock) was absent, and, in that study, the average amount of shuttling decreased by nearly 50% during that shock-free session (Martinez et al., 2013). Thus, it cannot be ruled out that the activation measured in the brains of those rats may have been due to the new learning that the shock was not on the same temporal schedule. As mentioned above, recent inactivation studies suggest the IL cortex may be particularly critical for response inhibition in fear and avoidance paradigms (Bravo-Rivera et al., 2014; Sangha et al., 2014). The current data also suggests that increases in PL c-Fos may reflect a motivational drive to avoid; therefore, it is possible during extinction both PL and IL cortices may be activated as a re-evaluation of predictive threat occurs.

Another consideration that needs to be recognized is that the current study focused its efforts on understanding particular aspects of avoidance learning in an anxiety disorder vulnerability model. The WKY rats acquire active avoidance differently than SD rats and clearly differ in their ability to extinguish the leverpress avoidance behavior (Servatius et al., 2008; Beck et al., 2010, 2011; Jiao et al., 2011). Information pertaining to threat evaluation is transmitted from the basolateral amygdala (BLA) to PL cortex through a direct projection that changes its directional plasticity in response to stress (Maroun and Richter-Levin, 2003; Maroun, 2006). Those studies were conducted in SD rats, which are not as stress sensitive as WKY rats (Pare, 1989; Tejani-Butt et al., 1994; Bravo et al., 2011; Jiao et al., 2011). Therefore, it is entirely possible that these regions of the WKY mPFC are not functioning in the same manner as the male SD rats during avoidance learning. For example, latencies to respond to the warning signal with a lever-press avoidance response are quicker in male WKY rats versus male SD rats (Beck et al., 2010). Also, increases in anticipatory responding occur earlier in WKY rats versus SD rats, suggesting the processing of prospective threat may occur more readily in WKY rats (Perrotti et al., 2013). Thus, although the necessary processing of the FL ITI-signal does not appear to require the IL cortex, differences in other aspects of avoidance learning displayed between WKY and SD rats may be due to neurotransmission differences between the BLA and PL cortex. Amygdala – mPFC connectivity and functioning is documented to be different in humans with anxiety disorders, although the particular pattern of difference is still not resolved (Gilboa et al., 2004;Monk et al., 2006;McClure et al., 2007; Liberzon and Sripada, 2008; Etkin et al., 2010; Tromp et al., 2012; Demenescu et al., 2013; Stevens et al., 2013; Killgore et al., 2014). Therefore, the WKY rat can be a very useful model to study how *abnormal* prefrontal functioning can lead to avoidance susceptibility, avoidance extinction resistance, and overall anxiety vulnerability.

### **CONCLUSION**

The paradoxical behavioral inhibition and active-avoidance susceptibility demonstrated by WKY rats provide a unique opportunity to examine how an intact, albeit abnormal, brain can produce behaviors akin those expressed by individuals with pathological anxiety. Although we reaffirmed that male WKY rats more readily acquire active avoidance when a discrete signal is presented during the "safe" ITIs, the behavioral and neuroimmunohistological data do not readily support the hypothesis that the ITI signal acquires the properties of a "safety signal". In fact, we observed brief exposures to the same stimulus can facilitate eyeblink conditioning in the male WKY rats, suggesting intermittent exposures to the FL may be serving to increase arousal, if it is novel. This may explain why the facilitation of active-avoidance learning is apparent in the early phases of training. Changes in the activation of the PL cortex occurring at the end of that early acquisition period may represent a change in how the rats were responding to the signals in the environment. Specifically, it appears to occur at a period when damage to the PL cortex is associated with longer avoidance latencies. Thus, the activation of the PL cortex, following the ITI-signal enhanced period, may represent the acquired association of threat to the elicitation of the avoidance behavior. Future work will be focused on understanding how the PL cortex contributes to the acquisition of active-avoidance learning, and whether abnormal neural activity in the PL cortex of WKY rats contribute to their avoidance-susceptible behavioral phenotype. This research, utilizing a behaviorally inhibited model, complements other recent work that has begun to dissociate functions of mPFC subregions in the human mPFC and the adoption of avoidance (Bzdok et al., 2013).

### **ACKNOWLEDGMENTS**

The authors thank the efforts of Shane Mahabir for conducting portions of the research. This work was accomplished through support from award 1I01BX000218 from the Biomedical Laboratory Research & Development Service of the VA Office of Research and Development (KDB) and funds from the Stress & Motivated Behavior Institute. The opinions and conclusions presented are those of the authors and are not the official position of the U.S. Department of Veterans Affairs.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 27 June 2014; accepted: 04 November 2014; published online: 21 November 2014.*

*Citation: Beck KD, Jiao X, Smith IM, Myers CE, Pang KCH and Servatius RJ (2014) ITI-signals and prelimbic cortex facilitate avoidance acquisition and reduce avoidance latencies, respectively, in male WKY rats. Front. Behav. Neurosci. 8:403. doi: 10.3389/fnbeh.2014.00403*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Beck, Jiao, Smith, Myers, Pang and Servatius. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

### Absence of "warm-up" during active avoidance learning in a rat model of anxiety vulnerability: insights from computational modeling

#### **Catherine E. Myers 1,2\*, Ian M. Smith<sup>1</sup> , Richard J. Servatius 1,2 and Kevin D. Beck 1,2**

<sup>1</sup> Department of Veterans Affairs, VA New Jersey Health Care System, East Orange, NJ, USA

<sup>2</sup> Stress and Motivated Behavior Institute, Department of Neurology and Neurosciences, New Jersey Medical School, Rutgers, The State University of New Jersey, Newark, NJ, USA

### **Edited by:**

Gregory J. Quirk, University of Puerto Rico, USA

### **Reviewed by:**

Seth Davin Norrholm, Emory University School of Medicine, USA Christopher Cain, Nathan S. Kline Institute for Psychiatric Research, USA

### **\*Correspondence:**

Catherine E. Myers, NeuroBehavioral Research Lab, VA New Jersey Health Care System, 385 Tremont Avenue, Mailstop 127A, East Orange, NJ 07042, USA e-mail: catherine.myers2@va.gov

Avoidance behaviors, in which a learned response causes omission of an upcoming punisher, are a core feature of many psychiatric disorders. While reinforcement learning (RL) models have been widely used to study the development of appetitive behaviors, less attention has been paid to avoidance. Here, we present a RL model of lever-press avoidance learning in Sprague-Dawley (SD) rats and in the inbred Wistar Kyoto (WKY) rat, which has been proposed as a model of anxiety vulnerability.We focus on "warm-up," transiently decreased avoidance responding at the start of a testing session, which is shown by SD but notWKY rats.We first show that a RL model can correctly simulate key aspects of acquisition, extinction, and warm-up in SD rats; we then show thatWKY behavior can be simulated by altering three model parameters, which respectively govern the tendency to explore new behaviors vs. exploit previously reinforced ones, the tendency to repeat previous behaviors regardless of reinforcement, and the learning rate for predicting future outcomes. This suggests that several, dissociable mechanisms may contribute independently to strain differences in behavior. The model predicts that, if the "standard" inter-session interval is shortened from 48 to 24 h, SD rats (but not WKY) will continue to show warm-up; we confirm this prediction in an empirical study with SD andWKY rats. The model further predicts that SD rats will continue to show warm-up with inter-session intervals as short as a few minutes, while WKY rats will not show warm-up, even with inter-session intervals as long as a month. Together, the modeling and empirical data indicate that strain differences in warm-up are qualitative rather than just the result of differential sensitivity to task variables. Understanding the mechanisms that govern expression of warm-up behavior in avoidance may lead to better understanding of pathological avoidance, and potential pathways to modify these processes.

**Keywords: reinforcement learning model, anxiety vulnerability, acquisition, extinction, learning and memory**

Anxiety disorders are the most common psychiatric disorders,with a worldwide lifetime prevalence of 16–29% (Kessler et al., 2005; Somers et al., 2006). Although each subtype (e.g., generalized anxiety disorder, obsessive-compulsive disorder, panic disorder, and social phobia) has unique features, a core symptom of all anxiety disorders is excessive avoidance. Avoidance is also a defining symptom for posttraumatic stress disorder (PTSD), and the growth of avoidance behaviors traces the full expression of PTSD (North et al., 2004; Karamustafalioglu et al., 2006; O'Donnell et al., 2007; Kashdan et al., 2009). Given this prominent position, acquisition and maintenance of avoidance behaviors may represent an endophenotype for a variety of anxiety- and stress-related mental disorders (Gould and Gottesman, 2006).

Among a variety of neurobiological and neurobehavioral factors representing a source of risk for pathological avoidance, some have been amenable to study in animal models. For example, the personality trait of behavioral inhibition, characterized as extreme withdrawal in the face of social and non-social challenges (Kagan et al., 1987; Rosenbaum et al., 1991; Fox et al., 2005), is consistently linked to anxiety disorders (Kagan et al., 1987; Hirshfeld et al., 1992; Biederman et al., 1993; Rosenbaum et al., 1993; Fox et al., 2005; Hirshfeld-Becker et al., 2007). Behavioral inhibition can be studied via an animal model, the inbred Wistar Kyoto (WKY) rat strain, which displays behavioral withdrawal, propensity to avoid, hyper-responsiveness to stress, and hypervigilance, compared to outbred strains such as the Sprague-Dawley (SD) rat (Pare, 1992, 1993; Solberg et al., 2001; Drolet et al., 2002; McAuley et al., 2009; Lemos et al., 2011). Thus, WKY rats represent an animal model of behavioral withdrawal in the face of social and non-social challenges (Jiao et al., 2011b).

It has therefore been useful to compare the acquisition and maintenance of avoidance behavior in the SD and WKY rat models. For example, in lever-press avoidance, a rat is placed in a conditioning chamber for several acquisition trials; on each trial, a warning signal *W*, such as a tone, is presented for some interval (warning period), and then remains on during a subsequent shock

period during which electric shocks are delivered every few seconds. If the animal presses a lever during the shock period, this is defined as an escape response: both *W* and shocks are terminated, and the trial moves immediately to an intertrial interval (ITI). If the animal presses the lever during the warning period, this is defined as an avoidance response: *W* is terminated, no shocks are delivered, and the trial moves immediately to the ITI. Behaviorally inhibited WKY rats acquire avoidance responses more quickly than SD rats (Servatius et al., 2008; Beck et al., 2011; Jiao et al., 2011b; Perrotti et al., 2013).WKY rats also typically show impaired extinction of responding when *W* is no longer paired with shock (Servatius et al., 2008; Beck et al., 2011; Jiao et al., 2011b; Perrotti et al., 2013). This impaired extinction indicates that theWKY rat is an overly avoidant animal that is willing to expend energy and continue displaying the avoidance response during extinction rather than occasionally testing whether the reinforcement contingency is still present. Such resistance to extinction has been implicated in neuropathology of human anxiety (Myers and Davis, 2002; Barad, 2005).

A curious feature that appears across avoidance learning paradigms emerges when one looks at behavior within, rather than across, sessions. Specifically, SD rats typically show less avoidance responding at the start of a daily session, compared to their performance at the end of the prior session or later in the current session (Servatius et al.,2008). This phenomenon has been termed"warmup," and is shown by a number of species in a range of avoidance paradigms (for reviews, see Kamin, 1963; Spear et al., 1973; Hineline, 1978). In contrast,WKY rats tend to respond on the first trial of each session at approximately the same rate as at the end of the prior session (Servatius et al., 2008; Perrotti et al., 2013). It is possible that the absence of warm-up contributes to the generally faster acquisition, and slower extinction, of avoidance in the WKY rats compared to SD rats. Thus, understanding the nature of the warmup phenomenon may have implications for the study of avoidance learning, and may in turn provide insight into how pathological avoidance is acquired and maintained in anxiety-vulnerable humans.

Several general classes of explanation for warm-up have been presented (for review, see McSweeney and Roll, 1993; Beck et al., 2010). Perhaps the simplest explanation invokes simple forgetting of the avoidance response during the inter-session interval, with warm-up reflecting reacquisition during the beginning of the next session. However, simple forgetting does not appear to be an adequate explanation, since warm-up can occur with inter-session intervals as short as 30 min (Hineline, 1978). Another early explanation for warm-up was that the decrement in responding on early trials of a session could be the result of a context shift, as the animal is moved from the home cage into the testing chamber, and these contextual effects need time to dissipate before the animal can begin executing avoidance responses. However, this explanation also appears unlikely since warm-up is not reduced if the animals are given a period of confinement in the experimental chamber before the session begins (Hoffman et al., 1961), nor is warm-up abolished if the animals are housed round-the-clock in the experimental chamber to eliminate context effects (Hineline, 1978).

Another class of explanations for the warm-up effect suggests that it reflects emotional processing. On the one hand, some researchers have suggested that presentation of shocks, early in a testing session, might produce arousal that needs to be overcome before the animal can begin executing avoidance responses (Hoffman and Fleshler, 1962); such arousal might produce a speciesspecific response such as freezing that could transiently interfere with the animal's ability to execute a lever-press response. However, this explanation fails to account for the fact that warm-up is relatively unaffected by shock intensity (Hoffman et al., 1961), or for the decrement in responding observed on the very first trial of a session, before any shock has yet been delivered. On the other hand, researchers have suggested that presentation of several shocks may be required before arousal accumulates sufficiently to motivate responding (Hoffman et al., 1961; Powell, 1972). However, this explanation fails to account for the fact that warm-up can be observed even during extinction sessions, when no shocks are presented (e.g., Bullock, 1960; Nakamura and Anderson, 1962). Thus, while emotional effects, including freezing, may certainly occur during and contribute to acquisition and extinction of avoidance, they alone do not appear sufficient to fully account for the phenomenon of warm-up (Nakamura and Anderson, 1962; Spear et al., 1973).

A final class of explanations for the warm-up effects invokes the concept of interference. For example, Spear et al. (1973) conducted a series of studies showing that warm-up could be reduced by pretest treatments that appeared to affect memory of the prior session(s) rather than affecting motivation in the current session. They concluded that an important factor contributing to warmup was the lingering influence of "unspecified events" occurring between learning and testing, such as the intervention of other behaviors during the inter-session period, which interfered with retrieval of the memory trace for the avoidance response. An interference account of warm-up avoids many of the difficulties inherent in the other explanations, since it presumes interference is possible even with a relatively short inter-session interval, should be relatively independent of shock intensity, and should indeed be maximal on the first trial of a session, even before shock has occurred. On the other hand, the central weakness of this account is that it invokes the influence of hypothetical events that occur during the inter-session interval, when the animal's behavior is often not observed and may be difficult to qualify much less quantify. Evaluating the nature and impact of such unspecified events has therefore proven understandably difficult in empirical studies, but computational modeling provides a possible tool to approach this issue, and to determine whether such hypothetical interference from prior behaviors could indeed replicate the existing data on warm-up effects.

Many computational models of associative learning exist, often using a reinforcement learning (RL) model which consists of two modules, the actor and the critic (Barto et al., 1983). The critic receives as input the current state, defined as the configuration of external and internal stimuli, and learns to output the "goodness" or reward value of each state. In the absence of explicit reward or punishment, learning can also be driven by changes in the prediction of future reward or punishment (Sutton, 1988; Dayan and Balleine, 2002). The critic sends these prediction values to the actor which learns through trial and error to select from a set of possible responses in order to maximize future

reward and minimize future punishment (Dayan and Balleine, 2002). Such models therefore embody aspects of several theories of avoidance learning, including two-factor theory (Mowrer, 1951), which posits separate stimulus–stimulus and stimulus–outcome learning processes, and cognitive expectancy theories, which posit that organisms learn to select among possible responses based on the expected long-term outcome from each (Tolman, 1932; Seligman and Johnston, 1973). Actor–critic models have been widely used by many researchers to understand the roles of brain substrates, such as the nigrostriatal dopamine system, the dorsal striatal action selection system, the prefrontal cortex, and the hippocampus (e.g., Houk and Wise, 1995; Schultz, 1998; Daw et al., 2005; Moustafa et al., 2009, 2010), and to simulate classical conditioning data and/or category learning data (e.g., Moustafa et al., 2009, 2010), or appetitive conditioning (for review, see Dayan and Balleine, 2002). Such models have also been successfully used to simulate shuttlebox avoidance (Johnson et al., 2002; Smith et al., 2004; Moutoussis et al., 2008; Maia, 2010) and can capture various features of empirical data including negatively accelerated learning curves, reduced latency to respond with extended training, and resistance to extinction when the shocks are no longer administered.

Here, we show that such a RL model incorporating actor and critic modules can also successfully capture many aspects of leverpress avoidance in SD rats, including the transition from escape to avoidance responding and the phenomenon of warm-up. The model thus provides one possible explanation of warm-up based purely on learning mechanisms, without requiring additional assumptions about motivational or emotional processes. We also show thatWKY performance can be simulated by adjusting several parameters in the model, which have largely independent effects on aspects of avoidance. The model further predicts that SD will show warm-up, but WKY will show first-trial avoidance, under a range of inter-session intervals. As a partial test of this prediction, we tested SD and WKY rats in the lever-press paradigm with the inter-session interval reduced from the "standard" 48–24 h (daily sessions); results confirm the model predictions. The model therefore suggests that multiple, interacting mechanisms may underlie pathological avoidance in WKY rats, which in turn may provide insight into how such mechanisms could confer risk for anxiety disorders in humans.

### **MODELING METHODS**

### **WITHIN-TRIAL EVENTS**

In a canonical version of the lever-press avoidance paradigm (e.g., Servatius et al., 2008), the warning signal *W* is a tone that comes on at the start of a trial and remains present for a 60-s warning period; a lever-press during this warning period is scored as an avoidance response and terminates the trial, triggering a 3-min safe period (ITI) signaled by a flashing light (*S*). Otherwise, once the 60-s warning period has elapsed, *W* remains on and scrambled 1 mA, 0.5 s footshocks (*U*) are delivered through the grid floor every 3 s for a maximum of 99 shocks. A lever-press during the shock period is scored as an escape response, terminating both *W* and *U*, and triggering the ITI. Twenty trials are typically delivered in a daily session, with sessions occurring on alternating days (48-hour inter-session interval); between sessions, the

animal is removed to the home cage. Each session begins with a 60-s stimulus-free period in the testing chamber.

To simulate this paradigm, each trial is divided into 54 timesteps that each represents approximately 10 s of simulated time. At each timestep, inputs signal the presence or absence of *W*, *S*, *U*, and the context (home cage or experimental chamber). The acquisition phase of the task consists of 12 sessions; **Figure 1** shows a schematic representation of the events in one acquisition session. Under standard conditions, each acquisition session starts with six timesteps in the experimental context, followed by 20 trials. On each trial, *W* is presented for 6 timesteps (warning period) and persists through a further 30 timesteps where *U* is also presented (shock period), followed by 18 timesteps with *S* (ITI period).

At each timestep, the actor receives inputs and can choose a response from among a set of possible actions, with one action arbitrarily designated as lever-press. A lever-press response during the warning period, but before onset of shock, is scored as an avoidance response and terminates *W* and causes the trial to move directly to the ITI period; a lever-press response during the shock period is scored as an escape response and terminates *W* and *U*, and causes the trial to move directly to the ITI period.

Following the end of each session, an "overnight" period is simulated during which the home cage context input is present instead of the testing chamber context, no other inputs (*W*, *U*, *S*) are present, and the lever-press response is disabled. This overnight period is 18,000 timesteps in length, to simulate the relative ratio of home cage time to testing sessions in animals given testing sessions on alternating days.

The last acquisition session is followed by 12 extinction sessions that are the same as acquisition except that *U* is never presented.

### **ACTOR MODULE**

At every timestep *t*, the actor module chooses a response *r* from a set of *A* possible actions, of which one is arbitrarily designated to represent lever-press (**Figure 2**). To capture the fact that leverpress is only one of a large number of possible actions available to an animal (e.g., grooming, rearing), *A* = 100 in these simulations. The probability of selecting a particular response *r* at timestep *t* is defined as

$$\Pr(r) = \frac{f\left(\frac{M\_t}{T}\right)}{\sum\_{\text{a}} f\left(\frac{M\_s}{T}\right)}$$

where *f*(*x*) = *e x* , *a* = 1..*A* and *T* is an explore/exploit parameter (sometimes called the "inverse temperature") which governs the tendency to repeat previously reinforced responses vs. explore the effect of new ones. At each timestep *t*, the values *M*<sup>a</sup> are computed as

$$M\_{\mathbf{a}} = \sum\_{\mathbf{i}} m[a][i] \ast I\_{\mathbf{i}} + p \ast c[a][i].$$

Here, *I*<sup>i</sup> is the current value of input *i*; *m*[*a*][*i*] is the strength of the connection from input *i* to action *a*, with all *m*[*a*][*i*] initialized to a small value (0.01) at the beginning of a simulation run. *P* is a perseveration factor governing the tendency to repeat a prior

**FIGURE 1 | Schematic of events during one acquisition session in the model**. Each training session begins with a short stimulus-free period (Pre) in the testing chamber context. Then, on each trial, the warning signal W is presented for several timesteps ("warning period," white boxes), with each timestep representing about 10 s. Next, W and the shock U are presented together for several timesteps ("shock period," red boxes); finally, both W and U are removed and the ITI signal S is presented for several timesteps ("ITI period," green boxes), after which the next trial begins with another presentation of W. At each timestep, the actor module chooses and executes

provided to the critic module, which also contains weighted connections from each input, and calculates V, a prediction of future reward (or punishment). The prediction error PE, which is the difference between expected outcome V and actual outcome R, is then used to train the weights in both the actor and critic modules.

action (values of *P* < 0 confer a tendency for spontaneous alternation) and *c* is a working memory trace that records prior actions in response to the inputs: *c*[*r*][*i*] = 1 for the action *r* which was executed at time *t*; for all actions *a* 6= *r*,*c*[*a*][*i*]←*c*[*a*][*i*]\*0.95. All *c*[*a*][*i*] are initialized to 0 at the start of a simulation run.

### **CRITIC MODULE**

Based on the action *r* selected by the actor module at timestep *t*, external reinforcement *R* is provided. If shock is present at *t* + 1, then *R* is set to *R*shock, a large negative value (e.g., −4); otherwise *R* = 0 unless the action selected was lever-press, in which case *R* is set to *R*press, a small negative value (e.g., −0.2) representing the

a response from a large set of possible actions; one of these is arbitrarily designated as lever-press. Lever-press during the warning period is scored as an avoidance response: in this case, W is terminated, U is omitted, and the trial proceeds directly to the ITI. Lever-press during the shock period is scored as an escape response: in this case, W and U are terminated and the trial proceeds directly to the ITI. Lever-press responses during the stimulus-free period at the start of the session (Pre) are scored as anticipatory responses. Events during extinction sessions are identical except that U is never presented.

cost of lever-press in energy expenditure and missed opportunity to engage in other behaviors.

Based on *R*, the critic module computes prediction error *PE*, defined as

$$\text{PE} = \text{R} + \text{0.9} \ast V - V'$$

where *V* is the predicted future value of *R*, calculated as

$$V = \sum\_{\mathbf{i}} \nu[\mathbf{i}] \* I\_{\mathbf{i}}$$

and where *V* 0 is the value of *V* from the prior timestep. All *v*[*i*] are initialized to 0 at the start of a simulation run, and updated as

$$
\Delta\nu[i] = \alpha \ast \text{PE} \ast I\_{\text{i}}
$$

where α is a learning rate that governs rate of weight change in the critic. The values of *v*[*i*] are clipped at ±*R*shock, to prevent *v* from growing out of bounds.

The weights in the actor module *m*[*r*][*i*] for the chosen action *r* are also updated based on PE:

$$
\Delta m[r][i] = \varepsilon \ast (\text{PE} - m[r][i]) \ast I\_{\text{i}},
$$

where ε is the learning rate that governs rate of weight change in the actor. The values of *m* are restricted to be ≥0.

### **SIMULATING BEHAVIOR**

For each trial, the dependent variables are the latency to first leverpress response on that trial (calculated in timesteps since the onset of *W* ), and whether that first lever-press constitutes an avoidance response (occurring within the warning period), an escape response (occurring within the shock period), or neither (occurring during the ITI). If no lever-press responses are made during the trial, latency defaults to the maximum number of timesteps in the trial. In addition, anticipatory responses are defined as leverpress responses occurring during the stimulus-free period at the beginning of each session.

To simulate the behavior of SD rats, parameter space was explored for four free parameters: α (learning rate in the critic), ε(learning rate in the actor),*T* (explore/exploit), and *P* (perseveration). Parametric explorations are shown in the Supplementary Material;in brief,manipulations of *T* tended to affect rate of avoidance acquisition, without much effect on extinction or warm-up; manipulations of α tended to affect rate of extinction, without much effect on acquisition or warm-up; and manipulations of *P* tended to affect warm-up without much effect on either acquisition or extinction. Manipulations of ε also tended to affect acquisition rate, but these effects were more dramatic than the effects of manipulating *T*, and realistic learning curves were only obtained within a fairly small range of values. Simulations that best simulated key features of SD behavior were obtained when α = 0.05, ε = 0.005, *T* = 1.0, and *P* = 0.25, and these values were subsequently "fixed" for the SD simulations reported below.

Next, the model was adjusted to simulate behaviorally inhibited WKY rats. While WKY rats have a number of phenotypic differences compared to control strains, there are three in particular that appear to relate in a fairly straightforward way to RL model parameters. First, because WKY rats are behaviorally inhibited, and behavioral inhibition implies a tendency to repeat previously reinforced (familiar) responses rather than explore new ones, we reduced the value of *T*. Second, given data suggesting that WKY rats have reduced mesolimbic dopamine function (Jiao et al., 2003), a system which has been implicated in generating the prediction error signal in RL (Hollerman and Schultz, 1998; Schultz and Dickinson, 2000), we reduced the learning rate α at which the critic updates weights based on prediction error. Third, given data suggesting that WKY rats have reduced dopamine function in prefrontal cortex (De La Garza and Mahoney, 2004), a brain area implicated in working memory, such as would maintain a trace of recent responses (Goldman-Rakic, 1992; Bussey et al., 2001), we reduced the perseveration parameter *P*. As described below, simulations with these three parameter values (i.e., α = 0.005,*T* = 0.25, and *P* = 0), produced behavior that simulated key features of the WKY rat.

All modeling results reported are averaged over 10 simulation runs.

### **MODELING RESULTS**

### **BASIC FEATURES OF AVOIDANCE ACQUISITION AND EXTINCTION IN SD AND WKY**

**Figure 3A** shows typical acquisition and extinction curves obtained in male SD and WKY rats, expressed as percent of trials with an avoidance response, with WKY rats acquiring faster (sessions 1–10) and to a higher asymptotic level, compared to SD rats (Jiao et al., 2011a);WKY rats also extinguish slower when shock no

**FIGURE 3 | Acquisition and extinction of avoidance**. **(A)** Male WKY rats acquire avoidance, expressed as percent of trials with an avoidance response, faster (sessions 1–10) and to a higher asymptotic level, and extinguish slower (sessions 11–23), compared to male SD rats. Adapted from Figure 5 of Jiao et al. (2011a). **(B)** The same strain difference is reflected in latency to respond: male WKY rats respond faster than male SD rats during acquisition, and continue to give short-latency responses during the first few extinction sessions. Here, latency is defined as average time from onset of warning signal to first avoidance response; responses occurring within first 60 s after

warning signal onset (below dotted line) are avoidance responses. Adapted from Figure 1 of Servatius et al. (2008). **(C)** As in the rat data, the WKY model acquires faster (sessions 1–12) and extinguishes slower (sessions 13–24) than the SD model. **(D)** Similarly, the WKY model gives faster latency responses than the SD model, and continues to give short-latency avoidance responses for the first several sessions of extinction. Avoidance responses occur within the first six timesteps after warning signal onset (below dotted line). Here and in subsequent figures, simulation results are shown averaged over 10 simulation runs; error bars show SEM computed across runs.

longer occurs (sessions 11–23). Another way to assess learning is by considering latency from onset of the warning signal to first leverpress response; responses occurring before shock onset (during the warning period) are avoidance responses, and those occurring during the subsequent shock period are escape responses. As shown in **Figure 3B**, during the first few acquisition sessions, both SD and WKY rats rapidly decrease average latency, so that on most trials, responses occur within the warning period; during extinction, latency rapidly increases in SD rats while WKY rats continue to give responses during the warning period for several sessions, even though the shock no longer occurs (Servatius et al., 2008). **Figure 3C** shows acquisition and extinction curves obtained in the SD and WKY models, with fast acquisition and slow extinction in the WKY model. Similarly, the SD model shows decreasing response latency across the 12 acquisition sessions, so that by the end of acquisition, most responses are avoidance responses that occur within the warning period (here, within timesteps 0–6); during extinction, latencies quickly increase (**Figure 3D**). However, in the WKY model, response latencies remain within the warning period for several extinction sessions, similar to the rat data shown in **Figure 3B**.

As mentioned above, warm-up is exhibited in the SD but not WKY rats. **Figure 4A** shows typical within-session avoidance responding patterns, plotted as trial-by-trial responding averaged across several blocks of training sessions (Perrotti et al., 2013). As illustrated in the figure, avoidance responding typically increases across trials within a session, but particularly in later sessions, SD rats generally make fewer avoidance responses on the first few trials of a session, compared to their performance at the end of the previous session or later in the same session. **Figure 4B** shows similar within-session data from the SD and WKY model. During the first three sessions of acquisition, the SD model does not show much avoidance responding (**Figure 4B1**); however, as the avoidance response is acquired in sessions 4–6 and beyond, the SD simulations reliably show warm-up (**Figures 4B2**–**4**). WKY simulations do not show warmup during these acquisition sessions. During early extinction (**Figures 4B5,6**), the SD model continues to show warm-up,meaning that avoidance responses increase over the first few trials of an extinction session, even though no reinforcer is delivered; this pattern of paradoxical increases in responding across the first few trials of early extinction session has also been observed in SD rats (Beck et al., 2011).

### **EFFECTS OF MANIPULATING SHOCK INTENSITY**

One possible reason for faster learning in the WKY strain could be increased sensitivity to shock, since stronger punishers should tend to produce faster associative learning. However, increasing the shock amplitude, e.g., from 1 to 2 mA, does not significantly alter acquisition speed in either WKY or SD rats, with SD rats continuing to learn more slowly than WKY rats at either amplitude (**Figure 5A**;Jiao et al., 2011b), although extinction in theWKY rats is worse after training with the higher amplitude shock. **Figure 5B** shows a similar pattern in the model: when the shock amplitude (value of *R*shock in the model) is doubled, WKY simulations still learn faster than SD simulations; however, extinction in the WKY model is severely attenuated following training with the greater shock amplitude. The modeling results suggest that differences in shock sensitivity do not have to be assumed to explain strain differences in learning and extinction.

### **EFFECTS OF MANIPULATING WARNING SIGNAL**

In outbred rat strains such as SD, learning of lever-press avoidance is affected when the length of the warning signal (interstimulus interval or ISI) is varied (Cole and Fantino, 1966; Berger and Brush, 1975; Berger and Starzec, 1988). For example, on a leverpress avoidance task similar to the paradigm described above, SD rats trained with a fixed-interval 60-s warning signal (F-60) acquired the avoidance response, but those trained with a 10-s warning signal (F-10) exhibited low levels of avoidance responding, although escape responding was robust (**Figure 6A**; Berger and Brush, 1975). Reduced avoidance responding under the 10-s ISI is sometimes attributed to motivational factors, such as a fear response to the warning signal which causes freezing that must be overcome before lever-pressing can be initiated; such explanations assume that a 60-s ISI is enough to allow this fear response to dissipate but a 10-s ISI is not. However, such explanations need not necessarily be invoked to explain reduced avoidance acquisition under a shorter ISI. Specifically, when ISI in the SD model is reduced from 60 s of simulated time to 10 s, avoidance acquisition is greatly reduced, although not abolished (**Figure 6B**). This is simply due to the probabilistic nature of response selection in the actor module of the model; with a longer ISI there is greater probability that lever-press will be selected at least once during the warning period, compared to a shorter ISI which provides fewer timepoints at which to select actions. On the other hand, WKY rats can acquire robust avoidance responses even under the shorter ISI (Berger and Starzec, 1988); **Figure 6B** shows that the WKY model is less impaired under the 10 s ISI than is the SD model, although performance is not as good as under the longer ISI for either model.

Manipulating the ISI provides a way to explore another possible explanation for the absence of warm-up shown in WKY rats and the WKY model, which is that warm-up occurs only while the avoidance response is still being acquired; thus, SD rats (and model) which learn slowly continue to show warm-up behavior throughout the acquisition sessions, but WKY rats (and model) which quickly reach a higher level of performance do not show warm-up behavior. However, **Figure 6C** shows that, while the SD model shows warm-up under both the 10 and 60-s ISI conditions, there continues to be an absence of warm-up in the WKY model, even under the 10-s condition, where a relatively low performance criterion is reached even in the final session block of acquisition training (**Figure 6C4**).

The model therefore makes the novel prediction that the presence of warm-up in SD, and the absence of warm-up in WKY, should be independent of whether high or low performance levels are reached.

### **MANIPULATIONS OF INTER-SESSION INTERVAL**

Another feature of warm-up observed in early studies with outbred rats is that it appears even when the inter-session interval is fairly short, e.g., 30 min (Hineline, 1978) or 1 h (Kamin, 1963), and occurs whether or not the animal is removed to the home cage

between sessions, or is housed round-the-clock in the conditioning chambers to eliminate possible contextual effects (Hineline, 1978). The SD model is able to capture these effects as well. As the length of the inter-session is varied from 0 min to the"standard"48 h, and even up to the equivalent of 30 days of simulated time (259,200 timesteps) between testing sessions, there is little effect on acquisition or extinction rate in the SD model (**Figure 7A1**; for clarity, only a few representative curves are shown); **Figure 7C1** plots the eventual asymptote (avoidance rates in training session 12) for all values of inter-session interval explored in the model, and shows that all simulations reached approximately the same asymptote. However, inter-session interval does affect warm-up in the SD

model, evident as a sharp decrease in response on the first trial of a session (**Figure 7A2**; again, for clarity, only a few representative curves are shown); **Figure 7C2** shows data from all inter-session intervals explored, plotted as a difference score representing the average difference in responding on trial 2 vs. trial 1 of sessions 10– 12. There is no warm-up in the SD model when sessions are continuous, but warm-up emerges with inter-session intervals as short as a few minutes of simulated time, and reaches what appears to be a maximum with intervals of 30 min or longer. The same general pattern of results is obtained in the SD model when "round-theclock" housing is simulated; i.e., when contextual inputs remain the same throughout the experiment rather than switching to the

WKY rats is independent of shock intensity (1 vs. 2 mA), although male WKY rats trained with the 2 mA shock extinguish more slowly than counterparts trained with 1 mA shock, or SD rats at either intensity. Adapted from Figure 1

value of Rpress. As in the animal data, increasing the shock intensity (from Rpress = -4 vs. -8) strongly attenuates extinction in the WKY model, with relatively little effect on extinction in the SD model.

home cage context during the inter-session interval (simulations not shown). Therefore, the model can also capture this feature of warm-up in the SD rat.

On the other hand, the model predicts that changes in intersession interval will affect acquisition and eventual asymptote in the WKY, without producing warm-up. **Figure 7B1** shows that

tested. contin = continuous sessions, m = min, h = hours, d = days; \* = "standard" inter-session interval. Note that, for all conditions except contin, each session started with a 1 min pre-stimulus interval, in addition to the explicit inter-session interval.

avoiding, session 12) for all intervals tested, in both the SD and WKY models. Inter-session interval does affect warm-up in the SD but not WKY model; again, for illustration **(A2)** and **(B2)** show data obtained under a

shorter inter-session intervals (e.g., ≤1 h of simulated time) produce faster learning to a higher asymptote, and slower extinction, than longer inter-session intervals (e.g.,≥6 h). However, in no case do WKY simulations exhibit warm-up (**Figure 7B2**).

The simulations with varying inter-session interval have implications for empirical studies. In particular,lever-press avoidance in rats is typically run with sessions on alternating days (i.e., 48-hour inter-session interval); this is primarily due to a tacit assumption that more frequent sessions (e.g., 24-hour inter-session interval) might be too stressful or otherwise impair learning. However, the simulations in **Figure 7** suggest that, over a wide range of intersession interval, there is little effect on acquisition, extinction, or warm-up in either the SD or WKY model, at least for inter-session intervals longer than about 30 m. In particular, if the model can adequately account for the major processes underlying avoidance learning in SD and WKY rats, then data obtained under daily training should show the same basic features of faster acquisition in WKY, with SD but not WKY showing warm-up. This prediction was tested with an empirical study, as described next.

### **EMPIRICAL METHODS**

As a test of the model prediction that strain differences in acquisition and warm-up observed under the "standard" inter-session interval of 48 h appear also with a 24-hour inter-session interval, an empirical study was conducted with SD and WKY rats given daily training sessions of lever-press avoidance. Materials and procedures generally followed those of prior studies described above (Servatius et al., 2008; Beck et al., 2010) except for inter-session interval which was reduced to 24 h, as described below. The study methods were approved by the IACUC at VA New Jersey Health Care System and confirmed to Federal standards set in the NIH Guide for the Care and Use of Laboratory Animals.

### **ANIMALS**

Eight male WKY rats (10 weeks old) and 8 male SD rats (10 weeks old) were obtained from Harlan Labs Inc. (Indianapolis, IN, USA). Rats were individually housed in cages on a 12:12 light cycle (lights on at 0700). All rats had at least 2 weeks to acclimate to their living conditions prior to the start of training and had free access to water and food in their home cages. The Institutional Animal Care and Use Committee approved all procedures in accordance with AAALAC standards.

### **APPARATUS**

Training was conducted in 30 cm × 25 cm × 30 cm operant avoidance chambers. The chambers were sound attenuated and had clear Plexiglas front doors. One wall was fitted with a lever (10.5 cm above the grid floor), a speaker (26 cm above the floor), and a light cue (20.5 cm above the floor) that designated the ITI, and blinked at a rate of 0.5 Hz when illuminated. On the opposing wall, a house light (26 cm above the floor) was continually lit for illumination. A scrambled 1.0 mA electric footshock was delivered via a shocker (Coulbourn Instruments, Langhorn, PA, USA).

### **AVOIDANCE CONDITIONING**

Twelve acquisition sessions occurred during the light cycle over twelve consecutive days. Each session began with a 1 min stimulusfree period, followed by 20 escape-avoidance trials. A trial began with a 75 dB, 1000 Hz tone (warning signal) that preceded the first shock by 1 min. Lever-press responses during this tone-alone warning period terminated the tone and were scored as avoidance responses. If no avoidance response was made, the tone remained on and a series of 1.0 mA footshocks (0.5 s in duration every 3 s) were delivered through the grid floor; lever-press responses during this period caused termination of both tone and shock and were scored as escape responses. In the absence of an escape response, shocks terminated after 300 s. Each trial was followed by a 3 min ITI, during which the blinking light cue (ITI signal) was presented. Typically, any rats that fail to produce at least five lever-press responses by the end of Session 5 are excluded; in the current experiment, no animals met this criterion and none were excluded.

### **DATA ANALYSIS**

Graphic State (Coulbourn Instruments, Langhorn, PA, USA) was used to control the testing apparatus and to record avoidance responses and response latency on each trial. Custom algorithms in S-Plus were used to detect all actions on the lever during the entire session. Avoidance responses were ascertained from these data, and they were analyzed using mixed-design ANOVA with between-subjects factor of strain and between-subjects factor of trial and/or session.

### **EMPIRICAL RESULTS**

Given daily testing sessions (24-h inter-session interval), there were main effects of Strain, *F*(1, 14) = 33.5, *p* < 0.0001 and Session, *F*(11,154) = 32.9, *p* < 0.0001, indicating acquisition of the avoidance response occurred in both strains, but the strains differed in their overall performance (**Figure 8A**). WKY rats acquired the avoidance behavior quicker and to a higher asymptotic level than SD. Thus, as in the model (**Figure 8B**), decreasing the inter-session interval from 48 to 24 h preserved the faster acquisition normally observed in WKY rats.

Next, to examine effects of the shorter inter-session interval on warm-up, avoidance responses were analyzed within a session, averaged across three sessions for each of four session blocks. There were main effects of strain, *F*(1, 14) = 33.5, *p* < 0.0001, Session block, *F*(3,42) = 56.8, *p* < 0.0001, and trial, *F* (19, 266) = 5.1, *p* < 0.0001, as well as an interaction between strain and trial, *F*(19,266) = 5.1, *p* < 0.0001); specifically, as shown in **Figure 9A**, WKY rats tended to outperform SD rats, particularly on the early trials of a session; by the later trials of a session block (particularly later session blocks, **Figures 9A3**,**4**), SD rats approximated the performance levels of WKY rats. As evidenced by the average of the first two trials of the last session block vs. the last two trials of the previous session block,WKY rats show absolutely no evidence of warm-up, whereas the SD rats clearly exhibit warm-up. Thus, the empirical data support the model predictions (**Figure 9B**) that warm-up is preserved in SD rats, but absent in WKY rats, even under the shorter inter-session interval.

### **DISCUSSION**

The current work demonstrates that a RL model can capture many aspects of avoidance acquisition and extinction of leverpress responding in outbred SD rats, including the phenomenon

of warm-up, which correctly appears in the SD model even when the inter-session interval is fairly short (≥30 min of simulated time) and even if inter-session intervals occur in the training environment, with no context shift (removal to home cage) between sessions. As in the empirical data, warm-up in the SD model does not require explanations invoking emotional effects, contextual shift effects, or simple forgetting; rather, warm-up in the SD model reflects a tendency to perseverate or repeat behaviors that have occurred during the inter-session interval at the expense of avoidance responding, similar to the interpretation proposed by Spear et al. (1973). Thus, when the parameter *P*, which governs perseveration,is reduced to 0,warm-up is abolished without much effect on other aspects of behavior in the model, such as rate of acquisition or extinction (see Figures S1C,D in Supplementary Material).

The model also provides an explanation of the finding that SD rats show reduced avoidance acquisition under short ISI. This poor learning is sometimes attributed to motivational factors such as a fear response to the warning signal that causes freezing which must be overcome before an operant avoidance response can be initiated; under this theory, the shorter ISI simply does not leave enough time for this emotional response to dissipate before shock onset. However, the model provides a simpler interpretation: a shorter warning signal is simply shorter, making it less likely that a probabilistic response selection process will choose a lever-press response at least once within that time period, compared to the probability under a longer warning signal.

The model can also address data from behaviorally inhibited WKY rats, which typically show faster acquisition, slower extinction, and lack of warm-up. WKY-like behavior is produced when the model is altered by reducing the default values of three model parameters: reducing the explore/exploit parameter *T*, which causes a decrease in behavioral exploration similar to behavioral inhibition, and increases acquisition rates; reducing the learning rate α, which impairs extinction; and reducing the perseveration parameter *P*, which reduces warm-up. The model also correctly captures the effect of increasing the intensity of the punisher, which causes little facilitation of acquisition in either rat strain but greatly retards extinction in the WKY rats.

The ability of the model to simulate these strain differences suggests that differences in behavior between SD and WKY rats may be best understood as resulting from distinct associative learning mechanisms, each of which may be amenable to independent study. If the mechanisms underlying pathological avoidance in WKY rats are similar to those underlying avoidance vulnerability in humans, then avoidance vulnerability may similarly reflect a confluence of several mechanisms which, together, produce the endophenotype.

The RL model also makes several novel predictions. First, it predicts that the impaired extinction observed in WKY rats is not simply an artifact of their higher response asymptote during acquisition, compared to SD rats. Instead, even under a short ISI where a fairly low response asymptote is reached during acquisition, the WKY model continues to show impaired extinction compared to the SD model trained under the same conditions (**Figure 6**).

Second, the model predicts that the accelerated avoidance in WKY rats is not simply a reflection of the absence of warm-up. As shown by parametric manipulations (Figures S1C,D in Supplementary Material), altering the perseveration parameter *P* at least within a range from neutral (*P* = 0) to mildly positive (*P* ≤ 0.25) values affects warm-up but has little effect on rates of either acquisition or extinction of avoidance responding. Even under conditions where the WKY model shows degraded learning, such as the short ISI training simulated in **Figure 6**, the SD model nevertheless still shows warm-up, and the WKY model does not.

Third, while continuous sessions abolish warm-up, for intersession intervals ranging from 30 min to 30 days of simulated time, warm-up is robust in the SD model, but never appears in the WKY model. This prediction was partially confirmed by our empirical data, which show that when the inter-session interval is halved, from the "standard" 48 to 24 h (daily sessions), WKY rats still acquire the avoidance response faster than SD rats, while SD but not WKY still show warm-up. While the daily testing sessions may arguably be more stressful for the animal, in neither the empirical study nor the model simulations did this change affect associative learning.

Limitations of the current work include the fact that the RL model is a fairly abstract model; although parameters can be manipulated which bear some resemblance to known features of SD vs.WKY rats, the RL model cannot provide a complete account of the underlying biology that gives rise to strain differences in avoidance behavior. In addition, while the current study focused on comparing SD and WKY, there are other strain differences that could be modeled. For example, the outbred C57BL mouse strain appears to acquire a lever-press avoidance response about as well as outbred SD rats, but an inbred strain, the FVB/NJ mouse, learns to escape but not avoid (Brennan, 2004). The RL model could be used to examine possible mechanisms underlying this behavioral phenotype, which may be relevant to understanding comparable phenotypes in human anxiety and depression.

Further, although strain is indeed an important determinant of variability in learning and behavior, there are other important individual differences that affect acquisition and maintenance of avoidance too; among these are sex differences (Beck et al., 2010, 2011), which the current model does not address, although some aspects of sex differences might be in principle amenable to future study to determine which parametric differences best capture behavioral differences observed between male and female rats. In particular,while female rats generally outperform male rats of the same strain on lever-press avoidance acquisition, male and female rats are differentially affected by the presence of the safety signal during the ITI (Beck et al., 2011), and computational modeling might help elucidate some of the mechanisms underlying this difference.

Finally, although the RL model provides simple explanations for many features of avoidance that do not require invoking motivation or emotion as constructs, nevertheless SD and WKY rats clearly differ in emotional responding; in fact, one of the defining characteristics of behavioral inhibition in WKY rats is exaggerated freezing after initial placement in the center of a brightly lit open field or when faced with an electrified probe [for review, see Jiao et al. (2011a)]. Such freezing would obviously be expected to facilitate passive avoidance in WKY rats, although it would actually be expected to impair – not facilitate – active avoidance compared to SD rats. Although freezing to the warning signal has not to our knowledge been explicitly assessed in WKY rats during leverpress avoidance, there are no differences between WKY and SD rats in freezing to a tone stimulus that has been paired with an electric shock in a classical conditioning paradigm (LeDoux et al., 1983). In addition, increasing the shock intensity, which should presumably increase emotional responding, does not greatly affect acquisition in either strain (**Figure 5A**; Jiao et al., 2011b). For these

reasons, freezing alone does not appear to adequately explain the strain differences in warm-up. However, freezing is an important species-specific response to threatening stimuli, and may play an important role in strain differences in active avoidance; in fact, given the higher freezing in WKY rats placed in the open field, it is theoretically possible that manipulations which reduce freezing would actually magnify strain differences observed in avoidance acquisition and extinction. On the other hand, avoidance learning is known to be facilitated following exposure to stressors (Brennan et al., 2005, 2006). The existing RL model does not consider how learning might be modulated by emotional and/or neurochemical states brought on by prior experiences, and thus it cannot directly address these concepts. However, the model simulations and empirical study both suggest that reducing inter-session interval,which might arguably cause an increase in stress – by increasing absolute shock frequency and/or allowing less time for arousal to dissipate between sessions – is not of itself sufficient to affect strain differences in avoidance acquisition and warm-up.

Future modeling work could address some of these ideas. Despite these limitations, the current work shows that a fairly simple RL model can simulate key features of lever-press avoidance, and parametric manipulations can capture a range of observed phenomena in acquisition, extinction, and warm-up, without needing to invoke additional motivational or emotional mechanisms. The model may thus provide a framework for further exploration of these mechanisms and their role in pathological avoidance, and in future could be used to explore the space of possible potential pathways (e.g., behavioral interventions) to remediate pathological avoidance. Such exploration can be done cheaply and quickly in a computational model, and paradigms identified as of interest could then be targeted for future study in rat models and also in humans. This in turn might help in the development of more sophisticated behavioral therapies to promote extinction of pathological avoidance or even prevent the initial development of pathological avoidance in anxiety-vulnerable individuals.

### **AUTHOR CONTRIBUTIONS**

Catherine E. Myers, Kevin D. Beck, and Richard J. Servatius contributed to the design of the modeling work; Ian M. Smith and Kevin D. Beck contributed to the design and implementation of the empirical study. Catherine E. Myers conducted the computational modeling and model analysis; Ian M. Smith and Kevin D. Beck conducted the empirical data collection and analysis. All authors contributed to drafting and revising the manuscript, approved the final version, and agree to be accountable for all aspects of the work.

### **ACKNOWLEDGMENTS**

This work was supported by Award Number I01CX000771 from the Clinical Science Research and Development Service of the VA Office of Research and Development, Award Number I01BX000218 from the Biomedical Laboratory Research and Development Service of the VA Office of Research and Development, and by the SMBI. Opinions expressed herein are those of the authors and do not necessarily represent the views of the Department of Veterans Affairs or the U.S. Government.

### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at http://www.frontiersin.org/Journal/10.3389/fnbeh.2014.00283/ abstract

### **REFERENCES**


Sutton, R. (1988). Learning to predict by the methods of temporal differences. *Mach Learn* 3, 9–44. doi:10.1007/BF00115009

Tolman, E. (1932). *Purposive behavior in animals and men*. New York: Appleton-Century-Croft.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 24 April 2014; accepted: 01 August 2014; published online: 18 August 2014.*

*Citation: Myers CE, Smith IM, Servatius RJ and Beck KD (2014) Absence of "warm-up" during active avoidance learning in a rat model of anxiety vulnerability: insights from computational modeling. Front. Behav. Neurosci. 8:283. doi: 10.3389/fnbeh.2014.00283*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Myers, Smith, Servatius and Beck. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited andthatthe original publication inthis journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Not so bad: avoidance and aversive discounting modulate threat appraisal in anterior cingulate and medial prefrontal cortex

Michael W. Schlund<sup>1</sup> \*, Adam T. Brewer <sup>2</sup> , David M. Richman<sup>3</sup> , Sandy K. Magee<sup>1</sup> and Simon Dymond<sup>4</sup>

*<sup>1</sup> Barrett Translational Behavioral and Neurobehavioral Laboratory, Department of Behavior Analysis, University of North Texas, Denton, TX, USA, <sup>2</sup> Department of Psychology and Liberal Arts, Florida Institute of Technology, Melbourne, FL, USA, <sup>3</sup> Department of Educational Psychology and Leadership, Texas Tech University, Lubbock, TX, USA, <sup>4</sup> Experimental Psychopathology Laboratory, Department of Psychology, Swansea University, Swansea, UK*

Edited by:

*Richard J. Servatius, Syracuse DVA Medical Center, USA*

### Reviewed by:

*Seth Davin Norrholm, Emory University School of Medicine, USA Christine A. Rabinak, Wayne State University, USA*

### \*Correspondence:

*Michael W. Schlund, Barrett Translational Behavioral and Neurobehavioral Laboratory, Department of Behavior Analysis, University of North Texas, 1155 Union Circle, Box 310919, Denton, TX 76203-0919, USA michael.schlund@unt.edu*

> Received: *21 April 2015* Accepted: *15 May 2015* Published: *10 June 2015*

#### Citation:

*Schlund MW, Brewer AT, Richman DM, Magee SK and Dymond S (2015) Not so bad: avoidance and aversive discounting modulate threat appraisal in anterior cingulate and medial prefrontal cortex. Front. Behav. Neurosci. 9:142. doi: 10.3389/fnbeh.2015.00142* The dorsal anterior cingulate (adACC) and dorsal medial prefrontal cortex (dmPFC) play a central role in the discrimination and appraisal of threatening stimuli. Yet, little is known about what specific features of threatening situations recruit these regions and how avoidance may modulate appraisal and activation through prevention of aversive events. In this investigation, 30 healthy adults underwent functional neuroimaging while completing an avoidance task in which responses to an Avoidable CS+ threat prevented delivery of an aversive stimulus, but not to an Unavoidable CS+ threat. Extinction testing was also completed where CSs were presented without aversive stimulus delivery and an opportunity to avoid. The Avoidable CS+ relative to the Unavoidable CS+ was associated with reductions in ratings of negative valence, fear, and US expectancy and activation. Greater regional activation was consistently observed to the Unavoidable CS+ during avoidance, which declined during extinction. Individuals exhibiting greater aversive discounting—that is, those more avoidant of immediate monetary loss compared to a larger delayed loss—also displayed greater activation to the Unavoidable CS+, highlighting aversive discounting as a significant individual difference variable. These are the first results linking adACC/dmPFC reactivity to avoidance-based reductions of aversive events and modulation of activation by individual differences in aversive discounting.

Keywords: avoidance, threat, fear, anterior cingulate, medial prefrontal cortex, loss discounting, anxiety, neuroimaging

### Introduction

Discriminating and appraising situations as threatening or non-threatening is important for adaptive approach-avoidance decision-making. Equally important is rapidly and flexibly altering appraisals and associated negative emotional responses following actions that successfully change stimuli/contexts from threats to non-threats. Numerous nonhuman and human investigations on instructed and conditioned fear (for reviews, see Sehlmeyer et al., 2009; Mechias et al., 2010) and anticipatory anxiety and aversion (Straube et al., 2007) highlight central roles for the dorsal anterior cingulate (adACC) and dorsal medial prefrontal cortex (dmPFC) in threat appraisal and fear expression (Milad et al., 2007; Rushworth et al., 2007; Etkin et al., 2011; Shackman et al., 2011; Bravo-Rivera et al., 2014; Kalisch and Gerlicher, 2014). However, a critical gap in our knowledge base concerns what variables and characteristics of threatening situations recruit regions (Rushworth et al., 2007; Etkin et al., 2011; Shackman et al., 2011; Kalisch and Gerlicher, 2014). Accordingly, this investigation employed functional magnetic resonance imaging (fMRI) to examine the effects of avoidance behavior, a prominent emotional coping strategy and core feature of anxiety (Dymond and Roche, 2009; Aldao et al., 2010), as well as trauma and stress related disorders (American Psychiatric Association, 2013), and extinction on threat appraisal and regional activation. Findings obtained will contribute to contemporary theories of adACC/dmPFC function and development of an empirically grounded model of the endophenotypic expressions of pathological avoidance in anxiety which is fundamental to advancing our understanding its etiology, correlates, and prevention.

Anxiety disorders are characterized by exaggerated negative emotional responses to threat, and chronic, ritualized forms of cognitive and behavioral avoidance (Craske et al., 2009). Theories of avoidance highlight central roles for Pavlovian and instrumental learning processes in identifying and coping with threat. During Pavlovian learning, a neutral cue that predicts an aversive unconditioned stimulus (US) will become a conditioned stimulus (CS+) capable of eliciting a conditioned response (CR), while another cue (CS−) does not. Avoidance is then negatively reinforced via instrumental conditioning when it removes the fear-eliciting CS+ threat and subsequently prevents US delivery. In classic two-factor theory, fear and avoidance are closely associated such that CS+ termination and fear reduction are the assumed mechanisms driving and maintaining avoidance (Mowrer, 1947; Bolles, 1973). Instrumental based accounts underscore reduction in the relative frequency of US contact as the key mechanism maintaining avoidance (Herrnstein and Hineline, 1966; Dymond and Roche, 2009). Alternatively, cognitive expectancy theory suggests CS > US expectancies acquired through Pavlovian learning and CS > noUS expectancies acquired during avoidance learning may maintain active avoidance (Lovibond, 2006).

Several contemporary theoretical perspectives highlight a role for adACC/dmPFC in regulating threat appraisal and fear expression based on fear conditioning studies showing greater regional responses to a CS+ threat relative to a CS− (Rushworth et al., 2007; Etkin et al., 2011; Shackman et al., 2011; Kalisch and Gerlicher, 2014). These views may also be extended to human and nonhuman investigations on avoidance that show adACC/dmPFC recruitment to CS+ threats that prompt avoidance (Jensen et al., 2003; Kim et al., 2006; Mobbs et al., 2007, 2009; Delgado et al., 2009; Schlund et al., 2010, 2011, 2013; Bravo-Rivera et al., 2014). However, our current knowledge of relations between adACC/dmPFC and avoidance is limited because most human neuroimaging studies employ avoidance paradigms that restrict imaging analyses to activation associated with both CS+ onset and decision to avoid. Using a different approach, Schlund et al. (2013) found that when a CS+ was repeatedly presented during a 16 s threat period and avoidance successfully prevented US deliveries, analyses focusing on temporal dynamics showed CS+ activation initially increased, but then decreased during the threat period even though avoidance responding continued. These findings revealed that adACC activation is not necessarily sustained during avoidance, but instead shows an experiencedependent change when avoidance successfully prevented US deliveries. Regression analysis also revealed that the magnitude of adACC activation was negatively correlated with total avoidance responses. Importantly, the experience-dependent change was observed when avoidance was well-learned, thereby eliminating trial and error learning as an explanation. Such findings suggest adACC/dmPFC can flexibly regulate threat appraisal and fear expression based on avoidance-based local changes in the likelihood of experiencing an aversive outcome. However, the significance of findings is somewhat limited because CS+s associated with unsuccessful avoidance were not employed as negative controls.

Results suggesting adACC/dmPFC is sensitive to local reductions in US probability through avoidance seems reasonable given the dynamic relationship between avoidance and the CS > US association. More specifically, it is plausible to suggest that one consequence of successful avoidance is that it transforms the CS+ into a safety-like CS− cue by preventing US delivery. Thus, successful avoidance adds to the CS+ an additional inhibitory association (CS+ > noUS) that coexists with the original excitatory association (CS+ > US) established through prior Pavlovian pairings (see Craske et al., 2014). Another consequence of avoidance is it produces an immediate local reduction in US probability, the net effect of which is a fundamental change in the reinforcement history and associated CS+ threat value, with the duration and extent of change entirely dependent upon the participant's avoidance behavior. In some ways, successful avoidance models extinction which involves learning an inhibitory association to the CS+ and in turn alters the CS+ reinforcement history. This view is consistent with human and non-human investigations of avoidance that report reductions in cognitive US expectancies and physiological fear to CS+s (Starr and Mineka, 1977; Lovibond et al., 2007; Dymond et al., 2011, 2012).

The primary aim of this investigation was to further our understanding of how changes in the CS+ > US association through avoidance modulates adACC/dmPFC responses and subjective ratings of US expectancies, stimulus valence, and fear. Our specific question was to what extent adACC/dmPFC responses are differentially controlled by the prevailing excitatory CS+ > US association verses the temporary inhibitory CS+ > noUS association governed by successful avoidance. This question speaks to the adaptive ability to flexibly alter appraisals and associated negative emotional responses following actions that effectively change stimuli/contexts from threats to nonthreats. Evidence showing adACC/dmPFC activation to a CS+ associated with successful avoidance (i.e., prevents US delivery) would suggest control by the prevailing excitatory CS+ > US association and adACC/dmPFC insensitivity to local reductions in US probability. Alternatively, evidence showing the absence of adACC/dmPFC activation to a CS+ associated with successful avoidance would highlight control by the inhibitory CS+ > noUS association and adACC/dmPFC sensitivity to local changes in US probability. The latter finding would help bridge views emphasizing adACC/dmPFC in regulating threat appraisal and fear expression (Etkin et al., 2011; Shackman et al., 2011; Kalisch and Gerlicher, 2014) with views that regional responses reflect an extended choice-outcome history with response dependent positive and negative outcomes (i.e., CS+ reinforcement history) (Rushworth et al., 2007).

The secondary aim of this investigation was to bring a clinically-relevant individual-differences approach to advancing our understanding of relations between adACC/dmPFC function and human avoidance. One important gap in our knowledge concerns how vulnerability factors implicated in the pathogenesis of chronic avoidance coping modulate human avoidance neurocircuitry (Schlund et al., 2011, 2013). Emerging evidence on discounting of rewards and aversive outcomes highlight discounting as a candidate individual difference variable in psychopathology (Rounds et al., 2007; Bickel et al., 2009; Salters-Pedneault and Diller, 2013; Tanaka et al., 2014). In research on anxiety, for example, Salters-Pedneault and Diller (2013) used a behavioral delay discounting task where participants made choices between electric shocks delivered immediately vs. shocks delivered after various time delays and found that increased anxiety and experiential avoidance scores were associated with avoidance of immediate shocks (see also Deluty, 1978). Evidence from human neuroimaging studies examining discounting of gains and losses implicate the anterior cingulate, striatum, posterior cingulate, and lateral prefrontal cortex (Bickel et al., 2009) and have highlighted regional differences in the magnitude of activation to losses. For example, Xu et al. (2009) reported choices involving losses were associated with greater activation in posterior parietal areas, insula, thalamus, and dorsal striatum and choices involving immediate losses differentially activated anterior cingulate cortex, insula, and superior frontal gyrus. Similarly, Tanaka et al. (2014) examined the neural correlates of gain and loss asymmetry (i.e., the "sign effect") and found the sign effect was associated with a greater insular response to the magnitude of loss than gain and a greater striatal response to the delay of loss than gain. Collectively, more immediate losses are perceived as more aversive or threatening and choice patterns recruit brain regions implicated in threat appraisal and fear expression. Here, we sought to characterize the relation between discounting of delayed losses and adACC/dmPFC activation to CS+ threat to evaluate aversive discounting as candidate individual difference variable. We hypothesized that individuals exhibiting greater aversive discounting in the form of greater avoidance of immediate losses would display greater adACC/dmPFC activation to a CS+ threat.

Using a within-subjects design, we coupled fMRI with a novel delayed avoidance task to examine the effects of avoidance and extinction on threat appraisal and adACC/dmPFC regional activation in healthy adults. The delayed avoidance task was developed to temporally separate CS presentation from avoidance responses to better isolate regional activation to CSs. Prior to neuroimaging, our participants underwent threat conditioning in which two visual CS+ threats predicted US delivery and a safe CS− predicted its absence. Afterwards, participants learned through trial and error they could avoid the US associated with one CS+ threat (Avoidable CS+) but not a second CS+ threat (Unavoidable CS+). Pretraining established CSs as threats and eliminated learning related changes in activation from analyses. Neuroimaging occurred during the delayed avoidance task with CSs presented randomly. Within the same session, extinction testing was performed in which the US was withheld and CSs presented without an opportunity to avoid. We hypothesized the Unavoidable CS+ threat would be associated with greater regional activation and greater ratings of negative valence, fear, and US expectancy compared to both the Avoidable CS+ and Safe CS−.

### Materials and Methods

### Participants

Thirty, right-handed adults (Mage = 24.1, SD = 4.3, 16 males) without any reported clinical disorders, metal in the body, or use of medications altering central nervous system functioning and/or pregnancy provided written informed consent. Participants were compensated with a fixed amount for participation and could earn money during the experimental tasks. The Institutional Review Boards for the Protection of Human Subjects at the University of North Texas and Texas Tech University approved this investigation.

### Conditioned Stimuli

Three images of spaceships served as CSs (see **Figure 1** for an example). The US was a empirically validated compound aversive stimulus consisting of the simultaneous presentation of a \$1.00 loss prompt and 600 ms female scream (see Delgado et al., 2006; Lau et al., 2008; Schlund et al., 2010, 2011, 2013; Glenn et al., 2012a,b). One CS+ was arbitrarily designated the "Avoidable CS+" threat. For this CS+, participants learned through threat conditioning (see below) it predicted US delivery and through trial and error learning (see below) an avoidance response could prevent US delivery. The second CS+ was designated the "Unavoidable CS+" threat. For this CS+, participants learned through threat conditioning it predicted US delivery and through trial and error learning that no amount of responding could prevent US delivery. Thus, instructions were not used to establish CS+s as threats or direct avoidance responding. Lastly, a "Safe CS−" spaceship was established by not pairing it with US delivery.

### Design

The procedure consisted of completing four consecutive steps prior to neuroimaging: (a) Completion of a discounting task with hypothetical money losses; (b) CS pretesting to ensure CSs were viewed as neutral and responding was undifferentiated; (c) Threat conditioning, which established CS+s as threats by pairing them with the US, and establishing the CS− as safe by pairing it with the absence of the US; (d) Avoidance learning, where US presentations could be prevented to one CS+ but not another. Lastly, neuroimaging occurred while participants completed a delayed avoidance task during one scanning run and extinction testing during a second run.

### fMRI Data Acquisition

The avoidance task and extinction testing were performed during two separate fMRI scans sensitive to blood oxygen level dependent (BOLD) contrast with a 3T Siemens Magnetom Skyra equipped with a 20 channel head coil. T2∗-weighted echo-planar images consisted of 41 axial oriented slices with voxels measuring 3.5 mm<sup>3</sup> (repetition time = 2000 ms, echo time = 20 ms, 90◦ flip angle, field of view = 221 mm, 64 × 64 matrix, 272 dynamics). To minimize equilibrium effects, the first four EPI volumes for each acquisition were discarded. Additionally a high-resolution T1-weighted image was obtained for anatomical reference (192 sagittal slices, voxels 0.9 mm<sup>3</sup> , repetition time 1900 ms, echo time 2.49 ms, field of view 240 mm).

### Procedure

### Discounting Task

Assessment of aversive discounting was determined using an adjusting amount delay discounting task (Du et al., 2002) with hypothetical monetary losses. The rationale for the task rests on supposition that remote aversive events are less aversive or threatening that proximal ones (e.g., McNaughton and Corr, 2004). As such, responses to delayed losses may provide a novel individual difference measure of threat sensitivity and anxiety (e.g., Salters-Pedneault and Diller, 2013). In the task, participants were asked what they would prefer to lose by way of having to pay an amount of money, with paying reflecting avoidance of a more aversive alternative with a higher threat value. Participants were given repeated choices between paying (a) a large \$500 delayed loss (under randomized delay conditions of 0.08,0.50, 1, 3, 5, and 10 years) or (b) a smaller \$250 immediate loss. Following choice of the large delayed loss, the amount of the small immediate loss decreased by 50% on the subsequent trial. Following choice of the small immediate loss, the amount of the small immediate loss increased by 50% on the subsequent trial, but never exceeded \$500. This adjusting procedure determined the point at which the subjective value of the options was equivalent (indifference point) under each delay condition. Values were the resulting small immediate loss after six trials for each delay condition. Aversive discounting was characterized for each subject using area under the curve (AUC; Myerson et al., 2001). An AUC of 1.0 would highlight consistent choice of the immediate loss, meaning delayed losses had a higher threat value and were avoided. By comparison, an AUC of 0 would highlight consistent choice of the large delayed loss, meaning smaller immediate loss had a higher threat value and were avoided.

### CS Pretesting

Participants viewed each CS spaceship for 10 s and provided a rating in three different categories: negative valence ("How much do you dislike the [CS]?"), fear ("How much do you fear the [CS]?"), and US expectancy ("How much money did you lose to the [CS]?"). Ratings were made using a 9-point scale (1 = Not at all, 9 = A lot).

### Threat Conditioning

A modified Pavlovian fear conditioning paradigm was utilized to establish two excitatory CS+ > US relations for two CS+ spaceships (i.e., threat cues) and establish a CS− spaceship as a safe cue (see **Figure 1**). The delayed avoidance task shown in **Figure 1** was modified for this purpose. Trials lasted 20 s and consisted of a 12 s threat phase during which one presented CS physically enlarged over time (15–90 mm; 8 mm/s), (the 2 s choice phase shown was omitted), followed by a 2 s outcome phase and 4 s intertrial interval. Participants were given a stipend of \$10.00 and instructed to watch and learn which spaceships predicted the US and which did not during the 4 min task. CSs were presented for five trials in a randomized order with equal probability and CS+s were always followed by the US. The CS− was followed by a blank screen. Dependent measures included valence, fear, and US expectancy ratings for each CS. Threat conditioning was considered successful when both CS+s were more disliked and feared than the CS− and US expectancies for CS+s were greater compared to the CS−. Ratings provide evidence of the conscious knowledge of differences in cueoutcome contingencies among CSs and differences in associated negative appraisal processes.

### Avoidance Acquisition

**Figure 1** provides a schematic of the 4 min task used to establish avoidance prior to neuroimaging. The goal was to train participants to learn that avoidance could prevent the US following the Avoidable CS+ but not following the Unavoidable CS+, and press response button #3 to the CS−. Thus, pretraining was designed to facilitate learning an inhibitory CS+ > noUS association for the Avoidable CS+. Pretraining also eliminated learning related activation during subsequent neuroimaging. Participants were given a stipend of \$10.00 and told their task was to keep aliens from taking their supplies and money, they may be able to stop an alien ship by choosing between shield #1 or shield #2 depending on which spaceship was present, and to press #3 to allow a "Friendly ship" [CS−] to refuel. Trials lasted 20 s and consisted of a 12 s phase during which one CS (Unavoidable CS+, Avoidable CS+ or Friendly CS−) enlarged over time, a 2 s choice phase, a 2 s outcome phase and 4 s intertrial interval. During the choice phase, when a CS+ was presented participants were prompted to make a choice between two shields (#1, #2) that may prevent the aversive stimulus. Choosing shield #1 after the Avoidable CS+ prevented the US and any other response produced US. Therefore, avoidance was acquired through trial and error learning. Regardless of the shield chosen, the Unavoidable CS+ was always followed by the US. Dependent measures included button/shield choice and reaction time (RT) for each CS. Acquisition ended when successful avoidance to the Avoidable CS+ was >80% correct during a 5 trial block (generally two blocks were required). All participants were required to meet criterion before proceeding to neuroimaging.

### Neuroimaging

Two, 9 min consecutive imaging scans were completed, separated by a ∼3 min break. Participants were given a button box with three buttons arranged vertically and described as #1, #2, and #3. Responses were made with the right thumb. During the first scan, the delayed avoidance task was presented (see **Figure 1**). Here, participants were given a \$13.00 stipend and told their task was (again) to keep aliens from taking their supplies by applying what they learned during training. CS order was randomized in blocks of 3 trials and 10 blocks were presented. Dependent measures included button choice and RT for each CS along with valence, fear, and US expectancy ratings obtained at task completion.

During the second scan, extinction testing with response prevention was completed. The extinction testing/task was the delayed avoidance task modified to exclude any opportunity to respond and with all US deliveries withheld. Instructions stated that the shields (buttons) were inoperable, so no avoidance was possible; however, participants still held the button box in the scanner. Participants were not informed about US omission. CS order was randomized in blocks of three trials and 10 blocks were presented. Dependent measures included button presses to each CS (none occurred or were predicted) and valence, fear, and US expectancy ratings obtained at task completion.

### Analyses

### Neuroimaging

Neuroimaging data analyses were performed using SPM 8 (Wellcome Department of Cognitive Neurology, London UK, http://www.fil.ion.ucl.ac.uk/). Preprocessing procedures included reorientation, slice acquisition time correction, coregistration, within-subject realignment, spatial normalization to the standard Montreal Neurological Institute EPI template with resampling to 2 × 2 × 2 mm voxel sizes, and spatial smoothing using a Gaussian kernel (6 mm full width at halfmaximum). High pass filtering was applied to the time series of EPI images to remove any low frequency drift in EPI signal. Head motion was restricted to <3.0 mm in any dimension using the first acquisition as a reference. No participants were excluded.

At the first level, individual subject time series data were analyzed using a multiple regression model. Events of interest modeled included an Unavoidable CS+, Avoidable CS+ and a Safe CS− which served as baseline for sensory and motor responses and absence of US delivery. Only trials with correct responses were used in the analysis. CS+ specific activation was highlighted by creating Unavoidable CS+ > CS− and Avoidable Schlund et al. Threat and avoidance

CS+ > CS− contrast images. The effects of avoidance during the delayed avoidance task and during extinction testing was assessed by highlighting differences between CS+s with the contrast [(Unavoidable CS+ > CS−) – (Avoidable CS+ > CS−)]. Localization of activation during the 12 s threat period was revealed by convolving a 12 s boxcar function to the time series which produced a parameter estimate reflecting the magnitude of activation. Additionally, a secondary time course verification of 12 s CS presentations was completed with a supplemental analysis involving a Finite Impulse Response (FIR) model with a 2 s sampling rate. The FIR analysis generated parameter estimates for each voxel every 2 s over the 12 s CSs duration. Participant-specific head movement parameters were also modeled as covariates of no interest.

Functional imaging analyses proceeded through three stages: anatomically restricted localization of sustained activation to the 12 s CSs, time course verification of sustained activation and correlation of brain activation with a measure of aversive delay discounting. First level individual contrast images were carried to a second level for group analyses. Because our a priori hypotheses focused on the anterior cingulate and anterior, medial and ventral frontal regions, a regions-of-interest (ROIs) mask was created using the Automated Anatomic Labeling atlas (AAL; Tzourio-Mazoyer et al., 2002) of the WFU Pickatlas toolbox (Maldjian et al., 2003). Consequently, analyses were restricted to these regions and employed SPMs small volume correction function. Activation for the Unavoidable CS+ and the Avoidable CS+ was separated evaluated relative to the Safe CS− with one-sample ttests thresholded at p < 0.005 uncorrected and 20 contiguous voxels. However, no significant differences were found for the Avoidable CS+. The effect of successful avoidance on CS activation during avoidance and extinction was highlighted using the contrast [(Unavoidable CS+ > CS−) – (Avoidable CS+ > CS−)] and one-sample t-tests thresholded at p < 0.005 uncorrected and 20 contiguous voxels. While these thresholds balance concerns of Type I and Type II error (Lieberman and Cunningham, 2009), all clusters reported during avoidance exceeded a cluster level family-wise error (FWE) correction set at p < 0.05. Lastly, multiple regression examined relations between regional activation identified with the Unavoidable CS+ > CS− contrast (via inclusive masking) and AUC discounting measures using the thresholds p < 0.01 uncorrected and 20 contiguous voxels. Parameter estimates and contrast values plotted are from significant peak voxels. The location of voxels with significant activation was summarized by their local maxima separated by at least 8 mm, and by converting the maxima coordinates from MNI to Talairach coordinate space using conventional transformations implemented in GingerALE 2.0 (http://www. brainmap.org/ale/). MNI with coordinates are reported and regions assigned neuroanatomic labels using Talairach atlas for guidance. Statistical parametric maps displayed were overlaid onto a reference brain using MRIcron (http://www.sph.sc.edu/ comd/rorden/mricron/).

### Behavioral

For each condition (except Avoidance Acquisition), differences among CS ratings were evaluated via three planned comparisons performed using paired t-tests and a criterion alpha set at p < 0.05/3, Bonferroni corrected. During neuroimaging of the delayed avoidance task, three planned comparisons were performed to evaluate differences among choice distributions and reaction times using paired t-tests and a criterion alpha set at p < 0.05/3, Bonferroni corrected.

### Results

### Behavioral Performance

During the delayed avoidance task, participants chose shield #1 significantly more often to the Avoidable CS+ threat (M = 97%, SD = 6.4%), which successfully prevented US delivery, and chose button #3 significantly more often for the Safe CS− (M = 99%, SD = 2.7%), consistent with instructions (**Figure 2**). No significant differences were found between choices of shields #1 (M = 48%, SD = 6.6 %) and #2 (M = 52%, SD = 6.8%) to the Unavoidable CS+ threat, highlighting variable responding as participants tried and failed to prevent US delivery. No significant differences were found among RTs when choosing to avoid the Avoidable CS+ threat (M = 576 ms, SD = 122 ms) or the Unavoidable CS+ threat (M = 562 ms, SD = 161) or responding to the Safe CS− (#3 M = 563 ms, SD = 195 ms). Significant RT differences were not expected given the lengthy 12 s threat period that preceded choices.

### Ratings

Ratings of negative valence, fear and US expectancy provided clear evidence of differential threat conditioning, CS+ modulation by successful avoidance, and extinction learning (**Figure 3**; Supplemental Table 1). Following pretesting and extinction, all ratings in each category were low and no significant differences among CSs present. Threat conditioning produced significantly higher CS+ ratings in each category relative to the Safe CS−, showing both CS+s functioned as

FIGURE 2 | Response accuracy during avoidance task. The plot shows the distribution of choices among available responses (buttons 1, 2, and 3) to each CS during neuroimaging. Buttons 1 and 2 were described as "shields" that could help avoid alien attacks. The plot shows subjects consistently (and correctly) choice #3 to the CS− and correctly choice #1 to the Avoidable CS+, which prevented US delivery. In contrast, choices were distributed between #1 and #2 to the Unavoidable CS+ as subjects tried and failed to avoid the aversive US. (Brackets highlight significant differences, *p* < 0.05 corrected. Bars represent 95% confidence intervals).

indicative of successful differential threat conditioning. For the delayed avoidance task, CS+s were rated significantly higher than the CS− and the Avoidable CS+ threat was rated significantly lower compared to the Unavoidable CS+, demonstrating that successful avoidance reduced threat appraisal. (Brackets highlight significant differences, *p* < 0.05 corrected. Bars represent 95% confidence intervals. See Supplemental Table 1 for details).

threats. Importantly, no significant differences were observed between CS+s, indicating similar threat values. Ratings in each category for CS+s presented during the delayed avoidance task were significantly higher than the CS−, again demonstrating CS+s acted as threats. However, ratings in each category for the Avoidable CS+ were significantly lower compared to the Unavoidable CS+ and significantly higher compared to the CS−, demonstrating that avoidance significantly reduced ratings of valence, fear and US expectancy, but not to CS− levels.

### Neuroimaging

### Delayed Avoidance Task-related Activation

Significantly greater activation to the Unavoidable CS+ threat relative to the Safe CS− was observed in the adACC and dmPFC (**Figure 4A**; **Table 1**), but not for the Avoidable CS+ threat. Similarly, adACC and dmPFC activation was significantly greater to the Unavoidable CS+ threat relative to the Avoidable CS+ threat (**Figure 4B**; **Table 1**; see **Figure 5** for individual subject contrast values). Plots of contrast values for the session and early and late phases reveal that regional activation was sustained and did not decline during the session. Activation to the Unavoidable CS+ threat also did not decline within the imaging session, highlighting that the Unavoidable CS+ remained a threat and the US remained aversive. Consequently, the reduced activation observed to the Avoidable CS+ cannot be attributed to time or US habituation. Finally, the greater activation to the Unavoidable CS+ threat suggested by results obtained with a 12 s boxcar regressor were verified through FIR time course validation (**Figure 6**). Plots for adACC, dmPFC, and APFC reveal Unavoidable CS+ activation was sustained during the 12 s threat period while Avoidable CS+ and CS− activation showed markedly similar declines during the 12 s threat period.

### Extinction Testing-related Activation

Activation to the Unavoidable CS+ threat relative to the Avoidable CS+ threat during the session was restricted to the dmPFC (**Figure 7**; see **Figure 5** for individual subject contrast values). However, analyses of within session changes in activation revealed there was significantly greater activation to the Unavoidable CS+ threat relative to the Avoidable CS+ threat during the early (first half) of the session in pregenual anterior cingulate (pgACC) and ventromedial prefrontal cortex (vmPFC). These within session changes are consistent with changes in threat appraisal that would be predicted to occur under extinction when CSs are presented without the US.

### Brain-behavior Relations

Grouped data showed evidence of discounting of losses with increased delay (**Figure 8A**). A regression analysis constrained to regions showing activation for the Unavoidable CS+ > CS− contrast was used to examine how individual differences in discounting, expressed as AUC, modulated activation. Regional activation and discounting were negatively correlated in dmPFC and bilateral adACC (**Figure 8B**; **Table 1**). Therefore, individuals with a lower AUC and who were more avoidant of immediate losses displayed increased activation to threat in adACC and dmPFC.

### Discussion

Using a within-subjects design and fMRI, we examined the effects of avoidance and extinction on threat appraisal and regional activation. Major findings were (a) an Avoidable CS+ threat relative to the Unavoidable CS+ threat was associated with reductions in ratings of negative valence, fear, and US expectancy and reduced regional activation and (b) individuals who exhibited greater aversive discounting and were more avoidant of immediate losses, displayed greater activation to an Unavoidable CS+ threat. Moreover, Unavoidable CS+ activation was sustained or increased during CS presentation and activation was sustained throughout the avoidance task but declined during

extinction. These findings suggest adACC/dmPFC supports flexible threat appraisals through sensitivity to avoidance based reductions in the local probability of US delivery. They also bridge views that adACC/dmPFC plays a central role in regulating threat appraisal and fear expression expression (Etkin et al., 2011; Shackman et al., 2011; Kalisch and Gerlicher, 2014) with views that regional responses reflect an extended choiceoutcome history with response dependent positive and negative outcomes (Rushworth et al., 2007).

The differential adACC/dmPFC responses observed to Avoidable/Unavoidable CS+s identifies characteristics/variables associated with threatening situations that control regional activation. In particular, our findings identify successful avoidance and associated reductions in US delivery as an important variable mediating threat appraisal and fear expression. The differences in activation observed between the Avoidable CS+ and Unavoidable CS+ parallel results of fear generalization studies that report a reduction in regional activation along the stimulus continuum from CS+ to CS− (Lissek et al., 2014), the effects of controllability of immediate and proximal aversive events and reductions in the negative impact of aversive events (Maier, 2015), studies on anticipatory anxiety showing regional activation during anticipation of phobiarelevant stimuli (Straube et al., 2007) and studies showing how reappraisal of anticipated threat recruits medial and lateral prefrontal regions and reduces anxiety (Yoshimura et al., 2014). Consistent with studies on extinction learning (Phelps et al., 2004; Quirk and Mueller, 2008), we also observed significantly greater pgACC and vmPFC activation to the Unavoidable CS+ threat relative to the Avoidable CS+ threat during the early phase of extinction. During extinction, participants learned an additional CS+ inhibitory association through US omission. The reduced pgACC and vmPFC activation observed to the Avoidable CS+ suggests the inhibitory CS+ > noUS association had been acquired through avoidance before extinction testing, which corresponds with the consequences of avoidance discussed in the Introduction.


These results also contribute to translational research on anxiety pathology. Our approach highlights active avoidance as a potentially useful model for elucidating the brain mechanisms supporting the dynamic relationship between threat appraisal, fear expression and response outcomes that alter threatening situations. Using a novel avoidance paradigm which delayed avoidance responding following CS presentation, we showed avoidance success modulated adACC/dmPFC activation along with ratings of negative valence, fear and US expectancies. Previous investigations on anticipation of aversion have also reported adACC activation along with results highlighting phasic and sustained activation patterns in different brain regions (e.g., Grupe et al., 2013). The reduction in activation we observed to the Avoidable CS+ threat seems quite reasonable in light of the effectiveness of avoidance coping in anxiety disorders. One might even speculate that the decreasing activation we observed during the threat period to the Avoidable CS+ and Safe CS− reflects an active dampening process. Thus, adACC/dmPFC dysfunction in anxiety may manifest as insensitivity to response produced local changes in US probability, an inability to accurately associate long term changes in US probability with a CS+ or an inability to engage an active dampening process to avoidable and non-threatening stimuli.

We found support for aversive discounting as an individual difference variable that may contribute to research on threat and anxiety pathology. Clinical applications of reward discounting have advanced our understanding of dysfunction in various clinical populations, especially in substance abuse (Bickel et al., 2012; for a meta-analysis see MacKillop et al., 2011). Evidence from human neuroimaging research on discounting also suggests different brain mechanisms are involved at different temporal

delays (McClure et al., 2004; Ballard and Knutson, 2009). We found individuals who exhibited steeper loss discounting, that is, subjects who were more avoidant of immediate losses, also displayed greater activation to an Unavoidable CS+ threat. These findings support and extend a growing literature investigating relations between aversive discounting, threat appraisal-reactivity and anxiety pathology (Rounds et al., 2007; Salters-Pedneault and Diller, 2013; Tanaka et al., 2014).

The present investigation has potential limitations and the findings raise empirical questions that should be addressed in future studies. First, the CS+ differences observed as a function of successful and unsuccessful avoidance might be enhanced with an aversive US, such as electric shock. Despite the practical and ethical barriers that exist in its application with vulnerable populations such as children, adolescents and those with anxiety disorders (e.g., Britton et al., 2011), it would be salutary to replicate the present findings with a shock US. Second, the inclusion of other independent physiological measures, such as skin-conductance, pupil dilation or fear-potentiated startle responses, would supplement the existing measures of fear conditioning and avoidance. Third, the present paradigm has utility as a translational model of putative neurobehavioral differences in avoidance and threat appraisal in those with and without an anxiety disorder. Finally, an important area

FIGURE 7 | Differences in activation between the Unavoidable CS+ and Avoidable CS+ during extinction. Plots show contrast values for the session and early (first half) and late (last half) phases. Values highlighted within yellow boxes correspond to differences appearing in activation maps.

for future investigations will be to use parametric designs that manipulate US probability and delays to examine the effects on regional activation and approach-avoidance decision making.

### Conclusions

Altering threat appraisals and associated negative emotional reactions following actions that change situations from threats to non-threats is important for adaptive approach-avoidance decision-making and emotional health. This investigation employed fMRI and healthy adults to examine the effects of avoidance, which is prominent in anxiety, and extinction on threat appraisal and adACC/dmPFC regional activation. Findings were consistent with and extend a number of contemporary theories of adACC/dmPFC function. We The top plot highlights a significant difference in activation in dmPFC for the session. The remaining two plots highlight significant differences in activation in pregenual anterior cingulate (pgACC) and ventromedial prefrontal cortex (vmPFC) for the early phase. (Bars reflect 95% confidence intervals).

concluded that differences in CS+ activation associated with successful avoidance reflect a regional sensitivity to avoidance based reductions in the local US probability. We propose that adACC/dmPFC dysfunction in anxiety may manifest as insensitivity to response produced local changes in US probability, an inability to accurately associate long term changes in US probability with a CS+ or an inability to engage an active dampening process to avoidable and nonthreatening stimuli. Another finding with translational value was results showing individuals exhibiting greater aversive discounting—more avoidant of immediate loss compared to a larger delayed loss—also displayed greater activation to the Unavoidable CS+. We concluded that aversive discounting may be a candidate individual difference variable that modulates regional activation to CS+ threat in anxiety pathology. Future investigations are necessary to further elucidate relations

between adACC/dmPFC sensitivity and variables, such as delay and US probability, which also influence approach-avoidance

choice of an immediate small loss over a large delayed loss. In contrast, an

### Acknowledgments

decision-making.

We express our gratitude to Texas Tech Neuroimaging Center for their invaluable assistance. This work was supported by Beatrice H. Barrett Research Endowment to the University of North

### Supplementary Material

activation the Unavoidable CS+ threat.

Texas.

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fnbeh. 2015.00142/abstract

### References


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Schlund, Brewer, Richman, Magee and Dymond. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Avoidance prone individuals self reporting behavioral inhibition exhibit facilitated acquisition and altered extinction of conditioned eyeblinks with partial reinforcement schedules

### **Michael Todd Allen1,2\*, Catherine E. Myers 2,3 and Richard J. Servatius 2,3**

<sup>1</sup> School of Psychological Sciences, University of Northern Colorado, Greeley, CO, USA

<sup>2</sup> Stress and Motivated Behavior Institute, NJMS-UMDNJ, Newark, NJ, USA

<sup>3</sup> Neurobehavioral Research Lab, DVA Medical Center, NJHCS, East Orange, NJ, USA

### **Edited by:**

Djoher Nora Abrous, Institut des Neurosciences de Bordeaux, France

### **Reviewed by:**

David Belin, University of Cambridge, UK Véronique Deroche-Gamonet, INSERM, France

### **\*Correspondence:**

Michael Todd Allen, School of Psychological Sciences, University of Northern Colorado, 501 20th St, Greeley, CO 80634, USA e-mail: michael.allen@unco.edu

Avoidance in the face of novel situations or uncertainty is a prime feature of behavioral inhibition which has been put forth as a risk factor for the development of anxiety disorders. Recent work has found that behaviorally inhibited (BI) individuals acquire conditioned eyeblinks faster than non-inhibited (NI) individuals in omission and yoked paradigms in which the predictive relationship between the conditioned stimulus (CS) and unconditional stimulus (US) is less than optimal as compared to standard training with CS-US paired trials (Holloway et al., 2014). In the current study, we tested explicitly partial schedules in which half the trials were CS alone or US alone trials in addition to the standard CS-US paired trials. One hundred and forty nine college-aged undergraduates participated in the study. All participants completed the Adult Measure of Behavioral Inhibition (i.e., AMBI) which was used to group participants as BI and NI. Eyeblink conditioning consisted of three US alone trials, 60 acquisition trials, and 20 CS-alone extinction trials presented in one session. Conditioning stimuli were a 500 ms tone CS and a 50-ms air puff US. Behaviorally inhibited individuals receiving 50% partial reinforcement with CS alone or US alone trials produced facilitated acquisition as compared to NI individuals. A partial reinforcement extinction effect (PREE) was evident with CS alone trials in BI but not NI individuals. These current findings indicate that avoidance prone individuals self-reporting behavioral inhibition overlearn an association and are slow to extinguish conditioned responses (CRs) when there is some level of uncertainty between paired trials and CS or US alone presentations.

**Keywords: behavioral inhibition, partial reinforcement, eyeblink conditioning, associative learning, anxiety disorders**

### **INTRODUCTION**

Anxiety disorders are the most common form of mental illness. However, the development of anxiety disorders is unclear. Two individuals can experience the same event and yet one develops an anxiety disorder while the other does not. Many factors including genetics, gender, personality and prior experiences are hypothesized to play a role in the development of anxiety disorders (Mineka and Zinbarg, 2006). Recent work has focused on a learning diathesis model that involves differences in learning based upon specific temperament factors such as behavioral inhibition or BI.

Behavioral inhibition has been put forth as a possible risk factor for the development of anxiety disorders (Fox et al., 2005). Behavioral inhibition is defined as a temperamental tendency to withdraw from or avoid novel social and non-social situations (Kagan et al., 1987; Morgan, 2006). Another feature of BI is a sensitivity to forming associations between stimuli. Recent studies examining classical conditioning with individuals expressing behavioral inhibition have found enhanced acquisition of conditioned eyeblinks (Myers et al., 2011; Caulfield et al., 2013; Holloway et al., 2014). Classically conditioned eyeblink conditioning involves the pairing of a conditioned stimulus (CS) tone with an unconditional stimulus (US) corneal air puff which result in learning a conditioned response (CR) eyeblink to the previously neutral CS. There is also a long history indicating that classical conditioning of the eyeblink or eyelid response is affected by anxiety (Hilgard et al., 1951; Spence and Taylor, 1951; Taylor, 1951; Spence and Farber, 1953; Baron and Connor, 1960; King et al., 1961; Beck, 1963; Spence et al., 1964; Spence and Spence, 1966). Consistent with recent findings with BI, these studies revealed enhanced CR acquisition including greater asymptotic performance and a greater number of CRs overall compared to individuals reporting low anxiety.

In addition to a long history of behavioral work in humans and animals, the neural substrates of classical eyeblink conditioning are well understood. Cerebellar and brainstem circuits are known to underlie acquisition, retention, and extinction of eyeblink conditioning across several mammalian species including rabbits, rodents, and humans (for review see Thompson and Steinmetz, 2009). The present study utilized delay conditioning in which the CS and US partially overlap and co-terminate. This form of eyeblink conditioning is known to require the cerebellum, but not other brain structures such as the hippocampus (Schmaltz and Theios, 1972; Gabrieli et al., 1995) or cerebral cortex (Mauk and Thompson, 1987). However, strong evidence also exists for the associative learning in the cerebellum during delay conditioning to be modified by septo-hippocampal (Berry and Thompson, 1979; Allen et al., 2002) and amygdala inputs (Whalen and Kapp, 1991; Weisz et al., 1992; Blankenship et al., 2005). Stein et al. (2007) found that anxiety prone subjects enhanced exhibited amygdala activity during the processing of emotional stimuli. If anxiety vulnerable individuals have greater amygdala activity than non-vulnerable individuals in response to the mildly aversive corneal air puff, this activity could facilitate associative learning in the cerebellum for eyeblink conditioning. These limbic systems may be one mechanism through which temperamental factors such as behavioral inhibition facilitate acquisition of classically conditioned eyeblinks.

Another possible explanation for enhanced acquisition of eyeblink CRs in behaviorally inhibited (BI) individuals is an avoidance of the US air puff by eye closure in response to the CS tone. Holloway et al. (2014) tested the possibility of enhanced avoidance learning in BI individuals using a delay, omission, or yoked conditioning schedule. Omission training was identical to delay, except that the performance of a CR by the participant resulted in omission of the US on that trial. Avoidance learning in eyeblink conditioning has been defined as the degree to which learning during omission training exceeds that of the yoked controls (Logan, 1951; Gormezano et al., 1962; Moore and Gormezano, 1963). Holloway et al. (2014) failed to observe avoidance learning in BI individuals, but did observe enhanced acquisition relative to non-inhibited (NI) individuals. The greater facilitation of learning in the omission and yoked groups was evident in situations of partial reinforcement due to the omission of the US on some trials. These findings were interpreted as an increased sensitivity to uncertainty in BI individuals in the case of partial reinforcement.

In addition to avoidance, BI also includes social reticence and enhanced reactivity to novelty, threat, and uncertainty (Hirshfeld et al., 1992; Schwartz et al., 2003a,b). Grupe and Nitschke (2013) defined anxiety as "anticipatory affective, cognitive, and behavioral changes in response to uncertainty about a potential future threat" (p.489). Anxiety disorders may come about due to how an individual learns to respond to environmental cues, especially when there is some uncertainty about relationships between stimuli. Examples of uncertainty in classical conditioning would include schedules of partial reinforcement.

Partial reinforcement for classical eyeblink conditioning has been defined by Leonard and Theios (1967) based on the US air puff being the reinforcing event. Therefore, partial reinforcement in eyeblink conditioning involves CS tone alone trials that omit the US air puff. Various manipulations of schedules of partial reinforcement involving CS alone and CS-US paired trials in human eyeblink conditioning have produced three major findings. These results include a significant decrement in acquisition in the partial reinforcement group as compared to the continuous reinforcement group (Reynolds, 1958; Ross, 1959; Hartman and Grant, 1960; Ross and Spence, 1960; Runquist, 1963; Perry and Moore, 1965), a partial reinforcement extinction effect (PREE; Longenecker et al., 1952; Perry and Moore, 1965; Newman, 1967; Leonard, 1975), and a null effect of no significant differences in acquisition between partial and continuous reinforcement schedules (Humphreys, 1939; Grant et al., 1950; Hake and Grant, 1951; Grant and Schipper, 1952; Moore and Gormezano, 1963; Price et al., 1965; Foth and Runquist, 1970).

In the current study, we investigated the effects of BI on two forms of partial reinforcement. Based on the levels of responding in omission and yoked groups in the Holloway et al. (2014) study, we chose a 50% partial reinforcement schedule with CS alone trials intermixed with CS-US paired trials. This schedule is also the most common partial reinforcement schedule from the human eyeblink conditioning literature. In addition, we included a 50% partial US group to test the effects of US alone rather than CS alone trials inter-mixed with CS-US paired trials. The inclusion of un-signaled air puff USs would be a different type of unexpected trial type.

Based on the omission and yoked results of Holloway et al. (2014), we hypothesized that 50% partial reinforcement (either with CS alone or US alone trials) would result in reduced conditioned responding as compared to 100% paired trials. In addition, we hypothesized there would be enhanced acquisition of CRs in BI individuals as compared to NI individuals. We also hypothesized that partial reinforcement with CS alone trials would result in a PREE based on previous eyeblink conditioning experiments schedule (Longenecker et al., 1952; Perry and Moore, 1965; Newman, 1967; Leonard, 1975).

### **MATERIALS AND METHODS PARTICIPANTS**

One hundred forty nine college-aged students were recruited from the University of Northern Colorado, School of Psychology. Students voluntarily participated to receive class credit or extra credit for psychology classes. Ninety eight females and 51 males with mean age of 19.9 (SD = 3.0, range 18–38) and mean education of 13.5 years (SD = 1.4, range 11–19) were included in the study. Informed consent was obtained in accordance with procedures approved by the University of Northern Colorado Institutional Review Board adhering to the federal regulations on research involving human subjects.

### **MATERIALS AND APPARATUS**

The eyeblink conditioning apparatus and procedures were similar to that previously described (Beck et al., 2008). The tone stimulus was produced with Coulbourn Instruments (Allentown, PA, USA) signal generators and passed to a David Clark aviation headset (Model H10–50, Worchester, MA, USA). Sound levels were verified with a Realistic sound meter (RadioShack, Fort Worth, TX, USA). The headset was fitted with a boom placed 1 cm from the cornea that delivered a 5 psi air puff US via sylastic tubing connected to a regulator and released by a computer controlled solenoid valve (Clipper Instruments, Cincinnati, OH). To record the eyelid electromyographic (EMG) signal, pediatric silver/silver chloride EMG electrodes with solid gel were placed above and below the left eye, with the ground electrode placed on the neck. The EMG signal was passed to a medically isolated physiological amplifier (UFI, Morro Bay, CA, USA), low-pass filtered and amplified 10 K. The EMG signal was sampled at 500 Hz by an A/D board (PCI 6025E, National Instruments, Austin, TX, USA) connected to an IBM-compatible computer. Software control of stimulus generation was performed by LabView (National Instruments).

### **PSYCHOMETRIC SCALES**

Study participants completed the Adult Measure of Behavioral Inhibition or AMBI (Gladstone and Parker, 2005). The AMBI is a 16-item self-report inventory that assesses current tendency to respond to new stimuli with inhibition and/or avoidance, and has also been shown to be a measure of anxiety proneness.

### **BI GROUPS**

Participants were divided into BI and NI groups based on a median split of the AMBI score. This methodology was based on previous eyeblink conditioning studies with BI (Caulfield et al., 2013; Holloway et al., 2014) and allowed for equal sample sizes in our BI and NI groups.

### **CONDITIONING SESSION**

Upon arrival to the study, participants provided informed consent and were instructed that the study was going to evaluate responses to tones and air puffs to the eye, that they were to watch a silent video of their choice (e.g., a nature video with sound muted), and that they were to remain awake during the testing session. Participants were then fitted with EMG electrodes and headphones, EMG signal quality was verified, and the conditioning program was started. The program began with three US-alone (50 ms, 5.5 psi air puff) exposures to assess UR quality and magnitude for all participants. The acquisition session began immediately following the US exposures. Delay training consisted of 60 acquisition trials and 20 CSalone extinction trials. The inter-trial interval varied pseudorandomly between 30 ± 5 s for all contingencies. Participants received either 100% CS-US paired trials or a 50% partial reinforcement schedule for acquisition training. Paired CS-US trials included a 500 ms/1200 Hz pure tone CS overlapping and coterminating with the US air puff, partial reinforcement schedules included 30 CS-US paired trials inter-mixed with 30 pseudorandom presentations of either a CS alone or US alone trial in which no more than three of the same trial types were presented consecutively.

### **SIGNAL PROCESSING AND DATA REDUCTION**

Electromyography data was evaluated on a trial-by-trial basis for all participants. Processing of eyeblink responses followed methods previously reported (Beck et al., 2008). To determine the occurrence of an eyeblink, EMG activity was first lowpass filtered with a Lowess filter (Stat-Sci, Tacoma, WA, USA) using a time constant of 0.025, and a smoothing interval of 5. With these filter values, activity greater than 0.2 (unitless) corresponded to an eyeblink response. For a response to be counted, smoothed EMG activity in a 500-ms window beginning at the onset of the CS had to exceed the mean activity, plus four times the standard deviation, of the activity in a 125-ms comparator window that immediately preceded the CS window. A CR was scored when an eyeblink occurred 80 ms after CS onset but before US onset. A UR was scored when an eyeblink was produced 0–100 ms after US onset. Those sessions with excessive signal noise (loss of more than 10% of trials), equipment malfunction, or incomplete session data (e.g., falling asleep), were discarded and not used for further analysis. Inspection of the eyeblink conditioning data therefore resulted in rejection of data from 41 participants. The final groups that were analyzed were delay (*n* = 35), 50% partial CS (*n* = 43), and 50% partial US (*n* = 30) for a total of 108.

### **DATA ANALYSIS**

To examine the main effects and interactions of anxiety vulnerability and CR acquisition, the 80 trial conditioning session was divided into 10 trial blocks and evaluated independently for 60 acquisition trials and 20 extinction trials. Between group measures included Group (100% CS-US paired trials, 50% CS partial reinforcement, and 50% US partial reinforcement), and BI (BI vs. NI), with Block as a within subject measure. Significant effects from the ANOVAs were followed up with planned Ftests. The planned comparisons included comparisons of BI and NI individuals within each conditioning protocol. In addition, planned comparisons were done between the partial reinforcement schedules with the standard 100% CS-US paired trials condition. The level of significance was set at *p* < 0.05.

### **RESULTS**

### **PSYCHOMETRIC DATA**

Psychometric and demographic data for BI and NI groups for the 100% paired trial, 50% CS partial reinforcement and 50% US partial reinforcement groups are summarized in **Table 1**. There were no significant gender differences between groups on any of the measures (all *p*'s > 0.13). The AMBI score used for the median split was 11.5 for the 100% paired trials group, 14.5 for the 50% US partial reinforcement group, and 12.5 for the 50% CS Partial reinforcement group.

### **ACQUISITION**

Participants acquired CRs across the conditioning session in all three training protocols as shown in **Figure 1**. This was confirmed with a 3 (Group) × 2 (BI) × 6 (Block) repeated measures ANOVA which revealed a main effect of Block (*F*(5,510) = 29.069, *p* < 0.001). There were significant differences in CR acquisition between the three training protocols. A 3 (Group) × 2 (BI) × 6 (Block) repeated measures ANOVA revealed a main effect of group (*F*(1,102) = 3.226, *p* < 0.05) for conditioned eyeblink response acquisition. Further analysis revealed that the conditioned responding in the 50% CS partial reinforcement protocol was significantly lower than in the 100% paired trial protocol (*F*(1,74) = 6.01, *p* < 0.02). There was no significant group difference between the 100% paired trial protocol and the 50%



US partial reinforcement protocol (*p* > 0.70). All interactions for these pairwise comparisons between training protocols were nonsignificant (*p*'s > 0.25).

As shown in **Figure 2**, individuals self-reporting high AMBI scores exhibited more CRs across the six acquisition blocks than did those self-reporting low AMBI scores. This was confirmed by a 3 (Group) × 2 (BI) × 6 (Block) repeated measures ANOVA which revealed a significant main effect of BI (*F*(1,102) = 10.596, *p* < 0.005). None of the interactions between these three variables were significant (*p*'s > 0.425).

The BI effect was further analyzed for each of the individual training protocols separately. Training with the 100% paired trial protocol did not produce a significant difference in conditioned eyeblinks between the high and low AMBI groups as shown in **Figure 3**. Training with the 50% CS partial reinforcement protocol (*F*(1,41) = 6.469, *p* < 0.05) as well as the 50% US partial

reinforcement protocol BI (*F*(1,28) = 4.358, *p* < 0.05) produced significantly more CRs in the high AMBI group as compared to the low AMBI group as shown in **Figures 4**, **5**.

### **EXTINCTION**

Individuals in all three training protocols exhibited extinction defined by a decrease in conditioned responding across the 20 CS alone trials as evident in **Figure 1**. This observation was confirmed by a 3 (Group) × 2 (BI) × 2 (Block) repeated measures ANOVA which revealed a main effect of Block (*F*(1,102) = 30.242, *p* < 0.001). Behaviorally inhibited individuals also exhibited more CRs across CS alone trials (i.e., less extinction) than NI individuals as shown in **Figure 2**. This finding was confirmed by a main effect of BI (*F*(1,102) = 6.263, *p* < 0.05). There was a non-significant trend towards a Group by BI interaction (*F*(2,102) = 3.722, *p* = 0.116).

However, due to significant differences in levels of asymptotic performance across the three training protocols for the high and low AMBI groups, it was necessary to evaluate extinction with respect to the asymptotic performance at the end of acquisition training. The conditioned responding for the last block of acquisition training was used a covariate for further

ANOVAs. A 3 (Group) × 2 (BI) ANOVA of these data revealed a significant interaction between Group and BI in conditioned responding during extinction between the three training groups, (*F*(1,101) = 4.25, *p* < 0.05). Based on this interaction, the individual training protocols were evaluated. A significant main effect of BI was evident in the 50% CS alone training protocol (*F*(1,40) = 5.74, *p* < 0.05), but not in the 100% paired trial protocol or the 50% US partial reinforcement protocol (all *p*'s > 0.70).

Behaviorally inhibited individuals expressed more CRs during the extinction

training. Error bars represent standard error of the mean.

To analyze for a PREE, pairwise comparisons of the partial reinforcement schedules to the standard 100% CS-US paired trials were conducted. Comparisons of the 50% CS partial reinforcement protocol and the 100% CS-US paired trial protocol revealed a main effect of BI (*F*(1,73) = 5.198, *p* < 0.05) such that BI individuals exhibited more CRs than NI individuals. There was also a significant interaction of Group × BI when the 50% CS partial reinforcement protocol and the 100% CS-US paired trial protocol were compared (*F*(1,73) = 6.272, *p* < 0.05) such there more CRs in the behavioral inhibition group in the 50% CS partial reinforcement condition. There were no significant differences in extinction between the 50% US partial reinforcement schedule and the standard 100% CS-US paired trial protocol.

### **DISCUSSION**

the mean.

Prior work by Holloway et al. (2014) found enhanced eyeblink conditioning in individuals self-reporting behavioral inhibition in learning situations such as omission and yoked training in which the pairing of conditioning stimuli was less than optimal. The omission of the US on trials in which a CR was exhibited to the CS resulted in various patterns of partial reinforcement. As conditioning progressed and CRs were acquired, the pairing of the CS and US was reduced. This progressive omission of the US increased participant uncertainty about stimulus pairings. Behaviorally inhibited individuals appeared to be overly sensitive to partial reinforcement schedules in which there is some uncertainty about stimulus pairings and presentations as evidenced by increased conditioned responding as compared to NI individuals.

### **ACQUISITION EFFECTS**

Partial reinforcement schedules with either 50% CS alone or US alone trials pseudo-randomly inter-mixed with CS-US paired trials produced enhanced acquisition in BI individuals as compared to NI individuals. Our finding of a magnified BI effect in the partial reinforcement schedules with either CS or US alone trials matches the findings of Holloway et al. (2014) with omission and yoked protocols. In addition, our findings with 100% paired trials were similar to those of Holloway et al. (2014) in that while there was a pattern of enhanced acquisition BI in the 100% paired trials, this difference was not significant. It appears both in our present work with partial reinforcement schedules and prior work with omission and yoked controls that the effects of BI are most evident in non-optimal conditions in which CS-US pairings are less than 100%.

One feature of BI is an enhanced reactivity to novelty, threat, and uncertainty (Hirshfeld et al., 1992; Schwartz et al., 2003a,b). The current partial reinforcement protocols with either CS or US alone presentations pseudo-randomly intermixed with CS-US paired trials produced uncertainty for both stimulus presentation and timing of CS-US paired trials for the participants. During the training session, it is not apparent to the participant when the next CS-US paired trial will occur. In the case of the CS alone partial reinforcement protocol, the next trial could either be a CS paired with a US and should be responded to or it could be a CS without an US and does not need to be responded to. In the case of the US alone partial reinforcement protocol, the next trial could either be a CS paired with a US which should be responded to or be an un-signaled US. This uncertainty was also found with the yoked group in Holloway et al. (2014) when the participants received what appeared to be a random arrangement of CS-US paired trials and CS alone trials based on the CR performance of their matched omission participant. The current findings support the further exploration of uncertainty as an important feature of enhanced learning in BI individuals.

In addition to our findings with BI, the schedules of partial reinforcement differed from the standard 100% CS-US paired training. Partial reinforcement training with CS alone trials was found to produce less conditioned responding than 100% paired CS-US training. This finding corresponds to prior human eyeblink conditioning studies with partial reinforcement with CS alone trials (Reynolds, 1958; Ross, 1959; Hartman and Grant, 1960; Ross and Spence, 1960; Runquist, 1963; Perry and Moore, 1965).

In contrast to the findings with CS alone trials, eyeblink conditioning with the 50% US partial reinforcement protocol did not differ from 100% paired CS-US trials. This finding was somewhat surprising in that only half of the trials were training trials (i.e, CS-US pairings). However, the presentation of a corneal air puff alone could be a viewed as unexpected. The unexpected nature of these US alone trials could facilitate conditioning through several neural mechanisms involving attention.

Several theories have proposed the reinforcement system for different forms of motor learning in the cerebellum (including classical eyeblink conditioning) to be the climbing fiber system from the inferior olive (Albus, 1971; Eccles, 1977; Ito, 1982; Thompson, 1989; Swaim et al., 2011). The inferior olive climbing fiber system has been hypothesized as a teaching signal for the cerebellum. This cerebellar circuitry has been hypothesized by several theories and computational models to work in an error correction manner similar to the Rescorla-Wagner rule (e.g., Kenyon et al., 1998; Gluck et al., 2001). In these models, the error correction between the actual US and a prediction of the US (i.e., the CR) is instantiated as an inhibitory connection between the cerebellum and the inferior olive. The enhanced conditioning in the case of partial reinforcement training with US alone trials may be due to this circuit. Sears and Steinmetz (1991) found that inferior olive activity is inhibited on CR trials but is present on trials in which a CR does not occur. This pattern of US firing during US alone presentations intermixed with CS-US paired trials may produce the higher numbers of CRs in the partial reinforcement protocol with US alone trial. The random inferior activity could "spark" plasticity in the cerebellum leading to more CRs than would be expected with only 50% CS-US paired trials.

Another way in which attentional mechanisms may modulate the cerebellum is via theta activity from the septo-hippocampal system. Theta activity enhances eyeblink conditioning in rabbits (Berry and Seager, 2001) while disruption of the septohippocampal system via medial septal lesions or administration of cholinergic antagonists slows delay eyeblink conditioning (Berry and Thompson, 1979; Allen et al., 2002). Gray and McNaughton (2000) proposed theta is also associated with anxiety in that the septo-hippocampal system responds to competing options or motivations (possibly due to uncertainty) by increasing vigilance. Theta activity, thus, may be a source of the enhanced acquisition observed in BI individual in conditioning situations where there is less than optimal relationships between stimuli.

### **EXTINCTION EFFECTS**

In addition to findings of BI and partial reinforcement effects on acquisition of conditioned eyeblinks, the current study also revealed differences in extinction to CS alone presentations for BI individuals. Previous work with omission and yoked protocols (Holloway et al., 2014) did not produce any effects of BI on extinction even though about 50% of the trials were CS alone due to the omission of the US air puff on CR trials. In the present study, BI individuals exhibited a partial reinforcement extinction effect (i.e., PREE) following CS alone partial reinforcement such that they responded more during CS alone extinction trials as compared to individuals trained with 100% CS-US paired trials. However, NI individuals did not show PREE: i.e., those trained with CS alone partial reinforcement did not differ in extinction from those trained with 100% CS-US trials. A subset of prior classical conditioning studies with partial reinforcement with CS alone trials have reported a PREE (Longenecker et al., 1952; Perry and Moore, 1965; Newman, 1967; Leonard, 1975). Based on the current findings, the inconsistency in the past in obtaining a PREE in human eyeblink conditioning could be explained by temperament factors such as behavioral inhibition. It is of interest to note that while there is a large body of anxiety work and partial reinforcement studies with eyeblink conditioning from the 1950's and 1960's, the current study is the first to combine both elements.

Some aspects of the BI effect on extinction can be explained through the hippocampal modulation of eyeblink conditioning. Hippocampal lesions have been found to disrupt extinction of conditioned eyeblinks to tone alone training in rabbits (Schmaltz and Theios, 1972; Akase et al., 1989). Hippocampal theta activity also plays a role in the PREE. Gray (1972) found that theta is highest on non-reward trials and that medial septal lesions or electrical stimulation that blocks theta activity also disrupts PREE in rat straight alley maze running for a water reward.

Additionally, Penick and Solomon (1991) found that hippocampus is involved in encoding context. The hippocampal encoding of context may be responsible for the differences in extinction between 50% CS partial and the 100% CS-US paired protocols found with BI individuals. Spence et al. (1964) found an inverse relationship between the rate of extinction and recognition of changes in trial type between acquisition and extinction training. In the case of our CS alone partial reinforcement protocol, the extinction phase was similar to the acquisition phase in that both included CS alone presentations. This similarity in context may have contributed to the continued conditioned responding to the CS alone trials in high BI individuals. The PREE observed with CS alone training could be interpreted as being due to consistencies in context between acquisition and extinction training. Differences in extinction between BI and NI may be due to differential hippocampal activity based on continued vigilance to the CS alone presentations.

### **LIMITATIONS AND CONCLUSIONS**

The sample for the current study had a few limitations. First, the participants were undergraduates in psychology courses who voluntarily participated for research credit for coursework. While it is possible the participants had some preconceptions about the nature of the study, they were blind to the fact that they were going to do eyeblink conditioning and were also blind to the type of training protocol with which they would be presented.

Second, the sample included a majority of female participants (i.e., 98 females as compared to 51 males). While anxiety disorders are more prevalent in females, and females have also been reported as exhibiting facilitated eyeblink conditioning (Spence and Spence, 1966), the present study did not observe a gender effect for eyeblink acquisition which matches with other recent eyeblink conditioning studies concerning BI (Caulfield et al., 2013; Holloway et al., 2014). There were also no significant differences between males and females for any of the demographic measures. Third, the present study utilized a non-clinical population of college undergraduates who self-reported anxiety vulnerability on the AMBI scale. One unanswered question is whether the current findings would generalize to a post-traumatic stress disorder (PTSD) population or other anxiety disorder populations. Myers et al. (2011) found enhanced eyeblink conditioning in a delay paradigm with 100% paired trials among veterans self-reporting severe PTSD symptoms. It would be of interest to test the current findings of even greater enhancement of eyeblink conditioning in the partial reinforcement conditions in a population that has been clinically diagnosed with PTSD or some other anxiety disorder. The pattern of faster acquisition and slower extinction in partial reinforcement is similar to the symptoms of PTSD.

Our working hypothesis was that temperament factors like BI may alter associative learning thus leading to increased risk of development of anxiety disorders when presented with aversive stimuli. The current findings with partial reinforcement protocols match previous findings with omission and yoked protocols (Holloway et al., 2014). Behaviorally inhibited individuals exhibited greater facilitation of eyeblink conditioning (i.e., associative learning) at a greater rate in partial reinforcement protocols than in standard 100% paired trials. Additionally, the partial reinforcement protocol with CS alone trials revealed a PREE effect in only the high BI individuals. The current study furthers our understanding of enhanced associative learning in individuals self-reporting behavioral inhibition. Overall, this work supports a growing literature in which enhanced associative learning, especially in the cases where there is some uncertainty, is an elemental component of anxiety disorders.

### **ACKNOWLEDGMENTS**

This work was supported by the University of Northern Colorado and the Stress and Motivated Behavior Institute.

### **REFERENCES**


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

### *Received: 12 June 2014; accepted: 16 September 2014; published online: 06 October 2014*.

*Citation: Allen MT, Myers CE and Servatius RJ (2014) Avoidance prone individuals self reporting behavioral inhibition exhibit facilitated acquisition and altered extinction of conditioned eyeblinks with partial reinforcement schedules. Front. Behav. Neurosci. 8:347. doi: 10.3389/fnbeh.2014.00347*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience*.

*Copyright © 2014 Allen, Myers and Servatius. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms*.

## Generalization of socially transmitted and instructed avoidance

#### Gemma Cameron<sup>1</sup> , Michael W. Schlund<sup>2</sup> and Simon Dymond<sup>1</sup> \*

<sup>1</sup> Experimental Psychopathology Laboratory, Department of Psychology, Swansea University, Swansea, UK, <sup>2</sup> Department of Behavior Analysis, University of North Texas, Denton, TX, USA

Excessive avoidance behavior, in which an instrumental action prevents an upcoming aversive event, is a defining feature of anxiety disorders. Left unchecked, both fear and avoidance of potentially threatening stimuli may generalize to perceptually related stimuli and situations. The behavioral consequences of generalization mean that aversive learning experiences with specific threats may lead to the inference that classes of related stimuli are threatening, potentially dangerous, and need to be avoided, despite differences in physical form. Little is known however about avoidance generalization in humans and the learning pathways by which it may be transmitted. In the present study, we compared two pathways to avoidance—instructions and social observation—on subsequent generalization of avoidance behavior, fear expectancy and physiological arousal. Participants first learned that one cue was a danger cue (conditioned stimulus, CS+) and another was a safety cue (CS−). Groups were then either instructed that a simple avoidance response in the presence of the CS+ cancelled upcoming shock (instructed-learning group) or observed a short movie showing a demonstrator performing the avoidance response to prevent shock (observational-learning group). During generalization testing, danger and safety cues were presented along with generalization stimuli that parametrically varied in perceptual similarity to the CS+. Reinstatement of fear and avoidance was also tested. Findings demonstrate, for the first time, generalization of socially transmitted and instructed avoidance: both groups showed comparable generalization gradients in fear expectancy, avoidance behavior and arousal. Return of fear was evident, suggesting that generalized avoidance remains persistent following extinction testing. The utility of the present paradigm for research on avoidance generalization is discussed.

### Keywords: instructed-learning, observational-learning, avoidance, generalization, fear-conditioning, anxiety disorders

Chronic or excessive avoidance behavior, in which an overt action postpones or prevents an upcoming aversive event, is a defining feature of anxiety disorders (Craske et al., 2009). In the laboratory, avoidance learning is usually studied within the fear-conditioning paradigm (Dymond and Roche, 2009; Vervliet and Raes, 2013). Fear conditioning involves an initially neutral stimulus (the conditioned stimulus or CS) being paired with an aversive unconditioned stimulus (US), such as electric shock. After only a few CS−US pairings, presentations of the CS alone will elicit a conditioned fear response (CR), measured in humans via physiological arousal, expectancy ratings or action tendencies. Moreover, performing a simple motor response in the presence of the CS that predicts US delivery (CS+) might lead to acquisition

### Edited by:

Richard J. Servatius, Syracuse DVA Medical Center, USA

### Reviewed by:

Roee Admon, McLean Hospital, Harvard Medical School, USA Jony Sheynin, University of Michigan, USA Brian Van Meurs, University of Minnesota, USA

#### \*Correspondence:

Simon Dymond, Experimental Psychopathology Laboratory, Department of Psychology, Swansea University, Singleton Park, Swansea SA2 8PP, UK s.o.dymond@swansea.ac.uk

> Received: 02 April 2015 Accepted: 01 June 2015 Published: 18 June 2015

#### Citation:

Cameron G, Schlund MW and Dymond S (2015) Generalization of socially transmitted and instructed avoidance. Front. Behav. Neurosci. 9:159. doi: 10.3389/fnbeh.2015.00159 of steady rates of avoidance behavior because doing so successfully prevents contact with the US, while rates of avoidance will be low or zero in the presence of the CS that predicts absence of the US (CS−). Several decades of research have been conducted on fear and avoidance learning using variants of this basic paradigm (Boddez et al., 2014; LeDoux, 2014).

A direct instrumental/operant learning history with an avoidance response preventing upcoming US delivery through trial and error may not actually be necessary to learn avoidance. Little is known however about these so-called alternative pathways by which avoidance may be acquired in adults, and to date, much of the basic research has focused on fear learning. Rachman (1977) proposed several vicarious learning pathways to fear other than directly experienced CS−US pairings, such as verbal instructions, in which participants are instructed about the CS−US pairings, and social observation, in which participants observe another individual experience the CS−US pairings. Olsson and Phelps (2004) compared fear learning acquired through direct (CS−US pairings) and indirect experience (verbal instructions and social observation). Participants in the observational-learning group observed a demonstrator's fearful expression when receiving shocks paired with the angry face CS+, while those in the instructed-learning group were simply informed that the CS+ would be paired with shock. Results showed similar levels of fear learning across all three groups, as measured by skin conductance response (SCR), and similar studies have replicated and extended this basic effect (e.g., Olsson and Phelps, 2007; Raes et al., 2014; Golkar et al., 2015; Mertens et al., 2015). Vicarious learning of fear may help explain how fear is acquired in common childhood fears (e.g., Askew and Field, 2007, 2008; Muris and Field, 2010) and is consistent with the clinical observation that individuals with anxiety do not always report prior direct conditioning episodes like those modeled in fear-conditioning paradigms (Merckelbach et al., 1989; Ollendick and King, 1991).

The evidence to date therefore indicates that both fear and avoidance learning can occur through indirect learning pathways of the kind proposed by Rachman (Field et al., 2001; Askew and Field, 2007; Kelly et al., 2010; Muris and Field, 2010). Avoidance has, however, tended to be measured as a behavioral output of fear, and remains relatively under-investigated in its own right. Indeed, few studies have compared the vicarious pathways through which an avoidance response may be initially acquired. Preliminary evidence for the idea that avoidance may in fact be acquired via alternative pathways was found by Dymond et al. (2012), who tested whether avoidance acquired indirectly via verbal instructions results in similar levels of avoidance behavior and expectancy of shock to avoidance acquired after direct instrumental learning. Following fear conditioning, participants either learned or were instructed to make a response that cancelled upcoming shock. Three groups were then tested with presentations of a directly learned CS+ and CS− (learned group) or instructed CS+ (instructed group). Results showed similar levels of avoidance behavior and expectancy ratings across each of the pathways despite the different routes (i.e., experience vs. instructions) through which they were acquired. These preliminary findings are important because the fear conditioning history with the same danger and safety cues was common across the different pathways; the groups only differed by how the instrumental-avoidance response was acquired before it was subsequently tested under extinction.

No two situations are ever the same, and fear and avoidance acquired in one setting or situation may generalize to perceptually related situations. Generalization of conditioned fear based on formal, perceptual similarity is relatively well studied in humans (Dymond et al., in press) and nonhumans (Kheirbek et al., 2012). Drawing on classic studies of stimulus generalization in nonhumans (Honig and Urcuioli, 1981), systematic tests of fear generalization present an array of stimuli that vary along a specifiable physical continuum (e.g., color or size) from the CS+ (Dunsmoor and Paz, 2015). Generalization of fear and avoidance is adaptive when elicited by stimuli with a high probability of threat. However, the behavioral consequences of fear and avoidance generalization mean that aversive learning experiences with one cue may lead people to infer that classes of related cues are fearful, potentially threatening and need to be avoided, despite differences in physical form. If left unchecked, the focus of fear soon becomes excessive and can lead to debilitating anxiety, impaired social functioning and diminished quality of life. Indeed, the unrestricted generalization or ''overgeneralization'' of maladaptive fear and avoidance is now widely considered to be a defining feature of anxiety disorders (American Psychiatric Association, 2013). Overgeneralization of conditioned fear has been observed in panic disorder (Lissek et al., 2010), generalized anxiety disorder (Lissek et al., 2014; Tinoco-González et al., in press) and post-traumatic stress disorder (Lissek and Grillon, 2012). Yet, surprisingly little research has been conducted on the generalization of avoidance with healthy participants (Lommen et al., 2010; van Meurs et al., 2014; see also, Geschwind et al., 2015). Lommen et al. (2010) first identified participants who scored high and low for neuroticism and then used white and black colored circles as CS+ and CS−, respectively. During the generalization test, circles with grey values that ranged between black and white were presented as generalization stimuli (GSs) and participants were informed that shocks could be avoided within a latency of 1 or 5 s. Findings showed that participants who scored high on neuroticism only avoided shocks on the 5 s trials compared to the group scoring low on neuroticism.

Recently, van Meurs et al. (2014) devised a ''virtual farmer'' task to investigate the inter-relationship between Pavlovian fear learning, a passive-emotional process, and the operant/instrumental avoidance it motivates, which may be considered the active-behavioral component of maladaptive coping. The participants' task was to plant and harvest crops by selecting one of two routes to the field that differed in the likelihood of a successful harvest and the delivery of shock. In the course of the task, circles of differing size (Lissek et al., 2008) appeared onscreen and predicted the delivery of shock during Pavlovian fear and instrumental avoidance generalization trials. During the instrumental avoidance trials, participants had to choose between taking either the short route, which always resulted in a successful harvest but was followed by shock on CS+ trials, or taking the long route, which was never followed by shock but resulted in a reduced likelihood of a successful harvest. Avoidance in the presence of the CS+, by taking the long route, is considered adaptive because it prevents shock, but the extent to which the GSs evoked a maladaptive generalized avoidance tendency was the focus of investigation. van Meurs et al. (2014) found generalization in risk ratings and fear potentiated startle EMG responses obtained on Pavlovian generalization trials and in the proportion of avoidance responses made on instrumental generalization trials. van Meurs et al. (2014) determined patterns of overgeneralized maladaptive avoidance by plotting their measures along a continuum from the CS− via the GSs to the CS+. Similar to studies on the generalization of conditioned fear (Lissek et al., 2005, 2008, 2014), participants' avoidance behavior resembled a generalization gradient in which conditioned responding reached a maximum in the presence of the CS+, declined as the GSs gradually became more dissimilar, and reached a minimum in the presence of the CS− and a physically unrelated safety cue. Generalization gradients presented in this manner allow for an examination of the strength of generalization by charting the steepness of the gradient: the less steep the gradient, the greater the generalization.

While generalization gradients have been used to directly compare overgeneralization of fear in healthy participants and individuals with clinical disorders (Lissek, 2012), little is known about the overgeneralization or otherwise of avoidance behavior and the mechanisms by which it may be learned and generalized. Studies conducted to date have tended to employ avoidance behavior as a discrete measure of either the motivative properties of fear or as an instantiation of fear itself. For instance, van Meurs et al. (2014) only tested avoidance once in a generalization phase that interspersed Pavlovian fear learning and generalization trials with instrumentalavoidance generalization trials because they were interested in the relationship between passive-emotional Pavlovian and activebehavioral instrumental avoidance. Overlooking the acquisition of avoidance as a signaled operant response (Hurwitz et al., 1972; Higgins and Morris, 1984) may limit our understanding of how maladaptive avoidance coping first comes to be established before it subsequently generalizes and which may then appear to be partially independent of the contribution of Pavlovian processes or not necessitate the simultaneous probing of Pavlovian and instrumental components.

In the present study, we sought to investigate the generalization of signaled operant avoidance following a direct Pavlovian fear learning history in which one cue was established as a danger cue (CS+) and another as a safety cue (CS−). We then compared different routes or pathways by which the avoidance response is learned on subsequent generalization. Our aim was to contrast instructed-learning and observational-learning pathways of generalized avoidance. Following preacquisition and fear conditioning phases, groups were either instructed that a simple instrumental response in the presence of the CS+ cancels upcoming shock or observed a short movie showing a demonstrator in the same experimental context performing the avoidance response to prevent shock. In the generalization test phase, learned danger and safety cues were presented along with generalization stimuli (GS1, GS2, GS3) in the absence of the US (extinction), and avoidance behavior, US expectancy and SCR measured.

In addition, we then sought to test whether, after the end of the extinction test block, three unsignaled US presentations would prompt reinstatement of generalized avoidance if the generalization test was repeated. Reinstatement tests like this model the real world return of fear that often interferes with the long-term effectiveness of exposure-based therapy (Haaker et al., 2014). Interestingly, reinstatement studies with humans have shown a post-extinction increase in outcome measures to the CS+ and also the generalization of reinstatement effects to the CS− (Kull et al., 2012). To our knowledge, reinstatement of generalized avoidance has not been tested before in humans. A secondary aim of the present study was therefore to investigate the effects of reinstatement on generalized avoidance both in terms of the physically similar stimuli resembling CS+ and the safety cue, CS−.

We hypothesized higher trial-by-trial US expectancy ratings, avoidance behavior and SCRs to CS+ than CS−, and generalization of these outcome measures to stimuli that are more physically similar to the CS+ than CS−. We also hypothesized that there would be no differences in outcome measures during extinction testing between individuals who have acquired avoidance via either instructed-learning or observational-learning. Moreover, we hypothesized that reinstatement testing following extinction would result in less steep gradients overall to GSs arranged along the physical continuum between CS− and CS+ in both groups. Given that the present study predicted a degree of equivalence between the instructed-learning and observational-learning pathways, conventional null hypothesis testing is somewhat limited. For this reason, and because a non-significant p-value does not provide support for the null hypothesis, we used an additional, Bayesian analysis to establish the statistical likelihood of the null hypotheses being valid over the alternative hypothesis. The Bayesian framework has several theoretical advantages over classical frequentist statistics (Dienes, 2014), which allows us to quantify the probability of the null hypothesis being true (Wagenmakers, 2007).

### Materials and Methods

### Participants

Fifty-four healthy participants, 15 men and 39 women (Mage = 20.13 years, SD = 3.30) without a self-reported history of anxiety or depression, were randomly assigned to one of two groups: Instructed-learning or Observational-learning. All participants provided informed consent and were compensated with either course credits or the opportunity to win a £15 voucher. The Department of Psychology Ethics Committee at Swansea University approved the study and all procedures were conducted in accordance with the Declaration of Helsinki for the protection of human participants.

### Apparatus and Stimuli

Five gray colored circles of increasing size were used as the conditioned and generalization stimuli, with the largest and smallest circles serving as the CS+ and CS−, in a counterbalanced order of conditions (see **Figure 1**). The remaining three circles served as the generalization stimuli (GS1, GS2, GS3). The smallest circle had a diameter of 5 cm, increasing progressively in size by 15% for each stimulus, such that the second smallest circle was 15% larger than the first and 15% smaller than the next (i.e., 5 cm, 5.8 cm, 6.6 cm, 7.6 cm, 8.7 cm). A black isosceles triangle served as a perceptually dissimilar novel safety cue (1CS− had a width and height of 6.6 cm, comparable to that of the GS2, which also remained the same across groups. Stimuli were presented on a 17" computer monitor with a 60 Hz refresh rate, and the stimulus sequence, presentation and timings were controlled using Open Sesame (Mathôt et al., 2012).

Electric shock (250 ms duration) was delivered via a bar electrode fitted to the participant's dominant forearm and controlled by an isolated stimulator (STM200–1, BIOPAC Systems, Santa Barbara, CA, USA). At the outset, all participants underwent a shock calibration procedure in which they were given an example shock and instructed to either select or retain a shock level that was ''uncomfortable, but not painful''. The shock level selected by each participant was used throughout the experiment. SCRs were acquired from the distal phalanx of the second and third digits on the non-dominant hand and recorded using the BIOPAC MP-150 SCR module (BIOPAC Systems, Santa Barbara, CA, USA). Isotonic recording gel was applied to the Ag-AgCl 4 mm electrodes prior to their application.

### Procedure

Following informed consent, participants were fitted with the SCR and shock electrodes and undertook shock calibration. Participants were then given general procedural instructions explaining that on each trial one of two colored circles would appear, that some may be followed by shock, and that when the US expectancy rating questions appeared on screen they should use the mouse to rate the likelihood of shock (where 0 = certainly no shock, 5 = uncertain and 10 = certainly shock).

The procedure consisted of six phases: preacquisition, fear conditioning, avoidance learning, generalization test, US reinstatement, and reinstatement test (see **Table 1**). Both groups experienced all phases, but contingencies differed betweengroups in the avoidance learning phase only.

### Preacquisition

The CS+ and CS− were each presented once in the center of the screen for 2 s followed by the rating scale, which remained on screen until a response occurred or for a maximum of 4 s, whichever happened first. The CS duration was therefore 6 s and the intertrial interval (ITI), which varied between 6 s and 8 s, was indicated by a black fixation cross. No shocks were presented in this phase.

### Fear Conditioning

In this phase, which continued uninterrupted following the previous ITI, CS+ and CS− were each presented six times in a randomized order (CS duration was 6 s). The termination of every CS+ trial (either by a rating response or by reaching its maximum duration) was always followed by shock. No shocks ever followed the CS−.

### Avoidance Learning

During this phase, the Instructed-learning group was told that their task was to learn to make a response to prevent shock (**Figure 2**). They were told that on some trials a black border


TABLE 1 | Trial types and number of stimulus presentations during preacquisition, fear conditioning, avoidance learning, generalization test, and reinstatement test phases.

Note: Three unsignaled US presentations occurred between the Generalization Test and Reinstatement Test phases. "US?" and "NoUS?" ask whether shock was or was not presented. \*Indicates that the US occurred on unavoidable trials (for the Instructed-learning group only).

would appear around the edge of the screen and would signal the availability of the avoidance response, which consisted of pressing the right mouse button once with the cursor hovering over the CS+. Participants were presented with the CS+ and CS− a further eight times; on half of the trials, the avoidance cue was presented, which signaled the availability of the mouse button response (CS duration was 6 s). On the avoidable trials, the stimulus (CS+ or CS−) was presented for 2 s and followed by the avoidance cue for 2 s: during this time the stimuli remained on screen and participants could use the mouse to click on the image to prevent pending shock. Participants made ratings on the US expectancy scale before making or not any avoidance response. On unavoidable trials, the stimulus (CS+ or CS−) was presented for 2 s and, 2 s later, shock always followed CS+ trials only. Shocks never followed any CS− trials. Participants were informed they should only make the avoidance response if they believed that shock would follow the image on the screen and that once the rating scale appeared, the avoidance response, when available, could no longer be performed and that they should instead make a rating on the scale. The mouse cursor was hidden until available to use, either at the onset of the avoidance cue or rating scale.

The Observational-learning group did not take part in any learning trials in this phase. Instead, they viewed a short (4 min) film of a male demonstrator taking part in the avoidance-learning phase of the same experiment (**Figure 2**). They were told that they would observe a person taking part an experiment similar to the one that they themselves would be taking part in after the video had ended. They were also told that the person in the film would learn to cancel an upcoming shock using the mouse and that they should pay close attention to the screen because they too would have to learn to cancel upcoming shocks. These participants observed a total of 16 trials (i.e., CS+ and CS− each presented eight times) in which the CS+ was always avoided when the border appeared and the CS− never avoided (the demonstrator, but not the participant, received a total of four shocks on CS+ unavoidable trials). The demonstrator made ratings on every trial (which were always high for CS+ unavoidable trials and low for all CS− and CS+ avoidable trials). The Observational-learning group made no ratings during this phase.

### Generalization Test

This phase continued uninterrupted and without further instructions. The CS+, CS−, GS1, GS2, and GS<sup>3</sup> were each presented four times (two avoidable and two unavoidable trials of each cue) along with two presentations of the 1CS− for a total of 22 trials. The cue signaling an avoidable trial appeared on all CS+, CS− and GS trials, but never on 1CS− trials. As this was a test phase, shock was withheld on all trials.

### US Reinstatement

Following a short interval (1250 ms), the US was presented three times without warning and in the absence of any onscreen stimuli. Each US presentation was separated by a delay of 1 s.

### Reinstatement Test

Finally, a short interval (1 s) commenced before the scheduled ITI and trials were re-presented from the generalization test phase.

### Data Analysis

Skin conductance data were continuously recorded at a rate of 1000 samples per second and off-line analysis of the analog SCR waveforms conducted with AcqKnowledge (BIOPAC Systems Inc., Goleta, CA, USA). SCRs were measured for each trial as the peak-to-peak amplitude difference in SCR to the first response (in microsiemens, µS) in the 0.5–6 s latency window following stimulus onset. The minimal response criterion was 0.02 µS. To normalize the SCR data, scores were square-root transformed. Statistical analysis of SCR data involved repeated- measures ANOVA.

Online US expectancy ratings appeared on every trial for the Instructed-learning group and on all trials excluding the avoidance learning phase for the Observational-learning group. Where participants responded within the time allowed, ratings for each stimulus were analyzed within test phases and separately for avoidable and unavoidable trials. The analysis of US expectancy ratings and SCRs focused on avoidable trials during avoidance learning and test phases. All 1CS− trials were unavoidable during these phases and were not included in the final analysis. For all phases, excluding the avoidance learning phase, a two-way repeated measures ANOVA was used to compare within and between subject differences for the dependent measures. For the generalization and reinstatement test phases only, a polynominal trend analysis was conducted to determine the linear and quadratic terms used to describe the shape of the generalization gradients obtained (only significant trends are reported). A paired samples t-test was used to analyze data from the Instructed-learning group during avoidance learning. Avoidance behavior was measured as a percentage of trials avoided for all avoidable CS+, CS− and GS stimuli. For all analyses, the alpha level was set at 0.05, where necessary, p-values reflect the Greenhouse-Geisser correction for sphericity, and Bonferroni correction was used to control for multiple comparisons.

To further investigate the predicted absence of between-group differences, we performed repeated-measures Bayesian ANOVA with JASP (Love et al., 2015) and used default priors to estimate the Bayes Factor (BF; Rouder et al., 2012). The BF indicates the likelihood of the data fitting under the null hypothesis with the likelihood of fitting under the alternative hypothesis. In our analysis, we compared the null hypothesis against the alternative (BF01), where the greater the BF value, the greater the likelihood of the data fitting the null hypothesis (e.g., a BF greater than 3 indicates substantial evidence for the null hypothesis, 1 indicates no evidence for either theory, and less than 1 indicates increasing evidence for the alternative hypothesis; Wetzels and Wagenmakers, 2012).

### Results

A total of four participants were removed from all analyses (three from the Instructed-learning group and one from the Observational-learning group) due to a programming error, while a further one participant's data from the Instructedlearning group was excluded from analysis of the avoidance learning phase only. The final sample sizes were: Instructedlearning (n = 25) and Observational-learning (n = 25). Of these, SCR data from three participants (one Instructed-learning and two Observational-learning) were removed from the analysis as they were deemed non-responders; due to a programming error, data were missing from a further two participants from each of the groups, respectively, and two further participants from the Observational-learning group were removed from analysis of the reinstatement test phase only because they removed the electrodes.

### Preacquisition

### US Expectancy Ratings

As expected, ratings of the likelihood of shock did not differ across stimuli during preacquisition, F(1,34) = 1.969, p = 0.170, η 2 <sup>p</sup> = 0.055, BF<sup>01</sup> = 1.945, there was no interaction with group, F(1,34) = 0.362, p = 0.552, η 2 <sup>p</sup> = 0.011, BF<sup>01</sup> = 3.811, and no differences between groups, F(1,34) = 0.049, p = 0.826, η 2 <sup>p</sup> = 0.001, BF<sup>01</sup> = 1.985.

### SCR

Analysis of SCR revealed a similar pattern, with no differences between stimulus type, F(1,46) = 0.468, p = 0.497, η 2 <sup>p</sup> = 0.010, BF<sup>01</sup> = 3.687, no interaction, F(1,46) = 0.569, p = 0.454, η 2 <sup>p</sup> = 0.012, BF<sup>01</sup> = 13.425, and no differences between groups, F(1,46) = 0.049, p = 0.827, η 2 <sup>p</sup> = 0.001, BF<sup>01</sup> = 3.783.

**Table 2** shows the means (and standard deviations) for US expectancy ratings and SCR for CS+ and CS− during Preacquisition, Fear Conditioning, and Avoidance Learning phases (avoidable trials only) for both groups.

The expectancy ratings and SCR findings were predicted given the absence of shock during preacquisition, and showed that the groups had a similar, low expectancy of shock and undifferentiated SCR profile at the outset.

### Fear Conditioning

### US Expectancy Ratings

During fear conditioning, expectancy ratings differed across stimuli, F(1,48) = 65.342, p < 0.001, η 2 <sup>p</sup> = 0.577, BF<sup>01</sup> = 7.007, but no interaction with group was found, F(1,48) = 0.374, p = 0.544, η 2 <sup>p</sup> = 0.008, BF<sup>01</sup> = 2.830. The instructedlearning and observational-learning groups did not differ


TABLE 2 | Means (and standard deviations) for US expectancy ratings and SCR for CS+ and CS- during preacquisition, fear conditioning, and avoidance learning phases (avoidable trials only) for the instructed-learning and observational-learning groups.

in their expectancy of shock, F(1,48) = 0.200, p = 0.657, η 2 <sup>p</sup> = 0.004, BF<sup>01</sup> = 4.306. Analysis of trial-by-trial ratings for this phase with trial order as within subjects factor and group as between subjects factor, revealed significantly higher expectancy across trials for CS+ F(5,420) = 26.519, p < 0.001, η 2 <sup>p</sup> = 0.356, BF<sup>01</sup> = 2.388e-19, and which did not differ between the groups F(1,48) = 1.235, p = 0.272, η 2 <sup>p</sup> = 0.025, BF<sup>01</sup> = 3.535. As predicted, this indicates that both groups demonstrated an increase in US expectancy across trials (see **Figure 3A**).

### SCR

Analysis of SCR revealed no significant main effect of stimulus, F (1,40) = 3.313, p = 0.076, η 2 <sup>p</sup> = 0.076, BF<sup>01</sup> = 0.973, and no interaction with group, F(1,40) = 0.162, p = 0.690, η 2 <sup>p</sup> = 0.004, BF<sup>01</sup> = 0.749. The groups had a near significant difference in overall SCR, F(1,40) = 3.878, p = 0.056, η 2 <sup>p</sup> = 0.088, BF<sup>01</sup> = 0.781, but were similar in SCRs elicited to CS− (p = 0.178) and CS+ (p = 0.077; see **Figure 3B**).

### Avoidance Learning US Expectancy Ratings

The instructed-learning group's ratings during avoidable, t(23) = 10.429, p < 0.001, and unavoidable trials, t(23) = 10.854, p < 0.001, differed. This indicated high expectancy of shock following CS+ than CS−, irrespective of the availability of the avoidance response.

### Avoidance Behavior

The instructed learning group performed the avoidance response on 73.9% of CS+ trials (SD: 37.9) and 25% of CS− trials (SD: 38.3). The proportion of avoidance behavior evoked by the cues was significantly different, t(23) = 4.579, p < 0.001, indicating a higher proportion of avoidance responses made to CS+ compared to CS− during avoidable trials.

### SCR

The SCR elicited by CS+ and CS− during avoidable, t(20) = 2.482, p < 0.05, and unavoidable trials, t(21) = 2.327, p < 0.05,

FIGURE 3 | Fear conditioning results. Trial by trial unconditioned stimulus (US) expectancy (A) and mean skin conductance response (SCR) (µS) (square-root transformed) (B) results for CS+ and CS− presentations during fear conditioning for the instructed-learning and observational-learning groups. Error bars indicate SEM.

differed, which indicated an increased physiological response to the danger cue (CS+) than the safety cue (CS−) during avoidable and unavoidable trials. Interestingly, the availability of avoidance did not modulate SCRs to the CS+.

### Generalization Test

### US Expectancy Ratings

Ratings made on avoidable trials revealed a significant main effect of stimulus, F(4,192) = 18.507, p < 0.001, η 2 <sup>p</sup> = 0.278, BF<sup>01</sup> = 5.044e-11, with a quadratic increase from CS− to CS+ (p < 0.001), but no interaction with group, F(4,192) = 0.372, p = 0.745, η 2 <sup>p</sup> = 0.008, BF<sup>01</sup> = 3.871e-9. Groups did not differ on the ratings they made, F(1,48) = 0.071, p = 0.791, η 2 <sup>p</sup> = 0.001, BF<sup>01</sup> = 4.069, suggesting similar patterns of generalized expectancy (see **Figure 4A**). Follow-up tests revealed a significant difference between the safety cue, CS−, and the generalized cue it most resembled, GS<sup>1</sup> (p < 0.05). Similar differences in expectancy were found between GS<sup>2</sup> and GS<sup>3</sup> (p < 0.01) and GS<sup>3</sup> and CS+ (p < 0.05). Mean ratings made to GS<sup>2</sup> were not significantly greater than those to GS<sup>1</sup> (p = 0.205) (**Figure 4A**).

Ratings on unavoidable trials displayed a similar pattern with a main effect of stimulus, F(5,240) = 39.457, p < 0.001, η 2 <sup>p</sup> = 0.451, BF<sup>01</sup> = 8.452e-27, as well as both linear (p < 0.01) and quadratic trends (p < 0.001) found in the generalization gradient, but no interaction with group, F(5,240) = 2.173, p = 0.084, η 2 <sup>p</sup> = 0.043, BF<sup>01</sup> = 1.102e-26. However, a marginally significant difference was found between groups, F(1,48) = 3.738, p = 0.059, η 2 <sup>p</sup> = 0.072, BF<sup>01</sup> = 1.057. Pairwise comparisons revealed that the instructed-learning and observational-learning groups did differ in ratings made during CS+ trials (p < 0.01), with higher ratings from the instructed-learning group, but no difference for CS− (p = 0.347), GS<sup>1</sup> (p = 0.286), GS<sup>2</sup> (p = 0.800), or GS<sup>3</sup> trials (p = 0.107). The differences between groups on CS+ unavoidable trials likely stems from the different number of directly experienced shock deliveries during avoidance learning. For the observational-learning group, ratings made on CS+ and GS<sup>3</sup> trials did not differ (p = 0.376), but did differ in the instructed-learning group (p < 0.05). Similarly, there was no difference in ratings to GS<sup>3</sup> and GS<sup>2</sup> in the observationallearning group (p = 0.061), but a difference was found in ratings made by the instructed-learning group (p < 0.001). Ratings on CS− and GS<sup>1</sup> trials did not differ for either the instructed-learning (p = 0.598) or observational-learning (p = 0.792) group, but ratings to GS<sup>1</sup> and GS<sup>2</sup> did differ for both groups (instructed-learning: p < 0.01; observational-learning: p < 0.001).

### Avoidance Behavior

Avoidance evoked by generalization test stimuli was significantly different, F(4,192) = 12.839, p < 0.001, η 2 <sup>p</sup> = 0.211, BF<sup>01</sup> = 2.560e-7, with a linear trend increase in avoidance from CS− to CS+ (p < 0.001), but no interaction with group, F(4,192) = 1.230, p = 0.301, η 2 <sup>p</sup> = 0.025, BF<sup>01</sup> = 4.677e-6. This reflects no differences in avoidance between the instructed-learning and observationallearning groups, F(1,48) = 0.248, p = 0.620, η 2 <sup>p</sup> = 0.005, BF<sup>01</sup> = 3.048. Pairwise comparisons revealed no differences

proportion of avoidance behavior (B), and mean SCR (µS) (square-root transformed) (C) results for conditioning (CS+ and CS−) and generalization stimuli (G1, G2, G3) during generalization testing for the instructed-learning and observational-learning groups (avoidable trials only). Error bars indicate SEM. Linear and/or quadratic terms are also shown.

between avoidance evoked by CS− and GS<sup>1</sup> (p = 0.671), G1 and GS<sup>2</sup> (p = 0.263), and GS<sup>3</sup> and CS+ (p = 0.169), but significantly higher levels of avoidance to GS<sup>3</sup> than GS<sup>2</sup> (p < 0.01). These results suggest a shallow generalization gradient from CS− to GS2, but a steep incline from GS<sup>2</sup> to GS3, which then flattened between GS<sup>3</sup> and CS+ (see **Figure 4B**).

### SCR

Results from avoidable trials showed no main effect of stimulus, F(4,160) = 1.284, p = 0.278, η 2 <sup>p</sup> = 0.031, BF<sup>01</sup> = 10.339, no interaction, F(4,160) = 1.822, p = 0.127, η 2 <sup>p</sup> = 0.044, BF<sup>01</sup> = 40.510, and no significant differences between groups, F(1,40) = 1.810, p = 0.186, η 2 <sup>p</sup> = 0.043, BF<sup>01</sup> = 1.777 (see **Figure 4C**).

Results from unavoidable trials also produced no significant effects of either stimulus type, F(4,160) = 0.251, p = 0.909, η 2 <sup>p</sup> = 0.006, BF<sup>01</sup> = 49.218, group, F(1,40) = 1.094, p = 0.302, η 2 <sup>p</sup> = 0.027, BF<sup>01</sup> = 1.883, or any interaction, F(1,40) = 1.240, p = 0.296, η 2 <sup>p</sup> = 0.030 BF<sup>01</sup> = 647.127.

### Reinstatement Test US Expectancy Ratings

Analysis of avoidable trials revealed a significant main effect of stimulus type, F(4,192) = 15.110, p < 0.001, η 2 <sup>p</sup> = 0.239, BF<sup>01</sup> = 5.99e-90, characterized by a quadratic trend (p < 0.001), but no interaction with group, F(4,192) = 0.139, p = 0.9013, η 2 <sup>p</sup> = 0.003, BF<sup>01</sup> = 5.704e-7. The instructed-learning and observational-learning groups did not differ in their expectancy ratings during this phase, F(1,48) = 0.048, p = 0.827, η 2 <sup>p</sup> = 0.001, BF<sup>01</sup> = 3.617. Pairwise comparisons revealed that CS+ and GS<sup>3</sup> (p = 0.437) and CS− and GS<sup>1</sup> (p = 0.907) were rated similarly, but significantly higher ratings were seen to GS<sup>2</sup> over GS<sup>1</sup> (p < 0.001) and GS<sup>3</sup> over GS<sup>2</sup> (p < 0.05), indicating generalization from both CS+ and CS− to stimuli physically closest on the continuum (see **Figure 5A**).

Analysis of unavoidable trials displayed a similar pattern; the stimuli presented evoked differential levels of expectancy, F(5,240) = 14.324, p < 0.001, η 2 <sup>p</sup> = 0.230, BF<sup>01</sup> = 2.187e-10, with a quadratic trend (p < 0.001), and no interaction with group, F(5,240) = 0.438, p = 0.757, η 2 <sup>p</sup> = 0.009, BF<sup>01</sup> = 1.672e-8. Similarly, the groups did not differ, F(1,48) = 0.865, p = 0.357, η 2 <sup>p</sup> = 0.018, BF<sup>01</sup> = 2.792, and neither ratings of CS+ and GS<sup>3</sup> (p < 0.001) nor CS− and GS<sup>1</sup> (p < 0.05) differed. However, there were significant differences between GS<sup>2</sup> and GS<sup>3</sup> (p < 0.001) and GS<sup>2</sup> and GS<sup>1</sup> (p < 0.05), demonstrating similar expectancy ratings to stimuli most physically similar to the CS− and to CS+ respectively, but not for stimuli too dissimilar or far removed from both CS+ and CS− (i.e., GS2).

### Avoidance Behavior

Consistent with the generalization test phase, analysis of avoidance revealed a significant main effect of stimulus, F(4,192) = 5.599, p < 0.01, η 2 <sup>p</sup> = 0.104, BF<sup>01</sup> = 0.013, no interaction, F(4,192) = 0.586, p = 0.606, η 2 <sup>p</sup> = 0.012, BF<sup>01</sup> = 0.529, and no differences between groups, F(1,48) = 0.385, p = 0.538, η 2 <sup>p</sup> = 0.008, BF<sup>01</sup> = 2.653. However, unlike the generalization test phase, both linear (p < 0.01) and quadratic (p < 0.05) increases in avoidance from CS− to CS+ were found. Pairwise comparisons revealed no significant difference between CS− and GS<sup>1</sup> (p = 0.104) or between GS<sup>2</sup> and GS3, (p = 0.304), but a significant difference between GS<sup>1</sup> and GS<sup>2</sup> (p < 0.05) and GS<sup>3</sup> to CS+ (p < 0.05), which indicates a shift in avoidance from GS<sup>2</sup> towards the CS+, while the GS<sup>3</sup> generalization gradient became steeper towards CS+ (see **Figure 5B**).

### SCR

Analysis of avoidable trials revealed no main effect of stimulus type, F(4,152) = 1.961, p = 0.103, η 2 <sup>p</sup> = 0.049, BF<sup>01</sup> = 4.168, and no interaction with group, F(4,152) = 1.661, p = 0.162, η 2 <sup>p</sup> = 0.042, BF<sup>01</sup> = 5.398. Interestingly, a significant difference between groups was found, F(1,38) = 5.219, p < 0.05, η 2 <sup>p</sup> = 0.121, BF<sup>01</sup> = 0.435, which pairwise comparisons suggested was driven by differences in SCR amplitude to CS− GS2, and GS<sup>3</sup> (all p's < 0.05), but not to GS<sup>1</sup> (p = 0.795) or CS+ (p = 0.406). Overall,

FIGURE 5 | Reinstatement test results. Trial by trial US expectancy (A), proportion of avoidance behavior (B), and mean SCR (µS) (square-root transformed) (C) results for conditioning (CS+ and CS−) and generalization stimuli (G1, G2, G3) during reinstatement testing for the instructed-learning and observational-learning groups (avoidable trials only). Error bars indicate SEM. Linear and/or quadratic terms are also shown.

it appeared that the observational-learning group produced consistently higher SCRs to all stimuli (**Figure 5C**).

For the unavoidable trials, analysis revealed a significant main effect of stimulus type, F(4,152) = 3.485, p < 0.01, η 2 <sup>p</sup> = 0.084, BF<sup>01</sup> = 0.536, an interaction with group, F(4,152) = 3.148, p < 0.05, η 2 <sup>p</sup> = 0.077, BF<sup>01</sup> =.031, and significant differences between the groups, F(1,38) = 8.258, p < 0.01, η 2 <sup>p</sup> = 0.179, BF<sup>01</sup> = 0.169. SCRs differed between groups to CS−, GS2, and CS+ (all p's < 0.05), with higher SCRs elicited by the observational-learning than the instructed-learning group, but not to GS<sup>1</sup> (p = 0.222) or GS<sup>3</sup> (p = 0.923).

### Return of Fear: Comparing Generalization and Reinstatement Tests

To assess return of fear, the final presentation of each stimulus in the generalization test phase was compared to the first


TABLE 3 | Means (and standard deviation) US expectancy ratings, proportion of avoidance and SCR for CS-, GS1, GS2, GS<sup>3</sup> and CS+ during avoidable trials in the generalization test and reinstatement test phases for the instructed-learning (instructed) and observational-learning (observed) groups.

presentation following US reinstatement. Repeated measures ANOVA was run with group as the between groups variable and trial order as the within subjects factor. **Table 3** shows the mean US expectancy ratings, proportion of avoidance, and SCR for all stimuli presented during the generalization and reinstatement test phases.

### US Expectancy Ratings

Results revealed a significant main effect of trial order, F(9,423) = 12.953, p < 0.001, η 2 <sup>p</sup> = 0.216, BF<sup>01</sup> = 3.283e-16, but no interaction between group and trial, F(9,423) = 0.568, p = 0.744, η 2 <sup>p</sup> = 0.012, BF<sup>01</sup> = 1.091e-13, and no differences between groups, F(1,47) = 0.149, p = 0.702, η 2 <sup>p</sup> = 0.003, BF<sup>01</sup> = 4.031. Ratings did not differ between groups for the CS− (p = 0.143), GS<sup>1</sup> (p = 0.579), or CS+ (p = 0.963), but there was a significant increase in ratings to GS<sup>2</sup> (p < 0.01), and a near significant decrease for GS<sup>3</sup> (p = 0.051) from generalization test to reinstatement test. These results indicate a return of fear towards more ambiguous stimuli, but stable responding to those with a prior history of either shock or no shock, which generalized only to the most physically similar stimuli (i.e., GS<sup>1</sup> and GS3) .

Results from the unavoidable trials followed a similar pattern: a significant main effect of trial, F(11,495) = 17.360, p < 0.001, η 2 <sup>p</sup> = 0.278, BF<sup>01</sup> = 3.149e-27, no interaction between group and trial, F(11,495) = 0.553, p = 0.798, η 2 <sup>p</sup> = 0.012, BF<sup>01</sup> = 5.732e-25, and no difference between groups, F(1,45) = 2.920, p = 0.094, η 2 <sup>p</sup> = 0.061, BF<sup>01</sup> = 1.397. These results indicate that in the absence of the availability of the avoidance response, no change in US expectancy ratings in either the instructed-learning or observational-learning groups occurred following US reinstatement.

### Avoidance Behavior

When testing for return of avoidance, we found a significant main effect of trial order, F(9,423) = 4.720, p < 0.001, η 2 <sup>p</sup> = 0.091, BF<sup>01</sup> = 4.386e-4, no interaction, F(9,423) = 0.568, p = 0.644, η 2 <sup>p</sup> = 0.015, BF<sup>01</sup> = 0.077, and no difference between groups, F(1,47) = 0.388, p = 0.537, η 2 <sup>p</sup> = 0.008, BF<sup>01</sup> = 3.288. There was a significant increase in avoidance responding to CS− (p < 0.05), but no change to GS<sup>1</sup> (p = 0.375), GS<sup>2</sup> (p = 0.963), GS<sup>3</sup> (p = 0.129), or CS+ (p = 0.980).

### SCR

The analysis of SCR during avoidable trials revealed no main effect, F(9,342) = 1.462, p = 0.161, η 2 <sup>p</sup> = 0.037, BF<sup>01</sup> = 20.990, no interaction, F(9,342) = 0.938, p = 0.492, η 2 <sup>p</sup> = 0.024, BF<sup>01</sup> = 1128.745, and no significant differences between groups, F(1,38) = 1.894, p = 0.177, η 2 <sup>p</sup> = 0.047, BF<sup>01</sup> = 2.375. There was a significant increase in SCR during presentations of CS− (p < 0.05) and GS<sup>3</sup> (p < 0.05), but no significant change to presentations of GS<sup>1</sup> (p = 0.805), GS<sup>2</sup> (p = 0.912) and CS+ (p = 0.666).

Analysis of unavoidable trials revealed no main effect, F(9,342) = 1.481, p = 0.183, η 2 <sup>p</sup> = 0.038, BF<sup>01</sup> = 19.863, no interaction, F(9,342) = 1.165, p = 0.325, η 2 <sup>p</sup> = 0.030, BF<sup>01</sup> = 384.817, and no difference between groups F(1,38) = 3.581 p = 0.066, η 2 <sup>p</sup> = 0.086, BF<sup>01</sup> = 1.551, indicating no change in physiological responding for both groups following US reinstatement in the absence of avoidance.

### Shapes of Generalization Gradients

To assess the shape of the generalization gradient, we adopted the method of linear departure described by van Meurs et al. (2014) to determine the extent to which the gradients departed from linearity: (average GS1, GS2, GS3)−(average CS+, CS−). The average of the CS+ and CS− reflects the directly trained midpoint of the generalization gradient (overall avoidance) and the average responses of the GSs (maladaptive avoidance) could fall either above (positive departure) or below (negative departure) this midpoint. For the combined sample only (i.e., both groups), a significant positive correlation was found between gradients of avoidance behavior and US expectancy ratings (r = 0.515, p < 0.001), but not SCR (r = 0.168, p = 0.287) during the reinstatement test phase only. No significant correlations between avoidance and either SCR (p = 0.670) or US expectancy ratings (p = 0.389) was found during the generalization test phase.

### Discussion

The aim of the present study was to compare, for the first time, instructed-learning and observational-learning pathways of avoidance on generalized avoidance behavior in humans. Participants first underwent fear conditioning and were then divided into two groups that differed in how avoidance was acquired (either through instructions about the avoidance response for the instructed-learning group or through watching a video recording of a demonstrator performing the avoidance response for the observational-learning group). Both groups were then tested in extinction for generalization of avoidance to stimuli physically resembling the CS+ along the formal dimension of size. Return of fear and avoidance was then tested in a reinstatement phase after unsignaled US presentations. Results showed that groups did not differ by the end of fear conditioning, with each group showing enhanced expectancy of shock following CS+ relative to CS−, although SCR measures fell short of statistical significance for this phase. Results also showed that both groups demonstrated comparable levels of US expectancy, avoidance behavior and physiological measures in a gradient-like manner from CS+ across the generalization stimuli to CS−. Return of fear was evident during reinstatement testing, suggesting that generalized avoidance remains persistent following the completion of extinction testing. Taken together, these findings are the first demonstration of similar generalization gradients of a signaled operant avoidance response following instructed and observational learning. We will now discuss these findings and the limitations of the present study in more detail below.

### Instructed-Learning and Social Transmission of Generalized Avoidance

Previous studies have shown that fear learning may be acquired vicariously via instructions and social observation in both adults (Olsson and Phelps, 2004, 2007; Mechias et al., 2010) and children (e.g., Askew and Field, 2007; Reynolds et al., 2014). Effects of vicarious learning on fear related cognitions and approach-avoidance behavior has also been reported with children (Broeren et al., 2011). The present study is the first however to directly contrast Rachman's (1977) social observation pathway with instructed avoidance with adults, and to examine generalization of avoidance established via these pathways. Participants in the instructed-learning group were informed about the availability of the avoidance response, which was cued by an onscreen border, the avoidance response was fully described, and the CS+ and CS− were presented in random order a fixed number of times. These procedures therefore likely employed a combination of instructed and instrumental learning processes (Raes et al., 2014). Some studies have shown that explicit instructions about the avoidance response are not necessary for successful acquisition of avoidance, even in studies requiring multiple acquisition sessions to maintain a predetermined training criterion (e.g., Sheynin et al., 2014). In the present study, some form of instruction about avoidance was deemed necessary to facilitate comparison with avoidance acquired via social learning. It is likely therefore that acquisition of avoidance in the instructed-learning group could potentially have been a mismatch with the contingencies experienced by the observational-learning group who passively viewed a movie of a demonstrator performing the correct avoidance response. Future research would be well advised to determine whether or not the task instructions given to the instructed-learning group were either necessary or sufficient for the acquisition of avoidance and, if so, what effects it may have had on subsequent generalization. Moreover, a demonstration of generalized avoidance following trial and error instrumental learning with minimal or no instructions about the avoidance response would also be salutary (Dymond et al., 2012).

Both groups were exposed to an identical generalization test where the effects of the different avoidance pathways were tested. Findings indicated that both groups responded in a similar manner during the generalization test phase, with avoidance behavior and US expectancy ratings falling along a generalization gradient of responding (van Meurs et al., 2014). In both groups, the generalization cue G<sup>3</sup> elicited similar expectancy ratings and levels of avoidance to CS+, with less pronounced differences seen in SCR (**Figure 4**). This cue was physically closest to CS+ along the continuum of size but had never been paired with shock. Yet, it elicited levels of fear and prompted actual avoidance behavior in a manner resembling a directly learned danger cue. Although G<sup>3</sup> and the other generalization stimuli lacked a direct conditioning history with shock, their presentation during the generalization test was sufficient to prompt maladaptive avoidance behavior by both groups as participants clearly adopted a ''better safe than sorry'' approach (Lommen et al., 2010). That is, rather than wait to see whether or not withholding avoidance in the presence of the GSs would be followed by shock, a significant proportion of avoidance behavior was seen and a high expectancy of shock was simultaneously recorded. Taken together, these findings suggest that participants in both groups had a high expectancy that shock would follow nonavoidance, which motivated the levels of avoidance behavior seen. This is the first such demonstration of generalized avoidance in a signaled operant task without additional, ongoing task conflict (e.g., approach-avoidance conflict; van Meurs et al., 2014).

The present findings support the use of analyses of the slope of individual participant's generalization gradients to determine the extent to which they depart from linearity. We adopted the van Meurs et al. (2014) method of calculating linear departure to describe the shape of the gradient where the mean of the CS+ and CS− reflects the directly trained midpoint and where the mean of the GSs could fall either above (positive departure) or below (negative departure) this midpoint. Similar to van Meurs et al. (2014) we found correlations between our measures of generalized avoidance, with the slope of participants' expectancy ratings positively correlating with the proportion of avoidance in the reinstatement test phase only. The absence of these correlations in the generalization test phase may indicate either early effects of extinction or, as will be discussed below, inadequate power in the number of stimulus presentations of the GS's used to calculate departure.

### Reinstatement of Generalized Fear and Avoidance

As described above, reinstatement or return of fear (and avoidance) was tested by unsignaled US presentations following the extinction generalization test, which was then repeated. Reinstatement research with humans is still very much in its infancy (Haaker et al., 2014) and the present study represents the first such investigation in the context of avoidance generalization. As might be expected, a brief reintroduction of fear induced avoidance behavior, expectancy of shock and SCR but not at levels significantly different from earlier (**Figure 5**). In fact, our analyses indicated that reinstatement testing boosted levels of maladaptive responding but only to the CS− across all performance measures, suggesting it was deemed a potential threat. Such a finding has been observed previously in the context of fear learning and extinction (Kull et al., 2012), but this is the first such demonstration in a study of avoidance generalization in healthy humans. It remains to be seen whether the reported transient effects of reinstatement and its susceptibility to methodological factors such as a stimulus sequence effects (Haaker et al., 2014) are observed in other avoidance learning pathways and generalization paradigms (see also Mertens et al., 2015).

### Social Observational Pathway to Generalized Avoidance

It is well known that nonhumans can acquire fears via observation. For instance, Mineka et al. (1984) had observer monkeys without snake-fear watch their model parent monkeys interact with real, toy and model snakes. Five of the six observer monkeys readily acquired fear and avoidance of snakes, which generalized to snake-related stimuli and novel contexts. It is also relatively well known that human adults and infants can acquire fear vicariously via social observation. For example, in a neuroimaging study, Olsson et al. (2007) showed participants a movie of another person experiencing fear and distress when receiving shocks paired with a CS+. These authors found that similar neural systems were recruited during acquisition (observation) and expression (test) of learned fear, highlighting a common neurobehavioral mechanism supporting directly learned and observed fear pathways. Moreover, facial fear expression readily functions as a US in human adults (Vaughan and Lanzetta, 1980) and nonhumans (Mineka et al., 1984). Behavioral research has also highlighted similar findings in typically developing young infants. Broeren et al. (2011), for example, exposed young children to a peer modeling intervention in which they viewed either positive or negative modeling films showing peers approach the same wooden box used in their behavioral approach/avoidance task. They found that positive modeling decreased avoidance tendencies towards known and unknown animals, while negative modeling had little effect on avoidance of the modeled animal but did decrease avoidance tendencies towards the non-modeled animal.

The present findings are the first to show that avoidance may initially be acquired via observational-learning and then subsequently generalize to exemplars perceptually related to the conditioned danger cue in a manner resembling that seen in the generalization of instructed-learning of avoidance. This is, therefore, the first study to investigate avoidance behavior as both the acquisition pathway of comparison and the means of testing potential similarities between pathways during a common generalization test. Previous studies have tended to compare different pathways to fear or to use avoidance as a one-off behavioral outcome of fear; the present study is unique then for its emphasis on both avoidance acquisition and generalization.

The present paradigm affords several opportunities to further investigate generalized avoidance and the social transmission of avoidance. First, our paradigm may prove useful in detecting social learning effects on acquisition and generalization of avoidance. For instance, varying the expressive details, accuracy or racial group membership of the facial expression modeled (e.g., Golkar et al., 2015) may influence the persistence of avoidance and might even produce pronounced effects with fear relevant stimuli in individuals with and without an anxiety disorder. Second, extending the observational phase to include modeling of unsignaled US presentations prior to a reinstatement test phase would be a novel synthesis of observational-learning of avoidance with human reinstatement research and allow for the detection of potentially transient effects on generalization. Third, exposing participants to a movie where the US is removed (extinction) or where the avoidance response is prevented and the US presented independent of responding (Higgins and Morris, 1984) would permit an examination of the relative effectiveness of these separate, operant extinction methods. Moreover, effects of presenting either the generalized or learned cues in tandem with these extinction methods could be tested and applied to analog analyses of exposure-based therapy for reducing levels of problematic avoidance that is often seen in anxiety disorders. Fourth, once further validated, the present paradigm may prove useful in identifying the neurobehavioral mechanisms of avoidance generalization and testing for potential differences in generalization in those with and without an anxiety disorder.

### Limitations

A limitation of the present study was the failure to detect significant effects of stimulus type in SCR during fear conditioning and subsequent generalization test phases. The Bayesian analysis conducted of SCR data obtained during fear conditioning indicated that the data were insensitive in distinguishing the alternative hypothesis from the null. This may be related to the small sample size, the loss of some SCR data, and the number of trials presented in the generalization test and reinstatement test phases. This was exacerbated by the use of both avoidable and unavoidable trials for CSs and GSs, which meant the number of analyzed trials was reduced by more than 50%. The design of the present task also meant that trials involving 1CS− were never avoidable, and consequently no data for these trials were included in our final analysis. The inclusion of avoidable and unavoidable trials was intended as a form of within-subject contrast to help ensure reliable acquisition of discriminated avoidance (for the instructed-learning group and which was observed indirectly for the observationallearning group) and to maintain generalized avoidance when the US was withheld in extinction and reinstatement testing. A limitation of this approach was that our design did not allow for analysis of both avoidable and unavoidable trials of the 1CS−, which we might predict would evoke a low level of generalized avoidance comparable to the CS−. Indeed, the reported difference in CS+ ratings on unavoidable trials may have resulted from the fact that the instructed group received four more unavoidable shocks than the observational group during avoidance learning. Further research on the relative roles of avoidable and unavoidable trials on the generalization of avoidance is therefore warranted.

The generalization test phase was also an extinction test since the US was withheld on all trials. Previous work on fear generalization (see Dymond et al., in press) and avoidance generalization (van Meurs et al., 2014) has tended to employ variants of the ''steady state'' generalization test by continuing to present the US on some trials because doing so prevents extinction and gives participants the opportunity to learn that the CS+ is still dangerous and the GSs are, at least putatively, safe. Because we were interested in reinstatement of avoidance, we chose to conduct generalization testing in extinction; future research should investigate other paradigms to probe for generalization that continue to present the US on some trials.

An additional limitation is that performing the avoidance response may have influenced SCR recording during the test phases. Since the SCR interval analyzed was 6 s post-stimulus onset, the peak SCR may not have occurred until after the avoidance response was made. It is possible therefore that performing the avoidance response within the 6 s interval reduced peak amplitude SCR, but would not have influenced US expectancy ratings, which were made before avoidance responding. This was reflected during the reinstatement test phase during unavoidable trials, where US expectancies and SCR responding to CS+ remained higher than CS−. In the absence of avoidance we observed differences in SCR to CS's and GS's, which diminished in the presence of avoidance. As a result, the current study does not allow strong claims to be made regarding the physiological responding of avoidance generalization. In future research, the avoidance response should ideally be separated from SCR recording until such a time that the peak SCR is recorded and additional physiological measures, such as fear-potentiated startle, incorporated into the analysis of generalized avoidance.

Finally, the results of the present study would be strengthened by undertaking tests for retrospective identification of the CS+ and determining the extent to which participants discriminated between the danger cue and the cue it most resembled (i.e., GS3). Also, employing a greater number of generalization stimuli, which could be combined to make classes of GSs (Lissek et al., 2008), may serve to facilitate potentially larger perceptual generalization differences between the cues. Accurate and high post-experimental recognition of the CS+ would thereby confirm

### References


both discrimination of danger and safety and ensure that the generalization gradient obtained was the one intended by the experimenter.

### Conclusions

The present study demonstrated, for the first time, the equivalence of instructed-learning and observational-learning pathways of avoidance acquisition on the generalization of avoidance behavior. After fear conditioning, groups either were instructed that a simple instrumental response in the presence of the CS+ cancelled upcoming shock or observed a short movie showing a demonstrator in the same experimental context performing the avoidance response to prevent shock. In two test phases, in the absence of the US, danger and safety cues were first presented along with GSs and a profile consisting of avoidance behavior, US expectancy and SCR measured. Return of fear was then probed in a reinstatement test following unsignaled US presentations. Findings revealed a generalization gradient in responding with the greatest proportion of avoidance and fear expectancy elicited by the CS+, with decreasing levels of avoidance, fear and SCR to the GSs of decreasing similarity to the CS+. Reinstatement testing demonstrated that generalized avoidance remained remarkably intact following a brief reintroduction of fear. The present findings show that generalized avoidance is a resilient behavioral consequence of fear learning and may emerge in the absence of a direct avoidance learning history. These findings also contribute to the literature on alternatives to frequentist statistical inference approaches by reporting a Bayesian analysis as an alternative to null hypothesis significance testing (NHST; Wagenmakers, 2007; Masson, 2011; Jarosz and Wiley, 2014). A non-significant result cannot provide evidence against the alternative hypothesis but is regularly used in such a way (Dienes, 2014). Similar to previous analyses (Krypotos et al., 2011, 2014), we used Bayesian analysis to determine if the absence of any differences between our groups supported the null hypothesis over the alternative hypothesis, which is not possible using conventional NHST.

### Acknowledgments

This work was supported by funding from the Department of Psychology, Swansea University. We thank Gary Freegard for technical assistance, Miriam Lommen for assistance with the figures, and Angelos Krypotos for helpful discussion of the Bayesian analysis.


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Cameron, Schlund and Dymond. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

### Acquisition and extinction of human avoidance behavior: attenuating effect of safety signals and associations with anxiety vulnerabilities

### **Jony Sheynin1,2,3\*, Kevin D. Beck 1,2,3, Richard J. Servatius 1,2,3 and Catherine E. Myers 1,2,3**

<sup>1</sup> Department of Veterans Affairs, New Jersey Health Care System, East Orange, NJ, USA

<sup>2</sup> Joint Biomedical Engineering Program, New Jersey Institute of Technology and Graduate School of Biomedical Sciences, Rutgers, The State University of New Jersey, Newark, NJ, USA

<sup>3</sup> Stress and Motivated Behavior Institute, Department of Neurology and Neurosciences, New Jersey Medical School, Rutgers, The State University of New Jersey, Newark, NJ, USA

### **Edited by:**

Gregory J. Quirk, University of Puerto Rico, USA

### **Reviewed by:**

Carsten T. Wotjak, Max-Planck-Institute of Psychiatry, Germany Christopher Cain, Nathan S. Kline Institute for Pscyhiatric Research, USA

### **\*Correspondence:**

Jony Sheynin, Department of Veterans Affairs, New Jersey Health Care System, 385 Tremont Avenue, Mail Stop 15, East Orange, NJ 07018, USA

e-mail: jony.sheynin@rutgers.edu, jony.sheynin@gmail.com

While avoidance behavior is often an adaptive strategy, exaggerated avoidance can be detrimental and result in the development of psychopathologies, such as anxiety disorders. A large animal literature shows that the acquisition and extinction of avoidance behavior in rodents depends on individual differences (e.g., sex, strain) and might be modulated by the presence of environmental cues. However, there is a dearth of such reports in human literature, mainly due to the lack of adequate experimental paradigms. In the current study, we employed a computer-based task, where participants control a spaceship and attempt to gain points by shooting an enemy spaceship that appears on the screen. Warning signals predict on-screen aversive events; the participants can learn a protective response to escape or avoid these events. This task has been recently used to reveal facilitated acquisition of avoidance behavior in individuals with anxiety vulnerability due to female sex or inhibited personality. Here, we extended the task to include an extinction phase, and tested the effect of signals that appeared during "safe" periods. Healthy young adults (n = 122) were randomly assigned to a testing condition with or without such signals. Results showed that the addition of safety signals during the acquisition phase impaired acquisition (in females) and facilitated extinction of the avoidance behavior. We also replicated our recent finding of an association between female sex and longer avoidance duration and further showed that females continued to demonstrate more avoidance behavior even on extinction trials when the aversive events no longer occurred. This study is the first to show sex differences on the acquisition and extinction of human avoidance behavior and to demonstrate the role of safety signals in such behavior, highlighting the potential relevance of safety signals for cognitive therapies that focus on extinction learning to treat anxiety symptoms.

**Keywords: avoidance, anxiety disorders, anxiety vulnerability, safety signal, individual differences, sex differences, inhibited temperament, computer-based task**

### **INTRODUCTION**

Avoidance behavior is the performance or the withholding of a specific response to prevent an upcoming aversive event (active or passive avoidance, respectively). Although normally an adaptive behavior that protects one from harm, avoidance can be overexpressed and become pathological. Indeed, exaggerated avoidance behavior is a predominant symptom in all anxiety disorders (e.g., American Psychiatric Association, 2000) and its severity often parallels the overall growth and persistence of the disorders (Karamustafalioglu et al., 2006). Much of our current understanding of avoidance behavior is based on animal literature. A common approach to assess avoidance in animals is to expose a rodent to an aversive event (e.g., electric shock), which is preceded by a warning signal (e.g., tone) and which can be avoided by performing or withholding a specific operant response (e.g., lever-press and step-down on an electrified grid, respectively). Responding (or withholding the response) during the aversive event represents an escape response (ER) that terminates the aversive event, whereas responding during the warning signal completely prevents the aversive event and thus represents an avoidance response (AR).

Avoidance behavior in rodents has been shown to depend on individual differences. The strain (Sutterer et al., 1980; Bond, 1981; Kuribara, 1982;Berger and Starzec, 1988; Servatius et al., 2008) and sex (Beatty and Beatty, 1970; Scouten et al., 1975; Van Oyen et al., 1981; Heinsbroek et al., 1983; Beck et al., 2010) of the tested animals affect the rate and overall level of active avoidance behavior acquisition in rodents. In addition, features of the protocols, such as the interstimulus interval duration (Berger and Brush, 1975) and the properties of the aversive event (D'Amato and Fazzaro, 1966) can also influence active avoidance learning. In some cases, individual differences in active avoidance learning can interact with differences in the avoidance training protocol (e.g.,Beck et al., 2011). These findings suggest susceptibility to acquire avoidant behavior is not uniform; instead susceptibility is determined by sensitivity to specific stimuli or reactions to stimuli experienced during training.

Safety periods, i.e., periods free from aversive events, represent an appetitive component of avoidance behavior (Denny and Weisman, 1964), and can also modulate avoidance behavior in rodents (Berger and Brush, 1975). It was argued that signals associated with safety periods [i.e., safety signals (SSs)] provide positive reinforcement for an AR (Seligman and Johnston, 1973; Rachman, 1984) and may become inhibitors of fear (Falls and Davis, 1997; Myers and Davis, 2004). Research that examined the effect SSs on avoidance behavior showed that by introducing a visual SS during the intertrial period, acquisition of ARs was facilitated (Bower et al., 1965; Dillow et al., 1972; Hurwitz et al., 1972; Candido et al., 1991). It has been argued that the facilitation was the result of the feedback stimulus, contingent on the animal's AR (Bolles and Grossen, 1969; Dillow et al., 1972). In agreement with this idea, when a non-contingent SS was used, no facilitation was shown (Fernando et al., 2013). Interestingly, the length of the SS did not affect acquisition of avoidance responding (Galvani and Twitty, 1978; Candido et al., 1991; Brennan et al., 2003).

While a large rodent literature on SS processing can be found, reports often lack a standardized methodology, which makes interpretation difficult. For instance,some researchers administered SSs specifically during the acquisition phase (e.g., Bower et al., 1965) or during the extinction phase (e.g., Grossen and Bolles, 1968), or both (e.g.,Dillow et al.,1972). Further,although most of the rodent studies tested female animals, evidence suggests that the existence of sex-related differences in safety processing in avoidance learning (Beck et al., 2011). Avoidance paradigms themselves also vary; some studies used lever-press discriminated avoidance (e.g., Dillow et al., 1972), free-operant avoidance (e.g.,Hurwitz et al., 1972), shuttle-box avoidance (e.g., Galvani and Twitty, 1978), or jumping avoidance (e.g.,Candido et al., 1991). In addition, the SS is usually a white or flashing light (e.g., Candido et al., 1991; Beck et al., 2011), but a "darkness SS" (e.g., Jacobs et al., 1983) or auditory SS (e.g., Fernando et al., 2014) has also been used. Furthermore, data on extinction learning, in which the aversive events no longer occur and the previously learned responding is expected to gradually decline, are inconsistent. While some researchers reported facilitation of extinction by the administration of SSs (Grossen and Bolles, 1968; Moscovitch and LoLordo, 1968; Weisman and Litner, 1969; Roberts et al., 1970; Jacobs et al., 1983), others found no effect (Dillow et al., 1972; Candido et al., 1991; Fernando et al., 2014). In light of the described methodological heterogeneity and inconsistent findings, translation of animal research into a clinical population is very limited. While a few attempts to test SS processing in humans have been reported (Jovanovic et al., 2005; Schiller et al., 2008; Pollak et al., 2010), all were based on classical fear conditioning, rather than operant avoidance paradigms.

The current study is the first to test the role of SSs in the acquisition and extinction of conditioned avoidance behavior in humans. We used a computer-based task that captures key features of common paradigms used to assess avoidance behavior in rodents (Sheynin et al., 2014a). On this task, which is reminiscent of a spaceship videogame, participants control a spaceship, shoot an enemy spaceship to obtain points, and hide in designated screen areas to protect against on-screen aversive events. Prior work has tested learning of SSs on this task using a conditioned discrimination procedure, where one visual signal predicted the occurrence of an aversive event, whereas another signal was associated with its non-occurrence (Molet et al., 2006; Sheynin et al., 2014a). While both prior studies demonstrated that participants successfully discriminated between on-screen stimuli and showed minimal responding during the SS, the specific contribution of the SS to avoidance behavior was not assessed. Another study that used a similar paradigm provided evidence that participants' learning is sensitive to the visual context; manipulation of the context had a prominent effect on associative learning (Byron Nelson and del Carmen Sanjuan, 2006). Here, we extended these prior studies to test how the inclusion of visual SSs affects avoidance behavior. Importantly, the SS was a discrete on-screen cue, which was presented during the intertrial interval (ITI) and explicitly signaled a period of non-threat. Here, we refer to such cues as SSs, although there are open questions as to whether such stimuli are actually perceived and/or processed as SSs [see Beck et al. (under review)].

In addition to the effect of SSs, we were interested to investigate how individual differences affect avoidance behavior on the current paradigm. In a recent study, Beck et al. (2010) showed that female sex and behaviorally inhibited temperament (i.e., a tendency to avoid or withdraw from novel social and non-social situations), two factors associated with vulnerability to anxiety in humans (Kagan et al., 1989; Pigott, 1999), were each associated with facilitated acquisition of avoidance responding in rats. Using a computer-based task similar to the one employed in the current study, we have recently paralleled these animal findings and demonstrated that sex and inhibited temperament similarly affect avoidance behavior in humans (Sheynin et al., 2014a). Here, we expected to replicate these findings and further extend them to extinction learning. Given that animal models of anxiety vulnerability show resistance to extinction of avoidance behavior (Servatius et al., 2008), we expected anxiety-vulnerable individuals to persist with the exaggerated avoidance responding even when aversive events no longer occur. In sum, in this study, we have tested how anxiety vulnerability and SSs affect acquisition and extinction of avoidance behavior in humans. We hypothesized that both anxiety vulnerability and presence of SSs would facilitate learning of avoidance behavior. If the effect of SSs is dependent on individual differences, this could suggest a personalized approach to treat mental disorders associated with pathological avoidance.

### **MATERIALS AND METHODS PARTICIPANTS**

Participants were 122 healthy young adults (Rutgers University-Newark undergraduate students; mean age 20.7 years, SD 3.6; 54.1% female). Participants were recruited via a departmental subject pool, in which available research studies are posted and students sign up to participate in exchange for research credits in a psychology class. Participants were randomly but evenly assigned to one of two experimental groups (*n* = 61) given

different versions of the computer-based task (with or without the presence of an SS). Participants were tested individually; the participant and experimenter sat in a quiet testing area during the experiment. All participants provided written informed consent and the experiment was approved by the local research ethics committee and conducted in accordance with guidelines established by the Federal government and the Declaration of Helsinki for the protection of human participants.

### **QUESTIONNAIRE**

All participants completed the tridimensional personality questionnaire (TPQ), a self-report questionnaire, which consists of 100 true/false items asking how the individual feels or behaves in various daily situations, and provides scores relating to three orthogonal personality dimensions (Cloninger et al., 1991). One personality dimension assessed by the TPQ, which the authors termed harm avoidance (HA), is defined as behavioral inhibition in response to novel or aversive situations (Cloninger, 1986, 1987). In line with our recent work (Sheynin et al., 2014a), and in agreement with reports from other groups (e.g., Mardaga and Hansenne, 2007; Baeken et al., 2009; Wilson et al., 2011; Bailer et al., 2013), we used this subscale to assess inhibited temperament in the current study. The other two dimensions assessed by the TPQ are reward dependence (RD), defined as marked response to rewarding stimuli, and novelty seeking (NS), defined as exploratory activity in response to novel stimulation. Based on

our recent findings (Sheynin et al., 2014a), we predicted that HA scores would be related to avoidance learning in the current study, whereas RD and NS scores were not expected to show significant relationships with learning.

### **ESCAPE–AVOIDANCE TASK**

To test escape–avoidance behavior, participants from both experimental groups were administered a computer-based task, which took the form of a spaceship videogame. The task was conducted on a Macintosh computer programed in the SuperCard language (Solutions Etcetera, Pollock Pines, CA, USA) and followed a similar design as recently described (Sheynin et al., 2014a; **Figure 1**). The keyboard was masked except for three keys, labeled "←,""→," and "FIRE," which the participants used to perform the task. In the task, participants controlled a spaceship and could move it to one of five horizontal locations at the bottom of the screen, by using the left and the right arrow keys. An enemy spaceship appeared randomly in one of six locations on the screen. Participants were instructed to gain points by using the "FIRE" key to shoot at and destroy this enemy spaceship, which appeared in a specific location for approximately 1 s unless destroyed by the participant. Every successful hit caused an explosion of the enemy spaceship and provided a reward of 1 point.

Every 20-s, two colored rectangles (the warning signal) appeared for 5 s in a designated area at the top of the screen (warning period; **Figure 1C**). Color of the rectangles (pink or blue) was

**FIGURE 1 | Computer-based escape–avoidance task: one enemy spaceship appears randomly in one of six locations on the screen, approximately every 1 s**. The participant's goal is to gain points by shooting and destroying this spaceship (1 point for each hit). **(A,B)** The experimental groups differ in the appearance of the ITI. **(A)** In the first group (without-SS), background was the same as the one during the other task periods, **(B)** whereas in the second group (with-SS), two lights were visualized at both upper corners of the screen. **(C)** The warning period includes two colored rectangles at the top of the screen, which appear every 20 s and remain visible for 5 s. **(D)** On acquisition trials, the warning period is always

followed by appearance of a bomb, which remains on-screen for 5 s (bomb period). The bomb period is divided into five segments of equal duration; during each segment, there is an explosion and loss of 5 points to a maximum of 25 points. **(E)** At the bottom corners of the screen, there are two box-shaped areas representing "safe areas." Moving the participant's spaceship to one of those boxes is defined as "hiding." While hiding, the participant's spaceship can not be destroyed and no points can be lost, but neither can the participant shoot the enemy spaceship and gain points. Labels shown in white text are for illustration only and do not appear on the screen during the task.

randomly assigned, but remained constant for each participant. Each task session consisted of 24 trials. During the first 12 acquisition trials, the warning period was always followed by appearance of a bomb for another 5 s (bomb period). The bomb period was divided into five 1-s segments; during each segment, there was an explosion and a loss of 5 points (**Figure 1D**), to a maximum of 25 points. The bomb period was followed by a 10-s ITI during which participants could gain points without any risk of aversive events. During the subsequent 12 extinction trials, no bombs appeared and each warning period was followed by a 15-s ITI. The two experimental groups differed in the appearance of the ITI on the acquisition trials; in the first group (without-SS), the background was the same as during the other task periods (**Figure 1A**), while in the second group (with-SS), the background during ITI included two lights at the two upper corners of the screen (SS; **Figure 1B**).

Both experimental groups included two box-shaped areas representing "safe areas" at the bottom corners of the screen (**Figure 1E**). Moving the participant's spaceship to either one of those boxes was defined as"hiding."While hiding, the participant's spaceship could not be destroyed and no points could be lost, but neither could the participant shoot the enemy spaceship and gain points. Hiding during the bomb period represented an ER and terminated point loss, while hiding during the warning period represented an avoidance behavior and could cause the complete omission of point loss; in both cases, if the participant emerged from hiding before the end of the bomb period, point loss resumed and response could not be recorded as an AR. Importantly, participants were not given any explicit instructions about the safe areas or the hiding response. At the beginning of the experiment, the participants saw the following instructions. "You are about to play a game in which you will be piloting a spaceship. You may use LEFT and RIGHT keys to move your spaceship, and press the FIRE key to fire lasers. Your goal is to score as many points as you can. The number of points will appear on the top of the screen. Good luck!" Participants were then given 1 min of practice time, during which they could shoot the enemy spaceship but no signals or bombs appeared. This practice period also included an SS in the with-SS group. Twelve trials followed, each defined by the appearance of the warning signal; the start of a new trial was not explicitly signaled to the participant. A running tally at the top of the screen showed the current points accumulated; this tally was not allowed to fall below 0, to minimize frustration among participants.

### **DATA ANALYSIS**

Every 100 ms, the program recorded whether the participant's spaceship was inside or outside one of the boxes. To assess avoidance behavior, percentage of time spent hiding during the 5-s warning period was recorded on each of the 12 acquisition and 12 extinction trials. In addition, following Sheynin et al. (2014a), two dependent variables were defined to describe specific aspects of avoidance: AR rate (percentage of acquisition trials on which an AR was made) and AR duration (percentage of the warning period during which the participant's spaceship was hidden, averaged across trials). Importantly, to consider only hiding that was part of an AR, only acquisition trials where an AR was made were included in the analyses of AR duration. By definition, all ARs resulted in avoidance of any point loss on a specific trial; longer AR duration indicated that a participant made a response earlier during the warning period and remained hiding longer overall on that trial. To assess ERs, the percentage of each bomb period during which the participant's spaceship was hidden was recorded for each acquisition trial. Finally, to analyze overall performance on the task, total points gained during the entire session, number of shooting attempts (presses on the FIRE key), and participants' locomotion (presses on the LEFT or RIGHT keys) were recorded. Due to a computer failure, number of shooting attempts and locomotion data for one participant were not recorded.

To compare the two experimental groups (with-SS versus without-SS), we used *t*-test for continuous values and chi-square for categorical values, with Yates continuity correction for 2 × 2 tables. To test association of sex, personality, and presence of SS with the escape–avoidance behavior, we used stepwise linear regressions. Predictor variables were sex, score on the TPQ subscales (NS, HA, and RD), and experimental group. Dependent variables were average hiding during the warning period on acquisition and extinction trials, and average hiding during the bomb period on acquisition trials. Similar analyses were also conducted on AR rate, AR duration, and the different task performance variables (total points, shooting, and locomotion). Internal consistency of the different questionnaire subscales was analyzed using Cronbach's α with reverse scoring for individual questions taken into account. Statistical analyses were conducted using SPSS version 17.0 (SPSS Inc., Chicago, IL, USA). Alpha was set to 0.050, effects that did not approach significance (*p* > 0.100) were not reported.

### **RESULTS**

On the NS, HA, and RD subscales of the TPQ questionnaire, mean (SD) values were 16.8 (4.8), 12.8 (7.8), and 18.4 (4.3), respectively. For the 34, 34, and 30 questions comprising NS, HA, and RD subscales, inter-item reliability was 0.689, 0.900, and 0.678, respectively. No correlations were found between TPQ subscales (Pearson correlations, all *p* ≥ 0.600). Participants assigned to the two experimental groups did not differ on sex, age, or any of the TPQ subscale scores (all *p* > 0.100).

On the computer task, one participant gained only one point during the entire session (more than 2.5 SD from group mean) and demonstrated extremely high locomotive activity (more than 8 SD from group mean); data from this participant were excluded from all the behavioral analyses reported below.

For the remaining participants, to assess avoidance behavior on the task, we first analyzed percentage of time spent hiding during the 5-s warning period (**Figure 2**). On the acquisition phase, stepwise linear regression revealed that hiding could be predicted by a model including sex as the only predictor variable [*R* <sup>2</sup> = 0.059, *R* = 0.244, *F*(1,119) = 7.524, *p* = 0.007]; females acquired avoidance behavior faster and to a higher degree than males. Further, due to an apparent effect of experimental group on females' acquisition learning (**Figure 2B**), we performed *post hoc* regression analyses, separately for each sex (predictor variables were TPQ subscales and experimental group). As hypothesized, analyses revealed that females'hiding could be predicted by experimental group [*R* <sup>2</sup> = 0.078,*R* = 0.280, *F*(1,64) = 5.438, *p* = 0.023]; females in the "with-SS" group acquired slower than those in the

"without-SS" group. In males, however, similar analysis identified no variables as significant predictors (all *p* > 0.300).

On the extinction phase, hiding could be predicted by sex [*R* <sup>2</sup> = 0.064, *R* = 0.253, *F*(1,119) = 8.123, *p* = 0.005], as well as by both sex and experimental group [*R* <sup>2</sup> = 0.128, *R* = 0.358, *F*(2,118) = 8.647, *p* < 0.001]. Adding experimental group to the model accounted for significant additional variance (*p* = 0.004); males and participants in the "with-SS" group extinguished faster than females and those in the "without-SS" group. Crucially, since participants' hiding on the extinction phase was positively correlated with their earlier hiding during the acquisition phase (Pearson correlation, *r* = 0.655, *p* < 0.001), we used a hierarchical multiple regression to repeat the latter analysis while controlling for behavior during that phase. First, average hiding on the acquisition phase was entered as predictor variable. As expected, hiding during extinction could be predicted solely by amount of hiding during acquisition [*R <sup>2</sup>* = 0.429, *R* = 0.655, *F*(1,119) = 89.434, *p* < 0.001]. On the next step of the analysis, a stepwise linear regression was used to test whether any of the other variables (sex, TPQ subscales, and experimental group) could account for significant additional variance in hiding during extinction, beyond that accounted for by hiding during acquisition. When experimental group was added as a predictor variable, the model could account for significant additional variance (*p* = 0.045); therefore, extinction behavior could be best predicted by a model including both acquisition hiding and the presence of SS [*R* <sup>2</sup> = 0.448, *R* = 0.670, *F*(2,118) = 47.954,*p* < 0.001]. Participants in the"with-SS"group, who showed less avoidance acquisition, demonstrated faster extinction learning than their counterparts.

We next used stepwise linear regression to examine two specific aspects of AR (**Figure 3**). AR duration (calculated only on trials where an AR was made) could be predicted by a model including sex as the only predictor variable [*R* <sup>2</sup> = 0.105,*R* = 0.323, *F*(1,94) = 10.985, *p* = 0.001; **Figures 3A,B**]; females demonstrated longer duration of hiding during the warning period. Considering AR rate (**Figures 3C,D**), highly inhibited individuals (i.e., those scoring in the top third of HA scores) demonstrated more ARs than their uninhibited counterparts (i.e., those scoring in the lower third of HA scores). However, stepwise linear regression indicated that neither HA nor any of the other potential predictor variables accounted for significant variability in AR rate (all *p* > 0.200). Interestingly, when AR rate data were displayed separately for each trial (as the percentage of "avoiders,"i.e., participants exhibiting an AR on that trial; **Figure 4**), on some trials the "with-SS" group included numerically fewer female avoiders than the "without-SS" group (**Figure 4B**). However, *post hoc* regression analyses separately for each sex (predictor variables were TPQ subscales and experimental group) identified no variables as significant predictors (all *p* > 0.100).

Then, we assessed ER on the task by analyzing hiding during the 5-s bomb period on the acquisition phase (**Figure 5**). Stepwise linear regression indicated that no predictor variables could

AR was made) in males versus females [**(A)** n = 42 and 54, respectively] and in inhibited versus uninhibited participants [**(B)** participants scoring in the upper and lower thirds on HA; n = 37 and 26, respectively]; AR duration could be predicted by a model including sex as the only predicting variable

versus uninhibited participants [**(D)** n = 43 and 36, respectively]; AR rate was higher in the inhibited than in the uninhibited participants. However, the relationship between AR rate and HA did not reach statistical significance (all p > 0.200). Error bars indicate SEM.

significantly predict variance in ER, meaning that there were no significant effects of sex, personality, or experimental group (all *p* > 0.100).

Lastly, we assessed overall task performance (**Figure 6**). Stepwise linear regression revealed that a model that included sex as the only predictor variable could be used to predict total points [*R* <sup>2</sup> = 0.423, *R* = 0.650, *F*(1,119) = 87.271, *p* < 0.001], shooting [*R* <sup>2</sup> = 0.092, *R* = 0.304, *F*(1,118) = 12.023, *p* = 0.001], and locomotion [*R* <sup>2</sup> = 0.127, *R* = 0.356, *F*(1,118) = 17.113, *p* < 0.001]. Across the entire task session, females earned fewer points (**Figure 6A**), made fewer attempts to shoot (i.e., fewer FIRE key presses; **Figure 6B**), and showed less locomotion (i.e., fewer LEFT and RIGHT key presses; **Figure 6C**) than males. Adding experimental group and personality variables into the models did not account for additional variance (all *p* > 0.100).

### **DISCUSSION**

The purpose of the current study was to examine the effects of an SS on avoidance acquisition and extinction in humans. Participants were tested on a computer-based escape–avoidance task meant to capture several key features of avoidance paradigms commonly used in rodents. Here, participants were divided into two experimental groups that differed in whether an SS was presented during acquisition. Results showed that the presence of an SS during the acquisition phase of the task impaired acquisition (in females) and facilitated extinction of the learned avoidance behavior (**Figure 2**). Results also generally replicated our prior findings with this task (Sheynin et al., 2014a); specifically, females demonstrated longer duration of hiding on trials where an AR was made (AR duration; **Figure 3A**), and participants with inhibited temperament showed a higher AR rate than uninhibited participants (**Figure 3D**), although the latter relationship fell short of statistical

**FIGURE 6 |Total points, total shooting attempts (presses on the FIRE key), and locomotion (presses on the LEFT or RIGHT keys) on the computer-based task in male and female participants with-SS (n** = **26 and 34, respectively) and without-SS (n** = **29 and 32, respectively)**. Due to a computer failure, shooting and locomotion data for one participant

were not recorded. For all three performance measures, scores could be predicted by a model that included sex as the only predictor variable (all p ≤ 0.001); **(A)** females earned fewer points, **(B)** made fewer attempts to shoot, and **(C)** showed less locomotion than males. Error bars indicate SEM.

significance in the current study. These findings, and limitations of the current study, are discussed further below.

### **SAFETY SIGNALS AND ACQUISITION OF AVOIDANCE**

Prior studies in rodents had demonstrated that the presence of an SS during acquisition facilitated acquisition of the AR (e.g., Bower et al., 1965; Dillow et al., 1972; Hurwitz et al., 1972; Candido et al., 1991). However, in the current study, presence of an SS did not affect (in males) or even impaired (in females) acquisition of avoidance. One possible explanation for this discrepancy is that, in the rat studies, the SS is usually contingent on (and appears immediately after) a successful AR or ER; as such, it may provide positive reinforcement for the behavioral response (Bolles and Grossen, 1969; Dillow et al., 1972). In contrast, the SS in the current study was non-contingent, and appeared at the end of each bomb period, whether or not an AR or ER was made. In fact, when non-contingent SSs are used in rodent studies, facilitated acquisition is not observed (Fernando et al., 2013). An interesting avenue for future work may be to compare the effect of contingent versus non-contingent SS on avoidance acquisition in humans. Moreover, this work suggests sex-related differential effect of the SS; it is possible that the SS caused higher relaxation in females (Denny and Weisman, 1964), which was generalized to the warning period and resulted in reduced avoidance. Such a differential effect of an SS in males and females is in agreement with the differential utilization of SSs by male and female rodents (Beck et al., 2011), and emphasizes the need to include both sexes in future animal and human research.

### **SAFETY SIGNALS AND EXTINCTION OF AVOIDANCE**

In contrast to the sex-dependent effect of SSs on acquisition in the current task, there was a main effect of SSs on extinction. Specifically, participants for whom the SS was present during the ITI on acquisition trials subsequently extinguished avoidance faster than those with no SS. This finding is consistent with several reports showing that inclusion of SSs in the acquisition phase of free-operant avoidance facilitates extinction (Roberts et al., 1970; Jacobs et al., 1983; Beck et al., 2011; Fernando et al., 2014). One explanation for this effect is that there is a context shift between acquisition, where SS is present during the ITI, and extinction, where it is not (Roberts et al., 1970). Consistent with this explanation, rodent studies that administered the SS during both acquisition and extinction failed to show any effect of SSs on extinction (Dillow et al., 1972; Candido et al., 1991).

### **SEX DIFFERENCES**

Females in the current study showed more avoidance behavior, regardless of the presence of an SS. This is consistent with previous reports,which showed that females acquire avoidance behavior faster than their male counterparts, in both rodents (Beatty and Beatty, 1970; Van Oyen et al., 1981; Beck et al., 2010) and humans (McLean and Hope, 2010; Sheynin et al., 2014a). As in our prior study (Sheynin et al., 2014a), the current study also found a specific association between sex and AR duration, with females hiding longer than males during the warning signal on trials where an AR was demonstrated. The current study further showed that females continued to demonstrate more avoidance behavior even on extinction trials when the aversive events no longer occurred.

An important feature of the current paradigm is the motivational conflict between the option to avoid the aversive event by hiding in one of the safe areas versus the option to gain points by staying in the central area and shooting the enemy spaceship. It is possible that the observed sex differences are the result of distinct sensitivities to these appetitive and aversive components of the task. This idea is consistent with recent work, which suggested that female and male rats process reward and punishment differently on a decision-making task (van den Bos et al., 2012). It is also possible that males in the current study had greater reward sensitivity, which caused them to delay the AR and remain in the open to accrue points, until the last possible moment before the bomb arrived (Li et al., 2007). This idea of higher reward-seeking in males is supported by the fact that males scored more total points than females (**Figure 6A**), and made more attempts to obtain such reward (i.e., increased shooting rate; **Figure 6B**), but did not differ from females on responding during the bomb period, when no reward was available (i.e., ERs; **Figure 5**).

In the current task, increased hiding during the warning period typically means that the participant entered the safe area soon after the onset of the warning signal, and remained there throughout the remainder of the warning period and the subsequent bomb period. Thus, increased duration of hiding during the warning signal did not serve to better avoid the upcoming point loss, but rather prevented the participant from obtaining further reward (points). This pattern is in line with recent findings by van den Bos et al. (2012), who showed that female rats demonstrated a disadvantageous strategy on a decision-making paradigm, which resulted in less reward (fewer sugar pellets) than obtained by male rats. This non-optimal behavior might be related to the pathological avoidance behavior demonstrated by anxious individuals, and might represent a behavioral risk factor that underlies females' vulnerability to develop anxiety disorders (Pigott, 1999).

Interestingly, we also found that females gained fewer points overall, as well as making fewer attempts to gain these points, indexed as number of shooting attempts (**Figures 6A,B**). This could represent a decreased reward sensitivity in females (Li et al., 2007), but it might also be the case that females were simply less experienced or less motivated at playing videogames (Pfister, 2011). While it seems reasonable to believe that most college-age participants had at least some prior exposure to computer games, future studies should specifically address and control for this variable. In the current study, it appears unlikely that male–female differences simply reflected differences in experience with computer games, since AR duration assesses the timing of a learned response, rather than the response itself. In fact, both genders executed ARs at the same rate (**Figure 3C**). In addition, following a report that showed that exaggerated locomotor activity can mask avoidance differences in rodents (Aguilar et al., 1998), we analyzed locomotion in the current study. However, females' exaggerated avoidance on the current task can not be simply attributed to increased locomotor activity making them more likely to enter the hiding areas, since females actually showed less locomotor activity than males (**Figure 6C**).

### **INHIBITED TEMPERAMENT DIFFERENCES**

Participants who reported an inhibited temperament in the current study (as assessed by the HA subscale of the TPQ questionnaire) tended to show more ARs than uninhibited participants. This is consistent with our recent report, where AR rate could be reliably predicted by inhibited temperament on a similar computer-based task (Sheynin et al., 2014a). Interestingly, this relationship did not reach statistical significance in the current study. While this could merely represent minor differences across participant samples, it could also be the result of subtle variations in the task design (e.g., presence of a "control-signal" in the earlier study versus the SS in the current study). Future studies should follow up on this finding and further investigate the exact nature of this relationship.

### **IMPLICATIONS FOR THERAPY**

The overall effect of SSs on avoidance behavior in the current study has potential therapeutic relevance. Anxiety disorders, as well as post-traumatic stress disorder, are characterized by impaired extinction learning, reflected in patients'tendency to keep emitting ARs, even when the aversive outcomes no longer occur (Graham and Milad, 2011). An attempt to promote extinction is often made in cognitive–behavioral therapies via exposure techniques, where individuals are exposed to the feared stimulus or outcome in the absence of actual threat (Balooch et al., 2012). The current study, in which extinction was facilitated by the presence of an SS during a prior acquisition phase, suggests that individuals might benefit from the exposure to non-threat cues during or near the time of the traumatic experience. Importantly, given the slower extinction learning exhibited by female participants in the current study, females might benefit more from the use of SSs. Further, future work should test whether a similar positive effect on extinction learning can be obtained if SSs are administered during the extinction phase itself. Indeed, it was argued that therapeutic procedures that improve patients' general sense of safety and security would reduce avoidance in agoraphobic patients (Rachman, 1984; Sartory et al., 1989).

It is also important to note that most of the research on processing of SSs in humans is based on classical fear conditioning rather than avoidance learning, mainly because of the dearth of adequate tools to investigate conditioned avoidance in humans. The current study investigated a purely cognitive form of avoidance learning that involved a point loss in a computer game. While the current and previous studies suggested that such paradigms are sufficient for triggering more avoidance behavior in individuals with anxiety vulnerabilities (Sheynin et al., 2013, 2014a), a direct comparison of cognitive versus fear-evoked avoidance should be a focus of future work. In addition, future work could include screening for drug use, to control for its possible involvement in the reported behavioral differences (Sheynin et al., 2014b). Lastly, future work should consider adapting the current task to further promote the study of behavioral differences in anxious individuals. For instance, manipulating the Pavlovian contingency between the warning signal and the aversive event (e.g., compare probabilistic versus deterministic designs) or the instrumental contingency of the hiding response [e.g., manipulate the frequency of the protective outcome (AR)] might add uncertainty to the task, and thus, better dissociate individual differences (McEvoy and Mahoney, 2012). Moreover, adapting the current task for acquisition across multiple sessions would allow the study of the "warm-up" phenomenon, where the subject starts a training session at a lower performance level than what was performed at the end of the previous training session. Since a lack of warm-up is exhibited by the inhibited WKY rat strain (Servatius et al., 2008), we hypothesize that inhibited human subjects might show a similar impairment.

### **ACTIVE VERSUS PASSIVE AVOIDANCE**

It is important to discuss the type of avoidance behavior that is addressed by the current computer-based task. In the current task, the hiding response protects the participant from the aversive event (AR); participants who enter the safe area soon after onset of the warning signal typically have longer AR duration than those who remain in the central area until right before the bomb appears. AR duration is therefore roughly comparable to response latency in rodent active avoidance tasks, where a rat can emit a lever-press or other response (AR) after onset of the warning signal but before arrival of the aversive shock. However, although AR in the current task is clearly an active behavioral strategy that requires the initial move of the participant's spaceship from the central to a safe area, it also includes an important passive property. By definition, both AR rate and AR duration require a persistent hiding state, where the initial active response (entering the safe area) is followed by a passive response (staying in the safe area through the rest of the warning period and through the entire bomb period to completely avoid any explosion and any point loss). Thus, in this task participants can learn a unique avoidance behavior that includes both active and passive properties, both of which have been demonstrated to be abnormal in rodents with increased anxiety levels (Dubrovina and Tomilenko, 2007; Beck et al., 2010).

This idea of a mixed avoidance pattern has been investigated in humans; for example, adolescent running away behavior may reflect passive avoidance in males but both passive and active avoidance in females (De Man et al., 1994). Clinically, agoraphobia may be associated with strong passive avoidance but weak active avoidance (Zinbarg et al., 1992). These studies suggest that inhibited temperament and female sex might be differentially associated with active and passive avoidance. In future, the current computerbased task could be adapted to specifically target these types of behavior, and potentially, analyze them separately within each tested individual. Specifically, active avoidance could be assessed by a single key press that would terminate/prevent the aversive event (parallel to the rat lever-press response), whereas passive avoidance would be the requirement to withdraw from the shooting response (Arcediano et al., 1996).

### **CONCLUSION**

This is the first study to examine how non-contingent SSs affect acquisition and extinction of escape–avoidance behavior in male and female humans. In this study, we found that administering such signals during the acquisition phase specifically attenuated avoidance behavior, without affecting other behavioral measures such as acquisition of ERs or overall performance on the computer-based task. As the participants in the current study were healthy young adults, our findings shed light on specific vulnerability factors that confer risk to develop anxiety disorders in future, and also suggest how a better understanding of SSs may promote therapeutic approaches in individuals who develop pathological avoidance.

### **AUTHOR CONTRIBUTIONS**

All authors participated in the design of the project. Jony Sheynin conducted the data analysis and prepared the initial draft of the manuscript. All authors contributed to revising the manuscript, approved the final version, and agreed to be accountable for all aspects of this work.

### **ACKNOWLEDGMENTS**

For assistance with data collection, authors wish to thank Saima Shikari, Jacqueline Ostovich, Barbara Ekeh, and Yasheca Ebanks-Williams. This work was supported by Award Number I01CX000771 from the Clinical Science Research and Development Service of the VA Office of Research and Development, by the NSF/NIH Collaborative Research in Computational Neuroscience (CRCNS) Program, by NIAAA (R01 AA018737), and by additional support from the SMBI. The views in this paper are those of the authors and do not represent the official views of the Department of Veterans Affairs or the U.S. government.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 28 May 2014; accepted: 29 August 2014; published online: 15 September 2014. Citation: Sheynin J, Beck KD, Servatius RJ and Myers CE (2014) Acquisition and extinction of human avoidance behavior: attenuating effect of safety signals and associations with anxiety vulnerabilities. Front. Behav. Neurosci. 8:323. doi: 10.3389/fnbeh.2014.00323*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Sheynin, Beck, Servatius and Myers. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited andthatthe original publication inthis journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## The influence of trial order on learning from reward vs. punishment in a probabilistic categorization task: experimental and computational analyses

Ahmed A. Moustafa1,2\*, Mark A. Gluck <sup>3</sup> , Mohammad M. Herzallah3,4 \* and Catherine E. Myers 2,5,6

### Edited by:

Lars Schwabe, University of Hamburg, Germany

### Reviewed by:

Hanneke E. Den Ouden, Radboud University Nijmegen, Netherlands Vivian V. Valentin, University of California, Santa Barbara, USA

#### \*Correspondence:

Ahmed A. Moustafa, School of Social Sciences and Psychology and Marcs Institute for Brain and Behaviour, University of Western Sydney, Locked bag 1797, Penrith 2751, Sydney, NSW, Australia a.moustafa@uws.edu.au; Mohammad M. Herzallah, Al-Quds Cognitive Neuroscience Lab, Palestinian Neuroscience Initiative, Faculty of Medicine, Al-Quds University, University Avenue, Abu Dis, Jerusalem 20002, Palestine mohammad.m.herzallah@gmail.com

> Received: 05 March 2015 Accepted: 26 May 2015 Published: 24 July 2015

#### Citation:

Moustafa AA, Gluck MA, Herzallah MM and Myers CE (2015) The influence of trial order on learning from reward vs. punishment in a probabilistic categorization task: experimental and computational analyses. Front. Behav. Neurosci. 9:153. doi: 10.3389/fnbeh.2015.00153 <sup>1</sup> School of Social Sciences and Psychology and Marcs Institute for Brain and Behaviour, University of Western Sydney, Sydney, NSW, Australia, <sup>2</sup> Department of Veterans Affairs, New Jersey Health Care System, East Orange, NJ, USA, <sup>3</sup> Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, NJ, USA, <sup>4</sup> Al-Quds Cognitive Neuroscience Lab, Palestinian Neuroscience Initiative, Faculty of Medicine, Al-Quds University, Jerusalem, Palestine, <sup>5</sup> Department of Pharmacology, Physiology and Neuroscience, Rutgers-New Jersey Medical School, Newark, NJ, USA, <sup>6</sup> Department of Psychology, Rutgers University-Newark, Newark, NJ, USA

Previous research has shown that trial ordering affects cognitive performance, but this has not been tested using category-learning tasks that differentiate learning from reward and punishment. Here, we tested two groups of healthy young adults using a probabilistic category learning task of reward and punishment in which there are two types of trials (reward, punishment) and three possible outcomes: (1) positive feedback for correct responses in reward trials; (2) negative feedback for incorrect responses in punishment trials; and (3) no feedback for incorrect answers in reward trials and correct answers in punishment trials. Hence, trials without feedback are ambiguous, and may represent either successful avoidance of punishment or failure to obtain reward. In Experiment 1, the first group of subjects received an intermixed task in which reward and punishment trials were presented in the same block, as a standard baseline task. In Experiment 2, a second group completed the separated task, in which reward and punishment trials were presented in separate blocks. Additionally, in order to understand the mechanisms underlying performance in the experimental conditions, we fit individual data using a Q-learning model. Results from Experiment 1 show that subjects who completed the intermixed task paradoxically valued the no-feedback outcome as a reinforcer when it occurred on reinforcement-based trials, and as a punisher when it occurred on punishment-based trials. This is supported by patterns of empirical responding, where subjects showed more win-stay behavior following an explicit reward than following an omission of punishment, and more lose-shift behavior following an explicit punisher than following an omission of reward. In Experiment 2, results showed similar performance whether subjects received reward-based or punishment-based trials first. However, when the Q-learning model was applied to these data, there were differences between subjects in the reward-first and punishment-first conditions on the relative weighting of neutral feedback. Specifically, early training on reward-based trials led to omission of reward being treated as similar to punishment, but prior training on punishment-based trials led to omission of reward being treated more neutrally. This suggests that early training on one type of trials, specifically reward-based trials, can create a bias in how neutral feedback is processed, relative to those receiving early punishment-based training or training that mixes positive and negative outcomes.

Keywords: category learning, reward, punishment, Q-learning computational model, intermixed trials

### Introduction

Prior research has shown that different rearrangement of task trials affects learning. For example, acquisition of fear conditioning in humans depends on the ordering of presentation of fear (CS+) and safety (CS−) trials (Esteves et al., 1994; Ohman and Soares, 1998; Katkin et al., 2001; Morris et al., 2001; Wiens et al., 2003). In a study by Wiens et al. (Wiens et al., 2003), one group of subjects received CS+ and CS− trials in a random order (differential group) and another group received CS+ and CS− trials in a nonrandom restricted manner. It was found that skin conductance response to CS+ and CS− was not significantly different in the random condition, but skin conductance responses to CS+ was significantly larger than that to CS− in the restricted condition. Similarly, although the forward and backward blocking paradigms have the same trials, albeit arranged differently, research has shown that the backward blocking effect is weaker than the forward blocking effect (Chapman, 1991; Lovibond et al., 2003). Other studies show that trial order also affect motor learning by observation (Brown et al., 2010). Using a concurrent discrimination task, we have previously found that training subjects to discriminate among a series of pairs of stimuli simultaneously (concurrent condition) takes more trials than learning to discriminate among each pair individually (shaping condition; Shohamy et al., 2006). In short, a number of studies suggest that trial order might impact cognitive performance.

However, the mechanisms underlying these effects remain a matter of debate. Chapman (1991) argues that associative learning models (e.g., Rescorla-Wagner's 1972 model; Rescorla and Wagner, 1972) and statistical models (e.g., multiple linear regression) cannot account for trial order effects.

Computational models of decision-making are increasingly being used to interpret behavioral results and help understand underlying information-processing mechanisms that could produce individual patterns of behavior (Frank et al., 2007; Dickerson et al., 2011). One class of models, reinforcement learning (RL) models, assumes that trial-and-error learning results in the learner coming to choose actions that are expected to maximize reward and/or minimize punishment. Prediction error (PE), the difference between expected and experienced outcomes, is used to update the learner's expectations and guide action selection. PE is positive when there is unexpected reward (or an expected punisher fails to occur) and negative when there is unexpected punishment (or an expected reward fails to occur). Learning can be affected by a number of free parameters in RL models, such as LR+, the learning rate when PE > 0, LR−, the learning rate when PE < 0, and β, an explore/exploit parameter which governs the tendency to repeat previously-successful responses or explore new ones. For each individual subject, values of the free parameters that led the model to display behavior that best mimicked that individual's observed behavior are identified; differences in the obtained parameters suggest mechanisms underlying different performance as a result of task condition. Previous research has used similar computational models to fit model parameter values for each subject to genetic (Frank et al., 2007), brain imaging (O'Doherty et al., 2003; Dickerson et al., 2011) and patient data (Moustafa et al., 2008; Myers et al., 2013).

In this study, we test the effect of trial ordering on a probabilistic categorization task that involves both reward and punishment-based category learning (Bódi et al., 2009). This task has the feature that reward-based trials, which can result in either reward or no feedback outcomes, are intermixed with punishment-based trials, which can result in either punishment or no feedback outcomes; thus, the nofeedback outcome is ambiguous as it can signal either missed reward (similar to a punishment) or missed punishment (similar to a reward). Prior studies with this task have documented differential learning from reward and punishment in patient populations including medicated and unmedicated patients with Parkinson's disease (Bódi et al., 2009), major depressive disorder (Herzallah et al., 2013), schizophrenia (Somlai et al., 2011), and symptoms of post-traumatic stress disorder (Myers et al., 2013), as well as individual differences in learning as a function of genetic haplotypes (Kéri et al., 2008) and of personality traits such as novelty seeking (Bódi et al., 2009) and behavioral inhibition (Sheynin et al., 2013). However, the effects of trial order on this task have not heretofore been considered.

Here, in Experiment 1, we started by considering the ''standard'' task in which reward-based and punishment-based trials are intermixed in each training block. Then, we fit subjects' behavioral data with a RL model (Watkins and Dayan, 1992; Sutton and Barto, 1998) to investigate mechanisms underlying subjects' performance. Based on prior computational modeling of this task (Myers et al., 2013), we expected that subjective valuation of the ambiguous no-feedback outcome might vary considerably across subjects. In Experiment 2, we considered a ''separated'' version of the task, in which subjects are administered reward-based and punishment-based trials in different blocks, and the same model was applied to see how different trial order might affect these mechanisms. We hypothesized that both learning and valuation of the ambiguous no-feedback outcome might differ, depending on whether reward-based or punishment-based training occurred first.

### Methods

### Experiment 1

### Participants

Experiments 1 and 2 were run concurrently, with participants randomly but evenly assigned to one experimental group. For experiment 1, participants included 36 healthy young adults (college undergraduates, mean age 20.0 years, SD 1.4; 66.7% female). For their participation, subjects received research credit in a psychology class. Procedures conformed to ethical standards laid down in the Declaration of Helsinki for the protection of human subjects. All participants signed statements of informed consent prior to inclusion in the study.

### Behavioral Task

The task was as previously described (Bódi et al., 2009; Myers et al., 2013) and was conducted on a Macintosh computer, programmed in the SuperCard language (Allegiant Technologies, San Diego, CA, USA). The participant was seated in a quiet testing room at a comfortable viewing distance from the computer. The keyboard was masked except for two keys, labeled ''A'' and ''B'' which the participant used to enter responses. A running tally at the bottom of the screen showed the current points accumulated; this tally was initialized to 500 points at the start of the experiment.

On each trial, participants viewed one of four images, and guessed whether it belonged to category A or category B (**Figure 1A**). For each participant, the four images were randomly assigned to be stimuli S1, S2, S3, and S4. On any given trial, stimuli S1 and S3 belonged to category A with 80% probability and to category B with 20% probability, while stimuli S2 and S4 belonged to category B with 80% probability and to category A with 20% probability. Stimuli S1 and S2 were used in the reward-learning task. Thus, if the participant correctly guessed category membership on a trial with either of these stimuli, a reward of +25 points was received (**Figure 1B**); if the participant guessed incorrectly, no feedback appeared (**Figure 1C**). Stimuli S3 and S4 were used in the punishment-learning task. Thus, if the participant guessed incorrectly on a trial with either of these stimuli, a punishment of –25 was received (**Figure 1D**); correct guesses received no feedback. Thus, the no-feedback outcome, when it arrived, was ambiguous, as it could signal lack of reward for an incorrect response (if received during a trial with S1 or S2) or lack of punishment for a correct response (if received during a trial with S3 or S4). Participants were not informed which stimuli were to be associated with reward vs. punishment.

At the start of the experiment, the participant saw the following instructions: in this experiment, you will be shown pictures, and you will guess whether those pictures belong to category ''A'' or category ''B''. A picture doesn't always belong to the same category each time you see it. If you guess correctly, you

FIGURE 1 | The reward- and punishment-learning task (Bódi et al., 2009). (A) On each trial, a stimulus appears and the subject guesses whether this stimulus belongs to category "A" or category "B." For two stimuli, correct responses are rewarded (B) but incorrect responses receive no feedback (C); and for the other two stimuli, incorrect responses are punished (D) but correct responses receive no feedback. In Experiment 1, reward-based and punishment-based trials were interleaved, as in the original Bódi et al. (2009) study; in Experiment 2, reward-based and punishment-based trials were presented in separate blocks.

may win points. If you guess wrong, you may lose points. You'll see a running total of your points as you play. (We'll start you off with a few points now.)

The task included a short practice phase, which showed the participant an example of correct and incorrect responses to sample punishment-based and reward-based trials. These practice trials used images other than S1–S4. The practice phase was followed by 160 training trials, divided into four blocks of 40 trials, with each stimulus appearing 10 times per block. Trials were separated by a 2 s interval, during which the screen was blank. At the end of the experiment, if the subjects' total had fallen below the starting tally of 500 points, additional trials with S1 and S2 were added until the tally reached 525 points; these extra trials were not included in the data analysis.

The probabilistic nature of the task meant that an optimal response across trials (i.e., ''A'' for S1 and S3; ''B'' for S2 and S4) might not be correct on a particular trial. Therefore, on each trial, the computer recorded reaction time (in ms) and whether the participant's response was optimal, regardless of actual outcome (points gained or lost). In addition, for each stimulus, we recorded number of win-stay responses (defined as trials on which the subject repeated a response that had received reward or non-punishment on the prior trial with that stimulus) and number of lose-shift responses (defined as trials on which the subject did not repeat a response that had received punishment or non-reward on the prior trial with that stimulus).

Mixed ANOVA with within-subject factors of block and trial type (reward vs. punishment) and between-subjects factor of gender were used to analyze the data; for analyses of reaction time, response type (optimal vs. non-optimal response) was also included as a between-subjects factor. Levene's test was used to confirm assumptions of homogeneity of variance. Where Mauchly's test indicated violations of assumptions of sphericity, Greenhouse-Geisser correction was used to adjust degrees of freedom for computing p-values from F-values. The threshold for significance was set at α = 0.05 with Bonferroni correction used to protect significance values under multiple comparisons (e.g., post hoc testing).

### Computational Model

Here, we modeled the observed behavioral results using a Qlearning model (Watkins and Dayan, 1992; Sutton and Barto, 1998; Frank et al., 2007) which uses the difference between expected and experienced outcomes to calculate PE which is then used to update predictions and guide action selection. The variant used here is the gain-loss model, which allows separate learning rates when PE is positive-valued or negative-valued (Frank et al., 2007).

Specifically, given stimulus s, each possible action r (here, choice to categorize the stimulus as ''A'' or ''B'') has a value Qr,s(t) at trial t. All Q-values are initialized to 0 at the start of a simulation run (t = 0). The Q-values are used to determine the probability of choosing each response via a softmax logistic function:

$$\Pr(r = \text{"{a}"}) = \frac{e^{Q\_{\Lambda, \text{s}}(t)/\beta}}{e^{Q\_{\Lambda, \text{s}}(t)/\beta} + e^{Q\_{\text{B}, \text{s}}(t)/\beta}} \tag{1}$$

$$\Pr(r = \text{"B"}) = 1 - \Pr(r = \text{"A"}) \tag{2}$$

As described above, β reflects the participant's tendency to either exploit (i.e., to choose the category with the currently highest Q value) or explore (i.e., to randomly choose a category).

PE is then computed for that trial based on the subject's actual response r ∗ and the feedback R(t) that the subject received on that trial:

$$PE(t) \, := \, R(t) - Q\_{r^\*,s}(t) \tag{3}$$

Here, R(t) is +1 for reward feedback, −1 for punishing feedback, and R0 for the no-feedback outcome. In the prior (Myers et al., 2013) paper, R0 was a free parameter that could vary between −1 (similar to punishment) and +1 (similar to reward).

Finally, the Q-value for the selected stimulus-response pair was updated based on PE:

$$Q\_{r^\*,s}(t+1) = Q\_{r^\*,s}(t) + a \ast PE \tag{4}$$

Here, β is the learning rate, set to LR+ if PE > 0 and to LR− if PE < 0.

First, we considered the model previously applied to data from this task in Myers et al. (2013); this model included four free parameters: LR+, LR−, β, and R0. We also considered a five-parameter model in which the value of R0 could be different on reward-based (R0rew) than on punishmentbased (R0pun) trials, allowing for the possibility that subjects might value the no-feedback outcome differently on these two types of trial. Theoretically optimal performance would be obtained if R0rew approached −1 (similar to punishment, and maximally different from R+) while R0pun approached +1 (similar to reward, and maximally different from R−). Simulations (not shown) confirmed that this pattern indeed obtained when the model was run on hypothetical subject data in which optimal responses were always, or nearly always, executed.

Finally, we considered an alternate 4-parameter model in which R0 was a free parameter by R0rew = −1 <sup>∗</sup>R0pun, i.e., the value of the no-feedback outcome is equal in magnitude but opposite in valance for the two trial types.<sup>1</sup>

For each of the three models under consideration, values of the free parameters were estimated for each participant, based on that participant's trial-by-trial choices and the feedback received. To do this, we searched through parameter space, allowing LR+, LR− and β to vary from 0 to 1 in steps of 0.05 and R0 to vary from −1 to +1 in steps of 0.1, to find the configuration of parameter values that minimized the negative log likelihood estimate (negLLE) across n trials:

$$\log \text{LLE} = -\sum\_{t=1...n.} \log \text{Pr}(r=r^\*) \tag{5}$$

In plotting results, for clarity of interpretation, this value is transformed into a probability value, p(choice) = exp(-negLLE/n), where p(choice) = 0.5 means chance and p(choice) = 1 means perfect replication of subject data.

To compare the three models, we used the Akaike Information Criterion (AIC; Akaike, 1974), which compares goodness-of-fit in terms of minimal negLLE while penalizing models that have more free parameters: AIC = 2∗negLLE + 2 ∗ k, where k is the number of free parameters. We also used the Bayesian Information Criterion (BIC; Schwartz, 1978) which additionally considers number of subjects:

$$BIC = \text{--} \, \text{\*} \, \text{neg} LLE + k \, \text{\*} \ln(\text{x}) \tag{6}$$

where x is the number of trials. Note that BIC assumes that one of the models being compared is the correct model, which is an assumption that is not necessarily provable for this type of dataset, while AIC only assesses which of the models is most efficient at describing the data while not necessarily assuming any are probably correct.

In addition to evaluating the three models described above, we also considered several additional variants: a threeparameter model where R0 was held constant at 0 (leaving LR+, LR−, and β free to vary), a two-parameter model where LR+ and LR− are constrained to be the same value, as in a standard Q-learning model (leaving only a single LR and β free to vary), and models where R0 (singly, or separately for R0rew and R0pun) were free to vary but the other parameters were fixed using mean values derived from the five-parameter value; none of these other variants performed as well as the four- and five-parameter models, and

<sup>1</sup>We thank an anonymous reviewer of this article for the suggestion.

for conciseness results with these variants are not described further here.

Further, to compare models we used the random effects Bayesian model selection procedure described in Stephan et al. (2009) and Penny et al. (2010), which takes into account the possibility that different models may have generated different subjects' data. Based on prior studies, we consider one model the winning model when protected exceedance probability for that model is larger than 0.9.

### Results

### Behavioral Task

**Figure 2A** shows performance on reward-based and punishment-based trials, over the four blocks of the experiment. There was significant learning, indicated by a within-subjects effect of block (F(2,47,84.12) = 6.22, p = 0.002) with no effect of trial type or sex and no interactions (all p > 0.200), indicating that average learning accuracy was similar across reward-based and punishment-based trials. **Figure 2B** shows reaction times (RT) for optimal and non-optimal responses on each trial type, over the course of the experiment. While all subjects made at least one optimal response to each trial type in each block, eight subjects did not make any non-optimal responses to at least one trial type in at least one block, meaning average RT could not be computed. Rather than dropping these eight subjects from analysis, we did separate rmANOVA of RT on optimal responses (calculated for all 36 subjects) and on non-optimal responses (for those 28 subjects who made at least one non-optimal response on each trial type in each block); Bonferroni correction was used to adjust alpha to 0.025 to protect significance levels under multiple tests. For optimal responses, there was a significant decrease in RT over blocks (F(1.75,59.45) = 21.92, p < 0.001) as well as a main effect of trial type, with RT slower on punishment-based than reward-based trials (F(1,34) = 26.61, p < 0.001). For non-optimal responses, the same pattern was observed: a significant decrease in RT over blocks (F(1.85,48.02) = 34.97, p < 0.001) and significantly slower responding on punishmentbased than reward-based trials (F(1,26) = 8.48, p = 0.007). However, the interaction between block and trial type and all effects and interactions involving gender did not reach corrected significance.

However, **Figure 3A** shows that there was considerable individual variability in performance on reward-based and punishment-based trials, with many subjects performing considerably better on one type of trial than another. Following Sheynin et al. (2013), we considered a ''bias'' measurement, defined as the difference between a subject's performance on reward-based trials and on punishmentbased trials in the final training block; thus, a negative bias indicates better performance on punishment-based trials, a positive bias indicates better performance on reward-based trials, and a bias of 0 indicates equally good performance on both types of trial. **Figure 3B** shows that, although bias was near 0 when averaged across subjects, many individual subjects showed a bias for either reward- or punishment-based trials that persisted through block 4 of the experiment.

Finally, we examined win-stay and lose-shift behavior. It would be expected that subjects would generally show win-stay after an explicit reward, and generally show lose-shift after an explicit punishment (although, due to the probabilistic nature of the task, not every punishment should trigger abandonment of a response rule). If the no-feedback outcome were treated as similar to a punisher on reward-based trials, then it should also trigger lose-shift; conversely, if the no-feedback outcome were treated as similar to a reward on punishment-based trials, then there it should also trigger win-stay. However, **Figure 4** shows that, on average, subjects exhibited more win-stay responses on reward-based than punishment-based trials, and more lose-shift responses on punishment-based than reward-based trials. Mixed

4), individual subjects showed a "bias" (difference between performance on

non-punishment (R0pun). Lose-shift occurs when subjects make a response to a stimulus and receive punishment (or non-reward) and then make a different response on the next trial with that stimulus. Subjects exhibited more lose-shift responses following an explicit punishment (R−) than following a non-reward (R0rew).

ANOVA confirmed these impressions: there was a main effect of response, with subjects producing more win-stay than lose-shift responses overall (F(1,34) = 43.93, p < 0.001), as well as a main effect of trial type (F(1,34) = 18.73, p < 0.001), and an interaction (F(1,34) = 101.96, p < 0.001). Post hoc pairwise ttests to examine the interaction, with alpha adjusted to 0.025, revealed significantly more win-stay behavior on reward-based than-punishment-based trials (t(35) = 8.22, p < 0.001) but significantly more lose-shift behavior on punishment-based than reward-based trials (t(35) = 7.60, p < 0.001). The omnibus ANOVA also revealed a main effect of sex, with males generally exhibiting more win-stay and lose-shift behaviors than females (F(1,34) = 4.83, p = 0.035), and a three-way interaction between response type, trial type, and gender (F(1,34) = 5.36, p = 0.027); however, none of the specific comparisons in the interaction reached significance on post hoc testing.

### Computational Model

ordered along the x-axis by bias.

Prior work with this task in a different population (veterans with and without severe PTSD symptoms) led us to note that an important source of individual differences might be variability in how people assigned reinforcement value to the ambiguous no-feedback outcome (Myers et al., 2013). The individual differences observed in this experiment, together with the finding that win-stay occurred significantly more often following a reward than a no-punishment outcome, while lose-shift occurred significantly more often following a punishment than a no-reward outcome, led us to consider whether individual differences in valuation of the ambiguous outcome might similarly underlie the behavior observed in the current study.

Following the earlier Myers et al. (2013) paper, we considered an RL model with four free parameters, the learning rates LR+ and LR−, the ''temperature'' parameter β, and the reinforcement value of the no-feedback outcome R0. We also considered a more elaborate five-parameter model, where the no-feedback outcome could be valued differently when it occurred on a reward-learning trial (R0rew, signaling failure to obtain reward) and when it occurred on a punishment-learning trial (R0pun, signaling successful avoidance of punishment). We also considered a second four-parameter model where R0rew was free to vary while R0pun was set at −1 <sup>∗</sup>R0rew.

All models generated unique combinations of best-fit parameters for every subject with the exception of the fiveparameter model, which generated multiple sets of bestfit parameters. In each of these cases, there were unique best-fit values for all estimated parameters except R0rew; however, for two subjects, the five-parameter model produced equally low negLLE for any value of R0rew >= 0 (given best-fit values for the remaining parameters), and for one subject, the model produced equally low negLLE for any value of R0rew <= 0 (given best-fit values for the remaining parameters). For subsequent analyses, the neutral value of R0rew = 0 was assigned as the best-fit parameter for these three subjects.

**Figure 5** shows that the models were similar in their ability to reproduce the data, both in terms of negLLE (**Figure 5A**) and also in terms of AIC and BIC (**Figure 5B**).

**Figure 6A** shows mean estimated parameter values for the three models; although R0rew and R0pun are plotted separately, note that R0rew = R0pun for the R0 model, while R0rew = −1 <sup>∗</sup>R0pun for the R0inv model. While all models had similar estimated mean values for LR+ (rmANOVA, F(1.52,53.21) = 1.21, p = 0.297) and LR− (F(1.34,46.86) = 3.22, p = 0.068), there were differences on estimated values of β (F(1.22,42.66) = 6.88, p = 0.008), which was significantly larger in the R0 model than in the five-parameter (t(35) = 2.48, p = 0.018) or R0inv models (t(35) = 2.86, p = 0.007), which did not differ (t(35) = 1.47, p = 0.151). Finally, the largest differences between models were observed in estimated values of R0rew and R0pun. Specifically, as shown in **Figure 6A**, the five-parameter model produced a mean value of R0rew that was greater than 0, and a mean value of R0pun that was less than 0. This pattern was echoed in the R0inv model but not in the R0 model, where both R0rew and R0pun were constrained to be equal, resulting in a weakly positive value for both. rmANOVA confirmed that estimated values of R0rew did not differ across the three models (F(1.48,51.00) = 1.56, p = 0.217) but values of R0pun did (F(1.62,56.57) = 16.25, p < 0.001). Specifically, the value of R0pun in the R0 model was significantly greater than in the five-parameter or R0inv models (all t > 4.5, all p < 0.001), but the value in the latter two models did not differ (t(35) = 1.65, p = 0.108).

Based on these analyses of mean scores, the R0inv model was a closer approximation to the five-parameter model than the R0 model. However, **Figure 6B** shows that there was considerable individual variability in values of R0rew and R0pun in the fiveparameter model, such that mean values may not adequately capture the qualitative patterns in the data. Specifically, while the R0 model constrained subjects to have equal values for R0rew and R0pun, and the R0inv model constrained them to be opposite in valence, **Figure 6B** shows that neither constraint adequately described the values generated by the five-parameter model. Rather, while a majority of subjects had estimated values of R0rew > 0 and R0pun < 0 (as also indicated in **Figure 6A**), some individual subjects assigned the same valence to these parameters while others did not. Interestingly, for no subject was the theoretically optimal pattern (R0rew < 0 and R0pun > 0) observed. There were also differences in the relative magnitude of R0rew and R0pun. **Figure 6C** shows R0 bias, defined as the difference between estimated values of R0rew and R0pun, for individual subjects. While only 2 of 36 subjects (5.6%) had R0 bias < −0.5, 22 of 36 subjects (61.1%) had R0 bias > = +0.5.

In addition to conducting simulations over all 160 training trials, we also conducted separate simulations to determine best-fit parameters over the first two blocks (first 80 trials) and over the last two blocks (last 80 trials). As shown in **Figure 7A**, model fit was better (lower negLLE, reflected in higher p(choice)) when the model was fit to blocks 3 and 4; this is unsurprising since subjects should have developed more consistent response rules later in training. As shown in **Figure 7B**, the value of estimated parameters R0rew and

R0rew = −1

<sup>∗</sup>R0pun.

Frontiers in Behavioral Neuroscience | www.frontiersin.org July 2015 | Volume 9 | Article 153 |

five-parameter model with where both R0rew and R0pun were

R0pun show the same qualitative pattern of R0rew > 0 and R0pun < 0 in the early blocks of training, and also in the later blocks of training, as when the model is applied to all 160 trials.

Importantly, random effects Bayesian model selection comparing the five-parameter model with the two fourparameter models indicated strong evidence in favor of the five-parameter model, with posterior probability for this model calculated as r = 0.63, compared with r = 0.11, −0.26 for each of the smaller models. Further, we compared protected exceedance probabilities among the three models as well as among each two models separately. We found that comparing all three models at once yields exceedance probabilities of 0.9998, 0.0001 and 00001 for five-parameter model, standard four-parameter-model, and four-parameter-model in which R0 was a free parameter by R0rew = −1 <sup>∗</sup>R0pun, respectively.

### Experiment 2

Experiment 1 was examined here as a standard baseline task for reward and punishment learning, as used in prior studies (Bódi et al., 2009; Kéri et al., 2010; Myers et al., 2013; Sheynin et al., 2013). Because reward-based and punishment-based trials were intermixed in Experiment 1, the no-feedback outcome was ambiguous. The central finding of the modeling was that—contrary to what might be defined as ''optimal'' behavior, subjects tended to value the ambiguous feedback as positive (similar to reward) on reward-based trials, and as negative (similar to punishment) on punishment-based trials.

In Experiment 2, we use a separated task design in which reward and punishment trials are conducted separately, in different training blocks. The no-feedback outcome is arguably unambiguous here, since in a block of reward-based trials it always signals missed reward (similar to a punishment) while in a block of punishment-based trials it always signals missed punishment (similar to reward). We here predicted that estimated values of R0 might differ accordingly both early in training, while subjects were experiencing only a single trial type, as well as later in training, as a function of early learning.

### Participants

Participants were drawn from the same population as Experiment 1 and included 36 healthy young adults (college undergraduates, mean age 19.6 years, SD 1.6; 63.9% female). As in Experiment 1, participants received research credit in a

plotted as p(choice) is better in last 80 trials compared to first 80 trials, reflecting greater consistency in subject responding as training

progressed. (B) The value of estimated parameters R0rew and R0pun, showing mean values of R0rew > 0 and R0pun < 0, whether assessed over the first 80 trials (Blocks 1–2), last 80 trials (Blocks 3–4), or entire experiment (all 160 trials).

psychology class. Procedures conformed to ethical standards laid down in the Declaration of Helsinki for the protection of human subjects. All participants signed statements of informed consent prior to inclusion in the study.

### Behavioral Task

The task was the same as in Experiment 1 except that subjects were randomly assigned to either a Reward-First (n = 17) or Punish-First (n = 19) condition. For those in the Reward-First condition, all 40 reward-based trials (stimuli S1 and S2) appeared in blocks 1 and 2 while all 40 punishment-based trials (stimuli S3 and S4) appeared in blocks 3 and 4. For the Punish-First condition this order was reversed. Thus, the nofeedback outcome was no longer ambiguous, as it consistently signaled lack of reward during the reward-learning blocks and consistently signaled lack of punishment during the punishmentlearning trials. Subjects were not informed that trials were blocked by type, nor were subjects explicitly signaled of the shift between blocks 2 and 3.

### Computational Model

The same five-parameter model described in Experiment 1 above was applied to the data from this experiment. In addition to calculating best-fit parameters based on the data from the complete set of 160 trials, we also applied the models to just the first 80 trials (blocks 1 and 2) and just the last 80 trials (blocks 3 and 4), when subjects were learning either the reward-based or punishment-based task.

### Results

### Behavioral Task

**Figure 8A** shows performance across all four blocks, for subjects assigned to the Reward-first and Punish-first conditions. Mixed ANOVA confirmed no significant change in performance across the four blocks (F(2.22,75.42) = 0.70, p = 0.553), no effect of condition (F(1,34) = 0.31, p = 0.579) and no interaction (F(2.22,75.42) = 1.55, p = 0.208). **Figure 8B** shows individual subject performance on the reward-based and punishmentbased trials, and again shows considerable individual variation on performance to the two trial types for subjects in either experimental condition.

**Figures 8C,D** show mean RT for subjects in each condition. Again, not all subjects made all response types on every block; for example, four subjects (two in each condition) made no non-optimal responses in block 4. Thus, as in Experiment 1, separate mixed ANOVAs of RT were conducted on optimal and non-optimal responses, with alpha adjusted to 0.025 to protect significance. For optimal responses, there was a significant effect of block (F(2.09,69.09) = 25.85, p < 0.001) but no effect of condition (F(1,33) = 0.02, p = 0.904) and no interaction (F(2.09,69.09) = 2.29, p = 0.107). For non-optimal responses, the pattern was similar: a within-subjects effect of block (F(1.37,38.35) = 20.29, p < 0.001) but no effect of condition (F(1,28) = 0.14, p = 0.714) and no interaction (F(1.37,38.35) = 0.23, p = 0.708). Specifically, for both optimal and non-optimal responses, RT decreased from block 1 to block 2 (all t > 5, all p < 0.001) and from block 3 to block 4 (all t > 4, all p < 0.001) but did not change when the trial types were shifted from block 2 to 3 (all t < 1.5, all p > 0.100).

Finally, we again examined win-stay and lose-shift behavior. As suggested by **Figure 9**, there was no main effect of condition (F(1,34) = 0.04, p = 0.842); however, subjects exhibited more win-stay than lose-shift behavior (F(1,34) = 27.69, p < 0.001). There was also an interaction between trial type and experimental condition (F(1,34) = 9.41, p = 0.004); however, no post hoc comparisons to explore this interaction survived corrected significance. Thus, the pattern seen in Experiment 1 (**Figure 4**), where there was more win-stay behavior on reward-based than

punishment-based trials, and more lose-shift behavior on punishment-based than reward-based trials, was not observed here.

### Computational Model

Given that the analyses in Experiment 1 identified the fiveparameter model as providing lowest negLLE, with comparable AIC or BIC to simpler models, we applied the same fiveparameter model here to data from the two experimental conditions. Again, we ran simulations both to fit data from all 160 trials, as well as running additional simulations based on data from just the first 80 or last 80 trials, while subjects were experiencing only a single trial type. Results from both sets of simulations are reported here.

Across just the first 80 trials (blocks 1 and 2), the model fit data from the Reward-First and Punish-First conditions equally well (t(34) = 0.03, p = 0.977); similarly, across all 160 trials there was no difference in negLLE between the two conditions (t(34) < 0.01, p > 0.99). **Figure 10** shows these data, rescaled as p(choice) which is normalized for number of trials and so can be compared across calculations based on 80 vs. 160 trials.

At the end of the first 80 trials (blocks 1 and 2), there were no differences in the value of LR+, LR−, or β between Reward-First and Punish-First conditions (**Figure 11A**; t-tests, all t < 1.5, all p > 0.100). Because subjects in each condition had only experienced one type of trial, those in the Reward-First condition had an estimated value for R0rew but not R0pun (which they had never experienced), while those in the Punish-First condition had an estimated value for R0pun but not R0rew. As might be expected, for the former group, R0rew < 0 (one-sample t(16) = 3.29, p = 0.005), indicating the no-feedback outcome was valued as similar to a punishment (missed reward); however, for the latter group, R0pun was not significantly different from 0 (one-sample t(18) = 1.82, p = 0.086).

trials.

FIGURE 9 | Win-stay and lose-shift behavior for subjects in the (A) Reward-First and (B) Punish-First conditions. In both conditions, there was more win-stay than lose-shift behavior for both reward and punishment

trials, but (unlike Experiment 1) there was no difference in win-stay behavior to explicit reward vs. non-punishment, or in lose-stay behavior to explicit punishment vs. non-reward.

For subjects in the Punish-First condition, there was a strong correlation between the value of R0pun over the first 80 trials and performance over those same trials (r = 0.603, p = 0.006), indicating that those subjects who assigned R0pun a more positive value performed better at avoiding the actual punisher in preference to the no-feedback outcome (**Figure 11C**). However, for subjects in the Reward-First condition, the correlation between R0rew and performance was not significant (r = −0.337, p = 0.185); as shown in **Figure 11B**, many subjects who valued R0rew close to −1 nevertheless performed near chance on the reward-learning trials.

When applying the model to data from all 160 trials, during which all subjects had experienced the same number of both trial types, there were no differences in the value of any estimated parameter between Reward-First and Punish-First conditions (**Figure 12A**; t-tests, all t < 1.5, all p > 0.100). Here, neither R0rew nor R0pun differed significantly from 0 in either condition (all p > 0.200). **Figure 12B** shows that the separated task design did qualitatively shift R0 bias: whereas in Experiment 1, only 2 of 36 subjects (5.6%) had R0 bias < −0.5 (**Figure 6C**), here 8 of 17 subjects (47.1%) in the Reward First condition and 5 of 19 subjects (26.3%) in the Punish First condition had R0 bias <= −0.5. The distribution did not differ across Reward-First and Punish-First conditions (Yates-corrected chi-square test, χ <sup>2</sup> = 0.90, p = 0.344).

Finally, it can be argued that the first 80 and last 80 trials represent two separate tasks, and so rather than fitting the model to all 160 trials, it is reasonable to fit it once to the first 80 trials and again to the last 80 trials. When the model was fit to just the last two blocks (last 80 trials), there were no differences between conditions on any estimated parameter (**Figure 13**; all p > 0.05), and neither the estimated value of R0rew in the Punish-First group (who was now experiencing reward-based trials) nor the estimated value of R0pun in the Reward-First group (who was now experiencing punishment-based trials) differed significantly from 0 (all p > 0.05).

Thus, comparing the first task (**Figure 11A**) to the last task (**Figure 13**) learned, there were effects of task order. Specifically, the estimated value of R0rew was greater when reward trials occurred after punishment trials (i.e., in the Punish-First group) than when they were trained without prior experience (i.e., in the Reward-First group; **Figure 13**; t-test, p = 0.011). Such differences were not evident in estimated values of R0pun—i.e., values were similar whether or not punishment training occurred in naïve subjects, or in subjects who had already experienced reward-based training (t-test, p = 0.411). No other parameters

changed significantly within or across conditions from the first two blocks to the last two blocks (all p > 0.08). Thus, training order had a significant effect on R0rew but not R0pun or any other parameter.

### Discussion

In the current study, we found the following. First, when we applied Q-learning model to the ''standard'' (intermixed) version of the task in Experiment 1, we found that the five-parameter model weighted the no-feedback outcome differently when it appeared on reward-based trials (R0rew, the alternative to explicit reward) than when it appeared on punishment-based trials (R0pun, the alternative to explicit punishment). Contrary to what one might think (and, in fact, contrary to what would be ''optimal''), subjects tended to value R0rew > 0 and R0pun < 0. That is, the nofeedback outcome on a reward trial was valued similar to a small reward, while the no-feedback outcome on a punish trial was valued similar to a small punishment. This pattern was similar whether the model was applied to data from all 160 trials or separately to the first 80 trials (blocks 1 and 2) and the second 80 trials (blocks 3 and 4). This suggests that, rather than treating the no-feedback outcome as a contrast to explicit reward, subjects instead tended to value it based on trial type: positively on trials where reward was available, and negatively on trials where punishment might occur.

Second, when we looked at individual subject data from Experiment 1, there was no correlation between estimated values of R0rew and R0pun; that is, while the group valued these inversely on average, individual subjects did not. One implication of this is that, although we explored several simpler RL models, these generally did not adequately capture the qualitative range of solutions found to describe individual subjects. We also found that individual subjects tended to have R0 bias > 0, indicating a greater absolute value of R0rew than R0pun. This would potentially produce somewhat better learning on punish than reward trials in the intermixed group, since a strong positive value of R0rew means that the actual reward might only be viewed as marginally more reinforcing than the no-feedback outcome. Such a trend is visible in **Figure 1A**, although it was not significant here. Prior work with this task, however, has often shown slightly better learning on punishment trials

differences in mean estimated parameter values between conditions, and neither R0rew nor R0pun differed significantly from 0 in either condition. (B) R0

than reward trials in control groups given intermixed training (e.g., Somlai et al., 2011; Myers et al., 2013; Sheynin et al., 2013), although like in the current study this difference does not always reach significance. Such a bias to learn from punishment more readily than from reward is of course consistent with loss aversion theory (Kahneman and Tversky, 2000), which essentially states that losses are psychologically more powerful than gains. We also found that males show more win-stay and lose-shift behaviors than females in this task. The win-stay behavior in our data is similar to other findings from rat studies, although the lose-shift data are different (van den Bos et al., 2012).

The curious finding that subjects in Experiment 1 valued R0rew similar to a reward and R0pun similar to a punishment warrants further investigation. As a first step, in Experiment 2, we examined subject behavior when reward-based trials were trained first vs. when punishment-based trials were trained first. Visual comparison of learning curves from Experiment 1 (**Figure 2A**) and Experiment 2 (**Figure 8A**) shows somewhat better learning in the latter, as might be expected given that the separated conditions of Experiment 2 involve reduced working memory load (only two trial types trained at any given time) and reduced ambiguity of the nofeedback outcome (only one meaning within any given block of trials). However, we found no significant difference on behavioral performance (in terms of percent optimal responding or reaction time) whether the reward-based or punishmentbased trials were trained first. This contrasts observations in other tasks showing an effect of trial ordering on behavior (Esteves et al., 1994; Ohman and Soares, 1998; Katkin et al., 2001; Morris et al., 2001; Lovibond et al., 2003; Wiens et al., 2003).

There was, however, an effect on win-stay and lose-shift behavior. Specifically, as shown by **Figure 9**, there was no main effect of whether reward-based or punishment-based trials were trained first, but post hoc tests found no significant difference in win-stay responding on reward trials (where subjects obtained explicit reward) vs. non-punishment trials (where subjects received R0rew), and no significant difference in lose-shift responding on punishment trials (where subjects obtained explicit punishment) vs. non-reward trials (where subjects received R0pun). This suggests that subjects were treating R0rew similar to punishment (missed reward) and R0pun similar to reward (missed punishment), in contrast to the results from Experiment 1.

This pattern was echoed when the Q-learning model was applied to data from the first 80 blocks of Experiment 2, when subjects were experiencing either reward-based or punishmentbased trials. Specifically, and as might be expected, subjects in the Reward-First condition had estimated values of R0rew below 0, indicating that the no-feedback outcome was treated similar to a punishment (missed opportunity for reward). In the Punish-First condition, estimated values of R0pun were numerically, but not significantly, greater than 0.

It might have been expected that ''switching'' tasks in the second half of Experiment 2 might result in the Punish-First group (who was now experiencing only reward-based trials) might similarly develop a positively-valued R0rew, while the Reward-First group (who was now experiencing only punishment-based trials) might develop a positivelyvalued R0pun. But this was not the case. Specifically, the estimated value of R0rew was greater (closer to 0) when reward-based trials occurred after punishment-based trials (as in the Punish-First group) than when they occurred in a naïve subject (as in the Reward-First group). This suggests that non-reward is valued more negatively in subjects who have only ever experienced reward, compared to subjects who have previously experienced explicit punishment. By contrast, prior exposure to explicit reward did not affect valuation of non-punishment. Thus, training order had a significant effect on R0rew, but not R0pun or any other estimated parameter.

Finally, when the Q-learning model was applied to data from all 160 trials in Experiment 2, there were no differences in values of estimated parameters for subjects from the Reward-First or Punish-First conditions, and in neither case did estimated values of R0rew or R0pun differ significantly from 0. Again, the chief contrast with Experiment 1, where trial types were intermixed, appears to be in valuation of R0rew, which was strongly positive following intermixed training, but more neutrally-valued after separated training.

The negatively-valued estimates of R0 in the Reward-First condition are potentially interesting because they are reminiscent of those observed by Myers et al. (2013) in a prior study of veterans with symptoms of post-traumatic stress disorder (PTSD), a disorder that includes pathological avoidance among its defining symptoms. In that earlier study, in which all subjects received intermixed reward and punishment trials, control subjects (with few or no PTSD symptoms) had estimated values of R0 near +0.5, slightly larger than those obtained in the intermixed group of the current study. By contrast, subjects with severe PTSD symptoms had significantly lower (but still positive) values of R0, and those with severe PTSD symptoms who were not receiving psychoactive medication for their symptoms had estimated values of R0 near 0, similar to that observed in the Reward-First condition of Experiment 2 here. In the current study, we did not assess PTSD symptoms, so we cannot definitively rule out the possibility that PTSD symptoms contributed to the current pattern of results; however, it seems unlikely that severe PTSD would occur at high rates in the college population from which our sample was drawn, nor that such cases if they existed would have been disproportionately assigned to the Reward-First condition.

An alternate hypothesis is that prior training on rewardbased trials only created a bias to view neutral outcomes as negatively-valenced, although this bias could be partly remediated by later exposure to punishment-based trials. As current therapy for PTSD often focuses on providing positive and/or neutral feedback, it may be possible that alternate approaches, which explicitly contrast neutral and negative feedback, might be more successful in helping these individuals to reframe their interpretation of neutral outcomes. However, future work should confirm or disconfirm these speculations.

Another relevant prior study has suggested that subjects' RT depend on the rate of experienced reward, possibly reflecting tonic levels of dopamine (Guitart-Masip et al., 2015). This study differed from ours in many ways: specifically, it was an oddball detection task, with subjects required to respond quickly in order to obtain monetary rewards; by comparison, our task involved a forced-choice categorization with no explicit instruction for subjects to respond quickly. Nevertheless, it might have been expected that the RT results from Guitart-Masip et al. (2015) might generalize to a probabilistic categorization task such as the current paradigm. In our Experiment 2, (most) subjects got frequent reward during the reward-based trial blocks, and got no reward (or at best lack-of-punishment) during the punishment-based trial blocks, so arguably relative rate of reward changed across the course of the experiment. However, our RT analysis did not find significant effects of condition on RT nor any block-condition interactions. One possible explanation for this discrepancy is simply that our small sample size was underpowered to examine RT data. A second explanation might be that subjects in the current study viewed the nofeedback outcome as reinforcing, which meant that (for most subjects) relative rates of reward were similar across reward and punishment blocks, particularly since performance levels were approximately equal (**Figures 2A**, **8A**). However, this explanation is not supported by the computational modeling, which suggested that, although the no-feedback outcome was positively-valued during punishment-based trials in the Punish-First condition, it was not positivelyvalued during punishment-based trials in the Reward-First condition. Future studies could be designed to further elucidate this issue, by explicitly varying the rate of reward in this task, perhaps especially following overtraining and the achievement of steady-state response behavior (Niv et al., 2007).

In summary, our study shows that probabilistic category learning is impacted by ordering of trials, and specifically by whether reward-based and punishment-based trials occur first or are intermixed. Our computational modeling suggests that these differences are reflected in the relative weighting of neutral feedback, and further suggests that early training on one type of trials, specifically reward-based trials, can create a difference in how neutral feedback is processed, relative to those receiving only punishment trials or intermixed reward-based and punishment-based trials. This may create conditions that facilitate subsequent learning of avoidance responses, when punishment-based learning is introduced, which in turn may suggest a way in which early experiences could confer later vulnerability to facilitated avoidance, which is a feature of anxiety disorders.

### Acknowledgments

For assistance with data collection and management, the authors gratefully acknowledge the assistance of Monica Andrawis-Valentín, Yasheca Ebanks-Williams,

### References


Lisa Haber-Chalom, Priyanka Khanna, and Roshani Patel. This work was partially supported by the NSF/NIH Collaborative Research in Computational Neuroscience (CRCNS) Program and by NIAAA (R01 AA 018737).

is affected by presence of PTSD symptoms in male veterans: empirical data and computational model. PLoS One 8:e72508. doi: 10.1371/journal.pone.0072508


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Moustafa, Gluck, Herzallah and Myers. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Medial amygdala lesions selectively block aversive Pavlovian–instrumental transfer in rats

#### **Margaret G. McCue<sup>1</sup> , Joseph E. LeDoux 1,2 and Christopher K. Cain1,3\***

<sup>1</sup> Emotional Brain Institute, Nathan Kline Institute for Psychiatric Research, Orangeburg, NY, USA

<sup>2</sup> Center for Neural Science, New York University, New York, NY, USA

<sup>3</sup> Child and Adolescent Psychiatry, New York University Medical School, New York, NY, USA

### **Edited by:**

Richard J. Servatius, DVA Medical Center, USA

### **Reviewed by:**

Anne-Marie Mouly, Centre de Recherche en Neurosciences de Lyon, France Hadley Bergstrom, National Institutes of Health, USA

### **\*Correspondence:**

Christopher K. Cain, Emotional Brain Institute, Nathan Kline Institute for Psychiatric Research, 140 Old Orangeburg Road, Orangeburg, NY 10962, USA e-mail: ccain@nki.rfmh.org

Pavlovian conditioned stimuli (CSs) play an important role in the reinforcement and motivation of instrumental active avoidance (AA). Conditioned threats can also invigorate ongoing AA responding [aversive Pavlovian–instrumental transfer (PIT)]. The neural circuits mediating AA are poorly understood, although lesion studies suggest that lateral, basal, and central amygdala nuclei, as well as infralimbic prefrontal cortex, make key, and sometimes opposing, contributions. We recently completed an extensive analysis of brain c-Fos expression in good vs. poor avoiders following an AA test (Martinez et al., 2013, Learning and Memory). This analysis identified medial amygdala (MeA) as a potentially important region for Pavlovian motivation of instrumental actions. MeA is known to mediate defensive responding to innate threats as well as social behaviors, but its role in mediating aversive Pavlovian–instrumental interactions is unknown. We evaluated the effect of MeA lesions on Pavlovian conditioning, Sidman two-way AA conditioning (shuttling) and aversive PIT in rats. Mild footshocks served as the unconditioned stimulus in all conditioning phases. MeA lesions had no effect on AA but blocked the expression of aversive PIT and 22 kHz ultrasonic vocalizations in the AA context. Interestingly, MeA lesions failed to affect Pavlovian freezing to discrete threats but reduced freezing to contextual threats when assessed outside of the AA chamber. These findings differentiate MeA from lateral and central amygdala, as lesions of these nuclei disrupt Pavlovian freezing and aversive PIT, but have opposite effects on AA performance. Taken together, these results suggest that MeA plays a selective role in the motivation of instrumental avoidance by general or uncertain Pavlovian threats.

**Keywords: medial, amygdala, Pavlovian, instrumental, transfer, avoidance, freezing, ultrasonic**

### **INTRODUCTION**

Instrumental active avoidance (AA) is a major mechanism for coping with threats. As with all forms of defensive conditioning, AA mechanisms evolved because they were adaptive. Indeed, AA gives subjects control in dangerous situations and likely contributes to adaptive active coping strategies and resilience (LeDoux and Gorman, 2001). However, when active avoidance responses (ARs) are inappropriate, or occur too frequently, they can interfere with normal activities and contribute to anxiety pathology (McGuire et al., 2012). Compared to related phenomena like Pavlovian threat conditioning (Johansen et al., 2011), very little is known about the brain mechanisms of AA.

In a typical signaled AA paradigm, rats first learn that a conditioned stimulus (CS, sometimes called a "warning signal"; e.g., tone) predicts the occurrence of an aversive unconditioned stimulus (US; e.g., footshock). This Pavlovian phase transforms the CS into a threat that triggers defensive reactions (e.g., freezing). Then, on subsequent trials, rats gradually learn to suppress Pavlovian reactions and emit a specific instrumental action (AR; e.g., shuttle) that terminates the CS and prevents US delivery. Although the reinforcement mechanism in AA is unknown, one prominent theory hypothesizes that "fear reduction" associated with CS termination reinforces the AR (Mowrer and Lamoreaux, 1946; Miller, 1948; Rescorla and Solomon, 1967; Levis, 1989). Conditioned threats also play an important role in AA expression; once the instrumental contingency is acquired, CS presentations provide the motivation to perform the AR (Rescorla, 1990).

Active avoidance learning is also possible without an explicit CS or warning signal (Sidman, 1953). In the unsignaled AA paradigm, rats learn to emit ARs at regular intervals to delay US presentations (Bolles and Popp, 1964). In this task, similar Pavlovian and instrumental processes are hypothesized; however, the CS is a contextual cue that increases in intensity with time (Anger, 1963; Rescorla, 1968). Although unsignaled AA is more difficult to learn than signaled AA, it has proven useful for addressing some key questions about AA mechanisms. For instance, we have exploited the variability in unsignaled AA behavior to demonstrate that AA performance reflects a competition between competing motivations to react (e.g., Pavlovian freezing) or act (e.g., instrumental shuttle) in the face of threat (Lazaro-Munoz et al., 2010). Further, since unsignaled AA produces a steady rate of ARs, it is ideal for studying aversive conditioned motivation mechanisms in isolation with Pavlovian–instrumental transfer tasks (PIT; Rescorla and Lolordo, 1965; Patterson and Overmier, 1981; Laroche et al., 1987;Campese et al., 2013). In the aversive PIT procedure, Pavlovian and instrumental conditioning occur separately. Then, during the critical

PIT test, AR rates are compared during CS and CS-free periods. Aversive CSs facilitate AA responding, most likely by activating a central arousal-like state (LeDoux, 2014).

We have used signaled AA, unsignaled AA, and aversive PIT tasks to help reveal the neural circuitry of AA and to identify areas that may contribute to conditioned motivation or response competition. In a recent study, we evaluated expression of the immediate-early gene *c-fos* after unsignaled AA training (Martinez et al., 2013). Good avoiders showed high AR rates and low freezing, whereas poor avoiders showed an opposite pattern. Although we examined a number of brain regions, we found that c-Fos expression correlated with freezing and AA behavior in only five regions: lateral amygdala (LA), basal amygdala (BA), central amygdala (CeA), infralimbic prefrontal cortex (IL-PFC), and medial amygdala (MeA). Involvement of the first four regions in AA converges with lesion studies. Lesions of LA or BA block AA acquisition and impair AA expression (Poremba and Gabriel, 1997, 1999; Choi et al., 2010; Lazaro-Munoz et al., 2010). Lesions of CeA block Pavlovian freezing and facilitate AA in poor avoiders, but have little effect in good avoiders (Choi et al., 2010; Lazaro-Munoz et al., 2010; Moscarello and LeDoux, 2013). Lesions of IL-PFC enhance freezing and impair AA acquisition (Moscarello and LeDoux, 2013). And lesions of LA or CeA, but not BA, impair aversive PIT (Campese et al., 2014). Considered with findings from Pavlovian conditioning studies (reviewed by Cain and LeDoux, 2008a), this has led us to a hypothetical model where LA is critical for learning and storing Pavlovian CS–US associations, and this information can be used in different ways to: (1) elicit Pavlovian reactions (via CeA), (2) motivate specific instrumental actions linked to the CS (via BA), or (3) generally motivate instrumental actions (via CeA). IL-PFC contributes by suppressing CeA-mediated reactions that compete with ARs. This is an incomplete working model as much remains unknown; however, these studies begin to address how the neural circuits of Pavlovian and instrumental aversive conditioning interact to produce behavior in the AA paradigm.

The only brain region identified by our c-Fos analysis that has not been investigated with lesions in the AA task is MeA. MeA is part of the extended amygdala (Alheid et al., 1995), a collection of structures that have been generally implicated in risk assessment and low-level defensive behaviors to uncertain or distant threats (Kemble et al., 1984; Davis et al., 2010). MeA has also been clearly implicated in innate defensive responses to predator cues (Rosen et al., 2008; Takahashi et al., 2008), as well as aggression and sexual behavior (Newman, 1999). MeA disruption has been studied with Pavlovian conditioning, although the results have been mixed (Nader et al., 2001; Walker et al., 2005). To our knowledge, the effects of MeA lesions on AA or aversive PIT have never been evaluated. MeA receives projections from LA and CeA and could mediate CS-elicited reactions that compete with ARs (Pitkänen, 2000). MeA also receives inputs from IL-PFC and could be necessary for suppressing CeA-mediated reactions that compete with ARs (McDonald et al., 1999). Finally, MeA projects to regions like the ventral tegmental area and striatum that may be important for instrumental learning and conditioned motivation to act (Pardo-Bellver et al., 2012).

Given this sparse information, we tentatively hypothesized that MeA is required for Pavlovian motivation of AA performance, but not for Pavlovian defensive reactions. To test this, we used electrolytic lesions of MeA and evaluated unsignaled AA, Pavlovian conditioning, and aversive PIT behaviors. Pre- and post-training lesions were used to differentiate between effects on learning and performance in the AA task. Further, we designed the studies to measure a range of defensive reactions to learned and innate threat stimuli in order to clarify the role of MeA in defensive conditioning. Finally, we included several control measures to determine whether MeA lesions affect basic sensorimotor functions. The results suggest that MeA selectively mediates low-level defensive reactions and motivation of instrumental avoidance by general or uncertain threats.

### **MATERIALS AND METHODS SUBJECTS**

Subjects were 74 Male Sprague-Dawley rats (Hilltop Lab Animals Inc., Scottsdale, PA, USA) weighing ~300 g at the start of the study. Rats were housed two per cage and maintained on a 12:12-h light:dark schedule with free access to food and water. All experiments were approved by the Nathan Kline Institute Animal Care and Use Committee and were in accordance with NIH guidelines.

### **APPARATUS**

All avoidance, avoidance extinction, and PIT sessions occurred in standard rat two-way shuttleboxes (H10-11R-SC; Coulbourn Instruments, Whitehall, PA, USA). Shuttleboxes were equipped with infrared beam arrays to automatically detect movement between chamber sides, and bat detectors for analysis of 22 kHz ultrasonic vocalizations (USVs; Noldus Ultravox system, Leesburg, VA, USA). Pavlovian threat conditioning and context freezing tests occurred in standard rat conditioning boxes (H10-11R-TC; Coulbourn Instruments). Shuttleboxes and conditioning boxes also contained house lights, infrared indicator lights, video cameras, 8 ohm speakers (one per conditioning box, two per shuttlebox on opposite ends) and stainless steel grid floors for scrambled footshock delivery (shock source: Precision Animal Shocker, model H13–15, Coulbourn Instruments). Tone stimuli were delivered to speakers by programmable tone generators (Coulbourn Instruments, model A12–33). Shuttleboxes and conditioning chambers were enclosed in sound attenuating chambers (H10-24A). All conditioning procedures were controlled by Graphic State software (v3.03, Coulbourn Instruments). Predator odor tests occurred in a custom two-compartment chamber with wire mesh floors. Each chamber measured 28 cm × 28 cm × 43 cm (L ×W × H) and was open at the top to allow for recording of animal behavior via an overhead video camera. The internal walls were painted gray and chambers sides were indistinguishable. Chamber sides were separated by a small open passage (10 cm × 19 cm). Pavlovian cue freezing tests occurred in Coulbourn conditioning boxes modified to mask salient contextual cues. Modifications included: plastic inserts to cover grid floors, high contrast visual cues added to transparent walls, and the addition of a novel odor (floor pans cleaned with 6% ethanol before test). Key behavioral sessions were recorded to DVD for offline analyses.

### **PROCEDURE**

Five sequential behavioral phases comprised the major experiments: (1) Sidman active avoidance conditioning, (2) predator odor tests, (3) Pavlovian threat conditioning, (4) avoidance extinction, and (5) Pavlovian–instrumental transfer tests. Two experiments were conducted, which differed mainly in the timing of MeA lesions. In Experiment 1, MeA lesions occurred before all behavioral phases. In Experiment 2, MeA lesions occurred after avoidance training and before all other behavioral phases. Additionally, cat hair served as the predator odor in Experiment 1 and fox urine served as the predator odor in Experiment 2. Finally, in Experiment 2, poor avoiders were identified after AA training and excluded from further analysis, thus, MeA lesions were evaluated only in good avoiders. After these experiments were completed, a third experiment was conducted to determine if MeA lesions affect pain threshold. At the completion of behavioral testing, rats were transcardially perfused under deep anesthesia and brains were removed for histological verification of lesions. See **Figure 1** for experimental timelines.

### **Unsignaled AA conditioning**

Rats received unsignaled Sidman AA training (4–5 sessions per week, 1 session per day, 25 min per session) as previously described (Lazaro-Munoz et al., 2010; Campese et al., 2013, 2014) where every shuttling response (movement to the opposite chamber side) delayed the delivery of the unconditioned stimulus (US; 0.5 s × 1 mA footshock) by 30 s (R–S or response–shock interval). In the absence of shuttling, the US was delivered every 5 s (S–S or shock–shock interval). Avoidance responses (ARs) were defined as shuttles during the R–S interval; shuttles during the S–S interval were considered escape responses (ERs). All shuttles were marked by a brief feedback stimulus (house lights blink off for 0.3 s). The number of ARs, ERs, shocks, and USVs were automatically recorded for all sessions. In Experiment 1, all rats received eight post-lesion training sessions. In Experiment 2, rats received seven training sessions, then surgery and recovery, followed by two additional AA sessions (identical to training). Note that poor avoiders were identified after session seven and excluded from further analysis as previously described (Lazaro-Munoz et al., 2010).

### two habituation sessions and two predator odor test sessions on consecutive days. Each session lasted 10 min and the percentage of total time spent in each chamber side was measured from video files. In Experiment 1, cat hair and a segment of cat collar were placed in a receptacle under the wire mesh floor in one compartment of the chamber (Blanchard et al., 2005). Since no effect of cat odor was found, in Experiment 2, 50µl of 100% fox urine (Leg Up Enterprises, Lovell, ME, USA) was pipetted onto a Kimwipe and placed in a receptacle under the wire mesh floor of one compartment. No attempt was made to actively control odor flow between the compartments. To evaluate the effect of MeA lesions on predator odor, both the habituation sessions and predator odor tests were analyzed from video files to quantify the time spent in each chamber. To evaluate potential effects of MeA lesions on locomotor activity, habituation sessions were analyzed from video files by bisecting each chamber into quadrants with lines on the video monitor and counting the number of line crossings during the session.

### **Pavlovian threat conditioning**

Rats received three pairings of the conditioned stimulus (CS: 30 s, 5 kHz, 80 dB tone) and co-terminating US (0.7 mA × 1 s footshock) with 3 min acclimation and inter-trial intervals. One day later, rats received counterbalanced cue and context tests separated by 3 h. For the context test, rats were returned to the conditioning boxes for 8 min. For the cue test, conditioning chambers were modified to remove salient contextual cues, and a 30-s CS was presented 3 min after entry to the chamber. Freezing was rated from DVD files by an experienced observer blind to treatment condition. For cue tests, freezing was rated continuously and percent freezing was calculated by dividing the total seconds freezing by 30 and multiplying by 100. For context tests (including freezing during AA training), freezing was rated by time-sampling; every 5 s the rater determined whether the rat was freezing or not, and percent freezing was calculated by dividing the number of freezing observations by the total number of observations and multiplying by 100.

### **Predator odor**

Predator odor tests were included as a positive control for effective MeA lesions (reviewed by Takahashi et al., 2005). Rats received

### **AA extinction**

Rats were returned to the shuttleboxes for 60 min with shockers turned off. Feedback was provided with each shuttle response.

received Sham or MeA-lesion surgery prior to all behavioral testing. **(B)** Rats received unsignaled Sidman AA training for seven daily sessions. Poor avoiders were identified after session 7 and excluded from further study.

testing. **(C)** In a final, separate experiment, rats received Sham or MeA-lesion surgery followed by pain reactivity testing only. Red behavioral stages all occurred post-lesion.

Long-term memory for AA extinction was assessed during the first 5 min of the PIT test session, 1 day after extinction training.

### **Pavlovian–instrumental transfer**

Rats received two PIT tests separated by 1 day. PIT test sessions involved a single presentation of the aversive CS in the shuttleboxes while rats shuttled under extinction (US presentations absent, response feedback present). For each individual, the CS presentation was triggered when the shuttling rate fell below two responses per minute (RPMs) for two full minutes. Previous work found that PIT effects were greatest when baseline response rates were low (~2 RPMs), but not absent (Campese et al., 2013). Since rats vary greatly in their rates of AA extinction, this protocol ensured similar baseline response rates when PIT was assessed. Additionally, since some rats freeze when initially placed in the shuttleboxes, the CS trigger was disabled for the first 15 min of USAA extinction. Once triggered, the CS presentation remained on until 10 shuttles were performed. Immediately after the 10th shuttle response, the CS was terminated, the house light turned off and the session ended. For each rat in each test, a PIT score was calculated by the following equation: (shuttling rate during the CS/shuttling rate during an equivalent Pre-CS period)\*100.

### **Shock reactivity**

Rats were placed individually into the conditioning boxes and scrambled 0.5 s footshocks were delivered every 30 s, beginning 60 s after entry to the chamber. The initial shock intensity was 0.1 mA and each subsequent shock increased by 0.1 mA. Thresholds to flinch, vocalize, and jump were recorded as described by others (Swedberg, 1994). The session was terminated once a jump was observed or 1.5 mA was reached; however, all rats emitted jump responses prior to reaching the 1.5-mA maximum.

### **SURGERY**

Rats were anesthetized with isoflourane (3–4%) (Henry Schein, Melville, NY, USA), and placed in a stereotaxic apparatus (David Kopf Instruments, Tujunga, CA, USA). Small burr holes were drilled in the skull above MeA. A stainless steel monopolar electrode covered with epoxy (exposed tip of 500µm;model NE-300X, David Kopf Instruments) was lowered through an incision in the dura into MeA. Bilateral lesions were created with a lesion maker (model 53500, Ugo Basile, Italy) by passing current (+ 0.5 mA, 12 s) through the electrode at four different drop sites (relative to Bregma in millimeters): (1) AP: −1.9, ML: ± 3.2, DV: −9.2; (2) AP: −2.4, ML: ± 3.2, DV: −9.3; (3) AP: −2.9, ML: ± 3.4, DV: −9.0; (4) AP: −3.4, ML: ± 3.4, DV: −8.9. Post-operative pain was managed with subcutaneous Buprenorphine SR (0.5 mg/kg; ZooPharm, Windsor, CO, USA). Sham animals underwent the same procedure, but no current was passed through the electrode. Animals recovered in their homecages, singly housed, for 14 days following surgery, and then were returned to pair housing for the remainder of the experiment.

### **LESION VERIFICATION**

At the completion of behavioral testing, rats were given an anesthetic overdose and perfused transcardially with 10% phosphatebuffered formalin. Brains were removed and stored in 10% phosphate-buffered formalin and 30% sucrose for at least 3 days and were then cut in 50µm sections using a freezing microtome (every other section was collected). Nissl stains were then performed and tissue images were collected (Nikon Microphot-FXA). Damage to target brain regions and adjacent areas was assessed using a rat brain atlas as a guide (Paxinos and Watson, 2005).

### **STATISTICAL ANALYSIS**

Data are presented as group means (±SEM). Bar graphs with two groups were analyzed with unpaired, two-tailed student's *t*tests. All other data were analyzed with two-way repeated measures ANOVAs (GraphPad Prism 6.0, GraphPad Software Inc., La Jolla, CA, USA). Planned *post hoc* comparisons were analyzed using Bonferroni's Multiple Comparison test. Differences were considered significant if *p*-values were less than 0.05. Note that behavioral results from Experiments 1 and 2 were initially analyzed separately. Data from Experiments 1 and 2 were combined only if: (1) testing occurred post-lesion in both experiments, (2) MeA lesions produced the same outcome (effect vs. non-effect) in both experiments, and (3) direct comparisons revealed no statistically significant differences between Sham groups orMeA-lesion groups from each experiment.

### **RESULTS**

Avoidance data were analyzed separately for Experiments 1 and 2, since MeA lesions occurred pre- or post-training. With the exception of the predator odor data, all other tests combined data from Experiments 1 and 2 for analysis since these tests all occurred post-lesion and there were no differences in MeA lesion effects between the experiments. The predator odor tests were also analyzed separately since they used different odors (cat hair vs. fox urine).

### **LESION VERIFICATION**

Twenty-two rats received Sham lesion surgery and 42 rats received electrolytic lesions targeted to MeA. Twelve rats died post-surgery and all of these were in the MeA lesion group. **Figure 2** depicts the extent of acceptable lesions to MeA in the final dataset. One rat was excluded because of insufficient bilateral damage to MeA or excessive damage to adjacent regions. Thus, the final groups included 22 shams (Experiment 1: *n* = 9; Experiment 2: *n* = 5; Experiment 3: *n* = 8) and 27 MeA lesions (Experiment 1: *n* = 13; Experiment 2: *n* = 8; Experiment 3: *n* = 6).

### **ACTIVE AVOIDANCE MEASURES**

Pre-training lesion effects on AA measures were analyzed using group (Sham vs. lesion) × session (1–8) ANOVAs. Session was treated as a repeated measure. Bonferonni posttests evaluated group effects for individual sessions. Rats in both groups acquired the AA task equally (**Figure 3**); AA responses increased with training [group: *F*(1,20) = 1.0, *p* = 0.32; session: *F*(7,140) = 17.2, *p* < 0.01; group × session: *F*(7,140) = 0.32, *p* = 0.94] and escape responses (ERs) decreased with training [group: *F*(1,20) = 1.6, *p* = 0.22; session: *F*(7,140) = 9.3, *p* < 0.01; group × session: *F*(7,140) = 0.71, *p* = 0.66]. Rats in both groups also saw a decline in the number of shocks as the AR was acquired, althoughMeA-lesion rats receivedfewer shocks throughout training [group: *F*(1,20) = 6.4, *p* = 0.02; session: *F*(7,140) = 14.8,

*p* < 0.01; group × session: *F*(7,140) = 0.85, *p* = 0.54; Figure S1A in Supplementary Material]. To better understand how MeA-lesion rats could experience fewer shocks than Sham rats, but exhibit similar numbers of ARs and ERs, we more closely evaluated the patterns of responding during session 1 of training. Interestingly, MeA-lesion rats were more likely to escape following a shock presentation (Figure S1A in Supplementary Material). Since shocks are delivered every 5 s in the absence of shuttling, it is possible to receive significantly fewer shocks while still performing similar numbers of AR and ER shuttles. Note that MeA-lesion rats performed slightly more ARs and ERs in each session of training, thought this difference was statistically insignificant.

distance from Bregma in millimeters. Brain slides adapted from Paxinos and

For Experiment 2, pre-lesion AA data was analyzed as above, with repeated measures group (Sham vs. lesion) × session (1–7) analyses. To evaluate the effect of lesions on AA in good avoiders, we compared the average of the final two AA training sessions (6– 7) to the average of the two post-lesion test sessions (8–9) with group (Sham vs. lesion) by phase (pre- vs. post-lesion) ANOVAs, treating Phase as a repeated measure. Poor avoiders, identified after session 7, were excluded from all analyses. Rats in both groups acquired AA equally prior to lesion surgeries; there were no differences in ARs, ERs or shocks [group effects: *F*(1,11) < 1.9, *p* > 0.19; session effects: *F*(6,66) > 5.0, *p* < 0.01; group × session effects: *F*(6,66) < 0.77, *p* > 0.59]. Although there was a slight

optic tract; CP, caudate putamen.

**FIGURE 3 | MeA lesions have no effect on AA learning or performance**. **(A,C)** AA responses (ARs) and escape responses (ERs) during AA training for rats with pre-training MeA (filled circles, n = 13) or sham (open circles; n = 9) lesions. **(B,D)** ARs and ERs during AA training (sessions 1–7) and AA tests (sessions 8–9) for rats with post-training MeA (filled squares; n = 8) or sham (open squares; n = 5) lesions. Dotted vertical lines represent lesion surgeries in relation to AA training and testing.

dropoff in ARs following the lesion and recovery period for both groups [Phase: *F*(1,11) = 0.01, *p* = 0.01], MeA lesions had no effect on ARs, ERs, or shocks post-lesion [group × phase interactions: *F*(1,11) < 2.6, *p* > 0.14].

### **22 kHz USVs AND FREEZING DURING AA**

Ultrasonic vocalizations were automatically recorded throughout AA training and testing in the shuttleboxes. We also rated freezing behavior during the first 2 min of each training and test session. Statistical analyses of USV and freezing data for Experiments 1 and 2 were identical to those described for AA measures above (see Active Avoidance Measures). In Experiment 1, USVs declined as ARs were acquired, and rats with pre-training MeA lesions showed profound impairments in USVs throughout AA training [**Figure 4A**; group: *F*(1,20) = 24.00, *p* < 0.01; session: *F*(7,140) = 9.20, *p* < 0.01; group × session: *F*(7,140) = 0.68, *p* = 0.69]. In Experiment 2, USVs also declined as ARs were acquired, and there were no differences between the groups prior to lesions [**Figure 4B**; group: *F*(1,11) = 0.06, *p* = 0.81; session:

number of 22 kHz USVs per session during AA training for rats with pre-training MeA (filled circles; n = 13) or sham (open circles; n = 9) lesions. **(B)** Total USVs during AA training (sessions 1–7) and AA tests (sessions 8–9) for rats with post-training MeA (filled squares; n = 8) or sham (open squares; n = 5) lesions. \*p < 0.05 vs. Sham controls.

*F*(6,66) = 2.32, *p* = 0.04; group × session: *F*(6,66) = 1.67, *p* = 0.14]. Due to equipment failure, USV data were lost during the postlesion test for three rats (two Sham and one MeA-lesion rat). Rats with MeA lesions showed a decline in USVs compared to Shams; however, the differences were not statistically significant [group: *F*(1,8) = 1.8, *p* = 0.21, phase: *F*(1,8) = 5.0, *p* = 0.06, group × phase: *F*(1,8) = 1.3, *p* = 0.29].

In Experiment 1, freezing in the shuttleboxes increased with AA training and MeA-lesion rats froze significantly less than Sham controls [**Figure 5A**; group: *F*(1,20) = 6.55, *p* = 0.02; session: *F*(7,140) = 4.65, *p* < 0.01; group × session: *F*(7,140) = 1.30, *p* = 0.26]. Bonferonni post-tests indicate that the strongest group differences were toward the end of AA training [sessions 7 and 8: *t*(160) > 1.9, *p* < 0.05]. In Experiment 2, freezing also increased with AA training, and there were no differences between the groups prior to lesions [**Figure 5B**; group: *F*(1,11) = 3.09, *p* = 0.11; session: *F*(6,66) = 3.56, *p* < 0.01; group × session: *F*(6,66) = 0.90, *p* = 0.50]. However, after surgery, Sham rats showed slightly increased freezing rates in the AA context whereas MeA-lesion rats showed decreased freezing [group × phase: *F*(1,11) = 43.71, *p* < 0.01]. Bonferonni post-tests confirmed the absence of a group difference pre-lesion [*t*(22) = 1.83] and a significant group difference post-lesion [*t*(22) = 3.88, *p* < 0.01].

Since rats with pre-training MeA lesions received fewer shocks than Sham controls in Experiment 1 (Figure S1A in Supplementary Material), we conducted an additional analysis to determine if this explained the reduction in USVs and freezing in the shuttleboxes during AA training. Sham and MeA-lesion groups were divided in half by the number of shocks received during session 1 of AA training. Rats in the bottom half of the Sham group (Sham-Low; *n* = 5) and those in the top half of the MeA-lesion group (MeA-High; *n* = 6) had nearly identical mean shock scores (114 vs. 115). We compared Shocks, ARs, ERs, USVs, and freezing with two-way group (Sham-Low vs. MeAhigh) × session (1–8) ANOVAs, treating session as a repeated measure. We found no differences in Shocks, ARs, ERs, or freezing [group effects: *F*(1,91) ≤ 1.41, *p* > 0.27]. However, MeA-lesion rats still showed significantly fewer USVs even after controlling for shock levels [Figure S2 in Supplementary Material; group: *F*(1,9) = 32.60, *p* < 0.01; session: *F*(7,63) = 14.57, *p* < 0.01; group × session: *F*(7,63) = 2.56, *p* = 0.02].

### **PREDATOR ODOR**

As a positive control for MeA lesions, we also tested avoidance of predator odors, by comparing the percent time spent in the predator odor chamber during two habituation sessions (no predator odor) and two test sessions (predator odor in one side). For the statistical analysis, we took the average of the habituation and test sessions and conducted two-way group (Sham vs. lesion) × phase (habituation vs. odor test) ANOVAs, treating phase as a repeated measure. In Experiment 1, we found no effect of cat hair odor on the time spent in the cat hair chamber, thus, it was impossible to evaluate the effects of MeA lesions on cat hair avoidance [Phase: *F*(1, 20) = 1.4, *p* = 0.26]. Thus, in Experiment 2, we switched to fox urine as our predator odor, as this may be a more salient natural threat cue (Takahashi et al., 2005; Fendt, 2006). In this experiment, Sham rats showed a reduction in time spent in the fox urine chamber, and MeA-lesion rats did not [Figure S3 in Supplementary Material; group × phase interaction: *F*(1,11) = 6.7, *p* = 0.03]. Bonferonni post-tests confirmed that there were no differences between the groups during habituation [*t*(22) = 0.06], but MeAlesion rats spent more time in the fox urine chamber during the test [*t*(22) = 2.52, *p* < 0.05]. Interestingly, MeA-lesion rats appeared to prefer the fox urine chamber, perhaps because they experience the odor as less aversive than shams and are more likely to investigate this novel stimulus.

### **LOCOMOTOR ACTIVITY**

To evaluate potential MeA lesion effects on baseline locomotor activity, we measured line crossings during the predator odor habituation sessions. For each animal, an average of the two sessions was calculated. There were no differences in locomotor activity for the groups in Experiments 1 and 2, so these were combined into a single analysis. We found no differences in locomotor activity between Sham and MeA-lesion rats [Figure S4A in Supplementary Material; *t*(33) = 1.08, *p* = 0.29].

### **PAVLOVIAN THREAT CONDITIONING**

Pavlovian threat conditioning occurred outside of the shuttleboxes in a neutral context, followed by counterbalanced context freezing and cue freezing tests 1 day later. Since there were no differences in the pattern of lesion effects in Experiments 1 and 2, these data were combined into a single analysis. For the cue test, we used a two-way group (Sham vs. lesion) by TestPhase (Pre-CS vs. CS) ANOVA, treating TestPhase as a repeated measure. Rats in both groups showed little freezing pre-CS and strong freezing during the CS [**Figure 6A**; TestPhase: *F*(1,33) = 258.5, *p* < 0.01]; however, MeA lesions did not significantly alter pre-CS or CS freezing [group × TestPhase: *F*(1,33) = 1.2, *p* = 0.27]. For the context test,

**FIGURE 5 | MeA lesions impair freezing in the AA context**. **(A)** Percent time spent freezing during the first 2 min of each training session for rats with pre-training MeA (filled circles; n = 13) or sham (open circles; n = 9) lesions. **(B)** Percent freezing during the first 2 min of AA training (sessions 1–7) and AA tests (sessions 8–9) for rats with post-training MeA (filled squares; n = 8) or sham (open squares; n = 5) lesions. \*p < 0.05 vs. Sham controls.

context freezing. **(A)** Percent time spent freezing during the 30-s pre-CS and CS periods for all MeA (filled bars, n = 21) and sham (open bars; n = 14) rats in Experiments 1 and 2. **(B)** Percent time spent freezing during the 8-min exposure to the Pavlovian conditioning context for same rats. \*p < 0.05 vs. Sham controls.

MeA-lesion rats froze less than Sham rats [**Figure 6B**; *t*(33) = 2.29, *p* = 0.03].

### **AA EXTINCTION**

Since our PIT test involves AA extinction, we evaluated AA extinction directly in Sham and MeA-lesion rats. Rats were placed in the shuttleboxes and allowed to respond with shockers turned off for 60 min, then returned to the chambers 1 day later for the first aversive PIT test. PIT testing begins with AA extinction, thus, the first 5 min of the PIT test was used as the long-term memory test. Since there were no differences in the pattern of responding between Experiments 1 and 2, data from the two experiments were combined for analyses. Shuttling data are presented in Figure S4C in Supplementary Material in 5 min blocks. Within-session learning was assessed with a two-way group (Sham vs. lesion) × block (1–12) ANOVA, treating block as a repeated measure. Longterm memory was assessed by comparing shuttles during the last 5 min block of extinction acquisition (learning) with shuttles during the 5-min extinction test (memory) in a separate two-way group (Sham vs. lesion) × phase (learning vs. memory) ANOVA, treating phase as a repeated measure. Bonferonni post-tests were used to evaluate group differences during individual 5 min blocks. Sham and MeA-lesion rats shuttled equally during the first 5 min block of AA extinction [*t*(396) = 0.37]. Shuttling decreased during the extinction session, and MeA-lesion rats extinguished slightly faster than Sham rats [group: *F*(1,33) = 7.2, *p* = 0.01; block: *F*(11,363) = 20.53, *p* < 0.01; group × block: *F*(11,363) = 0.89, *p* = 0.55]. However, there was significant spontaneous recovery and both groups showed equivalent shuttling during the long-term memory test 1 day later [group: *F*(1,33) = 1.97, *p* = 0.17; phase: *F*(1,33) = 111.3, *p* < 0.01; group × phase: *F*(1,33) = 0.02, *p* = 0.88].

### **AVERSIVE PIT**

Aversive PIT was evaluated by allowing rats to shuttle in extinction and comparing shuttling rate during the CS to the shuttling rate immediately preceding the CS. Since there were no differences in the pattern of responding between Experiments 1 and 2, data were combined into a single analysis. PIT data are presented as a percentage of pre-CS responding in **Figure 7**. For simplicity, and because there were no differences in PIT within the groups between tests, a mean PIT score was determined for each animal for the two PIT tests. Sham rats showed a significant increase in shuttling rate with the aversive CS presentation, and this PIT effect was absent in MeA-lesion rats [*t*(33) = 3.915, *p* < 0.01].

### **SHOCK REACTIVITY**

To ensure that MeA lesions do not affect US (footshock) reactivity, a separate group of rats received Sham (*n* = 8) or MeA-lesion (*n* = 6) surgery prior to a pain threshold test. Rats received footshocks in ascending intensity steps of 0.1 mA and the thresholds to flinch, audibly vocalize, or jump were recorded for each rat. Data were analyzed with a two-way group (Sham vs. lesion) × threshold (flinch, vocalize, jump) ANOVA, treating threshold as a repeated measure. There were increasing thresholds for eliciting flinch, vocalization, and jump responses; however, no differences in shock reactivity were observed between the groups [Figure S4B in Supplementary Material; group: *F*(1,12) = 0.67, *p* = 0.43; threshold: *F*(2,24) = 46.94, *p* < 0.01; group × threshold: *F*(2,24) = 0.45, *p* = 0.65].

### **DISCUSSION**

The present experiments expand our understanding of aversive conditioned motivation and provide novel information regarding the role of MeA in generating defensive responses. Our major novel results are: (1) MeA lesions abolish aversive PIT without affecting Pavlovian freezing to the PIT CS or baseline AA behavior, and (2) MeA lesions impair USV and freezing reactions to contextual

where a single CS presentation occurred after a baseline of AA responding in extinction. PIT is presented as the percent of pre-CS responding for the two sessions (see Material and Methods) for all MeA (filled bars, n = 21) and sham (open bars; n = 14) rats in Experiments 1 and 2. Dashed line represents the absence of PIT (pre- and post-CS AR rates were equal). \*p < 0.05 vs. Sham controls.

threats. Control experiments and secondary analyses suggest that these effects are not explained by differences in locomotor activity, shock reactivity, total shocks received, or AA extinction. We also confirmed a role for MeA in predator odor avoidance. Together, these data suggest that MeA processes uncertain threats and may motivate ARs through activation of a general arousal-like state. These points are discussed in more detail below.

### **SELECTIVITY OF MeA LESIONS**

Electrolytic lesions were created by passing current through a monopolar electrode tip at four MeA sites per hemisphere. Histology revealed significant bilateral damage to MeA that completely spared damage to LA and BA, and largely spared damage to adjacent CeA and accessory basal nucleus. The cortical nucleus was moderately damaged in some animals, and the optic tract medial to MeA was damaged in nearly all cases. Damage to the optic tract may have affected vision in MeA-lesion animals; however, unsignaled AA depends critically on feedback stimuli (Bolles and Popp,1964),which were visual in our paradigm, and rats withMeA lesions had no impairment in AA learning or performance. Visual cues are likely important for contextual conditioning, thus context data should be interpreted with caution. With MeA lesions, there is also some concern that amygdalofugal fibers running between MeA and CeA are damaged. However, others have reported that simultaneous bilateral lesions of the amygdalofugal pathway lead to aphagia, adipsia, and death (Liang et al., 1990). Although we recorded no mortality post-surgery for our Sham rats, 12 rats died post-surgery in the MeA-lesion group. Thus, we suspect that MeA

lesions that significantly damaged the amygdalofugal pathway lead to premature death and assume that our final MeA-lesion groups had minimal damage to this pathway.

### **MeA IS NOT REQUIRED FOR ACTIVE AVOIDANCE**

To our knowledge, MeA function has never been evaluated with AA paradigms. The present experiments were largely inspired by results from a recent c-Fos analysis following training with an identical unsignaled AA protocol (Martinez et al., 2013). That study, which focused on individual differences in AA behavior and competing Pavlovian reactions, found greater MeA c-Fos activation after an AA test in good vs. poor avoiders. This led us to hypothesize that MeA is required for AA performance. However, in the present studies, we found no effects on ARs or ERs with pre- or post-training lesions. This strongly suggests that MeA is not required for the reinforcement or motivation of AA responding. Since c-Fos studies are only correlational, it is quite possible that differences in MeA c-Fos simply reflect differences in afferent regions that directly mediate AA behavior. Indeed, in our previous study, we also found that AA behavior correlated with c-Fos in LA, BA, CeA, and IL-PFC. Each of these regions has been implicated in AA performance with loss of function studies (Poremba and Gabriel, 1997, 1999; Choi et al., 2010; Lazaro-Munoz et al., 2010; Moscarello and LeDoux, 2013) and each sends projections to MeA (Hurley et al., 1991; Pitkänen et al., 1997; Pitkänen, 2000).

### **ROLE OF MeA IN PAVLOVIAN DEFENSIVE REACTIONS**

Our present data suggest that MeA is not required for the learning or expression of conditioned freezing to a discrete auditory cue. However, in several experiments, we found impairments in conditioned freezing to contextual cues, both in the AA context, and in a second conditioning context where the AR was not available. This was true even in our post-training lesion experiment with good avoiders, where total shock levels did not differ between groups (**Figure 5B**). It is notable that MeA lesions did not completely block context freezing, and in experiment 1, when we controlled for total shocks, MeA lesions did not significantly impair freezing in the AA context (Figure S2 in Supplementary Material). Thus, the results suggest that MeA plays a peripheral, not essential, role in Pavlovian context freezing.

MeA has received some attention in Pavlovian threat conditioning studies. Two studies, using a conditioning procedure similar to ours, found that pre-training MeA lesions failed to affect freezing to a tone CS previously paired with footshock (Nader et al., 2001; Holahan and White, 2002). However, another study found that inactivation of MeA blocks the expression of fear-potentiated startle to olfactory, visual, and contextual cues (Walker et al., 2005). A fourth study found that post-conditioning lesions of MeA had no effect on context freezing, but did block context-elicited neuroendocrine responses (Yoshida et al., 2014). The notion that MeA participates in contextual threat reactions appears to be supported by studies of neural activity in rats (Knapska et al., 2007; Trogrlic et al., 2011) and humans (Alvarez et al., 2008). Together, these findings suggest that MeA at least modulates contextual threat reactions, but has little role in Pavlovian reactions to discrete threat cues. This interpretation seems consistent with a role for MeA in

extended amygdala processing of uncertain threats (Sullivan et al., 2004), defined as threats that are weakly correlated with the US, threats that lack temporal precision, or threats unlinked to any particular AR (Seligman et al., 1971; Rosen and Donley, 2006; Rau and Fanselow, 2007; Davis et al., 2010).

We also found that MeA lesions severely impaired USV responding in the AA context, even when the number of shocks was similar between MeA-lesion and Sham groups. We are aware of no studies that evaluated the role of MeA in conditioned USV reactions, however, USVs have been observed with stimulation of the basolateral amygdala complex (BLA: LA + BA) and periacqueductal gray (PAG; Kim et al., 2013). Lesion studies suggest that BLA is necessary for conditioned USVs (Koo et al., 2004). Interestingly, this same study found that electrolytic lesions of CeA impaired USVs, but excitotoxic lesions had only a modest effect. The authors suggest that BLA fibers passing through CeA to some unknown effector region are important for conditioned USVs. Since MeA receives inputs from BLA and projects to PAG (Canteras et al.,1995;Pitkänen et al.,1997), our data raise the interesting possibility that MeA links contextual threat representations to USV effector regions. Our data also suggest that MeA-mediated defensive reactions, like USVs, are not incompatible or directly competing with active ARs;MeA-lesion rats emitted comparatively few USVs, but were no better at acquiring or performing the AR. This contrasts with CeA-mediated reactions like freezing, which constrain AA performance (Lazaro-Munoz et al., 2010; Moscarello and LeDoux, 2013).

Lastly, our data are not inconsistent with studies showing that conditioned freezing and USVs are correlated, and proportional to anxiety states in rats (e.g., Borta et al., 2006). Defensive responses are believed to be arranged hierarchically, and are often mediated by different brain regions. However, these brain regions are components of larger survival circuits that produce coordinated and dynamic responses to threats (LeDoux, 2014). It is likely that factors responsible for trait anxiety influence multiple parts of the circuit and multiple defensive behaviors, especially in response to similar threats.

### **MeA IS NECESSARY FOR PIT TO A GENERAL THREAT CUE**

Pavlovian–instrumental transfer procedures have been widely employed in appetitive studies to elucidate the psychological and neural mechanisms of conditioned motivation. Although instrumental procedures themselves rely on conditioned motivation for response performance, they are not ideal for studying conditioned motivation because learning is gradual and it is difficult to differentiate between reinforcement and motivation processes. The PIT test is entirely performance-based and allows one to study Pavlovian motivation of instrumental actions in isolation (Estes, 1948; Lovibond, 1983).

Using our recently developed aversive PIT procedure (Campese et al., 2013), where Pavlovian threats facilitate unsignaled (Sidman) AA responding, we found that electrolytic lesions of LA or CeA blocked PIT, but lesions of BA did not (Campese et al., 2014). Importantly, in this experiment, unsignaled AA was overtrained, which leads to amygdala-independent AA performance (Poremba and Gabriel, 1999; Lazaro-Munoz et al., 2010). This

allowed us to use lesions to evaluate PIT even though the same pretraining lesions normally impair AA acquisition (Lazaro-Munoz et al., 2010). In the present studies, we found that MeA lesions completely blocked aversive PIT, but had no effect on Pavlovian freezing to the CS or baseline instrumental avoidance. Thus, MeA is the first region to show a selective role in aversive transfer. LA is required for Pavlovian conditioning, AA, and PIT (Nader et al., 2001; Choi et al., 2010; Lazaro-Munoz et al., 2010; Campese et al., 2014). BA is required for AA and expression of Pavlovian conditioning (Anglada-Figueroa and Quirk, 2005; Choi et al., 2010; Lazaro-Munoz et al., 2010). And CeA is required for Pavlovian conditioning and PIT, but opposes AA expression (Nader et al., 2001; Choi et al., 2010; Lazaro-Munoz et al., 2010).

The present data may help refine our understanding of the complicated role that CeA plays in aversive conditioned motivation. It is unclear how CeA could mediate both Pavlovian reactions like freezing and facilitate instrumental actions like shuttling (PIT). CeA is known to mediate different response types via cell-type specific projections to different effector region (Huber et al., 2005; Viviani et al., 2011). CeA has also been shown to mediate both active and passive defensive responses (Gozzi et al., 2010), depending on local circuit activity and, perhaps, regulation by IL-PFC processes (Moscarello and LeDoux, 2013). Our results suggest alternative possibilities: (1) direct CeA projections could relay conditioned threat information to MeA even while outputs mediating Pavlovian freezing are inhibited, or (2) direct projections from LA to MeA could relay conditioned threat information necessary for PIT. Note that CeA has been implicated in aversive PIT only with electrolytic lesions that damage fibers of passage. Our USV data combined with previous findings (Koo et al., 2004; Kim et al., 2013) suggest that LA fibers coursing through CeA to MeA are important for conditioned USVs, and our MeA lesions impaired aversive PIT and conditioned USVs in the same animals, suggesting a common mechanism. These pathway specific hypotheses could be tested with disconnection lesions, inactivation of CeA, or more precise targeting of projections with optogenetic or chemogenetic techniques (Rogan and Roth, 2011; Aston-Jones and Deisseroth, 2013).

It is important to mention that appetitive PIT procedures have identified both outcome-specific and general forms of conditioned motivation. These complex procedures simultaneously evaluate multiple responses, CS and US combinations in the same animal during the same session (e.g., Corbit and Balleine, 2005). In brief, CSs selectively facilitate responses that are linked to the same US (specific PIT). Thus, when presented with a cue predicting sucrose, rats will selectively increase pressing on a bar that previously earned sucrose over a bar that earnedfood pellets. However, a CS linked to a third appetitive US (e.g., polycose) that was not available during bar-press training, will facilitate responding on both sucrose and food-pellet bars (general PIT). In appetitive studies, specific PIT depends on associations between the CS and specific sensory features of the US, and is BLA-dependent (Corbit and Balleine, 2005). General PIT depends on associations between the CS and "affective" properties of the US, and is CeA dependent (Hall et al., 2001; Holland and Gallagher, 2003; Corbit and Balleine, 2005). Thus in general PIT, CS presentations are assumed to activate a general arousal-like state that can motivate many instrumental responses linked to USs of the same valence. These complex procedures are more difficult to develop with aversive studies, however, there is reason to believe our procedure produces general PIT. First, although the reinforcement mechanism in AA is unknown, it is clearly different from the reinforcement in Pavlovian threat conditioning. In AA, learning occurs on trials where the US is omitted, whereas in threat conditioning, learning occurs on trials where the US is presented. This mismatch between reinforcers suggests that specific PIT is not possible with our procedure. Second, appetitive studies suggest that a response choice is necessary for specific PIT (Corbit and Balleine, 2005); even when USs match, PIT is CeA dependent when only one instrumental response is available, as in our procedure (Holland and Gallagher, 2003). Thus, we hypothesize that threats in our simple PIT procedure activate a general defensive state that can motivate any avoidance response available to the animal.

Finally, our combined studies on AA and PIT suggest that there is another distinction in conditioned motivation mechanisms that relates to the role of the CS in the instrumental associative structure. Both AA and PIT rely on conditioned motivation mechanisms to generate AA responding, so why would they depend on such different neural pathways? Early in AA training, the CS (or warning signal) is transformed into a threat by pairing with the US. However, once the AR is learned, the CS functions as a discriminative, or occasion-setting, stimulus that signals when the instrumental contingency is in operation (Ross and LoLordo, 1987; Rescorla, 1990). In our PIT procedure, the PIT CS is never present during AA training and cannot be part of the instrumental memory structure. Thus, our data are consistent with a model where: (1) LA is necessary for threat learning, (2) BA is necessary for signaling when an AR is available to avoid a specific US, and (3) CeA and MeA are necessary for motivation of ARs when threats are uncertain or unlinked to available ARs, through activation of a central defensive state (**Figure 8**).

### **LIMITATIONS**

We chose to use electrolytic lesions to evaluate the role of MeA in learned and innate defensive responses. Electrolytic lesions are often preferred for initial investigations of the necessity of brain regions (Cain and LeDoux, 2008b). There are several reasons for this: (1) they can clearly rule out a necessary role for a brain region, since effective lesions leave no functional brain tissue, (2) compared to chemical lesions, inactivations, or techniques that depend on viral infection, it is easier to control the spatial extent of affected tissue, and (3) it is easy to confirm the manipulation with basic histological techniques. However, electrolytic lesions are permanent and damage fibers of passage (Kim et al., 2013), which can sometimes lead to misleading results if there are compensatory changes in the brain or if fibers of passage in a region, but not cell bodies, are necessary for a particular function. Electrolytic lesions may be most problematic for the interpretation of context freezing deficits, as the optic tract was clearly damaged in most animals. Although rodents likely use all sensory modalities in creating a representation of context, visual cues are clearly important, and these results should be interpreted with caution. Ultimately, it is important to confirm the effects of electrolytic lesions with techniques that are reversible and do not damage

**FIGURE 8 |Working model of amygdala pathways mediating defensive reactions, AA, and aversive PIT**. LA is primarily involved in learning and storing Pavlovian CS–US associations. Once the CS gains affective valence, it can be used by downstream areas to generate wide-ranging defensive behaviors. LA and BA are required for instrumental AA, whereas CeA and MeA are not. CeA is necessary for expressing Pavlovian reactions to imminent threats. MeA, as part of the medial extended amygdala, mediates defensive reactions to uncertain or distant threats and aversive PIT to general threats that are not part of the AA memory associative structure. IL-PFC can regulate amygdala-mediated defensive reactions and facilitate AA performance. Blue lines denote pathways that can promote instrumental action and block defensive motivational states. Red lines denote pathways that can promote defensive reactions and defensive motivational states. The line connecting IL-PFC to MeA is dashed because little is known about the influence of this pathway on defensive reactions and aversive PIT. The line connecting LA to MeA passes through CeA since it is not yet clear whether CeA is necessary for conditioned USVs and PIT or whether fibers passing through CeA relay critical information directly to MeA. LA, lateral amygdala, BA, basal amygdala, CeA, central amygdala, MeA, medial amygdala, IL-PFC, infralimbic prefrontal cortex, USV, 22 kHz ultrasonic vocalizations, PIT, Pavlovian-instrumental transfer, Ctxt, context, CS, conditioned stimulus.

fibers of passage. Exciting new techniques also allow for control neural activity that is cell-type specific, reversible and even pathway specific (by controlling projections between brain regions) (Rogan and Roth, 2011; Aston-Jones and Deisseroth, 2013). We are currently pursuing such studies to confirm the roles of LA, BA, CeA,MeA, and IL-PFC in threat conditioning,AA and aversive PIT.

As mentioned above, our PIT procedure cannot differentiate between outcome-specific and general forms of conditioned motivation. Although we are developing procedures that may ultimately address these issues, these procedures are inherently more difficult to develop than appetitive PIT procedures. This is mainly because hungry rats are much more likely to behave actively when presented with multiple food options, whereas rats experiencing multiple threats and aversive USs tend to cease active behavior and freeze. It is important to point out that aversive PIT studies have lagged far behind appetitive PIT studies, and it will take time to develop the ideal procedures. However, our simple PIT procedure is already generating novel and important information about aversive conditioned motivation, as did the early appetitive PIT studies that also used simple procedures (Estes, 1948; Lovibond, 1983).

Lastly, our interpretation of the USV findings assume that these are conditioned reactions elicited by Pavlovian contextual cues. This is largely because prior studies interpret USVs this way (e.g., Koo et al., 2004), and because USVs were elicited in the shockpaired AA context, and post-shock responses like freezing are known to be conditioned, not unconditioned, reactions (Fanselow, 1986). However, it is possible that USVs represent unconditioned reactions to US presentation in the AA context. Others have reported 22 kHz USV responses to unconditioned threats, including predators (Blanchard et al., 1991), and direct stimulation of pathways believed to relay US information to the amygdala also trigger USVs (Kim et al., 2013). Further, it was not uncommon in our studies that rats began emitting USVs after receiving the first shock during AA training sessions (not upon entering the chamber). However, this alternate interpretation of USVs would not significantly change our conclusions and would only suggest that MeA has a dual role in processing conditioned threats and mediating unconditioned responses to naturally aversive stimuli. This seems likely anyway, given the clear role in aversive PIT and in defensive responses to predator odor cues (e.g., Rosen et al., 2008; Figure S3 in Supplementary Material).

### **CONCLUSION AND CLINICAL IMPLICATIONS**

In conclusion, our studies reveal an essential and selective role for MeA in aversive PIT. They also suggest that MeA is critical for processing uncertain or general threats and generating lowerlevel "anxiety-like" defensive responses. Although we cannot know what the rat is feeling during these tasks (LeDoux, 2014), it is likely that these forms of threat processing relate to human anxiety disorders. Human anxiety is characterized by defensive reactions to often uncertain threats (Tolin et al., 2003), and AA mechanisms likely relate to both adaptive (LeDoux and Gorman, 2001) and maladaptive coping strategies (McGuire et al., 2012). Aversive PIT demonstrates how threat cues can invigorate, or re-invigorate, AA behavior, even after it is extinguished. In the case of adaptive ARs, PIT mechanisms could contribute to beneficial active coping strategies in resilient individuals. However, in the case of maladaptive ARs, PIT mechanism could trigger a relapse to pathological behavior even after seemingly successful treatment. Several recent reports demonstrate that aversive PIT occurs in humans and may depend on similar neural pathways (Nadler et al., 2011; Geurts et al., 2013; Lewis et al., 2013). These studies, along with mechanistic studies in rodents, hold promise for discovering novel and

improved treatments for human anxiety disorders characterized by impaired or inappropriate avoidance responding.

### **AUTHOR CONTRIBUTIONS**

Margaret Grace McCue and Dr. Christopher K. Cain designed the experiments. Ms. Margaret Grace McCue conducted all of the surgeries, behavioral testing and histology for lesion verification. Ms. Margaret Grace McCue, Dr. Christopher K. Cain, and Dr. Joseph E. LeDoux analyzed the results, wrote the manuscript and approved the final manuscript for submission.

### **ACKNOWLEDGMENTS**

Research supported by an NIMH grant (R21 MH097125) and a NIDA subaward (F6761-01) to Christopher K. Cain, and an NIMH grant (R01 MH38774), an NSF grant (0920153), and a NIDA grant (R01 DA029053) to Joseph E. LeDoux. The authors would like to thank Justin Moscarello and Vincent Campese for helpful discussions and advice, and Jeanny Kim for assistance with data collection.

### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at http://www.frontiersin.org/Journal/10.3389/fnbeh.2014.00329/ abstract

### **REFERENCES**


functions of the amygdala. *Trends Neurosci.* 20, 517–523. doi:10.1016/S0166- 2236(97)01125-9


contextual conditioned fear in male rodents. *Endocrinology* 155, 2996–3004. doi:10.1210/en.2013-1411

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 04 July 2014; accepted: 03 September 2014; published online: 18 September 2014.*

*Citation: McCue MG, LeDoux JE and Cain CK (2014) Medial amygdala lesions selectively block aversive Pavlovian–instrumental transfer in rats. Front. Behav. Neurosci. 8:329. doi: 10.3389/fnbeh.2014.00329*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 McCue, LeDoux and Cain. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Enhanced discriminative fear learning of phobia-irrelevant stimuli in spider-fearful individuals

#### **Carina Mosig<sup>1</sup> , Christian J. Merz <sup>2</sup> , Cornelia Mohr <sup>1</sup> , Dirk Adolph<sup>1</sup> , Oliver T.Wolf <sup>2</sup> , Silvia Schneider <sup>1</sup> , Jürgen Margraf <sup>1</sup> and Armin Zlomuzica<sup>1</sup>\***

<sup>1</sup> Mental Health Research and Treatment Center, Ruhr-University Bochum, Bochum, Germany

<sup>2</sup> Department of Cognitive Psychology, Institute of Cognitive Neuroscience, Ruhr-University Bochum, Bochum, Germany

### **Edited by:**

Richard J. Servatius, DVA Medical Center, USA

### **Reviewed by:**

Kevin D. Beck, Rutgers New Jersey Medical School, USA Seth Davin Norrholm, Emory University School of Medicine, USA

### **\*Correspondence:**

Armin Zlomuzica, Mental Health Research and Treatment Center, Ruhr-University Bochum, Universitätsstr. 150, Bochum 44780, Germany e-mail: armin.zlomuzica@rub.de

Avoidance is considered as a central hallmark of all anxiety disorders. The acquisition and expression of avoidance, which leads to the maintenance and exacerbation of pathological fear is closely linked to Pavlovian and operant conditioning processes. Changes in conditionability might represent a key feature of all anxiety disorders but the exact nature of these alterations might vary across different disorders. To date, no information is available on specific changes in conditionability for disorder-irrelevant stimuli in specific phobia (SP). The first aim of this study was to investigate changes in fear acquisition and extinction in spider-fearful individuals as compared to non-fearful participants by using the de novo fear conditioning paradigm. Secondly, we aimed to determine whether differences in the magnitude of context-dependent fear retrieval exist between spider-fearful and nonfearful individuals. Our findings point to an enhanced fear discrimination in spider-fearful individuals as compared to non-fearful individuals at both the physiological and subjective level.The enhanced fear discrimination in spider-fearful individuals was neither mediated by increased state anxiety, depression, nor stress tension. Spider-fearful individuals displayed no changes in extinction learning and/or fear retrieval. Surprisingly, we found no evidence for context-dependent modulation of fear retrieval in either group. Here, we provide first evidence that spider-fearful individuals show an enhanced discriminative fear learning of phobia-irrelevant (de novo) stimuli. Our findings provide novel insights into the role of fear acquisition and expression for the development and maintenance of maladaptive responses in the course of SP.

**Keywords: differential fear conditioning, anxiety disorders, specific phobia, spider fear, conditionability, extinction, fear renewal, virtual reality**

### **INTRODUCTION**

Patients with anxiety disorders and stressor-related disorders exhibit an increased avoidance of fear-related stimuli and situations. An increased tendency to avoid novel situations might constitute an important risk factor for the development and maintenance of clinical anxiety as shown in anxiety vulnerable individuals (e.g., behaviorally inhibited individuals, Fox et al., 2005) and animal models of anxiety vulnerability (Beck et al., 2010). Findings from these studies emphasized the importance of increased conditionability as a functional mechanism contributing to a strong avoidance behavior (Ricart et al., 2011;Myers et al., 2012;Holloway et al., 2014). Conditionability refers to the capacity to acquire new associations between a neutral (conditioned) stimulus (CS) and an aversive (unconditioned) stimulus (UCS) or outcome. Conditionability also comprises the ability to extinguish this association if it becomes invalid (CS-noUCS). Evidence from psychophysiological, behavioral, and imaging studies showed that individuals with high trait anxiety (Caulfield et al., 2013), patients with anxiety disorders (Lissek et al., 2005) as well as traumatized individuals (Milad et al., 2009; Norrholm et al., 2011; Jovanovic et al., 2013; Stevens et al., 2013) show systematic changes in the acquisition and extinction of conditioned fear.

Although a great deal of different methods has been utilized [see Lissek et al. (2005)], these studies typically assessed conditionability in a differential fear conditioning paradigm. Here, conditioned responses (CR) are operationalized as the difference of responses to aversively paired CS<sup>+</sup> and unpaired CS<sup>−</sup> as measured on the psychophysiological [e.g., skin conductance responses (SCRs), startle amplitudes] and/or subjective level (shock expectancy and subjective valence ratings) (Hermans et al., 2002;Arnaudova et al., 2013).

Given that the fear-inducing stimuli and situations as well as the associated symptoms vary between different anxiety disorders, the *de novo* fear conditioning paradigm (where participants are conditioned to unfamiliar and disorder-irrelevant stimuli) has been employed to detect alterations in general conditionability in patients with anxiety disorders and stressor-related disorders. A stronger acquisition (Orr et al., 2000; Norrholm et al., 2011) as well as a delayed extinction (Peri et al., 2000; Blechert et al., 2007) was found in patients diagnosed with post-traumatic stress disorder (PTSD) as compared to participants without trauma exposure and to healthy controls, respectively. The delayed extinction as indicated on the psychophysiological level and the level of UCS-expectancy ratings is paralleled by a weaker extinction of

conditioned negative valence in PTSD (Blechert et al., 2007). In contrast, patients with panic disorder (PD) showed no differences in CR during acquisition compared to control participants (Grillon et al., 1994; Michael et al., 2007), but displayed larger SCRs to CS<sup>+</sup> stimuli during extinction (Michael et al., 2007).

Overall, these findings imply that fear learning as measured on the behavioral, psychophysiological, and neuronal level is specifically altered in anxiety and stressor-related disorders and might represent a key feature of these disorders. However, there is also evidence for clear differences in fear conditioning between PD (Grillon et al., 1994; Michael et al., 2007) and PTSD patients (Orr et al., 2000; Blechert et al., 2007; Milad et al., 2009; Norrholm et al., 2011; Jovanovic et al., 2013). This suggests that changes in the ability to acquire and extinguish conditioned fear might be disorder-specific and might resemble some core symptomatic features characteristic of a certain disorder. To allow for a more general conclusion, however, comparison to yet another disorder group would be valuable. Given that the pathogenesis of SP is likely to involve cued fear conditioning, individuals with a SP would be an appropriate comparison group. SP is characterized by exaggerated fear of specific objects or situations, and cued conditioning is thought to play a central role in the etiology of this condition (e.g., Grillon, 2002). Presently, only little information is available about possible changes in general fear conditionability for *de novo* stimuli in spider-fearful individuals (Schweckendiek et al., 2011). The investigation of a group that shows a specific fear of spiders might provide valuable information on the integrity of the fear conditioning system in individuals with SP that would allow predictions on the speed of fear extinction through exposure therapy. It would also allow for comparison on differences in the magnitude and characteristics of fear learning between different forms of anxiety. For instance, the symptomatology of individuals showing a cue-specific fear (e.g., spider fear) is quite different relative to the symptomatology of individuals suffering from PTSD or PD. Besides such differences in symptomatology, there are also substantial differences between PTSD and PD on the one hand and SP on the other hand with respect to psychophysiological (Cuthbert et al., 2003; Lang and McTeague, 2009) and neuronal reactivity (Rauch et al., 2003; Etkin and Wager, 2007) during the processing of neutral and negative stimuli. Furthermore, in contrast to PD and PTSD patients, SP is associated with lower levels of anxiety and depressive symptoms (Cook et al., 1988;Cuthbert et al., 2003). Acute stress exposure (Merz et al., 2013), higher levels of tensionstress (Arnaudova et al., 2013), as well as increased anxiety levels (Dibbets et al., 2014) are linked to deficits in discriminatory fear learning. This poses another potential problem with the interpretation of previous findings on fear conditioning in clinical anxiety samples (e.g., Blechert et al., 2007; Michael et al., 2007) because differences in conditionability might be confounded by comorbid depressive symptoms and/or differences in stress levels.

In recent years, the examination of contextual effects on fear conditioning processes has become a matter of extensive clinical research because findings from these studies bear the potential to optimize exposure-based therapies in anxiety disorders (Craske et al., 2014). With respect to the treatment of anxiety disorders, the extinction of a learned association or CR leading to maladaptive behavior is equally important as learning new behavior-outcome associations, which support appropriate or "normal" behavior. Therefore, exposure-based therapy seems to be primarily based on fear extinction learning (Michael et al., 2009; Vervliet et al., 2013; Craske et al., 2014). However, extinction is a complex multi-level process. Conditioned fear responses can reoccur after extinction learning over time (spontaneous fear recovery) or when an excitatory CS is presented in an unfamiliar context (fear renewal) (Bouton, 2004, 2006). Renewal after extinction learning in experimental settings corresponds to one form of relapse after exposure therapy (Rachman, 1989; Craske et al., 2014), representing a serious problem in psychotherapy (Laborda et al., 2011). Despite the high clinical relevance, significant demonstrations of fear renewal after successful exposure therapy have been scarce so far and have yielded conflicting results (e.g., Mineka et al., 1999; Mystkowski et al., 2002, 2006). Our present knowledge of the underlying behavioral and neurobiological mechanisms governing context-dependent conditioning is primarily based on findings from animal studies and/or studies with healthy human participants (Bouton and Bolles, 1979; Bouton, 1988, 1991, 1994; Bouton and Nelson, 1998; Milad et al., 2005). To the best of our knowledge, however, there is a lack of studies assessing context-dependent fear conditioning in patients with anxiety disorders or in individuals with high levels of trait anxiety.

A promising tool for the study of contextual influences on fear conditioning is virtual reality (VR) technology (Grillon et al., 2006; Alvarez et al., 2007; Huff et al., 2011; Dunsmoor et al., 2014). The VR approach allows for systematic manipulation of context conditions and is more likely to induce a strong fear renewal since participants are provided with multisensory input in an experimental setup that more closely corresponds to realworld experiences (Huff et al., 2011). Thus, besides high ecological validity, VR techniques offer the possibility to conduct translational research on contextual effects during fear conditioning. For instance,VR environments to a great extent resemble physical multisensory contexts implemented in animal studies, as participants, in a manner analogous to rodent exploratory behavior, are engaged in the exploration of the VR environment (Huff et al., 2011). This is especially important with regard to the cross-species translational approaches examining context-dependent fear conditioning in animals and humans (Soliman et al., 2010; Haaker et al., 2013).

The present study sought to examine whether spider-fearful individuals would show alterations in the acquisition and extinction of conditioned fear. These findings could help to disentangle whether possible alterations in fear conditioning processes in participants with a specific fear of spiders are different relative to findings obtained in PTSD (Blechert et al., 2007) and PD (Michael et al., 2007). To allow for some comparability across studies, we examined differential fear conditioning in spiderfearful participants by using a modified version of the recently used differential fear conditioning paradigm (Blechert et al., 2007, 2008; Michael et al., 2007). Our paradigm utilizes the simultaneous assessment of CR on the autonomic (SCRs) and cognitive (UCS-expectancy ratings), but also the affective (valence ratings) level (Hermans et al., 2002; Blechert et al., 2007, 2008; Michael et al., 2007).

Clear differences in the amount of comorbid depression and anxiety symptoms exist across different anxiety and stressor-related disorders (Cook et al., 1988; Cuthbert et al., 2003), which might influence discriminative fear learning processes (Otto et al., 2007; Gazendam and Kindt, 2012; Arnaudova et al., 2013). Therefore, we used the Depression Anxiety Stress Scales (DASS) to control for possible effects of negative emotional states, such as anxiety, stress, and depression, on fear conditioning in spiderfearful individuals. The DASS has recently been shown to provide valuable information on the link between negative emotional states and inter-individual variability in discriminative fear learning [see Arnaudova et al. (2013)].

Given that extinction is a highly context-dependent process, another aim of this study was to determine whether spiderfearful individuals show differences in the context-dependent re-emergence of fear responses as compared to non-fearful individuals. We used VR environments as external contexts, as has previously been shown (e.g., Alvarez et al., 2007; Huff et al., 2011; Dunsmoor et al., 2014), and assessed context-dependent retrieval of extinguished CR at the subjective (expectancy and valence ratings of CS) and psychophysiological level (SCRs).

### **MATERIALS AND METHODS**

### **PARTICIPANTS**

Individuals with a specific fear of spiders and non-fearful individuals were recruited to participate in a study dealing with "clinical implications of spider fear". Recruiting was performed via bulletin board notices on the campus of the Ruhr-University Bochum (Germany) and by postings in social media networks. All participants were further screened using the Fear of Spiders Questionnaire [FSQ; Szymanski and O'Donohue, 1995; German version by Rinck et al. (2002)]. Only participants who explicitly reported a moderate to severe specific fear of spiders on the FSQ [cut-off score >15, according to Cochrane et al. (2008)] were assigned to the spider-fearful group. Individuals who explicitly reported to have no fear of spiders and in the FSQ scored below the cut-off were assigned to the non-fearful group. Exclusion criteria for both groups included a severe acute or chronic disease, current pharmacological or behavioral treatment for mental disease, drug/alcohol abuse or dependence, or other use of medications.

Three participants were excluded from data analyses due to technical errors during the experimental procedure. Our final sample consisted of 43 participants: 25 spider-fearful participants (mean age of 24.1, SD = 5.8) and 18 non-fearful individuals (mean age of 23.4, SD = 2.7), with a mean FSQ score of 61.1 (SD = 21.1) and 2.8 (SD = 3.2), respectively (see **Table 1**). All participants provided written informed consent. The study was approved by the local ethics committee of the Ruhr-University Bochum and conducted according to the guidelines of the Declaration of Helsinki. Each participant received a payment of 20C as reimbursement.

### **EXPERIMENTAL DESIGN**

We used an adapted version of the differential fear conditioning paradigm previously developed by Blechert et al. (2007). In particular, differential fear conditioning was assessed by using a set of different dependent measures including SCRs, as well as affective (valence ratings) and cognitive (UCS-expectancy ratings) responses [see Blechert et al. (2007)]. A high-frequency tone



DASS, depression anxiety stress scale; FSQ, fear of spiders questionnaire; SPQ, spider phobia questionnaire; UCS, unconditioned stimulus.

\*Groups differed from each other in post hoc tests (p < 0.01); \*\*Groups differed from each other in post hoc tests (p < 0.001); T test for independent groups.

(300 Hz) and a low-frequency tone (135 Hz) served as CS<sup>+</sup> and CS−. CSs were counterbalanced and presented via headphones (60 dB). The presentation of CS<sup>+</sup> lasted for 8 s and co-terminated with the UCS. The UCS was a mild electrical stimulation applied to the skin of the lower arm for the duration of 500 ms. The CS<sup>−</sup> was never paired with the UCS. The conditioning task consisted of a habituation, acquisition, extinction, and a retrieval phase (both in the former acquisition and extinction context). During all phases, the sequence of CSs was pseudorandom, although owed to the constraint that only two identical CSs may occur consecutively. The inter-stimulus interval (ISI) was set randomly at 18–22 s.

VR software was used to examine the effects of contextual change during the phases of fear acquisition and extinction. After habituation, each participant was subjected to the entire conditioning procedure within a VR-based format. We used an AB (AB) renewal setup with a within-subject design [according to Alvarez et al. (2007)]. Each participant experienced fear acquisition in context A, but extinction was conducted in context B. Subsequently, participants were re-exposed to contexts A and B for a retrieval test. The order of presentation of context A and context B was matched across the participants. Context presentation during the acquisition and extinction phase was counterbalanced across participants and groups (for half of the participants context A served as the acquisition context and context B as the extinction context and vice versa for the other half). Also, the order of context presentation during the fear retrieval test was counterbalanced (i.e., half of the participants was returned to context A first and then entered context B, while for the other half the context order was reversed).

Max Payne software was used to create VR contexts (see Cyberpsychology Lab, University of Quebec, Outaouais, http: //w3.uqo.ca/cyberpsy/en/index\_en.htm). The VR environment was presented with a 3D head-mounted display (Z800, eMagin, USA). During the conditioning procedure, two different contexts were presented while the CSs were delivered via headphones

**differential fear conditioning procedure**. VR software was used for the operationalization of external context change during the phases of fear acquisition and extinction. Context 1 featured an apartment and context 2 showed a cafeteria. Participants were instructed to freely explore the VR

(see **Figure 1**). The fear conditioning experiment consisted of three sessions with a break of 15 min in-between sessions. The first session consisted of a habituation phase and the acquisition phase and lasted about 15 min. Habituation served the purpose of reducing orienting responses to the CSs and to allow participants to acclimate to the experimental environment. During habituation, two CS<sup>+</sup> and two CS<sup>−</sup> were presented while the head-mounted display depicted only a black screen. During acquisition, a total of 10 CS<sup>+</sup>

and 10 CS<sup>−</sup> were presented in context A. Six out of the 10 CS<sup>+</sup> were paired with the UCS. In the second session, participants were extinguished in context B. Both CSs were presented eight times each, but were never paired with the UCS. The second session lasted about 10 min. In the third session, the fear retrieval test was run in both context A and context B. In each context, 3 CS<sup>+</sup> and 3 CS<sup>−</sup> were presented. The UCS was not administered during the fear retrieval phase.

### **APPARATUS AND PHYSIOLOGICAL RECORDINGS**

The experiment was conducted in a sound-attenuated room electrically connected to an adjacent control room where the experimental apparatus was stationed. Experimenter and participant were able to communicate via headphones and microphone. A constant current electrical stimulator delivered the UCS via Ag/AgCl electrodes placed on the left lower arm of the participant. SCRs were measured via 5-mm inner diameter Ag/AgCl electrodes that were filled with non-hydrating electrode paste and attached on the distal phalanxes of the index and middle finger of the non-dominant hand. Stimulus delivery was controlled with Presentation software (Neurobehavioral Systems, USA). Physiological data was obtained in a continuous mode using a 16-bit Brain Amp ExG amplifier and was analyzed with Brain Vision Recorder Software (Brain Products, Gilching, Germany).

virtual context. The order of context presentation was matched across participants and counterbalanced during fear retrieval in contexts A (ret A) and B (ret B).

### **ASSESSMENTS Questionnaires**

Differences in anxiety, stress, and depression levels between spiderfearful and non-fearful participants were assessed with the DASS (21-item version; Lovibond and Lovibond, 1995). The DASS-21 comprises three 7-item self-report scales (depression, anxiety, stress), measuring acute symptoms of depression, anxiety, and stress on a 4-point scale (0 = did not apply to me at all, 3 = applied to me very much). Sum scores for each scale as well as a total sum score were calculated for each participant. The DASS-21 has previously been associated with very good reliability estimates (Antony et al., 1998; Clara et al., 2001). Internal consistencies (Cronbach's alpha) were in the good to excellent range: 0.88 for the depression scale, 0.82 for the anxiety scale, 0.90 for the stress scale, and 0.93 for the total scale (Henry and Crawford, 2005). Convergent and discriminant validity was good when compared with other validated measures of depression and anxiety (e.g., Hospital Anxiety and Depression Scale, Zigmond and Snaith, 1983; Personal Disturbance Scale, Bedford and Foulds, 1978; Henry and Crawford, 2005).

The FSQ [Szymanski and O'Donohue, 1995; German version by Rinck et al. (2002)] consists of 18 items depicting spider-fearrelevant statements. Agreement to each statement is rated on a 7-point scale (0 = does not apply to me at all,6 = applies to me very much). A sum score was calculated for each participant. Internal consistency (Cronbach's alpha) and retest-reliability of the German version of the FSQ were excellent: 0.96 and 0.95, respectively (Rinck et al., 2002).

In addition to the FSQ, the Spider Phobia Questionnaire [SPQ; Watts and Sharrock, 1984, German version by Rinck et al. (2002)] was administered to provide further information about the magnitude of spider-fear-related cognitions and avoidance behavior in spider-fearful participants. The SPQ contains 43 items describing

spider-relevant situations as well as possible reactions and attitudes toward spiders. Each item is either confirmed (correct) or refused (incorrect) by the participant. A sum score was calculated. Internal consistency (Cronbach's alpha) of the German version of the SPQ was 0.84 and retest-reliability was 0.94 (Rinck et al., 2002). Muris and Merckelbach (1996) tested both the FSQ and the SPQ and confirmed adequate reliability and validity. Both questionnaires could discriminate phobics from non-phobics, were sensitive to therapeutic change after cognitive behavior therapy, and correlated significantly with other subjective and behavioral indices of spider fear. As the FSQ and the SPQ tap somewhat different aspects of spider fear, it is recommended to administer both questionnaires in order to get a clearer picture of the nature of spider fear (Szymanski and O'Donohue, 1995; Muris and Merckelbach, 1996).

### **UCS expectancy and CS valence ratings**

At the end of each phase of the differential fear conditioning paradigm, ratings of CS valence and UCS expectancy were obtained. For this purpose, each CS type was presented once again via headphones followed by a standardized, pre-recorded rating instruction that was likewise presented via headphones. Pursuant to the instruction, participants had to evaluate the valence of the particular CS ("How do you feel when you hear this tone?") on a 5-point vertical visual analog scale ranging from −2 = very uncomfortable to +2 = very comfortable (0 = neutral). UCS expectancy ("Do you think that this tone is paired with an electrical stimulation?") was rated from −2 = highly unlikely to +2 = most likely (0 = equiprobable).

### **Skin conductance responses**

Skin conductance responses were obtained by subtracting the average SC level (SCL) during the 1000 ms preceding CS onset (baseline) from the maximum SCL recorded during the last 7 s of CS presentation. SCR data were *z*-transformed to obtain a normal distribution.

### **PROCEDURE**

Upon arrival, each participant was informed about the content and goal of the experiment. In the laboratory room, participants were seated in a comfortable chair and electrodes for the measurement of SCRs as well as for the application of the electric current were attached. Together with the experimenter, each participant individually adjusted the intensity of the electric stimulation to a level they subjectively perceived as "uncomfortable but not painful" [adapted from Blechert et al. (2007)]. The experimenter explained that participants would be exposed to virtual environments via the head-mounted display while tones of different frequencies would be presented via the headphones and an electric current would be administered once in a while. Finally, the experimenter introduced and explained the vertical visual analog scale for the CS valence and UCS-expectancy rating procedure. Rating instructions were repeated and the ratings were trained with each participant several times to ensure that the rating procedure was fully understood. Thereafter, each participant was equipped with the head-mounted display and headphones, the room light was switched off, and the experimenter left the room.

The experimenter controlled and monitored the experiment from the control room. CS valence and UCS-expectancy ratings were sampled online by the experimenter. After the end of the experiment, all electrodes were removed. The participants filled out the above-mentioned self-report questionnaires and were fully debriefed.

### **STATISTICAL ANALYSES**

Statistical comparisons were conducted separately for each phase (habituation, acquisition, extinction, fear retrieval) using IBM SPSS Statistics for Windows 22.0 via analyses of variance (ANOVA). For valence and UCS-expectancy ratings, the betweensubjects factor group (spider-fearful vs. non-fearful) as well as the within-subjects factor CS (CS<sup>+</sup> vs. CS−) were entered. SCRs were subjected to a group × CS × trial ANOVA, separately for the four phases. For all dependent measures, the within-subjects factor context (context A vs. context B) was added in the fear retrieval phase.

Greenhouse-Geisser correction was applied where indicated; the according (corrected) degrees of freedom are given in parentheses. The statistical significance level was set to α = 0.05. Significant main or interaction effects were followed by appropriate *post hoc* tests.

### **RESULTS**

No significant differences in age and other important control variables, such as depression, stress, and anxiety levels, were evident between spider-fearful and non-fearful participants (see **Table 1**).

### **VALENCE RATINGS**

After habituation, no differences were found in valence ratings between the CS<sup>+</sup> and the CS<sup>−</sup> or between groups. After acquisition, a significant CS+/CS<sup>−</sup> differentiation emerged [main effect CS; *F*(1,41) = 22.91; *p* < 0.001], which was also subjected to group differences [CS × group interaction; *F*(1,41) = 4.95; *p* = 0.032]: the CS<sup>+</sup> was rated more negatively as compared to the CS<sup>−</sup> in the spider-fearful group [*t*(24) = 5.36; *p* < 0.001], but not in the nonfearful group. After extinction,no significant effects were observed. However, when both groups were tested separately, the spiderfearful group still rated the CS<sup>+</sup> more negatively than the CS<sup>−</sup> [*t*(24) = 2.15; *p* = 0.041]; this differentiation was not seen in the non-fearful group. During the fear retrieval phase, the ANOVA with the factors CS, context, and group revealed only trends toward a main effect of the CS [*F*(1,41) = 3.81; *p* = 0.058] and toward a CS × group interaction [*F*(1,41) = 3.36; *p* = 0.074]. The CS<sup>+</sup> was rated more negatively than the CS−; this was especially the case for the spider-fearful group [*t*(24) = 6.77; *p* = 0.016], but not for the non-fearful group.

Taken together,spider-fearful participants reported a more negative valence toward the CS<sup>+</sup> as compared to the CS<sup>−</sup> after the acquisition, extinction and during the fear retrieval phase (cf. **Figure 2**).

### **UCS-EXPECTANCY RATINGS**

The CS<sup>+</sup> and CS<sup>−</sup> were not rated differently with regard to UCS expectancy after habituation. All participants rated the CS<sup>+</sup> as significantly more likely to be followed by the UCS than the

CS<sup>−</sup> after acquisition [main effect CS; *F*(1,41) = 118.49; *p* < 0.001], after extinction [main effect CS; *F*(1,41) = 13.97; *p* = 0.001], and during the fear retrieval phase [main effect CS; *F*(1,41) = 19.31; *p* < 0.001]. In general, the spider-fearful group stated a higher UCS expectancy during fear retrieval as compared to the nonfearful group [main effect group: *F*(1,41) = 4.41; *p* = 0.042]. No other main or interaction effects were observed.

In conclusion, spider-fearful participants only differed from non-fearful participants in their reported UCS expectancy during fear retrieval (cf. **Figure 3**).

### **SKIN CONDUCTANCE RESPONSES**

During habituation, a main effect of trial was found [*F*(1,41) = 15.87 *p* < 0.001], indicating a decrease in SCRs over the two trials. During acquisition, the main effect of trial persisted over ten trials [*F*(6.6,268.6) = 2.10; *p* = 0.048]. Importantly, fear acquisition was successful as indicated by a significant differentiation between the CS<sup>+</sup> and the CS<sup>−</sup> [*F*(1,41) = 28.31; *p* < 0.001]. Furthermore, groups differed in fear learning [CS × group interaction: *F*(1,41) = 7.61; *p* = 0.009], which was driven by significantly higher SCRs toward the CS<sup>+</sup> as compared to the CS<sup>−</sup> in spiderfearful participants [*F*(1,24) = 42.55; *p* < 0.001], but not in nonfearful persons. Additional analyses of the CS<sup>+</sup> and CS<sup>−</sup> trials separately showed that the spider-fearful group displayed almost significantly enhanced responding to the CS<sup>+</sup> [*F*(1,41) = 3.67; *p* = 0.062] and significantly attenuated responding to the CS<sup>−</sup> [*F*(1,41) = 6.14; *p* = 0.017] compared to the non-fearful group.

A main effect of trial occurred during extinction [*F*(5.1,210.1) = 9.90; *p* < 0.001] and fear retrieval [*F*(1.6,65.5) = 25.60; *p* < 0.001]. No further main or interaction effects were observed.

Concluding, spider-fearful participants displayed higher conditioned SCRs during acquisition only, but not during the other conditioning phases (cf. **Figure 4**).

### **DISCUSSION**

Alterations in fear acquisition and extinction have been found in patients with PD (Michael et al., 2007) and PTSD (Blechert et al., 2007). Some of these alterations seem to reflect a general deficit that is shared by both PD and PTSD, whereas other deficits might be disorder-specific. The *de novo* paradigm seems well suited to compare general conditionability across disorders. While PD and PTSD are characterized by high trait anxiety and comorbid depressive symptoms, SP is marked by fear, rather than anxiety (Grillon, 2002), of a specific object or situation, and thus seems a valuable comparison group for the investigation of shared and specific factors of fear learning in anxiety and stressor-related disorders.

In the present study, we examined changes in fear conditionability and context-dependent fear renewal in spider-fearful individuals. We found an enhanced aversive discrimination learning for *de novo* stimuli in spider-fearful individuals as evidenced on the level of electrodermal responses. This was accompanied by a more negative evaluation of the CS<sup>+</sup> as compared to the CS<sup>−</sup> (at the subjective valence level) in spider-fearful individuals throughout the whole conditioning procedure, i.e., the acquisition, the extinction, and the fear retrieval phase. No specific difference in extinction learning was found between spider-fearful and non-fearful participants.

Our results are in partial accordance with the propositions made by previous etiological models of anxiety disorders (Öhman and Mineka, 2001; Lissek et al., 2005). In the present study, we could demonstrate an increased capability of spider-fearful individuals to detect and respond to stimuli, which signal aversive consequences. Although we found a more negative evaluation of the CS<sup>+</sup> compared to the CS<sup>−</sup> in spider-fearful individuals, the SCR data suggest that superior aversive discrimination learning in spider-fearful individuals was presumably not mediated by an increased physiological responding to fear-eliciting stimuli. Hence our findings do not correspond to similar investigations in other anxiety and stressor-related disorders (e.g., Blechert et al., 2007; Michael et al., 2007; Milad et al., 2009; Norrholm et al., 2011; Jovanovic et al., 2013). For instance, we did not find evidence for increased SCR responses for CS<sup>+</sup> or CS<sup>−</sup> in spider-fearful individuals relative to non-fearful individuals. Thus, while spiderfearful individuals rated the CS<sup>+</sup> as more negative on the subjective valence level, the physiological expression of fear (at the level of SCR) in the presence of the CS<sup>+</sup> was not affected in these individuals. Conversely, the spider-fearful group rather seems to exhibit a lower threshold for the detection of cues, which signal aversive consequences and as a consequence display an enhanced fear discrimination learning.

The mechanisms underlying the enhanced fear discrimination for *de novo* fear stimuli in spider-fearful individuals remain elusive. Evidence from neurobiological studies in animals and humans suggest that the amygdala represents the most critical structure involved in the acquisition and expression of conditioned fear. Selective lesions to the amygdala impair both cued and contextual fear conditioning in animals (LeDoux, 2000). Similarly, amygdala activity increases during the acquisition relative to the extinction phase (Phelps et al., 2001; Knight et al., 2004), and there is a strong correlation between amygdala reactivity and conditioned SCRs during fear acquisition (Cheng et al., 2003; Phelps et al., 2004) in humans. The amygdala is also involved in the fast detection of potentially harming stimuli (LeDoux, 2000; Öhman and Mineka, 2001), which might represent a highly adaptive process. Spider-phobics detect and respond to phobia-relevant stimuli more rapidly (Globisch et al., 1999; Öhman et al., 2001), which might be mediated by an increased activation of the amygdalar network after confrontation with fear-related material (Dilger et al., 2003; Larson et al., 2006). This is in line with our findings on differential responding in spider-fearful individuals during the fear acquisition phase. In particular, non-fearful participants show a slight habituation of SCR during the fear acquisition phase, which is compatible with findings on habituation of amygdala activation during conditioning (LaBar et al., 1998; Phelps et al., 2001; Wright et al., 2001). In contrast, spider-fearful individuals continue to show a differential CS+/CS<sup>−</sup> responding throughout the entire acquisition phase. This implies an exacerbated amygdalar reactivity in spider-fearful individuals associated with both the rapid detection of threatening cues as well as a lack of habituation when repeatedly confronted with these cues. Such deficient habituation of fear responses might be maladaptive in the way that pathological anxiety is maintained and further reinforced by the avoidance of cues, which signal aversive consequences (Globisch et al., 1999; Öhman et al., 2001). Interestingly, it has been reported that the hyperactivity of the amygdala that is observed in patients with SP can be normalized after successful exposure therapy (Goossens et al., 2007).

The present findings extend our knowledge on specific differences in fear acquisition and extinction between different anxiety and stressor-related disorders. Unlike to previous studies in PTSD and PD, which utilized similar methodological approaches, we did not find clear evidence for changes in fear extinction learning in spider-fearful individuals. For instance, stronger fear acquisition was found in PTSD (Orr et al., 2000), but not in patients with PD as compared to control participants (Grillon et al., 1994; Michael et al., 2007). Furthermore, PTSD but not PD patients (Michael et al., 2007) exhibited an enhanced responding to the CS<sup>−</sup> during extinction (Grillon and Morgan, 1999; Peri et al., 2000; Blechert et al., 2007, 2008). This finding is interpreted as a general deficit in the ability to extract information from safety cues (Davis et al., 2000) and might represent a central feature of the PTSD psychopathology (Ehlers and Clark, 2000). Our results, by contrast, rather suggest that SP might be primarily characterized by an increased ability to discriminate between fear-related and fear-unrelated cues, which reflect the core symptomatology of SP. Namely, fear associated with specific phobias (SPs) is usually restricted to the phobic stimuli and SP exhibit an increased bias for identifying threatening material (Miltner et al., 2004). These findings are in accordance with the propositions made by "vigilance–avoidance" models of anxiety (Amir and Foa, 2001). The quick detection of aversive cues, which signal threat (which is presumably devoid of cognitive control) in SP might lead to an automatic initiation of avoidance behavior, which in turn hampers the habituation to these cues.

It should be noted, however, that the generalization of our findings warrants further replication with other measures of fear (e.g., fear-potentiated startle, neuroimaging, attention bias) to rule out the possibility that the herein observed effects are related to the specific methodology used. Nevertheless, our results imply that albeit SP, PTSD, and PD might share some common features (e.g., increased amygdalar activity, e.g., Larson et al., 2006; Etkin and Wager, 2007; Fani et al., 2012; Stevens et al., 2013), which are highly related to the symptomatology and psychopathology of these disorders, it remains at least questionable whether deficits in extinction learning represents a common biomarker of all anxiety and stressor-related disorders. Longitudinal studies could help to get more insights into the etiological role of fear learning in different anxiety and stressor-related disorders (e.g., Lommen et al., 2013).

Despite high clinical relevance, only one study so far assessed changes in conditionability in spider phobia (Schweckendiek et al., 2011). Schweckendiek et al. (2011) previously reported that, compared to healthy controls, spider-phobic patients show enhanced neuronal activations within the fear network (e.g., medial prefrontal cortex, amygdala) in response to CSs, which were paired with phobia-related pictures (UCS). Moreover, spider-phobic participants displayed higher amygdala activation in response to the phobia-related CS than to the non-phobia-related CS. The results on differences in conditionability for non-phobia-related CSs between patients and healthy controls, however, were less clear. In fact, none of the groups showed differential SCRs with respect to CSs, which were paired with non-phobia-relevant but otherwise aversive UCSs (pictures of mutilations). The authors stated that this might be attributed to the use of pictorial stimuli as UCS instead of electrical stimulation. Hence, the present findings can be considered as the first proof that – in addition to an enhanced conditionability on the neural level for phobia-relevant stimuli [see Schweckendiek et al. (2011)] – spider-fearful individuals also show an enhanced fear discrimination to phobia-irrelevant CSs. Our findings were presumably not mediated by an increased trait anxiety, concomitant increases in state depression, or changes in stress tension, since we did not find differences in these control variables between spider-fearful and non-fearful participants. Thus, consistent with previous findings, changes in cue-related anxiety responses rather than generally increased levels of anxiety (Otto et al., 2007) might be responsible for inter-individual differences in conditionability.

While spider-fearful individuals continued to rate the CS<sup>+</sup> valence as negative during the fear retrieval phase, we did not observe context-induced fear renewal after extinction learning. This finding was rather unexpected and several methodical factors might account for the absence of such a finding. In the present study, we developed a modified version of an ABA fearconditioning task and used a relatively short delay between acquisition, extinction, and fear retrieval [according to Grillon et al. (2006) and Alvarez et al. (2007)]. External context change was operationalized byVR environments. It is possible that the external context manipulation via VR technology is not suitable to reliably induce a context-dependent re-emergence of fear responses. However, given that several studies successfully demonstrated fear renewal even when using subtle changes in contextual features as an operationalization of "external context change"this assumption is quite unlikely [reviewed inVervliet et al. (2013)]. Another explanation might be that extinction generalized across the extinction and acquisition contexts in our task because extinction was conducted shortly after acquisition [see also Myers et al. (2006)]. In this regard, it should be noted that in previous studies on human fear conditioning, the delay between the extinction phase and the renewal test was 24 h [see Maren et al. (2013)]. In the present study, where we utilized a much shorter delay, not only the association between CS<sup>+</sup> and UCS might had been weakened during extinction training; but instead extinction training might also had induced a sensory habituation process to the CS<sup>+</sup> stimuli as well (e.g., Lloyd et al., 2012). Thus, during the renewal test shortly after the extinction session, the CS<sup>+</sup> elicited a weaker processing in the sensory system and concomitantly a weaker fear response compared to the CS−. This might be the reason why the renewal response is blocked after a short but not long delay between the extinction and renewal phase. The presentation of CS<sup>+</sup> after 24 h in contrast might be associated with a recovery of the sensory response to the CS+, which in turn is more likely to induce a significant fear renewal. However, certainly more research is needed to disentangle the temporal dynamics of contextual effects on fear acquisition, extinction, and retrieval processes.

The absence of a clear clinical diagnosis for SP by means of a clinical interview in our sample of spider-fearful individuals might limit the validity of our findings. However, mean SPQ and FSQ scores in spider-fearful individuals were very high and correspond to clinical sample means (Pflugshaupt et al., 2007; Müller et al., 2011; Fisler et al., 2013; Gerdes and Alpers, 2014; Peperkorn et al., 2014; Soravia et al., 2014), suggesting that our results can be generalized to clinically significant spider phobia. Furthermore, a closer inspection of demographic data revealed that most of the spider-fearful participants indicated at least a moderate spider fear that was perceived as disturbing and accompanied by clear avoidance behavior in real life environment. Finally, the majority of spider-fearful participants were interested to participate in a future follow-up exposure therapy study with the aim to reduce their fear of spiders. However, future studies are needed to exclude the possibility that the finding of enhanced conditionability in our study is restricted to individuals who display only subclinical levels of spider fear. Although none of the participants exhibited clinically significant depressive or anxiety symptoms as evidenced from DASS scores, we cannot completely rule out that single individuals suffered from other yet undiagnosed psychiatric disease.

To our knowledge, this is the first study showing significant changes in conditionability for disorder-irrelevant stimuli in spider-fearful individuals at both the subjective and electrodermal level. Our data suggest that spider-fearful individuals show an enhanced fear discrimination while fear extinction seems to be unaffected. More research is needed, however, to understand the underlying neurobiological foundation of altered conditioning processes in spider fear. Future longitudinal studies would be valuable to provide a more causal link between altered fear learning and the development of specific fear. A better understanding of fear conditioning processes in SP and other anxiety disorders is of therapeutic significance and might help to contribute to the refinement of exposure-based treatments.

### **ACKNOWLEDGMENTS**

This study was funded by project P9 (ZL 59/2-1) of the German Research Foundation (DFG) Research Unit "Extinction learning: Neural mechanisms, behavioral manifestations, and clinical implications" (FOR 1581). The funders had no role in study design, data collection, analysis, and interpretation, decision to publish, or preparation of the manuscript.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 03 July 2014; accepted: 03 September 2014; published online: 01 October 2014.*

*Citation: Mosig C, Merz CJ, Mohr C, Adolph D, Wolf OT, Schneider S, Margraf J and Zlomuzica A (2014) Enhanced discriminative fear learning of phobiairrelevant stimuli in spider-fearful individuals. Front. Behav. Neurosci. 8:328. doi: 10.3389/fnbeh.2014.00328*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience.*

*Copyright © 2014 Mosig , Merz, Mohr, Adolph,Wolf, Schneider, Margraf and Zlomuzica. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Neonatal handling decreases unconditioned anxiety, conditioned fear, and improves two-way avoidance acquisition: a study with the inbred Roman high (RHA-I)- and low-avoidance (RLA-I) rats of both sexes

### Edited by:

Richard J. Servatius, Syracuse DVA Medical Center, USA

### Reviewed by:

Deborah Suchecki, Universidade Federal de Sao Paulo, Brazil Michael Arthur Van Der Kooij, Johannes Gutenberg University Mainz, Germany

### \*Correspondence:

Cristóbal Río-Alamos and ˙ Alberto Fernández-Teruel, Medical Psychology Unit, Department of Psychiatry and Forensic Medicine, School of Medicine, Autonomous University of Barcelona, 08913 Bellaterra, Barcelona, Spain cristobal.delrio@uab.cat; albert.fernandez.teruel@uab.cat

> Received: 28 March 2015 Accepted: 19 June 2015 Published: 10 July 2015

### Citation:

Río-Alamos C, Oliveras I, Cañete T, ˙ Blázquez G, Martínez-Membrives E, Tobeña A and Fernández-Teruel A (2015) Neonatal handling decreases unconditioned anxiety, conditioned fear, and improves two-way avoidance acquisition: a study with the inbred Roman high (RHA-I)- and low-avoidance (RLA-I) rats of both sexes. Front. Behav. Neurosci. 9:174. doi: 10.3389/fnbeh.2015.00174 Medical Psychology Unit, Department of Psychiatry and Forensic Medicine, School of Medicine, Institute of Neurosciences, Autonomous University of Barcelona, Barcelona, Spain

Cristóbal Río-Alamos ˙ \*, Ignasi Oliveras, Toni Cañete, Gloria Blázquez, Esther Martínez-Membrives, Adolf Tobeña and Alberto Fernández-Teruel\*

The present study evaluated the long-lasting effects of neonatal handling (NH; administered during the first 21 days of life) on unlearned and learned anxiety-related responses in inbred Roman High- (RHA-I) and Low-avoidance (RLA-I) rats. To this aim, untreated and neonatally-handled RHA-I and RLA-I rats of both sexes were tested in the following tests/tasks: a novel object exploration (NOE) test, the elevated zero maze (ZM) test, a "baseline acoustic startle" (BAS) test, a "context-conditioned fear" (CCF) test and the acquisition of two-way active—shuttle box—avoidance (SHAV). RLA-I rats showed higher unconditioned (novel object exploration test -"NOE"-, elevated zero maze test -"ZM"-, BAS), and conditioned (CCF, SHAV) anxiety. NH increased exploration of the novel object in the NOE test as well as exploration of the open sections of the ZM test in both rat strains and sexes, although the effects were relatively more marked in the (high anxious) RLA-I strain and in females. NH did not affect BAS, but reduced CCF in both strains and sexes, and improved shuttle box avoidance acquisition especially in RLA-I (and particularly in females) and in female RHA-I rats. These are completely novel findings, which indicate that even some genetically-based anxiety/fear-related phenotypes can be significantly modulated by previous environmental experiences such as the NH manipulation.

Keywords: neonatal handling, anxiety, inbred roman rats, two-way avoidance acquisition, coping style

### Introduction

Neonatal handling (NH), typically administered to rodents during the first 3 weeks of life, is an environmental treatment that has often been used to study behavioral and neurobiological plasticity. The effects of this manipulation are well documented since the 1950s, when Seymour Levine provided the first demonstration that NH induced an enduring improvement in the ability of rats to learn a two-way active avoidance task (Levine, 1956, 1957). These results have been confirmed by many studies showing that the improving effects of NH extend to a wide variety of tests/tasks and to different strains/lines of rats (and mice) with remarkable long lasting effects. Thus, a large amount of studies have shown that NH increases activity and specific exploratory behavior in rodents, in a variety of unconditioned anxiety/emotionality tests involving different degrees of novelty (e.g., Bodnoff et al., 1987; Escorihuela et al., 1994; Ferré et al., 1995a; Núñez et al., 1995, 1996; McIntosh et al., 1999; Fernández-Teruel et al., 2002a; Cañete et al., 2015), although absence of effects, task-specific effects, and sex-specific effects have also been reported (see review by Raineki et al., 2014). In addition, concerning its effects on conditioned fear/anxietyrelated measures, studies with unselected rats have shown that NH enduringly reduces conflict-induced lick suppression and conditioned freezing (Núñez et al., 1996), accelerates two-way active avoidance acquisition (Escorihuela et al., 1992, 1994; Núñez et al., 1995), thus replicating and extending the original Levine's findings, and reduces learned helplessness (Tejedor-Real et al., 1998). NH has also been reported to decrease stress-induced corticosterone, ACTH, and prolactin secretion (e.g., Levine, 1957; Meaney et al., 1988, 1991; Núñez et al., 1996; Anisman et al., 1998; Raineki et al., 2014). Thus, from a behavioral and neuroendocrine perspective, NH-treated rodents appear to have an improved ability to adapt, or to efficiently cope with challenging/stressful environmental conditions. Finally, NH manipulation generally improves cognition in rats and mice under different spatial learning/memory paradigms, although such effects are strainand sex-dependent (e.g., Wilson and Jamieson, 1968; Meaney et al., 1988; Zaharia et al., 1996; Fernández-Teruel et al., 2002a; Stamatakis et al., 2008; Raineki et al., 2014; Cañete et al., 2015). However, in the cognitive (learning and memory) domain there are controversial results, as with the exception of shuttle box avoidance acquisition (see refs. above), the NH procedure generally impairs aversive learning in several tasks (see review by Raineki et al., 2014).

One of the most validated genetic rat models for the study of fear/anxiety- and stress-related phenotypes is constituted by the Roman High- and Low-avoidance (RHA and RLA, respectively) rat lines/strains. They were initially selected and bred on the basis of their very good (RHA) vs. extremely poor (RLA) acquisition of the two-way active—shuttle box—avoidance response (Bignami, 1965; Driscoll and Bättig, 1982; Driscoll et al., 1998). Two inbred strains (RHA-I and RLA-I) derived from the original outbred (RHA/Verh and RLA/Verh) lines, are maintained at the Autonomous University of Barcelona since 1997 (Escorihuela et al., 1999; Driscoll et al., 2009), while colonies of the outbred RHA/RLA rat lines are maintained at Geneva (Switzerland; Dr. Steimer; e.g., Steimer and Driscoll, 2005) and Cagliari (Italy; Prof. Giorgi and Corda; e.g., Giorgi et al., 2007).

Learning a two-way avoidance task in a shuttle box involves a "passive avoidance/active avoidance" conflict during the initial stages of acquisition (i.e., a tendency to freeze–receiving the electric shock- runs against a tendency to actively cross to the opposite compartment -avoiding the insult-) which is mediated by anxiety (e.g., Wilcock and Fulker, 1973; Gray, 1982; Gray and McNaughton, 2000; Vicens-Costa et al., 2011). Accordingly, shuttle box avoidance acquisition has been shown to be inversely related to anxiety/fear (e.g., Weiss et al., 1968; Gray, 1982; Fernández-Teruel et al., 1991a,b; Escorihuela et al., 1993; Gray and McNaughton, 2000; López-Aumatell et al., 2009a,b, 2011; Vicens-Costa et al., 2011; Díaz-Morán et al., 2012). Not surprisingly, therefore, the extensive research conducted with the RLA and RHA rats over near four decades has led to the conclusion that anxiety/fearfulness and stress sensitivity are among the most prominent behavioral traits separating the two lines/strains. In fact, RLAs (both from the outbred lines and from the inbred strain) are more anxious and/or fearful than their RHA counterparts in a wide series of unconditioned and conditioned tests/tasks (e.g., Ferré et al., 1995b; Escorihuela et al., 1999; Steimer and Driscoll, 2003, 2005; Driscoll et al., 2009; López-Aumatell et al., 2009a,b; Díaz-Morán et al., 2012; Martinez-Membrives et al., 2015). Moreover, RLA rats display enhanced frustration responses following reward down-shift (e.g., Torres et al., 2005; Rosas et al., 2007; Sabariego et al., 2013) and higher stress-induced HPA-axis and prolactin responses than RHAs (e.g., Steimer and Driscoll, 2003, 2005; Carrasco et al., 2008; Díaz-Morán et al., 2012). To sum up, it is commonly accepted that, compared with RHAs, RLAs rats display increased anxiety, fearfulness, stress sensitivity, and a predominantly passive (reactive) coping style when facing situations involving conflict (e.g., Steimer and Driscoll, 2003, 2005; Díaz-Morán et al., 2012).

As mentioned earlier, NH procedure generally appears to improve the subjects' ability to adapt to, or to efficiently cope with conflicting and/or stressful conditions. However, most of the research on NH effects has been performed in one gender, usually male rats or mice. Interactions between NH and sex have been observed in some reports which evaluated NH effects in unselected rats of both sexes. To say just a few examples (see also "Discussion"): NH improved spatial learning (in the Morris Water Maze; MWM) only in males (Stamatakis et al., 2008) while, in different studies, spatial learning in the "Y" maze was improved by NH in females and impaired in males (Noschang et al., 2012), and long-term retention of inhibitory avoidance was impaired only in females (Kosten et al., 2007). The striking sex differences in the effects of NH tell us that gender must be considered as an important (or even crucial) variable in behavioral and neurobiological studies of NH induced effects and/or mechanisms.

Thus, the present study was aimed to evaluate whether the NH procedure is able to improve coping ability in both inbred Roman strains and sexes, with an especial focus on RLA-I rats. If so, we would expect that handled RLA-I rats present a more active coping style than untreated RLA-I animals, which would be reflected by unlearned and/or learned anxiety/fear measures. To this aim, non-handled (undisturbed) and NH treated inbred Roman Low- (RLA-I) and High-avoidance (RHA-I) rats of both sexes were evaluated in a test battery devoted to measure several types of unconditioned and conditioned anxiety/fearrelated responses: a "novel object exploration" (NOE) test, the elevated zero-maze (ZM), a baseline acoustic startle response test (BAS), a context-conditioned fear (CCF) test and the acquisition of the two-way active avoidance (SHAV) task. This represents the first time that the effects of NH on both unconditioned and conditioned anxiety/fear (including shuttle box avoidance acquisition) are evaluated in "inbred" Roman rats from both strains and sexes.

### Materials and Methods

### Animals

Pregnant inbred Roman High- (RHA-I) and Low-Avoidance (RLA-I) rats from our permanent colony at the Autonomous University of Barcelona (Medical Psychology Unit, Department Psychiatry and Forensic Medicine) were used in the present study. They were individually housed and were maintained with food and water freely available, with a 12-h light-dark cycle (light on 0800 h) and controlled temperature (22 ± 2 ◦C). They were randomly distributed across the following experimental groups to which their offspring would be assigned: control animals, which were not disturbed until weaning (C), and animals that received neonatal handling (NH, see procedure below). All care was taken to avoid litter effects, by using a sufficiently large number of litters per group. Thus, each experimental group contained animals from at least 6 different litters. At postnatal day 1, litters were culled to a maximum of 12 pups (without any compensation for the number of males or females). After weaning (postnatal day 21st) the pups were housed in pairs of the same litter, sex and group in standard macrolon cages (50 × 25 × 14 cm) under the above conditions. Experiments were performed using 50 RLA-I and 29 RHA-I rats from the 59th generation of inbreeding. At the beginning of the experiments subjects were 2 months old (weight, 167 ± 20 g; mean ± SD; see **Table 1** for details of the sample). Experiments were performed during the light cycle, between 09:00 and 19:00 h in accordance with the Spanish legislation on "Protection of Animals Used for Experimental and Other Scientific Purposes" and the European Communities Council Directive (86/609/EEC) on this subject.

### Procedure and Apparatus

### Neonatal Handling (NH)

NH was given twice daily between postnatal days 1 and 21 (see Fernández-Teruel et al., 1992; Escorihuela et al., 1995; Steimer


\*Final n = 9 and n = 6 in RLA-I and RHA-I control groups because of technical problems in several tests/tasks.

et al., 1998). The first daily handling session, administered in the morning (approximately between 9:30 and 10:30 h a.m.), consisted of first removing the mother from the litter and then placing the pups gently and individually in plastic cages (35 × 15 × 25 cm) lined with paper towel for a total period of 8 min. After 4 min in this situation, each pup was individually (and gently) handled and stroked for 3–4 s and returned to the same cage for the remaining 4 min. At the end of the 8-min period, each pup was gently handled for another 3–4 s and then returned to its homecage. When all the pups from one litter were back in their homecage, the mother was returned to it. The same procedure was conducted in the evening (2nd time; approximately at 5:00 h p.m.). NH was carried out in a room different from the animal room, maintaining the temperature at 24◦C. NH finished at postnatal day 21. Weaning was done at postnatal day 21, after finishing the last NH session. Control (C) non-handled groups were left undisturbed, except for regular cage cleaning once a week, until weaning.

### Test 1: Novel Object Exploration Test (NOE)

In order to assess emotional reactivity (or behavioral inhibition under novelty, or "curiosity") a novel object exploration (NOE) test was conducted. The test consisted of the evaluation of the exploratory response of rats when a novel object was introduced in their home cage. Rats were 60 days-old at the beginning of the NOE test, and they were housed in pairs of the same sex, strain, and treatment condition. The test started by removing the food from the home cage (leaving only four pellets in each cage). One hour later, the novel object (graphite pencil Staedtler Noris, HB n◦ 2) was perpendicularly introduced in their home cages through the grid cover, until it made contact with the cage bedding. To facilitate observation of the rats each individual cage was pulled from the rack about 15 cm, which allowed to score the latency to the first exploration (LAT-NOE; time spent until the first exploration of the novel object) and the total time (Time-NOE) spent exploring the pencil for each individual rat. The experimenter/observer was standing at 50 cm from the cage front. The NOE test lasted 3 min (see **Figure 1**).

### Test 2: Elevated Zero Maze (EZM)

The maze, similar to that described by Shepherd et al. (1994) (1) comprised an annular platform (i.e., a circular corridor; 105 cm diameter; 10 cm width) made of black plywood and elevated to 65 cm above the ground level. It had two open sections (quadrants) and two enclosed ones (with walls 40 cm height). The subject (80 days-old) was placed in an enclosed section facing the wall. The apparatus was situated in a black testing room, dimly illuminated with red fluorescent light, and the behavior was videotaped and measured outside the testing room. Time spent in open sections (ZM-T), number of entries into open sections (ZM-E), and number of episodes of exploratory activity at the edge of the test, namely "head dips" (ZM-HD), were measured for 5 min (see López-Aumatell et al., 2008, 2009a; see **Figure 1**).

### Test 3: Baseline Acoustic Startle Response (BAS)

Four sound-attenuated boxes (Sr-Lab Startle Response system, San Diego Inst., San Diego, USA) diffusely illuminated (10 w)

were used (90 × 55 × 60 cm). Each box housed a Plexiglas cylinder (8.2 cm in diameter, 25 cm in length) with a grid placed in the bottom, resting on a plastic frame. For any test session each animal was placed in the cylinder, and movements of the cylinder resulting from startle responses were transduced by a piezoelectric accelerometer (Cibertec S.A. Madrid) into a voltage which was amplified, digitized and saved into a computer for analysis. The session started with 5 min of habituation. A white noise generator provided background noise of 55 dB. Then, 25 trials of acoustic startle stimuli of 105 dB and 40 ms of duration were delivered by a loudspeaker, mounted at distance of 23 cm above the plexiglas cylinder. The inter-trial interval (ITI) was 15 s in average (range 10–20 s). Startle response amplitude was defined as the maximum accelerometer voltage during the first 200 ms after the startle stimulus onset (see López-Aumatell et al., 2008; see **Figure 1**).

### Tests 4 and 5: Context Conditioned Freezing (CCF) and Two-way Active—Shuttle Box—Avoidance Acquisition (SHAV)

The experiment was carried out with two identical shuttle boxes (Letica, Panlab, Barcelona, Spain) each placed within independent sound-attenuating boxes constructed of plywood. A dim and diffuse illumination was provided by a fluorescent bulb placed behind the opaque wall of the shuttle boxes. The experimental room was kept dark. The shuttle boxes consisted of two equally sized compartments (25×25 × 28 cm), connected by an opening (8 × 10 cm). Training consisted of a single 50-trial session for the RHA-I strain, and two 50-trial sessions, spaced 24 h apart, for RLA-I rats. RLA-I rats were trained twice as much as RHA-I rats because we did not expect any NH effect on RHA-I rats, due to roof effects (i.e., they usually attain a >60% avoidance response levels in the first 50-trial session). A 2400-Hz, 63-dB tone plus a light (from a small 7-w lamp) functioned as the CS (conditioned stimulus). The US (unconditioned stimulus) which commenced at the end of the CS, was a scrambled electric shock of 0.7 mA delivered through the grid floor. Once the rats were placed into the shuttle box, a 4-min familiarization period (without any stimulus) elapsed before training commenced. Each of the 50 (or 100 -in case of RLA-I rats-) training trials consisted of a 10-s CS, followed by a 20-s US. The CS or US was terminated when the animal crossed to the other compartment, with crossing during the CS being considered as an avoidance response and during the US as an escape response. Once a crossing had been made or the shock (US) discontinued, a 60-s inter-trial interval (ITI) was presented during which crossings (ITC) were scored within each block of trials. Freezing behavior, defined as the complete absence of movements except for breathing, was also scored (by a well-trained observer) during the 60-s inter-trial intervals of trials 2–5 as an index of context-conditioned fear (CCF; during trials 2–5 no rat made any avoidance response, i.e., all rats received electric shock in these trials). The measure of freezing during the inter-trial interval of trial 1 was excluded because it is not a proper measure of context conditioning.

The variables recorded were the number of avoidances (SHAV) and inter-trial crossings (ITCs), either grouped in blocks of 10 trials or accumulated in one (SHAV50, ITC50), or two (SHAV100, ITC100) sessions (e.g., see López-Aumatell et al., 2011; Díaz-Morán et al., 2012; see **Figure 1**).

### Statistical Analysis

Statistical analysis was performed using the "Statistical Package for Social Science" (SPSS, version 17).

Pearson's correlation coefficients were performed among the main variables.

Factorial 2 × 2 × 2 ANOVAs ("2 strain" × "2 treatment conditions" × "2 sex") were applied to measures from NOE, ZM, and CCF tests, as well as for total measures of the shuttle box avoidance task. Appropriate repeated measures ANOVAs with "5-trial blocks" as within-subject factor were applied to BAS test ("2 strain" × "2 treatment conditions" × "2 sex" × "5 block" ANOVA), and to shuttle box avoidance acquisition with "10 trial blocks" as within-subject factor ("2 strain" × "2 treatment conditions" × "2 sex" × "10 block" ANOVAs).

Post-hoc Duncan's multiple range tests were applied to all dependent variables following significant ANOVA effects. A Student's t-test (independent samples) was also applied to avoidance results from male "control" and "NH" RLA-I groups, because we had the a priori hypothesis that NH treatment would improve avoidance acquisition in RLA-I rats. Significance level was set at p < 0.05.

### Results

### "Novel Object Exploration" Test (NOE)

The results of the NOE test (**Figures 2A,B**) showed that, compared to RHA-I rats, RLA-I animals presented higher latency

(to explore for the first time the novel object; LAT-NOE) and less time spent exploring the novel object (TIME-NOE) ["Strain" effect on both parameters, F(1, 78) = 17.36, p < 0.001, and F(1,78) = 118.30, P < 0.001, respectively]. As expected, NH significantly reduced LAT-NOE and increased TIME-NOE in both rat strains ["NH" effect, F(1,78) = 9.66, p ≤ 0.003, and F(1,78) = 80.40 P < 0.001, respectively]. A "sex" effect was found only on TIME-NOE [F(1, 78) = 13.08, p = 0.001], indicating that females (particularly RHA-Is) spent overall less time exploring the novel object compared to males (**Figure 2B**). There were also "Strain × NH" interactions for LAT-NOE and TIME-NOE [F(1, 78) = 6.37, p ≤ 0.01, and F(1,78) = 5.32 P = 0.02, respectively], as NH effects were globally stronger in RLA-I rats of both sexes.

### "Elevated Zero Maze" Test (ZM)

The results of the ZM test (**Figures 3A–C**) showed "Strain" effects on ZM-E [F(1, 78) = 12.13, p ≤ 0.001], ZM-T [F(1, 78) = 7.29, p ≤ 0.009] and ZM-HD [Fs(1, 78) = 41.55, p < 0.001], with RHA-I rats showing overall higher scores in the three parameters (**Figures 3A–C**). "NH" effects were found in ZM-T [F(1, 78) = 8.60, p ≤ 0.005] and in ZM-HD [F(1, 78) = 11.85, p ≤ 0.001], reflecting that neonatally-handled groups globally spent more time in open sections and performed more head dips than untreated animals (**Figures 3B,C**; see also Duncan's tests in **Figures 3B,C**).

the "Strain" effect (see text for significance); \*p < 0.05 between the groups indicated (Duncan's multiple range tests following significant ANOVA effects); #p < 0.05 vs. respective control (C) group of the RLA-I strain (Duncan's multiple range tests following significant ANOVA effects). Group symbols: C, control non-handled group; H, neonatally handled (NH) group.

### "Baseline Acoustic Startle Response" Test (BAS)

**Figure 4** shows the results of the BAS test. The repeated measures ANOVA ("2 strain" × "2 treatment conditions" × "2 sex" × "5 blocks of trials") indicated a "strain" effect, as taking the session

as a whole, the RLA-I strain displayed higher acoustic startle response than the RHA-I strain ["Strain" effect, F(1, 71) = 12.26, p ≤ 0.001]. ANOVA also showed significant "Block" and "Block × Strain" effects [F(3, 222) > 22.22, p < 0.001, and F(3, 222) = 9.11, p < 0.001, respectively], indicating both an habituation effect (on both strains) as well as that such a habituation is relatively more marked in RLA-I rats (**Figure 4**). Further One-Way ANOVAs per each 5-trial block showed between-strain differences (i.e., overall higher BAS scores in RLA-I than RHA-I rats) in all blocks except in the last one [Block1–Block4, all Fs(7, 78) > 2.10, all p ≤ 0.05; see **Figure 4**]. No NH effect was observed.

### "Context-Conditioned Freezing" ("CCF") and "Two-way Active—Shuttle Box—Avoidance Acquisition" Test ("SHAV")

Results of the "context conditioned freezing" (CCF) test are shown in **Figure 5**. One-Way ANOVA showed a global "Strain" effect, with the RLA-I groups performing more freezing behavior than the RHA-I strain ["Strain" effect, F(1, 76) = 6.79, p ≤ 0.011, **Figure 5**]. Interestingly, a global "NH" effect was also present, as NH decreased the time spent freezing in both strains ["NH" effect F(1, 76) = 4.11, p = 0.046; **Figure 5**]. There was also a "Strain × Sex" effect, mainly because there was a trend for RLA-I female groups to show lesser freezing than their respective male groups, and that tendency was not present in RHA-I rats ["Strain × Sex" effect, F(1, 76) = 5.39, p = 0.023; **Figure 5**].

**Figures 6A–C** shows the results of two-way avoidance (SHAV) acquisition. The repeated measures ANOVA applied to results from the first 50-trial session ("2 Strain" × "Treatment conditions" × "2 Sex" × "5 blocks of 10 trials") showed that RHA-I performed more avoidance responses than RLA-I rats ["Strain" effect, F(1, 76) = 462.7, p < 0.001; **Figures 6A,B**] and also a global "NH" effect [F(1, 76) = 6.10, p = 0.016; **Figures 6A,B**], with neonatally-handled animals performing overall more avoidances than untreated/control rats (**Figures 6A,B**). Duncan's

test showed statistical differences between control and handled RHA-I females (in several 10-trial blocks—**Figure 6A** as well as in the whole 50-trial session—**Figure 6B**) as well as between control and handled RLA-I females (in several 10-trial blocks— **Figure 6A** and in the whole 50-trial session—**Figure 6B**), while a Student's t-test for independent samples showed differences between control and handled RLA-I males in the whole 50-trial session [t(24) = 2.68, p = 0.014; **Figure 6B**. This t-test was applied because we had the—directed– a priori hypothesis that NH procedure would improve avoidance acquisition in RLA-I rats].

The repeated measures ANOVA of the first 50-trial session also showed "Block" and "Block × Strain" effects [both ANOVAs, Fs(4, 69) > 93.96, p ≤ 0.001; **Figure 6A**], thus respectively reflecting (i) the overall significant learning curves as well as (ii) that RHA-I rats learned much faster than RLA-I rats. There was also a "NH × Sex" effect [F(1, 76) = 5.24, p = 0.025; **Figure 6B**], mainly because NH induced positive effects on avoidance acquisition of all groups except RHA-I males (**Figure 6B**).

Analysis of the whole 100 acquisition trials (i.e., the two training sessions) in RLA-I groups (repeated measures ANOVA, "2 treatment conditions" × "2 sex" × "10 blocks of 10 trials" as within-subject factor; SHAV100 trials in **Figure 6C**) showed a "NH" effect [F(1, 46) = 10.68, p = 0.002; **Figures 6A,C**], with handled animals performing overall better than control rats (**Figure 6C**), and a "NH × Sex" effect [F(1, 46) = 4.24, p = 0.045], as NH more markedly increased the number of avoidances in RLA-I females (see Duncan's test in **Figure 6C**) than in males. ANOVA also showed "Block," "Block × NH," and "Block × Sex" effects [all Fs(6, <sup>266</sup>.11) > 2.18, P ≤ 0.05](**Figure 6A**), thus respectively indicating that (i) RLA-I rats show a significant acquisition curve along the 100 training trials, (ii) such an acquisition curve depends on the treatment condition (as NHinduced acquisition improvements are different depending on which 10-trial block is taken into account), and (iii) such an acquisition curve depends on the gender (particularly because of the pronounced NH effect on females, across different 10-trial blocks, which is not present in RLA-I males) (see **Figure 6A**).

the same sex (Duncan's tests after significant ANOVA effects). (B,C) &, indicates the "Strain" effect (see text for significance); \*p < 0.05 between the groups indicated (Duncan's tests following significant ANOVA effects); #p < 0.05 vs. respective control (C) group of the RLA-I strain (Duncan's tests following significant ANOVA effects). Group symbols: C, control non-handled group; H, neonatally handled (NH) group.

**Figures 7A,B** shows ITCs (inter-trial crossings) results during avoidance acquisition training. The repeated measures ANOVA applied to results from the first 50-trial session ("2 Strain" × "Treatment conditions" × "2 Sex" × "5 blocks of 10 trials") showed that RHA-I performed more ITCs than RLA-I rats ["Strain" effect, F(1, 76) = 96.4, p < 0.001; **Figures 7A,B**], a global "NH" effect [F(1, 76) = 5.3, p = 0.024; **Figures 6A,B**], with neonatally-handled animals performing overall more ITCs than untreated rats (**Figures 7A,B**), and a "Sex" effect [F(1, 76) = 7.5, p = 0.008] indicating that females of both strains performed more ITCs than male rats (**Figures 7A,B**). Similar to SHAV50 results, there were also "Block" and "Block × Strain," as well as "Block × Sex" effects on ITCs [for all parameters, Fs(4, <sup>252</sup>.63) > 2.89, p ≤ 0.03].

Analysis of ITCs along the whole 100 training trials (repeated measures ANOVA, "2 treatment conditions" × "2 sex" × "10 blocks of 10 trials" as within-subject factor; SHAV-ITC 100; **Figure 7C**), only in the RLA-I groups, showed a NH effect [F(1, 46) = 4.75, p = 0.035], as neonatally-handled animals performed more ITCs than untreated ones (see **Figure 7C**). There was also a "Block" effect [F(3, <sup>122</sup>.06) = 5.96, p = 0.001] (**Figure 7A**), reflecting the overall ascending progression of ITCs across successive 10-trial blocks.

indicates the "Strain" effect (see text for significance); \*p < 0.05 between the groups indicated (Duncan's tests following significant ANOVA effects). Group symbols: C, control non-handled group; H, neonatally handled (NH) group.

### Correlations among Variables

Pearson correlations are shown in **Table 2**. The most relevant trends to highlight are between-test correlations. In this regard, significant correlations are observed between ZM and NOE variables (from r = −0.26 to r = 0.53), indicating that both tests


### TABLE 2 | Pearson correlation coefficients are shown.

Significant values are in bold. \*p < 0.05; \*\*p < 0.01; \*\*\*p < 0.001 (two-tailed). (\*) Refers to the RLA-I groups only (thus n = 50), which were the only groups performing 100 trials in the shuttle box avoidance task.

might be partly measuring similar anxiety-related traits. There are also low but significant correlations among both NOE and ZM variables with BAS parameters (NOE with BAS variables, from r = 0.22 to r = 0.28. ZM with BAS variables: r = −0.25 between ZM-E and BAS21-25; r = −0.29 between ZM-HD and BAS; see **Table 2**). Most importantly, there were very relevant correlations among ZM variables and SHAV and ITC (ranging from r = 0.29 to r = 0.56; **Table 2**), as well as between NOE variables and SHAV and ITCs (ranging from r = −0.30 to r = 0.58; **Table 2**) and between BAS (acoustic startle) responses and SHAV and ITCs (ranging from r = −0.25 to r = −0.39; **Table 2**), thus suggesting that unconditioned anxiety-related trait is negatively associated with two-way avoidance acquisition, i.e., the higher the unconditioned anxiety levels in those three tests the poorer the acquisition levels in the avoidance task.

### Discussion

In the present study we have investigated, for the first time: (1) NH effects in inbred RHA-I/RLA-I rats of both sexes, (2) by using a test battery which included both unconditioned (NOE, ZM, and BAS) anxiety/fear tests and -most importantlya context-conditioned fear test and shuttle box avoidance acquisition (i.e., the trait which constitutes the basis of genetic selection of RLA-I and RHA-I rats). We have found that, compared with their RHA-I counterparts, RLA-I rats show higher unconditioned anxiety/fear-related responses in the novel object exploration (NOE) and elevated zero-maze (ZM) tests, as well as in the baseline acoustic startle (BAS) test. These results agree with previous reports showing similar differences between the RLA-I and RHA-I strains in a variety of novelty/conflict tests (e.g., Driscoll et al., 2009; López-Aumatell et al., 2009a,b; Díaz-Morán et al., 2012; Martinez-Membrives et al., 2015; see "Introduction" for further references). As expected, and also in agreement with previous reports, RLA-I rats also displayed an overall increase of context-conditioned freezing and markedly impaired acquisition of the two-way active avoidance response compared with RHA-I rats (e.g., López-Aumatell et al., 2009a,b; Díaz-Morán et al., 2012; Martinez-Membrives et al., 2015; see "Introduction").

The main novel findings of the present study concern the effects of neonatal handling. Thus, regarding the unconditioned tests, i.e., NOE and ZM, we have found that NH increases exploration of both the novel object (NOE) and the open sections of the ZM test in both rat strains, although in the NOE test such effects are apparently more marked in RLA-I rats, which are more behaviorally inhibited (i.e., more anxious) than RHA-I rats in both tests (compare untreated rats of both strains in **Figures 2**, **3**). Actually, the levels attained by NH-treated RLA-I rats in NOE measures tend to approach the response levels of untreated RHA-I rats. These NH effects are overall in agreement with those previously reported on several (unconditioned) novelty/anxietyrelated traits in unselected rats (e.g., Escorihuela et al., 1994; Ferré et al., 1995a; Núñez et al., 1995, 1996; Fernández-Teruel et al., 2002a; Raineki et al., 2014; see further references in the "Introduction") as well as in the Roman rats from the "outbred" lines (e.g., Fernández-Teruel et al., 1992; Steimer et al., 1998).

Importantly, the present study is the first demonstration that NH enduringly improves two-way avoidance acquisition in "inbred" RLA-I rats of both sexes and in female RHA-I rats (see "NH × sex" interactions in "Results"). The positive effect of NH manipulation on avoidances in RLA-I rats is also more pronounced in females, as reflected by significant "NH × sex" effects on SHAV100 (see and **Figures 6A–C**). NH also induced a significant increase of ITCs, both considering all groups (see **Figures 7A,B**) or only RLA-I groups (see **Figure 7C**). We have to remind here that the relevant literature shows that ITCs are positively related with (and are a positive predictor of) two-way avoidance acquisition, i.e., ITCs are "pseudoavoidance" responses indicating that animals are developing active coping strategies to solve the "passive avoidance/active avoidance" conflict involved in the task (for review see Castanon et al., 1995; Aguilar et al., 2004), as it is also suggested by the positive SHAV-ITC correlations observed here (see **Table 2**). In parallel to these results, NH overall decreased context-conditioned freezing (i.e., classically conditioned fear) in both rat strains. Fear to the context during the initial stages of shuttle box avoidance training is known to be inversely related to effective avoidance acquisition (e.g., López-Aumatell et al., 2011; Vicens-Costa et al., 2011; Díaz-Morán et al., 2012; Martinez-Membrives et al., 2015). The negative correlations between context-conditioned freezing and number of avoidances and ITCs (see **Table 2**) give further support to that contention.

In the only previous study with Roman rats in which NH effects were evaluated on shuttle box avoidance, only "outbred" RLA males (from the swiss RLA/Verh outbred line) were used (Escorihuela et al., 1995). This study indicated a slight trend toward a positive treatment effect on avoidance responses, which failed to be significant according to overall ANOVA (Escorihuela et al., 1995). Therefore, the present study in inbred Roman rats of both sexes is the first demonstration of a significant NH-induced modulation of the trait which is the criterion for selection of the RLA-I and RHA-I strains (i.e., shuttle box avoidance acquisition).

Conditioned freezing and two-way avoidance acquisition (as well as ITCs) are apparently less affected by NH than the unconditioned anxiety measures (NOE, ZM). This would be congruent with the view that, in the Roman rat strains, twoway avoidance acquisition and conditioned freezing are more strongly linked to their genetic constitution than unconditioned anxiety/fearfulness traits (e.g., Castanon et al., 1995; Fernández-Teruel et al., 2002b; Steimer and Driscoll, 2003, 2005; Driscoll et al., 2009). Related to that, it was reported already in early behavioral genetic studies in rats that two-way active avoidance acquisition is probably among the types of behavioral traits having the highest heritability coefficients (e.g., Wahlsten, 1972; Wilcock and Fulker, 1973; Wilcock et al., 1981; see also Castanon et al., 1995; Fernández-Teruel et al., 2002b; Johannesson et al., 2009; Baud et al., 2013, 2014). With regard to the Roman rats, it has been suggested that the "warm up" phase, i.e., the performance during initial 10–20 trials of each shuttle box training session, is the aspect that most markedly differentiates both lines/strains (e.g., Driscoll and Bättig, 1982; Fernández-Teruel et al., 1991b; Escorihuela et al., 1995, 1999; Ferré et al., 1995b; Driscoll et al., 2009). In particular, the extremely slow "warm up" effect typically shown by RLA rats seems to stem from their proneness for fear conditioning (e.g., Escorihuela et al., 1995; López-Aumatell et al., 2009a,b; Estanislau et al., 2013), thus to freeze when facing an aversively-conditioned context (as it is the case during the initial trials in the shuttle box task), which is known to run against actively searching for a more adaptive (active) response like escape or avoidance (e.g., Weiss et al., 1968; Wilcock and Fulker, 1973; Fernández-Teruel et al., 1991a,b; Gray and McNaughton, 2000; López-Aumatell et al., 2009a,b; Vicens-Costa et al., 2011; Díaz-Morán et al., 2012). Hence, it seems possible that a more proactive (or less reactive) coping style of NH-treated RLA-I rats (as suggested by NH effects on conditioned freezing and ZM and NOE tests) might be partly responsible for their improved ability to acquire the two-way avoidance task.

As said in the Introduction, some studies on NH that have used rats of both sexes have shown that "treatment × gender" interactions are common, and either NH effects are often observed in just one gender or handling effects show divergent patterns in both sexes. As a few examples of this: (1) Stamatakis et al. (2008) reported that in acutely-stressed (Wistar) males rats NH manipulated showed better place learning performance than females, while no sex differences were observed in a spatial memory trial. (2) Likewise, handling-induced changes in hippocampal mineralocorticoid receptors were found in males only (Stamatakis et al., 2008). (3) Learning of a spatial "Y" maze task was impaired by NH in males and improved in female Wistar rats (Noschang et al., 2012) while, in the same study, (4) only NHtreated females (but not males) showed a decreased SOD/CAT (superoxide dismutase/catalase) ratio in prefrontal cortex. (5) Impairing NH effects on long-term retention of inhibitory avoidance were observed in female, but not male Sprague-Dawley rats (Kosten et al., 2007). (6) In another study, NH produced sex-dependent effects on stress-induced corticosterone and brain c-fos expression in adolescent Sprague–Dawley rats (Park et al., 2003). (7) Furthermore, Papaioanou et al. (2002) reported that NH treatment interacts with stress type (i.e., short-term or longterm) and with sex to induce changes in the concentration and turnover of brain serotonin and dopamine in Wistar rats. In this context, it is remarkable that also in the present study the positive effects of NH on avoidance acquisition have been shown to be divergent depending on gender. Thus, there are significant "NH × sex" effects on SHAV50 and SHAV100 (avoidances after 50 or 100 trials, respectively), which reflect the fact that NH improved avoidance acquisition more markedly in female rats of both strains during the first 50 trials (SHAV50; see **Figure 6B**) or in RLA-I females (compared with RLA-I males) after completing the 100 trials (SHAV100; see **Figure 6C**).

There is evidence, from factor-analytical studies using very large samples of F2 rats (derived from the "outbred" Roman lines, n = 800; Aguilar et al., 2003) or heterogeneous NIH-HS rats (n = 1600; López-Aumatell et al., 2011) that females' responses when facing conflicting situations might be more driven by activityrelated responses (i.e., more "proactive" responses) than males' responses, which would be more driven by anxiety/freezing (i.e., "reactive" coping strategies; e.g., Fernandes et al., 1999; Aguilar et al., 2003; López-Aumatell et al., 2011). In this connection, it is tempting to suggest that the more marked NH effects observed in females, particularly in the two-way avoidance task, might be partly due to the fact that NH is able to disinhibit conflict-induced behavior (i.e., so changing a "reactive" to a more "proactive" coping strategy) more easily in females than in males.

The present positive results of NH on two-way avoidance acquisition are in contrast with several lines of research carried out by using psychogenetically-selected strains/lines of rats possessing divergent abilities to acquire shuttle box avoidance (i.e., Mausdley reactive vs. non-reactive rats, Levine and Broadhurst, 1963; RLA/Lu v.s RHA/Lu rats, Satinder and Hill, 1974), which failed to show acquisition improvements following neonatal handling. Possible reasons to explain the different results of these and the present study could be the more intensive neonatal handling procedure used here (i.e., two handling sessions/day in the present study v/s one session/day in those studies), or the fact that the present shuttle box training parameters (i.e., composite "light + tone" CS; CS, US and inter-trial interval of longer durations than in those studies; no overlapping between CS and US) were specifically selected to facilitate the emergence of escape (or avoidance) responses and to minimize the presence of "response failures" (see details in Escorihuela et al., 1995).

The observed between-strain differences in baseline acoustic startle (BAS) are in agreement with previous reports (e.g., López-Aumatell et al., 2009a,b). Notably, however, neonatal handling did not affect BAS responses in any rat strain. The baseline acoustic startle is a reflex response that is mediated by a fast "cochlear root nucleus—caudal pontine reticular nucleus" pathway (e.g., see review by Koch and Schnitzler, 1997). To the best of our knowledge the effects of NH treatment on BAS have been evaluated for the first time in the present study, and the absence of changes in NH-treated rats, which contrasts with the positive effects observed in the other tests/tasks, suggests that brainstem-mediated reflex responses (i.e., BAS) are less sensitive to (NH) manipulation influences than more cognitively elaborated conflict-based responses (like NOE, ZM, CCF, or SHAV), which are thought to be under hippocampal control (e.g., Gray and McNaughton, 2000; López-Aumatell et al., 2008, 2009a, and references therein). Possibly in line with that contention, in a study in which rats were treated with environmental enrichment (EE) for several months, the treatment produced the expected long-lasting positive effects on several stress/anxiety-related and cognitive responses, but EE did not affect baseline acoustic startle (Peña et al., 2009).

A more active/functional hippocampus has been related to increased anxiety when facing "approach-avoidance" or "passive avoidance/active avoidance" conflict situations (such as the cases of NOE-ZM and CCF tests and the SHAV task, respectively) (Gray and McNaughton, 2000). In line with that, it is remarkable that the high anxious (and passive/reactive coper) RLA-I rat strain has a more functional hippocampus than the (low anxious) RHA-I strain (Meyza et al., 2009; Garcia-Falgueras et al., 2012). It would be interesting to investigate how hippocampal function during (unconditioned or conditioned) conflict could be affected by neonatal handling and how such an effect on hippocampus would be relevant for the H-induced changes in RLA-I rats. Would NH manipulation influence septo-hippocampal function in a manner similar to anxiolytic drugs—i.e., benzodiazepine agonists, which reduce conflict and improve shuttle box avoidance acquisition? (e.g., Fernández-Teruel et al., 1991a; Gray and McNaughton, 2000). A number of effects of neonatal handling on different neurobiogical aspects within the hippocampal formation have been reported, for example: (i) increased hippocampal long-term potentiation (e.g., Wilson et al., 1986) and decreased hippocampal neuronal loss with age in H-treated rats (e.g., Meaney et al., 1988; see reviews by Fernández-Teruel et al., 1997, 2002a); (ii) enhanced hippocampal type II glucocorticoid receptors, linked to decreased HPA-axis responses to stress (e.g., Meaney et al., 1988); (iii) increased GAP-43 (growth associated protein 43) expression in rat pups (Zhang et al., 2012); (iv) increases in hippocampal but not cortical 5-HT and 5-HIAA in rats (e.g., reviewed by Anisman et al., 1998; Fernández-Teruel et al., 2002a), as well as in hippocampal nerve growth factor mRNA (Mohammed et al., 1993; Pham et al., 1997); (v) enhancement of NADPHdiaphorase-positive neurons (a potential marker of nitric oxide-producing neurons) (Vaid et al., 1997); (vi) increases of central benzodiazepine and GABA-A receptors (Bodnoff et al., 1987; Bolden et al., 1990; see review by Raineki et al., 2014). Preliminary results from our laboratory suggest that RLA-I rats have reduced content of hippocampal PSA (polysialic acid, related to neural cell adhesion molecules -NCAM-), which is raised to RHA-I levels by NH. Thus, provided that all these forms (and others not listed here) of hippocampal plasticity have been shown to be sensitive to NH effects, it does not seem unreasonable to expect that hippocampal function during conflict (i.e., under anxiety-inducing, conditioned or unconditioned) situations could also be enduringly modulated by neonatal handling, thus inducing changes on coping strategies/responses. Testing such a hypothesis should be matter of further research.

In summary, in the present study, several long-lasting effects of NH are reported for the first time: (i) NH manipulation is able to partially counteract the genetically-based two-way avoidance acquisition deficit of (inbred) RLA-I rats, being the effect more evident in females. (ii) NH manipulation improves acquisition in females (but not males) of the RHA-I strain. (iii) NH effects on shuttle box avoidance acquisition are paralleled by a treatmentinduced reduction of context-conditioned freezing (during intertrial intervals 2–5 of the training session) also in both rat strains, which may suggest that the treatment has produced some change toward more adaptive (i.e., proactive) coping strategies, and that such an effect may underlie (at least partly) the avoidance acquisition improvement, particularly in RLA-I rats. (iv) The positive effects of NH on SHAV, ITCs, CCF, NOE, and ZM test measures, also agree with the contention that the treatment induces changes toward more proactive coping strategies. (v) Baseline acoustic startle is not influenced by NH, in line with findings obtained with other anxiety-reducing environmental treatments (Peña et al., 2009), thus suggesting that brainstemmediated responses like BAS could be less sensitive to chronic treatment influences than conflict-based hippocampus-mediated responses.

### Acknowledgments

Supported by grants PSI2013-41872-P, 2014SGR-1587 and "ICREA-Academia 2013" (to AF-T). IO is recipient of a PhD FI fellowship (DGR 2014).

### References


the male and female rat brain. Neuroscience 114, 195–206. doi: 10.1016/S0306- 4522(02)00129-X


adult mice: genetic and maternal factors. Psychopharmacology 128, 227–239. doi: 10.1007/s002130050130

Zhang, Z., Zhang, H., Du, B., and Chen, Z. (2012). Neonatal handling and environmental enrichment increase the expression of GAP-43 in the hippocampus and promote cognitive abilities in prenatally stressed rat offspring. Neurosci. Lett. 522, 1–5. doi: 10.1016/j.neulet.2012.05.039

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Río-Alamos, Oliveras, Cañete, Blázquez, Martínez-Membrives, ˙ Tobeña and Fernández-Teruel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Pyrazine analogs are active components of wolf urine that induce avoidance and fear-related behaviors in deer

#### **Kazumi Osada<sup>1</sup>† , Sadaharu Miyazono<sup>2</sup>† and Makoto Kashiwayanagi <sup>2</sup>\***

<sup>1</sup> Division of Physiology, Department of Oral Biology, School of Dentistry, Health Sciences University of Hokkaido, Ishikari-Tobetsu, Hokkaido, Japan <sup>2</sup> Department of Sensory Physiology, Asahikawa Medical University, Asahikawa, Hokkaido, Japan

### **Edited by:**

Richard J. Servatius, DVA Medical Center, USA

### **Reviewed by:**

Akshay Anand, Post Graduate Institute of Medical Education and Research, India Christopher Cain, Nathan S. Kline Institute for Pscyhiatric Research, USA

### **\*Correspondence:**

Makoto Kashiwayanagi, Department of Sensory Physiology, Asahikawa Medical University, Midorigaoka Higashi 2-1-1-1, Asahikawa, Hokkaido 078-8510, Japan e-mail: yanagi@asahikawa-med.ac.jp †These authors have contributed equally to this work.

Our previous studies indicated that a cocktail of pyrazine analogs, identified in wolf urine, induced avoidance and fear behaviors in mice. The effects of the pyrazine cocktail on Hokkaido deer (Cervus nippon yesoensis) were investigated in field bioassays at a deer park in Hokkaido, Japan. A set of feeding bioassay trials tested the effects of the pyrazine cocktail odor on the behavior of the deer located around a feeding area in August and September 2013. This odor effectively suppressed the approach of the deer to the feeding area. In addition, the pyrazine cocktail odor provoked fear-related behaviors, such as "tailflag", "flight" and "jump" actions, of the deer around the feeding area. This study is the first experimental demonstration that the pyrazine analogs in wolf urine have robust and continual fearful aversive effects on ungulates as well as mice. The pyrazine cocktail might be suitable for a chemical repellent that could limit damage to forests and agricultural crops by wild ungulates.

**Keywords: pyrazine analog, wolf, Hokkaido deer, field bioassay, avoidance, fear, repellent, kairomone**

### **INTRODUCTION**

Wild animals frequently infiltrate human habitats, where they can cause serious trouble. For example, the damage that deer cause to agricultural, horticultural, and forest resources is an economic problem not only in Hokkaido (Masuko et al., 2011) but around the world (Trdan et al., 2003; Killian et al., 2009; Kimball et al., 2009; Baasch et al., 2010; Gheysen et al., 2011). Rather than eliminating deer, it is ideal to control their behavior so that they coexist with wild animals without destroying human habitats and natural environments.

The detection of predator phenotypic traits by prey species is a vitally important function of communication among mammals. How prey discerns a predator remains to be elucidated; it most likely involves a range of sensory and behavioral signals. For animals that rely on chemical communication to regulate social and sexual interactions, there is some indication that the presence of a predator can be detected by its scent. When the recipient benefits from the signal, the molecules involved are called kairomones (Wyatt, 2003; Rodriguez, 2010).

Many studies have shown that the odors of a predator induce avoidance and fear in various kinds of herbivores. For instance, black-tailed deer (*Odocoileus hemionus columbianus*) and/or white-tailed deer (*Odocoileus virginianu*s) aversively respond to the odor of the urine of several predators, including wolf (*Canis lupus*), coyote (*Canis latans*), fox (*Vulpes vulpes*), wolverine (*Gulo gulo*), lynx (*Lynx canadensis*), and bobcat (*Lynx rufus*), as well as to the odor of the feces of cougar (*Puma concolor*), coyote, and wolf (Sullivan et al., 1985b; Swihart et al., 1991). Similarly, odors emitted by several kinds of predators induce defensive behaviors in hare (*Lepus americanus*) (Sullivan et al., 1985a) and experimental rats (*Rattus norvegicus*) (Fendt, 2006). Moreover, American beaver (*Castor canadensis*), cattle (*Bos taurus*), and marsupials that are exposed to the odor of wolf or dingo (*Canis lupus dingo*) showed defensive or avoidance responses (Lindgren et al., 1995; Kluever et al., 2009; Parsons and Blumstein, 2010). Those studies clearly indicate that many carnivores' urine and feces including wolf contain kairomones, which repel their prey animals. As a practical matter, predator wolf urine is used to drive away these animals without killing them (Sullivan et al., 1985a,b; Lindgren et al., 1995; Severud et al., 2011).

According to our recent study (Osada et al., 2013), urine odors of the common gray wolf induce aversive and fear-related responses in mice in an experimental setting. In addition, these responses are caused mainly by the presence of certain volatile pyrazine compounds, namely 2,6-dimethyl pyrazine (DMP), trimethyl pyrazine (TMP), and 3-ethyl-2,5-dimethyl pyrazine (EDMP), in wolf urine. The cocktail of DMP, TMP, and EDMP (pyrazine cocktail) is more potent than any one component alone. These pyrazine analogs, which retain characteristic roasted aromas in various foods, are known as safe compounds with no carcinogenicity and with low acute toxicity (EFSA Panel on Food Contact Materials, Enzymes, Flavourings and Processing Aids (CEF), 2011). Actually, some of alkylpyrazines are widely used in the food industry as a flavor ingredient (Burdock and Carabin, 2008). Therefore, the pyrazine analogs are expected to be favorable herbivore repellents without destroying the natural habitat and agriculture.

We hypothesized that the pyrazine analogs, odors of predator wolf, are at least a portion of the putative kairomones that induce avoidance and fear in various prey species. In this study, we explored the effects of pyrazine analogs to Hokkaido deer (*Cervus nippon yesoensis*), a kind of large herbivores. The analogs were found to act as repellents for deer and also to directly elicit fear-related reactions in deer, such as "tail-flag", "flight" and "jump" (Caro, 2005; Stankowich and Coss, 2006). The present results suggested that the pyrazine analogs provoke aversion and fear not only in mice but also in large herbivores.

### **MATERIALS AND METHODS**

### **STUDY AREA**

The field work was conducted in a deer park (44◦ 12' N and 142◦ 48' E, Nishiokoppe, Hokkaido, Japan), a commercial wildlife park located within a reservation and conservation area. Over 30 Hokkaido deer inhabited an enclosed area of more than 9 ha. They had free access to herbage, bamboo grass, tree leaves and bark, and water located in the park. All of them were considered healthy. They were sometimes fed with steam-flaked corn, whose chemical composition was crude protein 7.6%, ether extract 3.8%, crude fiber 1.7%, crude ash 1.2%, nitrogen-free extract 71.3%, and moisture 14.5% (Hokuren Federation of Agricultural Cooperatives, Hokkaido, Japan).

### **EXPERIMENTAL DESIGN**

The study was carried out in accordance with the Guidelines for the Use of Laboratory Animals of the Asahikawa Medical University and approved by the Nishiokoppe collegium of deer nurturing (NOP-130708). 2,6-Dimethyl pyrazine (DMP) and TMP were purchased from Tokyo Chemical Industry (Tokyo, Japan), and 3-ethyl-2,5-dimethyl pyrazine (EDMP) was purchased from Alfa Aesar (Ward Hill, MA, USA). Feeding bioassay trials (**Figure 1A**) were carried out twice, on 27 August and 19 September, 2013. The deer in the trial included 12 males and 10 females in August, and 16 males and 9 females in September. The basic design of the bioassay trial in this study utilized square translucent sheets (1.8 m × 1.8 m) with food and odor sources. The four sheets with 5 kg of steam-flaked corn put on each of the center (feeding area) were placed at approximately 3 m intervals on a line. In order to prevent the animals from accidentally destroying the odor sources, self-made odor generators were constructed from iron tubes (2.5 cm i.d. × 25 cm length equipped with 40 odor holes, each having a diameter of 5 mm), into which were inserted 2 ml pyrazine cocktail (DMP, TMP, and EDMP, 33% v/v of each) or no odorant (control) mixed with cotton. At two of the four feeding areas, the odor generators containing the pyrazine cocktail were put on each of four corners (that is, 8 ml pyrazine cocktail per feeding area), and the others were left with the control odor generators. An animal's movements and behaviors were recorded by two observers, each with a video camera, positioned 10 m away (a distance that did not interfere with the animal's behavior). The trials were terminated after 15 min.

**FIGURE 1 | Feeding trial. (A)** An average of 24 deer participated in the feeding trial at the deer park in Nishiokoppe, Hokkaido. The pyrazine cocktails were placed in two of the four feeding areas. **(B)** Average changes in the number of deer surrounding the pyrazine cocktail (closed symbols) or control feeding area (open symbols). The numbers were plotted by counting the deer near in each feeding area every 30 s. Symbols of circle and triangle indicate the average number from two feeding areas per odor condition in the trials in August and September, respectively. Lines show the average values from the both trials.

### **DATA COLLECTION AND PROCESSING**

We collected and analyzed data from adults and juveniles (>1 year old) by distinguishing the sex and age of the deer according to antler and body size. Because of their inconsistent participation (just a few seconds) in the trials, we ignored three of the fawns (all three were <1 year old). We conducted the several behavioral observations for total number of 47 deer (28 males and 19 females). Movements of individual deer were evaluated by identifying its position every 2.5 s as recorded by the video camera. The positions were defined according to five position indexes: the animal pressed its head into the sheet of the control (+2) or pyrazine cocktail (−2) area; the animal was within 1 m of the feeding area but did not press its head to the sheet (control, +1; pyrazine cocktail, −1); the animal was far from the feeding area (0). We defined ±2 and ±1 of the position index as "access" and "approach", respectively, and then quantified the avoidance behaviors from the position index traces. In addition, we noticed that some deer lifted up their tail upon accessing the feeding sheet (tail-flag), rapidly escaped with their neck retracted (flight), and sprang back (jump) from the feeding sheet associated with the pyrazine cocktail odor generators. Therefore, we recorded these reactions as behavioral measures that might indicate fear (Caro, 2005; Stankowich and Coss, 2006). An observer who was not aware of each animal's test condition later analyzed each deer's movements and behaviors as recorded on video.

### **STATISTICAL ANALYSIS**

Data are given as means ± SEM. Overall statistical differences were determined using Friedman tests for changes in the duration and frequency of access. Differences between the pyrazine cocktail and control areas were detected using Wilcoxon signed-rank tests for paired time periods. Differences between males and females were tested by Mann-Whitney *U*-tests. The criterion for statistical significance was *p* < 0.05 in all cases.

### **RESULTS**

### **PYRAZINE ANALOG-INDUCED SUPPRESSION OF DEER APPROACH FEEDING AREA**

To explore the avoidance effect of pyrazine analogs on deer, we conducted feeding trials in August and September (**Figure 1A**). During the first 5 min, the average numbers of deer attracted to the feeding areas were approximately two and six by the presence and absence, respectively, of a pyrazine cocktail (**Figure 1B**). The poor attraction of the feeding area pervaded by the pyrazine cocktail odor remained until the end of the 15-min trial (**Figure 1B**). This result indicates that the odor of pyrazine analogs may inhibit deer from approaching despite the presence of maize.

### **AVOIDANCE BEHAVIORS ELICITED BY ODOR OF PYRAZINE ANALOGS**

In order to examine the effect of the pyrazine cocktail on individual deer, we first evaluated the movements of the individuals (see details in Section Materials and Methods). Among the 28 males and 19 females participating in the two trials, most of them spent more time eating maize grain in the control feeding area than in the pyrazine cocktail area (**Figures 2A,B**).

From the movement traces, we quantified avoidance behaviors at the pyrazine cocktail and control odor feeding areas. Between the trials in August and September, there were no dramatic differences in any of the avoidance or fear-related behaviors (described below) of deer at the feeding areas (*p* > 0.05, Mann-Whitney *U*-test; Supplementary Figure S1). We then compared avoidance behaviors between males and females (**Figure 3**). Both males and females spent less time in the pyrazine cocktail area than in the control area during the first 5 min of the trial, and this was also the case throughout the 15-min trial (**Figures 3A,B**). The changes in the frequency of access were also similar to those in the duration (**Figures 3C,D**). Moreover, for both sexes, the odor of the pyrazine cocktail increased the latencies to reach the feeding area from their approach within 1 m (**Figure 3E**). Interestingly, females showed poorer approaches to the feeding area than males in the presence of the pyrazine cocktail (**Figures 3F,G**). These results indicate that both males and females avoid pyrazine cocktail odor and do not become easily habituated to the odor for tens of minutes.

### **FEAR-RELATED BEHAVIORS PROVOKED BY PYRAZINE ANALOGS**

Since the pyrazine cocktail provokes fear-related behaviors in mice (Osada et al., 2013), we examined whether the pyrazine cocktail could induce these behaviors in deer. We quantified the tail-flag, flight, and jump actions, which are known to be fear responses of deer to predators (Caro, 2005; Stankowich and Coss, 2006). In both sexes, tail-flag was observed more frequently during access to the pyrazine cocktail feeding than during access

were obtained from the individual animals during every 2.5-min period of the 15-min trial. **(C,D)** Frequency of access was defined as the number of times that males and females craned their necks to the sheet in either area. Data were obtained from the same deer pack as in **(A)** and **(B)**. **(E)** Latency to the access from the approach. Data were obtained from the same deer pack

of the position indexes shown in **Figure 2**. Open and closed bars indicate the control and pyrazine cocktail areas, respectively, in all panels. The time-dependent differences in the values for the pyrazine cocktail in **(A–D)** are not significant (p > 0.05, Friedman test). \* p < 0.05, \*\* p < 0.01, Wilcoxon signed-rank test. † p < 0.05, Mann-Whitney U-test.

to the control area feeding (**Figure 4A**). The other fear responses, flight and jump, were observed mainly in females (**Figures 4B,C**). These results indicate that the odor of the pyrazine cocktail could provoke fear in deer.

### **DISCUSSION**

The present study shows that Hokkaido deer are repelled by the odor of a cocktail of pyrazines identified in wolf urine, and that this odor in a feeding area significantly inhibits their approach to the area (**Figure 1**). In addition, in order to explore the individual deer behaviors, the pyrazine cocktail odor's ability to keep deer from entering foraging areas were clarified (**Figures 2**– **4**). Moreover, these effects were observed similarly at least 1 month after the first experiment day (Supplementary Figure S1). As mentioned in the Introduction section, we recently clarified that wolf urine odors induce aversive and fear-related responses in mice in an experimental setting (Osada et al., 2013). In this paper, we clarified that these activities were mainly due to the presence of certain volatile pyrazine compounds. Previous studies identified novel kairomones from odor sources of predators of rodents (Vernet-Maury et al., 1984; Wallace and Rosen, 2000; Papes et al., 2010; Ferrero et al., 2011). However, we did not find any reports that confirmed the effects of these kairomones on other kinds of mammals, including ungulates.

Previous studies clearly indicated that wolf urine contains semiochemicals that repel their prey species (Jorgenson et al., 1978; Raymer et al., 1984; Sullivan et al., 1985a,b; Nolte et al., 1994). Although there are numerous studies about repellents to ungulates, we are not aware of any that identified effective kairomone(s) to ungulates. Of these, ∆<sup>3</sup> -isopentenylmethyl sulphide and its derivatives are candidate predator kairomones (Wilson et al., 1978). However, their capacity to provoke avoidance behaviors in ungulates is limited (Wilson et al., 1978; Hani and Conover, 1995; Lindgren et al., 1995). Therefore, to the best of our knowledge, a mixture of pyrazine analogs is the first example of kairomones that provoke an aversive effect in both rodents and ungulates. However, we do not preclude the significance of the above-mentioned putative kairomones previously identified. In addition, a synergistic effect might exist between pyrazine analogs and these alkyl sulfides.

In the present study, we observed that the proportions of fear-related behaviors, such as tail-flag, flight, and jump were

significantly higher in the presence of the pyrazine cocktail than of the control. Interestingly, some of these fearful behaviors depended on sex; remarkably, female but not male deer exhibited fearful reactions (**Figure 4**). According to a previous study on experimental mice, the magnitude of avoidance of trimethylthiazoline (TMT) as well as of natural fox feces was significantly higher in female than in male mice (Buron et al., 2007). Moreover, Perrot-Sinal et al. (1996) provided evidence for sex differences of meadow voles in both basal activity level and activity following exposure to the odor of a predator red fox. Therefore, this present result shows that ungulates also exhibit sex-differences in avoidance behavior.

The extent to which Hokkaido deer remain averse to pyrazine analogs over time remains to be seen. In this study, we clarified that the avoidance effects provoked by pyrazine analogs were observed even 1 month after the first application of the pyrazine cocktail to deer belonging to the same pack (Supplementary Figure S1). This implies that pyrazine analogs maintain their effect on deer over time. A few previous studies demonstrated the continual effectiveness of wolf urine as a repellent. Parsons and Blumstein (2010) demonstrated that, despite repeated exposure to the scent of dingo, macropodids persistently avoided an area of highly palatable food. In addition, Sullivan et al. (1985b) demonstrated that the effectiveness of wolf urine odor in suppressing the feeding of black-tailed deer on salal was significantly more effective than a control for at least 6 days. Therefore, it is conceivable that the pyrazine analogs are at least a portion of the components that evoke the significant and prolonged aversive effect of wolf urine on prey animals. Actually, in our preliminary experiment on mice, the analogs showed a powerful effect by repeated exposures (data not shown). In the present study, we have conducted the trials on two occasions. Obviously, further experimental study is needed to determine whether the odor of pyrazine analogs has continual aversive effects for extended periods to deer.

Our observations also raise the question of why Hokkaido deer avoid pyrazine analogs even though Japanese wolf (*Canis lupus hodophilax*), a potential predator, has been extinct for about 100 years (Ministry of the Environment, 2014). The extinction of a large carnivore as a consequence of anthropogenic disturbance induces important changes in ecological patterns involving behavior and interspecific ecological interactions (Berger, 1999). Actually, Pyare and Berger (2003) demonstrated that female moose (*Alces alces*) from a region (Mainland Alaska) with wolves and grizzly bear (*Ursus arctos*) assemblage responded significantly more strongly to odors of both carnivores more than did female moose from Grand Teton National Park (Wyoming), where these predators had been absent for 60–75 years until the 1990s. Therefore, it is conceivable that our present results are at odds with the previous results. However, they also found that the vigilance behavior of Mainland Alaska moose to wolf odor was significantly higher than that of Wyoming moose, but surprisingly was not higher than that of moose in a predator-free region (Kenai Peninsula) population, suggesting that learning was not a necessary component of wolf urine avoidance (Pyare and Berger, 2003). Moreover, a recent study demonstrated that black tail deer react more strongly to wolf cues than to cues associated with the less dangerous black bear (*Ursus americanus*), despite having had no contact with wolves for more than 100 years (Chamaillé-Jammes et al., 2014). Therefore, the present results suggested that pyrazine analogs are at least one of the components that provoke prey on an instinctive level. Kimball et al. (2009) indicated that avoidance of blood and other animal-derived substances may be the result of an "evolutionary memory" (Provenza, 1995) that conveys information about potential sources of pathogens. Similarly, pyrazine analogs might have conveyed information about predator odor to the prey even if the prey had never encountered that species of predator.

The prey animals detect predator odors via the main olfactory and/or the vomeronasal systems (reviewed in Takahashi, 2014). Naïve rats and mice exposed to the odor of foxes or TMT, the most effective fear-inducing component in fox feces, showed species-specific defensive responses, such as freezing in place (Vernet-Maury et al., 1984; Wallace and Rosen, 2000; Fendt et al., 2005; Buron et al., 2007; Fendt and Endres, 2008; Janitzky et al., 2009). TMT is mainly detected by the main olfactory system (Kobayakawa et al., 2007). Ferrero et al. (2011) reported that 2 phenylethylamine (2-PEA), a common constituent of carnivore urine, triggers hard-wired aversion via the olfactory sensory neurons. On the other hand, rodents exposed to cat-derived odors demonstrated fear-related responses and the elevation of stress hormones (Takahashi et al., 2005, 2007, 2008) via the accessory olfactory bulb (AOB) in the vomeronasal system (Staples et al., 2008). Papes et al. (2010) demonstrated that derivatives of major urinary proteins of rat and cat activate the vomeronasal organ and AOB neurons, and initiate defensive behaviors in mice. In a previous study, we showed that wolf urine and the volatile pyrazine cocktail also stimulate the murine vomeronasal system (Osada et al., 2013). Therefore, pyrazine analogs induce avoidance and freezing behaviors via stimulation of the murine vomeronasal system and perhaps of the main olfactory system as well. Artiodactyla, including deer (Park et al., 2014), have both olfactory systems, as do mice, suggesting that deer also detect pyrazine analogs via olfactory systems similar to those of mice. Previous reports found that TMT and 2-PEA, which induce avoidance and freezing behaviors in rodents, increase plasma corticosterone level (Kobayakawa et al., 2007; Ferrero et al., 2011). In deer, the stress level could be evaluated by measuring fecal glucocorticoid level (Millspaugh and Washburn, 2004). Further studies are required on this point.

Although the present study was conducted in a semi-natural experimental setting, we have clearly illustrated that (1) pyrazine analogs identified in wolf urine provoke an aversive effect in not only mice but also an ungulate, Hokkaido deer; (2) fear-related behaviors as well as avoidance behaviors were observed in deer; and (3) the effects of pyrazine analogs were reproduced 1 month after the first precursor experiment, suggesting the continuity of the aversive effects of the pyrazine analogs in these Hokkaido deer.

This report describes the first experimental demonstration that wolf urine kairomones, pyrazine analogs, have a robust and continual aversive effect on ungulates. However, further studies are needed in order to confirm whether pyrazine analogs provoke an aversive effect on other kinds of wild animals.

### **AUTHOR CONTRIBUTIONS**

Makoto Kashiwayanagi and Kazumi Osada designed the experiment. Makoto Kashiwayanagi, Kazumi Osada, and Sadaharu Miyazono performed the experiment. Sadaharu Miyazono and Makoto Kashiwayanagi analyzed the data. Kazumi Osada and Sadaharu Miyazono wrote the first draft of the manuscript. Makoto Kashiwayanagi critically revised the manuscript and all authors approved the final version.

### **ACKNOWLEDGMENTS**

We thank all of the staff of the deer park for providing a study site. This work was supported by the *A-step* feasibility study program from the Japan Science and Technology Agency (No. AS251Z00533M to Makoto Kashiwayanagi); by Grants-in-Aid for Scientific Research from the Japan Society for the Promotion of Science (No. 24770064 to Sadaharu Miyazono); and by Asahikawa Medical University and Health Sciences University of Hokkaido.

### **SUPPLEMENTARY MATERIAL**

The supplementary material for this article can be found online at: http://www.frontiersin.org/Journal/10.3389/fnbeh.2014. 00276/abstract

### **REFERENCES**


red data book. Available online at: http://www.env.go.jp/en/nature/biodiv/ reddata.html (Accessed 2014 Apr 23).


deer (*Odocoileus hemionus columbianus*). *J. Chem. Ecol.* 11, 921–935. doi: 10. 1007/BF01012078


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 23 April 2014; accepted: 28 July 2014; published online: 14 August 2014*. *Citation: Osada K, Miyazono S and Kashiwayanagi M (2014) Pyrazine analogs are active components of wolf urine that induce avoidance and fear-related behaviors in deer. Front. Behav. Neurosci. 8:276. doi: 10.3389/fnbeh.2014.00276*

*This article was submitted to the journal Frontiers in Behavioral Neuroscience*. *Copyright © 2014 Osada, Miyazono and Kashiwayanagi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms*.