Neural correlates of feedback processing in decision-making under risk

Schuermann, Beate; Endrass, Tanja; Kathmann, Norbert

doi:10.3389/fnhum.2012.00204

ORIGINAL RESEARCH article

Front. Hum. Neurosci., 06 July 2012
Sec. Cognitive Neuroscience
Volume 6 - 2012 | https://doi.org/10.3389/fnhum.2012.00204

Neural correlates of feedback processing in decision-making under risk

Beate Schuermann* Tanja Endrass Norbert Kathmann

Department of Psychology, Humboldt-Universität zu Berlin, Berlin, Germany

Introduction: Event-related brain potentials (ERPs) provide important information about the sensitivity of the brain to process varying risks. The aim of the present study was to determine how different risk levels are reflected in decision-related ERPs, namely the feedback-related negativity (FRN) and the P300. Materials and Methods: Twenty participants conducted a probabilistic two-choice gambling task while an electroencephalogram (EEG) was recorded. Choices were provided between a low-risk option yielding low rewards and low losses and a high-risk option yielding high rewards and high losses. While options differed in expected risks, they were equal in expected values and in feedback probabilities. Results: At the behavioral level, participants were generally risk-averse but modulated their risk-taking behavior according to reward history. An early positivity (P200) was enhanced on negative feedbacks in high-risk compared to low-risk choices. With regard to the FRN, there were significant amplitude differences between positive and negative feedbacks on high-risk choices, but not on low-risk choices. While the FRN on negative feedbacks did not vary with decision riskiness, reduced amplitudes were found for positive feedbacks in high-risk relative to low-risk choices. P300 amplitudes were larger in high-risk decisions, and in an additive way, after negative compared to positive feedback. Discussion: The present study revealed significant influences of risk and valence processing on ERPs. FRN findings suggest that the reward prediction error signal is increased after high-risk decisions. The increased P200 on negative feedback in risky decisions suggests that large negative prediction errors are already processed in the P200 time range. The later P300 amplitude is sensitive to feedback valence as well as to the risk associated with a decision. Thus, the P300 carries additional information for reward processing, mainly the enhanced motivational significance of risky decisions.

Introduction

A significant function of the human brain is to assess the riskiness of decisions in order to prevent negative outcomes. Brain imaging studies indicate that frontolimbic brain circuits involving the ventromedial prefrontal cortex, amygdala, insula, ventral striatum, and anterior cingulate cortex (ACC), are implicated in risk processing. In particular, the ACC is important for detecting and evaluating unfavorable outcomes (Bush et al., 2000; Luu et al., 2000), and for risk assessment (Ernst et al., 2004; Fukui et al., 2005; McCoy and Platt, 2005). Greater ACC activity predicts enhanced error avoidance (Johansen and Fields, 2004; Frank et al., 2005) and less risk-taking behavior (Paulus and Frank, 2006). An influential model of decision-making under risk is the prospect theory (Tversky and Kahneman, 1981). It proposes that human decision makers are generally risk avoiding when choosing between alternatives. Nevertheless, it has been shown that risk-taking behavior may also depend on the context (Tversky and Kahneman, 1981), i.e., risk aversion increases after gains and decreases after losses.

Studies using event-related brain potentials (ERPs) have revealed that the human brain is able to evaluate the outcome of actions within a few 100 ms. Specific brain potentials are elicited by self-generated responses and performance feedback (Holroyd and Coles, 2002). The error-related negativity (ERN; Falkenstein et al., 1990; Gehring et al., 1990) and the feedback-related negativity (FRN; Miltner et al., 1997) are elicited by erroneous responses and by negative feedback or losses, respectively. ERN and FRN are assumed to originate from the anterior midcingulate cortex (Gehring and Willoughby, 2002; Debener et al., 2005). Therefore, ERN and FRN may reflect similar mechanisms of monitoring and controlling behavior. It has been suggested that the ACC uses reinforcement learning (RL) signals conveyed by the midbrain dopamine system to optimize future decision-making behavior (Holroyd and Coles, 2002). According to the RL theory, ERN and FRN reflect a reward prediction error signal in the ACC that occurs when ongoing events are worse than expected. Subsequently, the ACC triggers an adaptive modification of behavior by relating actions with their consequences (Holroyd and Coles, 2002; Rushworth et al., 2004). Another ERP component that has been shown to carry important information for reward processing is the feedback-related P300, a parietally distributed positivity (Yeung and Sanfey, 2004; Polezzi et al., 2009). It has been suggested that the feedback-related P300 may reflect the extent to which information is motivationally significant or salient (for a review, see Nieuwenhuis et al., 2005). In line with that, the P300 amplitude varies with the motivational significance of feedback information (Yeung and Sanfey, 2004; Polezzi et al., 2009) and is increased in individuals who attributed more meaning to feedback (de Bruijn et al., 2004).

Economic decision theories presume that risk depends on potential losses and increases with its probability and magnitude (Tversky and Kahneman, 1981; Brown and Braver, 2007). In this regard, rational decisions are made on the basis of the expected value, which is a multiplicative combination of the two components (Machina, 1982). Recent studies investigated the different components of risk-taking by assessing the influences of feedback valence, magnitude, and probability on ERP amplitudes. Research on the impact of the probability of feedback has generally shown that both the FRN and the P300 are modulated by this variable, with unexpected feedback being associated with enhanced amplitudes (Holroyd et al., 2004; Hajcak et al., 2007). Furthermore, it was demonstrated that amplitudes are modulated by an interaction between feedback valence and expectancy: unexpected negative feedback is associated with larger amplitudes compared to unexpected positive feedback (Frank et al., 2005; Moser and Simons, 2009). In gambling paradigms, an additional important variable associated with decision-making under risk is outcome magnitude. Yeung and Sanfey (2004) studied the effects of winning or losing large or small amounts of money on the FRN and P300 and concluded that only the P300 was affected by the amount of monetary loss, whereas the FRN was insensitive to outcome magnitude. In line with this, Toyomaki and Murohashi (2005) reported effects of magnitude on the participants' subjective assessment of losses, but no effects on FRN amplitudes (see also Sato et al., 2005; Hajcak et al., 2006). Other studies reported significant magnitude effects on the FRN (Goyer et al., 2008; Wu and Zhou, 2009). However, tasks in these studies required participants to choose from alternatives without having any information about reward magnitude. To conclude, FRN and P300 seem to reflect different aspects of risk processing in economic decision-making, valence and magnitude processing, respectively.

A limitation of most previous studies is that they did not independently control for the effects of probability, magnitude, and expected value. Some studies focusing on neural correlates of feedback processing used different expected values of choices to determine learning (van der Helden et al., 2010; Schuermann et al., 2011). Furthermore, sometimes participants were unaware of possible outcome magnitudes prior to receiving feedback, and thus could not make informed choices (Goyer et al., 2008; Wu and Zhou, 2009). Finally, for gambling tasks, choices often differed in outcome probability (Yeung and Sanfey, 2004; Cohen et al., 2007). To overcome some of these limitations, we designed a gambling task in which expected risk was independently manipulated from expected values and reward probability. Specifically, participants were requested to select between a low-risk option yielding low rewards and low losses and a high-risk option yielding high rewards and high losses. Unlike traditional RL tasks used in ERP research, participants in the present task were not required to learn outcome contingencies throughout the course of the task. In this study, expected values were equal for both options. There was also no difference in reward probabilities between the low-risk and the high-risk option. Examining risk effects also requires that probabilities involved in a decision are explicitly known (Brand et al., 2006; Brown and Braver, 2007). Therefore, in the present task participants were informed about the outcome probabilities. In sum, the present task should provide a better account to assess pure risk preference and to evaluate the influence of risk parameters on ERPs.

The aim of the present study was to determine how different risk levels are reflected in decision-related ERPs, namely the FRN and the feedback-related P300. Therefore, we developed and tested a novel two-choice gambling task allowing for the examination of risk-taking in unambiguous situations (Pilot experiment). The associated electrocortical indicators of risk-taking behavior were examined in the main experiment. Considering that expected values of high-risk and low-risk options were equal, we predicted that participants are predominantly risk-averse, namely that they are less willing to choose risky options (Tversky and Kahneman, 1981). Furthermore, we assumed that participants are more risk-averse following gains and relatively more risk-seeking following losses (Tversky and Kahneman, 1981). According to the RL theory, which states that the FRN responds to the difference between experienced and anticipated rewards, we predicted enhanced FRN amplitudes for high-risk compared to low-risk decisions. We also assumed that P300 amplitudes would be enhanced for high-risk decisions compared to low-risk ones due to an enhanced motivation of risky decisions (Nieuwenhuis et al., 2005).

Pilot Experiment

Materials and Methods

Participants

Fifty participants (30 women and 20 men) took part in the pilot experiment. Their mean age was 30.5 years (SD: 11.4; range: 18–50). Three of the participants were left-handed. Participants had no history of neurological or psychiatric diseases. All participants received verbal and written explanations of the purpose and procedures of the study, and gave written informed consent in accordance with the Declaration of Helsinki.

Task and procedure

A computerized probabilistic two-choice gambling task was administered, which involved low-risk and high-risk options. On each trial participants were asked to choose between two options that were presented on a computer screen (see Figure 1). The colors of the stimuli indicated the relative probability of winning (green), which was always 75%, and the relative probability of losing (red), which was always 25%. Reward magnitudes associated with choice options were displayed in each stimulus. Choices were made by pressing one of two corresponding response buttons. After 700 ms, participants were shown the outcome associated with the selected option for 1100 ms. A red frowny face together with a negative amount indicated negative feedback, while a green smiley face together with a positive amount indicated positive feedback. In addition, the total account balance across trials was presented below the feedback stimuli. Choices had to be made within 2300 ms, otherwise participants were prompted to respond more quickly. The next trial was presented after an intertrial interval of 750–950 ms. Following standardized written instructions, participants performed two practice trials. The pilot experiment consisted of 112 total trials and lasted about 5 min. Participants were instructed to earn as many points as possible and were told that each point corresponds to one Euro cent. Participants received on average 4.50€ in the pilot experiment. Table 1 presents an overview of the reinforcement schedule. In each trial, participants always had to choose between options A and B (56 trials) or options C and D (56 trials). The options with the larger maximum outcomes were termed as high-risk (options B and D), and the options with the smaller maximum outcomes were termed as low-risk (options A and C). Positions of options on the computer screen changed across trials in pseudo-random order. At the beginning of the pilot experiment, participants were informed that presented options differed in expected risks, while the expected values were equal for high-risk and low-risk options. According to Brown and Braver (2007), expected risk of each option was defined as [loss probability × (rewards – losses)], expected value of each option was defined as [(reward probability × rewards) + (loss probability × losses)].

FIGURE 1

Figure 1. Schematic depiction of the probabilistic two-choice gambling task. During each trial, participants were asked to choose between two options that were represented visually by a histogram. The colors of the histogram indicated the relative probability of gaining (colored green) which was always 75% and the relative probability of losing (colored red) which was always 25%. The current amount of gains and losses associated with each option were displayed in each histogram. Choices were made by pressing one of two corresponding response buttons. After 700 ms, participants were shown the outcome associated with their choice for 1100 ms. A red frowny face together with a negative amount indicated negative feedback, while a green smiley face together with a positive amount indicated positive feedback. In addition, the current total amount was presented below the feedback stimuli.

TABLE 1

Table 1. Reinforcement schedule in the probabilistic two-choice gambling task.

Data analyses

To assess risk-taking behavior, percentages (relative to the total amount of choices) of low-risk (options A and C) and high-risk choices (options B and D) were determined and analyzed using two-tailed t-tests. Percentages of low-risk and high-risk choices were further analyzed as a function of total account balance (positive account balance; i.e., >0€ vs. negative account balance; i.e., <0€), performing two-tailed t-tests. Moreover, we analyzed whether the probability of high-risk choices on a given trial varied as a function of prior feedback valence and prior risk-taking behavior. This was done with an ANOVA with the within-subject factors previous feedback valence (gains vs. losses on the previous trial) and previous risk-taking (low-risk options vs. high-risk options on the previous trial). Statistical analysis was carried out with the Predictive Analytic Software (PASW) 19.0 for Windows.

Results

Table 2 presents the behavioral results. A significant preference for the low-risk options over the high-risk options was found throughout the task, [t₍₄₉₎= 2.84, p = 0.007]. The analysis of risk-taking as a function of actual account balance revealed that participants avoided high-risk options when their current balance was positive, [t₍₄₉₎= 4.71, p < 0.001]. By contrast, high-risk options were preferred when the current balance was negative, [t₍₄₉₎= 4.05, p < 0.001]. Further, it was shown that risk preference varied as a function of prior feedback valence, [F_{(1, 49)}= 25.62, p < 0.001], and prior risk-taking, [F_{(1, 49)}= 14.85, p < 0.001]. The interaction of feedback valence and prior risk-taking was not significant, [F_{(1, 49)} < 1, p = 0.970]. These effects reflect that participants preferred higher risks following losses than following gains, as well as following a high-risk decision as compared to a low-risk decision.

TABLE 2

Table 2. Behavioral results of the pilot experiment (N = 50) and the main experiment (N = 20) presenting mean (M) and standard deviation (SD).

Discussion

With this pilot experiment we aimed to explore risk-taking behavior using a probabilistic two-choice gambling task. During each trial, participants were required to choose between options associated with two different risk levels. As expected, participants preferred the low-risk options over the high-risk options, although options did not differ with respect to expected values. Results are consistent with previous findings of Polezzi et al. (2008). In that study, participants had to choose between a predictable option (which was always associated with a gain of 10€) and an unpredictable option (which was associated with a gain of 30€ or a loss of 10€). The results showed a clear preference for options associated with a predictable outcome, although the expected value of both options was identical. Analysis of the choice history also revealed a loss avoidance tendency among participants. Participants strongly avoided the high-risk options following gains and when they had positive balances. This was not the case after losses and with negative account balances. When faced with rewarding feedback, participants were possibly more willing to protect the money they had and thus showed more conservative behavior. By contrast, the increase in risk proclivity might occur due to an anticipation of larger monetary rewards in order to reduce negative consequences (in terms of corrective actions). These findings are in line with previous studies (Gehring and Willoughby, 2002; Goyer et al., 2008), showing that participants are more likely to engage in risky choices following losses. In summary, the pilot experiment demonstrated the usefulness of the two-choice gambling task as a suitable test for examining risk-taking behavior in unambiguous situations.