Edited by: Josef P. Rauschecker, Georgetown University School of Medicine, USA
Reviewed by: Kimmo Alho, University of Helsinki, Finland; Amber Leaver, Georgetown University, USA
*Correspondence: Annerose Engel and Peter E. Keller, Music Cognition and Action Group, Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstr. 1a, 04105 Leipzig, Germany. e-mail:
This article was submitted to Frontiers in Auditory Cognitive Neuroscience, a specialty of Frontiers in Psychology.
This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.
The ability to evaluate spontaneity in human behavior is called upon in the esthetic appreciation of dramatic arts and music. The current study addresses the behavioral and brain mechanisms that mediate the perception of spontaneity in music performance. In a functional magnetic resonance imaging experiment, 22 jazz musicians listened to piano melodies and judged whether they were improvised or imitated. Judgment accuracy (mean 55%; range 44–65%), which was low but above chance, was positively correlated with musical experience and empathy. Analysis of listeners’ hemodynamic responses revealed that amygdala activation was stronger for improvisations than imitations. This activation correlated with the variability of performance timing and intensity (loudness) in the melodies, suggesting that the amygdala is involved in the detection of behavioral uncertainty. An analysis based on the subjective classification of melodies according to listeners’ judgments revealed that a network including the pre-supplementary motor area, frontal operculum, and anterior insula was most strongly activated for melodies judged to be improvised. This may reflect the increased engagement of an action simulation network when melodic predictions are rendered challenging due to perceived instability in the performer's actions. Taken together, our results suggest that, while certain brain regions in skilled individuals may be generally sensitive to objective cues to spontaneity in human behavior, the ability to evaluate spontaneity accurately depends upon whether an individual's action-related experience and perspective taking skills enable faithful internal simulation of the given behavior.
Imagine stepping into a jazz club, where you are met by the strains of a pianist negotiating a mesmerizing solo unlike anything that you have heard before. Would you be able to tell from the sounds alone whether the pianist is improvising or playing a rehearsed melody?
Spontaneity is a highly valued quality in many of the world's music performance traditions. Its appreciation by listeners presumably relies upon the interaction of objective auditory cues in a musical performance and the subjective experience and expertise of the listener. It is, therefore, quite likely that individuals differ in their ability to evaluate spontaneity in a performer's actions. This ability, broadly speaking, concerns the sensitivity of one individual to the degree of spontaneity in another's behavior. Such sensitivity is relevant not only to the esthetic appreciation of music and drama, but may be also relevant when inferring others’ intentions in everyday situations (e.g., when judging whether someone's behavior is calculated and intended to deceive). Improvised musical performance, however, presents a paradigmatic domain in which to study the perception of spontaneity in human behavior.
Musical improvisation is a creative process during which a performer aims to compose novel music by deciding what sounds to produce – and when, as well as how, to produce them – within the real-time constraints of the performance itself (Pressing,
Our own previous research (Keller et al.,
The current study extends this work to behavioral and brain processes associated with the perception of musical spontaneity by listeners. Specifically, in a functional magnetic resonance imaging (fMRI) experiment, we investigated the ability of musically trained listeners to differentiate between excerpts from the improvised and rehearsed jazz piano solos that were analyzed by Keller et al. (
With respect to listener behavior, we assumed that the perception of musical spontaneity is based on the detection of auditory cues reflecting uncertainty in the performer, and, therefore, that the ability to judge whether a performance is improvised or rehearsed depends on the listener's sensitivity to fluctuations in parameters such as event timing and intensity. Sensitivity to these fluctuations could potentially be influenced by factors that affect general responsiveness to uncertainty in other individuals’ behavior, as well as by factors related more specifically to uncertainty in the production of piano melodies. The former, general factors may include socio-cognitive variables like empathy, i.e., the ability to understand others’ feelings (of uncertainty, in the present case). Factors related more specifically to the perception of piano melodies include the listener's own experience at playing the piano. Through such experience, an individual has the opportunity to learn about the effects of uncertainty on the variability of timing and intensity in their own playing. Experienced listeners – especially if highly empathic – may therefore be able to recognize these hallmarks of spontaneity in another pianist's performance.
A distinction between domain general and music specific processes can, likewise, be hypothesized with respect to the brain mechanisms underlying the perception of musical spontaneity. At one level, brain areas that are generally sensitive to behavioral variability related to uncertainty may play a role in the detection of such spontaneity. Studies investigating the neural correlates of the perception of behavioral uncertainty point to the involvement of brain regions including anterior cingulate cortex, insula, and amygdala (Singer et al.,
In addition to brain regions that are generally sensitive to behavioral uncertainty, neural mechanisms that enable a skilled listener to perceive uncertainty in a performer's actions on a specific instrument (in this case, the piano) may facilitate the evaluation of musical spontaneity. Functional links between perceptual and motor processes constitute such a mechanism. Considerable evidence for such perception–action links has accumulated in the auditory domain (Kohler et al.,
The co-activation of sensory and motor areas (in the absence of overt movement) is consistent with the proposal that action perception recruits covert sensorimotor processes that internally simulate the observed action (Rizzolatti and Craighero,
In the context of music listening, internal simulation processes may trigger anticipatory auditory images of upcoming sounds (Keller,
Viewing action simulation in light of the broader claim that perception and action recruit common neural networks (Hommel et al.,
While research studies on musical spontaneity are small in number, a relatively large body of related work has been conducted in the broader field of voluntary action control. In this field, actions are classified along a continuum reflecting the degree to which they are controlled internally (i.e., endogenously) by the agent or externally (i.e., exogenously) by environmental cues (Waszak et al.,
Based on the foregoing, we hypothesized that listeners with jazz piano experience would be sensitive to differences in the degree of musical spontaneity in improvised and imitated jazz piano solos. Specifically, the ability to discriminate between these modes of performance should vary as a function of the listener's amount of musical experience and empathy, to the extent that these factors affect the ability to simulate the performers’ actions (for both improvisation and imitation) and to recognize auditory cues to uncertainty in the timing and intensity of the performances. With respect to the neural correlates of evaluating musical spontaneity, we hypothesized that brain regions that have been implicated in the detection of behavioral uncertainty (anterior cingulate cortex, insula, and amygdala) would be sensitive to uncertainty related fluctuations in performance timing and intensity. Furthermore, we expected that the probability of judging a performance to be improvised would increase with increasing demands placed on brain regions involved in action simulation (vPM, IPL, anterior insula, frontal operculum) due to perceived unpredictability in the performer's actions. Finally, to the extent that simulations are high in fidelity, differences in brain activation when listening to improvisations vs. imitations should be observed in similar regions to those found for the production of improvised and rehearsed music (pre-SMA, dPM, DLPFC) and of internally and externally controlled actions (RCZ, SFG).
To take into account the possibility that some of the above-mentioned brain regions may differentiate between improvisations and imitations even when listeners fail to make accurate explicit judgments, we analyzed the fMRI data in accordance with a 2 × 2 factorial design that incorporated both the objective classification of stimulus melodies (real improvisations/real imitations) and subjective classifications based on listeners’ responses (judged improvised/judged imitated). The main effect contrast for the objective classification was expected to reveal differences in neural processing related to physical differences between improvised and imitated melodies. The main effect contrast for subjective classifications tested the neural bases of listeners’ beliefs, and was expected to be informative about experience-related processes such as action simulation.
The analyzed sample of listeners consisted of 22 healthy male jazz musicians (mean age = 24 years; range 19–32 years) who had on average 12.8 years (SD 6.8) of piano playing experience, 6.8 years (SD 4.8) of which involved playing jazz. Piano was the primary instrument for 7 of the participants and the second instrument for 15 participants. The average amount of time spent practicing the piano per day was 1 h (SD 1), with 0.5 h (SD 0.7) focused on jazz. In addition, participants spent on average 2.4 h/week (SD 3.7) playing piano in ensembles, with 1.5 h (SD 3.1) devoted to jazz.
Further details concerning participants’ musical experience are as follows. Thirteen out of the 22 individuals were, or had completed, studying music at the university level (specializing in jazz performance, music education, church music, or music theory). The whole sample (
Twenty participants were right-handed and two were left-handed according to the Edinburgh handedness inventory (Oldfield,
The experimental stimulus set included 84 10-s excerpts from piano melodies that had been recorded over novel “backing tracks” representing three contrasting styles found in jazz (swing, bossa nova, blues ballad; see Keller et al.,
The production of stimulus materials proceeded via the following steps. First, chord progressions that are characteristic of the three styles (swing, bossa nova, blues ballad) were composed by a professional jazz pianist/composer (Andrea Keller)
Six different pianists (with an average of 11.8 ± SD 5.8 years piano experience and, of that, 6.0 ± SD 4.7 years jazz piano experience) were recruited to create the stimuli for the present study. However, one pianist (who only had one year experience at playing jazz piano) was judged by authors Annerose Engel and Peter E. Keller to be generally poor at imitating other pianists’ improvisations, and all of his performances were discarded. The five pianists who were retained had an average of 11.2 years (SD 6.3) of formal piano training, of which 7.0 years (SD 4.5) involved the development of jazz piano skills, including improvisation. Three of these pianists were professional musicians (a jazz pianist, church musician, and music teacher) and two were highly competent amateur musicians. The pianists practiced jazz piano daily (1.9 ± SD 1.9 h; range 0.5–5 h) and played in jazz bands on a weekly basis (for 3.2 ± SD 2.2 h on average; range 1–6 h).
The pianists were asked to improvise melodies over the backing tracks on a digital piano (Yamaha Clavinova, CLP150). Anvil Studio was used to present the backing tracks by playing back the relevant MIDI files on a second, identical digital piano. The backing tracks were unfamiliar to the pianists prior to the improvisation session. Charts showing chord symbols that indicated the harmonic progressions in each backing track were visible during improvisation. Each pianist performed three improvisations per backing track, which were recorded in MIDI format using Anvil Studio. From these improvisations, 30–60 s excerpts (selected for their musical integrity by an experienced music producer and author Peter E. Keller) per pianist/style were transcribed by a professional musician using software for musical transcription and notation (Finale 2005, Coda Music Technology/MakeMusic, Inc.)
After 4–12 weeks had elapsed, each of the six pianists returned to the laboratory to imitate the selected excerpts from his or her improvisations (self-imitation) and those produced by two of the other pianists (other imitation). Scanned versions of the transcribed excerpts were sent to the pianists via email approximately one week prior to these imitation sessions, so that the pianists could learn the notes. During the imitation recording sessions, pianists were instructed to reproduce all audible details of the improvised performances, including the notes and stylistic performance parameters related to timing and expression. Pianists were permitted to listen to the original improvisations – and to practice imitating them – as many times as was needed in order to feel confident in reproducing them, and the transcriptions of the improvised excerpts remained in view during the subsequent recording of imitated performances. Two versions of each excerpt were recorded in MIDI format using Anvil Studio. The original improvisation and backing track could be heard while recording the first (“duet”) version, while the second (“solo”) version was accompanied by only the backing track. Pianists were allowed to record several takes of each imitation, until they indicated to the experimenter that they were satisfied that they had produced the best possible imitation.
In addition to the pianist who was judged to be poor at imitating other pianists’ improvisations (see above), another individual was technically unable to imitate one of the bossa improvisations properly. Consequently, 14 improvisations (5 blues, 4 bossa, and 5 swing) and 14 corresponding (solo, other) imitations played by five of the pianists were used as the basis for generating stimuli for the fMRI experiment. These performances and accompanying backing tracks were played back on a digital piano under the control of Anvil Studio, and sound output was recorded as .wav audio files by Logic Pro 8.0.2 (Apple, Inc.)
Several performance parameters were extracted from portions of the MIDI files corresponding to each 10-s stimulus item: (a) identity and number of notes played (i.e., keystrokes produced), (b) the duration of inter-onset intervals (IOIs) between successive keystrokes (IOIs, a measure of performance timing), and (c) the relative intensity of keystrokes within a stimulus item (note that intensity, or loudness, is proportional to the force with which a piano key is struck, and can be measured in “MIDI velocity” in arbitrary units ranging from 1 (soft) to 127 (loud)). These performance parameters were compared across improvisations and imitations using two-tailed
The number of notes played (mean ± SD) did not differ significantly between improvised items (27.0 ± 7.6) and imitated items (25.9 ± 7.2),
Performance timing and intensity were analyzed separately in each performed melody item by computing the mean, variance, and Shannon's information entropy (Shannon,
Improvisations | Imitations | |
---|---|---|
Mean | 376.9 ± 117.5 | 392.7 ± 118.5 |
Variance | 135051 ± 131281 | 141390 ± 156586 |
Entropy | 2.90 ± 0.23 | 2.79 ± 0.23 |
Mean | 74.1 ± 8.8 | 74.3 ± 6.7 |
Variance | 276.2 ± 91.4 | 149.0 ± 83.6 |
Entropy | 2.96 ± 0.24 | 2.79 ± 0.25 |
Valid entropy values range from 0 to 5.99 for IOIs and from 0 to 4.85 for MIDI velocity. Zero entropy represents perfect order (i.e., no randomness) and large entropy values indicate high randomness in a probability distribution.
Two-tailed
A brief note on the relationship between variance and entropy is required. Variance and entropy are independent measures of the concentration of values in a probability distribution. These measures are equivalent for normally distributed values: Both variance and entropy are low if there is a single concentration of values around the mean of the distribution. Such a situation may arise with intensity (MIDI velocity) data, for example, if a performer plays at a single intensity level with minor fluctuations. For non-normal distributions, variance is high if entropy is high, but not necessarily vice versa. If there are several concentrations of values that are not contiguous but scattered at intervals throughout a distribution, for instance, then variance will be relatively high while entropy may still be low. Thus, a bimodal distribution with values concentrated in its tails – which could arise if a performer used low and high (but no intermediate) intensities while playing – may have low entropy but high variance. The current study does not aim to address the shapes of the timing and intensity distributions generated while playing (though this may be an interesting topic for future research), and variance and entropy are treated as equally informative, alternative indices of variability in these performance parameters.
The fMRI experiment comprised 84 experimental trials, 21 baseline trials (which were not utilized in the analyses reported in this article), and 21 null events. In experimental trials, participants heard a melody together with the accompanying backing track. The task was to judge whether the melody was improvised or imitated. In baseline trials, participants heard only a backing track excerpt and were required to judge whether or not it was played well. During null events, which occurred after every five trials, participants had no specified task and were instructed to relax.
Improvised and imitated items were presented across experimental trials in randomized order, with the constraint that at least 30 experimental or baseline trials intervened between the presentation of improvised and imitated versions of the same melody. Whether the improvised or imitated version of a melody appeared first in the trial order was varied randomly. No more than three improvisations or imitations could appear in immediate succession. The order of appearance of baseline (backing track) stimuli was randomized and there were at least three experimental trials between baseline trials.
Each trial lasted 20.8 s and consisted of a period without scanner noise (11.7 s) and a period with scanner noise (9.1 s), during which data acquisition took place in a sparse sampling design (see
Baseline trials had the same procedure as experimental trials, except items consisted only of backing tracks, instructional text at the beginning of a trial was the German equivalent of “backing well played?”, and the participant was required to indicate whether these items were well played (yes or no) by pressing a key (left or right). For null events, the instruction “break” was displayed together with the fixation cross and a black screen was presented after the fixation cross.
During the experiment, participants lay supine on the scanner bed, with the right hand resting on the response box. Visually presented instructions were projected by an LCD projector onto a screen placed behind the participant's head. The screen was viewed via a mirror on the top of the head coil. All auditory stimuli were presented over scanner compatible headphones (Resonance Technology, Inc.)
Participants were familiarized with the task prior to the scanning session. They were informed about how the melodies were created (i.e., improvised or imitated) and they were played examples of the backing tracks with and without improvised melodies. In a training phase, participants were presented six 10-s stimulus items, and were asked to judge whether each item was improvised or imitated. Feedback about the correctness of each response was provided and the matched improvised/imitated item was played. Finally, the procedure was practiced (without feedback) first without and then with scanner noise to familiarize participants with the sequence of events: listening to music followed by the presence of scanner noise after the melody presentation (see Figure
After the experiment, participants were debriefed and asked about their strategies for solving the improvised/imitated judgment task. They filled out a questionnaire concerning musical background and the Interpersonal Reactivity Index (Davis,
All functional images were collected with a 3T scanner (Medspec 30/100, Bruker, Ettlingen) equipped with a standard birdcage head coil for excitation and signal collection. To avoid contamination by scanner noise during stimulus presentation, we applied a sparse sampling technique, namely interleaved silent steady state echo planar imaging (Schwarzbauer et al.,
Prior to functional image acquisition, two sets of dimensional anatomical images were acquired: T1-weighted modified Driven Equilibrium Fourier Transform images (data matrix 256 × 256, TR = 1300 ms, TI = 650 ms, TE = 10 ms) were obtained with a non-slice-selective inversion pulse followed by a single excitation of each slice; T2* weighted images with the same parameter as the functional scans. After functional image acquisition, geometric distortions were characterized by a B0 field-map scan. The field-map scan consisted of a gradient-echo readout (24 echoes, inter-echo time 0.95 ms) with a standard 2D phase encoding. The B0 field was obtained by a linear fit to the unwrapped phases of all odd echoes.
Structural images were acquired on a 3T scanner (Siemens TRIO, Erlangen) on a different day before the functional scanning session using a T1-weighted 3D MP-RAGE (magnetization-prepared rapid gradient echo) sequence with selective water excitation and linear phase encoding. Magnetization preparation consisted of a non-selective inversion pulse. The following imaging parameters were applied: TI = 650 ms; repetition time of the total sequence cycle, TR = 1300 ms; repetition time of the gradient-echo kernel (snapshot FLASH), TR,A = 10 ms; TE = 3.93 ms; alpha = 10°; bandwidth = 130 Hz/pixel (i.e., 67 kHz total); image matrix = 256 × 240; FOV = 256 mm × 240 mm; slab thickness = 192 mm; 128 partitions; 95% slice resolution; sagittal orientation; spatial resolution = 1 mm × 1 mm × 1.5 mm; 2 acquisitions. To avoid aliasing, oversampling was performed in the read direction (head–foot).
Functional magnetic resonance imaging data were analyzed using SPM5
In the first level analysis, pre-processed images of each participant were analyzed with a General Linear Model comprising five predictors modeled using a finite impulse response function with the onset of the first acquired volume after the silent period and with a length of 3.9 s (corresponding to the length of three volumes). It was assumed that the first three volumes acquired in each trial mainly reflect brain activity associated with listening to stimulus items rather than activity associated with the following motor response (given a 4- to 6-s lag of the hemodynamic response). Stimuli were assigned to the predictors according to whether they were improvised or imitated (objective classification) and whether they were judged to be improvised or imitated (subjective classification). Resulting predictors cover (1) improvised melodies judged to be improvised; (2) improvised melodies judged to be imitated; (3) imitated melodies judged to be improvised; (4) imitated melodies judged to be imitated; and (5) backing tracks without melody. Contrasts for predictors 1–4 of the first level analysis for each individual participant were entered into a one way analysis of variance (ANOVA) model for a second-level group analysis. Results for contrasts addressing the baseline (backing track) condition are not reported in this article.
Additionally, two parametric analyses were conducted. In the first level of these analyses, activity associated with listening to a melody (regardless of whether it was improvised or imitated and irrespective of participants’ judgments) was modeled by a single predictor. Each item of that predictor was weighted according to its entropy of timing or entropy of intensity value (see
The following regions of interest (ROIs) were chosen to test our prior hypotheses based on the literature reviewed in the Introduction: amygdala, vPM, IPL, anterior insula, frontal operculum, pre-SMA, dPM, DLPFC, RCZ, and SFG. For these areas, we report activations that were significant at the
Listeners were able to judge whether a melody was improvised or imitated with an average correct response rate of 55% (SD 5.4), which is significantly better than chance (50%; two-tailed one-sample
An item analysis was conducted to examine the relationship between subjective judgments for each stimulus item (averaged across listeners) and objective measures of performance instability (i.e., variance and entropy of keystroke timing and intensity; see
Functional magnetic resonance imaging data were analyzed according to a 2 × 2 factorial design that took into account the objective classification of stimuli (real improvisations/real imitations) and subjective classifications based on participants’ responses (judged improvised/judged imitated). First, a logical “AND” conjunction analysis (see Figure
Anatomical region | Hemisphere | MNI coordinates | |||
---|---|---|---|---|---|
Auditory cortex | R | 57 | −15 | 3 | >10 |
L | −48 | −21 | 3 | >10 | |
Precentral gyrus (vPM) | R | 57 | 0 | 42 | 7.18 |
Cerebellum | R | 3 | −45 | −15 | 5.57 |
Supplementary motor area | R | 6 | −24 | 60 | 5.56 |
The main effect contrast for the objective classification of stimuli (listening to real improvisations vs. real imitations) revealed a bilateral activation cluster in the amygdala nuclei region (left amygdala,
Anatomical region | Hemisphere | MNI coordinates | |||
---|---|---|---|---|---|
Amygdala (mainly superficial group) | L | −15 | −6 | −9 | 3.80 |
R | 18 | −6 | −9 | 3.60 | |
Pre-SMA | R | 15 | 6 | 60 | 3.91 |
RCZ | R | 9 | 15 | 45 | 3.33 |
Insula | L | −36 | 24 | 3 | 3.28 |
Frontal operculum | L | −54 | 0 | 3 | 3.21 |
Medial frontal gyrus | R | 36 | 3 | 54 | 3.45 |
Head of caudate | R | 9 | 6 | 0 | 3.84 |
The reverse objective contrast comparing activity associated with listening to real imitations vs. listening to real improvisations yielded no significant differences in any ROIs. In a whole brain analysis, this contrast showed two small activations in the left hippocampal gyrus (MNI coordinates:
The main effect contrast for the subjective classification of stimuli (listening to melodies judged to be improvised vs. melodies judged to be imitated) revealed differential activity in several ROIs (see Figure
The reverse subjective contrast, which compared brain activity associated with listening to melodies that were judged to be imitated vs. listening to melodies that were judged to be improvised, yielded no significant differences in any ROIs or the whole brain analysis at a significance level of
Finally, two sets of parametric analyses were conducted to examine relationships between observed brain activations and features of the stimuli pertaining to timing and intensity in the musical performances. In these analyses, each stimulus item (i.e., each melody regardless of its true status as improvised or imitated) was weighted according to either the entropy of IOIs (timing) or the entropy of keystroke velocities (intensity). Results indicated that the entropy of both measures was correlated positively with activity in the left amygdala (
Anatomical region | Hemisphere | MNI coordinates | |||||
---|---|---|---|---|---|---|---|
Entropy of timing | |||||||
Amygdala | L | −15 | −3 | −9 | |||
Auditory cortex | R | 45 | −21 | 6 | |||
L | −42 | −30 | 9 | ||||
Brainstem, periaqueductal gray | L/R | 3 | −30 | −3 | |||
Entropy of intensity (loudness) | |||||||
Amygdala | L | −15 | −3 | −9 | |||
Auditory cortex | R | 54 | −12 | 3 | |||
L | −42 | −24 | 6 | ||||
Entropy of timing | |||||||
Amygdala | L | −18 | −3 | −9 | |||
Auditory cortex | R | 48 | −21 | 6 | |||
L | −42 | −27 | 9 | ||||
Brainstem, periaqueductal gray | L/R | 6 | −33 | −6 | |||
Entropy of intensity (loudness) | |||||||
Amygdala | L | −15 | −3 | −9 | |||
Auditory cortex | R | 51 | −9 | 0 | |||
L | −42 | −33 | 9 |
The current study addressed the perception of musical spontaneity by examining differences in brain activation associated with listening to improvised vs. imitated jazz piano performances. Behavioral data indicated that listeners (experienced jazz musicians) were able, on average, to classify these performances as improvisations or imitations at an accuracy level that – despite being low (55%) – was significantly better than chance. Listeners’ judgments quite likely reflected their sensitivity to differences in the variability of timing and intensity in improvised and imitated performances. Analyses of the performances themselves revealed that the entropy of keystroke timing and intensity was generally higher during improvisation than imitation. This suggests that the spontaneous variability of motor control parameters governing pianists’ movement timing and finger force was greater when inventing melodies than when producing rehearsed versions of these melodies. It may be the case that a performer's degree of (un)certainty about upcoming actions fluctuates more widely during improvisation than imitation (see Keller et al.,
The examination of individual differences in the ability to make accurate improvisation/imitation judgments revealed that accuracy was positively correlated with listeners’ musical experience and scores on a self-report measure of the “perspective taking” dimension of empathy. These factors may have influenced the detection of variability in timing and intensity; indeed, previous research has shown that musical training enhances auditory sensitivity to timing deviations (Rammsayer and Altenmueller,
Analyses of listeners’ hemodynamic responses were conducted on the basis of two contrasts applied to the fMRI data. The first contrast – which was based on the objective classification of stimuli as real improvisations vs. real imitations – revealed that listening to improvisations was associated with relatively strong activity in the amygdala. This structure may play a role in the detection of cues to behavioral uncertainty in physical stimulus parameters such as random fluctuations in performance timing and intensity.
Classical views describing amygdala involvement in threat detection, fear conditioning, and the processing of negatively valenced emotional stimuli (LeDoux,
In functional terms, the amygdala may be involved in heightening vigilance and attention in response to ambiguity in external signals (Whalen,
The second contrast applied to the fMRI data was based on participants’ subjective classifications of stimuli as improvisations or imitations. It thus tested the neural bases of listeners’ beliefs about the spontaneity of each performance, and, in doing so, is potentially more informative than the objective contrast when it comes to examining experience-related processes such as action simulation. The subjective contrast revealed that hemodynamic responses in the pre-SMA (extending to the SFG), RCZ, frontal operculum, and anterior insula were stronger when listening to melodies that were ultimately judged to be improvised than for melodies that were judged to be imitated. These findings are consistent with our hypothesis that listening to improvisations vs. imitations would generally be associated with cortical activations that overlap with those observed in studies of internally vs. externally guided action execution and in work on covert action simulation. Notably, studies examining differences associated with producing improvised vs. imitated or pre-learned melodies have also reported stronger activation of the pre-SMA, RZC, and frontal operculum during improvisation (Bengtsson et al.,
The relatively strong activation of the pre-SMA, RCZ, frontal operculum, and anterior insula when listening to performances that were judged to be improvised in our study may reflect the greater engagement of an action simulation network related to free response selection. Converging evidence that listeners engaged in action simulation in the current task was provided by a conjunction analysis examining overlap in brain areas activated by improvised and imitated melodies (either judged to be improvised or imitated). This analysis revealed the involvement of motor-related areas, including the cerebellum, SMA, and the vPM, in addition to primary and secondary auditory regions. This pattern of activations is consistent with those observed in previous studies on covert simulation during music listening (Griffiths et al.,
The differential involvement of an action simulation network specializing in freely selected responses when listening to judged improvisations vs. judged imitations may reflect differences in the degree of effort (i.e., amount of processing) required by the cognitive/motor system to generate online predictions about upcoming events in the performances. This prediction process, which may involve auditory imagery (Keller,
Spontaneously improvised piano melodies are characterized by greater variability in timing and intensity than rehearsed imitations of the same melodies, and highly experienced, empathic listeners can detect these differences more accurately than expected by chance. Distinct patterns of brain activation associated with listening to improvised vs. imitated performances occur at two levels. At one level, differences based on the objective classification of performances reflect a distinction in the way the brain processes improvisations and imitations independently of whether the listener classifies them correctly. The amygdala seems to be involved in this differentiation, operating as detector of cues to behavioral uncertainty on the part of the performer who recorded the melody. At the other level, differences in brain activation related to the listener's subjective belief that a performance is improvised or imitated were observed. A cortical network involved in generating online predictions via covert action simulation may mediate judgments about whether a melody is improvised or imitated, perhaps based on the degree of expectancy violation produced by perceived fluctuations in performance stability. It should be noted that the above effects were found with musically trained listeners. Whether they generalize to untrained individuals remains to be seen.
The current findings point to a bipartite answer to the question posed at the opening of this article: Although your amygdala may be sensitive to whether the mesmerizing pianist is engaged in spontaneous improvisation or rehearsed imitation, the ability to judge this would depend on whether your musical experience and perspective taking skills enable faithful internal simulation of the performance. Thus, while certain brain regions may be generally sensitive to cues to behavioral spontaneity, the conscious evaluation of spontaneity may rely upon action-relevant experience and personality characteristics related to empathy.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The Supplementary Material for this article can be found online at
Example of a 10-s excerpt of an improvised melody played over the swing backing track.
Example of an imitated version of the improvised swing melody in Audio
Example of a 10-s excerpt of an improvised melody played over the bossa nova backing track.
Example of an imitated version of the improvised bossa nova melody in Audio
Example of a 10-s excerpt of an improvised melody played over the blues ballad backing track.
Example of an imitated version of the improvised blues ballad melody in Audio
This research was supported by the Max Planck Society. We thank Andrea Keller for composing and recording the backing tracks used in this study. We are also grateful to Andreas Weber for recording and editing the pianists’ improvisations and imitations for use as stimuli, and for assisting with fMRI data acquisition. Finally, we thank Jöran Lepsien for helpful comments on the design, Toralf Mildner for implementation of the fMRI ISSS sequence, Karsten Müller for discussions concerning the ISSS data analysis, and Johannes Stelzer for assisting with the analysis of the MIDI performance data.
1
2
3
4
5
6
7
8
9
10
11